How WebRTC Works Inside Philippine Enterprise UC Platforms: Signaling, Media Negotiation, and NAT Traversal Explained

WebRTC’s ICE framework tests three distinct candidate paths before a single audio packet moves between browser-based VoIP endpoints on a Philippine enterprise network: host candidates on the local interface, server-reflexive candidates discovered through STUN, and relay candidates routed through TURN. Philippine network topologies push TURN relay usage to 8–20% of enterprise sessions because of widespread symmetric NAT and carrier-grade NAT (CGNAT) deployments across major ISPs.

TL;DR: WebRTC eliminates plugins for browser-based voice and video inside UC platforms, but Philippine enterprises face higher-than-average NAT traversal failures. Understanding signaling over WebSocket, SDP codec negotiation, ICE candidate gathering, and STUN/TURN fallback mechanics determines whether a deployment produces clear calls or silent black holes.

That percentage matters for capacity planning. When 92% of connections establish through direct STUN-based peer-to-peer paths globally, the remaining sessions fall back to TURN relay servers that consume bandwidth at the relay point rather than at the endpoints. For a Philippine BPO running 500 concurrent browser-based agent sessions, 8–20% TURN fallback means 40 to 100 sessions hitting your relay infrastructure simultaneously. Undersizing that relay layer produces exactly the kind of one-way audio and call drops that frustrate IT teams. If you’ve been tracking call quality degradation across your network, WebRTC NAT traversal failures are a frequent root cause that doesn’t show up in traditional SIP diagnostics.

How Signaling Bootstraps Every WebRTC Session

WebRTC deliberately leaves signaling unspecified at the protocol level. The standard defines media transport (SRTP), encryption (DTLS), and connectivity checks (ICE), but the mechanism for exchanging session descriptions between peers is the implementer’s problem. Philippine UC platforms from vendors like Yeastar, PortSIP, and Ribbon typically solve this with WebSocket connections over TLS (WSS on port 443), which provides the low-latency bidirectional channel that HTTP polling can’t match.

The signaling server’s job is narrow: relay SDP offers and answers between participants, forward ICE candidates, and handle participant presence. It never touches media. For a 200-seat Philippine enterprise, the signaling server handles session establishment rates measured in dozens of transactions per second, while the media plane handles hundreds of concurrent RTP streams. This asymmetry means signaling scales horizontally with shared state stored in Redis or a comparable in-memory store, keeping session data intact if a signaling node restarts.

Info: SIP over WebRTC requires a gateway or Session Border Controller that translates between SIP/UDP on the PBX side and WebSocket/DTLS on the browser side. PortSIP’s SBC documentation describes this as performing “federation services to transform SIP communications into WebRTC or vice versa,” allowing organizations to [retain their existing PBX infrastructure](/blog/on-premise-pbx-cloud-voip-philippines) while adding browser-based endpoints.

When a user opens a WebRTC-enabled UC client in Chrome or Edge, the browser establishes a WSS connection to the signaling server. The server authenticates the session (typically via OAuth2 or token-based auth), registers the endpoint’s presence state, and waits. No media resources are consumed until someone initiates a call.

diagram showing the WebRTC signaling flow between a browser client, WebSocket signaling server, and SIP PBX gateway, with arrows indicating SDP offer/answer exchange and ICE candidate forwarding

SDP Offers, Answers, and How Codecs Get Negotiated

Session Description Protocol (SDP) carries the payload that makes or breaks call quality. When Peer A initiates a call, the browser generates an SDP offer containing every codec it supports (Opus at 48 kHz for audio, VP8 or VP9 or H.264 for video), its ICE candidates, DTLS fingerprint, and media line descriptions specifying whether the session includes audio, video, or data channels.

Peer B’s browser receives this offer through the signaling server and generates an SDP answer. The answer selects the highest-priority codec both sides support, confirms encryption parameters, and includes Peer B’s own ICE candidates. This exchange happens in 1–3 round trips over the WebSocket, typically completing in under 200 milliseconds on Philippine fiber connections from PLDT or Converge.

Codec selection has direct operational impact for Philippine enterprises. Opus, the default WebRTC audio codec, operates at bitrates from 6 kbps to 510 kbps and adapts dynamically to network conditions. For a BPO floor where Erlang capacity models drive agent staffing, Opus at 32 kbps per stream consumes roughly 40% less bandwidth than G.711’s fixed 64 kbps, directly affecting how many concurrent calls a single internet circuit can carry. A 100 Mbps symmetrical fiber link supports approximately 2,500 concurrent Opus streams at 32 kbps versus 1,500 G.711 streams at 64 kbps, before accounting for IP/UDP/RTP overhead of roughly 12 kbps per stream.

According to MDN’s WebRTC protocol reference, the entire media path uses DTLS-SRTP encryption by default, meaning every audio and video frame is encrypted end-to-end between browsers without requiring the IT team to configure TLS certificates on the media plane. The signaling plane still needs its own TLS layer (WSS), but the media encryption is baked into the WebRTC specification itself.

ICE Candidate Gathering Decides Whether Calls Connect

Interactive Connectivity Establishment is the framework that systematically discovers how two endpoints can reach each other across NATs, firewalls, and multiple network hops. As GetStream’s ICE candidate documentation explains, “peers negotiate the actual connection between them by exchanging ICE candidates” through a structured process that tests every possible path.

ICE gathers three types of candidates in sequence. Host candidates come from the device’s local network interfaces and work when both peers sit on the same LAN. Server-reflexive candidates come from a STUN query that reveals the endpoint’s public IP and mapped port as seen by the outside world. Relay candidates come from a TURN server that agrees to forward media on behalf of the endpoint. The browser collects all three types, bundles them into the SDP, and sends them to the remote peer through the signaling channel.

Once both peers have each other’s candidate lists, ICE runs connectivity checks by sending STUN binding requests on every possible candidate pair. A session between two endpoints with 3 candidates each produces up to 9 candidate pairs to test. The ICE agent prioritizes host-to-host pairs first (fastest, zero relay cost), then server-reflexive pairs (STUN-assisted), then relay pairs (TURN-assisted, highest latency and cost). The first pair that produces a successful bidirectional STUN response wins, and media flows on that path.

infographic showing the three ICE candidate types (host, server-reflexive, relay) with percentage breakdown of connection success rates: 92% via STUN direct, 8-20% via TURN relay, and the decision flo

Philippine NAT Topologies That Break Direct Connections

Why does STUN TURN Philippines matter more than in, say, Singapore or Tokyo? The answer sits in how Philippine ISPs deploy network address translation.

PLDT, Globe, and Converge all use CGNAT on residential and many SME circuits. CGNAT places subscribers behind a second layer of NAT at the carrier level, meaning the “public” IP a STUN server discovers is actually the carrier’s shared IP, not the subscriber’s. When both peers sit behind different CGNAT deployments, symmetric NAT behavior blocks inbound STUN binding requests because the carrier’s NAT router rejects connections from peers it hasn’t previously communicated with. According to MDN’s WebRTC documentation, TURN exists specifically to “bypass the Symmetric NAT restriction by opening a connection with a TURN server and relaying all information through that server.”

Enterprise-grade fiber circuits from PLDT Enterprise or the provincial fiber buildouts Converge is deploying often provide static public IPs that avoid CGNAT, but branch offices, remote workers on residential connections, and hotel-based staff regularly hit the CGNAT wall. A Philippine enterprise with 15 branch offices might find that 4–6 branches connect through CGNAT circuits, pushing their TURN relay usage well above the 8% global floor.

A Philippine enterprise with 15 branch offices might find that 4–6 branches connect through CGNAT circuits, pushing their TURN relay usage well above the 8% global floor.

Firewalls add a second layer of obstruction. Corporate Fortinet or Cisco ASA deployments commonly restrict outbound UDP to specific ports, blocking the UDP/3478 that STUN defaults to. Best practice for Philippine WebRTC enterprise deployments dictates configuring TURN servers to listen on TCP port 443, which firewalls almost universally permit because blocking it would break HTTPS. TURN credentials should use HMAC-SHA1 with tokens expiring within 24 hours to prevent open-relay abuse.

TURN Relay Sizing for Philippine Workloads

TURN server capacity is bounded by bandwidth, not CPU. Each relayed audio stream at Opus 32 kbps consumes roughly 44 kbps of relay bandwidth after overhead. Each relayed HD video stream at 1080p consumes 2.5–4 Mbps. A TURN server on a 1 Gbps link can theoretically relay around 22,000 concurrent audio-only sessions or approximately 250 concurrent 1080p video sessions.

For a 300-seat Philippine call center using browser-based VoIP, where 15% of sessions fall back to TURN, you’re looking at 45 concurrent TURN-relayed audio streams consuming roughly 2 Mbps of relay bandwidth. That’s trivial for a single TURN server. But a 300-seat video conferencing deployment with the same TURN fallback rate produces 45 concurrent video streams at 2.5 Mbps each, consuming 112.5 Mbps of relay bandwidth. The difference between audio-only and video workloads changes TURN sizing by a factor of 56.

Softil’s WebRTC Interconnect Framework documentation describes an enterprise integration stack that includes “ICE/STUN/TURN toolkits with the addition of WebRTC transport mechanisms such as WebSocket and DTLS/multiplexing” as core components rather than optional add-ons. For Philippine deployments, placing a TURN cluster in a Manila data center (Equinix MN1 or VITRO Makati) keeps relay latency under 5 ms for Metro Manila users and under 25 ms for Cebu or Davao users, compared to 80–150 ms round-trip if your TURN server sits in Singapore or Tokyo.

ComponentPortProtocolPhilippine Sizing Consideration
Signaling (WSS)443TCP/TLS1 server per ~5,000 concurrent sessions; horizontal scale with Redis
STUN3478UDPMinimal resources; handles ~92% of connections
TURN443 (TCP fallback)TCP/TLS or UDPBandwidth-bound; 1 Gbps supports ~22,000 audio or ~250 video relays
SFU (media server)DynamicUDP/DTLSCPU-bound for transcoding; 1 server per ~200 concurrent group calls
network architecture diagram showing a Philippine enterprise WebRTC deployment with browser clients in Manila, Cebu, and Davao connecting through a Manila-based TURN cluster, signaling server, and SBC

SFU Architecture for Group Calls Beyond Five Participants

Peer-to-peer WebRTC topology breaks down around 5 concurrent participants because each peer must upload a separate media stream to every other peer. A 5-person call requires each participant to maintain 4 upstream connections. A 10-person call would require 9, which exceeds the upload capacity of most Philippine broadband connections running at 25–50 Mbps upload.

Selective Forwarding Units solve this by having each participant upload a single stream to the SFU server, which then distributes copies to all subscribers. The SFU doesn’t transcode; it selectively forwards, keeping CPU costs low relative to a Multipoint Control Unit (MCU) that would decode and re-encode every stream. With Simulcast enabled, each sender transmits at multiple quality tiers simultaneously (typically 1080p, 720p, and 360p), and the SFU delivers the appropriate tier based on each subscriber’s available bandwidth. This architecture maintains sub-150 ms end-to-end latency even with tens of thousands of concurrent participants in webinar-style deployments.

For Philippine enterprises evaluating whether cloud or on-premise UC fits their operations, the SFU hosting decision is significant. Self-hosting an SFU in a Manila colocation facility gives you control over latency and data sovereignty, but you absorb the bandwidth costs of every forwarded stream. Cloud-hosted SFUs from platform vendors distribute that load across regional points of presence but introduce dependency on international links that can degrade during undersea cable maintenance windows affecting Philippine connectivity.

The Open Threads

Several questions remain unresolved for WebRTC enterprise Philippines deployments. The NTC has not published specific guidance on encrypted WebRTC media traversing Philippine networks, leaving enterprises to interpret existing DICT data privacy frameworks on their own. Interoperability between WebRTC-native UC platforms and legacy SIP trunks still requires Session Border Controllers with protocol translation capabilities, adding a component that many mid-market Philippine businesses don’t budget for.

TURN server placement within the Philippines remains an infrastructure gap. Only 3–4 commercial data centers in Metro Manila offer the low-latency peering and bandwidth density needed for a production TURN cluster, and no equivalent facility exists in Visayas or Mindanao with comparable peering arrangements. Enterprises with distributed operations across those regions face a choice between higher relay latency to Manila-based TURN servers or the cost of deploying edge TURN nodes in locations with limited connectivity options.

The WebRTC standard itself continues to evolve. WebTransport as a signaling alternative to WebSocket, AV1 as a next codec option alongside VP9 and H.264, and Insertable Streams for end-to-end encryption in SFU topologies are all in various stages of browser adoption. Philippine IT teams building UC platforms on WebRTC today should architect their signaling and media layers as separable components, so swapping a TURN provider or adopting a new codec doesn’t require rebuilding the entire stack.

Recent Posts

Contact Us



    About

    Kital is an innovative telecom, IP Telephony, and customized solutions provider to small-to-medium-sized businesses and large enterprises in the Philippines.

    Follow Us on Social Media

    Scroll to Top