RFC 3261 defines the Session Initiation Protocol as a text-based, request-response protocol for creating, modifying, and terminating communication sessions between IP devices. That definition has been stable since 2002. What hasn’t been stable is the way Philippine enterprises actually deploy it — on congested PLDT and Globe circuits, behind carrier-grade NAT, through firewalls that treat UDP port 5060 traffic with suspicion, and into BPO floors where 300 agents generate concurrent SIP sessions that stress-test every assumption the protocol’s designers made.
This article walks through one SIP call transaction, end to end, the way it actually behaves on a Philippine enterprise network. We’re dissecting the signaling, not the audio. By the time media (your actual voice) starts flowing, SIP’s work is already done — and most of the deployment problems your team will encounter live inside that signaling window.
The INVITE and Everything Before Ringback
A SIP call begins with an INVITE request. When an agent on a Fanvil X4U desk phone in a Makati BPO center dials an external number, the phone doesn’t call the number directly. It sends a SIP INVITE message to the on-premise PBX — a Yeastar P570, in the deployment we’re dissecting — which decides how to route it.
The INVITE contains several critical pieces:
- A Request-URI that identifies the called party
- Via headers that track the message’s path so responses can trace it back
- A Call-ID that uniquely identifies this dialog for its entire lifecycle
- A CSeq (command sequence) number that orders transactions within the dialog
- An SDP body (Session Description Protocol) that proposes media parameters: codecs, IP address, port numbers
The PBX inspects the dial plan, determines the call should route through the PLDT SIP trunk, and forwards the INVITE outbound. PLDT Enterprise offers standard-based SIP trunking that integrates voice and data on a single network, so the INVITE travels from the PBX through the Session Border Controller (SBC) and into PLDT’s voice network.
Within about 100 to 200 milliseconds (on a healthy Metro Manila connection), the agent hears ringback tone. But between pressing “Call” and hearing that tone, at least six SIP messages have been exchanged: INVITE, 100 Trying, 180 Ringing, and in many configurations, additional provisional responses. The PBX has authenticated itself against the trunk, the SBC has rewritten internal IP addresses, and both sides have agreed on how the audio will flow.

SDP Negotiation and the Codec Handshake
The SDP body inside that initial INVITE is where session initiation gets practical. Your VoIP architecture lives or dies on what gets negotiated here.
The calling phone proposes a list of codecs it supports — typically G.711a, G.711u, G.722, and sometimes G.729. The SDP also declares the IP address and UDP port where the phone wants to receive RTP media. This is the offer side of the offer-answer model.
When the far end responds with a 200 OK, its SDP body contains the answer: which codec it selected from the offered list, and its own IP/port for receiving media. Both sides now have everything they need to open a direct RTP media channel.
Here’s where Philippine deployments get complicated. The IP address in the SDP body is the phone’s local address — often a 192.168.x.x or 10.x.x.x range behind NAT. If the SBC doesn’t rewrite that private address to the public-facing IP before the INVITE leaves the network, the far end receives an SDP offer with an unreachable IP. The RTP packets arrive at a dead end. The call connects — you see a timer counting up on the phone display — but one or both parties hear silence.
This one-way audio problem is the single most common SIP deployment failure in Philippine enterprises, and it has nothing to do with audio quality, codec selection, or bandwidth. It’s a call signaling problem. The SBC failed to perform proper NAT traversal on the SDP body. If you’ve run into intermittent device offline issues that seem random, the root cause often traces back to SDP handling under NAT, not to registration timeouts.
The call connects — you see a timer counting up on the phone display — but one or both parties hear silence. This one-way audio problem is the single most common SIP deployment failure in Philippine enterprises, and it’s a signaling problem, not an audio problem.
The ACK That Closes the Handshake
After the 200 OK arrives, the calling side sends an ACK — an acknowledgment that completes the three-way handshake for INVITE transactions. SIP treats INVITE specially because it initiates a long-running dialog. Every other request type (REGISTER, BYE, OPTIONS) uses a simpler two-message exchange. INVITE demands the ACK because the cost of a missed connection setup is too high to leave to UDP’s unreliable delivery.
If the ACK doesn’t reach the server, the server retransmits the 200 OK. And retransmits again. And again. Your PBX logs fill with duplicate 200 OK entries, and the call may time out before media ever begins. On networks where packet loss exceeds 1% — common during peak hours on shared Philippine ISP circuits — these retransmission storms become measurable. You’ll find them if you’re doing proper network logging before issues cascade.
Where Philippine Network Conditions Break the Signaling Ladder
The SIP call flow described above assumes clean, low-latency, low-loss connectivity. Philippine enterprise networks frequently deviate from that assumption in specific, predictable ways.
Carrier-Grade NAT and Double NAT
Many Philippine ISP connections, especially on business DSL or fiber plans that aren’t enterprise-grade, put the customer behind carrier-grade NAT (CGNAT). Your office router performs NAT once. The ISP performs NAT again. The SIP INVITE now carries an SDP body with a private IP, passes through your SBC (which rewrites it to your router’s WAN IP — also private), and arrives at the trunk provider with an address that’s still unreachable from the public internet.
The fix is straightforward but often missed: your SBC needs to be configured with the actual public IP that the ISP’s CGNAT maps to, or you need to use STUN (Session Traversal Utilities for NAT) to discover it dynamically. Yeastar PBXes support STUN configuration natively. If your ISP provides a static public IP — ask specifically, because many Philippine ISP “business plans” don’t include one by default — configure the SBC’s external media address to that IP directly and skip STUN entirely.

Firewall Rules That Block Responses
SIP signaling typically runs on UDP port 5060 (or TCP/TLS port 5061 for encrypted signaling). RTP media runs on a dynamic range — commonly UDP 10000 through 20000. Philippine IT teams frequently open port 5060 outbound but forget that SIP is bidirectional. The trunk provider sends responses (180 Ringing, 200 OK) back to your PBX. If your Fortinet or Cisco firewall doesn’t have a SIP Application Layer Gateway (ALG) enabled — or worse, if it has a broken SIP ALG that mangles the headers — those responses never arrive.
The irony: many firewall vendors’ SIP ALG implementations are so buggy that the standard advice in the VoIP industry is to disable SIP ALG entirely and handle NAT traversal at the SBC instead. On FortiGate firewalls running FortiOS 7.x, disabling the SIP ALG and relying on the Yeastar SBC’s own NAT handling produces more consistent results than letting the firewall inspect and modify SIP packets.
Warning: If you’re running a FortiGate firewall and experiencing intermittent call failures, check whether the SIP ALG is enabled. Disable it, then configure your PBX’s SBC module for direct NAT traversal. This single change resolves a disproportionate number of SIP signaling issues on Philippine enterprise networks.
Registration Keepalives on Unstable Links
SIP devices maintain their connection to the PBX through periodic REGISTER messages. The standard re-registration interval is 3600 seconds (one hour), but on Philippine networks where ISP connections drop briefly and reconnect with a new IP, an hour is far too long. By the time the phone re-registers, it’s been unreachable for up to 59 minutes.
Best practice for Philippine deployments: set the registration expiry to 120 to 300 seconds. Yes, this increases signaling traffic. On a 200-phone deployment, you’ll see roughly 40 to 100 additional REGISTER transactions per minute. That’s trivial bandwidth — a few kilobits per second — but it means your phones re-register within two to five minutes of any IP change.
TLS, SRTP, and the Encryption Split
SIP as originally designed sends everything in plaintext. The INVITE, the SDP body with its IP addresses and port numbers, the authentication credentials (hashed, but with predictable challenge-response patterns) — all visible to anyone capturing packets on the network path.
Philippine enterprises handling sensitive communications — hospitals, banks, government agencies covered by NTC Memorandum Circular No. 05-08-2005 and DICT security guidelines — need to understand that SIP security splits into two separate layers.
Signaling encryption uses TLS (Transport Layer Security). Instead of sending SIP messages over UDP port 5060, the PBX connects to the trunk provider over TCP port 5061 with a TLS certificate exchange. This protects the INVITE, the authentication headers, and the SDP body from eavesdropping.
Media encryption uses SRTP (Secure Real-Time Transport Protocol). This protects the actual audio stream. TLS on the signaling layer doesn’t automatically encrypt the media. You need both.
International SIP trunk providers serving Philippine numbers — AVOXI, for example — advertise TLS and SRTP encryption with a 4.2 MOS score and 99.995% uptime. PLDT Enterprise’s SIP trunking supports TLS as well, though you’ll need to confirm certificate compatibility with your specific PBX model during provisioning.
On Yeastar P-Series PBXes, enabling TLS for SIP trunks requires uploading the provider’s CA certificate and configuring the trunk transport to TLS. SRTP is a separate checkbox per trunk. Don’t assume one implies the other.

For enterprises in regulated industries, this dual-layer requirement matters because compliance audits may check both independently. If your signaling is encrypted but your media isn’t, an attacker with network access can still record calls. If you’re designing VoIP infrastructure for government agencies, the encryption question needs explicit answers for both layers.
The Signaling Architecture That Survives Philippine Conditions
The SIP transaction we’ve walked through — INVITE, codec negotiation, NAT traversal, ACK, media flow, BYE — looks clean on a whiteboard. On a Philippine enterprise network running through PLDT or Globe circuits, with CGNAT, inconsistent firewall firmware, and ISP maintenance windows that coincide with BPO night shifts, it gets messy fast.
The deployments that work reliably share a few architectural choices:
Voice and data traffic live on separate VLANs. The SIP signaling and RTP media get their own network segment with QoS policies that prioritize voice packets. DSCP marking EF (Expedited Forwarding, decimal 46) on voice traffic gives it priority through every managed switch between the phone and the SBC.
The SBC sits at the network edge, between your PBX and the ISP uplink. It handles NAT rewriting, topology hiding (so your internal IP scheme never appears in outbound SIP headers), and rate limiting against SIP scanning attacks. On Yeastar systems, the built-in SBC module handles this. Larger deployments — 500+ seats, multiple branch offices — benefit from a dedicated SBC appliance from AudioCodes or Ribbon.
Registration intervals are shortened to match the realities of Philippine ISP stability. Failover trunks from a second provider (TELNOVO, for example, or an international SIP trunk as backup) ensure that when the primary trunk goes down, calls reroute within seconds rather than minutes.
And monitoring watches the signaling layer specifically. Not CPU utilization on the PBX. Not mean opinion scores on completed calls. The signaling itself: REGISTER success rates, INVITE-to-200-OK latency, ACK delivery rates, BYE completion percentages. When these metrics drift, you catch problems before anyone picks up a phone and hears nothing.
The SIP protocol is 24 years old. The VoIP architecture built on it is mature, well-documented, and entirely capable of handling enterprise-scale Philippine deployments. The failures almost always live in the gap between how the protocol expects the network to behave and how Philippine networks actually behave. Understanding call signaling at the transaction level — knowing what each message does, where it can break, and why — is the difference between a VoIP deployment that works on the vendor’s demo and one that works at 2 AM on a Makati BPO floor with 300 concurrent calls.



