VoIPmageddon: Is Quality Leading to a Telephony Meltdown?
As the number of VoIP endpoints reach critical mass, we need to address the quality issues that threaten to obliterate the chance of a good call.
All of the talk about Snowmageddon 2015 has me thinking about the Armageddon we could be heading toward in enterprise communications over degrading voice quality in the post-VoIP conversion world
VoIP-driven quality issues, be those related to unacceptable latency, line noise, garbled speech, or dropped calls, have popped into recent conversations I've had with cloud telephony and conferencing providers, a few enterprises, and a number of other industry participants. And although my sampling may be unscientific, to me it seems to show a significant increase not only in the number of issues but also the percentage of calls having those issues.
So I've been thinking: Is VoIP adoption beginning to create a larger percentage of low-quality/dropped calls? Are we approaching VoIPmageddon?
While most of the VoIP community sees VoIP as a basic service, the reality is delivering quality voice over an IP-based implementation is challenging and can be fraught with issues. Many of the quality issues that arose during VoIP's 10-year boom (2000 to 2010) were resolved in the implementations of the period. That's led to the common thinking that using some form of quality-of-service (QoS) mechanism solves all problems. However, many other factors can cause voice quality issues.
Are You Still There?
A good understanding of VoIP quality starts with knowing how various factors effect voice calls. The first is latency, which impacts how we interact.
When we talk over distance, we pause to let the other person speak. If latency is too high, then the first speaker starts talking again before the second speaker's voice gets back. This results in a "collision," which feels like an interruption (for more information on this process, please read the PKE Consulting white paper, Making IP Networks Voice Enabled). In a traditional voice network, the TDM process assures low latency within the local loop as well as across long distances.
In the TDM infrastructure, the voice stream is sampled once every 125 microseconds and then each 125 microseconds the sample is moved to the next stage in the path. This results in very low latency. In the local loop, we typically see round-trip latency of less than 10 to 15 milliseconds (msec) on a TDM voice call. Long-haul latency also accounts for sampling (a few samples at 125 microseconds) plus fiber propagation and echo cancellation. So, the round-trip latency for a cross-country TDM call in the U.S. has four components: 30 msec maximum for the local loop (the local loop times two crossings), maybe another 5 msec for sampling the forward timing in the long-distance path, a couple of milliseconds for echo cancellation there and back, and propagation delay of about 16 msec in each direction (with a propagation speed in fiber of 85% of the speed of light and a 2,500-mile distance). In sum, total round-trip cross-country latency is about 84 msec.
Considering the same factors, placing a TDM voice call halfway around the world -- 12,500 miles -- would result in latency of about 220 msec. This is well below the latency threshold of 275 to 300 msec we accept in natural conversation. Even adding in a conferencing bridge delay of an additional 5 msec in each direction results in acceptable overall delay.
In VoIP, delay is typically much higher because of the way the voice stream gets broken into packets. In a typical VoIP connection, at least three packets flow in each direction, resulting in at least 60 msec but more likely 80 msec or more of packet delay in each direction. In addition, protocol delays, router forwarding, codec time, and other network processes effect latency.
So, on a LAN with essentially zero propagation delay, the typical round-trip latency for an end-to-end VoIP call is around 130 to 170 msec (see the white paper for more details). Consider our 2,500-mile, cross-country scenario, and total round-trip latency increases to 160 to 200 msec due to the transmission time. For our halfway around the world call, due to the transmission time increase, the total is now 270 to 300 msec, right at the edge of our ability to perceive it in conversation. Of course, latency will increase even more if the IP path is indirect, looping back based on peering locations.
Now let's look at what happens when the call is not VoIP end to end but instead crosses the TDM-based PSTN as it moves from one VoIP network to another. This happens when two enterprises have deployed VoIP, but one or the other (or both) is using a TDM trunk to the PSTN. In this case, packetizing the voice for each VoIP segment, there and back, accounts for additional latency. The result is that the round-trip latency, even when the PSTN connection is local loop and both PBXes are on high-speed LANs, increases to 250 to 320 msec. As the distance increases, this number grows. For the 2,500-mile call (30 msec additional round trip plus 5 msec for router hops), the latency is now over the 275 to 300 msec threshold of perceived latency. Likewise in our international VoIP calling scenario, in which latency would reach well over 300 msec.
Conferencing in the VoIP domain has the same sort of results. When we join two IP networks by the PSTN or a conference bridge, even in the best cases, we approach the threshold of perception. If we add an additional packet into each of the jitter buffers, the resulting 80 msec (one 20-msec sample in each of the four IP domains round trip) takes the latency up closer to 400 msec, which makes natural interactive speech very challenging and will significantly degrade a provider's mean opinion score, or MOS.
This is all to say that VoIP calling works well when the communications is IP end to end or, if not end-to-end VoIP, then when only one end uses the PSTN. The potential for perceptible latency goes way up when the PSTN sits between two VoIP networks or during a conference call. While a conference bridge can mitigate this somewhat with jitter buffer management and error correction, in the PSTN/TDM case, the operation is defined by the requirement to minimize underruns with larger jitter buffers and the need for gateway error corrections. Using SIP trunking between two IP domains can eliminate the TDM-added latency, of course, but SIP trunking is still a relatively low percentage of all trunks.
How this latency plays out for an enterprise can lead to challenges. When using a conference bridge with multiple VoIP domains and paths that often loop back on each other, for example, two speakers may be 400 to 500 msec apart. This will result in a relatively poor and inconsistent experience for participants, with latency varying based on the type of network they've used to dial in to the bridge.
Click to the next page to read about other factors affecting VoIP call quality