Solving the Pain of VoIP Quality
VoIP quality will be transient and ever changing. A voice quality problem can be of such a short duration that once the troubleshooter attempts to diagnose the problem, network conditions have changed. As enterprises move to Unified Communications with conferencing traffic, especially video, the capacity and performance demands on the IP network will increase.
I had the opportunity to host a webcast, "Solving the Pain of VoIP Quality" with Psytechnics. The thrust of the webcast is "what you don't measure, you can't manage". During the webcast, we went beyond the traditional approaches to voice quality issues as viewed by network trouble shooters. What makes this presentation different is that it shows why the measurements of the data side of the voice call are insufficient for determining what the voice quality problem is in human terms.
An early standard that attempts to measure voice quality is the Real Time Control Protocol (RTCP), which reports on data network impairments. The impairment measurements are then used to calculate a prediction of the impact of the IP network on voice Mean Opinion Score (MOS). MOS is the average of the opinions given by a group of people (subjects) for a given example of voice quality in a subjective test. The subjects typically give their opinions against an ITU P.800 opinion score scale: excellent (5), good (4), fair (3), poor (2), bad (1). MOS prediction is then a numeric measure/value of the voice quality where 5 = perfect; 4.4 = toll quality and 3.5 = marginally acceptable voice quality. The RTCP measurements include:
- RTP time stamp
- Packet loss
- Sequence number
RTCP is a good start, but does not provide nearly enough information to really determine the call quality from a human perspective.
A newer standard, RTCP XR (extended reporting) adds several more measured elements:
- Packet loss and discard rates
- Burst length and density
- Gap length and density
- Packet path, end system and round trip delays
- Signal, noise and echo levels
- Jitter buffer configuration
- Packet Loss Compensation (PLC) type
This is an improvement that delivers metrics to produce a better estimation of the impact on MOS of the IP-Network and also more information on the factor(s) that cause the voice quality issue. RTCP-XR also provides fields for edge-devices to report a locally calculated MOS score, in which case it is better to rely on a standardized measure than a proprietary approach. RTCP is still not the human ear. No listener is going to describe their voice quality complaints with most of these measurements. There have also been cases where a vendor says they use RTCP XR, but not all the fields are included.
Here are other factors that will influence voice quality that are not part of the RTCP XR measurements:
- CODEC type (G.711,G.729, G.722)
- Packet size (20ms, 30ms, 40ms)
- Silence suppression (VAD) competition
- Clipping during silence suppression
- Link utilization and low speed transmission
- Adaptive jitter buffer operation
- Competing with data traffic
Further factors that directly determine voice quality that are not caused by the IP-transport and so go entirely un-measured with IP-network measures include:
- Echo (hybrid and acoustic)
- Speech levels (volume)
- Speech distortion
The ITU has a group of standards for both the network side and the analog (listener) side of the voice call as shown in the following figure.
NB = narrowband, classic voice bandwidth of about 3.4 kHz WB = wideband, that is about 7 kHz bandwidth PESQ = Perceptual Evaluation of Speech Quality
These ITU standards, that were initially developed for service providers, are better than the RTCP XR standard for reporting voice quality. Using these standards, it is easier to compare voice quality experiences among many networks and VoIP products. Although there are non-standard approaches used by some vendors, these approaches make comparing call quality more difficult because they calculate the MOS using different algorithms. Using standard methods allows VoIP vendors and service providers to provide universal quantitative comparisons.