Network Visibility: Using Active Path Testing
How do you know your network is properly handling SIP voice?
You've updated your voice calling system. It uses SIP trunking, you've saved money, and everyone is happy. But a few nagging problems let you know it's not working smoothly all the time. How do you tell whether the network is at fault?
By now, you should know the factors that affect voice quality: latency, jitter, and packet loss. Except in some cases like satellite circuits, latency should be low, perhaps 100 to 200 milliseconds at most. Jitter should likewise be low, unless congestion is driving big changes in queue depths at multiple points in the path. Packet loss may be a significant factor as congestion fills buffers in the networking equipment along the path.
How do you know the network is running smoothly? How do you know a voice problem is external to the parts of the network path you administer? You can't answer these questions if you don't have good network monitoring instrumentation.
To start, the network monitoring system should be monitoring all interfaces in the paths over which voice may travel. This means the network monitoring system needs to be inexpensive and correctly configured to monitor all interfaces. I've seen too many implementations in which cost has limited the use of the monitoring system such that some network interfaces go unmonitored. The result is a lack of visibility into potential causes of packet loss.
I like to instrument a network to record interface errors and drops. Errors are a network interface or media problem, like a dirty optical connection or a noisy WAN circuit. Drops occur when a network interface's buffers fill and another packet needs to transit that interface. Congestion on egress is significantly more common than congestion on ingress. Investigate interfaces that have more than 0.0001% (that's 1x10E-5) packet loss, and fix those with errors. Drops require other measures, which I'll cover below.
I also like to get reports on call data records and call maintenance records from the call controller, and look for calls that have low mean opinion scores (MOS). A calculated MOS uses latency, jitter, and packet loss to arrive mathematically at a score for the call, so a breakdown of each factor helps identify the type of problem that needs investigation.
Finally, I've found tracking call paths useful. I have firsthand examples where figuring out the call paths for poor video quality took a long time, as I discussed in my initial No Jitter post, "Know The Path Your Media Sessions Take." As we worked on figuring out what was causing performance issues in the two examples I'd shared in that post, being able to drop a test system into the two video facilities to measure the real network paths would have been nice. It would have directed us to make sure that the testing went via the same media concentrators as the video paths, allowing us to quickly identify the core problems.
Networks are too big to look at individual elements. I like to use the network management product's reporting system to sort the test results in Top-N reports that identify the worst offenders. I can then focus on the worst problems. I also look at the Top-N report to determine commonalities. Is common infrastructure involved? Does it always involve a certain site? Does a Top-N problem call report correlate with an interface in a Top-N drops report? Using the network management reporting tools to identify problem spots is very efficient.
Continue to Page 2 for how to help assure voice quality on the network and a new tool for verifying quality of experience