SHARE



ABOUT THE AUTHOR


Terry Slattery
Terry Slattery, is a senior network engineer with decades of experience in the internetworking industry. Prior to joining Chesapeake NetCraftsmen as...
Read Full Bio >>
SHARE



Terry Slattery | September 04, 2014 |

 
   

QoS - It Really Is Important

QoS - It Really Is Important Tying together QoS, network monitoring, and the impact of packet loss

Tying together QoS, network monitoring, and the impact of packet loss

QoS Is Still Misunderstood
I received an email this week from a gentleman named Steve, who asked about QoS. It seems that some of his co-workers don't believe in QoS in their network. They evidently believe that the network has sufficient bandwidth that QoS is not needed. Steve, however, had read a couple of my blog posts about QoS, interface drops, and network performance. He was concerned that they needed QoS and wanted to learn more about the factors that indicate a need for it.

Over the years, I've written several articles about QoS, network monitoring, and the impact of packet loss on network performance. Since Steve asked about all of these, I thought that it would be useful to write a summary article that ties all the parts together.

The Impact of Packet Loss
It doesn't take much packet loss to negatively impact applications. With all the emphasis on network performance for real-time applications like video and voice, they are surprisingly resilient when faced with random packet loss. The codecs in popular use are able to interpolate between received data samples to synthesize samples that are close to what was lost. Our human sensory systems for visual and audio noise and dropouts are also very resilient and allow us to make sense of imperfectly operating systems. So these real-time applications tend to perform reasonably well when faced with packet loss.

TCP, on the other hand, is seriously impacted by packet loss. The very small amounts of packet loss that impact TCP are what is surprising. We've been trained that TCP handles packet loss all on its own. From my perspective, our intuition on the volume of packet loss has been very wrong. I discovered the difference between reality and my intuition when I came across something that's been called The Mathis Equation, named after the principal author of the initial research paper on the topic of the impact of packet loss on TCP. I first wrote about it at Chesapeake Netcraftsmen.

The result is that packet loss of greater than 0.0001% (that's 0.000001 * packet_count) should be investigated. That's the point in the curve where packet loss begins to impact TCP performance. Could you use a larger packet loss figure? Sure. I wouldn't go any higher than 0.01% packet loss, due to the loss of throughput for TCP as shown in the Mathis Equation throughput chart. Since most business applications run on TCP, higher packet loss can have a significant impact on productivity.

Network Monitoring
How do you know that you have a problem? After all, if no congestion is occurring, QoS isn't necessary. This is where your Network Management System (NMS) becomes useful.

I should point out here that simply monitoring link utilization isn't sufficient for detecting whether a link needs QoS. The averaging that an NMS performs when collecting interface performance data seriously understates the actual bandwidth used at any point in time. Data traffic is very bursty, and the reporting from the NMS is averaged over much longer periods of time than are used by the bursts. We've seen customers who have links running at 40% long-term utilization and which are experiencing significant congestion packet loss. In my experience, most network engineers and managers would not be concerned by 40% utilization. They are missing the peaks and how often those peaks occur.

It is better to look for other indicators of packet loss, such as interface drops. Some systems may report drops as "discards," so look carefully at both the network equipment and NMS to find the right variable. Two previous posts that deal with detecting packet loss are "Detecting Network Packet Loss" and "Detecting Link Congestion."

Another way to detect packet loss is to ask the endpoints, such as VoIP phones and video conferencing systems. The voice/video controller can report the statistics from the endpoints and allow you to sort by loss statistic or export to a tool in which you can do the sorting.

Finally, you can monitor the business servers for packet loss, using 'netstat -s -p tcp'. The output includes the following:

~ tcs$ netstat -s -p tcp

tcp:

697719 packets sent

313408 data packets (81383562 bytes)

247 data packets (72106 bytes) retransmitted

TCP ramps up its throughput during the "slow start" phase of a data transfer. It relies on dropping a packet when it reaches the throughput limit of the path, so some packet loss is to be expected.

My sample netstat output above is from my laptop, which runs over wireless most of the time, so I expect it to have higher retransmissions than normal. I prefer a double sort of server data to show me the systems that need the most attention. The first sort is by retransmissions. The second sort is by 'packets sent.' I then look for the systems with the largest volume of traffic and the highest retransmissions.

Once the high-loss systems are identified, you then need to determine the network path that's being used. You may need to determine the typical set of TCP connections with 'netstat -an" or by talking with the server team about the application architecture. Once you determine the other endpoints of that server, you can use trace/traceroute to identify the path that the packets are taking. All this investigation can be a bit time consuming, but it is well worth it for critical business servers that are experiencing significant packet loss. While you're at it, you may find a simple duplex mismatch that's causing a significant problem.

Finally, there may be other sources of packet loss, as described in this blog post:

An old switch interface card and unplanned server connections resulted in significant congestion at the switch interface card. We wouldn't have found this information without doing some CLI data collection and analysis.

QoS
Finally, let's talk about QoS. As I mentioned above, the network can experience congestion due to micro-bursts (also called instantaneous buffer congestion), as described in this blog post.

But you shouldn't stop there. You need to verify that the QoS implementation is doing what you want it to do. We discovered that a QoS configuration wasn't doing what we wanted on a highly congested T3 link in a customer's network. We had to modify the QoS buffering to force the drops into the low priority traffic queue and buffer the high priority data traffic.

Summary
I find it interesting that old ideas stay with us so long. It wasn't until I learned of micro-bursts that I gained a real appreciation for the value of QoS, even in the LAN. Then I gained some experience in several customer networks where we were able to see congestion and its impact. More experience allowed me to gain more knowledge and understanding of the impact of congestion and its sources. Actually implementing QoS and seeing that the default configuration didn't work for a significantly oversubscribed link was very interesting. We learned what to tweak and were able to achieve the desired result.





COMMENTS



Enterprise Connect Orlando 2018
March 12-15 | Orlando, FL

Connect with the Entire Enterprise Communications & Collaboration Ecosystem


Stay Up-to-Date: Hear industry visionaries in Keynotes and General Sessions delivering the latest insight on UC, mobility, collaboration and cloud

Grow Your Network: Connect with the largest gathering of enterprise IT and business leaders and influencers

Learn From Industry Leaders: Attend a full range of Conference Sessions, Free Programs and Special Events

Evaluate All Your Options: Engage with 190+ of the leading equipment, software and service providers

Have Fun! Mingle with sponsors, exhibitors, attendees, guest speakers and industry players during evening receptions

Register now with code NOJITTEREB to save $200 Off Advance Rates or get a FREE Expo Pass!

December 13, 2017

The two major vendors in the Unified Communications space, Cisco and Microsoft, are both strongly promoting their cloud UC deployments. If cloud UC is on your enterprises roadmap, but you dont want

November 29, 2017

As video conferencing use rises in the enterprise, businesses are looking for ways to bring this technology out of traditional conference room and make it more broadly accessible. That's made the h

November 1, 2017

Your customers (internal and external) demand that you offer them the ability to connect by any means. With the adoption of cloud communications tools you now have access to an expanded portfolio o

November 30, 2017
With a ruling on the FCC's proposed order to dismantle the Open Internet Order expected this month, communications technology attorney Martha Buyer walks us through what's at stake.
October 23, 2017
Wondering which Office 365 collaboration tool to use when? Get quick pointers from CBT Nuggets instructor Simona Millham.
September 22, 2017
In this podcast, we explore the future of work with Robert Brown, AVP of the Cognizant Center for the Future of Work, who helps us answer the question, "What do we do when machines do everything?"
September 8, 2017
Greg Collins, a technology analyst and strategist with Exact Ventures, delivers a status report on 5G implementation plans and tells enterprises why they shouldn't wait to move ahead on potential use ....
August 25, 2017
Find out what business considerations are driving the SIP trunking market today, and learn a bit about how satisfied enterprises are with their providers. We talk with John Malone, president of The Ea....
August 16, 2017
World Vision U.S. is finding lots of goodness in RingCentral's cloud communications service, but as Randy Boyd, infrastructure architect at the global humanitarian nonprofit, tells us, he and his team....
August 11, 2017
Alicia Gee, director of unified communications at Sutter Physician Services, oversees the technical team supporting a 1,000-agent contact center running on Genesys PureConnect. She catches us up on th....
August 4, 2017
Andrew Prokop, communications evangelist with Arrow Systems Integration, has lately been working on integrating enterprise communications into Internet of Things ecosystems. He shares examples and off....
July 27, 2017
Industry watcher Elka Popova, a Frost & Sullivan program director, shares her perspective on this acquisition, discussing Mitel's market positioning, why the move makes sense, and more.
July 14, 2017
Lantre Barr, founder and CEO of Blacc Spot Media, urges any enterprise that's been on the fence about integrating real-time communications into business workflows to jump off and get started. Tune and....
June 28, 2017
Communications expert Tsahi Levent-Levi, author of the popular BlogGeek.me blog, keeps a running tally and comprehensive overview of communications platform-as-a-service offerings in his "Choosing a W....
June 9, 2017
If you think telecom expense management applies to nothing more than business phone lines, think again. Hyoun Park, founder and principal investigator with technology advisory Amalgam Insights, tells ....
June 2, 2017
Enterprises strategizing on mobility today, including for internal collaboration, don't have the luxury of learning as they go. Tony Rizzo, enterprise mobility specialist with Blue Hill Research, expl....
May 24, 2017
Mark Winther, head of IDC's global telecom consulting practice, gives us his take on how CPaaS providers evolve beyond the basic building blocks and address maturing enterprise needs.
May 18, 2017
Diane Myers, senior research director at IHS Markit, walks us through her 2017 UC-as-a-service report... and shares what might be to come in 2018.
April 28, 2017
Change isn't easy, but it is necessary. Tune in for advice and perspective from Zeus Kerravala, co-author of a "Digital Transformation for Dummies" special edition.
April 20, 2017
Robin Gareiss, president of Nemertes Research, shares insight gleaned from the firm's 12th annual UCC Total Cost of Operations study.
March 23, 2017
Tim Banting, of Current Analysis, gives us a peek into what the next three years will bring in advance of his Enterprise Connect session exploring the question: Will there be a new model for enterpris....
March 15, 2017
Andrew Prokop, communications evangelist with Arrow Systems Integration, discusses the evolving role of the all-important session border controller.
March 9, 2017
Organizer Alan Quayle gives us the lowdown on programmable communications and all you need to know about participating in this pre-Enterprise Connect hackathon.
March 3, 2017
From protecting against new vulnerabilities to keeping security assessments up to date, security consultant Mark Collier shares tips on how best to protect your UC systems.
February 24, 2017
UC analyst Blair Pleasant sorts through the myriad cloud architectural models underlying UCaaS and CCaaS offerings, and explains why knowing the differences matter.
February 17, 2017
From the most basics of basics to the hidden gotchas, UC consultant Melissa Swartz helps demystify the complex world of SIP trunking.
February 7, 2017
UC&C consultant Kevin Kieller, a partner at enableUC, shares pointers for making the right architectural choices for your Skype for Business deployment.
February 1, 2017
Elka Popova, a Frost & Sullivan program director, shares a status report on the UCaaS market today and offers her perspective on what large enterprises need before committing to UC in the cloud.
January 26, 2017
Andrew Davis, co-founder of Wainhouse Research and chair of the Video track at Enterprise Connect 2017, sorts through the myriad cloud video service options and shares how to tell if your choice is en....
January 23, 2017
Sheila McGee-Smith, Contact Center/Customer Experience track chair for Enterprise Connect 2017, tells us what we need to know about the role cloud software is playing in contact centers today.