SHARE



ABOUT THE AUTHOR


Terry Slattery
Terry Slattery, is a senior network engineer with decades of experience in the internetworking industry. Prior to joining Chesapeake NetCraftsmen as...
Read Full Bio >>
SHARE



Terry Slattery | October 02, 2014 |

 
   

SDN: A Network Troubleshooting Black Hole?

SDN: A Network Troubleshooting Black Hole? How will we do network troubleshooting with dynamic networks?

How will we do network troubleshooting with dynamic networks?

What Tools Will Be Useful?
Software Defined Networks will make network troubleshooting more challenging. Flows can potentially take any of several paths through the network and traditional tools won't necessarily be useful for network testing. In today's networks, we have several common tools for troubleshooting: ping, traceroute, and, well, maybe some network management tool reports. How will ping and traceroute work where the network path for each flow is determined by a central controller? Will these tools continue to be useful, or will we need new tools?

Our network troubleshooting tools will need to provide information similar to the information that we've traditionally had available. Ping shows us connectivity, round trip time, and packet loss. Traceroute shows us the forwarding path between two systems.

Ping
Ping is a pretty simple tool. It sends an ICMP echo-request packet and looks for an echo-reply packet. I prefer the versions of ping that display the sequence number in the output. This allows me to easily track which replies are received, multiple replies (which indicates packet replication of either the echo-request or of the echo-reply), out-of-sequence packets, and the round trip time of each request. Most networking staff use ping only to verify connectivity.

I've used ping in non-obvious ways to detect various network problems. One troubleshooting technique is to start a long-running ping and record the output. Import the output into Excel, using the Excel parsing mechanism to separate the sequence number and round trip times into separate columns. Plot the round trip times against the sequence numbers (I prefer sequence numbers on the X axis and RTT on the Y axis). It is really easy to see periodic changes in RTT when starting to diagnose problems.

I have occasionally been able to determine a culprit by looking at how often the RTT spikes. Also look for periodic instances where packets are dropped. Is there a routing change that always occurs at a set time of day? Or is there an outage that corresponds to a planned network change?

We will need something comparable for SDN troubleshooting. I don't think we should rely on end-systems for this functionality. The SDN infrastructure should be able to generate ping packets, to be sent from one or more SDN switches.

Traceroute
Traceroute has its problems, and perhaps we can improve on it. Traceroute works by sending multiple packets into the network, to be routed from the source system to a specified destination IP address. One problem is that it sends multiple packets and the route that is selected may change between packets. Network engineers have learned how to look for such changes in the output. RFC1393 describes a different IP option and an ICMP message type that eliminates the need for multiple source packets. Routers along the path detect the option and return the new ICMP message. Unfortunately, it was not widely implemented and has been deprecated by RFC6814.

Because traceroute generates multiple packets for each TTL probe value, it is possible to see multi-path selection where each probe packet takes a different path. This functionality is going to be important in the SDN world.

We will want to have the SDN traceroute show and verify multiple paths. Just because a controller says that a path exists doesn't necessarily mean that the path can carry packets. The SDN version of traceroute will need to generate probe packets to verify each path.

New Tools
The dynamics of SDN will require that some new tools exist. I would like to see a bi-directional path viewing tool. It would only examine the SDN controller to gather information about the path over which two endpoints are communicating. It should show the path in both directions so that asymmetric paths can be detected. It should have a recording mode that monitors the path for changes (like using ping above to detect that there are routing changes). There are several current tools on the market that perform this recording and playback (Appneta's Pathview comes to mind), so we have prior work upon which we can build. The display might look like one used in video editing, with a timeline that displays markers where changes have occurred.

A neat addition to the above tool would be to incorporate a Mathis Equation calculation. The tool would need to run for a while over both path directions to measure packet loss and latency. It could then perform the Mathis Equation calculations to report on the maximum throughput over that path. By incorporating the calculation, it could report the path characteristics in something that everyone understands: potential throughput. Incorporating a version of TCP throughput tests in the SDN switches and controller would then allow network admins to initiate a test to verify throughput without needing access to the end systems and without impacting their operation.

To aid in troubleshooting flow tables, it will be useful to display the Flow Equivalence Class (FEC) information at each switch in a path. It will be important to know if a path between two endpoints is using an FEC that's used by other flows as well as how many flows are using those entries.

Similarly, we'll want tools that allow us to see the amount of activity of a given flow entry. We will need tools and modifications of existing tools that will allow us to define network criteria that might affect a flow, such as a QoS setting.

We should also see simulators that allow us to use data collected from an operational network and easily modify the data to do "what-if" analysis. I have never been happy with existing simulation tools because of the amount of work needed to properly instrument them. With SDN, we should have better ways of collecting the data necessary to populate the simulators. And since the simulator is simply an SDN controller running on a simulated network, the data should be easily imported. Simulators will also allow us to prove whether connectivity should, or should not, exist, enhancing network security and helping us answer questions about whether two systems should be able to communicate with each other.

Tools will also be needed for diagnosing problems within the SDN domain. For example, we will need mechanisms that help diagnose split-brain failures. Similarly, we will need something to diagnose problems of connectivity between the SDN controller and SDN switches.

Summary
I'm sure that other tools will be created as we encounter situations that need additional visibility. It will be interesting to see what results from real-world operations.





COMMENTS



Enterprise Connect Orlando 2018
March 12-15 | Orlando, FL

Connect with the Entire Enterprise Communications & Collaboration Ecosystem


Stay Up-to-Date: Hear industry visionaries in Keynotes and General Sessions delivering the latest insight on UC, mobility, collaboration and cloud

Grow Your Network: Connect with the largest gathering of enterprise IT and business leaders and influencers

Learn From Industry Leaders: Attend a full range of Conference Sessions, Free Programs and Special Events

Evaluate All Your Options: Engage with 190+ of the leading equipment, software and service providers

Have Fun! Mingle with sponsors, exhibitors, attendees, guest speakers and industry players during evening receptions

Register now with code NOJITTEREB to save $200 Off Advance Rates or get a FREE Expo Pass!

November 1, 2017

Your customers (internal and external) demand that you offer them the ability to connect by any means. With the adoption of cloud communications tools you now have access to an expanded portfolio o

October 18, 2017

Microsofts recent Ignite event had some critically important announcements for enterprise communications. Namely, Microsofts new Team Collaboration offering, Teams, will be its primary communicatio

September 20, 2017

Customer experience can make or break your business. But how do you achieve outstanding customer service when you're dealing with outdated organizational structure, lagging technology, dated proces

October 23, 2017
Wondering which Office 365 collaboration tool to use when? Get quick pointers from CBT Nuggets instructor Simona Millham.
September 22, 2017
In this podcast, we explore the future of work with Robert Brown, AVP of the Cognizant Center for the Future of Work, who helps us answer the question, "What do we do when machines do everything?"
September 8, 2017
Greg Collins, a technology analyst and strategist with Exact Ventures, delivers a status report on 5G implementation plans and tells enterprises why they shouldn't wait to move ahead on potential use ....
August 25, 2017
Find out what business considerations are driving the SIP trunking market today, and learn a bit about how satisfied enterprises are with their providers. We talk with John Malone, president of The Ea....
August 16, 2017
World Vision U.S. is finding lots of goodness in RingCentral's cloud communications service, but as Randy Boyd, infrastructure architect at the global humanitarian nonprofit, tells us, he and his team....
August 11, 2017
Alicia Gee, director of unified communications at Sutter Physician Services, oversees the technical team supporting a 1,000-agent contact center running on Genesys PureConnect. She catches us up on th....
August 4, 2017
Andrew Prokop, communications evangelist with Arrow Systems Integration, has lately been working on integrating enterprise communications into Internet of Things ecosystems. He shares examples and off....
July 27, 2017
Industry watcher Elka Popova, a Frost & Sullivan program director, shares her perspective on this acquisition, discussing Mitel's market positioning, why the move makes sense, and more.
July 14, 2017
Lantre Barr, founder and CEO of Blacc Spot Media, urges any enterprise that's been on the fence about integrating real-time communications into business workflows to jump off and get started. Tune and....
June 28, 2017
Communications expert Tsahi Levent-Levi, author of the popular BlogGeek.me blog, keeps a running tally and comprehensive overview of communications platform-as-a-service offerings in his "Choosing a W....
June 9, 2017
If you think telecom expense management applies to nothing more than business phone lines, think again. Hyoun Park, founder and principal investigator with technology advisory Amalgam Insights, tells ....
June 2, 2017
Enterprises strategizing on mobility today, including for internal collaboration, don't have the luxury of learning as they go. Tony Rizzo, enterprise mobility specialist with Blue Hill Research, expl....
May 24, 2017
Mark Winther, head of IDC's global telecom consulting practice, gives us his take on how CPaaS providers evolve beyond the basic building blocks and address maturing enterprise needs.
May 18, 2017
Diane Myers, senior research director at IHS Markit, walks us through her 2017 UC-as-a-service report... and shares what might be to come in 2018.
April 28, 2017
Change isn't easy, but it is necessary. Tune in for advice and perspective from Zeus Kerravala, co-author of a "Digital Transformation for Dummies" special edition.
April 20, 2017
Robin Gareiss, president of Nemertes Research, shares insight gleaned from the firm's 12th annual UCC Total Cost of Operations study.
March 23, 2017
Tim Banting, of Current Analysis, gives us a peek into what the next three years will bring in advance of his Enterprise Connect session exploring the question: Will there be a new model for enterpris....
March 15, 2017
Andrew Prokop, communications evangelist with Arrow Systems Integration, discusses the evolving role of the all-important session border controller.
March 9, 2017
Organizer Alan Quayle gives us the lowdown on programmable communications and all you need to know about participating in this pre-Enterprise Connect hackathon.
March 3, 2017
From protecting against new vulnerabilities to keeping security assessments up to date, security consultant Mark Collier shares tips on how best to protect your UC systems.
February 24, 2017
UC analyst Blair Pleasant sorts through the myriad cloud architectural models underlying UCaaS and CCaaS offerings, and explains why knowing the differences matter.
February 17, 2017
From the most basics of basics to the hidden gotchas, UC consultant Melissa Swartz helps demystify the complex world of SIP trunking.
February 7, 2017
UC&C consultant Kevin Kieller, a partner at enableUC, shares pointers for making the right architectural choices for your Skype for Business deployment.
February 1, 2017
Elka Popova, a Frost & Sullivan program director, shares a status report on the UCaaS market today and offers her perspective on what large enterprises need before committing to UC in the cloud.
January 26, 2017
Andrew Davis, co-founder of Wainhouse Research and chair of the Video track at Enterprise Connect 2017, sorts through the myriad cloud video service options and shares how to tell if your choice is en....
January 23, 2017
Sheila McGee-Smith, Contact Center/Customer Experience track chair for Enterprise Connect 2017, tells us what we need to know about the role cloud software is playing in contact centers today.