Terry Slattery
Terry Slattery, is a senior network engineer with decades of experience in the internetworking industry. Prior to joining Chesapeake NetCraftsmen as...
Read Full Bio >>

Terry Slattery | December 09, 2013 |


Monitoring a Software Defined Network, Part 1

Monitoring a Software Defined Network, Part 1 In addition to many traditional problems, there will be a set of new problems that must be understood.

In addition to many traditional problems, there will be a set of new problems that must be understood.

The Need for Monitoring
Just because a network is Software-Defined does not mean that it doesn't need to be monitored and managed. There are many network problems that SDNs do not eliminate, so there is still a requirement to identify the sources of problems. In fact, in addition to many traditional problems, there will be a set of new problems that must be understood, and a means must be developed for identifying and correcting them.

A lot has been written about the potential for SDN to change how networks are designed and how they operate. However, very little has been written about monitoring an SDN. Most of the development effort that I've seen is about OpenFlow, which one could argue is not by itself SDN, but is simply a mechanism by which an SDN could be implemented. Other mechanisms can be used to create an SDN, so let's not restrict ourselves to OpenFlow for this analysis.

Monitor the Network
The network will still need to be monitored to detect traditional problems. At the physical layer, we will still need to detect Frame Check Sequence (FCS) errors, Runts, Giants, Late Collisions, and link errors (some of these are specific to Ethernet; others are generic to any interface). Several of these errors are indications of duplex mismatch in Ethernet links. Perhaps the SDN initiative will allow us to create a mechanism that we can use to switch both ends of a link to the same duplex setting--it would be really nice to get this problem solved.

At the interface queuing level, we need to detect packet discards, which are due to interface congestion. A discard is a packet that can't be transmitted on an egress interface because there were no free interface buffers (i.e. the egress queue was full); it's caused when one or more high-speed interfaces are feeding one lower speed egress interface.

An ingress overrun can occur where the switching hardware is unable to handle an inbound packet before the next inbound packet arrives on that interface. These should be rare, but can happen on low-cost devices that have less than line-rate ingress processing capability.

An incredibly useful addition to the basic error counters would be a cache of the failed packet's header. When an error is detected, the packet header would be copied into this cache. By saving the header, it is possible to retrieve the information regarding the most recent error, providing valuable troubleshooting data. Otherwise, it would be necessary to use a packet capture device to try to determine which systems were having the problem.

If only a single storage location exists for each counter, any new error would overwrite the previous header. So ideally, there would be at least one storage location for each error counter. If two or four storage locations existed per counter, it would operate as a ring buffer, storing the headers of the last two or four errors. I have written about this suggestion and a workaround in the past at Netcraftsmen: How To Improve SNMP MIBS and Diagnosing the ipOutNoRoute Counter. An alternative for SDN is for the switch to forward the header to the SDN controller for storage, possibly only when we're actively troubleshooting a problem, much like we do with the "debug" command in today's equipment.

Monitor Forwarding Counters
SDN switches make forwarding decisions based on a large set of bits, including MAC address, QoS bits, and, potentially, application header bits. It is important for the SDN switches to record forwarding successes and failures as well as tracking the number of packets and octets that are processed by each forwarding criteria. This is particularly useful for tracking bandwidth utilization in a QoS queue or being able to detect when an application or a Forwarding Equivalency Class is consuming more bandwidth than anticipated.

Keep in mind that SDN switches, like today's switches, often have aggregate packet forwarding limits. With both packet count and octet count metrics available, the SDN controller can make smarter decisions, as well as providing feedback to applications that are capable of understanding it.

While we're wishing for monitoring functions, let's ask for functionality to make the measurements more accurate. It would be useful for the switch hardware to support atomic snapshots of multiple counters. For example, getting the packet count and octet count in one atomic operation would make the resulting calculation accurate. This is something that's not possible with the current SNMP protocol.

As with physical counters, it is important to cache the header of any packets that are not forwarded. Let's say that we are troubleshooting a connectivity problem. If we have the headers of dropped packets, we can check them against the forwarding entries that exist in the switch to determine why a specific set of packets are being dropped.

In addition, it would be nice if the switch hardware supported tracing of packet processing, recording the table entries that matched for a specified packet header. We could then ask the switch to tell us what internal processing happened and know whether a packet with a certain header was forwarded or dropped and why. A version of this capability has been developed at Stanford, in a network debugger called "ndb".

To Be Continued...
It would be nice if the monitoring of traditional errors was improved, making it easier to troubleshoot common problems. With SDN, an opportunity exists to gain more visibility into the forwarding engine's operation. We should spend time investigating whether the proposed mechanisms are sufficient for doing the level of debugging and troubleshooting that will be required.

Next time, I'll discuss monitoring the SDN controller.


March 7, 2018

Video collaboration is experiencing significant change and innovation-how can your enterprise take advantage? In this webinar, leading industry analyst Ira Weinstein will present detailed analysis

February 21, 2018

Business agility has become the strongest driver for enterprises to begin migrating their communications to the cloud-and its a benefit that enterprises are already realizing. To gain this benefit

February 7, 2018

Enterprises are starting to grasp the critical importance of security and compliance in their team collaboration deployments. And once the risks are mitigated, your enterprise can integrate these n

March 12, 2018
An effective E-911 implementation doesn't just happen; it takes a solid strategy. Tune in for tips from IT expert Irwin Lazar, of Nemertes Research.
March 9, 2018
IT consultant Steve Leaden lays out the whys and how-tos of getting the green light for your convergence strategy.
March 7, 2018
In advance of his speech tech tutorial at EC18, communications analyst Jon Arnold explores what voice means in a post-PBX world.
February 28, 2018
Voice engagement isn't about a simple phone call any longer, but rather a conversational experience that crosses from one channel to the next, as Daniel Hong, a VP and research director with Forrester....
February 16, 2018
What trends and technologies should you be up on for your contact center? Sheila McGee-Smith, Contact Center & Customer Experience track chair for Enterprise Connect 2018, gives us the lowdown.
February 9, 2018
Melanie Turek, VP of connected work research at Frost & Sullivan, walks us through key components -- and sticking points -- of customer-oriented digital transformation projects.
February 2, 2018
UC consultant Marty Parker has crunched lots of numbers evaluating UC options; tune in for what he's learned and tips for your own analysis.
January 26, 2018
Don't miss out on the fun! Organizer Alan Quayle shares details of his pre-Enterprise Connect hackathon, TADHack-mini '18, showcasing programmable communications.
December 20, 2017
Kevin Kieller, partner with enableUC, provides advice on how to move forward with your Skype for Business and Teams deployments.
December 20, 2017
Zeus Kerravala, principal analyst with ZK Research, shares his perspective on artificial intelligence and the future of team collaboration.
December 20, 2017
Delanda Coleman, Microsoft senior marketing manager, explains the Teams vision and shares use case examples.
November 30, 2017
With a ruling on the FCC's proposed order to dismantle the Open Internet Order expected this month, communications technology attorney Martha Buyer walks us through what's at stake.
October 23, 2017
Wondering which Office 365 collaboration tool to use when? Get quick pointers from CBT Nuggets instructor Simona Millham.
September 22, 2017
In this podcast, we explore the future of work with Robert Brown, AVP of the Cognizant Center for the Future of Work, who helps us answer the question, "What do we do when machines do everything?"
September 8, 2017
Greg Collins, a technology analyst and strategist with Exact Ventures, delivers a status report on 5G implementation plans and tells enterprises why they shouldn't wait to move ahead on potential use ....
August 25, 2017
Find out what business considerations are driving the SIP trunking market today, and learn a bit about how satisfied enterprises are with their providers. We talk with John Malone, president of The Ea....
August 16, 2017
World Vision U.S. is finding lots of goodness in RingCentral's cloud communications service, but as Randy Boyd, infrastructure architect at the global humanitarian nonprofit, tells us, he and his team....
August 11, 2017
Alicia Gee, director of unified communications at Sutter Physician Services, oversees the technical team supporting a 1,000-agent contact center running on Genesys PureConnect. She catches us up on th....
August 4, 2017
Andrew Prokop, communications evangelist with Arrow Systems Integration, has lately been working on integrating enterprise communications into Internet of Things ecosystems. He shares examples and off....
July 27, 2017
Industry watcher Elka Popova, a Frost & Sullivan program director, shares her perspective on this acquisition, discussing Mitel's market positioning, why the move makes sense, and more.
July 14, 2017
Lantre Barr, founder and CEO of Blacc Spot Media, urges any enterprise that's been on the fence about integrating real-time communications into business workflows to jump off and get started. Tune and....
June 28, 2017
Communications expert Tsahi Levent-Levi, author of the popular blog, keeps a running tally and comprehensive overview of communications platform-as-a-service offerings in his "Choosing a W....
June 9, 2017
If you think telecom expense management applies to nothing more than business phone lines, think again. Hyoun Park, founder and principal investigator with technology advisory Amalgam Insights, tells ....
June 2, 2017
Enterprises strategizing on mobility today, including for internal collaboration, don't have the luxury of learning as they go. Tony Rizzo, enterprise mobility specialist with Blue Hill Research, expl....
May 24, 2017
Mark Winther, head of IDC's global telecom consulting practice, gives us his take on how CPaaS providers evolve beyond the basic building blocks and address maturing enterprise needs.
May 18, 2017
Diane Myers, senior research director at IHS Markit, walks us through her 2017 UC-as-a-service report... and shares what might be to come in 2018.
April 28, 2017
Change isn't easy, but it is necessary. Tune in for advice and perspective from Zeus Kerravala, co-author of a "Digital Transformation for Dummies" special edition.
April 20, 2017
Robin Gareiss, president of Nemertes Research, shares insight gleaned from the firm's 12th annual UCC Total Cost of Operations study.
March 23, 2017
Tim Banting, of Current Analysis, gives us a peek into what the next three years will bring in advance of his Enterprise Connect session exploring the question: Will there be a new model for enterpris....
March 15, 2017
Andrew Prokop, communications evangelist with Arrow Systems Integration, discusses the evolving role of the all-important session border controller.
March 9, 2017
Organizer Alan Quayle gives us the lowdown on programmable communications and all you need to know about participating in this pre-Enterprise Connect hackathon.
March 3, 2017
From protecting against new vulnerabilities to keeping security assessments up to date, security consultant Mark Collier shares tips on how best to protect your UC systems.
February 24, 2017
UC analyst Blair Pleasant sorts through the myriad cloud architectural models underlying UCaaS and CCaaS offerings, and explains why knowing the differences matter.
February 17, 2017
From the most basics of basics to the hidden gotchas, UC consultant Melissa Swartz helps demystify the complex world of SIP trunking.
February 7, 2017
UC&C consultant Kevin Kieller, a partner at enableUC, shares pointers for making the right architectural choices for your Skype for Business deployment.
February 1, 2017
Elka Popova, a Frost & Sullivan program director, shares a status report on the UCaaS market today and offers her perspective on what large enterprises need before committing to UC in the cloud.
January 26, 2017
Andrew Davis, co-founder of Wainhouse Research and chair of the Video track at Enterprise Connect 2017, sorts through the myriad cloud video service options and shares how to tell if your choice is en....
January 23, 2017
Sheila McGee-Smith, Contact Center/Customer Experience track chair for Enterprise Connect 2017, tells us what we need to know about the role cloud software is playing in contact centers today.