SHARE



ABOUT THE AUTHOR


Terry Slattery
Terry Slattery, is a senior network engineer with decades of experience in the internetworking industry. Prior to joining Chesapeake NetCraftsmen as...
Read Full Bio >>
SHARE



Terry Slattery | June 04, 2015 |

 
   

The Quest for Network Visibility

The Quest for Network Visibility Monitoring all network interfaces isn't an unreasonable expectation -- especially since the tools exist to do it without breaking the budget.

Monitoring all network interfaces isn't an unreasonable expectation -- especially since the tools exist to do it without breaking the budget.

Monitoring everything in a network is a big challenge, but I think it's a necessary undertaking.

Monitoring Everything
Unfortunately, many network management systems (NMS) aren't up to the challenge of polling every network interface, storing the data, and generating useful information out of the collected data. This is perhaps the biggest reason why network management is so difficult. Scaling up to handle a large network is challenging.

Why monitor everything? Can't we get by with a system that monitors only the important interfaces, as is often the case due to budgeting constraints. Licensing for most (all?) NMS products is by the number of monitored elements, so limiting that number can keep the NMS budget in check. But this is one case where what you don't know can, in fact, hurt you.

As we attempt to identify the "important" interfaces, we end up having to implement a process and procedure to validate whether any new interfaces are important enough to be monitored. We also need a process and procedure to remove old interfaces from the monitoring system as endpoints and network devices are removed from the network. This process is likely to impede other network maintenance processes, and will soon fall into disuse. We either have a cumbersome process or a failed process, both of which have other costs.

So I like to monitor everything in the network. Network monitoring automation tools make it easy to identify interfaces and begin monitoring them. I let the NMS filter out the unimportant data. But this means that the NMS must handle all interfaces, which can be expensive.

It's Too Expensive
I was working with a customer a while back that had purchased an expensive NMS and was spending more money customizing it. I asked the NMS team to monitor all the interfaces in the data center, figuring that at least those interfaces were important. I wanted to be able to identify server interfaces that reported problems with the connected servers. I rely on late collisions and frame check sequence, or FCS, errors to indicate duplex mismatch problems. Counters for discards/drops and ingress overruns tell me about oversubscribed interfaces.

In this same network, we found that several key servers connected to the same 48-port 6148 Ethernet blade in a Cisco Catalyst 6500 switch. In fact, three of the highest-volume servers connected to the consecutive ports on one ASIC on this blade, as shown in the figure below. At the busy part of the day, these servers would send more traffic than the switch could handle, resulting in high counts of ingress overruns. Distributing the servers to other blades in the same switch solved the congestion problem. In addition, the analysis identified this switch as a single point of failure for most of the business functions, which were running on these three servers.

portable
Oversubscribed ASIC

When I asked NMS team to monitor all server interfaces, I was told that was too expensive to do. In addition, the system's default configuration did not monitor the SNMP objects needed to identify problems similar to those described above.

Is There a Solution?
My current favorite interface monitoring system is Statseeker, because it is affordable and fast. It only needs one server to monitor more than 500,000 interfaces at a fast polling rate. Any of the collected data is viewable within a few seconds versus the many minutes required by some other systems I've used. Among the many NMS options available, this is the system that seems to provide a good value for the money and provides the capability of monitoring everything. An added benefit is that it is easy to set up and use. Many monitoring systems require a lot of fiddling and configuring, effectively making them expensive to install and use. I prefer something that works well out of the box.

What About Deployment?
I use tags to build a hierarchy of relative interface importance. Briefly, I use interface descriptions to add one or more tags the NMS can use to classify and rank the importance of interfaces. I add device tags to the SNMP Location string, or some other device string variable that's accessible via SNMP. Any interface tagged with "Critical Server" would be grouped into the "Server" interface group. A problem on any interface in this group would generate a high-priority alert. In normal operation, no interface in this group would have a problem that hasn't been diagnosed and corrected. (For more details, see my NetsCraftsmen blog, Device and Interface Tagging.)

Similarly, infrastructure interfaces would have tags like "Core-Core," "Core-Dist" or "Dist-Edge," allowing for easy grouping for alerting and reporting purposes. In order for this mechanism to work, the NMS must be able to create device and interface groups automatically based on the tags.

I handle edge interfaces that are relatively unimportant differently (all active interfaces are important or nothing would be connected to them). The default is not to tag an edge interface, which results in a very large group of relatively unimportant interfaces. The NMS is set up to produce a top-down sort of interfaces with errors. The interfaces with a high volume of errors appear first in this list. The network operations team then uses this report to identify and correct problems that have a major impact on an edge device.

Applying Network Visibility to UC
Gaining visibility into UC system connections to the network is easy with the above tips. Use an NMS that allows monitoring of all interfaces. Tag key interfaces to UC infrastructure such as session border controllers, UC managers, multipoint control units, and important teleconferencing systems. Using the tags, these interfaces are then automatically grouped together for monitoring and reporting purposes, as described above.

When a trouble ticket gets opened for an edge interface, the operations team should first check the NMS reports on the affected edge interface as well as the server interface or interfaces to make sure the culprit isn't something like a simple duplex mismatch.

It is important to use the periodic interface error reports to correct simple network problems. I've seen numerous examples where the network staff refuses to correct problems because an end user didn't report a problem. That's not being proactive, and does not lead to good network operational practices. It typically takes some involvement by the IT management staff to encourage tracking network problems and correcting them.

Summary
It isn't unreasonable to monitor all network interfaces. The tools exist to do it without breaking the budget. Adding a few operational procedures like tagging makes the tools much more useful. Finally, create processes and procedures to follow for handling the problems that the NMS reports.





COMMENTS



August 16, 2017

Contact centers have long been at the leading edge of innovation in communications technology, given their promise of measurable ROI and the continual need to optimize customer interactions and sta

July 12, 2017

Enterprises have been migrating Unified Communications & Collaboration applications to datacenters - private clouds - for the past few years. With this move comes the opportunity to leverage da

May 31, 2017

In the days of old, people in suits used to meet at a boardroom table to update each other on their work. Including a remote colleague meant setting a conference phone on the table for in-person pa

August 16, 2017
World Vision U.S. is finding lots of goodness in RingCentral's cloud communications service, but as Randy Boyd, infrastructure architect at the global humanitarian nonprofit, tells us, he and his team....
August 11, 2017
Alicia Gee, director of unified communications at Sutter Physician Services, oversees the technical team supporting a 1,000-agent contact center running on Genesys PureConnect. She catches us up on th....
August 4, 2017
Andrew Prokop, communications evangelist with Arrow Systems Integration, has lately been working on integrating enterprise communications into Internet of Things ecosystems. He shares examples and off....
July 27, 2017
Industry watcher Elka Popova, a Frost & Sullivan program director, shares her perspective on this acquisition, discussing Mitel's market positioning, why the move makes sense, and more.
July 14, 2017
Lantre Barr, founder and CEO of Blacc Spot Media, urges any enterprise that's been on the fence about integrating real-time communications into business workflows to jump off and get started. Tune and....
June 28, 2017
Communications expert Tsahi Levent-Levi, author of the popular BlogGeek.me blog, keeps a running tally and comprehensive overview of communications platform-as-a-service offerings in his "Choosing a W....
June 9, 2017
If you think telecom expense management applies to nothing more than business phone lines, think again. Hyoun Park, founder and principal investigator with technology advisory Amalgam Insights, tells ....
June 2, 2017
Enterprises strategizing on mobility today, including for internal collaboration, don't have the luxury of learning as they go. Tony Rizzo, enterprise mobility specialist with Blue Hill Research, expl....
May 24, 2017
Mark Winther, head of IDC's global telecom consulting practice, gives us his take on how CPaaS providers evolve beyond the basic building blocks and address maturing enterprise needs.
May 18, 2017
Diane Myers, senior research director at IHS Markit, walks us through her 2017 UC-as-a-service report... and shares what might be to come in 2018.
April 28, 2017
Change isn't easy, but it is necessary. Tune in for advice and perspective from Zeus Kerravala, co-author of a "Digital Transformation for Dummies" special edition.
April 20, 2017
Robin Gareiss, president of Nemertes Research, shares insight gleaned from the firm's 12th annual UCC Total Cost of Operations study.
March 23, 2017
Tim Banting, of Current Analysis, gives us a peek into what the next three years will bring in advance of his Enterprise Connect session exploring the question: Will there be a new model for enterpris....
March 15, 2017
Andrew Prokop, communications evangelist with Arrow Systems Integration, discusses the evolving role of the all-important session border controller.
March 9, 2017
Organizer Alan Quayle gives us the lowdown on programmable communications and all you need to know about participating in this pre-Enterprise Connect hackathon.
March 3, 2017
From protecting against new vulnerabilities to keeping security assessments up to date, security consultant Mark Collier shares tips on how best to protect your UC systems.
February 24, 2017
UC analyst Blair Pleasant sorts through the myriad cloud architectural models underlying UCaaS and CCaaS offerings, and explains why knowing the differences matter.
February 17, 2017
From the most basics of basics to the hidden gotchas, UC consultant Melissa Swartz helps demystify the complex world of SIP trunking.
February 7, 2017
UC&C consultant Kevin Kieller, a partner at enableUC, shares pointers for making the right architectural choices for your Skype for Business deployment.
February 1, 2017
Elka Popova, a Frost & Sullivan program director, shares a status report on the UCaaS market today and offers her perspective on what large enterprises need before committing to UC in the cloud.
January 26, 2017
Andrew Davis, co-founder of Wainhouse Research and chair of the Video track at Enterprise Connect 2017, sorts through the myriad cloud video service options and shares how to tell if your choice is en....
January 23, 2017
Sheila McGee-Smith, Contact Center/Customer Experience track chair for Enterprise Connect 2017, tells us what we need to know about the role cloud software is playing in contact centers today.