Monitoring a Software Defined Network, Part 5
Is a separate control, monitoring, and management network needed?
Note: My discussion of SDN monitoring covers several topics. Here are the prior posts:
1. Monitoring the SDN data plane
2. What parameters to monitor in an SDN control system
3. Should the SDN monitoring system be integrated with the controller?
4. SDN offers an opportunity to re-design network monitoring
1. Monitoring the SDN data plane
I've seen a number of comments about SDN requiring a separate monitoring and control network. Let's take a closer look at this requirement to decide how critical it is. I'll simply refer to the combination of control, monitoring, and management as control data and the network as a control network.
Advantages of a Separate Control Network
Telephony networks, such as those based on SS7, have relied on separate control and monitoring networks, also called "out-of-band" management networks. These networks are valuable because they separate the control data from the impact that the in-band data might have. There's no need to include prioritization of control data over other types of traffic. Having a separate control network simplifies the QoS configuration. However, most QoS class definitions include Network Control as the top priority class by default at DSCP 48/CS6. While it doesn't eliminate a QoS Class, it may eliminate the traffic that would use it.
A separate out-of-band network may be less susceptible to congestion-based packet loss. One would hope that there would never be a network event that would cause a relatively high-speed control network to generate enough traffic to cause congestion packet loss. The buffering inherent in the network equipment should be sufficient to handle any imaginable burst of control traffic. There is probably some dependency on the transport protocol in use.
User Datagram Protocol (UDP) might conceivably drop some packets due to congestion at some point in the network. That would be a good reason to use Transmission Control Protocol (TCP) for the transport, especially since the upper-level protocols being discussed for network control are based on REST/XML and/or JSON. I don't see these protocols running over UDP. It just seems to make more sense to run them over TCP and get reliable delivery without trying to fit individual request/response exchanges in single UDP packets.
A separate control network makes it easy to communicate with all the network devices, even when the SDN part of the network is experiencing problems. This is a powerful advantage that shouldn't be overlooked. Consider what will be required to bring up the network after a site-wide power outage, such as due to a major storm (Hurricane Sandy taught many of us lessons about the duration and scope of storm-related outages). Can the SDN control system easily communicate with the SDN switches in order to bring up the network and load the forwarding tables?
If a separate control network is not used, the SDN controller will need to understand the network physical topology and talk to devices that are nearest to the controller before attempting to talk with devices that are still isolated. Some zero touch installation mechanisms may alleviate this potential problem, so make this one of the things that you understand from your SDN equipment provider.
Disadvantages of a Separate Control Network
For reliability, a separate control network would need to be fully redundant. Building and maintaining a separate control network of this complexity would be reasonably challenging in its own right. There can be significant cost to building a parallel, but separate control network. Additional WAN links may be necessary. All connectivity in the SDN and in the control network must be checked to make sure that the control network traverses different links and doesn't share any physical infrastructure, potentially including power feeds as well as communication links.
How complex is it to create a diverse, redundant control network that's separate from the diverse, redundant SDN infrastructure? It is hard enough to get two diverse paths without having to also get two more diverse paths for the control network. This could quickly get very expensive. And I doubt that the diversity would exist for long, as carriers shift circuits onto different paths as they work to lower costs.
The separate control network could be based on SDN, or it could be using traditional IP networking technology. If it were based on SDN, it would need a mechanism to bootstrap itself after a power failure. The bootstrap mechanism could be built to be reasonably efficient and fast, so this isn't a significant problem.
If the control network is constructed as a traditional IP network, does it imply keeping old-style network management tools active, just for monitoring the control network? It does mean that the control network uses different configuration and troubleshooting tools and techniques, which is an added cost. And that doesn't include the cost of retaining staff who understand how to configure, maintain, and troubleshoot the IP network.
Constructing a separate and resilient control network is quite expensive. In many cases, it will be impossible to find independent paths where a single failure won't affect both the control network and the SDN network itself. (Consider a backhoe digging up the fiber at a building entrance or a highway maintenance crew accidentally slicing a key cross-country link.) In many cases, the cost of building a redundant control network will be sufficiently expensive that it won't make it past the CFO's approval.
Networking trends have been to converge disparate networks where possible in order to reduce capital and operational costs. Why would the SDN control network be any different? Vendors will be pressured by customers to provide mechanisms by which a separate control network is not needed, and that cost will eventually be eliminated.
Cost pressures will lead organizations to converge the control network with the operational SDN infrastructure. Even if an organization decides to build a separate control network, there will need to be a backup network, and using the SDN network itself will be very tempting. If the SDN architecture includes redundant devices and links, what makes a separate control network necessary?
It seems like the driver for a separate control network is avoiding shared fate: When an incident takes out part of the SDN infrastructure, you want to continue to have a control path in order to facilitate troubleshooting and remediation. But if the SDN architecture is resilient, with redundant, diverse paths, isn't that sufficient to provide the backup path that the control network also needs? The only factor that would seem to come into play is sharing bandwidth. SDN provides good QoS controls, so there's no problem in prioritizing the control traffic over application traffic. There are mechanisms that can be employed to handle the network startup after a major power outage.
I don't see the benefit of separate control networks and can see the value of a converged SDN and control network. That isn't to say that some topologies won't need some additional data paths in order to provide connectivity to remote parts of the network in case of disaster. I wouldn't count on those paths being totally independent of the other paths in the SDN infrastructure for those times when the backhoe finds the cable or when the road work crew drives a guard rail support through a cable.