Bare-Metal SDN Fabric

Bare-Metal SDN
The term "Bare-Metal" refers to switches that are delivered without an embedded, custom, and often exceedingly complex operating system. SDN vendors are then able to load their own simple operating systems on these switches to implement an SDN fabric.

These switches use common switching silicon, so that it is easy for the SDN vendors to create an OS that can be easily ported to any of several vendors' switches. This is much like the ability to run any of several hypervisors or operating systems on x86 hardware. Virtualization of the network is a step closer.

An advantage of these so-called bare-metal switches is that they come at a lower price point than the historical enterprise switch that incorporates a complex operating system.

Last week, Big Switch announced their Big Cloud Controller and the ability to run on any of several vendors' switch hardware (Dell, Quanta, etc). (See Big Switch Networks post.) The Big Switch controller operates as a cluster that is logically centralized and physically distributed.

The announcement included numerous partnerships. The Dell and Quanta partnerships allow the use of those vendors' switches, which can be ordered with the ONIE (Open Network Install Environment) bootstrap loader installed. ONIE allows touchless installation of new switches--automatically finding the SDN controller and loading the necessary switch software. The boot loader functionality is similar to what servers have had for some time.

Since the switches are managed from the central controller, the Operations, Administration, and Maintenance (OAM) functions are also significantly reduced. There is a single management system for configuring and controlling the physical infrastructure (leaf + spine topology) and virtual infrastructure (vSwitches), providing full connectivity across the SDN domain. There are efficiencies to be gained in this approach, due to a reduction in the number of hops across the network as well as the elimination of multiple layers of encapsulation headers.

We can now run production SDN systems right on the hardware instead of implementing an overlay on top of an existing network. (The overlay approach typically requires operation and management of both the underlay network and the overlay network, increasing the management workload and making troubleshooting more difficult.) A "pure" SDN makes it easier to implement big, multi-tenant networks.

The Big Cloud Controller proactively calculates the SDN forwarding rules for all destinations within the forwarding domain when a new device is detected. This is in contrast with the "reactive" approach, in which forwarding rules are loaded only when a new flow is detected. The downside to the proactive approach is that more connectivity is possible than what some security-focused SDN proponents advocate (they prefer that only the required forwarding rules be loaded, to prevent undesirable connectivity).

When I was talking with Kyle Forster (co-founder of Big Switch), we briefly touched on the fact that the Big Cloud Controller could be implemented on a VM. The SDN controller must be dedicated and not virtualized. After a power outage, the controller needs to start up without any assistance from a VM controller, and it must be able to communicate with at least the adjacent network infrastructure.

I predict that some enterprising network admin will completely virtualize an SDN controller some day. When the system tries to come back up after a power outage, this admin will find that the VM controller can't talk with the VM hardware until parts of the network are initialized. But since the SDN controller is in a VM, the network can't be initialized until that VM is functioning and is connected to the network. SDN network audits will thus have to incorporate a check that the basic infrastructure can be booted without relying on each other or on external components.

Core + Pod
SDN domains are best implemented in pods. A pod can be anything from a few VM servers and network equipment to a whole data center row (say 16 racks of 20-40 multi-core servers each). In a multi-rack implementation, there are two Top of Rack Switches (TORS). Each TORS is connected to all spine switches, providing a lot of network cross-sectional bandwidth. Resilience is increased because the loss of one leaf or one spine switch affects a smaller percentage of the overall network capacity.

The connections to the core are done in a similar manner. A pair of core switches would connect to each leaf switch, not to the spine switches, as one would see in a traditional 3-layer core, aggregation, and distribution design.

The figure below, from Big Switch, shows the typical connectivity within such a design. The Services and Connectivity Racks, shown in the lower right, provide connectivity to the Core.

When I was talking with Kyle at Big Switch, he described how one customer creates pod designs. Each item in the pod is specified, down to device model numbers. That pod design is then replicated as many times as desired. When anything in the pod design changes, a new pod revision number is assigned. Changes in some hardware may dictate changes in the provisioning or management system, so that becomes a part of the pod design as well. Instead of attempting to have one management system that handles everything in all pods, the management system only needs to handle the systems within one pod. The end result is fewer management problems.

Big Tap 4.0
Big Switch is also announcing Big Tap 4.0. Big Tap is a Network Packet Broker (NPB), typified by Gigamon. Big Tap allows for tapping of every data center rack, which is important in large, multi-tenant data centers where being able to tap any flow at any server is necessary for scalable operations and troubleshooting. The tapped data flow is fed to any of several diagnostic or monitoring tools.

I've used NPBs in the past at enterprise network choke points to allow easy troubleshooting. When someone complains about a slow application, it is great to be able to feed the client-server flow to an application performance management tool to aid diagnosis of the problem.

The Big Switch Big Tap product does the same thing, but at much larger scales and in multi-tenant environments. Tools like these (NPB and app performance management) are wonderful to have when the trouble ticket says, "The network is slow."

Summary
SDN running on bare-metal switches is not new. Google and Facebook have been making their own switches and writing their own software. (See a description of the Facebook implementation.) But there are few commercially available implementations. Big Switch and their partners now give anyone interested in starting a pure SDN prototype the opportunity to get started.

Tags:

News & Views

Search form

Bare-Metal SDN Fabric

Tags: