Pondering a 'Third Way' for IT & Network Management
Proposed by a BT research exec, the Third Way of Management accounts for the virtualization of today's network and IT environments.
The notion of network management is as old as the notion of networks. I did a survey of enterprises back in 1982, and even then more than 80% of respondents cited network management as a problem. That's interesting because, asked again in the spring of 2014, that same question generated almost exactly the same response.
Now we're adding IT into the mix. In the service provider space, network functions virtualization (NVF) proposes to substitute software and hosting for custom network devices. In the cloud, networking and IT combine to create an experience that the user sees simply as a "cloud service." IT management has been a problem for even longer than network management, and if the two management needs are combining, then do we have any hope those survey numbers will ever improve?
Source: Steve Noble, Wikibon
Maybe there is hope. As Chris Bilton, BT's director of research and technology, recently said in a press interview, he thinks the time has come to admit that we can't face the future through either traditional network management or traditional IT management. He proposes a "Third Way of Management" that blends bits of the two. What might that look like, and could we ever get there? Let's see.
Who Does the Fixing?
All of management is really about fault tracing and problem resolution. Something breaks and you fix it. What's made things complicated in recent years has been the increased focus on virtualization, because virtualization means that the "something" that breaks isn't the usual solid old router or switch but some virtual feature running somewhere and connected via something. Just figuring out what it is confuses buyers, and deciding if it's broken or not seems almost out of reach.
Then there's the fixing part. You roll a truck to replace or repair a box, but you can't send a real service tech to fix a virtual problem. Most problem responses in the current age are really about remediation at the service level, the assignment of substitute resources to stand in for what's not working so the service really doesn't break at all.
Even our pronoun is at risk. "You" fix it? Today, any human intervention is seen as a prohibitive cost, so what actually has to determine if something breaks is an automated process, and another such process has to do the fixing. Humans might end up getting sent to do something with physical infrastructure, but by that point their automated servants have taken the faulty element offline and replaced it so there's no service outage. The tech is tidying up.
A Sign of What's to Come
All this may seem irrelevant but it's really the signpost to that Third Way of Management Bilton mentions. We have highly dynamic resource systems, multitenant in nature, that are pressed into service to perform some temporary function we'd call a "service feature" or an "application." From the pressing into service to the eventual end of the experience, users are served by a Third Way toolkit, because that's the only way to give them the kind of flexibility and efficiency they want.
Management of this sort starts with things to manage, though, and the Third Way has to manage abstractions. Applications and network services can be defined as a series of abstractions. We do that all the time when we draw diagrams to show network connectivity or application component parts. A set of tools can then take a given abstraction and deploy it to make it real. That's what IT management does with DevOps tools during the life of an application, and what NFV architects or software-defined network planners would do in the networking future.
Where we have a problem here isn't as much in the deploying as in that "something breaks and you fix it" stuff. The Internet has demonstrated that we can eliminate most of classical management by eliminating any specific service-level agreements (SLA). We have a pool of stuff, we size the pool for a given load of users, and we fix stuff that breaks in the pool without worrying about what happened to service, because the user will re-click on the URL or try the Skype call again if nothing happens. But that doesn't work where we do need an SLA, or even where "best effort" isn't good enough.
All About the Virtual
The critical piece, and the piece that's missing, in the Third Way isn't how we look at services or applications. We know they're abstractions that we can deploy using tools or scripts. The missing piece is that "breaks and gets fixed" notion that IT and networking have had all along, and still have everywhere that best efforts won't cut it. The Third Way of Management, the one we're seeking, is all about virtual, but its problem is that once we realize our abstractions we lose the connection between them and the service.
In a complex system of devices or application components, "working" means that everything is working. That demonstrates that the status of what the user buys or runs--the application or the service--is the combined status of all those little deployed pieces and all the virtual pathways that connect them. Deployment scripts compose the application or service, and they have to compose the management pathways as well, showing how the status of every high-level element--all the way to the service/application itself--can be derived from the status of whatever is below. Status has to flow like traffic or work has to flow, or we won't be able to fix it, let alone know if it's broken in the first place.
The Third Way of Management is the "virtual way," the way that says that when you build anything from abstract parts that are then loosely bound to real resources, you have to build everything, even the management. And because you build management from virtual principles, it becomes not only the Third Way but the ultimate way.
Follow Tom Nolle on Google+!