There is not enough standardization in QoS implementations for a deployment of regional WANs to be successful. Again this month I am working with a global company who has deployed regional WAN carriers and connected them with an international carrier, and hopes to make real-time traffic work across this two-tiered WAN architecture.Last July I wrote about the problems debugging this kind of deployment. One would think with all the standardization that we have around IP and QoS that this approach should be solid. But it just doesn't seem to be the case.
The diagram below is a canonical drawing of the type of network I am discussing. Usually deployed by a global company, the network uses regional carriers to connect the facilities in each region back to a regional data center. Then these regional data centers are connected together by one of the global carriers who can provide access in all the regions in which this company does business.
Often this architecture is a direct reflection of the way the company was built, through acquisition. Company A does business primarily in one region (e.g. North America, Europe, Australia, Japan, etc.) and then purchases another company in their industry in another geography. Each part of the company already has their network in place, so rather than restructuring the whole network, the company just binds them together with this hierarchical approach.
While this works pretty well for data applications that are primarily client-server based and predominantly local, i.e. clients are interacting with servers in their own regional data center, it does not work well for voice and video communications. Here is why.
1. To make real-time traffic work across the enterprise, the QoS policy has to be consistent across all local area networks, all regional carriers, the global carrier and the data centers. While this sounds possible, in reality it is difficult to accomplish. There is not sufficient standardization of QoS deployments today to make this plug-and-play. A concerted effort is required by a central authority (which may not exist in a conglomerated company) to get each participant to redesign or redeploy to meet the corporate standard.
2. The multiple hops of this network add latency by adding unnecessary layers of switches and routers on the global paths. This adds delay to a voice or video conversation that is already slow due to the large geographic distances and thus further degrades the quality of the conversation.
3. This design causes bandwidth scaling problems at the local data centers. Whenever voice or video demand increases between one region and the next, due to increased collaboration, increased voice or video deployment, or upgrades for higher quality communications (e.g. HD voice or video), the bandwidth in and out of the data centers has to be likewise increased. Again, a central designer must be watching this whole network and adjusting QoS bandwidth levels and/or deploying bandwidth in a timely manner to stay ahead of the bandwidth needs.
4. And lastly, as discussed in my previous article, this design is a bear to debug because there are so many different administrative domains involved in the path from one endpoint to the next.
I will write about my suggested approach to solving this problem in a few days time. If you have thoughts on how to resolve these issues, leave me a comment and let's discuss it.