Bryan Johns
Bryan Johns is the Community Director for Digium, the Asterisk company. In this role, Bryan works globally to foster growth...
Read Full Bio >>

Bryan Johns | May 12, 2011 |


Business Telecom Continuity Strategies in the Cloud Era

Business Telecom Continuity Strategies in the Cloud Era The Amazon EC2 "cloudpocalypse" taught us all some lessons, and tornadoes in Alabama had a specific impact on Digium's BC strategies.

The Amazon EC2 "cloudpocalypse" taught us all some lessons, and tornadoes in Alabama had a specific impact on Digium's BC strategies.

I'd like to begin this conversation by recognizing that this not the first, nor the last blog post that you're likely to read on this topic. The big "oops" in the Amazon EC2 cloud in early May of this year has caused many to consider the potential shortcomings of public clouds with respect to business continuity. While the EC2 cloud outage was due to an apparent "fat-fingering" of a single router port configuration, we at Digium were recently confronted with a six-day power outage at the hands of Mother Nature, who hammered our home town of Huntsville, Alabama with a barrage of tornadoes on April 28, 2011 that knocked down the Tennessee Valley Authority's power grid across the entire Northern half of the state. While these two events were very different in type and duration, they both served as examples of how business continuity strategies work and do not work in disastrous circumstances.

Because open source IP communications systems like Asterisk and Asterisk SCF can be deployed as software, in almost any hardware and/or network configuration, they can be equally attractive for cloud deployments as they are for premises deployments. However, the business continuity strategy for these styles of deployment are decidedly different. Let's take a moment to look at both strategies through the lens of the Amazon EC2 outage and the Huntsville, Alabama storms respectively.

Surviving a "Cloudpocalypse"
Amazon's EC2 utility computing platform has become synonymous with cloud computing. It was one of the earliest and is one of the largest cloud computing infrastructures in existence. A wide variety of commercial web applications and services are hosted from EC2, including many VoIP telephony applications. The typical EC2 customer has come to expect that up time and survivability are a part of what they are buying in this environment.

The events in early May this year proved that no infrastructure is foolproof to a sufficiently talented fool when an engineer hit the enter key on an errant router configuration and triggered a cascading, systemic failure of Amazon's public cloud. For the duration of this disruption, any customer of the affected infrastructure who did not have redundant infrastructure running in a different network or infrastructure was offline. This outage impacted a number of notable web properties deployed from EC2 including Foursquare and Quora.

So the EC2 outage highlights the need for diversity at the infrastructure layer and at the network layer. If you're running your company's VoIP systems in a public cloud (not recommended) you must achieve infrastructure diversity by enlisting a backup cloud in separate infrastructure. An example would be to have an EC2 deployment backed up by a GoGrid deployment. It is important to confirm that the primary and the backup clouds are connected to the Internet through different primary providers in order to avoid a single point of failure at the network layer.

Surviving a Natural Disaster
A number of weather events in recent times have highlighted the importance of having geographic diversity between primary and backup infrastructure. When Digium's headquarters lost power as the result of the tornadoes that struck Huntsville, Alabama in late April, the company's business continuity plan was enacted to allow the company to continue to operate all of its critical systems, including its Switchvox VoIP appliances, using emergency power supplied by a diesel generator. However, the duration of the power event raised concerns that the generator would not have enough fuel to last until power was restored--and without power, gas stations can't pump more diesel fuel.

Digium was fortunate to be able to fuel its generator and stay in business for the entire six day power outage. However, there were plenty of other companies in the area that were not so lucky and had to close their doors and have their telecom infrastructure offline until the power came back on. The localized nature of the weather event emphasizes the importance of geographic diversity for primary and secondary critical systems in a business continuity strategy.

If your primary business telecommunications infrastructure is run within your company's headquarters, you need a backup solution run in a different geographic location. This can be accomplished by using a SIP PBX in a different company facility in a different state, trunked to the same SIP trunking provider; or by deploying a backup within a cloud. Alternatively, many modern VoIP PBX platforms allow for calls destined to extensions to be routed out to mobile devices when the user's desk phone is not available. As long as you can keep your VoIP switch online, you can get calls to where they need to go.

Lessons Learned
I am sure that the Amazon EC2 "cloudpocalypse" has caused many of their customers to take a long, hard look at their business continuity strategies. The obvious issue for these customers is how they overcome their singular vendor dependency on Amazon without spending the same money twice on a monthly basis. Additionally, I would expect that Amazon is scrambling to improve their own infrastructure to be able to better cope with broad infrastructure disruptions, so that they can recover the trust of their customers.

At Digium, the weather event that took out power for almost a week opened our eyes to opportunities for improvement in our core telecommunications infrastructure. There are a number of ways that we can better utilize our San Diego facilities as backup capacity for our Huntsville headquarters in order to achieve geographic diversity. Digium can also make more extensive use of cloud operations as backup for our central communications platforms as a means of enhancing infrastructure diversity.

It is not possible predict the future and see what kinds of issues might create the need for a solid business continuity strategy. However, if your company relies upon its telecommunications infrastructure to sell its products, communicate with its customers or collaborate internally you must ask yourself what it costs to be without these tools for hours, days or even weeks, and then you must plan accordingly.


April 19, 2017

Now more than ever, enterprise contact centers have a unique opportunity to lead the way towards complete, digital transformation. Moving your contact center to the cloud is a starting point, quick

April 5, 2017

Its no secret that the cloud offers significant benefits to enterprises - including cost reduction, scalability, higher efficiency, and more flexibility. If your phone system and contact center are

March 22, 2017

As today's competitive business environments push workforces into overdrive, many enterprises are seeking ways of streamlining workflows while optimizing productivity, business agility, and speed.

April 28, 2017
Change isn't easy, but it is necessary. Tune in for advice and perspective from Zeus Kerravala, co-author of a "Digital Transformation for Dummies" special edition.
April 20, 2017
Robin Gareiss, president of Nemertes Research, shares insight gleaned from the firm's 12th annual UCC Total Cost of Operations study.
March 23, 2017
Tim Banting, of Current Analysis, gives us a peek into what the next three years will bring in advance of his Enterprise Connect session exploring the question: Will there be a new model for enterpris....
March 15, 2017
Andrew Prokop, communications evangelist with Arrow Systems Integration, discusses the evolving role of the all-important session border controller.
March 9, 2017
Organizer Alan Quayle gives us the lowdown on programmable communications and all you need to know about participating in this pre-Enterprise Connect hackathon.
March 3, 2017
From protecting against new vulnerabilities to keeping security assessments up to date, security consultant Mark Collier shares tips on how best to protect your UC systems.
February 24, 2017
UC analyst Blair Pleasant sorts through the myriad cloud architectural models underlying UCaaS and CCaaS offerings, and explains why knowing the differences matter.
February 17, 2017
From the most basics of basics to the hidden gotchas, UC consultant Melissa Swartz helps demystify the complex world of SIP trunking.
February 7, 2017
UC&C consultant Kevin Kieller, a partner at enableUC, shares pointers for making the right architectural choices for your Skype for Business deployment.
February 1, 2017
Elka Popova, a Frost & Sullivan program director, shares a status report on the UCaaS market today and offers her perspective on what large enterprises need before committing to UC in the cloud.
January 26, 2017
Andrew Davis, co-founder of Wainhouse Research and chair of the Video track at Enterprise Connect 2017, sorts through the myriad cloud video service options and shares how to tell if your choice is en....
January 23, 2017
Sheila McGee-Smith, Contact Center/Customer Experience track chair for Enterprise Connect 2017, tells us what we need to know about the role cloud software is playing in contact centers today.