Planning for Disaster is Planning to Succeed
Business continuity in the face of disaster is achievable through small, medium, and large steps.
"There's nowhere you can be that isn't where you're meant to be."
― John Lennon
As big of a Beatles fan that I am, I really don't believe that. Yes, we can influence where we "are meant to be" by our actions and attitudes, but I am a firm believer that some things just happen. Bad things. Good things. Accidents are just that. Accidents.
However, that doesn't excuse us from being prepared for whatever the world might throw our way. If you drive a car, you take out insurance. If you ride a bike, you wear a helmet. You don't necessarily plan for the worst, but you protect yourself in order to survive the worst while you are out there minding your own business.
The same holds true here in the world of communications. Despite the best intentions of the manufacturers, designers, installers, service providers, IT managers, and technicians; the best laid plans sometimes go awry. Equipment fails, lines are cut, supposedly innocuous changes take down entire systems, and Mother Nature has a nasty habit of reminding us who's the boss. Tiny disruptions can be annoying while major interruptions in service result in lost revenue, disgruntled customers, lost productivity, and in the most extreme cases, injury or death.
I recently assisted a major contact center in conducting a thorough business risk assessment and the results were sobering. The company knew that its contact center was a significant source of revenue, but what it didn't realize was just how valuable it was. After analyzing months of contact center and sales records it was determined that it would lose upwards of $36,000 an hour for every hour the contact center was down. Since it runs an international, 24/7 business, that comes to approximately $700,000 a day. Clearly, a catastrophic outage of days or weeks would wreak havoc on its bottom line and could even jeopardize its ability to stay in business.Problems, Problems, Problems
The reasons for lost communications are vast and varied. I recently came across an article from Level 3 that presented The 10 Most Bizarre and Annoying Causes of Fiber Cuts. While some are quite laughable, the damage they caused was not.
While I wasn't surprised at what human mistakes and Mother Nature can do to underground and above ground cables, who would have thought that 17% of Level 3's damage was caused by squirrels? Yes, you read that right. Those cute and furry critters that romp through trees and drive dogs wild, are responsible for a significant number of lost telephone calls. I will never look at those buck teeth in the same way again.
The next most surprising statistic was that 7% of Level 3's annual outages were caused by people using fiber cables for gun practice. The article went on to say that many of these shot up cables were in tough parts of town, and repair technicians required bodyguards on their service calls. You would have thought that shooting up stop signs would be enough, but there appears to be an attraction to fiber that I just don't get.
In addition to cut, gnawed, and shot up cables, problems can occur in network and on-premises equipment. These failures come in many different flavors -- software, hardware, and configuration.Planning for the Inevitable
The building blocks of a fully resilient solution are many, and the costs and complexities to add additional 9's depends on how much risk you are willing to accept.
While this is not an exhaustive list, I consider the following as table stakes:
- Eliminate as many single points of failure as possible. This includes duplicated and high availability instances of routers, call servers, networks, session border controllers, session management servers, etc. The goal is to be able to take a sledge hammer to anything without losing your ability to make and receive calls.
- Consider multiple carriers. It's extremely rare that two carriers stop providing service to a particular area at the same time. Even in huge natural disasters such as Hurricane Sandy, some carriers stayed up while others faltered.
- Spread your 800 numbers across multiple carriers. I am a big fan of using a third party RespOrg, such as ATL Communications, to manage your 800 numbers. This allows you to develop number distributions that save money during sunny days (least cost routing, etc.) and keep your agents and customers talking during rainy days (carrier failover).
- Create network failover paths that support both wired and wireless networks. This is most applicable to branch offices that can use wireless technologies such as 4G/LTE in disaster situations.
- Move to SIP. In addition to saving money, SIP supports robust disaster recovery scenarios not possible with TDM.
- Look to the cloud. The resiliency possibilities that cloud-based communications offers are myriad. You might use the cloud strictly for business continuity during major outages, or you might choose to move everything off premise. A mix and match approach might include provisioning trunks from traditional carriers such as AT&T and Verizon along with Internet trunks from the likes of Twilio. The cloud comes in many different flavors and it's important to choose the one that is right for your business.
I would be overjoyed if the answer to every problem was "all you need is love," but unfortunately, love isn't all you need. To keep your lines of communication open, contact center agents busy, customers happy, and accountants busy counting all the money you are making, you need to plan, prepare, and build for disasters. Your losses may not be in the tens of thousands of dollars an hour, but significant money will be lost when the phones stop ringing and customers stop buying.
The point of this article is that business continuity in the face of disaster is achievable through small, medium, and large steps. The approach that one enterprise takes may be different than that of another, but the end result of survivability is the same.
There will always be squirrels with a taste for fiber and yahoos with itchy trigger fingers, but you get to decide how they will affect your operations – a little, a lot, or not at all. I know what I would choose.
Andrew Prokop writes about all things unified communications on his popular blog, SIP Adventures.