Gary Audin
Gary Audin is the President of Delphi, Inc. He has more than 40 years of computer, communications and security...
Read Full Bio >>

Gary Audin | November 11, 2012 |


Are You Planning for a Disaster or Downtime?

Are You Planning for a Disaster or Downtime? Enterprises should continually review and update the procedures and technologies they need in order to mitigate downtime.

Enterprises should continually review and update the procedures and technologies they need in order to mitigate downtime.

Though the human costs from megastorm Sandy continue to be severe, from the perspective of those charged with maintaining enterprise systems, the key issue for the network is downtime. Enterprises should continually review and update the procedures and technologies they need in order to mitigate downtime.

The Many Elements of Downtime
Downtime may be unplanned--for example, as a result of a disaster--or it may be planned. Planned downtime may result from implementing new or changed/upgraded hardware, new or modified/upgraded software, new or changed network connections, system/network testing, and physical site construction and modification.

As a rule, these planned downtimes are not included in the calculation of the benchmark 99.999% availability goal. The primary element for the 99.999% calculation is unexpected hardware failures.

Unexpected downtime can be due to:

* Natural disasters that knock out resources such as power and building access due to weather, floods, earthquakes, wildfires etc.
* Power outages that are not related to natural disasters
* Software bugs
* Malicious behavior
* Security breaches
* Human error

There are also problems that may not be technically considered a failure but cause performance degradation. These include:

* Network overload and congestion
* Poor operating system performance
* Unstable applications and application configuration
* Data unavailability, corruption and access limitation

No matter what is the downtime problem, the enterprise should establish some key performance indicators (KPIs), measuring what matters most. KPIs for downtime include:

* Time to discover a problem
* Time to diagnose the problem
* Amount of the organization that is affected
* Amount of resources needed to stop the downtime
* Time to initialize the solution(s), measured from the time to discover the problem
* Time to complete the resolution(s) measured from the time solutions are initialized
* Cost to resolve the downtime
* Cost the downtime produces in lost productivity, customers, reputation, etc.
* What is the new availability figure when the downtime is included? (i.e., how far does the downtime drive you down from the 99.999% goal?)

Stages of a Downtime Problem
There are four stages to dealing with downtime:

1. A problem has occurred and no one knows yet.
2. IT and/or the users determine there is a problem. Knowing the extent of the problem is critical to its resolution.
3. Deciding what to do in response to the problem, what resources are needed, the cost, what steps to take, and how long will it take for problem resolution. Alternative solutions should be investigated during this stage.
4. Performing the failover and recovery procedures and notifying the users of problem resolution.

Is the Backup Working?
All too often I have encountered enterprises switching to a backup solution only to discover that the backup does not work. Those in charge of the backup systems made assumptions that proved incorrect. This happened frequently for enterprises and other organizations during and after Sandy. No one had tested the backup or it had been a long time since it was tested.

I had one occasion when the backup carrier connection did not work. My client had had no need for the backup for the three years it was in place. We subsequently discovered that the backup had never worked. It had been installed improperly, never tested by the carrier, just checked off as working. My client was able to obtain a refund for the connection for its entire life span, but this was not much of a consolation.

On another occasion, the server backup had been taken off line for some work. No one who needed to know it was off line was informed of its status. Therefore no backup existed when the primary failed. It took a few hours to learn what was wrong and restore IP PBX service.

Oversights in Downtime Planning
Downtime can be caused by small events. You don't need a megastorm for a failure. Application and server failures are far more likely to occur than a natural disaster.

If you have configured an active/active configuration for the servers, ensure that when one server fails, the other server can handle the full workload. If not, users will see performance degradation or even terminated sessions.

Monitoring the KPIs means that not only should failures be detected quickly, but degradation of the KPIs should touch off alerts when they are reaching poor performance levels, before there are real problems. Doing this may prevent any user performance degradation, by solving the problem before it causes a larger issue.

The enterprise should also monitor for application software issues and data corruption. The backup hardware and software should be monitored equally as well as the primary.

Testing the backup procedures and operation is extremely important; test frequently. I had one client that used their electrical generators for 24 hours every weekend as a live power source. Turning a backup on for few minutes is not enough time. Backup failure can occur once a full load is applied to the system.

Don't select cheaper backup components. They should be equal in quality and reliability to the primary systems. Fully understand the assumptions made when coping with downtime--for example, beware of assuming that the mobile network will be operating when the wired network fails. Mobile services failed over a wide area with Sandy.

Practice, Practice, Practice
Practice the backup procedures live, not on paper. Don't let anyone know that a backup test will be performed. I had client that warned all the users that a backup test was to be performed the next Monday. On Friday, most of the employees downloaded their work into their PCs so they would not be affected by the backup procedure. This defeated the purpose of the backup test, as it did not demonstrate that the backup and procedures would work properly under a full load.

Most of the time, enterprise management does not allow full failover operation, as it will disrupt the business. This is not so smart when a big problem occurs and the failover does not work as expected.


Enterprise Connect Orlando 2018
March 12-15 | Orlando, FL

Connect with the Entire Enterprise Communications & Collaboration Ecosystem

Stay Up-to-Date: Hear industry visionaries in Keynotes and General Sessions delivering the latest insight on UC, mobility, collaboration and cloud

Grow Your Network: Connect with the largest gathering of enterprise IT and business leaders and influencers

Learn From Industry Leaders: Attend a full range of Conference Sessions, Free Programs and Special Events

Evaluate All Your Options: Engage with 190+ of the leading equipment, software and service providers

Have Fun! Mingle with sponsors, exhibitors, attendees, guest speakers and industry players during evening receptions

Register now with code NOJITTEREB to save $200 Off Advance Rates or get a FREE Expo Pass!

March 7, 2018

Video collaboration is experiencing significant change and innovation-how can your enterprise take advantage? In this webinar, leading industry analyst Ira Weinstein will present detailed analysis

February 21, 2018

Business agility has become the strongest driver for enterprises to begin migrating their communications to the cloud-and its a benefit that enterprises are already realizing. To gain this benefit

February 7, 2018

Enterprises are starting to grasp the critical importance of security and compliance in their team collaboration deployments. And once the risks are mitigated, your enterprise can integrate these n

March 12, 2018
An effective E-911 implementation doesn't just happen; it takes a solid strategy. Tune in for tips from IT expert Irwin Lazar, of Nemertes Research.
March 9, 2018
IT consultant Steve Leaden lays out the whys and how-tos of getting the green light for your convergence strategy.
March 7, 2018
In advance of his speech tech tutorial at EC18, communications analyst Jon Arnold explores what voice means in a post-PBX world.
February 28, 2018
Voice engagement isn't about a simple phone call any longer, but rather a conversational experience that crosses from one channel to the next, as Daniel Hong, a VP and research director with Forrester....
February 16, 2018
What trends and technologies should you be up on for your contact center? Sheila McGee-Smith, Contact Center & Customer Experience track chair for Enterprise Connect 2018, gives us the lowdown.
February 9, 2018
Melanie Turek, VP of connected work research at Frost & Sullivan, walks us through key components -- and sticking points -- of customer-oriented digital transformation projects.
February 2, 2018
UC consultant Marty Parker has crunched lots of numbers evaluating UC options; tune in for what he's learned and tips for your own analysis.
January 26, 2018
Don't miss out on the fun! Organizer Alan Quayle shares details of his pre-Enterprise Connect hackathon, TADHack-mini '18, showcasing programmable communications.
December 20, 2017
Kevin Kieller, partner with enableUC, provides advice on how to move forward with your Skype for Business and Teams deployments.
December 20, 2017
Zeus Kerravala, principal analyst with ZK Research, shares his perspective on artificial intelligence and the future of team collaboration.
December 20, 2017
Delanda Coleman, Microsoft senior marketing manager, explains the Teams vision and shares use case examples.
November 30, 2017
With a ruling on the FCC's proposed order to dismantle the Open Internet Order expected this month, communications technology attorney Martha Buyer walks us through what's at stake.
October 23, 2017
Wondering which Office 365 collaboration tool to use when? Get quick pointers from CBT Nuggets instructor Simona Millham.
September 22, 2017
In this podcast, we explore the future of work with Robert Brown, AVP of the Cognizant Center for the Future of Work, who helps us answer the question, "What do we do when machines do everything?"
September 8, 2017
Greg Collins, a technology analyst and strategist with Exact Ventures, delivers a status report on 5G implementation plans and tells enterprises why they shouldn't wait to move ahead on potential use ....
August 25, 2017
Find out what business considerations are driving the SIP trunking market today, and learn a bit about how satisfied enterprises are with their providers. We talk with John Malone, president of The Ea....
August 16, 2017
World Vision U.S. is finding lots of goodness in RingCentral's cloud communications service, but as Randy Boyd, infrastructure architect at the global humanitarian nonprofit, tells us, he and his team....
August 11, 2017
Alicia Gee, director of unified communications at Sutter Physician Services, oversees the technical team supporting a 1,000-agent contact center running on Genesys PureConnect. She catches us up on th....
August 4, 2017
Andrew Prokop, communications evangelist with Arrow Systems Integration, has lately been working on integrating enterprise communications into Internet of Things ecosystems. He shares examples and off....
July 27, 2017
Industry watcher Elka Popova, a Frost & Sullivan program director, shares her perspective on this acquisition, discussing Mitel's market positioning, why the move makes sense, and more.
July 14, 2017
Lantre Barr, founder and CEO of Blacc Spot Media, urges any enterprise that's been on the fence about integrating real-time communications into business workflows to jump off and get started. Tune and....
June 28, 2017
Communications expert Tsahi Levent-Levi, author of the popular blog, keeps a running tally and comprehensive overview of communications platform-as-a-service offerings in his "Choosing a W....
June 9, 2017
If you think telecom expense management applies to nothing more than business phone lines, think again. Hyoun Park, founder and principal investigator with technology advisory Amalgam Insights, tells ....
June 2, 2017
Enterprises strategizing on mobility today, including for internal collaboration, don't have the luxury of learning as they go. Tony Rizzo, enterprise mobility specialist with Blue Hill Research, expl....
May 24, 2017
Mark Winther, head of IDC's global telecom consulting practice, gives us his take on how CPaaS providers evolve beyond the basic building blocks and address maturing enterprise needs.
May 18, 2017
Diane Myers, senior research director at IHS Markit, walks us through her 2017 UC-as-a-service report... and shares what might be to come in 2018.
April 28, 2017
Change isn't easy, but it is necessary. Tune in for advice and perspective from Zeus Kerravala, co-author of a "Digital Transformation for Dummies" special edition.
April 20, 2017
Robin Gareiss, president of Nemertes Research, shares insight gleaned from the firm's 12th annual UCC Total Cost of Operations study.
March 23, 2017
Tim Banting, of Current Analysis, gives us a peek into what the next three years will bring in advance of his Enterprise Connect session exploring the question: Will there be a new model for enterpris....
March 15, 2017
Andrew Prokop, communications evangelist with Arrow Systems Integration, discusses the evolving role of the all-important session border controller.
March 9, 2017
Organizer Alan Quayle gives us the lowdown on programmable communications and all you need to know about participating in this pre-Enterprise Connect hackathon.
March 3, 2017
From protecting against new vulnerabilities to keeping security assessments up to date, security consultant Mark Collier shares tips on how best to protect your UC systems.
February 24, 2017
UC analyst Blair Pleasant sorts through the myriad cloud architectural models underlying UCaaS and CCaaS offerings, and explains why knowing the differences matter.
February 17, 2017
From the most basics of basics to the hidden gotchas, UC consultant Melissa Swartz helps demystify the complex world of SIP trunking.
February 7, 2017
UC&C consultant Kevin Kieller, a partner at enableUC, shares pointers for making the right architectural choices for your Skype for Business deployment.
February 1, 2017
Elka Popova, a Frost & Sullivan program director, shares a status report on the UCaaS market today and offers her perspective on what large enterprises need before committing to UC in the cloud.
January 26, 2017
Andrew Davis, co-founder of Wainhouse Research and chair of the Video track at Enterprise Connect 2017, sorts through the myriad cloud video service options and shares how to tell if your choice is en....
January 23, 2017
Sheila McGee-Smith, Contact Center/Customer Experience track chair for Enterprise Connect 2017, tells us what we need to know about the role cloud software is playing in contact centers today.