Big IT Crashes: Conclusions Small Businesses Should Not Draw
As experts mulled this summer's NYSE and United Airlines IT outages, they erroneously surfaced the idea that the days of five-nines availability are behind us.
This summer, due to a system outage, the New York Stock Exchange halted trading for several hours.
Think for a moment about how much that must have unsettled the investment community.
Then, think about how much more concerned they must have felt knowing that, earlier that same day, United Airlines was forced to delay almost 1,200 flights for nearly two hours due to a system outage of its own.
After the systems came back online, a swarm of pundits and experts analyzed the coincident events. Do we blame hackers? Poor systems design? Lack of government oversight? One unsettling conclusion rose the forefront: The era of "five nines" is over, or so reported The Wall Street Journal.
What Is Five Nines?
The reliability of an IT system, including on-premises or cloud-based communications platforms, is described in a percentage of uptime. Three-nines, or 99.9% uptime, sounds impressive -- but it actually equates to more than eight hours of unplanned downtime over the course of the year.
Five-nines -- or 99.999% uptime -- means the system will have no more than six minutes of unplanned downtime over the course of the year, or less than 30 seconds per month.
Both internal IT departments and external IT providers can guarantee certain application uptime levels depending on the architecture and operational procedures of the system implementing that application. External IT providers will often go a step further by backing their guarantees with financial remuneration, articulated in service-level agreements, should they fail to meet their promises.
(Some providers guarantee "100% uptime" which, in my opinion, is disingenuous. Being online 100% of the time is nearly impossible.)
Is Five Nines Now Impossible?
I can't speak for the airlines or the NYSE. The scope and complexity of their IT systems almost boggles the mind. But for the six million U.S. businesses with 1,000 or fewer employees, the era of five nines is far from over. In fact, these businesses should demand nothing less.
Prior to the cloud era, most businesses considered even three nines a pipe dream. When managed in-house, IT systems are only as reliable as internal personnel. If an email system crashed at 9 p.m., it probably wouldn't come back online until the IT person got to work in the morning and fixed the problem.
Some major providers of cloud services advertise three-nines uptime to attract smaller businesses that have retained on-premises IT and for which such reliability would be a dream come true. But the best providers are still offering five nines -- again, six minutes of yearly downtime vs. eight hours of yearly downtime -- and businesses with fewer than 1,000 employees should settle for nothing less for their critical applications.
The Hidden Cost of Downtime
One of the temptations of the cloud era is the widespread availability of freemium services, which often pull businesses into the cloud for the wrong reasons. These service providers attract customers with zero cost or rock-bottom prices, but build off business models based on converting a small percentage of free users into paid users.
In these cases, the cost of "free" is hidden -- often in support and downtime. These services can be wonderful when they work well, but providers must implement these applications on budget infrastructures since they provide little to no incoming revenue. They often use systems that lack the architecture to provide any uptime guarantees, and frequently lack the staff to provide quick recovery or high-quality support when problems do occur.
IDC has estimated that the average cost of a critical application failure is between $500,000 and $1 million per hour for Fortune 1000 companies. Even for smaller companies, the cost of application downtime can be significant. Analyst firms like Gartner offer cost-of-downtime calculators that highlight the risk in trying to save money by using rock-bottom-cost providers.
Even though some experts suggest that the world's biggest businesses should settle for less, I firmly believe that they won't because of the costs of downtime.
And you shouldn't either.
If a major glitch can happen to the NYSE and United, it can happen to any business. The difference between freemium, three nines, and five nines can mean the difference between an utter catastrophe and a small bump in the road. Seek out five nines. It'll save you money in the long run.
Jonathan Levine, CTO of Intermedia, has more than 25 years of experience in IT, cloud operations, data analytics, software development, and corporate strategy. As CTO, he is responsible for the company's technology strategy, ensuring that Intermedia continues to deliver the security, availability, and value on which its customers depend.