Cloud: The Answer to a Good Night’s Sleep for the IT Manager?
Cloud benefits aren't white and fluffy, but rather they are hard, significant ones.
As an IT manager for many years, there were many thoughts that kept me up at night. The focus of this article will be on improving the quality of sleep of the bloodshot-eyed IT manager, and to determine if the cloud can assist in this lofty endeavor.
Human Single Points of Failure
Most organizations have a single subject matter expert on critical technologies. This isn't necessarily by choice, since the skillset to design, build and maintain communications infrastructures is rare and difficult to find. If that subject matter expert leaves the company or moves on to another role within the company, they generally leave behind a gaping hole.
As a manager, this is something I was constantly worried about, and at times it literally kept me up at night. In fact, at a few times early in my career when I was an engineer, I was that single point of failure. At one point I moved on to another role within the company I worked for, and since my management couldn't find anyone to replace me after a multiple month search, they ended up bringing in an outsourcing partner to manage the communication system I used to manage.
Clouds can alleviate this stress since the cloud providers have teams of subject matter experts on staff, thereby negating the risk associated with one subject matter expert leaving the company.
Infrastructure Environment Build Time
In most organizations, particularly in large enterprises, bringing up a new environment is a very time-consuming process. Usually there is an overall governance process, along with other processes, paperwork and lead times for server builds, database builds, application builds, QA and testing, and oftentimes a project manager is assigned to manage the overall process.
In my experience, it generally takes 6 months to bring up a new environment for an existing application, and 12 months or more to bring up a new application. This was true regardless of what company I worked for, and my internal business customers were constantly frustrated with the amount of time it took for the IT department to implement technology solutions for them.
Cloud providers are able to spin up environments for their customers in a much more rapid manner. They generally measure their implementation time in days or weeks rather than months, thereby improving customer satisfaction.
One of the most common and time-consuming and tedious initiatives that take place in IT organizations is system upgrades. Teams of people get together for numerous meetings spanning multiple weeks in order to plan for system upgrades, and the upgrades themselves are even more time consuming, spanning months or even years.
In addition, an organization may not focus on currency initiatives, or may not wish to dedicate the financial and human resources required for currency. These companies may not realize they should have upgraded until they come across a situation such as a requirement for a patch to be written to fix a production issue or to meet a business need, only to find out that their system has reached end of support.
With cloud infrastructures, the upgrades come along with the cloud, and you don't have to spend countless hours working on the upgrade. The only area of caution is to ensure that you have adequate test environments with your cloud provider so that you can properly test an upgrade before it migrates to production.
Enterprises generally build out systems to meet defined capacity requirements. If they need to scale an environment up, this can be a very time-consuming process (on the order of months) which is usually similar in nature to bringing up a new environment. I often had my business customers baffled at how long it would take to simply add capacity to a system.
Clouds are generally designed to scale up or down very rapidly. Depending on the application and the provider, they may even have "elastic" capabilities, which allow you to dynamically expand and contract capacity as required. Your business customers will appreciate the speed at which you can offer scalability.
Enterprise applications are designed for a certain level of availability such as 99.5%, 99.9%, 99.99%, etc. The reality is that many factors can cause availability to fall short of the published metric, and often applications don't perform at the level where they should be performing. Below are a number of areas that can impact system stability:
* Disaster Recovery Testing: Many companies discuss DR testing, but DR tests aren't performed nearly as often as they should be. Or, if a DR test is performed, applications are somewhat gracefully shut down, as opposed to the abrupt nature of a true disaster. During disasters, data center outages, or partial system outages, systems often do not fail over as expected and outages end up occurring. In my experience over almost two decades with Fortune 500 companies, I've participated in less than a half a dozen DR tests, and experienced three actual disasters. The applications I was responsible for did not recover as expected during the true disasters due to incomplete and infrequent DR tests.
* Stress Testing: Stress testing is commonplace for many applications, but can be difficult and costly to perform in the communications world. For example, most companies don't have the ability to drive hundreds or thousands of calls into their infrastructure, so they need to rely on 3rd party companies to assist. Using a 3rd party can be costly, yet regardless of complexity or cost, it is still important to perform stress tests, and these tests can uncover vulnerabilities in an environment. Nevertheless, companies often choose to use their resources for initiatives other than stress testing. Unfortunately, it is in times of high volume that system availability is most important, and that is when many issues are encountered due to a lack of stress testing.
* Overall system and regression testing: Initial testing usually takes place when a system is put into production, but often, after changes are made to environments, full regression tests don't take place as frequently as they should. This can result in unexpected outages in production.
* System backups: This one should be a no-brainer, but surprisingly enough, even backups aren't always performed as they should be. I know of a Fortune 500 company that lost thousands of call recordings due to a maintenance glitch, only to discover that the servers weren't being backed up properly, so the recordings were lost forever.
Competent cloud providers know and understand that system stability is their number 1 job. Therefore, they provide infrastructures that are likely to meet or exceed availability targets on a regular basis.
As you can see, there are many benefits to migrating applications to the cloud. The cloud benefits aren't white and fluffy, but rather they are hard, significant ones. As more applications become available with cloud providers, enterprises have more choices when it comes to on-premise vs. cloud and should take a serious look at what the cloud providers have to offer.