1 (866) 866-2320 Resources Events Blog

Cloud Outages are the Bigger Risk


Cloud Outages are the Bigger Risk


blog outages ClosedRecently Brandon Butler of InfoWorld reported 'Cloud security: Outages are bigger risk than breaches'. He observes that as the IT industry moves to the cloud, "the biggest concern should not be that data might be compromised in the cloud, but rather that a cloud outage could lead to data loss."

Butler reports on the conclusions made by Gartner cloud security analyst Jay Heiser, that there's a perception, that the most significant risk in using the cloud is that sensitive data can be leaked. However, Heiser says what is more common today are cloud outages and data loss. Even more alarming is the issue that many enterprises are not well-prepared for those incidents. 

Below we explore how pervasive outages and downtime is in the modern data center, and it's consequences and impact on the business.

Downtime Impact on Reputation and Loyalty

A recent Gartner study projected that "Through 2015, 80% of outages impacting mission-critical services will be caused by people and process issues, and more than 50% of those outages will be caused by change/configuration/release integration and hand-off issues." The fallout from the Amazon cloud outage added to fear surrounding cloud security and downtime. And as Amazon continued to scramble to get its cloud services back online, many customers questioned the reliability of the cloud, Amazon's communication around the outage and whether they would be compensated for the downtime as part of their SLA. 

Downtime, Outages And Failures - Understanding Their True Costs 

Stay Ahead of Cloud Outages

With this complex ecosystem of systems and services involved, the new point of failure becomes complexity itself. Some argue that in complex systems, failure is inevitable. Instead of assuming that providers will eventually make all of these interconnected systems and service offerings infallible, it is more prudent to assume that there will at some point be a catastrophic outage. 

10 Realizations On Cloud Outages 

IT Operations in the Cloud

IT operations in the cloud introduces many configuration management challenges, growing the amount of dynamic changes, more complexity, a higher pace of change spurred by business demand, and further complicated by limited visibility from infrastructure abstraction. The rapid pace of change supported by Cloud Computing makes it a major challenge for the enterprise to be able to drive high-powered change while still remaining firmly in control. 

PCWorld added "Forrester Research analyst James Staten said that PaaS (platform as a service) clouds such as Azure are very complex and highly automated environments, and sometimes glitches crop up in production that can't be anticipated in test environments. This appears to be one of those cases." 

Configuration Error Brings Down The Azure Cloud Platform 

Even the biggest in the business suffer downtime and outages

What Happened: The trigger for this event was a network configuration change. This involved a subset of the Amazon Elastic Block Store ("EBS") volumes in a single Availability Zone within the US East Region that became unable to service read and write operations, making them "stuck" volumes. This caused instances trying to use these affected volumes to also get "stuck" when they attempted to read or write to them. As with any complicated operational issue, this one was caused by several root causes interacting with one another and therefore give Amazon many opportunities to protect the service against any similar events from reoccurring. The changes that Amazon made provide them with protections against having a repeat of this event.

10 Devastating Outages And Failures Of Major Brands In 2011 

Can You Gain In Insight To Stay Ahead Of Unplanned Outages!

Unplanned IT outages always cost money. A gap exists, and is widening, between the cost of unplanned downtime and an organization's ability to be effective with traditional response activities, according to the aforementioned report. Traditional response activities - including server reboots, retrying activities to restore services that have failed in previous attempts, incident management's inability to understand the technical environment or any other set of circumstances - do, indeed, slow restoration and cost the business money.

I Want Advanced Notice Of Any Unplanned Outages!

Your Turn
How are you prepared for a cloud outage?

About the Author
Martin Perlin