What To Do When the Cloud Goes Down

cloud_is_down

Learn from the mistakes of others. It’s cheaper.

In December of 2012, two of the largest-ever cloud outages had businesses and service providers rethinking their decision to move their databases online. On Christmas Eve a backend failure (which is a real thing) immobilized Amazon’s cloud-based infrastructures taking Netflix (among others) down with it and prohibiting millions from digesting turkey while watching Walking Dead. A few days later it was Microsoft’s turn. On December 28th, 2012, Windows Azure took a sick-day that affected users in most of the South Central United States leaving many unable to access their online databases for more than 24 hours.

It was Voltaire who said “doubt is not a pleasant condition, but certainty is absurd” and I think he’s right. Power goes out and online databases go down- even for the big guys. It’s not a matter of “if,” but “when.” So what does that mean? It means the biggest thing in your control is how you handle it. So, the question is, how do suppliers and businesses prepare and respond in the inevitable event of an outage? Migrating your business and databases to an online platform in the cloud is no guarantee that your business won’t suffer, but preparing for some downtime can go a long way toward ulcer prevention.

Don’t do this

During the Windows Azure debacle, repair updates did not reflect correct information regarding service restoration, leaving users and business peeved in the lurch.. Soluto, a web service focused on PC enhancement, and one of many companies affected by the outage reached out to customers in frustration saying “For over 24 hours now, we’re down. It’s horrible. Seeing Google Real-Time Analytics show this image is… well…heart breaking at best and murderous-thoughts-invoking at worst.”…” We know people are working hard and around the clock to fix this failure, so instead of complaining, we decided to send our community to transmit positive karma in the direction of the people spending their weekend restoring the service instead of with their families.” Although a bit dramatic, these sentiments seem to adequately convey the emotional roller coaster that is an online database outage.

Do, do that

Despite the long interruption and reportedly sloppy response from Amazon and Microsoft, most service providers do (and should) build in contingency plans to protect their users’ online databases. Multiple availability zones are exactly what they sound like; a series of data centers or “availability zones” physically set up at different locations and programmed to pick up the slack of a fallen comrade in the event of an emergency. For companies transitioning from old fashion in-house servers and hardware, implementing a hybrid set-up can be a lifesaver. Although data vulnerability is increased when databases are partially online and partially in-house, companies can carry out their own hardware backups and build contingency systems to be at the ready.  Businesses using a hybrid system are wise to include some form of daily/weekly backup or data caching particularly if they are on their way to fully migrating their databases online.

Stan Klimoff, Director of Cloud Services for Grid Dynamics, has a few tips for readying online databases and systems for an outage. Running a disaster scenario lets businesses know where the vulnerabilities are and what expertise and time is required to fix them. Systems designed with failure in mind empower teams to prepare a coordinated response and recover more quickly. This is an expensive exercise but designing and testing with disaster in mind can have huge impact when there is a power outage, service interruption or natural disaster…like an earthquake…in Japan-it could happen.

Companies must also have an in-depth understanding of the roles, responsibilities and limitations indicated in their service agreement. In addition, service providers should demonstrate to client’s that their online database storage and management systems are robust, current and include contingencies and fail-safes.

Online databases outages are becoming a fact of life like death, taxes and Oprah-it’s going to happen. The only question is; what will you do when the lights go out?

 

Subscribe to TrackVia’s Blog