Defining Availability Requirements for Cloud Computing

Overview

In my article about Security and Privacy in Public Cloud Computing, I described a number of concerns that the US government expressed about using public cloud computing service providers. The National Institute of Standards and Technology (NIST), part of the US Department of Commerce, published Special Publication 800-144 Guidelines on Security and Privacy in Public Cloud Computing to clearly express these concerns.

I’m examining all of these concerns with the intent of mapping the theory and recommendations expressed in SP 800-144, and supporting reference documents, to a practical IT approach. In this article, I’m exploring the Availability Technologies recommendations.

Availability in the Cloud

It might seem odd to think critically about availability in a cloud scenario. After all, isn’t cloud computing a solution to availability problems? Don’t most of us assume that great backups, mirrored data, and multiple data centers are features of every cloud provider and solution?

Yes and no. Virtually all cloud providers use effective data backup and restore solutions. It is usually part of every service offering. But backing up and restoring data is only part of what you really need. What you need is availability. And that’s different.

When you’re selecting a cloud provider and service package, you must first define the value of service availability to your business. That’s actually harder than it seems. Only then can you determine whether the cloud provider’s products meet your expectations.

How Do I Define My Availability Requirements?

I get this question in many forms and within many contexts. Often, it is while I’m teaching security to IT professionals. To explain, I draw this diagram:

CIA Pyramid

You might already know that the three key considerations for information security problems are confidentiality (keeping secrets secret), integrity (letting only authorized people change data), and availability (delivering data when it is needed). They form the CIA pyramid. The pyramid-ness is crucial because it illustrates that when you plot a solution on the pyramid, you’re essentially deciding the balance of these three. Plotting a solution close to availability gives you more resilient access to data and services, but at the cost of less confidentiality and integrity. Moving that solution towards confidentiality indicates a solution that restricts unauthorized access and is highly available, but at the cost of reduced data integrity.

Your definition and need for availability, in this light, must be decided within the overall pyramid. Keeping in mind that there are no absolutes, but solutions that focus more or less in certain areas, consider the business impact of these questions:

Is it better to lose the data permanently or have it fall into the wrong hands?
This is a balance between availability and confidentiality.

Is keeping the data tamper-free more important than unplanned data loss?
The answer helps you decide whether to focus on integrity or availability, or to balance between them.

Are all of these decisions unacceptable, and I need absolute confidentiality, integrity, and availability?
If so, plan on spending a lot of time and money. Such comprehensive no-compromise solutions are rarely cost-effective, even in a cloud scenario.

How long can my company operate without access to cloud data and services?
This question gets right to the point. If the cloud is down, does that result in a minor inconvenience or a profit-shaking catastrophe? Would you gladly risk your data going public in order to get access restored?

Does My Provider Meet My Expectations?

Your answer to this question is far easier than you think. Once you’ve given some thought and researched the questions in the previous section, there are two aspects of availability that you need to verify with the cloud provider: availability and recovery.

Availability in this context is how much time the service provider guarantees that your data and services are available. This is typically documented as a percent of time per year, e.g. 99.999% (or five nines) uptime means you will be unable to access resources for no more than about five minutes per year.

You must look for a cloud service provider that both publishes and guarantees availability. Like an insurance policy, the guarantee should be in terms that compensate your company for missing the documented availability metrics. The better the guarantee and availability become the more reliable and expensive the service is.

Note that cloud service providers differ in their definition and measurement of availability. Some providers claim no downtime if a single internet client can access at least one service, while others require that multiple countries or internet service providers can access all services. Ensure that you review the provider’s definition, as you may need to change your requirement to match the way they measure it.

Recovery is effective for services that aren’t business-critical. Recovery operations include restore from backup, repair of damaged systems, and resolving viruses and denial of service (DoS) attacks.

Most cloud providers have a good recovery story and you won’t need to look hard to find one that, for example, guarantees file restore within an hour. They tout tangible recovery features like redundant storage, multi-site mirroring, multiple internet paths, and rapid patch management.

Summary

To ensure you don’t engage the wrong service, you should not confuse recovery and availability. While both are critical elements in IT, and there is some degree of overlap, they mean very different things to your business processes. Regardless of your recovery and availability needs, you should always use a service provider that guarantees a specific level of availability in writing.