Categorizing the Cloud …

Article
08/16/2010

I’ve been thinking about the Cloud a lot lately. I’ve also been talking to a lot of customers both large and small about Azure. I’m getting some fairly consistent messages. On the good side, most of my customers indicate that they are interested in the Cloud and what the Cloud can do for their business. We work together to devise a cloud strategy that works for their particular business. All is good. On the downside, many customers indicate that it is very difficult to find clarity in very basic questions about the Cloud – and even with the most basic question, what is the Cloud?

When these existential questions come up, I tend to go back to first principles. I first try and answer the question “how did I figure out what the cloud is?” and “what resources helped me?” I then try and relay these first principles to the customer that I am speaking with about the Cloud. Generally, it helps. I like to think I followed a fairly logical road to Cloud understanding. But, before we jump to the answers, we require a short digression or two.

The Evil Cloud: An Architect’s View (or Digression #1)

As you know, I’m an architect. To an architect, the image of a cloud has always represented “uncertainty”. When most architects draw a cloud in their Visios, they immediately pierce that cloud with an arrow entitled “VPN” or “HTTPS”. The notion being that the cloud represents a nebulous, unstructured mass through which our clear and concise communications will manage to travel …

Notice how the evil the cloud is? Why even it’s typeface portends darkness, rain ahead and unpleasantness. See how it is ruthlessly penetrated by VPN and HTTPS? Notice how Enterprisey Goodness is protected by the gallant gateways! Poor cloud, my poor cloud!

The Evil Cloud: A Developer’s View (or Digression #2)

I am also a developer. That means projects fall on my shoulders. I actually have to get things done. You can’t deposit money into a bank account or trade shares using a Visio diagram! Oh those architects with their heads in the clouds! Always talking about loose coupling and infinite elasticity (when what they actually mean is “virtually infinite” – sheesh!). They never care if I can actually accomplish what they want me, the reasonable and level-headed developer, to accomplish! As a developer, I need to understand how the Cloud actually works, how I integrate with it and how I get things done!

If I, the developer, am going to give up control of all the great levers I can pull when everything is On Premise, then I am going to need to know that the complete Cloud stack is there. Tools, APIs, debugging, logging, monitoring all have to be present and mature for me to move forward and actually create something that will work. My incomplete Cloud, my poor, incomplete Cloud.

The Good Cloud: First Principles

Okay, so given those two digressions, let me then present a different view of the Cloud. A good Cloud. This cloud is complete. This cloud has sharp edges and well-defined features. (Life would have been easier if the Cloud was called the Sky-Cube or something (Now you know why I’m not in marketing.))

You will immediately notice that the fluff is gone. We have gone from rounded, uncertain edges to straight lines. We’ve also gone from two dimensions to three – subtly implying that the Cloud is “real” and has “depth”. We’ve also added added Models. (In case you were wondering, much like an extra level of indirection can solve any problem in computer science, models can solve any problem in design.) Let’s dive into our new view of the Cloud and see what we have.

On the front face of the cube, we have the Service Models:

Software as a Service (SaaS)
Platform as a Service (PaaS)
Infrastructure as a Service (IaaS)

Each of these Service Models represents a different offering in the cloud. These Service Models are best described in units of Management. That is, what as an business do I have to manage given each model.

(I borrowed this image from the great David Chou.) You will see that at each level, the amount of management increases or decreases. With IaaS for example, the customer is responsible for managing what is put on top of the metal whereas with Software as a Service, the vendor manages everything. Platform as a Service falls in the middle, with levels up to the runtime managed by the vendor and the application the responsibility of the customer.

Given that view, it’s not hard to imagine what On Premises looks like – basically Blue from top-to-bottom. The customer buys, licenses, racks and stacks as well as manages everything themselves.

If you look at the Cloud from a deployment perspective (more like a developer), things look similar …

The major difference being that with IaaS, it’s necessary to deploy the OS layer on top of the metal that the vendor is renting to you.

This is an important dimension. If a company is to cede control and deploy only at the application level, it’s important to know a Cloud Provider has the proper capabilities to support all the different types of applications that may need to be created to meet business needs.

On the depth face, we have the Deployment Models:

Public
Hybrid
Private

I will define the Public very simply as …

A pool of computing resources offered by a vendor, typically using a “pay as you go” model.

The definition of Private falls out from that as …

A pool of computing resources that lives within a self-managed datacenter.

(The definitions and architectural models below are Borrowed from Simon Guest. If you ever get the chance to go see Simon speak, do not miss.) The notion of a pool of computing resources is important. It implies that regardless of where the metal lives, the available resource are what’s important. From a very practical perspective, we care about Storage, Compute, Data, Connectivity, Security, Frameworks, and Application Services. We want those capabilities on a self-service, elastic and pay-as-you-go basis. Where those capabilities live defines the difference between Public and Private.

My friend Hybrid bridges those worlds. When we have systems that live in the Public Cloud and systems that live in the Private Cloud (like existing Line of Business applications for example), it can become necessary for those systems to communicate with each other. That’s a Hybrid Cloud – a Private Cloud and a Public Cloud working together to provide capability.

Federated identity (ADFS for example) is a great example of a Hybrid Cloud service. The concept is that an ADFS server can live in your datacenter and establish trust with a service living in Azure. Then when a client goes to authenticate with the service in the Cloud, it takes a token granted by ADFS from and On Premise identity store like Active Directory, hands it out to the Cloud service, the Cloud service can decrypt the token and pull out the claims without communicating back into the On Premise system at runtime. Private Cloud and Public Cloud working together. (Great, quick video on ADFS).

On the top face is the Isolation Model. Too often, the Isolation Model is left out of categorizations for Cloud. And further, too often the Isolation Model is confused with the Deployment Model. In order to define the Isolation Models, we first need a definition of Tenant.

Just what is a Tenant? If I am a Cloud provider, a Tenant is my customer. But, if I am an ISV using a Cloud provider to serve up my application, then my customer is the Tenant. And, in either case, that Tenant might be made up of sub-tenants, such as subsidiaries or business groups that require isolation from each other for business or regulatory reasons. As such, we can define Tenant as …

An entity such as a customer or business unit or set of entities that require that events (such as changes made to data) are visible only to that entity or set of entities.

Note that the definition holds regardless of the infrastructure that is shared among the Tenants. Take this architectural example …

Three customers (Tenants) sharing the same Web Tier, Business Logic Tier and Database. Even in this simple example, we can get all kinds of geeky here. For example …

Is each Tier sharing hardware? If so, are they separated by VMs on that hardware?

Above, all the requests share the same hardware in the middle tier (of course there would be multiple servers to scale the request, so think of the servers in this case as logical servers). Below, all the requests share the same hardware, but each is isolated by it’s own VM on top of that hardware.

When the database comes into play, it can become even more complicated …

Again, are we talking about different databases on the same server? Different servers? Same database with filegroups on the same metal, different metal, virtual machines versus actual metal and so on.

So, how do you avoid going down the tenancy rabbit hole?

First degree/second degree/lesser pure/not-pure it’s all a red herring for the security and SLA discussions. There are very Good Reasons to need a dedicated system, but is security truly one of them? And, if so, what security domain are we talking about? Certainly, Isolation Model relates to Segmentation, Data Protection, Application Security, Business Continuity/Disaster Recovery, Physical Security, Key Management, Auditing, eDiscovery, Incident Response just to name a few.

Given the potential high impact, what are those Good Reasons …

Compliance
Data Sovereignty
Residual Risk Reduction for high value business data

What about Physical Security? I don’t think so. Physical Security is definitely an issue impacted by the Isolation Mode, but your provider’s datacenter should be more physically secure than yours. Unless you are the NSA/CIA or other governmental body, it is highly unlikely that your datacenter is more secure than the Azure datacenters, for example. Unless you have ISO 27001 and SAS 70 I & II certifications for your datacenters (PDF link), you’re not even in the game. If your Cloud Provider’s datacenter is not more Physically Secure than yours, find a different Cloud Provider.

It boils down to Compliance. If you need to comply with a government regulation (Patriot Act or EU Data Protection Directive are a good examples) or internal IT regulation (as in the case of data categorization), your business is in the Dedicated Cloud. Or, more correctly, you have some elements of your business that require a Dedicated Cloud, but it is almost certain that elements of your business can be hosted in the Multi-Tenant Cloud.

Do not fall for the false choice of the “security” versus Isolation Model, however. I can think of at least 15 security domains, not all of which have anything to do with the Isolation Model. The argument presupposes that one architecture is a priori more secure than another when it is really a posteriori knowledge that determines whether or not our data is going to be exposed to another Tenant.

If it was possible to ensure security through architecture alone, then all systems would be secure. It’s not. Execution and implementation play a huge factor in the security of a system. I don’t want to turn this into a STRIDE discussion, but security in the Requirements and Design phases are not enough. There needs to be security through Implementation, Verification, Release and Response across all Security Domains. The bottom line is that if your data dependent routing implementation is flawed and you have not given yourself defense in depth with a proper Authentication Security Domain, whether it routes you to the wrong server, VM or database table is irrelevant.

If we take Compliance, Data Sovereignty and “security” out of the Isolation Model discussion, we are left where we should have started with SLA. If a Cloud Provider cannot meet the specific SLA required by your business, then there is no reason to go down the Isolation Model discussion anyway (nor it is necessary to have the security discussion either).

Clearly, the Isolation Model will affect SLA. Disaster recovery and data restoration are a good example of how Isolation Model can affect SLA. Can your cloud provider restore your data at any time without affecting other Tenants? Can other Tenants data be restored at any time without affecting your SLA? I will argue that as long as the Cloud Provider’s SLA meets your needs and absent a Good Reason, you should not care about the Isolation Model.

Conclusion

Those are the categorizations I think about when I think about Cloud. I take those categorizations and map Software and Services to each. This mapping helps customers understand what section their business falls into and how they can be using the Cloud appropriately. If your talking to a customer whose business is in the Dedicated/Public/SaaS space (BPOS-D, BPOS-F) about Multi-Tenant/Public/PaaS (Azure, xRM) then you’re doing it wrong!

Next time, we will look at the various permutation of the Service X Isolation X Deployment cube and talk about what types of business fall in each category and what Software + Services fit in each.

Categorizing the Cloud …

Additional resources