On-premises, hosting & cloud: a metaphor for understanding how they differ

[Dear loyal readers, be warned: this post will contain some identity, but it is more about "pure" architecture & cloud services]

image

On a flight Honolulu-Seattle; vacation is over, and I'm back to sharing with you my wild thoughts (ok, not ALL of my wild thoughts :-)).

Have you noticed how people tend to react when they learn about some of the cloud "limitations"? Like "what do you mean, no atomic transactions?" or again: "no principals? Where do I manage my principals?". Well, weeell. The point is, guys, that cloud is not classic hosting; it is designed to solve different problems, and it is equipped with different weapons. I don't have a killer app to use for making my point here, this is all way too new, hence I'll try to use one of my signature metaphors (already used at TechEd EMEA & OreDev). I will argue that deploying a solution on-premises is a bit like owning a car, taking advantage of hosting service is like renting a car, and that using cloud computing is a different thing altogether: it's like taking a train, going-from-A-to-B capability as pure as it can get.

On premises solutions ~= driving your own car

As somebody who walked every day a good 3 Km from age 12 to age 27, I can tell you that having your own car is just great. You can go wherever you want with it, whenever you want: if while going from A to B you want to do a detour to the memorabilia shop and buy a hideous spoon for your mom, you certainly can. You can customize your car at will, from the canoe holder to the flames finish on the doors; and if you scratch it on a fence, you can decide when (or if) to fix it. Of course, with great power comes great responsibility: the car needs to bought, the insurance paid, and if you run over the foot of a bystander there are consequences.

The parallel with a solution on premises is obvious: you have as much control as you can get, but you pay for it.
You pay for the sheer hardware, but you can choose everything in details; you pay for the platform maintenance, patching and the like, but you get to choose exactly what you want (say an old OS); you pay for the software, licenses-development costs-maintenance, but you get quintessential control on whatever your solution looks like.

I don't know why the hype works the way it does. Once a technology rises to the public attention, there's a tendency of overstretching the range of situations in which it will be used (often "in the future") and only after some time the pendulum finally swings back to reason. Don't fall for that trap with the cloud. There are plenty, PLENTY of reasons for which you may want to have a solution on premises: maybe you have a production line, and you want to keep the computation close to the actuators; or more commonly, you may have a heavy investment in various systems that work and you expect to get ROI for a long time to come. If you have 5 kids and you live 5 Kms from the nearest town, it still makes sense to own a car.

Hosting ~= renting a car

I like to think of hosting as something akin to renting a car. You still have a good degree of control, though less than when you own: decent choice over the car model you want to pick, complete control over the itinerary and so on. You pay an amount that is proportional to the days of actual use, but in general if you need a car every day of the year you'd end up paying more than if you'd own. You still have responsibilities: running over the foot of the bystander can still cost your license, if you want to go from A to B you better make sure you know which roads will get you there, and so on. Furthermore, you now have obligations such as treating the vehicle with some care.
Dealing with a hoster or a co-locator is similar to the above in many ways. You don't have to buy the hardware, and you don't have to provide the air conditioning or the electric power for the rooms where it sits. You pay for a period of time of use, as opposed to the capex you'd put down for your own data center (the costs are a point where the metaphor may be less accurate). On the other hand, your choices are not boundless anymore: which web server products are available, which databases or SDKs, which virtualization technology? And, above all, you still retain a level of responsibility. Do you have a series of VMs? When a zero-day patch comes out, it is your responsibility to apply it to your VMs. Do you want users from a partner org to have access to your app?  It's up to you to install suitable software and set up a federation. And naturally, since you are expected to deal with every aspect of your app you are still free to mess up things: if you do I/O in the wrong places in the pipeline of your web services, the performance of your app will suck regardless of how many VM instances you buy. Note: all this holds even if your hoster happens to be an internal hoster, that is to say another division within your company offering services such as virtualization/server consolidation to internal customers: the biz model does not change much the architectural constraints (thanks Ryan for the discussion that made me understand that this should be explicitly called out).
Hosting is awesome. I love hosting. You have an old solution that requires hardware no longer meeting your standards? Set up few virtual machines, your application can ascend and the old machines decommissioned. You need to expose pages on the internet, but you don't have the structure for doing so? Buy some internet space on a web server, and that's it. Does that remind you of a company that leases, instead of buying, the car fleet for its sales force? (no pun intended).
Those are classes of problems in which renting infrastructure has a clear advantage over the approach in which you own everything: however there are other classes of problems where neither approaches are effective.

Say that your startup owns a web application which produces revenues mainly by selling ads, and say that for some reason you end up on top of Digg: for that day, you'll experience a surge in requests. If you own your own datacenter, chances are that you sized it for handling much less traffic and you won't keep up: that means missed revenues. If you deployed your app in a series of VMs at some hoster you may end up not keeping up anyway. Here I am talking about an application, not just a page server, hence scaling may require not only firing up a new VM instance: should you add a new DB role machine or a new web frontend role? Maybe is the backend that is choking performances? And once you fire the new instance, how do you configure the rest of the solution to use it properly? No, simply not owning the hardware won't help you here: you need a different solution.

Think that this is confined to web2.0 scenarios? Think again. Say that you are a business ISV, selling subscription based apps to many tenants. You need to ensure that every tenant can access your app in a secure way, but you also want to make it easy; and you want it easy, yes, but also expressive enough to allow fine grained authorization. If your business customers are in the THOUSANDS, you *need* to find ways of streamlining the process: self provisioning or similar tricks would be of use here. Now tell me, does hosting your app in some third-party VM system help for this? A little: at least you don't need to worry too much about handling bandwidth and ports. But that's not the point: the resource given to you, a VM, is still too raw, it still requires too much work and know-how specific to building distributed & highly scalable solutions. No, we don't need just more of the same but something qualitatively different: what you need for this this classes of apps are capabilities that are closer to the result you want to obtain. Sometimes you just want to go from A to B, and focus on other things instead of remembering to check the fuel level or wondering if you're holding the map upside down.

Cloud solutions ~= taking the train

Above I said I've walked 3 Km every day for 15 years straight: did I do it because I'm a fitness nut? Unfortunately my oversized belly shows how preposterous that idea is. I walked because I was going from home to the train station and back.

The train is some kind of magic. You board, you sit down (good luck with that, if you are on the Recco-Brignole or the Pistoia-Rifredi lines, maybe with Lollo & Stefano), you open your book and... after a short time (ditto) you're there! No worries about taking wrong turns, switching to the car column that until you moved there was the fastest, finding a decent parking spot... and I don't mean to sound callous, but if the train somehow damages the foot of our bystander this time you have no legal responsibilities.  That pretty much holds for every public mean of transportation: for the airplane you may just need a bigger book. It's also true that you can't make that hideousspoon-driven detour, that you need to stick with a timetable and that the location of the train station may be suboptimal.

Now, as anticipated this comes as close as pure A-to-B capability as it can get: you can blissfully ignore everything about train crossings rules or how to land on a foggy day, all you need is a ticket and the knowledge of some basic procedures (after having spent from 9:00pm to 2:30am stranded in Rome's airport, I'd say that if you travel in Italy you also need some luck :-)).

The big thing about the cloud is not that your resource is hosted somewhere else. Well, it is ALSO that: but IMO that's not the defining bit. The point here is that you get capabilities-in-a-vacuum. Storage, compute, authentication... those are all things that (for example) the Azure platform offers without requiring you to worry about the paraphernalia (hardware and software) which is necessary for providing those capabilities. The issuing of a security patch does not impact the resources you choose to implement via cloud capabilities: you operate at a different granularity, you don't care about the machines running your cloud storage in the same way in which you don't care about the scheduler that assigns threads to the cores of a machine in your own data center. To push the reasoning to an extreme, If tomorrow somebody would find a way of offering storage or compute capabilities without relying on Von Neumann-based machines you'd be blissfully unaware of it.

There are some notes I should raise about this. First: if you want this to work, you need to release some control and trust the platform to take care of some details. Remember of the scenario of the app deployed on VMs at the host? There we had issues about which VM role (DB, frontend, backed) we should have created a new instance of; and how to make the existing instances aware of the newly booted VM. In an ideal cloud platform, you'd follow a programming model that would invite you to develop solutions in which for scaling out you just need to add new instances and not have worries like the above. Of course that brings some limitations in your control, exactly like you forsake fine grained memory management for the joys of having a garbage collector. Is that bad? If what you want is scale, that's not bad at all. And that leads me to the second note. Big scale problems are not the only ones that the cloud is good at (just think of the fantastic integration possibilities offered by the servicebus, the access control or the idea of a marketplace), but they are almost paradigmatic of the new approach. Let's get back to the example of the web app that gets DIGGed, and let's say that it's a social network. When a user lands on his(her?) page, he should be able to see the activities from his contacts and should be able to update his status. Consider for a moment the data store behind this application: what is more important to it, integrity or availability? Is it more important that the page is always able to serve a reasonably up-to-date view of the content, or integrity is so important that is justifies putting locks here and there that would sometime result in not being able to serve the page (and with it the ads that are the lifeblood of its bizmodel)? When you look at the first offerings in the cloud space, and you see some capabilities left out in respect to their on premises counterparts, it is useful to think in those terms for understanding the intended usage of those new tools. This by no means implies that those capabilities aren't important! If your web app would have a shopping kart as well, you'd definitely want to make sure that the paths leading to a payment are transactional where appropriate; the fact that you are using the cloud does not mean that you can't throw in other technologies in the mix when needed.

I hope I made my point here: a train is not just a really long limousine, it is something essentially different from a private vehicle.
You drive a car, regardless of if you own it or you rent it, while you ride a train. In the same way, a monstrous data center behind a cloud platform is not a super hoster. Let me be clear, it can be used that way: many years ago, we actually rented an entire train and celebrated new year's eve going from Genova to Firenze (thank goodness at the time we didn't have phones with cameras, or I'd be worried about reminding this to some friend who saw me that night; in fact, we didn't have cell phones yet). But that's hardly the best way of using a train.

Pat Helland and Nicholas Carr made exquisite use of historical metaphors for explaining big trends (Metropolis for service orientation, and the "electrification revolution" for cloud computing, respectively). I am not even vaguely in the league of those two, hence I'll resist the urge of drawing parallels with what happened as public transportation arose and how it changed the way in which people live their lives in ways the former generations could not imagine. But if you want to go ahead and feel in awe when contemplating the implications of a computing platform you can ride, who am I to stop you? ;-)

[I ended up finalizing this post much after I got off the plane, but since the bulk of it was written in-flight I left the intro :-)]