At the Professional Developer’s Conference (PDC) in 2010 we announced an addition to the Computational Roles in Windows Azure, called the VM Role. This new feature allows a great deal of control over the applications you write, but some have confused it with our full infrastructure offering in Windows Hyper-V. There is a proper architecture pattern for both of them.
Virtualization is the process of taking all of the hardware of a physical computer and replicating it in software alone. This means that a single computer can “host” or run several “virtual” computers. These virtual computers can run anywhere – including at a vendor’s location. Some companies refer to this as Cloud Computing since the hardware is operated and maintained elsewhere.
The more detailed definition of this type of computing is called Infrastructure as a Service (Iaas) since it removes the need for you to maintain hardware at your organization. The operating system, drivers, and all the other software required to run an application are still under your control and your responsibility to license, patch, and scale. Microsoft has an offering in this space called Hyper-V, that runs on the Windows operating system. Combined with a hardware hosting vendor and the System Center software to create and deploy Virtual Machines (a process referred to as provisioning), you can create a Cloud environment with full control over all aspects of the machine, including multiple operating systems if you like. Hosting machines and provisioning them at your own buildings is sometimes called a Private Cloud, and hosting them somewhere else is often called a Public Cloud.
State-ful and Stateless Programming
This paradigm does not create a new, scalable way of computing. It simply moves the hardware away. The reason is that when you limit the Cloud efforts to a Virtual Machine, you are in effect limiting the computing resources to what that single system can provide. This is because much of the software developed in this environment maintains “state” – and that requires a little explanation.
“State-ful programming” means that all parts of the computing environment stay connected to each other throughout a compute cycle. The system expects the memory, CPU, storage and network to remain in the same state from the beginning of the process to the end. You can think of this as a telephone conversation – you expect that the other person picks up the phone, listens to you, and talks back all in a single unit of time.
In “Stateless” computing the system is designed to allow the different parts of the code to run independently of each other. You can think of this like an e-mail exchange. You compose an e-mail from your system (it has the state when you’re doing that) and then you walk away for a bit to make some coffee. A few minutes later you click the “send” button (the network has the state) and you go to a meeting. The server receives the message and stores it on a mail program’s database (the mail server has the state now) and continues working on other mail. Finally, the other party logs on to their mail client and reads the mail (the other user has the state) and responds to it and so on. These events might be separated by milliseconds or even days, but the system continues to operate. The entire process doesn’t maintain the state, each component does. This is the exact concept behind coding for Windows Azure.
The stateless programming model allows amazing rates of scale, since the message (think of the e-mail) can be broken apart by multiple programs and worked on in parallel (like when the e-mail goes to hundreds of users), and only the order of re-assembling the work is important to consider. For the exact same reason, if the system makes copies of those running programs as Windows Azure does, you have built-in redundancy and recovery. It’s just built into the design.
The Difference Between Infrastructure Designs and Platform Designs
When you simply take a physical server running software and virtualize it either privately or publicly, you haven’t done anything to allow the code to scale or have recovery. That all has to be handled by adding more code and more Virtual Machines that have a slight lag in maintaining the running state of the system. Add more machines and you get more lag, so the scale is limited. This is the primary limitation with IaaS. It’s also not as easy to deploy these VM’s, and more importantly, you’re often charged on a longer basis to remove them. your agility in IaaS is more limited.
Windows Azure is a Platform – meaning that you get objects you can code against. The code you write runs on multiple nodes with multiple copies, and it all works because of the magic of Stateless programming. you don’t worry, or even care, about what is running underneath. It could be Windows (and it is in fact a type of Windows Server), Linux, or anything else – but that’ isn’t what you want to manage, monitor, maintain or license. You don’t want to deploy an operating system – you want to deploy an application. You want your code to run, and you don’t care how it does that.
Another benefit to PaaS is that you can ask for hundreds or thousands of new nodes of computing power – there’s no provisioning, it just happens. And you can stop using them quicker – and the base code for your application does not have to change to make this happen.
Windows Azure Roles and Their Use
If you need your code to have a user interface, in Visual Studio you add a Web Role to your project, and if the code needs to do work that doesn’t involve a user interface you can add a Worker Role. They are just containers that act a certain way. I’ll provide more detail on those later.
Note: That’s a general description, so it’s not entirely accurate, but it’s accurate enough for this discussion.
So now we’re back to that VM Role. Because of the name, some have mistakenly thought that you can take a Virtual Machine running, say Linux, and deploy it to Windows Azure using this Role. But you can’t. That’s not what it is designed for at all. If you do need that kind of deployment, you should look into Hyper-V and System Center to create the Private or Public Infrastructure as a Service. What the VM Role is actually designed to do is to allow you to have a great deal of control over the system where your code will run. Let’s take an example.
You’ve heard about Windows Azure, and Platform programming. You’re convinced it’s the right way to code. But you have a lot of things you’ve written in another way at your company. Re-writing all of your code to take advantage of Windows Azure will take a long time. Or perhaps you have a certain version of Apache Web Server that you need for your code to work. In both cases, you think you can (or already have) code the the software to be “Stateless”, you just need more control over the place where the code runs. That’s the place where a VM Role makes sense.
Virtualizing servers alone has limitations of scale, availability and recovery. Microsoft’s offering in this area is Hyper-V and System Center, not the VM Role. The VM Role is still used for running Stateless code, just like the Web and Worker Roles, with the exception that it allows you more control over the environment of where that code runs.