Windows Azure Recipe: High Performance Computing

One of the most attractive ways to use a cloud platform is for parallel processing. Commonly known as high-performance computing (HPC) , this approach relies on executing code on many machines at the same time. On Windows Azure, this means running many role instances simultaneously, all working in parallel to solve some problem. Doing this requires some way to schedule applications, which means distributing their work across these instances. To allow this, Windows Azure provides the HPC Scheduler.

This service can work with HPC applications built to use the industry-standard Message Passing Interface (MPI). Software that does finite element analysis, such as car crash simulations, is one example of this type of application, and there are many others. The HPC Scheduler can also be used with so-called embarrassingly parallel applications, such as Monte Carlo simulations. Whatever problem is addressed, the value this component provides is the same: It handles the complex problem of scheduling parallel computing work across many Windows Azure worker role instances.

Drivers

  • Elastic compute and storage resources
  • Cost avoidance

Solution

Here’s a sketch of a solution using our Windows Azure HPC SDK:

image

Ingredients

  • Web Role – this hosts a HPC scheduler web portal to allow web based job submission and management. It also exposes an HTTP web service API to allow other tools (including Visual Studio) to post jobs as well.
  • Worker Role – typically multiple worker roles are enlisted, including at least one head node that schedules jobs to be run among the remaining compute nodes.
  • Database – stores state information about the job queue and resource configuration for the solution.
  • Blobs, Tables, Queues, Caching (optional) – many parallel algorithms persist intermediate and/or permanent data as a result of their processing. These fast, highly reliable, parallelizable storage options are all available to all the jobs being processed.

Training

Here is a link to online Windows Azure training labs where you can learn more about the individual ingredients described above. (Note: The entire Windows Azure Training Kit can also be downloaded for offline use.)

Windows Azure HPC Scheduler (3 labs) 

The Windows Azure HPC Scheduler includes modules and features that enable you to launch and manage high-performance computing (HPC) applications and other parallel workloads within a Windows Azure service. The scheduler supports parallel computational tasks such as parametric sweeps, Message Passing Interface (MPI) processes, and service-oriented architecture (SOA) requests across your computing resources in Windows Azure. With the Windows Azure HPC Scheduler SDK, developers can create Windows Azure deployments that support scalable, compute-intensive, parallel applications.

See my Windows Azure Resource Guide for more guidance on how to get started, including links web portals, training kits, samples, and blogs related to Windows Azure.