Azure@home Part 9: Worker Role Run method

This post is part of a series diving into the implementation of the @home With Windows Azure project, which formed the basis of a webcast series by Developer Evangelists Brian Hitney and Jim O’Neil. Be sure to read the introductory post for the context of this and subsequent articles in the series.

In my last post, I covered the initialization of the WorkerRole in Azure@home – namely the implementation of the OnStart method via which we set up the role to collect a few performance statistics as well as to log system and application errors. The next stage in a worker role’s lifecycle is the Run method, typically implemented as an infinite loop Worker Role in Azure@home– in fact if the Run method terminates, the role will be recycled and started again, so even if you don’t implement an infinite loop in Run directly, the Windows Azure fabric pretty much enforces one on you (granted in a more disruptive fashion).

Revisiting the architecture diagram (see left), each instance of the WorkerRole is responsible for starting the Folding@home process (FAH.EXE – Step 4) and reporting the progress of the simulation (Steps 5 and 6) via an Azure table (workunit – Step 7) as well as a web service call  (Step 8) to the main distributed.cloudapp.net application.  When a single simulation run – known as a work unit – completes, the WorkerRole simply starts another Folding@home console application process and the cycle repeats.

The implementation of Run looks like this:

    1:  public override void Run()
    2:  {
    3:      // set up the local storage, including Folding@home executable
    4:      FoldingClient.SetupStorage();
    5:   
    6:      // poll for the client information in Azure table every 10 seconds
    7:      while (true)
    8:      {
    9:          ClientInformation clientInfo = FoldingClient.GetFoldingClientData();
   10:          if (clientInfo != null)
   11:          {
   12:              FoldingClient.LaunchFoldingClientProcess(clientInfo);
   13:              break;
   14:          }
   15:   
   16:          Thread.Sleep(10000);
   17:      }
   18:  }
  
  
  • In Line 4, infrastructure needed to support the Folding@home console application (FAH) in the cloud is set up.  FAH is a standard executable, so needs to be copied to the local file system within the VM to have a place to run.  Once that’s done, the customary infinite loop begins (Line 7ff).
  • Line 9 checks to see if there is information in the client table to support starting the worker role (this is Step 3 in the diagram above).  The client table captures the information submitted by the user on the WebRole’s default.aspx page, namely his/her name, location (latitude and longitude), and Folding@home team number.  If there is no information yet in the table (i.e., clientInfo is null on Line 10), the worker role sleeps for 10 seconds (Line 16) before checking again.
  • Once a client row is detected (Line 12) – indicating the user has submitted the default.aspx page - the ‘real work’ can begin!

Over the next few blogs posts, I’ll cover the processing initiated by the code above, starting with SetupStorage, which is the focus for the remainder of this article.

 

Local Storage in Windows Azure

As you know, each role instance that is deployed to Windows Azure runs inside of a virtual machine (VM) instance.  Typically we focus on the number of CPUs  (or cores) to define the VM role type – small (1 CPU), medium (2 CPUs), large (4 CPUs), and extra-large (8 CPUs) – however, each option also includes a RAM allocation, local file storage, and a rough level of I/O performance:

Instance Size

CPU (1.6 GHz)

Memory

Local Storage

I/O Performance

small

1

1.75 GB

225 GB

Moderate

medium

2

3.5 GB

490 GB

High

large

4

7 GB

1000 GB

High

x-large

8

14 GB

2040 GB

High

Local storage is ephemeral and tied to a specific VM instance.  When the instance is recycled, the storage goes with it, so it’s not recommended for data that needs to be persistent or shared, but in our case it will work perfectly, since each Foldiing@home process is a single standalone executable completely enveloped by the worker role.  We can use local storage, essentially the disk space on the VM instance, to ‘install’ the Foldiing@home console application and to store the output file (unitinfo.txt) that that process writes to.

You might be inclined to think that Interacting with the local file system on a VM instance requires direct access to the VM itself such as via a remote desktop.  While such access has been announced as an upcoming feature of Windows Azure, it’s actually not necessary here (or for other more interesting scenarios): Windows Azure includes a LocalResource abstraction allowing you to access the local storage on the VM instance programmatically.

You define local resources as part of the ServiceDefinition.csdef file for the particular role requiring local storage access.   You can define multiple local resources within a role to correspond to discrete uses of file storage.  Within Visual Studio, you use the property dialog for the given role to set up local resource definitions; here’s the one defined for Azure@home:

Local Storage configuration

That translates to the following XML within the ServiceDefinition.csdef file:

   <WorkerRole name="WorkerRole">
    <ConfigurationSettings>...</ConfigurationSettings>
    <LocalResources>
      <LocalStorage name="FoldingClientStorage" 
                    cleanOnRoleRecycle="false" 
                    sizeInMB="1024" />
    </LocalResources>
  </WorkerRole>

The name provided is used to access the storage programmatically within the role code.  Each resource includes a size allocation.  The sum of the allocations, of course, must be less than what is allotted for the given VM instance size; here, 1 GB of 225 GB of storage is set aside.  The Clean on Role Recycle property is set to true by default and indicates that the local storage resource will be ‘wiped out’ and not available should the role be recycled. 

Roles can be recycled due to uncaught exceptions – in which case, you may want start fresh - but they can be recycled on demand too (via RequestRecycle), perhaps in response to a planned configuration change.  In the latter case you may want to retain the storage configuration by unchecking the Clean on Role Recycle property.  For Azure@home, there’s no compelling reason to set (or not) the property.

Programmatically, local storage resources are accessed via the LocalResource class, and each named instance of a local resource is obtained via the GetLocalResource method of the RoleEnvironment class.  Each LocalResource instance has three, read-only properties:

  • MaximumSizeInMegabytes – the value provided when configuring local storage via the configuration file,
  • Name – the name identifying the resource, also provided via the configuration file, and
  • RootPath – the path to the local resource within the VM instance that’s provisioned for the web or worker role.

 

Implementation of SetupStorage in Azure@home

It’s the RootPath that’s most useful here, since it identifies where in the VM’s file system you can start adding directories and files that might be necessary to support the functionality of the role.  In Azure@home, it’s the SetupStorage method (within the FoldingClient.cs file) that accesses the local resource; here are the first few lines of its implementation:

    1:      //setup the local file storage
    2:      LocalResource foldingIo = RoleEnvironment.GetLocalResource("FoldingClientStorage");
    3:      String targetPath = String.Format(@"{0}client", foldingIo.RootPath);
    4:   
    5:      //copy files to temp storage
    6:      String rootPath = Environment.GetEnvironmentVariable("RoleRoot");
    7:      String sourcePath = String.Format(@"{0}\approot\client", rootPath);

The first set of code (Lines 1 – 3) determines the physical path of the local resource and adds a client directory to that path designation.  In a sample execution of the code above, Windows Azure yielded the following path for targetPath:

C:\Resources\directory\1bfb645285504a1e82a9b9cf68ac3fd3.WorkerRole.FoldingClientStorage\client

The second set of code (Lines 5 – 7) recreates the path at which your application is deployed within the VM on Windows Azure; this is the destination of the code assembled into the .cspkg file deployed to the hosted service.  For the sample I ran, RoleRoot was a simple drive destination:

E:

Client folderIn short, the code above yields two directories accessible in the confines of the deployed WorkerRole’s VM:

  • targetPath, the destination path corresponding to the named local storage resource, and
  • sourcePath, the location of the client directory files that were part of the original WorkerRole project (right) and likewise included in the package file uploaded to Windows Azure. 

Don’t forget to set the Copy to Output Directory property in Visual Studio to Copy Always or Copy if Newer for any files that need to be included in the output directory and, therefore, the package file deployed to Windows Azure.

The remainder of the code in the SetupStorage method (below) demonstrates fairly straightforward System.IO methods to copy the client files originating in the Visual Studio WorkerRole project to the the local storage directory.  The result of this processing is that the Folding@home client console application – Folding@home-Win32-x86.exe – now resides in local storage in the VM instance housing the worker role.  Beyond this particular application, it should be clear that this technique can be generalized to copy additional ancillary assets to local storage in any Azure web or worker role instance.

 

    9:      // To copy a folder's contents to a new location:
   10:      // Create a new target folder, if necessary.
   11:      if (!System.IO.Directory.Exists(targetPath))
   12:      {
   13:          System.IO.Directory.CreateDirectory(targetPath);
   14:      }
   15:   
   16:      String[] files = System.IO.Directory.GetFiles(sourcePath);
   17:   
   18:      // Copy the files and overwrite destination files if they already exist.
   19:      foreach (String s in files)
   20:      {
   21:          // Use static Path methods to extract only the file name from the path.
   22:          String fileName = System.IO.Path.GetFileName(s);
   23:          String destFile = System.IO.Path.Combine(targetPath, fileName);
   24:          System.IO.File.Copy(s, destFile, true);
   25:      }

In subsequent blog posts, we’ll start a new process within the WorkerRole using that executable file (Step 4 in the architecture diagram shown at the beginning of this article) and collect information from its output file (Steps 5 and 6 from that same diagram).

 

Afterword: Environment Variables in Windows Azure

By the way, as you might have noticed from Line 6 above, RoleRoot is actually an environment variable, versus a configuration parameter on the service.  That may beg the question as to what other environment variables of interest are set on the role instances running in Windows Azure.  To satisfy my own curiosity, I threw a quick loop into the WorkerRole code to record the environment variables to the Azure log file and ended up with the following table.

Access only supported environment variables! The table below represents the variables returned for a specific role instance.  As MSDN states, the only variables specifically supported for your use within Windows Azure are RoleRoot, TEMP, and TMP.

Environment Variable (Sample) Value
COMPUTERNAME RD00155D328ACE
PUBLIC D:\Users\Public
LOCALAPPDATA D:\Users\3ae18441-313a-490d-ba26-033caeb8a555\AppData\Local
USERNAME 3ae18441-313a-490d-ba26-033caeb8a555
PROCESSOR_ARCHITECTURE AMD64
CommonProgramFiles(x86) D:\Program Files (x86)\Common Files
ProgramFiles(x86) D:\Program Files (x86)
PROCESSOR_LEVEL 16
APPDATA D:\Users\3ae18441-313a-490d-ba26-033caeb8a555\AppData\Roaming
ProgramFiles D:\Program Files
PATH D:\windows\system32;D:\windows;D:\windows\System32\Wbem; D:\windows\System32\WindowsPowerShell\v1.0\;E:\base\x64; E:\base\x86;E:\storage\cloud\x64;E:\diagnostics\x64;; D:\Packages\Runtime\WorkerRole_IN_6\x64;D:\Packages\Runtime\WorkerRole_IN_6\x86;
SystemRoot D:\windows
TEMP C:\Resources\temp\1bfb645285504a1e82a9b9cf68ac3fd3.WorkerRole\RoleTemp
RoleRoot

E:\

ALLUSERSPROFILE D:\ProgramData
RdRoleRoot E:\
FP_NO_HOST_CHECK NO
ProgramData D:\ProgramData
DFSTRACINGON FALSE
USERPROFILE D:\Users\3ae18441-313a-490d-ba26-033caeb8a555
PATHEXT .COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC
OS Windows_NT
CommonProgramFiles D:\Program Files\Common Files
PROCESSOR_IDENTIFIER AMD64 Family 16 Model 4 Stepping 2, AuthenticAMD
ComSpec D:\windows\system32\cmd.exe
TRACE_FORMAT_SEARCH_PATH \\winseqfe\release\Windows6.0\lh_sp2rtm\6002.18005.090410-1830\amd64fre\symbols.pri\TraceFormat
SystemDrive D:
PROCESSOR_REVISION 0402
RdRoleId 1bfb645285504a1e82a9b9cf68ac3fd3.WorkerRole_IN_6
NUMBER_OF_PROCESSORS 1
TMP C:\Resources\temp\1bfb645285504a1e82a9b9cf68ac3fd3.WorkerRole\RoleTemp
windir D:\windows
USERDOMAIN CIS