Windows Azure Caching (Preview) and Ruby Cloud Services

Article
07/02/2012

One of the new features introduced in the spring update to Windows Azure is the ability to use memory in your compute instances for caching. This cache is distributed across all instances of your application, so it scales with your application and data stored in it is available to all your instances.

The other cool feature of this cache is that it supports the Memcache wire protocol, so it can be accessed by any Memcache client. You can read more about the new caching feature at Windows Azure Caching (Preview).

Pretty much all the steps for setting up the Windows Azure Caching preview are performed in config files, so it can be used from any programming language you can publish as a Windows Azure Cloud Service. For this post I'm going to use the RubyRole project as an example.

Note: As of the time of this writing, the bits required to enable caching are only available with the Windows Azure SDK for .NET. So you will need to have this installed in order to successfully use caching from a Windows Azure cloud service.

Caching preview bits

The first thing you will need is the Windows Azure SDK for .NET. This is currently (7/2/2012) the only Windows Azure SDK that includes the bits needed to enable caching in a cloud service. You can tell if the bits are installed by checking for a caching directory at C:\Program Files\Microsoft SDKs\Windows Azure.NET SDK\2012-06\bin\plugins.

These files are used both in the local Windows Azure Emulator, and also when you package and deploy your application to Windows Azure.

Co-located vs. dedicated

There are two ways you can use the new caching feature; either by co-locating the cache with your application, using memory in the same role that your application is running in, or by creating a separate worker role that is dedicated to caching. The trade-offs are that using the same memory pool your application is using leaves less memory for the application, while using a dedicated worker role uses an additional compute instance and increases the cost of hosting the cloud service.

For the dedicated option there's also the problem that your application has to find the runtime name of the Memcache gateway for the dedicated instance so it knows where to point the client for the memcache endpoint. I haven't quite gotten that working yet, so I'll stick with describing co-located in this post and revisit dedicated caching at a later date.

Co-location example

In the ServiceDefinition, perform the following steps:

Import the caching module:

This entry goes inside your web or worker role entry.
Add a local store. This is primarily for the cache to store information such as logs:

This entry goes inside the LocalResources entry.
Add an endpoint that the client will use to access the cache using the Memcache protocol. Since we're only going to be using the cache from our application and not exposing it on the internet, this should be an InternalEndpoint:

This entry goes inside the EndPoints section.

In the ServiceConfiguration.cscfg file(s), add the following entries to the ConfigurationSettings section.:

These define configuration settings for the cache:

NamedCaches let's you create multiple caches that have individual configurations such as time to live, eviction policies, etc. For more information see Named Caches section of the Overview of Windows Azure Caching (Preview) topic.
Loglevel is used to specify the level of information logged for diagnostic purposes. For information on setting the Loglevel, see Troubleshooting and Diagnostics for Windows Azure Caching (Preview).
CacheSizePercentage is the percentage of memory in the role that should be allocated for the cache. For more information, see Capacity Planning Considerations for Windows Azure Caching (Preview).
ConfigStoreConnectionString specifies the storage account. When you're using the emulator, ServiceConfiguration.local.cscfg file is used, so this file should have the value set to "UseDevelopmentStorage=true" . When deployed to Windows Azure this should point to a valid storage account, so the value would be something like "DefaultEndpointsProtocol=https;AccountName=YOURSTORAGEACCOUNT;AccountKey=YOURSTORAGEACCOUNTKEY" .

That's it. You're ready to use caching from your application. For my tests, I used the Dalli gem with the following code:

Shim vs. no-shim

While the above example works, this isn't actually the fastest method though. You see, all traffic using the Memcache protocol gets routed through a gateway into the Windows Azure cache. Due to differences between the hashing schemes used by Memcache and Windows Azure cache, some translation has to happen at the gateway, and this can degrade performance.

There's a shim that can be loaded in your role that handles this translation before the request reaches the gateway, which improves performance over just using the gateway alone. The shim is distributed as the Windows Azure Caching Memcache Shim, which is a NuGet package. As far as I'm aware, you can only use NuGet packages in a Visual Studio project.

You can read more about it in the Memcache Wrapper for Windows Azure Caching (Preview) topic, and Simon Davies has published an example project of using this shim with a Node.js cloud service. You can find this NodeCacheExample project on GitHub.

Summary

While the above is pretty much a hack to get the caching preview working with a Ruby cloud service, I suspect that once it leaves preview it will be a bit easier to enable for non-.NET applications. Also the above only addresses using a co-located cache; I'll continue researching how to implement a dedicated cache solution and post my findings on the blog.

For a full example of a solution using the co-located cache, see the co-located-cache branch of the RubyRole project.