Queues in Windows Azure Storage

This warrants calling out - the queues in Windows Azure aren't your typical FIFO queue/dequeue mechanism.  They actually use a two-step dequeuing process. 

Imagine the typical Windows Azure architecture where a web role (or multiple web roles) accepts incoming network traffic, pushes work onto a queue, and then the work is dequeued and done by worker roles. 

image

Let's look at how to dequeue.  This dequeuing code is taken from the Thumbnails sample in the Windows Azure SDK.  It's done by the worker role. 

 QueueStorage queueStorage = QueueStorage.Create(StorageAccountInfo.GetDefaultQueueStorageAccountFromConfiguration());
MessageQueue queue = queueStorage.GetQueue("thumbnailmaker");

...

 while (true)
{
    try
    {
        Message msg = queue.GetMessage();
        if (msg != null)
        {
            string path = msg.ContentAsString();

            RoleManager.WriteToLog("Information", string.Format("Dequeued '{0}'", path));

            // Insert code to process the message and do the work here...

            RoleManager.WriteToLog("Information", string.Format("Done with '{0}'", path));

            queue.DeleteMessage(msg);
        }
        else
        {
            Thread.Sleep(1000);
        }
    }
    catch (StorageException e)
    {
        // Explicitly catch all exceptions of type StorageException here because we should be able to 
        // recover from these exceptions next time the queue message, which caused this exception,
        // becomes visible again.

        RoleManager.WriteToLog("Error", string.Format("Exception when processing queue item. Message: '{0}'", e.Message));
    }
}

So what's going on here?  After we get a reference to the queue, we're starting an infinite loop.  That is because we want the worker to continuously wake up, check the queue for new messages, process the messages if there are any, and then sleep if there are no new messages. 

Now, let's look at the dequeuing process. 

The first step is to call queue.GetMessage() .  This gives us a copy of the message, but it actually doesn't remove the message from the queue.  Instead, the message remains on the queue, but is put into a hidden state.  Then worker roles can continue to process other messages behind it on the queue.  After a certain amount of time passes, if the second step of the dequeuing process hasn't been completed, the message will come out of the hidden state and be re-added to the queue.  Why does this behavior make sense?  Imagine if a worker role dequeued a message for processing, but then the worker role died in mid-process.  With this architecture, after the configured amount of time passes with no followup from the worker role, the message will be re-added to the queue and processed by another worker role, instead of data being lost. 

The second step is to call queue.DeleteMessage() , after the message has been successfully processed.  This step completes the dequeuing and removes the message from the queue. 

Also, note the use of RoleManager.WriteToLog() throughout the code.  When your code is running in the cloud, you can't attach a debugger to it.  (There are a couple of reasons for this.  First, there are security concerns.  Debuggers essentially latch onto a running process and this could be dangerous in a shared environment like the cloud.  Secondly, when your app is running in the cloud, it's in a production environment.  You really shouldn't be debugging and stepping through code that's running in production.)  Therefore, use logging to figure out what's going on with your code.  There's a good video from PDC 2008 that gives best practices on logging effectively in Windows Azure at https://channel9.msdn.com/pdc2008/ES03/.