.NET Services March 2009 CTP - Service Bus Routers and Queues - Part 1: Fundamentals

In the March 2009 CTP of the .NET Service Bus we’ve added two significant new capabilities, Routers and Queues, that also signal a change of how we’ve thinking about the Service Bus namespace, its capabilities and the road ahead. Before the M5 release, the Service Bus’ primary capability was to act as a Relay between two parties. It’ll absolutely continue to play that role and we’re working to improve that capability further.

The significant shift we’ve made with M5 is that we’ve now started to add long-lived, system-inherent messaging primitives that exist and operate completely independent of any active listener that sits somewhere on some machine and is plugged into the Service Bus. That means that you can now leverage the Service Bus as an intermediary for push-pull translation, or as a publish/subscribe message distribution facility to optimize or facilitate messaging between places that are already “on the Web” or you can set up message distribution scenarios where some messaging destinations are existing Web Services and some receivers require the Service Bus’ relay capability to be reachable.

Before I go into further detail on that, let me explain some of the more philosophical aspects of the model behind Routers and Queues and especially how it relates to the Service Bus namespace that I’ve already discussed to some degree in this post. image

Names and Policies

The relationship between any messaging primitive and the Service Bus namespace is established by picking a name in your project’s Service Bus hierarchy - say https://clemensv.servicebus.windows.net/myapp/q1 and then assign a role to that name. From an astronaut’s perspective, all names in a Service Bus namespace that can theoretically exist do already exist and their role is ‘none’. So when I’m assigning a role to a name, I don’t create the name itself. The name is already there, it’s just in hiding.

That mind-trick is necessary, because we don’t want to burden anyone with creating intermediary names leading up to a name in the hierarchy. In the example I’m using here I would have to first create ‘myapp’ and then create ‘q1’ if we wouldn’t be operating under the assumption that all names you could ever interact with were already existing.

Assigning a role is commonly done using the Atom Publishing Protocol (there’s also a WS-Transfer head that we use for the .NET SDK bits) whereby the POSTed entry contains some form of policy that holds information about what role the name should take on, and what the applicable constraints or operational parameters are. The POST request is sent to the exact URI projection of the name you picked.

Why is that a POST and not a PUT when you already know the URI?

Because once you post a policy to a name, there’s a metamorphosis happening (think “magic little puff of smoke”, not Kafka) that transforms the name into an active messaging primitive. On success, the POST request will yield a 201 response code along with a Location header that indicates the place where you’ll further interact with the policy you just posted. The URI itself is taken over by the primitive. 

The picture on the right shows what happens in the case of a queue. As the policy is applied, the queue’s “tail” takes over the URI and two subordinate URIs are created, whereby one serves to interact with the policy and the other one to dequeue messages from the queue’s “head”.

Any name can play any role that’s supported by the system. We currently have a “metadata” role where you just stick an external reference such as a URI or a WS-Address endpoint reference into the name in the registry. We have a “connection point” role that’s established by the WCF listeners as they take over a name to listen on the Service Bus. And we’ve got these two new roles “queue” and “router” that I’m going to explain here.

Queues and Routers

A Router is a publish/subscribe message distribution primitive that allows “push” subscribers to subscribe and get messages that flow into the Router. A Queue is a – well – a queue that accepts messages and holds them until (a) consumer(s) come by and “pull” the messages off the queue. We’re explicitly allowing for Routers to subscribe to Routers and for Queues to subscribe into Routers. The resulting composite is typically quite a bit more powerful than any of the primitives alone. So we call the these capabilities “primitives”, because they explicitly allow for composition.

imageIn the picture on the left you see one possible composition pattern.

We’ve got a number of processing services that we want to load balance jobs across. We also have an auditing service that ought to see and log every single “raw” job message that goes into the system.

The audit service is particularly interested in not losing any messages until they are secured on disk, while the processing services want to get their work pushed to them and run as fast as they can.

The setup is that we’re creating a Router with a message distribution policy of “All” that sends each message to all subscribers. Then we create a secondary Router with a distribution policy of “One”, which sends any incoming message to exactly one randomly selected current subscriber – which solves the load balancing problem for the Processing Service.

For auditing, we also create a Queue that subscribes into the top-level Router that gets all messages and holds them for the Audit Service to pick them up.

The Audit Service would use the peek/lock pattern to get the messages off the queue. That means that the consumer puts an exclusive lock on the message that’s being retrieved and that the message is removed from the view of any competing consumers. The message isn’t gone, though. If the consumer doesn’t acknowledge the message within a minute, the message pops back into view. That means that if the Audit Services were to gets a message but would fumbles it or can’t get it on disk, the message wouldn’t be lost, even in the case of a catastrophic failure. Once the Audit Service can get the message on disk, it deletes the lock and that finally removes the message from the Queue.

So that’s the background on the relationship of Names and Policies and Queues and Routers and how they are designed for composition. In the next posts I’ll go into detail on what the policies for Queues and Routers look like, how you apply them via the SDK programming model or via plain HTTP and how you submit messages into and get messages out of a Router, a Queue or a composite like the one shown here.