Load Balancing at the Routing Service

This post subtitled “I hate to tell you this but….”

“Can the Routing Service do load balancing?” is one of the most
common questions I’m asked.  The immediate answer I give is almost
universally disappointing: No.  It’s actually hard to say that Load
Balancing (as most people mean it) should be done at the Routing
Service.  But why?  It seems like since I have this intermediary that
load balancing is one of the obvious applications.  Here’s why it
doesn’t work as well or as easily as it should.

How would I
build a filter that allowed me to do something simple like Round Robin
load balancing?  Well, it would seem that I would have a filter, one for
each destination that I wanted to send to, and as each message came
through, only one of the selected filters would respond that it
matched.  So a setup like this:

RoundRobinFilter1 –>
Endpoint1

Round RobinFilter2 –> Endpoint2

RoundRobinFilter3
–> Endpoint3

Which we then fed 3 messages would result in
message 1 going to Endpoint1, message 2 to Endpoint 2, etc.

But
building this is hard :) Why?  You immediately run into two issues:

  1. How do these MessageFilters exchange information?  Basically
    how does each filter know that this time it’s the one that needs to
    match (and the other two know that they’re not)?
  2. How does
    this information stay around between messages?  Remember that
    MessageFilters are designed to be stateless, meaning that they don’t
    normally have any information and would always match a given message the
    same way.

We actually have a sample out that shows how
this would work, but it relies on a “creative” use of the MessageFilter
API.  The hint is that MessageFilters can be a used as a factory to
create the MessageFilterTable that they belong to.  Another key point is
that MessageFilterTables can contain child MessageFilterTables (usually
of a specific type).  An example of this in the framework today is the
XPathMessageFilter.  When you add a XPathMessageFilter to the Routing
Service’s MessageFilterTable, that’s actually not what happens. 
Instead, this is the structure you end up with:

XPathMFT

Where did this middle
XPathMessageFilterTable come from?  Reflector gives us the answer. 
When the XPathMessageFilter is added to the parent MessageFilterTable
(red), the parent MFT actually calls CreateFilterTable on the
XPathMessageFilter, at which point the XPathMessageFilter creates a new
XPathMessageFilterTable and returns it to the parent MessageFilterTable
to use.

XPathMFTCreateFilterTable

When the parent
MessageFilterTable is asked by the RoutingService to return a list of
endpoints that match a particular message, the parent MessageFilterTable
in turn delegates this message to its own internal MessageFilterTables
and MessageFilters, building the full result set and returning it to the
Routing Service.

The sample RoundRobinMessageFilter inside the
Routing Service’s Advanced Filters sample (over
here
) does much the same thing as the XPathMessageFilter.  Now that
we have a custom MessageFilterTable in the mix, we can use that to
store state and coordinate the responses from the individual
MessageFilters that ultimately gets returned to the parent
MessageFilterTable.  So can you build things this way?  Absolutely.  Is
it the best answer? Probably not.  At the end of the day you’re still
stuffing state into the MessageFilter model, where it doesn’t belong. 
Furthermore, implementing custom MessageFilterTables is a pain because
it requires you to stub in a lot of method calls that won’t ever be
used.

Another answer is a different custom MessageFilter, which
wouldn’t do RoundRobin specifically, but would send messages to a
particular endpoint 1/n of the time, on average, where n is the number
of possible destinations.  This is also buildable, and could even be
stateless, but could also be tricky because you’d have to figure out how
to gaurentee that only one filter matched any given message.  Just
having say four filters that all compute a rand() between 0 and 1 and
then determine if it’s in a particular range isn’t going to cut it,
since multiple filters could independently report that they matched.  I
leave this one as an exercise to the reader, but one solution is to
predefine your ranges and use some piece of message data (such as say,
the time the message was sent, some random int that the client injected,
or a hash of some other piece of data) to pick which filter matches. 
You could also get really creative with your use of the DynamicUpdate
functionality to periodically switch out which endpoints the Routing
Service was pointed to, but this seems a kludge.

The final
solution, and the one I’m going to recommend, is to use a different
extensibility point within the Routing Service.  Note that the
MessageFilterTable that we use is a MessageFilterTable<IEnumerable<ServiceEndpoint>>
(something I alluded
to
a while ago).  Why not write your own custom
IEnumerable<ServiceEndpoint> and plug that into the right hand
side of the MessageFilterTable?  That way your filter selection could be
simple (you could even use just a MatchAll if you didn’t need any other
semantics).  Here’s what that would probably look like:

   class
RoundRobinEndpointList :
IEnumerable<ServiceEndpoint>

    {

        ServiceEndpoint[] endpoints;

        int head;

        public
RoundRobinEndpointList(params ServiceEndpoint[] endpoints)

        {

            if (endpoints == null
|| endpoints.Length == 0)

            {

                throw new ArgumentException("One or more ServiceEndpoints must be
provided", "endpoints");

            }

            //Copy the list to ensure it doesn't change on us
(and so we can lock() on our private copy)

            this.endpoints
= endpoints.ToArray();

        }

        public IEnumerator<ServiceEndpoint>
GetEnumerator()

        {

            int
currentHead;

            lock (this.endpoints)

            {

                currentHead = this.head++;

                if (this.head == this.endpoints.Length)

                {

                    //Wrap back to the start

                    this.head = 0;

                }

            }

            //If
one just wanted to return 1 endpoint as opposed to a list with

            //backup endpoints this 'yield' is all that would
be needed.

            //yield return
this.endpoints[currentHead];

            //return
results [current] ... [last]

            for
(int i = currentHead; i < this.endpoints.Length; i++)

            {

               
yield return this.endpoints[i];

            }

            //return wrap-around (if any) [0] ... [current-1]

            for (int i
= 0; i < currentHead; i++)

            {

                yield return this.endpoints[i];

            }

        }

        IEnumerator IEnumerable.GetEnumerator()

        {

            return this.GetEnumerator();

        }

    }

Overall, this is a much more pleasing
implementation of RoundRobin, and could easily be extended to other load
balancing patterns as well. It has the distinction of being 1) quite
small, 2) easy to understand (especially compared to the other
implementations) 3) thread safe, and 4) performant (since we only need
to lock on a copy of a list instead of the actual list that other
MessageFilters might need to be using).