What’s a Correlation and why do I want to Initialize it?

In .NET 4.0, we have introduced a framework for correlation.  What do I mean by correlation?  I’m glad you asked.  In our vocabulary, a “correlation” is actually one of two things:

  1. A way of grouping messages together.  A classic example of this is sessions in WCF, or even more simply the relationship between a request message and its reply. 
  2. A way of mapping a piece of data to a service instance.  Let’s use sessions as an example here also, because it makes sense that I’d want all messages in a particular session (SessionId = 123) to go to the same instance (InstanceId = 456).  In this case, we’ve created an implicit mapping between SessionId = 123 and InstanceId = 456.

As you can see, these patterns are related, hence why we call them both “correlations”.  But sessions are inherently short-lived, tied to the lifetime of the channel.  What happens if my service is long-running and the client connections aren’t?  The world of workflow services reinforces the need for a broader correlation framework, which is why we’ve invested in this area in .NET 4.0. 

There are two operations that can be performed on a correlation: it can be initialized, or it can be followed.  This terminology is not new; in fact, Biztalk has had correlation for many releases.  But what does “initializing” a correlation mean?  Another great question.  Well, it is simply creating this mapping between the data and the instance.

In addition, there are many types of correlation available in .NET 4.0, but let’s focus on this category of associating data with an instance.  When that data comes from within the message itself (e.g. as a message header or somewhere in the body of the message), we call that content-based correlation

Ok, too much theory and not enough application; let’s look at how this manifests in WF 4.0.  With every message sent or received from your workflow, you’ve got an opportunity to create an association between a piece of data in that message and the workflow instance.  That is, every messaging activity (Receive, SendReply, Send, & ReceiveReply) has a collection of CorrelationInitializers which let you create these associations.  Here’s what the dialog looks like in Visual Studio 2010:

CorrInitializers

As you can see, it’s been populated with a Key and a Query.  The Key is just an identifier used to differentiate queries, e.g. DocId = 123 should be different than CustomerId = 123.  The Query part is how we retrieve the data from the message; in this case, it’s an XPath expression that points within the body of the ReceiveDocument request message to the Document.Id value.  Some of the resulting correlation information is pushed into a CorrelationHandle variable (the DocIdCorrelation), which will be used by later activities to follow this correlation.

Now, you might be wondering: if I already have the Document.Id value in a workflow variable, why do I need this XPath expression in order to initialize my correlation?  That’s a great point.  In fact, we wrote another activity just for this purpose: the InitializeCorrelation activity.  As you would expect, the dialog looks very similar to what we just saw (Note: this dialog is different than what is present in Visual Studio 2010 Beta2):

InitializeCorrelation

This activity is particularly useful in scenarios where the data to initialize the correlation comes from a database, or if you need to create a composite correlation from multiple pieces of data like a customer’s first and last name concatenated together.  For all of you Biztalk experts, this means no more “dummy send” pattern!  Hooray!

Ok, you’ve initialized a correlation … now what?  Regardless of how you’ve initialized it, a correlation is only useful if it is followed.  A Receive activity follows a correlation by correlating on the same piece of data (in this case, a DocId) and specifying the same CorrelationHandle.  Imagine that the document approval process includes some opportunity to update the document before the official approval is given.  An UpdateDocument request message is sent, which contains the DocumentId value in it.  Here we specify the XPath expression to point to that particular piece of data in our incoming message.  We also set the CorrelatesWith property to the same CorrelationHandle we specified previously; this ensures that when the Receive activity starts executing and goes idle (sets up its bookmark), the WorkflowServiceHost knows what correlation this workflow is waiting on and can resume the correct instance when the message with the corresponding Document Id comes in. 

CorrelatesOn  

And now you can consider yourself a content-based correlation expert!  No longer are you restricted to using a context-based binding for communicating with instances of your workflow services!  Now that’s freedom.  Give it a shot and let us know what you think!