WF4 Collection Activities and Object Equivalence

This morning while reviewing sample code I ran across the collection activities.  Working with these activities is fairly straightforward but there are some issues that you should be aware of that can impact the behavior of your workflow. These issues are familiar to .Net developers because they impact the basics of object equivalence and how the collection classes behave. However many workflow developers are building workflow activities that are meant to be used by non-developers. For this reason I want to revisit the issue with you.

What Problem are you talking about?

Scenario: Create an activity that ensures a customer exists in the list

Given

  • A workflow with a variable of type List<Customer>
  • And an InArgument<Customer>
  • That does not exist in the list

When

The activity is invoked

Then

The customer is added to the list

Sounds simple right? Let’s implement the scenario with a Workflow. The goal is to add a customer if and only if it does not already exist in the collection.

image

Now let’s write some test code to verify the behavior using Microsoft.Activities.UnitTesting

    1: [TestMethod]
    2: public void AddCustomerIfNotInList()
    3: {
    4:     var collection = new Collection<Customer>();
    5:     var customer = new Customer();
    6:     var activity = new EnsureCustomerIsInList();
    7:     var host = new WorkflowInvokerTest(activity);
    8:  
    9:     var inputs = new Dictionary<string, object> { { "CustomerCollection", collection }, { "Customer", customer } };
   10:  
   11:     try
   12:     {
   13:         host.TestActivity(inputs);
   14:         Assert.IsTrue(collection.Contains(customer));
   15:     }
   16:     finally
   17:     {
   18:         host.Tracking.Trace();
   19:     }
   20: }

Sure this looks great but there is a hidden danger. Can you find it?

If your are having trouble finding it, here is unit test that reveals the danger. In this test I’ve defined a copy constructor for the customer type and we will create two customer objects, customer1 and customer2.  Then I will add customer1 to our customer list before passing the customer list and customer2 object to our activity.

    1: [TestMethod]
    2: public void WhatHappensIfCopiedCustomerExists()
    3: {
    4:     var collection = new Collection<Customer>();
    5:     var customer1 = new Customer();
    6:  
    7:     // Add the customer to the list
    8:     collection.Add(customer1);
    9:  
   10:     // Create a copy of the same customer
   11:     var customer2 = new Customer(customer1);
   12:  
   13:     var activity = new EnsureCustomerIsInList();
   14:     var host = new WorkflowInvokerTest(activity);
   15:  
   16:     // Pass the copy to the activity
   17:     var inputs = new Dictionary<string, object> { { "CustomerCollection", collection }, { "Customer", customer2 } };
   18:  
   19:     try
   20:     {
   21:         host.TestActivity(inputs);
   22:  
   23:         this.TestContext.WriteLine("Customer collection count {0}", collection.Count);
   24:         foreach (var cust in collection)
   25:         {
   26:             this.TestContext.WriteLine("Customer ID {0}", cust.Id);
   27:         }
   28:  
   29:         // The copy will be added because the reference is not equal
   30:         // This is probably not be what you want
   31:         Assert.IsTrue(collection.Contains(customer1));
   32:         Assert.IsTrue(collection.Contains(customer2));
   33:  
   34:         // Both customer and customer2 are in the collection
   35:         Assert.AreEqual(2, collection.Count);
   36:     }
   37:     finally
   38:     {
   39:         host.Tracking.Trace();
   40:     }
   41: }

What just happened?

Look at the test results and here we see our collection.  There are 2 customer objects in the collection.  These two customer objects are different objects but the data they hold represents the same customer.  Is this the correct behavior?  In some cases yes, but for a business object, such as a Customer, the answer is most likely no.

    1: Customer collection count 2
    2: Customer ID 2ee4f0ae-fdee-4fca-951a-81cad05ba5d3
    3: Customer ID 2ee4f0ae-fdee-4fca-951a-81cad05ba5d3

What is the solution?

This issue can result in very subtle bugs especially when you cross layers between code and workflow activities. The solution in this case gets to the very heart of what we mean when we say that two Customer objects are equal.

Consider the Customer type

clip_image004

Given two of these what do we mean when we say customer1 == customer2? If we don’t provide an answer, Object.Equals will provide one. It’s default answer is that the reference named customer1 is the same reference as customer2.

In my test when I constructed customer2 using a copy constructor I created a second reference to another object instance. As far as Object.Equals is concerned that is a different customer.  Yet, from the business rules point of view we might say that the two customer objects are equal because customer1.Id == customer2.Id.

Fortunately there is a solution. To see it in action I’ve created a new type derived from Customer.

clip_image006

EquatableCustomer implements IEquatable<T> and it also overrides the Object.Equals method as well as Object.GetHashCode(). This allows you to define the meaning of equals.  On a side note – Resharper (which I totally love) will happily implement IEquatable for you and warn you if you don’t override the appropriate methods. Here is the implementation

    1: public class EquatableCustomer : Customer, IEquatable<EquatableCustomer>
    2: {
    3:     public EquatableCustomer()
    4:     {
    5:         this.Id = Guid.NewGuid();
    6:     }
    7:  
    8:     public EquatableCustomer(Customer other)
    9:     {
   10:         this.Id = other.Id;
   11:         this.Name = other.Name;
   12:     }
   13:  
   14:     public override bool Equals(object obj)
   15:     {
   16:         if (ReferenceEquals(null, obj))
   17:         {
   18:             return false;
   19:         }
   20:  
   21:         if (ReferenceEquals(this, obj))
   22:         {
   23:             return true;
   24:         }
   25:  
   26:         if (obj.GetType() != typeof(EquatableCustomer))
   27:         {
   28:             return false;
   29:         }
   30:  
   31:         return this.Equals((EquatableCustomer)obj);
   32:     }
   33:  
   34:     public override int GetHashCode()
   35:     {
   36:         return this.Id.GetHashCode();
   37:     }
   38:  
   39:     public bool Equals(EquatableCustomer other)
   40:     {
   41:         if (ReferenceEquals(null, other))
   42:         {
   43:             return false;
   44:         }
   45:  
   46:         if (ReferenceEquals(this, other))
   47:         {
   48:             return true;
   49:         }
   50:  
   51:         return other.Id.Equals(this.Id);
   52:     }
   53: }

Here is a test using EquatableCustomer that will get the correct results.

    1: [TestMethod]
    2: public void WhatHappensIfCopiedEquatableCustomerExists()
    3: {
    4:     var collection = new Collection<Customer>();
    5:     var customer1 = new EquatableCustomer();
    6:  
    7:     // Add the customer to the list
    8:     collection.Add(customer1);
    9:  
   10:     // Create a copy of the same customer
   11:     var customer2 = new EquatableCustomer(customer1);
   12:  
   13:     var activity = new EnsureCustomerIsInList();
   14:     var host = new WorkflowInvokerTest(activity);
   15:  
   16:     // Pass the copy to the activity
   17:     var inputs = new Dictionary<string, object> { { "CustomerCollection", collection }, { "Customer", customer2 } };
   18:  
   19:     try
   20:     {
   21:         host.TestActivity(inputs);
   22:  
   23:         this.TestContext.WriteLine("Customer collection count {0}", collection.Count);
   24:         foreach (var cust in collection)
   25:         {
   26:             this.TestContext.WriteLine("Customer ID {0}", cust.Id);
   27:         }
   28:  
   29:         // The copy will not be added because the customers are have the same Id
   30:         // collection.Contains will return true for both because of the IEquatable interface
   31:         Assert.IsTrue(collection.Contains(customer1));
   32:         Assert.IsTrue(collection.Contains(customer2));
   33:  
   34:         // Only 1 customer is in the collection
   35:         Assert.AreEqual(1, collection.Count);
   36:     }
   37:     finally
   38:     {
   39:         host.Tracking.Trace();
   40:     }
   41: }

Summary

This is not just an issue for WF4 and the Collection activities. You can reproduce the problems without workflow but I think that the danger with Workflows is that the problem is hidden in the code layer which is less visible to the workflow layer. This is especially true for people who are building workflow activities and designers for non-developers to use.

When creating business entities like “Customer”, “Product”, “Order” and the like, you should implement IEquatable<T> to define the semantic equivalence that non-developers expect. Ask yourself, how should we define the identity of this entity? Primary keys in the database are a good hint.