Implementing the N of M Pattern in WF

The second in my series of alternate execution patterns (part 1)

I recently worked with a customer who was implementing what I would call a "basic" human workflow system. It tracked approvals, rejections and managed things as they moved through a customizable process. It's easy to build workflows like this with an Approval activity, but they wanted to implement a pattern that's not directly supported out of the box. This pattern, which I have taken to calling "n of m", is also referred to as a "Canceling partial join for multiple instances" in the van der Aalst taxonomy.

The basic description of this pattern is that we start m concurrent actions, and when some subset of those, n, complete, we can move on in our process and cancel the other concurrent actions. A common scenario for this is where I want to send a document for approval to 5 people, and when 3 of them have approved it, I can move on. This comes up frequently in human or task-based workflows. There are a couple of "business" questions which have to be answered as well, the implementation can support any set of answers for this:

  • What happens if an individual rejects? Does this stop the whole group from completing, or is it simply noted as a "no" vote?
  • How should delegation be handled? Some business want this to break out from the approval process at this point.

The first approach the customer took was to use the ConditionedActivityGroup (CAG). The CAG is probably one of the most sophisticated out of the box activities that we ship in WF today, and it does give you a lot of control. It also gives you the ability to set the Until condition which would allow us to specify the condition that the CAG could complete, and the others would be cancelled (see Using ConditionedActivityGroup)

ConditionedActivityGroup

What are pros and cons of this approach:

Pros

  • Out of the box activity, take it and go
  • Focus on approval activity
  • Possibly execute same branch multiple times

Cons

  • Rules get complex ( what happens if the individual rejections causes everything to stop)
  • I need to repeat the same activity multiple times (especially in this case, it's an approval, we know what activity needs to be in the loop)
  • I can't control what else a developer may put in the CAG
  • We may want to execute on some set of approvers that we don't know at design time, imagine an application where one of the steps is defining the list of approvers for the next step. The CAG would make that kind of thing tricky.

This led us to the decision to create a composite activity that would model this pattern of execution. Here are the steps we went through:

Build the Approval activity

The first thing we needed was the approval activity. Since we know this is going to eventually have some complex logic, we decided to take the basic approach of inheriting from SequenceActivity and composing our approval activity out of other activities (sending email, waiting on notification, handling timeouts, etc.). We quickly mocked up this activity to have an "Approver" property, a property for a timeout (which will go away in the real version, but is useful to put some delays into the process. We also added some code activities which Console.WriteLine 'd some information out so we knew which one was executing. We can come back to this later and make it arbitrarily complex. We also added the cancel handler so that we can catch when this activity is canceled (and send out a disregard email, clean up the task list ,etc). Implementing ICompensatableActivity may also be a good idea so that we can play around with compensation if we want to (note, that we will only compensate the closed activities, not the ones marked as canceled).

Properties of the Approval Activity

Placing the Approval Activity inside our NofM activity.

What does the execution pattern look like?

Now that we have our approval activity, we need to determine how this new activity is going to execute. This will be the guide that we use to implement the execution behavior. There are a couple of steps this will follow

  1. Schedule the approval's to occur in parallel, one per each approver submitted as one of the properties
  2. Wait for each of those to finish.
  3. When one finishes, check to see if the condition to move onward is satisfied (in this case, we increment a counter towards a "number of approvers required" variable.
  4. If we have not met the criteria, we keep on going. [we'll come back to this, as we'll need to figure out what to do if this is the last one and we still haven't met all of the criteria.]
  5. If we have met the criteria, we need to cancel the other running activities (they don't need to make a decision any more).
  6. Implement the easy part of this (scheduling the approvals to occur in parallel)

I say this is the easy part as this is documented in a number of places, including Bob and Dharma's book. The only trickery occurring here is that we need to clone the template activity, that is the approval activity that we placed inside this activity before we started working on it. This is a topic discussed in Nate's now defunct blog.

     protected override ActivityExecutionStatus Execute(ActivityExecutionContext executionContext)
    {
        // here's what we need to do.
        // 1.> Schedule these for execution, subscribe to when they are complete
        // 2.> When one completes, check if rejection, if so, barf
        // 3.> If approve, increment the approval counter and compare to above
        // 4.> If reroute, cancel the currently executing branches.
        ActivityExecutionContextManager aecm = executionContext.ExecutionContextManager;
        int i = 1;
        foreach (string approver in Approvers)
        {
            // this will start each one up.
            ActivityExecutionContext newContext = aecm.CreateExecutionContext(this.Activities[0]);
            GetApproval ga = newContext.Activity as GetApproval;
            ga.AssignedTo = approver;
            // this is just here so we can get some delay and "long running ness" to the
            // demo
            ga.MyProperty = new TimeSpan(0, 0, 3 * i);
            i++;
            // I'm interested in what happens when this guy closes.
            newContext.Activity.RegisterForStatusChange(Activity.ClosedEvent, this);
            newContext.ExecuteActivity(newContext.Activity);
        }
        return ActivityExecutionStatus.Executing;
    }

Code in the execute method

One thing that we're doing here is RegisterForStatusChange() This is a friendly little method that will allow me to register for a status change event (thus it is very well named). This is a property of Activity, and I can register for different activity events, like Activity.ClosedEvent or Activity.CancelingEvent. On my NofM activity, I implment IActivityEventListener of type ActivityExecutionStatusChangedEvent (check out this article as to what that does and why). This causes me to implement OnEvent which since it comes from a generic interface is now strongly typed to accept the right type of event arguments in. That's always a neat trick that causes me to be thankful for generics. That's going to lead us to the next part.

Implement what happens when one of the activities complete

Now we're getting to the fun part of how we handle what happens when one of these approval activities return. For the sake of keeping this somewhat brief, I'm going to work off the assumption that a rejection does not adversely affect the outcome, it is simply one less person who will vote for approval. We can certainly get more sophisticated, but that is not the point of this post! ActivityExecutionStatusChangedEventArgs has a very nice Activity property which will return the Activity which is the one that caused the event. This let's us find out what happened, what the decision was, who it was assigned to, etc. I'm going to start by putting the code for my method in here and then we'll walk through the different pieces and parts.

 public void OnEvent(object sender, ActivityExecutionStatusChangedEventArgs e)
{
    ActivityExecutionContext context = sender as ActivityExecutionContext;
    // I don't need to listen any more
    e.Activity.UnregisterForStatusChange(Activity.ClosedEvent, this);
    numProcessed++;
    GetApproval ga = e.Activity as GetApproval;
    Console.WriteLine("Now we have gotten the result from {0} with result {1}", ga.AssignedTo, ga.Result.ToString());
    // here's where we can have some additional reasoning about why we quit
    // this is where all the "rejected cancels everyone" logic could live.
    if (ga.Result == TypeOfResult.Approved)
        numApproved++;
    // close out the activity
    context.ExecutionContextManager.CompleteExecutionContext(context.ExecutionContextManager.GetExecutionContext(e.Activity));
    if (!approvalsCompleted  && (numApproved >= NumRequired))
    {
        // we are done!, we only need to cancel all executing activities once
        approvalsCompleted = true;
        foreach (Activity a in this.GetDynamicActivities(this.EnabledActivities[0]))
            if (a.ExecutionStatus == ActivityExecutionStatus.Executing)
                context.ExecutionContextManager.GetExecutionContext(a).CancelActivity(a);
    }
    // are we really done with everything? we have to check so that all of the 
    // canceling activities have finished cancelling
    if (numProcessed == numRequested)
        context.CloseActivity();  
}

Code from "OnEvent"

The steps here, in English

  • UnregisterForStatusChange - we're done listening.
  • Increment the number of activities which have closed (this will be used to figure out if we are done)
  • Write out to the console for the sake of sanity
  • If we've been approved, increment the counter tracking how many approvals we have
  • Use the ExecutionContextManager to CompleteExecutionContext, this marks the execution context we created for the activity done.
  • Now let's check if we have the right number of approvals, if we do, mark a flag so we know we're done worrying about approves and rejects and then proceed to cancel the activities. CancelActivity. CancelActivity schedules the cancellation, it is possible that this is not a synchronous thing (we can go idle waiting for a cancellation confirmation, for instance.
  • Then we check if all of the activities have closed. What will happen once the activities are scheduled for cancellation is that each one will eventually cancel and then close. This will cause the event to be raised and we step through the above pieces again. Once every activity is done, we finally close out the activity itself.

Using it

I placed the activity in a workflow, configured it with five approvers and set it for two to be required to move on. I also placed a code activity outputting "Ahhh, I'm done". I also put a Throw activity in there to raise an exception and cause compensation to occur to illustrate that only the two that completed are compensated for.

So, what did we do?

  • Create a custom composite activity with the execution logic to implement an n-of-m pattern
  • Saw how we can use IEventActivityListener in order to handle events raised by our child activities
  • Saw how to handle potentially long running cancellation logic, and how to cancel running activities in general.
  • Saw how compensation only occurs for activities that have completed successfully

Extensions to this idea:

  • More sophisticated rules surrounding the approval (if a VP or two GM's say no, we must stop)
  • Non binary choices (interesting for scoring scenarios, if the average score gets above 95%, regardless of how many approvers remaining, we move on)
  • Create a designer to visualize this, especially when displayed in the workflow monitor to track it
  • Validation (don't let me specify 7 approvals required, and only 3 people)