Declarative workflow versioning and workflow instances duplication

In this post I will describe how the versioning mechanism for the declarative workflows (created either with SharePoint Designer [SPD] 2007 or 2010), in some circumstances, can cause an undesired duplication of “some” workflow instances already running on specific list items.

I refer to a sample scenario, but the concepts here described are generic. Before delving into the sample scenario, I repeat here below two basic concepts regarding the declarative workflows.

Two basic (preliminary) concepts…

1. In SharePoint, the workflow engine allows only one single workflow instance of a certain workflow type (also known as workflow model or workflow definition) to be running on a specific list item.

2. Every time a (declarative) workflow type is “modified and published” using the Workflow Designer tool in SPD 2007/2010, a new “version” of the workflow type is attached to the specific list; the previous versions of the same workflow type are maintained because there could be running workflow instances referring to that specific versions; obviously, it is not possible to start new instances of older versions.

The sample scenario

I consider the simple scenario of an “issue management” workflow, where two individuals contributors, in sequence, must do an action on an open issue: a first individual is asked to propose a “technical solution”; a second individual is asked to define the “communication actions” to be taken.

This workflow can be described with a simple sequential flowchart, as shown below (with the actors and the status).

Click to enlarge

We can implement this scenario with two lists:

· An “Issues” list, created using the standard “Issues Tracking” list definition.

· A “Tasks” list, automatically created by SPD (at the moment of the first workflow publishing) using the standard “Tasks” list definition

On the “Issues” list we need to customize the “Issue Status” field with the four values described in the previous figure and here summarized:

· Active

· In Charge by Tech. Team

· In Charge by Comm. team

· Closed

It is also useful to hide the “Issue Status” field in the new/edit forms of the “Issue” content type associated to the list; in fact, we want to manage the value of this column automatically, using a workflow. A final customization on this list consists in renaming the “Assigned To” field in “Assigned To – Tech. Team” and adding a new identical field named “Assigned To – Comm. Team”.

The base implementation of the workflow can be realized with a single step, as shown below:

Base workflow implementation

It is possible to notice that the workflow consists in a simple sequence of the following activities:

· “Set Field in Current Item” activity (used to set the “Issue Status” field)

· “Collect Data from User” activity (used to collect the technical and communication contribution)

Let us suppose that we need to automatically activate the workflow, not only when an issue is created but also when an issue is modified:

Workflow startup options

After creating a new item…

New item creation

…the workflow starts and works as expected, as shown in the following picture:

Click to enlarge

The problem of the duplicated instances…

Now, let us suppose that, after the first publishing of this simple workflow model, the following sequence of events happens, in the order shown here below:

· there are already active workflow instances, waiting for the completion of the “first type” of task (some “technical contributors” have been already contacted by the workflow but they have not yet completed their tasks);

· a web designer opens the workflow in SharePoint Designer and presses the “Finish” button on the Workflow Designer tool (accidentally or in order to save an even minimal modification, like the change of a string message);

· one of the assignees of a “first type” of task (a “technical contributor”) completes its assigned task.

In this situation, the following assignee (the “communication contributor”) of that specific issue is correctly assigned the “second type” of task but, at the same time, the first assignee of that issue (the same “technical contributor” that has just answered…) unexpectedly sees a new task assigned to him, identical to the one just completed and referring to the same issue! This will be probably judged as an incomprehensible behavior of the workflow…!

Click to enlarge

Looking more in detail, we can see that a new instance of the workflow is started on that specific list item (the worked issue). Why this happens? Because, after pressing the “Finish” button on the Workflow Designer tool, a new version of the workflow is created and associated to the list. This new version is not considered “the same workflow type”, so a new instance of this new version can start even if there is an already running instance of the previous version.

Specifically, in our example, when the first assignee pushes the “Complete Task” button (see step 4 in the figure above), the “Set Field in Current Item” action (setting the new issue status) causes the startup of a new workflow instance of the new workflow type; this last creates a new task of the “first type” (a “technical contribution” task).

Please note that the problem is not only tied to the fact that the workflow itself updates a field on the current item: there are situations where the users need to repeatedly change the list item to whom workflows are attached, even when workflow instances are still running. These manual updates normally do not cause the startup of new workflow instances (just one instance of a certain workflow type can be running in a given moment) but, when a new version of a certain workflow is published, such kind of changes cause the startup of new instances of that specific workflow version.

It is worth to notice also that this problem is not tied to the existence of the “task” activities in the workflow model: also a “wait” activity can cause the type of running workflow instances to become obsolete and, so, possibly subjected to duplication…

In summary, this unwanted workflow instances duplication may happen for workflow types with the following characteristics:

· they are configured to start when their associated item is changed;

· they contains activities causing the “dehydration” of their workflow instances (“wait” activities or “user task” activities).

Specifically, these kind of workflow types are subjected to the described workflow instances duplication when the following conditions are met:

· there are dehydrated workflow instances (waiting for a user task completion or for the end-condition of a “wait” activity) at the time when a new version of their workflow type is published;

· one or both the following situations happens:

o a user updates a list items to whom one of the dehydrated workflow instances is associated;

o the workflow instance itself, after the re-hydration, updates one or more fields on the associated list item.

Unfortunately, these conditions are met quite often in real-world workflow implementation… You can imagine what confusion can be caused by the re-publishing of a workflow, if it causes the duplication of hundred of tasks!

A possible solution

It is worth to repeat that the problem of the workflow instances duplication described in this post is not a bug; it is just the consequence of the workflow versioning mechanism and it appears only in the described conditions.

Here I propose a possible solution for the creation of workflows (with the characteristics mentioned above) having a more “predictable behavior” when they are updated (even if the conditions mentioned above are met).

I want to anticipate an important consideration: this solution “is not for free”, in the sense that it causes an important increment of complexity in the workflow design… For this reason, my first suggestion is to attentively consider if this unwanted workflow duplication can really cause “big problems” to your users (will you have dozens of users potentially affected? Can you handle these situations with a proper communication strategy?)

If you consider the risk of the duplicated instances as unacceptable in your scenario, you should firstly check if:

· [Question A] you have to deal with users that can or need to update the “base list items” even when these items have workflow instances already running;

· [Question B] an instance of your workflow model, after the re-hydration, can update the current item (possibly causing the startup of another instance of a new type).

Accordingly to the answer that you give to the previous two questions, I propose here below a solution that can be composed of just one or both the following parts:

· [Solution part #1To be used only if the answer to “Question A” is ‘yes’]: immediately stop any new workflow instance if there is another instance running for a previous version of the same workflow type.

· [Solution part #2To be used only if the answer to “Question B” is ‘yes’]: immediately after any re-hydration, check if the current workflow instance is “of the latest published version” for its workflow type. If not, execute a “stop-and-restart”: predispose the startup of a new instance (of the latest version for the workflow type) and stop the current instance before doing any other action; the new instance should be able to restart the execution at the same “logical workflow point” where the previous instance left.

The “part #1” of the solution can be implemented in the following manner:

· You have to create a custom list for each workflow type you have to deal with (for example, named “Running Workflow Instances - Type XYZ” ); this list will contain a ‘lookup’ to the current item in your main list if and only if there is already a running workflow instance (of that specific workflow type, no matter of what version) on that specific list item.

· As very first action, your workflow model needs to query this custom list in order to check if it contains the ‘lookup’ for the current list item. If this lookup is found, then the workflow should stop itself immediately. Otherwise the current instance must register itself by adding the lookup to the current item in the previous list.

· As very last action, your workflow model needs to unregister itself by deleting the lookup to the current list item.

The “part #2” of the solution can be implemented in the following manner:

· You have to create a custom list (if you do not already have one defined; for example, you can name it “Configuration Parameters” ) where you can manually insert an item specifying the number of the latest version which has been published for the current workflow type (note: the “version” numbering here defined does not need to be the same used by the workflow versioning mechanism; you can define your own numbering strategy, for example just using 1, 2, 3, etc…).

· In your workflow model, you have to specify the current version in an appropriate workflow variable; this variable must be manually updated every time you update and re-publish the model.

· After each possible re-hydration (that is, after the completion of a user assigned task or after the completion of a pause), the workflow instance should check if its version (read from the variable) is still the same specified in the configuration list; if it is the same, the instance can continue its execution; otherwise:

o the current instance must unregister itself (this step is required only if the registration mechanism previously described for the “part #1” of the solution has been set up);

o the current instance must make a “fake update” on the current item (for example, you can override a field with its actual value). This is required in order to trigger the restart of a instance of the new version for the workflow type;

o the current instance must stop itself.

· Each action composing the workflow model should be wrapped into an appropriate “if” condition, checking for the expected status of the current list item; this will allow the new workflow instance to restart the execution exactly where the previous instance left. The status of an item usually can be read from one or from a combination of its field values.

Here below I show how the described strategy is practically applied to the simple scenario proposed above.

Firstly, I want to emphasize the evident increment of complexity in the workflow design, by showing how a “single-step” workflow is now composed by 8 steps:

Steps in the solution

Here I show how the workflow instance registration mechanism (part #1) is realized:

Click to enlarge

This figure shows how this mechanism works on a practical example:

Click to enlarge

Here I show the version registration that is performed at startup and the version check that is performed after each re-hydration (part #2):

Click to enlarge

Please note that, in my example, I can simply check the status of the item by looking at the value of a single field (the “Issue Status” field).

As you see, in order to restart a new instance I just overwrite the title field of the current item with its actual value.

Here below I show how the status check must be performed at every single step:

Click to enlarge

The following figure shows how the described mechanism works; in the example, the assignee of the first task (the “Technical Contribution Task”) completes it after that a new version of the workflow was published (and registered into the “Configuration Parameters” list). As you can see in the figure, the current running instance stops itself and a new instance automatically restarts the flow execution just at the correct point.

Click to enlarge

Do SharePoint 2010 and SPD 2010 change the game?

In my opinion, minimally…: the problem described in this post is still valid, caused by the same reasons already detailed; also the described solution can still be applied.

The appreciable change is in the many enhancements of the “Workflow Editor” tool in SPD 2010; for example, it is now possible to easily nest the steps and the conditions, allowing a simpler and more readable organization of the workflow design.

The following figure shows the two parts of the solution applied to the workflow design for the scenario show as example in this post; as you can see, all the conditions and checks can be nicely organized into a single step.

Click to enlarge

A few other considerations

You may wonder why, in the “part #1” of the proposed solution (the part aimed to avoid workflow instanced duplication when a user manually updates the current list item), I suggest to write and read into a custom “Running Workflow Instances” list, instead then simply checking the status of the current list item (like I do in the “part #2” of the proposed solution). A status check is not practical for the aim pursued in "part #1": in general, for every possible list item status, there are situations where a new workflow instance must start, in order to continue the work of a previous instance, according to the desired “logical” organization of the flow (which may be composed by more than one declarative workflow implementation); for example, this happen when you have loops in your flowchart representation of the overall logical flow.

A simpler and smarter idea for the “part #1” of the proposed solution could be to query the current list item in order to check if there are other workflow instances running. Unfortunately, the field named as the current workflow type only gives information about the (numeric) status of the latest workflow instance run or running for the latest version of the current workflow type; it is not helpful if you want to discover the existance of workflow instances still running for the previous versions of the current workflow type.

Looking at the “part #2” of the proposed solution, it is important to notice that it does not protect against the accidental re-publishing of the current workflow type (for example, in SPD 2007/2010, when the workflow model was only shown but not modified, and the web designer closes the Workflow Editor tool with the ‘Publish’ button instead that with the ‘Cancel’ button). In that case, the current workflow version in the predisposed workflow variable – matched with the value of predisposed list item into the “Configuration Parameters” list – is not updated; for that reason, when a dehydrated instance of a previous workflow version will be re-hydrated, the proposed stop-and-restart mechanism is not executed and the “standard but unwanted” workflow instance duplication will happen.

It is important to notice that both parts of the described solutions can greatly reduce the occurrences of the workflow instances duplication but do not give the sure confidentiality that… no more unexpected situations will arise. In effect, they contains some “set of actions” that should be executed – by different workflow instances and/or by users – according to an expected logical timing sequence; unfortunately, this last may be subjected to unexpected “timing and concurrency issues”. Just as an example…: it is unlikely but it may happen that the new instance of the new workflow version – triggered by the stop-and-restart mechanism executed after the re-hydration, accordingly to the “part #2” of the proposed solution – will start faster that the completion of the list item cancellation into the “Running Workflow Instances” list; such an unpredictable situation will obviously cause the new workflow instance to stop itself.

In my experience, it is difficult to predict the occurrence rate for the “timing and concurrency issues”: it may be null where the servers are not heavily used and where the users’ concurrency rate is small; it may be somehow appreciable otherwise. There are other tips allowing the reduction of the incidence of this kind of problems; for example, it may be safer to introduce another custom list with another workflow, causing the restart of the main workflow (in its new version) after a pause of a few minutes (this may be considered as acceptable in a human workflow implementation). I am planning to write more about my experiences on this kind of problems in other posts.

The solution proposed in this post is valid even when the workflow implementation is more complex; for example, the “conceptual model” of the overall workflow (represented with a flowchart or with a state machine or with other techniques) can contain loops (in our example, what if a third contributor need to evaluate the first and the second contributions, possibly asking for a reworking of one or both these contributions?). As you know, the “Workflow Designer/Editor” tool in SPD 2007/2010 does not allow the introduction of loops; in spite of this limitation, there are techniques allowing the creation of loops with the implementation of two or more workflow types triggering themselves as described in the following post (in the SPD team blog): “Service Pack 2 prevents an on-change workflow from starting itself”. Even in this situation, the solution described in this post can allow for a more predictable workflow behavior when the main workflow is updated.

Wrapping up…

In this post I have tried to pursue two aims:

· I have highlighted how the standard and powerful versioning mechanism for the SPD 2007/2010 declarative workflows, in certain conditions, can cause a workflow instances duplication which may produce confusion and the perception of a misbehaving workflow implementation;

· I have proposed a repeatable solution designed to greatly reduce the risk that such unwanted situations will happen.

I admit that the solution that I am proposing is complex and difficult to be implemented (but sometime necessary) in real workflow designs, because the additional logic must be potentially repeated many times in a single workflow implementation; moreover, not all the possible causes of problems are finally addressed (for example, the solution does not protect against the “accidental” workflow re-publishing).

I will be glad to see, in the comments to this post, if other simpler and/or wider approaches can be used instead.