WF4 instance state size is smaller than WF3

Part of the performance enhancements made in WF4 are in the size of the workflow instance state. There are a number of contributing design decisions:

  • Separation of workflow definition from workflow instance - I like to think of this as the separation of a class and an object. WF3 keeps the workflow definition and instance data together, which is helpful if you want to do dynamic updates but harmful on performance.
  • Clear scoping rules - WF3 variables did not have scope, meaning that everything had to be kept in the instance state. WF4 has changed this to use strict scoping rules where variables only live within their scope and are aggressively disposed when out of scope. Arguments are also handled in a cleaner manner.
  • Delay of initialized state on activities - All activities in a WF3 workflow are initialized when the workflow instance is started. WF4 initializes the activities only when it has to.

A good way to understand the differences in workflow instance state sizes between the two workflow versions is to examine the size of data written to the persistence store.

For this experiment, I created a workflow with a persistence point in the beginning and the end. Directly after the persistence point there is a pause for input so that I have time to query the database to see how large the workflow instance size is. In between the two persistence points, I inserted 50 empty CodeActivity activities.

In case you're wondering, I did not put 50 code activities in there by hand: that's code generated. The persistence point I use in this workflow is just a delay. Setting the unload on idle time low enough will cause the workflow to be persisted. There are other ways to accomplish this but this seemed to be pretty quick. The WaitForInput* code activities just have a Console.ReadLine() in them. I installed the persistence store database and modified the persistence example from the WF3 samples to use this workflow.

 using System;
using System.Threading;
using System.Workflow.Runtime;
using System.Workflow.Runtime.Hosting;

namespace WF3LotsOfActivities
{
    class Program
    {
        static void Main(string[] args)
        {
            using (WorkflowRuntime workflowRuntime = new WorkflowRuntime())
            {
                WorkflowPersistenceService persistenceService =
                    new SqlWorkflowPersistenceService(
                    "Initial Catalog=SqlPersistenceService;Data Source=localhost;Integrated Security=SSPI;",
                    false,
                    new TimeSpan(1, 0, 0),
                    new TimeSpan(0, 0, 1));
                workflowRuntime.AddService(persistenceService);

                AutoResetEvent waitHandle = new AutoResetEvent(false);
                workflowRuntime.WorkflowCompleted += delegate(object sender, 
                    WorkflowCompletedEventArgs e)
                {
                    waitHandle.Set();
                };
                workflowRuntime.WorkflowTerminated += delegate(object sender, 
                    WorkflowTerminatedEventArgs e)
                {
                    Console.WriteLine(e.Exception.Message);
                    waitHandle.Set();
                };
                workflowRuntime.WorkflowIdled += OnWorkflowIdled;
                workflowRuntime.WorkflowPersisted += OnWorkflowPersisted;
                workflowRuntime.WorkflowUnloaded += OnWorkflowUnloaded;
                workflowRuntime.WorkflowLoaded += OnWorkflowLoaded;

                WorkflowInstance instance = workflowRuntime.CreateWorkflow(
                    typeof(WF3LotsOfActivities.Wf50Activities));
                instance.Start();

                waitHandle.WaitOne();
            }
        }

        static void OnWorkflowLoaded(object sender, WorkflowEventArgs e)
        {
            Console.WriteLine("Workflow was loaded.");
        }

        static void OnWorkflowUnloaded(object sender, WorkflowEventArgs e)
        {
            Console.WriteLine("Workflow was unloaded.");
        }

        static void OnWorkflowPersisted(object sender, WorkflowEventArgs e)
        {
            Console.WriteLine("Workflow was persisted.");
        }

        static void OnWorkflowIdled(object sender, WorkflowEventArgs e)
        {
            Console.WriteLine("Workflow is idle.");
            e.WorkflowInstance.TryUnload();
        }
    }
}

The SQL script I'm using to determine the size of the persisted workflow instance is pretty simple. The important portion of the size is the state column in the InstanceState table.

 SELECT DATALENGTH(state) size
FROM   [InstanceState]

Running this code and using the SQL script, I found the first persistence point to have a size of 8534 bytes. At the second persistence point, the size is 9228 bytes. So running the 50 activities added 694 bytes even though those activities don't have any arguments or variables.

The next thing to do was write the same workflow in WF4.

There is no equivalent to the empty CodeActivity in WF3. So I created an activity called Comment that does nothing. The code for Comment looks like this:

 [ContentProperty("Body")]
public sealed class Comment : CodeActivity
{
    public Comment()
        : base()
    {
    }

    [DefaultValue(null)]
    public Activity Body
    {
        get;
        set;
    }

    protected override void Execute(CodeActivityContext context)
    {
    }
}

I also had to create a ReadLine activity.

 [ContentProperty("Body")]
public sealed class ReadLine : CodeActivity
{
    public ReadLine() : base() { }

    [DefaultValue(null)]
    public Activity Body { get; set; }

    protected override void Execute(CodeActivityContext context)
    {
        Console.WriteLine("Press enter to continue");
        Console.ReadLine();
    }
}

With that in place, the code for running the workflow looks like this:

 class Program
{
    private static ManualResetEvent mre = new ManualResetEvent(false);

    static void Main(string[] args)
    {
        WorkflowApplication wfApp = new WorkflowApplication(new Wf50Activities());
        SqlWorkflowInstanceStore instanceStore = new SqlWorkflowInstanceStore(
            ConfigurationManager.ConnectionStrings["WorkflowPersistenceStore"].ConnectionString);
        instanceStore.RunnableInstancesDetectionPeriod = TimeSpan.FromSeconds(1);
        instanceStore.InstanceCompletionAction = InstanceCompletionAction.DeleteAll;
        instanceStore.InstanceEncodingOption = InstanceEncodingOption.None;
        wfApp.InstanceStore = instanceStore;

        wfApp.Completed = new Action<WorkflowApplicationCompletedEventArgs>((e) =>
        {
            mre.Set();
        });
        wfApp.Idle = new Action<WorkflowApplicationIdleEventArgs>((e) =>
        {
            Console.WriteLine("Workflow idled");
        });
        wfApp.Run();
        mre.WaitOne();
    }
}

Notice that I have the InstanceEncodingOption set to None. This keeps the GZip compression off as I want to measure the uncompressed size.

The SQL script that I use to calculate the size of the persisted instance is a tiny bit more complex since the WF4 persistence store has divided the data into separate columns.

 SELECT ISNULL(DATALENGTH([ReadWritePrimitiveDataProperties]), 0) +
       ISNULL(DATALENGTH([WriteOnlyPrimitiveDataProperties]), 0) +
       ISNULL(DATALENGTH([ReadWriteComplexDataProperties]), 0) +
       ISNULL(DATALENGTH([WriteOnlyComplexDataProperties]), 0) size
FROM   [System.Activities.DurableInstancing].Instances

The first peristence point shows a size of 4925 bytes. That's a little more than half the size of WF3. The second persistence point has a size of 4935 bytes. That's only 10 new bytes after running those 50 activities.

So, how about we take that workflow and expand the 50 activities to 1000 activities? The table below shows the measurements from this.

  # ActivitiesInitial SizeFinal SizeDelta
WF3   50 8534 9228 694
1000 83163 95784 12621
Delta   74629 86556  
WF4   50 4925 4935 10
1000 4928 4940 12
Delta   3 5  

As you can see there is virtually no increase in size based on the number of activities in the workflow in WF4. In WF3, the number of activities has a profound impact on the serialized size. Also, notice that the instance data takes up more space as well. WF4 is handling the scoping much better.

If we set the InstanceEncodingOption to GZip on the WF4 persistence provider, that further reduces the size of the persisted state. The initial size for 1000 activities reduces from 4928 down to 1534. Final size is 1547.

Obviously the size difference will depend on the type of workflow, the variables, and where in the workflow the persistence occurred. But it should be clear from this experiment that there are substantial reductions in the size of the workflow instance and that will translate into less memory pressure on the host, less network traffic when contacting the SQLdatabase, and less disk space taken up on the SQL database.