Less is More

File:Tonysmith freeride sculpture.jpg

I am a musician. Over the years I have played keyboards in many bands, traveling and performing mostly as a hobby. I’ll never forget what one of my mentors taught me. He said

“The rest is just as important as the note”

The rest is the part of the song where you don’t play. Young musicians often forget this and take great delight in trying to dazzle you with their dexterity. Put a group of young musicians together and you get a cacophony of noise. The purpose, the beauty, the soul of the music is lost in the means of its production.

When I enter a planning process in building software, I want to keep this principle in mind. The things I leave out are just as important as the things I put in.

What Problem Are We Solving?

Unlike music or art which can exist simply for their beauty, software is utilitarian. It is a tool and no tool ever existed without a problem that it seeks to solve. The first step in building a great tool is to be very clear about the problem.

  • What is the problem?
  • Who has this problem?
  • When does this problem occur?
  • What would make it better?

Why This Matters

“Well those drifter’s days are past me now
I’ve got so much more to think about
Deadlines and commitments
What to leave in, what to leave out”

– Bob Seger – “Against the Wind”

The art of building great software begins with the art of making great choices. We can’t do everything… we need to do the best thing that we can with the time we’ve and resources we’ve got.

Taking a Risk

Right now I’m still in the stage of defining it but I’m going to be as transparent as possible with you in this process with the hope that you can give me feedback. There is some risk in this. You might disagree with my conclusions. You might wonder about the priorities that we have as a team. You might get the wrong impression… blah, blah, blah…

We live in a time of increasing secrecy in the tech industry. Companies take delight in hiding the details and then in one dramatic moment pulling back the curtain to a round of applause. I suppose that for some things that is the best approach, but you and I are in a different place. We need each other and we need to be transparent which means we run the risk of misunderstanding and that’s ok.

What Problem is Ron Solving?

I’ve been given the job of making Workflow and System.Threading.Task work well together. I began by researching the situation as it exists today. I might have come to conclusion that there is no problem or that even if there is a problem that it is so small it’s not worth addressing. Or I could have concluded that the problem is so massive that nothing can be done about it. I came to my initial conclusions by talking with some of you, my own experience and many hours prototyping and trying to work with tasks and workflows together.

What is the problem?

There are two key problems and one opportunity.

  1. Problem: Windows Workflow Foundation (WF) has a large API surface composed of three key classes (WorkflowInvoker, WorkflowApplication and WorkflowServiceHost) none of which is Task enabled as defined by the Task Asynchronous Pattern.
  2. Problem: Developers who want to implement the Task Asynchronous Pattern in their hosting code and activity code will find it difficult if not impossible to do so.
  3. Opportunity: Because we have to introduce new API surface to solve these problems we may be able to at the same time make learning and using Workflow both easier and more powerful if we are very careful.

Problem: WF does not implement the Task Asynchronous Pattern (TAP)

Here are a few examples

Task Async Pattern Workflow
Async methods should end with the “Async” suffix WorkflowApplication.Run() is async
Async methods should return a Task so the caller can call Task.Wait() Async methods return void
Result accessed with Task.Result Caller must implement delegates and wait handles to wait and access result
Exceptions in task are marshaled to calling thread in an AggregateException Exceptions in Activities are handled by the Aborted delegate or the Completed delegate depending on the UnhandledExceptionAction of the host
Tasks are canceled by calling CancellationTokenSource.Cancel() which signals a CancellationToken that is passed to child tasks. Activities are canceled by WorkflowAppliication.Cancel or NativeActivityContext.CancelChild / CancelChildren

As more and more developers become familiar with System.Threading.Task and the way it works with other classes in the .NET Framework they will expect that Workflow works the same way. If it did work the same way then someone who is familiar with Task will find Workflow more familiar.

Problem: Implementing TAP With Activities

Suppose you want to create an activity that access a database and you want to use SqlConnection.OpenAsync(cancellationToken). This new API is great because if the database server is down, you can cancel the Open operation on demand. Your first challenge is how to pass a CancellationToken from your workflow hosting code to the database activity. You could pass it as an in argument and require workflow authors to pass it to your activity but that doesn’t seem like the right approach.

So instead, you decide to pass it as an extension only to find that when you call workflowApp.Extensions.Add(cancellationToken) that you get an error stating that extensions must be reference types. No problem, you create a class that allows you to pass the token as an extension to the activity.

Now you’ve got it right? So you create a unit test to verify that if you try to connect to a server that does not exist and then cancel the token using CancellationTokenSource.Cancel that the activity immediately cancels the OpenAsync call and returns.

At first you think it works but then you notice that your workflow terminates with a TerminationException of AggregateException –> OperationCanceledException.

Is that ok? Should a cancel cause a workflow to terminate?

No, it should not. Workflow already has a model for cancellation of activities. Someone may use your database activity inside of a CancellationScope. If your activity faults when it is really just canceled then the cancellation handler will not be invoked.

This is just one of several problems I’ve identified with using tasks inside of activities.

Opportunity: Make Workflow Easier and More Powerful

Any time you change an existing API there are both risks and opportunities. We could try to make it better, simpler and more powerful and actually end up making it worse, more complicated or break existing things. We must be very careful about how we approach this opportunity.

When it comes to workflows and tasks here are my design principles.

  1. Embrace, don’t try to hide the asynchronous nature of Workflow
    • When a caller wants to wait for something to happen with a workflow (complete, idle, load, unload, cancel etc. etc.) they should receive a task that allows them to wait for that thing.
  2. A caller may create 0..n tasks when using Workflow
    • Users may “Fire and Forget” a workflow (0 tasks)
    • Users may want to create a task to wait for a workflow to become idle in addition to a task to wait for the workflow to complete (2 tasks)
  3. If a caller has requested a task to wait for a specific event (idle, complete, unload, etc.) and the event that they received it for
    • Did happen the task will complete.
    • Did not happen yet but still could happen within the lifetime of the AppDomain, the task will remain running
    • Did not happen yet and will never happen within the lifetime of the AppDomain the task will be canceled.
  4. If the workflow faults then
    • The main task will be faulted, all others will cancel

Ok, ready for the risk disclaimer?

None of this might happen. Or what eventually happens might be radically different than what I am thinking right now. Blogs live forever so if you come back in 5 years and read this you might laugh… but that is a risk I’m willing to take.

Happy Coding!

Twitter: @ronljacobs

Comments (6)

  1. Josh Reuben says:

    Sounds like a TPL DataFlow Designer – an agent activity scheduler + activities representing various message passing blocks  !

  2. ronjacobs says:

    @Josh – don't get too far ahead of me… at this point I'm only scoping the problems and laying out some general principles for the design.  However… if you have scenarios in mind, I'm delighted to hear about them.

  3. I was going to use your sample from blogs.msdn.com/…/windows-workflow-foundation-wf4-activities-and-threads.aspx to achieve this. Do you still feel that this is a valid way forward if cancellation is not required?

  4. ronjacobs says:

    @Rory – you can use tasks just be aware of the issues.  Actually if all you want to do is easily implement async execution you can use Action.BeginInvoke / EndInvoke with a delegate.

  5. Dave says:

    I am trying to make the distinction between AppFabric and Workflow 1.0 Beta.  AppFabric doesn't install Service Bus, but AppFabric has configuration and monitoring interface wrapped up into IIS7.  If I were to recommend to my clients to use one of these services which should I recommend.  It is not clear which direction Microsoft is going with these two seemingly competing service offerings.  If I were considering either Workflow 1.0 Beta or AppFabic AND required access controls will ACS work with both.  All of the services would be hosteed on-premise.

  6. ronjacobs says:

    @Dave – You should check out our Team Blog blogs.msdn.com/…/workflowteam

    Jurgen (our boss) has written some posts about where we are investing and how these offerings fit together.