How to hibernate async methods (how to serialize Task)


[update: there’s now a sequel to this post, here, which improves the technique so it can handle async callstacks rather than just a single async method at a time.]

 

Sometimes people ask the question “How can I serialize a Task?” If you try, it throws a SerializationException:

Dim t = TestAsync()

Using stream As New MemoryStream

    Dim formatter As New Formatters.Binary.BinaryFormatter

    formatter.Serialize(stream, t)
‘ SerializationException: Type ‘Task’ is not marked as serializable.

End Using

 

Async Function TestAsync() AsTask

    Await Task.Delay(10)

    Console.WriteLine(“a”)

End Function

It’s the wrong question to ask. A Task is something that conceptually can’t be serialized. Consider as example this task: “Dim t As Task = httpClient.DownloadStringAsync(url)”. The task object embodies two things: (1) task is a promise that the operation will be completed at some time in the future, regardless of whether that operation is being performed locally in an async method or thread, or through some packets hurtling through the internet; (2) a task contains a collection of signed-up behaviors, delegates, that will be invoked once the task transitions to the “completed” state. It doesn’t make sense to talk about serializing either of those two things.

The right question is: “How can I serialize the current state of my progress through this async method?” In this article I’ll show how.

Caveat

Personally, I don’t yet believe that this is a good idea. I’m posting this blog article in case I’m wrong, and in case anyone finds a good design pattern for it.

In the late 1990s I did my PhD on the topic of message-passing as a way to write concurrent programs. It was a bad idea: “await” is better than message-passing. At the time most of my peers were doing PhDs on mobile agents which, I think, are even worse idea. Mobile agents are pieces of code which can serialize their state to disk, be transmitted over the internet, be deserialized on a remote machine, and resume their execution there. Biztalk/XLang also has the notion of long-running interactive processes which can be suspended to disk and resumed later.

I don’t think it’s a good idea to serialize processes/functions/behaviors/agents. That’s because they’re intangible. I think we should figure out the plain-old-data embodiment of the current state of our work, and serialize that data to disk, and deserialize it again all as a passive data-structure rather than an active agent. That way, if someone asks “What is currently in flight?” then in answer we can point them to the data structures.

So please only follow this article if you have a real vision of where it will be practically useful and clean.

Using a serializable async method

Here’s my serializable async method “TestAsync”.

Async Function TestAsync(min As Integer, max As Integer) As Task
    For i = min To max
        Console.WriteLine(i)
        Await Task.Delay(100)
        If i = 5 Then Await Hibernator.Hibernate(“a.hib”)
    Next
End Function

One of my programs is going to kick off the async method. But half way through, the async method will hibernate itself to disk and terminate abruptly with a “Serialized-to-disk” exception.

Console.WriteLine(“INITIATING…”)
Try
    Await TestAsync(1, 10)
Catch ex As OperationCanceledException
    Console.WriteLine(“EX” & ex.Message)
End Try

INITIATING ASYNC METHOD…

1

2

3

4

5
EX Serialized to a.hib

My other program might be running on a different machine, or maybe just a long time later, or maybe it just needed the async method to be snapshotted to disk for resilience against errors. It’s going to resume that same async method, mid-flight, from disk.

Console.WriteLine(“RESUMING…”)

Dim t As Task = Hibernator.Resume(“a.hib”)

Await t

Console.WriteLine(“done”)

RESUMING…

6

7

8

9

10
done

Q. What makes the async method serializable? A. All its parameters and local variables must themselves be serializable.

Q. Why design it so that hibernation throws a SerializationException from the original method? A. Imagine you stepped into a Star Trek transporter to beam down to the planet’s surface. Would you want the original copy of you in the transporter bay to go on living, while the new copy on the planet’s surface goes on living as well? So there are now two of you? You’d have to be very careful of side-effects, in case the two copies tried to do the same thing at the same time.

Q. Why design it so the async method itself is responsible for hibernating, rather than its caller? A. I think that hibernation is serious business, akin to thread-termination or task-cancellation, which should only be done cooperatively when the async method is ready for it. I’d probably want to extend the above code to use HiberationTokenSource / HibernationToken, by analogy to CancellationTokenSource / CancellationToken.

Implementing serialization of async methods

Here’s my implementation of the Hibernate and Resume functionality. Code first; discussion afterwards.

<Serializable>
Public Class Hibernator : Implements INotifyCompletion, ISerializable
    Private m_ex As Runtime.ExceptionServices.ExceptionDispatchInfo
    Private m_fn As String


    ‘===================================================================================

    Public Shared Function Hibernate(fn As String) As Hibernator
        ‘ Intended use: “Await Hibernator.Hibernate(filename)”
        Return New Hibernator With {.m_fn = fn}
    End Function

    Private Sub New()
        ‘ Private constructor used only by the Hibernate() shared method
    End Sub

    Function GetAwaiter() As Hibernator
        Return Me
    End Function

    Public ReadOnly Property IsCompleted As Boolean
        Get
            Return False ‘ so that OnCompleted gets called
        End Get
    End Property

    Public Sub OnCompleted(continuation As Action) Implements INotifyCompletion.OnCompleted
        Dim ex0 As Exception = Nothing
        Try
            Dim sm = continuation.Target.GetType().GetField(“m_stateMachine”,
                      BindingFlags.NonPublic Or BindingFlags.Instance).GetValue(continuation.Target)
            Using stream As New FileStream(m_fn, FileMode.Create, FileAccess.Write)
                Dim formatter As New Runtime.Serialization.Formatters.Binary.BinaryFormatter
                formatter.Serialize(stream, sm.GetType().Assembly.FullName)
                formatter.Serialize(stream, sm.GetType().FullName)
                For Each field In sm.GetType().GetFields(BindingFlags.Public Or
                        BindingFlags.NonPublic Or BindingFlags.Instance)
                    If field.Name = “$Builder” OrElse field.Name = “<>t__builder” Then Continue For
                    Dim fieldValue = field.GetValue(sm)
                    If fieldValue Is Nothing Then Continue For
                    If field.Name.Contains(“$awaiter”) AndAlso
                       Not Object.ReferenceEquals(fieldValue, Me) Then Continue For
                    formatter.Serialize(stream, field.Name)
                    formatter.Serialize(stream, fieldValue)
                Next
            End Using
            ex0 = New OperationCanceledException(“Serialized to “ & m_fn)
        Catch ex1 As Exception
            If File.Exists(m_fn) Then IO.File.Delete(m_fn)
            ex0 = ex1
        End Try
        m_ex = Runtime.ExceptionServices.ExceptionDispatchInfo.Capture(ex0)
        ‘ That’s so we can save+rethrow the exception with its intended callstack

         continuation()
    End Sub


    Public Sub GetResult()
        If Not m_ex Is Nothing Then m_ex.Throw()
    End Sub


    ‘===================================================================================

    Public Shared Function [Resume](fn As String) As Task
        Using stream As New IO.FileStream(fn, FileMode.Open, FileAccess.Read)
            Dim formatter As New Runtime.Serialization.Formatters.Binary.BinaryFormatter
            Dim assemblyName = CStr(formatter.Deserialize(stream))
            Dim typeName = CStr(formatter.Deserialize(stream))
            Dim sm = TryCast(System.Activator.CreateInstance(assemblyName, typeName).Unwrap(),
                             IAsyncStateMachine)

            While stream.Position < stream.Length
                Dim fieldName = CStr(formatter.Deserialize(stream))
                Dim fieldValue = formatter.Deserialize(stream)
                Dim field = sm.GetType().GetField(fieldName, BindingFlags.Public Or
                                                  BindingFlags.NonPublic Or BindingFlags.Instance)
                field.SetValue(sm, fieldValue)
            End While

            Dim builder = AsyncTaskMethodBuilder.Create()
            Dim builderField = sm.GetType().GetField(“$Builder”, BindingFlags.Public Or
                                                    BindingFlags.NonPublic Or BindingFlags.Instance)
            If builderField Is Nothing Then builderField =
sm.GetType().GetField(
“<>t__builder”, BindingFlags.Public Or
                                                    BindingFlags.NonPublic Or BindingFlags.Instance)
            builderField.SetValue(sm, builder)
            builder.Start(sm) ‘ This invokes MoveNext()
            Dim builder2 = CType(builderField.GetValue(sm), AsyncTaskMethodBuilder)
            Return builder2.Task
        End Using
    End Function


     ‘===================================================================================

    Private Sub New(info As SerializationInfo, ctx As StreamingContext)
        ‘ Private constructor used only by the deserializer
    End Sub

    Public Sub GetObjectData(info As SerializationInfo,
                             context As StreamingContext) Implements ISerializable.GetObjectData
        ‘ we have nothing to serialize
    End Sub

End Class

NOTE 1: What fields need be serialized? Well, all of them, except for…

  • The builder field won’t be serialized. It must be reconstructed. In VB it’s called $Builder. In C# it’s called <>t__builder. Note that both names are unutterable in user code.
  • Most awaiters don’t need to be serialized; only the current one is needed. The VB compiler calls its awaiters $awaiterXYZ The C# compiler calls its <>XYZ$awaiterABC. Again, both names are unutterable in user code. Both compilers creates several awaiters: one awaiter of type ‘object’ for all reference-type awaitables, and one awaiter for each type of value-type.
  • Null fields don’t need to be serialized, and indeed can’t.

NOTE 2: Normally it’s wrong for OnCompleted to invoke continuation() directly, because you end up with MoveNext invoking OnCompleted invoking MoveNext, which if it went too far would blow the stack. We should really be posting a delegate to the current synchronization context. But here the intended usage is “Await Hibernator.Hibernate(filename)” which will terminate the async method with an uncaught exception, so there’s no danger.

NOTE 3: In this design of hibernator, GetResult will be called at two times:

  1. when the user did “Await Hibernator.Hibernate()”, and our OnCompleted method saved an exception into m_ex, and we always wish to throw that exception
  2. when the user resumed the aysnc method from hibernation, and the Hibernator was deserialized with all fields null, and we don’t want to throw any exception.

NOTE 4: Why does the [Resume] method retrieve the “builder” field a second time? Well, Async state machines are structures that get boxed at their first await. If they get boxed, then all structure-fields within them get boxed as well. Therefore “sm.builder” after the first MoveNext might not be the same “builder” that we created before.

Conclusions

This is a pretty sketchy implementation. It does just enough to make the sample code work. It doesn’t work with Async Subs or Async Task(Of T) Functions.

It also isn’t composable: you can’t write a wrapper method around a call to Hibernate(). That’s because it only serializes its immediate caller.

This code is tightly coupled to the .NET4.5 compilers’ internal implementations of async methods. It relies on the fact that “Await Hibernate()” will invoke Hibernate’s OnCompleted method with a delegate whose Target object has an m_stateMachine field that points to the state machine. It relies on this state-machine having lifted local variables, and a builder of a recognizable name, and recognizable awaiter fields.

We didn’t give these things a clean public API, because we didn’t have any good use-cases for how or why people would want to serialize their async methods. If you folks come up with compelling mainstream scenarios, then we might consider maybe a Microsoft-authored NuGet package which does serialization, or exposing the necessary fields via a clean public API.

Comments (10)

  1. Jon Skeet says:

    Humbug! This is one of the "evil code" examples I'm planning to give at CodeMash… will have to compare my code with yours…

  2. First, thanks for the demo code. However, using Reflection to serialize the compile-generated members is just what we are hoping to avoid. We're developing a library that is used in many projects, and we've been burned with framework-version specific reflection tricks before. Here are some problems:

    – We don't know if it will be possible to achieve this in a future version. Who want's to build an application on an architecture that might become obsolete due to some minor change in .NET?

    – If MS publishes a beta version or CTP of a new framework/compiler release, we have to ignore all other priorities and try to find a way to make our code work with it. If we find a problem too late, it won't get fixed. If we find it in time, there's still no guarantee.

    – Until we have a tested and production-ready version in our library, people won't be able to upgrade.

    – This version would have to be composable and support at least Task and Task<T>, handle problems gracefully and be completely unit-tested, which amounts to some serious effort.

    – Sometimes we have to support several framework versions in parallel, which is awfully hard to support (just think continuous integration)

    – Adding support for several compiler versions would complicate matters further.

    An officially supported NuGet package would solve most of the trouble, especially if MS guarantees to support it in future versions and actually does consider it when the compilers are changed. We would very much like it to be a single package that supports all versions, not one assembly per framework/compiler version. So yes, please, do make such a package!

    But let's take a step back. First, you say that serializing Tasks is the wrong questions. We agree, we only want to serialize execution state. That's why our original request is to untie the CPS capabilities of C# and VB from Tasks. A simple continuation would be easy to serialize, just like an iterator. However, we have to deal with this graph of objects that is created around the Task class. We just need to support a specific scenario though, and your code is a nice hint through that labyrinth. Thanks for that!

    I described our scenario in your previous blog post. So rather than asking for a compelling scenario, I'd like you to tell us what's not so compelling with ours. I'll start with a short defense and a few details. You write that really long-running stuff should not use these mechanisms, but rather build explicit data structures. We totally agree. We have actually built a workflow engine that controls activity flow for humans, and it would never have occurred to us to use such a fragile mechanism. However, user interactions over the HTTP request/response paradigm are no more long-running than normal procedure calls on a fat client. We only have to unwind the call stack for technical reasons. Similarly, the need for serialization is only a technical one, for installations that choose to use state servers for load balancing. None of this changes the conceptually transient nature of these things.

  3. (cont'd)

    Just like there is no need to analyze the current execution state of every thread on a fat client through friendly data structures, there is no need to do that on a web server. For analytics, I know what requests are being processed, and what sessions are alive, and that's all I need. There is no need to maintain this state over application updates, share it with other users etc.

    On the plus side, nested procedural user interaction flows are much more easy to write using async methods than explicit structures. We know that for sure, because explicit structures is what we're doing now, and it's awful. By providing a simple, CPS-based procedural way to write request-spanning logic, we can better modularize our applications, and we can even separate the general flow from UI technologies. I.e. the code that defines a use case in the business logic layer will eventually branch off to some Web form according to some mapping. (It might even dynamically choose a Web form or a Windows dialog, depending on the environment it runs in. Not something we're doing, and often not desirable, but definitely a powerful option.)

    Just for the sake of completeness, our library is at http://www.re-motion.org, the part I'm discussing now is here: svn.re-motion.org/…/ExecutionEngine

    Here's how we used to define procedural code with explicit data structures:

    class MyFunction : WxeFunction

    {

     MyFunction (string param1, int param2, …) : base (param1, param2, …) {}

     void Step1 () { /*a*/ }

     class Step2 : WxeTryCatch

     {

        class Try :  WxeStepList

        {

           class Step1 : WxeIf

           {

              bool If () { return /*b*/ }

              class Then : WxeStepList

              {

                void Step1 () {/*c*/}

                WxePageStep Step2 = new WxePageStep ("somepage.aspx");

              }

            }

            void Step2 () {/*d*/}

        }

        [WxeException (typeof (InvalidOperationException))]

        class Catch1 : WxeCatchBlock { /*e*/ } }

    That's just a vague recollection, it's actually even more complex if you use parameters and return values. It's powerful, but so obscure that we eventually made it obsolete. We're using some fragile workarounds now to achieve something similar.

    With async, it'd be just what it should be:

    async MyFunction (string param1, int param2) {

     /*a*/

     try

     {

       if (/*b*/)

         /*c*/

         ShowPage ("somepage.aspx");

       /*d*/

     }

     catch (InvalidOperationException ex) {/*e*/} }

    … and whatever you call in a, b, c etc. using await would either return immediately or show a web page and resume execution at the next request. For complex business apps with nested forms and multi-step use cases, this way of writing applications can be a huge time saver (if used wisely).

  4. @Stefan, thanks for the detailed explanation.

    As I understand, you're describing a revolutionary new way of writing the "choreography" of web sites. By "choreography" I mean the glue logic that describes the sequence of things that happen, which ones are conditional, which ones get done in a loop, how data flows between them &c. Your approach is to re-use the language constructs (if, semicolon, while, variables &c.) to describe this choreography.

    I have been working on a similar approach for the choreography of client applications. My first prototype http://www.wischik.com/…/PaddleGame.html used async choreography for the sequence of behaviors in a simple interactive game. I'm currently working on async choreography for which pages get displayed in a Windows Phone app.

    So I personally am sold on the design pattern! Obviously my client scenario doesn't need serialization — that seems a specifically a need for stateless webservers. I can't think of other needs for it.

    Practically, what are the next steps? First, we need to solve the composition problem. I'll spend a while thinking about this.

    Second, people need to be shown this new choreography design pattern, and shown how it will solve their problems more easily.

    Third, the package you describe has to be built. It'll be a heavy undertaking for the reasons you list. It has to be built either within Microsoft or as a third-party project. The engineering effort will be about the same in both cases. Work always seems easier if someone else has to do it 🙂 I'm not sure that MS is the right place to develop these new ideas, but I'll chat with someone in the ASP.Net to what they think.

  5. Hi Lucian, it's good to hear that you can relate to our idea. We're currently trying to work out a few convincing scenarios. If you email me at stefan dot wenig at rubicon dot eu, I'll be able to send them straight to you. Or if you prefer, we can discuss this at our mailing list at groups.google.com/…/re-motion-dev.

    Thanks!

  6. Pierre Arnaud says:

    Hi Stefan and Lucian, I found you post and discussion fascinating. I was idly searching the web for some ideas to hibernate an async/await state machine, and there you have come up with some interesting ideas.

    I am currently doing some exploratory work in order to implement the CQRS pattern, in which my micro-services will consume commands in awaitable methods. What I'd love to create (that's still just a dream for now), is some way to suspend a long running micro-service (for instance if some internal command was just posted and the service needs to wait for it to be processed before continuing). I'd like to be able to completely shut down a service, unload its assembly, reload a newer version of the same service and restart it where it left off.

    I understand that tweaking the internals of the compiler generated state machine is brittle, and that it would be even more so if the code happened to get upgraded between hibernation and dehibernation. Nevertheless, it would be really cool 🙂

  7. Chui Tey says:

    Hi Lucian, Stefan, and Pierre, I've been searching on this topic as well. It may be possible on a limited basis to use Roslyn to modify subroutines so that local variables are stored in a propertybag which can be serialized.

    This leaves only the await part. When await is called, this can trigger hibernation/serialization/reificiation of the current thread's propertybag. This way, the thread can be put to sleep for days or months until an external event rehydrates the process.

  8. Reg Hammond says:

    OpenVMS could do this hibernate stuff with ease … in it's 'sleep' – literally!!!

  9. felek says:

    Good code 🙂

    Just wish it's official.

    I create a simple workflow system in which workflow can be hibernates.

    Everything is based on the C # code.

    Currently testing the proposed code, the potential is huge.

    We can stop the execution method (in of case when we need input from the user), then we can very long wait for the data.

    and as we have received the data resume the method.

    Something like MS WorkFlow long run. (only in the code without designers and other strange mechanisms that probably only understand the designer of WF)

    Regards

  10. Just a note to anyone still following this old blogpost: I’ve updated the technique to handle arbitrary async callstacks (not just a single async method). New blogpost here: https://blogs.msdn.microsoft.com/lucian/2016/04/20/async-workflow-2/