How to hibernate async methods (how to serialize Task)

Article
11/23/2012

[update: there's now a sequel to this post, here, which improves the technique so it can handle async callstacks rather than just a single async method at a time.]

Sometimes people ask the question “How can I serialize a Task?” If you try, it throws a SerializationException:

Dim t = TestAsync()

Using stream As New MemoryStream

Dim formatter As New Formatters.Binary.BinaryFormatter

formatter.Serialize(stream, t)
' SerializationException: Type 'Task' is not marked as serializable.

End Using

Async Function TestAsync() AsTask

Await Task.Delay(10)

Console.WriteLine("a")

End Function

It’s the wrong question to ask. A Task is something that conceptually can’t be serialized. Consider as example this task: “Dim t As Task = httpClient.DownloadStringAsync(url)”. The task object embodies two things: (1) task is a promise that the operation will be completed at some time in the future, regardless of whether that operation is being performed locally in an async method or thread, or through some packets hurtling through the internet; (2) a task contains a collection of signed-up behaviors, delegates, that will be invoked once the task transitions to the “completed” state. It doesn’t make sense to talk about serializing either of those two things.

The right question is: “How can I serialize the current state of my progress through this async method? ” In this article I’ll show how.

Caveat

Personally, I don’t yet believe that this is a good idea. I’m posting this blog article in case I’m wrong, and in case anyone finds a good design pattern for it.

In the late 1990s I did my PhD on the topic of message-passing as a way to write concurrent programs. It was a bad idea: “await” is better than message-passing. At the time most of my peers were doing PhDs on mobile agents which, I think, are even worse idea. Mobile agents are pieces of code which can serialize their state to disk, be transmitted over the internet, be deserialized on a remote machine, and resume their execution there. Biztalk/XLang also has the notion of long-running interactive processes which can be suspended to disk and resumed later.

I don’t think it’s a good idea to serialize processes/functions/behaviors/agents. That’s because they’re intangible. I think we should figure out the plain-old-data embodiment of the current state of our work, and serialize that data to disk, and deserialize it again all as a passive data-structure rather than an active agent. That way, if someone asks “What is currently in flight?” then in answer we can point them to the data structures.

So please only follow this article if you have a real vision of where it will be practically useful and clean.

Using a serializable async method

Here’s my serializable async method “TestAsync”.

Async Function TestAsync(min As Integer, max As Integer) As Task
  For i = min To max
  Console.WriteLine(i)
  Await Task.Delay(100)
  If i = 5 Then Await Hibernator.Hibernate("a.hib")
  Next
End Function

One of my programs is going to kick off the async method. But half way through, the async method will hibernate itself to disk and terminate abruptly with a “Serialized-to-disk” exception.

Console.WriteLine("INITIATING...")
Try
Await TestAsync(1, 10)
Catch ex As OperationCanceledException
Console.WriteLine("EX" & ex.Message)
End Try

INITIATING ASYNC METHOD...

5
EX Serialized to a.hib

My other program might be running on a different machine, or maybe just a long time later, or maybe it just needed the async method to be snapshotted to disk for resilience against errors. It’s going to resume that same async method, mid-flight, from disk.

Console.WriteLine("RESUMING...")

Dim t As Task = Hibernator.Resume("a.hib")

Await t

Console.WriteLine("done")

RESUMING...

10
done

Q. What makes the async method serializable? A. All its parameters and local variables must themselves be serializable.

Q. Why design it so that hibernation throws a SerializationException from the original method? A. Imagine you stepped into a Star Trek transporter to beam down to the planet’s surface. Would you want the original copy of you in the transporter bay to go on living, while the new copy on the planet’s surface goes on living as well? So there are now two of you? You’d have to be very careful of side-effects, in case the two copies tried to do the same thing at the same time.

Q. Why design it so the async method itself is responsible for hibernating, rather than its caller? A. I think that hibernation is serious business, akin to thread-termination or task-cancellation, which should only be done cooperatively when the async method is ready for it. I’d probably want to extend the above code to use HiberationTokenSource / HibernationToken, by analogy to CancellationTokenSource / CancellationToken.

Implementing serialization of async methods

Here’s my implementation of the Hibernate and Resume functionality. Code first; discussion afterwards.

<Serializable>
Public Class Hibernator : Implements INotifyCompletion, ISerializable
  Private m_ex As Runtime.ExceptionServices.ExceptionDispatchInfo
  Private m_fn As String
  '===================================================================================
  Public Shared Function Hibernate(fn As String) As Hibernator
  ' Intended use: "Await Hibernator.Hibernate(filename)"
  Return New Hibernator With {.m_fn = fn}
  End Function
  Private Sub New()
  ' Private constructor used only by the Hibernate() shared method
  End Sub
  Function GetAwaiter() As Hibernator
  Return Me
  End Function
  Public ReadOnly Property IsCompleted As Boolean
  Get
  Return False ' so that OnCompleted gets called
  End Get
  End Property
  Public Sub OnCompleted(continuation As Action) Implements INotifyCompletion.OnCompleted
  Dim ex0 As Exception = Nothing
  Try
  Dim sm = continuation.Target.GetType().GetField("m_stateMachine",
  BindingFlags.NonPublic Or BindingFlags.Instance).GetValue(continuation.Target)
  Using stream As New FileStream(m_fn, FileMode.Create, FileAccess.Write)
  Dim formatter As New Runtime.Serialization.Formatters.Binary.BinaryFormatter
formatter.Serialize(stream, sm.GetType().Assembly.FullName)
formatter.Serialize(stream, sm.GetType().FullName)
  For Each field In sm.GetType().GetFields(BindingFlags.Public Or
BindingFlags.NonPublic Or BindingFlags.Instance)
If field.Name = "$Builder" OrElse field.Name = "<>t__builder" Then Continue For
  Dim fieldValue = field.GetValue(sm)
  If fieldValue Is Nothing Then Continue For
  If field.Name.Contains("$awaiter") AndAlso
Not Object.ReferenceEquals(fieldValue, Me) Then Continue For
formatter.Serialize(stream, field.Name)
formatter.Serialize(stream, fieldValue)
  Next
  End Using
ex0 = New OperationCanceledException("Serialized to " & m_fn)
  Catch ex1 As Exception
  If File.Exists(m_fn) Then IO.File.Delete(m_fn)
ex0 = ex1
  End Try
m_ex = Runtime.ExceptionServices.ExceptionDispatchInfo.Capture(ex0)
  ' That's so we can save+rethrow the exception with its intended callstack

continuation()
End Sub

  Public Sub GetResult()
  If Not m_ex Is Nothing Then m_ex.Throw()
  End Sub
  '===================================================================================
  Public Shared Function [Resume](fn As String) As Task
  Using stream As New IO.FileStream(fn, FileMode.Open, FileAccess.Read)
  Dim formatter As New Runtime.Serialization.Formatters.Binary.BinaryFormatter
  Dim assemblyName = CStr(formatter.Deserialize(stream))
  Dim typeName = CStr(formatter.Deserialize(stream))
  Dim sm = TryCast(System.Activator.CreateInstance(assemblyName, typeName).Unwrap(),
IAsyncStateMachine)
  While stream.Position < stream.Length
  Dim fieldName = CStr(formatter.Deserialize(stream))
  Dim fieldValue = formatter.Deserialize(stream)
  Dim field = sm.GetType().GetField(fieldName, BindingFlags.Public Or
  BindingFlags.NonPublic Or BindingFlags.Instance)
field.SetValue(sm, fieldValue)
  End While
  Dim builder = AsyncTaskMethodBuilder.Create()
  Dim builderField = sm.GetType().GetField("$Builder", BindingFlags.Public Or
  BindingFlags.NonPublic Or BindingFlags.Instance)
  If builderField Is Nothing Then builderField =
sm.GetType().GetField("<>t__builder", BindingFlags.Public Or
BindingFlags.NonPublic Or BindingFlags.Instance)
builderField.SetValue(sm, builder)
builder.Start(sm) ' This invokes MoveNext()
  Dim builder2 = CType(builderField.GetValue(sm), AsyncTaskMethodBuilder)
  Return builder2.Task
  End Using
  End Function

    '===================================================================================
  Private Sub New(info As SerializationInfo, ctx As StreamingContext)
  ' Private constructor used only by the deserializer
  End Sub

  Public Sub GetObjectData(info As SerializationInfo,
context As StreamingContext) Implements ISerializable.GetObjectData
  ' we have nothing to serialize
  End Sub
End Class

NOTE 1: What fields need be serialized? Well, all of them, except for...

The builder field won't be serialized. It must be reconstructed. In VB it's called $Builder. In C# it's called <>t__builder. Note that both names are unutterable in user code.
Most awaiters don't need to be serialized; only the current one is needed. The VB compiler calls its awaiters $awaiterXYZ The C# compiler calls its <>XYZ$awaiterABC. Again, both names are unutterable in user code. Both compilers creates several awaiters: one awaiter of type 'object' for all reference-type awaitables, and one awaiter for each type of value-type.
Null fields don't need to be serialized, and indeed can't.

NOTE 2: Normally it's wrong for OnCompleted to invoke continuation() directly, because you end up with MoveNext invoking OnCompleted invoking MoveNext, which if it went too far would blow the stack. We should really be posting a delegate to the current synchronization context. But here the intended usage is "Await Hibernator.Hibernate(filename)" which will terminate the async method with an uncaught exception, so there's no danger.

NOTE 3: In this design of hibernator, GetResult will be called at two times:

when the user did "Await Hibernator.Hibernate()", and our OnCompleted method saved an exception into m_ex, and we always wish to throw that exception
when the user resumed the aysnc method from hibernation, and the Hibernator was deserialized with all fields null, and we don't want to throw any exception.

NOTE 4: Why does the [Resume] method retrieve the "builder" field a second time? Well, Async state machines are structures that get boxed at their first await. If they get boxed, then all structure-fields within them get boxed as well. Therefore "sm.builder" after the first MoveNext might not be the same "builder" that we created before.

Conclusions

This is a pretty sketchy implementation. It does just enough to make the sample code work. It doesn’t work with Async Subs or Async Task(Of T) Functions.

It also isn't composable: you can't write a wrapper method around a call to Hibernate(). That's because it only serializes its immediate caller.

This code is tightly coupled to the .NET4.5 compilers’ internal implementations of async methods. It relies on the fact that “Await Hibernate()” will invoke Hibernate’s OnCompleted method with a delegate whose Target object has an m_stateMachine field that points to the state machine. It relies on this state-machine having lifted local variables, and a builder of a recognizable name, and recognizable awaiter fields.

We didn’t give these things a clean public API, because we didn’t have any good use-cases for how or why people would want to serialize their async methods. If you folks come up with compelling mainstream scenarios, then we might consider maybe a Microsoft-authored NuGet package which does serialization, or exposing the necessary fields via a clean public API.

How to hibernate async methods (how to serialize Task)

Caveat

Using a serializable async method

Implementing serialization of async methods

Conclusions

Additional resources