How to hibernate async methods (how to serialize Task)
[update: there's now a sequel to this post, here, which improves the technique so it can handle async callstacks rather than just a single async method at a time.]
Sometimes people ask the question “How can I serialize a Task?” If you try, it throws a SerializationException:
Dim t = TestAsync()
Using stream As New MemoryStream
Dim formatter As New Formatters.Binary.BinaryFormatter
formatter.Serialize(stream, t)
' SerializationException: Type 'Task' is not marked as serializable.
End Using
Async Function TestAsync() AsTask
Await Task.Delay(10)
Console.WriteLine("a")
End Function
It’s the wrong question to ask. A Task is something that conceptually can’t be serialized. Consider as example this task: “Dim t As Task = httpClient.DownloadStringAsync(url)”. The task object embodies two things: (1) task is a promise that the operation will be completed at some time in the future, regardless of whether that operation is being performed locally in an async method or thread, or through some packets hurtling through the internet; (2) a task contains a collection of signed-up behaviors, delegates, that will be invoked once the task transitions to the “completed” state. It doesn’t make sense to talk about serializing either of those two things.
The right question is: “How can I serialize the current state of my progress through this async method? ” In this article I’ll show how.
Caveat
Personally, I don’t yet believe that this is a good idea. I’m posting this blog article in case I’m wrong, and in case anyone finds a good design pattern for it.
In the late 1990s I did my PhD on the topic of message-passing as a way to write concurrent programs. It was a bad idea: “await” is better than message-passing. At the time most of my peers were doing PhDs on mobile agents which, I think, are even worse idea. Mobile agents are pieces of code which can serialize their state to disk, be transmitted over the internet, be deserialized on a remote machine, and resume their execution there. Biztalk/XLang also has the notion of long-running interactive processes which can be suspended to disk and resumed later.
I don’t think it’s a good idea to serialize processes/functions/behaviors/agents. That’s because they’re intangible. I think we should figure out the plain-old-data embodiment of the current state of our work, and serialize that data to disk, and deserialize it again all as a passive data-structure rather than an active agent. That way, if someone asks “What is currently in flight?” then in answer we can point them to the data structures.
So please only follow this article if you have a real vision of where it will be practically useful and clean.
Using a serializable async method
Here’s my serializable async method “TestAsync”.
Async Function TestAsync(min As Integer, max As Integer) As Task
For i = min To max
Console.WriteLine(i)
Await Task.Delay(100)
If i = 5 Then Await Hibernator.Hibernate("a.hib")
Next
End Function
One of my programs is going to kick off the async method. But half way through, the async method will hibernate itself to disk and terminate abruptly with a “Serialized-to-disk” exception.
Console.WriteLine("INITIATING...")
Try
Await TestAsync(1, 10)
Catch ex As OperationCanceledException
Console.WriteLine("EX" & ex.Message)
End Try
INITIATING ASYNC METHOD...
1
2
3
4
5
EX Serialized to a.hib
My other program might be running on a different machine, or maybe just a long time later, or maybe it just needed the async method to be snapshotted to disk for resilience against errors. It’s going to resume that same async method, mid-flight, from disk.
Console.WriteLine("RESUMING...")
Dim t As Task = Hibernator.Resume("a.hib")
Await t
Console.WriteLine("done")
RESUMING...
6
7
8
9
10
done
Q. What makes the async method serializable? A. All its parameters and local variables must themselves be serializable.
Q. Why design it so that hibernation throws a SerializationException from the original method? A. Imagine you stepped into a Star Trek transporter to beam down to the planet’s surface. Would you want the original copy of you in the transporter bay to go on living, while the new copy on the planet’s surface goes on living as well? So there are now two of you? You’d have to be very careful of side-effects, in case the two copies tried to do the same thing at the same time.
Q. Why design it so the async method itself is responsible for hibernating, rather than its caller? A. I think that hibernation is serious business, akin to thread-termination or task-cancellation, which should only be done cooperatively when the async method is ready for it. I’d probably want to extend the above code to use HiberationTokenSource / HibernationToken, by analogy to CancellationTokenSource / CancellationToken.
Implementing serialization of async methods
Here’s my implementation of the Hibernate and Resume functionality. Code first; discussion afterwards.
<Serializable>
Public Class Hibernator : Implements INotifyCompletion, ISerializable
Private m_ex As Runtime.ExceptionServices.ExceptionDispatchInfo
Private m_fn As String
'===================================================================================
Public Shared Function Hibernate(fn As String) As Hibernator
' Intended use: "Await Hibernator.Hibernate(filename)"
Return New Hibernator With {.m_fn = fn}
End Function
Private Sub New()
' Private constructor used only by the Hibernate() shared method
End Sub
Function GetAwaiter() As Hibernator
Return Me
End Function
Public ReadOnly Property IsCompleted As Boolean
Get
Return False ' so that OnCompleted gets called
End Get
End Property
Public Sub OnCompleted(continuation As Action) Implements INotifyCompletion.OnCompleted
Dim ex0 As Exception = Nothing
Try
Dim sm = continuation.Target.GetType().GetField("m_stateMachine",
BindingFlags.NonPublic Or BindingFlags.Instance).GetValue(continuation.Target)
Using stream As New FileStream(m_fn, FileMode.Create, FileAccess.Write)
Dim formatter As New Runtime.Serialization.Formatters.Binary.BinaryFormatter
formatter.Serialize(stream, sm.GetType().Assembly.FullName)
formatter.Serialize(stream, sm.GetType().FullName)
For Each field In sm.GetType().GetFields(BindingFlags.Public Or
BindingFlags.NonPublic Or BindingFlags.Instance)
If field.Name = "$Builder" OrElse field.Name = "<>t__builder" Then Continue For
Dim fieldValue = field.GetValue(sm)
If fieldValue Is Nothing Then Continue For
If field.Name.Contains("$awaiter") AndAlso
Not Object.ReferenceEquals(fieldValue, Me) Then Continue For
formatter.Serialize(stream, field.Name)
formatter.Serialize(stream, fieldValue)
Next
End Using
ex0 = New OperationCanceledException("Serialized to " & m_fn)
Catch ex1 As Exception
If File.Exists(m_fn) Then IO.File.Delete(m_fn)
ex0 = ex1
End Try
m_ex = Runtime.ExceptionServices.ExceptionDispatchInfo.Capture(ex0)
' That's so we can save+rethrow the exception with its intended callstack
continuation()
End Sub
Public Sub GetResult()
If Not m_ex Is Nothing Then m_ex.Throw()
End Sub
'===================================================================================
Public Shared Function [Resume](fn As String) As Task
Using stream As New IO.FileStream(fn, FileMode.Open, FileAccess.Read)
Dim formatter As New Runtime.Serialization.Formatters.Binary.BinaryFormatter
Dim assemblyName = CStr(formatter.Deserialize(stream))
Dim typeName = CStr(formatter.Deserialize(stream))
Dim sm = TryCast(System.Activator.CreateInstance(assemblyName, typeName).Unwrap(),
IAsyncStateMachine)
While stream.Position < stream.Length
Dim fieldName = CStr(formatter.Deserialize(stream))
Dim fieldValue = formatter.Deserialize(stream)
Dim field = sm.GetType().GetField(fieldName, BindingFlags.Public Or
BindingFlags.NonPublic Or BindingFlags.Instance)
field.SetValue(sm, fieldValue)
End While
Dim builder = AsyncTaskMethodBuilder.Create()
Dim builderField = sm.GetType().GetField("$Builder", BindingFlags.Public Or
BindingFlags.NonPublic Or BindingFlags.Instance)
If builderField Is Nothing Then builderField =
sm.GetType().GetField("<>t__builder", BindingFlags.Public Or
BindingFlags.NonPublic Or BindingFlags.Instance)
builderField.SetValue(sm, builder)
builder.Start(sm) ' This invokes MoveNext()
Dim builder2 = CType(builderField.GetValue(sm), AsyncTaskMethodBuilder)
Return builder2.Task
End Using
End Function
'===================================================================================
Private Sub New(info As SerializationInfo, ctx As StreamingContext)
' Private constructor used only by the deserializer
End Sub
Public Sub GetObjectData(info As SerializationInfo,
context As StreamingContext) Implements ISerializable.GetObjectData
' we have nothing to serialize
End Sub
End Class
NOTE 1: What fields need be serialized? Well, all of them, except for...
- The builder field won't be serialized. It must be reconstructed. In VB it's called $Builder. In C# it's called <>t__builder. Note that both names are unutterable in user code.
- Most awaiters don't need to be serialized; only the current one is needed. The VB compiler calls its awaiters $awaiterXYZ The C# compiler calls its <>XYZ$awaiterABC. Again, both names are unutterable in user code. Both compilers creates several awaiters: one awaiter of type 'object' for all reference-type awaitables, and one awaiter for each type of value-type.
- Null fields don't need to be serialized, and indeed can't.
NOTE 2: Normally it's wrong for OnCompleted to invoke continuation() directly, because you end up with MoveNext invoking OnCompleted invoking MoveNext, which if it went too far would blow the stack. We should really be posting a delegate to the current synchronization context. But here the intended usage is "Await Hibernator.Hibernate(filename)" which will terminate the async method with an uncaught exception, so there's no danger.
NOTE 3: In this design of hibernator, GetResult will be called at two times:
- when the user did "Await Hibernator.Hibernate()", and our OnCompleted method saved an exception into m_ex, and we always wish to throw that exception
- when the user resumed the aysnc method from hibernation, and the Hibernator was deserialized with all fields null, and we don't want to throw any exception.
NOTE 4: Why does the [Resume] method retrieve the "builder" field a second time? Well, Async state machines are structures that get boxed at their first await. If they get boxed, then all structure-fields within them get boxed as well. Therefore "sm.builder" after the first MoveNext might not be the same "builder" that we created before.
Conclusions
This is a pretty sketchy implementation. It does just enough to make the sample code work. It doesn’t work with Async Subs or Async Task(Of T) Functions.
It also isn't composable: you can't write a wrapper method around a call to Hibernate(). That's because it only serializes its immediate caller.
This code is tightly coupled to the .NET4.5 compilers’ internal implementations of async methods. It relies on the fact that “Await Hibernate()” will invoke Hibernate’s OnCompleted method with a delegate whose Target object has an m_stateMachine field that points to the state machine. It relies on this state-machine having lifted local variables, and a builder of a recognizable name, and recognizable awaiter fields.
We didn’t give these things a clean public API, because we didn’t have any good use-cases for how or why people would want to serialize their async methods. If you folks come up with compelling mainstream scenarios, then we might consider maybe a Microsoft-authored NuGet package which does serialization, or exposing the necessary fields via a clean public API.