Power2: Async and Resumable methods

Article
02/01/2010

[This post is part of a series, "wish-list for future versions of VB"]

IDEA: Async and Resumable Methods. We should be able to use the "Async Programming Model" (APM) more easily, via an extra keyword "Async":

Async Function Test() As Integer

Dim fs = IO.File.OpenRead("c:\windows\win.ini")

Dim buf(100) As Byte

Dim numBytesRead = Async fs.Read(buf, 0, 100)

numBytesRead += Async fs.Read(buf, 0, 100)

Return numBytesRead

End Function

Sub Main()

Dim x = Test()

End Sub

The goal is to make this simple enough that anyone can read it and make sense of it without knowing what's going on under the hood. And the only thing to make sense of is "Async keyword means that the call is non-blocking".

Here's what happens under-the-hood...

Normally fs.Read() would be a blocking call, which would block the calling thread until it has finished. But because of the Async keyword on it, we instead use the APM to initiate the Read; the remainder of the function will be resumed only when the IO system has completed the read. This leaves the thread free to do other work. The motive for this is that threads are costly on Windows and the CLR, and it's not good to have thousands or even hundreds of them.

What we end up with is that the code is a bit like an iterator: it does some work, then comes to a yield point (i.e. the start of an async call), and someone has to resume the code once the work has finished. Then it comes to a second asynchronous call, and again yields, and again someone has to resume the work.

The question is, who does the job of resuming once the work has finished? And where does control-flow go when we hit a yield point?

Answer1: In the case of "Async Sub Test()", it was declared as an Async sub. That means that it defers the decision: it simply "bubbles-up" the yield point to its caller. (Under the hood, the compiler synthesizes "BeginTest()" and "EndTest()" so that Test is callable via the async pattern).

Answer2: In the case of "Sub Main()", it was not declared as an Async sub. That's a way of saying "The async buck stops here". Main will block until the call to Test() has returned.

The next question is, how do things actually get resumed? The way the APM works is that when you call BeginRead(), you pass a lambda, and this lambda gets executed by some IO thread once it's finished. This will be a small lambda which simply sets an event. Meanwhile the main thread is inside "Sub Main", blocked, waiting for the event. As soon as it gets the event, it continues by executing the next statement inside Test().

The overall goal is that the user can write what looks like sequential code, and their code always resumes on the same thread it started on, even though under-the-hood the APM works very differently.

SCENARIO: You have written a Silverlight application. This simply does not allow blocking calls: instead it requires APM. You want an easy syntax to use it.

SCENARIO: You have written a web service and don't want to use up lots of threads because they are costly: instead you use APM. And again you want an easy syntax.

Early last year, researcher Claudio Russo of Microsoft Research in Cambridge, England, worked on an experiment called "Concurrent Basic". It was a version of Visual Basic with message-passing primitives. I had done my own PhD research on similar message-passing primitives. We have taken some of the lessons learned from that experiment. The two most important lessons for me were:

1. If the UI makes an async call, it usually want to resume on the UI thread. The typical pattern is that you make a web-request or file-operation which takes some time, and when it's done you have to update the UI thread with the results. That's exactly what this idea does.

2. Users shouldn't have to "invert" their programs, shouldn't have to turn them into state-machines. Think of iterators in C#. The compiler takes sequential-looking code and turns it into a finite state machine. But VB users don't have that and so are forced to write their iterator state machines manually. It's always better to have code that looks sequential. In other words, if two things are meant to happen sequentially, then you should be able to write them sequentially in your source code. You shouldn't have to package them up into lambdas or functions or anything like that.

There's one scenario we've not covered yet in this proposal, "bottlenecks". The idea is that locks and mutexes and critical-sections are awkward to use. Instead you'd like to protect a block of code so that only one thread can ever be executing this block at a time. It would look like this:

Sub f()

Console.WriteLine("hello")

Async On MyThreadpool1

Dim s = Async workItems.Pop()

s &= "work"

finishedItems.Push(s)

End Async

Console.WriteLine("goodbye")

End Sub

This is a generalization of a critical-section. It means that only threads from "MyThreadpool1" are ever allowed to be executing inside this block. There might even be a single thread in MyThreadpool1, in which case only one thread can ever be active inside the block at a time. This isn't quite like a critical-section: that's because if one caller had yielded its thread while it waited for its Async to finish, then another caller could use that thread to enter the block at the same time. The end result is that you have an easier and more efficient way to protect your variables against concurrent access.

And here's how to use the same construct to do some work on the UI thread:

Async On UI_Thread

Form1.Button1.Value = "rendezvous"

End Async

Provisional evaluation from VB team: This idea obviously needs a lot more work and design. For instance, inside "Async Function Test()", could we infer that the function is Async because an Async call was made inside? Or could we assume that all calls made inside are Async because the function was declared as Async? How many blocking constructs are there in WPF and Winforms that would have to be updated to pump resumables? Will this really work in the Silverlight scenario? Is it sufficiently better than the current BackgroundWorker solution? Is it really useful outside Silverlight? And, most importantly,

Is it really easy enough to use? How could it be made easier?

We think this is a decent idea, one worth considering against the other decent ideas.

Power2: Async and Resumable methods

Additional resources