Grokking LazyAsyncResult (.Net internal)

(Background: sometimes as I try to understand our bugs, I have to learn about the .net internal classes used for implementation of the public classes that we're consuming - my motivator today is SocketAsyncEventArgs.)

Today I'm going to try to understand the internal class LazyAsyncResult.

Now that .Net core is open source we can find its source code in the state I see it here:

https://github.com/dotnet/corefx/blob/bffef76f6af208e2042a2f27bc081ee908bb390b/src/Common/src/System/Net/LazyAsyncResult.cs

- but I like to use Reflector which helps me figure otu how the class is actually used too.

The class comment in source is pretty helpful: "LazyAsyncResult - Base class for all IAsyncResult classes that want to take advantage of lazily-allocated event handles".

In plainer language, imagine that you have to implement the IAsyncResult interface from scratch as part of a new API you're writing. (Yes, this is less and less likely as the world moves to async/await and Task based APIs, but humor me!)

The interface must provide the following properties(snippeting from MSDN with links and all):

  1. object AsyncState
  2. WaitHandle AsyncWaitHandle
  3. bool CompletedSynchronously
  4. bool IsCompleted

The member to pay attention to for understanding LazyAsyncReuslt is AsyncWaitHandle. Clearly, the goal of this class is that if no caller actually ever needs a WaitHandle to wait on, then we shouldn't need to allocate one - as after all, allocating WaitHandles is slightly expensive.

How expensive? Well, here is an interesting discussion related to the subject by Joe Duffy, which basically explains the entire reason for having this LazyAsyncResult class. Aside from mentioning that allocating these handles is kinda expensive, he does some performance measurements, and then notes "In the case of high performance asynchronous IO, for example, where completion often involves simply marshaling some bytes between buffers, this can be a key step in the process of improving system throughput."

I think Joe was not being explicit in that sentence, but he probably meant that if async callbacks just to move some bytes between buffers, you might never actually wait on the event at all. Which is what makes it the best performance win. And also inferring, since he didn't include his test harness, I think his fibonnaci scenario is probably also not actually waiting on the event - otherwise we wouldn't expect to see an actual performance improvement for large fibonacci scenarios.

So is that the case that you won't be using any WaitHandles when you are using the async socket API?... Hopefully. :)

Anyway, now we understand most of the class design, but what about all these internal members the .net framework adds on top of the IAsyncResult interface? There are heaps of them! Here they are listed, with descriptions lifted from the source, broken into groups with commentary:

These properties are an obvious set to have in order to implement a 'Begin*/End* IAsyncResult pattern, which promises to callback some user-supplied callback method with a user-supplied async object parameter. Normally, also the 'End*' method returns the final result of the async operation.

  1. protected AsyncCallback AsyncCallback - "Caller's callback method"
  2. internal object AsyncObject - "Caller's async object"
  3. internal object Result - "Final I/O result returned by the End* method"
  4. internal int ErrorCode - "Win32 error code for Win32 IO async calls (that want to throw)."

Note that sometimes Result is actually set to hold an exception object, when the result of the async operation was a failure, and as a bonus, there is an int ErrorCode field for storing Win32 error codes such as socket errors, where applicable.

Some of these next properties are a little less clearly described:

  1. internal bool EndCalled - "True if the user called the End*() method" [what? the user calls their own end method?]
  2. internal bool InternalPeekCompleted - "Returns true if this call created the event." [what???]
  3. internal void InvokeCallback()
  4. internal void InvokeCallback(object result)
  5. protected void ProtectedInvokeCallback(object result, IntPtr userToken) - "A method for completing the IO with a result and invoking the user's callback. Used by derived classes to pass context into an overridden Complete()." [note: ProtectedInvokeCallback calls into Complete() if the async result had not been completed yet.]

I mean, come on, how could LazyAsyncResult possibly know that the user called their own end method without going through LazyAsyncResult at all? That can't really be what the comment means, so what is EndCalled actually for? Inspection shows that the purpose of EndCalled is to implement 'double-End-call' checking, i.e. inside an End*() method, the IAsyncResult will be cast as LazyAsyncResult and used to perform a sanity check, which becomes a nice debugging assistant:

if (lazyAsyncResult.EndCalled) { throw new InvalidOperationException("You already called End* on this async result, calling it twice is an error"); } lazyAsyncResult.EndCalled = true;

And what about this InternalPeekCompleted boolean? Well, untangling its implementation, it relies on a field m_IntCompleted in the LazyAsyncResultobject. The field m_intCompleted stores two pieces of information: a) whether the async result completed synchronously yet, and b) how many times 'ProtectedInvokeCallback() or InvokeCallback() (which calls ProtectedInvokeCallback())' callback has been called, which is most useful for ensuring that if the async result finally does complete, the user callback gets called back exactly once.

Anyway, InternalPeekCompleted is in short, designed to return true not excatly whenever the async result is completed but rather when the user's callback has been dispatched, or the user has otherwise observed the event to be completed. This gets used as a general book-keeping aid for various classes knowledgeable of LazyAsyncResult implementation, who want to e.g. keep track of their own internal state, without calling IsCompleted, which would have an undesirable side effect: 'IsCompleted' is rather special - it ensures that as soon as any external observer of the LazyAsyncResult starts seeing the 'IsCompleted' propertly as true (or false), every thread starts consistently seeing IsCompleted is true (or false) consistently, i.e. it 'locks in' a truth value for synchronous completion.

That leaves:

  • internal void InternalCleanup() - "A general interface that is called to release unmanaged resources associated with the class. It completes the result but doesn't do any of the notifications."
  • internal object InternalWaitForCompletion() - "If [AsyncWaitHandle] is used, the [manual reset event] cannot be disposed because it is under the control of the application. Internal should use InternalWaitForCompletion instead - never AsyncWaitHandle"

The main learning I see here is that InternalWaitForCompletion comes with benefits that echo InternalPeekCompleted: when there's a framework-internal substitute for a public member that knows the difference between an 'external observer' viewing the completion state of the async result, and a 'piece of the framework' peeking at that completion state (or waiting upon it), this allows being more efficient about whether to create wait handles, whether to clean them up, and whether to bother trying to present a universally consistent story around whether the event completed synchronously nor not - the framework can be assumed to a) not do anything silly like wait on the wait handle after the event has actually completed, and b) be robust to the possibility that the user doesn't know yet whether the event is completing synchronously. (Presumably the framework code verifiably does it correctly - if only all user code were also easily verifiable to do the same!)

There's one really niec simplification in API the framework gives itself. In the framework internal world, LazyAsyncResult only has to implement 'wait for the event', and can implement that however it likes. In the public code scenario, LazyAsyncResult has a much more demanding spec: do it with a WaitHandle or bust! (D'oh.)

So TL;DR: LazyAsyncResult is just a reusable 'IAsyncResult' implementation for the framework to use, that comes with a bunch of commonly useful optimizations. Their benefit is partly from optimizing common user scenarios, and partly from the framework knowing that it can use more specialized APIs than IAsyncResult interface in its implementation (when the IAsyncResult way would result in suboptimalness).