Anonymous Methods, Part 1 of ?

Although I was involved in many of the language design meetings, I don't think I can claim much of the credit for the actual design of the feature in the language spec.  There were a few technical gotchas that I discovered and proposed solutions to, and I'll try to remember to point them out as appropriate. 

First of all, although anonymous methods can be compared to several other features in several languages I think they are different enough to really be classified as their own beast (if somebody does know of another language that really does have something like anonymous methods please enlighten me).  Functional programmers will be sad when they learn that the captured locals are read/write.  Anonymous methods are different from regular delegates because of their ability to capture a locals.  They aren't exactly anonymous classes (although with careful knowledge of how the compiler works you could use them as such).

So what are anonymous methods?  It is a way of writing an unnamed nested method that, just like most languages that have nested methods, allows access to all of the outer method's locals and parameters, including the 'this' parameter.  Here's where we hit the first gotcha.  In order to allow access to the outer method's locals and parameters we have to move them off of the stack onto the heap so they can live beyond the method execution and still be verifiable (no pointer arithmetic or passing of stack frame pointers into nested methods).  Well for starters we can't move reference parameters onto the heap, hence the restriction that ref or out parameters cannot be captured (used inside anonymous methods).  If you do need to do something like that the suggested pattern to make the code explicitly perform copy-in and/or copy-out semantics by assigning to or from a local of the same type.

Example:


void SomeMethod(ref int ri){    ...    delegate {  ri++; } // This use is illegal because some users might expect    // the reference to outlive the method, and thus whenever they invoke the    // delegate some integer someplace gets incremented    int copy_of_i = ri;``    ....    delegate { copy_of_i++; } // This is legal because now the programmer is    // forced to realized the delegate can't mess with ri    ...    // Then if the delegate is invoked, or copy-out is required``    ri = copy_of_i;}


Now to complicate things even further, we have to remember that the 'this' pointer of structs is passed as if it was a reference parameter, that is why anonymous methods inside structs cannot access any instance members, while their counterparts inside classes can.  Again the suggested pattern to do a copy-in or copy-out as needed.  The language designers briefly toyed with the compiler automatically injecting the copy semantics, but them what happens when the real struct is modified on another thread?  Suddenly the user is able to perceive the difference between passing-by-reference and copy-in/copy-out.  Then there was also the question of where should the copy occur, and what happens when the delegate/method is invoked long after the outer method has returned, then there is no place to do the copy-out!  In the end we figured this was the safest route because it made it crystal clear what the compiler was going to do and the programmer would be forced into thinking about when and if a copy-out was needed.  I think that was probably my biggest contributions to anonymous methods.

Now earlier I mentioned that the compiler moves all the locals from the stack onto the heap.  As you might have already guessed this involves allocation of a GC object.  The compiler works very hard to preserve the semantics of your original method, and as such tries to only create things when needed, but still in a very predictable manner.  To find out where an object will be created, simply look at the scopes.  Start at the outermost scope of the method (the open curly), and then work inward with this simple rule: if this scope contains any captured locals, the compiler will remove the local from the generated method and replace all the captured locals in that scope with fields in a new class.  The new class instance will be created on each entry into that scope (so that loops like for, foreach, do, while, will get new locals for each iteration, but gotos within a scope will not).  So if you have 5 scopes each with one or more captured locals, you'll get 5 display classes.  So choose carefully which locals you capture and where they are declared.  In the outer method accessing a captured local becomes a single indirection through a display class.  From inside the anonymous method the local is often accessed as a member of 'this'.  The other performance gain that the compiler strives for is to eliminate extra delegate creations.  To accomplish that the compiler creates another local to cache the created delegate, thus if your anonymous method appears in the middle of a tight hot loop, as long as you don't capture any locals inside the loop, the constructed delegate will be cached outside the loop, and thus will only be created once and then successive loop iterations will only re-use the created delegate.

So I guess my final point for this post will be be careful when writing new code.  Anonymous methods can be very very useful, and make certain programming tasks very easy, but “with great power comes great responsibility“ because these wonderful anonymous methods, if used improperly can literally kill performance...

--Grant