The implementation of anonymous methods in C# and its consequences (part 1)


You may not even have realized that there are two types of anonymous methods. I'll call them the easy kind and the hard kind, not because they're actually easy and hard for you the programmer, but because they are easy and hard for the compiler.

The easy kind is the anonymous method that doesn't use any local variables from its lexically-enclosing method. These are anonymous methods that could have been their own separate member functions; all the anonymization does is save you the trouble of coming up with names for them:

class MyClass1 {
 int v = 0;
 delegate void MyDelegate(string s);

 MyDelegate MemberFunc()
 {
  int i = 1;
  return delegate(string s) {
          System.Console.WriteLine(s);
         };
  }
}

This particular anonymous method doesn't access any MyClass1 members, nor does it access the local variables of the MemberFunc function; therefore, it can be converted to a static method of the MyClass1 class:

class MyClass1_converted {
 int v = 0;
 delegate void MyDelegate(string s);

 // Autogenerated by the compiler
 static void __AnonymousMethod$0(string s)
 {
  System.Console.WriteLine(s);
 }

 MyDelegate MemberFunc()
 {
  int i = 1;
  return __AnonymousMethod$0;
  // which is in turn shorthand for
  // return new MyDelegate(MyClass1.__AnonymousMethod$0);
  }
}

All the compiler did was give your anonymous methods a name and use that name in place of the "delegate (...) { ... }". (Note that all compiler-generated names I use here are purely illustrative. The actual compiler-generated name will be something different.)

On the other hand, if your anonymous method used the this parameter, then that makes it an instance method instead of a static method:

class MyClass2 {
 int v = 0;
 delegate void MyDelegate(string s);

 MyDelegate MemberFunc()
 {
  int i = 1;
  return delegate(string s) {
          System.Console.WriteLine("{0} {1}", v, s);
         };
  }
}

The anonymous method in MyClass2 uses the this keyword implicitly (to access the member variable v). Therefore, the conversion is to an instance member rather than to a static member.

class MyClass2_converted {
 int v = 0;
 delegate void MyDelegate(string s);

 // Autogenerated by the compiler
 void __AnonymousMethod$0(string s)
 {
  System.Console.WriteLine("{0} {1}", v, s);
 }

 MyDelegate MemberFunc()
 {
  int i = 1;
  return this.__AnonymousMethod$0;
  // which is in turn shorthand for
  // return new MyDelegate(this.__AnonymousMethod$0);
  }
}

So far, we've only dealt with the easy cases. The transformation is local and not particularly complicated. These are the sorts of transformations you could make yourself without too much difficulty in the absence of anonymous methods.

The hard case is where things get interesting. The body of an anonymous method is permitted to access the local variables of its lexically-enclosing method, in which case the compiler needs to keep those variables alive so that the body of your anonymous method can access them. Here's a sample anonymous method that accesses local variables from its lexically-enclosing method:

class MyClass3 {
 int v = 0;
 delegate void MyDelegate(string s);

 MyDelegate MemberFunc()
 {
  int i = 1;
  return delegate(string s) {
          System.Console.WriteLine("{0} {1} {2}", i++, v, s);
         };
  }
}

In this example, the anonymous method prints "1 v s" the first time it is called, then "2 v s" the second time it is called, and so on, with the integer increasing by one. (And where v s are the current values of v and s, of course.) This happens because the i variable that the anonymous method is accessing is the same one each time, and it's the same i that the MemberFunc method was using, too. If the function were rewritten as

class MyClass4 {
 int v = 0;
 delegate void MyDelegate(string s);

 MyDelegate MemberFunc()
 {
  int i = 0;
  MyDelegate d = delegate(string s) {
          System.Console.WriteLine("{0} {1} {2}", i++, v, s);
         };
  i = 1;
  return d;
  }
}

the behavior would be the same as in MyClass3. The creation of the delegate from the anonymous method does not make a copy of the i variable; changes to the i variable in the MemberFunc are visible to the anonymous method because both are accessing the same variable.

When faced with this "hard" type of anonymous method, wherein variables are shared with the lexically-enclosing method, the compiler generates a helper class:

class MyClass3_converted {
 int v = 0;
 delegate void MyDelegate(string s);

 // Autogenerated by the compiler
 class __AnonymousClass$0 {
  MyClass this$0;
  int i;
  public void __AnonymousMethod$0(string s)
  {
    System.Console.WriteLine("{0} {1} {2}", i++, this$0.v, s);
  }
 }

 MyDelegate MemberFunc()
 {
  __AnonymousClass$0 locals$ = new __AnonymousClass$0();
  locals$.this$0 = this;
  locals$.i = 0;
  return locals$.__AnonymousMethod$0;
  // which is in turn shorthand for
  // return new MyDelegate(locals$.__AnonymousMethod$0);
  }
}

Wow, there was a lot of rewriting this time. A helper class was created to contain the local variables that were shared between the MemberFunc function and the anonymous method (in this case, just the variable i), as well as the hidden this parameter (which I have called this$). In the MemberFunc function, access to that shared variable is done through this anonymous class, and the anonymous method that you wrote is an anonymous method on the anonymous class.

Notice that the assignment to i in MemberFunc modifies the copy inside locals$, which is the same object that the anonymous method will be using when it runs. That's why it prints "1 v s" the first time: The value had already been changed to 1 by the time the delegate ran for the first time.

Those who have done a good amount of C++ programming (or C# 1.0 programming) are well familiar with this technique, since C++ callbacks typically are given only one context variable; that context variable is usually a pointer to a larger structure that contains all the complex context you really want to operate on. C# 1.0 programmers went through a similar exercise. The "hard" type of anonymous method provides syntactic sugar that saves you the hassle of having to declare and manage the helper class.

If you thought about it some, you'd have realized that the way it's done is pretty much the only way it could have been done. It turns out that most computer programming doesn't consist of being clever or making hard decisions. You just have one kernel of an idea ("hey let's have anonymous methods") and then the rest is just doing what has to be done, no actual decisions needed. You just do the obvious thing. Most programming consists of just doing the obvious thing.

Okay, so that's a quick introduction to the implementation of anonymous methods in C#. Mind you, this information isn't just for your personal edification. It's actually important that you understand how these works (and not just treat it as "magic"), because lack of said understanding can lead to subtle programming errors. We'll look at those types of errors over the next few days.

Update: This behavior changed in Visual Studio 2015 with the switch to the Roslyn compiler. For performance reasons, anonymous methods are now always instance methods, even if they capture nothing.

Comments (50)
  1. Anonymous says:

    This is why Java went with "anonymous inner classes can only access final parameters and local variables".  No subtle side-effects here!

    If C#2.0 allows you to write subtly incorrect code, why not just get rid of that damned "can’t declare the same variable in nested scopes" compiler error too?

  2. Anonymous says:

    I thought the object of objects was that they could be used without knowledge of their internals.  It seems here that you need to know the internals of BOTH the objects you are using and the object compiler.  This is progress?  Looks to me to be nothing but fire and motion to distract the competition.

    I would much rather do structured and object oriented coding in C where I can know what is going on because I explcitly made it work that way.  Oh wait.  That’s what I have been doing for almost 15 years.  Never mind.

    Thanks for reminding me of why I still don’t want to use C++ or C#.  

  3. Anonymous says:

    Allowing the lambada function to access data in the local scope of the function seems like it’s the wrong answer. The lambada exists longer then the local varabile so it would be a scoping issue. Allowing it just seems to be asking for trouble.

    Not that I would have even know that it did that, that is not something I would ever try myself. It just seems to be asking for trouble.

  4. Anonymous says:

    tylar – teh scopign isue is resolved by the rule that a varable exists as long as sombody has a refrance to it. easy. i mean ‘easy’ in teh abstarct.

    whe’re it gets weird is if youve got multipal nestad scopes invloved. say a closuare returnign a closure waht raturns a colsure. u can have moare then one closure runign around loose with reefrences to theh same object which was mabe orignaly on the stack. so if that objects a int or somthing the compilar also has to know to put on the heep an onyl put a refrence on the stack. so it can outlive the stdackframe.

    wondar if u can do thatwith a registar varable? hyuk huyk hyuk!

  5. Anonymous says:

    linel – why do u need to know the intrenals? use it an it wroks. how meany poeople who wriate vrtual functoins or use multaple in heritance know about vtables?

  6. zahical says:

    Funny thing, that the argument of ‘not knowing what is happening’ is being used to bash C++ or C#. I think that, for the sake of the argument, one can say that we lost track of what’s happening the moment we stopped programming in assembler or the /O2 switch appeared on the command line of the compiler. Or possibly, the moment when everyone stopped developing their own OSes – just to keep things in control.

    And on the other hand – what’s not to know in the example with the anonymous delegates? I know exactly what is happening – the local variable referenced in the body of the anonymous delegate and in the body of its lexically-enclosing method is one and the same, and so any changes to it are visible in both places.

    MS could change the implementation someday but if the end result is the same I’d still know what’s happening.

    You don’t need to know the internals of this to know what’s happening. Of course, the internals are very interesting and I think they illustrate very well Raymond’s point about the “kernel idea” and then just “doing what has to be done”.

    Anyway, judging by yesterday’s and today’s comments it actually turned out that this actually is ‘not actually a .NET blog’. :-)

  7. Anonymous says:

    The inner class should have access to the outer classes members and functions, regardless of protection (public, private, etc.)  If you consider the outer class as a translation unit, the that class’ variables and functions can be considered "global" to the inner class.   The outer class should not have access to the inner class’ members just as global functions do not have access to class’ private members and functions (unless explicitly provided).

    In Raymond’s "hard" example where the delegate accesses local variables, thus causing compiler generation of anonymous classes, I don’t see how this is any different from a closure.  But I suppose that if there are two things that are hard for programmers to get (besides pointers and recursion — but that’s a different story) it’s closrues and coroutines.

    With regard to Matt’s complaint about serialization, I agree that is annoying.  I’d also say that it was a valid language/compiler design choice.  If you have the compiler automatically make closure/anon classes serializable, it will likely impose limits on the types of the data members in said class.  As a language designer, the tradoff would be making anon classes serializable but limited, or making anon classes unlimited but classes using delegates with anonymous classes unserializable.

    In the end, the anon classes are just syntactic sugar anyway, so why not just make named serializable delegates?

  8. Anonymous says:

    I appreciate your C# insights, you have a skill at explaining things.

    However, I have to agree with many people that I don’t completely grasp /why/ annoymous methods are so great. On a technical level what is gained over named delegates/ methods or just manually inlining the code? I guess I view it as another tool in my toolbox that I don’t understand why it exists.

  9. Anonymous says:

    Is this actually how the CLR (or C#… is this language specific?) implements closures? Or only conceptualy? Doesn’t it use a more compact representation?

    I can see why having a "complete" class with the environment as member variables may be better by "playing nice" (or "playing by the book") with the CLR, so this is not a critique, only a question…

  10. Anonymous says:

    I have apparently hit a few hot buttons by saying that there are application classes that are not well solved by using .NET, C++, or C#.  They are good tools for low performance and simple CRUD centric applications on or off the internet and little else.  If that’s what you do, expect your job to be outsourced in a few months to a year, if not next week.  

    Meanwhile, I will accomplish my goals my way.

  11. Anonymous says:

    Funny thing, that the argument of ‘not knowing what is happening’ is being used to bash C++ or C#.

    I find it funnier when MSDN already has a growing collection of articles on the intricacies of C#, yet some people claim that C++ is an abomination purely because there are some subtle details.

    I’m not sure if "HA HA HA"’s badly spelled rants are entirely serious – but fanboys are always funny, regardless of the object of their affection.  ;)

  12. Anonymous says:

    Lionell: I think you missed the point of the article, which is about anonymous functions, and nothing to do with objects really.  The same idiosyncracies come up in other languages like LISP, which is functional, and has closures/lambda functions.  Your little rant didn’t make much sense in context.

    TW: I tend to agree that anon functions bring in a lot more complexity than they’re worth, at least from my experience, true closures can be pretty nasty.  Having the compiler generate anonymous functions for simple wrappers can be very handy though, if for nothing else to prevent the namespace from getting cluttered with trivial functions.

  13. Anonymous says:

    I am reminded of why I still don’t want to use C every time I call malloc, every time I want to concatenate two strings (or for that matter, every time I want to model a 1:n relationship without fixing the value of n to some hard-coded constant), every time I want to deep copy a struct, and every time I check the same damn error code through ten call stack frames.

    It’s not that we don’t know how to program in C.  We do.  We know how bad it sucks.  We know that a C program can be replaced by an equivalent C++ or C# program in half the number of lines of code or less, without sacrificing performance (re-read Raymond Chen’s series on building a Chinese / English dictionary).  We know that complexity scales exponetially with LOC; that’s why we want the compiler to generate all that boilerplate, instead of fetishizing our ignorance and insisting on using an exacto knife for a job that requires a chainsaw.

    Lionell, if you can’t be bothered to learn something new after 15 years, perhaps you should be the one worried about your future employability.

  14. Anonymous says:

    ccx:

    I’m not opposed to anon methods. As you said I don’t understand them so I have trouble seeing the usefulness. When I first started programming I thought the same thing about function pointers (functors, delegates, whatever). I now love them.

    I think part of the one sided-ness of the comments could just show who reads "The Old New Thing". Namely, C/ C++ coders, not C#.

    Tools change, sometimes for the better, sometimes not. More tools are not a bad thing as long as they make the builders more productive and they enjoy using them.

  15. Anonymous says:

    How does this work in a function containing multiple anonymous functions sharing a subset of local variables? I’d guess there’d either have to be one anonymous class created which shares all local variables referenced by any anonymous functions, or else a tree of anonymous classes referencing others depending on scope. The first seems like it could cause some massive memory leaks you wouldn’t expect, but the second would be harder for the compiler and could easily generate some horrible code for all the dereferences in edge cases.

    (I’m not a C# programmer, but enjoyed the article btw…)

  16. Anonymous says:

    A notable point which annoys the hell out of me but which is beautifully illustrated by your post is :

    // Autogenerated by the compiler
    class __AnonymousClass$0 {
     MyClass this$0;

    Anyone who’s ever serialized a large object graph via binary or soap formatters will spot it pretty quick. [Serializable] is missing.
    So any object graph containing one of these won’t work.

    Really, **really** annoying.

    It is worth therefore being very aware of this distinction between ‘easy’ and ‘hard’ if you are going to use anonymous delegates and you ever use the invasive* serialization options.

    My team is probably going to have a look at whether we can binary hack our dll’s on loading (we already use custom assembly resolvers so it’s not as big a step as it sounds) to fake in the Serializable attributes at runtime.
    The ability to checkpoint almost the entire application to an arbitrary stream is *very* tempting.

    Not intended as a rant – just as an additional warning to anyone looking into this functionality and its side effects.

    * as opposed to the public properties / or special interface only XmlSerialization)

    [I don’t know what attributes the compiler autogenerates. This was an informal discussion, not a specification. -Raymond]
  17. Anonymous says:

    Gosh!

    From what i use to see in other sites about programming languages, i thought people would say: "Great, finally C# has this new cool feature (closures)!". Instead, what i see is people bashing a useful feature just because they don’t quite understand it. Maybe recursion and virtual funcionts received the same bad looks when they were introduced…

    Am i the only one who **likes** when a programming language has more features? Or do everybody prefer minimalistic languages where you have to reimplement everything (including pass-by-reference, as in Java)?

  18. Anonymous says:

    The first seems like it could cause some massive memory leaks you wouldn’t expect

    Two words: Garbage collection.

  19. Anonymous says:

    Anonymous: The point of bok is more subtle, if anon method a keeps a reference to a local variable that’s a huge dictionary z, and anon method b is a simple "return x + y" method, and b is shared by both anon functions, then the reference to z would be kept as long as there was any reference to anon method b. Not so good, or at least a real issue. It could be one of the cases where some knowledge of the actual implementation is needed.

  20. Anonymous says:

    [I don’t know what attributes the compiler auto generates. This was an informal discussion, not a specification. -Raymond]

    Sorry I wasn’t trying to pick at anything in your example – your example was completely correct in not having the attribute.

    I was just pointing out the limitations imposed by the silent hoops the compiler jumped through in the hard case.

    I agree simply adding it willy nilly would also be unrealistic since it would needlessly constrain things in another (far more important) direction.

    For the record I love anon delegates for two principle reasons neither of which rely on the ‘hard’ case.

    As already mentioned by others they reduce the need for a separate function name (which as an event handler callback is probably not something you should be calling directly).

    But better still for a very small function (hopefully one or two lines) they provide greater ‘cohesion’ between the event subscription and the desired behaviour when it executes. I love this and find, when used well, it dramatically improves the readability of the code. I find reading the auto generated names from the designer a little ugly (but baring anon delegates have no better solution to offer :)

  21. Anonymous says:

    TW:

    One great use for anonymous delegates is unit testing event handlers; each test can be self contained and not require any additional methods to handle the event.

  22. Anonymous says:

    CN: I don’t really understand… Assuming that’s something like this:

     BigDictionary z;

     anon_a = delegate (x) { return z[x]; }

     anon_b = delegate (x,y) { return x + y; }

    And assuming that both anon_a and anon_b are returned, the compiler will generate one wrapper class and use for anon_a, but anon_b falls into what Raymond classified as the easy case.

    Unless anon_b directly references either z or anon_a, it is unrelated to z, therefore has no impact on the lifetime of z.

    It is easier if you think that each closure (delegate…) has an internal hashtable with the variables that it refers and are not parameters. This hashtable maps the name of the variable to a reference to it. The wrapper class thingy is only an implementation detail of C#/CLR.

  23. Anonymous says:

    Azrael: I don’t believe that was what CN meant. My original point was more the following situtation:

    int w;

    BigDictionary z;

    anon_a = delegate (x) { return z[x + w++]; }

    anon_b = delegate (x) { return x + w++; }

    if( something )

    {

       return anon_a;

    }

    else

    {

       return anon_b;

    }

    Neither anonymous function fits into the easy case, because both reference ‘w’. If anon_b happens to be returned, does the anonymous class include a live reference to ‘z’ as well? There can’t be different anonymous classes which both contain ‘w’, in case either delegate modifies it.

  24. Anonymous says:

    Here you can read some articles extracted from my book ‘Practical .NET2 and C#2’ where I explain the ‘under the hood’ of iterators in C#2 and how they are related to anonymous mehods:

    http://www.theserverside.net/tt/articles/showarticle.tss?id=AnonymousMethods

    http://www.theserverside.net/tt/articles/showarticle.tss?id=IteratorsWithC2

  25. Anonymous says:

    Allowing the lambada function to access data in the local scope of the function seems like it’s the wrong answer.

    Is that the forbidden dance function?  (http://en.wikipedia.org/wiki/Lambada)

  26. Anonymous says:

    Na blogu oldnewthing pojawił się trzy częściowy artykuł o implementacji metod anonimowych w…

  27. Anonymous says:

    CN/Azrael/bok, the page Raymond linked to in a later entry seems to suggest that the comparative lifetime depends on the scope the variable is declared in. This means that you could declare z in an inner scope to w so that anon_b wouldn’t need to keep a reference to z.

  28. Anonymous says:

    Raymond wrote a really nice series of posts on this:

    Part 1

    Part 2

    Part 3

    He also points out that…

  29. Anonymous says:

    This is an interesting thread — one that reminds me strongly of discussions threads I was involved in back in the ’70s when we (a group of developers) were trying to get a grasp of closures and continuations in Scheme. I would strongly recommend that people interested in learning more about these take a look at Guy Steele’s "The Lambda Papers" (http://library.readscheme.org/page1.html) An excellent introduction to the power of these ideas is Dan Friedman’s "Little Schemer" book.

    These are certainly not concepts that are either new or Microsoft specific. They certainly are very powerful concepts and I for one am tickled pink that they are showing up in languages such as c#.

    If you don’t take the time and effort to understand them they will remain "quirky" facets of the language. If you *do* take the time then I believe your code will be the better for it.

    Regards,

    Bill

  30. Anonymous says:

    You’ve been kicked (a good thing) – Trackback from DotNetKicks.com

  31. Anonymous says:

    An anonymous method is of course not anonymous at all, how else would you find it runtime if it were?

  32. Anonymous says:

    Вместо эпиграфа: information hiding (1) In programming, the process of hiding details of an object or

  33. Anonymous says:

    This post assumes that you understand how closures are implemented in C#. They’re implemented in essentially

  34. Anonymous says:

    I read a nice and not too complicated post regarding the "behind the scenes" of anonymous methods. I

  35. Anonymous says:

    One of the most useful features of .NET 2.0 is anonymous delegates. They allow you to create "wrappers"

  36. Anonymous says:

    One of my favorite new features for Code Analysis in Visual Studio 2008 is our support for analyzing

  37. Anonymous says:

    The post I wrote yesterday about Expression Trees has inspired to find some more cool usages for this

  38. Anonymous says:

    On the advice of Jay Wren , I decided to try our ReSharper 4.1 .  I had previously installed DevExpress

  39. Anonymous says:

    On the advice of Jay Wren , I decided to try our ReSharper 4.1 .  I had previously installed DevExpress'

Comments are closed.