The implementation of iterators in C# and its consequences (part 1)

Like anonymous methods, iterators in C# are very complex syntactic sugar. You could do it all yourself (after all, you did have to do it all yourself in earlier versions of C#), but the compiler transformation makes for much greater convenience.

The idea behind iterators is that they take a function with yield return statements (and possible some yield break statements) and convert it into a state machine. When you yield return, the state of the function is recorded, and execution resumes from that state the next time the iterator is called upon to produce another object.

Here's the basic idea: All the local variables of the iterator (treating iterator parameters as pre-initialized local variables, including the hidden this parameter) become member variables of a helper class. The helper class also has an internal state member that keeps track of where execution left off and an internal current member that holds the object most recently enumerated.

class MyClass {
 int limit = 0;
 public MyClass(int limit) { this.limit = limit; }

 public IEnumerable<int> CountFrom(int start)
  for (int i = start; i <= limit; i++) {
   yield return i;

The CountFrom method produces an integer enumerator that spits out the integers starting at start and continuing up to and including limit. The compiler internally converts this enumerator into something like this:

 class MyClass_Enumerator : IEnumerable<int> {
  int state$0 = 0;// internal member
  int current$0;  // internal member
  MyClass this$0; // implicit parameter to CountFrom
  int start;      // explicit parameter to CountFrom
  int i;          // local variable of CountFrom

  public int Current {
   get { return current$0; }

  public bool MoveNext()
   switch (state$0) {
   case 0: goto resume$0;
   case 1: goto resume$1;
   case 2: return false;

   for (i = start; i <= this$0.limit; i++) {
    current$0 = i;
    state$0 = 1;
    return true;

   state$0 = 2;
   return false;
  ... other bookkeeping, not important here ...

 public IEnumerable<int> CountFrom(int start)
  MyClass_Enumerator e = new MyClass_Enumerator();
  e.this$0 = this;
  e.start = start;
  return e;

The enumerator class is auto-generated by the compiler and, as promised, it contains two internal members for the state and current object, plus a member for each parameter (including the hidden this parameter), plus a member for each local variable. The Current property merely returns the current object. All the real work happens in MoveNext.

To generate the MoveNext method, the compiler takes the code you write and performs a few transformations. First, all the references to variables and parameters need to be adjusted since the code moved to a helper class.

  • this becomes this$0, because inside the rewritten function, this refers to the auto-generated class, not the original class.

  • m becomes this$0.m when m is a member of the original class (a member variable, member property, or member function). This rule is actually redundant with the previous rule, because writing the name of a class member m without a prefix is just shorthand for this.m.

  • v becomes this.v when v is a parameter or local variable. This rule is actually redundant, since writing v is the same as this.v, but I call it out explicitly so you'll notice that the storage for the variable has changed.

The compiler also has to deal with all those yield return statements.

  • Each yield return x becomes
     current$0 = x;
     state$0 = n;
     return true;

    where n is an increasing number starting at 1.

And then there are the yield break statements.

  • Each yield break becomes
     state$0 = n2;
     return false;

    where n2 is one greater than the highest state number used by all the yield return statements. Don't forget that there is also an implied yield break at the end of the function.

Finally, the compiler puts the big state dispatcher at the top of the function.

  • At the start of the function, insert
    switch (state$0) {
    case 0: goto resume$0;
    case 1: goto resume$1;
    case 2: goto resume$2;
    case n: goto resume$n;
    case n2: return false;

    with one case statement for each state, plus the initial zero state and the final n2 state.

Notice that this transformation is quite different from the enumeration model we built based on coroutines and fibers. The C# method is far more efficient in terms of memory usage since it doesn't consume an entire stack (typically a megabyte in size) like the fiber approach does. Instead it just borrows the stack of the caller, and anything that it needs to save across calls to MoveNext are stored in a helper object (which goes on the heap rather than the stack). This fake-out is normally quite effective—most people don't even realize that it's happening—but there are places where the difference is significant, and we'll see that shortly.

Exercise: Why do we need to write state$0 = n2; and add the case n2: return false;? Why can't we just transform each yield break into return false; and stop there?

Comments (33)
  1. Chris says:

    Because weird things might happen with MoveNext() if it’s called more than once?

  2. Chris says:

    (…Past the end of the enumerated collection?)

  3. Damien says:

    Because whatever state it was in might have a non-deterministic path that sometimes leads to a yield break, and other times to a yield return?

  4. Damien says:

    Ooh. Plus, whatever code is in the last but one state might have side effects (might be what Chris was implying)

  5. Gabe says:

    If you don’t change the state to n2, you can keep calling MoveNext() on it and always redo the last state.

  6. bahbar says:

    I believe Damien and Chris nailed it.

    MoveNext is supposed to keep returning false once it started to do so.

    From MSDN "If MoveNext passes the end of the collection, the enumerator is positioned after the last element in the collection and MoveNext returns false. When the enumerator is at this position, subsequent calls to MoveNext also return false"

    Imagine the for loop condition is  i != limit rather than <=… Then MoveNext will return false then start to return true again.

  7. hova says:

    What do you plan to do with all the excercises when you write your next book?   Put a workbook or test section at the end of each chapter?

    Why some teacher might even use that book to teach students with!

  8. Michael says:

    The C# method … just borrows the stack of the caller

    Interesting… When I first saw how iterators were implemented in C#, I saw the "yield" and immediately assumed coroutines.  So, I’m surprised to learn .NET iterators are really closer to Java iterator objects than "true" (coroutine-based or thread-based) iterators; it’s just that the nice syntax of C# hides this from us.

  9. EricLippert says:

    Indeed. You can use iterators as poor-mans coroutines; in fact, the managed robotics framework that Microsoft ships does exactly that!

    Thanks for doing this Raymond; this is better than our internal documentation. :-)  

  10. philq says:

    Iterators are a really great feature.  The only drawback is that making recursive calls in them is n^2.  We need a ‘yield foreach’ statement!  Eric please implement this feature immediately.

  11. Ajai Shankar says:

    Hi Raymond

    Strange I came across only now across this series on enumerators & fibers.

    I’d written an MSDN article way back in ’03 on how to get fiber based enumerators to work with .NET using an undocumented hosting API.

    Implementing Coroutines for .NET by Wrapping the Unmanaged Fiber API (

    Of course now there is a dire warning at the beginning of the article saying it is not supported by Microsoft :-)

    Ajai Shankar

  12. Igor Levicki says:

    Not actually a .Net blog but it might occasionally dip its toes in C# it seems.

    [Why is the blog’s subtitle “Not actually a .NET blog”? -Raymond]
  13. Ian says:

    .net week makes the baby jesus cry.

  14. Miral says:

    Gah, stuffed that last example up a bit (row => rowIndex, col => colIndex), and the formatting sucks.  Still, you get the idea :)  (Oh, for an edit button…)

  15. wtroost says:

    Thanks for the great post, I always wondered how the yield statement worked its magic!

  16. SRS says:

    I really wish Sun would put ‘yield’ into Java. Oh, and fix generics.

    Just saying.

  17. Miral says:

    My favourite iterator trick is the synchronised foreach:

    public IEnumerator<T> GetEnumerator()


     lock (SyncRoot)


        foreach (T item in BaseList)


           yield return item;




    (Not 100% sure that’s the right code; I did it from memory.)

    Of course there are drawbacks to doing something like the above, but it *does* mean that your foreach loops on that collection won’t get borked by other threads modifying the collection.  (It might kill your performance, though.)

    My second favourite trick is the way you can now define multiple kinds of enumeration trivially:

    // .. in some grid/matrix-like class:

    public IEnumerable<T> AcrossThenDown




       for (int rowIndex = 0; rowIndex < RowCount; ++rowIndex)


         for (int colIndex = 0; colIndex < ColCount; ++colIndex)


           yield return this[row, col];





    public IEnumerable<T> DownThenAcross




       for (int colIndex = 0; colIndex < ColCount; ++colIndex)


         for (int rowIndex = 0; rowIndex < RowCount; ++rowIndex)


           yield return this[row, col];





  18. Stuart Leeks says:

    This was only going to be two posts, but after my last post I’d been mulling over a post that looks at

  19. Rob says:

    Shouldn’t public IEnumerator<int> CountFrom(int start) [fourth line]


    public IEnumerable<int> CountFrom(int start)


    [Fixed. Thanks. -Raymond]
  20. Ian Marteens says:

    Actually, the translation get trickier when there are try/finally blocks, either explicit or implicit, in the iterator block. That’s because you must track iteration state in case the Dispose method of the enumerator must be called. That’s also the reason why try/catch is not allowed in iterators: translation becomes even trickier.

    There are a couple of enhancements that Microsoft could apply to iterators:

    1- With a good peephole optimizer, the target code generated for the compiler can be greatly simplified. Most compiled code have a lot of redundant state assignments that a peephole optimizer could erase. I have done some tests on this with Freya, my own pet language and compiler for .NET.

    2- A very simple extension for iterators: allow "yield" to receive a whole iterator call. It would translate as a foreach/yield combo statement, and it would be great for composing iterators, such as in recursive iterators. For simpler composite iterators, we have LINQ, of course.

    3- I have read a paper from Microsoft Research where they fiddle with a IEnumerable extension for handling recursive iterators in an efficient manner. It would be great to see that research applied to the .NET compilers. Of course, it’s not an easy task.

  21. Andrew says:

    “…Instead it just borrows the stack of the caller, and anything that it needs to save across calls to MoveNext are stored in a helper object (which goes on the heap rather than the stack)…”

    I used to think that implementation of an iterator as a value type is more efficient in terms of performance (say, generic list and dictionary have their iterators as structs).

    And I used to think that “yield return”  just fails to expose a struct since that should be public (to aviod boxing) and that’s not appropriate for autogenerated classes.

    But Raymond seems to say that heap-based iterators are better, why? Because of possible nesting?

    [I believe most people expect reference semantics for iterators, not value semantics.

    e = o.GetEnumerator();

    I suspect most people would expect DoSomething2 to continue where DoSomething1 left off. I don’t see how nesting is involved. -Raymond]

  22. My latest in a series of the weekly, or more often, summary of interesting links I come across related to Visual Studio. The Visual C++ Team blog has a post about IconWorkshop Lite , which is available as a free download. Greg Duncan posted a link to

  23. Andrew says:

    [I believe most people expect reference semantics for iterators, not value semantics.]

    Vast majority of people writing vernacular C# use them in foreach loops. They never bother about ref/val semantics. Also, you palter a little bit: with a value type enumerator, in your example

    e = o.GetEnumerator();



    e gets boxed, so no difference between a class and a struct iterator. The only benefit for a value type iterator comes from the foreach loop: it uses duck typing against iterators,  thus doesn’t cast to IEnumerator.

    […I don’t see how nesting is involved]

    Um, sorry – I meant recursion. If you  recursively interate through, all iterators (as far as I understand) should be on the stack that might, eg. cause stack overflow.

    [You focused on boxing and missed the original problem. DoSomething2(e) will not resume where DoSomething1(e) left off. And stack overflow is not the issue here either. -Raymond]
  24. Ask the Directory Services Team : MCS Talks Infrastructure Architecture joeware – never stop exploring…

  25. Alexander Morou says:

    As stated before, the primary reason the state machine transitions into a final state is to prevent pre-final states from being repeated, since the logic of them could alter the state of other objects within the iterator’s scope (and potentially even cause run-time errors).

    I’m guessing the other reason yield break; transitions into a state, is state reduction, the most optimal state machine is the one with the fewest states.  Multiple yield breaks do transition into a single state, don’t they? (I haven’t actually checked.)

    Recursive iterators merely require a more complex approach, but are entirely possible.  Instead of being directly recursive, you need to cast part of the logic into a sub-iterator that would effectively emit the transition series of objects used in the recursive approach, then using the flat form of the hierarchy, yield the members respectively.  Alternatively you can flatten your recursive approach into a non-recursive solution and yield the members that way.

    I found iterators interesting because you can essentially do whatever you want with them.  My favorite was creating a multi-dimensional array iterator that iterated the indices of an array, yielding the same array instance each time, just with different values.  Good for debugging where you send in a nondescript array and its internals are iterated and logged regardless of the array’s dimensional structure.

  26. CodeMage says:

    Another interesting fact about iterators is the garbage they generate, by virtue of implementing IEnumerator<T> and IEnumerable<T>. Every time you use an iterator in a foreach loop, you’ll instantiate the implementation of these interfaces. In case of yield return, this implementation is hidden from you by the compiler, but it still creates garbage.

    There’s a way to avoid that, though. By creating a value type that implements the iterator design contract (GetEnumerator, MoveNext, Reset and Current) without implementing the IEnumerator<T> and IEnumerable<T> interfaces, you can still use it in foreach loops but you don’t generate any garbage. The bad news is that you have to implement it by hand: there’s no syntactic sugar for this.

  27. Andrew says:

    [You focused on boxing and missed the original problem. DoSomething2(e) will not resume where DoSomething1(e) left off. And stack overflow is not the issue here either. -Raymond]

    Wow! Raymond, yep I missed the point – ultimately had to write a small app in VS to get to it, sorry :)

    But. Iterators for List<T> and Dictionary<K,V> are public nested structs. Does this mean that perf gain in foreach loops was more important back then, than the continuation idea (implemented in yield return) ?

    (just in case you or someone else here has an explanation)



  28. Andrew says:


    …By creating a value type that implements the iterator design contract…

    Exactly, and that’s what bothers me: why didn’t they implement ‘yield return’ to generate structs? Raymond answered above, but it still somehow not quite persuading

    (btw – that nested value type iterator should be public, that’s possibly something you cannot always afford)

  29. Implementing IEnumerable&lt;T&gt; can turned out to be tricky in one particular situation. Consider the

Comments are closed.