Foreach codegen


Do you know how the following C# code compiles?

foreach (int i in c) { Console.WriteLine(i); }

 

In addition to the obvious branching opcodes, this can also emit try/finally, a call to IDisposable, unboxing opcodes.

 

It turns out it really depends on the type of C. There are 4 interesting cases:

  1. If C is an T[], then the compiler emits an efficient for(int i = 0; i < c.Length; i++) loop. Sweet.
  2. IF C implements the method protocol for IEnumerable but doesn’t actually implement the interfaces, then the compiler emits the calls directly.
  3. If C is IEnumerable<T>, then then compiler calls the proper methods on the interface. It will also setup a Try/Finally to ensure that IDisposable.Dispose is called on the IEnumerator<T> object. 
  4. If C is  IEnumerable (the non-generic version), it’s similar to IEnumerable<T>. But since IEnumerator doesn’t derive from IDisposable, the compiler must also check if the enumerator implements IDisposable.

You can figure this out empirically (see the code snippet below), or you can read the C# spec.

The neat part is that for most people, this is all just an optimization.  Since typeof(c) is known at compile-time, this is all resolved statically and so the compiler can pick the best technique.

 

Sample code:

Here’s a more complete sample that illustrates the different cases, and shows how each compiles.

 

// Show how various foreach constructs compile.
using System;
using System.Collections.Generic;
using System.Text;
using System.Collections;

class Program
{
    //-------------------------------------------------------------------------
    #region Test with IEnumerable<T>
    // Test foreach using IEnumerator
    static void IEnumTest_Generic(IEnumerable<int> c)
    {         
        foreach (int i in c)
        {
            Console.WriteLine(i);
        }
    }
    
    // This is what a foreach for  IEnumerable<T> compiles to.
    // - has a call to Dispose() on the enuemerator. 
    static void IEnumTest_Generic_Compiled(IEnumerable<int> c)
    {
        IEnumerator<int> e = c.GetEnumerator();
        try
        {
            goto Check;

        Resume:
            int i = e.Current;
            // Body(i);
            {
                Console.WriteLine(i);
            }

        Check:
            if (e.MoveNext()) goto Resume;
        }
        finally 
        {
            // IEnumerator<T> inherits from IDisposable, so no need to use 'as' keyword.
            if (e != null) e.Dispose();
        }
    }

    // Implement an IEnumerable<T>.
    static IEnumerable<int> List()
    {
        for (int i = 0; i < 5; i++)
        {
            yield return i;
        }

    }
    #endregion


    //-------------------------------------------------------------------------
    #region Test with IEnumerable, non-generic version.
    static void IEnumTest_NonGeneric(IEnumerable c)
    {        
        foreach (int i in c)
        {
            Console.WriteLine(i);
        }
    }
    // This is what a foreach for  IEnumerable<T> compiles to.
    // - has a call to Dispose() on the enuemerator. 
    static void IEnumTest_NonGeneric_Compiled(IEnumerable c)
    {
        IEnumerator e = c.GetEnumerator();
        try
        {
            goto Check;

        Resume:
            int i = (int) e.Current; // Implicit casting.
            // Body(i);
            {
                Console.WriteLine(i);
            }

        Check:
            if (e.MoveNext()) goto Resume;
        }
        finally
        {
            // Check if the enumerator implements disposable.
            IDisposable d = e as IDisposable;
            if (d != null) d.Dispose();
        }
    }
    // Implement an IEnumerable (non-generic)
    static IEnumerable List2()
    {
        for (int i = 0; i < 5; i++)
        {
            yield return i;
        }
    }
    #endregion


    //-------------------------------------------------------------------------
    #region Test with Array

    // Test with an array
    // Compiler detects this is an array and generates efficient
    // for(int i = 0; i < array.Length; i++) code. 
    static void ArrayTest(int[] array)
    {
        foreach (int i in array)
        {
            Console.WriteLine(i);
        }
    }
    
    // This is what that foreach compiles to:
    static void ArrayTest_Compiled(int[] array)
    {
        int[] _array = array;
        int _counter = 0;

        goto Check;

    Resume:
        int i = _array[_counter];
        // body of for-loop using i
        {
            Console.WriteLine(i);
        }

        _counter++;

    Check:
        if (_counter < _array.Length) goto Resume;
    }
    
    #endregion


    //-------------------------------------------------------------------------
    #region Test with explicit enumeration.
    // Test using explicit methods.
    // The argument to Foreach does not implement IEnumerable. 
    // The compiler will codegen like IEnumerable, but call the methods statically instead
    // of through an interface.
    // This generates direct calls, but the code can't be shared.
    static void ExplicitTest()
    {
        Thing t = new Thing();
        foreach (int i in t)
        {
            Console.WriteLine(i);
        }
    }

    // explicitly impement an enumerator. These don't implement IEnumerable, IEnumerator.
    class Thing 
    {
        public ThingEnumerator GetEnumerator()
        {
            return new ThingEnumerator();
        }
    }
    class ThingEnumerator
    {
        int i = -1;
        public object Current
        {
            get { return i; }
        }

        public bool MoveNext()
        {
            i++;
            return i < 5;
        }

        public void Reset()
        {
            i = -1;
        }
    }
    #endregion


    //-------------------------------------------------------------------------
    static void Main(string[] args)
    {
        Console.WriteLine("T[]");
        int[] array = new int[] { 0, 1, 2, 3, 4 };
        ArrayTest(array);
        ArrayTest_Compiled(array);

        Console.WriteLine("IEnumerable<T>");
        IEnumerable<int> c1 = List();
        IEnumTest_Generic(c1);
        IEnumTest_Generic_Compiled(c1);

        Console.WriteLine("IEnumerable");
        IEnumerable c2 = List();
        IEnumTest_NonGeneric(c2);
        IEnumTest_NonGeneric_Compiled(c2);

        Console.WriteLine("Explicit implementation");
        ExplicitTest();
    }
}

Comments (5)

  1. Sergey Kanzhelev says:

    What about parallel execution on different processors? I heard that "foreach" will differ from "for" on the multi-processor machines. But the case 1 (Array) do not supposes so.

  2. A great breakdown of how foreach works

  3. jmstall says:

    Sergey – where did you hear that? What differences were you expecting? Part of the challenge is that you can compile to IL on a single-proc and then JIT + run on a multi-proc (or vice versa), so the compilation phase can’t assume very much about the final execution environment.

    Thus you can verify what’s supported by inspecting the IL. In this case, you can indeed see that the compiler is not parallelizing the for-loop (although the jitter could still do some smaller optimizations for multi-proc)

  4. Stuart says:

    Out of curosity when was this optimisation introduced to the compiler, obviously the generics part was introduced for 2.0 but was any of this there for 1.0 or 1.1

  5. jmstall says:

    Stuart – which optimization?