Yield and usings - your Dispose may not be called!

We ran into an interesting bug recently where a resource was being leaked because we weren't disposing of an IDisposable in some cases. Before I go any further I should state that the root bug is that the IDisposable hanging onto the resource should have implemented a finalizer, or even better, used SafeHandle. However it wasn't our code.

We tracked down the bug to some code that looked like:

     static IEnumerable<Foo> MyEnumerable()
    {
        using (new MyDisposable())
        {
            // code that calls yield
        }
    }

Then, someone called MyEnumerable().GetEnumerator().MoveNext() without disposing it or wrapping it in a foreach. The result was that Dispose was never called on the disposable. I usually think of using guaranteeing that Dispose() gets called, except in extreme cases like thread abort exceptions. It sparked my curiosity to dig and figure out how this all works.

First, a couple of words about Enumerables, Enumerators and yield. An IEnumerator provides a simple way to iterate through a collection, using MoveNext() to advance to the next element, and a property Current to get the value at the current position. MoveNext() returns false at the end of the collection. An IEnumerable is simply an object you can call GetEnumerator() on to get an IEnumerator you can use to enumerate it (for example, collections are IEnumerables).

The yield keyword is a handy way to implement an IEnumerable that exposes the result of a series of calculations or operations in such a way that you can write serial code. I could spend a whole post on how cool yield is, but you can find that elsewhere. For a cool example, you can see the Fibonacci series implemented using yield. Note: yield is not a feature of the CLR, it's magical syntactic sugar provided by C# 2.0.

Let's take a simple example:

     class MyDisposable : IDisposable
    {
        public void Dispose()
        {
            Console.WriteLine("Disposed");
        }
    }

and the function:

     static IEnumerable<int> MyEnumerable()
    {
        using (new MyDisposable())
        {
            yield return 1;
            yield return 2;
        }
        Console.WriteLine("complete");
    }

If you walk the collection with the following code:

     IEnumerator<int> enumerator = MyEnumerable().GetEnumerator();
    while (enumerator.MoveNext())
    {
        Console.WriteLine(enumerator.Current);
    }

It will print "1", "2", "Disposed", "complete". But, how does it work? If you were to implement this without yield, you'd probably have to create a gnarly state machine, with some special cases for cleanup. Well, that's exactly what the C# compiler does for you!

Using .NET Reflector, we can see what code is generated. The MyEnumerable() function returns a new instance of a compiler generated class. Leaving out some of the details, the core of MoveNext() looks like:

     switch (this.<>1__state)
    {
        case 0:
            this.<>1__state = -1;
            this.<>7__wrap1 = new MyDisposable();
            this.<>1__state = 1;
            this.<>2__current = 1;
            this.<>1__state = 2;
            return true;

        case 2:
            this.<>1__state = 1;
            this.<>2__current = 2;
            this.<>1__state = 3;
            return true;

        case 3:
            this.<>1__state = 1;
            this.<>m__Finally2();
            Console.WriteLine("complete");
            break;
    }
    return false;

With:

     private void <>m__Finally2()
    {
        this.<>1__state = -1;
        if (this.<>7__wrap1 != null)
        {
            this.<>7__wrap1.Dispose();
        }
    }

So, if you walk through the states, you can see in case 0, it creates the MyDisposable, sets current to 1 and returns true. When it's called again in state 2, it sets current to 2 and returns true. When it's called a third time, it calls the finally function that disposes of the MyDisposable and then writes out "complete".

But, what happens in a partial enumeration? If you use foreach, like:

     foreach (int i in MyEnumerable())
    {
        Console.WriteLine(i);
        break;
    }

You might not expect it to call dispose. But, what it outputs is "1", "Disposed". The reason that this works is that the generated IEnumerator class also implements IDisposable where Dispose() calls <>m__Finally2(). And, foreach calls Dispose on its enumerator if it is an IDisposable.

However, in the case of our bug, we weren't using a foreach, we were just using MoveNext/Current directly without enumerating through the entire list. In this case, the MyDisposable is never disposed, even when GC takes place. Is this a bug? I've gone back in forth in my mind. On the one hand, the Enumerable code has a using, and that's supposed to clean things up! The compiler could have implemented a finalizer for the temporary class to clean up. On the other hand, the IEnumerator it generated was an IDisposable, which we failed to dispose. I tend to land on the side that this isn't a bug (and I bet the C# team agrees). If you have something that's holding onto a critical resource, you should implement a finalizer or use a SafeHandle and not rely on Dispose being called. However, I did find it a bit surprising. Now I know!