The reason why IEnumerator extens IDisposable


Lot’s of people asked me why IEnumerator<T> extends IDisposable. We did this to support some obscure, yet important scenarios where an enumerator enumerates database rows, or files in a directory, etc. In such cases, the enumerator usually opens some connection or a handle, which then needs to be closed at the end of the enumeration.

Now, we could solve the problem in two ways.

1)     We could do what we did. Now, foreach loop can simply call Dispose when the loop terminates.

2)     We could implement foreach loop to do a dynamic cast to IDisposable at the end of the loop and if the enumerator happened to implement IDisposable, we would call it.

We chose #1 because it’s slightly faster and the cost of implementing an empty Dispose method, for those who don’t need it, is relatively low.

 

… besides, have you heard about the new C# yield statement?

Comments (7)

  1. Brad Abrams has pointed to this post here. Kryzystofs bit here makes perfect sense to me and honest it’s not obscure! (well at least not around here) Perhaps it seems obscure to the folks writing stuff like the CLR, but to those of us doing LOB apps out h

  2. Kenny Kerr says:

    In part 8 of my series on MSIL I discuss the inner workings of the for each statement, including the logic behind having the generic IEnumerator inherit from IDisposable.
    <br>
    <br><a target="_new" href="http://weblogs.asp.net/kennykerr/archive/2004/12/15/316014.aspx">http://weblogs.asp.net/kennykerr/archive/2004/12/15/316014.aspx</a&gt;
    <br>

  3. Ken Beckett says:

    I think it’s unfortunate that you only considered costs at the machine level – CPU and memory costs.  You should always make a point of considering the human costs.  The effort of hundreds of thousands of developers taking the time to understand why Dispose() is there, how it needs to be implemented in their case, and then actually banging out the simple implementation on their keyboard vs. a slight performance improvement for a rare scenario (not to mention a scenario that implies the use of relatively expensive resources, which in turn would make the dynamic cast insignificant).

    If you had considered that, I think you would have seen option #2 as the obvious choice.

  4. Krzysztof Cwalina says:

    Ken, we did take human cost into account. We provide build in collections, base collections (like Collaction<T>), and the yield statement so most developers don’t have to implement IEnumerable<T>.

    But, there is a small set of the Framework types for which performance at the "machine level" is important. This applies to some basic types (like IEnumerable<T>) which are implemented by the building blocks of the Framework (e.g. arrays).  

    The performance overhead of doing the dynamic cast would be present in all scenarios, not only in rare scenarios; i.e. even in scenarios where the collection is not disposable, we would need to do the cast.

  5. Christoph Ammann says:

    Seems to me that there’s another reason why you’d want IEnumerator<T> to extend IDisposable. Suppose you want to write a collection that avoids allocating an object in GetEnumerator. You’d do that by the collection class extending IEnumerator<T> and returning "this" in GetEnumerator. However, you can only return "this" if no loop is currently "active" because otherwise one loop would end up modifying the enumerator state of the other, say if there are two nested foreach loops over the same collection. To maintain this "active" flag you need to know when the loop exits. This notification is provided by the Dispose method, and there was nothing like it in .netfx 1.0.

  6. Anup says:

    I have a Matrix class whose member variables are:

    double[,] elements;

    int rows;

    int columns;

    The Matrix class also serves as an enumerator for the MatrixRow class.

    If I want to manipulate the rows of a particular matrix, I write the following code

    foreach (MatrixRow row in matrix)

    Why is IDisposable required for the matrix class in this case, since the Matrix class contains simple data types only..

    What is the performance cost if I have an empty Dispose method?

  7. Jon Hanna says:

    I immediately welcomed the fact that IEnumerator<T> is derived from IDisposable because of the human costs being less that way.

    Previously, in .NET1.1, I had many cases where an enumerator object was based on a resource (I’m a web developer and almost everything I write has many cases where I’m creating objects based on the values in a row from a database – if I couldn’t create an IEnumerator from an IDataReader my only alternative would be to create an entire collection like an ArrayList or a List<T> and return it, which would be dreadfully inefficient in cases where I didn’t need the extra functionality of such a class, but obviously I need to close the database connection when I’m finished with that).

    As such I needed to implement IDisposable. I only knew that IDisposable.Dispose() would be called until I experimented, and I couldn’t be as sure that it would continue to be called in later frameworks as I can now that it is in the core interface.

    To talk of the "human cost" of programmers having to understand what they are doing with IDisposable is nonsense.

    This implies that there are programmers building classes without considering the lifetime management of that class.

    Sorry, but claiming that there is a "human cost" in making people consider the basics is actually making me angry. Why are these people calling themselves "developers" or "programmers"?

    Still, the human cost is now reduced. Putting IDisposable into a class where it is particularly likely that it will exit prior to the most obvious ending event (which is true of all cases of IEnumerator<T> since the loop could be exited for any reason prior to MoveNext() returning false) means you have to consider your implementation of IDisposable in EVERY implementation of IEnumerator. It’s just more obvious with IEnumerator<T>.

    And the performance cost is reduced in every case. Calling an empty Dispose() method has a performance cost somewhere between zero and very small (depending on how much the optimiser deals with). Checking to see if a class implements IDisposable and then discovering it doesn’t and hence not calling IDisposable is both more expensive and less likely to be optimised out.

    So. Machine cost is improved by making both cases where Dispose() is needed and cases where it isn’t needed more efficient. Human cost is improved by making it harder to make a mistake – admit it a rather stupid mistake, but one that people are complaining they don’t get to make any more.

    If only we could turn back the clock and undo the error of having Reset() in the interface. It’s not needed in the most common use case and indeed not implemented in quite a few MS-supplied cases (including the class generated if you use yield), but it’s still in there. As far as I can see the only reason it’s in there is that it was in the old COM equivalent. Why it was ever in the COM equivalent is beyond me.

    That one has a real human cost, because a developer examining the interface has to consider whether failing to implement it (and it can often be impossible to implement) will cause problems (I’ve never found that it does, but wouldn’t it be nice to have a guarantee).