The implementation of iterators in C# and its consequences (part 2)


Now that you have the basic idea behind iterators under your belt, you can already answer some questions on iterator usage. Here's a scenario based on actual events:

I have an iterator that is rather long and complicated, so I'd like to refactor it. For illustrative purposes, let's say that the enumerator counts from 1 to 100 twice. (In real life, of course, the iterator will not be this simple.)

IEnumerable<int> CountTo100Twice()
{
 int i;
 for (i = 1; i <= 100; i++) {
  yield return i;
 }
 for (i = 1; i <= 100; i++) {
  yield return i;
 }
}

As we learned in Programming 101, we can pull common code into a subroutine and call the subroutine. But when I do this, I get a compiler error:

IEnumerable<int> CountTo100Twice()
{
 CountTo100();
 CountTo100();
}

void CountTo100()
{
 int i;
 for (i = 1; i <= 100; i++) {
  yield return i;
 }
}

What am I doing wrong? How can I move the "count to 100" into a subroutine and call it twice from the CountTo100Twice function?

As we saw last time, iterators are not coroutines. The technique above would have worked great had we built iterators out of, say, fibers instead of building them out of state machines. As state machines, all yield return statements must occur at the "top level". So how do you iterate with the help of subroutines?

You make the subroutine its own iterator and suck the results out from the main function:

IEnumerable<int> CountTo100Twice()
{
 foreach (int i in CountTo100()) yield return i;
 foreach (int i in CountTo100()) yield return i;
}

IEnumerable<int> CountTo100()
{
 for (i = 1; i <= 100; i++) {
  yield return i;
 }
}

Exercise: Consider the following fragment:

 foreach (int i in CountTo100Twice()) {
  ...
 }

Explain what happens on the 150th call to MoveNext() in the above loop. Discuss its consequences for recursive enumerators (such as tree traversal).

Comments (49)
  1. John says:

    Am I alone in thinking that CLR week sucks?

  2. Rick C says:

    If you think CLR week sucks, go somewhere else for a week. It’s Raymond’s blog, he gets to choose his own comments.

    "Gee, Raymond, I don’t like the brand of free ice cream you’re serving!  You should give me what I like!"

  3. Mark says:

    John: most likely, but who cares?

  4. Nathan_works says:

    Since I don’t do much CLR type things, I’ve got to ask: where are anonymous functions and iterators used/useful ?

    C++/STL iterators are handy, but in my experience I’ve never had the need to write one. The ones Raymond has show here don’t fit the C++/STL iterator pattern, so I’m more left to wonder why someone would need it and what it would be used for (aside from "it’s neat!").

    Can anyone fill in the blanks (CLR haters excused from answering this..)

  5. acq naq says:

    Everybody who criticizes NET week should instead of sharing their comprehension problems with us put on their thinking caps and read and reread Raymond’s posts and think very, very hard until something happens.

    That’s because the quality of Raymond’s NET posts is unbelievably high compared to almost any other material you are about to encounter on the web, no matter in which programming language the topic is presented! The material presented IS in fact of universal value, but you have to understand it first. If you manage to understand, you’ll be immensely better programmer. And then you’ll also agree with my claims.

    I don’t use C# but I can’t express enough how thankful I am to Raymond for these posts. Thanks Raymond.

  6. Tim says:

    Nathan: iterators, anonymous functions, and anonymous types, are all extremely useful with LINQ, which is awesome and everyone should make an effort to learn as soon as possible.

  7. weird examples ahead! says:

    @Nathan:

    anonymous delegates are very useful everytime you need a very short delegate since they improve readability of code.

    For example the Sort method might accept a delegate with the comparing function and it’s both shorter and more readable to have it inline; another example are BeginInvokes or SynchronizationContexts which are in practice ways to use the messaging facitilites to execute code in the UI thread – you can pass the code inline using an anonymous delegate instead of having a separate one.

    Iterators are unbeliavably useful.. when you get used to them you can’t go back.

    You can create easy to navigate views on your data structures, filters and much more.

    For example:

    List<Entry> m_AddressBook1 = …;

    List<Entry2> m_AddressBook2 = …;

    IEnumerable<Entry> GetAddressBooksFilteredBySomeWeirdMethodology()

    {

      foreach(Entry E in m_AddressBook1)

        if (Whatever(E)) yield return E;

      foreach(Entry2 E in m_AddressBook2)

        if (Whatever(E)) yield return Entry2ToEntry(E);

    }

    GetAddressBooksFilteredBySomeWeirdMethodology() returns both address books combined, with conversion of the second addressbook format on the fly, of only items matching some weird criterion. Without copying any memory around and you can navigate the result with a simple

    foreach(Entry E in GetAddressBooksFilteredBySomeWeirdMethodology())

  8. Mark Baker says:

    I’ve never used .NET, and only done a littl ebit of Win32. I develop for linux at work, and use a Mac (mostly just for web browsing) at home. And I find almost everything Raymond writes about interesting, including the C# stuff this week.

    If you’re a programmer, and you don’t think what Raymond writes about is relevant to what you do, you’re probably not a very good one.

  9. EricLippert says:

    To return to the subject at hand…

    We could fix the issue Raymond alludes to by implementing a new syntax which allows you to

    "yield foreach CountTo100()".

    The compiler could generate efficient code which does not run into the problem. We could make this work even with complex iterators on recursive data structures.

    We know that in both principle and practice it is possible to build a compiler which does this because someone has:

    http://citeseer.ist.psu.edu/cache/papers/cs2/355/http:zSzzSzwww.cs.kuleuven.ac.bezSz~frankzSzPAPERSzSzFTfJP2005.pdf/iterators-revisited-proof-rules.pdf

    However, this has never been a high enough priority to make it into the C# language. I would very much like it to be in the future, but I would not get my hopes up too high if I were you.

  10. As an alternative to having two foreach loops within CountTo100Twice(), you could instead use the .Concat() extension method [0]

    IEnumerable<int> CountTo100Twice()

    {

     return CountTo100().Concat(CountTo100());

    }

    The only problem with .Concat() is that it’s limited to combining only two lists, though it’s ~trivial to write a verion of .Concat() that handles more [1].

    • Jon

    [0] http://msdn.microsoft.com/en-us/library/bb302894.aspx

    [1] http://anonsvn.mono-project.com/source/branches/rocks-playground/Mono.Rocks/IEnumerable.cs

    Search for "Concat<TSource>".

  11. Lukas Beeler says:

    I’m mostly a system administratos, and while there’s a lot of content on this blog that i don’t understand at all, there’s still so much useful information to be found here even for nonprogrammers.

  12. KenW says:

    @John: "Am I alone in thinking that CLR week sucks?"

    Yes, you’re alone. You’re also alone in having too low an IQ to see where Raymond’s posts are educational whether you like or use C# or not.

    Find somewhere else to spend your time, please.

  13. Nick Palladinos says:

    One typo

    IEnumerable<int> CountTo100()

    {

    // should be “for”

    foreach (i = 1; i <= 100; i++){

     yield return i;

    }

    }

    [Fixed. Thanks. -Raymond]
  14. Grumpy says:

    Am I alone in thinking that CLR week sucks?

    God, where do you people come from?  Go whine at someone else.

    You don’t like what’s here?  Go bore someone else with your whining.

  15. Nick Palladinos says:

    I believe that in F# you can write

    seq {

      yield! CountTo100()

    }

    Raymond keep up the good work

  16. alex.r. says:

    Nathan:

    Most people never had to write their own  STL iterators mainly because it’s really the last thing you want to do given the complexity it involves. It seems that STL iterators were meant to be easy to use but hard to build.

    Things don’t have to be that way.

    An example : you might have some file containing a list of serialized objects. By building an iterator that parses each record and outputs the de-serialized objects one at a time, you can use the same functions to do operations on an in-memory list of these objects and the serialized representation.

    You’d probably do this anyway by re-implementing the general idea of the iterator in your program — the fact that there’s a built-in mechanism for it is just nice.

    In the same way, anonymous methods are nothing more than a convenience really. They favor a certain program style that is not necessarily bad in itself but would be too ‘messy’ to be practical without them.

    For example, some people prefer not to use std::foreach because you have to separate the looping from what you’re actually doing in the loop. With anonymous functions you wouldn’t have to.

  17. Daniel Colascione says:

    alex.r, boost has a bunch of utilities to make iterators much easier to build.

    Also, I haven’t used C# iterators, but I do use the Python equivalents quite often. Generally, I don’t try to make them recursive, as in Raymond’s example. Instead, if I need to walk some tree structure, I just write the walk with an explicit stack inside the iterator. (Which also makes it simple to change the walk to a breadth-first one.)

  18. Jonathan Allen says:

    The VB team has recognized this limitation and has proposed an alternate syntax for iterators.

    Function CountTo100Twice() As IEnumerable<int>

    Dim Count100 = Iterator
    
        For i = 1 to 100
    
            Return i
    
        Next
    
    End Iterator
    
    Return Iterator
    
        Return Each CountTo100()
    
        Return Each CountTo100()
    
    End Return
    

    End Function

    This function creates two anonymous iterators.

    The first is assigned to CountTo100.

    The second, which is returned, iterates through each item in CountTo100 (Return Each), and then of course does it again.

  19. Robert says:

    You mentioned fibres and coroutines. You could also use continuations to implement this.

  20. Michael says:

    Robert:> You mentioned fibres and coroutines. You could also use continuations to implement this.

    Yes!  That should be the NEXT BIG FEATURE of .NET — "call-with-current-continuation"!  Is foreach not good enough for you?  Are you tired of writing your tree traversal algorithms recursively?  With "call/cc", now you can have the spaghetti code you’ve always dreamed of, and still manage to look like a programming master to the maintainer of your code 50 years from now!

    Sarcasm aside, I would certainly love to have this Scheme feature in some .NET language.  While it will probably never happen (does anyone actually use call/cc, other than Scheme implementors?), it would certainly help show that .NET has arrived as a system for programming language research, as well as a boring but practical app development system.

  21. Christian says:

    Raymond picked the most interessting thing about iterators in C# 2.0. These two blog posts were STRAIGT to the point and absolutely excellent.

    Not some "how to", but directly how it works!

    I had to use Reflector back then to figure out how they work.

  22. steveg says:

    Michael + Robert

    I think (hope?) you’re joking… A programming jedi always chooses languages that use { and }. A programming sith choose languages with ( ).

    Ewoks choose VB (don’t be angry, you’re cute).

    For those of you too lazy to google:

    http://en.wikipedia.org/wiki/Call-with-current-continuation.

    I like C#’s simplicity… I did a lot of C++ STL programming and never really liked the library. Too many of its design decisions favoured performance over readability and it always reeked of an academic provenance. And the error messages… you know something’s wrong with a language when the error message fills the screen.

  23. Jolyon Smith says:

    A very useful series of posts.

    They are demonstrating very nicely what a waste of time a lot of so called "language improvements" (especially those that the .NET camp smugly call their own, whilst sneering at unmanaged languages that have to be more explicit to achieve the same thing) can lead to.

    by removing the details of how some piece of code works SO FAR from the way that code is expressed, the potential for unanticipated side-effects is increased and huge impediments introduce to being able to identify the causes of, and resolutions to, those side-effects.

    So for the occasional trivial use, where you save a few scant seconds by using these oh-so-clever language features, you pay a huge price in hours/days spent scratching your head trying to figure out how some piece of code that you think you understand but clearly don’t (because you don’t have a compiler embedded in your brain).

    I am not some luddite who believes we should all be using assembler, but I do believe there is a PRACTICAL limit to how high a high level language should go.

    Pressure suits and oxygen masks shouldn’t be required.

    Keep It Simple, doesn’t necessarily mean "make it so we can write as little code as possible".

  24. Wolf Logan says:

    @Jolyon Smith

    That practical limit is your ability to hold onto the concepts expressed in a particular piece of code in a particular language. I for one find nothing complicated or abstruse about C#3, and save more than "a few scant seconds" with "occasional trivial use" of its many "language improvements". Rather, my livelihood is based on making very good use of them to simplify not only the lines of code I type, but the ability of those after me to understand and maintain that code.

    I have difficulty following some types of monadic functional code in certain languages. Should I then declare that such things are stratospheric and baroque, and then deride those that use them? No. Instead I recognise that code written in that style has value, and that I as a professional programmer have some work to do to improve my skills.

    Compared to the languages that we were programming in twenty years ago, modern languages (even those without "language improvements") are dramatically more abstract and removed from implementation. There’s no reason to believe that we’re somehow in programming’s "golden age" with languages that are at the pinnacle of perfection with regards to abstraction, and that any further refinement is just architecture astronautics.

    I for one welcome our new abstraction overlords.

  25. Someone suggested instead of complaining about CLR/.NET I should state my reasons why?

    Here they are:  The concept was stolen from the original VB Classic team and mutilated.  The folks who currently develop it have no idea of the true VB history and what it represents…and in the process M$ declared VB6 dead and not work saving…even tho we could not move our code assets to .NET without a rewrite.  There is a word for that, but as this is a PG-13 blog, I’ll keep it to myself.  WHy build another layer (.NET) ontop of the WIn32 and makes our lives even more complicated.  Does someone really think this is an advantage?  It isn’t.  With VB6 I had ample control over what kind of code I could write and how to hack it to make VB do col things it was never meant to do.  I can’t do this in .NET because it’s either do it M$’s way via the framework, so go use Delphi.  I chose the latter, mostly because I don’t trust MS to screw us developers again when they get tired of .NET and want to do .ORG where nothing is backwards compatible, we can only write the code MS wants us to and only have it work on the latest OS (and if it’s like Vista (which I do like) which many do not…it’s yet more bullets in the foot if MSFT.  Don’t they have pain receptors to keep them from making the same mistakes over and over again, or will they just slowly bleed to death until they are dead.

    One has to wonder.

  26. MichaelGG says:

    @Nick I came here for that :)

    #light

    open System

    let countTo100 = { 1..100 }

    let countTo100Twice = {

       yield! countTo100

       yield! countTo100 }

    @Wolf Logan: Exactly. Once you learn and grasp a concept, its yours to reuse anytime you need it – it’s a one time payment. If you don’t learn the concept, you have to pay every single time you run into that problem.

  27. @Nathan:

    Iterators (writing my own) have been very useful to me in various API wrapping situations: writing an iterator around an accounting system SDK isn’t hard, but makes life so much easier for the people using your wrappers.

    I haven’t had much need for anonymous delegates so far, but have been nice in cases of small IDisposable utility classes.

  28. Nick Palladinos says:

    @MichaelGG

    :) F# is the elite language for .Net

  29. MichaelGG says:

    @Nathan, anonymous methods can replace objects.

    For a cheap example, let’s say we have some method "SanitizeWords" that takes a callback of (string -> string). Taking a callback ensures the consumer of this method can customize the behaviour easily.

    .NET delegates, contain a function pointer + an object. So you can define a new type (say, ListBasedSanitizer) and create a "string Sanitize(string word)" function and any supporting fields (string[] badWords), etc. When you call SanitizeWords, you first create a new ListBasedSanitizer, then specify that instance’s Sanitize method as the callback. If the logic for Sanitize needs to be different, you define a new type. You end up with little types just to hold your state and logic for the callback. In some cases, you could add more methods and fields to the type you’re currently in. At best:

    string mySanitize(string w) { … }

    this.someDataForSanitize = …

    SanitizeWords(this.mySanitize)

    "Closures" eliminate all this work. They work by "capturing" variables. Suppose you want to call SanitizeWords with some custom logic, like randomly replacing 4 letter words with "OHMY". You don’t need to create a new type or define fields and methods:

    var r = new Random()

    SanitizeWords(word =>

       word.Length != 4 ? word :

       r.NextDouble() > 0.1 ? word :

       "OHMY");

    Even though r is outside of the anonymous method, we can use it. Using r "captures" it, and now you’ve brought external data into SanitizeWords without having to write unnecesary code. Behind the scenes it’ll create a type and field and new up a new instance for you.

    My favourite thought on closures and objects is the koan here:

    http://people.csail.mit.edu/gregs/ll1-discuss-archive-html/msg03277.html

  30. MichaelGG says:

    @Jonathan Pryor

    What about:

    return (new[] { CountTo100(), CountTo100() }).Aggregate(Enumerable.Concat);

    (Or SelectMany)?

    A problem may be that each enumerator in the array will be created before it’s actually needed. Getting around that isn’t hard, but it’s not exactly elegant either:

    return (new Func<IEnumerable<int>> [] { ()=> CountTo100(), ()=>CountTo100() }).Select(f=>f()).Aggregate(Enumerable.Concat);

    [Not the first or last time I wish Expression<T> had different syntax.]

    @Nick – Oh no! Don’t use that term :(. I’m trying to keep some hope that F# will become mainstream, even though everyone tells me that’ll never happen. (Although, my ego does enjoy the delusion of being "elite".)

  31. Nick Palladinos says:

    @MichaelGG

    I mean that it is elite in terms of rich language features (among other .Net languages)

  32. MichaelGG says:

    @Nick – Definately agree with that then :)

  33. AndyB says:

    @steveg: absolutely right, if MS has released .NET as a VB.NET only language (ie and not released C#) then everyone would be complaining about it, saying its not a real language, its performance and ease-of-use, and ‘fancy features’ suck.

    As they cleverly created a Ja.. ahem, sorry, a language with curly braces, they made it acceptable for everyone to use and love it.

    Strangely, VB.NET is actually the stronger and more powerful language, it has a few additional bits that are missing from C# but everyone thinks the only .NET language is VB.NET. Go figure.

    I always recall the COM attacks that appeared from MS when .NET first came out, all the crufty things that were added to COM that sucked were attacked by MS people, only now I see .NET getting the same kind of crufty things added to it. I reckon 5, 10 years from now someone will post a blog saying how iterators were such a mistake and how we should all move to MSs’ coolest newest architecture…..

    CLR week does suck, but Raymond – keep it up, its still informative and entertaining. (can we have more on exceptions please – check Chris Brumme’s old blog for posts as excellent as Raymonds).

  34. CShartp says:

    What a bunch of crusty old whiners.  C# rocks.  Everything is 10x simpler than C++.  I can focus on the problem I’m trying to solve instead of problems with the language.

  35. In short: not only does the CLR week not suck, if possible, I’d love to see more of it please.

    It hasn’t been as hard-core as some of the Win32 entries, but it’s surely interesting.

    Thanks Raymond.

  36. John says:

    Well, at least I am not alone.  But to clarify, I didn’t mean his writing was poor or anything; as a Win32 guy, I just don’t find .NET stuff particularly interesting compared to the usual topics.  For a Win32 “Not a .NET blog”, I was not expecting a full week of it.  Granted its his blog and he can do what he wants, but I suspect it probably wasn’t his choice.

    [“I was not expecting a full week of it.” You must be new here. It’s an annual tradition. -Raymond]
  37. Andrew says:

    As of the question, I don’t see anything specific about the 150th call.

    (If it meant to be 101st – then it’s the point  where the first sub-enumerator gets disposed. So with recursive itarators, you’d still had a non-disposed object to put more pressure on GC)

  38. Russell says:

    I’m a .NET developer who comes to this site to gain a better understanding of programming in general and also get a clearer understanding of what is going on under the hood in an OS (it just happens to be WIN32 here). The programming language used to describe the functionality or idea isn’t that important but it sure is nice to get some of Raymonds extremely in depth thoughts and explanations on something that is actually of use to me on a daily bases. DOTNET week is fascinating!

    Thanks Raymond.

  39. KenW says:

    @Kevin Provance: "Someone suggested instead of complaining about CLR/.NET I should state my reasons why?

    /snip long tirade about demise of VB6"

    You still haven’t. All you’ve done is complain about how VB6 is finally dying, and you can’t seem to move on to a real language (although you did mention Delphi, which is very much a real language – that confused me somewhat, I’ll admit).

    You said nothing about problems with .NET whatsoever, and even less about C# and iterators (which were the topic of Raymond’s blog post). All you did is whine about VB6’s demise.

    Other than you lack of skills (which if you had would make it easy for you to move on to a better language), what is your problem with .NET?

  40. EricLippert says:

    To attempt to bring this back on topic again…

    The only problem with .Concat() is that it’s limited to combining only two lists

    Though your suggestion is an excellent one, its parameter arity is not the only problem with Concat. Concat suffers from exactly the problem that Raymond was alluding to, a problem which as yet no one has mentioned here.

    I believe the problem that Raymond is alluding to is that we now have an iterator which calls another iterator; to get each item, two calls to MoveNext have to happen, so the total cost of enumerating is O(2n). Which is fine, that’s O(n).  

    But if you were enumerating a binary tree, say:

    foreach(T t in Left) yield return t;

    yield return Value;

    foreach(T t in Right) yield return t;

    then the average number of calls to MoveNext per item yielded will be O(h) where h is the average height of the tree.  In a balanced binary tree, h is O(lg n), so enumerating all n items is O(n lg n).  In an unbalanced binary tree, h is potentially O(n), so enumerating all n items is O(n^2).

    Grant Richins, Wes Dyer and I have all written blog articles on ways to solve this problem; as I said before, we could have the compiler solve it for you, but don’t hold your breath waiting. We have other priorities.

  41. Ask the Directory Services Team : MCS Talks Infrastructure Architecture joeware Рnever stop exploring…

  42. Greg D says:

    @KenW: "You’ve managed to join the company of Igor and Yuhong*"

    Don’t forget everybody’s favorite blogs.msdn.com troll Norman Diamond!  I wonder if he ever got his SCSI floppy’s interface working on that ancient laptop of his.  I think it cost him, like, 2000 yen. heheheh.

  43. EricLippert says:

    Re: people who still refer to Microsoft as "M$".

    http://www.penny-arcade.com/comic/2002/7/22/ms/

  44. Good Point says:

    @Kevin Provance: "and in the process M$ declared VB6 dead and not work saving"

    And God Bless Microsoft for doing that!

    They wanted to move development away from native code to some kind of managed code.  I congratulate them for not dragging along all of the VB baggage and proposing a new solution.

    If not, in 2015 (maybe earlier) Raymond would have been blogging about how the design decisions in VB3 shape our programming environment.

  45. KenW says:

    @John: "Well, at least I am not alone. "

    No. You’ve managed to join the company of Igor and Yuhong*. Congratulations.

    And it was Raymond’s choice, as it is every year, and as is every blog entry he writes. Nice try, though.

    *I suspect you’re too new here to understand the reference. The two people I mentioned are the two biggest trolls here, who use many of Raymond’s articles as the launching point to bring up old, long resolved issues relating to MS and Windows. IOW, they constantly whine and complain and never post anything of any value, and generally just tick people off.

  46. John says:

    @Ken:  Actually Yuhong is one of my favorites, though I don’t care much for Igor.  AndyB said he also thinks CLR week sucks, therefore I am not alone.

  47. Anonymous says:

    John: you are quite OT – http://tinyurl.com/2bjua8

  48. AndyB says:

    Come to think of it, I was mistaken. CLR week does NOT suck. At all.

    Its the CLR and all the Managed code hype and reinvention that sucks. my most abject apologies to Mr Chen.

Comments are closed.