Iterator Blocks Part Seven: Why no anonymous iterators?


This annotation to a comment in part five I think deserves to be promoted to a post of its own.

Why do we disallow anonymous iterators? I would love to have anonymous iterator blocks.  I want to say something like:

IEnumerable<int> twoints = ()=>{ yield return x; yield return x*10; };
foreach(int i in twoints) …

It would be totally awesome to be able to build yourself a little sequence generator in-place that closed over local variables. The reason why not is straightforward: the benefits don’t outweigh the costs. The awesomeness of making sequence generators in-place is actually pretty small in the grand scheme of things and nominal methods do the job well enough in most scenarios. So the benefits are not that compelling.

The costs are large. Iterator rewriting is the most complicated transformation in the compiler, and anonymous method rewriting is the second most complicated. Anonymous methods can be inside other anonymous methods, and anonymous methods can be inside iterator blocks. Therefore, what we do is first we rewrite all anonymous methods so that they become methods of a closure class. This is the second-last thing the compiler does before emitting IL for a method. Once that step is done, the iterator rewriter can assume that there are no anonymous methods in the iterator block; they’ve all be rewritten already. Therefore the iterator rewriter can just concentrate on rewriting the iterator, without worrying that there might be an unrealized anonymous method in there.

Also, iterator blocks never “nest”, unlike anonymous methods. The iterator rewriter can assume that all iterator blocks are “top level”.

If anonymous methods are allowed to contain iterator blocks, then both those assumptions go out the window. You can have an iterator block that contains an anonymous method that contains an anonymous method that contains an iterator block that contains an anonymous method, and… yuck. Now we have to write a rewriting pass that can handle nested iterator blocks and nested anonymous methods at the same time, merging our two most complicated algorithms into one far more complicated algorithm. It would be really hard to design, implement, and test. We are smart enough to do so, I’m sure. We’ve got a smart team here. But we don’t want to take on that large burden for a “nice to have but not necessary” feature.

Comments (28)

  1. MikeG says:

    That just sounds scary.

    Seems perfectly reasonable why it wasn’t done.

  2. Ben Voigt [C++ MVP] says:

    What would be really cool would be for the compiler error messages to link to Eric’s blog posts.  They already link to MSDN, so adding "This error is explained on Eric Lippert’s blog" hyperlinks automatically to topics matching specially formatted blog keywords ("keyword: CSnnnn" anyone) would be really neat.  Even someone adding the links by hand would be really helpful.

  3. Eric:Thanks for devoting an entire post to my question(comment) about Anonymous iterators from Part5 🙂

  4. Goran says:

    First, I believe that Microsoft should commit itself to achieving the goal, before this decade is out, of allowing anonymous iterator blocks and having them nest safely. No single feature in this period will be more awesome to mankind, or more important for the future of c#; and none will be so difficult or expensive to accomplish. We propose to accelerate the development of the appropriate compiler. We propose to develop alternate iterator anonymous method rewriter, much smarter than any now being developed, until certain which is superior.

  5. Fahad says:

    @Eric: One more approach to get this done, you could have some Sequence expression pattern as in F#(using Monads), that allows you to wrap a IEnumerable on the return values. Of course, this is not totally applicable for C#, but that could be one way to get this done for anonymous methods? What do you say?

    Seq(()=>{yield return x;})

    Adding up, this would be a great benefit for LINQ too,

  6. Dave Sexton says:

    @Eric: "If anonymous methods are allowed to contain iterator blocks…"

    Is the cost of just discovering nested iterators and nested anonymous methods within iterators any lower?  If a restriction were to be placed on anonymous yields to prevent these compiler complexities, would this feature still be worth implementing?

    I’m thinking that in the phase when anonymous methods are rewritten, the compiler can issue an error if it detects an anonymous method within an anonymous iterator.  Likewise for the iterator rewriter phase, but detecting nested anonymous iterators instead.

    The cost of each may be smaller if nesting could be detected after the top-level closures have already been created, assuming that each phase already works recursively.  (Full disclosure: I don’t know diddly about compiler theory, so this is all probably just rubbish anyway :p)

    Sure, it would suck to not have the ability to nest anonymous iterators and methods.  And it would also be inconsistent in C# (barring the existing strangeness of iterators, as you’ve pointed out in other posts).  But I think it might still be useful.

    Either that or perhaps put something into the BCL that’s like the following?

    static class Enumerable {

     // NOTE: this is not meant to be an extension method

     public static IEnumerable<T> Iterate<T>(params T[] items)

     {

       return items.ToList().AsReadOnly();

     }

    }

  7. Morten Mertner says:

    Any thoughts on allowing anonymous types to implement interfaces? That would be a really useful feature.

  8. Daniel Pratt says:

    C# is a great language and it seems to get significantly better with each version. For the most part, I really like the direction your team has taken the language.

    If there is one criticism I would level, it is that I feel that I often bump into what seem like arbitrary limitations of a given language feature. To put it more precisely, I feel that a lot of the language features are not very isomorphic.

    The word you’re looking for is “orthogonal”, not “isomorphic”. “Orthogonal” means that two things can vary independently of each other. By analogy, “orthogonal” features are those where there are no weird interactions with other features. “Isomorphic” means “having the same shape”. — Eric

    I guess I’m trying to say that when you folks are doing your cost/benefit analysis, I hope that a lot of consideration is given to trying to make the language natural and consistent.

    Perhaps I was not sufficiently clear. The whole purpose of this series of articles was to justify why it is that sometimes we choose to make language features nonorthogonal: orthogonality is a goal, but sometimes it is too expensive. Sometimes it is better to get a feature that works well with 95% of the language. — Eric

  9. Steve Bjorg says:

    Anonymous iterators is simply a *must* do for the compiler team.  More and more async frameworks are discovering that iterators can be used to implement coroutine-like functions.  However, in the absence of anonymous iterators, it puts additional effort and complexity on the developer.  Given that we’re in the age of concurrency, there will be more need for async/concurrent/coordinated code.  Do we really want to leave this responsibility to each and every developer instead of the compiler and/or a framework?

    It may be a complex endeavor, but it’s also one that will yield many rewards!  No pun intended.

  10. I would like to add my voice to the plea for anonymous iterators. We’ve built a coroutine framework on top of yield and do a lot of iterator work. But we are basically stuck in .NET 1.0 land in many ways, since we cannot take advantage of closure capturing to simplify our code.

    In short, adding anonymous iterators would bring us folks into .NET 2.0 land and get all the benefits of anonymous methods and I’m sure that nobody is challenging that anonymous method weren’t hugely beneficial!

  11. Well I’ll add my voice to the chorus of folks saying that anonymous iterators would be nice to have, but given a finite budget I would much rather see the brains at Microsoft implement deep const and isolation semantics into the CLR so that languages targeting it can then expose them.

  12. TheCPUWizard says:

    @Tom, I agree with you! [eric: I also know how difficult "const" is to properly implement]

  13. Daniel Pratt says:

    @Eric:

    >> The word you’re looking for is "orthogonal", not "isomorphic".

    Well, that’s embarassing, but not at all out of character for the day I had. For what it’s worth, I do know what those words mean, but they’re certainly not part of my daily vernacular.

    >> The whole purpose of this series of articles was to justify…

    I have not read the whole series of posts. In any case, I imagine that much of the feedback your team recieves is asking you to focus on the pragmatic aspects of the language. I just wanted to let it be known that even though this specific feature is not one I (yet) feel is absolutely necessary, I would still love to see it implemented someday for the sake of language consistency.

  14. Richard says:

    Of course you can do anonymous iterators in F#:

    let y = 10

    seq { yield y; yield 10*y } |> Seq.iter (fun x -> printfn "%A" x )

    (and you can even have "inner iterators" via "yield!".)

  15. Jay R. Wren says:

    unacceptable!

    This isn’t a “would be nice to have” feature. This feature is absolutely required for writing the highest readable code possible. Code readability and thus maintainability are the number one concerns when writing. Having to hunt down some generator method somewhere because I could not write it inline reduced readability and increases maintenance cost.

    Well then, if C# does not meet your absolute requirements then you shouldn’t use it. Though it is disappointing to not delight every customer, I do not think it reasonable to expect that we’ll meet all the requirements of every person. Try F# instead. — Eric

  16. TheCPUWizard says:

    @Jay – of course this is a subjective discussion, but I have to disagree. Too many times I have found even anonymous methods to be a MAJOR source of testability issues. With very few exceptions, I have chosen to go with well named, well factored methods.

    Inlining, does not [IMHO] prove readability or understandability. Well factored code where a given item performs one well defined function does improve understanding of the code, and allows a person to focus on just the specific area instead of being bound into a given context.

  17. Timwi says:

    Hi Eric. You say that the two algorithms for rewriting anonymous methods and iterator blocks would interact too much if anonymous iterators were allowed. I would be very interested in seeing a more detailed explanation of this. Why can’t the algorithm for rewriting anonymous methods simply rewrite anonymous methods irrespective of whether they contain ‘yield’ or not? In other words, does the anonymous methods rewriter really need to care whether any particular anonymous method is an iterator block? The way I see it, it doesn’t; you already have the algorithm in place to handle nested anonymous methods, so an anonymous iterator block nested inside an anonymous method doesn’t seem like a barrier.

    Once anonymous methods have been rewritten into actual (top-level) methods, the iterator-block rewriter can work its magic, and at this point does not need to care whether any particular iterator block used to be an anonymous method and whether it used to be nested.

    Thus, to me the two features seem completely orthogonal. So orthogonal, in fact, that the relevant rewriting algorithms should be orthogonal too.

    What is the detail I am missing that makes them interact in complex ways?

  18. Another limitation of iterator blocks and anonymous methods that I’d love to see a post on – or better yet, just a bug fix in the next version, because it clearly is a bug, and by the standards of what we’re talking about, a very simple one to fix.

    Why can’t you use “base” in an anonymous delegate or iterator block without getting an ‘unverifiable code’ warning?

    That’s fixed in C# 4. I regret not getting the fix into the final release of C# 3, but it was simply too dangerous given all the radical changes we’d made to the anonymous function binding code to make lambdas work. It should have made it into the service release, but there was some scheduling mixup that I don’t recall the details of, so it didn’t make it in their either. In C# 4, we do the right thing; generate a helper method for you and call it. — Eric

    I know exactly why you can’t from a *technical* perspective: because no outside class is allowed to make a non-virtual call to a base class virtual method, and the delegate or iterator is being implemented as a separate class by the compiler under the hood. But compared to the complexity of the magic that the compiler is *already* doing to generate that separate class, especially in the case of an iterator, the fix (generate a private helper method on the containing class to do the necessary base class operation) is trivial.

    I note that “relatively trivial” does not logically imply “trivial”. — Eric

    I raised this question shortly after C#2 was released and got the answer that yes, the compiler team knew that and intended to fix it but there was no time to do so prior to the C#2 release. Well, I’m now using C#3.5 which has seen a plethora of radical new features in the compiler since C#2 – but still no fix for this simple bug? Is it fixed in C#4?

    Your idea of what’s a simple bug and my idea of what’s a simple bug are perhaps rather different. A simple bug, for example, does not require me to consider whether fixing the bug will subtly change the order in which metadata is emitted, and thereby hit unperformant code paths in the unmanaged metadata emitter provided by the CLR, code paths which we work hard to avoid. A simple bug doesn’t require cross-team communication with the jitter and verifier teams to determine whether the new codegen is likely to result in any additional verification problems or hit any unexpectedly bad performing jit scenarios. Simple bugs do not require the construction of new visitor passes that have to happen in the precisely correct order related to other passes that do rewriting on “base” calls and anonymous function and iterator rewritings. Simple bugs take a few minutes of code review, not several hours pouring over the semantic analyzer with multiple senior team members. In my opinion, this was not a simple bug, which is why it went unfixed for an entire version. You are probably smarter than me and find these sorts of things simple, but I do not. — Eric

    Also, will we ever get a ‘yield foreach’?

    It’s on the list, but it’s not real high on the list. I wouldn’t hold my breath waiting if I were you. Particularly since I have heard rumours that Erik’s solution for C-Omega has been shown to have certain shortcomings; apparently his transformations do not correctly handle some cases in which the nested iterator throws exceptions. Since the existence of incorrect codegen scenarios implies that fixing the algorithm is an open research problem, and since the compiler team is not really in the business of solving open research problems — that’s MSR’s department — that’s more points against the feature. I’d love to have it, but I don’t think it is likely any time soon. — Eric

  19. "Your idea of what’s a simple bug and my idea of what’s a simple bug are perhaps rather different."

    Touché. I think what I meant was that it is ‘simply’ a bug, as opposed to a language feature with benefits and downsides and tradeoffs like most of the other things that have been argued about here. In retrospect I’m not at all surprised that the implementation was bloody hard and I’m sorry for implying that it wouldn’t be.

    "Particularly since I have heard rumours that Erik’s solution for C-Omega has been shown to have certain shortcomings; apparently his transformations do not correctly handle some cases in which the nested iterator throws exceptions."

    I’d think even a really naïve approach would be worth implementing from a code readability perspective. If, in the first pass of implementation, you simply translated "yield foreach foo;" into "foreach (var x in foo) yield return x;" it’d make code nicer to read without precluding doing smarter, optimized codegen when the research problem is solved, wouldn’t it?

  20. (things I forgot in the prior post)

    "You are probably smarter than me and find these sorts of things simple, but I do not."

    More like, I have a tendency to frequently forget that the C# language team is, in fact, made up of mortals. Naturally, from the results of your work, I tend to assume you are all extreme programming gods 😉

    (quoting myself) "make code nicer to read without precluding doing smarter, optimized codegen when the research problem is solved, wouldn’t it?" – Not to mention that, presuming the research problem will eventually be solved, having a naive implementation now would mean that when that solution is found, C# can take advantage of it to produce drastically better performing binaries from *existing* code. If you wait for the solution to the research problem before adding the language feature that could take advantage of it, then everyone will need to scan their code for "foreach" loops that could benefit. Or the compiler’s optimizer will need to be a lot cleverer to identify yields in foreach blocks that could also be transformed in the same way.

  21. @Steve Bjorg:

    Eric requested in a comment a year ago (http://blogs.msdn.com/oldnewthing/archive/2008/08/15/8868267.aspx) that programmers not make "clever use" of iterators.

  22. James says:

    This feature isn't a big loss. 99% of the cases where I'd wanted to use it, new[] { a } is more appropriate… especially considering that any iterator will be a new object *anyway*.

    On the C# side, the language has essentially everything one could desire at this point (except A<T> : T, which opens up a whole world of great possibilities for generics, but…).

    *.Net* has bigger problems — for example, if a struct or array of structs has no object pointers, *I should be able to modify it at a pointer level without going to unsafe code*. This is also the appropriate way to allow access to SIMD features…

  23. ShuggyCoUk says:

    "I should be able to modify it at a pointer level without going to unsafe code"

    If you modify/access it at pointer (as in c style) then you are being unsafe as you have no bounds checking and can read/write anywhere. How is that not unsafe in managed world.

    Or did you mean pointers with bounds checking (which sound rather tricky to integrate as well as raw pointers and keep the syntax sane)

    I would like SIMD support, perhaps Mono's testing of the waters would be an option (I haven't personally tried it) but actaully what I really want is intrinsics. If I'm running on a machine with crc32 I really want to be able to get to it without having to do a managed/unmanaged transition. Having the JIT transparently give me a software based implementation when it's not present is nice, but I'd even accept it throwing NotSupportedException.

  24. Async/Await says:

    Hi Eric,

    You have gone to the trouble of supporting anonymous async/await methods.

    Will you take the trouble to bring iterators into line with this feature? 😉

    Keep up the good work!

    James Miles

    http://enumeratethis.com

  25. tormod.jervell.steinsholt@kongsberg.com says:

    Thank you for writing this article.

    It is easy to forget the huge complexity underneath. From the face of it, it seemed trivial to just have the compiler wrap it into a private helper method just because I wasn't using any other lambda features (didn't access any variable from external scope). Your two initial points about iterator blocks and anonymous methods being the two most complicated language features, compiler wise, quickly set me straight. From proudly professing my newfound functional highground to being humble.

    Let's just take it as a positive signal that more and more laypeople are endeavouring into functional paradigms. And as a reminder that even C# will at some time be replaced.

    I hope to see you at the NDC 2011.

  26. Dave Sexton says:

    It's disappointing that VB 5 supports anonymous iterators and C# 5 doesn't, while C# has supported named iterators since C# 2.  I understand that it's not a generally useful feature that outweighs all of its associated development costs, but why promise language parity and then decide that this feature outweighs its costs for VB only?  I'd really like to use anonymous iterators in my reactive parsers library, and I'll bet I'd find lots more unrelated uses if it were available in C#.

  27. Dave Sexton says:

    Just thought of another great use for anonymous iterators: Rx Experimental contains an overload of Observable.Create that accepts an iterator block.  It provides a way to write Async-like code that generates an observable sequence (as opposed to a  scalar-valued Task.)  The fact that I have to use a named method for the iterator argument takes away from the "flow" of a reactive LINQ query.

  28. Simon says:

    Thanks for the post, that's a nice explanation. It would definitely be "nice to have" and I hope it's on the list *somewhere* (probably near the bottom), but there are many other features I'd consider more important 🙂

Skip to main content