Why "yield return" rather than "yield"?


This came up in another context, and I thought I’d share the story.


In the first version of iterators, you would use “yield” when you wanted to “return” a iterator value back, and things worked great. So, that’s that way it was for quite a long period.


We knew at the start of the 2.0 effort that we were doing some major things to the language, and major things often require new keywords. It seemed likely that some of the generics work would lead to this.


So, we planned on how to deal with it. Now, new keywords are bad. Bad, because developers have the annoying habit of using non-keywords as identifiers (someday I’ll tell the story of the generated FORTRAN code I worked on when I was fresh out of college), and if their choice happens to align with a keyword, their used-to-be-working-perfectly code now breaks.


Which is, as they say, bad.


So, we came up with a mitigation strategy – we would provide a utility that you could run over your source code, and it would replace any identifier that had become a keyword with the escaped version. It does this by putting an “@” sign in front of it.


I expect to see heavy usage of “@” in any obfuscated C# contests…


So, anyway, we had this plan in place. But as we finished the work with generics, we found that we had managed to do it without any real keywords (see my next post for more information…).


This meant that the only new keyword in 2.0 was “yield”, and switching to yield return meant that C# 2.0 would compile all C# code without change.


Which is good.

Comments (9)

  1. Peter Ritchie says:

    Eric, interesting read.  Thanks for the post.  I see your point about willy-nilly adding a new keyword and collisions with identifiers (although being forced to use non-keywords as identifiers doesn’t seem like a "habit" to me :-).

    It’s sort of the dance of an evolving language.

    One of the things C++ "mandates" (whether vendors follow it, or is checked during compilation, is a different story) is that identifiers beginning with underscore (‘_’) are automatically reserved for compiler use.  One of the things that this allows is the addition of keywords that are semantically guaranteed not to collide with any existing identifiers (when they start with underscore).  I haven’t seen anything of the sort in any C# spec that I’ve read.  Little late to think about something like that though.  Any reason why it wasn’t carried over from C/C++?  Was the "@" operator expect to compensate?

  2. Gabe says:

    yield is still a keyword, it’s just context-sensitive so it’s not a reserved word. I’m just glad it’s not like some versions of ALGOL where you would have to quote every keyword:

    ‘if’ n ‘notequal’ 1 ‘then’ ‘goto’ do;

  3. @ is there primarly because .NET is a multi-language environment. It’s possible to reference a class with a identifier that matches one of your keywords, and C# needs a way to be able to write that.

    C# does not reserve identifiers that start with "_". I recall that we talked about it at one point but don’t recall anything beyond that, and I couldn’t find mention in the design notes.

    My personal feeling is that putting underscores in front of a keyword is both an ugly and a confusing thing to do to people, which is compounded in the C++ case by the use of double underscores in the some cases. I understand that for a language like C++, using _ was an expedient choice – and perhaps the only real choice given the realities of lots of different vendors – but I don’t think it’s a good one.

    I also think having _ as an out makes it too easy to add features, not that I’m suggesting that that happened in the C++ case…

    So, that’s what I think, and I don’t think there has been any real cause to regret that decision so far.

    Eric

  4. Peter Ritchie says:

    Agreed, I think underscore is ugly as well.  It’s a cross between a rock and a hard space though, with an evolving language.  You can’t please everyone when you make language additions/changes.  At least with "yield return" and "yield break" keywords existed that reasonably match the intention of the context.

    Was there a situation that suggested a new contextual keyword would not have been a better choice?  Could you not reasonably expect no other identifier would follow a "yield" keyword?

    Of the top of my head, "yield return" sort of suggests a return from GetEnumerator(); where "yield value = x" seems a better alternative, and it overrides another keyword.  Why would plain old "return" not work for "yield break"?

  5. Vince P says:

    They had to pair any new contextual keyword with an existing keyword, otherwise the same problem of breaking code would occur which is what the contextual keyword was created to solve.

  6. My "yield value = x" example would have been a breaking change (get rid of the =, in the case of "using yield = System.In32;"); but, that doesn’t mean they *had* to use an existing keyword (contextual or not).  They just had to use a syntax that could not have possibly complied before that.  For example, "yield iteration obj;" could never have compiled in VCS 2003 and therefore "iteration" could be added as a new contextual keyword without causing existing code to break–even if "iteration" was used as an identifier.

    They had the forethought to add the concept of conceptual keywords to help avoid having collisions with adding real keywords, brilliant.

  7. Jeff M says:

    Is there a way yield return could be used to flatten a hierarchical memory structure?  As I understand, all yield statements must figure in the same method and we can not call sub-methods or do recursive calls to achieve the iteration.  Is there any work-around?  Thanks.

  8. Peter Ritchie says:

    Jeff, check out the MSDN Magazine article "Create Elegant Code With Anonymous Methods, Iterators, And Partial Classes" (http://msdn.microsoft.com/msdnmag/issues/06/00/C20/default.aspx) which shows an example of "flattening" a binary tree with recursive iterations

Skip to main content