Closing over the loop variable considered harmful


(This is part one of a two-part series on the loop-variable-closure problem. Part two is here.)


UPDATE: We are taking the breaking change. In C# 5, the loop variable of a foreach will be logically inside the loop, and therefore closures will close over a fresh copy of the variable each time. The “for” loop will not be changed. We return you now to our original article.


I don’t know why I haven’t blogged about this one before; this is the single most common incorrect bug report we get. That is, someone thinks they have found a bug in the compiler, but in fact the compiler is correct and their code is wrong. That’s a terrible situation for everyone; we very much wish to design a language which does not have “gotcha” features like this.

But I’m getting ahead of myself. What’s the output of this fragment?

var values = new List<int>() { 100, 110, 120 };
var funcs = new List<Func<int>>();
foreach(var v in values)
  funcs.Add( ()=>v );
foreach(var f in funcs)
  Console.WriteLine(f());

Most people expect it to be 100 / 110 / 120.  It is in fact 120 / 120 / 120. Why?

Because ()=>v means “return the current value of variable v“, not “return the value v was back when the delegate was created”. Closures close over variables, not over values. And when the methods run, clearly the last value that was assigned to v was 120, so it still has that value.

This is very confusing. The correct way to write the code is:

foreach(var v in values)
{
  var v2 = v;
  funcs.Add( ()=>v2 );
}

Now what happens? Every time we re-start the loop body, we logically create a fresh new variable v2. Each closure is closed over a different v2, which is only assigned to once, so it always keeps the correct value.

Basically, the problem arises because we specify that the foreach loop is a syntactic sugar for

  {
    IEnumerator<int> e = ((IEnumerable<int>)values).GetEnumerator();
    try
    {
      int m; // OUTSIDE THE ACTUAL LOOP
      while(e.MoveNext())
      {
        m = (int)(int)e.Current;
        funcs.Add(()=>m);
      }
    }
    finally
    {
      if (e != null) ((IDisposable)e).Dispose();
    }
  }

If we specified that the expansion was

    try
    {
      while(e.MoveNext())
      {
        int m; // INSIDE
        m = (int)(int)e.Current;
        funcs.Add(()=>m);
      }

then the code would behave as expected.

It’s compelling to consider fixing this for a hypothetical future version of C#, and I’d like to hear your feedback on whether we should do so or not. The reasons FOR making the change are clear; this is a big confusing “gotcha” that real people constantly run into, and LINQ, unfortunately, only makes it worse, because it is likely to increase the number of times a customer is going to use a closure in a loop. Also, it seems reasonable that the user of the foreach loop might think of there being a “fresh” loop variable every time, not just a fresh value in the same old variable. Since the foreach loop variable is not mutable by user code, this reinforces the idea that it is a succession of values, one per loop iteration, and not “really” the same variable over and over again. And finally, the change has no effect whatsoever on non-closure semantics. (In fact, in C# 1 the spec was not clear about whether the loop variable went inside or outside, since in a world without closures, it makes no difference.)

But that said, there are some very good reasons for not making this change.

The first reason is that obviously this would be a breaking change, and we hates them, my precious. Any developers who depend on this feature, who require the closed-over variable to contain the last value of the loop variable, would be broken. I can only hope that the number of such people is vanishingly small; this is a strange thing to depend on. Most of the time, people do not expect or depend on this behaviour.

Second, it makes the foreach syntax lexically inconsistent. Consider foreach(int x in M()) The header of the loop has two parts, a declaration int x and a collection expression, M(). The int x is to the left of the M(). Clearly the M() is not inside the body of the loop; that thing only executes once, before the loop starts. So why should something to the collection expression’s left be inside the loop? This seems inconsistent with our general rule that stuff to the left logically “happens before” stuff to the right. The declaration is lexically NOT in the body of the loop, so why should we treat it as though it were?

Third, it would make the “foreach” semantics inconsistent with “for” semantics. We have this same problem in “for” blocks, but “for” blocks are much looser about what “the loop variable” is; there can be more than one variable declared in the for loop header, it can be incremented in odd ways, and it seems implausible that people would consider each iteration of the “for” loop to contain a fresh crop of variables. When you say for(int i; i < 10; i += 1) it seems dead obvious that the “i += 1” means “increment the loop variable” and that there is one loop variable for the whole loop, not a new fresh variable “i” every time through! We certainly would not make this proposed change apply to “for” loops.

And fourth, though this is a nasty gotcha, there is an easy workaround, and tools like ReSharper detect this pattern and suggest how to fix it. We could take a page from that playbook and simply issue a compiler warning on this pattern. (Though adding new warnings brings up a whole raft of issues of its own, which I might get into in another post.) Though this is vexing, it really doesn’t bite that many people that hard, and it’s not a big deal to fix, so why go to the trouble and expense of taking a breaking change for something with an easy fix?

Design is, of course, the art of compromise in the face of many competing principles. “Eliminate gotchas” in this case directly opposes other principles like “no breaking changes”, and “be consistent with other language features”. Any thoughts you have on pros or cons of us taking this breaking change in a hypothetical future version of C# would be greatly appreciated.

(This is part one of a two-part series on the loop-variable-closure problem. Part two is here.)

 

Comments (134)

  1. Mike Van Til says:

    If you have a good understanding of lambdas and closures, this code does exactly what you would expect it to do (nothing like your foreach (var m in from m in data orderby m select m) from a few days ago).

    It seems like changing it would do far more harm than good.  I think you adequately summed up the arguments in favor of not changing it.

    In particular, if you use re-sharper (and pay attention to it) you’ll catch this right away.  Perhaps you should just buy JetBrains and integrate their product?  😛

  2. TheCPUWizard says:

    Please do NOT change this, it is 100% correct. If people do not understand "Closures close over variables, not over values" then it is a matter of training [I actually use a similar sample in my assessment screening for midlevel and senior developers]

    Remember the imortal words of Scott Meyers from nearly 20 years ago (has it really been that long?).."Do not Mollycoddle your users (programmers in this case)"

  3. jcoehoorn says:

    You should definitely do something about this.  I see it on StackOverflow all the time.

    I like the "Add a compiler warning" idea. I like it a lot, because it side-steps the breaking change and still throws something in front of the user to help avoid the mistake.

    As for the "lexical" argument, I think the stronger argument is that the declaration of the variable occurs outside of the { } block.  Consider:

    " { int x; } " versus " int x; { } ".  This code feels much more like the latter.  Since you declare the variable outside of the scope block, obviously it should be the same variable.  Changing it to something else would be deeply weird.

  4. JD says:

    Initially I was all for making this change. I haven’t ever seen a legitimate use of closing over the iteration variable (except as a demonstration of why you shouldn’t). However, I hadn’t thought about the consistency issue with for loops. Considering that, I would argue that consistency wins. Reducing the cognitive load of understanding a language is highly valuable, and having these constructs behave differently in this regard seems like a recipe for confusion.

    Definitely do make a warning for this though. Surely in upwards of 99% of cases closing over the iteration variable is not going to produce the result the programmer expects.

  5. P.Baughman says:

    Please do not change this.  The behavior is consistant with the rest of the language.  I can’t think of another case where declaring a variable actually declares a bunch of variables that appear to point to the same data but don’t.

  6. Chris B says:

    In my opinion, this should at worst be a compiler warning.  The current behavior fits exactly with the delayed execution I would expect from a lambda closure.  For example:

    void Main()
    {
       int i = 0;
       Action foo = () => Console.WriteLine(i);
       i = 6;
       Console.WriteLine(i); // 6, as expected
       foo(); // also 6, not 0
    }

    The foreach is currently behaving consistently with this and this is the behavior I would expect.  If this change was made to foreach, consistency would suggest to me that this program print out 6 and then 0.  That change would mean that lambdas closed over values, not variables which would be a tremendous breaking change.

    I take your point. But let’s be clear on this. The proposal is not to make lambdas close over values. The proposal is to make the loop variable of a foreach loop into one variable per iteration, not one variable shared between all iterations. Since a foreach loop variable is logically immutable anyway, it’s already a very weak sort of “variable”. The question is not whether lambdas close over variables; clearly they do. The question is about how many variables a foreach loop generates, one per loop, or one per loop iteration. — Eric

  7. Pavel Minaev [MSFT] says:

    I’d say change. The strongest argument I can think of is that it is inconsistent with seemingly equivalent LINQ:

      var funcs = new List<Func<int>>();

      funcs.AddRange(from i in Enumerable.Range(0, 10) select(Func<int>)(() => i));

      funcs.ForEach(f => Console.WriteLine(f()));

    This prints 0 to 9, as expected. It’s also a counter-argument to this claim:

    > So why should something to the collection expression’s left be inside the loop? This seems inconsistent with our general rule that stuff to the left logically "happens before" stuff to the right. The declaration is lexically NOT in the body of the loop, so why should we treat it as though it were?

    Because you already do just that in at least one place in the language.

    The other strong argument is that the existing behavior is deeply unintuitive. Vast majority of C# developers get it wrong the first time they use lambda inside a foreach (and many get it wrong repeatedly even after they already know how it actually works). This is a clear sign that it’s not a good approach.

    Finally, the existing behavior is inconsistent with regard to pseudo-readonly status of the range variable. For normal readonly fields, I can expect that not only I cannot write to them, but noone else can, either (except during initialization). Foreach range variable is kinda weird – it’s readonly to my code, but there’s some other magic code there that can change it. IMO, a much more straightforward definition is as a true readonly local, which it can be if it’s "auto-declared" inside the loop body.

    > Please do NOT change this, it is 100% correct. If people do not understand "Closures close over variables, not over values" then it is a matter of training

    People do understand "close over variables" just fine. They don’t understand the lifetime of the range variable.

    > I can’t think of another case where declaring a variable actually declares a bunch of variables that appear to point to the same data but don’t.

    Uh, how about any time you explicitly declare a local inside the body of the loop?

  8. P.Baughman says:

    >Uh, how about any time you explicitly declare a local inside the body of the loop?

    If it’s in the body of the loop, you expect that line of code to be executed multiple times, therefore you expect multiple allocations.  If it’s not inside the body of the loop, you expect that line to get hit once

  9. L.Kean says:

    I agree with Mike Van Til, if you understand how it all works, it only makes sense that it should work that way. I ran into this problem about a month ago, and once I discovered the source of the bug, I immediately realized what was going on and thought it made perfect sense. Although I use resharper with visual studio, at the time I was using monodevelop. If anything is done about this at all, I like the idea of a compiler warning.

  10. Pavel Minaev [MSFT] says:

    There’s one more argument that I’ve missed. It is perhaps weak, but nonetheless…

    The problem is with wording. C# foreach statement reads rather naturally:

      foreach (var item in items) …

    That’s literally "for each item in items, do …". The "each" word here strongly implies that we’re taking each _separate_ item in turn, and doing something to it.

    In contrast, with a plain for, the fact that variable is shared is also quite explicit when you "say it aloud":

      for (int i = 0; i < 10; ++i) …

    That’s "for i, initially assigned to 0, incremented by 1, until it reaches 10, do …". It’s very explicit and obvious here that there’s a single i for the whole loop, and it gets mutated on each iterator. Not so with "for each".

    For people arguing strongly against the change (and to remind, the change is _not_ "make lambda capture the value"; the change is to the scope of the range variable), I’m genuinely curious: is there any particular reason apart from backwards compatibility? Do you actually ever _relied_ on the fact that there’s only one range variable in foreach, and it changes? Can you think of any scenario in which you’d prefer the existing behavior with respect to foreach and lambdas to be the case, because it’s more convenient? If not, then what difference would it make to you either way?

  11. Chris B says:

    @Eric,

    I think I poorly phrased a portion of my response.  That is what I get for hurriedly writing responses while at work! Anyway, I really intended to convey that if the foreach behaved as proposed, it would "feel" as if the foreach were closing over the values instead of the variables, which could lead one to expect similar behavior out of the sample program.  I certainly did not intend to suggest that you were proposing that lambdas would close over values.

  12. TheCPUWizard says:

    @Pavel – [IMHO] programmers should realize that:

         foreach(var x in y)

    should be "read" as:

         for(var x assign each element from y)

  13. Pavel Minaev [MSFT] says:

    > If it’s in the body of the loop, you expect that line of code to be executed multiple times, therefore you expect multiple allocations.  If it’s not inside the body of the loop, you expect that line to get hit once

    Well, conditions inside while and for loops are not inside the body of the loop, either:

      while (GetFoo() != 0) …

      for (int i = 0; i < 10; ++i)

    but you expect "that line" – specifically, the call to GetFoo(), and ++i – to be hit repeatedly nonetheless.

  14. Pavel Minaev [MSFT] says:

    > "programmers should realize that"

    Why _should_ they?

    I understand what you probably mean – that someone coding in C# should know this particular thing about the language. But we aren’t talking about knowing what’s there, but rather about the "perfect" state of affairs, and then I think that it’s best to stick to most intuitive interpretation of the keyword. It doesn’t say "assign" anywhere, or "=", or anything else to imply assignment.

    In any case, the strongest argument is still the existing mismatch in lifetime of range variable between loop "foreach (var x in xs)" and LINQ "from x in xs". Everything else is just the lack of icing on the cake that isn’t there.

  15. Random832 says:

    In Javascript, I generally resort to a ‘double closure’ – this is particularly needed in JS as nested functions are the only way to create a new scope block.

    funcs.Add((function(v2){return function(){return v2};})(v));

    translated to mostly equivalent [except i’d have to know the type of v, using int here] C#

    funcs.Add(((Func<int,Func<int>>)((v2) => () => v2))(v));

  16. DRBlaise says:

    I have to agree with Pavel on all his responses, especially the one about the plain meaning of foreach.  It would be fantistic and best if foreach actually meant "for each item in the collection."  I thought Eric’s second reason was particularly weak as the foreach statement already has the problem of the right side being evaluated before the left as pointed out in the last post about "foreach (var m in from m in data orderby m select m)".

    That being said, I believe Eric’s reason number 1 "breaking change" out ways everything else.  The train has already left the stop, it’s too late to get on board.

  17. TheCPUWizard says:

    If the lifetime of the range variable in foreach was to be changed, then it would not only conflict with a simple "for" it would conflict with the meaning for foreach in other environments.

    IF consistancy was to be the prime driver [I do NOT believe it should be] then LINQ as the new-comer to the block would be the one to change [I DONT think it should be changed]

    Consider code:

       for (var x=C.First(); x<=C.Last(); x=C.Next) {}

    and

       foreach (var x in C) {}

    If these behaved differently many more [IMHO] people would be confused.

    ps: I DO expect people to study and learn a language over a period of time [months to years], I am not a believer in the idea of picking up a programming language in "21 days" [even though I have been the Technical Editor of such books <grin>]. In general I would expect a person to have about 3 months [ about 225 hours] of study/usage of C# before they ever see LINQ or Lambdas….

  18. Phil Koop says:

    It seems to me that the responses that run "this is obvious to anyone who understands lambdas and closures" have missed the point. Anyone who understands lambdas and closures can state the behavior of the "desweetened" (soured?) code at a glance, but few people correctly predict this code. I think that Pavel Minaev has divined the reason correctly: the "each" in foreach makes a stronger impression than the "for", and to most people it suggests a per-instance lifetime.

    I am tempted to remind you about that hobgoblin, a foolish consistency, but perhaps you simply can’t bear to change the lifetime of foreach variables. I also understand that you must avoid head-to-head competition with partners like Resharper. But perhaps, in cases where the specification defines one syntactic construct as a sweetener for another, it would be permissible to allow Visual Studio to mark such sugar and display the true structure on demand.

  19. Ali TAKAVCI says:

    i don’t believe it really does matter,

    i think in both of situations one should  (i mean not intuitively, but  by knowledge) know what he is using, and how it behaves.

    but…

    In the ‘change’ case you’ll have allowed someone (most people by most of your sayings) code and it works fine, but they don’t know what is under the hood…  and that’s what i’ll never appreciate.

    if you say how ‘foreach’ reads naturally,  (i dont want to be arguing readibility) than i remind its a formal language .

    if you stick on with "i suppose this work by this way"ers to change this,,, uueh , that s another thing.

  20. Thomas Levesque says:

    IMHO the current behavior is fine, although a bit confusing the first time you realize it. However a compiler warning would be nice…

  21. Joren says:

    The foreach variable and the range variable in a from clause are conceptually nearly identical, so I think they should be as similar in behaviour as possible. Declaring the foreach variable inside the generated while loop would be a great step in that direction.

  22. P.Baughman says:

    Have you considered the possibility that the people who submitted these bug reports:

    A) don’t understand lambda expressions

    instead of

    B) don’t understand foreach

    No, you’re right. Usually its a failure to understand that lambdas close over variables, not values. But who cares? The point is that people are writing broken code because the language semantics do not match people’s intuition in some way. My goal is not to write a language which punishes people for failing to understand the nuances of closure semantics; my goal is to write a language that naturally encourages writing correct code, even when the developer has a slightly incorrect mental model. Believe me, plenty of developers have incorrect mental models of how C# works in almost every respect. A language that can *only* be correctly used by experts isn’t what we’re after here. — Eric

  23. Pavel Minaev [MSFT] says:

    > Have you considered the possibility that the people who submitted these bug reports:

    > A) don’t understand lambda expressions

    > instead of

    > B) don’t understand foreach

    If people were simply misunderstanding lambdas, we’d see a similar amount of confusion when capturing for-loop counters, but it doesn’t seem to be the case (at least judging by the number of SO questions). In fact, I’ve yet to see anyone being confused about for-loops in a similar case; it’s invariably foreach only.

  24. TheCPUWizard says:

    Pavel – I believe that is because most people who are using for-loop are not using lambdas and that this accounts for the skew.

    If you don’t mind please contact me privately to discuss further…

    My contact info is in my MVP Profiile…

    https://mvp.support.microsoft.com/profile=9CAAC837-1258-4B5F-B6D0-7D9CD800EBBB

  25. I’m strongly in favor of making this change. The main reason is not so much that the current behavior is unintuitive – which it is, even when you understand the reasons for it – it’s that it’s unUSEFUL.

    If there were two possible things you might *want* to do – even if one of them was more common than the other – the argument against breaking changes would clearly be a reason not to ever consider making the change, because you never want to break working code.

    But as it stands, the current behavior, while "consistent" in some senses (and I really don’t buy most of the consistency arguments; the variable is readonly so it’s inconsistent that it changes, too), is useful for nothing at all. C# had the good sense to not keep the C behavior where loop variables are still in scope after the loop is finished and holding their last value around – because it’s confusing and not useful and leads to bugs. This is similar.

    I suggest that if you aren’t going to actually change the scope of the loop variable in this case, I’d go one stronger than a warning and make attempts to close over a foreach loop variable actually an error. Someone who wants the weird current behavior for some reason can still get it:

    int sharedValue;

    foreach (var value in values) {

     sharedValue = value;

     doSomething(() => sharedValue);

    }

    Basically, making it an error *forces* the programmer to disambiguate. Making it work the "expected" way would be preferable, as far as I’m concerned, but if there’s too much resistance to that, at least require the programmer to spell out what they want if they want something so deeply counterintuitive.

  26. Eoin Coffey says:

    I would have to agree with not changing the current behavior.

    The current behavior also matches what happens when you attempt the same thing in javascript:

    var a = [1, 2,3,4]

    var funcs = []

    for (var i in a) {

        funcs.push(function() { return i; });

    }

    for (var f in funcs) {

      console.log( f() );

    }

    will print out "4 4 4 4".

    Its helpful to think as lambdas and closures working on a semantic scope;  A method is a natural definition for semantic scope (I’m just kind of winging that definition; I am not a compiler guy), so in JS the preferred solution is to define another lambda that closes around what you need like this:

    for (var i in a) {

        (function(x) {

             funcs.push(function() { return x; });

        })(i);

    }

    –OR–

    for (var i in a) {

       funcs.push(function(x) {

          return function() { return x; };

      }(i));

    }

    Fiirst class functions are a beautiful thing. 🙂

  27. Pavel Minaev [MSFT] says:

    > The current behavior also matches what happens when you attempt the same thing in javascript … a method is a natural definition for semantic scope

    Not in any C-derived language with the notable exception of JavaScript. Natural definition of scope was always block – {} – and never the entire function. In fact, it seems to be one of the most hated things about JS:

       var x = 1;

       if (…) {

         var x = 2; // no error

       }

       // x is now 2

    It’s bad enough that they’ve added "let(x=1) {…}" specifically to support better scoping in JS1.7 (https://developer.mozilla.org/en/New_in_javascript_1.7#let_statement).

  28. SS says:

    Avoid writing code like this at all, regardless of how C# handles it. But if you must write code like this to impress your friends, C# appears to be handling things correctly as is.

  29. Adam Azarchs says:

    Parallel.ForEach in c# 4.0 behaves like the proposed change to foreach, not like the current implementation.  It also somewhat breaks the scoping for the loop variable.

    In fact, implementing my own Parallel.ForEach (because I’m impatient) was the first time I ran across this unexpected behavior.  The behavior so surprised my coworker that he bet me that wasn’t it.  This seems like an argument for changing the behavior, however it’s also the kind of thing that if you’ve seen it once, you’ll learn your lesson and not repeat the mistake – and get a better understanding of closure semantics in the process – so I’d favor a warning over any kind of breaking behavior change.

  30. Dean Harding says:

    And *I* think we should paint the bike shed cornflower blue!

  31. John Melville says:

    Initially I wanted the warning, but now I think .. just change it.

    My reason has nothing to do with consistency or the backward change.

    Apparently iterating through a collection and making a bunch of lambdas is common.  (Otherwise the bug would not come up.)  Everyone seems to agree that what is wanted is not what happens.  If the syntax exists and the desired semantic is clear, just do what I want and be over it.  Why should I have to clutter up my code with an extra local variable just to show you that I want what you already know I wanted?  

    I like the semantic idea that a foreach loop runs once on each element of a collection, not on a variable that changes with each iteration.  This is very different from the semantic of a for loop in which you explicitly create and increment a single variable.  Consistency is not always a virtue — why should dissimilar ideas be forced to be consistent?

    I have another question about backwards compatability.  I suspect that most of the code using this pattern right now is buggy code that the bugs haven’t been found yet.  It is clear that backwards compatability requires correct code to continue tp work correctly — does it require buggy code that happened to work by chance to continue to do so?

  32. JudahGabriel says:

    I vote for a compiler warning at least, or a breaking change to the language at most. This bug has biten some of our devs, who are by no means C# rookies, but at least mid-level devs.

    A compiler warnings seems like good fit to this problem.

  33. TheCPUWizard says:

    @Stuart – FYI: C++ for loop variables existing past the scope of the loop has NEVER been part of the specification. It was a common bug in many compilers [including VC++ upto and including 6.0].

  34. DRBlaise says:

    After reading the additional responses, I like Stuart Ballard’s answer the best.  Instead of making a WARNING message or "fixing" the problem, "make attempts to close over a foreach loop variable actually an error."  This would make people fix buggy code and make a program "breaking change" into a compile time breaking change.  If the error is well documented it should get rid of all the "incorrect bug reports", force the fixing of a potential bug, and educate the developer at the same time.  Sounds like a WIN-WIN-WIN to me.

    By the way, great post and fantastic responses all around.

  35. Joe White says:

    I’m with Pavel: the "each" is stronger. The natural expectation, to someone actually reading the code (as opposed to a commenter trying to be pedantic about the expansion), is that the item variable should be immutable and scoped to that loop iteration. I was shocked the first time ReSharper warned me about this (and it was right, I had exactly the bug it was warning me about). And when I tell ReSharper to use a different color to highlight mutable variables, every foreach variable gets highlighted. This is not reasonable behavior, especially as C# borrows more and more from functional languages where immutability is the norm.

    As to your arguments about the syntax suggesting that the variable declaration happens first… that just means the syntax is wrong. To steal from another language I’m familiar with, Ruby chose a syntax that doesn’t suffer from that ambiguity:

    collection.each do |item|

     …

    end

    I.e., call the collection’s "each" method and pass it a lambda. I know it’s a little late to rework the syntax of C#’s foreach keyword, but if LINQ shipped an .Each or .ForEach(Action<T>) method, then C# could use the exact same pattern, with the same clarity:

    collection.ForEach(item => {

     …

    });

    The extra parentheses around the lambda are yucky, but this would have the desired scoping; and the scoping would be clear at a glance, even to someone unfamiliar with the esoteric details of the expansion.

    If the performance overhead of the extra method call was a concern, the compiler could always look for this pattern and inline the same codegen as foreach does today (but with the right variable lifetime). I’m told that’s what most Smalltalks do with their "if" syntax — syntactically it looks a method call that takes two closures, but in real life it’s optimized to use native branch instructions just like you’d expect in any other language.

    My vote is to add an .Each or .ForEach method to LINQ, and then add a compiler warning for this pattern. (If it were me, I’d make sure the text of the warning recommends that you use .Each instead for clarity. But that’s just me.)

  36. Blake says:

    +1 on adding a compiler warning, but absolutely not changing the current behaviour.

  37. configurator says:

    I’m in favor of making the change.

    Here:

    foreach (int x in GetNumbers()) { … }

    Is x assigned before GetNumbers is called? Of course not. So something on the left doesn’t necessarily happen before something on the right. Now, if I were to exapnd it manually (forget error checking and dispose for now) I’d do:

    var iterator = GetNumbers().GetEnumerator();

    while (iterator.MoveNext()) {

       int x = iterator.Current;

       …

    }

    Most people I believe would do the same thing. I’ve commented asking why the compiler does this weirdly on your previous post, and your response was (I’m paraphrasing): It only makes a difference in closures, and even than it’s for the worse.

    I say make the change. But maybe someone is using that oddity on purpose? I have thought of a theoretical reason to do that, although I’d do this completely differently:

    bool first = true;

    foreach (int x in GetNumbers()) { // returns numbers 1..Max

       if (first) {

           UpdateProgress(x);

           first = false;

       }

       DoLongOperation(x);

    }

    void UpdateProgress(int x) {

       while (x < Max) {

           UpdateProgressBarInUI(x);

           Thread.Sleep(500);

       }

    }

  38. Jon Skeet says:

    I believe it should have originally been specified in the suggested new way – but I’m slightly opposed to it changing now. In terms of my "pro" vote, I vehemently concur with Pavel about the implication of the "each" part. The way it reads implies a new variable every time – and this is reinforced by the fact that the variable is readonly within the body of the loop. It’s readonly, and yet the same variable gets a new value each time through the loop? Weird! It’s far easier to conceive of a *new* readonly variable for each iteration of the loop.

    However, my "con" hat is the backward compatibility argument – not for breaking current code, but for making future code break under current compilers. Stay with me, it makes sense 🙂

    Currently I’m building Noda Time, and we have a Visual Studio solution for both VS2008 and VS2010. We’re not using any C# 4 features, but it’s nice to be able to build in both. If someone *does* try to use the new C# 4 features, their code will fail to compile in VS2008 – I *think* that even the overload resolution issue (where variance introduces makes some previously invalid candidates valid) won’t be an issue so long as we target .NET 3.5. (It may cause problems if we have a configuration targeting .NET 4.0 for the purpose of Code Contracts, but I think it’s unlikely to actually hit us.)

    Now fast forward a couple of years, and suppose we have a similar situation with VS2010 and VS2012 – but C# 5 has made a change to the scope of foreach iteration variables. A developer writes perfectly valid C# 5 – which is also perfectly valid C# 4, but which has the wrong semantics. No compiler warnings or errors – at best a ReSharper warning which could easily go unnoticed (as the developer with VS2010 may not even be looking at that code). At that point, you’d better hope your unit tests spot the difference…

    One possible solution to this issue is to make closing over a foreach variable (in a way which lets it escape from the loop; if the delegate never leaves the loop, it’s fine – how you detect that is tricky though!) issue a warning in C# 4, and then change the behaviour for C# 5. That way you’d have to be in a situation where two people are compiling the same code, one in VS2012 and one in VS2008, to see a silent change in behaviour.

    I hope this makes sense – I haven’t had my first coffee of the day yet – but it does mean that the backward compatibility issue is worse than I had originally imagined.

  39. Frank Bakker says:

    There are situations were closing over a loop variable does not create ‘unexpected’ results in eighter the current or the proposed situation, like in:

    foreach(var item in list)

    {

       Console.WriteLine(otherlist.where(i => i.name == item).Count());

    }

    As long as the lamba is not executed outside of the current loop itteration there is not really a problem. I think this is a pretty common and valid scenarion that should not yield a warning. problem is ofcourse that for the compiler it will be harder to detect if the lamda ‘surives’ the loop body.

  40. Mihailik says:

    Can you just ask Anders? The answer seems clear, just wondering if C# losing its momentum in conversations abstracting out into account its core metaphors and principles.

    Just come back to English: ‘for each apple x in the basket’. *Each* x, come on!

    I think the designers of C# should feel more contact with the ground, do not leak simplicity and original C# spirit in favour of obscure synthetic ideas.

  41. Joe says:

    Since both sides (language purism vs. practical application) have compelling arguments, why not allow for both scenarios?

    SomeType x;

    foreach(x in collection) { … }

    foreach(SomeType y in collection) { … }

    If you want the behavior that the iteration variable should be declared exactly once, then you have to pay the price of stating that intent explicitly and live with the fact that your variable is declared in the containing scope.

    If you want to iterate through a collection and have it "just work," then you use the second syntax, which would declare the variable in the loop body since you didn’t take the extra step of declaring it yourself elsewhere.  This aligns well with the code you would have otherwise written by hand:

    for(int index = 0; index < upperBound; index++)

    {

     SomeType y = collection.ItemAt(index);

     // do stuff with y and never touch index or upperBound

    }

    using(IEnumerator e = collection.GetEnumerator())

    {

     while(e.MoveNext())

     {

       SomeType y = (SomeType)e.Current;

       // do stuff with y and never touch e

     }

    }

  42. Adam Maras says:

    This might be opening an entirely new can of worms… but what if we added some sort of syntax to the foreach construct that specified whether the range variable is declared outside (like it is currently) or inside (like the proposed change) the body of the loop. Consider:

    var values = new List<int>() { 100, 110, 120 };
    var outerFuncs = new List<Func<int>>();
    var innerFuncs = new List<Func<int>>();
    foreach(var v in values)
    {
       // default behavior (variable declared outside loop body)
       outerFuncs.Add( ()=>v );
    }

    foreach(var v in values) inner
    {
       // new behavior (variable declared inside loop body)
       innerFuncs.Add( ()=>v );
    }

    foreach(var f in outerFuncs)
    {
       Console.WriteLine(f());
    } // 120 120 120

    foreach(var f in innerFuncs)
    {
       Console.WriteLine(f());
    } // 100 110 120

  43. Mihailik says:

    Also note, that the current implementation DEMANDS use of variable names such as ‘v1’.

    I am sure there’s a lot of people who will argue ‘v1’ is an excellent and natural thing to have.

  44. Alex Humphrey says:

    My immediate thought was to leave it as it, and avoid the breaking change, despite the fact that I view the variable declared in the foreach as immutable.

    However, I tested the following.

    var funcs = new List<int>() { 100, 110, 120 }.Select(v => () => v).ToList();
    foreach (var f in funcs)
    {
       Console.WriteLine(f());
    }

    And found that it prints 100, 110, 120. I don’t know much about the internals of the Select extension method, but I would have thought it used something similar to foreach. I guess it uses enumerators somewhat more explicitly with a new variable declared for each item.

    Your code is the moral equivalent of:

    foreach(int x in intList) funcs.Add(MakeMeAFunc(x));

    Func<int> MakeMeAFunc(int y) { return ()=>y; }

    See why that works? Every time through the loop you create a new variable y by calling MakeMeAFunc. The closure is closed over the parameter, and the parameter is a fresh new variable on each invocation. (Imagine if it wasn’t; recursive functions would be nigh-impossible!) — Eric

    To be honest, knowing what happens with foreach, I really didn’t know what to expect from the above code. I’m glad it did what I thought it should do though.

  45. Ben Lings says:

    My preference would be to allow ‘val’ as an alternative to ‘var’ for an implicitly typed local *value*. When used in a foreach loop this would cause the delegates to capture the value of the iteration variable.

    Admittedly this is a much larger change.

  46. aaron says:

    "Closures close over variables, not over values".  It’s not a matter of variables vs. values, it’s a matter of scope.  Placing this text in bold will not help.

    IIRC, C# _forbids_ the programmer from modifying this variable.  It’s no wonder programmers are surprised by this behavior — one moment the variable is immutable, and the next moment the language pulls the rug out from beneath you and rebinds the unrebindable variable.

    Perhaps you should put something in bold to that effect.

  47. aaron says:

    "Second, it makes the foreach syntax lexically inconsistent".  Lexical scoping isn’t the concern, but rather it’s a matter of dynamic extent — the same storage cell gets re-used in the next loop iteration, getting reassigned.

    This "inconsistency" the present semantics is trying to avoid already exists elsewhere in the language.  Consider "public void foo(int x) { … }" — in this case, int x’s lifetime is bound to the scope of the _body_ of the function, and does _not_ get re-used the next function invocation.  If you were striving for consistency, the other proposed behavior has more in common with function argument dynamic extent than the present behavior does.

  48. Steve Strong says:

    I’m completely in agreement with Jon Skeet here – make the change (the current behaviour almost inevitably isn’t what folk want their code to do), but make it in some future version.  For the next release, just emit a warning and indicate that this behaviour is to change.  The scenario that Jon gives is a very real one, and having identical code have different semantics in two adjacent versions of the compiler sounds like a bad idea.

  49. Konrad says:

    I much prefer Java’s rule here: only allow closing over variables which are declared `final`. But then, this only works because Java allows local variables to be declared `final`, which C# doesn’t.

    That said, I am all *for* changing this. You’ve already given the rationale: The current behaviour creates a lot of bugs. And from a programmer’s point of view, the loop variable is actually inside immutable inside the loop, and it’s only valid inside the loop. It is extremely unintuitive that this behaviour is not really backed by the language. Consistency and logic would dictate that the loop variable is indeed local to the loop body, immutable, and re-declared/initialized for every round of the loop.

    That this is currently handled differently is – for me – only a historical accident.

    To address your objections to making this change:

    * First, I don’t think that’s an issue. Just *assuming* that it might be is a wild guess. Yes, it’s erring on the safe side but it’s not a basis for an informed decision.

    * Second, I disagree that this is inconsistent. As I’ve said above, I’ve always considered the loop variable to pertain to the loop body. Yes, there are two parts in the head, one which belongs to the loop body and one which doesn’t. But I honestly don’t see a problem in this.

    * Third, “certainly would not make this proposed change apply to "for" loops.” – Whyever not? The `+= 1` in your example doesn’t change the value of `i`, it creates a *new* value and binds it to the symbol of `i`. But `i` has value semantics, right? So I expect the lambda to access the current *value* of `i`, not a reference to it.

  50. Konrad says:

    Duh, I’t like to clarify a point which was incorrect from my previous posting:

    Of course the lambda shouldn’t close over the *value* if a loop variable `i` because that’s precisely what lambdas don’t do. Rather, my point should have been that classical `for` loops are just a big problem when it comes to lambdas and their inevitable functional mindset. A variable which is “recycled” in this manner just doesn’t sit well with functional programming. It would be much cleaner if the conventional `for` loop actually *did* create a new *variable* and initialize it with the value of the old one – plus one.

    So, all things considered, the behaviour of the classical `for` loop probably shouldn’t be changed after all. Much better would be to deprecate the old-style `for` loop (which of course won’t happen) – Python demonstrates that one can live very well without one, even in an imperative language (`foreach (var i in Enumerable.Range(1, 10)` …).

  51. My vote would be to make the change and to break consistency between for and foreach in this matter.

    In any case, I don’t think for is really all that similar; for iterates "over" an indexer, and logically increments it all the time; whereas foreach hides the indexer (i.e., the enumerator) and shows you only the indexed _value_.

    In short, the index in a typicial for loop is not at all the same as the value in a typical foreach loop.

    Another reason to make the change is that the "correct" version currently looks like duplicated code – and correct code should look correct if at all possible.

  52. Austin Donnelly says:

    Issue a warning in C#4, and change it in C#5.

  53. ShuggyCoUk says:

    ++ to compiler warning. I understand the logic fully but I’ve still accidentally done it myself.

  54. Please don’t change it! This forms the basis of an interview question I give to developers. If you change it I’d have to come up with a new question and I’m very lazy…

    Do you find that this is a good interview question? It seems like an odd corner case; I wouldn’t expect most developers to know about this off the top of their heads, though I would expect them to be able to figure out what was going on eventually. — Eric

    A warning along the lines of ReSharper’s “access to modified closure” would be good though.

  55. fred says:

    What about nonbreaking change:

    foreach (new var iten in sequence) { /* … */ }

    ———

    Excerpt from MSDN "from clause (C# Reference)"

    … The range variable is like an iteration variable in a foreach statement except for one very important difference:

    a range variable never actually stores data from the source. It just a syntactic convenience that enables the query

    to describe what will occur when the query is executed. …

    Therefore closures close not over range variables but over items in the sequence.

  56. Filini says:

    I would avoid a breaking change like this, because the behavior change would be so subtle that developers will spend days before understanding why their code suddenly stopped to work.

    Think about this, you have 3 types of developers who are using this scenario:

    1. I wrote my code, I rely on the fact that var v is always the same, I am happy

    2. I wrote my code, it works, by pure luck, because var v is always the same, I am happy

    3. I wrote my code, it doesn’t work because var v is not new, I research/fix my code, I am happy (could be happier if foreach worked like I want, but still…)

    The breaking change would make developer 2 really, really, really unhappy 🙂

    Have you considered adding a new syntax? Something like this, that would be translated in the alternative code with the declaration inside the loop:

    foreachnew(var v in M())

    foreach(new v in M())

    foreach(new var v in M())

  57. Jeff Yates says:

    I’m with Jon Skeet on this one. It is the way it is regardless of how it should be – changing it now would incur bigger problems. Perhaps instead, slightly new syntax could be used as in:

    foreach (var v from M())
    {
    }

    Note the “from” instead of “in”. It’s a subtle change but maybe that allows the old and the new to sit side-by-side.

  58. André Vianna says:

    I only disagree with the second statement about the lexical inconsistency that the change would cause in the “foreach” command. The terms of the foreach command are stronger that the “left to right precedence” concept.

    I agree with some of the comments above that says that the "each" and the "in" terms in the command implies that the scope of the variable should occur within the enumeration loop.

    The code below makes more sense if you consider that each m is in the enumeration:

         while(e.MoveNext())

         {

           int m; // INSIDE

           m = (int)(int)e.Current;

           …

         }

    It’s not a single variable. It’s several, each one occurring inside the loop.

    So I’m in favor of the change.

    For those who need to maintain the previous behavior I propose a new command:

    iterate({variable} through {collection})

    that command would make clear that a single variable would iterate through the whole collection receiving it’s value.

  59. Rjae says:

    Delegates Are Easy, Closure Requires Craftsmanship.

    If this is true, and I think it is, a compiler or tool warning leaves full control in the hands of your customers. I completely sympathize with the reality that the overwhelming bugged bug report is based on misunderstanding of closure. Warnings seem to be the best you can do in this situation.

    Perhaps mentioned earlier (no time to read every single comment), this "breaking change" would have to be propagated to other "foreach" iterators, no? For example rewriting the code above using List<int>.ForEach would have to work the new way as well.

    I have no fear that you guys would introduce the change properly – and I would learn the special case – but it seems like a lesser option than the warning approach.

  60. tobi says:

    this should be fixed but just cannot be. i guess 1000s of people depend on this without knowing.

  61. Kevin says:

    I agree with Mike Van Til and the others stating the same thing.  Once you read it and think about what is going on, it makes perfect sense.  I would really prefer not to change it as that would break previous code, and make inconsistencies in how things function.  

  62. brad dunbar says:

    I love the fact that you’re looking at pain points and attempting to eliminate them.  I have found this gotcha in many other languages and usually people give the standard answer, "That’s the way it is."  I think new language constructs can deal with this rather well.  For instance, I think the ForEach method is much clearer and does what the user expects.  For instance:

    values.ForEach(v => funcs.add( () => v));

    funcs.ForEach(f => Console.WriteLine(f()));

  63. "…issue a warning in C# 4, and then change the behaviour for C# 5. That way you’d have to be in a situation where two people are compiling the same code, one in VS2012 and one in VS2008, to see a silent change in behaviour."

    Outside the corporate world you might upgrade your toolchain as every new release comes out, but inside it things are far more glacial. In the C++ world there are plenty of systems still based around Visual C++ 6.0 (1998!), and it’s not uncommon for teams to jump two or 3 versions when push finally comes to shove (just as many businesses leapt from Windows 95 to Windows XP will leap to Windows 7 bypassing Vista).

    Hopefully C# based development hasn’t plummeted to these depths just yet, but as it matures this will start to become an issue. C#3 is already providing plenty of bang-for-your-buck though, so C#4 and C#5 have a considerable amount work to do to keep the upgrade momentum going. Fortunately C#4 is looking enticing enough to do just that 🙂

  64. NickLarsen says:

    This is a weak point, but I distinctly remember my thinking before I web searched to understand why this was acting the way it does.

    IMO, the debugger helps people believe (like me when I ran into this the first time) that there variable inside the foreach loop is new each time because it highlights the type on each iteration.  In

    foreach(var value in values)  { doSomething(value) }

    it highlights the "in" first, and the "var value" next helping me think that there was a new value variable declared on each iteration.  Had it just highlighted "value in values" instead I might have realized there was only one variable being assigned..

  65. Alex Humphrey says:

    How about looking at a more general issue – closing over any mutable value.

    Although the foreach mistake is easier to make – even for developers with good knowledge of the workings of closures, I and others still easily make the mistake when closing over, say, the mutable member variable of a class instance.

    Detecting whether or not a variable referenced in a closure is mutable would mean providing a way of declaring any variable as immutable, but I think that would be a nice language feature anyway.

    We are considering language features that let you enforce constraints on regions where mutations can be legal/visible; being able to make specific variables more immutable would be a part of that. But these are very long-term research-y ideas. — Eric

    However, if I do close over a mutable variable, I’m not sure what should happen. Should it be illegal to close over a mutable variable? (you could always close over an immutable container for a mutable value, much like F#’s ref type – that would probably prove you knew what you were doing.)

    However, that might be way too harsh, and would break a hell of a lot of code. Maybe force a syntax that acknowledges you are closing over a mutable variable? Instead of => use ==> or something? That still doesn’t feel correct to me.

    Any opinions?

  66. My vote: make the change.

    Which change: the concept of foreach from <changing a read-only variable> to <multiple variables with the same name>

    I seriously doubt there is any (working) code that depends on lambdas binding to a foreach variable. The only possible usage I can think of is to (repeatedly) bind to the variable and then have some break condition.

    Initially, I was opposed to the change ("closures close over variables, not values" is just one of the learning points of C#, and every language has them no matter how well-designed). However, I have to agree with the "each" wording argument. It just seems more natural to work the other way.

  67. Steve Bjorg says:

    Make the compiler spit out a warning.  I see this way too often during code reviews.  It’s too easy to miss.  Then anyone can compile with warnings-as-errors and easily catch them… though I like the idea of MS buying Jetbrains too. 🙂  ReSharper is a great product!

  68. setc says:

    This more like a programmer issue than a compiler issue. I would rather vote for the warning from a compiler than changing it. Good experienced programmer would catch it, but again the warnings should be enough for someone to pay attention to and fix it. The only danger is that, most ignore the warnings and expect the result as it seems from the code.

    Still i think compiler warnings is the way to go for now. Please don’t change the way it works.

    Thanks,

  69. Don’t change this. Javascript handles its closures the exact same way, and that is the proper way.

    Again, just to be clear, I am not proposing that we change how closures are handled. Closures close over variables, great. What I’m proposing is that we change the loop variable of the foreach loop to go logically insidethe body of the loop. — Eric

  70. DRBlaise says:

    After reading even more great comments, my opinion has changed yet again.  I definitely don’t like the idea of a warning message.  After Frank Bakker’s comment, I realized there would be too many valid examples of "closure over a foreach loop variable" where the closure never leaves the foreach loop.  As some others have suggested I think new C# context keywords are the best answer. I suggest 3 versions of foreach:

    foreach (var item in collection) — error on closure of item (this is now deprecated old-style)

    foreach (var item inside collection) — item is inside body and is instantiated once per loop (closure allowed)

    foreach (var item outside collection) — item is outside body and is instantiated once (closure allowed)

    There would need to be an easy way to convert older "in" versions of foreach to a new (inside/outside) version with the default being inside.  However the older "in" version of foreach would still be accepted except when using closure.

  71. Peter Seibel says:

    I find it interesting that some people claim that "good understanding of lambdas and closures, this code does exactly what you would expect it to do." The folks who wrote the Common Lisp standard, who presumably understood both those things, still saw fit to include in the documentation of DOTIMES and related looping constructs: "It is implementation-dependent whether dotimes establishes a new binding of var on each iteration or whether it establishes a binding for var once at the beginning and then assigns it on any subsequent iterations." Which is of course the question you face. They chose not to choose because they wanted to leave implementations some latitude in how they compiled these constructs. But if you expect people to frequently close over the loop variable you might choose to choose: say whether it’s one binding or one per iteration. But it doesn’t seem to me that the choice is a logical consequence of the nature of closures.

  72. Fj_ says:

    Make it an error in C# 4.0 (and provide an explanation how to achieve _both_ behaviors)

    Make it behave as expected in C# 5.0

    Silently going from one behavior to another is bad. Doing it this way gives a hope that there wouldn’t be any single person in existence who not only depends on weird behavior, but also  would manage to jump from exactly C# 3.0 over 4.0 to 5.0 or further.

    By the way, "int item; foreach (item in data) …" is an error already. So much for `for` compatibility.

  73. David Wilcock says:

    +1 for warning message (which could always be turned off with #pragma warning of course). it is clearly a problem but a breaking change is a more of a problem.

  74. P.Baughman says:

    >No, you’re right. Usually its a failure to understand that lambdas close over variables, not values. But who cares

    I would say that I care because you’re proposing to fix a symptom instead of a problem which leaves you with a sort of patchwork of fixes that aren’t worth the 100 points to get you back into positive territory and certainly not enough points to cancel out whatever the penalty is for introducing a breaking change.  There’s also a "give a mouse a cookie" argument in here somewhere too.  Once you fix this, then something else will be the thing that causes confusion with lambdas and you’ll need to fix THAT too because now things are even more inconsistant than they were when you started.

    That said, I cannot think of a scenario where you would rely on the "old" behavior so my backwards compatibly demagoguery may not apply.

  75. Pavel Minaev [MSFT] says:

    Some food for thought:

    Imports System.Collections.Generic

    Module Program

       Sub Main

           Dim fs As New List(Of Func(Of Integer))

           For Each x In Enumerable.Range(0, 10)

               fs.Add(Function() x)

           Next

           For Each f In fs

               Console.WriteLine(f())

           Next

       End Sub

    End Module

    C:Temptest.vb(8) : warning BC42324: Using the iteration variable in a lambda expression may have unexpected results.  Instead, create a local variable within the loop and assign it the value of the iteration variable.

               fs.Add(Function() x)

  76. Pavel Minaev [MSFT] says:

    Here’s my broad suggestion:

    Leave the existing syntax as is, but add a warning in the same way VB does (yes, even though the lambda doesn’t always outlive the iteration). Introduce a new syntax which unifies foreach and LINQ:

       foreach (from x in xs) { … }

    The expression inside () can be any LINQ comprehension that is allowed today, but without the final "select". All range variables that would be visible in "select" are visible in the body of foreach, and bound for every iteration (consistent with plain LINQ). So, for example, in:

       foreach (from x in xs group x by x%3 into xg where xg.Count() > 1) { … }

    the body of foreach would be able to reference xg, but not x.

    This has the added beneficial effect of simplifying the existing pattern of iterating directly over simple queries. You’d think you can at least write:

      foreach (var x in from x in xs where x > 0 select x) { … }

    but this won’t work because two "x" will clash, so you need to come up with "x1" or other similarly descriptive name for one of them.

    Getting back to the original problem, it’s also why I also propose to make a warning for lambda capturing a variable in old-style foreach(var) loop – any existing code that does it for legitimate reasons (e.g. because lambda doesn’t escape) can be trivially rewritten to use foreach(from).

    Thoughts?

  77. Scott Langham says:

    Yes, please change it. This problem caught me out. The ‘each’ in the keyword ‘foreach’ led me to believe I’d get a new variable for each element in the collection. It turns out that the name of that keyword is currently a lie.

    When I fixed the bug this problem caused I felt I had to put lots of comments next to the code to prevent any of my colleagues from trying to helpfully clean up the code by removing the unnecessary looking variable I had to declare to fix the problem.

    The only reason not to change is if a significant number of people can cite existing code bases that would break if the change was made. If people do have code that would break, then we should at least have a warning.

    Keep up the good work.

    Regards,

      Scott

  78. Peter J Fraser says:

    I would like a syntax that forces the evaluation of a variable at the time of the lamda’s declaration.  It would save the necessity of creating dummy variables (as in your example) and have the secondary effect of allowing the compiler to produce better code.

  79. Niall Connaughton says:

    I agree with the calls to change it.

    Some small percentage of users will have written code that relies upon the existing behaviour. Some percentage of this group will be unaware of the change and they will see code that does not execute as expected. A lot of developers don’t understand the existing behaviour of foreach and so write code that does not execute as expected today.

    Most people (not all) agree that from the reading of foreach, the current behaviour is counter-intuitive. People having trouble today are being caught by the intuitive way being incorrect. People who would have trouble with the change are people who have adapted to unintuitive behaviour. Following Rico Mariani’s "Pit of Success" concept, making the change would lead to the "right" way being the obvious way, which is certainly not true today.

    The number of people suffering from breaking changes will diminish over time and most of the pain will be dealt with by the following (hypothetical) version of C#. If the change is not made, the current pain is not likely to get much better over time. Each successive release of C# will carry the same burden.

    Problems like this collect as baggage in the language. I think C# is really evolving well, but the evolution can’t just be about picking up new great things. Taking the pain to change the bad things early keeps the language … sharp? Oh god that was terrible 😛

  80. Drew Vogel says:

    The bugs caused by this common misunderstanding of closures are very easy to catch. I do agree with most of Pavel Minaev’s points, but I just don’t care strongly whether the foreach expansion is changed. I do feel strongly that something should be done to remedy the situation. For me, the biggest win that closures provide in C# is making my code clearer by eliminating noise like the v2 local var. If you do not change the foreach expansion, please add the warning and add a mechanism to avoid the inner variable declaration. It could be something like this:

    foreach(var v in values)
    {
       funcs.Add( ()=>valueof(v) );
    }

    In this case you could use:

       values.ForEach( v => funcs.Add( () => v ) );

    Why isn’t .ForEach() part of IEnumerable?

     

  81. Fj_ says:

    I want to emphasize: it should be an error rather than a warning. Because for someone who made this error mistakenly, it’s an error. And someone who wrote that code deliberately (for whatever unimaginable reason) better rewrite it as

       int item_global;

       foreach (int item in items) {

           item_global = item;

           /* do whatever you want now */

       }

    Really, making it a warning makes no sense at all. Even people porting legacy code are better served with an error.

  82. Lars Thorup says:

    Change it so new developers don’t get bitten. Or at least start giving warnings like VB seems to do according to Pavel Minaev [MSFT]

  83. Carl Daniel says:

    @TheCPUWizard –  In C++ the for loop variable did exist past the end of the block in most implementations up until just before the original C++ standard was approved in 1998 – several months after VC6 was released.  VC7.1 introduced the standard-compliant behavior as an option, with VC8 changing to the standard-compliant behavior as the default and the pre-standard behavior as the option.

  84. Carl Daniel says:

    For the record, after reading many of these reponses, I think the right thing to do would be to issue a warning in the nearest release possible and change the behavior in the release after that.  

    The current behavior, while consistent (and intuitive if you understand enough of the details) is UNuseful – it’s likely that any code today that’s relying on the current behavior is already broken.  

  85. Sebastian Good says:

    Perhaps *because* I come from a FP background, the behavior of the foreach variable surprised me. I have made this mistake several times! You never see the variable being mutated explicitly, and indeed you’re not allowed to change it, so you don’t realize you’re capturing the variable. I think the behavior should be changed so that the variable was bound freshly for each loop iteration.

  86. Loic Henry-Greard says:

    These weird semantics are the first thing I noticed when I first saw code samples for the then-new closures in C# 3.0.  They go against all of the functional programming semantics that I’ve ever experienced (OCaml, Haskell, F#, Scheme, …) and I find it very unnatural.  Closing on a variable reference is a state-heavy concept in the first place, it’s contrary to the aspirations of functional programming, and closures are meant to fulfill these aspirations.

    I think it’s still time to change.

  87. DRBlaise says:

    @Pavel – I love your foreach(from x in xs) solution for LINQ syntax, but not really for regular foreach syntax.  It just seems too much of a subtle difference in syntax that does not make the difference obvious.  When there are subtle differences in meaning that has caused a lot of problems in the past, I think it is best to be as blunt and obvious as possible in syntax.  That is why I suggested the 3 foreach syntaxes:

    foreach (var x in xs) — error or warning (if people insisted) on closure of x

    foreach (var x inside xs) — x is inside body and is instantiated once per loop (closure allowed)

    foreach (var x outside xs) — x is outside body and is instantiated once (closure allowed)

    However, if your foreach(from) syntax was the solution chosen, I would not complain … it would be much better than the current state.

  88. Pent says:

    Before reading Frank Bakker’s comment about valid examples I thought a warning would be nice. But closing over the loop variable inside the loop is quite common, so either make the change or don’t do anything (don’t add a warning).

  89. Andrew Backer says:

    Pavel, thanks for the VB note.  I was wondering if anyone was going to say that the competing compiler gives this warning already.   I get it each time I do it, and fix it.   I’d hate to think how many of those might have turned into bugs… I didn’t realize c# doesnt remove the burden of omniscience like vb did.

    ** I say CHANGE IT.  And change it in VB too, please!  I hear there is some sorta cooperation on things like this now 🙂

    It always surprises me, and everyone else, that it works the way it does.  It reads as pavel says, and .   If I wanted torture I’d go back to C++ 🙂  or C# can just become as hard and we can call it C##!

    However, if the backwards compatibility issues are real and not imagined, then it’s up to you guys.  We can always code in the workaround variable.

    I don’t think that anyone who isn’t trying to say some variation of how cool/smart/experienced/rough and tough they are would *expect* it to behave how it does.  Even those who do understand always figured it out after it bit them, never before.  For every weird behavior there is always a chorus of "but you just need to learn the language more!" or my favorite "I’d fire anyone who wrote that code".  

  90. Pavel Minaev [MSFT] says:

    Another potentially interesting data point is that Python also mutates range variable (and its lambdas consequently capture it in the same way C# does that):

       fs = []

       for x in range(0, 3):

           fs.append(lambda: x)

       for f in fs:

           print f()

    Output:

       2

       2

       2

    But so far Python is the only other language (apart from C# and VB) in which I can observe this.

    Furthermore, in Python, it is at least consistent with its list comprehensions:

       fs = [(lambda: x) for x in range(0, 3)]

       for f in fs:

           print f()

    Also outputs:

       2

       2

       2

    And Python lets you change range variable in "for" (it will simply be overwritten on the next iteration), and doesn’t scope it to the loop body:

       for x in range(0, 3):

           pass

       print x

    Output:

       2

    So at least its behavior, while still (IMO) unintuitive, is at least fully self-consistent: a range variable really is just a repeatedly assigned variable, nothing more. There’s nothing magical about it – it’s not "semi-readonly", it’s not specially scoped, etc.

  91. CuttingEdge says:

    I vote for a compiler warning, but I’m very interested in the issues around adding compiler warnings.

  92. At first, I thought having a compiler warning would be the best, but after reading the above comments, I too realized that there are way too many common, legal and wanted situations to close over the loop variable. Note that in those situations, the proposed change would not make a (functional) difference (it might have non-functional implications, e.g. performance – but not against the manual fix we need to do now).

    I say make the change.

    The breaking change argument is valid, but has lost it from other arguments before. I admit in some cases breaking changes were allowed to bring the implementation in line with the spec, but that is not the point. The point is that breaking changes were made before.

    I find the "for each" argument very strong, as well as the consistency argument when applied to the read-only nature of the loop variable. I also agree with the argument that the change is unlikely to break really existing code, since very few would (or even should) rely on the existing behavior. Finally, the readability argument is important: code with the manual fix looks redundant and wrong.

    The statement that the compiler correctly implements the specification is true, but not relevant. The bug is in the specification. Fix it, and adjust the compiler to the improved specification.

  93. That’s why I like resharper. Resharper kindly warns stupid developers like me about access to modified closure and offers refactorings like introducing a variable as your example states.

    Daniel

  94. Martin Mueller (MillKa) says:

    I vote for behaviour change in C#5.

    I vote for a warning in C#4.

    I think its too late to change the behaviour in C#4, but C#4 should definitely give a warning. This warning should be on by default in C#4 and C#5.  The warning could be off by default in C#6 and later.

    In C#5 the behaviour should be changed to the new, expected way.

    For those cases, where the old behaviour is still needed, I like the syntax Joe suggested above:

    SomeType item;

    foreach (item in collection) { .. } // note: no var before item, so the variable outside the loop is used.

    However, I have failed so far to find a situation that needs the old behaviour …

    In my humble opinion, the outside expansion

    // item var OUTSIDE loop

    try {

     int m;

     while (e.MoveNext ()) {

       m = (int) (int) e.Current;

       funcs.Add (() => m);

     }

    }

    is just bad code, because it conflicts with a rule I consider good:

    Declare/initialize/allcote a variable as late as possible with the smallest scope possible.  

    Proof that this rule is good: Pulling variable m out of the loop body for no apparent reason introduces weird and unexpected behaviour … ;-P

    Therefore, I think the inside expansion

    // item var INSIDE loop

    try {

     while (e.MoveNext ()) {

       int m = (int) (int) e.Current;

       funcs.Add (() => m);

     }

    }

    is better code.  I think thats the code that the majority of prgrammers would wrote by hand if they had to replace foreach.

    To guesstimate the pain of this breaking change, it might make sense to feed all NET source plus codeplex and then something through a modified c#4 compiler version ..

  95. Gabe says:

    I say to give the warning only where it would make a difference. That is, if you close over a loop variable and that closure escapes the loop, give a warning. In theory the only thing it should break is code that gets compiled with warnings as errors, and has closures over escaping loop variables.

  96. Pavel Minaev [MSFT] says:

    > I say to give the warning only where it would make a difference. That is, if you close over a loop variable and that closure escapes the loop, give a warning.

    The problem, as pointed out earlier, is that a closure may not actually escape the loop, and there’s no way the compiler can find out. Consider some hypothetical L2S code:

       foreach (var customer in ctx.Customers) {

           var orders = ctx.Orders.Where(o => o.Customer == customer).ToList();

           …  

       }

    Now we have a lambda that captures "customer". It doesn’t really escape here, because the enumerator in which it is stored is immediately discarded, and cannot be referenced outside the loop. But how is the compiler supposed to know that?

    So it would have to give a warning for the above, perfectly reasonable, code, and also the equivalent LINQ comprehension.

  97. Alexey Reznick says:

    + for compiler warning.

    Anyone who stumbles across this issue once, and realizes the mistake – won’t do it again. It’s not worth introducing breaking changes, and making the language inconsistent. The compiler warning should suffice for all matters.

  98. I would prefer that the current foreach semantics not be changed, but rather add another use for the ‘new’ keyword such that –

    foreach (new Object in Collection)…

    – implements the altered semantics of relocating the declaration.

  99. Andrew Chisholm says:

    +1 for a compiler warning, I got caught by this pitfall once and gave the weirdest program behaviour ever. Once I found the cause I immediately knew the mistake I’d made, but it is such a subtle semantic one it was difficult to find when reading my code. I think a compiler warning for it is a must. It would have saved me a lot of time tracking down the issue.

  100. I give a vote (+1) for the warning. I have struggled with this a couple of times but not enough to want you guys to change the way the compiler behaves.

  101. Make the breaking change.  Some seem to think a compiler warning is "more acceptable" than a breaking change, but in reality it just gives many of the negatives of breaking changes without any of the benefits of the actual change.  Many companies set "warnings as errors" making it the equivalent of an error, without the benefit of being able to use the corrected behavior.

    This change would be very beneficial in terms of making the language more closely follow the "principle of least astonishment."  Some have said that once they "discover" the behavior, surely it makes perfect sense.  However, I think something cannot possibly make "perfect sense" yet have to be discovered after getting it wrong.  I think "natural" and "intuitive" should take precedence here.

    Finally, as Pavel has mentioned repeatedly, making the change would make the foreach construct more consistent with Linq’s "from item in items".

  102. Meta-Knight says:

    I vote for making the change. I mostly code in VB so at least I get a warning but still, it’s annoying and counter-intuitive to have to create a local variable for this.

    If you decide to not make the change, at the very least add the warning in C#. If there wasn’t a warning for this in VB I would have been bitten by this several times.

  103. Julien Lebosquain says:

    I’d suggest to make the breaking change too. I didn’t got caught at all by this pitfall but that’s only thanks to ReSharper. The first time I saw this warning I was like "but what the hell is resharper talking about?". Two minutes of googling later I understood the problem and immediately thought that this behavior was really counter-intuitive. That being said, I always keep in mind the existence of that behavior now that I know it and r# didn’t caught me again.

    As many people stated, I too cannot think of any code that would rely on this behavior. So make the breaking change. If you don’t, at least issue a warning so that people get used to it and change the behavior in a future version.

    Some have proposed to use "foreach (new …)" but IMHO, it’s just worse. What does the "new" here, it certainly doesn’t construct a new instance like everywhere else in the language: it’s not intuitive at all. To understand why ‘new’ is here, you have to understand the problem it solves. If you do, you don’t have any use for this keyword since you’re already aware of the modified closure problem. If you don’t, you’ll get caught by the old behavior. The same goes for the "inside/outside" keyword. Just read this sentence: "for each value outside enumeration" and you’ll see what I mean: as stated before, "each" prevails in the understanding of "for each" and there shouldn’t be any keyword affecting the loop variable.

    However, I’m totally for the "int item; foreach (item in data) {}" construct for those who need it since it’s simple, natural and intuitive.

  104. Brad says:

    m = (int)(int)e.Current;

    Why the double cast?

  105. Pavel Minaev [MSFT] says:

    @Brad: according to the definition of foreach in the language spec, there are two types involved: element type (which is normally the type of Current property of the enumerator), and iteration variable type (which is the type you explicitly specify in foreach, or same as element type if you use "var"). The associated line of the expansion of foreach is this:

           v = (V)(T)e.Current;

    So far as I can see, the (T) cast will almost always be a no-op, because it will correspond to type of Current. Seemingly the only case where this isn’t so is when you’re iterating over an array (the spec requires the use of IEnumerable over IEnumerable<T> in that case, and element type is derived not from Current, but from array type).

    I wonder why it wasn’t simply redefined to use IEnumerable<T> for arrays as well (which would remove the need to special-case that, as well as the cast) – so far as I can see, the change wouldn’t be observable to the user…

  106. Tim Acheson says:

    Nice analysis, this is a classic trap to fall into.

  107. Pop.Catalin says:

    The way you described how the foreach variable should be scoped inside the foreach loop, seems the right way to be.

    But,

    I’m afraid this has allot of potential to introduce multi targeting bugs, code written with the new foreach rules will not work correctly when re targeted to older compilers (C# 2.0, C# 3.0). There are allot of libraries (we have 3 internal libraries for Desktop and CF version of .Net and 1 for Desktop and Silverlight) that are multi targeted for various versions of .Net framework (Desktop, Compact Framework or Silverlight).

    If this change is put in the new compiler there will be a new multi targeting gotcha, instead of a lambda over foreach variable gotcha.

    I don’t think there is a real use for the way closures work right now with foreach, and that there is someone who depends on the feature, but I don’t think that’s the real issue, I think the real issue is multi targeting.

  108. P. Baughman says:

    While we’re on about counter intuative things with the foreach loop how about the fact that

               List<Animal> AnimalList = new List<Animal>();
               AnimalList.Add(new Dog());
               AnimalList.Add(new Cat());
               foreach (Dog d in AnimalList)
                   d.Bark();

    Produces an InvalidCastException instead of barking the dog but not the cat?  After all, I said for each dog in the animal list.  If I wanted to try to bark every animal I would’ve written

               List<Animal> AnimalList = new List<Animal>();
               AnimalList.Add(new Dog());
               AnimalList.Add(new Cat());
               foreach (Animal a in AnimalList)

    The foreach loop was designed for a language without generics. In the C# 1 world, most of the time you’re iterating over a sequence of objects that all happen to be of the same type, but all you know about them at compile time is that they’re “object”. Now, suppose you are iterating over what you think is a sequence of dogs, but a cat got in there by accident. Remember, we don’t know that it’s a list of animals, we know that its a sequence of objects and that’s all. Is the right thing to do to ignore the error and keep on trucking, or to inform the developer that their assumptions are wrong, and this in fact is not a list of dogs? C# is not a “muddle on through and hope for the best” language. Rather, a design point of C# is that being brittle is better than being lax. A lax compiler hides your bugs, and we do not want to hide your bugs, we want to bring them to your attention.

    If what you want to do is iterate through only the members of a particular sequence that are of a particular type, we have an extension method for that: 

    foreach (Dog d in AnimalList.OfType<Dog>())

    — Eric

  109. P.Baughman says:

    Yeah, you’ve got me there.  The two situations aren’t really analogous at all.  I bow to your superior reasoning.  Although AnimalList.OfType<Dog> still requires extra knowledge about the problem, just like the proposed foreach(new dog d in AnimalList) syntax, I’d say that gets outweighed by the risk of compiling code that could be silently broken.

  110. @P.Baughman – one way to avoid your problem is to always use var for the loop variable in foreach. This has the effect of neatly disabling the casting behaviour; then as Eric says you can also use OfType to filter the list.

  111. Brad says:

    @Pavel

    Thank you! That clears it up 🙂

  112. Duarte Cunha Leão says:

    My vote: change this!

    I agree mostly with Stuart Ballard and Pavel Minaev.

  113. Somejan says:

    I say: change it!

    Like you said, nearly nobody is (or should be) depending on the current behavior, and it is counter intuitive.  

    About consistency, I think a control structure is special anyway, so this perceived inconsistency is not important. Consistency is there to make a language easier to understand, not for it’s own sake. The fact that we are writing foreach(var foo in foos) instead of foreach(foos as var foo) is basically a historical accident, with no good reason for one or the other except for what people are used to. [php uses the second variant]

    The standard for loop isn’t very consistent in this respect either: in for(int i=0; i<10; i++) the first clause is evaluated once, while the second and third clauses are both evaluated every loop. Sure, they are separated by a semicolon instead of a keyword, but they are still within brackets in the same line in a single control structure. And if the first semicolon means that expressions are evaluated at different times, why doesn’t the second mean the same? There is no real consistency to start with.

    So do whatever you can to fix language warts if at all possible. The longer you wait, the harder it becomes to fix.

  114. Ferran says:

    I think is better a warning because that’s the purpose of a waring, it isn’t?. Also it’s important for backward compatibility.

  115. I see a lot of comments here about backward compatibility, and that surprises me in this case.

    The set of cases that actually change is minuscule – in fact, I doubt the current behavior is used very often at all right now.  How often do people really make closures including the loop variable inside a foreach loop and then actually use that closure after the current iteration?

    That’s got to be a really weird corner case.

    After all, even with this proposal, the closures that are immediately used (say, as part of an immediately executed linq query) would behave identically.

    I mean, when you close over a loop variable, you must be inside the loop, and you thus must be doing so repeatedly as part of the loop.  The whole point of a loop is to execute code for various values – but constructing various (identical) closures over a single variable is inherently an odd  thing to do.

    Does *anybody* actually rely on the current behaviour?  Backwards compatibility is only a reasonable argument if there actually exists code to retain compatibility with.

  116. Gabe says:

    Eamon: It’s  unlikely that anybody knowingly relies on the current behavior, but it’s quite likely that many people unknowningly rely on the behavior. You can easily write code to rely on any obscure behavior, and never realize it until the behavior changes.

  117. Svolo4 says:

    I vote for the change.

    I don’t think than the behavior of the foreach should match the behavior of the for because it is a higher-level language construct. It feels more natural to me that foreach(var x in y) is implemented as:

    for(int i = 0; i < y.Count; i++)

    {

     const var x = y[i];

     // your foreach body goes here

    }

    (actually it is the way I implement it once as a macro in C code).

    And I think this even can enable more agressive optimization of simple foreach body: use IEnumerable.Current value instead of assigning it first to a variable, for example in loop like this:

    int sum = 0; foreach(int x in y) sum += y;

  118. Pavel Minaev [MSFT] says:

    > And I think this even can enable more agressive optimization of simple foreach body: use IEnumerable.Current value instead of assigning it first to a variable

    Why do you think it’s an optimization?

  119. John Clements says:

    From a Functional Programmer’s perspective: make the change.  Wait, that’s what Sebastian Good said!  Perhaps our minds have been warped by excessive exposure to Matthias Felleisen.

    Final comment: grep for the word "clearly". These (in general) represent points where the author is not convinced of his own argument.

    Pavel +1

    Sebastian +1

  120. Sorry for getting in the discussion quite late. I understand the delicacy of the issue and I hate inconsistency but  I do lean towards a change. HOWEVER I think this bad can be cured by introducing a compiler flag somewhere to control the behaviour. If you convert your old project to C# 5 you get the "old behaviour" flag if you have some code that would run into the issue (along with some warning). New project = new behaviour flag. As versions of C# progress the compiler flag setting gets more hidden somewhere deep in the GUI 🙂

    I think warnings are a bad idea. Either you define the construct as a shorthand for the variable being declared outside the loop or inside the loop. Both behaviuors are logically sound and ok for usage. I’d argue that the actual problem is that developers assume one behaviour whereas the language is implemented using the other. I may be cynical but I’d argue that this mismatch can only be remedied by changing the compiler behaviour or prohibiting it (warning/errors). If you continue having a mismatch, people will produce errors. Which is the worst outcome of all. If you have a problem, take the cure now, not later. And – people WILL loathe the C# compiler for having to introduce that weird copying all the time.

    You give four arguments to why change is a bad idea:

    1. breaking changes

    2. Lexical inconsistency since "left" should execute before "right"

    3. Inconsistency with "for" semantics

    4. There is already a good workaround

    As for 1, I agree, but I think the bad things can be overcome (see my first paragraph).

    As for 4, this is not really an argument. If people make errors, then it’s bad. And it is a common error.

    As for 2, I don’t agree. The syntax "foreach(var x in Y)" basically conveys that you pick out objects "in" the bucket Y and name each object x – not the you place those Y objects in a separate container x. They’re suppose to be "in" Y, right? Contrary to your argument I feel that "in" implies that it has to execute after and that the left -> right is a vague language idea. I believe users misunderstanding the current compiler’s behaviour is evidence for this.

    As for 3, I don’t think it’s a fair comparison. In a for loop you declare an index variable that HAS to live throughout the iteration. It’s clear. The foreach loop on the other hand is subject to a totally different idea interpretation.

  121. Timwi says:

    While reading this blog entry, I was reminded of an earlier entry where you described a breaking change that was made in order to introduce generics:

    public static void Method(bool one, bool two) { }

    public static void Main() {

       int a = 0, b = 0, c = 0, d = 0, e = 0;

       Method(a < b, c > (d + e));  // used to compile in C# 1, now compiler error

       Method(a < b, c > (int)d);    // even worse; compiler errors rather hard to understand

    }

    I was surprised that this kind of breaking change would have been made; it was easy for me to imagine that some amount of code might exist that uses this construct and would now fail to compile.

    With the foreach loop, the danger of the breaking change is greater because the change is not caught by a compiler error; however, my intuition is that the frequency of incidence is vastly lower than in the above case. I can see the above construct used intentionally in meaningful code, but I cannot see the current behaviour of the foreach loop used intentionally at all. More likely it was used accidentally in the belief that the foreach variable is scoped inside the loop, and the bug never noticed; hence, this change would be more of a fixing change than a breaking change.

    I agree that the variable scoping would become inconsistent with that in ‘for’ loops; however, I find that argument rather weak because it is not the kind of inconsistency that would trip anyone up. You can declare a variable in the head of a ‘for’ loop but not in a ‘while’ loop; this is strictly an "inconsistency" but no-one really bothers because it doesn’t cause people to write subtle bugs.

    The lexical inconsistency I feel is the weakest argument, because it is inconsistent _either_ way, and so is the ‘for’ loop. Consider:

    for (var i = 0; i < 10; i++)

     DoSomething(i);

    for (var i = 0; i < 10; i++)

     DoSomethingElse(i);

    If the variable ‘i’ was scoped outside the ‘for’ loop, then you shouldn’t be able to declare it twice. But you can, and they are separate variables. And yet it is not scoped inside the loop. It seems to be scoped somewhere in the middle; it has a scope all to itself. The same is true of ‘foreach’ at the moment, by being scoped inside the ‘try’ but outside the ‘while’. To fully appreciate this, the programmer needs to realise the existence of that ‘try’ and ‘while’. This is bad enough in the case of ‘foreach’ but at least you can find it in the specification and lots of blog entries and websites; but in the case of ‘for’, it is really rather mysterious.

    My conclusion, therefore, is that the arguments against are weak, and the benefits reaped are large, given the apparent notoriety of this gotcha and the desirable goal of a "pit of success" language.

  122. Martinez says:

    According to the article, int acts as a reference type, not a variable type. Where do I make a mistake?

    I do not understand the question. Ints are of value type, not of reference type. I do not know what “variable type” means. Can you rephrase the question to explain what it is that you do not understand? — Eric

  123. Martinez says:

    According to the article, int acts as a reference type, not a variable type. Where do I make a mistake?

    [Rephrased]: In C# all instances of all classes are of reference types. All instances of structs are simple “variable” types. So if two references are pointing to one object and you will make some changes in this object via first reference, changes will also be reflected if you examine this object via second reference. This is of course not true for simple types, like int. Sorry for not being clear at the beginning.

    You seem to be confusing variables with value types. There’s no such thing as a “variable type”. A variable HAS a type, which can be a reference type or a value type. Either way, a variable is a storage location which can store a value or a reference, depending on its type. Closures close over variables, not values. You can of course change the value stored in a variable; that’s what “variable” means: that the value can vary. — Eric

  124. Igor Z says:

    C# Reference clearly states that each element in the loop should be treated as a separate entity:

    "The foreach statement repeats a group of embedded statements for each element in an array or an object collection. The foreach statement is used to iterate through the collection to get the desired information, but should not be used to change the contents of the collection to avoid unpredictable side effects."

    As was repeatedly stated above: The syntax "foreach(var x in Y)" basically conveys that you pick out objects "in" the bucket Y and name each object x – not the you place those Y objects in a separate container x.

    Current implementation is not conforming to the above and so should be fixed.

  125. Issue a warning in C#4, and change it in C#5.

  126. John Rose says:

    @Konrad: "I much prefer Java’s rule here: only allow closing over variables which are declared `final`. But then, this only works because Java allows local variables to be declared `final`, which C# doesn’t."

    The reason Java supports final variables is exactly to fix this problem.  The designer of inner classes was a Lisp refugee who had been bitten by the DOTIMES issue (see Peter Seibel above), and didn’t want to allow the problem into Java.

    This particular barn door is hard to close after the mutable iteration variable has galloped away.

  127. NetMage says:

    I vote for making the change.

    I think adding a warning or an error is the worst possible thing to do – I hate it when the message I get from a system is "you did this wrong, we know what you did wrong, we would rather tell you then do the right thing for you." – that’s not the kind of system I enjoy using, be it programming language or business application.

    This one may go even further – "we tricked you into doing the wrong thing, we know what you did was wrong, but we’d rather tell you about it than do the right thing for you."

    All kinds of annoyance there. The VB error message is exactly that. See Alan Cooper for how error messages should be (mainly, don’t have them).

  128. Python allows both ways:

    >>> f = []

    >>> for i in range(10): f.append(lambda : i)

    >>> f[0]()

    9

    On the other hand:

    >>> f = []

    >>> for i in range(10): f.append(lambda v = i: v)

    >>> f[0]()

    0

    I’ve called this "v" for clarity. Of course, this also works:

    >>> f = []

    >>> for i in range(10): f.append(lambda i = i: i)

    >>> f[0]()

    0

  129. dmihailescu says:

    I say change it since few people used it knowingly.

    Other languages should follow suit too in the same VS release.

    The warning is too often overlooked by many peoples.

  130. hoodaticus says:

    Thanks for the clear explanation!

  131. Brian Gideon says:

    Something has got to change.  Otherwise, the StackOverflow developers are going to have to create a new site called closeoverloopvariable.stackexchange.com dedicated specifically to this problem.  It has to be one of the most asked questions on SO.  I bet there are literally thousands of them lurking there already.

  132. Roman Starkov says:

    Yes, please, Eric, consider changing the scope of the foreach variable so that it isn't shared between iterations.

    C# is generally fantastic at doing just the right thing, without introducing ambiguity and without the developers even fully understanding just what really happens behind the scenes. The foreach variable capturing is an unfortunate example of where this doesn't happen.

  133. Kyralessa says:

    I just saw it claimed elsewhere that this change would be made in C# 5, and I came here to see if it were true.  It is!  Hooray!  I shall bid a not-so-fond farewell to the bugs this used to lead to.

  134. fpelliccioni@gmail.com says:

    How do you solve this?

    //—

    var actions = new List<Action>();

    for ( int counter = 0; counter < 10; ++counter )

    {

       actions.Add ( () => Console.WriteLine(counter) );

    }

    foreach (var action in actions)

    {

       action();

    }

    //—

    The range-based-for (foreach) sugar is wrong, but you need a capture-by-value lambdas.

    Especially in a language that tries to support value semantics.