Implications of the just in time nature of LINQ

I love LINQ! I’ve found that much of the code I write involves manipulating collections in ways that can be very naturally expressed in LINQ. One interesting aspect of LINQ is that things are evaluated just in time as you enumerate over them. This can have a few unexpected consequences. Here are a couple of examples. Take the following test class:

         private class Test
        {
            public int Value { get; set; }
        }

Now, think about the following code:

             IEnumerable<Test> tests = Enumerable.Range(1, 3).Select(i => new Test() { Value = i });

            foreach (Test test in tests)
            {
                test.Value += 100;
            }

            foreach (Test test in tests)
            {
                Console.WriteLine(test.Value);
            }

For those not familiar with LINQ, the Enumerable.Range(1, 3) creates an IEumerable that ranges from 1 to 3, and then the Select creates new Test objects with a Value equal to the current value, meaning the overall expression creates Tests with values that range from 1 to 3. So, what does this output? You might expect 101, 102, and 103 because the first foreach increments the values. However, it prints 1, 2, 3. The reason is that foreach calls tests.GetEnumerator which runs through the process of creating Test objects as MoveNext/Current is called through the foreach loop. So, the first time we go though the loop it creates 3 Test objects and we increment the value. However, those Test objects are just returned by the enumerator and don’t get stored anywhere. The second time we go through the loop we create 3 new Test objects with values 1-3. One way to get the expected behavior would be to replace the first line with:

             List<Test> tests = Enumerable.Range(1, 3).Select(i => new Test() { Value = i }).ToList();

This creates a list with the Test values once, and then the foreach statements will enumerate the same set of Test values in that list.

Another gotcha of the just in time evaluation is that values in the lambda functions are evaluated at the time of enumeration. So, in the following example:

             int x = 0;

            tests = Enumerable.Range(1, 3).Select(i => new Test() { Value = x + i });

            x = 100;

            foreach (Test test in tests)
            {
                Console.WriteLine(test.Value);
            }

You get 101, 102, and 103 output. That’s because the “x + i” expression is evaluated after x is set to 100. This sort of issue is more subtle when you return a LINQ expression from a function. Who knows when that will be evaluated and what will change by then. Using ToList() is a reasonable way to force the evaluation time.