C# Fun at AADND - A Closer Look at the Where Extension Method and Lambda Expressions

bill_color_small Last week at the Ann Arbor Dot Net Developers group (AADND), Bill Wagner gave a talk entitled "After the Launch: C# 3.0".  He discussed:

  • Implicit properties
  • Extension methods
  • Enumerators and deferred vs. immediate execution
  • Lambda expressions
  • Sequence operators

 

One of the questions that was asked involved the below code: 

 

static void Main(string[] args)
{
    var smallNumbers = Enumerable.Range(0, int.MaxValue).Where(n => n < 10);
    foreach (int i in smallNumbers)
    {
        Console.WriteLine(i);
    }
}

 

This code is examining all non-negative integers (up to 2,147,483,647, which is the value of int.MaxValue) and scoping them down to integers less than 10.  It produces the following output: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. 

The question was (drum roll please): "Where is n initialized?"   This may be a little confusing because there is no explicit "int n;" declaration anywhere in this program. 

Let's break down what this code is actually doing. First of all, the Enumerable.Range method will produce a sequence of integers, from 0 to 2,147,483,647.  Then the Where method is called. 

Before we talk about what the Where method does, let's take a quick look at the parameter it takes: the lambda expression "n => n < 10".  If you use Intellisense to examine what parameter type the Where method is expecting, you will see "Func<int, bool> predicate".  This is a generic delegate, which takes one parameter (an int) and returns a value (a bool). 

But we are assigning a lambda expression to this delegate...why does that compile?  It's because the underlying type of a lambda expression is a generic Func delegate, so we can pass a lambda expression as a parameter without assigning it to a delegate explicitly. 

Therefore, in our Func<int, bool>, the int is "n" and the bool is "n < 10".  These types are inferred by the compiler and don't need to be declared explicitly.  It might be easier to think of "n => n < 10" as this (which the compiler will translate it into):

 

private static bool LambdaFunction(int n)
{
    return n < 10;
}

 

Now, what is the Where method doing?  It actually calls this lambda function, passing in each non-negative integer (up to 2,147,483,647, from the Range) as “n”, and getting back a Boolean result on whether each is less than 10.  Yes, the lambda expression is called 2,147,483,647 times.  That is the correct behavior of the Where method: to evaluate the predicate for every member of the sequence.  This is not optimal for this particular scenario, but there is nothing that tells the compiler that once you go beyond 10 in the number line, nothing will be less than 10.  (NOTE: there are ways to avoid this if you are processing a sorted sequence with a well-defined cutoff point - use the TakeWhile method instead of Where.)

Also, note that the "Where" method uses deferred execution.  It actually returns an object which stores all of the info needed to run the query, and the query itself isn't executed until the object is enumerated in the "foreach" statement.  Specifically, the Where method just checks its input parameters (so "bad argument" exceptions can be thrown immediately, not deferred) and then returns an enumerator. 

So where is n actually initialized?  The "n" is a parameter to the lambda function.  Again, the actual lambda function would look like this:

 

private static bool LambdaFunction(int n)
{
    return n < 10;
}

 

So "n" is initialized the same way any other function parameter is initialized: by calling the function, a copy of the variable that is passed in will be made. 

Special thanks to Eric Lippert, David Carley, and Wolf Logan for their insights and discussion on implementation details.