Behind the Scenes of LINQ - Part 1

Now it's getting confusing :-) As a prequel to my LINQ-To-Sql Article, I'd like to digg into the details of LINQ in generel!

Language Integrated Queries is a new feature of C# 3.0 in conjunction with .NET 3.5. This means LINQ is purely solved at a language compiler and .NET library level. .NET 3.5 doesn't come with a new CLR, but consists only of additional framework libraries (like System.Core.dll). So the MSIL code produced below is purely MSIL 2.0 code!

So how do LINQ queries work?
Let's take a basic query as a sample:

 List<int> ints = new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

var filteredInts = from i in ints where i > 5 select i;

foreach (int i in filteredInts)
{
   Console.WriteLine(i);
}

First of all we generate a new list of integers, which is initialized with values 1-10 (with the new C# 3.0 Collection Initializers feature).

Then we query the list with a LINQ query.

The query returns an C# 3.0 infered type, which means, we do not need to specify the type, but use var instead. .NET will automatically substitute it with the actual type.

This is exactly the same as in these samples:

 var s = "Hallo"; // Substitute with string
var i = 10;      // Substitute with int
var f = new List<AppDomain>(); // Substitute with List<AppDomain>

In the same way the type IEnumerable<int> is filled in by the compiler instead of var above.

Like with every IEnumerable we can iterate through the filteredInts with a foreach loop and output all the filtered integers.

So how does the actual LINQ query work ?!

The LINQ query

 var filteredInts = from i in ints where i > 5 select i;

could be rewritten into:

 var filteredInts = ints.Where(i => i > 5);

which is what the compiler does.

So it calls a method Where on the IEnumerable and passes it a Lambda Expression.

First of all: Where does the Where method come from? This is an extension method to the type IEnumerable. What this means is, the method is not really implemented at the List<int> but only extends it.

Extension methods are static methods in a static class, which take a special argument:

 public static void ConsoleWrite(this IEnumerable enumerable);

would be an extension method for all IEnumerable implementing classes (because of the this keyword in front of the parameter, this is the extended type).

If we look into the class Enumerable we find a bunch of extension methods there, like Average, Sum, OrderBy or...

 public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source,
 Func<TSource, bool> predicate)

This is the method used here. It extends IEnumerable types and has a delegate as a parameter!

What does it do? Something similar to this:

 public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source, 
            Func<TSource, bool> predicate)
{
  List<TSource> result = new List<TSource>();
  foreach (TSource item in source)
  {
    if (predicate(item) == true)
      result.Add(item);
  }
  return result;
}

It iterates through all items of the source collection and calls the delegate for each of them, then adds only those for which the delegate is true to the results.

So i=> i > 5 is basically a different syntax for specifying the delegate!

How do they play together?! Get back for the next part of the article!