Loving LINQ

I have to say, I love how easy it is to do SQL-like expressions over collections of objects with LINQ...

If we have two simple classes that define our data;

class Person

{

    public int Id { get; set; }

    public string Name { get; set; }

    public int AddressId { get; set; }

}

class Address

{

    public int Id { get; set; }

    public string Line1 { get; set; }

    public string PostCode { get; set; }

}

Then we can “join” them in memory with a quick LINQ query;

static void JoiningCollections()

{

    // create some test data

    List<Person> people = new List<Person>()

    {

        new Person() { Id=1, Name="Simon Ince", AddressId=1},

        new Person() { Id=2, Name="Someone Else", AddressId=1},

        new Person() { Id=3, Name="My Friend", AddressId=2},

        new Person() { Id=4, Name="My Other Friend"}

    };

    // create some more test data

    List<Address> addresses = new List<Address>()

    {

        new Address() { Id=1,

                        Line1="1 Aroad",

                        PostCode="A1 B23" },

        new Address() { Id=2,

                        Line1="3 Another Road",

                        PostCode="Z1 ZZZ" }

    };

    // define a really simple query

    var results =

        from p in people

        join a in addresses

        on p.AddressId equals a.Id

        select new { p.Name, a.Line1, a.PostCode };

    // and enjoy the simplicity of the results

    foreach (var result in results)

    {

        Console.WriteLine(String.Format("{0} lives at {1}, {2}.",

                         result.Name,

                         result.Line1,

                         result.PostCode));

    }

}

Imagine how many more lines of code you would have to write to achieve that before LINQ existed.

The one thing I don’t like about this is that you may need to avoid the “var” keyword when returning result sets to a calling function. No problem... just add a class that can pair up your names and addresses;

class PersonsAddress

{

    public Person Person { get; set; }

    public Address Address { get; set; }

}

... and change your query and output routine to use it;

IEnumerable<PersonsAddress> results =

    from p in people

    join a in addresses

    on p.AddressId equals a.Id

    select new PersonsAddress { Person=p, Address=a };

foreach (PersonsAddress result in results)

{

    Console.WriteLine(String.Format("{0} lives at {1}, {2}.",

                     result.Person.Name,

                     result.Address.Line1,

                     result.Address.PostCode));

}

Hey presto – you have just joined together two sets of objects in memory, and could now return them as a strongly typed collection of data. What is also interesting is that by using the IEnumerable<PersonsAddress> type this query won’t be executed until you start to iterate over the results.

The only gotcha with this is that I dread the day when I see two massive collections of data retrieved from a database and joined in memory... so be careful! Bear in mind that in memory joins like this will almost certainly be slower than when done in an engine designed to do it all the time – SQL Server. I say almost as you could point out there is overhead to hit a database, and if you have two collections containing only a few items each I might be persuaded it is quicker to do it in memory... but you see my point, non?!

Still, you’ve got to love LINQ.