C# 3.0: LINQ, I’m not sure I like it that much

This is the my sixth post in the series of posts I am making on the new features of C#3.0. See the previous posts on var, extension method, lambda expressions, object-collection initializers and anonymous-types

I think the single biggest thing in C#3.0 is Language INtegrated Query or LINQ. On seeing all the other features of 3.0 (listed above) somehow I get a feel that they all came into the picture because Linq needs them to work well. This does not mean that these features do not find there usage elsewhere (they definitely do) but they look as if they are a part of the grand plan of Linq. CyrusN has some great blogs on Linq

I generally give strong opinions about whether I like a feature or not straightaway (no IMHO). But I am kind off divided on Linq. Lets first see what Linq is and how it works and then I’d go into why I like it and why I don’t.

Using LINQ

Traditionally data has always been disjoint from code. A programming language would provide statements and expressions to work with data-types and each programmer would write specialized code in his own style to filter/manipulate data. Lets consider an array of Employees defined as below as our datasource. Note that the definition uses some of the new C#3.0 features including implicitly-types variablesanonymous-types, object-collection initializers and anonymous array declaratiom.

var employees = new []{

new { Name = “Arthur Dent”, JobGrade = 3, JobTitle = “SDE”,

Salary = new { Base = 2000, Allowance = 1000 }},

new { Name = “Ford Prefect”, JobGrade = 2, JobTitle = “SDE”,

Salary = new { Base = 5000, Allowance = 500 }},

new { Name = “Slartibartfast”, JobGrade = 2, JobTitle = “SDET”,

Salary = new { Base = 3000, Allowance = 1000 }},

new { Name = “Zaphod Beeblebrox”, JobGrade = 1, JobTitle = “SDE”,

Salary = new { Base = 6000, Allowance = 1000 }},

new { Name = “Trillian”, JobGrade = 3, JobTitle = “SDET”,

Salary = new { Base = 12000, Allowance = 1000 }},


In the old world you’d use custom functions to work on this array (data) to filter them based on some criteria. In C# 3.0 you can use extension method and lambda expressions to do this as

var highlyPaid = employees.Where(e => e.Salary.Base > 5000).Select(e => e.Name);

Effectively this returns the name of all employees whose base salary is over 5000. However in LINQ. You can convert this into query syntax which is similiar to SQL as in

// Query-1

var highlyPaid =

from e in employees

where e.Salary.Base > 5000

select e.Name;

You can use other features like anonymous types to group data as well. In case you are interested to know the name of the person as well as his/her salary you’d write something like

// Query-2

var highlyPaid =

from e in employees

where e.Salary.Base > 5000

select new { e.Name, e.Salary.Base };

There is something interesting here. In classic anonymous type declaration the declaration of the type is of the form new { name = value }. However in the above case we have not specified the name and yet you can do

foreach(var v in highlyPaid)


Here e.Name and e.Salary.Base is available as v.Name and v.Base. This works because the compiler knows the name of the fields in employee  and generates the anonymous type to contain fields/properties matching the same name.

How LINQ works

C#3.0 does not put any restriction on the semantics of the query expressions. The language defines translation rules which maps each of the expressions into method invocation. So when the Linq expression Query-1 given above is compiled the compiler emits code to execute the following

var highlyPaid = employees.Where(e => e.Salary.Base > 5000).Select(e => e.Name);

The language defines that for Where clause the following will be called

delegate R Func<A,R>(A arg);

class C<T> // This is the data type on which the query is run


public C<T> Where(Func<T,bool> predicate);


Since this call is made by syntactic mapping the type on which the query is run is free to implement Where as a instance method, extension method or use the implementation of where in System.Query. If you open the assembly with some tool like reflector to see the generated code, you’ll see that the whole query is just syntactic sugar to generate calls to these methods.

The formal translation rules and the recomended shape of a generic type that supports the query pattern is documented in the C#3.0 spec.  

Why I like it

There are a lot of reasons to like LINQ.

  • First of all it introduces a consistent and general way of querying for data, be it for databases, in-memory or XML. This will go a long way in increasing maintainability of code.

  • Since there is no specified semantic and the user is free to implement the query pattern. This gives a lot of flexibility

  • The fact that if the data source is a database DLinq will ensure that the query is executed remotely on the DB using SQL. This means the data comes after filtering on the server side and is not such that the whole data is pulled in and then filtered on client.

Why I do not like it

  • This is another new way of doing things and will add to the burden of C#. I keep saying this over and over again as I strongly believe that the surface area of a language should be minimal and too much of change citing specific usage leads to trouble down the line. Soon the language becomes capable of doing everything in totally different ways and it becomes less discoverable and comes as surprises.

  • The flexibility comes with a price. The same thing that can happen with operator overloading may happen with the query syntax as well. Someone can implement a custom Where for his data type which is non-standard and can take the code maintainer or a client of that code by surprise.

  • I think that this might be used in small projects but on large data-driven application it’ll rarely be used. People traditionally have separate data-tier with stored procedure and that works out really well both in terms of performance, maintainability and security.

  • I have a little doubt about the security. In some blog I read that based on DB vendor the SQL statement might be generated and sent to the DB. Can this lead to some security holes? I am not too sure on this


Comments (7)

  1. christophep says:


    in fact we can see that projects from Microsoft Research usually do not go in a box for sale and in this case, it will different. MS try to put this project Comega into C#. I don"t like it because if I need to sort and find things, I use collections of specialized algorythms. We have beautifull things in C++. If I need to a speed-like Db, I can use a DBBase-like engine, some small one can be easily integrated into apps and have custom index based on custom tables. Here, we are in the middle ; but OK, let’s try and see how the community adopt it or not.

    Axapta and it’s X++ language have something like that.

    It’s horrible.

    Christophe (France)

  2. Diego Vega says:

    I am not that concerned for the future of C# (or VB). I think LINQ adds a lot of expresiveness to the language, meaning that you can write more concise, natural and easy to mantain code.

    There are some new advanced features that can get difficult to explain to a newbbie, but looking at the resulting LINQ code and then comparing it to the concatenated-spaghetti SQL, the a.("b").Value field accessors, data type conversions and null checks you need today, the improvement is obvious!

    You can always use expressivness to obfuscate your code, but this is not a new problem. It is possible already to write completely awaful C# code without LINQ.

    I don’t agree that big projects are not going to use LINQ. As long as DLINQ is done well (it gets multiple database support) I see a great future for it.

    As a matter of fact, I think DLINQ could help with database engine abstraction, needed in many big projects. This is something you only get now trough high ammounts of discipline.

  3. Sam says:

    I’ve done plenty of projects involving OR mapping . You have to go through the pain of doing it without LINQ to see the beauty of it. Embedding SQL and generating it manually can be a nightmare. The type-safety LINQ provides is invaluable. The other features are also very useful in allowing refactoring in dimensions not possible before, enabling further cohesive, decoupled code. With lambda expressions, you can finally do true functional programming. For example, before you could not remove duplication resulting from similar method calls.

  4. Gabe says:

    While I agree that just adding features willy-nilly like C++ is not a good idea. But then look at Java: I’m sure they thought that not having enumerated types, function pointers (closures or delegates), foreach, and other features of C# made the languages nice, simple, clean, and easy-to-learn. Unfortunately, programmers needed these regardless of what the language designers thought, and just ended up coming up with multiple incompatible (and often incorrect) ways of doing them. Witness the various ways of making Java enumerations ("public static final int" isn’t typesafe; "public static final Enum" can’t deserialize). So after many years Sun added anonymous inner classes to imitate function pointers (the JVM’s design won’t allow real ones), and only recently added real enums and foreach.

    The key to determining what language features to implement is figuring out what is used the most and the most error-prone, and give the programmer a simpler, more elegant way to do it.

    As it turns out, nowadays almost every program contains some sort of data access, be it from a SQL source or XML. There is currently no good, single, easy way to access data in C#. Every data source (SQL Server, Access, XML, arrays) has its own way of accessing data and manipulating it, each requiring its own query language to be most efficient. The world is awash in wrappers, object-relational mappers, and rote error-prone boilerplate code that’s required to do even the simplest things with data.

    So, in creating LINQ and adding more C# to learn, they are simultaneously eliminating 10 times as much stuff to learn that isn’t C# AND making the same code less error-prone.

  5. <P>Its apparent from the comments that people do think that Linq is going to add real value. Programming language is all about expressiveness and choice. This provides another way to get things done and I am not against it. May the better choice win 🙂 However, I still think that inspite of the benefits its much better to separate the queries out from your code and place them as stored procedure on your data-tier (DB) and make call to it from client. The benefits are multiple </P>


    <LI>Code is better organized</LI>

    <LI>Maintanability is improved. In case I need to fix a query I know where to look for instead of grepping in source files</LI>

    <LI>Servicability is vastly improved. As stored procs can be individually hot fixed. </LI>

    <LI>Security is enhanced as the SQL is not generated on the client</LI>

    <LI>For large project typically the SQL/data handling are handled by separate developer and so ownership of these stored procedure can be with him </LI></OL>

  6. Gabe says:

    It sounds like you’re arguing for stored procedures versus ad hoc SQL. That’s fine for some applications, and still supported by DLINQ according to the docs I read. However, most people don’t even use stored procedures, not all databases support them, and sometimes the stored procs can’t even do queries where the expressions needed to query aren’t known until runtime.

    Besides that, LINQ is also invaluable for manipulating data after it comes back from the DB, or may not have come from the DB in the first place.

  7. Diego Vega says:

    I am completely in favor of keeping the tier separation and concentrating all your database access code in your DAL. I just think it will be great to do in using LINQ instead of whatever you use now. It will also be great to be able to use the same principles and syntax for data manipulation on layers that are far from the database.