LINQ to SQL: the good, the bad and the ugly

Let’s start with the good:

I was never a big fan of O/R mapping technology nor to do all the work by hand in populating objects.

So I tend to chose typed DataSet in project where the speed of development is a key driver. Of course DataSets are big on the wire and .NET only, so it is not the right choice for large complex distributed apps.

LINQ adds a couple of key advantages. The first being to have powerful query capabilities for in memory cached collection and the second to have a strongly type verifiable syntax for query.

LINQ to SQL is able to map these queries to SQL Server queries giving you full intellisense and the possibility at design time to verify your query. Add the advantage to have sleek objects and collections representing your data basically effortless and it is a clear winning technology.

So in ASP.NET app I would definitely start considering LINQ to SQL as a viable an productive technology.

The bad:

Actually is not really bad but is a consequence of O/R mapping technologies. Using LINQ to SQL it makes all the data work simpler and effortless but the result can lead to a performance disaster. Not because LINQ to SQL is not performing well but because you as a developer misuse it or do not really understand what is going on. The queries that LINQ to SQL is generating and going to SQL Server are good and efficient and in some cases like the paging one using .Skip and .Take are extremely good. If you are really paranoid by dynamic SQL statement you can force LINQ to SQL to use your manually defined stored procedures. There is of course a small overhead in filling the objects and in the sql statement generation but I would not consider this a real problem.

So there is nothing inherently wrong with LINQ to SQL and its performance.

The big challenge is for the developer. It is so easy to issue query that have the wrong effects on data.

A classical example is Lazy vs. Eager data loading. Let’s take an example:

I have a Customer and an Order table.

I query the Customer that where created in the last week:

DALDataContext dc = new DALDataContext();

var query = from c in dc.Customers

where c.DateCreated > “12.12.2008”

select c;

foreach (Customer cu in query)

{

doSomething(cu.Orders)

}

Depending on what you tell LINQ to SQL to do you will have two effects:

-the default behavior, assuming you have 100 customers returned, is that on your foreach loop you will issue 100 queries to the DB. This can be the right thing or a really bad thing.

-In case you know you will get a max 10 orders per customers it would be a much better retrieve all the 100 customers and the 1000 orders in one shot. Of course LINQ to SQL enables you to do exactly this using this option:

DALDataContext dc = new DALDataContext();

System.Data.Linq.DataLoadOptions dos = new System.Data.Linq.DataLoadOptions();

dos.LoadWith<Customers>(c => c.Orders);

dc.LoadOptions = dos;

As you can see you really need to understand what is going on in your DB and what desired action you want. If you would have 100000 orders per customer you want to take the default approach with some sub filtering and not loading all in one shot.

So there is no general best way to do it but is case by case. Unfortunately both will work but with really bad side effect in perf depending on the scenario.

There is a solution. You need to fully understand how LINQ to SQL works and always have SQL Server Profiler open and ask yourself if the query that LINQ is sending is the right one for your problem.

You can even have LINQ to SQL send the query to your Debug windows in VS:

dc.Log = Console.Out;

The Ugly:

No real support for multi tiers application. LINQ to SQL does change tracking on the DataContext that of course is not serializable (it make sense, imagine in a SOA world where the client is a Java one what should it do with a DataContext object). You can mark your “data containers object” as WCF serializable with VS UI and it works well in passing around these objects across tiers. But as soon as you want to modify some data and have only these changes applied back to the DB you need to do manual work. If you have changed a Boolean in one customer out of the 100 retrieved you LINQ to SQL will update all the 100 because you need a new DataContext that doesn’t know what is changed and what not. Not something that you really want. There are solutions to this problem and one is here: https://www.codeplex.com/interlinq. Of course it is not that we don’t want to have multi tier support but we didn’t make it on 3.5. Future version will have better support.

Bottom line LINQ to SQL is a powerful, productive and cool technology but you must understand it well if you don’t want to get burned. In the current version is good candidate for ASP.NET apps and WinForm/WPF 2 tier apps. Lacking of a good multi tier support is the major drawback, it is not too bad for reads but is challenging on updates making it a harder choice for multi tier applications.

And remember always use the SQL Profiler and ask yourself if you are issuing the right query. LINQ to SQL does still require a good understanding of SQL language and DB technologies.