Playing with DLINQ

Today, I was playing with DLINQ and would like to share my findings with you. First of all, if you never heard of LINQ or want a basic intro, I suggest starting with the following 3 short videos (in order):

First of all, I generated a C# class file representing my SQL database tables by using a built-in SDK tool. The class is generated from my SQL database, called PfRep: I started an Orcas command prompt and stepped into the C:\Program Files\Microsoft Visual Studio 9.0\SDK\v3.5\Bin folder, then ran the following command:

sqlmetal /server:local host /database:PfRep /code /pluralize C:\blabla\PfRep.cs

This creates a .CS file with the class structure discovered from the database tables.

(by the way, the .CS file can be generated by a Visual Studio designer as well - by adding a "Linq to SQL File" file to the project - sqlmetal is good for re-generating the structure if it has changed.)

The first thing that's a bit weird is that the generated file contains modified table names. For example, I have the following tables in my PfRep database:

  • Folder
  • ContentIndex

that (in PfRep.cs) were transformed to the following class names:

  • Folders
  • ContentIndexes

I guess, it doesn't want the programmer to mix up the classes (table definitions) with the property methods used to retrieve the records from the tables - or something like that. Let's write some easy code to run a query:

PfRep pfRep = new PfRep("database=PfRep;server=dszabo1");

var bigFolders =

    from folder in pfRep.Folders

    where folder.ItemCount > 1000

    select folder;

foreach (Folder folder in bigFolders)

{

    Console.WriteLine(folder.FolderKey + " " + folder.FolderTree.FolderTreeName);

    Console.ReadLine();

}

What it does, it queries the Folder table for records with an item count larger than 1,000. Then it iterates through the results, looks up a referenced record (in the FolderTree table, which is referenced by Folder through the FolderTreeID property) and dumps out these data.

I was curious how does it spend the expensive transaction "fees" against SQL Server. :)  It turns out that it's pretty clever. Here's what I captured with SQL Profiler. First, it queries for all the Folder records with an ItemCount larger than 1,000:

exec sp_executesql N'SELECT [t0].[FolderID], [t0].[FolderTreeID], [t0].[FolderName], [t0].[FolderKey], [t0].[Indexed], [t0].[ItemCount], [t0].[ParentFolderID], [t0].[ActionPlanID], [t0].[OwnerName], [t0].[OwnerEmail], [t0].[Note]

FROM [Folder] AS [t0]

WHERE [t0].[ItemCount] > @p0',N'@p0 int',@p0=1000

then when in the loop, it queries for the FolderTree record associated with the Folder record:

exec sp_executesql N'SELECT [t0].[FolderTreeID], [t0].[FolderTreeName], [t0].[DateCreated], [t0].[FolderTreeUrl]

FROM [FolderTree] AS [t0]

WHERE [t0].[FolderTreeID] = @p0',N'@p0 int',@p0=1

It checks whether the given FolderTree record was already retrieved and it doesn't run the query twice if already got the record - I'm wondering where does it store the results though ... and how.

sp_executesql is a good choice of running the queries, in my example, it's especially useful when running the FolderTree query (in the loop). Here's how it works:

"sp_executesql can be used instead of stored procedures to execute a Transact-SQL statement a number of times when the change in parameter values to the statement is the only variation. Because the Transact-SQL statement itself remains constant and only the parameter values change, the Microsoft® SQL Server™ query optimizer is likely to reuse the execution plan it generates for the first execution. " msdn2.microsoft.com/en-us/library/aa933299(SQL.80).aspx

Sounds good, isn't it? I'm still very curious about how is the cached data stored, it could be an important question by Customers on large database projects. After some debugging and reverse-engineering, it turns out that the data is stored by using the type of the underlying entity (FolderTree in my case) and is indexed by a special generic collection.

Have fun with LINQ! :)