LINQ to SQL: The Mapping Engine

The primary purpose of any ORM system is to map relational data onto objects in your programming environment. Mapping here refers to the meaning in the mathematical sense that there is a correspondence from one item to the other. For example, a database row might map to an object, or a field in a database might map to a property. Some mappings are simple, like the ones I’ve already mentioned; others are more complex such as parts of multiple rows combining to form a single object.

LINQ to SQL has a variety of mapping capabilities some simple and some edging toward complex. Other ORM systems have additional mappings too, though none ever seem to fully satisfy the somewhat inexhaustive list of mappings that domain experts imagine.

Mappings in an ORM system are very similar to the view system employed by databases, yet are often described as data for a mapping engine rather than as query in-and-of itself, which is likely why they never seem to cover all the possible cases. On the surface, views through queries seem much more powerful since you can basically write code to transform one set of information into another. A declarative model is more constrained, but that’s not necessarily a bad thing. It keeps you from expressing mappings that have no inverse, which would be unfortunate since after mapping relational data out of database and into objects you often want to map it back. LINQ to SQL is no different. It allows you to describe mapping using attributes placed on your objects or via an XML ‘map’ file you can author and load at runtime.

Yet, even though the API only allows you to declare mapping in this constrained fashion, the actual mapping engine used by LINQ to SQL operates closer to the view model. The information you specify in the mapping is actually turned into a view query over the tables in your database. The query you were trying to execute actually ends stuck on top of this view query, similarly to if you were querying against an actual database view.

Let’s take a look at how this works.

var q = from c in db.Customers

        where c.City == "London"

        select c;

This simple query is actually the following API calls underneath:

var q = db.Customers
.Where(c => c.City == "London")
.Select(c => c);

Translated into a first approximation SQL query:

SELECT c
FROM db.Customers AS c
WHERE c.City = ‘London’

Of course, ‘db.Customers’ is a database table so we must translate it and the references to it into something that SQL can understand. Here’s where the mapping comes in. In this case, our mapping between the database and the customer class is basically one-to-one so we can rewrite the expression ‘db.Customers’ with a query that constructs an object out of the underlying columns using a hybrid C# & SQL syntax.

SELECT c
FROM (
SELECT new Customer {
CustomerID = t.CustomerID,
CustomerName = t.CustomerName, …
}
FROM [Customers] AS t

) AS c
WHERE c.City = ‘London’

Of course, the server cannot actually construct an object instance and send it back to us (at least not yet) so we must reduce this query further into something it can understand. We do this by extracting out the portion of the projection that cannot be understood and leaving the parts that can be understood as a flat tuple of columns.

Projection
new Customer {
CustomerID = Column(“CustomerID”),
ContanctName = Column(“ContactName”), …
}

Query
SELECT c.CustomerID, c.ContactName, …
FROM [Customers] AS c
WHERE c.City = ‘London’

Now, it easy to see that the remaining query is what is executed against the server and the projection becomes the rule for turning the tabular results of the query into actual objects.

The mapping engine took the declared mapping and turned it into a query expression that produces the desired objects. This query is just like the ‘view’ used by a database, except it’s expressed in terms of constructing CLR objects and not relational tables.

A slightly more complicated query shows the truth of this a little better. For instance, a query can project any shape it wants, not just one of the mapped objects.

var q = from c in db.Customers

        where c.City == "London"

        select new { c.ContactName, c.Phone };

Jumping right to the SQL we see:

SELECT new { ContactName = c.ContactName, Phone = c.Phone }
FROM (
SELECT new { CustomerID = t.CustomerID … }
FROM db.Customers AS t

) AS c
WHERE c.City = ‘London’

Which reduces to:

SELECT new { ContactName = c.ContactName, Phone = c.Phone }
FROM [Customers] AS c
WHERE c.City = ‘London’

And then to:

Projection
new {
ContanctName = Column(“ContactName”),
Phone = Column(“Phone”);
}

Query
SELECT c.ContactName, c.Phone
FROM [Customers] AS c
WHERE c.City = ‘London’

It turns out, representing the mapping as an actual query expression is very useful. Because mappings are expressed in the exact same form you would expression your own projection, all projections compose nicely on top of any mapping. Also, since both mappings and projections are expressed using a single language, the top-level projections extracted from the overall query become more than just descriptions of the objects you want, they in fact are the actual code that is used to construct them.

CODE is DATA çèDATA is CODE.

Hopefully, you can start to see how other more complicated mappings such as inheritance would be represented.

Given a hierarchy such as: Person -> Contact -> Customer

var q = from p in db.People

        select p;

The expression ‘db.People’ is mapped into a query that constructs one of three different types.

SELECT
CASE p.Type
WHEN 1 THEN new Person {Name = p.Name}
WHEN 2 THEN new Contact {Name = p.Name, Phone = p.Phone}
WHEN 3 THEN new Customer {Name= p.Name, Company = p.Company }
END
FROM [People] AS p

Which becomes:

Projection
CASE Column(“Type”)
WHEN 1 THEN new Person {Name = Column(“Name”)}
WHEN 2 THEN new Contact {Name = Column(“Name”), Phone = Column(“Phone”)}
WHEN 3 THEN new Customer {Name = Column(“Name”), Company = Column(“Company”) }
END

Query
SELECT p.Type, p.Name, p.Phone, p.Company
FROM [People] AS p

Mapping columns from multiple tables into a single object becomes downright trivial.

SELECT new Customer {
CustomerID = c.CustomerID,
ContactName = c.ContactName,
Address = ca.Address,
City = ca.City, …
}
FROM [Customers] AS c
INNER JOIN [CustomerExtended] AS ca ON c.CustomerID = ca.CustomerID

Of course the opposite is true as well; mapping a single table into multiple objects is just as easy.

SELECT new Customer {
Customer ID = c.CustomerID,
ContactName = c.ContactName,
Address = new Address {
Street = c.Address,
City = c.City, …
}}
FROM [Customers] AS c

Because the LINQ to SQL mapping engine uses query expressions to express the mapping, just about any mapping imaginable is possible through the query translator. In fact, any query that you could write to produce the output you want could be represented within the mapping engine.

Aren’t you just completely geeking-out about now? I know I am. J

Of course, that might lead you to question why LINQ to SQL does not support some of these more advanced mappings, if it’s so easy. The truth is that even though the mapping engine and query translator can understand all of these, the test burden to prove that they all work as expected, with the correct plumbing through the change tracker and update processor, not to mention the complication added to the mapping user model (attributes and xml file) and designer was simply too large to swallow in a single release.

Instead, we took a conservative approach and engineered the most common and useful mappings as a starting point, allowing the system to stay simple and easy to understand by most people.

And yes, we do want your feedback on what should come next. Please help us budget our time for the next version by either leaving comments here or on the MSDN forum.