Entity SQL Non-Quirkiness

Zlatko has been posting about one LINQ to Entities new feature included in the upcoming Beta 3, so I decided to take revenge and make a 100% Entity SQL post. Here is something I ran against the other day:

Let's assume we need to retrieve the Order with the maximum OrderID, which is a really awful way to get the ID of the order you just inserted! :)

In your everyday store-specific SQL, you can use a MAX() aggregate function in a subquery as a WHERE predicate. In Transact SQL, it should look like this:

SELECT *
FROM   Products AS p
WHERE  p.ProductID =(SELECT MAX(p2.ProductID)
                     FROM Products as p2);

So far, so good. If you have been playing a little with Entity SQL, you will probably guess how the equivalent Entity SQL would look like:

SELECT VALUE p
FROM   Products AS p
WHERE  p.ProductID =(SELECT MAX(p2.ProductID)
                     FROM Products as p2);

But if you run this query, what you get is this interesting exception:

System.Data.QueryException: Argument types 'Edm.Int32' and 'Transient.collection[Transient.rowtype[(_##groupAggMax2,Edm.Int32(Nullable=True,DefaultValue=))](Nullable=True,DefaultValue=)]' are incompatible for this operation, near WHERE predicate, line 1, column 60.

The subquery is actually returning a Transient.collection of a Transient.rowtype... Those are internal things, so for illustration purposes, let's turn to the LINQ perspective of life:

var query = from p in context.Products
            select new { p.ProductID };

int productID = query;

 

(Argh, this post is no longer 100% Entity SQL!)

No surprisingly, what you get is a compile-time exception:

Cannot implicitly convert type 'System.Linq.IQueryable<AnonymousType#1>' to 'int'.

Both exceptions are homologous, and for a text-based query language, Entity SQL happens to be very type-safe at its core. Standard SQL makes the basic assumption that it is ok to implicitly convert single-item collections of single-column projections to discrete scalars. We don't.

The basic theme in Version 1.0 of the Entity Framework is to build a solid foundation for the future. As a consequence, one thing we avoid doing is "magic" type conversions except when they make perfect sense (think union of projection queries with exactly the same shape). The motive: magic conversions tend to mine the stability and composability of the language.

That said, this buys us freedom to hand-pick certain implicit behavior in the future, if we find enough feedback and proof that it makes sense.

That's enough on the rationale. Now, how do I make it work? There are two approaches.

First:

SELECT VALUE p
FROM   Products AS p
WHERE  p.ProductID = MAX(SELECT VALUE p2.ProductID
FROM Products AS p2);

This one works because:

a) The SELECT VALUE returns the scalar itself, instead of a projection (rowtype) of the scalar.

b) MAX() operates on the collection of scalars returned by the subquery, returning a single maximum value that will be directly comparable (same type) as ProductID.

Second:

SELECT VALUE p
FROM   Products AS p
WHERE  p.ProductID = ANYELEMENT(
SELECT VALUE MAX(p2.ProductID)
FROM Products AS p2);

This works because:

a) The subquery will return a single-item collection of a scalar value.

b) ANYELEMENT will retrieve a single element (in this case, the only one) contained in the collection. That element will be directly comparable with ProductID.

In case you are wondering now how efficient this is, don't worry. Entity SQL is still a functional language. So, while understanding the type reasoning is interesting and useful, these queries still express "what you want to get" rather than "how you want the job done".

As a matter of fact, with our current SqlClient implementation, these queries will be translated to some simple, yet unexpected Transact-SQL. But I'll leave that to you as an exercise...