ADO.NET’s Entity Client comes with a client-side query engine that provides a consistent entity-level platform over different store providers. The input to that query engine is an entity-level query written in a language naturally called Entity SQL. The purpose of Entity SQL is to provide a common, entity-level, query language independent of any data store, so that queries written in it are the same across different store implementations. People might expect Entity SQL to be a small subset of SQL constructs that is common for most data stores. Those will be pleasantly surprised - Entity SQL is a very rich functional language whose expressiveness supersedes conventional SQL dialects.
The building block of Entity SQL is expression. The language is about composing complex expressions. Thus the simplest Entity SQL query looks like this:
The domain of expression types is multidimensional. With regard to multiplicity, expressions fall into two categories: items and collections. The difference between those two is easy to guess – collections are subject to relational operations while items are not. Here are some valid collection expressions:
With regard to source, expressions may be literals, parameters, or nested expressions. That means anywhere a value is expected, it could be parameterized or composed of some other expression. Parameterization has both security- and performance benefits. By passing user-supplied input as parameters, one avoids the risk of SQL injection. And by using a constant query text, a frequently used query is likely to stay in the query plan cache and won’t waste time to compile for each execution.
Composition means that queries are no longer constrained to the tabular shape of conventional SQL. A nested expression may be placed anywhere a value of that type would be accepted. For instance:
If a collection represents the vertical axis of a conventional SQL row set, then what represents the horizontal axis, i.e. how can we make a composite item? In addition to the scalar items that we’ve seen already, there are rows and entities. An Entity SQL row is identical to the conventional SQL row – it’s a set of items eventually of different types. A row is constructed through the ROW() constructor.
An entity is an instance of an entity type from the entity model. An entity could be flattened to a row through the VALUE keyword. The opposite is also true – an entity may be constructed from a row using the entity type constructor as long as all the properties are supplied in the exact order as they are defined in the entity model.
Now we are ready for some samples with relational operations. All the samples bellow use the Northwind sample model:
Navigate vs. Join
An entity model has no notion of normalization as that is only applicable to store implementations. Therefore entity-level queries against a well-designed model should rarely need to use JOIN. Instead, those queries should use navigation properties. Those are properties that represent relationships in an entity model. For instance:
Navigating the same relationship in the opposite direction is even more interesting:
References and Entities
References and entities in Entity SQL differ with regard to entity properties - properties are directly applicable only on entities, and not on references. A reference must be “dereferenced” to an entity first. For instance, to query all products and the name of each product’s category, we need the following query:
As it was mentioned above, a nested query may be used anywhere an expression of the type it returns is accepted. Furthermore, since Entity SQL is not limited to a tabular result set as conventional SQL dialects are, one can place a nested query even in a projection clause:
Paging is a hot feature that web applications demand from data providers. Roughly speaking paging is extracting a range/page of a base collection starting at a given offset and proceeding for a given length/size. The benefit of paging is reducing the cost of building and transporting the result on the backend and processing it on the frontend.
Entity SQL supports physical paging. That’s what the major data store providers support today. Physical paging means both the offset and the size of the requested page are given in number of items. For instance, if an application is rendering products from the Northwind model in groups of 10, on page 1 it will render items 1 through 10, on page 2 – items 11 through 20, and so forth. Physical paging has some deficiencies with regard to changes in the base collection:
· For each item that is removed from a lower-index page, the rest of the collection will shift towards the beginning, and one item will not be shown at all.
· For each item that is added to a lower-index page, the rest of the collection will shift towards the end, and one item will be shown more than once.
Despite of its semantic deficiencies physical paging could be implemented efficiently by data providers. It is also easy to comprehend and to adopt in applications.
Paging implies some sort order. Although in standard SQL one can query a range without specifying any ordering, Entity SQL doesn’t allow that exception – paging constructs are invalid without an ORDER BY clause. The clauses to specify page offset and page size are SKIP and LIMIT respectively. The following example would retrieve products from position 21 through position 30 ordered by name:
Each of SKIP and LIMIT is optional. When LIMIT is given alone, it is equivalent to TOP in the projection clause. However, mixing SKIP and TOP is not allowed.
Differences from Standard SQL
Although Entity SQL shares a set of operations and syntax constructs with standard SQL, it should not be approached as another “SQL dialect”. Instead, it should be regarded as a new, collection-oriented, query language with its own merits.
The syntax of standard SQL is driven by the tabular shape of the result sets that data stores return. There are some shortcuts that are not available in Entity SQL since Entity SQL does not imply any particular shape of the collections it operates on. Following are examples of how some standard SQL statements map into Entity SQL:
Writing such a statement in Entity SQL is ambiguous since the operands of the UNION ALL operator are unclear. It would work over named collections like this:
The rule in Entity SQL is that nested queries must always be enclosed in parentheses:
Another convenient shortcut in T-SQL is that an ORDER BY clause may be specified at the end of the last query of a UNION ALL operator:
Entity SQL’s UNION ALL operator doesn’t have an ORDER BY clause at all. Again, nesting should be used:
Overall our belief is that having stricter rules and less exceptions is better for the adoption of a programming language. Once a developer abstracts herself from the tabular constraint of conventional result sets, Entity SQL will become much easier to comprehend than standard SQL.
A beta version of the Entity SQL language reference is available as part of the June CTP distribution. Take a look and try out some queries. Don’t hesitate to post questions and ask for help at the Entity Framework forum. Feedback is welcome.
Program Manager, Data Programmability Runtime