Improvements to Generated SQL in .NET 4.0


The information in this post is out of date.

Visit msdn.com/data/ef for the latest information on current and past releases of EF.


In a previous post (https://blogs.msdn.com/adonet/archive/2009/08/05/improvements-to-the-generated-sql-in-net-4-0-beta1.aspx), we talked about the improvements to the quality and the readability of the SQL generated when querying using the Entity Framework that we made in .NET 4.0 Beta 1. We continued working on that task and here I talk about the additional improvements provided with .Net 4.0.

 

Again, most of the improvements we have made are in the Entity Framework query pipeline as opposed to specific SqlClient changes, thus affecting the SQL generated by any backend, not only by our SQL Server provider.

 

Unless otherwise noted:

· The examples are based on the NorthwindEF Entity Data Model provided with the ADO.NET Entity Framework Query Samples (https://code.msdn.microsoft.com/EFQuerySamples).

· The “Before” generated SQL is produced by the Entity Framework in .NET 3.5, and the “After” by the Entity Framework in .NET 4.0.

 

Also, in all the examples below, the parts of the SQL query highlighted in red have been removed and those in yellow have been added.

 

1. Elimination of internal constants used as null sentinels

In the query pipeline we introduce internal constants to serve as null sentinels, i.e. based on whether they come out as null we make decisions whether a record should be null /collection should be empty. In some cases, the introduction of the new column can cause a new level of nesting to be introduced. The improvement is instead of introducing new constants to try reusing other columns for the same purpose. This in many cases leads to simpler queries with less nesting.

Example:

var query = from c in context.Categories

select new { c.CategoryID, Count = c.Products.Count }

Before

SELECT

1 AS [C1],

[Project1].[CategoryID] AS [CategoryID],

[Project1].[C1] AS [C2]

FROM ( SELECT

      [Extent1].[CategoryID] AS [CategoryID],

      (SELECT

            COUNT(cast(1 as bit)) AS [A1]

            FROM [dbo].[Products] AS [Extent2]

            WHERE [Extent1].[CategoryID] = [Extent2].[CategoryID]) AS [C1]

      FROM [dbo].[Categories] AS [Extent1]

)  AS [Project1]

After

 

SELECT

[Extent1].[CategoryID] AS [CategoryID],

(SELECT

      COUNT(1) AS [A1]

      FROM [dbo].[Products] AS [Extent2]

      WHERE [Extent1].[CategoryID] = [Extent2].[CategoryID]) AS [C1]

FROM [dbo].[Categories] AS [Extent1]

 

2. Translating String.StartsWith, String.EndsWith and String.Contains to LIKE in LINQ to Entities

In .NET 3.5, in LINQ to Entities we provided support for the String.StartsWith, String.EndsWith and String.Contains and we translated them to canonical functions and ultimately to SQL in the following way:

CLR Method

Translation in terms of canonical functions

Resulting translation for SQL Server

Boolean String.Contains(String value)

IndexOf(this, value) > 0

CHARINDEX(value, this) > 0

Boolean String.EndsWith(String value)

Right(this, length(value)) = value

(RIGHT(this, LEN(value)) = value

Boolean String.StartsWith(String value)

IndexOf(this, value) = 1

CHARINDEX(value, this) = 1

 

These translations resulted in inability to utilize indexes on the tested string and were not as easy to read. Therefore, in .NET 4.0, when the given value is a constant or a parameter and the provider support translation to LIKE, we translate these functions into a DbLikeExpression, which ultimately results in a “LIKE” on SQL Server.

Example:

var query = from p in context.ProductSet

            where p.ProductName.Contains("va")

            select p.ProductID;

Before

SELECT

[Extent1].[ProductID] AS [ProductID]

FROM [dbo].[Products] AS [Extent1]

WHERE (CAST(CHARINDEX(N'va', [Extent1].[ProductName]) AS int)) > 0

After

 

SELECT

[Extent1].[ProductID] AS [ProductID]

FROM [dbo].[Products] AS [Extent1]

WHERE [Extent1].[ProductName] LIKE '%va%'

3. Improve the translation of the canonical functions StartsWith, Contains and EndsWith to use LIKE

 

In .NET 4.0 Beta 1 we introduced new canonical functions, among which the following string functions along with their translations for SQL Server:

Canonical Function

Translation for SQL Server

Boolean String.Contains(String value)

CHARINDEX(value, this) > 0

Boolean String.EndsWith(String value)

CHARINDEX(REVERSE(arg1), REVERSE(arg0)) = 1

Boolean String.StartsWith(String value)

CHARINDEX(value, this) = 1

 

We have now changed the translations of these functions to use LIKE when the value is a constant.

Example:

Note: The ‘Before’ SQL in this example is produced by Entity Framework in .NET 4.0 Beta1

string entitySql = "contains('abcd', 'bc')";

Before

SELECT

CASE WHEN (CHARINDEX( 'bc', 'abcd') > 0) THEN cast(1 as bit) WHEN ( NOT (CHARINDEX( 'bc', 'abcd') > 0)) THEN cast(0 as bit) END AS [C1]

FROM ( SELECT 1 AS X ) AS [SingleRowTable1]

After

SELECT

CASE WHEN ('abcd' LIKE '%bc%') THEN cast(1 as bit) WHEN ( NOT ('abcd' LIKE '%bc%')) THEN cast(0 as bit) END AS [C1]

FROM ( SELECT 1 AS X ) AS [SingleRowTable1]

 

4. Additional Join Elimination when there is more than one navigation property from one entity type to another

In some cases, then there is more than one navigation property from one entity type to another, and both naviation properties were accessed mulitple times we were generating an output quey with redundant joins.

Example :

Note: This example is based on a slightly modified schema from the one provided in the ADO.NET Entity Framework Query Samples. In particular, a new Association Products_SecondaryCategory has been added identical to the Products_Category association along with the navigation properties. Also, referential constrains between Categories and Products for both associations have been added to the SSDL.

var query = from p in context.ProductSet select new {

          p.ProductID,

         category = p.Category.CategoryName,

         categoryDescription = p.Category.Description,

           secondaryCategory = p.SecondaryCategory.CategoryName,

         secondaryCategoryDescription = p.SecondaryCategory.Description

            };

Before

SELECT

1 AS [C1],

[Extent1].[ProductID] AS [ProductID],

[Extent2].[CategoryName] AS [CategoryName],

[Extent3].[Description] AS [Description],

[Extent4].[CategoryName] AS [CategoryName1],

[Extent5].[Description] AS [Description1]

FROM [dbo].[Products] AS [Extent1]

LEFT OUTER JOIN [dbo].[Categories] AS [Extent2] ON [Extent1].[CategoryID] = [Extent2].[CategoryID]

LEFT OUTER JOIN [dbo].[Categories] AS [Extent3] ON [Extent1].[CategoryID] = [Extent3].[CategoryID]

LEFT OUTER JOIN [dbo].[Categories] AS [Extent4] ON [Extent1].[SecondaryCategoryID] = [Extent4].[CategoryID]

LEFT OUTER JOIN [dbo].[Categories] AS [Extent5] ON [Extent1].[SecondaryCategoryID] = [Extent5].[CategoryID]

After

 

SELECT

[Extent1].[ProductID] AS [ProductID],

[Extent2].[CategoryName] AS [CategoryName],

[Extent2].[Description] AS [Description],

[Extent3].[CategoryName] AS [CategoryName1],

[Extent3].[Description] AS [Description1]

FROM [dbo].[Products] AS [Extent1]

LEFT OUTER JOIN [dbo].[Categories] AS [Extent2] ON [Extent1].[CategoryID] = [Extent2].[CategoryID]

LEFT OUTER JOIN [dbo].[Categories] AS [Extent3] ON [Extent1].[SecondaryCategoryID] = [Extent3].[CategoryID]

 

5. Provide mechanism for efficient queries on non-Unicode columns

In .NET 3.5, whenever a constant or a parameter was used in LINQ to Entities query, we treated it as being Unicode. As a result, when comparing a constant to a property stored in a non-unicode column on SQL Server, if there was an index on that column, it was not being used.

To address the issue, we now generate non-unicode constants and parameters when these are used in LINQ to Entities queries in comparisons with non-unicode columns.

Example:

Note: This example is based on a slightly modified schema from the one provided in the ADO.NET Entity Framework Query Samples. In particular, the SSDL has been modified, to specify the ProductName column as non-unicode.

 var query = from p in context.ProductSet

where p.ProductName == "Chocolade"

select p.ProductID;

Before

SELECT

[Extent1].[ProductID] AS [ProductID]

FROM [dbo].[Products] AS [Extent1]

WHERE N'Chocolade' = [Extent1].[ProductName]

 

After

 

SELECT

[Extent1].[ProductID] AS [ProductID]

FROM [dbo].[Products] AS [Extent1]

WHERE 'Chocolade' = [Extent1].[ProductName]

 

6. Collapse multiple Case statements into one

Queries with deeply nested conditional statements could fail due a SQL Server’s limitation on the level of nesting of CASE statements. This scenario is especially interesting with LINQ to Entities queries like the one in the example, which are natural to write. To provide a better support for queries like this, we now collapse multiple cascading case statements into one.

Example:

var query = from p in context.ProductSet

select

(p.ProductID == 1) ? "first" :

                 (p.ProductID == 2) ? "second" :

                (p.ProductID == 3) ? "third" :

                 (p.ProductID == 4) ? "fourth" :

      "other";

 

Before

SELECT

CASE WHEN (1 = [Extent1].[ProductID]) THEN N'first' ELSE

  CASE WHEN (2 = [Extent1].[ProductID]) THEN N'second' ELSE

    CASE WHEN (3 = [Extent1].[ProductID]) THEN N'third' ELSE

      CASE WHEN (4 = [Extent1].[ProductID]) THEN N'fourth' ELSE N'other'

END END END END AS [C1]

FROM [dbo].[Products] AS [Extent1]

After

 

SELECT

CASE WHEN (1 = [Extent1].[ProductID]) THEN N'first'

     WHEN (2 = [Extent1].[ProductID]) THEN N'second'

     WHEN (3 = [Extent1].[ProductID]) THEN N'third'

     WHEN (4 = [Extent1].[ProductID]) THEN N'fourth' ELSE N'other' END AS [C1]

FROM [dbo].[Products] AS [Extent1]

 

Thank you!

Kati Iceva,

Developer, Entity Framework