Performance Considerations when using TPT (Table per Type) Inheritance in the Entity Framework

 


The information in this post is out of date.

Visit msdn.com/data/ef for the latest information on current and past releases of EF.

For Performance Considerations see https://msdn.com/data/hh949853


 

The Entity Framework offers very rich inheritance mapping schemes for building your conceptual model. The inheritance capabilities in EF include:

· TPH (Table per Hierarchy): This is the inheritance scheme where there is a single table in the database representing the hierarchy, but the conceptual model includes multiple entities.

· TPT (Table per Type): In this scheme, each type in the inheritance hierarchy is stored in separate tables in the database

· TPC (Table per Concrete type): In this scheme, each concrete type in the inheritance hierarchy is stored as a separate database table, without any tables in the database for the abstract types. TPC is currently not supported by the Entity Designer in Visual Studio.

All of these inheritance schemes have their own strengths and weaknesses. One thing in particular to consider is that queries done on entities in a TPC or TPT mapped hierarchy are naturally more complicated to represent and translate to the store because of the fact that multiple tables may be involved, which usually require joins or unions. This can introduce performance problems that stem from the complexity of queries generated by EF. On the other hand, TPH fares relatively well when it comes to query complexity.

Take an example of a simple TPT inheritance hierarchy, as shown below:

Now, let’s take an example of a rather simple query that projects some properties out of the base type Guest, without ever querying for anything from any of the subtypes.

var query = from g in db.Guests
select new { Id = g.GuestId, Name = g.Name };

This results in a query that is shown below. This scenario really should only generate a query that involves the base Guests table – however, EF generates a query that includes a join over the union of two of the tables for the sub-types as well.

SELECT

[Extent1].[GuestId] AS [GuestId],

[Extent1].[Name] AS [Name]

FROM  [dbo].[Guests] AS [Extent1]

LEFT OUTER JOIN (SELECT

      [Extent2].[GuestId] AS [GuestId]

      FROM [dbo].[Guests_USGuest] AS [Extent2]

UNION ALL

      SELECT

      [Extent3].[GuestId] AS [GuestId]

      FROM [dbo].[Guests_UKGuest] AS [Extent3]) AS [UnionAll1] ON [Extent1]

.[GuestId] = [UnionAll1].[GuestId]

Here’s another example.

 var query = db.Guests;

This LINQ query seems even more simple than the previous one – but the resulting query is not as simple as you might expect, because what we are effectively doing is  requesting a polymorphic result set that includes multiple types.

The scenario itself is naturally more complicated than it seems – however EF is doing extra work to ensure that when returning flattened table that includes the data for all the types, that null values are returned for columns that may be irrelevant for a particular type. Technically, this extra validation is not necessary and the highlighted parts of the query can be safely ommitted from the SQL we generate.

SELECT

CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT (([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN '0X' WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN '0X0X' ELSE '0X1X' END AS [C1],

[Extent1].[GuestId] AS [GuestId],

[Extent1].[Name] AS [Name],

[Extent1].[Address] AS [Address],

[Extent1].[City] AS [City],

CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT (([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN [UnionAll1].[State] END AS [C2],

CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT (([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN [UnionAll1].[Zip] END AS [C3],

CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT (([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN [UnionAll1].[PhoneNumber] END AS [C4],

CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT (([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) ELSE [UnionAll1].[C1] END AS [C5],

CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT (([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) ELSE [UnionAll1].[C2] END AS [C6]

FROM  [dbo].[Guests] AS [Extent1]

LEFT OUTER JOIN (SELECT

      [Extent2].[GuestId] AS [GuestId],

      [Extent2].[State] AS [State],

      [Extent2].[Zip] AS [Zip],

      [Extent2].[PhoneNumber] AS [PhoneNumber],

      CAST(NULL AS varchar(1)) AS [C1],

      CAST(NULL AS varchar(1)) AS [C2],

      cast(1 as bit) AS [C3],

      cast(0 as bit) AS [C4]

      FROM [dbo].[Guests_USGuest] AS [Extent2]

UNION ALL

      SELECT

      [Extent3].[GuestId] AS [GuestId],

      CAST(NULL AS varchar(1)) AS [C1],

      CAST(NULL AS varchar(1)) AS [C2],

      CAST(NULL AS varchar(1)) AS [C3],

      [Extent3].[PostalCode] AS [PostalCode],

      [Extent3].[PhoneNumber] AS [PhoneNumber],

      cast(0 as bit) AS [C4],

      cast(1 as bit) AS [C5]

      FROM [dbo].[Guests_UKGuest] AS [Extent3]) AS [UnionAll1] ON [Extent1].[GuestId] = [UnionAll1].[GuestId]

These are two very common cases that you may have run into when using TPT with the Entity Framework. The good news is that we are working on these issues so that EF no longer generates unnecessary SQL. The bad news is that it will take some time before the fix is delivered in a future release.

 

So, until we are able to get these fixes in place, there are a few things you can do to make sure you don’t get affected by some of these issues:

Pay special attention when using Model-First with EF

Model-First is a fantastic capability of the Entity Designer to allow you to focus on your conceptual model rather than the storage model when starting to build your application. The thing to keep in mind here is that Model-First uses TPT as the default inheritance mapping strategy when you build inheritance hierarchies. 

 

Consider using TPH inheritance (along with Entity Designer Database Generation Power Pack)

TPH support is great from a performance perspective – and in many ways, TPH can help keep things simple as far as storage and query translation goes. TPH is an option within Model-First if you download and install the Entity Designer Database Generation Power Pack. A screenshot of the designer with the Entity Designer Power Pack installed are shown below:  

 

We understand that there are scenarios where TPT is required and it is a goal of EF to provide the best possible performance when using all the inheritance mapping strategies supported out of the box. While the story is not perfect today with TPT, we hope that you are able to use some of the information mentioned above until we are able to release fixes for these in a future release.

 

Faisal Mohamood
Program Manager, Entity Framework