Tip 12 - How to choose an Inheritance Strategy

Article
04/14/2009

What strategies does the Entity Framework support?

The Entity Framework supports 3 primary inheritance strategies:

Table Per Hierarchy (TPH):

In TPH, all data for a type hierarchy is stored in one table, and there is a discriminator column that is used to establish the type of a particular row (i.e. 'C' for Car or 'B' for Boat).

Columns for Properties that are not shared across all types need to be nullable, even if they aren't nullable properties. Which means the database is not completely enforcing your null-ability constraints.

A TPH table might look something like this:

ID	Description	Discriminator	HorsePower	KeelLength
1	Porsche 911	C	600	NULL
2	Yacht - KZ7	B	NULL	2.2

Table Per Type (TPT):

In TPT properties in a base type are stored in a shared table. For example a Vehicle table:

ID	Description
1	Porsche 911
2	Yacht - KZ7

And then there is a table for each sub type, with columns for just newly declared properties and the key, so the tables can be joined. So there would be a Car table:

ID	HorsePower
1	600

and Boat table:

ID	KeelLength
2	2.2

Table Per Concrete Class (TPC):

In Table per Concrete Class there is a table for each class, and each of those tables has a column for every property of that type, i.e. a Car table:

ID	Description	HorsePower
1	Porsche 911	600

And a Boat table:

ID	Description	KeelLength
2	Yacht - KZ7	2.2

Which strategy is the Best?

Trick question! In isolation of your requirements there is no 'Best' strategy.

But...

While the EF runtime supports TPC, the designer doesn't, and using TPC in the EF forces you to avoid associations in your base type. Because of these issues we generally discourage the use of TPC with the Entity Framework.

This means in most situations the question comes down to TPH or TPT?

Here are some of the things you might want to consider when making your decision:

Concern	Winner	Reasoning
Performance	TPH	Table Per Hierarchy is generally better performing, because no joins are necessary, everything is in one table. The decision becomes even more clear cut once the inheritance hierarchy gets wide or deep.
Flexibility	TPT	Table Per Type is often used by ISVs, as it allows customizations without modifying the 'base' table. I.e. new subtypes can be added simply by creating new tables for those sub-types.
Database Validation	TPT	TPH requires columns in derived types to be NULLABLE in the database, so that other derived types can be stored in the same table. Because of this it is possible to create rows in the table that are not valid according to the conceptual model. I.e. the column is NULLABLE but the combination of a particular column being NULL and a particular discriminator or type is not valid. This means the database is not enforcing the conceptual model for you anymore. This is fine if all access to the database is via the EF, but if anything else is used, you can end up with 'dirty' data.
Aesthetics	TPT	This one is completely subjective, but TPT feels more Object Oriented to me :)
Storage Space	TPT	Ward Bell pointed this one out: If your inheritance hierarchy has lots of types, then using TPH will result in lots of empty cells. If your database can handle 'sparse' columns well this probably isn't a real concern.

As you can see once you know what it is you are looking for it should be a pretty easy task to choose a strategy. Most of the time the recommendation is TPH because generally performance trumps these other concerns.

But every situation is different, and the key is to understand exactly what you value, and then make the decision accordingly.

Let me know if you have any questions.