Foreign Keys are our friends…

I often find customers who question the value of database-enforced integrity constraints.   While people often create primary keys, the foreign ones are perhaps a bit more …. well, foreign. 

Let’s start at the beginning. In traditional, old-school database design, you take your business object and you divide it up into tables. A foreign key (FK) lets you tie the rows (even in different tables) associated with one business object together and describe them as one set that should be preserved.  So, if we create an FK from Orders to Customers and from OrderDetails to Orders, we can now make sure that nobody can delete the customer row.

Foreign keys are enforced on top of the indexing mechanism in SQL Server.  So, this implies that you need indexes in order to use this mechanism.  Especially in larger data warehouses, people scrutinize each index due to the space requirements, often leaving off the definition of foreign keys in their warehouse databases.  In other cases, people might just be lazy and not bother creating them.

 Today, I’ll show you a case when the foreign key definition matters to the database engine.

 The SQL Server Query Optimizer contains logic to detect that some joins are unnecessary and to remove them.  I will show you an example where this happens only when the Foreign Key is defined.  If you think about it, a foreign key enforces that there must be a parent row for any child in the relationship.  So, an equijoin join across this condition is unnecessary if you don’t need any columns from the parent side of the relationship.

 There are two example queries and plans at the end that show that the resulting plans have no joins at all – they have been removed.  So, the next time someone says that foreign keys don’t matter, think about that for a minute before you agree.  While I don’t believe that they are appropriate everywhere, the SQL Server Optimizer does benefit from knowing about constraints in the data like foreign keys.




drop table t1

drop table f1

create table t1(id int primary key not null, data binary(2000))

create table f1(f_id int primary key identity, t_id int not null, otherdata binary(1000))

alter table f1 add constraint fk1 FOREIGN KEY (t_id) REFERENCES t1(id)

–alter table f1 drop constraint fk1

declare @i int=0

set nocount on

begin transaction

while @i < 2000


insert into t1 (id) values (@i*100)

insert into f1 (t_id) values (@i*100)

insert into f1 (t_id) values (@i*100)

insert into f1 (t_id) values (@i*100)

set @i+= 1


commit transaction


select f1.* from t1 inner join f1 on = f1.t_id

select f1.* from f1 where exists (select 1 from t1 where = f1.t_id)



select f1.* from t1 inner join f1 on = f1.t_id



|–Clustered Index Scan(OBJECT:([blog1].[dbo].[f1].[PK__f1__2911CBED1367E606]))



select f1.* from f1 where exists (select 1 from t1 where = f1.t_id)



|–Clustered Index Scan(OBJECT:([blog1].[dbo].[f1].[PK__f1__2911CBED1367E606]))


Comments (3)

  1. Molto spesso mi imbatto in strutture dati che non implementano foreign key demandando l&#39;integrità

  2. erincon says:

    Hi Connor,

    nice to see your blog back 🙂

    Some suggestions with all my respect, and maybe I’m wrong due to set options:

    select, f1.*

    from t1 inner join f1

    on = f1.t_id

    table t1 is accessed, while optimizer should know that is equal to f1.t_id

    if you create an index for the FK column reference:

    create nonclustered index nci_f1_t_id

    on f1 (t_id)

    and you run a query like this (OLTP Traditional)

    select f1.*

    from t1 inner join f1

    on = f1.t_id

    where f1.t_id = 1

    the query uses properly the new NCI, but needs to access the parent table.

    Do you see this recommendations usefull at connect?



  3. Frequently, as a trainer and consultant I have to face in our clients the benefits that customers can