Foreign Keys are our friends...

I often find customers who question the value of database-enforced integrity constraints.   While people often create primary keys, the foreign ones are perhaps a bit more .... well, foreign. 

Let's start at the beginning. In traditional, old-school database design, you take your business object and you divide it up into tables. A foreign key (FK) lets you tie the rows (even in different tables) associated with one business object together and describe them as one set that should be preserved.  So, if we create an FK from Orders to Customers and from OrderDetails to Orders, we can now make sure that nobody can delete the customer row.

Foreign keys are enforced on top of the indexing mechanism in SQL Server.  So, this implies that you need indexes in order to use this mechanism.  Especially in larger data warehouses, people scrutinize each index due to the space requirements, often leaving off the definition of foreign keys in their warehouse databases.  In other cases, people might just be lazy and not bother creating them.

 Today, I'll show you a case when the foreign key definition matters to the database engine.

 The SQL Server Query Optimizer contains logic to detect that some joins are unnecessary and to remove them.  I will show you an example where this happens only when the Foreign Key is defined.  If you think about it, a foreign key enforces that there must be a parent row for any child in the relationship.  So, an equijoin join across this condition is unnecessary if you don't need any columns from the parent side of the relationship.

 There are two example queries and plans at the end that show that the resulting plans have no joins at all - they have been removed.  So, the next time someone says that foreign keys don't matter, think about that for a minute before you agree.  While I don't believe that they are appropriate everywhere, the SQL Server Optimizer does benefit from knowing about constraints in the data like foreign keys.

 

Thanks,

Conor

drop

table t1

drop

table f1

create

table t1(id int primary key not null, data binary(2000))

create

table f1(f_id int primary key identity, t_id int not null, otherdata binary(1000))

alter

table f1 add constraint fk1 FOREIGN KEY (t_id) REFERENCES t1(id)

--alter table f1 drop constraint fk1

declare

@i int=0

set

nocount on

begin

transaction

while

@i < 2000

begin

insert

into t1 (id) values (@i*100)

insert

into f1 (t_id) values (@i*100)

insert

into f1 (t_id) values (@i*100)

insert

into f1 (t_id) values (@i*100)

set

@i+= 1

end

commit

transaction

select

f1.* from t1 inner join f1 on t1.id = f1.t_id

select

f1.* from f1 where exists (select 1 from t1 where t1.id = f1.t_id)

StmtText

-------------------------------------------------------

select f1.* from t1 inner join f1 on t1.id = f1.t_id

StmtText

----------------------------------------------------------------------------------

|--Clustered Index Scan(OBJECT:([blog1].[dbo].[f1].[PK__f1__2911CBED1367E606]))

StmtText

------------------------------------------------------------------------------

select f1.* from f1 where exists (select 1 from t1 where t1.id = f1.t_id)

StmtText

----------------------------------------------------------------------------------

|--Clustered Index Scan(OBJECT:([blog1].[dbo].[f1].[PK__f1__2911CBED1367E606]))