Conor vs. FOREIGN KEY join elimination

I received a question from a friend in Brazil related to foreign keys and join elimination in SQL Server.  Yes, SQL Server can detect that some joins are not needed to return results to the user and skip them.  What is this evil magic, you ask?  Well, let’s walk through it before we get out our torches and pitchforks…

 

What’s a Foreign Key?  To most users, it is something that throws errors and blocks you from inserting data where you want it.  However, to SQL Server, it is a glorious thing!  It lets the Query Optimizer know that the data in one table absolutely correlates to the data in another table.  (So, if I have a Customers table and an Orders table, a Foreign Key from Orders to Customers tells me that every Order has a Customer).  Now, when one queries the join of Orders and Customers, it might be that you only really need columns from the Orders table.  Now we’re in business – the Optimizer can detect this and see that the result of the query is the same if we just skip all of the work of doing random IO lookups into the Primary Key index on Customers to make sure that they are there.

 

So, you should think of Foreign Keys as a great way to give the Optimizer hints to speed up your query, in most cases.  The only cases where I recommend that customers NOT define Foreign Keys is in very large data warehouses (which I will save for another blog post).

 

So, the question from my Brazillian friend is why we DO the elimination for single-column Foreign Keys and not for multi-column cases.

 

(I will post the mail I got from him here – so, his example + comments)

    1:   
    2:  IF OBJECT_ID('Tab2') IS NOT NULL
    3:  BEGIN
    4:    DROP TABLE Tab2
    5:  END
    6:  IF OBJECT_ID('Tab1') IS NOT NULL
    7:  BEGIN
    8:    DROP TABLE Tab1
    9:  END
   10:  GO
   11:  CREATE TABLE Tab1 (Tab1_Col1 Integer NOT NULL PRIMARY KEY, Tab1_Col2 Char(200))
   12:  CREATE TABLE Tab2 (Tab2_Col1 Integer NOT NULL PRIMARY KEY, Tab1_Col1 Integer NOT NULL, Tab2_Col2 Char(200))
   13:  ALTER TABLE Tab2 ADD CONSTRAINT fk FOREIGN KEY (Tab1_Col1) REFERENCES Tab1(Tab1_Col1)
   14:  GO
   15:   
   16:  -- Fine, the execution plan not use Tab1
   17:  SELECT Tab2.* 
   18:    FROM Tab2
   19:   INNER JOIN Tab1 
   20:      ON Tab1.Tab1_Col1 = Tab2.Tab1_Col1
   21:  --|--Clustered Index Scan(OBJECT:([dbo].[Tab2].[PK__Tab2__993AF6027FB886E3]))
   22:   
   23:  GO
   24:   
   25:  IF OBJECT_ID('Tab2') IS NOT NULL
   26:  BEGIN
   27:    DROP TABLE Tab2
   28:  END
   29:  IF OBJECT_ID('Tab1') IS NOT NULL
   30:  BEGIN
   31:    DROP TABLE Tab1
   32:  END
   33:  GO
   34:  CREATE TABLE Tab1 (Tab1_Col1 Integer NOT NULL, Tab1_Col2 Integer NOT NULL, Tab1_Col3 Char(200), PRIMARY KEY(Tab1_Col1, Tab1_Col2))
   35:  CREATE TABLE Tab2 (Tab2_Col1 Integer NOT NULL PRIMARY KEY, Tab1_Col1 Integer NOT NULL, Tab1_Col2 Integer NOT NULL, Tab2_Col2 Char(200))
   36:  ALTER TABLE Tab2 ADD CONSTRAINT fk FOREIGN KEY (Tab1_Col1, Tab1_Col2) REFERENCES Tab1(Tab1_Col1, Tab1_Col2)
   37:  CREATE INDEX ix ON Tab2(Tab1_Col1, Tab1_Col2)
   38:  GO
   39:   
   40:  -- Why not use the same behavior above ? Just because the multi column foreign key?
   41:  SELECT Tab2.*
   42:    FROM Tab2
   43:   INNER JOIN Tab1 
   44:      ON Tab1.Tab1_Col1 = Tab2.Tab1_Col1
   45:     AND Tab1.Tab1_Col2 = Tab2.Tab1_Col2
   46:  --|--Nested Loops(Inner Join, OUTER REFERENCES:([dbo].[Tab2].[Tab1_Col1], [dbo].[Tab2].[Tab1_Col2]))
   47:  --     |--Clustered Index Scan(OBJECT:([dbo].[Tab2].[PK__Tab2__993AF602084DCCE4]))
   48:  --     |--Clustered Index Seek(OBJECT:([dbo].[Tab1].[PK__Tab1__6D1AC6A2047D3C00]), SEEK:([dbo].[Tab1].[Tab1_Col1]=[dbo].[Tab2].[Tab1_Col1] AND [dbo].[Tab1].[Tab1_Col2]=[dbo].[Tab2].[Tab1_Col2]) ORDERED FORWARD)
   49:   
   50:  -- Why not apply the predicate at Tab1.Tab1.Col2 too? We have the trusted foreing key to be able to do that.
   51:  -- If we do it, in that case we could use the PK of Tab1 and the Index ix of Tab2 just filtering by Tab1_Col1(JOIN) and Tab1_Col2(WHERE)
   52:  SELECT Tab2.* 
   53:  FROM Tab2
   54:  INNER JOIN Tab1
   55:    ON Tab1.Tab1_Col1 = Tab2.Tab1_Col1
   56:  WHERE Tab2.Tab1_Col2 = 10
   57:  --|--Nested Loops(Inner Join, OUTER REFERENCES:([dbo].[Tab2].[Tab1_Col1]))
   58:  --     |--Clustered Index Scan(OBJECT:([dbo].[Tab2].[PK__Tab2__993AF602084DCCE4]), WHERE:([dbo].[Tab2].[Tab1_Col2]=(10)))
   59:  --     |--Clustered Index Seek(OBJECT:([dbo].[Tab1].[PK__Tab1__6D1AC6A2047D3C00]), SEEK:([dbo].[Tab1].[Tab1_Col1]=[dbo].[Tab2].[Tab1_Col1]) ORDERED FORWARD)

 

Before I answer, let me just say that I love getting questions like this – it shows that people really do like to deeply understand the engine and use what we put in it to its fullest. 

 

So, the answer is that we did the single-column case many years ago since most FKs are single-column.  We haven’t seen enough cases where the multi-column FK is used to make it more important than all of the other things that we’ve been requested to add to the product.  I wish the answer had more intrigue, but it’s just a case where we have lots and lots of things to code and we haven’t had time to do this one yet.  The technical side of me obviously wishes we had all of these things coded into the product.  The other way to look at it is that people are upset if query compilation takes too long, so we have to find a balance between what is mathematically possible and what is useful to the broadest set of customers.

If we do find that this kind of construct becomes more common, we would likely go and add this support for a future release of the product.

 

For the last example, predicate duplication across multi-column joins is another one of these cases where it is possible to do but does not show up as being the dominant factor in queries often enough to enable by default.

 

One last note – we also do not do FK join elimination on tempdb, even if you have a FK defined.

Happy Querying!

Conor Cunningham