Table variable vs. temp table and estimated vs. actual query plan

The other day we got a call from a puzzled friend. He has rewritten a stored procedure using table variable instead of temp table, it makes the stored procedure code look more tidy. The puzzling part is the same stored procedure now running a lot slower. The estimated plans looked the same using temp table and table variable, but the execution plans are very different. Unbeknownst to him, rewriting the procedure using table variable bumped into a query optimizer blind spot. While the “Estimated” and “Actual” query plans are same in most cases, there are some exception. If a table (temporary or permanent) is created in the same batch with the query, SQL Serve has to recompile the query since the query definition is unknown when the batch is compiled the first time. This is not the case for table variables (and that was the main reasons SQL Server 2005 introduce them – to reduce recompilations when unnecessary). In order to explain in detail, let’s use the following example:

--0. create test data

use test

if (OBJECT_ID('Test1') is not null)

DROP TABLE Test1

go

CREATE TABLE Test1 (ID int)

DECLARE @i int

SET @i = 0

SET NOCOUNT ON

WHILE @i < 20000

BEGIN

INSERT INTO Test1 (ID) Values (@i)

SET @i = @i + 1

END

CREATE CLUSTERED INDEX IX_Test1 ON dbo.Test1 (ID)

--1. Query using table variable

DECLARE @Tmp1 TABLE (ID int)

INSERT INTO @Tmp1(ID)

SELECT ID

FROM Test1

SELECT *

FROM Test1

WHERE ID NOT IN (SELECT ID FROM @Tmp1)

--2. Query using temp table

CREATE TABLE #Tmp1(ID int)

INSERT INTO #Tmp1(ID)

SELECT ID

FROM Test1

SELECT *

FROM Test1

WHERE ID NOT IN (SELECT ID FROM #Tmp1)

DROP TABLE #Tmp1

The estimated plans for both SELECT queries in the example above look the same – Nested Loops (Left Anti Semi Join) with the outer (top in the graphical display) table the Clustered index on Test1 and the inner table is the table variable in the first case and the temporary table in the second case. But if you run both scripts you will see that the first runs considerably slower – a minute or two depending on your processor speed, memory available to SQL Server and disk I/O capacity. How is this possible?

When the queries are compiled neither the table variable nor temporary table are populated and the query optimizer assumes “minimum” number of rows and that is always 1 in SQL Server. This estimation is used when SQL Server generates the “Estimated” query plan. While the “Estimated” and “Actual” query plans are same in most cases, there are some exception. If a table (temporary or permanent) is created in the same batch with the query, SQL Serve has to recompile the query since the query definition is unknown when the batch is compiled the first time. This is not the case for table variables.

Now turn on “Display  Actual Execution Plan” in SSMS and re-run the scripts 1. in one query window and 2. in another one.  You will see that the actual plan for script 1 is the same as estimated plan and took substantially longer than the script 2 to complete. The actual execution plan for Script 2 is very different from its estimated plan. The actual execution plan for script 2 uses Hash Match to execute the Anti Semi Join at the top of the plan. There is a table scan of the temporary table #Tmp1 at the top of the plan building the hash table. If you examine the properties of this scan you will find out that the number of Estimated rows is 20000 (was 1 in the Estimated plan as well as it is 1 for the table variable plan).

Figure 1. Estimated plan

clip_image001

Figure 2. Actual plan of SELECT query using temp table

clip_image001[10]

When examining query plans be careful to consider possible plan change during the batch execution due to recompiles. Bear in mind that while table variables are not causing recompiles, you may need just the opposite to get the best query plan. You should always use temporary tables in your queries unless you intentionally want to avoid query recompilation and you are confident query variable will give you a good plan. Introduction of statement level recompilation in SQL Server 2005 substantially lowered the need to use table variables in your queries.

You will find some great information about table variables and temporary tables in

https://databases.aspfaq.com/database/should-i-use-a-temp-table-or-a-table-variable.html .

Please note that the actual execution plans for the above batches will change if you merge 1 and 2 into to a single batch and will change again if you merge all three into single batch… There are some interesting learnings surrounding such modifications and we will talk about them in some later blog entries! Stay tuned!

Lubor Kollar, Lindsey Allen