SYSK 35: Performance comparison of WITH and TABLE variable queries

As many of you know, SQL 2005 now has a common table expression (CTE) WITH.  You can think of it as a temporary (non-persistent) view defined within the scope of an executing statement (see SQL help topic ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/27cfb819-3e8d-4274-8bbe-cbbe4d9c2e23.htm for more information).

I’m sure many of you can point to a number of benefits of CTE.  The question for me was – if I can do the same with a table variable and CTE, what’s better in terms of performance? 

I’ve created a simple table with four columns – identity primary key, group id, data id, and data value.  Populated it with the data, and ran the queries (see below) that returned values of the data column when data id = 4 or 17 for a randomly selected group id.

The answer may surprise you…  My tests show that the WITH statement is about 2 times slower!   On a 1 GHz/1Mb server populated with few thousands of rows, WITH statement took approx 1.2 sec vs. 0.6 sec using the table variable.  Same query with 2.5 million rows took a whopping 49 min 59 sec using WITH statement and 23 min 33 sec using Table variable.

Query using table variables (the faster way):

DECLARE @Temp TABLE ([pk_id] [bigint] NOT NULL,
[GroupID] [bigint] NOT NULL,
[DataId] [int] NOT NULL,
[DataValue] [varchar](256) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL)

INSERT INTO @Temp
SELECT pk_id, GroupID, DataId, DataValue
FROM Table1
WHERE GroupID = @GroupID

SELECT
t1.DataValue Param4, t2.DataValue Param17
FROM @Temp t1
INNER JOIN @Temp t2 ON t1.GroupID = t2.GroupID
WHERE t1.DataId = 4 AND t2.DataId = 17

Query using WITH statement:

WITH ScenarioData AS
(
SELECT * FROM Table1 WHERE GroupID = @GroupID
)

SELECT
t1.DataValue Param4, t2.DataValue Param17
FROM ScenarioData t1
INNER JOIN ScenarioData t2 ON t1.GroupID = t2.GroupID
WHERE t1.DataId = 4 AND t2.DataId = 17