I Smell a Parameter!

Parameters are a useful way to improve overall system performance when there are many common queries with the same shape. Instead of compiling each query, one plan can be used for all similar queries. This can significantly reduce CPU overhead and improve throughput. As long as the queries would have really returned the same plan, this is a big performance winner. SQL Server internally tries to automatically turn simple non-parameterized user queries into parameterized queries to take advantage of this performance gain.

Parameter use, especially in more complex scenarios, can also cause performance issues. If the queries are complex and/or the data distribution on the columns against which the parameter is compared vary, the cost of different plan choices can change. A plan that is optimal for one parameter value may perform poorly for another value. The query optimizer still needs to estimate the selectivity and cardinality of predicates using a parameter value. This section describes how this process works in more detail.

The default implementation would be to use the average number of duplicates per value (average frequency) to guess how many rows that the parameter would match. For an equality predicate, that’s exactly what will happen in the base case. However, the optimizer tries to find out any specific value available at the time that the query is compiled and will use that value to estimate cardinality instead. This process of using the parameter value to estimate selectivity and cardinality is called “parameter sniffing”.

There are some non-obvious implementation details that can get in the way of giving the optimizer the chance to use that value. One common issue is that SQL Server’s procedural language, T-SQL, is interpreted. Additionally, the compilation model for batches of commands will compile all commands and then execute them. If you set a parameter value and then run a query that uses that parameter within the same batch, the value isn’t available to the optimizer (and therefore it can’t be sniffed).

Here’s some SQL examples to demonstrate the various behaviors in SQL 2005:

use tempdb

-- create a table with 2000 rows. 1000 of them have the values 1 to 1000 each once (no

-- duplicates). Then we have 1000 rows with the value 5000.

drop table t

create table t(col1 int)

declare @i int

set @i = 0

while @i < 1000

begin

insert into t(col1) values (@i)

set @i = @i + 1

end

set @i = 0

while @i < 1000

begin

insert into t(col1) values (5000)

set @i = @i + 1

end

-- Let's create some fullscan statistics on the column in our table

create statistics t_col1 on t(col1) with fullscan

-- note that the selectivity for the table in the statistics is 1/1001 = 9.9900097E-4; There are 1001 distinct values in the table

dbcc show_statistics ('dbo.t','t_col1')

-- compile with no value to sniff. We should use the "density" for the whole table to make our estimate

-- which means that we'll take 2000 rows * 1/1001 = about 2 rows returned

dbcc freeproccache

declare @p int

select * from t where col1 = @p

-- (look at the output plan to see the estimate)

-- same scenario but set a value. The value 5000 has 1000 instances, but we estimate 2 rows.

-- Why? Well, we compile the batch before we execute it, so the optimizer in 2005 does not see

-- the parameter value and we treat this the same as the previous case because it hasn’t been

-- actually set when the query is compiled

dbcc freeproccache

declare @p int

set @p = 5000

select * from t where col1 = @p

-- Let's use the option recompile as a workaround.

-- The first optimization has the same problem as before - estimates 2 rows

dbcc freeproccache

declare @p int

set @p = 1

select * from t where col1 = @p

option(recompile)

-- now look at the compiled plan for this case - we've recompiled and correctly estimate 1 row

select * from t where col1 = @p

option(recompile)

-- Another (better) workaround is to use the new optimize for hint - it avoids the recompile

-- and we estimate 1 row

dbcc freeproccache

declare @p int

set @p = 1

select * from t where col1 = @p

option (optimize for (@p = 1))

-- last workaround - use a stored procedure. This will create a new context in the

-- server and lets the optimizer "see" the parameter and sniff it during compilation.

create procedure foo (@p int)

as

return select * from t where col1 = @p

-- compile and examine the plan for this - estimates 1 row instead of 2

dbcc freeproccache

execute foo @p=1

-- how does one see if the plan used a sniffed parameter? If sniffed, the information

-- is visible in showplan_xml

set showplan_xml on

execute foo @p=1

-- look for a node like this:

- <ParameterList>

  <ColumnReference Column="@p" ParameterCompiledValue="(1)" />

  </ParameterList>

The examples here demonstrate how the cardinality estimates change based on the patterns you use to invoke your queries. In the examples, we expect to see an estimate of 1 row for each time we sniff a parameter with a value between 1-1000 since there is exactly one of each row. If we sniff nothing, we’d expect to use the average frequency (which is 2 rows for this example). If you pick the outlier (in this case we have 1000 rows with the value 5000), then you should see a higher estimate.

For the examples I’ve given here, I purposely picked a very simple query plan so that you could easily see the differences in the cardinality estimates without having to explain any other oddities in the query plan. While there is no plan choice impact for these examples, cardinality estimation errors such as the ones introduced here do cause changes in more complex queries.

The main points that you need to remember are:

  1. If you don’t see any problems due to parameters in your application, you don’t need to do anything. Just keep this information in the back of your head, just in case.
  2. If you find that the optimizer is picking different plans over time that have varying performance characteristics, consider using a parameter hint with a representative “average” value to get a good, common query plan that will work reasonably for all values.
  3. If you are running really complex queries and need the plan choice to be exact in each case, you can consider using the “recompile” hint – just know that it will compile each time it runs, so this is likely more appropriate for longer-running queries to justify that compilation cost.
  4. Moving a query into a stored procedure can put it into a separate procedural context and can be a good way to get that value visible to the optimizer (Note: this works in SQL 2000 as well)