6.0 Best Programming Practices

Article
01/13/2007

In this section we will outline some programming practices for efficient plan cache usage:

6.1 Client Side Parameterization of Queries

If your application has repeated execution of the same query with only parameter values changing from query to another, then parameterizing the query in the client application code before execution gives some significant performance gains. The performance gains come from the fact that the query is compiled just once versus once for every query issued for execution. Prepared execution also provides advantage from the fact that it helps reduce network traffic by eliminating the need to send the sql query text to the server each time. In cases where the statement text is fairly large the cost of transmitting this text from client to server repeatedly can quickly add up. In the case of prepared execution however all that needs to be sent over the network are parameter values, and the prepared handle (there is a list of prepared handles in a session). The prepared handle has the hash value associated with the query text which needs to be computed only once (versus once per query execution). Also since the server does not do any parameterization, there is no chance for plan cache bloating due to caching the shell queries.

Client side parameterization if applied incorrectly can cause more grief than gain. Since the server cannot parameterize the queries, and the advantage of simple or forced parameterization is lost. We will try to illustrate some of the common mistakes made, and also recommend some best practices. Consider the example below where the client code relies entirely on server side parameterization:

command.CommandText = "select * from t1 where col1 = 1 and col2 = 'abcd'";

command.ExecuteNonQuery();

command.CommandText = "select * from t1 where col1 = 1 and col2 = 'abc'";

command.ExecuteNonQuery();

This results in one parameterized plan:

(@1 tinyint,@2 varchar(8000))SELECT * FROM [t1] WHERE [col1]=@1 AND [col2]=@2

Notice that in the absence of any client side parameterization, the server side forced bucketization applies. All queries are bucketized to varchar(8000) or nvarchar(4000). If the parameter is greater than varchar(8000) then nvarchar(max) is used. While this may seem acceptable, if the parameter data is always limited to say 50 characters (limited by the column data type and length) then it is not an optimal solution. Now consider the example below where the query has been parameterized incorrectly:

command.CommandText = "select * from t1 where col1 = @id and col2 = @str";

command.Parameters.Add("@id", 1);

command.Parameters.Add("@str", "abc");

command.ExecuteNonQuery();

command.Parameters[0].Value = 2;

command.Parameters[1].Value = "abcd";

command.ExecuteNonQuery();

This results in two parameterized queries:

(@1 tinyint,@2 varchar(3))SELECT * FROM [t1] WHERE [col1]=@1 AND [col2]=@2

(@1 tinyint,@2 varchar(4))SELECT * FROM [t1] WHERE [col1]=@1 AND [col2]=@2

Since the parameter type and lengths are not specified, different parameter values lengths can potentially generate different parameterized plans. This can cause a significant performance degradation in SQL Server 2005 RTM and SP1 since all these parameterized queries will hash into the same hash bucket. The reason they all hash into the same bucket is because they have the exact same parameterized query text but for the declarations, and the hash value is only computed in SQL Server 2005 and RTM based on the parameterized query text not including the declarations.

For legacy applications where change to client side code is not an option, use TF 144 to force server side bucketization. When the application has queries that are improperly parameterized, this trace flag forces server side bucketization. This will result in only one parameterized plan:

(@1 tinyint,@2 varchar(8000))SELECT * FROM [t1] WHERE [col1]=@1 AND [col2]=@2

Note however this a server wide option and will apply to all parameterized queries executed. In some cases this may not be suitable, and hence we advise use of this trace flag with caution.

The recommended method to parameterize queries in client application code has been illustrated below:

SqlConnection connection = new SqlConnection ("context connection = true");

connection.Open();

SqlCommand command = connection.CreateCommand();

command.CommandText = "select * from t1 where col1 = @id and col2 = @str";

command.Parameters.Add("@id", SqlDbType.Int);

command.Parameters.Add("@str", SqlDbType.VarChar, 50);

command.Parameters[0].Value = 1;

command.Parameters[1].Value = "abc";

command.ExecuteNonQuery();

command.Parameters[0].Value = 2;

command.Parameters[1].Value = "abcd";

command.ExecuteNonQuery();

This results in only one prepared compiled plan for both executions for the select query:

(@id int,@str varchar(50))select * from t1 where col1 = @id and col2 = @str

Some key points to note here include that we have parameterized the query and carefully specified the data type and the max length for the parameters (we specify the max length of the varchar parameter the same as the table column data type). Note that the query optimizer will try to ‘sniff’ parameter values and choose an optimal plan appropriately. While in most cases this works to our advantage, in some cases where the first parameter values are atypical it can actually cause a suboptimal plan to be chosen.

We can still do one better than the example above by deferring query preparation to execution time:

command.CommandText = "select * from t1 where col1 = @id and col2 = @str";

command.Parameters.Add("@id", SqlDbType.Int);

command.Parameters.Add("@str", SqlDbType.VarChar, 50);

command.Parameters[0].Value = 1;

command.Parameters[1].Value = "abc";

command.Prepare();

command.ExecuteNonQuery();

command.Parameters[0].Value = 2;

command.Parameters[1].Value = "abcd";

command.ExecuteNonQuery();

Here query preparation is deferred to execution time and we gain all the advantages of parameterization including the fact that we don’t have to send the sql query text over the network everytime. For ODBC the SQL_SOPT_SS_DEFER_PREPARE attribute (for OLEBD it is SSPROP_DEFERPREPARE) determines whether the statement is prepared immediately or deferred until execution. BOL has more information.

It is important to note that when specifying query parameters all statements in the batch should be parameterized. Consider the example where we will use sp_executesql and parameterize the queries:

declare @param_value int, @sqlstring nvarchar(500),

@param_definition nvarchar(500), @col2 int;

set @param_value = 5;

set @sqlstring = N'select @col2_out = col2 from t1 where col1 = @param; update t1 set col1 = col1 + 1 where col2 = 6';

set @param_definition = N'@param int, @col2_out int output';

exec sp_executesql @sqlstring, @param_definition, @param = @param_value, @col2_out = @col2 output;

set @sqlstring = N'select @col2_out = col2 from t1 where col1 = @param; update t1 set col1 = col1 + 1 where col2 = 7';

exec sp_executesql @sqlstring, @param_definition, @param = @param_value, @col2_out = @col2 output;

The select statement has been parameterized, while the update statement has not been parameterized making each batch unique. When these 2 batches are executed, there is a compiled plan for each batch in the cache.

(@param int, @col2_out int output)select @col2_out = col2 from t1 where col1 = @param; update t1 set col1 = col1 + 1 where col2 = 7

(@param int, @col2_out int output)select @col2_out = col2 from t1 where col1 = @param; update t1 set col1 = col1 + 1 where col2 = 6

With no reuse of the prepared compiled plan, every batch that comes in is compiled and a fresh plan is inserted into the cache. This will lead to performance degradation due to large number of compiles and plan cache bloating. Therefore it is important to make sure all statements in the batch are parameterized as follows:

declare @param_value int, @sqlstring nvarchar(500),

@param_definition nvarchar(500), @col2 int;

set @sqlstring = N'select @col2_out = col2 from t1 where col1 = @param1; update t1 set col1 = col1 + 1 where col2 = @param2';

set @param_definition = N'@param1 int, @param2 int, @col2_out int output';

set @param_value = 5;

exec sp_executesql @sqlstring, @param_definition, @param1 = @param_value, @param2 = @param_value, @col2_out = @col2 output;

set @param_value = 6;

exec sp_executesql @sqlstring, @param_definition, @param1 = @param_value, @param2 = @param_value, @col2_out = @col2 output;

The compiled plan generated for both executions as:

(@param1 int, @param2 int, @col2_out int output)select @col2_out = col2 from t1 where col1 = @param1; update t1 set col1 = col1 + 1 where col2 = @param2

This is the recommended approach to parameterizing batches with more than one query since it benefits from plan re-use and avoids plan cache bloating problems.

A word of caution on potential over use of prepared queries: If a query is going to be executed only once, then it is probably more efficient to directly execute the query on the server. This avoids the extra network round trip to prepare the query.

6.2 Use Fully Qualified Names

It is advisable to use fully qualified names in queries. There are some significant performance gains in doing so because it avoids the need to look up the default schema for the current user. It also provides us with a plan that can be executed by multiple users (irrespective of the default schemas). Let us demonstrate with an example: table t1 is part of user u1’s default schema s1. Executing a select query as user u1, without a fully qualified object name, results in the cache key user_id to be equal to the schema id to which this table belongs.

select * from t1

Use the query below to view the cache key attributes for the compiled plan for the query above:

select st.text, cp.plan_handle, cp.cacheobjtype, cp.objtype, pa.attribute, pa.value, pa.is_cache_key

from sys.dm_exec_cached_plans cp cross apply sys.dm_exec_sql_text(plan_handle) st

outer apply sys.dm_exec_plan_attributes(cp.plan_handle) pa

where cp.cacheobjtype = 'Compiled Plan'

and st.text not like '%select st.text%'

and pa.is_cache_key = 1

order by pa.attribute;

Text

Plan_

handle

Cache

objtype

Obj

type

Attribute

Value

Is_

cache_

key

select * from t1

0x0600010

0F299431E

B8614C040

000000000

00000

Compiled Plan

Adhoc

user_id

However executing a select query with a fully qualified table name results in a plan who’s cache key user_id is -2 indicating that this plan can be shared across multiple users (irrespective of what their default schema is).

select * from s1.t1

Use the query above to view the user_id cache key attribute for the compiled plan:

Text

Plan_

handle

Cache

objtype

Obj

type

Attribute

Value

Is_

cache_

key

select * from

master.

u1.t1

0x060001

00609A91

2EB86175

04000000

00000000

0000000000

Compiled Plan

Adhoc

user_id

-2

6.3 Use RPC Events over Language Events

RPC execution gives better performance than using language events. For example consider stored procedure executions with different parameters passed to the stored procedures on each execution. If we executed adhoc queries as below:

exec master.dbo.test_proc 1, 1000;

exec master.dbo.test_proc 99, 99000;

exec master.dbo.test_proc 100, 100000;

Then depending on the number of stored procedure executions we may cache the batch. Now if subsequent executions of the stored procedure have different parameter values, then this means we will land up caching all of these unique batches that have small probability of re-execution. This can cause the cache to bloat quickly and come under memory pressure. This situation can be entirely avoided by replacing these adhoc queries with RPC executions of the stored procedures as below:

{call master.dbo.test_proc (1, 1000)}

{call master.dbo.test_proc (99, 99000)}

{call master.dbo.test_proc (100, 100000)}

Now we have eliminated the problem of additional caching of the adhoc batches and have also got performance gains from using RPC.

6.4 Choose Suitable Object Names

It is also important to name tables, functions and procedures suitably. Procedures prefixed with “sp_”, functions names starting with “Fn_”, tables prefixed with “sys” might result in resource database look ups in the absence of fully qualified names in queries using these objects.

6.5 Exact Match of High Re-use Adhoc Query Text

For adhoc queries, the query text needs to be identical. Small differences like extra white space characters or case difference of the query text will result in different plans, and no re-use of cached plans. Comments are also considered part of the query text. Therefore the queries below result in 2 different cached plans:

---this is an example of an adhoc query

if exists(select col1 from t1 where col2 > 5) declare @x int;

---this is an example of an adhoc query with different comment

if exists(select col1 from t1 where col2 > 5) declare @x int;

Use the query below to view the sql_handle and plan_handles for both plans and notice that the sql_handles are different:

select text, sql_handle, plan_handle

from sys.dm_exec_query_stats qs cross apply sys.dm_exec_sql_text(sql_handle)

where text not like '%select text%'

Text

Sql_handle

Plan_handle

---this is an example of an adhoc query with different comment if exists(select col1 from t1 where col2 > 5) declare @x int;

0x0200000

0EC59E433

8B492CC1D

F65C99E5D

352A4E641

06FA0

0x060001

00EC59E43

3B8417204

000000000

000000

---this is an example of an adhoc query if exists(select col1 from t1 where col2 > 5) declare @x int;

0x0200000

04C1C833A

B811F38FB

C18913249

C73C4DFD2

A74DC

0x0600010

04C1C833A

B80167040

000000000

00000

As demonstrated previously different drivers have different default settings for set options. Difference in set options will not result in plan re-use for identical queries. For example, OSQL sets quoted_identifier on while SQL Server Management Studio sets quoted_identifier off. Identical queries issued from these different clients will result in two different plans.

6.0 Best Programming Practices

Additional resources