2.0 Diagnosing Plan Cache Related Performance Problems and Suggested Solutions

 

Plan cache related performance problems usually manifest in decrease in throughput (or increase in query response time), and some times out of memory errors, connection time outs. None of these symptoms however point to plan cache related performance problems specifically. In order to determine and further isolate the cause of the slowdown the following steps are recommended:

 

2.1 Periodically query sys.dm_os_wait_stats and examine the waits

 

  1. CMEMTHREAD waits: This wait type indicates that the rate of insertion of entries into the plan cache is very high and there is contention on the memory object from which the cache descriptors are allocated from. Contention is also observed when entries are being removed from the plan cache resulting in the resource monitor thread being blocked. There is a single memory object per cachestore from which the cache descriptors for the cachestore are allocated. Since there is a single global cachestore for the ad-hoc compiled plans, there is only one thread safe memory object creating the cache descriptors for each cacheable compiled plan.

 

For example: Consider the following query batch:

 

begin tran;

exec sp_some_procedure 'param1';

exec sp_some_procedure 'param2';

exec sp_some_procedure 'param3';

exec sp_some_procedure 'param4';

exec sp_some_procedure 'param5';

exec sp_some_procedure 'param6';

exec sp_some_procedure 'param7';

exec sp_some_procedure 'param8';

exec sp_some_procedure 'param9';

exec sp_some_procedure 'param10';

commit tran;

go

 

Each time the batch is executed with different parameter values passed to the stored procedure making the batch unique. This batch is cached in SQL Server 2005 and not in SQL Server 2000 due to different costing formulas and heuristics. Hence on upgrade from SQL Server 2000 this manifests as a significant drop in throughput once memory pressure is reached in SQL Server 2005 SP1 and sometimes OOM (out of memory) errors as well if resource monitor is not able to keep up with cache insertions. Query sys.dm_os_wait_stats looking for wait of type CMEMTHREAD:

 

select * from sys.dm_os_wait_stats

order by waiting_tasks_count desc

go

 

On a 32 processor, 8 Node NUMA Machine when this batch is executed over 64 connections there are a large number of CMEMTHREAD waits seen in sys.dm_os_wait_stats. SQL Server 2005 SP2 has a 2 part fix for the query batch described above: a) Zero cost queries that have set statements and transaction statements are no longer cached, unless the batch is entirely composed of set and/or transaction statements. This should dramatically reduce the rate of insertion of plans into the cache. b) Partition the memory object from which the cache descriptors are allocated per CPU. This will reduce the contention around the memory object, and therefore CMEMTHREAD waits.

 

Another approach to reducing contention would be examine the application code and use batched RPC to execute stored procedures.

 

  1. SOS_RESERVEDMEMBLOCKLIST waits:

 

select * from sys.dm_os_wait_stats

order by wait_type

go

 

A large number of waits of type SOS_RESERVEDMEMBLOCKLIST indicates that the query probably has a procedure with a large number of parameters, or the query has a long list of expression values specified in an IN clause, which would require multi-page allocations. When such queries are executed concurrently over multiple connections on 64 bit systems, there is a severe degradation in throughput especially when the total multi-page allocations exceed a couple of GB.

 

Consider the example below where the select query has an IN clause with a large number of values specified in the expression list:

 

select col3 from t1 where col1 = 5 and col2 in (

'1', '2', '3', '4', '5', '6', '7', '8', '9', '10',

.

.

'500')

go

 

When this query is executed with different values of literals in the IN clause over multiple connections, the multi-page allocations are made. Another example of when multi-page allocations are made is execution of a stored procedure that takes a large number of parameters over multiple connections, each time with different parameter values:

 

exec p1 '1', '2', '3', '4', ...'500'

go

 

Sys.dm_os_memory_cache_counters has the multipages allocated in KB (multi_pages_kb):

 

select name, type, single_pages_kb, multi_pages_kb,

single_pages_in_use_kb, multi_pages_in_use_kb

from sys.dm_os_memory_cache_counters

where type = 'CACHESTORE_SQLCP' or type = 'CACHESTORE_OBJCP'

go

 

It is also worth noting that periodically executing 'dbcc freeproccache' alleviates the problem temporarily. SQL Hotfix 543 contains the multi-page allocator fix to reduce this contention. SQL Server 2005 SP2 includes this Hotfix and has an additional fix to relieve contention on this wait type. Note that the fix in SP2 is not guaranteed to fix the contention on this wait type, but should reduce the contention. It is also worth examining the client application for alternatives to passing long parameter list to functions or stored procedures or rewriting queries with a long IN clause.

 

  1. RESOURCE_SEMAPHORE_QUERY_COMPILE waits:

 

select * from sys.dm_os_wait_stats

order by wait_type

go

 

A large number of waits of type RESOURCE_SEMAPHORE_QUERY_COMPILE indicates a large number of concurrent compiles. In SQL Server 2005 we limit the number of compiles than can start at any given time to prevent the following scenario: a large number of compilations are initiated and memory partially allocated. A large percentage of queries land up being suspended (waiting) for resources to complete compilation, eventually timing out and releasing memory allocated to them. This is an inefficient use of server resources, and hence in SQL Server 2005 there is a limit on the number of concurrent compiles that use a lot of memory at any given time. In such scenarios examine the entries in the plan cache to see if queries are getting compiled as follows:

 

select usecounts, cacheobjtype, objtype, bucketid, text

from sys.dm_exec_cached_plans cp cross apply

sys.dm_exec_sql_text(cp.plan_handle)

where cacheobjtype = 'Compiled Plan'

order by objtype

go

 

If there are no rows returned from the above query with objtype 'Prepared' then queries are not getting parameterized. First thing to try here is to alter the database parameterization setting:

 

alter database <database_name> set parameterization forced

go

 

However the recommended approach is to use plan guides for queries and specify parameterization option as forced instead of changing the database parameterization option. For example:

 

When the database parameterization option is set to simple the following query does not get parameterized:

 

select t1.col1 from t1 join t2 on t1.col1 = t2.col2 join t3 on t2.col1 = t3.col1 where t1.col1 = 200

 

When executed concurrently with different parameter values on a multi-processor machine results in large number of waits of type RESOURCE_SEMAPHORE_QUERY_COMPILE. When the database parameterization option is set to forced, the query gets parameterized as:

 

(@0 int)select t1.col1 from t1 join t2 on t1.col1 = t2.col2 join t3 on t2.col1 = t3.col1 where t1.col1 = @0

 

A more suitable alternative to setting the database parameterization option to forced would be to use plan guides as below:

 

declare @stmt nvarchar(max), @params nvarchar(max);

 

exec sp_get_query_template

N'select t1.col1 from t1 join t2 on t1.col1 = t2.col2 join t3 on t2.col1 = t3.col1 where t1.col1 = 200',

@stmt output, @params output;

 

exec sp_create_plan_guide N'TemplateGuide',

@stmt, N'template', NULL,

@params, N'option(parameterization forced)';

 

For more information see 'Designing Plan Guides for Parameterized Queries'.

 

It is important to note that when specifying query parameters all statements in the batch should be parameterized. Consider the example below:

 

declare @intvalue int, @sqlstring nvarchar(500),

@parmdefinition nvarchar(500), @col2 int;

set @intvalue = 5;

 

set @sqlstring = N'select @col2_out = col2 from t1 where col1 = @param; update t1 set col1 = col1 + 1 where col2 = 6';

set @parmdefinition = N'@param int, @col2_out int output';

exec sp_executesql @sqlstring, @parmdefinition, @param = @intvalue, @col2_out = @col2 output;

 

set @sqlstring = N'select @col2_out = col2 from t1 where col1 = @param; update t1 set col1 = col1 + 1 where col2 = 7';

exec sp_executesql @sqlstring, @parmdefinition, @param = @intvalue, @col2_out = @col2 output;

go

 

The select statement has been parameterized, while the update statement has not been parameterized making each batch unique. When these 2 batches are executed, there is a compiled plan for each batch in the cache.

 

(@param int, @col2_out int output)select @col2_out = col2 from t1 where col1 = @param; update t1 set col1 = col1 + 1 where col2 = 7

(@param int, @col2_out int output)select @col2_out = col2 from t1 where col1 = @param; update t1 set col1 = col1 + 1 where col2 = 6

 

With no reuse of the prepared compiled plan, every batch that comes in is compiled and a fresh plan is inserted into the cache. This will lead to performance degradation due to large number of compiles and plan cache bloating. Therefore it is important to make sure all statements in the batch are parameterized as follows:

 

declare @intvalue int, @sqlstring nvarchar(500),

@parmdefinition nvarchar(500), @col2 int;

 

set @sqlstring = N'select @col2_out = col2 from t1 where col1 = @param1; update t1 set col1 = col1 + 1 where col2 = @param2';

set @parmdefinition = N'@param1 int, @param2 int, @col2_out int output';

 

set @intvalue = 5;

exec sp_executesql @sqlstring, @parmdefinition, @param1 = @intvalue, @param2 = @intvalue, @col2_out = @col2 output;

 

set @intvalue = 6;

exec sp_executesql @sqlstring, @parmdefinition, @param1 = @intvalue, @param2 = @intvalue, @col2_out = @col2 output;

go

 

The compiled plan generated for both batches is:

 

(@param1 int, @param2 int, @col2_out int output)select @col2_out = col2 from t1 where col1 = @param1; update t1 set col1 = col1 + 1 where col2 = @param2

 

2.2 Periodically look at the spins and collisions data

 

create table t_spinlock_stats

(spinlockname varchar(64), collisions int,

spins int, [spins/collisions] int)

 

insert into t_spinlock_stats

exec ('dbcc sqlperf(spinlockstats)');

 

select * from t_spinlock_stats order by collisions desc

go

 

1. If the numbers of collisions (and spins) for SQL_MGR is relatively high, next examine the parameterized queries:

 

select usecounts, cacheobjtype, objtype, bucketid, text

from sys.dm_exec_cached_plans cp cross apply sys.dm_exec_sql_text(cp.plan_handle)

where cacheobjtype = 'Compiled Plan' and objtype = 'Prepared'

order by bucketid

go

 

If there are a large number of parameterized queries with the exact same sql text but different parameter lists, and identical bucketid’s then this might explain the large number of collisions for SQL_MGR.

 

For example:

 

select * from foo where col2 = 2.23

translates to: (@1 numeric(3,2))SELECT * FROM [foo] WHERE [col2]=@1

 

select * from foo where col2 = 2.2 

translates to:(@1 numeric(2,1))SELECT * FROM [foo] WHERE [col2]=@1

 

When a statement is parameterized by the server (either simple or universal parameterization) it makes type decisions based on the parsed type. If the parameter passed is numeric, then the type is determined based on precision and scale. So a value like 2.2 will have a type of numeric(2, 1) and 2.23 will have a type of numeric(3, 2). For varchar data type, server side parameterization will not exhibit this behavior. For example:

 

select * from table1 where col1 = 'abc'

 

select * from table1 where col1 = 'abcd'

both translate to:

(@0 varchar(8000))select * from table1 where col1 = @0

 

Failing to set the max length of the parameter when using client side parameterization, can however generate one parameterized plan for each parameter value with a different length. For example:

 

using SqlConnection myConnection = new SqlConnection ("context connection = true");

SqlCommand myCommand = new SqlCommand("SELECT * FROM t1 WHERE col1 = @ID", myConnection );

 

myCommand.Parameters.Add( "@ID", SqlDbType.VarChar);

myCommand.Parameters["@ID"].Direction = ParameterDirection.Input;

myCommand.Parameters["@ID"].Value = "ABC";

myConnection.Open();

myCommand.ExecuteNonQuery();

 

myCommand.Parameters["@ID"].Value = "ABCD";

myCommand.ExecuteNonQuery();

myConnection.Close();

 

This translates to two parameterized queries i.e. unique parameterized query if the length of the parameter is different.

 

(@ID varchar(3))SELECT * FROM t1 WHERE col1 = @ID

(@ID varchar(4))SELECT * FROM t1 WHERE col1 = @ID

 

If you pass different literal values with different number of decimal points they will all have the same parameterized query text but the types in the parameter list is different. The Sql 2005 caching algorithm uses only the parameterized query text to compute the hash value for the entry, and so it is possible to have a very large number of entries with the same query text but different parameter lists. This leads to long lookup times for the cache entries and high CPU as it walks the hash chain and does comparisons on many similar entries.

 

SQL Hotfix 716 (fix also in SQL Server 2005 SP2) fixes this issue by computing the hash value based on the entire query text including the parameter list. Minor modifications to the application code can mitigate the problem when the parameter type is varchar. It is recommended that the max length is always set suitably when using ADO.NET as illustrated below in the code snippet:

 

using SqlConnection myConnection = new SqlConnection ("context connection = true");

SqlCommand myCommand = new SqlCommand("SELECT * FROM t1 WHERE col1 = @ID", myConnection );

 

      myCommand.Parameters.Add( "@ID", SqlDbType.VarChar, 20);

      myCommand.Parameters["@ID"].Direction = ParameterDirection.Input;

      myCommand.Parameters["@ID"].Value = "ABC";

      myConnection.Open();

      myCommand.ExecuteNonQuery();

     

      myCommand.Parameters["@ID"].Value = "ABCD";

      myCommand.ExecuteNonQuery();

      myConnection.Close();

 

This translates into one parameterized query irrespective of the length of the actual parameter passed in:

 

      (@ID varchar(20))SELECT * FROM t1 WHERE col1 = @ID

 

2. If the numbers of collisions (and spins) for SPL_MUTEX is high, then examine your application code to see if queries are being executed under a security context other than the sysadmin.

 

Examine the size (single_pages_kb) of the TokenAndPermUserStore cache either through dbcc memorystatus or through the sys.dm_os_memory_clerks DMV. The number of single_pages_kb should be very high.

 

select type, name, single_pages_kb,

multi_pages_kb, memory_clerk_address

from sys.dm_os_memory_clerks where type =

'USERSTORE_TOKENPERM'

go

 

Next examine the buckets_count in sys.dm_os_memory_cache_hash_tables and look for long hash bucket chains:

 

select type, name, cache_address, buckets_count, buckets_in_use_count,

from sys.dm_os_memory_cache_hash_tables

where type = 'USERSTORE_TOKENPERM'

go

 

Symptoms above describe a situation where performance degrades because access to many entries in the TokenAndPermUserStore cache is going through a single spinlock. This is because entries end up in the same hash bucket. This leads to the response time of queries degrading significantly and has been observed on 64 bit systems.

 

If yes, next try to free the token and permissions cache (TokenAndPermUserStore) and see if that temporarily alleviates the problem:

 

dbcc freesystemcache ('TokenAndPermUserStore')

go

 

Changes to the application code that avoid this problem include executing all queries under the context of sysadmin to bypass security checks. A server side fix for this issue will be shipped as part of SP2.

 

2.3 Monitor Plan Cache size and Database pages size

 

Collect 'SQL Server: Plan Cache\Cache Pages(_Total)' and 'SQLServer: BufferManager\Database pages' performance counters. Typically the database pages increases along with the plan cache pages especially for workloads with adhoc queries. However due to the difference between SQL Server 2000 and SQL Server 2005 plan cache size limits, it is possible for the plan cache pages to grow up to ~80% of buffer pool without being under memory pressure leaving no room for database pages. This results in the database pages being evicted and subsequent performance degradation. The max limit of SQL Server 2005 Plan Cache size in SQL Server 2005 RTM and SP1 is bigger than in SQL Server 2000. To bring the SQL Server 2005 plan cache size limit to be similar to that of SQL Server 2000, in SP2 the limit for signaling internal memory pressure on a cachestore will be changed as described in section 4.