Find Top N costly query plans in adhoc batches or modules…


I encountered a problem about a query not using a particular indexed view in SQL Server 2005. To investigate this issue, I figured that I would go about writing a query using the execution related dynamic dynamic management views to get the cost of the query that should have used the indexed view. This proved to be an interesting adventure that also revealed some inconsistencies in how we expose certain metadata in the DMVs. On the bright side, the final query showcases the powerful new relational, TSQL and XQuery features of SQL Server 2005.

 

Let’s start with brief outline of the DMVs that I planned to use:


  • sys.dm_exec_query_stats – Returns performance statistics for cached query plans. This contains one row per query plan so if a stored procedure or batch contains two SELECT statements you may get two rows here
  • sys.dm_exec_sql_text – Returns the text of the sql statement based on the SQL handle
  • sys.dm_exec_query_plan – Returns the showplan in XML format for a batch or module based on the plan handle

A simple query to get the elapsed time counters and SQL text of the batch/module is shown below. This query uses the new APPLY operator to join with a table-valued function passing the sql_handle value for each row.

 

select qs.sql_handle, qs.execution_count

     , qs.total_elapsed_time, qs.last_elapsed_time

     , qs.min_elapsed_time, qs.max_elapsed_time

     , qs.total_clr_time, qs.last_clr_time

     , qs.min_clr_time, qs.max_clr_time
  from sys.dm_exec_query_stats as qs
 cross apply sys.dm_exec_sql_text(qs.sql_handle) as st

 

This query can be modified to return the actual statement for which the stats are returned using the offset columns in sys.dm_exec_query_stats. Please note that the statement_start_offset and statement_end_offset columns are zero based, represents number of bytes and the end offset is -1 for last statement in the batch. In addition the text value returned by sys.dm_exec_sql_text is nvarchar(max) instead of text as documented incorrectly in Books Online. The modified query that accounts for these quirks is shown below:

 

select qs.sql_handle, qs.execution_count
     , qs.total_elapsed_time, qs.last_elapsed_time

     , qs.min_elapsed_time, qs.max_elapsed_time

     , qs.total_clr_time, qs.last_clr_time

     , qs.min_clr_time, qs.max_clr_time
     , substring(st.text, (qs.statement_start_offset/2)+1

                        , ((case qs.statement_end_offset
                              when -1 then datalength(st.text)
                              else qs.statement_end_offset
                           end – qs.statement_start_offset)/2) + 1) as statement_text
  from sys.dm_exec_query_stats as qs
 cross apply sys.dm_exec_sql_text(qs.sql_handle) as st

 

Next, a similar query that gets the elapsed time counters and the XML showplan output for the batch/module.

 

select qs.sql_handle, qs.execution_count


     , qs.total_elapsed_time, qs.last_elapsed_time

     , qs.min_elapsed_time, qs.max_elapsed_time

     , qs.total_clr_time, qs.last_clr_time

     , qs.min_clr_time, qs.max_clr_time
     , qp.query_plan
  from sys.dm_exec_query_stats as qs
 cross apply sys.dm_exec_query_plan (qs.plan_handle) as qp

 

Note that as I mentioned above, sys.dm_exec_query_stats returns a row for each query plan in a batch/module. The showplan output on the other hand contains the entire plan in XML format with each statement identified by the StatementID attribute which is simply a counter starting from the first statement in a batch. For example, if a stored procedure contains two queries then there will be two entries in the XML plan – one with StatementID = 1 and another with StatementID = 2. If there are additional variable assignments statements or DDLs or other TSQL statements they are also numbered appropriately in the order of appearance.

 

Consider the example stored procedure below:

 

create procedure TestProc
as
begin
 declare @spid int
 set @spid = @@spid
 select count(*) from master.sys.sysprocesses where spid = @spid
 select count(*) from master.sys.syslockinfo where req_spid = @spid
end

 

The showplan output will contain 3 StmtSimple nodes (fragments shown below with StatementText, StatementType and StatementId attributes only):

 


  1. StatementText=”create procedure testproc
as
begin
    declare @spid int
    set @spid = @@spid
   “
    StatementId=”1″
    StatementType=”ASSIGN”
  2. StatementText=” select count(*) from master.sys.sysprocesses where spid = @spid
   “
    StatementId=”2″
    StatementType=”SELECT”
  3. StatementText=” select count(*) from master.sys.syslockinfo where req_spid = @spid
”
    StatementId=”3″
    StatementType=”SELECT”

 

The sys.dm_exec_query_stats will contain two rows one for each query only. So when we join sys.dm_exec_query_stats with sys.dm_exec_query_plan you will get the same showplan for both rows and we need to aggregate information from the plan or identify the appropriate StmtSimple nodes to get the cost of each query. So I generate first artificial counters by sorting and counting the statment_start_offset value from sys.dm_exec_query_stats. The same is done for the StmtSimple nodes in the query plan by selecting those of type “SELECT” and sorting them by StatementId value. Now this relative sequence numbers can be used to join and get the costing attributes for each query.

 

The final query that contains the logic to match the query stats with the plan nodes is shown below:

 

declare @top_n_plans int;
set @top_n_plans = 25;
with XMLNAMESPACES (‘http://schemas.microsoft.com/sqlserver/2004/07/showplan’ as sql)
select top(@top_n_plans)
       qst.text as sql_text, qp.query_plan
     , qst.statement_id
     , qst.statement_text as select_statement
     , qps2.statement_optimization_level
     , qps2.statement_optimization_early_abort_reason
     , (
        select sum(ro.SubTreeCost.value(
N’@EstimatedTotalSubtreeCost’, ‘float’))
          from qp.query_plan.nodes(

               N’//sql:Batch/sql:Statements/sql:StmtSimple/sql:QueryPlan[1]/sql:RelOp[1]’

             ) as ro(SubTreeCost)
       ) as Totalcost
     , qps2.statement_sub_tree_cost
     , qst.creation_time, qst.last_execution_time, qst.execution_count
     , qst.total_elapsed_time, qst.last_elapsed_time, qst.min_elapsed_time, qst.max_elapsed_time
     , qst.total_worker_time, qst.last_worker_time, qst.min_worker_time, qst.max_worker_time
     , qst.total_physical_reads, qst.last_physical_reads

     , qst.min_physical_reads, qst.max_physical_reads
     , qst.total_logical_writes, qst.last_logical_writes

     , qst.min_logical_writes, qst.max_logical_writes
     , qst.total_logical_reads, qst.last_logical_reads

     , qst.min_logical_reads, qst.max_logical_reads

     , qst.total_clr_time, qst.last_clr_time, qst.min_clr_time, qst.max_clr_time
     , qst.sql_handle, qst.plan_handle
  from (
        select *
             , substring(st.text, (qs.statement_start_offset/2)+1

                                , ((case qs.statement_end_offset
                                      when -1 then datalength(st.text)
                                      else qs.statement_end_offset
                                   end – qs.statement_start_offset)/2) + 1) as statement_text
             , ROW_NUMBER() OVER(PARTITION BY qs.plan_handle

                                     ORDER BY qs.statement_start_offset) as statement_id
          from sys.dm_exec_query_stats as qs
         cross apply sys.dm_exec_sql_text(qs.sql_handle) as st
     ) as qst
 cross apply sys.dm_exec_query_plan (qst.plan_handle) as qp
 cross apply (
        — Since sys.dm_exec_query_stats doesn’t have statement id,

        — we just sort the actual statement id from showplan for

        — SELECT statements and join them with similar sequence number generated based
        — on the statement start offset in sys.dm_exec_query_stats.
        — This allows us to match the row from showplan with that of the query stats.
        — This is a problem for batches containing multiple SELECT statements

        — and hence this solution.
        select ROW_NUMBER() OVER(ORDER BY qps1.statement_id) as rel_statement_id
             , qps1.statement_optimization_level, qps1.statement_sub_tree_cost

             , qps1.statement_optimization_early_abort_reason
          from (
                select sel.StmtSimple.value(
‘@StatementId’, ‘int’)
                     , sel.StmtSimple.value(
‘@StatementSubTreeCost’, ‘float’)
                     , sel.StmtSimple.value(
‘@StatementOptmLevel’ , ‘varchar(30)’)
                     , sel.StmtSimple.value(
‘@StatementOptmEarlyAbortReason’, ‘varchar(30)’)
                  from qp.query_plan.nodes(

                       N’//sql:Batch/sql:Statements/sql:StmtSimple[@StatementType = “SELECT”]’

                     ) as sel(StmtSimple)
             ) as qps1(statement_id, statement_sub_tree_cost

                     , statement_optimization_level, statement_optimization_early_abort_reason)
     ) as qps2
 where qps2.rel_statement_id = qst.statement_id
 —  and qst.text like … /* can be used to filter only particular statemetns */
 order by Totalcost desc, qst.plan_handle, qst.statement_id;

 


Lastly, let me highlight the new relational, TSQL and Xquery features of SQL Server 2005 as used in the query.

 


  • TOP clause – Support for variables or expressions in the TOP clause
  • APPLY operator – Used to join table against table-valued function in the query
  • ROW_NUMBER analytic function – To generate sequence numbers based on a particular order and optionally partition the rows into different groups
  • Xquery nodes method – To get the matching nodes based on a XPath expression for each instance of a query plan
  • Xquery value method – To get a particular Xml attribute as a typed column
  • WITH XMLNAMESPACES – To use namespace prefix in XPath expressions easily in a query

 

So run the query and enjoy the output / richness of metadata that SQL Server 2005 exposes. The Books Online topics and the MSDN links of the various features used in the query is below for convenience sake:

 



 
Umachandar Jayachandran

Comments (12)

  1. Cristian Lefter, SQL Server MVP says:

    That’s another great example. Thank you Umachandar! I love the new Summary Reports and of course the sources for it (the default trace and DMVs and DMFs). I use Profiler to see the use of them 🙂

  2. Veeral says:

    Hi,

    I have two stored procedures one written with SQL and other in CLR language.I m able to find CPU time takenby stored proc written in SQL with dm_exec_query_stat.But I m unable to find CPU time for CLR written stored procedure with dm_exec_query_stat.Can u plz tell me how can this be done.

    Veeral

    veeralmpatel@rediffmail.com

  3. sqletips says:

    Sorry, I didn’t include the CLR time counters in my query before. That is why you were seeing zero for the elapsed time. See the topic on sys.dm_exec_query_stats DMV. I modified the query to include the CLR counters too.

  4. Great blog says:

    Your site is realy very interesting.

  5. Jerry says:

    This stuff is all well and good, but you sure do have to be careful.  I attempted to run the final query above on our development box – it still had not completed after 3 minutes. It pegged CPU at 100% for the duration and had taken 16 million page reads by the time I cancelled it.  Can’t even imagine what it would do to our production box that has 3k trans/sec.

    Note – I am not trying to be overly critical, I *love* 2005 and the dmv’s, and judicious use of the query stats has helped us tremendously.  But often things are quite different out here in the wild; I just wanted to caution anyone from running the code on a production box w/o testing it first.

    Jerry

    jfos@plex.com

  6. sqletips says:

    I just wanted to update that we found and fixed a performance problem with the sys.dm_exec_query_plan DMV. This DMV can take a long time to generate plans for certain statements and this will reduce the overall performance of the query. This bug fix is in SQL Server 2005 SP1 so I would encourage you to try & use the DMV to see if you get noticeable improvements.



    Umachandar

  7. sangeethashekar says:

    Please check out http://blogs.msdn.com/sqlprogrammability/ for  a series of articles on the Plan Cache and improvements made to the plan cache in SQL Server 2005 RTM. The articles also have a number of examples and queries on plan cache related DMVs that you may find useful.

  8. ALZDBA says:

    I’m not into this xquery stuff, but can someone please give me a query to select only the queryplans that have implicit conversions ?

    <ScalarOperator ScalarString="CONVERT_IMPLICIT (whateverconversionwasneeded)">

  9. sqletips says:

    Hi,

    You can get the SELECT statements that have an expression with CONVERT_IMPLICIT in many ways. Below is a query that uses the sys.dm_exec_query_plan and XQuery to get the information:

    with XMLNAMESPACES (‘http://schemas.microsoft.com/sqlserver/2004/07/showplan‘ as sql)

    select qs.sql_handle, qs.plan_handle, qs.total_elapsed_time, qs.last_elapsed_time, qs.min_elapsed_time, qs.max_elapsed_time

    , db_name(qp.dbid) as database_name

    , quotename(object_schema_name(qp.objectid, qp.dbid)) + N’.’ + quotename(object_name(qp.objectid, qp.dbid)) as obj_name

    , qp.query_plan.value( N'(/sql:ShowPlanXML/sql:BatchSequence/sql:Batch/sql:Statements/sql:StmtSimple[@StatementType = "SELECT"])[1]/@StatementText’, ‘nvarchar(max)’ ) as plan_stmt_text

    , qp.query_plan.value( N'(/sql:ShowPlanXML/sql:BatchSequence/sql:Batch/sql:Statements/sql:StmtSimple[@StatementType = "SELECT"]/sql:QueryPlan/sql:RelOp/descendant::*/sql:ScalarOperator[contains(@ScalarString, "CONVERT_IMPLICIT")])[1]/@ScalarString’, ‘nvarchar(4000)’ ) as scalar_string

    from sys.dm_exec_query_stats as qs

    cross apply sys.dm_exec_query_plan(qs.plan_handle) as qp

    where qp.query_plan.exist( N’/sql:ShowPlanXML/sql:BatchSequence/sql:Batch/sql:Statements/sql:StmtSimple[@StatementType = "SELECT"]/sql:QueryPlan/sql:RelOp/descendant::*/sql:ScalarOperator[contains(@ScalarString, "CONVERT_IMPLICIT")]’ ) = 1;

    You can also use the new DMV sys.dm_exec_text_query_plan and do a text-based search which is less accurate.

    select qs.sql_handle, qs.plan_handle, qs.total_elapsed_time, qs.last_elapsed_time, qs.min_elapsed_time, qs.max_elapsed_time

    , db_name(tqp.dbid) as database_name

    , quotename(object_schema_name(tqp.objectid, tqp.dbid)) + N’.’ + quotename(object_name(tqp.objectid, tqp.dbid)) as obj_name

    , tqp.query_plan

    from sys.dm_exec_query_stats as qs

    cross apply sys.dm_exec_text_query_plan( qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset ) as tqp

    where tqp.query_plan like ‘%"%CONVERT_IMPLICIT%"%’;

    Note that you need to change the query (OBJECT_NAME and OBJECT_SCHEMA_NAME usage) if you are not running on SQL Server 2005 SP2. I will be discussing this particular question in a new BLOG article to explain further.

    Umachandar

  10. gqnelson says:

    How do I see the database name that the query is executing against?

  11. gqnelson says:

    This returns the top 25 SELECT statements. How would I change it to return all types of statements?

  12. gqnelson says:

    I found if you change, you get all queries instead of just SELECT:

    N’//sql:Batch/sql:Statements/sql:StmtSimple[@StatementType = "SELECT"]’

    to

    N’//sql:Batch/sql:Statements/sql:StmtSimple/sql:QueryPlan[1]/sql:RelOp[1]’