Getting more statistics information programmatically

As the building blocks which the Query Optimizer uses to know data distribution on tables, statistics are one of the key aspects in calculating a good enough plan to read data. So, when engaging in query performance troubleshooting, information about statistics is something you need to look at.

Let’s say we need to look at statistics information on table SalesOrderDetail in Adventureworks2016CTP3.

 SELECT sch.name + '.' + so.name AS [TableName], so.[object_id], ss.name, ss.stats_id,
(SELECT CONVERT(VARCHAR,c.[name]) + N',' AS [data()]
        FROM sys.stats_columns sc
        INNER JOIN sys.columns c ON sc.[object_id] = c.[object_id] AND sc.column_id = c.column_id
        WHERE sc.stats_id = ss.stats_id AND sc.[object_id] = ss.[object_id]
        FOR XML PATH('')) AS Columns
FROM sys.stats ss
INNER JOIN sys.objects so ON ss.object_id = so.object_id
INNER JOIN sys.schemas sch ON so.schema_id = sch.schema_id
WHERE so.name =  N'SalesOrderDetail';

This is the output:

image

For a long time the DBCC SHOW_STATISTICS command was the only way to get information about statistics. Let’s say I need to look at the stats over column ModifiedDate.

 DBCC SHOW_STATISTICS ('Sales.SalesOrderDetail','_WA_Sys_0000000B_57DD0BE4') WITH STAT_HEADER, HISTOGRAM

The above provides the following:

image

But if you need to get most of this information programmatically, you would need to dump this into temp tables, and query from there – can be cumbersome. So, back in SQL Server 2008R2 (SP2) and SQL Server 2012 (SP1) we introduced the sys.dm_db_stats_properties DMV. This DMV outputs the equivalent of executing DBCC SHOW_STATISTICS WITH STAT_HEADER (for partitioned tables, see the similar sys.dm_db_incremental_stats_properties). You can see examples of these DMVs being used programmatically in the TigerToolbox Github, with BPCheck and AdaptiveIndexDefrag.

Looking at the same stat over ModifiedDate column using sys.dm_db_stats_properties we get:

 SELECT * FROM sys.dm_db_stats_properties(1474104292, 4)

image

But what if you need to programmatically query the histogram? You would still need to dump DBCC SHOW_STATISTICS to a table and work from there.

Not anymore. With the latest SQL Server vNext CTP1.3* we introduced a new DMV sys.dm_db_stats_histogram. This DMV returns the statistics histogram for the specified object, equivalent to DBCC SHOW_STATISTICS WITH HISTOGRAM, as seen below for the same stat over ModifiedDate column:

 SELECT * FROM sys.dm_db_stats_histogram(1474104292, 4)

image

 

This opens new possibilities to programmatically explore a histogram for a specific query predicate, and understand data distribution inline. We will be updating the relevant scripts in the TigerToolbox Github.

* This DMV will also become available in an upcoming SQL Server 2016 SP1 CU.

Pedro Lopes (@sqlpto) – Senior Program Manager