Are My Statistics Correct?

The question is often “Are my statistics up-to-date?” which can be a bit misleading.   I can make sure I have up-to-date statistics but the statistics may not be accurate. 

I recently engaged in an issue where the statistics were rebuilt nightly.   A maintenance job change had been made moving from FULLSCAN to WITH SAMPLE statistics creation/update that dramatically altered the statistical layout.  The underlying data was skewed and as such the execution plan generation(s) varied significantly.  Queries running in 1 minute now took over an hour to complete using an alternate plan with significant memory grants and TEMPDB usage.

As you can imagine this issue has resulted in a series of DCR asks from the product team.  

The dilemma we all run into is what level of SAMPLED statistics is appropriate?   The answer is you have to test but that is not always feasible and in the case of Microsoft CSS we generally don’t have histogram, historical states to revisit.  

Microsoft CSS is engaged to help track down the source of a poorly performing query.   It is common step to locate possible cardinality mismatches and study them closer.   Studying the statistics dates, row modification counter(s), atypical parameters usage and the like are among the fundamental troubleshooting steps.

The script, below, is one way Microsoft CSS may use to help determine the accuracy of the current statistics.   You can use similar techniques to check the accuracy of your statistical, SAMPLING choices or to store historical information.  The example loads a specific histogram for the ‘SupportCases’ table then executes queries to, using the key values and range information to determine actual counts (as if FULLSCAN) had been executed.   The final select of the captured data can be used to detect variations in current actual vs the in use histogram.

create table #tblHistogram
(
vData sql_variant,
range_rows bigint,
eq_rows bigint,
distinct_range_rows bigint,
avg_range_rows bigint,
actual_eq_rows bigint DEFAULT(NULL),
actual_range_rows bigint DEFAULT(NULL)
)
go

create

procedure #spHistogram @strTable sysname, @strIndex sysname
as

dbcc show_statistics(@strTable, @strIndex) with HISTOGRAM
go

truncate

table #tblHistogram
go

insert

into #tblHistogram (vData, range_rows, eq_rows, distinct_range_rows, avg_range_rows)
exec #spHistogram 'SupportCases', 'cix_SupportCases'
go

-- EQ_ROWS

update #tblHistogram
set actual_eq_rows = (select count(*) from SupportCases with(NOLOCK) where ServiceRequestNumber = h.vData)
from #tblHistogram h;

-- RANGE_ROWS

with BOUNDS (LowerBound, UpperBound)
as
(
select LAG(vData) over(order by vData) as [LowerBound], vData [UpperBound] from #tblHistogram
)

update

#tblHistogram
set actual_range_rows = ActualRangeRows
from (select LowerBound, UpperBound,
(select count(*) from SupportCases with(NOLOCK) where ServiceRequestNumber > LowerBound and ServiceRequestNumber < UpperBound) as ActualRangeRows from BOUNDS
) as t
where vData = t.UpperBound
go

select

/*TOP 10 NEWID(),*/ vData, eq_rows, actual_eq_rows, range_rows, actual_range_rows from #tblHistogram
where eq_rows <> actual_eq_rows or range_rows <> actual_range_rows
--order by 1
go

Testing the script I leveraged UPDATE STATISTICS with SAMPLE 1 PERCENT and skewed data in my table.   This resulted in several steps of the histogram having a statistical variation of +200% from the actual (FULLSCAN) values.

I continued to test variants of SAMPLE PERCENTAGE until the statistical relevance level from actuals fell within a noise range.   For my data this was 65 PECENT.   SAMPLING at 65 PERCENT allows reduction of statistics creation/modification time while retaining the necessary statistical relevance.

Bob Dorr - Principal SQL Server Escalation Engineer