SQL Server Statistics: Explained

Several DBA’s asked me these questions about SQL Server statistics at different customer places.

a)      Should we disable Auto update stats for the database and run maintenance plans instead?

b)      What sampling rate does Auto Stats use? Can we control it?

c)       Should we update statistics after rebuilding indexes during maintenance plans?

 

Here is an explanation for all these questions.

There are 3 types of statistics in a SQL Server database.

1)      Statistics created due to index creation. These statistics have the index name

2)      Statistics created by Optimizer(Column statistics). Starts with _WA_*

3)      User defined statistics which are created with CREATE STATISTICS command by the DBA

 Facts about statistics:  

  • Index statistics are always updated with fullscan when we rebuild the index (Only exception is in SQL 2012 partitioned index when the number of partitions >1000 it uses default sampling)
  • Column Statistics are not updated because of index rebuilds, they are only updated by either DB Auto update stats option or if we manually update statistics using UPDATE STATISTICS command.
  • If we do not specify the sampling rate while running the UPDATE STATISTICS command it will take the default sampling rate. See below how the default sampling rate algorithm works.
  • If you specify Update statistics command with just the table name it will update statistics for all the statistics including the index statistics on the table. So you may lose the full scan advantage which you get for index statistics if you update statistics with sampling rate after rebuilding indexes (Be very careful with this step during your maintenance windows)

 

Auto Update stats Algorithm:  

So the Auto Update stats will fire for every 500 + 20% change in table rows. Of course, we have an improved algorithm in SQL 2012 which is SQRT(1000 * Table rows) which is much better.

 

When it fires it will use the default sampling rate and here is the algorithm how it calculates the sampling rate.

1)      If the table < 8MB then it updates the statistics with a fullscan.

2)      If the table > 8MB, it follows an algorithm. It reduces the sampling rate as the number of rows in the table are increased to make sure we are not scanning too much data. This is not a fixed value but is under the control of optimizer. It is not a linear algorithm either.

Example: if we have 1,000,000 rows it would use a sampling rate of 30% but when the number of rows increase to 8,000,000 it would reduce the sampling rate to 10%. These sampling rates are not under the DBAs control but optimizer decides it.

 

Question: So, what if we disable Autostats on the database and decide to update statistics manually by the DBA using a scheduled job?

Answer: It depends, on what the sampling rate was used by the optimizer versus what we specify manually. The larger the sampling rate the better the results.

                Here is how you can calculate the sampling rate used by Autostats algorithm, Run DBCC SHOW_STATISTICS(Tablename, statistic name), This will come up with Num of rows and Sampled rows. So you do a math, Sampling Rate = (sampled rows/Actual Rows)*100

 

In case you are running on SQL 2008 R2 SP2 or SQL 2012 SP1, there is a new DMV called as sys.dm_db_stats_properties where you can get the statistics sampling rate for all tables in the db.

 

You can check this for every table and see which method has the largest sampling rate and select it. If you think manually updating statistics is yielding higher sampling rate, then you can run Sp_autostats ‘OFF’ for that statistic and use your job to update it manually. In this way either the statistics are updated with Auto stats or updated manually yielding better execution plans for the optimizer.

 

Perils of updating statistics manually:

1)      Adds overhead for the DBA to write custom scripts for individual statistics

2)      If the job fails, we don’t have updated statistics until the next run.

3)      If SQL Agent fails, then your update statistics job cannot run.

 

Recommendation: We would recommend to leave the Auto Stats option for the database to ON, and if there is a particular case where your seeing bad query plans on certain tables, just disable Autostats for that table and enable the manual statistics job for that table.

References:

Internal employees

https://msdn.microsoft.com/en-us/library/dd535534.aspx

https://www.sqlskills.com/blogs/erin/post/New-Statistics-DMF-in-SQL-Server-2008R2-SP2.aspx