Lesson Learned #69: Azure SQL Database – How are the statistics updated automatically by SQL engine


Hello,

Today I worked on a service request that our customer needs to know how the statistics are automatically updated by SQL Engine and what is the impact of this process in terms of consumption resources and distribution of the statistics.

STEP 1 – Created the table and statistics.

I have created an example table with name Example1 that contains two fields ID and Name.

				CREATE TABLE [dbo].[Example1](
				   [Id] [int] NOT NULL IDENTITY,
				   [Name] [varchar](200) NULL,
				CONSTRAINT [idcontraint_name] PRIMARY KEY CLUSTERED 
				(
				   [Id] ASC
				)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
				)
		 
				CREATE STATISTICS [xxExample1] ON [dbo].[Example1]([Name])
 

STEP 2 – I inserted around 40000 rows.

          DECLARE @T AS INT = 0
		WHILE @T < = 40001 
		BEGIN 
			SET @T = @T +1 
			INSERT EXAMPLE1 (ID,NAME) VALUES(@T,'A' + CONVERT(VARCHAR(20),@T))
		END 
 

STEP 3 – If we run the DBCC SHOW_STATISTICS, I have empty result set.

         dbcc show_statistics('example1',[idcontraint_name])
         dbcc show_statistics('example1',[xxExample1])

STEP 3 – Running the sys.dm_db_index_operational_stats I saw 40002 inserted correctly on the index.

   select * from sys.dm_db_index_operational_stats(null,null,null,null) where object_id=object_id('example1')

STEP 4 – Running the sys.dm_db_partition_stats I saw 40002 inserted correctly on the index.

   SELECT * FROM sys.dm_db_partition_stats where object_id=object_id('example1')

STEP 5 – As we explained, if you run any query and the number of updates are greater of 20% of the total of the rows and this query is using any field that the statistics is based, you will get an update statistics.

    select name from Example1 where name = 'A200' order by Id
         dbcc show_statistics('example1',[idcontraint_name])
         dbcc show_statistics('example1',[xxExample1])

STEP 6 – On the same way, you will get an updated/delete statistics when you update data if this is more 20% of total rows.

    update EXAMPLE1 SET name = 'B2'+NAME where id >100 AND ID <= 8200

For this example I changed the compatibility level of the database to 110, please, review this URL because depending on the compatibility model you could have a different way to calculate automatically the statistics.

Also, I would like to highlight two points:

  • Depending on the number of rows of the table, SQL Engine could choose update the statistics using only a sample instead of all the rows of the table and depending on this distribution we may have an incorrect distribution data.
  • Rebuild indexes or update the statistics will generate a new execution plan for your queries.

Enjoy!


Comments (0)

Skip to main content