FAQ around sys.dm_db_index_usage_stats

For those of you who may not be familiarized with this object, sys.dm_db_index_usage_stats is one of the many Dynamic Management Views built into SQL Server. This one specifically falls into the category index related DMVs, and it returns counts of different types (scans, seeks, and lookups) of index operations, triggered by user or system initiated actions, and the time each type of operation was last performed. It reveals enlightening facts about how useful your indexes are or how much of an overhead they are to the performance of the system.

Following are some questions I’ve received from others or that I’ve asked to myself about the information exposed by sys.dm_db_index_usage_stats. I’ll keep this post alive, adding new questions and their corresponding answers as I come across them.

Q: Does SQL Server always collects these index usage statistics?

A: By default, yes. However, circumstances that prevent index usage statistics collection are the following:

  • The server was started with command line parameter -x (disable collection).
  • The server was started with command line parameter -m (maintenance mode).
  • Trace flag 2330 (disable index usage statistics collection) has been enabled. By enabling the trace flag session wide, it only affects the objects accessed from that specific session. If it is enabled server wide, from the instant the flag is enabled, SQL stops colleting usage stats for any object, from any database, being accessed by any user. Whatever has been collected to that point remains collected, can potentially be removed, but it is not updated as further accesses happen.

 

Q:   Does it gather usage statistics for system objects located in any database or for objects stored inside the Resource database? For example, if I run select * from sys.objects, will it keep usage stats of clustered indexes for sysschobjs, syssingleobjrefs, and syspalnames?

A: No. No statistics are collected for system objects. And there is no trace flag available in the retail build of SQL Server to override this default behavior. However, in the debug build of SQL Server there exists one trace flag that permits collecting these stats for absolutely all objects.

 

Q: If I access a table object through a heap data structure (one table with no clustered index), will that be counted as part of the usage statistics?

A: Yes. Access through heaps are also reported. The statistics associated to heaps can be identified because the column index_id reports a value of zero.

 

Q: If a query accesses an empty heap or index, does it get counted? For example, if you have just created a table and still haven’t inserted even a single row into it, will a SELECT on that table be counted in these stats?

A: Yes. That one also counts.

 

Q: If a database is shutdown (due to autoclose, detach, drop, any other DB reason that triggers the shutdown of the database) what happens to the collected statistics?

A: All the statistics associated to any object contained in that database are removed from memory. Next time the database is started up, the usage statistics will begin to be collected again.

 

Q: If an index is dropped what happens to the collected statistics associated to it?

A: They are removed from memory.

 

Q: If an object (table/view) is dropped what happens to the collected statistics associated to all their associated indexes?

A: They are all removed.

 

Q: If the base data structure used for a table object is a heap, on top of that heap, several non-clustered indexes have been created, and accesses have occurred through the different indexes so that there are usage statistics recorded for all those accesses. What happens if the user creates a clustered index on that table? As you should know, when you create a clustered index the heap is removed (heap and clustered index are mutually exclusive on the same table). But what happen to the usage statistics that existed before such action?

A: Any usage statistics associated to the heap are removed after the clustered index is created. Whatever usage statistics were associated to the non-clustered indexes before the clustered index was created remain intact. During the index creation process the clustered index must be scanned as many times as non-clustered indexes the table has. Those scans are reported as system_scans for the usage stats of the clustered index. However, when the clustered index creation statement completes, its usage statistics entry is removed from memory. Until some other statement accesses through that newly created clustered index, no usage statistics will be created for it.

 

Q: If the base data structure used for a table is a clustered index, on top of that index, several non-clustered indexes have been created, and accesses have occurred through the different indexes so that there are usage statistics recorded for all those accesses. What happens if the user drops the clustered index on that table? As you probably know, when you drop the clustered index of a table, it is replaced with a heap. And, any existing non-clustered indexes that were built on top of the clustered index must be rebuilt so that their keys don’t point to the clustering key (key of the clustered index) but to the Row ID (file + page + row of the heap pages).

A: If any usage statistics associated to the clustered index existed, they are wiped out. And one index usage statistic is created for the new heap the first time it gets scanned to populate the first non-clustered index. The heap will be scanned as many times as non-clustered indexes existed when the clustered index was dropped, and all those accesses will be counted under the system_scans column of the heap’s stats.

 

Q: When a clustered or non-clustered index is rebuilt using ALTER INDEX … REBUILD, how is it reflected in the usage statistics?

A: The system_scans value of the usage statistics entry associated to the index being rebuild is incremented by one. If there was no usage stats entry for that index when it was rebuilt, one is created at that point.

 

Q: When a clustered or non-clustered index is dropped and rebuilt using CREATE INDEX … WITH DROP_EXISTING, how is it reflected in the usage statistics?

A: After the index is rebuilt, the usage stats entry for that index is removed from memory. And, if the index definition is changed so that the rebuild must scan the heap or some other index to retrieve the values that were missing from the original index structure, the usage statistics entry associated to that other structured being scanned is updated to reflect an increment of its system_scans.

 

Q: If a row gets inserted into a table, how is that reflected in its heap/indexes usage stats?

A: The user_updates for each index usage statistic row associated to the target object (table) is incremented as part of the INSERT DML operation.

 

Q: Does the previous assertion also apply to filtered indexes or it depends on whether or not the row meets the filtering predicate in order for it to be accounted?

A: Whether or not the inserted row meets the filtering predicate, the user_updates counter will always be incremented as part of the INSERT DML operation.

 

Q: What if an UPDATE modifies one or more colums which are part of the clustered index key?

A: That increments the user_updates not only for the clustered index, but also for each and every non-clustered index (as they all contain a copy of the clustering key which also must be updated). That is also the case for filtered indexes.

 

Q: What if an UPDATE modifies one or more colums which are NOT part of the clustered index key?

A: That increments the user_updates of the heap or clustered index (where the base data resides), as well as that of any non-clustered indexes maintaining a copy of that column (either because the column is part of the index key, or because it is INCLUDEd in the leaf level of the index). For filtered indexes, as it has been explained in the two previous answers, as long as the index definition contains any of the updated columns, its usage statistics will be updated regardless of the subset of actual rows modified.

 

Q: And if the UPDATE doesn’t actually affects any rows because no rows meet the search predicate?

A: Still, that increments the user_updates of the heap or clustered index (where the base data resides), as well as that of any non-clustered indexes maintaining a copy of that column.

 

Q: Are XML indexes also included in the monitoring range of sys.dm_db_index_usage_stats?

A: Yes, the use of both primary and secondary XML indexes is also recorded and can be viewed through this DMV.

 

Q: Do the seeks, lookups or scans counters tell you anything about the volume of data (rows or pages) visited/moved/updated by the workload?

A: Not at all. A SELECT statement whose optimal execution plan includes an Index Scan over a particular index will increment the user_scans counter by 1 for that index, whether the index is empty, whether the Index Scan iterator is preceeded by a TOP 1 operator that stops the scan after the first row has been read, or whether it has to read hundreds of thousands of pages in order to complete the scan.

 

Q: If a statement fails with a runtime error, like in the case an INSERT fails with error 1946 because the data it is trying to insert exceeds the maximum length of the key in one of the indexes where it must be referenced, what happens to the usage statistics?

A: Since the usage counters exposed by this DMV are incremented when the statement begins its execution, before it even knows whether it will encounter a runtime error or not, the statistics for all of the indexes targeted by the statement will be updated in both cases: when the statement encounters a runtime error as when it doesn’t and executes successfully to the end.

 

Q: How does the existence of an AFTER TRIGGER can affect how and what statistics are collected?

A: If a statement targets an object for which there is a for/after trigger for such operation, the statistics collected for the affected indexes will be exactly the same as if there is no trigger. However, inside the trigger there could be additional statements targeting the exact same object or a set of completely different ones. For each of those additional statements, SQL will record whatever uses they make of heaps and indexes. Although this second part won’t happen if the original operation fails with a runtime error that prevents executing the statements inside the trigger.

 

Q: How does the existence of an INSTEAD OF TRIGGER can affect how and what statistics are collected?

A: If a statement targets an object for which there is an IOT for such operation, only the statistics of the objects accessed by the statements inside the trigger will be collected, because the original statement is not even executed. It is overridden by the contents of trigger. That is the idea behind IOTs.