Optimized Non-clustered Index Maintenance

Insert, update, and delete plans consist of two parts.  The first part or read cursor identifies the list of rows to be inserted, update, or deleted.  The second part or write cursor performs the actual insert, update, or delete.  Let's look at a simple example:

CREATE TABLE T (PK INT, A INT, B INT, C INT, D INT, E INT)
CREATE UNIQUE CLUSTERED INDEX TPK ON T(PK)

CREATE INDEX TB ON T(B)
CREATE INDEX TCD ON T(C,D)
CREATE INDEX TE ON T(E)

INSERT T VALUES(0, 10, 20, 30, 40, 50)

UPDATE T SET A = 19

Here is the plan for the update statement:

  |--Clustered Index Update(OBJECT:([T].[TPK]), SET:([T].[A] = [@1]))
       |--Top(ROWCOUNT est 0)
            |--Index Scan(OBJECT:([T].[TE]), ORDERED FORWARD)

In this plan, the index scan is the read cursor and the clustered index update is the write cursor.  I explained the purpose of the ROWCOUNT top in a prior post.  Although we have three non-clustered indexes on table T, SQL Server recognizes that these indexes do not cover the columns affected by the update and do not need to be maintained.  Hence this query plan only updates the clustered index.

If there are non-clustered indexes to maintain, the optimizer may choose to build a plan similar to the one above where the clustered index update also updates any non-clustered indexes.  Alternatively, the optimizer may choose to introduce separate clustered index update operators for each non-clustered index.  We often refer to a plan with a single update operator as a per-row or narrow plan and to a plan with one update operator per index as a per-index or wide plan.  Bart Duncan discusses the difference between these types of plans in this post.

Let's look at what happens if we try to update columns covered by the non-clustered indexes:

UPDATE T SET B = 29, C = 39, D = 49

  |--Clustered Index Update(OBJECT:([T].[TPK]), OBJECT:([T].[TB]), OBJECT:([T].[TCD]), SET:([T].[B] = [@1],[T].[C] = [@2],[T].[D] = [@3]))
       |--Compute Scalar(DEFINE:([Expr1015]=[Expr1015], [Expr1016]=[Expr1016]))
            |--Compute Scalar(DEFINE:([Expr1015]=CASE WHEN CASE WHEN [T].[B] = [@1] THEN (1) ELSE (0) END THEN (1) ELSE (0) END, [Expr1016]=CASE WHEN CASE WHEN [T].[C] = [@2] THEN (1) ELSE (0) END AND CASE WHEN [T].[D] = [@3] THEN (1) ELSE (0) END THEN (1) ELSE (0) END))
                 |--Top(ROWCOUNT est 0)
                      |--Clustered Index Scan(OBJECT:([T].[TPK]))

This is a per-row plan.  The single clustered index update operator updates the clustered index and two of the non-clustered indexes (TB and TCD).  The third non-clustered index (TE) does not cover the affected columns and still does not need to be maintained.  The showplan output for the clustered index update lists all three objects maintained by the operator.  (This is an improvement added in SQL Server 2005.  In versions prior to SQL Server 2005, the clustered index update operator only lists the clustered index.)

The plan also includes two new compute scalars that were not present in the first plan.  These operators are computing on a row by row basis whether there were any actual changes to each non-clustered index.  If there are no changes to a particular non-clustered index for a particular row, the server does not update that index for that particular row.  (This is another improvement added in SQL Server 2005.)

The first compute scalar calculates one boolean for each index by comparing the old and new values for each column in the index.  The expressions look slightly complex but if you look carefully you will see that they evaluate to 1 if there is no change to the index and if the index does not need to be updated while they evaluate to 0 if there is a change and if the index must be updated.  [Expr1015] checks column B and indicates whether index TB must be updated.  [Expr1016] checks columns C and D and indicates whether index TCD must be updated.  The second compute scalar is needed only for internal purposes.

There are at least three ways that we can verify that SQL Server is indeed only updating those non-clustered index rows that actually change.  First, we can compare the before and after output from the sys.dm_db_index_operational_stats DMV.  Second, we can check which locks are held by the update using the sys.dm_tran_locks DMV.  Third, we can look at the number of logical reads reported by SET STATISTICS IO ON.  Let's give each of these methods a try.  If you've already run either of the above update statements, drop and re-create the table to ensure that you get the same results.

Here is the output of sys.dm_db_index_operational_stats immediately after creating and populating the table with one row:

SELECT index_id, leaf_insert_count, leaf_update_count
FROM sys.dm_db_index_operational_stats (db_id(), object_id('T'), NULL, NULL)
ORDER BY index_id

 index_id    leaf_insert_count    leaf_update_count
----------- -------------------- --------------------
1           1                    0
2           1                    0
3           1                    0
4           1                    0

As expected, we've inserted one row into the clustered index (index_id 1) and into each non-clustered index (index_id 2 through 4).  Next, start a transaction and run an update that really modifies every column:

SET STATISTICS IO ON

BEGIN TRAN
UPDATE T SET B = 29, C = 39, D = 49

The STATISTICS IO output shows that we performed 10 logical reads:

Table 'T'. Scan count 1, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Now let's re-check sys.dm_db_index_operational_stats:

 index_id    leaf_insert_count    leaf_update_count
----------- -------------------- --------------------
1           1                    1
2           2                    0
3           2                    0
4           1                    0

This DMV reports cumulative data.  If we compare this data with the data from prior to the update, we see that as expected we've updated one row in the clustered index.  We also see that we've inserted one row each in two of the non-clustered indexes.  These two inserts are actually from updates that SQL Server internally split into a delete and an insert.

Next let's check sys.dm_tran_locks:

SELECT index_id, request_mode, request_type, request_status
FROM sys.dm_tran_locks JOIN sys.partitions
ON resource_associated_entity_id = hobt_id
WHERE request_session_id = @@spid AND resource_type = 'KEY'
ORDER BY index_id

 index_id     request_mode  request_type  request_status
-----------  ------------  ------------  --------------
1            X             LOCK          GRANT
2            X             LOCK          GRANT
2            X             LOCK          GRANT
3            X             LOCK          GRANT
3            X             LOCK          GRANT

As expected, we see that we are holding exclusive locks on the clustered index and on the two non-clustered indexes.  We actually have two locks on each non-clustered index because, as noted above, SQL Server split each update into a delete and insert.

Finally, commit the previous transaction and repeat the update:

COMMIT TRAN

BEGIN TRAN
UPDATE T SET B = 29, C = 39, D = 49

This time the update does not modify any data since we've already updated the single row to these values.  This is reflected in the STATISTICS IO output which shows that we performed only 2 logical reads compared the 10 logical reads from the first execution:

Table 'T'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

If we re-check sys.dm_db_index_operational_stats one more time, we can see that this time only the clustered index is updated.  SQL Server does not touch the rows in the non-clustered index:

 index_id    leaf_insert_count    leaf_update_count
----------- -------------------- --------------------
1           1                    2
2           2                    0
3           2                    0
4           1                    0

And, if we re-check sys.dm_tran_locks, we can see that we are holding an exclusive lock on the clustered index only:

 index_id     request_mode  request_type  request_status
-----------  ------------  ------------  --------------
1            X             LOCK          GRANT

In my next post, I'll look at how this optimization works in per-index or wide update plans.  You can also read more about this optimization in this post by Stefano, a developer on the query optimization team.