Non updating updates



A question we are frequently asked is what happens when an update statement assigns a column to its same current value. For example,


 


use tempdb


go


 


create table t(i int, cc as i + 1)


create index t_cc on t(cc)


go


 


insert into t values(1)


go


 


update t set i = 1


go


 


update t set i = i


go


 


Columns do get updated even if the value does not change. However, to be honest, i wouldn’t worry too much about it.
The runtime cost for updating a row is roughly equal to


a) locating the row in the heap or B-Tree


b) locking the row


c) changing the updated column values


d) logging the change.


For updates to columns that fit in the 8Kb page, i don’t think that avoiding part of the copy would make too much of a difference. Realistically, performing a comparison to tell what changed with memcmp would roughly cost as much as changing the value with memcpy. And this cost would surely be overshadowed by a) + b) + d). Updates to columns that don’t fit in the 8Kb page are inherently slower, and i think it is up to the application to avoid making unnecessary changes to these.

Now a), b), c) and d) need to be performed for every index carrying one or more columns being modified. So a major factor for the runtime cost of an update query is the number of nonclustered indexes that need to be maintained. In SQL 2005 we introduced an optimization to skip nonclustered index maintenance if none of the updated columns stored in the index actually changed. The reasoning is that running the comparison for only the modified columns stored in one or more nonclustered indexes introduces a runtime overhead that is very small, and surely significantly lower than a), b) and d). Given that columns stored in an index are typically pretty small in size, it is very easy to break even if the optimization actually saves a small minority of nonclustered index row updates. The optimization does not apply to the clustered index, as we want to ensure to always exclusively lock the affected rows even if no columns are really changing.

You can see the new optimization in action by comparing the statistics profile output for one of the update statements in the previous example in SQL 2000 and 2005 (see attached image).


set statistics profile on
go

update t set i = i


go


In SQL 2000, the lack of the optimization leads to updating the nonclustered index even if the value is not changing.

 

In the SQL 2005 plan, it is possible to appreciate


– a “Compute Scalar” operator that compares the current value and new value of the column being modified


– a new filter operator that on a row by row basis will determine whether the nonclustered index is being affected or not


– the fact that nonclustered index maintenance is now bypassed


 


This new SQL 2005 optimization allows to better address the problem of maintaining a diverse set of columns across multiple recurring update statements against a certain table. In SQL 2000 there was a tradeoff between building dynamic SQL statements and possibly incur in frequent compilations, to only maintain the modified columns vs. using a parameterized and standard statement that updated all the columns, but would also maintain all the nonclustered indexes all the time. Like i said before, special precautions should still be taken for columns that don’t fit the 8Kb page limit.


 


Ciao,


Stefano

Profiles.png


Comments (6)

  1. JFOS says:

    This is really cool, and it makes sense, except for one thing.

    Why does the example only seem to work if cc is a computed column?  If I make cc a normal column and then try the test I don’t even see the "Index Update" entity in the profile at all, even though the index does exist.  In fact, I don’t even see the Index Update even if I set cc to any value in the UPDATE, which is confusing.  Doesn’t the index get updated in that case?  Am I missing something?

    Thanks.

    Jerry

  2. QueryOptTeam says:

    Hi Jerry – thank you for the feedback. The answer to your question is actually fairly complex, so please look forward to a forthcoming post to explain this in better detail.

    Stefano

  3. scherbinin says:

    Hi Stefano! Great blog, I really enjoyed.
    Some correction: I think that d) logging the change must come berfore c) changing the updated column values, because of WAL. Isn’t it?

  4. QueryOptTeam says:

    You are correct. Because of write-ahead logging, logging the change will occur before modifying the value in the row. Thank you for pointing the inaccuracy out.

    Stefano

  5. Insert, update, and delete plans consist of two parts. The first part or read cursor identifies the list