CHECKDB (Part 5): What does CHECKDB really do? (part 4 of 4)

Article
09/20/2006

(Another airport, another blog post - I really must make an effort to come up with more original banter - I'm sure I've used that line before. TechEd Guangzhou has finished and I'm on the way to Beijing for TechEd #3. Checking in and getting through security were challenging to say the least this morning - there seemed to be problems at every turn, exacerbated by my China Southern airlines chaperone who spoke limited English (or what I should really say is that the problem was I don't speak Mandarin). Finally at the security checkpoint they asked for my ticket as well as my boarding pass - try explaining "e-ticket" to someone who doesn't speak the same language...)

So part 4 of the high-level explanation of what DBCC CHECKDB does will focus on the three completely new sets of checks that are in CHECKDB in SQL Server 2005. Well, the metadata checks (i.e. DBCC CHECKCATALOG) used to exist but so many people used to run both CHECKDB and DBCC CHECKCATALOG that I thought I'd make it easier in SQL Server 2005 by running DBCC CHECKCATALOG as part of CHECKDB. Maybe in Katmai we'll add in DBCC CHECKCONSTRAINTS - haven't decided yet.

Primitive checks of critical system tables
(Part 1...)

Allocation checks
(Part2...)

Logical checks of critical system tables
Logical checks of all tables
(Part 3...)

Service Broker checks
Metadata cross-checks
Indexed view and XML index checks
Its important that these checks are done after the per-table logical checks. This is because any repairs done as part of the per-table checks may affect the outcome of these checks quite substantially.

Let's look at an example of this. Imagine the case where an indexed view is based on a join between two tables, foo and bar. Table foo has a damaged page and is damaged in a way that reading the page as part of a query would not recognize - because full page audits are not done as part of regular page reads. The damage is such that the page has to be deleted as part of repair - thereby changing the results of the view. If the indexed view was checked before tables foo and bar, then the repair that was done would not get reflected in the indexed view and so the indexed view would essentially be corrupt. The same logic holds for checking XML indexes and Service Broker tables.

(Ok - in the air now with some serious turbulence - luckily I popped a Dramamine before we took off. And, more jellyfish in the in-flight meal - very cool :-) The 1st class cabin has what looks like a fully stocked bar - a little too early at 10am to be starting, although it could make for some interesting blog posts...)

Service Broker checks
The Service Broker dev team wrote a comprehensive set of checks of the data stored in their internal on-disk structures. The checks validate the relationships between conversations, endpoints, messages and queues. For example:

a conversation must have two endpoints
a service must be related to a valid contract
a service must be related to a valid queue
a message must have a valid message type

What's even cooler is that they implemented a set of logical repairs, so if one of their internal tables was damaged, and repaired by the earlier logical checks, then the Service Broker repair code can clean up any Service Broker entities and entity relationships that were damaged too.

Metadata cross-checks
The Metadata team also wrote a great set of checks for the relational metadata stored in the system tables. These checks that are run here are the same code that's run for DBCC CHECKCATALOG. The actual checks themselves in SQL Server 2005 are far more involved than in SQL Server 2000, and they're done way more efficiently too. However, they're not comprehensive by any means - we'll add some more checks during Katmai.

The checks only cover the relational metadata - i.e. the relationships between system table storing relational metadata. There are no such checks for the tables storing the storage engine metadata (also known as the critical system tables that were described in Part 1 of this 4 part sub-series. At present, these relationships between these tables are checked implicitly by the metadata checks I described in the last post. The checks are a series of validations between the various system tables, such as checking each rowset referenced in the sysrowsetcolumns system table exists in the sysrowsets system table.

There are no repairs for metadata corruptions. Metadata corruptions are extremely difficult to deal with because changing/deleting metadata has an effect on the entire table the corrupt metadata describes and could potentially be as bad as deleting the table - a table without metadata is just a collection of pages with indecipherable rows (well, not quite true - its possible to decipher any record without metadata but it requires human intervention and is incredibly hard).

Its possible we may put in some limited metadata repairs in a future release, but the frequency of their occurence in the field is so low that I decided that the engineering investment was not justified for SQL Server 2005 and so didn't push it. So - if you get any metadata corruption, you need to restore from your backups which, of course, after reading through my blog, you've been scared into making sure you have, right?...

Indexed view and XML index checks
These are very cool and many thanks to Conor Cunningham (dev lead of one of the two Query Optimizer teams) for helping work out how to do this.The indexed view contains the persisted results of a query, and we have the actual query it persists stored in metadata - so the easiest way to check whether the indexed view is accurate is to recalculate the view results into a temp table in TEMPDB and then compare the calculated values with the values persisted in the indexed view. The view is regenerated into a temporary table and then two concurrent left-anti-semi-joins are run, that basically return all the rows in the indexed view that are not in the recalculated view results, and vice-versa. This gives us all the rows that are extraneous in the indexed view and all the rows that are missing from it.

Indexed-view problems can also be repaired. The repair for extra rows is to delete them one by one (using internal query syntax that only works from DBCC), and the repair for missing rows is to rebuild the indexed view. This is done by simply disabling the indexed view and then bringing it back online (which rebuilds it).

There are two drawbacks to the indexed view checks if the views are large:

it can take up a lot of space in tempdb
it can take a lot of time to run the regeneration of the indexed views and to run the left-anti-semi-joins

So if you upgraded your database from SQL Server 2000 and you regularly run full CHECKDBs (i.e. without using WITH PHYSICAL_ONLY), then you may see a run-time increase for CHECKDB on SQL Server 2005 - this is documented in BOL and the README.

XML index checks work in a similar way. The XML blobs in the table are re-shredded and checked against the shredding that's persisted in the primary XML index. If anything is wrong, the primary XML index is recreated.

And that's it. Now you have a complete picture of what's going on with CHECKDB when it runs. In the next few posts I want to give some insight into how CHECKDB manages to do all these checks while only making a single pass through the database and shed some light on how repair works. Until then - time to chill out before doing more sessions in Beijing tomorrow... qing lai yiping pijiu!!!

CHECKDB (Part 5): What does CHECKDB really do? (part 4 of 4)

Additional resources