CHECKDB (Part 3): What does CHECKDB really do? (2 of 4)

In the previous post of this series, I covered the system table checks that have to be done before anything else can be checked by CHECKDB. Now that I've described pages, extents, IAM chains/allocation units and the major allocation bitmaps, in this post I'll cover the allocation checks.

1. Primitive checks of critical system tables
(Part 1...)

2. Allocation checks

This stage is in SQL Server 2000 and 2005. Allocation checks are checks of the various structures (IAM pages, IAM chains, allocation units, GAM/SGAM pages, PFS pages) that describe the allocations (pages, extents) that have been done within a database.

I'll describe how we go about collecting information from the various pages and then describe what some of the actual checks are.

The allocation checks are very fast (orders of magnitude faster than the logical checks, so fast in fact that they happen in the blink of an eye! well, perhaps I'm getting carried away a little as usual but you get the idea) because the number of database pages that have to be read is very small (so small in fact that... ok, I'll shut-up).  The algorithm for gathering allocation data is as follows:

  • For each file in each online filegroup in the database (except transaction log files):
    • read all PFS pages
      • This gives us a bitmap showing all IAM pages, plus another one showing all mixed pages
    • read the GAM pages
      • This gives us bitmaps of all allocated extents
    • read the SGAM pages
      • This gives us bitmaps of all mixed extents with at least one unallocated page
    • read the DIFF_MAP pages (a 'differential bitmap' page shows which extents in the GAM interval have been modified since the last full or differential backup - a differential backup only needs to backup those extents marked modified in the various DIFF_MAP pages)
      • This is just to make sure the pages can be read
    • read the ML_MAP pages (a 'minimally-logged bitmap' page shows which extents in the GAM interval have been modified in bulk-logged recovery mode since the last log backup - a log backup must also backup all such extents to ensure that all changes to the database have been backed up. This can make the log backup quite large (although the log itself stays much smaller) - but that's a topic for another blog post.
      • This is just to make sure the pages can be read
    • read all IAM pages
      • This gives us:
        • a list of all the mixed pages in the file, and by derivation, a list of all mixed extents in the file (remember that the first IAM page in an IAM chain/allocation unit contains an array to hold up to 8 mixed pages for the object/index/partition it represents
        • a list of all the valid IAM pages in the file
        • a list of all the allocated dedicated extents in the file
        • linkage information for IAM chains
  • After all the per-file stuff, read the Storage Engine metadata
    • This gives us:
      • information about the root of each IAM chain. Each row in the sys.allocation_units hidden metadata table contains the page Id of the first IAM page in the IAM chain for that allocation unit.
      • information about IAM chains currently waiting to be deferred-dropped. (This is the process by which an IAM chain with > 128 extents that is dropped - by dropping/rebuilding an index or dropping/truncating a table - does not have its actual pages and extents deallocated until after the transaction has committed. The IAM chain is unhooked from sys.allocation_units though and hooked into an internal queue - if we didn't scan that queue too as part of the allocation checks, we'd see all kinds of inconsistencies with the various allocation bitmaps)

So, now we've got a whole bunch of allocation data that we're juggling and we need to make sense of it all to see if all the allocation structures are correct. Here's a non-exhaustive list of checks that we do with all this data:

  • Check that each extent is either allocated to:
    • the GAM page for the GAM interval
    • or, the SGAM page for the GAM interval, as a non-full mixed extent
    • or, exactly one IAM page that covers the GAM interval
    • or, to no bitmap page, but all pages in the extent must be allocated to IAM pages as mixed pages
    • This could result in an 8903 (GAM and SGAM), 8904 (multiple IAMs), or 8905 (no page) errors depending on the combination of bitmaps that have the extent allocated
  • Check that all pages marked as being IAM pages in PFS pages really are IAM pages when they're read
  • Check that all pages marked as being mixed pages in PFS pages appear somewhere in a mixed page array on an IAM page
  • Check that each mixed page is only allocated in a single IAM page
  • Check that the IAM pages in an IAM chain have monatonically increasing sequence numbers
  • Check that the first IAM page in an IAM chain has a reference from a row in sys.allocation_units
  • Check that no two IAM pages within the same IAM chain map the same GAM interval
  • Check that all IAM pages within an IAM chain belong to the same object/index/partition
  • Check that the linkages within an IAM chain are correct (no missing pages for instance)
  • Check that all IAM/GAM/SGAM pages that map the final GAM interval in a file do not have extents marked allocated that are beyond the physical end of the file

Any errors found here will require REPAIR_ALLOW_DATA_LOSS to repair and some of the repairs are very complicated (e.g. multiply-allocated extents) - topic for a future blog post.

So, the allocation checks lay the next foundation level over the system table primitive checks and we're ready to move on to the logical checks.

3. Logical checks of critical system tables
4. Logical checks of all tables

(Part 3...)

5. Service Broker checks
6. Metadata cross-checks
7. Indexed view and XML index checks

(Part 4...)