When can allocation order scans be used?

I know this is jumping the gun a little as I haven't made it this far in my series on fragmentation, but this came up in a chalk-talk I did yesterday at TechEd Developers in Barcelona and is worth blogging about.

You'd expect a select * from mytable query on a table with a clustered index to use an allocation order scan to return the data, as that's the fastest way to read all the pages at the leaf-level of an index. In fact, the query plan will show an unordered clustered index scan. Well, guess again. What I didn't remember, and thanks to SQL Server MVP Maciej Pilecki for pointing this out during my talk (even though I didn't believe him at first), is that an allocation order scan can't be used when there's any possibility of the data changing beneath the scan.

Consider an example where an allocation order scan is progressing and a page that the scan has already read then splits - adding a newly allocated page at the end of the allocation-order list of pages. The scan will eventually come to the new page and then re-read some of the rows it has already read - producing duplicates in the scan output - clearly undesirable.

The only time such a scan will be used is when there's no possibility of the data changing (e.g. when the TABLOCK hint is specified, or when the table is in a read-only database) or when its explicitly stated that we don't care (e.g. when the NOLOCK hint is specifed or under READ UNCOMMITTED isolation level). As a further twist, there's a trade-off with setup cost of the allocation order scan against the number of pages that will b read - an allocation order scan will only be used if there's more than 64 pages to be read.

Below I've included a simple script that demonstrates the behavior. For me, the funny thing is that I should have remembered this behavior as it's exactly the same reason why DBCC SHOWCONTIG takes a shared table lock when in the default mode - it's using an allocation order scan and needs to ensure that the data doesn't change and produce duplicates. Oh well - you (re)learn something every day!

-- Drop and recreate the test database

USE

master;

DROP

DATABASE allocationordertest;

CREATE

DATABASE allocationordertest;

USE

allocationordertest;

GO

-- Create a simple table that we can fill up quickly, plus a clustered index

CREATE

TABLE t1 (c1 INT, c2 VARCHAR (8000));

CREATE

CLUSTERED INDEX t1c1 ON t1 (c1);

GO

-- Add some rows, making sure to produce an out-of-order dataset when scanned in allocation order.

DECLARE

@a INT;

SELECT

@a = 10;

WHILE

(@a < 100)

BEGIN

    INSERT INTO t1 VALUES (@a, replicate ('a', 5000))

    SELECT @a = @a + 1

END

;

SELECT

@a = 1;

WHILE

(@a < 10)

BEGIN

    INSERT INTO t1 VALUES (@a, replicate ('a', 5000))

    SELECT @a = @a + 1

END

;

GO

-- Now try to do an allocation order scan - doesn't work...

SELECT

* FROM t1;

GO

-- Until we add the right conditions.

SELECT

* FROM t1 WITH (TABLOCK);

GO