Enabling compression on a HEAP

I often get a question how to do enable compression on a table that is a heap (i.e. it does not have a clustered index). Clearly, one could create the clustered index with compression option set to PAGE and then drop the clustered index. This is an expensive operation because

· Creating a clustered index requires a SORT

· When you drop the clustered index, internal allocation structure, namely the PFS (Page Free Space) needs to updated to reflect the free space on each page. Note, when a clustered index is dropped, the resultant heap retains the compression state.

Instead, in SQL Server 2008, there is a new REBUILD command available that can be used to rebuild a HEAP. You can use this command to enable compression on the HEAP. Also, you can use this command to defragment a HEAP. You may wonder what is there to defrag in a heap because the pages don’t need to be traversed in the key order? Well, it is possible the pages in a HEAP are not full so the defragmentation can be used to reclaim the free space. In any case, let us get back to compression. Let me start with the following example

create table t1_big (c1 int, c2 int, c3 char(1000))

go

- Load the data

declare @i int

select @i = 0

while (@i < 5000)

begin

     insert into t1_big values (@i, @i + 5000,

                              replicate('a', 500))

     set @i = @i + 1

end

- Look at the current row size

- The following query returns a length of 1015

select max_record_size_in_bytes

from sys.dm_db_index_physical_stats (db_id('compression'),

                   object_id('compression.t1_big'),

                   null, null, 'DETAILED')

where object_name (object_id) like '%t1_big%'

-- Now, you can use the following NEW command to enable

-- PAGE compression. You could have used this command to

-- enable ROW compression or even NONE compression. As

-- part of rebuilidng the HEAP, it is also defragmented.

-- This command can also be executed ONLINE

alter table t1_big rebuild

 with (

    data_compression = PAGE)

 

-- compression setting shows the PAGE compression

select object_name (object_id) as table_name, data_compression_desc

from sys.partitions

where object_name (object_id) like '%t1_big%'

-- check the length again and it returns a value 9

-- Note, the example I have used here is not realistic so it

-- show almost 99% compression but you get the idea.

select max_record_size_in_bytes

from sys.dm_db_index_physical_stats (db_id('compression'),

                   object_id('compression.t1_big'),

                   null, null, 'DETAILED')

where object_name (object_id) like '%t1_big%'

There are few restrictions on ALTER TABLE REBUILD command as follows

(1) If you use ONLINE option then HEAP rebuild is single threaded. Note, this restriction is not there when you are rebuilding an index which supports multi-threaded ONLINE build. However, in both cases, OFFLINE build does support multi-threading.

(2) When you rebuild a HEAP and specify compression option, the corresponding non clustered indexes are also rebuilt because RIDs will change but the compression state of the indexes will not change. If you want to alter compression option of existing nonclustered indexes, you will need to execute ‘Alter Index’ command on each such indexes. So effectively, you will need to rebuild the nonclustered index twice if you want to change the compression setting both for the HEAP and its associated nonclustered indexes. Here is the rationale for this

a. Most index rebuild options are not applicable to HEAP. For this reason, we did not add ‘ALL’ keyword like the one with Alter Index command.

b. Since indexes are typically much smaller compare to the table, we think most customers will not want to change compression state on indexes automatically as part of rebuilding the HEAP or clustered index for that matter.

In the next BLOG I will discuss DML operations on the HEAP and compression

Thanks

Sunil