Does data compression lead to more fragmentation?

Lately I have been asked how data compression impacts fragmentation (i.e. does it cause more or less fragmentation?). I believe this question is best answered by looking at how does fragmentation occur in the first place and then analyze each of these cases for compressed and uncompressed data.

Let us start off with create clustered index on a table so there is no fragmentation to begin with. Now the following operations will cause fragmentation

· Delete Operaton: This means the pages are not as full as they can be. The delete operation will cause fragmentation both with/without data compression but the impact will be lower with data compression. For example, if you assume 50% compression (i.e. compressed data row is ½ the size of the uncompressed row) for an average uncompressed rowsize of 200 bytes. Now if we delete 10 rows on a page, the uncompressed page will have 2000 bytes of free space (i.e. 25% additional fragmentation within the page) while the compressed page will have 1000 free bytes (i.e. 12.5% additional fragmentation within the page)

· Insert Operation: when a row is inserted, the SQL Server will find the appropriate page to insert the row. If there is space in the page, the row will be inserted without additional fragmentation. However, if there is not sufficient space on the page to store the row, it will cause a page split which will potentially lead to fragmentation. You should see similar fragmentation with/without data compression

· Update Operation: An update operation can cause a page split if the update is causes rowsize to increase, a common case when updating variable length columns. As you may recall, both with ROW and PAGE compression, we store fixed length data type as variable length. So if you are changing the value of a say an integer column from 1 to 100000, it will cause the rowsize to increase from 1 byte to 3 bytes. If there was no space on the page, it will lead to page split thereby causing potential fragmentation. However, if you had not compressed the data, then fixed length columns take the maximum space possible in the row so such updates will not cause any changes to the size of the row. So if your application changes fixed length column types, you can potentially see additional fragmentation with data compression.

As you can see from the discussion above, the impact of data compression on fragmentation will depend on the schema, data distribution and the application. You can minimize fragmentation by using a lower fill factor but then you table will take more space.