Built-in functions for compression/decompression in SQL Server 2016

SQL Server 2016 provides built in functions for compression and decompression:

  • COMPRESS - compress data using GZip algorithm and returns binary data.
  • DECOMPRESS - decompress binary data using GZip algorithm and returns binary data. You will need to cast binary data to text if you have originally compressed text.

These functions use standard GZip algorithm so you can compress data in application layer/client side and send compressed data to SQL Server, or you can compress data in SELECT query and DECOMPRESS it in client side.

In this example I will create standard table with binary data that will contain compressed content:

 CREATE TABLE People (
 _id int primary key identity,
 name nvarchar(max),
 surname nvarchar(max),
 info varbinary(max)
)

Now, we can directly load compressed information into this column.

 INSERT INTO People (name, surname, info)
 SELECT FirstName, LastName, COMPRESS(AdditionalInfo) FROM Person.Person
 

We can return compressed data directly to client who can decompress it, or we can decompress data in query:

 SELECT name, surname, DECOMPRESS(info) AS original
 FROM People

As an alternative, we can add computed column (non-persisted) that dynamically decompress data:

 ALTER TABLE People
 ADD info_text as CAST( DECOMPRESS(info) AS NVARCHAR(MAX))

What is compression rate?

You can try it - just create any text, compress it and see the ratio:

 declare @text nvarchar(max) = (select top 100 * from sys.all_objects for json path)
select DATALENGTH(@text) AS original, DATALENGTH(COMPRESS(@text)) AS compressed, 1.0 * DATALENGTH(@text) / DATALENGTH(COMPRESS(@text)) AS ratio 

You can change the number of rows that will be formatted as JSON to change size of text.

With these functions you can choose what data should be compressed in your tables. Compression may help you if you have text data, but if you have already compressed binary content (jpeg, pdf) you might not have good compression ratio and you will just spend CPU cycles.