Compressing the Web

Article
10/21/2014

Be succinct.

Virtually any network-based application can be made faster by optimizing the number of bytes transferred across the network. Taking advantage of caching is a great way to minimize transfer sizes, but just as important is to reduce the size of the resources you transfer.

Data compression is used throughout the protocols and formats used by browsers, and today’s post is a summary of where you can find compression in use and how you can optimize it to improve the performance of your sites and services.

Compression Categories

Conceptually, compression is simple: recognize patterns in data and reduce repetition to minimize the size of the data. There are two major categories of compression: lossy and lossless.

When information is compressed with lossy compression, reversing the process (decompression) doesn’t result in the original data, but instead a surrogate which resembles the original either closely or loosely depending on the quality of the compression. In browsers, the primary uses of lossy compression are in the JPEG image file format, the MP3 audio file format, and the MPEG video file format. These formats utilize understanding of human perception to drop details that are unlikely to be noticed by the viewer/listener, resulting in often-huge savings in data size. While important, this post won’t discuss lossy compression any further.

In contrast, decompressing data that was losslessly compressed results in an exact copy of the original data—every byte is identical. Lossless compression is used throughout the web platform, in both the HTTP layer and internally within many web file formats including PNG, PDF, SWF, WOFF, and many more. The remainder of this post explores the use and optimization of lossless compression.

HTTP Headers

HTTP requests advertise the decompression algorithms the client supports using the Accept-Encoding request header.

Servers indicate which compression algorithm was used for a given response using the Content-Encoding response header. Unfortunately, even the updated HTTP/1.1 specification implies that servers can use compression-related tokens in the Transfer-Encodingresponse header, but indicating compression using the Transfer-Encoding header does not work in any major browser (all support only the chunked token).

If the server utilizes compression, it should generally include a Vary: Accept-Encoding response header to help ensure that a cache does not mistakenly serve a compressed response to a client that cannot understand it.

Algorithms

The most popular compression algorithm in use on the web is the DEFLATE algorithm, specified in RFC 1951. DEFLATE combines the LZ77 algorithm with Huffman encoding; it is straightforward to implement and effectively compresses a wide variety of data types. You can read a straightforward explanation of the algorithm here, and watch a fun video showing how repeated byte sequences are referenced here.

In browsers, the most obvious use of the DEFLATE algorithm is for HTTP/1.1 Content-Encoding. DEFLATE is the algorithm underlying two of the three encodings defined in the HTTP specification (“Content-Encoding: gzip” and “Content-Encoding: deflate”).

DEFLATE vs. GZIP for Content-Encoding?

Given that virtually all clients allow both “Content-Encoding: gzip” and “Content-Encoding: deflate”, which should you use?

You’ll find conflicting opinions on the topic, many of which are written without the understanding that both encodings are based on exactly the same algorithm, and the encodings differ only in the header and trailer bytes that wrap the compressed data.

The GZIP encoding of a resource starts with two magic bytes 0x1F 0x8B, a byte representing the compression method (0x08, indicating DEFLATE), seven additional bytes of metadata, several optional fields (rarely used), and then the DEFLATE-compressed bytes. After the compressed data, a 32bit CRC-32 and 32bit original datasize field complete the format.

Despite its name, the DEFLATE encoding of a resource is specified to be the ZLIB data format. This format consists of two header bytes; the first contains the compression method and window-size information (the low bits of the byte are 0x08, indicating DEFLATE). The second header byte is a flag byte which indicates the compression strategy (minimize size vs. maximize speed) and serves as a checksum of the header bytes. The DEFLATE-compressed bytes follow. After the compressed data, a 32-bit ADLER32 checksum field completes the format.

So, all other things being equal,a C-E:GZIP-encoded resource will be exactly 11 bytes bigger than a C-E:DEFLATE-encoded resource, due to the seven additional header bytes and four additional trailer bytes. In practice, however, your compression program may generate wildly different results.

Now, note that I said the encoding is specified to be the ZLIB data format. Here’s what you’ll see in Internet Explorer if you try to load a page which has the ZLIB-wrapper around the DEFLATE data:

The problem is that Internet Explorer expects a bare DEFLATE stream without the ZLIB wrapper bytes. So, does that mean you can’t use Content-Encoding: deflate? Not really, no. In addition to supporting the proper ZLIB-wrapped data, Chrome, Firefox, Safari, and Opera all support bare DEFLATE.

Some folks recommend you avoid C-E:Deflate and just stick with C-E:GZIP. It’s hard to argue with that approach.

Optimizing DEFLATE

No matter which DEFLATE-based Content-Encoding you choose, the quality of the implementation underlying DEFLATE implementation is key to reducing the size of the data stream. Many developers mistakenly assume that “DEFLATE is DEFLATE”, and any compressor that implements the algorithm will get the same result. This is a mistake.

Instead, think of DEFLATE’ing as solving a maze. A really really complicated maze. You might run around as fast as you can and eventually stumble upon an exit, but that’s very different than finding the optimal path. If you instead spent much longer, frequently retracing your steps to find out whether there’s a shortcut you missed, you may find a much shorter path. DEFLATE works much the same way—you can expend more resources (CPU time and memory) in the compression process finding the optimal compression choices. The best part is that expending resources to optimize DEFLATE compression doesn’t typically increase decompression time—uncompressing DEFLATEd content is a comparatively straightforward process.

For static resources that will be reused often (think of jQuery.js and other frameworks) you should be delighted to trade a one-time compression cost for a millions-of-times transfer size savings. Fortunately, folks at Google have done the hard work of making a great DEFLATE implementation, called Zopfli (pronounced "zopflee"); you can read more about it and its real-world savings over here. Unfortunately, far too few teams have integrated Zopfli into their workflow; even Google hasn’t gotten around to using it for most of their resources… yet.

If Microsoft were to use Zopfli when building the browser-based versions of Office, they’d see significant savings; three of their largest files shrink by just over 4% each:

File	Original Size	Served (gzip) Size	Zopfli Size	Savings
WordViewer.js	661,771	171,189	164,272	4%
Ewa.js	729,547	202,400	193,822	4.2%
WordEditor.js	1,876,196	482,125	462,147	4.1%

Of course, you won’t always be able to use Zopfli to compress your resources—the compression tradeoff isn’t appropriate for dynamically-generated responses which will only be served once. But you can use it for more than just your HTML, CSS, and JS. Read on for more details.

Exotic Encodings

One of the first criticisms of Zopfli was “it seems like an awful lot of effort for a small improvement. Perhaps it’s time to add a better compression method.” And this criticism is valid to a certain extent.

bzip2

For instance, the first versions of Google Chrome added support for bzip2 as a HTTP Content-Encoding, because (quoting a Google engineer) “well, we had the code laying around." Bzip2 yields much better compression than even Zopfli-optimized DEFLATE:

File	Zopfli Size	BZIP2 Size	Savings
WordViewer.js	164,272	139,393	15.1%
Ewa.js	193,822	163,204	15.8%
WordEditor.js	462,147	396,897	14.1%

And bzip2 doesn’t even offer the highest compression ratios: lzma2 showed even better results when we looked at it back in the IE9 timeframe.

The challenge with more advanced compression schemes is that there are billions of clients that support DEFLATE-based schemes, and as we’ve seen previously, most implementations haven’t yet optimized their use of DEFLATE. Making matters worse are intermediaries: Google reportedly encountered network proxies and security applications that would corrupt bzip2-encoded traffic.

Xpress

Some Microsoft products (Exchange, Software Update Services, etc) use Content-Encoding: xpress, a scheme based on LZ77 (like DEFLATE), optimized for compression speed. Xpress encoding is not used by Internet Explorer, WinINET, WinHTTP, or System.NET, the dominant HTTP implementations on Windows. When run on Windows 8 or later, Fiddler can decompress Xpress-encoded content using the native libraries.

SCDH

Back in 2008, Google proposed the SCDH (Shared Dictionary Compression over HTTP) Content-Encoding, and support has subsequently been added to Chrome and Android. Despite offering large potential savings, this lossless compression scheme is not broadly used and is comparatively complex to implement. It also has privacy implications.

Compress

The only non-DEFLATE-based HTTP-specification defined encoding (“Content-Encoding: compress”) is based on the LZW compression algorithm. Compress is not broadly supported (it doesn’t work in any major browser) and a quick test suggests that it’s not even as effective as a basic DEFLATE encoder.

What Gets Compressed

In general, HTTP Content-Encoding based compression applies to only the response body. This means that HTTP headers are not compressed, a significant shortcoming in the compression scheme. For instance, I recently observed Facebook delivering a 49 byte GIF file with 1050 bytes of HTTP headers, an overhead of over 2000% for a single file. Similarly, Content-Encoding is rarely applied to request bodies.

TLS Compression

When negotiating a HTTPS connection, the client may indicate support for automatic compression of all data on the connection, which has the benefit of applying to both headers and bodies of both the request and the response. TLS-based compression was not broadly implemented (Microsoft products have never supported it) and in 2012 it was disabled entirely by major products to address an exploit called CRIME (Compression Ratio Info-leak Made Easy).

HTTP/2 Compression

The HTTP/2 draft 14 enables header compression using an algorithm called HPACK, designed to combat the CRIME exploit against the SPDY protocol (which used DEFLATE to compress header fields). Draft 12 of the specification removed per-frame GZIP compression for data; you’ll still use Content-Encoding.

Compressing WebSocket Data

Check out https://www.igvita.com/2013/11/27/configuring-and-optimizing-websocket-compression/ for discussion of a mechanism to use DEFLATE to compress data sent over WebSockets.

Compressing Request Bodies

In theory, HTTP allows clients to compress request bodies using the same Content-Encoding mechanism used for HTTP responses. In practice, however, this feature is not used by browsers and is only rarely used by other types of HTTP clients.

One problem is that a client does not know, a priori, whether a server accepts compressed requests. In contrast, a server knows whether a client accepts compressed responses by examining the Accept-Encoding request header. Many servers do not accept compressed requests, although some have no objection (e.g. onedrive.live.com allows them). One consequence of this is that HTML forms do not (yet?) expose any way for authors to specify that compression of the request body should be undertaken.

Modern web applications can workaround the shortcomings in browser APIs with script:

Use HTML5 File API to load a file for upload into JavaScript
Compress it using a script-based compression engine (of many, Pako and compressjs look nice)
Upload the compressed array to the server.

You can use Fiddler to see whether your server accepts request bodies with Content-Encoding; click Rules > Customize Rules. Scroll to OnBeforeRequest and add the following code inside the function:

if (oSession.HTTPMethodIs("POST"))
{
oSession["ui-backcolor"] = "yellow";
oSession.utilGZIPRequest();
}

If the server accepts the upload, you should see a normal HTTP/200 response (although you may wish to ensure that the server properly recognized the compression and didn’t just treat it like binary garbage). If the server doesn’t accept compressed uploads, you will likely see a HTTP/400 or HTTP/500 error code.

Compression Bombs

One reason that servers might be reluctant to support compressed uploads is the fear of “Compression bombs”. The DEFLATE algorithm allows a peak compression ratio approaching 1032 to 1, so a one megabyte upload can explode to 1 gigabyte. Such attacks are generally possible against client applications like browsers, but tend to be much more potent against servers, where a single CPU is required to serve thousands of users simultaneously.

Protecting against maliciously crafted compression streams requires additional diligence on the part of implementers.

Best Practice: Minify, then Compress

Because DEFLATE yields such great reductions in size, you might be thinking “I don’t need to minify my assets. I can just serve them compressed.”

From a networking point of view, you’re mostly correct: the bytes-on-wire for Minify+DEFLATE will likely be nearly the same as DEFLATE alone. However, there are other concerns:

Memory: The browser must decompress your stylesheets, script, and HTML to strings, so a 50kb script file, compressed to 5kb, still uses 50kb of memory after it has been decompressed. If you can minify the script to 40kb before compressing, you can save at least 10kb of memory on the client (and likely more). This is particularly important on memory-constrained mobile devices.

Disk Cache: Some browsers store compressed responses in the cache in compressed form (and Firefox will even proactively compress uncompressed responses). However, Internet Explorer removes all Content-Encoding when writing to the cache file, meaning that your 50kb script file will occupy 50kb of the client’s cache. Smaller files will reload faster and are less enticing candidates for eviction when cache space is reclaimed.

File Formats

As mentioned previously, many file formats internally use DEFLATE compression; a great example is the PNG image format. Unfortunately, many generators of those formats opt for minimum compression time, failing to achieve maximum compression.

Fiddler now includes PNGDistill, a simple tool that removes unnecessary metadata chunks and recompresses image data streams using Zopfli. Zopfli-based recompression often shaves 10% or more from the size of PNG files. You can access this tool from the context menu of the ImageView inspector:

The “Minify-then-compress” Best Practice applies to image file types as well. While large fields of empty pixels compress really well, the browser must decompress those fields back into memory. So, if you're building a sprite image with all of your site's icons, don't leave a huge empty area in it.

Beyond PNG, the WOFF font format also uses DEFLATE. Google recently shrank their WOFF font library by an average of 6% by reencoding their internal DEFLATE streams using Zopfli. Not content to stop there, Google has also introduced the WOFF2 format, which utilizes an innovative new algorithm known as Brotli, yielding size reductions around 20% over WOFF. WOFF2 is currently usable in Chrome, Opera, and Firefox dev builds; see CanIUse.com and status.modern.ie.

Here are a few points of compression-related arcana that have come up over the years…

1. The HTTP decompression implementation used by Internet Explorer 6 and earlier was inside URLMon; the WinINET HTTP stack did not itself support decompression. As a consequence, there was a “dance” whereby WinINET would download the compressed version and ask URLMon to decompress the response in place, replacing the original compressed body. This “dance” was buggy in many corner cases, and led to problems where content could not be decompressed, or Vary headers were ignored, or similar.

In Internet Explorer 7, decompression was moved from URLMon down into WinINET, eliminating the “dance” and slaying many bugs.

2. Unfortunately, the fact that decompression happens so low in the stack causes its own minor problems, like the F12 Developer Tools not being aware of the use of Content-Encoding.

3. Some Apache servers will mistakenly serve .tar.gz files with a Content-Encoding: gzip header, leading the browser to eagerly decompress the downloaded file into a raw tar file with now-misleading filename extension. Firefox works around this issue by ignoring the Content-Encoding header if the response Content-Type is application/x-gzip. Internet Explorer 11 recently and briefly had a regression in its handling of related scenarios.

4. Despite the fact that the GZIP footer includes a CRC32 and OriginalSize field to enable detection of corruption, no browser complains if these fields are incorrect or missing entirely (which makes some sense, insofar as the original requester has already read the entire response before these fields are available). But you should set these correctly, or various tools and middleware might reject your content.

5. Even worse, a small number of servers apparently fail to update the HTTP Content-Length header when compressing the body.

-Eric Lawrence
MVP, Internet Explorer