HTTPS Caching and Internet Explorer

From time-to-time, I get questions about Internet Explorer’s behavior when it comes to caching of HTTPS-delivered content.

It comes as a surprise to many that by-default, all versions of Internet Explorer will cache HTTPS content so long as the caching headers allow it. If a resource is sent with a Cache-Control: max-age=600 directive, for instance, IE will cache the resource for ten minutes. The use of HTTPS alone has no impact on whether or not IE decides to cache a resource. (Non-IE browsers may have different default behavior for caching of HTTPS content, depending on which version you’re using, so I won’t be talking about them.)

Now, having said that, there are a few caveats to be aware of.

Users May Disable HTTPS Caching

First, the Internet Options control panel’s Advanced tab includes an option “Do not save encrypted pages to disk.” When this option is set, caching directives from the server are ignored, and HTTPS-delivered resources will not be cached for reuse, even during the same browser session.

This option is unticked by default on all client versions of Windows, but is ticked by default on the Windows Server versions. Hence, most of your sites visitors will not have this option set. However, the option is occasionally set by Group Policy as an attack-surface-reduction measure, although the side-effects of doing so can be pretty dire. The problem is that this option makes SSL-delivered downloads uncacheable by default, and that can lead to the dreaded “Internet Explorer cannot download” dialog. Rather than use this option, I recommend you use BitLocker Drive Encryption and/or use the Delete Browser History on Exit option.

WinINET's Per-Process Artifact

The second caveat to be aware of relates to an obscure architectural artifact in WinINET/IE. WinINET will not reuse a previously-cached resource delivered over HTTPS until at least one secure connection to the target host has been established by the current process[1]. This can lead to previously-cached resources being ignored, leading to increased network requests.

Some examples are in order.

Test Page #1 is a page at https:// www.bayden.com which contains 5 embedded images. Four of the five images on the page are marked as cacheable. The cacheable images come from https://www.bayden.com, https:// bayden.com , and https://login.live.com.

  • When you first visit this page, a network inspector will show that there are 6 requests: one for the page itself, and five for the images.
  • If you subsequently push the “Reload!” button, you will see just 2 requests: one for the page, and 1 for the only image marked not cacheable.
  • If you close IE and reopen it to the test page, you will find there are 4 unconditional requests: 1 for the page, 1 for the uncacheable image, 1 for the https://login.live.com image and 1 for the https:// bayden.com image. The two images from https:// www.bayden.com are not re-downloaded, because the previously-cached resources were reused.

Internet Explorer had already established a secure connection to https:// www.bayden.com during this browser process' lifetime—when it downloaded the test page itself. On the other hand, https://login.live.com and https:// bayden.com had not been contacted by this browser process, so their cached resources were ignored.

As you can see from this example, the simplest strategy to avoid the WinINET artifact is to ensure that if you have a page at https://www.example.com, it only tries to download cacheable resources from itself (https://www.example.com). Since a secure connection will be established when downloading the page itself, before any of its resources are downloaded, you’re guaranteed that the cache will not be bypassed.

Test Page #2 is the same page at a different host, https://www.enhanceie.com, and it contains the same 5 embedded images.

  • First close your browser.
  • Visit Test Page #2 and you will again see that there are 6 requests: one for the page, and five for the images.
  • Again, if you push the “Reload!” button, you will see just 2 requests: one for the page, and 1 for the only image marked not cacheable.
  • If you close IE and restart it, you will find there are 6 unconditional requests: 1 for the page, and 5 for each of the images.

Because this Internet Explorer process hasn’t previously contacted any of the secure servers, it will bypass the cache for all of the images.

Now, at this point, alert readers will shout “Hey, wait a second!

They’ll note: “Two of those images came from the same secure server. So shouldn’t the second image get pulled from the cache because the first image resulted in the server getting contacted?” And sadly, the answer here is “Not in this case.” The problem is that the two network requests are sent in parallel, so the cache has already been bypassed for the second request before the first request secure connection to the server has been established. In this surprising example, parallelism in the network stack leads to slower overall performance.

Having said that, this strategy can act as a workaround for the cache-bypassing behavior. If you construct your page such that it doesn’t make subsequent network requests to a cross-domain HTTPS server until the first response is returned, you will find that cached resources are reused for all subsequent requests to that HTTPS server for the life of the browser process.

7/14/2010 Update: The situation is improved in IE9 ; IE9 will make conditional HTTPS requests instead of unconditional requests. Additionally, the problems mentioned in Internet Explorer cannot download have been resolved so long as the Tools > Internet Options > Advanced > "Do not save encrypted pages to disk" option is not set.

6/1/2012 Update: The situation is improved (basically resolved) in IE10. IE10 does not force a connection to the origin HTTPS server for sub-downloads if the sub-downloads are in the cache and within their freshness lifetime.

Until next time,

-Eric

[1] It doesn't matter whether that first secure connection is still alive; it's enough that it was ever established during the lifetime of the current process.