A bit about WinInet’s Index.dat – Q&A

In my previous post I tried to explain a bit about what the index.dat files are and what has changed in IE7/Windows Vista timeframe. The post got a couple questions that I'll attempt to answer here.

1) Mike: The real problem behind index.dat is that whether or not the indexes inside are still relevant or not, it keeps named urls forever. … As a user, I want to be able to turn on an hypothetical "auto-delete" of everything either anytime the web browser is restarted, or windows is restarted, or even on a schedule basis.

In IE7 and Windows Vista, deleted entries are actually zeroed out instead of just marked free till overwritten. There shouldn't be any residual URL names lying around if the entry was deleted.

As for "Auto-Delete" we have very limited support to do this right now. I believe there is an option to delete the content cache on IE exit, but nothing that allows cleaning up all the stuff. I suggest that you vote for this suggestion and this suggestion and if they don't cover what you are looking for you can file your own on the IE feedback site and/or the WNDP Feedback site.

2) Jorrit: Why is the directory still called IE5?

The cache directory is called Content.IE5 because we haven't changed the file format since IE5. The same is true for a lot of the registry settings. We have been tempted to change the file format a couple times, but there is always the cost of having to upgrade and potentially downgrade (on uninstall) and the benefits of changing the file format hasn't been worth the cost. So why not update the name but leave the file format alone? It is a smaller cost (yes, there is a cost, we would have to move the files on upgrade and downgrade) but there is no apparent benefit and the point of the AppData directory is to not be user facing.

3) Nicholas: Also, I remember there being issues with the index.dat becoming corrupted or full. This is what leads to the "right click-> view source -> nothing happens." bug, right?

Full is correct. Another manifestation of the same issue was the "Save Image" wanting to save a .bmp file. There are a number of features of IE and other programs that expect a file behind the cache entry, when the file isn't there they have to try something else. In the image case they give you the in memory bitmap version. With view source, they just give up. The root cause is that there are only so many entries that the index.dat file format can hold, and when this limited resource ran out of space, we weren't firing off the scavenger code to try to clear up some space. This has been fixed in IE7 and Windows Vista and maybe even on IE6, but you'll have to ask the IE team about that (lazy Ari…).

   -- Ari Pernick