A bit about WinInet’s Index.dat – Q&A

In my previous post I tried to explain a bit about what the index.dat files are and what has changed in IE7/Windows Vista timeframe. The post got a couple questions that I’ll attempt to answer here.

1) Mike: The real problem behind index.dat is that whether or not the indexes inside are still relevant or not, it keeps named urls forever. … As a user, I want to be able to turn on an hypothetical “auto-delete” of everything either anytime the web browser is restarted, or windows is restarted, or even on a schedule basis.

In IE7 and Windows Vista, deleted entries are actually zeroed out instead of just marked free till overwritten. There shouldn’t be any residual URL names lying around if the entry was deleted.

As for “Auto-Delete” we have very limited support to do this right now. I believe there is an option to delete the content cache on IE exit, but nothing that allows cleaning up all the stuff. I suggest that you vote for this suggestion and this suggestion and if they don’t cover what you are looking for you can file your own on the IE feedback site and/or the WNDP Feedback site.

2) Jorrit: Why is the directory still called IE5?

The cache directory is called Content.IE5 because we haven’t changed the file format since IE5. The same is true for a lot of the registry settings. We have been tempted to change the file format a couple times, but there is always the cost of having to upgrade and potentially downgrade (on uninstall) and the benefits of changing the file format hasn’t been worth the cost. So why not update the name but leave the file format alone? It is a smaller cost (yes, there is a cost, we would have to move the files on upgrade and downgrade) but there is no apparent benefit and the point of the AppData directory is to not be user facing.

3) Nicholas: Also, I remember there being issues with the index.dat becoming corrupted or full. This is what leads to the “right click-> view source -> nothing happens.” bug, right?

Full is correct. Another manifestation of the same issue was the “Save Image” wanting to save a .bmp file. There are a number of features of IE and other programs that expect a file behind the cache entry, when the file isn’t there they have to try something else. In the image case they give you the in memory bitmap version. With view source, they just give up. The root cause is that there are only so many entries that the index.dat file format can hold, and when this limited resource ran out of space, we weren’t firing off the scavenger code to try to clear up some space. This has been fixed in IE7 and Windows Vista and maybe even on IE6, but you’ll have to ask the IE team about that (lazy Ari…).

   — Ari Pernick

Comments (11)

  1. Nicholas says:

    Thanks. Looks like I’ll have to keep CacheSentry around a bit longer, then 🙂

  2. The mysterious history file.

  3. Stu says:

    So if IE will still work (mostly) with a full or corrupted index.dat and the few issues that present themselves should be fixable, why is it needed?

  4. Dean Harding says:

    Stu: index.dat is the file which tells IE how to map a URI to the actual file on disk. IE7/Vista works by removing old items from the cache when the index.dat fills up – and thereby also freeing up space in the index.dat.

  5. Ffoeg says:

    How about I don’t want a file on my disk to begin with? That’s why I don’t use IE. I’d like to know how to run FreeBSD but unfortunately I don’t really have the time to learn it right now, so I use Firefox.

  6. wndpteam says:

    IIRC Firefox has a way to clear out your cache/cookies/etc on browser close, I don’t believe it has a feature to avoid writing a file to begin with. It is unlikely that IE will be able to have such a feature in the future without breaking a large number of important plugins who expect to work with files instead of datastreams.

  7. ghoch says:

    Why change your favorite browser when you can just delete from index.dat file any information you need? Even you can delete information regarding a particular website. Say try History Killer Pro – http://www.historykillerpro.com

  8. Hari says:

    There is a new index.dat file in the PrivacIE folder in IE8. Can any one explain ?

  9. Hugo says:

    Hi There,

    Is it possible to forcibly start the cache scavenger programatically (C++)?  I can find lots of references to the scavenger but nothing that says it’s possible to manually start it when desired, only that it runs intermitently.



  10. oscar says:

    Hi , I did a memory dump capture with the tool Dumpit and then I analyzed it with the tool hbgary responder and I found .cn URLS under the internet history, but the problem is that I did the memory dump on a recently installed OS, without using the internet, this are the urls I found:

    Then after installing the AV, I found more more weirds .cn urls:

    Can u help me to understand why those URL's appear even they are not linked to the IE process