Taking advantage of the asymmetry of offline compression


Online compression is a difficult balancing act. Time you spend trying to get good compression is time the user could have been using to do something useful. For the same reason, the decompression algorithm needs to be relatively fast as well. And since files support random access, you need to be able to compress and decompress an arbitrary location in the file, which typically means that the file is broken down into chunks which are compressed and decompressed independently.

Windows comes in a large compressed image, and starting in Windows 8.1, systems can be configured so that the "files" on disk are really pointers into the compressed image which Windows can decompress on the fly as if they were real files. Windows 10 expands this technique to all systems if an assessment determines that the system would benefit from it.

Offline compression with online decompression means that you can spend a lot of time in the compression step, since users never have to sit around waiting for the compression to take place. The compression can take place in a lab on a machine with a large amount of memory and CPU resources available, in order to eke out as much compression goodness as possible; only the compressed results need to be delivered to the user. On the other hand, decompression is still done in real-time, so the compression algorithm needs to be one that can still decompress relatively quickly.

Comments (15)
  1. alegr1 says:

    That’s fancy. Wish the boot from installation media (the progress bar phase) used it.

    1. Rick C says:

      You gotta try installing from USB instead of optical media sometime. That phase is much faster.

      1. Darran Rowe says:

        Yeah, I install Windows using a USB flash drive all the time, it makes the driver installation and the OOBE the slowest part of setup by far.

      2. alegr1 says:

        Most of the time, I’m installing from an ISO image over iDRAC or its equivalent (Raritan). Inside, iDRAC controller is connected through USB 1.1.

  2. IanBoyd says:

    With the size of the Windows installation, there may be some areas where you can have a large, Windows-specific, dictionary; looking across 10’s of gigabytes across thousands of files.

    The copyright and build string common to 750 dlls can populate the master decompression dictionary.

    1. kc0rtez says:

      Lol, we should gather information on how much space copyright stuff takes in microsoft products just for the sake of curiosity and to make salty remarks on it like “Oh, Microsoft takes 1% of my hard disk just to make sure it can sue me” .

      1. On an ordinary 1 TiB hard disk, 1% is like 10 GiB. That’s 2/3rd of Windows’s size. I count 26,420 .exe and .dll files on my system belonging to Windows. So, those stuff should take around 1,321,000 bytes or 1.26 MiB.

  3. Wow! This blog’s comment is refusing to display my whole comment above! Why? Because it has math calculations in it?

    Tell you what, this blog win the award for the funniest acting comment section on the web.

    1. Joshua says:

      Less than and greater than signs most likely.

      1. Very astute! But no. And anyway, unless I use them at the very beginning and the very end, it shouldn’t cause the whole comment to go. Add the fact that I’ve been having problem with the comment system of this certain blog for a long time now. I can’t write multi-paragraph comments while others can. (It joins the paragraphs together.)

        1. Scarlet Manuka says:

          In a recent comment, Raymond noted that the blog hosting software seems to mistrust any commenter without (I think it was) an MSDN login, and he has to manually approve about 90% of comments. So nobody (except the privileged few) should expect to see their comments straight away.

          1. Yup, I have to remember to check in every few hours to approve comments. And when I’m on vacation, it may be days before I get around to it.

          2. Well, that’s not my problem either. I do have an “MSDN login” as you call it. And I do see my comment immediately; albeit, without the body text and without the delete button.

          3. My guess is that comment disappeared because the server saw the < and interpreted it as the start of an HTML tag. When it never saw a matching > it decided to “fix” the problem by throwing the corrupted “tag” away.

  4. Paul Sanders says:

    And I bet whoever designed the preprocessor is spinning in his grave.

Comments are closed.

Skip to main content