The Alpha AXP, epilogue: A correction about file system compression on the Alpha AXP

Some time ago, I claimed that Windows file compression had to be dumbed down in order for the Alpha AXP to hit its performance targets.

I have since been informed by somebody who worked on file system compression for the Windows NT that information was badly sourced. (My source was somebody who worked on real-time file compression, but not specifically on the Windows NT version.)

This is a bit of a futile correction because the wrong information has already traveled around the world [citation needed], posted some selfies to Instagram, and renewed its passport.

Windows NT file compression worked just fine on the Alpha AXP. It probably got a lot of help from its abundance of registers its ability to perform 64-bit calculations natively.

We regret the error.

Bonus chatter: Real-time file compression is a tricky balancing act.

Compression unit: If you choose a small compression unit, then you don't get to take advantage of as many compression opportunities. But if you choose a large compression unit, then reads become more inefficient in the case where you needed only a few bytes out of the large unit, because you had to read the entire unit and decompress it, only to get a few bytes. Updates also become more expensive the larger the compression unit because you have to read the entire unit, update the bytes, compress the whole unit, and then write the results back out. (Possibly to a new location if the new data did not compress as well as the original data.) Larger compression units also tend to require more memory for auxiliary data structures in the compression algorithm.

Compression algorithm: Fancier algorithms will give you better compression, but cost you in additional CPU time and memory.

What makes twiddling the knobs particularly difficult is that the effects of the knobs aren't even monotonic! As you make the compression algorithm fancier and fancier, you may find at first that things get slower and slower, but when the compression ratio reaches a certain level, then you find that the reduction in I/O starts to dominate the extra costs in CPU and memory, and you start winning again. This crossover point can vary from machine to machine because it is dependent upon the characteristics of the hard drive as well as those of the CPU. A high-powered CPU on a slow hard drive is more likely to see a net benefit, whereas a low-powered CPU may never reach the breakeven point.

Comments (17)
  1. pc says:

    Seems somehow appropriate that “Regret the Error” goes to a 404.

    It’s okay Raymond. Even when you tell the truth, the media manages to distort it (I’m thinking of “BSOD week”). People somehow forget that your blog is “for entertainment purposes only”, and your stories are often your best guess at what happened. (Much like when any of us ever tell a story.) You are, in fact, a human, and humans make mistakes.

    We forgive you.

    1. Antonio Rodríguez says:

      “For entertainment purposes only”? Yes, most articles are fun or interesting (they are so varied that sometimes Raymond writes about something I’m not interested in, but hey, this is his blog, and he gets to decide what to post!). But you can learn from the most unexpected sources. In particular, I have put to work some tips extracted from Raymond’s grandfather tales about Windows 95 (not to mention more practical articles, like the yearly Batch File Week). For entertainment? Sure. But not only :-) .

      1. Antonio Rodríguez says:

        Clarification: those tips extracted from the Windows 95 articles are UI-related. Windows 95 may be obsolete, but all the user testing done by Microsoft (and its findings) are timeless. After posting my comment, I realized it sounded strange…

      2. pc says:

        It was a reference to something that I thought Raymond has posted more than once, but maybe I’d just read it more than once:

        One should also of course read his “Disclaimers and such” listed in the right nav:
        “In summary, readers are expected to employ critical thinking skills to evaluate statements in context.”

      3. Antonio Rodríguez says:

        Yes, I know. I was poking fun at it :-) .

        There used to be a first post in this blog where Raymond set up the basic rules (like not referencing by name products of other companies). IIRC, Raymond first posted there his classic “for entertainment purposes only”, and other disclaimers (his opinions don’t necessarily are Microsoft’s official position, for example). I can’t find that post, but I remember having read it.

  2. DWalker07 says:

    Raymond is a human? I didn’t think that any human could know and remember as much as he does.

    1. pc says:

      Oh, right. Never mind then.

  3. Antonio Rodríguez says:

    Disk optimization is tricky. I still keep in use an original Acer Aspire One 110 netbook from 2008, with the SSD upgraded to 16 GB, RAM upgraded to 1 GB and a 16 GB SD card in the expansion slot (this machine has two card readers, one for storage expansion and another one for regular use). The internal SSD is relatively slow (40 MB/s) on reads, and really painfully slow (often under 5 MB/s!) on random writes, so I use it purely as a system drive, and have every user folder (Documents, Music, Pictures…) moved to the permanent SD card (which is faster on random writes, and can be plugged in my study’s PC for a quick backup).

    But on to the point. I have the internal drive compressed. Being relatively slow at reads, this speeds up OS booting and application launching. This is one of the cases where file compression actually makes the system run faster.

    Tuning up this configuration (deciding what to put where, wether filesystem compression was worth, what to compress, the right amount of memory and swap file, which applications to use, etc.) took a lot of trial and error, but the result was worth it: an underpowered 9-year-old machine that beats many current laptops in day-to-day performance.

    1. Yukkuri says:

      I see you don’t consider your time to be very valuable…

      1. Antonio Rodríguez says:

        On the contrary. It is a machine I use everyday, my primary portable computer. If I spent a few afternoons making it run fast eight years ago, and that has saved me five to ten minutes each day since, I have gained a lot of time. Do the math.

  4. Simon Clarkstone says:

    Are you going to edit the original blog post to mention that you have retracted it?

    1. Yes, a retraction would be useful at the top of the original blog post. Raymond might think a correction is futile, but I’d respectfully disagree. On the web, corrections to wrong information essentially don’t exist unless they are on the same page as the wrong information was originally.

    2. Done. I couldn’t update the original until this article was posted, and then various issues prevented me from updating it immediately.

  5. Neil says:

    A case of twiddling the knobs: Firefox keeps its UI strings in a number of text files in DTD or Java properties format. Originally these (and other data files) were stored in a .ZIP format file with compression. However, the download is itself a compressed file, so storing them without compression actually reduced the size of the download. The latest thinking has switched back to using compression again, as that saves on I/O.

  6. Yuhong Bao says:

    So did the lack of BWX affect the Alpha compression in anyway and if so how?

  7. I’m surprised the NTFS compression has never been updated since at least Win2K, despite the enormous boosts in CPU speed compared to I/O speed in the 2000-2010 timeframe, and that Windows has never come pre-compressed. Pre-compression would have been a huge I/O and space gain, especially on itty-bitty early SSDs.

    Transparent compression has always seemed like something Microsoft wasn’t sure what to do with, and didn’t really use to its maximum potential. Now ReFS completely strips it out, and btrfs has lost the support of Redhat, so the future will probably be uncompressed. (Or rather, only internally compressed on the drive, but the user won’t be able to see any of that space savings.)

    1. Windows came pre-compressed starting in Windows 8.1. It executes directly out of the compressed image. Since the image is read-only, the compression can be done offline, which means an algorithm that is slow to compress but fast to decompress becomes a viable option. Windows 10 continues the evolution of compressed operating system files.

Comments are closed.

Skip to main content