The Windows 95 I/O system assumed that if it wrote a byte, then it could read it back


In Windows 95, compressed data was read off the disk in three steps.

  1. The raw compressed data was read into a temporary buffer.
  2. The compressed data was uncompressed into a second temporary buffer.
  3. The uncompressed data was copied to the application-provided I/O buffer.

But you could save a step if the I/O buffer was a full cluster:

  1. The raw compressed data was read into a temporary buffer.
  2. The compressed data was uncompressed directly into the application-provided I/O buffer.

A common characteristic of dictionary-based compression is that a compressed stream can contain a code that says "Generate a copy of bytes X through Y from the existing uncompressed data."

As a simplified example, suppose the cluster consisted of two copies of the same 512-byte block. The compressed data might say "Take these 512 bytes and copy them to the output. Then take bytes 0 through 511 of the uncompressed output and copy them to the output."

So far, so good.

Well, except that if the application wrote to the I/O buffer while the read was in progress, then the read would get corrupted because it would copy the wrong bytes to the second half of the cluster.

Fortunately, writing to the I/O buffer is forbidden during the read, so any application that pulled this sort of trick was breaking the rules, and if it got corrupted data, well, that's its own fault. (You can construct a similar scenario where writing to the buffer during a write can result in corrupted data being written to disk.)

Things got even weirder if you passed a memory-mapped device as your I/O buffer. There was a bug that said, "The splash screen for this MS-DOS game is all corrupted if you run it from a compressed volume."

The reason was that the game issued an I/O directly into the video frame buffer. The EGA and VGA video frame buffers used planar memory and latching. When you read or write a byte in video memory, the resulting behavior is a complicated combination of the byte you wrote, the values in the latches, other configuration settings, and the values already in memory. The details aren't important; the important thing is that video memory does not act like system RAM. Write a byte to video memory, then read it back, and not only will you not get the same value back, but you probably modified video memory in a strange way.

The game in question loaded its splash screen by issuing I/O directly into video memory, knowing that MS-DOS copies the result into the output buffer byte by byte. It set up the control registers and the latches in such a way that then bytes written into memory go exactly where they should. (It issued four reads into the same buffer, with different control registers each time, so that each read ended up being issued to a different plane.)

This worked great, unless the disk was compressed.

The optimization above relied on the property that writing a byte followed by reading the byte produces the byte originally written. But this doesn't work for video memory because of the weird way video memory works. The result was that when the decompression engine tried to read what it thought was the uncompressed data, it was actually asking the video controller to do some strange operations. The result was corrupted decompressed data, and corrupted video data.

The fix was to force double-buffering in non-device RAM if the I/O buffer was into device-mapped memory.

Comments (15)
  1. Joshua says:

    I must be strange for expecting read video memory to work.

  2. John Elliott says:

    Reading planar video memory *can* work, depending how you've set up the registers and the latches. But the game obviously didn't bother to do that, because it only wanted DOS to write into the buffer, not read from it.

  3. Zarat says:

    @Joshua: You should never write a device driver then.

    If you configured your memory mapping to be write-only then you are supposed to do only write operations on it. No safeguards from Windows or your compiler (doesn't matter if you are user-mode or kernel-mode).

    And to know its still relevant today just look at the warnings in the DX12 docs about mapped memory:

    msdn.microsoft.com/.../dn788712.aspx

  4. Sean Liming says:

    I wonder if this is the reason the original Windows CE video driver was a double copy. The result was a dirty rectangle driver.

  5. Yuri Khan says:

    The DOS game would not be governed by the Windows API’s restriction on reading from or writing to the buffer being used by an I/O operation, unless the DOS API had a similar restriction. Which, as far as I understand, it did not, as DoubleSpace only appeared in DOS 6 (renamed to DriveSpace in 6.22).

    It would be DriveSpace’s responsibility to only extend the optimization to client programs written against the Windows API.

  6. ErikF says:

    @Yuri: But the decompression logic would have been handled by the DriveSpace VxD, which had no knowledge of where the request is coming from; it would just see a read request for file XYZ with a buffer at ABC. I don't think that VxDs really could tell where requests came from (i.e. Windows programs vs. DOS programs), but I could easily be mistaken because I never had to program at that low of a level.

    FWIW, I used DriveSpace for a while way back when (and Stacker even before then) and dropped it in WfW 3.11 after too many data corruption events. I'm sure that things got better in Windows 95 but was just bitten too many times to try again.

  7. Killer{R} says:

    ...on Speccy video RAM sometimes was used even for code.

  8. Azarien says:

    All the machines I have seen that used DOS/Win9x disk compression ran absurdly slow.

    Things work much better now: recently I compressed a partition with Windows 10 and VS2015, and I don't notice any difference in performance.

  9. Mike Dimmick says:

    @Joshua: But if you multiplex your memory-mapped I/O, you can get away with half as many physical addresses! In general, I/O devices routinely use the same address for different functions if you read from the address versus writing to it. Essentially they use the read/write pin as an additional address line. A 'read' may often invoke some behaviour from the device, with the actual value ending up on the data bus being meaningless, or the value written to the data bus for a 'write' being unimportant, or being an additional parameter.

    In other words, hardware is not RESTful. GET has side-effects.

  10. Dave Bacher says:

    @Joshua:

    void cVGA::PutPixel(short int x, short int y, char color)

     {

       /* Each address accesses four neighboring pixels, so set

          Write Plane Enable according to which pixel we want

          to modify.  The plane is determined by the two least

          significant bits of the x-coordinate: */

       outportb(0x3c4, 0x02);

       outportb(0x3c5, 0x01 << (x & 3));

       /* The offset of the pixel into the video segment is

          offset = (width * y + x) / 4, and write the given

          color to the plane we selected above.  Heed the active

          page start selection. */

       Screen[(unsigned short int)(YTable[y]) + (x >> 2) + ActiveStart] = color;

    }

    char cVGA::GetPixel(short int x, short int y)

     {

       /* Select the plane from which we must read the pixel color: */

       outportb(GRAC_ADDR, 0x04);

       outportb(GRAC_ADDR+1, x & 3);

       return Screen[(unsigned short int)(YTable[y]) + (x >> 2) + ActiveStart];

     }

    Maybe that'll help?

  11. Dave Bacher says:

    Oh and since it'll come up:

    #define outport outpw

    #define outportb outp

    #define inport inpw

    #define inportb inp

    #define SEQU_ADDR 0x3c4

    #define CRTC_ADDR 0x3d4

    #define GRAC_ADDR 0x3ce

  12. boogaloo says:

    @killer: The Amiga only ever shipped with video ram. You could add non-video ram, but the standard ram expansion on the A500 was a special type of "video" ram that could only be used by the CPU. Although a later Agnus chip revision fixed it so you could.

  13. Joshua says:

    @Dave Bacher: That's a pretty good expansion of what was higher up. (That is, the readback works if the latches are set correctly.)

  14. Dave Bacher says:

    @Joshua:

    That's the actual code from a 1992 MS-DOS video game, although those two routines are never actually called.

    In the DOS version, no routine that does a write ever sets the read page, and no routine that does a read ever sets the write mask.  Because the two are asymmetric, and the bus write was really slow.  So the odds of the latches being correct are 25%, basically, pure random. :)

  15. boogaloo says:

    @Dave Bacher The odds are less than 25% that a written byte can be read back because the barrel shift or logical operation may have been enabled. wiki.osdev.org/VGA_Hardware

Comments are closed.

Skip to main content