Hardware backward compatibility: The firmware that missed one tiny detail


The person responsible for the floppy disk driver in Windows 95 also was responsible for the low-level CD-ROM drivers. (Not to be confused with the CDFS file system, which was handled by the file system team, not the hardware driver folks.) And I remember a story about one particularly strange CD-ROM drive.

This drive was produced by a name-brand manufacturer. The box that the drive comes in proudly announces that it is an IDE ATAPI drive. And they did a fantastic job. They implemented all the ATAPI commands that were defined at the time, with one tiny exception.

They forgot to implement the "Are you an ATAPI drive?" command.

Comments (24)
  1. Mathieu Garstecki says:

    Of course, this is a Windows 95 bug. It should be able to read the box the drive came in to determine its type.

  2. Z.T. says:

    @Mathieu Garstecki: The box often lies, so that would not be a good idea even if it were possible.

    Was the drive's firmware upgradeable?

  3. Mathieu Garstecki says:

    @Z.T.: of course it would be a bad idea, that was a joke.

    Even if a new firmware was available, I wonder how one could update it. If the disk controller can't recognize the drive as ATAPI, how could the firmware update routine access the drive ?

  4. Evan says:

    Wow, it's like the opposite problem as blogs.msdn.com/…/71307.aspx

  5. kip says:

    CDFS file system… that's kind of like an ATM machine or a PIN number, right? ;)

  6. pcooper says:

    But you see, by not implementing that, they in fact are not complying with the spec, so it's correct that it doesn't identify as an ATAPI drive.

    The only real problem is that the box was lying. But as already stated, that's the norm.

  7. Limited_Atonement says:

    @kip

    It's creepy how easily those two examples fall on my ears.  I try to be a purist (and I noticed the CDFS thing immediately), but obviously my old age (25) is causing me to be sluggist. ;)

  8. blah says:

    How did something so stupid make it to manufacturing?

    [It works great as long as you use the manufacturer's driver. -Raymond]
  9. Falcon says:

    @Mathieu Garstecki: I'm guessing the ATAPI query command is for software only. Raymond's response to blah's comment above seems to support that theory.

    [I don't know where the command is issued. Maybe the drive comes with a custom interface card (and a custom driver that knows how to talk to the custom interface card). -Raymond]
  10. Marquess says:

    I assume the fix consisted of identifying the driver some other way (like the name) and then pretending that it answered positively to the “Are you an ATAPI drive?” question. This would also make a firmware update unnecessary.

  11. Nick says:

    So now I'm curious… what was Windows 95's behavior when this drive was installed?  Did Windows just ignore it because it didn't respond to the needed commands, or was there some hardware sniffing done to try and detect this particular drive and work around it's deficiencies?

    [I don't know either. This was just something mentioned in a casual hallway conversation over 15 years ago. I didn't realize I had to follow up once he decided how to handle the situation. -Raymond]
  12. benjamin says:

    There's something I never understood: why does it seem like no matter how advanced Windows gets and how resilient it gets when dealing with errors, faults and corruption, somehow a bad CD (where 'bad' can be defined as having an unfortunate scratch somewhere) can cause Windows to completely freak out?

    The UI seems to completely lock up, hitting the SAS doesn't respond well and it just seems like the whole thing loses its marbles as it attempts to interpret what's going on. Generally if you can manage to eject the CD Windows regains its senses, gives you a 'this disk can't be read' message, and you can go on with life.

    I've had a whole manner of faulty hardware, from broken USB HDs to flash drives that've gotten a run through the washer just be rejected by Windows in a sensible manner, but for some reason CDs just completely seem to perplex it.

  13. Dean Harding says:

    @Benjamin: What version of Windows are you using? I haven't seen that kind of thing happen for a long time…

  14. steveg says:

    Probably the CD firmware was 100% sufficient for an MSDOS driver.

  15. Jason T. Miller says:
    This reminds me of the DEC TK50 tape drive (first of the type that eventually became known as DLT; this one stored a whopping 94MB on what was then known as a CompacTape): we wrote software that supported this drive, but were forced to identify it as "maybe TK50," as it responded to the SCSI INQUIRY command with a blank string.
    Trouble is, the TK50 was not without "quirks," so, should some _other_ vendor create a tape drive that identified itself as the empty string, we'd have been in hot water, at least as far as auto-configuration is concerned (there were config file overrides).
    Needless to say, not supporting the drive was not an option — we had several _very_ large clients who had built important systems around the "dodgy" hardware (including DEC itself, who, as I recall, used our products in their media duplication department — creating TK50 distribution tapes, so there's at least one application where "why use TK50 in the first place?" has an obvious answer).
    (To be fair, my understanding from DEC engineering was that the TK50 was essentially a "paleo-SCSI" drive in the following sense: the standard wasn't finalized, but DEC wanted to use it to connect the TK50 to the MicroVAX II. So the MicroVAX II was released with a "TK50 port" that was almost, but not entirely, a SCSI port, intended only for the TK50, itself intended for use only with the "TK50 port." Given this closed-world assumption, device identification isn't much of a problem — if it responds to commands at all, it exists, therefore it's a TK50. As usual, standardization committees don't always yield to schedule pressures imposed by marketing. But our software ran on MS-DOS, not the MicroVAX II, and Adaptec didn't sell "TK50 cards" for PCs.)
  16. I think I know the brand of the drive. It was from a then-famous maker that made the most popular dual- tri- and quad-speed IDE CD-ROM drives. That manufacturer is also famous for making the official controllers for many Nintendo and Sony game consoles. I'm not giving its name because it's still in the market of optical drives for computers.

    I remember my 4x unit from that maker worked perfectly under MS-DOS with the manufacturer's drivers, but didn't work with any generic IDE drivers. On the other hand, the drive's drivers worked without problem with drives from other makers. The surprise came when Windows 95 recognized my unit as an IDE drive without the need of MS-DOS drivers, proving that it was, indeed, IDE-compatible.

    Completing Raymond's story where he left it, it wouldn't be difficult that the drivers' programmer would have included a compatibility hack for that brand, making it work out-of-the-box with Windows 95.

  17. Nick says:

    "I didn't realize I had to follow up once he decided how to handle the situation."

    You didn't, of course.  I just thought I'd ask.

    @benjamin

    Whenever I encounter these kinds of problems with Windows, it's almost always tied to pending I/O operations and faulty or misbehaving hardware of some kind.  For example, mapping a drive to a WebDAV share and starting a file copy.  If the network is interrupted, Windows and Explorer will act very strangely and not respond until the operation times out or completes (which can take MUCH longer than it seems like it should).

    Mark Russinovich's tool NotMyFault can provide examples of a lot of this kind of behavior (hanging IRPs, deadlocks, etc).  You can grab it here for kicks if you want: technet.microsoft.com/…/bb963901.aspx

  18. Anonymous Coward says:

    Dean Harding, it has happened to me on every Windows system I've ever used with a cd-rom drive: 95, 98, Me and XP.

    The symptoms are hilarious (except when you've got work to do), ranging from interface lock-ups, to a sluggish mouse-pointer that only updates once every ten seconds, indefinite up/down-spinning of the cd, funny noises (fweep fwup fwup fwup fwop fwop fwoap) and so on.

    It goes away when you eject the cd. The problem is purely transient; I've ran a computer to which this happened non-stop afterwards without problems for almost a month, until patch-Tuesday came along. But sometimes the drive doesn't even react to the eject button any more. No panic though, you can still eject from Windows, that always works. Have fun navigating to My Computer when Windows takes ten seconds to react to every key and mouse movement or click.

  19. Worf says:

    A popular Unix-like OS, not by a Fruit Company though, had an issue that destroyed certain manufacturer's CD-ROM drives. Apparently the drive took one of the ATAPI commands to mean "update firmware" when it shouldn't have…

    linux.slashdot.org/article.pl

  20. D. Garlans says:

    If you're having issues with slowdowns using damaged cd's (which I have encountered as well) it might be worth checking out the IDE controller settings in control panel and making sure the controller didn't end up having one of it's channels set to PIO mode rather than DMA. I know some versions of windows (xp at least) sometimes, if it caught too many read errors on a cd or whatever, would set the channel down to PIO mode to try and get slower and better access to the drive. PIO mode however completely dominates the CPU and you end up with insanely poor performance every time you try to access the drive.

    I've only been bitten by that particular quirk maybe twice (and I also reaffirmed my computer god status with my dad by diagnosing that problem on his computer) but it's certainly worth considering!

    Nothing is as bad as the old MacOS (pre os9) days when if you ejected a floppy disk before it decided everything was done accessing it, you blocked the entire system until you put that exact disk back in. that was fun ;)

  21. Evan says:

    @D. Garlans's post reminds me of one of the most amusing bugs I've had to track down.

    I had just installed Gentoo Linux, and I noticed that my system clock was getting way off. (It was also true that the system would pause a bit when I'd, say, untar a file, but that seemed less out of norm. This was many years ago.) Anyway, I eventually tracked it down to the following trace of causes:

    1. A lot of motherboard clock chips at least at one point were inaccurate enough that Linux doesn't actually trust them to keep good enough time while the system is on. (Don't know if that's still true or not.)
    2. To compensate, Linux ignores it. What it does instead is set a high-frequency interrupt on the CPU. The interrupt handler updates the kernel's notion of what time it is.

    3. I did not compile the driver for my motherboard's northbridge(?).

    4. Because it didn't know how to do DMA, it would do all disk IO as PIO.

    5. I guess the PIO thread ran at high priority, because while it was accessing the disk, the CPU would start dropping timer interrupts.

    6. The dropped timer interrupts meant that the internal clock was updated less frequently than it should have been, and so time ran slower.

    In other words: my computer's clock worked poorly because the kernel couldn't use DMA for I/O. In retrospect the chain from cause to effect is perfectly logical, but it's still about the weirdest cause/effect pair on it's face that I've seen.

  22. James Schend says:

    Nothing is as bad as the old MacOS (pre os9) days when if you ejected a floppy disk before it decided everything was done accessing it, you blocked the entire system until you put that exact disk back in. that was fun ;)

    Sorry, have to defend my beloved Classic Macs here. In Mac terminology, "Eject" meant "eject the disk so I can put another disk in", not "eject the disk and I'm not ever going to use it again." The latter was actually the "Put Away" command (shortcut: Command-Y.)

    Really, the problem is just that Apple:

    1) Used different terminology than everybody else

    2) Made "Eject" more prominent than "Put Away", so people generally tried it first

  23. benjamin says:

    I remember the fix for the "Ejected, not Put Away" disk problem was to mash Command-. until MacOS just threw up its hands and gave up.

    AC's experiences mirror my own, almost all of which are with XP. Honestly I've stopped using CDs in lieu of external HDs and flash drives that I honestly don't even remember using them much with either Vista or 7. I know I've done that "change PIO back to DMA" trick before and can't recall it happening with a SATA-based CD drive, so maybe it really is a thing of the past.

    I guess I'll need to see if I can dig up one of my flaky CDs and see how 7 deals with it.

  24. Syllopsium says:

    That's pretty good and I can't quite top it. I can, however, offer modems that don't behave as modems.

    On the milder end of the scale there was the ISDN Terminal Adaptor which had a broken Windows TAPI .INF 'driver'. Most people use ISDN as a high speed synchronous connection, usually between two routers – or if you're cheap/stupid two TAs and connect to it at ISDN line speed (a multiple of 64Kb). What's commonly forgotten is that TAs also support V.110/V.120 – a method of interfacing to an asynchronous DTE (usually a computer with a serial port). The upshot is that it sends commands to the TA at less than 64Kb/s.

    The INF file had no knowledge of anything asynchronous and had to be manually hacked.

    The best one, however, was a particularly specialist modem. The customer couldn't get it to work, so I had a look.

    Now, modems can send results in one of two ways – numeric or text based. The majority of software uses text, as it's easier to interpret. However, it's entirely possible to use numeric results even if after the first 8 or so generic codes the numbers quickly degrade into manufacturer specific lovelies like 46 for V.32bis with error correction, password support and grunge sprocket protection.

    I loaded up a terminal program, asked the modem its status and got back an OK. Switched into numeric mode and got.. nothing. No matter what I tried it was completely non functional.

    Being an incredibly specialist modem, it was possible to phone the manufacturer and speak to the firmware programmer, who had omitted numeric results because 'no one uses them any more'. Fortunately a re-flash was possible.

Comments are closed.