Why did disabling interrupts cause Windows 95 to hang?


One of the ways people used to demonstrate the awfulness of Windows 95 was that you could hang the system with this simple two-instruction MS-DOS program:

    cli
    jmp $

The cli instruction disabled interrupts, and the jmp $ instruction merely jumped to itself. This spun the CPU with interrupts disabled, hanging the system. There is no way to break out of the loop because you told the CPU to stop listening to the outside world.

Why did Windows 95 let MS-DOS applications disable interrupts?

Compatibility, of course.

In principle, Windows 95 (and Windows 3.1) could have virtualized the interrupt flag for MS-DOS programs. If a virtual machine disabled interrupts, it would disable interrupts only for itself; other virtual machines would still have interrupts enabled, and interrupts would still be enabled in the virtual machine manager. Indeed, if the program was running in protected mode, the interrupt flag was virtualized. There is a special case for code running in virtual 8086 mode. Why the special exemption just for v86-mode?

There were a lot of MS-DOS drivers which relied on timing loops and tight polling. If you virtualized the interrupt, then the virtual machine that disabled interrupts would have a messed-up timing loop because its loop would be interrupted by other virtual machines that were also running at the same time. Similarly, the tight polling loop could miss an event because the hardware gave you only a 10ms window to respond to the signal, but the virtual machine got pre-empted for 55ms due to multi-tasking. That would cause your scanner to return garbage data, or your tape backup to fail, or your CD-ROM burning software to create a coaster.

Windows 3.1 (and Windows 95) addressed this problem by disabling multi-tasking when a virtual machine disabled interrupts. Disabling interrupts allowed a virtual machine to prevent other virtual machines from stealing CPU and messing up its hardware timing and polling loops.

It was the general impression that end-users would prefer to use the hardware that they paid good money for, and which was working just fine in MS-DOS. (Back in these days, a low-end CD-ROM drive cost around $200. I owned one such, and the only driver it came with was an MS-DOS driver.)

Of course, Windows NT addresses this problem a different way: It simply doesn't support MS-DOS drivers. But in the early 1990's, a lot of hardware devices didn't have drivers for Windows NT (and a lot of computers didn't meet Windows NT's hardware requirements), so your choices were limited.

  • Stick to MS-DOS and don't upgrade.
  • Suck it up and run Windows 95.
  • Use your external CD-ROM/Bernoulli/ZIP/tape drive as a doorstop.
Comments (35)
  1. Joshua says:

    Now I understand. I used to think it was a bug in the CPU architecture.

  2. Hildar says:

    Sounds like "code-execution results in executed code" again to me.

  3. Kemp says:

    This is one of those "Windows sucks - if I pull the power plug then my program stops executing" variety of issues. You told it to break, it broke. Congrats.

  4. Mike Caron says:

    Ah, the logical implementation of the hcf instruction. Neat, I love seeing low level details like this!

  5. Antonio 'Grijan' says:

    It still amazes me that Windows 95 achieved to run *inside* a virtual machine and *in userland* drivers that were designed to have total control of the machine. It had to made some compromises, and was less stable than it could be wished if you used legacy drivers, yes. But it worked most of the time, and it is an amazing feat!

  6. ErikF says:

    @Antonio: I agree. The tradeoffs that Microsoft made seem to have worked well: DOS and Win16 drivers and apps worked fine for the most part, and Win32 apps were introduced for the future. It's probably not too surprising that Windows 9x stayed popular for as long as it did (the more modest hardware requirements probably didn't hurt either.)

  7. JJJ says:

    To this day, I still don't touch the PC when I have occasion to burn a DVD until the process is complete.  I was so traumatized by the 90's where any slight timing delay (caused by moving the mouse too much even!) would result in a coaster.

    :(

  8. Mike Dunn says:

    Ah yes, the days when you had to defrag your disk before burning a CD.  If the HD did too much seeking, it might be unable to stream enough data to the burner, and you'd get a coaster.

  9. Joshua says:

    Now that I can afford the RAM I burn CDs from RAM images.

  10. Antonio 'Grijan' says:

    @JJJ: back then, I ran NT 4, and had a dedicated 650 MB partition to store the files, so I didn't have too much trouble running Word or Visual Basic while burning a CD. Having 32 MB of RAM, a decent amount at the time, didn't hurt, either. If Win95 was a fair trade off, NT 4 was rock solid. I had to look at Microsoft's hardware compatibility list before shopping for parts, true, but it was more than worth the trouble. Did I mention that my personal uptime record for a *desktop* computer is 111 days, with NT 4 SP 6a? And yes, it was before Patch Tuesdays were established.

  11. In 1995 Microsoft had $6,940,000,000 (almost 7 Billion) sitting in cash (http://www.microsoft.com/.../fh.htm). I realize we can't go back in time but I sometimes wonder about an alternative history where Microsoft spend some of their money dealing with problems like this. For example, Microsoft could have taken over the job of writing device drivers for hardware vendors, making sure the drivers were reliable and compatible with both the software and the hardware.

  12. Kevin says:

    I'm not sure why this is (was) such a big deal.  It's morally equivalent to the fact that a kernel-mode driver can do various bad things under modern Windows (and other OSs, for that matter).  If you don't want problems to affect the whole system, you don't run code at that level.

  13. @MichaelQuinlan: Money does not develop software; developers do. Microsoft does not have an infinite supply of expert developers, especially those willing to do that. Remember Longhorn?

  14. @carlmess says:

    I personally look at al of this as the normal evolution of things applied to computers, things were simply different, I still remember waiting a long time before applying service packs, updates or even upgrading my OS. Nowadays we wait anxiously new versions, for new fixes, install them as soon and even ask for them. Things have simply changed...

  15. voo says:

    That $200 CD-drive would be about $370 in today's money, which all considered isn't even *that* outrageous all considered.

    On the other hand, aren't we all happy that we moved on from disks to flash drives? My god how bothersome burning CDs always could be.

  16. appendix h says:

    Also, the Pentium (and late model 486s) were the first that could hardware-virtualize these instructions.  Before that, any attempt to software virtualize them would cause an exception to the virtual machine manager(aka kernel) every time.

    So, speed too.

  17. T. West says:

    @MichaelQuinlan: Indeed, if MS had spent the money to help control the user experience to make it that much better, Windows could have dominated the late 90's market instead of being a pale shadow of Macintosh sales.  Oh wait...

    The reality is that you don't get to optimize on all axes simultaneously.  MS worked very hard on enabling other companies to make hardware and software, including companies that took shortcuts at every possible opportunity, making stuff that worked on and off (and the "on" part was often thanks to people in MS like Raymond), but was 1/4 the price.

    And guess who *thoroughly* won the PC wars.

    Reliability is nice, but when it comes to wide spread market dominance, it consistently places third behind features and price.  You need to be reliable *enough*, but after that you really have to watch the trade-off.  After all, your customers pretty much assume 100% quality for anything they buy, so *actually* having the quality isn't a competitive advantage unless your competitors fall *too* far down the quality scale.

    (And yes, I'm a slightly embittered QA guy who talked too much to sales :-))

  18. Patrick Star says:

    I've always wondered why Win9x didn't use VME. This makes sense, but I do have some - admittedly vague - memories of really timing sensitive stuff (like racing the beam and twiddling the palette for each line on the screen) didn't work too well under Win9x regardless.

  19. Yuhong Bao says:

    @Antonio 'Grijan': Classic Mac OS did something similar too: lists.apple.com/.../msg00061.html

  20. ErikF says:

    @Patrick: According to Wikipedia, VME wasn't publicly documented until the P6, which only launched in early 1995 (so really too late for any kind of testing for an OS releasing later that year!) Also, the number of people with the hardware required to use it would have been fairly small, meaning there would be a lot of coding for very little immediate benefit. I don't believe that I even had a Pentium-class computer until closer to 2000 if memory serves, as the initial Pentiums were pretty expensive.

  21. Yuhong Bao says:

    @ErikF: MS and IBM at least definitely had the NDA VME documentation though.

  22. Patrick Star says:

    Exact dates are a bit fuzzy, but I got an AMD K6 1997 and ran Win95 on it - in fact it was kinda targeted at mixed 32/16 bit code. And K6 was a competitor to PMMX/PII, so clearly vanilla Pentium had been around for some time then.

    AFAIK OS/2 Warp, which actually pre-dates Win95, used VME when available - when not, it defaulted to allowing DOS boxes to disable IF, but on machines that supported it set up some sort of watchdog that triggered when interrupts had been disabled for too long. But I only ran OS/2 on 486 so can't swear that this is correct; however, Wikipedia seems to agree with me: en.wikipedia.org/.../2

  23. Viila says:

    @MichaelQuinlan: Would never ever happen. Among other reasons, so write drivers, MS would have to be given the specs and the interfaces of all the hardware, and hardware vendors guard every scrap of info about those with insane jealousy because... [Take your pick:]

     A) ...Their software and design gives them the edge over competition.

     B) ...They THINK their software and design gives them the edge over competition.

     C) ...Their hardware is miserable stinking pile of bugs that the driver has to constantly work around and they don't want anybody to find out.

     D) ...The only physical difference between Thingamafrob 200 and Thingamafrob 4000 Turbo is a flag in register 3 indicating whether the software should lock out all the features or not.

  24. DWalker says:

    An assembly loop can cause a computer to hang!  Wow!

    @MichaelQuinlan:  For the drivers issue -- although this is up for debate, I doubt that Microsoft would have wanted to, or been able to, write device drivers for ALL of the millions of devices that can be connected to a PC.

    I believe there are already "driver development kits" and also some built-in drivers that do most of the work, requiring the vendor to write a filter driver or some such.  (Don't quote me on this.)  But yes, getting the interrupts correct on hardware drivers is very hard.

  25. GWO says:

    @Villa - and anyone who's followed in-tree Linux driver development knows that every one of those reasons is still used now, when a Linux dev asks many hardware companies for enough documentation to write a driver,

  26. dasuxullebt says:

    That is why I am still waiting for a real usable microkernel that doesn't have these problems.

  27. Yukkuri says:

    @dasuxullebt

    VLADIMIR:

       You have a message from Mr. Microkernel.

    BOY:

       Yes Sir.

    VLADIMIR:

       He won't come this evening.

    BOY:

       No Sir.

    VLADIMIR:

       But he'll come tomorrow.

    BOY:

       Yes Sir.

    VLADIMIR:

       Without fail.

    BOY:

       Yes Sir.

    1. dasuxullebt says:

      True :(

      But I wonder… in the 90s and 2000s, the main argument against microkernels was that they are slow. Is this still an issue when it comes to things like filesystem drivers or network I/O?

      Btw, the nicest way to get most Linux machines down is still :(){:|:&};: (it can be mitigated with ulimit, but the default setup of most distros is unlimited). (Just tried %0|%0 on my Windows 10 VM, and it worked, too)

  28. Killer{R} says:

    be green, use powersaving features to stuck your PC:

    cli

    hlt

  29. Killer{R} says:

    Also I don't think it is a big problem that Win'9x could be deadlocked by executing specific couple of machine opcodes. Its not security-centric OS, the only thing protected mode was used for - to provide satisfactory OS environment stability against software bugs, but not attacks. And I suspect cli was a reason of less than 1% of problems in this area.

    And there were plenty ways to make Win'9x down without knowing assembler. My favorite was Start/Run typing NULCON there and pressing Enter. Whole university lab could be BSODed in a minute with this trick.

  30. Patrick Star says:

    On susceptible systems, F0 0F C7 C8 was great for that. Even under Win9x which could be crashed in lots of other ways it had an advantage for prankters: Because of the way it hung, ATX poweroff didn't work. You had to force poweroff (by holding down the power button for 3-5s, if the system even supported it) or pull the plug.

    1. Yuhong Bao says:

      @Patrick Star: And I think this was because a SMI was used to handle the power off button, which after booting an ACPI capable OS became a SCI. "F00F" locked up the CPU completely so that even SMIs can't be processed.

  31. santcugat says:

    Remember the joy of timing loops and calibrating analog joysticks in MSDOS? Those were the days...

  32. Dennis says:

    It remind me a story about 90% of crashes and kernel panic are actually cause by the VGA drivers before Windows Vista require them to be digital signed.

    x86_64 are more stable before they force those vendor to review and to rewrite their drivers.

    1. Jack says:

      I'd buy that. I had a machine running XP with some crap nVIDIA drivers that crashed, without fail, every time an application tried to use OpenGL.

Comments are closed.

Skip to main content