What did the Ignore button do in Windows 3.1 when an application encountered a general protection fault?


In Windows 3.0, when an application encountered a general protection fault, you got an error message that looked like this:

Application error
CONTOSO caused a General Protection Fault in
module CONTOSO.EXE at 0002:2403
Close

In Windows 3.1, under the right conditions, you would get a second option:

CONTOSO
An error has occurred in your application.
If you choose Ignore, you should save your work in a new file.
If you choose Close, your application will terminate.
Close
Ignore

Okay, we know what Close does. But what does Ignore do? And under what conditions will it appear?

Roughly speaking, the Ignore option becomes available if

  • The fault is a general protection fault,
  • The faulting instruction is not in the kernel or the window manager,

  • The faulting instruction is one of the following, possibly with one or more prefix bytes:

    • Memory operations: op r, m; op m, r; or op m.

    • String memory operations: movs, stos, etc.

    • Selector load: lds, les, pop ds, pop es.

If the conditions are met, then the Ignore option became available. If you chose to Ignore, then the kernel did the following:

  • If the faulting instruction is a selector load instruction, the destination selector register is set to zero.

  • If the faulting instruction is a pop instruction, the stack pointer is incremented by two.

  • The instruction pointer is advanced over the faulting instruction.

  • Execution is resumed.

In other words, the kernel did the assembly language equivalent of ON ERROR RESUME NEXT.

Now, your reaction to this might be, "How could this possibly work? You are just randomly ignoring instructions!" But the strange thing is, this idea was so crazy it actually worked, or at least worked a lot of the time. You might have to hit Ignore a dozen times, but there's a good chance that eventually the bad values in the registers will get overwritten by good values (and it probably won't take long because the 8086 has so few registers), and the program will continue seemingly-normally.

Totally crazy.

Exercise: Why didn't the code have to know how to ignore jump instructions and conditional jump instructions?

Bonus trivia: The developer who implemented this crazy feature was Don Corbitt, the same developer who wrote Dr. Watson.

Comments (37)
  1. Some people joke about how bad On Error Resume Next is, then try { ... } catch (e) {} without batting an eye.

  2. anonymouscommenter says:

    Well that explains why I thought it didn't work

    I never guessed you'd have to press it a bunch of times.

  3. anonymouscommenter says:

    Hey, if it works for web browsers (they completely ignore unknown HTML elements, unknown attributes, unknown CSS, and undefined variables; web sites frequently have JS errors, yet keep working normally) then why not for machine code too?

  4. anonymouscommenter says:

    Thank you Don Corbitt, for all the work I <b>didn't</b> lose.

  5. anonymouscommenter says:

    Answer to exercise: what would it mean to ignore a jump instruction??? For unconditional jumps, the program was probably REALLY expecting to jump here. The only way it can fault on the jump instruction itself is a computed jump for which the computation failed, so it has no address to jump to. Failing to jump and just dropping through to the next instruction will probably lead the program somewhere completely random; whatever case appears first in your switch statement, or possibly the top of some other function entirely if this was some kind of computed-tail-call thing. Or some completely other use of computed jumps produced by some esoteric language. In any case, the program will not be happy if the jump doesn't happen.

    The much more likely case of a GPF "caused by" a jump instruction is a perfectly good jump (conditional or not) that led to somewhere meaningless. At this point it's too late; all Windows knows is you tried to execute instructions somewhere meaningless. It doesn't know how you got here, so it can't go back in time and pretend not to take that jump after all. At least, not until MS Research finishes that time machine.

  6. anonymouscommenter says:

    @PaulZ: most likely it executes the jump table.

  7. anonymouscommenter says:

    @PaulZ:  Why not just record the full memory state of every app continuously in a buffer, to allow the kernel to rewind the app to a time before the issue occurred?  I checked with my web developer friends, and they concur that memory is so cheap and plentiful that this is a good idea

  8. Michael Kjörling says:

    @Bromide

    Because Windows 3.1 was supposed to be able to do something at least moderately *useful* in 1992, on an 80286 with 1 MB RAM?

  9. @Michael Kjörling says:

    *Woosh*

  10. anonymouscommenter says:

    I like @Bromide's idea.  It's like formally codifying the "undefined behavior can lead to time travel" rule.

  11. anonymouscommenter says:

    There was also such application for Windows 95 called Norton Crash Guard. It intercepted crash and did similar things but with fancy UI.

  12. anonymouscommenter says:

    @Maurits

    Surely you can understand the difference between ON ERROR RESUME NEXT, which ignores all errors and executes the next statement immediately after the problem statement, and a try-catch block that can ignore a whole batch of related statements if the exception is recoverable and even expected condition (such as received Parse(...) call). Of course, throwing exceptions should be avoided as flow control, but .NET 1.0 and 1.1 taught many developers bad habits.

  13. anonymouscommenter says:

    For jump instructions, it's either a near jump or a far jump. Near jumps can't fault because Windows would always map in the entire segment. Far jumps would be to the kernel code that does segment loading and so the actual fault would be in kernel code.

  14. anonymouscommenter says:

    @Maurits:

    Now, wait a minute. try/catch is an explictly anticipated code flow: the designer understands that if the code breaks anywhere in the try{} block, execution will escape the entire block (terminating in-progress loops, trashing local scopes, etc.) and GOTO the catch block. By invoking try/catch, the designer explicitly consents to this flow, and hopefully writes the catch block as a suitable response to that exception, leaving the application in a logically consistent state. If that's not the result, then the designer is at fault for writing an inadequate catch block.

    But Windows 3.1's IGNORE is in no way anticipated! It skips over an instruction that, in virtually all cases, the designer fully intended to be executed. The odds of inconsistent state after such an operation are astronomical, and the designer can't be faulted. Even if the designer expected the exception, "just skip it and resume execution" is not an option that the designer can expect to have occurred!

  15. anonymouscommenter says:

    I have always wanted to leave a comment in this blog but I have always been to late, anyway:

    A VERY nice work on those SysErrorBox() designs Mr. Chen!

    If you ever meet the people/team behind that part of the Windows development (the SysErrorBox API), they were really doing an awesome job, as well, on the graphical design and keeping it all alive during CPU interrupt execution!

  16. anonymouscommenter says:

    David Stein: I don't think Maurits was talking about programmers making proper, intentional use of try/catch with carefully designed catch blocks. I think he's talking about situations where the code is littered with try blocks where the catch block does nothing.

    For example, one code base I work with has EVERY. SINGLE. FUNCTION. wrapped in a try block with no logic in the catch block. All of the function's logic is in the try block; the catch block catches every exception and logs the name of the function and the error message, but there's no stack trace and no propagation of the error. It's as if the author wished for ON ERROR RESUME NEXT and did the best he could with his tools to emulate it. I'm honestly not sure which is worse.

  17. Azarien says:

    @Ken in NH: RESUME NEXT leaves your program in an inconsistent state. You can hope that every instruction that depended on failed instruction will fail also, but with every "NEXT" you are deeper in the undefined behavior zone.

    On the other hand, a try block with an empty catch {} will bail out immediately, silently abandon the whole task, and resume work on something else.

    This may be a desired idiom in case we want to just carry on or with badly designed APIs that throw spurious exceptions left and right.

  18. anonymouscommenter says:

    Back in the bad old days of DOS we had a dbase application that kept suffering from data corruption.  I was observing the user work with the program when I noticed that he was presented with the "Abort, Retry, Ignore?" prompt from DOS.  He ignored the error and went back to working with the application. He had never mentioned to anyone that he was receiving these messages.  When I asked him about why he never said anything, he said that he figured if he was given a choice to ignore the problem by the computer then it must be ok to do so.

  19. anonymouscommenter says:

    Getting Abort, Retry or Ignore when an application was printing and the printer was offline was also very annoying. I wrote a TSR to solve that one.

  20. Darran Rowe says:

    @Azarien:

    Except in the case where someone does:

    try{statement();}catch(e){}

    try{statement_that_depends_on_previous_statement();}catch(e){}

    While it bails out of the current instruction stream immediately, it doesn't bail out of the current set of actions. Exceptions don't guarantee that invariants are re-established.

  21. anonymouscommenter says:

    A try { ... } catch {} can be reasonable if the code is using an idiom such as RAII with the strong exception safe guarantee - under such circumstances any failure merely leaves the program in the state it was before the operation (or set of operations) started. (Some of the problems with exceptions and the difficulty of offering the strong guarantee have been previously noted in this blog, of course...)

  22. Killer{R} says:

    /*he said that he figured if he was given a choice to ignore the problem by the computer then it must be ok to do so.*/

    In one of my apps I did following thing: if unexpected exception was caught, then app entered 'crashed' state: immediately saved current data to special named save file and disabled all save data capabilities (menu items grayed + additional checks in code). Sure user was informed about crash too.

  23. anonymouscommenter says:

    I Googled this Don Corbitt and found he died in a plane crash in 2007.  RIP.  May your slumber never get an interrupt, and if you do we hope you are patching instruction pointers up in the sky.

  24. anonymouscommenter says:

    Update to previous: I guess it was 1999.  My mistake.

  25. General Protection Fault in Windows 3.0? Are you sure?

    As far as I remember, GPF came to be in Windows 3.1. Windows 3.0 generated Unrecoverable Application Error. Am I wrong?

  26. Now, this is interesting:

    support.microsoft.com/.../75490

    Dr. Watson, once used in Windows 3.0, turned Unrecoverable Application Errors into Recoverable Application Errors and added the ignore button. Hence, your "bonus trivia" at the bottom.

  27. anonymouscommenter says:

    > Fleet Command said:

    > As far as I remember, GPF came to be in Windows 3.1. Windows 3.0 generated Unrecoverable Application Error.

    That's also my recollection.

  28. anonymouscommenter says:

    @Gabe: "EVERY. SINGLE. FUNCTION."

    Yes, I've seen this too, and I always wondered who writes this kind of code. Until one day, I spoke with one developer who argued such code was the best possible code one could write. Unfortunately, that was probably not the worst of all problems he had with his brain.

  29. Dave Bacher says:

    There is a JavaScript library on GitHub with an impolite name that recursively loads the JavaScript on the page, removing whatever parts error out.  This reminds me a lot of that.

    As a side note, either option was a sign a blue screen was coming -- and so regardless of which button you hit (often times repeatedly), your next step should be to shut down to DOS, then restart the operating environment.  Any time you started receiving GPFs, more bad news was invariably on the way.

  30. anonymouscommenter says:

    @DavidStein I'd expect that a large fraction of crashes are from non-critical code. Sure, clicking "ignore" might cause your window to be drawn incorrectly or something.

    I'd also expect there was less critical Windows code around when 3.1 was new.

  31. @Nitpicker (Corner?): Actually, saying "it was UAE not GPF" has more implications than just nitpicking on a name (or that fact that both messages above are from Windows 3.1.) Back then, UAE was just a plain worthless message that wasn't even worth seeing. GPF, however, showed segment and offset address. Back then, it was a troubleshooting clue; Windows 3.1 came with Dr. Watson built-in.

    Those days were bad days. All those conventional memory, upper memory area, extended memory, expanded memory and worst of all, Lotus-Intel-Microsoft specifications!

  32. anonymouscommenter says:

    >Until one day, I spoke with one developer who argued such code was the best possible code one could write.

    But he missed an important opportunity for optimization.

    Having adopted a programming style in which it was made explicit that it didn't really matter whether you executed any given statement or not, you can take the next step and delete all statements.

  33. anonymouscommenter says:

    "Why didn't the code have to know how to ignore jump instructions and conditional jump instructions?"

    For an even more obvious reason than everyone else has given: they aren't memory operations, string memory operations, or selector loads, by the definition of this handler. The 8086 family has never had r/m forms of the conditional jump instructions. As for an indirect unconditional jump, it isn't really a "memory operation" as such. Besides, there is no way to "ignore" that, so it simply isn't handled by this handler.

  34. anonymouscommenter says:

    > Fleet Command said:

    > Actually, saying "it was UAE not GPF" has more implications than just nitpicking on a name

    > (or that fact that both messages above are from Windows 3.1.) Back then, UAE was just a plain

    > worthless message that wasn't even worth seeing. GPF, however, showed segment and offset

    > address.

    Strange thing: I was sure the UAE pop-up had some sort of debug info in it too. I was already tempted to jump into my DOS VM and take a looksie - now I will, for sure, install Windows 3.0 and try to crash it (shouldn't be hard :)).

  35. anonymouscommenter says:

    @Gabe: It is perfectly sensible in C++ except it shouldn't be needed since no one should be throwing exceptions in the first place, but in case someone does the best you can do is log it and terminate, if you don't wish to terminate, or the idiot throwing exceptions was truly idiotic and did it when it was exceptional (should have crashed), then you just log it and continue. You can hunt down the idiot using exceptions in C++ later (with a pitchfork and mob).

  36. anonymouscommenter says:

    "Exercise: Why didn't the code have to know how to ignore jump instructions and conditional jump instructions?"

    Because the ignore option does not even become available in those cases as for the conditions listed?

    "The faulting instruction is one of the following, possibly with one or more prefix bytes:

    Memory operations: op r, m; op m, r; or op m.

    String memory operations: movs, stos, etc.

    Selector load: lds, les, pop ds, pop es."

    [I guess I didn't phrase the question clearly enough. "Why weren't jump instructions on the list?" -Raymond]
  37. anonymouscommenter says:

    Allan: In my case, every single function body is wrapped in a single try{}. As far as I can see, there is no thought given to whether exceptions are expected or not. If any exception is thrown, it gets logged without a stack trace and the function returns as if nothing abnormal happened.

    Say you have a constructor that is supposed to assign valid objects to A, B, and C using this code: this.A = GetA(); this.B = GetB(); this.C = GetC(). If there is an exception in GetB(), the constructor will return with A and C having valid values but B being null (or an invalid value, depending on how GetB was written). This may cause data corruption or random breakage in unrelated parts of the app.

    While this app happens to not be C++, I can't imagine any language where this sort of anti-pattern is acceptable.

Comments are closed.

Skip to main content