One way people abused hooks in 16-bit Windows


We saw last time how windows hooks were implemented in 16-bit Windows. Even though the HHOOK was an opaque data type that should have been treated like a handle, many programs "knew enough to be dangerous" and took advantage of the fact that the HHOOK was just a pointer to the previous hook procedure.

The most common way of abusing this knowledge was by unhooking from the windows hook chain the wrong way. Instead of calling the UnhookWindowsHook function to unhook a windows hook, they called SetWindowsHook again! Specifically, they removed their hook by simply reinstalling the previous hook at the head of the chain:

HHOOK g_hhkPrev;

// install the hook
g_hhkPrev = SetWindowsHook(WH_KEYBOARD, MyHookProc);
...

// crazy! uninstall the hook by setting the previous hook "back"
SetWindowsHook(WH_KEYBOARD, g_hhkPrev);

This code worked in spite of itself; it's as if two wrongs made a "sort of right". If nobody else messed with the hook chain in between the time the hook was installed and it was subsequently "uninstalled", then reinstalling the hook at the head of the chain did restore the chain variables in the same way they would have been restored if they had uninstalled the hook correctly.

But if somebody else installed their own WH_KEYBOARD hook in the meantime, then setting the previous hook "back" would have the effect of not only "uninstalling" the MyHookProc but also all other hooks that were installed in the meantime. (This is exactly the same problem you have if you aren't careful in how you remove subclassed window procedures.)

I still have no idea why they used this strange technique instead of doing the right thing, which is just swapping out one line of code for another:

UnhookWindowsHook(WH_KEYBOARD, MyHookProc);

Windows 3.1 introduced the SetWindowsHookEx/CallNextHookEx model, which doesn't use the external linked list technique but rather manages the hook chain internally. This protected the hook chain from programs that corrupted it by mismanaging the external hook chain, but it meant that when these crazy programs tried to unhook by hooking, they ended up corrupting the internal hook chain. Special code had to be written to detect these crazy people and turn their bad call into the correct one so that the hook chain wouldn't get corrupted.

Comments (11)
  1. And where else have we seen this (bad) practice?

    Why, under DOS, when installing ISRs for background processing. Which just goes to show that people never, ever, EVER learn.

  2. nksingh says:

    Well, it’s never ever the same people every time.  It’s like Nietzsche’s Eternal Recurrence of inexperienced (too smart for their own good) programmers.  Also, old programmers never die, they just move up the management hierarchy.

  3. Joseph Bruno says:

    There was also the problem that there was a bug in the debug kernel of Windows 3.1 that made it crash if you used SetWindowsHookEx. The only cure was to go back to the deprecated SetWindowsHook.

  4. Kevin says:

    This is all well and good, but how does one restore GDI objects? By calling SelectObject and passing the old value, of course. So is it so hard to understand why programmers get confused? I’m not blaming MS in particular here, but this is a common problem when there are inconsistencies across an API. In this case SetWindowsHook should not have returned something that was a valid parameter for a subsequent call (although given the rampant casting required under Win16 that wouldn’t have stopped some people from still trying!).

    [Um, but there can be multiple hooks, but only one bitmap selected into a DC. Different models. You don’t “chain to the previous bitmap” in a DC. -Raymond]
  5. Coderjoe says:

    Right, but inexperienced programmers in a rush could wind up with the mistaken impression that there is only one hook, or something, and treat it like the do GDI objects.

  6. Shog9 says:

    Joe/Kevin:

    You know, most of us have probably seen and/or made enough newbie mistakes to recognize the strange logic that goes into making them.

    It doesn’t make them any less stupid.

    Now, misunderstanding the difference between selecting a bitmap into a DC and selecting anything else…

  7. Merit says:

    I think it has to do with the requirement that you save the old value in a global variable.  Without stopping to think about it pretty hard thats a really strange requirement, so people might have assumed that you needed to keep it around in order to do the unhook.

  8. Kevin says:

    [Um, but there can be multiple hooks, but only one bitmap selected into a DC. Different models. You don’t “chain to the previous bitmap” in a DC. -Raymond]
    But will a programmer writing his first (and only one in that program) windows hook necessarily realize that? Would it not be better to design API calls, where possible, so that this kind of mistake is impossible or at least unlikely, even for programmers who might miss a detail in the docs? The tone of some of these comments is starting to remind me of cases of “blame the user”, which I believe Raymond’s written about in the past. In this case it’s “blame the user of this API for not understanding the model the author had in mind, when there’s an incorrect but superficially similar model in the same general space (ie Win16 development) that the user has probably been using all along”. Whose fault is that?

    [You’d think a function called CallNextHook would be an extremely strong indication that there can be multiple hooks. Setting the current object into a container and installing a hook into a container are very different concepts. It actually surprises me that people think they’re related… -Raymond]
  9. Norman Diamond says:

    I still have no idea why they used this

    > strange technique instead of doing the right

    > thing

    Although I know (after taking too long to learn this) not to blame you for two possible reasons, nonetheless there are two pretty obvious possible reasons.

    1.  Not everyone was a Win16 programmer, no problem.  Some people who weren’t Win16 programmers were learning to be Win16 programmers, no problem.  Some people who weren’t Win16 programmers were doing Win16 programming in products to be released, without anyone else checking and fixing their work, oops.

    2.  Some people don’t always have time to experiment to see which 75% of MSDN is correct, so they skip MSDN and just do the experiments.

    I also agree with Merit’s suggestion that the use of a global variable provides some pretty powerful intuition that the global variable was intended to be used that way, and then add the fact that it seemed to work sometimes…

  10. Mihai says:

    I think it was inspired by DOS and TSR.

    The only way to unhook an interrupt was to set the previous interrupt address.

    The rigth way was to GetInterrupt (ah=35h), compare it to the saved interrupt address, and not uninstall yourself if different, only set a flag to disable processing in your own interrupt routine.

    But I have seen enough applications that just restored the original interrupt handler, "killing" other TSR or even the system.

    Fun times :-)

  11. Ben Hutchings says:

    Well, they would have to had to know about UnhookWindowsHook, and that’s not an obvious counterpart to SetWindowsHook. The name SetWindowsHook doesn’t suggest that any counterpart exists (except maybe GetWindowsHook). (Further thought would lead to the realisation that this can’t work for out-of-order removal, but there are plenty of hook mechanisms that are broken in that way.) The lesson I would draw is that functions that attach and detach functions from hooks should be done using names that suggest an inverse function exists. For example, Attach/Detach, Install/Remove, Add/Remove, Register/Unregister, Subscribe/Unsubscribe.

Comments are closed.

Skip to main content