Passing the incorrect object type for a handle, how bad is it?

A customer asked a somewhat strange question: "We are calling Set­Event but passing the handle to a waitable timer. Application Verifier reports, "Incorrect object type for handle." But the code works well. We want to know the risks of passing the wrong object type to Set­Event. Is the recommendation only to pass handles of type "Event" to Set­Event?

Let's answer those questions in reverse order.

Yes, the recommendation is only to pass handles of type "Event" to Set­Event, just as the recommendation is only to pass handles of type "Semaphore" to Release­Semaphore, and more generally, only to pass valid parameters to functions.

What is the risk of passing the wrong object type? You're lucky that the kernel does object type validation before proceeding, so your error is caught during parameter validation and the function fails with the error ERROR_INVALID_HANDLE (or status code STATUS_OBJECT_TYPE_MISMATCH, if the function returns status codes instead of error codes).

Of course, if you are encountering this problem only because you are using a handle after closing it (and then the handle got recycled as a timer handle), then you merely got lucky. Maybe tomorrow you won't be so lucky, and the handle will get recycled as another unrelated event. Tomorrow, your Set­Event call will succeed and set some other guy's event. This will probably cause that other guy to get really confused. "This event is set when the modulator has finished calibrating. But the event is getting signaled before the calibration is complete, so my code ends up using an uncalibrated modulator! I set a breakpoint on my Set­Event call, and it never fires, yet the event is set. Help me debug this. I've spent a week trying to figure out what's wrong!"

As to the final remark, "But the code works well," it's not clear what the customer meant by that. What does "works well" mean in this context? Do they mean, "The event is successfully set even though it's not an event"? How can you successfully perform an event operation on something that isn't an event? Or perhaps they mean, "Our code seems to work okay in spite of this mistake." The operative phrase there is "seems to". It may seem to work well, but someday it won't, and at the most inconvenient time.

Comments (24)
  1. alegr1 says:

    That's very childish mentality. "If not following rules doesn't cause crash most of the time, it's OK to not follow rules".

  2. Cesar says:

    …and then a rogue injected DLL patches SetEvent (…/10568872.aspx), expecting it to receive only event handles…

  3. Cesar says:

    OK, leaving the jokes aside: the call crosses a security barrier (the userspace/kernel barrier), so we can reasonably expect the handle type to be checked there. We could postulate a user-only implementation of SetEvent, but that would not work if the handle crosses a security barrier (between processes with different permissions, for instance), so we can reasonably expect it to always cross the userspace/kernel barrier.

    Which means we can reasonably expect it to always return an invalid handle error, without doing anything with the handle. That, then, means the SetEvent call can be replaced by a small stub function which returns an error and does nothing. Given that they do not check the error returned by the function (they only found it via Application Verifier), the whole SetEvent call is a NOP.

    If you have code in your program which is a guaranteed NOP, you're doing something wrong.

  4. bcs says:

    "It's going to cost $10M to fix this and deploy the patch to all the off net costumer sites. We don't know of any failures it's caused. Is this a critical issue or can we deploy it with the next release?"

  5. dave says:

    We need a 'customer-speak' series analogous to the "microspeak" articles.

    "But the code works well" == "We didn't check the return code".

  6. EduardoS says:

    The SetEvent function returns a BOOL (or is documented as such) the error code comes from GetLastError

  7. dave says:

    Yes, and the BOOL value is the 'return code' — the return is encoded as TRUE for success and FALSE for failure.

  8. Paul Z says:

    Raymond, do you ever get the impulse to grab these people by their shirt lapels and shake them back and forth until their head pops off? Seriously, what would cause someone to think this was a good idea?

    "Pray tell, Mr Babbage, if you put the wrong numbers into the machine, will the right answers come out?" "…….."

  9. voo says:

    @Paul Z: The first thing that sprang to mind was the Babbage quote as well. Wouldn't it be nice if some independent panel could just forbid some people to go anywhere near code? That person clearly shouldn't make their living by programming.

  10. Myria says:

    This is why I love the SetProcessMitigationPolicy(ProcessStrictHandleCheckPolicy) mode.  Pass a bad handle, and you crash instead of other undefined behavior.

    Unfortunately, I can't recommend its use in released products: too much third-party software injects DLLs into every process, and taking your application down when another company's product is misbehaving makes you look bad.

  11. Gabe says:

    Myria: I'm not sure exactly how ProcessStrictHandleCheckPolicy works, but I don't think it applies in this situation. It sounds like it raises exceptions for values that aren't handles in your process (STATUS_INVALID_HANDLE), whereas this situation just involved passing a valid handle of the wrong kind (STATUS_OBJECT_TYPE_MISMATCH).

  12. Gabe says:

    If they're passing a handle to a WaitableTimer to SetEvent, then they're probably trying to manually trigger a periodic timer. My guess is that the code works well because the timer automatically fires before they notice that the manual trigger failed.

  13. Joshua says:

    SetEvent(INVALID_HANDLE_VALUE) is still safe, right? (Yes I know it should error, that's not the point.)

  14. bcs says:

    "It's going to cost $10M to fix this and deploy the patch to all the off net costumer sites. We don't know of any failures it's caused. Is this a critical issue or can we deploy it with the next release?"

    It sounds critical to me.

    What you need to do urgently is identify what causes it and how to fix it. You can then try to guess how likely it is that you'll SetEvent the wrong event and whether the cost of fixing those situations where it does is more or less than the $10M.

    None of those questions can be answered by Microsoft.

    Microsoft aren't going to give you gambling tips. The developers should know more about how the software works than anyone, if they don't then you have no hope and this problem isn't likely to be your largest one.

  15. alegr1 says:

    About the first thing the kernel does for APIs that supply a handle, it calls ObReferenceObjectByHandle. One of the arguments is an object type. If the handle refers to the wrong object type, the function fails right there.

  16. Dave Bacher says:

    They're likely right that it works well.

    Case 1:  Timer goes off before SetEvent — Wait resumes and does whatever.  This is an expected case, and so it probably works.

    Case 2:  Timer goes off after SetEvent.  While SetEvent failed to signal the waiting handle, when the timer goes off, it will do so.  This will be indistinguishable to the code in the Wait statement from if SetEvent had actually done the signal.

    And so the sole impact is…

    …the code takes longer to run.

    It will pass any unit test of the two threads independently.

    It will pass any unit test of the two threads together.

    It'll just be a little slow.

    I'd define that as "works great," depending on how close the timer is set to the length of the operations being defined.

    Which then, leads into their question…

    So they have this app that works great and passes their tests, but App Verifier is saying "wait, wait, wait, you made an error here."  And so they're asking Microsoft, because everything they expect would happen if the code worked is actually happening.  Their, incorrect but consistent with most of the Windows API, thought was that the timer event derived from AutoResetEvent or w/e, and then just added some functionality to it — and therefore they should be able to call a base class method in order to signal it.

    Most places in Windows actually allow that — this spot doesn't.

    And so they're effectively asking why they can't pass a derived class to a function expecting a base class, because they've misunderstood the class graph in this region of the API.

    I think there's a world of difference there between that, and passing — for example — outright bad data.  Because it's a C API, the compiler and headers are limited in how much help they can give you in catching this.  But effectively, it's just a misunderstanding of a class graph, and not some dire programming error revealing the original requestor's lack of an understanding of basic computer science.

  17. Eric says:

    @Cesar: "If you have code in your program which is *at best* a NOP, you're doing something wrong."

    Raymond has a gift for describing what seems like a harmless bug in the first paragraph, then explaining why it's a true WTF in the next 3-4.

  18. Matt says:

    Worth noting that the SetEvent API *happens* to call into the kernel and is thus safe, rather than being *documented* as being safe when passed an invalid handle.

    It is entirely plausible that a future version of Windows will, in fact, do user-mode caching of data in such a way that passing the wrong handle value as a parameter will cause memory corruption in the process.

    It is undefined behavior to send a non-live Event parameter to SetEvent.

    And when you invoke undefined behavior, possible outcomes include "it seems to work correctly", "it sets some other guy's events" and "it causes a memory corruption, and the clever hacker on the network that triggered it uses that memory corruption to take control of your process, install malware and take your credit card numbers".

  19. Count Zero says:

    @Paul Z – At first I was almost sure that what happened is not what you assume to be happened. I have seen situations at one of my former workplaces  where sending a patch/bugfix out was practically not possible (because of reasons I would(/could) not describe here), and when an error like the described is found in the sources – which was quite rare, but occured once or twice – the PM asked all developers to predict what problems could an error like that cause IRL. I instantly thought that this question is indicated by a scenario like that until I have read the comment of @Gabe about "they're probably trying to manually trigger a periodic timer", a case where the above question only shows incompetence.

  20. Somebody says:

    Were you amazed that someone actually used Application Verifier?

  21. *blinks*

    Things like this make me think I'm actually a much better programmer than I assume I am normally.

  22. Goran says:

    Ah, the "works by accident", most vicious of all kinds of "it works"…

  23. John Doe says:

    "Say, if I provide wrong input, will I still get the right output?"

  24. JM says:

    @Dave Bacher: even if you wanted to reframe things in terms of classes, this doesn't get around the fact that a waitable timer is not an event, or vice versa. They both happen to be synchronization objects, that's all. I really don't see the difference between "misunderstanding a class graph" and "passing a non-event to SetEvent()", other than that the latter sounds like the fundamental error it is. If anything, it shows that pretending that an API works a certain way and letting ideas mushroom out of that is an impediment, not a substitute for actually understanding it.

Comments are closed.

Skip to main content