Function requirements are cumulative: If you fail to meet any of them, then all bets are off


A customer was having problems with the Wait­For­Multiple­Objects function:

We are looking for a clarification of the behavior of Wait­For­Multiple­Objects. We have a thread that waits on two handles (call them Handle1 and Handle2) with Wait­For­Multiple­Objects, bWaitAll = FALSE. Under certain conditions, we signal Handle1 and close Handle2 from another thread while the wait is in progress. This results in WAIT_FAILED being returned from the wait call. MSDN is not clear on what the expected behavior is here. On the one hand, it says

  1. When bWait is FALSE, this function checks the handles in the array in order starting with index 0, until one of the objects is signaled. If multiple objects become signaled, the function returns the index of the first handle in the array whose object was signaled.

This description implies that the wait should never fail, because the function should have noticed that Handle1 is signaled before it got around to noticing that Handle2 was invalid.

On the other hand, the documentation also says

  1. If one of these handle is closed while the wait is still pending, the function's behavior is undefined.

What behavior is guaranteed here?

Once you violate a constraint (in this case, the constraint that the handles remain valid for the duration of the wait operation), you have failed to meet the prerequisites and the behavior is undefined. You can't say, "Well, sure you say I can't do X, but if you follow exactly the steps given in this description of how signaled objects are selected, then the function wouldn't even have noticed X before coming to a decision, so the fact that I broke one of the rules is irrelevant!"

The description of the behavior of the Wait­For­Multiple­Objects function when bWait is FALSE is not an implementation specification. It's a description of how to interpret the behavior of the function. The algorithmic way the function behavior is described is to assist in understanding the behavior; it doesn't mean that the function actually follows the algorithm step-by-step. (In reality, there is no polling loop, as the algorithmic description implies. All the handles are waited on simultaneously. It's like Lebesgue integration: You integrate over the entire domain at once.)

An algorithm-free description of the behavior when bWait is false might go like this:

  1. When bWait is FALSE, the function does not return until one of the handles in the array becomes signaled. The return value is the smallest index corresponding to a signaled handle.

This description is purely declarative but gives you no mental model.

It's like saying that "Water seeks its own level." Water doesn't have a will that compels it to behave in a certain way, but describing the behavior in that way makes reasoning about water easier.

Comments (33)
  1. Adam Rosenfield says:

    Ooh, Lebesgue integration (pronounced le-BAYG).  I was not expecting a reference to that.  I never did get around to studying it, I think it was a couple chapters past the last chapter of Rudin that we studied in my analysis class.

  2. Brian G. says:

    Your analogies never fail to be clever and instructive.

  3. Barbie says:

    I might be in the minority, but I do find the second description clearer.

    Oh, and Lebesgue has no 'AY' in the pronunciation. It's just "ləˈbɛg". With lə as in Pulp fiction's "Le big-mac", and bɛg as something close to the English beg.

  4. Adam Rosenfield says:

    @Barbie: That's the correct French pronunciation, but all of the mathematicians that I've heard say it have pronounced it lə'beg (le-BAYG).

  5. Alexey says:

    Would be fun to see you work a tensor theory analogy into a future post.

  6. Voo says:

    @Adam Rosenfield: So all mathematicians you've heart so far, don't speak french and therefore pronounce it however they see fit? ;) It's named after the french mathematician of the same name, so why should it be pronounced in some completely arbitrary variant?

    [And this is surprising? Chern classes are named after the Chinese mathematician Shiing-Shen Chern. The 'r' is silent, but in English, everybody pronounces it with the 'r'. Weierstrass is pronounced with a "w" instead of a "v". The only part of Noether that people pronounce authentically is the n. -Raymond]
  7. Jim says:

    I still think that's a poorly-written description, because it sounds like it's describing the implementation.  It implies an ordering behavior which doesn't exist.  To me, the "purely declarative" description is much clearer, and wouldn't lead to as much confusion when combined with the rule about undefined behavior.

  8. Joshua says:

    XP SP2, single core has worse behavior. Closing a handle waited on by one thread from another thread -> uninterruptable deadlock. Is this behavior a bug? Yes. Is there a defined behavior? No.

  9. In my understanding of Lebesgue integration you integrate over the range rather than the domain (that is to say, in your algorithmic loop, the range "varies slower".)

    [In Riemann integration, you study individual pieces of the domain and combine the local results. But in Lebesgue integration, you study the function's behavior over the entire domain at once (by studying its behavior in the range). -Raymond]
  10. laonianren says:

    Of the differing pronunciations of Niklaus Wirth's surname it is said that Europeans call him by name but Americans call him by value.

  11. Joe says:

    "All the handles are waited on simultaneously."

    Not exactly. In practice they are waited on sequentially. Thus if you have two events, with event[0] being signalled frequently and event[1] less frequently, you may never find out event[1] was signalled from just the return from a WaitForMultipleObjects call.

  12. Alex Grigoriev says:

    @Joe:

    Yes, all the handles are waited on simultaneously. Cost of wakeup by any handle is the same. The wait list is processed sequentially, because the function spec says it will pick the first signalled object in the array.

    WFMO function in the kernel first builds the wait list by resolving the handles to the object pointers. The current thread is added to each object's wait list. When an object is signalled, its wait list is scanned to find out what threads should wake up. For each thread, its array of wait objects will be checked to find out if the wait condition is satisfied.

  13. rs says:

    In my understanding of Lebesgue integration you integrate over the range rather than the domain.

    The point with Lebesgue integral is that it's defined only if it gives the same value whichever way you compute it. You can split the area below the graph horizontally, vertically or in any other way you like (into countably many pieces). That's why improper Riemann integrals (which depend on the order of the domain) need not exist as Lebesgue integrals.

    (Sorry for being off-topic, but when people start talking about Lebesgue integration, I just can't resist.)

  14. Bob says:

    @Walker: Nit2:  You describe single-threaded processors.  Multi-threaded processors can indeed execute multiple threads simultaneously in one processor.

  15. Aleksander Oven says:

    Would it be ok to signal the second handle and then close it immediately, without waiting for WaitForMultipleObjects to notice it's been signaled? I'm never sure about this, even with .NET's WaitHandles…

  16. LR says:

    @Aleksander Oven: Why do you think so? WaitForMultipleObjects expects all handles to be valid from entry to exit. The "signaled" state has nothing to do with that requirement.

    msdn.microsoft.com/…/ms687025%28v=vs.85%29.aspx  "If one of these handles is closed while the wait is still pending, the function's behavior is undefined."

  17. Jeremy Price says:

    Lebesgue integration?  Chern classes?  Sometimes you scare me.  Tomorrow can you compare NOPs to trivial holonomy groups of Riemannian manifolds?

  18. SQL cursor analogy left to the reader.

  19. David Walker says:

    Nit:  Single-processor computers don't actually *implement* anything by doing anything simultaneously, since there is only one instruction pointer in the processor!  Everything is done one at a time although there is a lot of task switching.  Conceptually, the system can act like some things are done simultaneously, but in practice, the code runs one instruction at a time.  (Multiple instruction decoding pipelines in a single-processor core don't change this.)  I'm just sayin'.  :-)

  20. Almanac says:

    @Joshua Yeah you've got the definition of "undefined"

  21. Zan Lynx says:

    @Walker: Nit: Superscalar processors do indeed execute multiple instructions at once. Then they go to great effort to make it *appear* as if they did it one at a time, even going so far as to discard all the successful work of other instructions if a previous one failed or if there was an interrupt for some reason.

  22. Goran says:

    @Joshua. Ugh, dude! Closing a handle you are waiting on is the original bug if you ask me, and whatever consequence you might have later on are tangential. Yes, I know there's a lot of code written in that manner, but that doesn't make code right.

  23. Simon Robinson says:

    How likely is it that two handles could become signalled simultaneously? On a single core machine that would seem to be impossible, on a multicore machine, it would seem at best extraordinarily unlikely. So if that scenario (almost) never happens, then how often is WaitForSingleObject ever going to be in a situation of needing to select from multiple signalled handles? I'm assuming that scenario must be able to happen, otherwise there would be no point the documentation referring to it.

    On a single core machine I would guess that the answer is, it can happen if one thread causes two handles to become synchronized within the same timeslice, but that's a bit of a guess. Does the OS run a timeslice to completion or does it kill the timeslice immediately on detecting that a handle is signalled? If the latter than it would be impossible for WaitForMultipleObjects() to have more than one signalled handle. And I've no idea how that all translates to a multicore machine.  

  24. Neil says:

    Now you've got me wondering whether Raymond's surname was derived from Chern.

  25. Adam Rosenfield says:

    @Simon Robinson: Easy, just signal the handles before calling WaitForMultipleObjects.  The following call to WaitForMultipleObjects will always return 0:

    HANDLE h1 = CreateEvent(NULL, FALSE, TRUE, NULL);

    HANDLE h2 = CreateEvent(NULL, FALSE, TRUE, NULL);

    HANDLE h[2] = {h1, h2};

    WaitForMultipleObjects(2, h, …);

    [Or the thread could be temporarily pulled out of its wait state by a kernel APC, and when it returns to the wait state, it finds that both handles have been signaled. -Raymond]
  26. Alex Grigoriev says:

    @Simon:

    Change of object state (SetEvent, etc) modifies the target thread state (from waiting to ready) immediately, in the course of the call. When that thread will be picked for execution, is another matter.

  27. Mike Dimmick says:

    Simon, Alex Grigoriev: my understanding is that before Windows 7/Server 2008 R2, all manipulation of notification objects is done under the single system dispatcher lock. It's a hard problem to do it in a way that avoids deadlock. Therefore you cannot signal two events 'simultaneously' even on a multi-core system, because the first attempt to do so gets the dispatcher lock and the second one waits for it. (The multiprocessor specifications guarantee that one will be 'first' even if the processors genuinely do manage to generate the memory access at the same time.)

    If changing the state of an object makes a thread runnable, it will pre-empt a thread that's running at a lower priority (allowing for any dynamic priority boost from the wakeup – waiting on an event gets a boost of 1 and the foreground process gets an extra boost). That might be on this CPU or another one depending on CPU affinity mask, preferred CPU and the priorities of any running threads.

    A goal for Windows Server 2008 R2 was to exceed 64 cores – in large systems much CPU time was wasted spinning on the dispatcher lock. Thread scheduling was redesigned to remove the dispatcher lock. I don't really understand it yet (Windows Internals 6th Edition isn't out yet) but in the meantime, try Arun Kishan's conversation with Channel 9: channel9.msdn.com/…/Arun-Kishan-Farewell-to-the-Windows-Kernel-Dispatcher-Lock

    Joshua: Because of the way thread scheduling worked in Windows XP, as described above, I have a hard time believing that closing an event handle that is being waited on causes a deadlock. If it did I would expect it to cause a complete system deadlock, not just deadlock one process.

  28. Alex Grigoriev says:

    @Mike D:

    TL;DV. I understand that the Big Fat Lock was replaced by smaller locks. Perhaps a lock per thread, per node, or like that. Cross-node locking is really bad, because it needs to fire those cache management messages across. If you can limit your lock to a single node, that really helps.

  29. Dmitry Kolosov says:

    @Raymond: "When bWait is FALSE, the function does not return until one of the handles in the array becomes signaled. "

    This is not a specification. It says what the function DOES NOT DO ("does not return until"), but says nothing about what function DOES ("returns when…")

    [Would you prefer "suspends execution" instead of "does not return"? Because the point of the function is to not return until… "Returns when" leaves open the possibility that it may return for other reasons too. -Raymond]
  30. Dmitry Kolosov says:

    @Alex Grigoriev:

    If change of object state modifies the target thread immediately, then it should be perfectly OK to signal Handle 1 and close Handle 2 (as customer wanted). "The constraint that the handles remain valid for the duration of the wait operation" is not violated by closing Handle 2 because wait operation IS OVER as soon as Handle 1 is signaled.

  31. ficedula says:

    @Dmitry: Eh, not necessarily. While the wait is certainly over from the kernel's perspective (the thread has been marked as ready to run again), from the perspective of user code you could describe the wait as still "pending" – because WFMO still hasn't returned.

    I suppose you could argue for the documentation saying "If one of these handles is closed before WFMO returns" instead of "while the wait is still pending", but it's a bit nitpicky; the documentation is meant to help you understand how to call the function correctly, not act as a set of definitions to be rules-lawyered around.

  32. Alex Grigoriev says:

    @Dmitry, ficedula:

    The devil is in the "thread stack hijacking". When the kernel needs a thread to execute a kernel APC, the waiting thread may be taken from the "not ready" list. When it happens, all references to wait objects are returned. When the APC is done, the handles are used again to build the wait array. If a handle has become invalid, the function fails.

    It may happen that you signalled the event and closed the handle while the APC was executing. FAIL.

  33. ficedula says:

    @Alex: Right, the point is that the rule is basically "don't close a handle you've used in a WFMO call until WFMO returns". Which seems reasonable to me, and while you could argue that the documentation for WFMO could be worded better I don't think it's really that misleading as it stands; most people are going to read "don't close a handle while the wait is pending" as basically equivalent to "don't close a handle until WFMO returns".

Comments are closed.

Skip to main content