What does an invalid handle exception in LeaveCriticalSection mean?


Internally, a critical section is a bunch of counters and flags, and possibly an event. (Note that the internal structure of a critical section is subject to change at any time—in fact, it changed between Windows XP and Windows 2003. The information provided here is therefore intended for troubleshooting and debugging purposes and not for production use.) As long as there is no contention, the counters and flags are sufficient because nobody has had to wait for the critical section (and therefore nobody had to be woken up when the critical section became available).

If a thread needs to be blocked because the critical section it wants is already owned by another thread, the kernel creates an event for the critical section (if there isn't one already) and waits on it. When the owner of the critical section finally releases it, the event is signaled, thereby alerting all the waiters that the critical section is now available and they should try to enter it again. (If there is more than one waiter, then only one will actually enter the critical section and the others will return to the wait loop.)

If you get an invalid handle exception in LeaveCriticalSection, it means that the critical section code thought that there were other threads waiting for the critical section to become available, so it tried to signal the event, but the event handle was no good.

Now you get to use your brain to come up with reasons why this might be.

One possibility is that the critical section has been corrupted, and the memory that normally holds the event handle has been overwritten with some other value that happens not to be a valid handle.

Another possibility is that some other piece of code passed an uninitialized variable to the CloseHandle function and ended up closing the critical section's handle by mistake. This can also happen if some other piece of code has a double-close bug, and the handle (now closed) just happened to be reused as the critical section's event handle. When the buggy code closes the handle the second time by mistake, it ends up closing the critical section's handle instead.

Of course, the problem might be that the critical section is not valid because it was never initialized in the first place. The values in the fields are just uninitialized garbage, and when you try to leave this uninitialized critical section, that garbage gets used as an event handle, raising the invalid handle exception.

Then again, the problem might be that the critical section is not valid because it has already been destroyed. For example, one thread might have code that goes like this:

EnterCriticalSection(&cs);
... do stuff...
LeaveCriticalSection(&cs);

While that thread is busy doing stuff, another thread calls DeleteCriticalSection(&cs). This destroys the critical section while another thread was still using it. Eventually that thread finishes doing its stuff and calls LeaveCriticalSection, which raises the invalid handle exception because the DeleteCriticalSection already closed the handle.

All of these are possible reasons for an invalid handle exception in LeaveCriticalSection. To determine which one you're running into will require more debugging, but at least now you know what to be looking for.

Postscript: One of my colleagues from the kernel team points out that the Locks and Handles checks in Application Verifier are great for debugging issues like this.

Comments (27)
  1. Tom says:

    I’ve never heard of the Application Verifier before.  Thanks for the tip!  Has anybody had any real-world experience using it?

  2. J says:

    Yeah, my real-world experience with it consisted of this:

    Me:  Hey, our app crashes under Microsoft’s Application Verifier.  You need to figure out why and fix the problem.

    Other developer:  Come on, you can’t expect our application to work flawlessly under every possible situation.  We only need to fix things our internal testers can find.

    Me:  Oh how I hate you.

    (and I debugged and fixed the problem when I was moved full time to the project)

  3. Stu says:

    So, just out of interest, why is it possible to destroy a critical section while it is in use?

    Is there ever a valid reason to do it, because, to me, it seems like something that you would never do, like closing a file while another thread is reading/writing it, is that possible?

  4. Norman Diamond says:

    > This can also happen if some other piece of

    > code has a double-close bug, and the handle

    > (now closed) just happened to be reused as

    > the critical section’s event handle.

    Ouch.  In some programs I declared a boolean together with each file, and if error processing closed a file then I set the boolean to FALSE in order to know later not to close the same file again.  But in some programs I was lazy and if error processing closed a file then other processing might just observe and discard an error return from closing the same file again.  So in fact it’s a serious problem if later processing doesn’t get an error return because it closed some innocent victim instead, and the innocent victim is something that the program didn’t even know about.

    The double close bug needs a lot more emphasis.  It differs vastly from most other operating systems where a double close of a file has no effects beyond the file.

    [*ix has the same double-close problem. It too recycles file handles. -Raymond]
  5. Roman Belenov says:

    “One possibility is that the critical section has been corrupted, and the memory that normally holds the event handle has been overwritten with some other value that happens not to be a valid handle.”

    IMHO it’s worth mentiioning that this corruption can be done by perfectly legal call to EnterCriticalSection in some rare circumstances.

    http://www.bluebytesoftware.com/blog/PermaLink,guid,db9f8f5b-8d1d-44b0-afbd-3eadde24b678.aspx

    [But only if you tried to catch the exception. Win32 exceptions are, as a general rule, not safely catchable since the code in between is rarely exception-safe. This is just another example. -Raymond]
  6. Arlie Davis says:

    The double-close problem is actually *worse* on UNIX because the UNIX handle table will re-use the lowest unused fd index, which means that fds will be reused very, very quickly.  NT does not recycle object handles nearly as quickly, so the problem is less likely to occur on NT.

    It’s still a serious, serious flaw in any application, of course, and should always be fixed.

  7. Norman Diamond says:

    > *ix has the same double-close problem. It too

    > recycles file handles.

    But only if the program opens another file between the two closes.

    Monday, December 11, 2006 10:04 PM by Arlie Davis

    > The double-close problem is actually *worse*

    > on UNIX because the UNIX handle table will

    > re-use the lowest unused fd index, which means

    > that fds will be reused very, very quickly.

    But fds won’t be reused for critical sections or tons of other stuff, they’ll only be reused for files (and pipes and stuff that get accessed as files).

  8. Tal says:

    I’ve used Application Verifier just a little bit. Gflags.exe was more expressive.

    In any case you will need a debugger attached to see the results, and probably do more analysis, (so learn how to use WinDbg…)

  9. Norman, the double close problem on Windows is exactly the same as the double close problem on *nix, and has exactly the same pre and post conditions.

    What I don’t know is if *nix has a global handle manager or if each of the handles to objects is managed by a separate subsystem – if the handles are managed by unrelated subsystems, then that causes a whole other set of problems (if handle number 17 is used for both a critical section and for a file, hilarity can ensue when handle 17 is handed to the wrong close API).

  10. Arlie Davis says:

    James: UNIX "file" handles have not just been "file" handles for a long, long time.  They are, conceptually and in real-life use, exactly the same as object handles on NT.  You’re quibbling about details of the behavior of a class of bugs (double-close) on different platforms.  But it’s all the same bug, with all the same effects!

    Your "liquid dispenser" analogy is idiotic and wrong.  If you look at the Windows API functions that take objects as parameters, they DO take the type of the object into account.  Every function validates the type of object that it is acting on; if you call ReadFile on a semaphore, the call will fail with a "wrong object type" error.

    There are only a handful of functions that can work on more than one object type.  The WaitFor* functions, and CloseHandle.  This is hardly "reactor coolant" and "baby food".  You are intentionally skewing this discussion to make your pet OS look better, when in reality, it isn’t.  If there were 15 types of CloseHandle functions (CloseFileHandle, CloseXxxHandle), you would probably be complaining that Windows makes you remember too much.

    BryanK: All object handles that user-mode processes can access are bound to that process.  If this was not the case, it would be impossible to create a secure system.

  11. Norman Diamond says:

    Monday, December 11, 2006 11:31 PM by LarryOsterman

    > Norman, the double close problem on Windows

    > is exactly the same as the double close

    > problem on *nix, and has exactly the same pre

    > and post conditions.

    I still don’t believe it.  In Windows the innocent victim can be an event handle that the program didn’t even know about, or surely other kinds of handles that the program didn’t even know about.  In Unix the innocent victim can only be a file identifier, and the only way for it to be a valid file identifier is if the program has opened another file in between the two closes.

    > What I don’t know is if *nix has a global

    > handle manager

    It doesn’t matter.  It could vary by kernel (System V, NetBSD, Linux 2.4.9, Linux 2.4.10, etc., all completely different).  It doesn’t matter what they do internally because the API presented to programs uses small integer file identifiers.  0 is stdin, 1 is stdout, 2 is stderr, 3 is whatever the program opens next, etc.  (I guess 3 doesn’t have to be next but it still doesn’t matter, 3 isn’t going to be a critical section’s event handle.)

    [I don’t get it. You say it’s not the same problem, and then describe the same problem: If you close something twice and somebody reuses the handle in between, then you closed the wrong thing. The details may be different but the problem is the same: Handle reuse. -Raymond]
  12. sergio says:

    The Windows approach is like a generic ‘liquid dispenser’

    That was exactly the initial goal of Unix authors too — to treat as much as possible as file handle. And it is actually a good concept. For just a simplest example, although not historically first, to read random bits, you just open a file named "/dev/random" and read from it. Only when unix started to be expanded, the "non file" calls were added in quantities (that’s why you can’t read sockets like files). Windows designers somewhere tried to follow the initial Unix idea (e.g. COM port is really opened by CreateFile, although reading from it properly is not so easy) but more often they didn’t.

    I would personally be happier when there were more things visible "as files".

  13. Nick Lamb says:

    "Only when unix started to be expanded, the "non file" calls were added in quantities (that’s why you can’t read sockets like files)."

    You can (and often do) call read(2) on the handle for a connected socket. You only need something else if your socket isn’t connected or you want extra information on top of the actual transmitted data, such as the return address, out of band messages, error statistics etc.

    If a connected socket is passed as e.g. standard input to an ordinary Unix process, messages that arrive at that socket will be read by the process as its input.

  14. sandman says:

    I’ve never been hit by the double close thing on *ix, but I cant
    recall getting hit by this on windows either – or at least enough to
    have difficultly finding the issue.

    I agree wiht the MS team hgere that the bug is the same between Os’s
    , if wnayhting should be worse undix *ix because of the mroe frequent
    re-use but I;ve done a lot of progamming under *ix without hitting hit
    seriously by it.

    However I think it could be real bad in the case Raymonds describing
    here as the actual handle is obscure behind an ptr to an opaque (and
    rightly so) data strucuture. The fact that a crtical section handle
    contains an event handle handle and hidden one can be invalid is a
    problem. Particlaur if the error message returned directs the users to
    think the outer (critical section) handle is the invalid one.

    Does windows return the same error for and invalid cs handle as for an invalil cs->event handle ?

    [Critical sections are not handles. They’re just a
    chunk of memory (CRITICAL_SECTION). If you pass a pointer to a bogus
    chunk of memory then undefined things happen. -Raymond
    ]
  15. asampson says:

    Windows 2003? Did I miss a release?

  16. Gabe says:

    Modern versions of Unix use "files" for all kinds of things that aren’t even close to files, such as initializing memory (/dev/zero), generating random numbers (/dev/random), and getting process statistics (/proc).

    It’s entirely possible that a double-close bug in your process statistics code (reading /proc) could cause a failure in your random number generator.

  17. BryanK says:

    > If you close something twice and somebody reuses the handle in between, then you closed the wrong thing.

    But it depends on who “somebody” is.

    In Windows, if another process was the one that had the double-close
    bug, is it possible for your critical section’s event to have been
    closed?  I’d hope not (since that would violate just about every
    principle of keeping processes separate), but I don’t know for sure.

    I know that on *nix, file handles are per-process, so if some other
    process gets 5 back from an open() call, and you’ve done a close(5)
    twice, you won’t corrupt that other process.

    (Of course, even if handles in Windows are per process, you can
    still screw up your own process from another thread.  Same on
    *nix, I believe — AFAIK file handles are shared between threads.
     OTOH, it seems that vastly fewer *nix programs are
    multithreaded…)

    [I’m assuming everybody knows whether handles are global or per-process. -Raymond]
  18. James says:

    Raymond: The difference is that Unix file handles are not “handles”,
    they are “*file* handles”, completely distinct from, say, semaphores.
    For a Unix library to close your semaphore when it meant to close its
    own file, it would have to be calling completely the wrong function,
    not just calling the right function too many times.

    The Windows approach is like a generic ‘liquid dispenser’, with
    numbered nozzles for everything from baby food to reactor coolant, as
    opposed to Unix having a completely separate set of fuel nozzles: you
    might still end up putting diesel in a petrol car by mistake, but you
    won’t be left drinking a pint of Draino instead of the
    similarly-numbered beer you expected.

    It’s funny; I’d expect freeing and allocating blocks of memory to go
    through a single API (malloc/free or equivalent) while files and
    critical sections would be separate (since they have nothing in
    common)…

    Bryan: You’re right; between processes being much “cheaper” and
    greater use of non-blocking I/O, there’s much less need for threads.

    [Yes, I know that they operate only on file
    descriptors, but the core problem is the same – handle reuse. (Files
    and critical sections *are* separate. You can’t CloseHandle a critical section.) -Raymond
    ]
  19. James says:

    Arlie: you have a point that double-closing remains possible, but a file descriptor remains precisely that: a file, not an "object" wrapped around one of a variety of possible things.

    You haven’t identified anything "wrong" about it, nor am I "skewing" anything: yes, attempting the impossible fails, as one would expect (did you seriously expect trying to read data from something which doesn’t contain any to do anything else?!) – but why doesn’t closing something mirror opening it? You don’t call CreateHandle to get an event, you call CreateEvent: why break the symmetry like that? I very much doubt people would complain about having simple and obvious symmetry – CreateEvent/CloseEvent and CreateFile/CloseFile – rather than the present, more complex setup, where operating on an event might involve an Event function or a Handle one depending on the weather.

    By calling both files and events "HANDLE", you lose type checking. Yes, the *OS* knows what’s inside – but now, the compiler doesn’t. I’m no fan of languages like ML, but I still value some type checking.

  20. Jim says:

    asampson:

    Windows 2003? Did I miss a release?

    I don’t know, did you?

    http://www.microsoft.com/windowsserver2003/

  21. Norman Diamond says:

    > I don’t get it. You say it’s not the same

    > problem, and then describe the same problem:

    > If you close something twice and somebody

    > reuses the handle in between, then you closed

    > the wrong thing. The details may be different

    > but the problem is the same: Handle reuse.

    Well, if I think really hard then I can imagine some cases where Unix would open a file and assign the fd number to a process without the process expecting it or knowing about it.  But if I think harder then I forget what I was thinking of.  It still doesn’t seem to be as easy to do as unknowingly having handles opened by Windows APIs.

    By the way notice that I didn’t complain about the design philosophy.  I only asked that the warning about double close bugs be emphasized more widely.  Programs get in real trouble when a duplicate call to CloseHandle *doesn’t* return an error.  For example the MSDN page for CloseHandle should emphasize that programmers must take care never to do a double close.

    [What about double-DestroyWindow? Double-LocalFree? Double-delete? Should they all say “don’t destroy something more than once”? At what point do you stop restating the ground rules for programming? -Raymond]
  22. Mike says:

    Interesting and frustrating problem I had with Application Verifier.  I read about it on your blog, downloaded it, installed it, and (to try it out) picked notepad.exe for the application.

    At that point it told me of the need for a debugger, which I didn’t have installed on this computer, and quit.  I uninstalled the software and forgot about it.

    Today rolls around. I’m on a web site browsing a catalog without a wishlist and I find a bunch of new items I want.  I fire up notepad, go page by page through the items – recording description, cost, etc. until I’m done.  When I go to save the list on my desktop a dialog pops up telling me I’ve hit some sort of breakpoint.  Crap, I think, as I press <Ok>.  Gone is notepad, my list, and my time.

    Works ok now, but boy am I glad I didn’t try App Verifier out on Word (or worse).  I might have lost a day’s worth of work (or a years worth of Quicken data).

  23. Norman Diamond says:

    > What about double-DestroyWindow?

    > Double-LocalFree? Double-delete?

    OK, I see your point.

    > At what point do you stop restating the

    > ground rules for programming?

    I see your point there too, but it leads to another odd observation.  Some MSDN pages about APIs restate ground rules about C (often correctly) where I think it would be better to restate ground rules about Windows.  In view of a lot of garbage programs that you and I both have to contend with, a lot of programmers need a lot of ground rules to be restated.  On the other hand a lot of those programmers would never understand the purpose or usage of MSDN pages.  So here you go.  You ask an impossible question, you get a rambling answer.

    Wednesday, December 13, 2006 7:26 PM by Mike

    > Interesting and frustrating problem I had

    > with Application Verifier.

    I wonder.  I thought release builds of applications (Notepad or other) were supposed to have no breakpoints.  I wish that release builds didn’t have most of their debugging information stripped out but that’s a bit different from imposing breakpoints on retail customers.

    > Gone is notepad, my list, and my time.

    If you were using Internet Explorer 7 you wouldn’t even need Notepad, Application Verifier, or anything else in order to reach that state.  IE 7 crashed several times per day, closing three or four windows at a time, and only half the time it offered to send crash dumps to Microsoft.  Yesterday I uninstalled it.  The uninstallation deleted several Windows Update security patches in addition to uninstallting IE 7.  Running Windows Update under the restored IE 6 did not offer to reinstall the deleted security patches.  One machine now is running with known vulnerabilities and I’m considering reinstalling XP.

    [The breakpoint came from the verifier; it’s not from Notepad proper. That’s why the verifier documentation says you need a debugger. I’m not sure why you’re ranting about IE7 here. -Raymond]
  24. Norman Diamond says:

    The breakpoint came from the verifier; it’s

    not from Notepad proper.

    OK, glad to hear it.

    I’m not sure why you’re ranting about IE7

    here.

    Well, if I were using IE7 I wouldn’t be able to post a rant about it ^_^  But you’re right, my frustration and Mike’s frustration had unrelated roots.  Sorry.

  25. Mike says:

    >>The breakpoint came from the verifier;

    Oops, I meant to comment on this when I posted and forgot.  What I found interesting was that I hit the breakpoint two days later, after I had shut down and uninstalled Application Verifier and the computer had been rebooted a couple times.

    Not only was I not expecting a breakpoint, I’d always assumed that, when setting one, the executable was patched up in memory (INT3 or some such, it’s been a while for me) to cause one.  The mechanism that allowed the breakpoint to hit days later threw me.

    [The verifier also turns on debugging code in the core OS that is off by default. That’s why you hit it even after you uninstalled the verifier. -Raymond]
  26. Paul says:

    >The verifier also turns on debugging code in the core OS that is off by default.

    >That’s why you hit it even after you uninstalled the verifier.

    This behaviour is really, *really* annoying. For reasons too complicated to go into, I had to run App.Verifier on a server that was running some custom app that was causing problems. Unfortunately uninstalling it afterwards didn’t completely get rid of it (as you mention above), and in the end the only way to get the server running properly was a complete reinstall, which was what running App.Verifier on the thing was supposed to avoid in the first place (the custom stuff on there is extremely painful to configure). Why can’t the uninstall restore the original state, instead of leaving performance-affecting fundamental system changes in effect? Perhaps the App.Verified install should warn users that it’ll make changes to your system that can’t be undone, and it should only be used on sacrificial test machines.

    [You can undo them, you just have to remember to undo them before you uninstall Application Verifier. -Raymond]
  27. Paul says:

    You can undo them, you just have to remember to undo them before you uninstall Application Verifier.

    The problem with this is that (a) you need to know the changes have been made (which I didn’t until I read your comment above) and (b) you need to know what is is that A.V. changed. I have no idea what it’s done to the system, and consequently what needs to be undone.

Comments are closed.