Why did HeapFree fail with ERROR_POSSIBLE_DEADLOCK?


A customer reported that they were receiving some assertion failures because the Heap­Free function was failing with what they believed to be a valid heap block, and the Get­Last­Error function reported that the reason for failure was ERROR_POSSIBLE_DEADLOCK. What's going on?

One of my colleagues asked the psychic question, "Is the process exiting?"

"Why yes, in fact it is. How did you know?"

Recall how processes exit. One of the first things that happens is that all the other threads in the process are forcible terminated, which has as a consequence that any synchronization resources owned by those threads are now orphaned. And in this case, the synchronization resource in question was the heap.

When the function calls Heap­Free, the heap code tries to take the heap lock but finds that it can't because the heap lock was owned by another thread. And that other thread no longer exists. (Perhaps it was terminated while it was in the middle of its own Heap­Free operation.) The heap code detects this and instead of deadlocking on its own custom synchronization object, it fails with the error ERROR_POSSIBLE_DEADLOCK.

By the same logic, you can demonstrate that you cannot reliably allocate memory at process shutdown either. So now you can't allocate memory; you can't free memory. As we saw last time, when you are told that the process is exiting, you should not do any cleanup at all. The memory will get freed when the process address space is torn down. No need to free it manually; that's just a waste of time.

Comments (17)
  1. Joshua says:

    Sure, you can allocate memory. VirtualAlloc and friends work just fine.

  2. avakar says:

    What you actually need to ensure is that you join all threads before you start exiting. There are other locks that may easily become abandoned (notably the loader lock) with catastrophic results.

  3. Crescens2k says:

    @Joshua

    VirtualAlloc still needs unfragmented virtual address space to work. If the process is terminating under low memory conditions then you are going to have problems even with VirtualAlloc. That means even VirtualAlloc isn't able to reliably allocate memory, so you still can't allocate during process termination.

    Of course, whipping out VirtualAlloc feels a lot like using napalm to kill an ant. I would also question whether someone allocating memory in DllMain during DLL_PROCESS_DETACH is doing the correct thing.

  4. Leonard Crestez says:

    The correct way to handle errors from free functions is to just ignore them, right?

  5. Leo Davidson says:

    Thinking more about this and yesterday's post, I have to wonder why anything is running any code during process exit.

    Yesterday I was thinking in terms of a process that was about to exit, cleaning things up before actually exiting, but we're obviously talking about processes which have crossed that line. (As we were yesterday, too.)

    The fact that any code is being run at all once that line has been crossed seems like a design flaw to me. (Outside of very special cases.)

    The process should be cleaning up anything that needs to be cleaned up (or that it is inconvenient, and not worth the effort, to prevent being cleaned up; e.g. destructors) before it exits, not after.

  6. Clinton Pierce says:

    After I read the previous post on the topic I went home and had a real-life experience that made me flash back…

    I was replacing a cat litter box that had cracked with a new one.  I had prepared things to empty the "used" litter from the old box into the trash when I realized… "the process is exiting, there's no need to free the memory" … and threw away the entire bin: litter, "unpleasant stuff", and all.

  7. NB says:

    It seems the design (if any!) for process termination is a little ad-hoc to me. Was it always like that or did it just happen over the years?

    [See the classical model on how processes exit. -Raymond]
  8. Michael Grier [MSFT] says:

    @NB: I think that you can pretty easily attribute the whole thing to simplistic views of the world (as Raymond points out by referencing the older post).  Another good example of this sort of thing which you can't really call ad-hoc is the C++ static object construction/destruction problem.  It is well defined for a source unit but not even defined across source units.

    @Leo Davidson: The problem is at what granularity do you support this option?  For individual DLLs, they can already do the right thing and just return immediately in this case.  For executables, it's really not OK for them to set policy for every component that is in their address space (welcome to the world of in-process components).  The classic case which stymied my efforts a few years ago to do exactly what you describe is the C runtime library.  Nobody expects that helloworld.c may not actually write its output.

    Executables which want to take control of this can always call TerminateProcess() instead of ExitProcess() and the whole thing is over and done with.

  9. cheong00 says:

    Actually, I think that while memory resources will be freed by system automatically, hardware resources will not because the system has no way to know what to do with all kinds of non-standard (read: weird) hardwares.

    And then there's a question: While we should not free memory at terminations, we still need to free locks, especially if shared across other processes, right?

  10. Worf says:

    @cheongOO: If the driver is coded properly, it doesn't matter.

    You see, when a process terminates, after the kernel reclaims the memory, it closes all the handles, so the driver's close functions are invoked. As long as the close functions release the locks (which are held by the kernel, so the locks stay shut).

    Closing handles also closes kernel managed application locks like semaphores and such.

    About the only thing that doesn't happen are flushing of application managed file buffers (kernel buffers are flushed when the handle is closed), and telling other processes that are interacting via IPC that you're closing up shop, e.g., if you have shared memory and use events to signal when the shared memory changes, then the other processes will wait forever until someone else can set that event. Of course, if you're using kernel-managed IPC, then some may provide such notification ( e.g., TCP/IP sockets are closed by the kernel).

    I had this problem because a driver I wrote required per-process data structures, and I needed to free those data structures when the process quit, either by crashing or cleanly. Otherwise the memory would end up leaked.

  11. 640k says:

    It's very naive to think that a program doesn't have to allocate memory at shutdown. Meaning, it's easy to say, but it does happen alot, and when you least expect it. Try to write an app with 1M SLOC and tell me all corner cases of the shutdown process is simple to define. It's not gonna happen.

    There's lots of different kinds of resources that you want to gracefully dispose of. Memory allocations is the obvious one, with those the OS has a theoretical chance of disposing. Other things I can think of right away is RPC, SOAP. And all frameworks that is layered on top of that. With those it's usually nearly impossible to not allocate memory, because they are usually designed to do that when destructing/disposing remote objects.

    [If you're doing RPC or sending an HTTP request in your DLL_PROCESS_DETACH handler, you've already screwed up badly. -Raymond]
  12. Joshua says:

    [If you're doing RPC or sending an HTTP request in your DLL_PROCESS_DETACH handler, you've already screwed up badly. -Raymond]

    I can imagine a situation where this can be classified under flushing buffers. Anyway, if I had such an application I'd be sure for all of the paths in question that they follow the classical model but somebody else might not have been so fortunate to be in a place to do that, e.g. web browser plugin.

  13. cheong00 says:

    @640k: A networking application should handle case where the network is disconnected unexpectedly. So for RPC or SOAP requests, you application should be able to die right away without much problem other than loss of work.

  14. Joshua says:

    @640k: What do you think private heap is *for*?

  15. Hm says:

    @Joshua and other with "What do you think private heap is *for*?" and the like:

    Most people are using frameworks for 99.99% of the code. How to you redesign 10,000 or more predefined .NET or Delphi or C++ classes with complicated relationships between them, to meet your point of view? In the managed case (.NET and Java) you still have resources to manage, only memory is managed automatically, but all the other are still managed by destructor-like constructs.

  16. Joshua says:

    @Hm: You know full well you cannot call managed code from DllMain. As for C++, you do know you can redefine global operator new, right?

  17. Hm says:

    >As for C++, you do know you can redefine global operator new, right?

    How do you override the global new operator when the foreign software is not compiled with the compiler product you are using?

    How do you do that for COM components or data providers you have to use, or for the Delphi VCL und runtime when your program is not written in Delphi, how do you to that for XML parsers, SOAP classes and so on you need to use but come as external DLLs with some C-style API.

    This utility classes or modules may inturn use large third-party DLLs to do the job (database clients!), may be written in pure C, may be linked against some runtime you do not use yourself.

Comments are closed.

Skip to main content