How to avoid accessing freed memory when canceling a thread pool callback


The Windows thread pool is convenient, but one tricky part is how to remove items from the thread pool.

This discussion applies to all the thread pool objects, but I'll use thread pool timers for concreteness. You start by calling Create­Threadpool­Timer to establish the callback function and its context. Next, you call Set­Threadpool­Timer to configure the timer: When the timer becomes due and its optional periodicity.

At this point, the timer is live. It will queue a callback (or callbacks, if periodic) to the thread pool according to the schedule you specified.

At some future point, you decide that you are finished with the timer. The timer may have elapsed by this point, or maybe you're cleaning up the timer before it elapsed.

Now you have a few options.

The simplest way is just to call Close­Threadpool­Timer. If the callback is not running, then this frees the timer immediately. Otherwise, it waits for the callback to complete before freeing the timer.

This "either/or" behavior makes Close­Threadpool­Timer basically useless for any callback with nontrivial context data, because you don't know when it's safe to free the context data. If you free it as soon as Close­Threadpool­Timer returns, then you might free it out from under an active callback.

That would be bad.

If you make the callback itself responsible for freeing the context data, you have the new problem of not knowing whether the callback is running, so the thread trying to close the timer doesn't know whether it should free the data or not. You can't have the callback set a flag saying, "Hey, I've started!" because there's still a race condition where the thread trying to close the timer checks the flag just before the callback manages to set it. You might try to fix this by making the context pointer be a pointer to a control block that in turn contains the data pointer, and having the callback and the main thread perform an atomic exchange on the data pointer, but you merely replaced the problem with an identical one: How do you know when it's safe to free the control block?

Fortunately, the documentation suggests an alternative:

  • Call Set­Threadpool­Timer to reconfigure the timer so it never comes due. This prevents new callbacks from occurring.
  • Call Wait­For­Threadpool­Timer­Callbacks to wait for any outstanding callbacks to complete.
  • Call Close­Threadpool­Timer.
  • Free the context data.

When Wait­For­Threadpool­Timer­Callbacks returns, you know that there are no active callbacks, and your prior call to Set­Threadpool­Timer makes sure that no new callbacks are scheduled. This means that you can call Close­Threadpool­Timer, and it will always be in the "callback is not running" case, so you can free the context data as soon as Close­Threadpool­Timer returns.

Great, we solved the context data lifetime problem, but we introduced a new problem: Deadlock.

Oh, look at the time. We'll continue this discussion next time.

Comments (13)
  1. Giedrius says:

    Wouldn’t using three states like Nothing/Running/Wont run and atomically swapping them work?

    1. Clockwork-Muse says:

      Okay, where did you store that state? When is it safe to reuse that piece of memory?

      It also doesn’t solve the problem of starvation:
      1. Created as Nothing
      2. Timer procs and sets as Running.
      3. Main thread wants to close, but sees Running. Setting the variable to Won’t Run doesn’t help, because it doesn’t tell you when the callback finishes (and is safe for free). Loops for retry to set Won’t Run.
      4. Timer finishes and sets Nothing.
      5. Go back to 2

      Attempting to add a fourth value – LastRun – doesn’t solve the problem (there’s a race condition when the main thread does the recovery between the two atomic sets). Using two variables (one for “can start” and one for “running”) solves the issue, but doesn’t solve the “freeing the shared data” problem.

      1. Giedrius says:

        What’s “freeing the shared data” problem? Free the shared data when timers are finished and cancelled.

        1. And how do you know when the timers are finished and cancelled? They finish asynchronously, and there is no notification when they complete.

        2. Clockwork-Muse says:

          Ah. You’re thinking of something like this:
          1. Created as Nothing
          2. Timer procs and sets as Running
          3. Main thread wants to close, so blanket sets Won’t Run
          4. Timer finishes and sees Won’t Run, so the timer thread frees the shared data.

          The problem is that this requires the timer to run, and (perhaps more importantly) that there be only one timer accessing the shared memory: this is a problem if your process takes longer to run than the timer proc interval, or was delayed, or a couple of other possibilities.

  2. Joshua says:

    My solution: never use timers in threadpools. The most expediant way to create timers is to use WaitForMultipleObjects (or MsgWaitForMultipleObjects) in the master thread with the appropriate timeout.

  3. laonianren says:

    “If the callback is not running, then this frees the timer immediately. Otherwise, it waits for the callback to complete before freeing the timer.”

    I read this and thought: isn’t this exactly what you want?

    I had to reread it four times before I realised the problem: CloseThreadpoolTimer doesn’t wait for the callback to complete. It returns immediately while something else waits for the callback to complete.

    1. Jon says:

      Agree. “It waits” reads like “CloseThreadpoolTimer waits”. Raymond, are you able to rephrase this to make it more clear?

  4. VinDuv says:

    I was confused at first because I though “waits for the callback to complete before freeing the timer” meant that Close­Threadpool­Timer only return after callback completion. It isn’t; the documentation makes it clear that it frees the timer asynchronously, which is why the context data can not be immediately freed after its return.

  5. JAS says:

    You have to support cancellation, meaning that every timer call should have a context/lock saying whether you acquired the timer or were cancelled. Whichever it was, gets to free any resources immediately and blocks off the opposite path. If a timer is cancelled this way, the timer callback might still fire late, but the cleanup code was already run on the canceller’s thread so nothing will run inside the callback. In C# this is absolutely trivial with try-catch/async-await and CancellationToken.

    1. Zarat says:

      In C# this is only trivial because someone else already solved the problem for you. (The GC takes care of freeing the memory and the implementors of the C# wrapper around threadpool timers had to take care of any pinning problems and race conditions).

      So yeah, easy if someone else solved the problem for you, but it doesn’t teach you how to solve it when you have to face it.

    2. djingis1 says:

      For certain definitions of “absolutely trivial”.

  6. Neil says:

    What you want is a notification that the timer was freed. The question arises, how does CloseThreadpoolTimer get notified that it should free the timer? Maybe it just dispatches a task to the thread pool to wait for the callbacks to complete… in which case, you probably want to do that too instead of waiting on the calling thread.

Comments are closed.

Skip to main content