Hidden gotcha in the thread pool sample program on MSDN


There's a hidden gotcha in the MSDN thread pool sample that one of our interns stumbled across.

"I am trying to create a simple thread pool rather than creating a new thread for each task I want to perform. I based this program on the MSDN thread pool sample, but I found that the work items never run in parallel. They always run sequentially. All calls succeed. Can anybody explain why this is happening? Thanks."

#include <windows.h>
#include <iostream>
#include <cstdlib>

VOID
CALLBACK
Callback(
    PTP_CALLBACK_INSTANCE Instance,
    PVOID                 /* Parameter */,
    PTP_WORK              /* Work */
    )
{
    std::cout << "Starting " << Instance << std::endl;
    Sleep(3000); // Pretend to do work
    std::cout << "Ending " << Instance << std::endl;
}

int
__cdecl
main(int, char**)
{
    TP_CALLBACK_ENVIRON CallBackEnviron;

    InitializeThreadpoolEnvironment(&CallBackEnviron);

    auto pool = CreateThreadpool(NULL);

    SetThreadpoolThreadMaximum(pool, 1);

    SetThreadpoolThreadMinimum(pool, 1);

    auto cleanupGroup = CreateThreadpoolCleanupGroup();

    SetThreadpoolCallbackPool(&CallBackEnviron, pool);


    SetThreadpoolCallbackCleanupGroup(&CallBackEnviron,
                                      cleanupGroup,
                                      NULL);


    auto work = CreateThreadpoolWork(Callback,
                                NULL, 
                                &CallBackEnviron);

    for (int i = 0; i< 100; i++) {
        SubmitThreadpoolWork(work);
    }

    CloseThreadpoolCleanupGroupMembers(cleanupGroup,
                                       FALSE,
                                       NULL);


    CloseThreadpoolCleanupGroup(cleanupgroup);

    CloseThreadpool(pool);

    return 0;
}

The tasks all run sequentially because of these two lines:

    SetThreadpoolThreadMaximum(pool, 1);
    SetThreadpoolThreadMinimum(pool, 1);

If you set the minimum and maximum thread counts both to one, then that means that the thread pool consists of a single permanent thread. This really isn't much of a thread pool any more, although I guess it gives you the convenience of being able to add work to it relatively easily.

This hidden gotcha was called out in the sample where it says "The pool consists of one persistent thread." Mind you, it says so in a rather unobtrusive place, so I don't blame you for missing it.

If you want to allow tasks to run in parallel, remove the call to Set­Thread­pool­Thread­Maximum, or at least set the maximum to more than one. While you're at it, remove the call to Set­Thread­pool­Thread­Minimum, since there is nothing in this sample that requires that the threads be persistent. (If there is no work queued on the thread pool, we should let the thread pool destroy all its threads.)

And while I understand that this was just an intern playing around with a sample program, it should be called out that in general, you don't need to create your own thread pool. Just use the system one, and use a cleanup group if you want to be able to do bulk cancellation.

Comments (11)
  1. Kevin says:

    This looks like a case of API-colored glasses to me.  The author of the code fragment wanted to show off all the cool things you can do with thread pools, even though you probably don't actually want to do all of those things (at least, not all in the same piece of code).

  2. sense says:

    That's what you get from not reading the code you copy and paste.

  3. Michael Burr says:

    I wouldn't be surprised if there are a lot of thread pools out there that have this bug - that's the nature of examples in official or defacto documentation.

    I think the issue is actually more likely to make to the wild than many 'example code bugs' since the work items will get done, just not as efficiently as they should. Testing might not detect the problem unless someone is looking very, very closely at the thread pool's behavior.

  4. Sean says:

    @Kevin: In general I tend to prefer comprehensive examples that show a bunch of parts of a new API (and how they are put together in relation to each other). I don't want to say that this is the only alternative, but I've come across far too many APIs that go in the opposite direction, with a few very concise examples that may nicely target the 80% use case but are useless for the 20% of the time I have a special case that needs to use a niche feature of the API. I find it easier to trim out useless cruft from example code than to figure out where to put in useful specialization when I need it.

    However, the ideal is obviously enough good examples to show off 95+% of the use cases plus good documentation for each individual part of the API which makes it clear how they're supposed to be used.

  5. Kevin says:

    @Sean: Sure, API-colored glasses can be helpful.  But you have to be very clear upfront that that's what you're doing, or else people will just blindly copy and paste.

    Well, people will blindly cut and paste no matter what, but at least this way, those of us on Stack Overflow have something to hit them with when they realize it doesn't work and come crying to us for help.

  6. acq says:

    And where is the documentation about "using the system thread pool"? I tried to find out and probably followed the path of the intern:

    "OK, somebody who knows told me to use thread pool. Where to look? There's this Thread Pool API msdn.microsoft.com/.../ms686766%28v=vs.85%29.aspx Good what do I need? A Thread Pool, well there's the Pool section in the table, and 4 entries, the second one is CreateThreadpool, I guess I can start there, and lo! It doesn't mention that some pool I can use already exist. Just that this way a new is created. And there is link to the example. And that's the example Raymond discusses here."

    So what's the "simpler" way?

  7. acq says:

    It seems that the trick is passing NULL to the second parameter of SetThreadpoolCallbackPool, but see the description of that parameter now:

    "ptpp [in]    A TP_POOL structure that defines the thread pool. The CreateThreadpool function returns this structure."

    That it can be null is not specified here in any way. But there's the remark "If you do not specify a thread pool, the global thread pool is used."

    Now I entered "global thread pool" in MSDN search and... I can't say I've found any description of what a "global thread pool" is at all. So how's a beginner to even get the idea that it's OK thing to do that, and what he can expect?

  8. acq: msdn.microsoft.com/.../ms686756(v=vs.85).aspx may give you some clue.  Specifically "To request that a work item be handled by a thread in the thread pool, call the QueueUserWorkItem function. " and "The thread pool is created the first time you call QueueUserWorkItem..."

  9. acq says:

    Chris Crowther: Thanks! Still: "This topic describes the original thread pool API. The thread pool API introduced in Windows Vista is simpler, more reliable, has better performance, and provides more flexibility for developers." Not surprising that the programmer believes that he should use the new API then? Neither "system thread pool" nor "global thread pool" are named. And the parameter ptpp  I've mentioned is "in" and not "in_opt" like it would be marked in some other functions that accept NULL.

  10. skSdnW says:

    ..and there is/was a even older threadpool API in shlwapi, created around the IE4/5 timeframe.

  11. DWalker says:

    @Sean, @Kevin:  Speaking of examples, and how poor they often are:  I have NEVER seen an example, in any Microsoft SQL documentation, of how to use the Join clause to join two tables *on one more than one column*.  Every single Join example joins two tables on one column only.  

    When you are trying to learn something, as we all have at one time or another, some examples beyond the bare minimum would be very helpful.

Comments are closed.

Skip to main content