What’s wrong with this code, part 26 – the answer

Yesterday I posted a code snippet from inside a real piece of code inside the client side of a client/server utility in a Microsoft product.

 static DWORD WINAPI _PlayBeep(__in void* pv)
{
    UNREFERENCED_PARAMETER(pv);
    PlaySound(L".Default", NULL, SND_SYNC | SND_ALIAS);
    return 0;
}

LRESULT WndProc(...)
{
    :
    :
    case WM_KEYDOWN:
        if (!_AcceptInputKeys(wParam, lParam))
        {
            QueueUserWorkItem(_PlayBeep, NULL, 0);
        }
        break;
}

The bug here was that the call to PlaySound is synchronous.  Like many Win32 APIs, PlaySound is made thread-safe by taking a critical section around the code that does the work inside the call to PlaySound (since the semantics of PlaySound are that only one PlaySound call plays at any time, this is a reasonable thing to do).  Because the call to PlaySound specifies SND_SYNC, that processing is done on the thread that calls PlaySound and thus blocks the _PlayBeep call until the sound completes.  The “.Default” sound is approximately 1 second long, so the call to _PlayBeep takes about a second to complete.

Unfortunately keyboards repeat at a lot faster than 1 keystroke/second and each keystroke causes a call to QueueUserWorkItem.  Because there’s no thread available to process the request, the thread pool logic creates a new thread to process the next work item for each keystroke.  This quickly backs up and eventually you run into the 500 thread limit in the default process threadpool.  So you’re going to have the machine play “Ding’s” for hours before this cleans itself up.

But in this case the effects were was even worse. 

Remember when I told you that it was a client/server utility?  In this case, it meant that it used async RPC to communicate with the server.  And async RPC uses the default thread pool for on certain applications.  When the response to a server request came in, there were no RPC threads running so RPC tried to create a thread in the default threadpool.  Which failed because the default threadpool was full.

So the client/server utility stopped working.  It took about 15 seconds of hammering on invalid keys to get the utility into this state, NOT pretty.

 

The first fix was to change the WM_KEYDOWN to WM_KEYUP (so that you actually had to release the keys instead of letting the typematic feature repeat for you).  That didn’t work because it still could be reproduced, it just took longer.

The final fix was:

 static DWORD WINAPI _PlayBeep()
{
    PlaySound(L".Default", NULL, SND_ASYNC | SND_ALIAS);
    return 0;
}

LRESULT WndProc(...)
{
    :
    :
    case WM_KEYDOWN:
        if (!_AcceptInputKeys(wParam, lParam))
        {
            _PlayBeep();
        }
        break;
}

So instead of queueing a work item, the team changed the code to simply call PlaySound with SND_ASYNC and they were done.  When you specify SND_ASYNC, the PlaySound call queues the request to an internal-to-playsound worker thread, and it cancels the old sound before it starts playing a new sound (thus the sounds don’t back up).

 

The object lesson here is: Don’t queue long running items to the thread pool because it can lead to unexpected results.  And even a 1 second “Ding” can count as “long running” if it can be queued rapidly enough.