Sleeping vs. Yielding

According to the BOL, the recommended way of yielding to other workers in SQLCLR is to call System.Thread.Sleep(0). Long before Yukon shipped, I had a conversation with a coworker who was responsible for knowing something about SQLCLR, and I asked how the SQLCLR folks had handled Windows’ problem with Sleep() not actually yielding. His face went blank, so I then asked if he knew of the well-documented problems with people using Sleep(0) to yield to other threads in Windows. Windows OS team member Raymond Chen puts it pretty succinctly here, but I’ll try to summarize it for those who haven’t written many multi-threaded Windows apps.

Windows schedules threads according to priority. Naturally, higher priority threads win out over lower ones. Although calling Sleep(0) yields the remainder of the caller’s quantum and allows other threads of the same priority to run, it does not necessarily allow lower priority threads to run. If a thread has a high-enough priority, it can prevent all other user threads on the system from running, even if it frequently calls Sleep(0).

If you think of each priority level as its own scheduling queue, this makes perfect sense. Sleep(0) merely gives up the remainder of a thread’s quantum and puts it back in line for scheduling. If a high priority thread is the only thread at its particular priority, it merely moves to the back of its own queue. If no one else is in line, it is scheduled again immediately since its priority level trumps that of lower priority threads. In that sense, Sleep(0) can be completely ineffective as a yielding mechanism.

So, I asked my coworker whether he was aware of this. He was a nonprogrammer, but I thought he might know about it given his SQLCLR interest. I pointed out that if SQLOS supported distinct worker priorities (UMS didn’t), they would had to have provided for this scenario (or advised people to use a different yield mechanism). I got another blank stare, so I then asked if he knew whether the SQLOS guys had implemented worker priorities yet or whether they planned to. I asked because the Sleep(0) issue can’t occur if your scheduler doesn’t support distinct worker priorities (or it handles sleeping for 0 time inefficiently). SQLOS didn’t support worker priorities at the time and still doesn’t, but he didn’t know that, so he again had a blank look on his face. I was beginning to wonder whether he knew what a scheduling priority was and why it was important, so I tried to explain it to him. But, I could see his eyes glaze over as I began to talk, and I could see that that was irritating him, so I threw up my hands and gave up on getting through to him.

The moral of the story is this: calling System.Thread.Sleep(0) is the recommended way to yield to another worker in SQLCLR because SQLOS doesn’t have to worry with managing workers of different priorities. No worker is any more entitled than any other to run, therefore, giving up the remainder of your quantum sends you to the back of a line shared by all workers. If anyone has pending work requests, you will yield to them.

I should point out that, in order to deal with the thread starvation scenario I described above, Windows Server 2003 changed the behavior of Sleep(0) to yield to lower-priority threads. This doesn’t work the same on all releases of Windows, though, so you probably shouldn’t yet rely on it in Windows applications you develop. The old workaround of sleeping for a non-zero time (e.g., Sleep(1)) or calling SwitchToThread() still works. Again, this doesn’t affect SQLCLR apps because SQLOS handles SQL Server’s scheduling, and it doesn’t support distinct worker priorities. Developers of others types of apps (including regular managed code apps), however, still have to be mindful of the potential pitfalls of Sleep(0).