Today I witnessed a performance load test against a 16 proc 64-bit SQL Server instance. During the first run we saw a very high value for CXPACKET, followed up with SOS_SCHEDULER_YIELD. After disabling parallelism and clearing the wait stats, CXPACKET disappeared and was replaced with a significant percentage of SOS_SCHEDULER_YIELD (representing > 80% of the total accumulated wait time). The load test was driving all 16 CPUs quite high during the test, almost pegging them at 100%.
So here are two questions to ask in this situation:
· What does the wait type mean (in this case SOS_SCHEDULER_YIELD)?
· Is it actionable?
In answer to the first question, SOS_SCHEDULER_YIELD represents a SQLOS worker (thread) that has yielded the CPU, presumably to another worker. My favorite white paper, SQL Server 2005 Waits and Queues, defines SOS_SCHEDULER_YIELD as occurring “when a task voluntarily yields the scheduler for other tasks to execute. During this wait the task is waiting for its quantum to be renewed.” The over-simplified explanation of a quantum is the amount of time a thread is scheduled to run. The thread that yielded the scheduler (CPU) is now waiting for its turn again to run on the processor. SQL Server runs cooperatively (non-preemptive mode) for many activities, which means that SQL Server decides when a thread will yield – not the Operating System. With thread yielding being voluntary – if the thread was greedy, we could see a risk of the thread running until it was complete - however that is not the case. The SQLOS was designed in such a way that active threads should not starve out other runnable threads. Threads will voluntarily yield their time so that other tasks can execute. On a busy, highly concurrent SQL Server instance, this is a good thing.
Now is this wait type actionable? If this wait type represents a high percentage of overall wait time accumulated for a workload or important period of time, we can surmise that SOS_SCHEDULER_YIELD is a symptom of CPU pressure. If you see a high value for SOS_SCHEDULER_YIELD you should check perfmon counters to see if you also see sustained high values for % Processor Time and Processor Queue Length. You should also validate that it is indeed the SQL Server process that is causing the CPU pressure using the Process: % Processor Time counter, checking specifically for the SQL Server instance (or other processes responsible). You may also want to validate the number of workers in a runnable state in sys.dm_os_workers and sys.dm_os_schedulers, revealing if a significant number of tasks are waiting in line for their chance to run on a scheduler.
If you determine that SQL Server is responsible for the sustained CPU usage, then you should follow the standard CPU pressure troubleshooting path ( for example - query sys.dm_exec_query_stats ordering by total_worker_time, check for high compilations and recompilations, optimization time spent with sys.dm_exec_query_optimizer_info, and more).
Hope this helps,