Should you worry about SOS_SCHEDULER_YIELD?


Today I witnessed a performance load test against a 16 proc 64-bit SQL Server instance.  During the first run we saw a very high value for CXPACKET, followed up with SOS_SCHEDULER_YIELD.  After disabling parallelism and clearing the wait stats, CXPACKET disappeared and was replaced with a significant percentage of SOS_SCHEDULER_YIELD (representing > 80% of the total accumulated wait time).  The load test was driving all 16 CPUs quite high during the test, almost pegging them at 100%.


So here are two questions to ask in this situation:


·        What does the wait type mean (in this case SOS_SCHEDULER_YIELD)?


·        Is it actionable?


In answer to the first question, SOS_SCHEDULER_YIELD represents a SQLOS worker (thread) that has yielded the CPU, presumably to another worker.  My favorite white paper, SQL Server 2005 Waits and Queues, defines SOS_SCHEDULER_YIELD as occurring “when a task voluntarily yields the scheduler for other tasks to execute. During this wait the task is waiting for its quantum to be renewed.”  The over-simplified explanation of a quantum is the amount of time a thread is scheduled to run. The thread that yielded the scheduler (CPU) is now waiting for its turn again to run on the processor. SQL Server runs cooperatively (non-preemptive mode) for many activities, which means that SQL Server decides when a thread will yield – not the Operating System.  With thread yielding being voluntary – if the thread was greedy, we could see a risk of the thread running until it was complete – however that is not the case. The SQLOS was designed in such a way that active threads should not starve out other runnable threads.  Threads will voluntarily yield their time so that other tasks can execute. On a busy, highly concurrent SQL Server instance, this is a good thing.


Now is this wait type actionable?  If this wait type represents a high percentage of overall wait time accumulated for a workload or important period of time, we can surmise that SOS_SCHEDULER_YIELD is a symptom of CPU pressure.  If you see a high value for SOS_SCHEDULER_YIELD you should check perfmon counters to see if you also see sustained high values for % Processor Time and Processor Queue Length.  You should also validate that it is indeed the SQL Server process that is causing the CPU pressure using the Process: % Processor Time counter, checking specifically for the SQL Server instance (or other processes responsible).  You may also want to validate the number of workers in a runnable state in sys.dm_os_workers and sys.dm_os_schedulers, revealing if a significant number of tasks are waiting in line for their chance to run on a scheduler.


If you determine that SQL Server is responsible for the sustained CPU usage, then you should follow the standard CPU pressure troubleshooting path ( for example – query sys.dm_exec_query_stats ordering by total_worker_time, check for high compilations and recompilations, optimization time spent with sys.dm_exec_query_optimizer_info, and more).


Hope this helps,


Joe


Comments (3)

  1. ftpex says:

    everytime I see a thread on this I alway wonder why are you not looking for high IO.  Can cause High IO I never see it I see IO problems lack or memory but never the things you are talking about

  2. josephsa says:

    Hi ftpex – while I could see a high IO issue causing a downstream effect as you described, there are other scenarios (and in my experience – I've seen this with different customers).