What is RESOURCE_GOVERNOR_IDLE and why you should not ignore it completely

If you have query that runs slow, would you believe it if I tell you that you instructed SQL Server to do so?  This can happen with Resource Governor.

My colleague Bob Dorr has written a great blog about Resource Governor CPU cap titled “Capping CPU using Resource Governor – The Concurrency Mathematics”.

Today,, I will explore a customer scenario related to this topic. We have had a customer who complained that their queries ran slow.  Our support captured data and noticed that wait type “RESOURCE_GOVERNOR_IDLE” was very high.   Below is SQL Nexus Bottleneck Analysis report.

image

My initial thought was that this should be ignorable. We have many wait types that are used for idle threads for many different queues when queues are empty.   This must be one of those.

Since I haven’t seen it, I decided to check in the code.  It turned out to be significant.  This wait type is related to resource governor CPU cap implementation (CAP_CPU_PERCENT).     When you enable CAP_CPU_PERCENT for a resource pool, SQL Server ensures that pool won’t exceed the CPU cap.   If you configure 10% for CAP_CPU_PERCENT, SQL Server ensures that you only use 10% of the CPU for the pool.  If you pound the server (from that pool) with CPU bound requests, SQL Server will insert ‘idle consumer’ into runnable queue to take up the quantum that pools is not entitled to.   While the ‘idle consumer’ is waiting, we put RESOURCE_GOVERNOR_IDLE to indicate that the ‘idle consumer’ is taking up quantum.   here is what what the runnable queues for a particular resource pool would look like with and without CAP_CPU_PERCENT configured.

image  image

Not only you will see that wait type in sys.dm_os_wait_stats, but also you will see ring buffer entries like below:

select * from sys.dm_os_ring_buffers
where ring_buffer_type ='RING_BUFFER_SCHEDULER' and record like '%SCHEDULER_IDLE_ENQUEUE%'
<Record id = "139903" type ="RING_BUFFER_SCHEDULER" time ="78584090"><Scheduler address="0x00000002F0580040"><Action>SCHEDULER_IDLE_ENQUEUE</Action><TickCount>78584090</TickCount><SourceWorker>0x00000002E301C160</SourceWorker><TargetWorker>0x0000000000000000</TargetWorker><WorkerSignalTime>0</WorkerSignalTime><DiskIOCompleted>0</DiskIOCompleted><TimersExpired>0</TimersExpired><NextTimeout>6080</NextTimeout></Scheduler></Record>

 

Conclusion:

If you see wait type RESOURCE_GOVERNOR_IDLE, don’t ignore it.  You need to evaluate if you are setting the CPU cap correctly.  It may be what you wanted.  But you it may be that you have capped it too low and the queries are impacted in a way you didn’t intend to.  If it’s what you intended to do, you will need to explain to your user that they are “throttled”.

Demo

For the demo, observe how long the query runs before and after the CPU cap is configured.

--first measure how long this takes
select count_big (*) from sys.messages m1 cross join sys.messages m2  -- cross join sys.messages m3

go
--alter to 5 (make sure you revert it back later)
ALTER RESOURCE POOL [default]
WITH ( CAP_CPU_PERCENT = 5 );
go
ALTER RESOURCE GOVERNOR RECONFIGURE;
go

--see the configuration
select * from sys.dm_resource_governor_resource_pools

go

--now see how long it takes
select count_big (*) from sys.messages m1 cross join sys.messages m2  -- cross join sys.messages m3

go
--While the above query is running, open a different connection and run the following query
--you will see that it keeps going up. note that if you don't configure CAP_CPU_PERCENT, this value will be zero
select * from sys.dm_os_wait_stats where wait_type ='RESOURCE_GOVERNOR_IDLE'

image

go

--revert it back
ALTER RESOURCE POOL [default]
WITH ( CAP_CPU_PERCENT = 100 );
go
ALTER RESOURCE GOVERNOR RECONFIGURE;
go

 

Jack Li | Senior Escalation Engineer | Microsoft SQL Server