Troubleshooting Slow Disk I/O in SQL Server
If you suspect you are experiencing poor disk performance you can use internal DMVs combined with a Performance Monitor collection to get a good picture of the health of the disk I/O subsystem and any latency SQL Server is experiencing from its poor performance.
Performance Monitor Counters:
Below is a list of counters and thresholds to watch for when monitoring IO on your SQL Server:
Edit 7/19/2011 - Please monitor the below counters and compare average values over time to the thresholds provided. The counters are intended for OLTP workloads and larger workloads such as OLAP and batch activity may not fit within these thresholds.
Object |
Counter |
Instance |
Threshold |
Comments |
Physical Disk / Logical Disk |
Avg. Disk sec/Read |
* |
( < .005 excellent; .005 - .010 Good; .010 - .015 Fair; > .015 investigate) |
Measures read latency on the disks |
Physical Disk / Logical Disk |
Avg. Disk sec/Write |
* |
( < .005 excellent; .005 - .010 Good; .010 - .015 Fair; > .015 investigate) |
Measures write latency on the disks |
Physical Disk / Logical Disk |
Avg. Disk sec/Transfer |
* |
( < .005 excellent; .005 - .010 Good; .010 - .015 Fair; > .015 investigate) |
Measures average latency for read or write operations |
Physical Disk / Logical Disk |
% Idle Time |
* |
> 50% |
The disk should not be working constantly |
Physical Disk / Logical Disk |
Avg. Disk Queue Length |
* |
None |
Be careful when using old thresholds of 2 with SQL Server and SANs. This value can be much higher than 2. Do not rely solely on this counter to diagnose an IO problem. |
Physical Disk / Logical Disk |
Disk reads/sec |
* |
None |
Number of read IOPs |
Physical Disk / Logical Disk |
Disk writes/sec |
* |
None |
Number of write IOPs |
Physical Disk / Logical Disk |
Disk Transfers/sec |
* |
None |
Number of read and write IOPs (used to compare against capacity of storage subsystem) |
Physical Disk / Logical Disk |
Disk Read Bytes/sec |
* |
None |
Used for determine bandwidth utilization for Read Ops |
Physical Disk / Logical Disk |
Disk Write Bytes/sec |
* |
None |
Used to determine bandwidth utilization for Write Ops |
Physical Disk / Logical Disk |
Disk Bytes/sec |
* |
None |
Used to determine bandwidth total bandwidth utilization |
Processor |
% Processor Time |
_Total |
< 80% |
Amount of total CPU usage across all processors |
Processor |
% User Time |
_Total |
< 80% |
Amount of total CPU usage in user mode across all processors |
Processor |
% Privileged Time |
_Total |
< 30% |
Amount of total CPU usage in kernel mode across all processors |
Memory |
Available Mbytes |
n/a |
> 100MB |
Available Physical RAM on the box |
SQL Server:Buffer Manager |
Buffer Cache Hit Ratio |
n/a |
> 98% |
Indication of how often SQL Server is finding data pages in the buffer pool |
SQL Server:Buffer Manager |
Page Life Expectancy |
n/a |
> 300 |
Indicates pressure on memory as data pages are flushed to disk – inducing IO |
SQL Server:Buffer Manager |
Lazy Writes/sec |
n/a |
< 20 |
Measure of when the lazy writer thread must clear up buffers to free space for pages – inducing IO |
SQL Server:Buffer Manager |
Free Pages |
n/a |
> 640 |
Measure of how many free pages are in the buffer pool for use |
SQL Server:Wait Statistics |
Page IO latch waits |
Average wait time (ms) |
None |
Average wait time spent waiting on Page IO between disk and memory |
Additionally, by storing scalar values returned from any of the queries in this post into a variable and executing those queries in a loop or a job, you can then make them available to Performance Monitor as a counter to plot them with other metrics. This allows you to get more granular detailed information and track IO waits and other data on a particular database, application, or session if required.
If you can quantify the information from a query as a single numeric value, then you can pass it out to Performance Monitor to enhance your analysis. For instance, you may choose to find out how many outstanding IOPs exist on SQL Server waiting longer than 10ms with:
select COUNT(*) from sys.dm_io_pending_io_requests where io_type = 'disk' and io_pending_ms_ticks > 10
Or you could map the time spent waiting on PAGEIOLATCH_% with:
select SUM(wait_time_ms) from sys.dm_os_wait_stats where wait_type like 'PAGEIOLATCH%'
For more information on setting up a custom performance counter, check here:
Creating a custom performance monitor counter for SQL Server
SQL Server Dynamic Management Views (DMVs):
The following DMVs will return extensive information that will help determine and diagnose an IO problem as seen from within SQL Server:
To check for outstanding IOPs from within SQL Server waiting longer than 10ms, start with:
select * from sys.dm_io_pending_io_requests where io_type = 'disk' and io_pending_ms_ticks > 10
To get information on IO stalls (a wait for IO that can’t be completed immediately), and the amount of traffic per file in a database, use:
select * from sys.dm_io_virtual_file_stats(db_id(‘<ENTER YOUR DATABASE NAME HERE>’), NULL)
To see if IO is the primary waiter for SQL Server, execute the following and look for wait types with a naming convention like PAGEIOLATCH_%%:
select top 10 * from sys.dm_os_wait_stats order by wait_time_ms desc
To monitor PAGEIOLATCHes for increases in time, periodically check and store these:
select * from sys.dm_os_wait_stats where wait_type like 'PAGEIOLATCH%'
To capture current executing statements that are waiting on IO or just finished waiting on IO:
select
r.session_id,
s.login_name,
s.program_name,
r.start_time,
r.status,
r.command,
r.wait_type,
r.wait_time,
r.last_wait_type,
r.logical_reads,
(r.logical_reads * 8192) as 'KB Read',
r.writes,
(r.writes * 8192) as 'KB Written',
t.[text]
from sys.dm_exec_requests r
cross apply sys.dm_exec_sql_text(sql_handle) t
inner join sys.dm_exec_sessions s
on r.session_id = s.session_id
where s.is_user_process = 1 and
(r.wait_type like 'PAGEIOLATCH%' or r.last_wait_type like
'PAGEIOLATCH%') and
r.session_id != @@SPID
Finally, suspect I/O capacity or bandwidth can be verified with SQLIO and the I/O integrity checked with SQLIOSim:
SQLIO Disk Subsystem Benchmark Tool (SQLIO is a tool provided by Microsoft which can also be used to determine the I/O capacity of a given configuration.)
How to use the SQLIOSim utility to simulate SQL Server activity on a disk subsystem
https://support.microsoft.com/kb/231619
- Jay