How It Works: sys.dm_io_pending_io_requests

I have been working an issue where the DMV was returning io_pending_ms_ticks incorrectly. The following output is an example of ~164 hour delay. Unlikely to occur without any other side effects noted.

select * from sys.dm_io_pending_io_requestsimage

In the process of this investigation I uncovered several details that I found helpful to share.

Full Text and File Streams
There is a small set of I/Os that could show a large io_pending_ms_ticks value but io_pending should = 1. There are administrator interfaces for both the full text and file stream features. Think if them a bit like a new TDS connection to the server. When full text or a file stream request arrives the request needs to be processed. These requests are simply waiting on the arrival of a new request from the respective features. They seldom show up on this list and the file handle will not map to any of the handles exposed in the DMF - virtual_file_stats.

io_pending is the key
The io_pending column indicates 1 if the I/O is still pending within the kernel. A value of 0 indicates the I/O only needs to be processed by the local SOS scheduler. In this case we are not getting any I/O delay warnings, performance monitor is not showing I/O issues and there are no SOS scheduler issues.  

Dirty Read
After some more digging the issue is a DIRTY READ. SQL Server maintains a set of I/O request structures (request dispenser). When the I/O completes the request structure is returned to the dispenser and can be reused for another I/O. The DMV needs to materialize the list without causing scheduler issues on the system so it is designed to perform (NO LOCK) dirty reads of the I/O list. This is where your system may return you incorrect io_pending_ms_ticks value when the io_pending flag = 0.

Take the diagram below as an example. It is possible that the DMV is being materialized on one scheduler but the I/O is being completed on the I/Os owning scheduler. (Each scheduler has its own I/O list).

image

The DMV query can get access to the I/O request structure but it does not hold any locks on the structure. If it did it could lead to unwanted blocking. Take for instance that the application fetching the DMV output stalls. If a lock is held on the I/O list the lock can't be released until the entire list is properly traversed. Because the client is not fetching results it could lead to the I/O being stalled on the scheduler for a long period of time.

To avoid this the list is run in a dirty fashion. However, this means the IO request could finish I/O completion in parallel with the reading of the structure data. The SQL Server protects the DMV query but does not indicate to the DMV user that no lock data movement has occurred. Instead it is possible that the output can become skewed, as shown in previous output. The IO data structure can be re-assigned between the time the DMV query starts to read the information and the time all columns are produced, leading to unexpected output.

SQL Server 2012 (Denali) updates this behavior by adding a signature to the I/O request. This allows SQL to maintain the dirty read capability while also identifing I/O requests that fall into this category.

When you see large pending_io_ms_ticks consider the io_pending flag and additional scheduler warning information (178** messages) in the error log.

Bob Dorr - Principal SQL Server Escalation Engineer