SQLIOSim - Outstanding I/O Warnings

I have encountered some questions pertaining to the SQLIOSim outstanding IO warnings and I am posting some details.

The Display monitor thread is responsible for updating the display (progress outputs) and checking for long IOs (CheckForLongIO/ReportLongIo). Each time it wakes (from 1 to 10 second range depending on state) it fires the check.

Each logical CPU (scheduler) in the simulator has an IO list. The CheckForLongIO loops over each logical CPU and checks the pending IO list of each logical scheduler.

When the IO is posted (async) we grab the current time and add the IO to the logical scheduler specific list. When the check is performed each IO in the list is checked with HasOverlappedIoCompleted . This is a WIN32 macro that checks the status stored in the OVERLAPPED structure of the IO to see if the system has reported the IO as complete. When FALSE is returned and the duration of the IO is > 15 seconds the condition is flagged as a ‘stalled’ IO as the OS and sub-system have not completed the specific IO.

The number of these outstanding IOs is accumulated and reported after all logical CPUs have been checked.

The following output shows 2 logical CPUs being used by the simulator.

03/10/07

15:35:58

3264

System

CPUCount = 2, Affinity = 0, IOAffinity = 0, MaxMemory = 630 MB, StopOnError = Yes, TestCycles = 1, TestCycleDuration = 300 s, CacheHitRatio = 100.00%

At 15:36:46 you have 5 IO requests that exceeded 15 seconds. One second later another 72 requests exceeded the 15 second mark. (The same IO is not reported again, once flagged as long it won’t be counted on secondary passes.)

What this does not show is how many IOs have completed. SQLIOSim could have completed 1000s of IOs during this 15 second window but some of the requests are being stalled.

03/10/07

15:36:45

3764

Display Monitor

Error: 0x00000000 Error Text: Description: 5 IO requests are outstanding for more than 15 sec.

03/10/07

15:36:46

3764

Display Monitor

Error: 0x00000000 Error Text: Description: 72 IO requests are outstanding for more than 15 sec.

The files are open with NO buffering so they should not involve system cache components. You should review filter drivers and the like. Likely something like elevator seek or something of this nature is at play when you see the outstanding IO warnings.

 

03/10/07

15:35:58

3264

System

NoBuffering = Yes, WriteThrough = Yes, UseScatterGather = Yes, ForceReadAhead = No, MaxOutstandingIO = 0, TargetIODuration = 100 ms

LOGICALCONFIG::PrintConfigurationToLog

e:yukonsosbranchsqlntdbmsstorengutilsqliosimlogicalconfig.cpp

228

 

03/10/07

15:35:58

3264

System

AllowIOBursts = Yes, DeleteFilesAtStartup = Yes, DeleteFilesAtShutdown = No, StampFiles = No

     

The following error text is reported for each, individual file no more than every 30 seconds, indicating the number of total IOs identified as taking longer than 15 seconds.

The following indicates that 826 IO requests for the file sqliosim.mdx exceeded 15 seconds to complete in the last 30 second processing window. The last IO that is currently STATUS_PENDING (256) is at the specified offset on the specified handle.

 

03/10/07

15:37:16

3764

Display Monitor