New for Windows Vista: thread agnostic I/O

When an I/O is issued on behalf of a system service call (for example NtReadFile,
NtWriteFile, etc), the I/O manager creates a threaded IRP and then issues the
I/O. I previously wrote
about this. Today, Paul has returned as a guest writer and will talk about
threaded (or lack thereof ;)) IRP functionality in Vista...

One major change in I/O in Vista was the addition of thread agnostic I/O.
What is thread agnostic I/O, you ask? It simply means that the original
thread which issued the I/O no longer has to be present when the I/O it issues
completes. From a driver writer's perspective, this change is completely under the covers, although
it can bite developers in subtle ways if they were making incorrect assumptions
about the calling thread always being around when the IRP completes.

Typically when an IRP completes, the I/O manager attaches back to the original thread,
and finishes the completion of the IRP. Attaching back to the original thread
Any time the I/O manager can avoid this, especially in paths which demand high performance
such as the read path, developers are happy. If a file object has a completion port
associated with it (*), the I/O manager tries to optimize
out the attaching back to the original thread. In addition, developers who spin up a lot of
threads to issue I/O requests don't necessarily care if the original threads hang
around until the I/O completes. Now, with the addition of thread agnostic I/O,
the original threads can exit before the I/O completes.

Instead of immediately queuing requests onto the issuing thread (**),
the I/O manager checks the I/O to see if it is asynchronous and targeted towards a file object with a
completion port. If both of these conditions are met and (and the completion port
and file object are not being torn down), the I/O manager queues the IRP to the file object. When
the I/O is eventually completed, the IRP is placed in the completion port until the application retrieves it
(by calling GetQueuedCompletionStatus()). Once
the IRP is retrieved from the completion port, I/O manager performs the processing that would
have previously taken place in the original thread's context. This might
require attaching to the original process, but it does not require the issuing thread.

Thread agnostic I/O also changes the behavior of CancelIo().
Since the IRPs are no longer associated with the thread, they will not be cancelled
when this API is called. There are some new cancellation APIs for Longhorn which
I will discuss in a future post that take care of this problem.

So, what does this mean for driver writers?

  1. Since IRPs can now be queued on the file object, the !thread
    and !process debugger extensions may not display
    all pending I/O. To fix this, the !fileobj
    debugger extension has been updated to dump the list of IRPs that are
    associated with the target file object. If you can't find your IRP queued to a
    thread, try this extension out on the target file object.
  2. Don't assume that the original thread
    will stick around when you are pending asynchronous I/O and handle it on a worker
    thread.

* - There is also one other case the I/O manager does this. An application
can now lock a range of overlapped structures into memory by calling
SetFileIoOverlappedRange().
After making this call, any non-bufffered requests that have I/O status blocks
that lie in this range do not need the original thread (as their memory is locked into system
space).

** - This only applies to requests that go through the NT
services. For example IRPs that built by calling
IoBuildSynchronousIoRequest() will still be queued
to the calling thread.