Threaded and non-threaded PIRPs

Broadly speaking, there are two (*) types of PIRPs in Windows. By "type" I do not mean the MajorFunction (e.g. IRP_MJ_Xxx) of the PIRP. Rather, a PIRP can be categorized as either threaded or non-threaded. Hopefully I will be able to introduce and clearly explain the distinction between these two types, but this is complicated topic, so this entry is a bit long!

Before I get into the heart of the matter, I need to define completing a request. Completion of a PIRP is not always as simple as calling IoCompleteRequest(). When I talk about completing a PIRP, it can mean any of the following:

  1. Your driver doesn't forward the PIRP to another driver and just calls IoCompleteRequest()
  2. Your completion routine calls IoCompleteRequest() and returns STATUS_MORE_PROCESSING_REQUIRED
  3. Your completion routine does not call IoCompleteRequest() and returns a value != STATUS_MORE_PROCESSING_REQUIRED

Threaded PIRPs

Simply, a threaded PIRP is tied to the lifetime of the thread that initially sent the PIRP. The initiating thread will not exit until every one of its threaded PIRPs has been completed. The kernel object that represents the thread keeps a list of sent I/O and updates this list (**) whenever a driver issues an I/O request. This list is also updated if the driver creates a threaded PIRP and sends it to another driver. How a PIRP is enqueued on the list is rather straightforward:

  • A caller initiates I/O from either user mode (e.g. ReadFile()) or kernel mode (e.g. ZwReadFile())
  • The I/O manager then
    • Finds the underlying top level PDEVICE_OBJECT that corresponds to the file handle
    • Allocates a PIRP
    • Adds the new PIRP to the list of I/O sent on the current thread and stores a pointer to the current thread in the PIRP
    • Sends the PIRP to the underlying PDEVICE_OBJECT

But how does the PIRP get removed from the list? It is removed from the list when the top level device in the stack that was target of the I/O request completes the PIRP. How is this different than a lower level device object (assuming the top level device sent the PIRP down to a lower level device) completing the PIRP back to the top level device? The answer lies in how the current stack location in PIRP is changed when it is either sent to another driver or completed. Remember that the PIRP contains an I/O stack location for each device object in the stack. When you send a PIRP to another driver, the current stack location is decremented (pushed might be another way to think of it). Each time a device object completes the PIRP, the I/O manager increments (or pops) the current stack location, thus "unwinding" the current stack location back to the previous stack location. So, when the lower level device object completes the PIRP, the PIRP still contains valid I/O stack locations. When no more stack locations remain in the IRP, the IRP has completed back to the I/O manager, which then removes the PIRP from the original thread's list and queues an APC to the thread.

Non-threaded PIRPs

Let's push the threaded PIRP discussion onto the stack and talk about non-threaded PIRPs for a second. A non-threaded PIRP is not associated with any initiating thread. As such, the initiating thread pointer which is set for a threaded PIRP is instead left NULL. This is the key: there is no initiating thread pointer stored in a non-threaded PIRP.

Why you need to know the difference

So why is any of this important to you as a driver writer? If your driver allocates PIRPs, it must be careful to complete only the threaded PIRPs, and not the non-threaded PIRPs. If you, as the allocator of a non-threaded PIRP, complete the non-threaded PIRP back to the I/O manager, the I/O manager will blindly attempt to queue the APC to the initiating thread. But the thread pointer is NULL in this case and you will get a bug check, most likely code 0xA (IRQL_NOT_LESS_OR_EQUAL), I have example code and callstacks later on in the entry. What this means is that if you allocate a non-threaded PIRP in your driver, you must not complete it back to the I/O manager. Instead, you must free the PIRP by calling IoFreeIrp(). Here is an example of a callstack when you complete a non thread PIRP back to I/O manager instead of freeing it with IoFreeIrp()

 

How you know what to do when

Great, so PIRPs are now even more complicated than you originally thought. How are you supposed to know what to do when? First off, if you didn't allocate the PIRP, you can safely complete it back to the caller and not worry about the type of PIRP. You never need to call IoFreeIrp() on an IRP that was presented to your driver in one of your dispatch routines. You just have to call IoCompleteRequest() to complete the PIRP back to the I/O manager.

If you did allocate the PIRP, the following table describes the type of PIRP each of the PIRP allocation routines creates:

Allocates a non-threaded PIRP Allocates a threaded PIRP in the current thread
IoAllocateIrp() IoBuildSynchronousFsdRequest ()
IoBuildAsynchronousFsdRequest() IoBuildDeviceIoControlRequest()
TdiBuildInternalDeviceControlIrp()

For a threaded PIRP you must complete the PIRP back to the I/O manager. For a non-threaded PIRP, you must call IoFreeIrp(). If I missed any WDM functions please let me know (I'll cover how KMDF handles this differently in another topic). If you are reading the DDK docs and you are unsure what type of PIRP is being allocated, a good way to tell is whether or not the docs tell you to call IoFreeIrp() on the resulting PIRP.

There is one exception to the rule for how to free a non threaded PIRP.  If you allocate aPIRP by calling IoMakeAssociatedIrp() (which is not a very common function for a driver to call), you must complete the PIRP back to the I/O manager so that the MasterIrp can be completed when the last associated PIRP has completed.

Bad Code and call stacks

The following code snippet shows how to incorrectly complete a non- threaded PIRP back to the I/O manager. The call stacks are from an interim build of Vista, so the offsets and exact functions in the stack may vary from your machine.

 NTSTATUS
BadCompletionRoutine(
    PDEVICE_OBJECT DeviceObject,
    PIRP Irp,
    PVOID Context
    )
{
    UNREFERENCED_PARAMETER(DeviceObject);
    UNREFERENCED_PARAMETER(Irp);
    UNREFERENCED_PARAMETER(Context);

    //
    // By returning STATUS_CONTINUE_COMPLETION, the PIRP will be completed back
    // to the I/O manager
    //
    return STATUS_CONTINUE_COMPLETION;
}

NTSTATUS SendIrp(PDEVICE_OBJECT TargetDeviceObject)
{
    PIRP pIrp;
    PIO_STACK_LOCATION pNext;

    // allocate a non-threaded PIRP
    pIrp = IoAllocateIrp(TargetDeviceObject->StackSize, FALSE);
    if (pIrp == NULL) {
        return STATUS_INSUFFICIENT_RESOURCES;
    }

    IoSetCompletionRoutine(pIrp, BadCompletionRoutine, NULL, TRUE, TRUE, TRUE);

    pNext = IoGetNextIrpStackLocation(pIrp);

    // ..format pNext, in this case a PURB to send to the USB core...

    return IoCallDriver(TargetDeviceObject, pIrp);
}

and the subsequent bugcheck

IRQL_NOT_LESS_OR_EQUAL (a)

Arguments:
Arg1: 00000054, memory referenced
Arg2: 0000001b, IRQL
Arg3: 00000001, value 0 = read operation, 1 = write operation
Arg4: 81bb3e79, address which referenced memory

818f1b40 8187c4e0 0000000a 00000054 0000001b nt!KeBugCheck2+0x5f4
818f1b40 81bb3e79 0000000a 00000054 0000001b nt!_KiTrap0E+0x2ac
818f1bd0 8189ab83 846d8348 846d8308 00000000 hal!KeAcquireInStackQueuedSpinLockRaiseToSynch+0x19
818f1bf0 8189511c 846d8348 00000000 00000000 nt!KeInsertQueueApc+0x21
818f1c28 882094a8 8187e724 8454d9e8 00000000 nt!IopfCompleteRequest+0x431
818f1c64 8820ad4a 85db3028 846d8308 8539c7c0 USBPORT!<PIRP completion function>
[...]

Note that our sample driver is not on the stack. This occurs because we complete the PIRP by returning != STATUS_MORE_PROCESSING_REQUIRED from the completion routine. In this case, it would be incorrect to blame USBPORT for the bugcheck. If we modify the completion routine to explicitly call IoCompleteRequest, we get a different bugcheck call stack.

 NTSTATUS
BadCompletionRoutine(
    PDEVICE_OBJECT DeviceObject,
    PIRP Irp,
    PVOID Context
    )
{
    UNREFERENCED_PARAMETER(DeviceObject);
    UNREFERENCED_PARAMETER(Context);

    IoCompleteRequest(Irp, IO_NO_INCREMENT);

    //
    // By returning STATUS_MORE_PROCESSING_REQUIRED, we tell the I/O manager not
    // to continue PIPR completion
    //
    return STATUS_MORE_PROCESSING_REQUIRED;
}

and the subsequent bugcheck

IRQL_NOT_LESS_OR_EQUAL (a)

Arguments:
Arg1: 00000054, memory referenced
Arg2: 0000001b, IRQL
Arg3: 00000001, value 0 = read operation, 1 = write operation
Arg4: 81bb3e79, address which referenced memory

818f1b40 8187c4e0 0000000a 00000054 0000001b nt!KeBugCheck2+0x5f4
81c06da0 81bb3e79 0000000a 00000054 0000001b nt!_KiTrap0E+0x2ac
81c06e30 8189ab83 8461c6a8 8461c668 00000000 hal!KeAcquireInStackQueuedSpinLockRaiseToSynch+0x19
81c06e50 8189511c 8461c6a8 00000000 00000000 nt!KeInsertQueueApc+0x21
81c06e8c 89144060 81c06ec8 81894e24 00000000 nt!IopfCompleteRequest+0x431
81c06e94 81894e24 00000000 8461c668 83017ab0 <sample driver>!BadCompletionRoutine+0xa
81c06ec8 8840f4a8 8187e724 845db620 00000000 nt!IopfCompleteRequest+0x13d
818f1c64 8820ad4a 85db3028 846d8308 8539c7c0 USBPORT!<PIRP completion function>
[...]

(*) You could classify PIRP types differently and say there are more (like paging I/O), but I am ignoring those other types.
(**) Note that the user mode API CancelIo() uses this list to cancel all I/O sent by the calling thread.