Tips On How To Analyze Strange Crash Dumps And Uninstall Hidden Drivers

Recently, a friend of mine had the following problem: his computer crashed exactly 2 hours after booting into windows. As usual, I opened windbg and executed !analyze -v in the minidumps, however I didn't get any useful information:

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too high.  This is usually caused by drivers using improper addresses. If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 00000081, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000000, value 0 = read operation, 1 = write operation
Arg4: 865d7aa8, address which referenced memory

Debugging Details:
------------------
READ_ADDRESS:  00000081
CURRENT_IRQL:  2
FAULTING_IP: +ffffffff865d7aa8
865d7aa8 0000             add     [eax],al

CUSTOMER_CRASH_COUNT:  2
DEFAULT_BUCKET_ID:  DRIVER_FAULT
BUGCHECK_STR:  0xD1
LAST_CONTROL_TRANSFER:  from 80544f5f to 865d7aa8

STACK_TEXT:  
WARNING: Frame IP not in any known module. Following frames may be wrong.
f78aafcc 80544f5f 865941a8 86581000 00000000 0x865d7aa8
f78aaff4 80544acb a7554d44 00000000 00000000 nt!KiRetireDpcList+0x61
f78aaff8 a7554d44 00000000 00000000 00000000 nt!KiDispatchInterrupt+0x2b
80544acb 00000000 00000009 0081850f bb830000 0xa7554d44

STACK_COMMAND:  kb
FOLLOWUP_IP:  nt!KiRetireDpcList+61 80544f5f 837c240c00       cmp     dword ptr [esp+0xc],0x0
FAULTING_SOURCE_CODE:  
SYMBOL_STACK_INDEX:  1
FOLLOWUP_NAME:  MachineOwner
SYMBOL_NAME:  nt!KiRetireDpcList+61
MODULE_NAME:  nt
IMAGE_NAME:  ntkrpamp.exe
DEBUG_FLR_IMAGE_TIMESTAMP:  4356d823
FAILURE_BUCKET_ID:  0xD1_nt!KiRetireDpcList+61
BUCKET_ID:  0xD1_nt!KiRetireDpcList+61
Followup: MachineOwner

Unfortunately, windbg doesn't have any other information from the call stack, so he can only point to the windows kernel. This is a common behavior that can be seen, when a driver really messes things up. You should be alert that, when this happens, you might have to do some more advanced digging or some more "trial and error" (like I did). I'm sure that the problematic driver could have been found using driver verifier or some more advanced techniques, but here I would like to show a more quick-and-dirty solution.

Since the problem happened exactly 2 hours after he booted into windows, this could not be a hardware problem (since a hardware problem would occur more randomly). Also, from the bugcheck code it is obvious that it is a driver's fault. As a first step, I executed

lm and lm kv m specific_driver*

to find all the drivers that were loaded into the system and also to find specific information about some "interesting" ones. I saw that no driver was loaded at an address close to 0x865d6668.

The next step was to try and isolate drivers that might seem more suspicious than others. I found that an easy way to look at the drivers running on a system is driverview. This tool shows approximately the same information like windbg (driver name, corresponding filename, description, company name, etc), but also has a nice GUI. So, after finding some "interesting" cases, the next step was to uninstall some drivers. Of course, before that I tried to enable driver verifier on different driver categories, however this took quite some time and I opted for an easier solution :)

The problem here was the fact that by default not all drivers are viewable from the control panel. In order to show all drivers (even the hidden ones), you need to do the following:

  1. Open a command prompt (e.g. go to Start/Run and type cmd)
  2. At the command prompt, type "set devmgr_show_nonpresent_devices=1" (without the quotes)
  3. Type "devmgmt.msc" (without the quotes) and press enter. The Device Manager comes up.
  4. Go to "View" -> "Show Hidden Devices" and you'll see all the drivers that are currently running on your computer, as well as the drivers that are installed, but not running (e.g. because the corresponding device is not currently connected to the computer).
  5. After removing the drivers (by right-clicking on the corresponding device and clicking "uninstall") that you are sure that you don't need (don't remove all the unused devices just because they are greyed out! Make sure to remove only the ones that you don't need), you can close the device manager.
  6. Keep in mind that, until you close the above command prompt, every instance of device manager that you launch from there will be able to show all the hidden devices.
  7. Also, if you want to avoid having to set the variable any time that you want to look at the hidden device drivers, you can go to Control Panel -> System -> Advanced -> Environment Variables and set it there.

Of course, if you want to have a  tool that allows you to remove drivers easily, you can download the Driver Manager, which shows the list of the running drivers and allows you to disable them or remove  them.

So, in my case, after removing different sets of suspicious drivers, the culprit was found and removed from the system, so everything is now back to normal :)