Intro to kernel debugging 1

Topic: KD Setup

I am a user-mode developer, but part of the job of working on the Windows team (HoloLens runs on Windows!) requires knowing how to work with a kernel debugger on that OS.  Some problems are difficult to debug through user-mode debuggers alone and can be simpler in a kernel debugger . Examples include:

  • Failure to launch process
  • Early-stage OS boot or similar
  • Inter-process communication

But how does one learn how to use the kernel debugger on Windows if the code you write only runs in user mode?  Many tutorials are intended for driver authors.  I intend to author a brief intro to kernel debugging from the perspective of someone who doesn't write code there.  However, my perspective also includes being a Microsoft employee.  As such, I have access to source code and symbols that the general public does not have.  Kernel debugging is likely more applicable to someone in my position.

There are some topics that you should learn outside of this tutorial that will make you more effective as a kernel debugger:

  • Familiarity with debugging, particularly with any one of: {windbg, cdb, kd}
  • Difference between kernel mode and user mode execution
  • High-level understanding of interrupts and IRQLs

I learned about these topics while on the job or through reading the "Windows Internals" book by Russinovich / Solomon / Ionescu.  If you don't want to read that book, find some other way to get familiar with these concepts, as it really helps.

Let's get started with kd!

Note: This tutorial is part of a series.  See other parts of the series here:


 

Terminology

  • Debug target - The machine being interrogated by the debugger
  • Debug host - The machine doing the interrogating through the debugger.  Likely your dev machine.

Set up apparatus

In the past, setting up a kd was a cumbersome activity.  Today, we can whip up a virtual machine and hook up a kernel debugger with a few commands.  The following is how a kd is set up through a Hyper-V based machine:

  1. Install your Windows OS on the VM
  2. In the Hyper-V settings for the VM, set COM 1 to use a Named Pipe.  I will name mine "test1", which ends up creating a named pipe at \\.\pipe\test1
  3. In an elevated command prompt from within the OS running in the VM (the "debug target"), execute the following commands:
    1. bcdedit /set {default} debug on
    2. bcdedit /dbgsettings serial debugport:1 baudrate:115200
  4. Launch your favorite kernel debugger from your debug host (your dev machine).  My favorite is windbg.exe.
    1. (In windbg, I press control+K to open up the kd window and specify settings through there)
    2. Set the debugger to connect to connect with the following settings:
      1. Type = COM
      2. Baud Rate = 115200
      3. Connect through Pipe
      4. Port = \\.\pipe\test1
      5. Set it to reconnect automatically with Resets = 0
  5. Reboot the VM OS (the "debug target")
  6. As soon as the debug target gets far enough along in the boot process, the kernel debugger will automatically attach

You will see something similar to the following when the debugger officially attaches:

 Opened \\.\pipe\test1
Waiting to reconnect...
Connected to Windows 10 xxxxx x86 compatible target at (Fri Jun 10 14:26:44.374 2016 (UTC - 7:00)), ptr64 FALSE
Kernel Debugger connection established.


First kernel debugging commands

After you have connected, you can break in at any moment in order to see what's going on.  Press control+break (windbg) or control+c (kd, cdb) to break in:

 Break instruction exception - code 80000003 (first chance)
  *******************************************************************************
  *                                                                             *
  *   You are seeing this message because you pressed either                    *
  *       CTRL+C (if you run console kernel debugger) or,                       *
  *       CTRL+BREAK (if you run GUI kernel debugger),                          *
  *   on your debugger machine's keyboard.                                      *
  *                                                                             *
  *                   THIS IS NOT A BUG OR A SYSTEM CRASH                       *
  *                                                                             *
  * If you did not intend to break into the debugger, press the "g" key, then   *
  * press the "Enter" key now.  This message might immediately reappear.  If it *
  * does, press "g" and "Enter" again.                                          *
  *                                                                             *
  *******************************************************************************
  nt!RtlpBreakWithStatusInstruction:
  819a4bc4 cc              int     3

The first thing I always do when connecting to a debugger is make sure the symbols are resolved, loaded, and cached.  I normally have my sympath set in the _NT_SYMBOL_PATH environment variable, but you can also set it explicitly with the ".sympath" command.  Non-Microsoft employees should include the public symbol server at Microsoft as follows:

 2: kd> .sympath cache*c:\sym;c:\MySymbolPath1;\\MySymbolServer\SomePath2\foo\bar;srv*https://msdl.microsoft.com/download/symbols
 Symbol search path is: cache*c:\sym;c:\MySymbolPath1;\\MySymbolServer\SomePath2\foo\bar;srv*https://msdl.microsoft.com/download/symbols
 Expanded Symbol search path is: cache*c:\sym;c:\mysymbolpath1;\\mysymbolserver\somepath2\foo\bar;srv*https://msdl.microsoft.com/download/symbols
************* Symbol Path validation summary **************
 Response                         Time (ms)     Location
 Deferred                                       cache*c:\sym
 Deferred                                       c:\MySymbolPath1
 Deferred                                       \\MySymbolServer\SomePath2\foo\bar
 Deferred                                       srv*https://msdl.microsoft.com/download/symbols

Once the sympath is set, try loading all symbols in order to get the major symbols cached locally.  If this is your first time doing this, it can take a long time, on the order of 3-5 minutes.  Symbol files can be large.

 2: kd> .reload /f *.*
Press ctrl-c (cdb, kd, ntsd) or ctrl-break (windbg) to abort symbol loads that take too long.
Run !sym noisy before .reload to track down problems loading symbols.

If a particular file has trouble loading and you feel you need it, turn on verbose symbol resolving (!sym noisy) and force that one module to reload (.reload /f foo.dll).

Once symbols have loaded, you can use the 'lm' command to see which symbols (if any) are loaded for each module:

 2: kd> lm
start    end        module name
81341000 8134c000   kdcom      (private pdb symbols)  c:\sym\kdcom.pdb\8F834143CF1A42A3BC536396DA9853A91\kdcom.pdb
8181d000 8187e000   hal        (private pdb symbols)  c:\sym\halmacpi.pdb\EE1E389525C24DBEBF46DDDD10F6F6DF1\halmacpi.pdb
8187e000 81e95000   nt         (private pdb symbols)  c:\sym\ntkrpamp.pdb\73A615FD399C45A7A669ACFBAEBC82201\ntkrpamp.pdb
82000000 82072000   storport   (private pdb symbols)  c:\sym\storport.pdb73EE2F1FFD246009F9E36C45650BBF91\storport.pdb
82080000 8208a000   Fs_Rec     (private pdb symbols)  c:\sym\fs_rec.pdb\765EE27940FF458D95602B9CC799F83D1\fs_rec.pdb
82090000 820b7000   ksecpkg    (private pdb symbols)  c:\sym\ksecpkg.pdb\E5C81256D7F84A03B25B69E203B2A8261\ksecpkg.pdb
820c0000 820d1000   mpsdrv     (private pdb symbols)  c:\sym\mpsdrv.pdb\37DB180E8ABE4217B848B65D56E0B8481\mpsdrv.pdb
820f0000 82106000   mountmgr   (private pdb symbols)  c:\sym\mountmgr.pdb\9F665C8CB77344BE8D230633D672E6C71\mountmgr.pdb
82110000 82117000   intelide   (private pdb symbols)  c:\sym\intelide.pdb\A0F94CA95D964E47A325BA72DF9CCFF91\intelide.pdb
82120000 8212e000   PCIIDEX    (private pdb symbols)  c:\sym\pciidex.pdb\27855013369A402FA201B2DAFD92DE1B1\pciidex.pdb
82130000 82139000   atapi      (private pdb symbols)  c:\sym\atapi.pdb\CC79A012F9464F3BAA9D1B3926F6275D1\atapi.pdb
82140000 82169000   ataport    (private pdb symbols)  c:\sym\ataport.pdb\2274ED5AC2B344A8807C0CDD155B154A1\ataport.pdb
82170000 82182000   fileinfo   (private pdb symbols)  c:\sym\fileinfo.pdb\81692D1241464AE898EF26401F6FD0D11\fileinfo.pdb
82190000 821a2000   WimFsf     (private pdb symbols)  c:\sym\wimfsf.pdb\85AD06671DFF4CCABC15341F12D5571C1\wimfsf.pdb
821b0000 82200000   ks         (private pdb symbols)  c:\sym\ks.pdb\432B6F981F0441EDBED8C24DA0C4C2151\ks.pdb
82400000 825da000   Ntfs       (private pdb symbols)  c:\sym\ntfs.pdb\E9EF50EFE34D41FEA4703C0E78C9B0921\ntfs.pdb
825e0000 825ea000   storvsc    (private pdb symbols)  c:\sym\storvsc.pdb\BDCC5C0D12B44942AFB8CC7339081E401\st

(truncated for brevity)

Having symbols loaded for some critical modules is vital to making any sense of anything.  If you aren't getting symbols to load for the 'nt' module, stop what you are doing and figure it out.  Most debugger commands will likely NOT work properly unless you have the 'nt' symbols loaded.  Most other modules in the kernel space are safe to not have symbols loaded for ordinary debugging, but 'nt' is crucial.


Exploring kernel land

My key to exploring the kernel space is the !process command.  I recommend looking at your debugger docs for the command; the debugger docs are well-maintained and very informative.  The !process command allows you to enumerate and query data for all processes in the system.  Here are some sample uses for !process:

Enumerate all processes

 2: kd> !process 0 0
**** NT ACTIVE PROCESS DUMP ****
PROCESS 9fc41040  SessionId: none  Cid: 0004    Peb: 00000000  ParentCid: 0000
    DirBase: 001a9000  ObjectTable: 82804000  HandleCount: 510.
    Image: System
PROCESS a514c8c0  SessionId: none  Cid: 0174    Peb: 02471000  ParentCid: 0004
    DirBase: 7ffe0020  ObjectTable: 829a8bc0  HandleCount:  46.
    Image: smss.exe

(truncated for brevity)

 

Locate a specific process

This example is looking for all instances of meason_test.exe, which is a test app I created that does nothing but Sleep(INFINITE).

 1: kd> !process 0 0 meason_test.exe
PROCESS ab870bc0  SessionId: 0  Cid: 01dc    Peb: 00458000  ParentCid: 0af8
    DirBase: 7ffe04c0  ObjectTable: b0533040  HandleCount:  28.
    Image: meason_test.exe

 

Query all threads for a specific process

This example references the PROCESS address found in the previous example, in order to restrict output to one specific process (rather than all processes that match the string specified)

 1: kd> !process ab870bc0 2
PROCESS ab870bc0  SessionId: 0  Cid: 01dc    Peb: 00458000  ParentCid: 0af8
    DirBase: 7ffe04c0  ObjectTable: b0533040  HandleCount:  28.
    Image: meason_test.exe
        THREAD ac5435c0  Cid 01dc.0fe8  Teb: 00459000 Win32Thread: 00000000 WAIT: (DelayExecution) UserMode Non-Alertable
            ffffffff  NotificationEvent
        THREAD ae63ba00  Cid 01dc.095c  Teb: 0045a000 Win32Thread: 00000000 WAIT: (WrQueue) UserMode Alertable
            ae6147c0  QueueObject
        THREAD 9fcac040  Cid 01dc.07cc  Teb: 0045b000 Win32Thread: 00000000 WAIT: (WrQueue) UserMode Alertable
            ae6147c0  QueueObject

 Query call stacks for all threads

This command lets you see the kernel mode portion of the call stack for each thread.  Note, I had to obscure the nt and ntdll symbol names and source code paths, as I am not sure what the private symbol server exposes relative to the public symbol server.

 1: kd> !process ab870bc0 17
PROCESS ab870bc0  SessionId: 0  Cid: 01dc    Peb: 00458000  ParentCid: 0af8
    DirBase: 7ffe04c0  ObjectTable: b0533040  HandleCount:  28.
    Image: meason_test.exe
    VadRoot ab89f730 Vads  Clone 0 Private . Modified . Locked .
    DeviceMap 82807ba0
    Token                             ad57c438
    ElapsedTime                       00:00:04.735
    UserTime                          00:00:00.000
    KernelTime                        00:00:00.000
    QuotaPoolUsage[PagedPool]         16472
    QuotaPoolUsage[NonPagedPool]      1472
    Working Set Sizes (now,min,max)  (, , ) (KB, KB, KB)
    PeakWorkingSetSize                
    VirtualSize                        Mb
    PeakVirtualSize                    Mb
    PageFaultCount                    
    MemoryPriority                    BACKGROUND
    BasePriority                      
    CommitCharge                     
        THREAD ac5435c0  Cid 01dc.0fe8  Teb: 00459000 Win32Thread: 00000000 WAIT: (DelayExecution) UserMode Non-Alertable
            ffffffff  NotificationEvent
        Not impersonating
        DeviceMap                 82807ba0
        Owning Process                   Image:         meason_test.exe
        Attached Process          N/A            Image:         N/A
        Wait Start TickCount      1386973        Ticks:  (0:00:00:04.015)
        Context Switch Count      39             IdealProcessor: 3             
        UserTime                  00:00:00.000
        KernelTime                00:00:00.000
        Win32 Start Address meason_test!mainCRTStartup (0x00ad5bc0)
        Stack Init  Current  Base  Limit  Call 
        Priority  BasePriority  PriorityDecrement  IoPriority 2 PagePriority 5

        ChildEBP RetAddr  Args to Child              
        a9b7cc04 818be3a5 00000000 ac543678 ac5435c0 nt!(omitted)+0x19 (FPO: [Uses EBP] [1,0,4]) [(omitted)]
        a9b7cc78 818bde89 ac5435c0 a9b7cd1c 81af9101 nt!(omitted)+0x195 (FPO: [Non-Fpo]) (CONV: fastcall) [(omitted)]
        a9b7cccc 818b495d 00000002 7531db3b 80000032 nt!(omitted)+0x159 (FPO: [Non-Fpo]) (CONV: stdcall) [(omitted)]
        a9b7ccf8 81af9189 ffffff00 00000000 00000000 nt!(omitted)+0xad (FPO: [Non-Fpo]) (CONV: stdcall) [(omitted)]
        a9b7cd44 819b2097 00000000 003af8b8 003af8dc nt!(omitted)+0x89 (FPO: [Non-Fpo]) (CONV: stdcall) [(omitted)]
        a9b7cd44 77302090 00000000 003af8b8 003af8dc nt!(omitted) (FPO: [0,3] TrapFrame @ a9b7cd54) [(omitted)]
        003af870 77300bea 76fae258 00000000 003af8b8 ntdll!(omitted) (FPO: [0,0,0]) [(omitted)]
        003af874 76fae258 00000000 003af8b8 0ebd6750 ntdll!(omitted) +0xa (FPO: [2,0,0]) [(omitted)]
        003af8dc 76fae1af ffffffff 00000000 003af8f8 KERNELBASE!SleepEx+0x98 (FPO: [SEH]) (CONV: stdcall) [(omitted)]
        003af8ec 00ad2c38 ffffffff 003af93c 00ad5aff KERNELBASE!Sleep+0xf (FPO: [Non-Fpo]) (CONV: stdcall) [(omitted)]
        003af8f8 00ad5aff 00000001 009e1ed8 009e1230 meason_test!main+0x38 (FPO: [Non-Fpo]) (CONV: cdecl) [(omitted)]
        003af93c 77291154 00458000 744ffd4f 00000000 meason_test!__mainCRTStartup+0x107 (FPO: [Non-Fpo]) (CONV: cdecl) [(omitted)]
        003af984 77291114 ffffffff 7731448a 00000000 ntdll(omitted)+0x3a (FPO: [SEH]) (CONV: stdcall) [(omitted)]
        003af994 00000000 00ad5bc0 00458000 00000000 ntdll(omitted)+0x1b (FPO: [Non-Fpo]) (CONV: stdcall) [(omitted)]

(Truncated for brevity)

 


Summary

In this tutorial, we didn't get very deep into the bowels of the OS.  However, we cracked open the door and took a peek.  In the next tutorial, we will get a peek at what the kernel debugger looks like when it first breaks in.