Intro to kernel debugging 2

Topic: Debugger Context

This is part 2 of the intro to kernel debugging series.  Other posts:

In this post, we will explore the following:

  • What the debugger is looking at when it first breaks in
  • Get current call stack
  • Get current process
  • Get current processor
  • Get current IRQL

Reminders about how this tutorial is authored:

  • The author is a user mode developer, and tailors the conversation to that audience
  • Always make sure you have symbols loaded!
  • We are using a kd connection to a VM, as explained in a previous tutorial

 

Context of debugger break-in

When your debugger has broken into the target, you may notice that the target operating system has become completely unresponsive.  That's what kernel debugging is: You're debugging the operating system itself, and it needs to be frozen in order to inspect it.

With your operating system frozen, what you are looking at in the debugger is the state of the system when it became frozen.  That includes all processors, all processes, all threads in all processes, and even the operating systems threads.  The debugger will be looking at something relative to where the break-in occurred (its context).  Some potential debugger break-in points include:

  • Kernel breakpoint
    • Debugger breakpoint for kernel mode code
  • User breakpoint
    • Debugger breakpoint for user mode code
  • Bugcheck
    • The infamous bluescreen / BSOD
  • DbgBreakpoint()
    • Code-defined breakpoint
  • Clock interrupt check for break-in request
    • User pressed break keystroke on debugger, and the OS clock interrupt monitors for the request

 

Generally, there are up to 4 items that developers immediately want to know about to establish context once the debugger has broken in:

  1. Call stack
  2. Current process
  3. Current processor
  4. Current IRQL

In order to establish context, it seems all developers are interested in #1: What's the call stack?  However, there is a divergence with the others.  User mode developers tend to care more about #2, whereas kernel developers tend to care more about #3 and #4.  We will cover each.

 


Get call stack

If you've been using {windbg, cdb, kd} for a while, you should already know how to do this.  If not, take a look at the documentation for it.  The debugger docs are well-maintained, and are replicated to MSDN.

The most basic call stack is available with the 'k' command.  Note that the output here has been modified to obscure some data, as I am not sure what is exposed to the public versus what is exposed to Microsoft employees:

 1: kd> k
# ChildEBP RetAddr
00 9f1e69f8 819fa3a6 nt!(omitted) [(omitted) @ 48]
01 9f1e6a1c 8191a4cd nt!(omitted)+0x74a96 [(omitted) @ 244]
02 9f1e6a40 8191916b nt!(omitted)+0x3d [(omitted) @ 672]
03 9f1e6a8c 81825e70 nt!(omitted)+0x10b [(omitted) @ 1331]
04 9f1e6a90 81837857 hal!(omitted)+0x6 [(omitted) @ 328]
05 9f1e6a90 8183941b hal!(omitted)+0x1f7 [(omitted) @ 228]
06 9f1e6b94 81a4866d hal!(omitted)+0x7 [(omitted) @ 87]
07 9f1e6ba4 8191be83 nt!(omitted)+0x31 [(omitted) @ 671]
08 9f1e6cc8 8191b836 nt!(omitted)+0x513 [(omitted) @ 3856]
09 9f1e6d44 819b69a1 nt!(omitted)+0x306 [(omitted) @ 1046]
0a 9f1e6d48 00000000 nt!KiIdleLoop+0xd [(omitted) @ 1451]

Many people use exotic versions of the 'k' command.  Read the docs and come up with your own.  The author prefers 'kpnL', which removes source code data and inserts param names and values.

 


Get current process

When the kd breaks in, you'll always be in the context of some process running on the CPU serving up your debugger break (even threads executing in the system space are part of a process).  For user mode processes, a lot of the CPU state will refer to virtual addresses relative to that process.  Hence, it becomes vital to ensure the debugger context is set to the desired process so that a proper translation of virtual addresses occurs.

To query the current process context that the debugger is using, execute !process -1 0:

 2: kd> !process -1 0
PROCESS a027cb80  SessionId: 0  Cid: 02b0    Peb: 02f7b000  ParentCid: 0210
    DirBase: 7ffe0520  ObjectTable: af387300  HandleCount: 694.
    Image: dwm.exe

If you then use the 'k' command to enumerate the call stack, you'll be seeing the stack for a thread in this process (dwm.exe in the example shown here).  Address translation and symbol lookups will be taken care of appropriately by the debugger.

While !process allows you to view the state of any process, this command will not set a new debugger context.  There is one exception to this rule, when you add the appropriate bit (0x10) to your display options.  For example, with flags 0x17 set, it will temporarily switch context for the purpose of output in !process.  To set a new process context, use the .process command.  User mode developers will want to add the /p /r options to enable automatic address translations in the context of this new process.

 3: kd> !process -1 0
PROCESS a027cb80  SessionId: 0  Cid: 02b0    Peb: 02f7b000  ParentCid: 0210
    DirBase: 7ffe0520  ObjectTable: af387300  HandleCount: 694.
    Image: dwm.exe
3: kd> .process
Implicit process is now a027cb80
3: kd> !process 0 0 meason_test.exe
PROCESS acac1040  SessionId: 0  Cid: 0bd8    Peb: 0290d000  ParentCid: 0f40
    DirBase: 7ffe05a0  ObjectTable: b2248580  HandleCount:  28.
    Image: meason_test.exe
3: kd> .process /p /r acac1040
Implicit process is now acac1040
.cache forcedecodeuser done
Loading User Symbols
....

Note that this sequence of commands utilized !process to perform a process search.

What can we do now that the process context is set to a particular process?  You can probe all threads and get user mode call stacks for each.  Address translation is done for you by the debugger.  If you attempt to output a user mode call stack while the process context is not set to that process, you will get no symbols at best!

 


Get current processor, IRQL

The kd breakin must occur on one processor in your multiprocessor system, and the CPU is running at a particular IRQL when the break occurs on the processor.  Sometimes, you may need to reference these values.

The current processor is shown directly in your kd prompt.  For example, the following is within the context of CPU #2:

 2: kd>

In {cdb, kd}, it is readily apparent as part of the currently executing command.  In {windbg}, it is shown in the graphical interface next to where you enter in commands.

At the time of the breakin, the CPU executing the breakin code was running at a particular IRQL.  You can view this IRQL via the !irql command, which also displays the current CPU:

 2: kd> !irql
Debugger saved IRQL for processor 0x2 -- 28 (CLOCK2_LEVEL)

From the example above, I can see it was running at a really high IRQL - the clock interrupt.  This generally means the debugger user manually requested a break in.  You will see more interesting data with breakpoints or bugchecks.

You can switch debugger context to other processors by executing the context switch command (~s):

 2: kd> !irql
Debugger saved IRQL for processor 0x2 -- 28 (CLOCK2_LEVEL)
2: kd> ~0s
0: kd> !irql
Debugger saved IRQL for processor 0x0 -- 0 (LOW_LEVEL)
0: kd> ~1s
1: kd> !irql
Debugger saved IRQL for processor 0x1 -- 0 (LOW_LEVEL)
1: kd> ~3s
3: kd> !irql
Debugger saved IRQL for processor 0x3 -- 0 (LOW_LEVEL)

 


Summary

In this tutorial, we get to see what the kernel debugger looks like when it first breaks in.  We also got to migrate around a little bit to view various artifacts of the system.  In the next tutorial, we'll get more comfortable with playing around with kernel state.