Debugging kernel objects using windbg & kd

One of the most common issues I deal with are related to blocking. Some of this are pretty straight forward to identify, Others not that easy, especially if you have a process with many threads. When I have to troubleshoot blocking on WaitHandles (e.g Mutex) I usually find the need to use both user mode and kernel mode debugging to identify the guilty thread.

For demonstration purposes I’ve built a simple sample where a process appears hung.

The first thing to do is to attach the debugger to the process. My sample was built with CLR 4.0 so no more mscorwks (now it´s called clr.dll). Let’s start by loading the SOS extension.

0:007> .loadby sos clr

After looking at the threads stack with !clrstack we can see a thread blocked on WaitOne.

(managed stack)

0:007> !clrstack

OS Thread Id: 0x1b18 (7)

Child SP IP Call Site

04fbf4f4 76fc00fd [HelperMethodFrame_1OBJ: 04fbf4f4] System.Threading.WaitHandle.WaitOneNative(System.Runtime.InteropServices.SafeHandle, UInt32, Boolean, Boolean)

04fbf59c 6a4b6eff System.Threading.WaitHandle.InternalWaitOne(System.Runtime.InteropServices.SafeHandle, Int64, Boolean, Boolean)

04fbf5b8 6a4b6ebd System.Threading.WaitHandle.WaitOne(Int32, Boolean)

04fbf5d0 6a4b6e80 System.Threading.WaitHandle.WaitOne()

04fbf5d8 002006c6 BlockingQueueWithManualResetEvents`1[[System.__Canon, mscorlib]].Enqueue(System.__Canon)

04fbf610 00200666 ThreadsManaged.Program+<>c__DisplayClass4.<Main>b__3()

04fbf61c 6a4d0bea System.Threading.ThreadHelper.ThreadStart_Context(System.Object)

04fbf628 6a500b6c System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)

04fbf64c 6a500aac System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)

04fbf668 6a4d0b74 System.Threading.ThreadHelper.ThreadStart()

04fbf88c 6b1d1f02 [GCFrame: 04fbf88c]

04fbfb50 6b1d1f02 [DebuggerU2MCatchHandlerFrame: 04fbfb50]

(native stack)

0:007> knL30

# ChildEBP RetAddr

00 0508ef74 75c90962 ntdll!ZwWaitForMultipleObjects+0x15

01 0508f010 7607162d KERNELBASE!WaitForMultipleObjectsEx+0x100

02 0508f058 665cd0b6 KERNEL32!WaitForMultipleObjectsExImplementation+0xe0

03 0508f0bc 665cce8f clr!WaitForMultipleObjectsEx_SO_TOLERANT+0x56

04 0508f0dc 665ccd11 clr!Thread::DoAppropriateAptStateWait+0x4d

05 0508f170 665ccdd2 clr!Thread::DoAppropriateWaitWorker+0x17d

06 0508f1dc 6661a384 clr!Thread::DoAppropriateWait+0x60

07 0508f2dc 658f6eff clr!WaitHandleNative::CorWaitOneNative+0x196

My next step is to look at objects on this thread stack. We can see on top a Mutex sync object. Let’s dig into that one!!

0:007> !dso

OS Thread Id: 0x1b18 (7)

ESP/REG Object Name

04FBF53C 0230c0bc Microsoft.Win32.SafeHandles.SafeFileHandle

04FBF574 0230bc8c Microsoft.Win32.SafeHandles.SafeWaitHandle

04FBF59C 0230bc08 System.Threading.Mutex

04FBF5BC 0230bf0c System.Threading.ThreadStart

(…)

Below are the details on this Mutex object. Here we are interested on the waitHandle

0:007> !do 0230bc08

Name: System.Threading.Mutex

MethodTable: 6a5643c4

EEClass: 6a337fb0

Size: 24(0x18) bytes

File: C:\Windows\Microsoft.Net\assembly\GAC_32\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll

Fields:

      MT Field Offset Type VT Attr Value Name

6a58bcb8 40001cd 4 System.Object 0 instance 00000000 __identity

6a5642b4 4000690 c System.IntPtr 1 instance 1b8 waitHandle

6a5657c0 4000691 8 ...es.SafeWaitHandle 0 instance 0230bc8c safeWaitHandle

6a59168c 4000692 10 System.Boolean 1 instance 1 hasThreadAffinity

6a5642b4 4000693 b74 System.IntPtr 1 shared static InvalidHandle

    >> Domain:Value 008056b8:ffffffff <<

6a59168c 40006dd b80 System.Boolean 1 shared static dummyBool

    >> Domain:Value 008056b8:1 <<

6a5643c4 40006de 180 ...m.Threading.Mutex 0 shared static s_ReservedMutex

    >> Domain:Value 008056b8:00000000 <<

!handle is the command to see details on an existing handle. For more information on !handle please check the debugging tools documentation.

0:007> !handle 1b8 f

Handle 1b8

  Type Mutant

  Attributes 0

  GrantedAccess 0x1f0001:

         Delete,ReadControl,WriteDac,WriteOwner,Synch

         QueryState

  HandleCount 2

  PointerCount 4

  Name <none>

  Object Specific Information

    Mutex is Owned

We can see clearly that this mutex is owned by someone but it doesn´t state who is the owner.

Since we are dealing with kernel objects, some information is kept on internal kernel structures. In order to access this information we can use the kernel debugger KD.EXE (it comes with the same package as Windbg.exe). As a side note please be aware that when using this what you are seeing is a snapshot and therefore it might sometimes not be enough to troubleshoot the issue.

Start by opening a command prompt on the debugger directory and issue kd /kL (kL is the option for local kernel debugging). Here you might face a “little” issue because if you haven´t enabled local kernel debugging you will have to do it now and that requires a reboot (causing you to lose the opportunity to dig further on the issue in your application and having to wait for it to occur again)

C:\Debuggers\Debugging Tools for Windows (x64)>kd /kL

Microsoft (R) Windows Debugger Version 6.12.0001.577 AMD64

Copyright (c) Microsoft Corporation. All rights reserved.

Local kernel debugging is disabled by default since Windows Vista, you must run "bcdedit -debug on" and reboot to enable it.

Debuggee initialization failed, HRESULT 0x80004001

    "Not implemented"

C:\Debuggers\Debugging Tools for Windows (x64)>bcdedit -debug on

The operation completed successfully.

o After enabling debug I can start the debugger on kernel mode (it´s a bit ugly since it´s in the command prompt window, no pretty GUI. Also ensure that your user mode session is running and not stopped on the debugger). My first step would be to configure symbols if they aren´t already configured (for more information on configuring symbols please check www.microsoft.com/whdc/devtools/debugging/debugstart.mspx ).

Additionaly I always like to set !sym noisy so that I can see everything the debugger is doing when loading symbols.

lkd> .sympath SRV*c:\symbols*https://msdl.microsoft.com/download/symbols

Symbol search path is: SRV*c:\symbols*https://msdl.microsoft.com/download/symbols

Expanded Symbol search path is: srv*c:\symbols*https://msdl.microsoft.com/download/symbols

lkd> !sym noisy

noisy mode - symbol prompts on

Our first step in kernel mode is to find our process (in my case the process is called ThreadsManaged.exe) . To do that issue the command below (in yellow). As a result we are interested the process address (in green)

lkd> !process 0 0 ThreadsManaged.exe

PROCESS fffffa8009a3c7e0

    SessionId: 1 Cid: 006c Peb: 7efdf000 ParentCid: 073c

    DirBase: 15849000 ObjectTable: fffff8a012fa3410 HandleCount: 138.

    Image: ThreadsManaged.exe

Now with the process address let’s see a bit detail about threads on this process. Our next step will require some cross information between our user mode session and kernel session.

lkd> !process fffffa8009a3c7e0 2

PROCESS fffffa8009a3c7e0

    SessionId: 1 Cid: 006c Peb: 7efdf000 ParentCid: 073c

    DirBase: 15849000 ObjectTable: fffff8a012fa3410 HandleCount: 140.

    Image: ThreadsManaged.exe

        THREAD fffffa8009162b60 Cid 006c.1974 Teb: 000000007efdb000 Win32Thread: fffff900c2d54990 WAIT: (WrLpcReply) UserMode Non-Alertable

            fffffa8009162f20 Semaphore Limit 0x1

        THREAD fffffa800b7c3b60 Cid 006c.1b20 Teb: 000000007efd8000 Win32Thread: 0000000000000000 WAIT: (UserRequest) UserMode Non-Alertable

            fffffa8009980c10 SynchronizationEvent

            fffffa8009a50210 SynchronizationEvent

            fffffa80077dfce0 SynchronizationEvent

        THREAD fffffa800b874b60 Cid 006c.04f4 Teb: 000000007efd5000 Win32Thread: 0000000000000000 WAIT: (UserRequest) UserMode Non-Alertable

            fffffa8006d6cb20 NotificationEvent

            fffffa800d6afb90 SynchronizationEvent

            fffffa800737c9a0 SynchronizationEvent

        THREAD fffffa8007d83760 Cid 006c.1b3c Teb: 000000007efad000 Win32Thread: 0000000000000000 WAIT: (UserRequest) UserMode Alertable

            fffffa800b780060 SynchronizationTimer

            fffffa8008d5b060 SynchronizationTimer

            fffffa800d6eec20 SynchronizationTimer

        THREAD fffffa800dd75060 Cid 006c.1b34 Teb: 000000007efaa000 Win32Thread: 0000000000000000 WAIT: (UserRequest) UserMode Alertable

            fffffa8007779fe0 NotificationEvent

        THREAD fffffa8007066b60 Cid 006c.0d10 Teb: 000000007efa7000 Win32Thread: 0000000000000000 WAIT: (UserRequest) UserMode Alertable

            fffffa8007779fe0 NotificationEvent

        THREAD fffffa800d51a060 Cid 006c.1894 Teb: 000000007efa4000 Win32Thread: 0000000000000000 WAIT: (DelayExecution) UserMode Alertable

            fffffa800d51a338 Semaphore Limit 0x2

        THREAD fffffa800dd68060 Cid 006c.1b18 Teb: 000000007efa1000 Win32Thread: 0000000000000000 WAIT: (UserRequest) UserMode Alertable

            fffffa800a1057d0 Mutant - owning thread fffffa800d51a060

        THREAD fffffa8007932b60 Cid 006c.0e68 Teb: 000000007ef9e000 Win32Thread: 0000000000000000 WAIT: (WrQueue) UserMode Alertable

            fffffa800a112aa0 QueueObject

        THREAD fffffa8009a1e060 Cid 006c.1180 Teb: 000000007ef9b000 Win32Thread: 0000000000000000 WAIT: (WrQueue) UserMode Alertable

            fffffa800d6f4060 QueueObject

        THREAD fffffa8007ab2060 Cid 006c.02e4 Teb: 000000007ef98000 Win32Thread: 0000000000000000 WAIT: (WrQueue) UserMode Alertable

            fffffa800d6f4060 QueueObject

So looking above we see that thread with cid (<pid>.<tid>) 006c.1b18 (in yellow) shows a Mutant (a.k.a Mutex) and the owner of this object is thread at address fffffa800d51a060 (in green). This thread as a Cid 006c.1894.

Back to our user mode session, break again in the debugger and list existing threads. Now just search for those cid in the output (below)

0:007> ~

   0 Id: 006c. 1974 Suspend: 1 Teb: 7efdb000 Unfrozen

   1 Id: 006c. 1b20 Suspend: 1 Teb: 7efd8000 Unfrozen

   2 Id: 006c. 04f4 Suspend: 1 Teb: 7efd5000 Unfrozen

   3 Id: 006c. 1b3c Suspend: 1 Teb: 7efad000 Unfrozen

   4 Id: 006c. 1b34 Suspend: 1 Teb: 7efaa000 Unfrozen

   5 Id: 006c. 0d10 Suspend: 1 Teb: 7efa7000 Unfrozen

   6 Id: 006c. 1894 Suspend: 1 Teb: 7efa4000 Unfrozen

. 7 Id: 006c. 1b18 Suspend: 1 Teb: 7efa1000 Unfrozen

(…)

So remember from the beginning that our thread waiting was thread 7. Cross the information you got from the kernel session and you see that the owner thread on this mutex is thread 6. Below is the managed stack from thread 6 (the one responsible for the block).

OS Thread Id: 0x1894 (6)

Child SP IP Call Site

0511f638 76fbfd31 [HelperMethodFrame: 0511f638] System.Threading.Thread.SleepInternal(Int32)

0511f684 6a5341b9 System.Threading.Thread.Sleep(Int32)

0511f688 002006e3 BlockingQueueWithManualResetEvents`1[[System.__Canon, mscorlib]].Enqueue(System.__Canon)

0511f6c0 00200616 ThreadsManaged.Program+<>c__DisplayClass4.<Main>b__2()

0511f6cc 6a4d0bea System.Threading.ThreadHelper.ThreadStart_Context(System.Object)

0511f6d8 6a500b6c System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)

0511f6fc 6a500aac System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)

0511f718 6a4d0b74 System.Threading.ThreadHelper.ThreadStart()

0511f93c 6b1d1f02 [GCFrame: 0511f93c]

0511fc00 6b1d1f02 [DebuggerU2MCatchHandlerFrame: 0511fc00]

It´s not often the we have the need to look both in kernel and user mode and the case above was a simple example where probably a user mode debugging session along with some code review would be enough to find the thread owning the kernel object. But sometimes this is not the case and it comes in hand to know what more options and tools we have to help us.

Have fun!!

Bruno