Hung Window?, No Source?, No Problem!! Part 2


Written by Jeff Dailey


 


Hello, my name is Jeff, I’m a escalation engineer on the Microsoft CPR (critical problem resolution) platforms team.   This blog entry is part 2 of my Hung Window?, No source?, No problem!! Part 1 blog.   In this lab we will be debugging a problem involving multi threaded applications and synchronization objects and the types of things that can go wrong, and how to track them down. This process and training lab is right out of our CPR Training curriculum.  In order to do the lab I have prepared for you, you will need to have downloaded the dumphungwindow and then badwindow.exe from my earlier blog post.  You will also need to install the debugging tools for windows.


 


Debugging tools:


 http://www.microsoft.com/whdc/devtools/debugging/installx86.mspx


Previous blog http://blogs.msdn.com/ntdebugging/archive/2007/05/29/detecting-and-automatically-dumping-hung-gui-based-windows-applications.aspx


 


After you have both of these installed we can get started.  We are going to debug and figure out why the window stops repainting and does not respond.


 


Step 1 start badwindow.exe


Step 2 run dumphungwindow.exe


Step 3 select Hang \ Hang Type 2 from  the BadWindow.exe menu.


You should see dump hung window detect your window no processing messages and as a result it will dump the badwindow.exe process


 


************ OUTPUT *************


C:\source\dumphungwindow\release>dumphungwindow.exe
Dumps will be saved in C:\Users\jeffda\AppData\Local\Temp\
scanning for hung windows


**


Hung Window found dumping process (12912) badwindow.exe
Dumping unresponsive process
C:\Users\jeffda\AppData\Local\Temp\HWNDDump_Day6_14_2007_Time7_34_5_Pid12912_badwindow.exe.dmp


Dump complete


 


Hung Window found dumping process (12912) badwindow.exe


Dumping unresponsive process
C:\Users\jeffda\AppData\Local\Temp\HWNDDump_Day6_14_2007_Time7_34_24_Pid12912_badwindow.exe.dmp\jeffda\AppData\Local\Temp\HWNDDump_Day6_12_2007_Time9_53_56_Pid7924_badwindow.exe.dmp


Dump complete*


************ OUTPUT *************


 


Step 4 create a local symbol directory at C:\websymbols


Step 5 set your symbol path under file \ symbols in windbg to SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols


See http://www.microsoft.com/whdc/devtools/debugging/debugstart.mspx for details.



Step 6 start windbg select file\open crash dump and select the first dump file.


Your initial output should look like this.


 


Microsoft (R) Windows Debugger  Version 6.7.0001.0


Copyright (c) Microsoft Corporation. All rights reserved.


 


***** WARNING: Your debugger is probably out-of-date.


*****          Check http://dbg for updates.


 


Loading Dump File [C:\Users\jeffda\AppData\Local\Temp\HWNDDump_Day6_12_2007_Time9_53_34_Pid7924_badwindow.exe.dmp]


User Mini Dump File with Full Memory: Only application data is available


 


Symbol search path is: SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols;srv


Executable search path is:


Windows Vista Version 6000 MP (2 procs) Free x86 compatible


Product: WinNt, suite: SingleUserTS


Debug session time: Tue Jun 12 09:53:35.000 2007 (GMT-4)


System Uptime: 11 days 18:41:43.089


Process Uptime: 0 days 0:00:32.000


………………………………


Loading unloaded module list


.


eax=00000000 ebx=00000002 ecx=00000000 edx=00000000 esi=00000000 edi=00000000
eip=777faec5 esp=0017faf4 ebp=0017fb8c iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
ntdll!ZwWaitForMultipleObjects+0x15:
777faec5 c21400          ret     14h


0:000> !reload


 


Step 7 from the debugger prompt (Locate a prompt at the bottom of windbg that has a 0:000> next to it.
Type ~* k


 


 


You will most likely see output similar to this.


.  0  Id: 3270.2b10 Suspend: 0 Teb: 7efdd000 Unfrozen


ChildEBP RetAddr 


0017faf0 76e4edb5 ntdll!ZwWaitForMultipleObjects+0x15


0017fb8c 76e430c3 kernel32!WaitForMultipleObjectsEx+0x11d


0017fba8 00401502 kernel32!WaitForMultipleObjects+0x18


0017fbc8 0040139b badwindow!hangtype2+0x42 [c:\source\badwindow\badwindow\badwindow.cpp @ 340]


0017fc24 772a87af badwindow!WndProc+0x17b [c:\source\badwindow\badwindow\badwindow.cpp @ 274]


0017fc50 772a8936 user32!InternalCallWinProc+0x23


0017fcc8 772a8a7d user32!UserCallWinProcCheckWow+0x109


0017fd2c 772a8ad0 user32!DispatchMessageWorker+0x380


0017fd3c 004010fb user32!DispatchMessageW+0xf


0017ff0c 00401817 badwindow!wWinMain+0xfb [c:\source\badwindow\badwindow\badwindow.cpp @ 124]


0017ffa0 76eb19f1 badwindow!__tmainCRTStartup+0x150 [f:\sp\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 589]


0017ffac 7782d109 kernel32!BaseThreadInitThunk+0xe


0017ffec 00000000 ntdll!_RtlUserThreadStart+0x23


 


   1  Id: 3270.2cd0 Suspend: 0 Teb: 7efda000 Unfrozen


ChildEBP RetAddr 


026ffebc 777ecfad ntdll!ZwWaitForSingleObject+0x15


026fff20 777ecf78 ntdll!RtlpWaitOnCriticalSection+0x154


026fff48 0040153c ntdll!RtlEnterCriticalSection+0x152


026fff64 757c2848 badwindow!hangtype2threada+0x2c [c:\source\badwindow\badwindow\badwindow.cpp @ 358]


026fff9c 757c28c8 msvcr80!_endthread+0x4b


026fffa0 76eb19f1 msvcr80!_endthread+0xcb


026fffac 7782d109 kernel32!BaseThreadInitThunk+0xe


026fffec 00000000 ntdll!_RtlUserThreadStart+0x23

Note the [NUMBER] Id: indicates the thread number, to the right of this you have the process id and thread id > PROCESS 3270.2b10 < THREAD | THREAD STATE > Suspend: 0 Teb: 7efdd000 Unfrozen


 


Each of these threads represents a call stack.  The most recent call is at the TOP of the stack.  As each call is made the stack grows larger.   Looking at thread 0 you will see that our winproc appears to be blocked on a call to hangtype2, hangtype2 is making a call to WaitForMultipleObjects Lets look more closely at WaitForMultipleObjects


 


Docs for WaitForMultipleObjects


http://msdn2.microsoft.com/en-us/library/ms687025.aspx


 


DWORD WINAPI WaitForMultipleObjects( DWORD nCount, const HANDLE* lpHandles, BOOL bWaitAll, DWORD dwMilliseconds


 


Lets look at the parameters passed to


 


0:000> kv


ChildEBP RetAddr  Args to Child             


0017faf0 76e4edb5 00000002 0017fb40 00000000 ntdll!ZwWaitForMultipleObjects+0x15 (FPO: [5,0,0])


0017fb8c 76e430c3 0017fb40 0017fbc4 00000001 kernel32!WaitForMultipleObjectsEx+0x11d (FPO: [Non-Fpo])


0017fba8 00401502 00000002 0017fbc4 00000001 kernel32!WaitForMultipleObjects+0x18 (FPO: [Non-Fpo])


0017fbc8 0040139b 00401220 0017fbfc 00401220 badwindow!hangtype2+0x42 (FPO: [0,2,0]) (CONV: cdecl) [c:\source\badwindow\badwindow\badwindow.cpp @ 340]


0017fc24 772a87af 00063d36 00000111 00008004 badwindow!WndProc+0x17b (CONV: stdcall) [c:\source\badwindow\badwindow\badwindow.cpp @ 274]


0017fc50 772a8936 00401220 00063d36 00000111 user32!InternalCallWinProc+0x23


0017fcc8 772a8a7d 00000000 00401220 00063d36 user32!UserCallWinProcCheckWow+0x109 (FPO: [Non-Fpo])


0017fd2c 772a8ad0 00401220 00000000 0017ff0c user32!DispatchMessageWorker+0x380 (FPO: [Non-Fpo])


0017fd3c 004010fb 0017fd54 00403938 00000001 user32!DispatchMessageW+0xf (FPO: [Non-Fpo])


0017ff0c 00401817 00400000 00000000 00280f8c badwindow!wWinMain+0xfb (CONV: stdcall) [c:\source\badwindow\badwindow\badwindow.cpp @ 124]


0017ffa0 76eb19f1 7efde000 0017ffec 7782d109 badwindow!__tmainCRTStartup+0x150 (FPO: [Non-Fpo]) (CONV: cdecl) [f:\sp\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 589]


0017ffac 7782d109 7efde000 0017fb9e 00000000 kernel32!BaseThreadInitThunk+0xe (FPO: [Non-Fpo])


0017ffec 00000000 00401987 7efde000 00000000 ntdll!_RtlUserThreadStart+0x23 (FPO: [Non-Fpo])


 


The first parameter is 00000002  this is the number of objects we are waiting on.


The second parmeter is the address of the array of objects,  Lets dump it out and take a look at the objects


 


0:000> dd 0017fbc4


0017fbc4  000000c4 000000c8 0040139b 00401220


0017fbd4  0017fbfc 00401220 00063d36 0017fc48


0017fbe4  772a8989 772a894d 53ca28e7 00000000


0017fbf4  00063d36 00401220 00000000 00000000


0017fc04  00000000 0017fca0 00000001 00000000


0017fc14  ffffffff 772a88e5 53c4f4b4 75c12459


0017fc24  0017fc50 772a87af 00063d36 00000111


0017fc34  00008004 00000000 00401220 dcbaabcd


 


0:000> !handle 000000c4


Handle 000000c4


  Type            Thread


0:000> !handle 000000c8


Handle 000000c8


  Type            <Error retrieving type>


 


Looking at the second value it would appear as if all the info needed to get the handle type info is not in the dump for some reason.  Handles are a index into the handle table in kernel.  It’s possible when the dump was created that no all the handle info was included.  However that’s ok.  We have a simple way to work around this and see what happened.


 


We can use UF from part 1 of this blogs on badwindow.exe, All we need to do is UF the return address of


WaitForMultipleObjects.  Lets run through the assembly and see what we are waiting on.


 


0:000> uf 00401502


badwindow!hangtype2 [c:\source\badwindow\badwindow\badwindow.cpp @ 334]:


 


Reserving space on the stack by decrementing ESP (The stack pointer, remember the stack grows down in memory)


  334 004014c0 83ec08          sub     esp,8


 


Save the state of ESI so it can be restored later.


  334 004014c3 56              push    esi


 


Get the pointer to _beginthread from the import table and store it in ESI
Docs on being thread http://msdn2.microsoft.com/en-us/library/kdzttdcb(VS.80).aspx 1 (start address), 2 (stack size), 3 (arglist)


  337 004014c4 8b3580204000    mov     esi,dword ptr [badwindow!_imp___beginthread (00402080)]


 


 


Push the last arg to _beginthread on the stack this is the arg list for _beginthread in this case 0 we are passing no args.


  337 004014ca 6a00            push    0



This is our stack space.  Note in the debugger you can do a ? 2EE0h and it will show value in Hex and Dec, this value is 12000 dec.


  337 004014cc 68e02e0000      push    2EE0h


 


This is the start address for our thread function, in this case hangtype2threada


  337 004014d1 6810154000      push    offset badwindow!hangtype2threada (00401510)


 


Here we call _beginthread this starts the thread up.  The return value is a thread handle.


  337 004014d6 ffd6            call    esi


 


Push the last arg to _beginthread on the stack this is the arg list for _beginthread in this case 0 we are passing no args.


  338 004014d8 6a00            push    0


 


This is our stack space arg for _beingthread


  338 004014da 68e02e0000      push    2EE0h


 


This is the start address for our thread function, in this case hangtype2threadb


  338 004014df 6870154000      push    offset badwindow!hangtype2threadb (00401570)


 


We are now storing EAX (The return from the first _beginthreadcall ) into ESP+1ch (on our stack)


  338 004014e4 8944241c        mov     dword ptr [esp+1Ch],eax


 


Here we call _beginthread this starts the thread up.  The return value is a thread handle.


  338 004014e8 ffd6            call    esi


 


Any time we add to ESP We are shrinking or cleaning up the stack.


  338 004014ea 83c418          add     esp,18h


 


We are pushing our wait time for WaitForMultipleObjects in this case 0FFFFFFFFh (-1) Wait forever.


  340 004014ed 6aff            push    0FFFFFFFFh


 


Storing EAX on the stack, this is the thread handle from our last _beginthread call.


  340 004014ef 8944240c        mov     dword ptr [esp+0Ch],eax


 


This is our wait logic,  in this case it’s WaitAll, so we will only unblock once all handles are signaled or in this case threads complete running.


  340 004014f3 6a01            push    1


 


Here we are loading the pointer of the stack location that contains our handles that we will wait on into EAX.


  340 004014f5 8d44240c        lea     eax,[esp+0Ch]


 


Now we push the pointer to our handles / objects on the to the stack.


  340 004014f9 50              push    eax


 


And this is the count of objects, 2 in this case both of them threads.


  340 004014fa 6a02            push    2


 


Now we call our WaitForMultipleObjects call to wait on hangtype2threadb and hangtype2threada to finish executing.


  340 004014fc ff1510204000    call    dword ptr [badwindow!_imp__WaitForMultipleObjects (00402010)]


 


Restore our ESI register, this will happen when we return. 


  340 00401502 5e              pop     esi 


 


Dec our stack pointer.


  342 00401503 83c408          add     esp,8


 


Return we are done.


  342 00401506 c3              ret


 


Here is the source.


 


void hangtype2(void)


{


      HANDLE handles[2];


 


      handles[0] = (HANDLE)_beginthread(hangtype2threada, 12000, NULL);


      handles[1] = (HANDLE)_beginthread(hangtype2threadb, 12000, NULL);


     


      WaitForMultipleObjects(2,handles,1,INFINITE);


 


}


 


 


So what went wrong?  Let’s look at our threads again. 


 


We have our main message pump thread thread 0 waiting on two threads, One is still running badwindow!hangtype2threada and the other one is gone or has completed hangtype2threadb.


 


0  Id: 3270.2b10 Suspend: 0 Teb: 7efdd000 Unfrozen


ChildEBP RetAddr 


0017faf0 76e4edb5 ntdll!ZwWaitForMultipleObjects+0x15


0017fb8c 76e430c3 kernel32!WaitForMultipleObjectsEx+0x11d


0017fba8 00401502 kernel32!WaitForMultipleObjects+0x18


0017fbc8 0040139b badwindow!hangtype2+0x42 [c:\source\badwindow\badwindow\badwindow.cpp @ 340]


0017fc24 772a87af badwindow!WndProc+0x17b [c:\source\badwindow\badwindow\badwindow.cpp @ 274]


0017fc50 772a8936 user32!InternalCallWinProc+0x23


0017fcc8 772a8a7d user32!UserCallWinProcCheckWow+0x109


0017fd2c 772a8ad0 user32!DispatchMessageWorker+0x380


0017fd3c 004010fb user32!DispatchMessageW+0xf


0017ff0c 00401817 badwindow!wWinMain+0xfb [c:\source\badwindow\badwindow\badwindow.cpp @ 124]


0017ffa0 76eb19f1 badwindow!__tmainCRTStartup+0x150 [f:\sp\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 589]


0017ffac 7782d109 kernel32!BaseThreadInitThunk+0xe


0017ffec 00000000 ntdll!_RtlUserThreadStart+0x23


 


Looking at hangtype2threada it would seem that it is blocked on RtlEnterCriticalSection.


 


   1  Id: 3270.2cd0 Suspend: 0 Teb: 7efda000 Unfrozen


ChildEBP RetAddr 


026ffebc 777ecfad ntdll!ZwWaitForSingleObject+0x15


026fff20 777ecf78 ntdll!RtlpWaitOnCriticalSection+0x154


026fff48 0040153c ntdll!RtlEnterCriticalSection+0x152


026fff64 757c2848 badwindow!hangtype2threada+0x2c [c:\source\badwindow\badwindow\badwindow.cpp @ 358]


026fff9c 757c28c8 msvcr80!_endthread+0x4b


026fffa0 76eb19f1 msvcr80!_endthread+0xcb


026fffac 7782d109 kernel32!BaseThreadInitThunk+0xe


026fffec 00000000 ntdll!_RtlUserThreadStart+0x23


 


Lets look and see what is happening with this critical section call..


 


First lets set our thread context to thread id 1


 


0:000> ~1s


eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=00403780 edi=00000000


eip=777fa69d esp=026ffec0 ebp=026fff20 iopl=0         nv up ei pl nz na po nc


cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202


ntdll!ZwWaitForSingleObject+0x15:


777fa69d c20c00          ret     0Ch


 


Lets get our call stack and get the first and only arg for entercriticalsection.


 


0:001> kv


ChildEBP RetAddr  Args to Child              


026ffebc 777ecfad 000000cc 00000000 00000000 ntdll!ZwWaitForSingleObject+0x15 (FPO: [3,0,0])


026fff20 777ecf78 00000000 00000000 76e61d5a ntdll!RtlpWaitOnCriticalSection+0x154 (FPO: [Non-Fpo])


026fff48 0040153c 00403780 00000000 00000000 ntdll!RtlEnterCriticalSection+0x152 (FPO: [Non-Fpo])


026fff64 757c2848 00000000 51b22bb2 00000000 badwindow!hangtype2threada+0x2c (FPO: [Uses EBP] [1,1,0]) (CONV: cdecl) [c:\source\badwindow\badwindow\badwindow.cpp @ 358]


026fff9c 757c28c8 76eb19f1 02274358 026fffec msvcr80!_endthread+0x4b (FPO: [Non-Fpo])


026fffa0 76eb19f1 02274358 026fffec 7782d109 msvcr80!_endthread+0xcb (FPO: [Non-Fpo])


026fffac 7782d109 02274358 026ffb9e 00000000 kernel32!BaseThreadInitThunk+0xe (FPO: [Non-Fpo])


026fffec 00000000 757c286e 02274358 00000000 ntdll!_RtlUserThreadStart+0x23 (FPO: [Non-Fpo])


 


We can do a couple of things at this point first lets look at the CS. (Critical Section)


 


0:001> !cs 00403780


—————————————–


Critical section   = 0x00403780 (badwindow!csCritSec1+0x0)


DebugInfo          = 0x0029bd40


LOCKED             < It’s LOCKED.


LockCount          = 0x1


WaiterWoken        = No


OwningThread       = 0x00002048  < This is the owning thread.


RecursionCount     = 0x14


LockSemaphore      = 0xCC


SpinCount          = 0x00000000


 


Are there any other locked critical sections?  !locks will tell us and no this is the only one.


 


0:001> !locks


 


CritSec badwindow!csCritSec1+0 at 00403780


WaiterWoken        No


LockCount          1


RecursionCount     20


OwningThread       2048


EntryCount         0


ContentionCount    1


*** Locked


 


Scanned 156 critical sections


 


What thread are running in our process and what is 2048 doing?


 


0:001> ~


#  0  Id: 3270.2b10 Suspend: 0 Teb: 7efdd000 Unfrozen


.  1  Id: 3270.2cd0 Suspend: 0 Teb: 7efda000 Unfrozen


 


Ok here is our problem.  Apparently both threads hangtype2threada and hangtype2threadb were using this same critical section however something happened to hangtype2threadb.  We need to figure out what happened so let’s go take a look at that function.


 


Looking back where we unassembled badwindow!hangtype2 we got it’s address, lets verify it with a ln (list near), we are lucky enough to have symbols in this case.


 


0:001> ln 00401570


c:\source\badwindow\badwindow\badwindow.cpp(370)


(00401570)   badwindow!hangtype2threadb   |  (004015e0)   badwindow!hangtype3thread


Exact matches:


    badwindow!hangtype2threadb (void *)


 


Looks like we have an exact match.  Now lets unassemble it and see what went wrong.


 


 


0:001> uf 00401570


badwindow!hangtype2threadb [c:\source\badwindow\badwindow\badwindow.cpp @ 370]:


 


Save ECX


  370 00401570 51              push    ecx


 


Save EBX


  370 00401571 53              push    ebx


 


Move the pointer to sprint into EBX


  371 00401572 8b1d7c204000    mov     ebx,dword ptr [badwindow!_imp__sprintf (0040207c)]


 


Save EBP


  371 00401578 55              push    ebp


 


Move the pointer to outputdebugstring into ebp


  371 00401579 8b2d14204000    mov     ebp,dword ptr [badwindow!_imp__OutputDebugStringA (00402014)]


 


Save ESI


  371 0040157f 56              push    esi


 


Move the pointer to EnterCriticalSection into ESI


  371 00401580 8b351c204000    mov     esi,dword ptr [badwindow!_imp__EnterCriticalSection (0040201c)]


 


Save EDI


  371 00401586 57              push    edi


 


Move the pointer for Sleep into EDI


  371 00401587 8b3d0c204000    mov     edi,dword ptr [badwindow!_imp__Sleep (0040200c)]


 


Save  14h or 20dec to ESP+10h (A local on the stack)  Maybe this is a counter?


  371 0040158d c744241014000000 mov     dword ptr [esp+10h],14h


 


Push the address of the critical section csCritSec1 00403780 onto the stack.


  374 00401595 6880374000      push    offset badwindow!csCritSec1 (00403780)


 


Call entercriticalsection


  374 0040159a ffd6            call    esi


 


Push 0xFA, 250Dec on the stack


  376 0040159c 68fa000000      push    0FAh


Call Sleep (Wait for 250ms)


  376 004015a1 ffd7            call    edi


Push pointer to value on the stack. When in doubt dump it out.


0:001> db 004021e4


004021e4  57 65 20 61 72 65 20 69-6e 20 68 61 6e 67 74 79  We are in hangty


004021f4  70 65 32 74 68 72 65 61-64 62 00 00 48 00 00 00  pe2threadb..H…


00402204  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  …………….


00402214  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  …………….


  377 004015a3 68e4214000      push    offset badwindow!`string’ (004021e4)


 


 


Push pointer on the stack what is it?


0:001> db 00403380


00403380  57 65 20 61 72 65 20 69-6e 20 68 61 6e 67 74 79  We are in hangty


00403390  70 65 32 74 68 72 65 61-64 62 00 00 00 00 00 00  pe2threadb……


004033a0  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  …………….


004033b0  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  …………….


  377 004015a8 6880334000      push    offset badwindow!szTrace (00403380)


 


Call spirintf


  377 004015ad ffd3            call    ebx


 


Clean up the stack


  377 004015af 83c408          add     esp,8


 


This is out buffer we just did the sprintf to


  378 004015b2 6880334000      push    offset badwindow!szTrace (00403380)



Call outputdebugstring


  378 004015b7 ffd5            call    ebp


 


Push csCritSec2’s address on the stack


  380 004015b9 6868374000      push    offset badwindow!csCritSec2 (00403768)


 


Call LeaveCriticalSection for csCritSec2’s


  380 004015be ff1524204000    call    dword ptr [badwindow!_imp__LeaveCriticalSection (00402024)]


 


Decrement a counter on the stack this is a local counting down to zero..


  382 004015c4 836c241001      sub     dword ptr [esp+10h],1


 


Check counter local counting down to zero if we are not ZERO yet dump to the top of the loop.


  382 004015c9 75ca            jne     badwindow!hangtype2threadb+0x25 (00401595)


 


 


Restore all the registers and then return


  382 004015cb 5f              pop     edi


  382 004015cc 5e              pop     esi


  382 004015cd 5d              pop     ebp


  382 004015ce 5b              pop     ebx


  387 004015cf 59              pop     ecx


  387 004015d0 c3              ret


 


 


Did you see the BUG?,  Look closely,   If you need it here is the source.


void __cdecl hangtype2threadb(void *)


{


      int i=0;


      while(1)


      {


            EnterCriticalSection(&csCritSec1);


           


            Sleep(250);


            sprintf(szTrace, “We are in hangtype2threadb”);


            OutputDebugStringA(szTrace);


           


            LeaveCriticalSection(&csCritSec2);


            i++;


            if(i==20)


            {


                  break;


            }


      }


}


 


We are entering one critical section and leaving another.  Then we drop out of the function once we dec our counter to zero and the thread terminates leaving csCritSec1 entered but never left.  The fix for this seems rather simple,  we just need to leave critsec1 vis leave creatsec2.  That should fix it.  But it we don’t have the source how can we verify that?


SIMPLE! We just modify the machine code in the debugger!   Often if we think we know how to fix something we will edit the code bytes to make the machine code do the right thing and let it run. 


 


Do do this, from the command line in your debuggers directory run windbg.exe C:\source\badwindow\release\badwindow.exe   asuming you have your bad window sample in the same directory, I do.   When the debugger fires up make sure you have your symbol path set.  .sympath SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols.


 


Our bad funtion call was


 


Push csCritSec2’s address on the stack  << WRONG CRITICALSECTION


  380 004015b9 6868374000      push    offset badwindow!csCritSec2 (00403768)


 


Call LeaveCriticalSection for csCritSec2’s


  380 004015be ff1524204000    call    dword ptr [badwindow!_imp__LeaveCriticalSection (00402024)]


 


Push the address of the critical section csCritSec1 00403780onto the stack.   << CORRECT CRITICALSECTION


  374 00401595 6880374000      push    offset badwindow!csCritSec1 (00403780)


 


Call entercriticalsection


  374 0040159a ffd6            call    esi


 


Remember all we need to do is change what criticalsection was pushed on the stack for the leavecriticalsection call.


 


004015b9 6868374000  (BAD)   
00401595 6880374000  (GOOD)


Now we just do a edit bytes on the bad instruction.


0:001> eb 004015b9


004015b9 68 68  << ENTER 68


68


004015ba 68 80  << We don’t want 68 enter 80


80


004015bb 37  << Now just hit enter to finish editing memory.


 


Here is the fixed code.


 


0:001> uf 00401570


badwindow!hangtype2threadb [c:\source\badwindow\badwindow\badwindow.cpp @ 370]:
  370 00401570 51              push    ecx
  370 00401571 53              push    ebx
  371 00401572 8b1d7c204000    mov     ebx,dword ptr [badwindow!_imp__sprintf (0040207c)]
  371 00401578 55              push    ebp
  371 00401579 8b2d14204000    mov     ebp,dword ptr [badwindow!_imp__OutputDebugStringA (00402014)]
  371 0040157f 56              push    esi
  371 00401580 8b351c204000    mov     esi,dword ptr [badwindow!_imp__EnterCriticalSection (0040201c)]
  371 00401586 57              push    edi
  371 00401587 8b3d0c204000    mov     edi,dword ptr [badwindow!_imp__Sleep (0040200c)]
  371 0040158d c744241014000000 mov     dword ptr [esp+10h],14h


ENTERING CORRET CRITICAL SECTION csCritSec1


  374 00401595 6880374000      push    offset badwindow!csCritSec1 (00403780)
  374 0040159a ffd6            call    esi
  376 0040159c 68fa000000      push    0FAh
  376 004015a1 ffd7            call    edi
  377 004015a3 68e4214000      push    offset badwindow!`string’ (004021e4)
  377 004015a8 6880334000      push    offset badwindow!szTrace (00403380)
  377 004015ad ffd3            call    ebx
  377 004015af 83c408          add     esp,8
  378 004015b2 6880334000      push    offset badwindow!szTrace (00403380)
  378 004015b7 ffd5            call    ebp


LEAVING CORRECT CRITICAL SECTION csCritSec1


  380 004015b9 6880374000      push    offset badwindow!csCritSec1 (00403780) 
  380 004015be ff1524204000    call    dword ptr [badwindow!_imp__LeaveCriticalSection
(00402024)]


  382 004015c4 836c241001      sub     dword ptr [esp+10h],1
  382 004015c9 75ca            jne     badwindow!hangtype2threadb+0x25 (00401595)
  382 004015cb 5f              pop     edi
  382 004015cc 5e              pop     esi
  382 004015cd 5d              pop     ebp
  382 004015ce 5b              pop     ebx
  387 004015cf 59              pop     ecx
  387 004015d0 c3              ret


 


Then just run the code (PRESS G then enter in the debugger) that’s it, it will work!


Once you have proven this you can go to the developer of the application and recommend they change their code, remember to provide your debug notes.


 


I hope you found this helpful and I welcome your feedback.


Thank you,  Jeff-


 


 

Comments (4)

  1. !analyze -v says:

    이 문서는 http://blogs.msdn.com/ntdebugging blog 의 번역이며 원래의 자료가 통보 없이 변경될 수 있습니다. 이 자료는 법률적 보증이 없으며 의견을 주시기

  2. We’d like the thank everyone who attended the Windows NT Debugging Blog Live Chat two weeks ago. Here

  3. We’d like the thank everyone who attended the Windows NT Debugging Blog Live Chat two weeks ago. Here

  4. jz says:

    Isn't 'RecursionCount     20' a clue?

    [That just means the thread entered the critical section 20 times, it doesn't really indicate that the thread unlocked the wrong critsec.  There are legitimate reasons to enter the same critical section multiple times.]