Tracking Handle Misuse Using Application Verifier and Windbg

In a recent MSDN forum post Connection Pooling randomly throwing COM exceptions I uncovered an issue where some code inside the process was closing handles owned by our pooling code.  This caused our pooling code to get into a bad state triggering a SemaphoreFullException every single time a caller attempted to open a connection from the pool.

The root cause of this problem was code calling the Win32 API CloseHandle on a handle that it did not own.   Since Windows can aggresively reuse handle values, it’s possible to hit a situation like below:

Thread1: CreateFile –> Get handle #1.

Thread1: CloseHandle(#1)

Thread2: CreateFile –> Get handle #1, store in member variable.

Thread1: CloseHandle (#1) <- Mistake in code.

Thread1: CreateEvent –> Get handle #1.

Now my code on Thread2 thinks it has a file handle, but it has an event handle!   Thread2 did not make any coding mistake, the Thread1 code corrupted Thread2’s handle value.

This is an inherent issue when using handles, they are inherently dangerous you have to be very careful to only call CloseHandle 1 time.  Now the operating system could mitigate this by handling out random values for handles, but in general the operating system does not always do this for performance reasons.  The tradeoff is fast performance for all applications that behave correctly.

Now if you have a process where you suspect handles are getting misused (leaked or double closed), you can use Application Verifier to help track this down very quickly, this is what I want to talk about.

First download both Microsoft Application Verifier and Debugging Tools for Windows, both are free downloads from Microsoft.  I would provide links but these always change, get the latest versions of both.  Get the version that matches your operating system bitness as well for best results.

Install Microsoft Application Verifier and Debugging Tools for Windows on the machine where you have the problem.  Both are very clean installers and remove easily so don’t worry about installing on production machine, they are easy to remove and don’t require any reboot during removal.   If you are really worried you can install them on a separate machine and xcopy the files to the production machine, this works as well.

Next, figure out what process you want to debug.  If you are running IIS application with .NET, most likely this is the w3wp.exe process.  You don’t need the process ID yet.

First run Application Verifier to setup handle tracking, it looks like below:

image

Not too fancy.  :)  You want to create the entry for the process name next, select File | Add Application.   Let’s do the demonstration using notepad.exe for now.   Just type in notepad.exe and click Open.  You do not have to find and select the correct binary, just typing in the process name is sufficient, it does not need the full path:

image

Now that you’ve added the process, go to the right and select ONLY handles under the Basics options, like below:

image

The reason I only select handles here is to reduce the performance implications of all the other tracking inside the process.  I just want to track handles, this is all.

Once you’ve selected handles only, click Save, then close Application Verifier.  You will see this message:

image

This is fine, we will do this work later on (attach debugger).  Click OK and exit App Verifier.

Now, VERY IMPORTANT!  Re-open App Verifier.  Verify you see your process name correctly and you see only Handles is checked.  Do this step as a sanity check to ensure you selected the right thing AND it was saved correctly to registry.

Exit App Verifier again.

If your process is currently running, stop and restart the process.   If for example you are using IIS, you want to run iisreset at this point to restart IIS.  This step is required as the process reads the Application Verifier flags ONLY during startup.

In our case we are debugging notepad, so lets startup an instance of notepad.exe at this point (Start|Run –> Notepad.exe)

Now open Windbg.exe (from Debugging Tools For Windows install).  Select File | Attach To Process…

VERY IMPORTANT! First thing to do is check the Non-invasive check box on the Attach to Process dialog, as below:

image

You should ALWAYS check this when doing handle tracking work because this allows you to attach to the process in a special mode called non-invasive mode.  This allows the debugger to be later detached and the process will continue normally.  If you don’t check this checkbox, the debugger will fully attach to the process and you cannot shut down the debugger without terminating the process.

Next, locate your process, you will see here my process is notepad.exe, select this process:

image

Click OK to attach the debugger non-invasively.   Once you are attached, go to the command window (select View|Command) and run the following commands:

0:000> !htrace -enable
Handle tracing enabled.
Handle tracing information snapshot successfully taken.

This command is required to enable the tracing of handles.  Once this is set, you can exit the debugger by running the q (quit) command:

0:000:> q

At this point handle tracking is enabled for the process.  For this experiment, now go to notepad and open a single text file but do not exit notepad.exe yet.

Now we want to see what handle tracking can do for us.  Startup Windbg.exe again and attach to notepad again in non-invasive mode.  Run the following commands:

0:000> .dump /ma c:\temp\notepad.dmp
Creating c:\temp\notepad.exe - mini user dump
Dump successfully written

Note the command is .dump /ma <filename> where <filename> can be anything you want here.  I called my dump file c:\temp\notepad.dmp.

This creates a dump of the process.  Now again run q (quit) command to exit debugger.

Re-open Windbg again, we are next going to examine the dump file.

In Windbg select File | Open Crash Dump… and select c:\temp\notepad.dmp and click Open.

In the command window, run the following 2 commands to load symbols:

0:000> .sympath srv* https://msdl.microsoft.com/download/symbols
Symbol search path is: srv*https://msdl.microsoft.com/download/symbols
Expanded Symbol search path is: srv*https://msdl.microsoft.com/download/symbols
0:000> .reload /f
..........................................*** ERROR: Symbol file could not be found.  Defaulted to export symbols for AvCheck.dll -
.....
Loading unloaded module list

Note you may see errors here, this is fine for now.

Now we want to see a trace of our handle activity, run the !htrace command with no arguments:

0:000> !htrace

Next you will see tons of debugger spew like below:

--------------------------------------
Handle = 0x00000000000002f0 - OPEN
Thread ID = 0x0000000000001390, Process ID = 0x0000000000000b6c

0x0000000077835e3a: ntdll!ZwDuplicateObject+0x000000000000000a
0x00000000776148a8: kernel32!DuplicateHandle+0x0000000000000055
0x000007fefe6cb435: rpcrt4!THREAD::THREAD+0x00000000000000b5
0x000007fefe6cb35c: rpcrt4!ThreadSelfHelper+0x0000000000000028
0x000007fefe77444c: rpcrt4!NdrpClientCall2+0x0000000000000698
0x000007fefe774ddd: rpcrt4!NdrClientCall2+0x000000000000001d
0x000007fefe4c0e9a: ole32!BulkUpdateOIDs+0x000000000000008a
0x000007fefe4c0d89: ole32!CRpcResolver::BulkUpdateOIDs+0x0000000000000109
0x000007fefe4c0c36: ole32!CROIDTable::ClientBulkUpdateOIDWithPingServer+0x0000000000000224
0x000007fefe4c94c3: ole32!CROIDTable::WorkerThreadLoop+0x0000000000000057
0x000007fefe4c9406: ole32!CRpcThread::WorkerLoop+0x000000000000001e
0x000007fefe4c9682: ole32!CRpcThreadCache::RpcWorkerThreadEntry+0x000000000000001a
0x000000007761466d: kernel32!BaseThreadInitThunk+0x000000000000000d
0x0000000077818791: ntdll!RtlUserThreadStart+0x000000000000001d
--------------------------------------
Handle = 0x000000000000031c – OPEN

This shows you all the handle activity in the process.

You can locate a double close by searching for the CLOSE lines:

Handle = 0x000000000000033c - CLOSE

This will also have the stack trace of the code that closed the handle.

Now in my case what I did was gather a dump when some error case occurred in my process, then I examined the dump to determine what handle my code was using and then used !htrace to find out who closed it.  All of the information is kept in the crash dump nicely for you.

You can also pass a parameter to !htrace to dump activity for a single handle, like below:

0:000> !htrace 33c
--------------------------------------
Handle = 0x000000000000033c - CLOSE
Thread ID = 0x0000000000000744, Process ID = 0x0000000000000b6c

0x0000000077835faa: ntdll!ZwTerminateThread+0x000000000000000a
0x000000007780b508: ntdll!RtlExitUserThread+0x0000000000000048
0x000000007780b8a7: ntdll!TppWorkerThread+0x0000000000000a23
0x000000007761466d: kernel32!BaseThreadInitThunk+0x000000000000000d
0x0000000077818791: ntdll!RtlUserThreadStart+0x000000000000001d
--------------------------------------
Handle = 0x000000000000033c - OPEN
Thread ID = 0x0000000000000744, Process ID = 0x0000000000000b6c

0x000000007773d57a: user32!ZwUserCallOneParam+0x000000000000000a
0x000000007773d95f: user32!RealMsgWaitForMultipleObjectsEx+0x00000000000000b3
0x000000007773da2e: user32!MsgWaitForMultipleObjectsEx+0x0000000000000046
0x0000000077732af4: user32!MsgWaitForMultipleObjects+0x0000000000000020
0x000007fef5fbb198: browseui!CShellTaskScheduler::_TT_MsgWaitForMultipleObjects+0x0000000000000038
0x000007fef5fbb11d: browseui!CShellTaskScheduler::TT_TransitionThreadToRunningOrTerminating+0x00000000000000a1
0x000007fef5fbaea1: browseui!CShellTaskThread::ThreadProc+0x0000000000000076
--------------------------------------

You will notice a specific handle is closed and reopened many times over a period of time, this is normal.  What you want to look for is cases where CLOSE is called 2 times in a row to find cases of handle misuse.  Likewise if a handle is leaked you will not see a final CLOSE called, the last trace will be an OPEN.

Now, when you are finished debugging, BE ABSOLUTELY SURE to start App Verifier again and DELETE the process entry for your process!  You want to always do this since the App Verifier settings are persisted in the registry, they will survive even a server reboot.  Right click on the Image Name and select “Delete Application”.  Then click Save and close and re-open App Verifier to ensure that the entry is gone.  As well if you are debugging some service you want to restart the service to stop the tracking, run iisreset for example again to reset IIS to normal state if you are debugging IIS.

Hope this helps you debug the next handle misuse case you encounter, have fun!