Recently we have seen quite a few issues where the problem turned out to be an orphaned critical section. When you have a dump with this problem you can run the following commands to see this:
0:000> !locks CritSec +2b1bbac at 02b1bbac WaiterWoken No LockCount 1 RecursionCount 1 OwningThread 11dc EntryCount 0 ContentionCount 4 *** Locked Scanned 58 critical sections 0:000> ~ . 0 Id: 4f8.f30 Suspend: 1 Teb: 7ffdf000 Unfrozen 1 Id: 4f8.2d4 Suspend: 1 Teb: 7ffdd000 Unfrozen 2 Id: 4f8.d14 Suspend: 1 Teb: 7ffd4000 Unfrozen 3 Id: 4f8.13d0 Suspend: 1 Teb: 7ffac000 Unfrozen 4 Id: 4f8.1744 Suspend: 1 Teb: 7ffab000 Unfrozen 5 Id: 4f8.1700 Suspend: 1 Teb: 7ffaa000 Unfrozen 6 Id: 4f8.304 Suspend: 1 Teb: 7ffa9000 Unfrozen 7 Id: 4f8.3d8 Suspend: 1 Teb: 7ffa8000 Unfrozen 8 Id: 4f8.1288 Suspend: 1 Teb: 7ffa7000 Unfrozen 9 Id: 4f8.10dc Suspend: 1 Teb: 7ffa6000 Unfrozen 10 Id: 4f8.1c90 Suspend: 1 Teb: 7ffa5000 Unfrozen 11 Id: 4f8.f78 Suspend: 1 Teb: 7ffa3000 Unfrozen 12 Id: 4f8.1688 Suspend: 1 Teb: 7ffa2000 Unfrozen 13 Id: 4f8.1888 Suspend: 1 Teb: 7ffa1000 Unfrozen 14 Id: 4f8.d94 Suspend: 1 Teb: 7ff5d000 Unfrozen 15 Id: 4f8.8a8 Suspend: 1 Teb: 7ff59000 Unfrozen 16 Id: 4f8.7c0 Suspend: 1 Teb: 7ffde000 Unfrozen 17 Id: 4f8.ad4 Suspend: 1 Teb: 7ffad000 Unfrozen 18 Id: 4f8.1e8c Suspend: 1 Teb: 7ffa4000 Unfrozen 19 Id: 4f8.19f4 Suspend: 1 Teb: 7ff57000 Unfrozen 20 Id: 4f8.1a4c Suspend: 1 Teb: 7ff53000 Unfrozen 21 Id: 4f8.518 Suspend: 1 Teb: 7ff52000 Unfrozen 22 Id: 4f8.1fb0 Suspend: 1 Teb: 7ff51000 Unfrozen 23 Id: 4f8.1584 Suspend: 1 Teb: 7ff4f000 Unfrozen 24 Id: 4f8.1cb0 Suspend: 1 Teb: 7ff4e000 Unfrozen 25 Id: 4f8.edc Suspend: 1 Teb: 7ff4d000 Unfrozen 26 Id: 4f8.14ec Suspend: 1 Teb: 7ff4c000 Unfrozen 27 Id: 4f8.1710 Suspend: 1 Teb: 7ff5b000 Unfrozen 28 Id: 4f8.1b00 Suspend: 1 Teb: 7ff58000 Unfrozen 29 Id: 4f8.1a28 Suspend: 1 Teb: 7ff55000 Unfrozen 30 Id: 4f8.1d98 Suspend: 1 Teb: 7ffaf000 Unfrozen 31 Id: 4f8.1164 Suspend: 1 Teb: 7ffd8000 Unfrozen 32 Id: 4f8.be0 Suspend: 1 Teb: 7ffdb000 Unfrozen 33 Id: 4f8.1538 Suspend: 1 Teb: 7ffda000 Unfrozen
Here you can see that the OwningThread 11dc doesn't show up as an existing thread currently in the process. This is an orphaned critical section as the thread that owned it, didn't clean it up properly upon exit.
Tracking down the thread
So now we know we have an orphaned critical section, how do we go about figuring out what thread did this. Well there are a couple applications that can make this a lot easier to figure out. They are AppVerifier and DebugDiag. The steps to set these tools up is as follows:
Basically what this does is inject in the process and monitor Lock activity and will generate a BreakPoint exception when something exits while owning a lock. Before setting this up, if you have a current rule in DebugDiag that is currently creating memory dumps on First Chance exceptions, please remove it.
1. Setup Application Verifier
b) Once installed launch Application Verifier from Start > Programs > Application Verifier
c) From the menu choose File Add Application , browse to c:\windows\system32\inetsrv and select w3wp.exe .
d) Under Tests in the Right hand pane enable the Basics > Locks. If there is anything else selected it should be removed
e) Application Verifier is now configured and can be closed. For the settings to take affect you will need to restart IIS .
2. Setup the Debug Diagnostic tool to capture the exception that will be created by Application verifier.
a) Setup a Crash rule for " A specific IIS web Application Pool" , choose the name of the application we are debugging , Click > Next
b) Under Advanced configuration click Exceptions , then Add Exception
c) Under the Exception Code choose " 80000003 Break Point Exception". Set Action type to Full User Dump and Action Limit 10. Click Ok then Save & Close .
d) Click Next .. Choose the location where you want the memory dumps created, then click Next again.
e) "Activate the Rule now" should be selected already.. And click Finish
Viewing the thread and tracking down the reason
Now that we have the dump, we can look at the thread which is causing the orphan and see who is at fault and why. So the thread will look something like this:
0:029> kL ChildEBP RetAddr 05dcfa9c 00403933 ntdll!DbgBreakPoint WARNING: Stack unwind information not available. Following frames may be wrong. 05dcfca0 00332fbe vrfcore!VerifierStopMessageEx+0x4bd 05dcfcc4 00327a00 vfbasics!VfBasicsStopMessage+0x8e 05dcfd88 00327a8d vfbasics!AVrfpCheckCriticalSection+0x2d0 05dcfda8 00327693 vfbasics!AVrfpCheckCriticalSectionSplayNode+0x4d 05dcfed4 003275ae vfbasics!AVrfpCheckCriticalSectionTree+0xc3 05dcfeec 003255f2 vfbasics!AVrfCheckForOrphanedCriticalSections+0x7e 05dcff00 00325623 vfbasics!AVrfpCheckThreadTermination+0x12 05dcff10 00324ead vfbasics!AVrfpCheckCurrentThreadTermination+0x23 05dcff38 77bcb4b6 vfbasics!AVrfpExitThread+0x2d 05dcff44 77bcb535 msvcrt!_endthreadex+0x29 05dcff78 0032544f msvcrt!_threadstartex+0x79 05dcffb8 77e64829 vfbasics!AVrfpStandardThreadFunction+0x6f 05dcffec 00000000 kernel32!BaseThreadStart+0x34
If you look at the parameters to the VfBasicsStopMessage, you will see:
0:029> kpL 05dcfcc4 00327a00 vfbasics!VfBasicsStopMessage( unsigned long Code = 0x200, char * Message = 0x00322b50 "Thread is in a state in which it cannot own a critical section", unsigned long Param1 = 0x2494, char * Description1 = 0x00322b90 "Thread identifier", unsigned long Param2 = 0x174584, char * Description2 = 0x00322540 "Critical section address", unsigned long Param3 = 0x13bc30, char * Description3 = 0x00322568 "Critical section debug info address", unsigned long Param4 = 0x1021d40, char * Description4 = 0x003228c8 "Initialization stack trace. Use dds to dump it if non-NULL.")+0x8e
And if you dump out Param4, you will see:
0:029> dds 0x1021d40 01021d40 00000000 01021d44 0000001a 01021d48 00200281 01021d4c 00326936 vfbasics!AVrfpInitializeCriticalSectionCommon+0x136 01021d50 00326a94 vfbasics!AVrfpRtlInitializeCriticalSectionAndSpinCount+0x14 01021d54 77c7bc50 locking!MUTEX::CommonConstructor+0x1b 01021d58 77c7bc7c locking!MUTEX::MUTEX+0x13 01021dbc 11cd3116 CustomerCode!DllCanUnloadNow+0x97c 01021dc0 11cd2edc CustomerCode!DllCanUnloadNow+0x742
So you can see the functions that were involved in creating the critical section that is getting orphaned. Then you can track down what happened to that thread and why the critical section isn't getting cleaned up correctly. Most often, they will lead you to a point where you can talk to the company that created the control that is doing this or fix it yourself.
The thread that enters the critical section needs to make sure it cleans up the critical section, even if an exception happens or something like that. Usually this is done by using a try/finally block and releasing the critical section in the finally block.
Tess has some more information on other situation around a hang that you may run across in ASP.NET here.