ASP.NET Tips: How to identify a COM call that is blocking GC, causing a memory leak

Problem Description

So this is a customer situation we found were the customer was getting an System.OutOfMemoryException.  We were able to reproduce this with a console application that just ran:

             For Iterator = 0 To 99999999999
                Regex.IsMatch(CONNECTION_STRING, "/S")
                'Create a SQLConnection object then get rid of it
                Connection.Open()
                Connection.Close()
            Next Iterator

It turns out that the SQLConnection object uses some COM objects internally.  This objects need to get collected by the GC (Garbage Collector) but are unable to do so because the COM objects are created on this main STA thread and the finalizer thread cannot call back to the main thread due to this loop.

Debugging information

Looking at all the threads and look for the Finalizer thread

 0:000> !threads
ThreadCount: 3
UnstartedThread: 0
BackgroundThread: 2
PendingThread: 0
DeadThread: 0
                                 PreEmptive   GC Alloc                     Lock     
       ID   ThreadOBJ      State     GC       Context           Domain     Count APT Exception
  0 0x7a0 0x0014d358   0x2006020 Enabled  0x00000000:0x00000000 0x00147b18     0 STA
  2 0xf9c 0x00158010      0xb220 Enabled  0x00000000:0x00000000 0x00147b18     0 MTA (Finalizer)
  6 0xe30 0x0019f460   0x1800220 Enabled  0x00000000:0x00000000 0x00147b18     0 MTA (Threadpool Worker)

So the Finalizer thread is thread 2.  Looking at the callstack of that thread we see

 0:002> kbnL100
 # ChildEBP RetAddr  Args to Child              
00 00bbf74c 77f43741 77e41817 000001e0 00000000 SharedUserData!SystemCallStub+0x4
01 00bbf750 77e41817 000001e0 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
02 00bbf7c0 77e4168f 000001e0 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xac
03 00bbf7d0 771d3c39 000001e0 ffffffff 0016d53c kernel32!WaitForSingleObject+0xf
04 00bbf7ec 7726bef4 0018d7c0 00164538 0016d53c ole32!GetToSTA+0x6d
05 00bbf808 7726b087 00bbf8c4 0016d53c 001918f8 ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0xc8
06 00bbf8dc 771671c1 0016d53c 00bbf9c4 00bbf9b4 ole32!CRpcChannelBuffer::SendReceive2+0xb6
07 00bbf938 771ac725 0016d53c 00bbf9c4 00bbf9b4 ole32!CAptRpcChnl::SendReceive+0xab
08 00bbf98c 77ce6a4e 00000001 00bbf9c4 00bbf9b4 ole32!CCtxComChnl::SendReceive+0x8f
09 00bbf9a8 77ce6bdf 0018f9f4 00bbf9f0 06000169 rpcrt4!NdrProxySendReceive+0x41
0a 00bbfd78 77ce69cb 77164d98 771d1f38 00bbfdb0 rpcrt4!NdrClientCall2+0x1bc
0b 00bbfd98 77c64fcf 0000000c 00000008 00bbfe1c rpcrt4!ObjectStublessClient+0x89
0c 00bbfda8 771ac365 0018f9f4 00bbfdd0 00000000 rpcrt4!ObjectStubless+0xf
0d 00bbfe1c 772079fb 00000001 7921aa1d 00bbfea4 ole32!CObjectContext::InternalContextCallback+0x10b
0e 00bbfe5c 792190c6 00168018 7921aa1d 00bbfea4 ole32!CObjectContext::ContextCallback+0x6e
0f 00bbff08 79219404 7921aa54 00191058 00000000 mscorwks!CtxEntry::EnterContext+0xb7
10 00bbff30 792192ad 00000000 793e7e48 00000000 mscorwks!ComPlusApartmentCleanupGroup::CleanUpWrappers+0x63
11 00bbff68 7923d2c1 77e4b085 00000000 7923e791 mscorwks!ComPlusWrapperCleanupList::CleanUpWrappers+0x69
12 00bbff74 7923e791 791bbe2e 00000000 00000000 mscorwks!SyncBlockCache::CleanupSyncBlocks+0x85
13 00bbff78 791bbe2e 00000000 00000000 00000000 mscorwks!Thread::DoExtraWorkForFinalizer+0x23
14 00bbffb8 77e4a990 00000000 00000000 00000000 mscorwks!GCHeap::FinalizerThreadStart+0xbb
15 00bbffec 00000000 791ce2dc 00000000 00000000 kernel32!BaseThreadStart+0x34

In here, frame 0f shows us calling into a COM object to clean it up.  And the frames 04 and 05 show us switching to an STA apartment to make the COM call.  So now lets look at the process and thread that we are switching the COM call to

 0:002> dd 18d7c0+8 l2
0018d7c8  00000f58 000007a0

 

So now lets see which thread 7a0 is

 0:002> ~
#  0  Id: f58.7a0 Suspend: 1 Teb: 7ffde000 Unfrozen
   1  Id: f58.bf4 Suspend: 1 Teb: 7ffdd000 Unfrozen
.  2  Id: f58.f9c Suspend: 1 Teb: 7ffdc000 Unfrozen
   3  Id: f58.cb0 Suspend: 1 Teb: 7ffdb000 Unfrozen
   4  Id: f58.75c Suspend: 1 Teb: 7ffda000 Unfrozen
   5  Id: f58.698 Suspend: 1 Teb: 7ffd9000 Unfrozen
   6  Id: f58.e30 Suspend: 1 Teb: 7ffd8000 Unfrozen
   7  Id: f58.b38 Suspend: 1 Teb: 7ffd7000 Unfrozen

We can see that thread f58.7a0 is thread 0.  This tells us that the finalizer thread is trying to clean up an object and it is a COM object.  In order to clean up this STA object, we have to call back to it’s owner thread which is thread 0.

Looking at thread 0 we see

 0:000> kbn100
 # ChildEBP RetAddr  Args to Child              
00 0012f2c8 77f4372d 77e41bfa 00000003 0012f318 SharedUserData!SystemCallStub+0x4
01 0012f2cc 77e41bfa 00000003 0012f318 00000001 ntdll!NtWaitForMultipleObjects+0xc
02 0012f374 77161e96 00000003 0012f4f0 00000000 kernel32!WaitForMultipleObjectsEx+0x11a
03 0012f39c 771d36eb 0012f4f0 00000003 0012f3bc ole32!CCliModalLoop::BlockFn+0x2d
04 0012f404 7921a9e4 00162e20 00003a98 00000003 ole32!CoWaitForMultipleHandles+0xc5
05 0012f424 7921a96f 00000000 00003a98 00000003 mscorwks!NT5WaitRoutine+0x51
06 0012f44c 792982dc 00000003 0012f4f0 00000000 mscorwks!MsgWaitHelper+0x62
07 0012f47c 7929883d 00000003 0012f4f0 00000000 mscorwks!Thread::DoAppropriateWaitWorker+0xc1
08 0012f4d0 7933992b 00000003 0012f4f0 00000000 mscorwks!Thread::DoAppropriateWait+0x46
09 0012f534 00f80e8f 00000003 00000000 00000000 mscorwks!WaitHandleNative::CorWaitMultipleNative+0xff
WARNING: Frame IP not in any known module. Following frames may be wrong.
0a 0012f5a8 00febbbd 00000000 00000000 0504b1d4 0xf80e8f
0b 0012f5e8 00edff67 00000000 05041738 0012f6ec 0xfebbbd
0c 0012f62c 00edf44e 00000000 05046adc 0012f6ec 0xedff67
0d 0012f66c 00c51897 00000000 0012f6c8 0012f6ec 0xedf44e
0e 0012f6a8 791d7478 0012f7c0 791dcf66 0012f6fc 0xc51897
0f 0012f6b0 791dcf66 0012f6fc 00000000 0012f6d4 mscorwks!CallDescrWorker+0x30
10 0012f7c0 791dd2ca 008f507b 0015b9a8 0012f7e0 mscorwks!MethodDesc::CallDescr+0x1b6
11 0012f870 791dd336 008f507b 0015b9a8 110028f4 mscorwks!MethodDesc::CallDescr+0x43
12 0012f898 792363e9 0012f928 00000000 0014d358 mscorwks!MethodDesc::Call+0x97
13 0012f950 79238c4c 008f5080 00000001 00000000 mscorwks!ClassLoader::CanAccess+0x207
14 0012fa64 79238d49 0015b9a8 00000000 79041374 mscorwks!ClassLoader::ExecuteMainMethod+0x49d
15 0012fa7c 79238a36 00000000 0012fd70 00000000 mscorwks!Assembly::ExecuteMainMethod+0x21
16 0012fd60 791bcae7 00000001 0012fd90 0012ffe0 mscorwks!SystemDomain::ExecuteMainMethod+0x416
17 0012ffa0 791bc9ea 11000000 00000000 00000000 mscorwks!ExecuteEXE+0x1c0
18 0012ffb0 7917d0b8 00000000 791b0000 0012fff0 mscorwks!_CorExeMain+0x59
19 0012ffc0 77e4f38c 00000000 00000000 7ffdf000 mscoree!_CorExeMain+0x30
1a 0012fff0 00000000 7917d08c 00000000 78746341 kernel32!BaseProcessStart+0x23

Looking at the managed stack since frames 0a to 12 show what looks like managed code

 0:000> !clrstack
Thread 0
ESP         EIP       
0x0012f568  0x7ffe0304 [FRAME: ECallMethodFrame] [DEFAULT] I4 System.Threading.WaitHandle.WaitMultiple(SZArray Class System.Threading.WaitHandle,I4,Boolean,Boolean)
0x0012f580  0x00fec071 [DEFAULT] I4 System.Threading.WaitHandle.WaitAny(SZArray Class System.Threading.WaitHandle,I4,Boolean)
  at [+0xc9] [+0x41]
0x0012f5b4  0x00febbbd [DEFAULT] [hasThis] Class System.Data.SqlClient.SqlInternalConnection System.Data.SqlClient.ConnectionPool.GetConnection(ByRef Boolean)
  at [+0xad] [+0x47]
0x0012f5f0  0x00edff67 [DEFAULT] Class System.Data.SqlClient.SqlInternalConnection System.Data.SqlClient.SqlConnectionPoolManager.GetPooledConnection(Class System.Data.SqlClient.SqlConnectionString,ByRef Boolean)
  at [+0x1a7] [+0xc3]
0x0012f634  0x00edf44e [DEFAULT] [hasThis] Void System.Data.SqlClient.SqlConnection.Open()
  at [+0xce] [+0x44]
0x0012f674  0x00c51897 [DEFAULT] Void ConsoleApplication1.Module1.Main()
  at [+0x9f] [+0x37]
0x0012f9b0  0x791d7478 [FRAME: GCFrame] 
0x0012fa94  0x791d7478 [FRAME: GCFrame] 

Summary

So we can see that ConsoleApplication1.Module1.Main is calling into System.Data.SqlClient.SqlConnection.Open which is waiting for the connection to open.  In a real world scenario you would see it doing different things in a loop but never pumping the message loop.  In order for an STA component to receive calls the thread must pump messages (for more information see the Single-Threaded Apartments (STA) section of DCOM Architecture).

Note that any managed object could be using an STA COM object internally (in this case it was the Microsoft SqlConnection object) and there is no easy way to determine if it is doing so.  This means that you should always pump messages on any thread that creates managed objects if you don’t know the internal workings of the managed object (any object you did not develop).

The following are some methods to determine if a managed object uses COM internally:

  • Using a debugger and setting a breakpoint on COM activation functions in ole32
  • Use regmon and watch for the process looking up CLSIDs
  • Digging through the IL and looking for COM calls

Solutions

  1. The best solution is to pump messages on the STA thread.  From MSDN we can see easy ways to do this, from the “Blocking Issues” section of and Unmanaged Threading in Microsoft Windows

    WaitHandle.WaitOne, WaitAny, WaitAll, Monitor.Enter, Monitor.Block, Thread.Join, GC.WaitForPendingFinalizers, and so on are all responsive to Thread.Interrupt and to Thread.Abort. Also, if your thread is in a single-threaded apartment, all these managed blocking operations will correctly pump messages in your apartment while your thread is blocked

  2. Mark the main thread as an MTA thread

  3. If it is not easy to modify the main thread then you can create another thread that will handle the creation of the STA COM objects and will also pump messages.

kick it on DotNetKicks.com