How to troubleshoot leaked SqlConnection Objects (.NET 2.0) - Part 2

In the last post in this series, we looked at how we can determine that our Connection pool was exhausted.  In this post I'm going to go a little deeper into the Internal connection itself and how we can verify if this is a closed or active connection.

Dumping out the internal connection objects

A connection object in the System.Data.SqlClient namespace consists of two parts:

  • The SqlConnection class that is used by customers’ code
  • The SqlInternalConnectionTds internal class that is used by the pooling code. This class is not directly accessible to the user.

The SqlConnection class has a pointer to a SqlInternalConnectionTds object if it’s open (_innerConnection member variable). The _innerConnection member variable is null if the connection is closed. Whenever a connection is closed by the code, the internal object gets disassociated from the external object and the ownership of the internal object transfers to the pool object. This relationship allows us to identify SqlConnection objects that have not been closed.

The SqlInternalConnectionTds object has a weak reference back to the owning SqlConnection object.

Since there are typically multiple pools and not all of them are full, we want to start with the internal objects that we know belong to a full pool.

Going back to the pool in question, lets dump out the items within this pool.

0:000> !do 012bbe80
Name: System.Data.ProviderBase.DbConnectionPool
...
00000000 400152b 40 0 instance 012bbfb4 _objectList
79102290 400152c 54 System.Int32 1 instance 100 _totalObjects
...

0:000> !do 012bbfb4
Name: System.Collections.Generic.List`1[[System.Data.ProviderBase.DbConnectionInternal, System.Data]]
MethodTable: 654413c4
EEClass: 7912f680
Size: 24(0x18) bytes
(C:WINNTassemblyGAC_32mscorlib2.0.0.0__b77a5c561934e089mscorlib.dll)
Fields:
MT Field Offset Type VT Attr Value Name
7912d8f8 40009c7 4 System.Object[] 0 instance 012bbfcc _items
79102290 40009c8 c System.Int32 1 instance 100 _size
79102290 40009c9 10 System.Int32 1 instance 100 _version
790fd0f0 40009ca 8 System.Object 0 instance 00000000 _syncRoot
7912d8f8 40009cb 0 System.Object[] 0 shared static _emptyArray
>> Domain:Value dynamic statics NYI
00155858:NotInit <<

0:000> !da 012bbfcc
Name: System.Data.ProviderBase.DbConnectionInternal[]
MethodTable: 7912d8f8
EEClass: 7912de6c
Size: 416(0x1a0) bytes
Array: Rank 1, Number of elements 100, Type CLASS
Element Methodtable: 654009f0
[0] 012be414
[1] 012bf3e4
[2] 012bf008
...
[98] 0148114c
[99] 01485fcc

At this point we want to save all these 100 internal connection addresses into a file and remove all the array indexes so that the file only contains:

012be414
012bf3e4
012bf008
...
0148114c
01485fcc

Visual Studio is handy for this since we can select using alt + mouse to select the first 3-4 columns and delete them all, then save the file.

Processing the internal connections

The goal at this point is to find any SqlConnection objects from these SqlInternalConnectionTds objects that are no longer referenced. If the SqlConnection still references the SqlInternalConnectionTds and cannot be reached through !gcroot, it has been abandoned by the code without closing it.

Using .foreach to dump out the connections is easiest since it avoid the manual work of processing each of the 100 connections in question:

.foreach /f ( place "c:tempInternalConnections.txt") { dd poi(poi( place +4)+4) l1}
(32 bit)

or

.foreach /f ( place "c:tempInternalConnections.txt") { dq poi(poi( place +8)+8) l1}
(64 bit)

Explanation of the .foreach command:

place – this is our placeholder, or variable name, that represents each of the addresses in the file
dd – this would be dq in a 64-bit dump. It dumps out a double word, or the address
place + 8 – the weak reference is at offset 8 from the SqlInternalConnectionTds(64 bit) or at offset 4(32 bit):

0:000> !do 012be414
Name: System.Data.SqlClient.SqlInternalConnectionTds
MethodTable: 65404744
EEClass: 6544d9e0
Size: 140(0x8c) bytes
(C:WINNTassemblyGAC_32System.Data2.0.0.0__b77a5c561934e089System.Data.dll)
Fields:
MT Field Offset Type VT Attr Value Name
79102290 4000f67 1c System.Int32 1 instance 4 _objectID
...
79104c38 4000f6d 4 System.WeakReference 0 instance 012be55c _owningObject
...

The WeakReference object has a handle at offset 8 that is the second +8 in the command (64 bit) or at offset 4 (32 bit):

0:000> !do 012be55c
Name: System.WeakReference
MethodTable: 79104c38
EEClass: 79104bd4
Size: 16(0x10) bytes
(C:WINNTassemblyGAC_32mscorlib2.0.0.0__b77a5c561934e089mscorlib.dll)
Fields:
MT Field Offset Type VT Attr Value Name
791016bc 40005a9 4 System.IntPtr 1 instance 3f1268 m_handle
7910be50 40005aa 8 System.Boolean 1 instance 0 m_IsLongReference

The value at that location is the owning object if it exists.

Non-Null and has an owning object:

0:000> dd 3f1268 l1
003f1268 01575138

Null and no owning object:

0:000> dd 3f1268 l1
003f1268 00000000

Output of the foreach command:

0:000> .foreach /f ( place "c:tempInternalConnections.txt") { dd poi(poi( place +4)+4) l1}
003f1268 01575138
003f127c 0157336c
003f1290 0157136c
003f1298 0156f138
003f1244 015809fc
...
003f2d34 014ac514
003f2d2c 014acbf4
003f2d1c 014ac7d4
003f2d14 015817cc

As we can see, the internal connections have an owning SqlConnection object. This either means that they are actively being used by the code (not likely) or they have been abandoned (more likely).

Finding out if a connection is actively used

To find out if a SqlConnection is still being used by the code, we can run the !gcroot command. This command will tell us if the object is reachable by the .NET Framework and if it is not, it is ready to be collected.

0:000> !gcroot 0157336c
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 0 OSTHread 590
DOMAIN(00155858):HANDLE(WeakSh):3f127c:Root:0157336c(System.Data.SqlClient.SqlConnection)

At this point in the application, we only have one thread running which is thread ID 0. 

Here the output indicates that the object is reachable from thread 0. However, this can be a false positive because thread references can be old. We still have to verify that the object actually exists on that thread:

0:000> kL
ChildEBP RetAddr
0012f31c 7739bf53 ntdll!KiFastSystemCallRet
0012f3b8 7b0831a5 user32!NtUserWaitMessage+0xc
0012f434 7b082fe3 System_Windows_Forms_ni+0xb31a5
0012f464 7b0692c2 System_Windows_Forms_ni+0xb2fe3
0012f490 79e7c6cc System_Windows_Forms_ni+0x992c2
0012f510 79e7c8e1 mscorwks!CallDescrWorkerWithHandler+0xa3
0012f64c 79e7c783 mscorwks!MethodDesc::CallDescr+0x19c
0012f668 79e7c90d mscorwks!MethodDesc::CallTargetWorker+0x1f
0012f67c 79eefb9e mscorwks!MethodDescCallSite::Call_RetArgSlot+0x18
0012f7e0 79eef830 mscorwks!ClassLoader::RunMain+0x263
0012fa48 79ef01da mscorwks!Assembly::ExecuteMainMethod+0xa6
0012ff18 79fb9793 mscorwks!SystemDomain::ExecuteMainMethod+0x43f
0012ff68 79fb96df mscorwks!ExecuteEXE+0x59
0012ffb0 7900b1b3 mscorwks!_CorExeMain+0x15c
0012ffc0 77e6f23b mscoree!_CorExeMain+0x2c
0012fff0 00000000 kernel32!BaseProcessStart+0x23

We can see that we have managed code on this thread.  Let's look at what the managed stack looks like:

0:000> !clrstack
OS Thread Id: 0x590 (0)
ESP EIP
0012f32c 7c8285ec [InlinedCallFrame: 0012f32c] System.Windows.Forms.UnsafeNativeMethods.WaitMessage()
0012f328 7b08374f System.Windows.Forms.Application+ComponentManager.System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(Int32, Int32, Int32)
0012f3c8 7b0831a5 System.Windows.Forms.Application+ThreadContext.RunMessageLoopInner(Int32, System.Windows.Forms.ApplicationContext)
0012f440 7b082fe3 System.Windows.Forms.Application+ThreadContext.RunMessageLoop(Int32, System.Windows.Forms.ApplicationContext)
0012f470 7b0692c2 System.Windows.Forms.Application.Run(System.Windows.Forms.Form)
0012f480 00e70097 SqlConnectionLeakWin.Program.Main()
0012f69c 79e7c74b [GCFrame: 0012f69c]

Doesn't appear to be doing anything with SQL here.  Let's look at the objects on the stack:

0:000> !dso
OS Thread Id: 0x590 (0)
ESP/REG Object Name
ebx 01253384 System.Windows.Forms.Application+ThreadContext
esi 015cc2e8 System.Collections.Hashtable+HashtableEnumerator
0012f354 01299fc4 System.Windows.Forms.NativeMethods+MSG[]
0012f358 01253384 System.Windows.Forms.Application+ThreadContext
0012f360 01299ad8 System.Windows.Forms.Application+ComponentManager
0012f3d8 01253384 System.Windows.Forms.Application+ThreadContext
0012f42c 01253384 System.Windows.Forms.Application+ThreadContext
0012f43c 01296b84 System.Windows.Forms.ApplicationContext
0012f444 0127fe4c System.ComponentModel.EventHandlerList
0012f458 01252a8c SqlConnectionLeakWin.Form1
0012f460 01253384 System.Windows.Forms.Application+ThreadContext
0012f474 01252a8c SqlConnectionLeakWin.Form1

We can conclude that this SqlConnection object is no longer being used and it has not been closed.  This proves that the  applications code did not close all connections and further code investigation needs to be performed to close all connections.

Reference:

Part 1

Adam W. Saxton | Microsoft SQL Server Escalation Services