Debugging managed code memory leak with memory dump using windbg


Debugging memory leak is a must-have skill for most server side app developers especially when you move to cloud.  Usually you don't have luxury to attach debugger to the running process on the server. Even if you have the luxury, it might not be as easy as you think. A good solution to this problem which many developers are using is getting memory dump for the process in which you suspect there is memory leak and then debugging it with windbg.

In summary, the below are reasons for using windbg to debug managed code memory leak with memory dump.
 1. It is needed for server side applications and especially cloud applications
 2. Creating memory dump is easy and you can analyze it offline as long as you want
 3. Windbg is powerful, relatively lightweight and free!
 
This task is not as hard as it sounds like. I just simply used the below 9 steps to debug one possible memory leak at work a while back.
 1. .symfix  and .reload -f
 2. .loadby sos mscorwks!
 3. !VerifyHeap
 4. !EEHeap
 5. !dumpheap -stat
 6. !dumpheap -type
 7. !do
 8. !threads
 9. kb
 
Let's see how I debug the memory leak easily one step by one step.  I will give brief description for each step.

 1. .symfix  and .reload -f
 The idea here is finding correct symbols. If .symfix doesn't work, you will need to set the symbol path manually from the menu or use .sympath. This is important and make sure to do it right. Debugging  without symbols is harder and very inconvenient.
 
 2. .loadby sos mscorwks (.net 3.5 or earlier) / .loadby sos clr (.net 4.0 or later)
 sos is the debugger extension for managed coding debugging. Type !help to see what command are available and how to use them. The help is really useful and do read it.  The commands are case insensitive and there are also shortcuts for some commands. 

 3. !VerifyHeap
 Run this command to make sure that your memory dump is good to use because when you take memory dump, the GC may be doing its work and the memory dump taken will not be good for analysis.
 
 4. !EEHeap
 Total LoaderHeap size: 0x2dd0000(48037888)bytes
 =======================================
 Number of GC Heaps: 1
 generation 0 starts at 0x00000000a8a7c8f8
 generation 1 starts at 0x00000000a7ff1000
 generation 2 starts at 0x0000000003841000
 ephemeral segment allocation context: (0x00000000a8a7c8f8, 0x00000000a8a7c910)
          segment            begin         allocated             size
 0000000003840000 0000000003841000  000000001381a838 0x000000000ffd9838(268277816)
 00000000359a0000 00000000359a1000  00000000459969e0 0x000000000fff59e0(268392928)
 000000007fff0000 000000007fff1000  000000008ffc16e0 0x000000000ffd06e0(268240608)
 000000004d9a0000 000000004d9a1000  000000005d995758 0x000000000fff4758(268388184)
 000000008fff0000 000000008fff1000  000000009e42ce50 0x000000000e43be50(239320656)
 00000000a7ff0000 00000000a7ff1000  00000000b0ca14b0 0x0000000008cb04b0(147522736)
 Large object heap starts at 0x0000000013841000
          segment            begin         allocated             size
 0000000013840000 0000000013841000  000000001b7cd460 0x0000000007f8c460(133743712)
 000000002be90000 000000002be91000  0000000033e52160 0x0000000007fc1160(133960032)
 00000000459a0000 00000000459a1000  000000004d47eb00 0x0000000007addb00(128834304)
 000000005db30000 000000005db31000  0000000065aeca18 0x0000000007fbba18(133937688)
 0000000065b30000 0000000065b31000  000000006dab1dd8 0x0000000007f80dd8(133696984)
 000000009fff0000 000000009fff1000  00000000a71ada98 0x00000000071bca98(119261848)
 00000000b7ff0000 00000000b7ff1000  00000000bc534d10 0x0000000004543d10(72629520)
 Total Size        0x8a0e83a8(2316207016)
 ------------------------------
 GC Heap Size        0x8a0e83a8(2316207016)
 
 Look at the size of the heaps to see if it is normal.  In this case GC heap is ~2GB and doesn’t look like normal.
 LoaderHeap, also known as high-frequency heap,  is used to store static objects and Type System. GC doesn't work on this heap. GC heap, also known as low-frequency heap,  is object heap and used to store non-static objects.
 Here is a great article explaining everything.  http://msdn.microsoft.com/en-us/magazine/cc163791.aspx#S5
 
 5. !dumpheap -stat
  MT                      Count      Total Size          Class Name

 000007ff00a473e0    36826     14730400 System.Net.HttpWebRequest
 000007fef58ee278    67741     33448512 System.Collections.Hashtable+bucket[]
 000007fef58d4548   393533     33956800 System.Object[]
 000007ff01907cb8      139     44941480 System.Collections.Generic.Dictionary`2+Entry[[System.Object, mscorlib],[Microsoft.IdentityModel.MruSecurityTokenCache+CacheEntry, Microsoft.IdentityModel]][]
 000007fef58e6758   643630    143906080 System.String
 000007fef58ee798   124388    487011696 System.Byte[]
 0000000001bfd530    15368    973670696      Free
 Total 4183274 objects
 
 By looking at the output, you will find that there are so many HttpWebRequest which looks fishy.  Let's take a closer look in next step.
 
 6. !dumpheap -type
 0:000> !dumpheap -type System.Net.HttpWebRequest
          Address               MT     Size
 00000000037fd6c0 000007ff00a473e0      400    
 000000000380bac0 000007ff00a473e0      400    
 0000000003812f88 000007ff00a473e0      400    
 0000000003816990 000007ff00a473e0      400    
 ...
 
 This command will list all the objects of type System.Net.HttpWebRequest. You can pick any object or object with larger size and look into it.  Let's pick the first object and take a look.
 
 7. !do
 
 0:000> !do 00000000037fd6c0
 Name: System.Net.HttpWebRequest
 MethodTable: 000007ff00a473e0
 EEClass: 000007ff00a36dd8
 Size: 400(0x190) bytes
  (D:\windows\assembly\GAC_MSIL\System\2.0.0.0__b77a5c561934e089\System.dll)
 Fields:
               MT    Field   Offset                 Type VT     Attr            Value Name
 000007fef58e6048  400018a        8        System.Object  0 instance 0000000000000000 __identity
 000007ff00a45940  4001d73       28         System.Int32  1 instance                0 m_AuthenticationLevel
 000007fef607f5a8  4001d74       2c         System.Int32  1 instance                0 m_ImpersonationLevel
 000007ff00a4fb90  4001d75       10 ...equestCachePolicy  0 instance 000000000388ff20 m_CachePolicy
 000007ff00a72400  4001d76       18 ...uestCacheProtocol  0 instance 0000000000000000 m_CacheProtocol
 000007ff00a4fcb0  4001d77       20 ...questCacheBinding  0 instance 00000000037730d8 m_CacheBinding
 000007fef58ed640  4001d70      de8 ...ections.ArrayList  0   static               MT    Field   Offset                 Type VT     Attr            Value Name
 000007fef58e6048  400018a        8        System.Object  0 instance 0000000000000000 __identity
 000007ff00a45940  4001d73       28         System.Int32  1 instance                0 m_AuthenticationLevel
 000007ff00a4a370  4001f5b       80 ...Net.KnownHttpVerb  0 instance 00000000037736b0 _Verb
 000007ff00a4a370  4001f5c       88 ...Net.KnownHttpVerb  0 instance 00000000037736b0 _OriginVerb
 000007ff00a480e8  4001f5d       90 ...bHeaderCollection  0 instance 00000000037fd850 _HttpRequestHeaders
 000007fef58ee798  4001f5e       98        System.Byte[]  0 instance 0000000003f39650 _WriteBuffer
 000007ff00a47268  4001f5f      144         System.Int32  1 instance                1 _HttpWriteMode
 000007ff003d7690  4001f60       a0           System.Uri  0 instance 00000000041a5820 _Uri
 000007ff003d7690  4001f61       a8           System.Uri  0 instance 00000000041a5820 _OriginUri
 000007fef58e6758  4001f62       b0        System.String  0 instance 0000000000000000 _MediaType
 
 0:000> !do 00000000041a5820
               MT    Field   Offset                 Type VT     Attr            Value Name
 000007fef58e6758  4001b87        8        System.String  0 instance 00000000041a5758 m_String
 000007fef58e6758  4001b88       10        System.String  0 instance 0000000000000000 m_originalUnicodeString
 000007ff003d7bd0  4001b89       18     System.UriParser  0 instance 000000000364a9f0 m_Syntax
 000007fef58e6758  4001b8a       20        System.String  0 instance 0000000000000000 m_DnsSafeHost
 000007ff003d73e8  4001b8b       30        System.UInt64  1 instance 37615763976 m_Flags
 000007ff00914d80  4001b8c       28   System.Uri+UriInfo  0 instance 00000000041a5868 m_Info
 
 0:000> !do 00000000041a5758
 Name: System.String
 MethodTable: 000007fef58e6758
 EEClass: 000007fef54ee520
 Size: 194(0xc2) bytes
  (D:\windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
 String: https://*.windows.net:443/2/Management/
 
 By following the below steps I looked into a few more instances of HttpWebRequest and found out that all the requests are pointing to the same uri.  I know that what that uri is used for but I am not sure why there are so many requests to that uri. So I use !threads to see how many there are and what they are doing.
 
 8. !threads
 0:000> !threads
 ThreadCount: 497
 UnstartedThread: 0
 BackgroundThread: 203
 PendingThread: 0
 DeadThread: 292
 Hosted Runtime: no
                                               PreEmptive                                                Lock
        ID OSID        ThreadOBJ     State   GC     GC Alloc Context                  Domain           Count APT Exception
    9    1  b74 000000000184bd10      8220 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 Ukn
   22    2  664 000000000184fbe0      b220 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 MTA (Finalizer)
   23    3  350 0000000001beec70    80a220 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 MTA (Threadpool Completion Port)
   24    4  3e4 0000000001befa40      1220 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 Ukn
   25    6  6b8 000000001cbb2350   380b220 Enabled  0000000000000000:0000000000000000 0000000001bf0010     0 MTA (Threadpool Worker)
    2    8  b84 000000001cc813d0       220 Enabled  0000000000000000:0000000000000000 0000000001bf0010     0 Ukn
   26    9  82c 000000001cc83ac0   380b220 Enabled  0000000000000000:0000000000000000 0000000001bf0010     0 MTA (Threadpool Worker)
   27    a  394 000000001ccc2970   380b220 Enabled  0000000000000000:0000000000000000 0000000001bf0010     0 MTA (Threadpool Worker) System.IO.IOException (00000000972d9498)
   28    c  9e4 000000001ccd3220   200b020 Enabled  0000000000000000:0000000000000000 0000000001bf0010     0 MTA System.ServiceModel.CommunicationException (000000009893f890)
   29    d  9e8 000000001ccd37f0   200b020 Enabled  0000000000000000:0000000000000000 0000000001bf0010     0 MTA System.Threading.WaitHandleCannotBeOpenedException (0000000003a1efd8)
   30    b  834 000000001ccd2c50       220 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 Ukn
   31    e  4fc 000000001ccd4f30   a80b220 Enabled  0000000000000000:0000000000000000 0000000001bf0010     0 MTA (Threadpool Completion Port)
   32   2b  508 000000001edad0e0   a80b220 Enabled  0000000000000000:0000000000000000 0000000001bf0010     0 MTA (Threadpool Completion Port)
   33   4b  9d8 000000001ed4c000   380b220 Enabled  0000000000000000:0000000000000000 0000000001bf0010     0 MTA (Threadpool Worker) System.IO.IOException (0000000009c4d120)
   34   3a  a34 000000001cef75e0   380b220 Enabled  0000000000000000:0000000000000000 0000000001bf0010     0 MTA (Threadpool Worker) System.IO.IOException (0000000008e72090)
   35   30  2ec 000000001ed4d170   380b220 Enabled  0000000000000000:0000000000000000 0000000001bf0010     0 MTA (Threadpool Worker) System.IO.IOException (0000000008d2d420)
   36   2a  530 000000001cef7bb0   380b220 Enabled  0000000000000000:0000000000000000 0000000001bf0010     0 MTA (Threadpool Worker) System.IO.IOException (0000000008f35d68)
   37   4d  958 000000001eb17040   380b220 Enabled  0000000000000000:0000000000000000 0000000001bf0010     0 MTA (Threadpool Worker) System.IO.IOException (00000000097791c8)
 XXXX   52    0 000000001cef6470   1801820 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 MTA (Threadpool Worker)
   38   1c  ab4 000000001ce49790   380b220 Enabled  0000000000000000:0000000000000000 0000000001bf0010     0 MTA (Threadpool Worker) System.IO.IOException (000000000b5478d0)
 XXXX   43    0 000000001edad6b0   1801820 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 MTA (Threadpool Worker)
 XXXX   45    0 000000001edaa260   1801820 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 MTA (Threadpool Worker)
 XXXX   15    0 000000001f0f4cb0   1801820 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 MTA (Threadpool Worker)
 XXXX    5    0 000000001ed4c5d0   1801820 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 MTA (Threadpool Worker)
 XXXX   1f    0 000000001edab9a0   1801820 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 MTA (Threadpool Worker)
 XXXX   42    0 000000001f0f29d0   1801820 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 MTA (Threadpool Worker)
 XXXX   1b    0 000000001edabf70   1801820 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 MTA (Threadpool Worker)
 XXXX   3f    0 000000001ce4c040   1801820 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 MTA (Threadpool Worker)
 XXXX   40    0 000000001ce4aed0   1801820 Enabled  0000000000000000:0000000000000000 0000000001844fa0     0 MTA (Threadpool Worker)
 XXXX   49    0 000000001ed4ba30   18018
 ...
 
 As you can see, there are 497 thread which is crazy. You can pick one thread to look at what it is doing. You can also use ~*kb to see stacks for all threads
 
 9. kb
 Let's pick thread 72 and run the below command to see what we can get.
 
 Type ~72s to make thread 72 current and type Kb to show the stack.
 
 RetAddr           : Args to Child                                                           : Call Site
 000007fe`fd333756 : 00000000`00000000 000007ff`00d907e8 00000000`00000003 000007fe`f7a1df6e : ntdll!ZwWaitForSingleObject+0xa [o:\rd\rd_os_v1.obj.amd64fre\base\ntdll\daytona\objfre\amd64\usrstubs.asm @ 180]
 000007fe`fd3418ed : 00000000`00000000 00000000`3cfff3c0 00000000`3cfff3c0 00000000`2115c270 : mswsock!SockWaitForSingleObject+0x46 [d:\longhorn\net\sockets\winsock2\wsp\mswsock\msafdlib\proc.c @ 1723]
 000007fe`fe2216a7 : 00000000`00001754 00000000`018d8430 00000000`2115c278 00000000`03777b90 : mswsock!WSPSend+0x1050d
 000007fe`f7540d87 : 00000000`2115c250 00000000`2115c1b0 00000000`00000003 00000000`2115c3d8 : ws2_32!WSASend+0x167 [d:\longhorn\net\sockets\winsock2\ws2_32\src\send.cpp @ 297]
 000007ff`01848990 : 00000000`2115c250 00000000`2115c368 00000000`2115c280 000007ff`0184887c : mscorwks!DoNDirectCall__PatchGetThreadCall+0x7b [f:\dd\ndp\clr\src\vm\amd64\pinvokestubs.asm @ 286]
 000007ff`018484e6 RetAddr           : Args to Child                                                           : Call Site
 000007fe`fd333756 : 00000000`00000000 000007ff`00d907e8 00000000`00000003 000007fe`f7a1df6e : ntdll!ZwWaitForSingleObject+0xa [o:\rd\rd_os_v1.obj.amd64fre\base\ntdll\daytona\objfre\amd64\usrstubs.asm @ 180]
 000007fe`fd3418ed : 00000000`00000000 00000000`3cfff3c0 00000000`3cfff3c0 00000000`2115c270 : mswsock!SockWaitForSingleObject+0x46 [d:\longhorn\net\sockets\winsock2\wsp\mswsock\msafdlib\proc.c @ 1723]
 000007fe`fe2216a7 : 00000000`00001754 00000000`018d8430 00000000`2115c278 00000000`03777b90 : mswsock!WSPSend+0x1050d
 000007fe`f7540d87 : 00000000`2115c250 00000000`2115c1b0 00000000`00000003 00000000`2115c3d8 : ws2_32!WSASend+0x167 [d:\longhorn\net\sockets\winsock2\ws2_32\src\send.cpp @ 297]
 000007ff`01848990 : 00000000`2115c250 00000000`2115c368 00000000`2115c280 000007ff`0184887c : mscorwks!DoNDirectCall__PatchGetThreadCall+0x7b [f:\dd\ndp\clr\src\vm\amd64\pinvokestubs.asm @ 286]
 000007ff`018484e6 : 000007fe`f5548542 00000000`b126d3c0 00000000`00000000 00000000`2115c4d0 : DomainBoundILStubClass.IL_STUB(IntPtr, System.Net.WSABuffer[], Int32, Int32 ByRef, System.Net.Sockets.SocketFlags, IntPtr, IntPtr)+0x150
 000007ff`01847ded : 00000000`8a908070 00000000`2115c380 00000000`00000001 000007fe`f79686f7 : System!System.Net.Sockets.Socket.MultipleSend(System.Net.BufferOffsetSize[], System.Net.Sockets.SocketFlags)+0x1d6 [f:\dd\ndp\fx\src\Net\System\Net\Sockets\Socket.cs @ 7165]
 000007ff`01847529 : 00000000`8ab9ae78 00000000`00000000 00000000`00000000 000007ff`01847381 : System!System.Net.Sockets.NetworkStream.MultipleWrite(System.Net.BufferOffsetSize[])+0x4d [f:\dd\ndp\fx\src\Net\System\Net\Sockets\NetworkStream.cs @ 1104]
 000007ff`018470ae : 00000000`8ab9aba8 000007ff`00aa0c6b ffffffff`fffffffe 00000000`3cfff3c0 : System!System.Net.Security._SslStream.StartWriting(System.Net.SplitWritesState, SplitWriteAsyncProtocolRequest)+0x169 [f:\dd\ndp\fx\src\Net\System\Net\SecureProtocols\_SslStream.cs @ 392]
 : 000007fe`f5548542 00000000`b126d3c0 00000000`00000000 00000000`2115c4d0 : DomainBoundILStubClass.IL_STUB(IntPtr, System.Net.WSABuffer[], Int32, Int32 ByRef, System.Net.Sockets.SocketFlags, IntPtr, IntPtr)+0x150
 000007ff`01847ded : 00000000`8a908070 00000000`2115c380 00000000`00000001 000007fe`f79686f7 : System!System.Net.Sockets.Socket.MultipleSend(System.Net.BufferOffsetSize[], System.Net.Sockets.SocketFlags)+0x1d6 [f:\dd\ndp\fx\src\Net\System\Net\Sockets\Socket.cs @ 7165]
 …
 
I looked into a few more threads and they all had very similar call stacks. At this time point I knew where to look further into code. With all the information here it was easy for me to figure out what happened and fix the memory leak problem.

The above is just one real example about how I debugged memory leak by using windbg. You might need to use different commands in different cases. But  once you do it once, you will know how to learn more in this area and how to debug other managed code memory leak by using windbg+sos.
 
 

Comments (4)

  1. Li says:

    This is quite useful. Thanks for writing this up!!

  2. Scott Marlowe says:

    Very useful. Thanks for posting.

  3. Nick says:

    So good, thank you for the real world example.

  4. Easwaran says:

    Wonderful article.. Really appreciate your work

Skip to main content