Debugging a memory leak in managed code: Ping – SendAsync


Included in version 2.0 of the .NET framework is a Ping class (System.Net.NetworkInformation namespace) that can be used to monitor the up/down status of a network connection to a server.  We recently realized that we had a minor bug in the class that can present itself as a major problem – a memory leak.  Fortunately there is an easy workaround to this problem. I thought I would take this as an oportunity to demonstrate how to use the windbg debugger as I explain how we figured out what was going wrong.


Lets consider this simple console ping application that sends a series of ping requests to a server using the SendAsync method.






using System;
using System.Threading;
using System.Net.NetworkInformation;



class program {
 static int PingsOutstanding;


 static void Main(string[] args)
 {
  string targetServer = “www.contoso.com“;
  for (int i = 0; i < 1000; i++) {
   PingServer(targetServer);
  }


  int currentcount = PingsOutstanding;
  while (currentcount > 0) {
   Console.WriteLine(“waiting for {0} ping requests to complete”, currentcount);
   Thread.Sleep(1000);
   currentcount = PingsOutstanding;
  }


  //the calls to the GC are just for debugging purposes
  Console.WriteLine(“Calling GC.Collect()”);
  GC.Collect();
  GC.WaitForPendingFinalizers();


  Console.WriteLine(“Press <enter> to exit application”);
  Console.ReadLine();
 }


 static void PingServer(string hostnameoraddress)
 {
  try {
   Ping pingsender = new Ping();
   pingsender.PingCompleted += new PingCompletedEventHandler(OnPingCompleted);
   pingsender.SendAsync(hostnameoraddress, null);
   Interlocked.Increment(ref PingsOutstanding);
  } catch (Exception ex) {
   Console.WriteLine(ex);
  }
 }



 static void OnPingCompleted(object sender, PingCompletedEventArgs e)
 {
  Interlocked.Decrement(ref PingsOutstanding);


  try {
   Ping pingsender = (Ping)sender;


   pingsender.Dispose();
  } catch (Exception ex) {
   Console.WriteLine(ex);
  }
 }
}


This application will not consume a ton of memory under normal operation but if you increase the number of ping requests that are sent, you will see that memory consumption also climbs.  Given that the .Net framework uses a garbage collection approach to memory allocation and cleanup, you would expect that the memory would be cleaned up properly within a reasonable amount of time.  You should also notice that I am calling pingSender.Dispose() which we would hope would release the resources being held by this object.


For demonstration purposes I have added a Console.ReadLine() call at the end of the application so that it won’t exit.  I have also added a call to GC.Collect() and GC.WaitForPendingFinalizers() to make sure that when I go to debug this I am certain that the GC has recently run and cleaned up any objects that are available for cleanup.  It is not usually advisable to force GC collections except for debugging purposes.  See Maoni’s blog for great information on how the GC works and how to use it effectively.


Using Windbg.exe


I chose to use Windbg because I have found it to be the best debugger for memory leaks.  It is a very powerful GUI debugger that is used very frequently inside of Microsoft for debugging everything from application bugs to OS bugs.  You can download it from here: http://www.microsoft.com/whdc/devtools/debugging/default.mspx


A couple of quick notes before I begin:



  1. updates to these debugging tools are available regularly, so make sure to update periodically.
  2. I will color commands that are to be run inside the debugger in green
  3. I have access to all of the debugger symbols for the .Net framework, but you don’t.  However, most of the information you need should be accessible by using “.symfix” as noted below.  The principles discussed here apply to debugging a memory leak in any managed application.

    • public microsoft symbols can loaded into windbg using the .symfix  followed by a .reload command (requires internet connection to a microsoft http server)

  4. I may trim sample output (colored in purple) from the debugger in order to focus on the important information.

Now, Windbg is a native windows debugger and doesn’t really know much about the .Net framework, so you have to load an extension to the debugger that understands managed applications.  SOS.dll is the debugger extension we are looking for. In the command window this will load the sos extension:


   .loadby sos mscorwks


This command tells the debugger to load the sos dll by looking in the same directory as the mscorwks dll, which is a core dll that is always loaded in a managed application.  Note that Windbg breaks into the debugger immediately upon loading the application, so mscorwks will not be loaded at this point.  You will have to run the application long enough for the .NET framework code to be loaded.  You can actually tell the debugger to break when mscorwks is loaded by running this in the command window:


   sxe ld:mscorwks


There are two ways that you can run commands from the sos this extension dll:


   !<command> or !sos.<command>


Here are examples for the help function found in sos (it prints usage information):


   !help
   !sos.help


The second one is useful when you have multiple extensions that share common function names.


When debugging this leak, I actually let the app run until it blocked on the call to Console.ReadLine() so that GC should have cleaned up.  The first thing I do when dealing with a managed memory leak is to look at all the objects that have been created by the runtime and see if I see any that jump out at me.  Lets look at the heap and see what it contains. 


   !dumpheap -stat
total 20600 objects
Statistics:
      MT    Count    TotalSize Class Name
7a76eb14        1           12 System.Net.DnsPermission
7a755834        1           12 System.Diagnostics.PerformanceCounterCategoryType
7a753394        1           12 System.Diagnostics.TraceOptions
7a71a710        1           12 System.Net.TimeoutValidator
7912b908        1           12 System.Collections.Generic.GenericEqualityComparer`1[[System.String, mscorlib]]
79112c90        1           12 System.Collections.Comparer
7910db30        1           12 System.Threading.SynchronizationContext
7910a718        1           12 System.DefaultBinder
79107f40        1           12 System.RuntimeTypeHandle
79107ac4        1           12 System.Reflection.__Filters
79102f48        1           12 System.__Filters
79102ef8        1           12 System.Reflection.Missing
79101ca8        1           12 System.RuntimeType+TypeCacheQueue
79100800        1           12 System.Text.DecoderExceptionFallback
…(output trimmed)
0105a514       77         4004 System.Configuration.FactoryRecord
791242ec       47         8136 System.Collections.Hashtable+bucket[]
79160a3c     1000        16000 System.Threading.RegisteredWaitHandle
79124228       58        19216 System.Object[]
7a779154     1000        20000 System.Net.SafeLocalFree
7a778ec0     1000        20000 System.Net.SafeCloseHandle
7a761bb0     1000        20000 System.ComponentModel.AsyncOperation
79160b84     1000        20000 System.Threading._ThreadPoolWaitOrTimerCallback
7910cf3c     1001        20020 Microsoft.Win32.SafeHandles.SafeWaitHandle
791609c8     1000        24000 System.Threading.RegisteredWaitHandleSafe
79115d0c     1000        24000 System.Threading.ManualResetEvent
7a7811f4     1000        32000 System.Net.NetworkInformation.PingCompletedEventHandler
79160aa8     1000        32000 System.Threading.WaitOrTimerCallback
7915ff38     1000        32000 System.Threading.SendOrPostCallback
7910d2f4     1001        36036 System.Threading.ExecutionContext
79124418     1004        44820 System.Byte[]
7a7812e0     1000        88000 System.Net.NetworkInformation.Ping
00150c90      421       108132      Free
790fa3e0     5262       304552 System.String
Total 20600 objects


Looking at the columns in the output, MT stands for Method Table and is basically a pointer to the table that describes that type of object.  Count is the number of objects that exist in the heap of the given type.  TotalSize is the amount of memory that is being consumed by any one type of object.  The last column is obviously the fully typed name of the object.


The first thing that jumps out at me in this case is the fact that we still have a large number of ping objects in the heap even though we aren’t holding onto any references in our code.  We also called dispose on these objects, so this is a huge red light telling me that something went wrong.  Also, the fact that we sent exacly 1000 ping requests and there are exactly 1000 ping objects still hanging around calls my attention.  Lets take a look at these ping objects and see why they aren’t getting cleaned up (Below I show two ways of doing this, you really only need to do one of them).


   !dumpheap -type Ping


This does a substring match on any object with Ping in it.  In this case it will list both System.Net.NetworkInformation.Ping objects and System.Net.NetworkInformation.PingCompletedEventHandler objects.  I really wanted just the Ping objects, so we could have run this command (using the MT for the ping object – obtained above)


   !dumpheap -mt 7a7812e0    


 Address       MT     Size
01271ff0 7a7812e0       88    
012720f8 7a7812e0       88    
012721d0 7a7812e0       88    
012722a8 7a7812e0       88    
01272380 7a7812e0       88    
01272458 7a7812e0       88    
01272530 7a7812e0       88    
01272608 7a7812e0       88    
012726e0 7a7812e0       88    
012727b8 7a7812e0       88    
01272890 7a7812e0       88    
01272968 7a7812e0       88    
…(output trimmed) 
012fada0 7a7812e0       88    
012faeec 7a7812e0       88    
012fb038 7a7812e0       88    
012fb184 7a7812e0       88    
012fb2d0 7a7812e0       88    
012fb41c 7a7812e0       88    
012fb568 7a7812e0       88    
012fb6b4 7a7812e0       88    
012fb800 7a7812e0       88    
012fb94c 7a7812e0       88    
012fba98 7a7812e0       88    
012fbbe4 7a7812e0       88    
012fbd30 7a7812e0       88    
012fbe7c 7a7812e0       88    
total 1000 objects
Statistics:
      MT    Count    TotalSize Class Name
7a7812e0     1000        88000 System.Net.NetworkInformation.Ping
Total 1000 objects


 Notice that the MT for all of the above objects matches that for the ping object.  The left hand column is the memory address where the object resides.  We can use this address to examine the actual ping object.  To take a look at the first object in the above list we would use this command:


   !dumpobject 01271ff0


I am not going to go into details at this point because this won’t lead us where the problem lies.  I thought I would list the dumpobject command just to point it out to you.  The problem in this case is that objects that we think should be cleaned up are not cleaned up as expected.  This means that there must be a reference to the object somewhere.  The following command will help you to see what objects hold references to an object in question.  We will use the same object address we used in the dumpobject example:


   !gcroot 01271ff0


Note: Roots found on stacks may be false positives. Run “!help gcroot” for
more info.
Scan Thread 0 OSTHread 2f8
Scan Thread 2 OSTHread fcc
Scan Thread 3 OSTHread de8
Scan Thread 4 OSTHread eb8
Scan Thread 8 OSTHread c8c
Scan Thread 12 OSTHread 464
Scan Thread 13 OSTHread 1e0
DOMAIN(0014BF60):HANDLE(Strong):8f15b8:Root:0128d2c0(System.Threading._ThreadPoolWaitOrTimerCallback)->
01271ff0(System.Net.NetworkInformation.Ping)


This tool scans the threads to find any object that holds a reference to the given memory address.  We can see that some threadpool object is holding onto us.  If we go back to the objects on the heap we can see that there are these _ThreadPoolWaitOrTimerCallback objects on the heap also.  Now, I don’t know a ton about the thread pool, but I would not expect those callback objects to still be on the heap either because they are most likely specific to a single usage of a threadpool thread. We could dump all of the _ThreadPoolWaitOrTimerCallback objects and check their roots (the objects holding onto them), but why dump the more objects from the heap?  The output from !gcroot gave us a pointer to an instance that we can quickly examine.


   !gcroot 0128d2c0


Note: Roots found on stacks may be false positives. Run “!help gcroot” for
more info.
Scan Thread 0 OSTHread 2f8
Scan Thread 2 OSTHread fcc
Scan Thread 3 OSTHread de8
Scan Thread 4 OSTHread eb8
Scan Thread 8 OSTHread c8c
Scan Thread 12 OSTHread 464
Scan Thread 13 OSTHread 1e0
DOMAIN(0014BF60):HANDLE(Strong):8f15b8:Root:0128d2c0(System.Threading._ThreadPoolWaitOrTimerCallback)->
01271ff0(System.Net.NetworkInformation.Ping)


We would have hoped that the Dispose() function would have cleaned these objects up, but it didn’t.  So, what about the GC?  Why didn’t GC take care of them when dispose didn’t? Notice how this gave us the same exact output from the gcroot of the ping class.  This must mean that we have a circular reference (ping has a reference to this thread pool object and visa versa) and that is why the GC didn’t clean it up.


Workaround (and reason why calling Dispose didn’t clean this up)


The simple workaround to this problem is to cast the object to the IDisposable interface and then call Dispose on that casted object.


   ((IDisposable)pingSender).Dispose();


Why does this solve the problem?  Take a look at the declaration of the class:


   public class Ping:Component,IDisposable


Notice that it inherits from Component and it also implementes the IDisposable interface.  Now lets look at the function declaration for Dispose on the Ping class:


   void IDisposable.Dispose ()


In the implementation, the Dispose function was defined as part of the IDisposable interface but the fact that the base class Component also implements a Dispose function was missed.  The Ping class doesn’t override the Component.Dispose method and so if you call Dispose directly on the Ping object without casting it to an IDisposable Interface, the runtime calls the base class implementation of Dispose instead of the one that was intended (the one for IDisposable). 


Note: There is another great blog on debugging managed memory leaks here: http://blogs.msdn.com/mvstanton/archive/2005/10/11/479861.aspx

Comments (20)

  1. Matthew says:

    >> In the implementation, the Dispose function was defined as part of the IDisposable interface but the fact that the base class Component also implements a Dispose function was missed.

    I love VB.NET. None of these issues. 😉

  2. Mahjayar says:

    Aah…The nuances of explicit and implicit implementations of a interface.

  3. wei says:

    Excellent analysis!

    thanks so much for such a wonderful article!

  4. Try Scitech Memory profiler – it is awesome

  5. Windbg is a very useful and powerful tool, especially for detecting managed memory leak.Here is the microsoft…

  6. Olivier Fermy says:

    Windbg isn’t a powerfull tool for detecting memory leaks. I have experienced Scitech Memory profiler and it’s such a great tool. This profiler is able to find the path to the root object keeping your object alive. This is what you need when you want to repair memory leaks, you don’t only want a profiler which displays the "references to" and the "references from".

  7. Dave Tucker says:

    I am trying to make a class that avoids the Ping memory leak, but it isn’t working correctly.  Does anybody know why the following does not work?

    public class Ping : System.Net.NetworkInformation.Ping

    {

    protected override void Dispose(bool disposing)

    {

    ((IDisposable)this).Dispose();

    }

    }

    If I actually call the Dispose function, it cleans it up properly now, but it still will not automatically dispose when the object goes out of scope as it should.

    Is there a way to extend the Ping class in such a way so as to not have to worry about the leak?

  8. joncole says:

    Having an object go out of scope is not the same as disposing of the object.  The Dispose method is meant to be called explicitly because it gives you deterministic releasing of resources.  If you wait for an object to go out of scope, then you are letting the GC take care of it whenever it gets a chance.

    As a side note, your overridden Dispose function should call

         base.Dispose(disposing);

    before exiting.

  9. James Harrison says:

    Jon this post is great. I have a serious memory leak in my current project which I’ve been completely unable to track down. Using your tips I’ve discovered System.Threading._ThreadPoolWaitOrTimerCallback hanging around from ThreadPool.RegisterWaitForSingleObject() calls. Unfortunately the RegisteredWaitHandle returned by RegisterWaitForSingleObject doesn’t have a dispose. I’ve called RegisteredWaitHandle.Unregister() and get a return of true, but the _ThreadPoolWaitOrTimerCallback are still there. Is this a bug in the ThreadPool class?

  10. joncole says:

    I am not a CLR guy and I haven’t really used that API much, but this is what I would do if I were in your shoes…

    I would try to find answers to the following questions:  

    1) Are you getting all of the WaitOrTimerCallbacks invoked?  

    2) Are most of them timing out or getting called before the timeout?  

    3) What timeout value are you using?  Is it too big?  I ask because if you have a large number of long running tasks that never set the wait handle you are waiting on and you have a large timeout, then you could be creating work items for the thread pool faster than you are consuming them…

    4) What is the state of the thread pool?  Is it keeping up with the workload?  Check out the output from !sos.threadpool.

    Hope that points you in the right direction…

  11. James Harrison says:

    Again, thanks a lot for this blog, it’s opened up a whole new debugging method for myself. And thanks for the suggestions. I’m sure I’ll track down the problem.

  12. Iftekhar Naim says:

    thanks a lot man…… Amazing articles. I was wasting to solve my memory leak about 3 hours. Finally i identified that ping is the problem but found no solve. But your fantastic article just saved my day…:)

    thanks

  13. Jim88 says:

    Please could you provide VB.NET code for the workaround.

    How do you "cast the object to the IDisposable interface and then call Dispose on that casted object" in VB.

    Thanks.

  14. joncole says:

    I believe it is done something like this…

    Dim iDisp As IDisposable

    iDisp = CType(myPingObject, IDisposable)

    iDisp.Dispose()

  15. Simon says:

    Just wanted to say thanks for a revealing article, I’ve never delved into debugging memory leaks/similar – This has piqued my interest so I intend to investigate further.

    Thanks for taking the time to blog 🙂

  16. Geoff says:

    Thanks, This was a great read and fixed my issue with Ping!

  17. Writing programs using .Net is very productive. One reason is because much of memory management is “managed”

  18. Mike says:

    sos rocks…great for reflection tricks as well… nice article

  19. Adao Ramalho says:

    Excellent analysis!

    Solve my problem! Congratulations!

  20. Nick says:

    OMG!!!  Turns out my top thing was

    000007fed9964e98  4056295    162251800 System.Windows.Documents.TextTreeInsertUndoUnit

    We used text boxes to display everything, so that a user can copy and paste…ooooops!!!!  Doesn't help when your text box text displays current time!!!

    And here we thought our memory issues were some super complicated thing.