Included in version 2.0 of the .NET framework is a Ping class (System.Net.NetworkInformation namespace) that can be used to monitor the up/down status of a network connection to a server. We recently realized that we had a minor bug in the class that can present itself as a major problem - a memory leak. Fortunately there is an easy workaround to this problem. I thought I would take this as an oportunity to demonstrate how to use the windbg debugger as I explain how we figured out what was going wrong.
Lets consider this simple console ping application that sends a series of ping requests to a server using the SendAsync method.
using System;
static void Main(string[] args) int currentcount = PingsOutstanding; //the calls to the GC are just for debugging purposes Console.WriteLine("Press <enter> to exit application"); static void PingServer(string hostnameoraddress)
try { pingsender.Dispose(); |
This application will not consume a ton of memory under normal operation but if you increase the number of ping requests that are sent, you will see that memory consumption also climbs. Given that the .Net framework uses a garbage collection approach to memory allocation and cleanup, you would expect that the memory would be cleaned up properly within a reasonable amount of time. You should also notice that I am calling pingSender.Dispose() which we would hope would release the resources being held by this object.
For demonstration purposes I have added a Console.ReadLine() call at the end of the application so that it won't exit. I have also added a call to GC.Collect() and GC.WaitForPendingFinalizers() to make sure that when I go to debug this I am certain that the GC has recently run and cleaned up any objects that are available for cleanup. It is not usually advisable to force GC collections except for debugging purposes. See Maoni's blog for great information on how the GC works and how to use it effectively.
Using Windbg.exe
I chose to use Windbg because I have found it to be the best debugger for memory leaks. It is a very powerful GUI debugger that is used very frequently inside of Microsoft for debugging everything from application bugs to OS bugs. You can download it from here: http://www.microsoft.com/whdc/devtools/debugging/default.mspx.
A couple of quick notes before I begin:
- updates to these debugging tools are available regularly, so make sure to update periodically.
- I will color commands that are to be run inside the debugger in green.
- I have access to all of the debugger symbols for the .Net framework, but you don't. However, most of the information you need should be accessible by using ".symfix" as noted below. The principles discussed here apply to debugging a memory leak in any managed application.
- public microsoft symbols can loaded into windbg using the .symfix followed by a .reload command (requires internet connection to a microsoft http server)
- I may trim sample output (colored in purple) from the debugger in order to focus on the important information.
Now, Windbg is a native windows debugger and doesn't really know much about the .Net framework, so you have to load an extension to the debugger that understands managed applications. SOS.dll is the debugger extension we are looking for. In the command window this will load the sos extension:
.loadby sos mscorwks
This command tells the debugger to load the sos dll by looking in the same directory as the mscorwks dll, which is a core dll that is always loaded in a managed application. Note that Windbg breaks into the debugger immediately upon loading the application, so mscorwks will not be loaded at this point. You will have to run the application long enough for the .NET framework code to be loaded. You can actually tell the debugger to break when mscorwks is loaded by running this in the command window:
sxe ld:mscorwks
There are two ways that you can run commands from the sos this extension dll:
!<command> or !sos.<command>
Here are examples for the help function found in sos (it prints usage information):
!help
!sos.help
The second one is useful when you have multiple extensions that share common function names.
When debugging this leak, I actually let the app run until it blocked on the call to Console.ReadLine() so that GC should have cleaned up. The first thing I do when dealing with a managed memory leak is to look at all the objects that have been created by the runtime and see if I see any that jump out at me. Lets look at the heap and see what it contains.
!dumpheap -stat
total 20600 objects
Statistics:
MT Count TotalSize Class Name
7a76eb14 1 12 System.Net.DnsPermission
7a755834 1 12 System.Diagnostics.PerformanceCounterCategoryType
7a753394 1 12 System.Diagnostics.TraceOptions
7a71a710 1 12 System.Net.TimeoutValidator
7912b908 1 12 System.Collections.Generic.GenericEqualityComparer`1[[System.String, mscorlib]]
79112c90 1 12 System.Collections.Comparer
7910db30 1 12 System.Threading.SynchronizationContext
7910a718 1 12 System.DefaultBinder
79107f40 1 12 System.RuntimeTypeHandle
79107ac4 1 12 System.Reflection.__Filters
79102f48 1 12 System.__Filters
79102ef8 1 12 System.Reflection.Missing
79101ca8 1 12 System.RuntimeType+TypeCacheQueue
79100800 1 12 System.Text.DecoderExceptionFallback
...(output trimmed)
0105a514 77 4004 System.Configuration.FactoryRecord
791242ec 47 8136 System.Collections.Hashtable+bucket[]
79160a3c 1000 16000 System.Threading.RegisteredWaitHandle
79124228 58 19216 System.Object[]
7a779154 1000 20000 System.Net.SafeLocalFree
7a778ec0 1000 20000 System.Net.SafeCloseHandle
7a761bb0 1000 20000 System.ComponentModel.AsyncOperation
79160b84 1000 20000 System.Threading._ThreadPoolWaitOrTimerCallback
7910cf3c 1001 20020 Microsoft.Win32.SafeHandles.SafeWaitHandle
791609c8 1000 24000 System.Threading.RegisteredWaitHandleSafe
79115d0c 1000 24000 System.Threading.ManualResetEvent
7a7811f4 1000 32000 System.Net.NetworkInformation.PingCompletedEventHandler
79160aa8 1000 32000 System.Threading.WaitOrTimerCallback
7915ff38 1000 32000 System.Threading.SendOrPostCallback
7910d2f4 1001 36036 System.Threading.ExecutionContext
79124418 1004 44820 System.Byte[]
7a7812e0 1000 88000 System.Net.NetworkInformation.Ping
00150c90 421 108132 Free
790fa3e0 5262 304552 System.String
Total 20600 objects
Looking at the columns in the output, MT stands for Method Table and is basically a pointer to the table that describes that type of object. Count is the number of objects that exist in the heap of the given type. TotalSize is the amount of memory that is being consumed by any one type of object. The last column is obviously the fully typed name of the object.
The first thing that jumps out at me in this case is the fact that we still have a large number of ping objects in the heap even though we aren't holding onto any references in our code. We also called dispose on these objects, so this is a huge red light telling me that something went wrong. Also, the fact that we sent exacly 1000 ping requests and there are exactly 1000 ping objects still hanging around calls my attention. Lets take a look at these ping objects and see why they aren't getting cleaned up (Below I show two ways of doing this, you really only need to do one of them).
!dumpheap -type Ping
This does a substring match on any object with Ping in it. In this case it will list both System.Net.NetworkInformation.Ping objects and System.Net.NetworkInformation.PingCompletedEventHandler objects. I really wanted just the Ping objects, so we could have run this command (using the MT for the ping object - obtained above)
!dumpheap -mt 7a7812e0
Address MT Size
01271ff0 7a7812e0 88
012720f8 7a7812e0 88
012721d0 7a7812e0 88
012722a8 7a7812e0 88
01272380 7a7812e0 88
01272458 7a7812e0 88
01272530 7a7812e0 88
01272608 7a7812e0 88
012726e0 7a7812e0 88
012727b8 7a7812e0 88
01272890 7a7812e0 88
01272968 7a7812e0 88
...(output trimmed)
012fada0 7a7812e0 88
012faeec 7a7812e0 88
012fb038 7a7812e0 88
012fb184 7a7812e0 88
012fb2d0 7a7812e0 88
012fb41c 7a7812e0 88
012fb568 7a7812e0 88
012fb6b4 7a7812e0 88
012fb800 7a7812e0 88
012fb94c 7a7812e0 88
012fba98 7a7812e0 88
012fbbe4 7a7812e0 88
012fbd30 7a7812e0 88
012fbe7c 7a7812e0 88
total 1000 objects
Statistics:
MT Count TotalSize Class Name
7a7812e0 1000 88000 System.Net.NetworkInformation.Ping
Total 1000 objects
Notice that the MT for all of the above objects matches that for the ping object. The left hand column is the memory address where the object resides. We can use this address to examine the actual ping object. To take a look at the first object in the above list we would use this command:
!dumpobject 01271ff0
I am not going to go into details at this point because this won't lead us where the problem lies. I thought I would list the dumpobject command just to point it out to you. The problem in this case is that objects that we think should be cleaned up are not cleaned up as expected. This means that there must be a reference to the object somewhere. The following command will help you to see what objects hold references to an object in question. We will use the same object address we used in the dumpobject example:
!gcroot 01271ff0
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 0 OSTHread 2f8
Scan Thread 2 OSTHread fcc
Scan Thread 3 OSTHread de8
Scan Thread 4 OSTHread eb8
Scan Thread 8 OSTHread c8c
Scan Thread 12 OSTHread 464
Scan Thread 13 OSTHread 1e0
DOMAIN(0014BF60):HANDLE(Strong):8f15b8:Root:0128d2c0(System.Threading._ThreadPoolWaitOrTimerCallback)->
01271ff0(System.Net.NetworkInformation.Ping)
This tool scans the threads to find any object that holds a reference to the given memory address. We can see that some threadpool object is holding onto us. If we go back to the objects on the heap we can see that there are these _ThreadPoolWaitOrTimerCallback objects on the heap also. Now, I don't know a ton about the thread pool, but I would not expect those callback objects to still be on the heap either because they are most likely specific to a single usage of a threadpool thread. We could dump all of the _ThreadPoolWaitOrTimerCallback objects and check their roots (the objects holding onto them), but why dump the more objects from the heap? The output from !gcroot gave us a pointer to an instance that we can quickly examine.
!gcroot 0128d2c0
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 0 OSTHread 2f8
Scan Thread 2 OSTHread fcc
Scan Thread 3 OSTHread de8
Scan Thread 4 OSTHread eb8
Scan Thread 8 OSTHread c8c
Scan Thread 12 OSTHread 464
Scan Thread 13 OSTHread 1e0
DOMAIN(0014BF60):HANDLE(Strong):8f15b8:Root:0128d2c0(System.Threading._ThreadPoolWaitOrTimerCallback)->
01271ff0(System.Net.NetworkInformation.Ping)
We would have hoped that the Dispose() function would have cleaned these objects up, but it didn't. So, what about the GC? Why didn't GC take care of them when dispose didn't? Notice how this gave us the same exact output from the gcroot of the ping class. This must mean that we have a circular reference (ping has a reference to this thread pool object and visa versa) and that is why the GC didn't clean it up.
Workaround (and reason why calling Dispose didn't clean this up)
The simple workaround to this problem is to cast the object to the IDisposable interface and then call Dispose on that casted object.
((IDisposable)pingSender).Dispose();
Why does this solve the problem? Take a look at the declaration of the class:
public class Ping:Component,IDisposable
Notice that it inherits from Component and it also implementes the IDisposable interface. Now lets look at the function declaration for Dispose on the Ping class:
void IDisposable.Dispose ()
In the implementation, the Dispose function was defined as part of the IDisposable interface but the fact that the base class Component also implements a Dispose function was missed. The Ping class doesn't override the Component.Dispose method and so if you call Dispose directly on the Ping object without casting it to an IDisposable Interface, the runtime calls the base class implementation of Dispose instead of the one that was intended (the one for IDisposable).
Note: There is another great blog on debugging managed memory leaks here: http://blogs.msdn.com/mvstanton/archive/2005/10/11/479861.aspx
>> In the implementation, the Dispose function was defined as part of the IDisposable interface but the fact that the base class Component also implements a Dispose function was missed.
I love VB.NET. None of these issues. 😉
Aah…The nuances of explicit and implicit implementations of a interface.
Excellent analysis!
thanks so much for such a wonderful article!
Try Scitech Memory profiler – it is awesome
Windbg is a very useful and powerful tool, especially for detecting managed memory leak.Here is the microsoft…
Windbg isn’t a powerfull tool for detecting memory leaks. I have experienced Scitech Memory profiler and it’s such a great tool. This profiler is able to find the path to the root object keeping your object alive. This is what you need when you want to repair memory leaks, you don’t only want a profiler which displays the "references to" and the "references from".
I am trying to make a class that avoids the Ping memory leak, but it isn’t working correctly. Does anybody know why the following does not work?
public class Ping : System.Net.NetworkInformation.Ping
{
protected override void Dispose(bool disposing)
{
((IDisposable)this).Dispose();
}
}
If I actually call the Dispose function, it cleans it up properly now, but it still will not automatically dispose when the object goes out of scope as it should.
Is there a way to extend the Ping class in such a way so as to not have to worry about the leak?
Having an object go out of scope is not the same as disposing of the object. The Dispose method is meant to be called explicitly because it gives you deterministic releasing of resources. If you wait for an object to go out of scope, then you are letting the GC take care of it whenever it gets a chance.
As a side note, your overridden Dispose function should call
base.Dispose(disposing);
before exiting.
Jon this post is great. I have a serious memory leak in my current project which I’ve been completely unable to track down. Using your tips I’ve discovered System.Threading._ThreadPoolWaitOrTimerCallback hanging around from ThreadPool.RegisterWaitForSingleObject() calls. Unfortunately the RegisteredWaitHandle returned by RegisterWaitForSingleObject doesn’t have a dispose. I’ve called RegisteredWaitHandle.Unregister() and get a return of true, but the _ThreadPoolWaitOrTimerCallback are still there. Is this a bug in the ThreadPool class?
I am not a CLR guy and I haven’t really used that API much, but this is what I would do if I were in your shoes…
I would try to find answers to the following questions:
1) Are you getting all of the WaitOrTimerCallbacks invoked?
2) Are most of them timing out or getting called before the timeout?
3) What timeout value are you using? Is it too big? I ask because if you have a large number of long running tasks that never set the wait handle you are waiting on and you have a large timeout, then you could be creating work items for the thread pool faster than you are consuming them…
4) What is the state of the thread pool? Is it keeping up with the workload? Check out the output from !sos.threadpool.
Hope that points you in the right direction…
Again, thanks a lot for this blog, it’s opened up a whole new debugging method for myself. And thanks for the suggestions. I’m sure I’ll track down the problem.
thanks a lot man…… Amazing articles. I was wasting to solve my memory leak about 3 hours. Finally i identified that ping is the problem but found no solve. But your fantastic article just saved my day…:)
thanks
Please could you provide VB.NET code for the workaround.
How do you "cast the object to the IDisposable interface and then call Dispose on that casted object" in VB.
Thanks.
I believe it is done something like this…
Dim iDisp As IDisposable
iDisp = CType(myPingObject, IDisposable)
iDisp.Dispose()
Just wanted to say thanks for a revealing article, I’ve never delved into debugging memory leaks/similar – This has piqued my interest so I intend to investigate further.
Thanks for taking the time to blog 🙂
Thanks, This was a great read and fixed my issue with Ping!
Writing programs using .Net is very productive. One reason is because much of memory management is “managed”
sos rocks…great for reflection tricks as well… nice article
Excellent analysis!
Solve my problem! Congratulations!
OMG!!! Turns out my top thing was
000007fed9964e98 4056295 162251800 System.Windows.Documents.TextTreeInsertUndoUnit
We used text boxes to display everything, so that a user can copy and paste…ooooops!!!! Doesn't help when your text box text displays current time!!!
And here we thought our memory issues were some super complicated thing.