TGIF, almost time for the weekend... but before you leave, here is lab 3.
Todays debugging puzzle will be a Memory investigation. This time we are going to stress the BuggyBits site to create a high memory usage situation and figure out what caused it. This lab is a bit lengthier because I wanted to show various aspects of a memory investigation. Once you know the relationship between the numbers in the dump and perfmon etc. you can often skip some parts, but if it is possible to gather I would still recommend gathering performance logs for completeness.
Previous demos and setup instructions
If you are new to the debugging labs, here you can find information on how to set up the labs as well as links to the previous labs in the series.
Set up a performance monitor log
1. Browse to http://localhost/BuggyBits/Links.aspx to start the w3wp.exe process
2. Open up performance monitor (perfmon.exe from start/run)
2. Right-click "performance counters and alerts/counter logs" and choose "new log settings", name it Lab3-Mem
3. Click on "Add Objects..."
4. Add the objects ".NET CLR Memory" and "Process" and click ok.
5. Set the sampling interval to 1 second (because our repro is really short, normally 15 seconds is ok)
6. Change the "Run As:" from "<Default>" to your domainusername and click on the "Set Password..." button to set the password. This is done because you have to run the counter log as an administrator or as someone with debug priviledges to the w3wp.exe process, otherwise you will not get any .net counters.
7. Click ok, this will start monitoring the process.
Reproduce the problem
1. Start a command prompt and move to c:program filesIIS ResourcesTinyget or wherever you installed tinyget
2. Run "tinyget -srv:localhost -uri:/BuggyBits/Links.aspx -loop:4000"
Note: You can lower the amount of loops slightly if you find that your process crashes because of OutOfMemory exceptions, I set it this high so that we would get some kind of output in perfmon.
3. Stop the performance monitor counterlog (but let it run for at least about 20-30 seconds after the tinyget run)
Examine the performance logs
1. In performance monitor, move to the "System Monitor" node
2. Press Ctrl+L or click on the "View Log Data" button (looks like a database icon) and open the log file, make sure to expand the time range as much as possible
3. Remove any counters that it already added at the bottom of the screen
4. Add the following counters (for the w3wp.exe process) using the + button or Ctrl+I
- .NET CLR Memory/# Bytes in all Heaps
- .NET CLR Memory/Large Object Heap Size
- .NET CLR Memory/Gen 2 heap size
- .NET CLR Memory/Gen 1 heap size
- .NET CLR Memory/Gen 0 heap size
- Process/Private Bytes
- Process/Virtual Bytes
The counters should now show up at the bottom of the screen...
5. Right-click any of them and choose properties and on the Data tab set the scale for all of them to 0,0000001 to make sure you can view the whole graph on the screen
Note: If you click the lightbulb Ctrl+H the selected counter will be highlighted so that it is easier to distinguish which line corresponds to which counter
Q: What are the latest values for these counters?
Q: Compare the curves for Virtual Bytes, Private Bytes and #Bytes in all Heaps, do they follow eachother or do they diverge?
Q: Can you tell from this if it appears to be a native "leak", a .net memory "leak" or a leak on the loader heap (dynamic assemblies)?
Q: At the end the log (after you finish your tinyget run) does memory stay flat or does it go down?
6. Add the counters for .NET CLR Memory/# Gen 0 Collections, .NET CLR Memory/# Gen 1 Collections, .NET CLR Memory/# Gen 2 Collections
Q: Do any collections occurr after the tinyget run is done? if not, why not?
7. Open task manager and under "View/Select Columns..." add the column "Virtual Memory Size". Compare the values for Mem Usage and VM Size to the values for Private Bytes and Virtual Bytes in perfmon. (Note: since they are given in K in task manager you need to multiply these values by 1024).
Q: What does Mem Usage show?
Q: What does VM Size show?
Get a memory dump
1. Open a command prompt and move to the debuggers directory and type in "adplus -hang -pn w3wp.exe -quiet" and hit enter
Open the dump in windbg
1. Open the dump file in windbg (the .dmp file should be located in your debuggers directory under a folder labeled hang_mode and todays date and time.
2. Set up the symbol path and load sos (see the setup instructions for more info)
Q: How big is the dump? Look at the size in the file explorer.
Q: How does this compare to Private Bytes, Virtual Bytes and # Bytes in all Heaps.
Examine the memory to get an idea of where the memory is going
1. Run !address -summary (this will give you an overview of the memory usage) and familiarize yourself with the output. Hint: check the windbg help files for !address
Q: Which values corresponds best to the following
- Private Bytes
- Virtual Bytes
Q: Where is most of the memory going (which RegionType)?
Q: What does Busy, Pct(Busy) and Pct(Tots) mean?
Q: What does MEM_IMAGE mean?
Q: Under what region does .net memory fit in and why?
From perfmon we could see that #Bytes in all Heaps follows Private bytes very closely meaning that the increase in memory is mostly an increase in .NET memory. In other words we are looking for an answer to why our .net GC heaps keep growing.
Examine the .NET GC Heaps
1. Run !eeheap -gc to examine the size of the .NET GC heaps
Q: How many heaps do you have? Why?
Q: How much memory is stored on the .net GC heaps? Compare to #Bytes in all Heaps
Q: How much memory is on the large object heap? Hint: add up the sizes for all the Large Object heap segments. Compare to Large Object Heap Size in perfmon.
Q: Looking at the 5-10 object types that use most memory, what do you think is leaking?
Q: What "size" does the size column show? I.e. what is included in this "size"? Hint: see the post (!dumpheap -stat explained) for more info.
Normally I wouldn't recommend looking at strings immediately since they will be at the bottom of the !dumpheap -stat output in pretty much every single dump you take, because of the following
- the "size" for strings is the actual size of the contents of the string. If you compare this to the size of a dataset for example, the size will only contain pointers to the rows and columns arrays but not include the memory for the rows and columns, so the size for a datasetobject will always be very small (and constant) while the size of a string varies in size.
- strings are leaf nodes in most objects, i.e. datasets contain strings, aspx pages contain strings, session vars contain strings, labels contain strings etc. etc. so you are bound to have mostly strings in your app.
However in this case the strings are so many and occupy so much memory that if we don't find anything else that sticks out, we might just want to follow the string lead...
3. Dump out stats for various size strings to find out if there is a pattern (this is a bit of trial and error so you have to try a few different sizes to figure out where the bulk of the strings are.
Get the method table (MT) for System.String (first column in !dumpheap -stat)
!dumpheap -mt <string MT> -min 85000 -stat
!dumpheap -mt <string MT> -min 10000 -stat
!dumpheap -mt <string MT> -min 20000 -stat
!dumpheap -mt <string MT> -min 30000 -stat
!dumpheap -mt <string MT> -min 25000 -stat
Q: In what range (between what sizes) do most of the strings exist?
4. Dump out the strings in that range
!dumpheap -mt <string MT> -min 20000 -max 25000
In this case most of them will even be the exact same size which is a clue to what is going on...
5. Dump out a few of them to find out what they contain
!do <address of string - first column in the !dumpheap -mt output>
Q: What do these strings contain?
6. Pick a few and find out where they are rooted (i.e. why they can't be collected) Note: You may want to try a couple different ones.
!gcroot <address of string>
Q: Where are they rooted? Why?
Examine the finalizer queue and the finalizer thread
1. Look at the finalizer queue
Q: What objects are listed in the finalizequeue output? Hint: run !help finalizequeue
Q: How many objects are "ready for finalization"? What does this mean?
2. Find the finalizerthread to determine what it is doing. Run !threads and look for the thread listed with (Finalizer)
3. Move to the finalizer thread and examine the managed and native callstack
~5s (5 should be substituted with the id of the finalizer thread)
Q: What object is it finalizing?
Q: What is it doing? Why is this causing high memory usage?
Examine the code for verification
1. Open Link.cs to find the destructor/finalizer for the Link class to see the problematic code
Have fun debugging,