.NET Debugging Demos Lab 3: Memory


 

TGIF, almost time for the weekend… but before you leave, here is lab 3.

Todays debugging puzzle will be a Memory investigation.  This time we are going to stress the BuggyBits site to create a high memory usage situation and figure out what caused it.   This lab is a bit lengthier because I wanted to show various aspects of a memory investigation.  Once you know the relationship between the numbers in the dump and perfmon etc. you can often skip some parts, but if it is possible to gather I would still recommend gathering performance logs for completeness.

Previous demos and setup instructions

If you are new to the debugging labs, here you can find information on how to set up the labs as well as links to the previous labs in the series.

Information and setup instructions
Lab 1: Hang
Lab 1: Hang – review 
Lab 2: Crash
Lab 2: Crash – review

Set up a performance monitor log

1. Browse to http://localhost/BuggyBits/Links.aspx to start the w3wp.exe process

2. Open up performance monitor (perfmon.exe from start/run)

2. Right-click “performance counters and alerts/counter logs” and choose “new log settings”, name it Lab3-Mem

3. Click on “Add Objects…” 

4. Add the objects “.NET CLR Memory” and “Process” and click ok.

5. Set the sampling interval to 1 second (because our repro is really short, normally 15 seconds is ok)

6. Change the “Run As:” from “<Default>” to your domainusername and click on the “Set Password…” button to set the password.  This is done because you have to run the counter log as an administrator or as someone with debug priviledges to the w3wp.exe process, otherwise you will not get any .net counters.

7. Click ok, this will start monitoring the process.

Reproduce the problem

1. Start a command prompt and move to c:program filesIIS ResourcesTinyget or wherever you installed tinyget

2. Run “tinyget -srv:localhost -uri:/BuggyBits/Links.aspx -loop:4000”

Note: You can lower the amount of loops slightly if you find that your process crashes because of OutOfMemory exceptions, I set it this high so that we would get some kind of output in perfmon.

3. Stop the performance monitor counterlog (but let it run for at least about 20-30 seconds after the tinyget run)

Examine the performance logs

1. In performance monitor, move to the “System Monitor” node

2. Press Ctrl+L or click on the “View Log Data” button (looks like a database icon) and open the log file, make sure to expand the time range as much as possible 

3. Remove any counters that it already added at the bottom of the screen

4. Add the following counters (for the w3wp.exe process) using the + button or Ctrl+I

  • .NET CLR Memory/# Bytes in all Heaps
  • .NET CLR Memory/Large Object Heap Size
  • .NET CLR Memory/Gen 2 heap size
  • .NET CLR Memory/Gen 1 heap size
  • .NET CLR Memory/Gen 0 heap size
  • Process/Private Bytes
  • Process/Virtual Bytes

The counters should now show up at the bottom of the screen…

5. Right-click any of them and choose properties and on the Data tab set the scale for all of them to 0,0000001 to make sure you can view the whole graph on the screen

Note: If you click the lightbulb Ctrl+H the selected counter will be highlighted so that it is easier to distinguish which line corresponds to which counter

Q: What are the latest values for these counters?

Q: Compare the curves for Virtual Bytes, Private Bytes and #Bytes in all Heaps, do they follow eachother or do they diverge?

Q: Can you tell from this if it appears to be a native “leak”, a .net memory “leak” or a leak on the loader heap (dynamic assemblies)?

Q: At the end the log (after you finish your tinyget run) does memory stay flat or does it go down?

6. Add the counters for .NET CLR Memory/# Gen 0 Collections, .NET CLR Memory/# Gen 1 Collections, .NET CLR Memory/# Gen 2 Collections

Q: Do any collections occurr after the tinyget run is done? if not, why not?

7. Open task manager and under “View/Select Columns…” add the column “Virtual Memory Size”. Compare the values for Mem Usage and VM Size to the values for Private Bytes and Virtual Bytes in perfmon. (Note: since they are given in K in task manager you need to multiply these values by 1024).

Q: What does Mem Usage show?

Q: What does VM Size show?

Get a memory dump

1. Open a command prompt and move to the debuggers directory and type in “adplus -hang -pn w3wp.exe -quiet” and hit enter

Open the dump in windbg

1. Open the dump file in windbg (the .dmp file should be located in your debuggers directory under a folder labeled hang_mode and todays date and time.

2. Set up the symbol path and load sos (see the setup instructions for more info)

Q: How big is the dump? Look at the size in the file explorer.

Q: How does this compare to Private Bytes, Virtual Bytes and # Bytes in all Heaps.

Examine the memory to get an idea of where the memory is going

1. Run !address -summary (this will give you an overview of the memory usage) and familiarize yourself with the output. Hint: check the windbg help files for !address

Q: Which values corresponds best to the following

    • Private Bytes
    • Virtual Bytes

Q: Where is most of the memory going (which RegionType)?

Q: What does Busy, Pct(Busy) and Pct(Tots) mean?

Q: What does MEM_IMAGE mean?

Q: Under what region does .net memory fit in and why?

From perfmon we could see that #Bytes in all Heaps follows Private bytes very closely meaning that the increase in memory is mostly an increase in .NET memory.  In other words we are looking for an answer to why our .net GC heaps keep growing.

Examine the .NET GC Heaps

1. Run !eeheap -gc to examine the size of the .NET GC heaps

Q: How many heaps do you have? Why?

Q: How much memory is stored on the .net GC heaps?  Compare to #Bytes in all Heaps

Q: How much memory is on the large object heap? Hint: add up the sizes for all the Large Object heap segments. Compare to Large Object Heap Size in perfmon. 

2. Run !dumpheap -stat to dump out all .net objects in a statistical fashion (Note: you can run !help DumpHeap to get help for the !dumpheap command or check out this post, this post or this post)

Q: Looking at the 5-10 object types that use most memory, what do you think is leaking?

Q: What “size” does the size column show? I.e. what is included in this “size”?  Hint: see the post (!dumpheap -stat explained) for more info.

 

Normally I wouldn’t recommend looking at strings immediately since they will be at the bottom of the !dumpheap -stat output in pretty much every single dump you take, because of the following 

    • the “size” for strings is the actual size of the contents of the string.  If you compare this to the size of a dataset for example, the size will only contain pointers to the rows and columns arrays but not include the memory for the rows and columns, so the size for a datasetobject will always be very small (and constant) while the size of a string varies in size.
    • strings are leaf nodes in most objects, i.e. datasets contain strings, aspx pages contain strings, session vars contain strings, labels contain strings etc. etc. so you are bound to have mostly strings in your app.

However in this case the strings are so many and occupy so much memory that if we don’t find anything else that sticks out, we might just want to follow the string lead…

3. Dump out stats for various size strings to find out if there is a pattern (this is a bit of trial and error so you have to try a few different sizes to figure out where the bulk of the strings are.

Get the method table (MT) for System.String (first column in !dumpheap -stat)

!dumpheap -mt <string MT> -min 85000 -stat

!dumpheap -mt <string MT> -min 10000 -stat

!dumpheap -mt <string MT> -min 20000 -stat

!dumpheap -mt <string MT> -min 30000 -stat

!dumpheap -mt <string MT> -min 25000 -stat

 

Q: In what range (between what sizes) do most of the strings exist?

 

4. Dump out the strings in that range

!dumpheap -mt <string MT> -min 20000 -max 25000 

In this case most of them will even be the exact same size which is a clue to what is going on…

5. Dump out a few of them to find out what they contain

!do <address of string – first column in the !dumpheap -mt output>

Q: What do these strings contain?

6. Pick a few and find out where they are rooted (i.e. why they can’t be collected)  Note: You may want to try a couple different ones.

!gcroot <address of string>

Q: Where are they rooted?  Why?

 

Examine the finalizer queue and the finalizer thread

1. Look at the finalizer queue

!finalizequeue

Q: What objects are listed in the finalizequeue output? Hint: run !help finalizequeue

Q: How many objects are “ready for finalization”?  What does this mean?

2. Find the finalizerthread to determine what it is doing. Run !threads and look for the thread listed with (Finalizer)

3. Move to the finalizer thread and examine the managed and native callstack

~5s   (5 should be substituted with the id of the finalizer thread)
kb 2000
!clrstack

Q: What object is it finalizing?

Q: What is it doing?  Why is this causing high memory usage?

 

Examine the code for verification

1. Open Link.cs to find the destructor/finalizer for the Link class to see the problematic code

 

Related posts

!dumpheap -stat explained 

ASP.NET Memory Investigation

.NET Memory usage – A restaurant analogy

.NET Memory leak: Unblock my finalizer

.NET Memory: My object is not rooted, why wasn’t it garbage collected?

I have a memory leak!!! What do i do? (defining the “where”)

Who is this OutOfMemory guy and why does he make my process crash when I have plenty of memory left?

Have fun debugging,

Tess

Comments (30)

  1. MSBuild The custom MSBuild task cookbook [Via: bart ] ASP.NET Simplifying ASP.NET ListView Control…

  2. Link Listing – February 15, 2008

  3. Aaron says:

    Say you want to find the contents of a string, so you !do. What if you get an error saying the string it too large to display ? How can you determine the contents of the string ?

    Cheers

    Aaron

  4. Tess says:

    if sos deems it too large to display you can do

    dc <address>+0xc  

    (which will dump out the characters at an offset 0xc from the string address, this is where the contents of the string start)

    or you can try da <address>+0xc or du <address>+0xc

    da = dump ascii, du = dump unicode

    You can also do

    dc <address>+0xc <address>+<size of string>

    if you want to dump out the whole string, you will get the size from the !do output

  5. baal says:

    on my server, there are 10 w3wp.exe processes, when i add counters,  "select instance from list box",in the list box there are  w3wp.exe#1, w3wp.exe#2…., i can’t figure out which is i want.

  6. Tess says:

    That’s quite a few w3wp.exe processes:)  you can run

    tlist -c from the debuggers directory to get a list of all processes and their command lines to try to identify the pid and then check which w3wp corresponds to which pid under the process counter…

    but you might as well just add the counters for all w3wp.exe processes and then based on the counters (like memory usage or cpu activity) you can probably figure out which one it is…

  7. baal says:

    add the counters for all w3wp.exe processes

    —-there are too many info got, i can’t analyze them,it is too dificult for me. your method can’t work.

    here i found a article: http://support.microsoft.com/kb/922775/en-us ,in last ,it write "Multiple CLR performance counters appear that have names that resemble "W3wp#1" " i follow this way, but i can’t understand which represent the PID of w3wp.exe.

    identify the pid of w3wp.exe between the .net counters and native counters are different.  for native counters , do some modification on key:  HKEY_LOCAL_MACHINESystemCurrentControlSetServicesPerfProcPerformance,

    this is what i found on msdn

  8. Tess says:

    then try tlist and if that doesn’t help then then next thing i would suggest is to either shut down the processes and just start one or do something like hit a page repetedly while watching task manager to see which one of them the CPU usage/memory usage moves for to find the pid.

  9. Hope you have had a chance to go through the high CPU lab… Some people already commented with the solution

  10. Lab 1: HangLab 2: CrashLab 3: MemoryLab 4: High CPU Hang

  11. baal says:

    " tlist -c  " is  useless for me!   this command output  is the same as the tab of processes in  Debug Diagnostic tool. i installed  Debug Diagnostic tool. I guess my server has native leak. on my server, there are a few w3wp.exe running. I can’t stop the process and hit a page repetadly. the serve is production server on internet, many people will acess the web page.

  12. baal says:

    i got private bytes, virtual bytes  for  w3wp_6824  and and #Bytes in all heaps for w3wp#5, w3wp#5 is the no. 6824 process.(i do as  http://support.microsoft.com/kb/922775/en-us ‘s say.)

    but there are curves for private bytes, virtual bytes. but there is no curve for #Bytes in all heaps of w3wp#5. is it correct? ( the max, min, average value of  #Bytes in all heaps of w3wp#5 are the same. no value for last ). the #Bytes in all heaps for _global_ has curve( the max, min, average,last value of  #Bytes in all heaps of _global_ are exist and different)

  13. Tess says:

    The _global_ counters are not reliable. If you don’t get values for #Bytes in all heaps for certain w3wp.exe processes it either means that you are not running the log as an admin (or someone with debug priviledges for the process) or the process is not serving .net content (most likely cause is probably a perms issue)

  14. baal says:

    yes,it is the priviledges problem, I set Change the "Run As:" from "<Default>" to administrator ,but no data collect, I set to "<Default>"  is ok.

  15. Michal says:

    Hi,

    Is there away to get the value of a .Net variable from a memory dump?

    I known it’s name, it is of type string and when running

    !dumpheap -stat

    I see that there are ~6000 strings on the heap…

    Thank’s.

  16. Tess says:

    Hi Michal,

    You can’t really search by variable name or even retreive the variable name from a memory dump in most cases, since once you compile the variable names are usually no longer in the process, they are only relevant for you as a coder.  I.e. to the process what is relevant is addresses and instructions.

    If however, you know that this is a member variable of another object, or you know that it is on a stack somewhere you might be able to get to it by dumping the stackobjects, dumping out all the statics, dumping out session variables or dumping out the parent class

  17. Justin says:

    .NET调试实例

    这是一个系列的调式实例,目的是为了帮助你在调式.NET应用程序中最常见的挂起(Hang)、性能(performance)、内存(memory)和系统崩溃(crash)方面获得一些…

  18. J.W. says:

    Hi, Tess,

    Get a memory dump

    1. Open a command prompt and move to the debuggers directory and type in "adplus -hang -pn w3wp.exe -quiet" and hit enter

    Do we run it after tinyget returns or before tinyget returns?

    Thanks,

  19. Tess says:

    in this one you can wait until tinyget returns

  20. JanIvan Qian says:

    Hi Tess, I’m using Vista Enterprise. But I couldn’t follow you after running perfmon.exe in section "Set up a performance monitor log". I guess there may be some difference in Vista.

    I have no idea how to move on cause I’m not familiar with performance counters. Could you please show me some light? Thanks 🙂

  21. Tess says:

    In Vista you would create a new data collector set instead.  In perfmon this is located under data collector sets/user defined,  right-click to create a new one and follow the wizard

    HTH

    Tess

  22. I have put together a quick and dirty debug diag script for troubleshooting .net memory leaks. (attached

  23. Chris says:

    Great site and great info.  My company brought in the win32 native debugging and the advanced .net debugging courses and I took them both.  

    I’m looking at a high mem utilization problem, and I’m seeing 181250 finalizable objects in gen 2.  I’ve been through a pretty fair number of dumps and have never seen this many objects in gen2, or any generation for that matter.  We just started using WF for our page navigation and the top 10 (by count and mem usage) of finalizable objects are System.Workflow objects

    38079      1675476 System.Workflow.Activities.IfElseBranchActivity

    29112      1048032 System.Workflow.Activities.Rules.RuleConditionReference

    Have you run across anything like this in Workflow scenarios?

  24. Sundar Paranthaman says:

    I am looking into the high number of SqlCeCommands in our application, when I dump the contents of SqlCeCommand in our applicaion, I see outputs like this. What does a HANDLE(WekSh) signify. It looks like that this could not cause memory leak, since the handle is weak it could be garbage collected based on the need.

    DOMAIN(00000000000E5970):HANDLE(WeakSh):a425a8:Root:00000000032aef20(System.Data.SqlServerCe.SqlCeCommand)

  25. Innggi says:

    This is good for detecting server side leaks in IIS however what tools can you use for diagnosing client side Javascript memory behaviour in IE

  26. Mike says:

    Hi Tess,

    Is there any chance of getting Server Unavailable issue  because of storing  huge data tables in session variable? Can you brief me  If there is a chance to get?

    Thanks

    Mike

  27. Tess says:

    Mike,  that depends on how big the datatables are and how much memory you use in general in the process, and if you are in a 64bit or 32 bit process….  basically the answer is yes, if you use enough memory to get an OOM you can get a crash resulting in a Server Unavailable message

  28. Mike says:

    Thanks for replay..

    In our application we are storing more than 234777 rows in a  Data set  and this Data set object declared as a private variable like "private dsTemp as New Dataset". and bind the rows to same Data set in every events . Is this create new object in heap for every event( button click events) or request? or Overwrite same object in heap?

    And Is there any issue if  object declared as a private(private dsTemp as New Dataset) ?

    Thanks

    Mike

  29. Dzmitry says:

    {quotation}

    Examine the .NET GC Heaps

    1. Run !eeheap -gc to examine the size of the .NET GC heaps

    Q: How many heaps do you have? Why?

    {quotation}

    Could you please help with the answer, why there are 2 heaps?