I get several emails every day (through the blog) with urgent requests to help troubleshoot memory issues or memory related performance issues. Some just say “we have a memory issue, how do I troubleshoot it?” and others are far far more detailed with links to dumps and everything:)
I love getting emails, but since I have a day job too (troubleshooting such issues:)) I unfortunately have to ignore most such requests through the blog:( to have time to help the customers that contact me through the normal support channels, and therefore I would recommend that if you do have a pressing issue, that you create a support case on http://support.microsoft.com/ to make sure that you get a dedicated support engineer and that your issue gets the attention it deserves.
Having said all that, I wanted to point out one email I got today, both because I could see that the person writing it had put some effort into try troubleshooting the issue before contacting me, and because I think and hope that my answer might be able to give some starting points to other people in his situation.
We have a week to go for production and are stuck in one performance issue. Please guide me in resolving this.
We are running an ASP .Net (2.0) App on Windows Server 2003 (SP2), dual-processor, 4GB RAM. During stess-testing our app with 60 users, we see a drop in throughput from 17th minute to 22nd minute.
It rises and drops again around 40th minute.
I took 4 memory dumps for 8 mins, ran the perfmon counters. Here are the details:
Dump1 Dump2 Dump3 Dump4
Size 1.16G 1.12G 1.04G 1.13G
GC 802MB 750MB 648MB 738MB
Loader 18MB 18MB 18MB 18MB Image 236MB 236MB 236MB 236MB
MEM_RESV 220MB 236MB 347MB 255MB
MEM_COMM 1.19GB 1.06GB 1021MB 1.08GB
Perfmon Counters (8 mins)
#Bytes in all heaps: 798MB,
%Time in GC: 49%(avg),
#Induced GC: 14 (avg).
So I can deduce the following:
1. We are stuck in GC for the 8 minutes.
2. Assembly leak is not the problem.
3. Am not sure if Finalizer thread is being blocked.
The kb reads as below:
ChildEBP RetAddr Args to Child
0e1afc74 7c827d0b 77e61d1e 00000334 00000000 ntdll!KiFastSystemCallRet
0e1afc78 77e61d1e 00000334 00000000 0e1afcbc ntdll!NtWaitForSingleObject+0xc
0e1afce8 79e8c5f9 00000334 000007d0 00000000 kernel32!WaitForSingleObjectEx+0xac
0e1afd2c 79e8c52f 00000334 000007d0 00000000 mscorwks!PEImage::LoadImage+0x1af
0e1afd7c 79e8c54e 000007d0 00000000 00000000 mscorwks!CLREvent::WaitEx+0x117
0e1afd90 79fc2397 000007d0 00000000 00000000 mscorwks!CLREvent::Wait+0x17
0e1afdac 79fc24b4 000f2b28 0e1afeb0 000f3328 mscorwks!SVR::WaitForFinalizerEvent+0x4a
0e1afdc0 79e9845f 0e1afeb0 00000000 00000000 mscorwks!SVR::GCHeap::FinalizerThreadWorker+0x49
0e1afdd4 79e983fb 0e1afeb0 0e1afe5c 79f7759b mscorwks!Thread::DoADCallBack+0x32a
0e1afe68 79e98321 0e1afeb0 9b91703f 00000000 mscorwks!Thread::ShouldChangeAbortToUnload+0xe3
0e1afea4 79eef6cc 0e1afeb0 00000000 0010f770 mscorwks!Thread::ShouldChangeAbortToUnload+0x30a
0e1afecc 79eef6dd 79fc2469 00000008 0e1aff14 mscorwks!ManagedThreadBase_NoADTransition+0x32
0e1afedc 79ef12ab 79fc2469 9b91718f 00000000 mscorwks!ManagedThreadBase::FinalizerBase+0xd
0e1aff14 79f92015 00000000 00000000 89244470 mscorwks!SVR::GCHeap::FinalizerThreadStart+0xbb
0e1affb8 77e64829 000f3328 00000000 00000000 mscorwks!Thread::intermediateThreadProc+0x49
0e1affec 00000000 79f91fcf 000f3328 00000000 kernel32!BaseThreadStart+0x34
4. !finalizequeue results
generation 0 has 158 finalizable objects (1d1d42c8->1d1d4540)
generation 1 has 269 finalizable objects (1d1d3e94->1d1d42c8)
generation 2 has 1961 finalizable objects (1d1d1ff0->1d1d3e94)
Ready for finalization 0 objects (1d1d4540->1d1d4540)
generation 0 has 830 finalizable objects (1d23d7c0->1d23e4b8)
generation 1 has 284 finalizable objects (1d23d350->1d23d7c0)
generation 2 has 1238 finalizable objects (1d23bff8->1d23d350)
Ready for finalization 0 objects (1d23e4b8->1d23e4b8)
How do I proceed from here to isolate the issue?
Looking at the above, these were the comments and suggestions for further troubleshooting that I sent him.
I don’t agree that you are stuck in a GC for 8 minutes, you can check with # Gen 0/Gen 1/Gen 2 collections and will probably see them increasing. The reason I say that you are not stuck is because the % time in GC counter will only update on completion of a GC so if you were stuck it would not update.
I do think however that your % Time in GC looks really high and I think there might be two reasons for this
- You are reaching the point where mem usage gets really close the OOM range (typically around 1 GB private bytes) so the process is probably working really hard to get rid of everything it can. # Induced GC here is not 0 which means that someone called GC.Collect, and unless it is your app that calls GC.Collect it is probably ASP.NET doing this to clean up at an OOM.
Each induced GC will be a full GC which can get pretty costly performance wise.
Recommendation: Look at general memory usage, i.e. look at !dumpheap –stat etc. and see what is using all this .net memory and try to find places where you can reduce the memory usage, or alternatively, if you need to use all this memory, then you would need to move to a 64-bit environment. (You might want to look at some of my posts on troubleshooting .net memory issues to narrow this down). A potential also, if you are running multiple sites here, would be to separate them into different app pools.
- Your LOH looks pretty large (307 MB), if you are continuously allocating on the LOH you will also trigger a lot of full GC’s.
Recommendation: have a look at !dumpheap –min 85000 and see if there are any unexpected large objects, or if there are any that you can reduce in size or in number of objects. (See http://blogs.msdn.com/b/tess/archive/2006/06/22/asp-net-case-study-high-cpu-in-gc-large-objects-and-high-allocation-rates.aspx for more details)
Regarding the finalizer, the finalizer is not stuck, it is sitting in mscorwks!SVR::WaitForFinalizerEvent+0x4a which means that it is just idling, waiting for the GC to tell it to finalize.
I have a post with common asp.net threads and how they look in the “normal” idling state here http://blogs.msdn.com/b/tess/archive/2007/04/02/things-to-ignore-when-debugging-an-asp-net-hang-update-for-net-2-0.aspx that might be useful for further debugging.
You can further see that finalization is not a problem because !finalizequeue reports no objects to ready for finalization in either heap “Ready for finalization 0 objects”. Here you can find an example of what it would look like if the finalizer was actually stuck.
Hope that helps,