Questions on application domains, application pools and unhandled exceptions


I got an email with some questions around application domains, application pools and unhandled exceptions from a developer that was frequently seeing his website crash, and also had some related issues with session loss in his application.

I have written before about unhandled exceptions and session loss due to appdomain restarts but I thought his questions would serve as a nice refresher.

From what i read, my understanding is that a website has an app pool associated with it. This app pool leads to the creation of a w3wp.exe process. Inside this app pool/w3wp.exe process, an application domain is created.

Tess:  This is correct.  In IIS you can create different application pools that have different healthmonitoring settings, run under different user contexts etc. and when you create a web site you choose which application pool it will run under.  Each application pool will then spawn it’s own w3wp.exe process (or multiple if you have web gardening turned on) when the first request comes in.

The process (w3wp.exe) contains multiple application domains, typically a shared domain, a default domain, a system domain and one application domain per web application (virtual directory marked as application). 

An application domain recycling is different than an application pool/proccess (w3wp.exe) recycling, right?

Tess: Yes.  An appdomain can recycle without recycling the process.  Simplified an appdomain is a process within the process with it’s own statics (cache/session etc.), but all appdomains in the process share the same GC, threadpool, finalizer thread etc.

An appdomain recycle is triggered by a few things like web.config changes, directory name changes etc.  You can find most of the appdomain recycle reasons in this post.  When an appdomain recycles the process stays up, however when the process goes down the appdomains in the process will of course also go down.

Can unhandled exceptions cause the application domain to recycle ?

Tess:  It depends on what you mean by unhandled exceptions.  In ASP.NET there is the concept of unhandled exceptions that are caught by the global error handler or page error handler. i.e. the ones that give you the yellow exception output when you view a page.  They will be listing as unhandled exceptions in eventvwr, but in reality they are handled by the page error handler, and they will neither crash the appdomain or the application/w3wp.exe process.

If on the other hand you have an exception that occurrs on a non-request thread, such as a timer thread or the finalizer thread and you dont have a try/catch block around it, it is really an unhandled exception, and such unhandled exceptions will cause the process to go down and take all the appdomains with it.

What exactly happens inside the application domain when an unhandled exception occurs ?

Tess:  The process shuts down, and the appdomains are unloaded so anything inside it is gone including session vars etc.

Is there a setting in any of the config files that will prevent the application domain from recycling ?

Tess: There is a legacyUnhandledExceptionPolicy see this post for more info, that will cause a 2.0 process to behave as 1.1, i.e. not shut down the process and instead just quit processing the current thread of execution. I would seriously advice against using it though other than as a temporary measure while you troubleshoot the unhandled exception as it may cause your process to behave erratically, since you don’t really know what has processed and what hasn’t.  For example if an exception occurrs during finalization you will not know if you have released all the native handles you were supposed to or not.

Laters,

Tess 

Comments (34)

  1. Madhur says:

    That was very helpful .. Thanks

  2. viscious says:

    Thanks for clearing up some of those things.  One of the things I struggled with for a long time when I started reading your blogs and the blogs of your peers was trying to create a dump on an "unhandled exception".  I didn’t understand that these exceptions were really handled by the  global application handler.

  3. My latest in a series of the weekly, or more often, summary of interesting links I come across related to Visual Studio. The Web Developer Tools Team announced the release of the Dynamic Data Wizard Preview 0806 for VS 2008 SP1 . US ISV Developer Evangelism

  4. Link Listing – August 21, 2008

  5. A timely reminder! Thanks Tess.

  6. Tarique says:

    Tess,

    Thanks for answering my questions. I wasn’t expecting you to write a whloe blog post about it 🙂

    I was able to solve the problems following the steps in your <a href="http://blogs.msdn.com/tess/archive/2006/08/02/asp-net-case-study-lost-session-variables-and-appdomain-recycles.aspx">other blog post<a/>

    i was wondering if i could pick your brains a bit more 🙂

    I have a follow-up on Question 3, i.e. "Can unhandled exceptions cause the application domain to recycle ? "

    You say that in asp.net an unhandled exception on a  normal request thread will be caught by the global error handler or page level handler and will not cause the appdomain or the process to restart. But lets say hypothetically the developer hasn’t written any global error handler (in global.asax) or page level handler. In this case will the appdomain or process or both crash or does asp.net have something built in that will always catch it?

    In Question 4, "Is there a setting in any of the config files that will prevent the application domain from recycling ?"  You mention a solution that will prevent  the process from crashing and  the worker thread will be lost. What about the application domain ? will it survive or will it crash ?

    Also, an additional question. Could you elaborate a bit on the topic of  request threads and non request threads inside an application domain ? Or if you could point me to any good articles on how things work inside an application domain.

    Thanks again. You’re a life saver.

  7. Tarique says:

    Tess,

    Thanks for answering my questions. I wasn’t expecting you to write a whloe blog post about it 🙂

    I was able to solve the problems following the steps in your <a href="http://blogs.msdn.com/tess/archive/2006/08/02/asp-net-case-study-lost-session-variables-and-appdomain-recycles.aspx">other blog post<a/>

    i was wondering if i could pick your brains a bit more 🙂

    I have a follow-up on Question 3, i.e. "Can unhandled exceptions cause the application domain to recycle ? "

    You say that in asp.net an unhandled exception on a  normal request thread will be caught by the global error handler or page level handler and will not cause the appdomain or the process to restart. But lets say hypothetically the developer hasn’t written any global error handler (in global.asax) or page level handler. In this case will the appdomain or process or both crash or does asp.net have something built in that will always catch it?

    In Question 4, "Is there a setting in any of the config files that will prevent the application domain from recycling ?"  You mention a solution that will prevent  the process from crashing and  the worker thread will be lost. What about the application domain ? will it survive or will it crash ?

    Also, an additional question. Could you elaborate a bit on the topic of  request threads and non request threads inside an application domain ? Or if you could point me to any good articles on how things work inside an application domain.

    Thanks again. You’re a life saver.

  8. Tess says:

    Hi Tarique,

    Even if the developer has not written a global or page level errorhandler they wouldnt crash the system.  Instead the user would see an exception or error page.

    About Question 4,  no, the application domain would still stay up.

    And finally on the request threads and other threads…  by request threads I mean threads handling requests, and non-request threads would be timer threads and the finalizer for example…

  9. gus says:

    Hi Tess.

    I read this post and it hits the same spot has some others I read while trying to find a solution for a problem.

    Lost sessions are always related to session timeouts and appdomain/worker process recycling, but what about if only some sessions variables are randomly lost, way before their defined timeout, without the process recycle (no application_start called)? Some sessions just vanish! :s

    Maybe I should just submit my problem to MS support…

    Great blog in a gray area for must of us developers.

    Best regards

  10. Tess says:

    If you are just loosing a few session variables that would either mean that the code that was supposed to set them never ran, or they are overwritten.

    I would check for exceptions to see if any occurred preventing the variables to be set. or if you are populating them with content from shared/static variables that is maybe changed by other threads.

  11. Tess says:

    just one more note on that topic,  if you have a webfarm, make sure that you are either using sticky sessions so all requests go to the same server, or that you use out of proc session state like sql session state or a state server, otherwise the sessions would be available on one server but not the other…

  12. Alina says:

    Hi Tess,

    I have big object (wrapper around managed code) that I want to load somehow into IIS process but to not be subject of AppDomain recycling. Is it possible? If so how?

    This object will need to be accessed by current app pool that is processing the web requests, once app pool recycles memory is not freed up fast enough for the new app pool to have enough memory to allocate this object in the new pool. Yes, we have extreme memory fragmentation on the system that is the true source of the problem, we use ASP.NET 2.0. If we write something like ISAPI filter and have the object loaded there, and then marshaled-cross-domain boundary from regular apps, is that reasonable, will it even work? Any better ideas?

    Thanks,

    Alina

  13. Anil says:

    Hello Tess,

                    This question is about App domains and Sessions. Is it possible to have IIS run each User Session in a seperate App Domain. If Yes, Could you please let me settings in the config file that affect this.

    Regards,

    Anil.

  14. Tess says:

    Hi Anil,

    I am assuminig that you dont mean one appdomain per user, as this would mean that you would have a lot of appdomains:)  so I am not really sure that I am understanding your question…  

    If the question is about having each appdomain/website in a separate process, then you can do so by assigning them to different application pools.  

    If the question is about how you can save session state even over appdomain restarts, the answer is to use out of process session state, like asp.net state server or sql server session state.

  15. Anil says:

    Hello Tess,

                    Thanks for your reply. In fact I was asking if it is possible to have a seperate app domain per user. I agree with you this could result in a lot of app domains.

    Regards,

    Anil.

  16. ilan says:

    Hi Tess.

    Regarding your phrase: "but all appdomains in the process share the same GC, threadpool, finalizer thread etc", is that mean that GC don’t know the boundaries of each appdomain memory limit?

    I suspect that’s the case and that’s why gc does not know he must collect NOW or the pool for that application will recycled.

    If I’m right – that’s a big bug in the GC, if I’m wrong – then I must have a big illusive bug in my system. U c, I found myself having to call gc.collect() on every page_unload event of my website(s) just to stop the memory to increase almost every refresh of almost every page. Only the "Induced DC" stop that.

    For few days now I’m trying to figure out if I have some unmanaged unreleased resources, either by going over my code path or using the debug tools u teach us here so well, but I can’t seem to find any (I even panicked and did some "object=null " for managed objects like xmlDocument and such – even though I knew I don’t have to…).

    Love to hear your thoughts on the matter.

    Thanks a lot for all your great posts.

    Ilan.

  17. Tess says:

    I am not sure I follow your reasoning here.  When an appdomain is unloaded the memory (statics etc.) associated with that application domain is released, in the sense that it is available for garbage collections.

    Garbage collections occurr when you make an allocation that gets you over the limit for that generation, so that really has nothing to do with appdomains.    

    You shouldnt have to call gc.collect on each page_unload, but you might want to read some of my posts on garbage collection to understand better when a garbage collection occurrs to understand the behavior of the memory management in your application.

    Hope that helps,

    Tess

  18. ilan says:

    Hello Tess and thanks for replying.

    I read a lot on debugging in general and the GC in particular this week, in fact it becomes my job this days : -)

    What I neglected to mention in my question is the 60% default threshold of memory usage, which – according to the documentation – set the GC in motion when it reached.

    So my Q was: is it a 60% of the system memory or a 60% for each appdomain?

    Let me explain a little more to make it clearer.

    I have a website that has his own app pool with 200MB of memory + 200MB of VM set on it. Now the IIS is very strict about it, so if my app goes over the limit of 200MB, the appdomain is recycled almost instantly (the IIS waits a minute to give the application a chance to release the memory before recycling), but I would expect the GC to collect way before that point. If the GC was aware of each app pool limits, he would know to collect when my app reaches 120-130MB of use.

    Now I know that if the GC was collecting then, the memory was ok, since I see it is fine when I call GC.Collect() on the page_unload() (I promise I won’t do it), so the only explanation is that GC is not collecting all he could and apparently SHOULD when the 60% of my app pool is reached.

    So from what you said ("but all appdomains in the process share the same GC"), it sounds like the only thing the GC is aware of is the sum of memory used by the process all in all, but not per appdomain. Thus, until the PROCCESS reaches 60% of the SYSTEM Total Memory (that can have 4GB of RAM for all I know…), the GC thinks there’s no rush and keeps sipping on his coffee :), but the IIS on the other hand, sees that my pool boundaries "has been breached" and BANG its recycled.

    If that’s true then it’s a huge problem, isn’t? And as I said, if it’s not – then I have a big problem…

    Tess, thanks so much for helping, I really appreciate it. If it is my misunderstanding, please refer me to the best post that u think can answer me.

    ilan.

  19. Tess says:

    Hi Ilan,

    I think you might be confusing appdomain and application pool.  An application pool can contain many asp.net applications, each with its own appdomain.  In other words, a w3wp.exe process will serve an application pool with one or more asp.net applications/appdomains.  

    The GC will be common to all appdomains in an application pool but you will have a separate GC per process, i.e. a separate GC per apppool.

    Now, if you set the limit to 200 MB (private bytes), asp.net will try to release the cache etc. once you get within about 10% of that limit.  The 60% is a separate setting set in machine.config but that does not apply anymore in IIS6, instead it is overridden by the app pool settings of x MBs.  The 60% used to be a percentage of total RAM.  So you can forget about the 60% for your app… the 200 MB will be what counts.  Now 200 MB VM is a pretty small amount so you will probably have a hard time keeping under that limit, especially if you are running on a multiproc box, since the startup memory used (64+16)*#processes VM for the GC and whatever vmem you use for dlls etc, will pretty much use that up…   If possible you should consider setting it a little bit less restrictive…

    HTH

    Tess

  20. Ilan says:

    Hi Tess, thanks for the VERY quick reply.

    After reading your answer – that actually said that the problem is mine, I did some more reading and research, just to find out that my problem is probably not in the GC at all but elsewhere – a Native leak and mostly the "working process", I just don’t know why, maybe u can help me?

    Here are some numbers on a sample page that caused me the most memory problems:

    When I first load the app after killing the process from the task mgr (the project was already compiled, so no compilation cost here, but we do have the first load of all the dll’s) the numbers are:

    Process:

    Private Bytes: 46,022,656

    Virtual Bytes: 377,942,016

    Working Set: 63,397,888

    .NET CLR Memory:

    Bytes in all Heaps: 3,814,628

    Gen 0 Collections: 3

    Gen 1 Collections: 2

    Gen 2 Collections: 1

    # of Pinned Objects: 29

    # of Sink Blocks in use: 41

    Total committed Bytes: 16,994,072

     Total reserved Bytes: 201,592,600

    %Time in GC: 4.524

    Finalization Survivors: 47

    Gen 0 heap size: 32,541,512

    Gen 1 heap size: 2,533,028

    Gen 2 heap size: 1,205,392

    LOH size: 76,208

    Let me just say her that all the other counters like Exceptions, Loading, threads (beside the bug when it shoes 4,294,967,263 physical threads out of the blue) are all ok (thanks for 40,000 exceptions post, u shaved me a few thousands of my own :)).

    So, after hitting the refresh button for 10 times (I’ll skip the changes for each refresh, sometimes it jumps a lot sometimes almost nothing) here are the new numbers (Note the Private bytes and the Working process jump, while all the other are quite stable)

    Process:

    Private Bytes: 94,068,736 (*2, 48MB more)

    Virtual Bytes: 376,893,440 (little less!)

    Working Set: 112,365,568(almost *2, 50MB more)

    .NET CLR Memory:

    Bytes in all Heaps: 5,284,448(almost *2 but it’s actually only 2MB more)

    Gen 0 Collections: 9(*3, 6 more)

    Gen 1 Collections: 5(*2, 3 more)

    Gen 2 Collections: 4 (*4) (I never get the "1 Gen 2 for 10 Gen 1" good ratio)

    # of Pinned Objects: 32 (+3)

    # of Sink Blocks in use: 35 (-6)

    Total committed Bytes: 65,503,072(*4)

     Total reserved Bytes: 201,592,600(almost no change here)

    %Time in GC: 0.924 (1/4)

    Finalization Survivors: 82 (almost *2)

    Gen 0 heap size: 60,004,888 (almost *2) (I know it’s only allocated space and it’s not on the heap, but still, why does it increasing?)

    Gen 1 heap size: 2,448,524 (a little less)

    Gen 2 heap size: 1,443,524 (a little more)

    LOH size: 1,392,400 (*18 !)

    So my problem is definitely not in the GC since the sum of all the heaps is only 5MB, and according to your post at http://blogs.msdn.com/tess/archive/2005/11/25/i-have-a-memory-leak-what-do-i-do-defining-the-where.aspx, the increase in private bytes and not in Bytes in all heap (and not in Virtual bytes either) means that I have a Native memory leak, but I DON’T use any COM components in my code!

    Also, I keep calling dispose() or using a "using{}" block on any object that have a dispose method like sql connections, TextWriter, StreamWriter etc, so where this native leak can come from?

    And what can u say about that: by the time I finished writing the last few paragraph (about 10 minutes), the working process got to 178MB, an increase of 66MB for no obvious reason!!! And I’m on a local machine for the test; no one is doing anything on the application to use up memory or CPU! Any idea what’s going on?

    Why does the working process keep growing? what exactly is it?

    BTW, the private bytes did not grow (or any other counters for that matter).

    Tess, I feel really lost here, since I can’t seem to find anyone that talk about the "working process" as part of debugging memory issues. Am I missing something here?

    P.S. I’ve waited a long time without touching the application and the numbers didn’t grow(the working process stays around 178MB)

    Thanks again for your help,

    Ilan.

  21. Ilan says:

    Hi Tess,

    After posting my prev feedback, i’ve read your post about debugDiag. I downloaded the debugDiag and run it on my process and guess what… he claims that there’s no native leak either!

    So where am I leaking?! 🙁

    ilan.

  22. Tess says:

    Debug diag can be a bit tricky in showing leaks… you need to make sure that you have it leaktracking while the leak occurrs and that if the leak is fast, that you also enable it to gather stacks the first 15 minutes (under tools/options)

    It’s a bit hard to give too much more info about what you might be leaking without taking a deeper look at your specific scenario… if it is urgent for you you might want to call support to get more help and dive deeper into the issue.

  23. Tess says:

    Just had a look at your numbers btw, and I am thinking that it might just be assemblies that you are loading up that is taking up that memory…

    You have a lot of numbers and you need to look at the context a little bit.  For example an 18* increase of LOH sounds like a lot but when you look at the numbers you see that in reality it is just 1.3 MB so it might just be one object.   Also don’t worry so much about the .net heap numbers, considering that it is a total of 5 MB,  otherwise you will spend a lot of time solving a non-existent issue.

    As a general rule,  if you look for something that you suspect is a leak you need to first establish a baseline, i.e. how does it look after all the pages have loaded once… then you should look to see if you leak over time, i.e. if private bytes keeps going up and up and up and never comes down.    

  24. ilanmazuz says:

    Hello Tess and thanks for your reply.

    In the Debug diag I did trace the first 15 minutes and it definitely was tracing when the leak occur but still he says nothing is leaking.

    As for the support, how exactly do I do that? Do I send a dump to a general support email? or should I call first – and if so – to what international number?

    If I need to send a dump, then what kind of a dump, an AD+ or a Debug diag with leak trace?

    Thanks again,

    Ilan.

  25. Tess says:

    Hi Ilan,

    You would have to create a support incident.  How you do it depends a little bit on where you are and if you have a contract or not.  The details should be up on http://support.microsoft.com/

    Thanks

    Tess

  26. Zoran says:

    Hi Tess,

    thank you for your post, but how can I determine what changed my web.config file? I’ve got "Event message: Application is shutting down. Reason: Configuration changed", and I understood that this cause app domain to recycle. But what is the main reason, what changed the web.config? I tried with Filemon but all I can see is that w3wp.exe accesses the file to read it. We had a similar problem few months ago, something caused app domain to recycle very often and we discovered that was an antivirus program which used to write something in bin folder. Now reason is something else and I can’t figure out what it is. Thank you for helping,

    Zoran

  27. Tess says:

    If it isn’t you, the most probable causes are antivirus software or backup software

  28. Zoran says:

    We disabled antivirus, and there is no backup software. Is it possible that we get Configuration changed message for any other reason except that something changed the web.config file of our application? Thank you.

  29. Tess says:

    it could also be the machine.config…

    and there was an issue a long while back, can’t remember if it was 2.0 RTM or 1.1 (there should be a kb on it) even where we got overwhelming change notifications due to a problem with the filemonitor, but if you run on the latest bits that should not be an issue anymore.    In that case the message should be "overwhelming change notifications" though…

  30. Zoran says:

    It is frustrating, it just happens right now. All I can see from Filemon is that, in time when Configuration change happened, w3wp.exe opens, query and read machine.config and web.config, but it does that all the time, and I can’t see anything unusually when something cause configuration change. Few months ago, when we had similar issue(same application, same server), you could see exactly that antivirus writes something in bin folder of application. Now nothing. Nrgh!

  31. Zoran says:

    It seems that problem was that we created and deleted some files in root directory. Yesterday, I configured application to write that files in subfolder, and we didn’t got "Configuration changed" event since than.

  32. Ali Raza says:

    Thanks, nice post… really very help full.

    Regards,

    Ali Raza

    http://techgulf.blogspot.com/

  33. Rakesh says:

    Good questions & Answers, helped me lot

  34. Arun says:

    Can you please give me the difference between appdomain and apppool .