Thoughts on IIS Memory Recycling for 3rd party Applications

Sigh... it seems that the Application Health Monitoring features added in IIS6 are merely used by VARs to cover up for their own mistakes instead of leaving it in the user's control as a crutch to fall back on when 3rd party web applications fail.

Question:

We have a CRM application which runs on IIS6. The application has been crashing intermittently with the message "The server is temporary busy, please try again after a moment."

The manufacturer of the program had us adjust the default IIS AppPool properties as follows

[x] Recycle Worker processes (in minutes): 1440
[x] Recycle worker processes (number of requests): 35000

[x] Maximum virtual memory (in megabytes): 600
[x] Maximum used memory (in megabytes): 500

Changing these settings has tremendously improved the situation but we are still occasionally getting "The server is temporary busy" message.

I would go back to the manufacturer to ask for additional guidance but I got the impression they were just changing settings without knowing exactly what they were doing.

So my question is that given the server has 2GB of RAM, can we tweak these values further to eliminate this problem. What settings are recommended and why?

Also would adding additional RAM to the server help? Any assistance or documentation anyone could point me to regarding these settings would be greatly appreciated.

Thank You,

Answer:

Actually, if you want to eliminate this problem, you need to insist that your application manufacturer debug their CRM application to determine the cause of "the server is temporarily busy..." error. Only after determining the cause of the issue through debugging can one:

  • produce a fix for the bug 
    OR
  • determine a work-around for the bug

<soapbox>

In other words, without knowing the cause of the issue, you have no idea whether tweaking any server parameters will help, much less eliminate, the issue. This means that it does not help to ask for "recommended values" because there are never any generally recommended values for an application - it all depends on the application requirements, status, resource footprint, etc - and since the bug is "unknown", the recommendation is also "unknown". The same goes for changing system hardware like adding more RAM - once again, without understanding the issue, making changes is like gambling - and why gamble when the application manufacturer is supposed to support you in getting their CRM application working?

And if your concern is that the application manufacturer has no idea what they are doing with tuning/debugging their own application... then I wonder how you can trust their CRM application at all. The same group of people supporting the CRM application probably also wrote the CRM application - and if they cannot get you proper support for their application, then how can you rely on them long term?

In general, if someone tells you that "to run my application, you have to tweak THESE Application Pool Health Monitoring metrics", it means that their application is not written well enough to stay running.

</soapbox>

Right now, what it sounds like is that either this CRM application has some bug inside of it, either inside the program logic or its related configuration, and this bug eventually leads to the "server is temporarily busy..." error. The vendor either has no idea about the bug or knows about it but does not want to fix it.

There are two general ways to "resolve" a bug:

  1. Find a work-around to avoid the bug
  2. Fix the actual bug

Right now, it sounds like the vendor does not want to do #2 to truly eliminate the problem, so they distract you by telling you to tweak IIS AppPool Properties in an attempt to do #1. The problem with #1 is that it does NOT eliminate the problem - the bug is still there in the application you are running - and without debugging the issue, you cannot identify the problem and has no idea whether setting changes actually work-around the problem.

But... the vendor managed to get you distracted and wishfully thinking that YOU are actually empowered to resolve THEIR bug by simply tweaking IIS settings, adding more RAM, or otherwise tuning your system - when you have no idea what issue is being worked around. Meanwhile, the bug is still there in the application you are running...

Does this make any sense?

For example, adding more RAM does not solve logical bugs in an application - at best, it delays the problem. Suppose the issue is a memory leak. By adding more RAM, it simply means it takes longer for the application to consume all of your server's memory; it does not fix the leak such that the application never consumes more memory than it needs. Sure, you can recycle the process more frequently to periodically "free" up the memory, but recycling destroys in-process state and caches which have down-stream performance and reliability effects... all depends on the application's architecture. You merely substitute one unknown problem for another and do not make progress towards eliminating problems.

Remember, AppPool Health Monitoring Metrics are just that - generic metrics to determine the health of the application, and if deemed unhealthy by those metrics, recycle the worker process to clear away stale application state and start afresh. It is like rebooting the application.

In other words, it is meant as a possible crutch to keep the application running while support personel debug the issue and provide a fix. It is not meant as the "solution" to buggy software because it simply consumes one of YOUR defensive trump cards against buggy applications.

In general, the best solution to buggy software is to identify the issue and get it fixed, and this is what I suggest you insist your application manufacturer perform. Here are some useful blog entries for this endeavor:

//David