Thoughts on Application Pools running out of threads


Question:


Perhaps “application pool running out of thread”, that’s what my vendor tell me when their asp.net application has intermittent performance problem.


I searched and read quite some documentation, but no luck. What did he mean by “running out of thread”? Where can I check or adjust the setting for this?


And I want to know the exact benefic of web garden. Do more than one w3wp.exe achieve better performance than only one?


Answer:


If the vendor tells you their ASP.Net application is “running out of threads”, then the vendor is basically telling you that his ASP.Net application has some performance issue because it is performing too many simultaneousy synchronous operations tying up worker threads to prevent other requests from being serviced.


In other words, the vendor’s application has a problem, he is simply blaming Microsoft because it is easy and convenient, and you believed it because you had no other information.


Now, you may think “wait a minute; if the application is tying up worker threads, then shouldn’t IIS/ASP.Net simply detect this and create more worker threads to continue functioning – so isn’t this this really an IIS/ASP.Net issue?”


Ahh… but that is a common misconception about threads (and similarly, processes). Increasing concurrency (i.e. having the system perform more work simultaneously by creating more threads or starting up additional processes such as Web Garden) is not synonymous with improving performance. Why?


Computation 101


You can view the CPU of your computer as something that can perform some number of operations in a second, and at any point in time, it can only perform one operation.


Since the CPU does not perform multiple operations at any point in time, it means that it is not possible to run multiple applications simultaneously. A computer gives the ILLUSION of running multiple applications simultaneously by quickly switching the CPU between various applications such that each application gets a “slice” of CPU time to perform useful work with its operations.


Note that switching the CPU between various applications is not free; it takes a small but non-zero amount of CPU time.


Example


How does this translate into performance? Suppose you have:



  1. CPU that performs 120 operations per second
  2. A request takes 10 operations to complete
  3. Switching between threads takes 1 operation
  4. CPU switches threads every 5 operations


  • If your worker process only has one thread, it can perform at most 120 / 10 = 12 requests/second.
  • If your worker process has two threads, it can perform at most 120 / ( 2 x ( 5 + 1 ) ) = 10 requests/second (remember, it takes two context switches to gather up the CPU slices to perform 10 operations of work to complete the request).

Notice that increasing concurrency (in this case, the number of threads) DECREASES the potential request/second throughput. This decrease comes from the cost of context switching (in this case, between threads) that is otherwise not performing useful work. In other words, increasing concurrency decreases potential throughput because it must waste CPU cycles maintaining that concurrency.


At this point, the astute reader should wonder “if increasing concurrency decreases potential throughput (and hence max performance), then why in the world do we want concurrency? It’s just bad, right?”


Well, suppose that the request can monopolize and hang whatever thread it is running on, and it does this every 6 requests.



  • If your worker process only has one thread, you get 5 requests before your worker process is unable to service any more requests since it has no more threads.
  • If your worker process has two threads, you get 10 requests before your worker process is unable to service any more requests since it has no more threads.

Notice what concurrency brings:



  • If you are running blocking operations, concurrency allow you to run more of those operations before being totally blocked. In other words, it improves the appearance of availability.
  • However, concurrency does not solve blocking operations. It only delays the inevitable when all sources of concurrency gets blocked.

Conclusion


Thus, I would refrain from asking a question like “does more than one w3wp.exe achieve better performance than only one” because the answer completely depends on the behavior of each request.



  • If you have performant code that does not block, then having one w3wp.exe is ideal since it maximizes request throughput by not having any loss due to context switching
  • If you have code which blocks, you *may* want to consider increasing concurrency by adding threads or Web Garden… but realize that this only gives short-term appearance of availability at the expense of decreased raw performance, does not solve the non-performant code, and only delays the inevitable.

In other words, Web Garden is NOT a performance feature. It is a reliability/availability feature, and it improves appearance of reliability/availability at the cost of decreasing performance. It basically gives you another process (and more threads) which help improve reliability/availability if you are running code that hangs and chews up threads of a w3wp by providing additional processes and threads that can be chewed up by the same code. In other words, it delays the inevitable hang of a bad application. The cost of providing this concurrency is performance because there are more context switches between the process/threads, and context switch is basically wasted CPU cycle that does no useful work.


//David

Comments (16)

  1. Sebastian says:

    David, how does this apply to multi-core processors?

    I am interested in concepts of processor affinity and the ability (or inability) of threads within a single process to span multiple processors. can you shed some light on IIS 6 and specifically asp.net performance characteristics on multi-processor servers.

    At the moment I am having a "heated" debate with some IBM guys about IIS 6 performance.  

  2. Good says:

    Thanks David. Please ignore the previous comment of mine.

    You are right ‘cauze the vendor can’t suggest even a proper value for recycling, and they are still on the way of solving our aspx error. I don’t expect much from them.

    I understand your meaning, but since I can’t get the source code of that asp.net app, can you give me some clue about increasing thread for a application pool, to fulfill my learning curiosity.

    Although I know that probably would not solve the problem…

  3. Good says:

    I’ve got only one worker process for that application pool. (web garden will cast failure on that application)

    How many worker threads can this w3wp.exe launch (or use) then?

    <httpRuntime>

    minFreeThreads

    appRequestQueeuLimit

    these are the only setting I think might be related (threads pick up requests from these queues)

  4. David.Wang says:

    Good – signs of an application "running out of threads" include observable "intermittent poor performance" as well as various types of queuing by ASP.Net since there are no threads to pick up requests from the queue. At some point, the queue will get full and start rejecting requests and you see them as error responses of various sorts.

    As for thread counts – you are not interested in the number of threads w3wp.exe can launch/use because those threads are not involved in the execution of your ASP.Net application. ASP.Net has its own set of queues and thread pools (values controlled via the <httpRuntime> settings). All w3wp.exe threads do is take requests from HTTP.SYS queue, process it, and hand the request to ASPNET_ISAPI.DLL, who then deposits those requests into the ASP.Net request queue, and the ASP.Net threads service that queue.

    //David

  5. David.Wang says:

    Sebastian – Sorry. No special info here.

    IIS6 and ASP.Net do not do anything special for multi-core, processor affinity, nor thread-execution. It just does whatever Windows does for processes and threads.

    All you can do is define the SMPProcessorAffinityMask property at the per-ApplicationPool level. This allows you to map Application Pools (and hence applications) to logical processors.

    How the threads are scheduled amongst the processors – that is up to Windows.

    From IIS perspective, processors affect performance because they increase concurrent execution of threads through a given codebase and therefore bottleneck on "hotly contended locks".

    //David

  6. David Wang says:

    Sigh… it seems that the Application Health Monitoring features added in IIS6 are merely used by VARs…

  7. Hayden James says:

    This was an invaluable article! I’ve just reduced my worker processes from 8 to 2. Thanks!

  8. Peter says:

    I am interested in 4 to 8 CPU model with web garden.  

    If the ASP .NET application running under the pool is stateless, and there is a decent volume of requests, what is the guideline to start use Web Garden?   Is there any performance counters that could tell us before and after such as requests queued, etc?

    http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/659f2e2c-a58b-4770-833b-df96cabe569e.mspx?mfr=true

    Creating a Web garden for an application pool can also enhance performance in the following situations:

    • Robust processing of requests: When a worker process in an application pool is tied up (for example, when a script engine stops responding), other worker processes can accept and process requests for the application pool.

    • Reduced contention for resources: When a Web garden reaches a steady state, each new TCP/IP connection is assigned, according to a round-robin scheme, to a worker process in the Web garden. This helps smooth out workloads and reduce contention for resources that are bound to a worker process.

  9. David.Wang says:

    Peter – the general guideline to start using Web Garden is when you can articulate the exact reason(s) that you benefit from it. I am not trying to be curt, difficult, or facetious – literally, the PM and Dev who designed and implemented the Web Garden feature says the same thing! Really! :-)

    Basically, Web Garden is NOT a performance feature, but it is often misunderstood that way. Web Garden is at best an AVAILABILITY feature. Notice that what you quoted from microsoft.com exactly matches what I’m saying – when an application encounters process-level contention for resources, often guarded by locks or synchronous activity, Web Garden gives you additional processes that *may* work around the contention. I say *may* because if the contention is global, such as a handle to a log file, handle to shared memory, or semaphore, using Web Garden only *increases* contention by increasing the number of threads and processes fighting over the same resource.

    Hopefully, this reinforces my message that Web Garden is an AVAILABILITY and not performance (i.e. increase in throughput or decrease in latency) feature.

    In short, the ONLY reason one would use Web Garden is to work around contention bugs in server-side applications which you cannot fix (such as 3rd party applications). If the contention is in code you control, you want to fix it ASAP because it will only hamper future scalability attempts which you *may not* be able to work around with server configuration.

    Now that I have outlined the criteria for using Web Garden, let’s look at your situation:

    "Decent volume of requests" says nothing about contention, so no, it is not a sufficient reason for Web Garden. IIS can easily handle 10,000 asynchronous requests concurrently without contention and few threads in a process, so no need for Web Garden.

    "requests queued" indicate incoming rate of requests into the queue is greater than rate of worker threads completing requests from the queue. It is suspicious but does not necessarily mean contention, and even if contention it does not mean non-global contended resource. You will need to profile the queuing to determine WHAT is under contention before being able to determine if Web Garden can help.

    I understand that my requirements to profile seem "high", and that most people would probably just flip the switch and see if it "works" for their application… which is perfectly fine for me since I realize that you gotta do what you gotta do. However, just realize that if it fails later on, I am simply going to point to the taking of this shortcut and point you back to doing the real profile work.

    Basically, there is no shortcut for performance/tuning. You have to know what is the exact bottleneck to be able to determine the correct configuration to ease the bottleneck. No more; no less; no magic.

    //David

  10. David.Wang says:

    Peter – for machines with 8+ Processors, you can investigate using Processor Affinity in IIS6, which would isolate one w3wp.exe per CPU to decrease CPU context-switch and TLB cache-flush caused by flipping w3wp.exe processes *between* CPUs.

    However, this still does not necessarily endorse Web Garden since it is not the only way to get more w3wp.exe running on the machine (you can add more Application Pools of single w3wp.exe). The decision for Web Garden is still tied to the behavior of the application hosted inside it.

    //David

  11. Bib says:

    How many threads are spawned by each web garden process – w3p.exe? This article starts off talking about threads. Then in the "Conclusion" section, switches to process (w3p.exe). This appears to make an assumption that, a w3p.exe process spawns only one worker thread. Is that correct?

    Otherwise, a very good article.

  12. David.Wang says:

    Bib – w3wp.exe do not spawn threads to handle requests. On w3wp.exe startup, it creates a tiny pool of threads of a pre-determined amount to handle incoming requests.

    I switch between threads and processes because while threads control the level of concurrency, IIS only gives you control of processes. Users have no direct way to control number of threads executing other than influencing the number of processes, which is a good idea – Dictating number of threads usually indicates synchronous IO, which tends to bottleneck and scale worse than asynchronous IO.

    I have to bridge the gap between the original question and underlying reason. If I just answered in terms of threads, I have no recommended action. If I just answered in terms of processes, I appear to not answer the question.

    //David

  13. Clive says:

    increased AspProcessorThreadMax from 25 to 50, had no impact. On reflection, threads must be available because I can run a third page OK (that does nothing). So the problem appears to be around obtaining another ODBC connection from IIS whilst the long-running query is executing. A similar test works fine in VB6 using ODBC.

  14. Clive says:

    So adding a web garden fixes the issue, adding threads does not. But I can't use web garden due to session issues.

  15. Clive says:

    appears my first entry didn't make it through…I have an availability issue in IIS6 with classic ASP where running a page with a long-running ODBC query will block a second page which just opens an ODBC connection.

  16. Prafulla says:

    I am posting it here because I found this most logical of all the stuff I read.

    I have a single user on IIS 7.5 using my website.(on localhost)

    Initially the web requests are fulfilled in less than 5 seconds.

    But after 10 to 15 minutes of usage any request takes very long time 2 mins or even long.

    In one of my page I use Task<DataTable>.Factory.StartNew to get data for my 3 grids

    and in the end do a Task.WaitAll and then render the page.(this is a key page and comes in all workflows)

    The execution works without any threading errors or functional errors but time taken is very long.

    I checked out the CPU usage does not go up overall and also for w3wp.exe

    The memory usage is also pretty low overall and for w3wp.exe.

    I am not using any caching. disabled logging of request from IIS and cleaned up the log.

    And there is no question of requests getting queued up (Single user single request and max of 3 threads)

    The machine is pretty high end with 8 CPU's ……but I don't think this should matter at all.

    Please guide me as to what I can do about this.

    Thanks In advance.