Thoughts on Application Pools running out of threads

Article
03/15/2006

Question:

Perhaps "application pool running out of thread", that's what my vendor tell me when their asp.net application has intermittent performance problem.

I searched and read quite some documentation, but no luck. What did he mean by "running out of thread"? Where can I check or adjust the setting for this?

And I want to know the exact benefic of web garden. Do more than one w3wp.exe achieve better performance than only one?

Answer:

If the vendor tells you their ASP.Net application is "running out of threads", then the vendor is basically telling you that his ASP.Net application has some performance issue because it is performing too many simultaneousy synchronous operations tying up worker threads to prevent other requests from being serviced.

In other words, the vendor's application has a problem, he is simply blaming Microsoft because it is easy and convenient, and you believed it because you had no other information.

Now, you may think "wait a minute; if the application is tying up worker threads, then shouldn't IIS/ASP.Net simply detect this and create more worker threads to continue functioning - so isn't this this really an IIS/ASP.Net issue?"

Ahh... but that is a common misconception about threads (and similarly, processes). Increasing concurrency (i.e. having the system perform more work simultaneously by creating more threads or starting up additional processes such as Web Garden) is not synonymous with improving performance. Why?

Computation 101

You can view the CPU of your computer as something that can perform some number of operations in a second, and at any point in time, it can only perform one operation.

Since the CPU does not perform multiple operations at any point in time, it means that it is not possible to run multiple applications simultaneously. A computer gives the ILLUSION of running multiple applications simultaneously by quickly switching the CPU between various applications such that each application gets a "slice" of CPU time to perform useful work with its operations.

Note that switching the CPU between various applications is not free; it takes a small but non-zero amount of CPU time.

Example

How does this translate into performance? Suppose you have:

CPU that performs 120 operations per second
A request takes 10 operations to complete
Switching between threads takes 1 operation
CPU switches threads every 5 operations

If your worker process only has one thread, it can perform at most 120 / 10 = 12 requests/second.
If your worker process has two threads, it can perform at most 120 / ( 2 x ( 5 + 1 ) ) = 10 requests/second (remember, it takes two context switches to gather up the CPU slices to perform 10 operations of work to complete the request).

Notice that increasing concurrency (in this case, the number of threads) DECREASES the potential request/second throughput. This decrease comes from the cost of context switching (in this case, between threads) that is otherwise not performing useful work. In other words, increasing concurrency decreases potential throughput because it must waste CPU cycles maintaining that concurrency.

At this point, the astute reader should wonder "if increasing concurrency decreases potential throughput (and hence max performance), then why in the world do we want concurrency? It's just bad, right?"

Well, suppose that the request can monopolize and hang whatever thread it is running on, and it does this every 6 requests.

If your worker process only has one thread, you get 5 requests before your worker process is unable to service any more requests since it has no more threads.
If your worker process has two threads, you get 10 requests before your worker process is unable to service any more requests since it has no more threads.

Notice what concurrency brings:

If you are running blocking operations, concurrency allow you to run more of those operations before being totally blocked. In other words, it improves the appearance of availability.
However, concurrency does not solve blocking operations. It only delays the inevitable when all sources of concurrency gets blocked.

Conclusion

Thus, I would refrain from asking a question like "does more than one w3wp.exe achieve better performance than only one" because the answer completely depends on the behavior of each request.

If you have performant code that does not block, then having one w3wp.exe is ideal since it maximizes request throughput by not having any loss due to context switching
If you have code which blocks, you *may* want to consider increasing concurrency by adding threads or Web Garden... but realize that this only gives short-term appearance of availability at the expense of decreased raw performance, does not solve the non-performant code, and only delays the inevitable.

In other words, Web Garden is NOT a performance feature. It is a reliability/availability feature, and it improves appearance of reliability/availability at the cost of decreasing performance. It basically gives you another process (and more threads) which help improve reliability/availability if you are running code that hangs and chews up threads of a w3wp by providing additional processes and threads that can be chewed up by the same code. In other words, it delays the inevitable hang of a bad application. The cost of providing this concurrency is performance because there are more context switches between the process/threads, and context switch is basically wasted CPU cycle that does no useful work.

//David