I'd like to briefly explain how ASP.NET uses threads when hosted on IIS 7.5, IIS 7.0 and IIS 6.0, as well as the configuration changes that you can make to alter the defaults. Please take a quick look at the “Threading Explained” section in Chapter 6 of “Improving .NET Application Performance and Scalability”. Prior to v2.0 of the .NET Framework, it was necessary to tweak the processModel/maxWorkerThreads, processModel/maxIoThreads, httpRuntime/minFreeThreads, httpRuntime/minLocalRequestFreeThreads, and connectionManagement/maxconnection configuration. The v2.0 .NET Framework attempted to simplify this by adding a new processModel/autoConfig configuration, which made the changes for you at runtime. With the introduction of IIS 7.0 and the ASP.NET integrated pipeline, we've introduced another element to the mix, a registry key named MaxConcurrentRequestsPerCPU. Lets start with a discussion of how things worked on IIS 6.0 before discussing the changes made in IIS 7.0.
When ASP.NET is hosted on IIS 6.0, the request is handed over to ASP.NET on an IIS I/O thread. ASP.NET immediately posts the request to the CLR ThreadPool and returns HSE_STATUS_PENDING to IIS. This frees up IIS threads, enabling IIS to serve other requests, such as static files. Posting the request to the CLR Threadpool also acts as a queue. The CLR Threadpool automatically adjusts the number of threads according to the workload, so that if the requests are high throughput there will only be 1 or 2 threads per CPU, and if the requests are high latency there will be potentially far more concurrently executing requests than 1 or 2 per CPU. The queuing provided by the CLR Threadpool is very useful, because while the requests are in the queue there is only a very small amount of memory allocated for the request, and it is all native memory. It’s not until a thread picks up the request and begins to execute that we enter managed code and allocate managed memory.
The CLR Threadpool is not the only queue used by ASP.NET when hosted in IIS 6.0. There are also queues at the application level, within each AppDomain. If there is a lot of latency, the CLR Threadpool will grow and inject more active threads. At some point we would either run out of threads, not have enough threads left over for performing other tasks, or the memory associated with all the concurrently executing requests would be too much, so ASP.NET imposes a cap on the number of threads concurrently executing requests. This is controlled by the httpRuntime/minFreeThreads and httpRuntime/minLocalRequestFreeThreads settings. If the cap is exceeded, the request is queued in the application-level queue, and executed later when the concurrency falls back down below the limit. The performance of these application-level queues is really quite miserable. If you observe that the “ASP.NET Applications\Requests in Application Queue” performance counter is non-zero, you definitely have a performance problem. These queues were implemented to prevent thread exhaustion and contention related to web service requests. The problem was first described in KB 821268, which I had published many years ago. The KB article has been re-written a few times since it was originally published, and I hope nothing has been lost during the translations.
For most usage scenarios, the changes recommended in the KB article are not necessary because v2.0 introduced processModel/autoConfig. However, the autoConfig setting may not work for everyone--it limits the number of concurrently executing requests per CPU to 12. An application with high latency may want to allow higher concurrency than this, in which case you can disable autoConfig and make the changes yourself. If you do allow higher concurrency, keep an eye on your working set. I believe the default works for about 90% of the applications out there. I do wish we had the foresight to name that setting maxConcurrentRequestsPerCPU, and allow it to be used to control concurrency, since that would be much easier to configure. I guess this is just another example of when business was just a little bit faster than the speed of thought.
When ASP.NET is hosted on IIS 7.5 and 7.0 in integrated mode, the use of threads is a bit different. First of all, the application-level queues are no more. Their performance was always really bad, there was no hope in fixing this, and so we got rid of them. But perhaps the biggest difference is that in IIS 6.0, or ISAPI mode, ASP.NET restricts the number of threads concurrently executing requests, but in IIS 7.5 and 7.0 integrated mode, ASP.NET restricts the number of concurrently executing requests. The difference only matters when the requests are asynchronous (the request either has an asynchronous handler or a module in the pipeline completes asynchronously). Obviously if the reqeusts are synchronous, then the number of concurrently executing requests is the same as the number of threads concurrently executing requests, but if the requests are asynchronous then these two numbers can be quite different as you could have far more reqeusts than threads. So how do things work, exactly, in integrated mode? Similar to IIS 6.0 (classic mode, a.k.a. ISAPI mode), the request is still handed over to ASP.NET on an IIS I/O thread. And ASP.NET immediately posts the request to the CLR Threadpool and returns pending. We found this thread switch was still necessary to maintain optimal performance for static file requests. So although you will take a performance hit if you’re only executing ASP.NET requests, if you have a mix of dynamic and static files, as we see with many large corporate workloads, this thread switch will actually free up threads for retrieving the static files. Finally, once the request is picked up by a thread from the CLR Threadpool, we check to see how many requests are currently executing. If the count is too high, the request is queued in a global (process-wide) queue. This global, native queue performs much better than the application-level queues used when we’re running in ISAPI mode (same as on IIS 6.0). There is very little memory associated with a queued request, and we have not entered managed code yet so there is no managed memory associated with it. And we respect the FIFO aspect of a queue, something we didn’t do with the application-level queues--if there was more than one application, there was no simple way to globally manage the individual queues. We did however have a difficult time trying to come up with a good configuration story for the IIS 7.0 changes.
When I discuss how to configure thread usage for ASP.NET/IIS 7.0 integrated mode, please remember that we have a lot of pre-existing code and configuration, and you can’t just create something new the way you would like to without introducing backward compatibility issues. In this new mode, the CLR Threadpool is still controlled by the processModel configuration settings (autoConfig, maxWorkerThreads, maxIoThreads, minWorkerThreads, and minIoThreads). And autoConfig is still enabled, but its modifications to httpRuntime/minFreeThreads and httpRuntime/minLocalRequestFreeThreads do nothing, since the application-level queues do not exist. Perhaps we should have tried to use them to configure the global (process-wide) queue limits, but they have application scope (httpRuntime configuration is application specific), not process scope, not to mention being too difficult to understand. And because of some issues with using the configuration system that I won’t go into right now, we decided to use a registry key to control concurrency. So for IIS 7.0 integrated mode, a DWORD named MaxConcurrentRequestsPerCPU within HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ASP.NET\2.0.50727.0 determines the number of concurrent requests per CPU. By default, it does not exist and the number of requests per CPU is limited to 12. If you’re curious to see how much faster ASP.NET requests execute without the thread switch, you can set the value to 0. This will cause the request to execute on the IIS I/O thread, without switching to a CLR Threadpool thread. I don’t recommend this primarily because dynamic requests take a long time to execute relative to static requests, and I believe the overall performance of the system is better with the thread switch. However, and this is important, if your application consists of primarily or entirely asynchronous requests, the default MaxConcurrentReqeustsPerCPU limit of 12 will be too restrictive for you, especially if the requests are very long running. In this case, I do recommend setting MaxConcurrentRequestsPerCPU to a very high number. In fact, in v4.0, we have changed the default for MaxConcurrentRequestsPerCPU to 5000. There's nothing special about 5000, other than it is a very large number, and will therefore allow plenty of async requests to execute concurrently. One thing to watch out for is that when concurrency increases, your application will use more memory simply because there are more requests executing in managed code. The CLR ThreadPool will still do a great job maintaining the number of threads in the ThreadPool, so there should be no concern about this adversly impacting synchronous requests. I know there are people using ASP.NET 2.0 and developing Comet or Comet-like applications on WS08 x64 servers, and they set MaxConcurrentRequestsPerCPU to 5000 and increase the HTTP.sys kernel queue limit to 10,000 (it has a default of 1000). The HTTP.sys kernel queue limit is controlled by IIS. You can change it by opening IIS Manager and opening the Advanced Settings for your application pool and changing the value of "Queue Length".
As a final remark, please note that the processModel/requestQueueLimit configuration limits the maximum number of requests in the ASP.NET system for IIS 6.0, IIS 7.0, and IIS 7.5. This number is exposed by the "ASP.NET/Requests Current" performance counter, and when it exceeds the limit (default is 5000) we reject requests with a 503 status (Server Too Busy).
UPDATE (Aug-18-2008): .NET Framework v3.5 SP1 released earlier this week and it includes an update to the v2.0 binaries that supports configuring IIS application pools via the aspnet.config file. The aspnet.config file is not very well known. It is the CLR Hosting configuration file, and ASP.NET/IIS pass it to the CLR when the CLR is loaded. The host configuration file (aspnet.config) applies configuration at the process-level, as opposed to the application-level like web.config. There is a new system.web/applicationPool configuration section which applies to integrated mode only (Classic/ISAPI mode ignores these settings). The new config section with default values is:
<applicationPool maxConcurrentRequestsPerCPU="12" maxConcurrentThreadsPerCPU="0" requestQueueLimit="5000"/>
There is a corresponding IIS 7.5 change (Windows Server 2008 R2 only) which allows different aspnet.config files to be specified for each application pool (this change has not been ported to IIS 7.0). With this, you can configure each application pool differently. The maxConcurrentRequestsPerCPU setting is the same as the registry key described above, except that the setting in aspnet.config will override the registry key value. The maxConcurrentThreadsPerCPU setting is new, and allows concurrency to be gated by the number of threads, similar to the way it was done in Classic/ISAPI mode. By default maxConcurrentThreadsPerCPU is disabled (has a value of 0), in favor of gating concurrency by the number of requests, primarily because maxConcurrentRequestsPerCPU performs better (gating the number of threads is more complicated/costly to implement). Normally you'll use request gating, but you now have the option of disabling it (set maxConccurrentRequestsPerCPU=0) and enabling maxConccurentThreadsPerCPU instead. You can also enable both request and thread gating at the same time, and ASP.NET will ensure both requirements are met. The requestQueueLimit setting is the same as processModel/requestQueueLimit, except that the setting in aspnet.config will override the machine.config setting. All of this may be a little confusing, but for nearly everyone, my recommendation is that for ASP.NET 2.0 you should use the same settings as the defaults in ASP.NET v4.0; that is, set maxConcurrentRequestsPerCPU = "5000" and maxConcurrentThreadsPerCPU="0".
UPDATE (Sep-12-2011): The only relevant change to .NET Framework v4.0 (as compared to 3.5 or 2.0) is that the default for maxConcurrentRequestsPerCPU was increased to 5000. 5000 is also the value you should use in versions 2.0 and 3.5, which have a default of 12. Also, IIS 7.5 is identical to IIS 7.0 as far as threading is concerned. The only difference between IIS 7.5 and 7.0 that is relevant to this blog post is the support to configure different aspnet.config files for each application pool. You do this by setting the CLRConfigFile attribute for the application pool. You can then use the system.web applicationPool configuration mentioned above to set different values for maxConcurrentRequestsPerCPU, maxConcurrentThreadsPerCPU, and requestQueueLimit, if desired.
In general, running with default configuration works best. However, applications that have measurable latency, say latency of 100 milliseconds when communicating with a backend web service, will perform better with a few configuration changes. Let me tell you what configuration changes you should make on IIS 7.0 and IIS 7.5 in integrated mode in order to handle a large number of concurrent requests to an application that has backend latency. By large number of concurrent reqests, I mean between 12 and 5000 per CPU.
- For v2.0 and v3.5 set a DWORD registry value @ HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ASP.NET\2.0.50727.0\MaxConcurrentRequestsPerCPU = 5000. Restart IIS
- For v3.5, you can alternatively set <system.web><applicationPool maxConcurrentRequestsPerCPU="5000"/></system.web> in the aspnet.config file. If the value is set in both places, the aspnet.config setting overrides the registry setting.
- For v4.0, the default maxConcurrentRequestsPerCPU is 5000, so you don't need to do anything.
- Increase the HTTP.sys queue limit, which has a default of 1000. If the operating system is x64 and you have 2 GB of RAM or more, setting it to 5000 should be fine. If it is too low, you may see HTTP.sys reject requests with a 503 status. Open IIS Manager and the Advanced Settings for your Application Pool, then change the value of "Queue Length".
- If your ASP.NET application is using web services (WFC or ASMX) or System.Net to communicate with a backend over HTTP you may need to increase connectionManagement/maxconnection. For ASP.NET applications, this is limited to 12 * #CPUs by the autoConfig feature. This means that on a quad-proc, you can have at most 12 * 4 = 48 concurrent connections to an IP end point. Because this is tied to autoConfig, the easiest way to increase maxconnection in an ASP.NET application is to set System.Net.ServicePointManager.DefaultConnectionLimit programatically, from Application_Start, for example. Set the value to the number of concurrent System.Net connections you expect your application to use. I've set this to Int32.MaxValue and not had any side effects, so you might try that--this is actually the default used in the native HTTP stack, WinHTTP. If you're not able to set System.Net.ServicePointManager.DefaultConnectionLimit programmatically, you'll need to disable autoConfig , but that means you also need to set maxWorkerThreads and maxIoThreads. You won't need to set minFreeThreads or minLocalRequestFreeThreads if you're not using classic/ISAPI mode.
- If your application sees a large number of concurrent requests at start-up or has a bursty load, where concurrency increases suddenly, you will need to make the application asynchronous because the CLR ThreadPool does not respond well to these loads. The CLR ThreadPool injects new threads at a rate of about 2 per second. This is true for all versions of the CLR (v1.0 thru v4.0) at the time of this writing. If concurrency is bursty and the request thread blocks (e.g. on a backend with latency), the injection rate of 2 threads per second will make your application respond very poorly to this work load. The fix is to stop blocking on threads by using asynchronous I/O to communicate with the backend with latency. If you cannot make the application asynchronous, you will need to increase minWorkerThreads. I don't like to increase minWorkerThreads. It has a side effect on high-throughput synchronous requests that don't block on threads, because the thread count is artificially high.