CPU Throttling for ASP.NET Asynchronous Scenarios in .Net 4.5

Disclaimer: The article below discusses pre-release features. Some of these features could change on the RTM version.

The scenario

The sample scenario we considered here is the hypothetical page in which we need to aggregate the results of five remote feeds, in a second tier. Requesting the feeds has a latency, which we set to 100ms in our test. Requesting this asynchronously have a clear advantage, since five request will take around 100ms. If we requested them synchronously, in sequence, it will take 500ms. The figure below shows the scenario:

The problem

Asynchronous request in ASP.Net 4.0 has some scalability issues when a huge load (beyond the hardware capabilities) is put on such scenario. The problem is due to the nature of allocation on asynchronous scenarios. In these conditions, allocation will happen when the asynchronous operation starts, and it will consumed when it completes. By that time, it’s very possible the objects have been moved to generation 1 or 2 by the garbage collector.

When this happens, increasing the load will show increase on request per second (rps) until a point. Once we pass that point, the time spend in the garbage collector will start to become a problem and the rps will start to dip, having a negative scaling effect. The figure below shows the effect of increasing the load in our particular scenario. The pick is achieved in the sample data with 770 clients.

CPU throttling

To try to fix this problem, we implemented a throttling mechanism in 4.5. Simply, when the CPU usage went beyond a point, requests were sent to the ASP.NET native queue. This queue already existed, for IIS integrated pipeline mode only, and is used also to limit concurrency based on the parameters maxConcurrentRequestsPerCPU and maxConcurrentThreadsPerCPU in aspnet.config (or through registry keys).

We added an additional setting in aspnet.config for the percentage of CPU usage we will monitor to start the throttling. The new setting is called percentCpuLimit. So whenever the CPU usage goes beyond that point, we will start sending some of the request to the queue, effectively reducing the concurrency and the amount of managed buffers that get allocated and collected by the GC. This improved considerably the scalability of this type of scenarios in our tests.

The default value of the setting is 99%. To disable the throttling, set percentCpuLimit to 0:

<configuration>

    <!-- ... -->

    <system.web>

        <applicationPool percentCpuLimit="0"/>

    </system.web>

</configuration>

Performance counters

In order to know if throttling is happening, the existing “ASP.NET\Requests Queued” counter can’t be used. The counter includes the requests waiting for a CLR thread pool thread and the requests in the ASP.NET queue. So we added the counter below:

Counter

Purpose

“ASP.NET\Requests In Native Queue”

Requests queued in ASP.NET because the concurrency limits have been exceeded.

Note the new counter will also include requests queued for any other concurrency limiting setting. That is maxConcurrentRequestsPerCPU and maxConcurrentThreadsPerCPU. The checks are made in that order, then percentCpuLimit is checked.

New defaults

There are a few settings that affect this type of scenarios. Some of them in the framework, some of them in IIS/http.sys. We review them and tried to change the default values of some of them, so they work better out of the box.

System.Net.ServicePointManager.DefaultConnectionLimit: This setting controls how many concurrent outbound connections you can start with System.Net classes, such as HttpWebRequest. The setting can also be set in machine.config setting <system.net/connectionManagement/add[@address='*'] maxconnection />, as long as <processModel autoConfig=false />. In .Net 4.0 we used to set this parameter to 12 times the number of logical CPUs, when autoConfig=true. We have changed that in 4.5 to Infinite (actually is int32.MaxValue).

requestQueueLimit: This setting can be set in <processModel requestQueueLimit /> on machine.config or in <system.web/applicationPool requestQueueLimit /> in aspnet.config. The setting used to work on 4.0 and below as a fix setting, meaning that once we reach that many total requests (5000 by default) we started to return 503 status code. We fixed this in ASP.NET 4.5 to return 503 when the number of requests on the ASP.NET queue is bigger than this value, rather than using the total number of requests.

Http.sys queueLength: <system.applicationHost/applicationPools/applicationPoolDefaults.queueLength /> parameter can be used in IIS to set this. The default in IIS 7.5 is 1000. This can sometimes not be sufficient on a managed application with heavy garbage collection. This is cause by some GC threads using higher priority threads, which can make the IIS threads processing the queue to starve for CPU. However, the setting affects non-paged memory. The default value was not changed. If your web site returns 503 status code, you can try increasing this value.

System.Data MaxPoolSize: The number of database connection on the pool can also affect the scalability of these types of scenarios. The value can be set inside the connection string. The value to choose depends primarily by the Sql server, rather than the client machine, so the default was not changed in 4.5.

Summary

To review, a new concurrency setting was added in 4.5 in aspnet.config named percentCpuLimit, to alleviate async scenarios scalability. A new performance counter named “ASP.NET\Requests In Native Queue” was also added, to know when the throttling is happening. And some default values were changed for some .Net settings. All this is only valid for IIS integrated pipeline mode.