WCF Request Throttling and Server Scalability

(Copied from https://blogs.msdn.com/wenlong/archive/2008/04/21/wcf-request-throttling-and-server-scalability.aspx)

Two Threads per Request

In .NET 3.0 and 3.5, there is a special behavior that you would observe for IIS-hosted WCF services. Whenever a request comes in, the system would use two threads to process the request:

· One thread is the CLR ThreadPool thread which is the worker thread that comes from ASP.NET.

· Another thread is an I/O thread that is managed by the WCF IOThreadScheduler (actually created by ThreadPool.UnsafeQueueNativeOverlapped).

When you have a high latency request, you would see the following callstack in the debugger before the request is completed (with .NET 3.5):

0ee6ee8c 5094dad5 mscorlib_ni!System.Threading.WaitHandle.WaitOne()+0xa

0ee6ee8c 50951e3b System_ServiceModel_ni!System.ServiceModel.Activation.HostedHttpRequestAsyncResult.ExecuteSynchronous(System.Web.HttpApplication, Boolean)+0x8d

0ee6eeb4 65fe626d System_ServiceModel_ni!System.ServiceModel.Activation.HttpModule.ProcessRequest(System.Object, System.EventArgs)+0x143

0214e4f4 65fe3fd1 System_Web_ni!System.Web.HttpApplication+SyncEventExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()+0x5d

0ee6ef08 65fe804f System_Web_ni!System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)+0x41

0ee6efb8 65fe4df2 System_Web_ni!System.Web.HttpApplication+PipelineStepManager.ResumeSteps(System.Exception)+0x63b

060f05dc 66003a92 System_Web_ni!System.Web.HttpApplication.BeginProcessRequestNotification(System.Web.HttpContext, System.AsyncCallback)+0x56

0ee6f040 65fd8022 System_Web_ni!System.Web.HttpRuntime.ProcessRequestNotificationPrivate(System.Web.Hosting.IIS7WorkerRequest, System.Web.HttpContext)+0x352

0ee6f0c4 65fd7e07 System_Web_ni!System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr, IntPtr, IntPtr, Int32)+0x1f2

0ee6f0f4 01782374 System_Web_ni!System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr, IntPtr, IntPtr, Int32)+0x1b

0ee6f138 6a2abf3c webengine!MgdGetCurrentNotification+0x236

0ee6f19c 74a22ea0 webengine!MgdGetPreloadedSize+0x4d

0ee6f1b0 74a23696 iiscore+0x2ea0

0ee6f560 6a2ac222 webengine!MgdCanDisposeManagedContext+0xb5

0ee6f570 015f1311 webengine!MgdIndicateCompletion+0x22

System_Web_ni!System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr, IntPtr, IntPtr, Int32)+0x289

This is the worker thread that comes from ASP.NET. Another thread is still processing the request in WCF:

01fbf0ac 0e9001c7 App_Web_rsfhd6l1!HelloWorld.SimpleService.Hello(System.String)+0x14

01fbf0ac 50b8d90b System_ServiceModel_ni!DynamicClass.SyncInvokeHello(System.Object, System.Object[], System.Object[])+0x3f

01fbf0ac 50b6d245 System_ServiceModel_ni!System.ServiceModel.Dispatcher.SyncMethodInvoker.Invoke(System.Object, System.Object[], System.Object[] ByRef)+0x1fb

01fbf108 509137ad System_ServiceModel_ni!System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin(System.ServiceModel.Dispatcher.MessageRpc ByRef)+0xd5

01fbf148 509136a6 System_ServiceModel_ni!System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage5(System.ServiceModel.Dispatcher.MessageRpc ByRef)+0xad

01fbf174 50913613 System_ServiceModel_ni!System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage4(System.ServiceModel.Dispatcher.MessageRpc ByRef)+0x76

062b75a8 50913459 System_ServiceModel_ni!System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage3(System.ServiceModel.Dispatcher.MessageRpc ByRef)+0x33

062b75a8 50912257 System_ServiceModel_ni!System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage2(System.ServiceModel.Dispatcher.MessageRpc ByRef)+0x59

062b75a8 50911f8f System_ServiceModel_ni!System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage1(System.ServiceModel.Dispatcher.MessageRpc ByRef)+0x127

01fbf1f0 509115ff System_ServiceModel_ni!System.ServiceModel.Dispatcher.MessageRpc.Process(Boolean)+0xdf

01fbf394 5090f8c9 System_ServiceModel_ni!System.ServiceModel.Dispatcher.ChannelHandler.DispatchAndReleasePump(System.ServiceModel.Channels.RequestContext, Boolean, System.ServiceModel.OperationContext)+0x1ff

01fbf3dc 5090f35e System_ServiceModel_ni!System.ServiceModel.Dispatcher.ChannelHandler.HandleRequest(System.ServiceModel.Channels.RequestContext, System.ServiceModel.OperationContext)+0x169

01fbf42c 5090f2f1 System_ServiceModel_ni!System.ServiceModel.Dispatcher.ChannelHandler.AsyncMessagePump(System.IAsyncResult)+0x5e

01fbf42c 50232d68 System_ServiceModel_ni!System.ServiceModel.Dispatcher.ChannelHandler.OnAsyncReceiveComplete(System.IAsyncResult)+0x41

01fbf42c 50904501 SMDiagnostics_ni!System.ServiceModel.Diagnostics.Utility+AsyncThunk.UnhandledExceptionFrame(System.IAsyncResult)+0x28

01fbf468 50992b36 System_ServiceModel_ni!System.ServiceModel.AsyncResult.Complete(Boolean)+0xb1

01fbf4d8 50992215 System_ServiceModel_ni!System.ServiceModel.Channels.InputQueue`1+AsyncQueueReader[[System.__Canon, mscorlib]].Set(Item<System.__Canon>)+0x46

01fbf4d8 50991ffb System_ServiceModel_ni!System.ServiceModel.Channels.InputQueue`1[[System.__Canon, mscorlib]].EnqueueAndDispatch(Item<System.__Canon>, Boolean)+0x1f5

00000000 5091d7e5 System_ServiceModel_ni!System.ServiceModel.Channels.InputQueue`1[[System.__Canon, mscorlib]].EnqueueAndDispatch(System.__Canon, System.ServiceModel.Channels.ItemDequeuedCallback, Boolean)+0x6b

00000001 50977b7e System_ServiceModel_ni!System.ServiceModel.Channels.SingletonChannelAcceptor`3[[System.__Canon, mscorlib],[System.__Canon, mscorlib],[System.__Canon, mscorlib]].Enqueue(System.__Canon, System.ServiceModel.Channels.ItemDequeuedCallback, Boolean)+0x75

01fbf570 5094f396 System_ServiceModel_ni!System.ServiceModel.Channels.HttpChannelListener.HttpContextReceived(System.ServiceModel.Channels.HttpRequestContext, System.ServiceModel.Channels.ItemDequeuedCallback)+0x1ce

01fbf5b8 5094e4cf System_ServiceModel_ni!System.ServiceModel.Activation.HostedHttpTransportManager.HttpContextReceived(System.ServiceModel.Activation.HostedHttpRequestAsyncResult)+0xc6

0216803c 5094defd System_ServiceModel_ni!System.ServiceModel.Activation.HostedHttpRequestAsyncResult.HandleRequest()+0xdb

01fbf60c 5094dea5 System_ServiceModel_ni!System.ServiceModel.Activation.HostedHttpRequestAsyncResult.BeginRequest()+0x19

01fbf638 50903be6 System_ServiceModel_ni!System.ServiceModel.Activation.HostedHttpRequestAsyncResult.OnBeginRequest(System.Object)+0x29

01fbf674 50903b26 System_ServiceModel_ni!System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper+WorkItem.Invoke2()+0x36

01fbf688 50903ab5 System_ServiceModel_ni!System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper+WorkItem.Invoke()+0x4a

01fbf6bc 5090390f System_ServiceModel_ni!System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper.ProcessCallbacks()+0x185

01fbf6e8 5090388b System_ServiceModel_ni!System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper.CompletionCallback(System.Object)+0x6f

01fbf720 50232e1f System_ServiceModel_ni!System.ServiceModel.Channels.IOThreadScheduler+CriticalHelper+ScheduledOverlapped.IOCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)+0xf

01fbf720 79405534 SMDiagnostics_ni!System.ServiceModel.Diagnostics.Utility+IOCompletionThunk.UnhandledExceptionFrame(UInt32, UInt32, System.Threading.NativeOverlapped*)+0x2f

01fbf744 79e7c74b mscorlib_ni!System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)+0x60

01fbf758 79e7c6cc mscorwks!LogHelp_TerminateOnAssert+0x3433

01fbf7d8 79f00eca mscorwks!LogHelp_TerminateOnAssert+0x33b4

01fbf7f8 79f00e75 mscorwks!GetCLRFunction+0x1de16

Why is this? Is this a scalability issue for WCF? I will explain this below.

Asynchronous Request Processing

One of the most significant pieces for WCF is that it is highly asynchronous from bottom to top for the whole channel stack and the programming model. The advantage is apparent: applications have high flexibility by issuing operations asynchronously, doing some other stuff in parallel, and coming back to handle the completed operations once they are completed. This is also scalable when the asynchronous operations are I/O bound so that you do not need to hold extra threads to wait for operation completions.

ASP.NET also supports asynchronous request processing mechanisms through IHttpModule and IHttpAsyncHandler. With these interfaces, ASP.NET can push many concurrent requests up to the layers which have such asynchronous implementations. In this way, the whole stack need a small number of threads to process a large number of concurrent requests given appropriate queuing for the requests.

However, this also exposes a memory problem when this is on IIS 6.0. When there is unlimited number of incoming requests, all of the requests would be pushed up by ASP.NET to upper layers and it would cause significant memory consumption. Thus it would cause the server to be unavailable to do its work correctly.

So in the release of .Net 3.0 and 3.5, WCF implemented synchronous versions of HTTP module and handler instead of asynchronous ones. See the following for more details.

ASP.NET Threading on IIS 6.0

When ASP.NET is hosted in IIS 6.0, requests are handed over to ASP.NET on IIS I/O threads. ASP.NET uses CLR ThreadPool worker threads to handle requests. The CLR ThreadPool automatically adjusts the number of threads based on the server workload. In order to prevent from using too many threads or using too much memory, ASP.NET has a limit on the number of threads concurrently executing requests. This is controlled by the httpRuntime/minFreeThreads and httpRuntime/minLocalRequestFreeThreads settings. If the limit is exceeded, the request is queued in the application-level queue, and executed later when the concurrency falls back down below the limit. Thomas Marquardt has written an excellent blog on this:

https://blogs.msdn.com/tmarq/archive/2007/07/21/asp-net-thread-usage-on-iis-7-0-and-6-0.aspx (TM0707)

In .NET 2.0, the magic config “processModel/autoConfig” in machine.config automatically configures the ASP.NET threading settings for most applications. For those which require special settings need to be configured differently. The following KB article was provided as the general guidance:

https://support.microsoft.com/kb/821268 (KB821268)

However, there is no throttling for concurrent requests in ASP.NET on IIS 6.0.

WCF Request Throttling and Tuning

Because of lacking the throttling for concurrent requests in ASP.NET, WCF would get unlimited concurrent requests pushed up from ASP.NET if WCF had implemented asynchronous HTTP module/handler on IIS 6.0. It would cause severe memory usage problem. Because of this concern, WCF implemented the synchronous HTTP module/handler in 3.0 and 3.5 instead.

Here I only describe how the HTTP module works for WCF. The logic for the HTTP handler is similar. When a request comes in from ASP.NET, WCF grabs the request if it is a WCF request (by checking the registered BuildProvider of the extension to see whether it is WCF service BuildProvider). After that, it immediately switches to a new I/O thread and hold the ASP.NET worker thread until the request is completed. You may ask, why doesn’t WCF reuse the ASP.NET worker thread to handle the request until it is completed? Here are the reasons:

· As long as the ASP.NET worker thread is not returned back to ASP.NET, the request is not able to be completed. The client would be blocked.

· When the WCF operation is one-way, the request has to be completed before the service operation is invoked. So the ASP.NET worker thread has to return earlier.

· WCF supports people to complete a two-way operation earlier in a service operation when you call RequestContext.Close.

All of these require that WCF should not hold the ASP.NET worker thread once the request is claimed to be “completed”. At the same time, WCF has to hold the ASP.NET worker thread in a waited state before the request is completed due to the requirement of synchronous request processing logic. WCF relies on this ASP.NET thread throttling logic to limit concurrent requests.

Intuitively this seems to have scalability issue. Fortunately for most WCF services, especially high-throughput ones, this has quite low impact as long as the server is busy processing requests. This is demonstrated as the CPU usage of the server.

On multi-proc/multi-core servers, the default limit of concurrent requests may not be enough for the server to achieve best throughput. This would require larger limits for ASP.NET worker threads to be used to handle requests.

Throttling Tuning

In order to achieve high throughput on high-end servers, you would need to make the following throttling tuning:

· Increase ASP.NET thread throttling to allow more concurrent worker threads to handle requests

· Increase WCF service throttling to allow more concurrent requests

The latter reflects the change of the known WCF throttling settings such as MaxConcurrentCalls, MaxConcurrentInstances, and MaxConcurrentSessions. The former would require fine tuning of ASP.NET thread throttling as documented in the KB article KB821268 mentioned above. Basically ASP.NET does not execute more than the following number of concurrent requests:

(maxWorkerThreads*number of CPUs)-minFreeThreads

This is 12 by default on a single CPU machine. In order to get more threads to process requests, you need to increase maxWorkerThreads. Once you can get CPU fully loaded with the settings, setting “minWorkerThreads” to 2 or 3 would allow WCF to have best throughput. Here is sample setting in machine.config:

<processModel autoConfig="false" maxWorkerThreads="500" maxIoThreads="500" minWorkerThreads=”2”/>

<httpRuntime minFreeThreads="250" minLocalRequestFreeThreads="250"/>

The above settings work for most WCF applications. However, when the service has very slow operations which are I/O bound (for example, they further connects to backend database or network or file system to perform high-latency operations), you would need to use many outstanding ASP.NET worker threads to achieve parallel requests. This is bad when the number of requests is huge, for example, several thousands or more. Each thread would hold a fair amount of system resource, i.e., memory. It would cause the service to be very slow to respond. This really requires a real asynchronous design to solve the problem. Because of this, ASP.NET introduced a new throttling logic for concurrent requests in IIS 7.0 to support this scenario for WCF.

ASP.NET Throttling on IIS 7.0 in Integrated Mode

As Thomas mentioned in his blog entry (TM0707) mentioned above, the following registry value is introduced to throttle concurrent requests:

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ASP.NET\2.0.50727.0]

“MaxConcurrentRequestsPerCpu”=dword:0000000c

The default value is 12 per CPU which means is similar as the default setting for ASP.NET worker threads. This was introduced in ASP.NET 2.0 SP1 (shipped with .NET 3.5) and thus it is available in Windows 2008. It only applies to the IIS 7.0 Integrated mode, which is exactly where WCF is registered on Windows 2008.

Scalable Solutions for High Latency Services

Though the above throttling support was added to Windows 2008, WCF still uses synchronous module/handler instead of asynchronous ones in the released versions due to timing constraint. However, the internal WCF design does allow asynchronous request processing and it is just one step away from the real asynchronous logic. Here I provide a simple wrapper for that.

Custom HttpModule Using Reflection

The internal WCF types support asynchronous HTTP module. Since they are internal, we have to rely on reflection to create a real one.

First of all, you would need to register the async events for your HttpModule as following:

            context.AddOnPostAuthenticateRequestAsync(

                MyHttpModule.beginEventHandler,

                MyHttpModule.endEventHandler);

This is registered to the PostAuthenticateRequest for the same reasons as the default WCF HttpModule.

Secondly, you would need to use reflection to create the instance of the internal WCF type System.ServiceModel.Activation.HostedHttpRequestAsyncResult:

Type hostedHttpRequestAsyncResultType = typeof(ServiceHostingEnvironment).Assembly.GetType("System.ServiceModel.Activation.HostedHttpRequestAsyncResult");

ConstructorInfo hostedHttpRequestAsyncResult = hostedHttpRequestAsyncResultType.GetConstructor(BindingFlags.NonPublic | BindingFlags.Instance | BindingFlags.Public,

null, new Type[] { typeof(HttpApplication), typeof(bool), typeof(AsyncCallback), typeof(object) }, null);

return (IAsyncResult)hostedHttpRequestAsyncResult.Invoke(new object[] { application, false, cb, extraData });

Once you have the custom HttpModule, you can replace it with the default one that is registered in the WAS configuration %windir%\system32\inetsrv\config\applicationhost.config:

  <system.webServer>

    <modules>

      <add name="ServiceModel" type="System.ServiceModel.Activation.HttpModule, System.ServiceModel, Version=3.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" preCondition="managedHandler" />

    <modules>

  <system.webServer>

To do that, you can also define local settings in web.config:

  <system.webServer>

    <modules>

      <remove name="ServiceModel"/>

      <add name="MyAsyncWCFHttpModule" type="AsyncWcfModule.MyHttpModule, AsyncWcfModule" preCondition="managedHandler" />

    </modules>

    <validation validateIntegratedModeConfiguration="false" />

  </system.webServer>

The sample code is attached.

Improvement in .NET 3.5 SP1

In .NET 3.5 SP1, we will provide the real asynchronous HTTP module/handler for WCF to better solve this problem.