Throttling, generally speaking, is tricky. Get the limits low and you may be prone to DoS and clients timing out trying to connect to your service in vain; Get them high and you may end up with an overloaded service that’s eating up machine resources until it crashes. There’s a sweet spot in between that will give you an optimum throughput and high availability at the same time.
The ServiceThrottlingBehavior in WCF enables you to modify three important settings that you should consider tweaking to suit your application and resources. These settings are: MaxConcurrentCalls, MaxConcurrentInstances, and MaxConcurrentSessions. Since there are many considerations involved in choosing the values of these settings, and the fact that they may vary from a machine to another in a production environment, it’s recommended that you use the configuration file for your application (let it be app.config or web.config) to configure these limits. Here’s an example:
<service name="SampleService" behaviorConfiguration="Throttled">
<endpoint address="" binding="wsHttpBinding" contract="ISampleService">
<endpoint address="mex" binding="mexHttpBinding" contract="IMetadataExchange"/>
<serviceThrottling maxConcurrentCalls="200" maxConcurrentSessions="200" />
As you may have noticed I didn’t change the value of MaxConcurrentInstances and accepted the default which is 26. That’s because I set InstanceContextMode to Single, which means that there will be only one service instance. All calls are handled by this single instance and this can be a problem if the ConcurrencyMode is set to Single (the default value for this property) which means that the service is single-threaded and it can’t handle more than one call at a time, while other calls will have to wait until they get their turn or timeout. To avoid this problem, I set it to Multiple which allows the service to be multithreaded. Using multithreading comes with the usual responsibilities in design time (maintaining state consistency and avoiding synchronization problems) and in runtime (correctly throttling the service so that it doesn’t create many threads that can eat up machine resources).
- Single: N/A
- PerSession: Equal to MaxConcurrentSessions
- PerCall: Estimated number of calls (average number of sessions * average number of calls per session)
- Allowed: Specifies that the contract supports sessions if the incoming binding supports them. This one is tricky because it depends on the client, whether it establishes a "sessionful" binding or not. You need to estimate the mix of clients that use a session to make multiple calls vs clients that create a new session for each call.
- Required: Specifies that the contract requires a sessionful binding. An exception is thrown if the binding is not configured to support session. This one is self-explanatory.
- NotAllowed: Specifies that the contract never supports bindings that initiate sessions. Clients that create a new session for each call. Thus, number of sessions equals number of calls.
By default, all operation initiate sessions (according to the SessionMode of the service contract) but none is terminating, hence, the first call always initiates a session. If MaxConcurrentSessions is 100 and your client don’t terminate the session, your service will only handle 100 sessions then all subsequent messages will timeout. A client can close the session by calling one of the following methods:
- A terminating operation specified by the service contract (using the IsTerminating attribute).
The client should be a good citizen and always close the connection, even if the operation is terminating. The advantage of the 3rd option here is the enforcement of the termination by the service contract, so even if the client didn’t behave as expected, the service sets a timer and the channel aborts the client after a certain period. Setting the IsTerminating attribute to true in an operation contract require the SessionMode of the service contract to be set to Required.
The default values of MaxConcurrentSessions and MaxConcurrentCalls are 10 and 12 respectively. The higher you go with these values, the higher the throughput will be. You will need to understand the rates of resources consumption at higher throughput rates and the correlation between them (for example an exponential growth means that you have a problem). Also, the nature of the operations that your service execute play a big role too (whether they are I/O intensive or CPU intensive). On the other hand, leaving the values low makes your service prone to DoS attacks or mistakes like clients not closing the sessions. IMO, the following would help you find that sweet spot:
- Understanding the design and nature of the operations
- Stress testing
- Dogfooding and/or beta testing in a pre-production environment