Worker thread governance coming to Azure SQL Database

NOTE: As of 1/10, this change is now live in our datacenters.

Starting with the service update that went out recently, soft throttling on worker threads is changing. Over the next few months, soft throttling will eventually be replaced by worker thread governance. In the meantime, users may see requests failing due throttling on worker threads (error 40501) or worker thread governance (errors 10928 and 10929). The retry logic in your application should be modified to handle these errors. Please see https://go.microsoft.com/fwlink/?LinkId=267637 for more information on this topic.

While we roll-out the new worker thread governance mechanism on all datacenters, users may see requests failing due to either one of two reasons – throttling on worker threads (40501) or worker thread governance (new error codes : 10928, 10929; see table below). During this time, it is recommended that the retry logic in your application is suitably modified to handle both throttling error code (40501) and governance error codes (10928, 10929) for worker threads.
Please go through information below and modify your applications as required. Eventually, once worker thread governance is fully rolled out in all datacenters and soft throttling for worker threads has been disabled, we will notify users.

Please note that 40501 errors seen due to hard throttling on worker threads and due to throttling on other resources will continue to be seen as before. Please ensure your error catching logic continues to handle these 40501s as before.

  Current mechanism : Worker thread throttling New mechanism : Worker thread governance (coming soon)
Description

When soft throttling limit for worker threads on a machine is exceeded, the database with the highest requests per second is throttled. Existing connections to that database are terminated if new requests are made on those connections, and new connections to the database are denied, until number of workers drops below soft throttling limit. The soft throttling limit per back-end machine currently is 305 worker threads.

Every database will have a maximum worker thread concurrency limit. *Please note this limit is only a maximum cap and there is no guarantee that a database will get threads up to this limit, if the system is too busy. * Requests can be denied for existing connections in following cases: 1. If the maximum worker thread concurrency limit for the database is reached, user will receive error code 10928. 2. If the system is too busy, it is possible that even fewer workers are available for the database and user will receive error code 10929. This is expected to be a rare occurrence.

Error returned 40501 :The service is currently busy. Retry the request after 10 seconds.Incident ID: <ID>. Code: <code>.

10928 : Resource ID: %d. The %s limit for the database is %d and has been reached. See https://go.microsoft.com/fwlink/?LinkId=267637 for assistance. 10929 : Resource ID: %d. The %s minimum guarantee is %d, maximum limit is %d and the current usage for the database is %d. However, the server is currently too busy to support requests greater than %d for this database. See https://go.microsoft.com/fwlink/?LinkId=267637 for assistance. Otherwise, please try again later. Resource ID in both error messages indicates the resource for which limit has been reached. For worker threads, Resource ID = 1

Recommendation

Back-off and retry request after 10 seconds; See best practices

10928 : Check dm_exec_requests to view which user requests are currently executing 10929 : Back-off and retry request after 10 seconds; See best practices

Note : The hard throttling on worker thread mechanism is not being changed and will continue to return a 40501 error to user applications.