Biztalk Server issue affecting the # of running orchestrations in R2

Recently I noticed a trend occurring while working on several performance cases that I would like to share with the blogsphere.    In one case in particular, the problem was that, under stress, many hundreds of Biztalk orchestrations were running at one time.   This was because typically Biztalk Receive locations can bring messages into the messagebox faster than orchestrations instances can complete, so if several thousand messages are brought into a system, we were seeing several hundred running orchestrations.  This was confirmed using the performance counter  XLANG/s Orchestrations - Running Orchestrations.   In the old days of Biztalk 2004 on a dual processor machine, we'd never see more than 40 running orchestrations.    The net effect was that the large # of running orchestrations was causing problems downstream on the send hosts and also in the orchestration hosts.  The 40 total orchestrations would be capped by the highwatermark settings in the adm_serviceclass table in the mgmt database.   In fact, I used to tune this setting in Biztalk 2004 and it worked like a charm.  In this case however, the Biztalk Server had 4 quad core processors meaning that 16* 20 or 320 orchestrations could potentially be running at one time.   We wanted to control this, so we tuned the highwatermark settings down to 2, expecting that this would get us no more than 32 running orchestrations.  Wrong, we had over 400 running orchestrations under stress.

So after some more research I found that indeed the maximum # of running orchestrations in 2006 and R2 is controlled by the maxworkerthread setting in the clr hosting key in the registry for each service.  That is, if they are set.

 

So, following the documentation we figured if we had 16 procs and we wanted to minimize the # of running orchestrations, we would set the MaxWorkerThreads to 2, then we'd get 32 running orchestrations maximum.

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\BTSSvc$BiztalkServerApplication\CLR Hosting]

"MaxWorkerThreads"=dword:00000002

 

What happened on host instance restart.  The host instance failed to start, that's right, the minimum setting that we could make is 16 which also happens to be the # of processors, or procs * cores.  

 

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\BTSSvc$BiztalkServerApplication\CLR Hosting]

"MaxWorkerThreads"=dword:00000010

So what do you think happened when we set MaxWorkerThreads to 16 and we had 16 processors.   16*16 = 256 running orchestrations right?

No, we had 16 running orchestrations, that's as high as it got, and it performed much better than when we had 400 running orchestrations, and the important thing that it confirmed was that it is not # of processors * MaxWorkerThreads, but simply MaxWorkerThreads which tells the host instance how many orchestrations that it can run.

As it turns out there is an issue with the 3.5 framwork when installed on a Biztalk R2 Server, and that is that the MaxWorkerThreads default value has been raised from 25 in previous versions to 250 in the 3.5 framework. 

.NET 2.0:
https://msdn.microsoft.com/en-us/library/system.threading.threadpool(VS.80).aspx

.NET 3.5:
https://msdn.microsoft.com/en-us/library/system.threading.threadpool.aspx

The moral of the story.

On an orchestration host, make sure you set these settings so that orchestrations do not grow unbounded.  Do it even if the 3.5 framework is not installed.

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\BTSSvc$BiztalkServerApplication\CLR Hosting]
"MaxIOThreads"=dword:00000032
"MaxWorkerThreads"=dword:00000032
"MinIOThreads"=dword:00000019
"MinWorkerThreads"=dword:00000019