Thoughts on Orchestration Performance

My First Post

It’s been awhile since I told Mike that I’ll be putting up a post. I hesitate to write after reading all the deep technical contents Mike has been writing. I have always envisioned my role to consist of random ranting of daily routines in BizTalk CPR. I will work towards that but one has to start somewhere.

 

 

============================================================================

Thoughts on Orchestration Performance

============================================================================

 

I’ve promised Mandi a KB on Orchestration Performance for awhile now (Sorry Mandi). The draft was done until my HD crashed without a valid backup. Good thing that I have the main points imprinted into my brain. I will post the ideas here first so I can get some feedbacks first regarding to the validity of my memory imprint.

 

First, why do we want to talk about Orchestration performance? From support standpoint, we get a lot of cases on performance. Fine tuning performance is difficult and time consuming once a solution is in production. Also, no matter how experienced is the support engineer, he/she will not know your design better than you. The best thing for a BizTalk developer/administrator is to address these common issues before going live with any solution.

 

Emphasis on Test

Given the complexity of Orchestration designs and the permutation of hardware components, it is nearly impossible for anyone to predict performance. We do get cases from time to time from customers who want to hear our best effort estimate. The keyword here is ‘estimate’. Only way to find out for certain how much load your hardware can handle and whether your design has potential bottlenecks is by running proper stress tests. Both the Microsoft BizTalk Server Operations Guide and the upcoming BizTalk Performance Optimization Guide outline the proper testing methodology.

 

Host Level Isolation

It is a good idea to isolate your Orchestrations in a separate host. Not only does the separate host instance provide you with dedicated process level resources for your Orchestrations, it also allows more specific fine tuning of host level throttling configuration. If your Orchestration host is experiencing high memory utilization, you can consider separating your Orchestrations further into more hosts. Of course, you should do this with understanding of your hardware capacity. No point adding more host instances if you are reaching the limit of what your hardware can handle.

There are many ways to organize your BizTalk artifacts with regard to hosts. You can certainly do it by functionality or operation so if you have to temporarily bring down a host, it’ll affect only a single functionality, not all of your workflows. Also, if you know ahead of the time that two Orchestrations are designed to be running concurrently, they can benefit from being in different hosts instead of competing for resources in same host instance.

 

Throttling

Whenever you feel your Orchestration processing is slowing down, you should first take a look at the Performance Counters to see if throttling is taking place. The counters you want to keep track are under the BizTalk: Messaging Agent performance object. If publishing or delivering throttling is taking place for a host, you will get a non-zero status for the respective counter. You can then look up the status code in BizTalk Documentation. Please consult Host Throttling Performance Counters for more information on these counters.

Please use BizTalk Documentation (How to Modify Default Host Throttling Settings) as a guide when you modify throttling behavior. Process Memory Usage may be one setting you want to consider changing when you go from development to test/QA or production. By default Process Memory Usage is set to 25%. When the address space of a host process is 2GB, throttling may kick in after process memory usage crosses over 500MB, which may be rare in development but not during production. For a 32bit host, do not configure your host to use more than 1.54GB of memory (75%). If you have a 64bit host, you can change the setting to 100% so BizTalk host will not throttling until it uses more than 2GB of memory.

One thing to remember is that your Orchestration host may not be the one that is throttling but Orchestration performance is still affected. For example, your Orchestration may be bound to a SQL send port. If the host that is running the SQL send adapter is under throttling, messages cannot be published and your Orchestration may be held up as result. As a general rule, if you have throttling, you should investigate the cause and determine if you can do something about it.

 

Messages/Variables scope

We often see cases where Orchestration instances use a large amount of memory. One thing you should keep watch during development is the scope of your messages and variables. Any message or variable can either be defined at the Orchestration level or bound to a scope. It is similar to global variable versus local variable in any programming language. While Orchestration engine does have logic to invoke garbage collection on messages that are not used later on in the Orchestration, you don’t have absolute control over when GC kicks in. With large messages or object types, it makes sense to define them in the scope they will be used. Furthermore, if you load large objects into variables, such as a large XML into XMLDocument, it is good practice to clean up as soon as you are done with it.

When you are running your tests, keep an eye on the memory footprint. These issues should be resolved during design or test stage. Once you are in production, there is no easy way for us to know what is loaded in memory, without getting a memory dump. That is time consuming and you’ll end up with downtime while someone gathers and analyzes the memory dump.

More information on how to troubleshooting out of memory exception with BizTalk Server:

https://support.microsoft.com/kb/918643

 

Persistence Points

For longer running Orchestrations to maintain state information, the Orchestration engine saves instance state to the BizTalkMsgBoxDB at various persistence points.

Guaranteed persistence points are:

· Send Shape (after a message is sent)

· Start Orchestration Shape

· Suspend Shape

· End of a Transactional Scope (atomic or long-running)

· An Orchestration Debugger breakpoint is hit

· Orchestration Engine determines that the instance needs to be dehydrated

· When Orchestration Engine is shutdown, whether through controlled shutdown of the host or abnormal circumstances. In this case, the engine tries to persist but if that fails, Orchestration instance will resume from last successful persistence point.

Receive Shape, Listen Branch, and Delay Shape are conditional persistence points. Receive Shape dehydration behavior is determined by the history of how long Orchestration Engine waited for this subscription and the threshold value configured in BTSNTSVC.EXE.config file. Listen Branch and Delay Shape dehydration behaviors are based on the configured timer value and the threshold value. Please reference “How to Calculate Dehydration” in BizTalk documentation for more details.

Since each persistence point requires your Orchestration instance and the associated data to be serialized to the database, it can impact performance if you have large number of persistence points or there is large amount of data to be serialized. During design time, you can use the above list of persistence points as a guide to adjust your workflow. During runtime, you can use the dehydration and persistence counters under XLANG/s Orchestration performance object to determine whether excessive persistence may be a factor in your performance issue.

To prevent persistence, you can enclose a potential persistence point within an Atomic Scope. Orchestration Engine won’t persist until it reaches the end of the transaction. Be careful when you do this. While it does keep the Orchestration instance from persisting, it also forces the Orchestration instance to remain in memory when it could have dehydrated. You may run out of system resources if you have a large number of active Orchestration instances.

 

Map

Transformations are more efficient when they are executed from send or receive port. There are scenarios where it makes sense to have a map in the Orchestration. For example, you can execute a transformation dynamically within an Expression Shape and also maps in Orchestration can accept multiple inputs. In general, unless you have an absolute requirement to use map in Orchestration, you should avoid during so.

 

Distinguished Field/Promoted Property Field

When you promote a property, consider how you’ll be using it. If you plan to only access the data within an Orchestration, promote the property as Distinguished Field only. Promoted Property Fields can be used for routing and they are more expensive. It is not necessary unless you plan to use it outside of Orchestration, for example, as filter for a Send Port subscription.

 

External Components

Many people design Orchestrations that use external assemblies within Expression Shape. This gives your workflow more flexibility. However, when you call a method from external assembly or unmanaged code, you should realize that the Orchestration Engine has little control over what happens within external code. If the code fails, your Orchestration design needs to handle the exception. If the method call does not return in a timely manner, Orchestration Engine will have to wait on it. You should not count on the Orchestration Engine to force the termination of external component through any timeout mechanism. If a timeout mechanism is desired, you should build it into the method you are calling.

With any custom code, you should implement tracing or debugging information. I have seen many cases where Orchestration instance fails within custom code. However, since there was no tracing within the component, we spent time on isolating the failure. Once the problem was isolated to the custom component, the issue still needed to be resolved by the original developer since we didn’t have any knowledge of the component. It is more efficient for you if we can quickly determine what are failing and who should be engaged to fix the issue.

 

Latency

You may experience slow Orchestration performance when first message comes in after long idle. This may be caused by application domain unloading within the BizTalk host process. If an AppDomain is shutdown after idle, the next message that comes needs to wait for the Orchestration to compile again. Depending on the complexity of your design, this can be a noticeable wait. To prevent this in low latency requirement scenario, you can modify the BTSNTSVC.EXE.config file and set SecondsIdleBeforeShutdown property to -1. This will prevent AppDomain shutdown due to idle.

 

Tracking – Orchestration Events

Document tracking can impact performance throughout BizTalk, not just Orchestration. For Orchestration, you should realize that Orchestration Event Tracking is on by default. This is useful during development and testing since Orchestration Events are required for Orchestration Debugger. However, if you do not intend to debug an Orchestration directly in production, you should turn Orchestration Event Tracking off. Orchestration Events are eventually moved to DTA_DEBUGTRACE table in BizTalkDTADB by TDDS. We have seen slow read/writes to this table once it gets large (several hundred thousand rows). What is considered large may vary due to your SQL Server hardware. In the end, if TDDS cannot move data efficiently into BizTalkDTADB, data is accumulated in BizTalkMsgBoxDB. Large MsgBoxDB can cause all of your hosts to slow down, and eventually lead to throttling.

If you must track Orchestration progress, it is better to implement a BAM solution. It has lesser performance impact.

 

Thread Pool Size

Orchestration Engine and many BizTalk Adapters use worker threads from managed thread pool. The default size for the thread pool, which is defined by MaxWorkerThreads in registry, is 25. As Orchestration instances are dehydrated, threads are returned to the thread pool. While the default setting may be adequate for many scenarios, if you have large number of Orchestration instances that need to be active concurrently, you may be limited by the thread pool size. You can usually detect this by using the Running Orchestrations performance counter under XLANG/s Orchestrations performance object. You may notice that the number of running Orchestrations increases and then remains constant once it hits the thread pool limit. Of course, before you suspect thread pool is the limiting factor, you should first make sure no throttling is taking place and your system is not already under stress.

You can increase the thread pool size by modifying the MaxWorkerThreads value in registry. For instruction on how to do this, please reference https://support.microsoft.com/kb/900455.

 

Orchestration Profiler

It is often useful to find out which shape in your Orchestration is taking up most of the time. With that information in hand, you can then proceed to troubleshoot the bottleneck. Orchestration Profiler (https://www.codeplex.com/BiztalkOrcProfiler) is a nice tool that shows you the max and average duration spent on each shape. In addition, it displays your test coverage to let you know if you’ve covered all branches of the workflow. It is not an official tool so you may want to take that into consideration if you want to run it in production. But it is certainly a useful testing tool.

 

For more information about Orchestration in general, please reference the Orchestration FAQ:

https://msdn.microsoft.com/en-us/library/bb418739.aspx

 

More to come….