BizTalk 2004 Performance

I’ve been involved in a number of BizTalk 2004 Performance / Tuning engagements recently, it seems that most of this year customers have been busy writing solutions using BizTalk and are now getting to the point of deploying – hence it’s been really busy here!

On the whole BizTalk has worked really well; there is of course the usual gotchas and seemingly infinite number of configuration tweaks you can make which either makes things better or worse J  

A few things “common” things have come out of this works….

Disk Configuration

The first and most common is the Disk Configuration on the SQL Server hosting the BizTalk MessageBox. Typically customers test on standard IDE hard-drives or more commonly RAID5 which seems to be the default for server kit.

I’m by no means a storage or RAID expert but from my discussions RAID5 is not very fast for writes (But good for Reads), The BizTalk MessageBox database is effectively the “hard drive” for BizTalk and therefore it gets a lot of traffic (Messages being received, Sent, Orchestrations persisting, Tracking, etc.).

In most (but not all) systems I’ve been involved with, they have suffered from poor BizTalk performance due to high disk queuing on the SQL Server which means the disk is too busy to process the disk operation so it queues them up, meaning BizTalk in this case has to wait – When you combine this with BizTalk’s heavy usage it causes major problems.

So, when you’re doing your performance testing make sure your monitoring the following PerfMon counters held in the LogicalDisk performance object.

% Idle Time
% Idle Time reports the percentage of time during the sample interval that the disk was idle.

Avg. Disk Queue Length
Avg. Disk Queue Length is the average number of both read and write requests that were queued for the selected disk during the sample interval.

Disk queuing should typically be no more than 2, if you’ve got problems you’ll probably see 40+ J. However watch out as this counter is per spindle, therefore you should divide this number by the number of spindles (i.e. disks) in the drive array. % Idle Time will basically show you how often the disk is idle, if you see it maxed for large periods of time this also points to drive array problems and therefore performance issues.

To quote Kevin Smith almost word for word – “put your money into good disks” J  They are a crucial piece to BizTalk performance.

Atomic Scopes & Debugging/Tracing.

Another issue that came up recently which I didn’t fully appreciate is how “expensive” Atomic Scopes are, don’t get me wrong, they’re great and need to be there but I’m increasingly seeing customers using them where they don’t have to – For those of you with a COM+ background it’s analogous to running your code within a Transaction, most customers didn’t need them but turned it on anyway causing significant problems.

Atomic scopes aren’t really “expensive” on their own but if you have 10,20,30+ being hit per message then it has a cumulative effect which can cause you problems, BizTalk will checkpoint your Orchestration before and after the Atomic scope to enable it to “roll-back” the state. This obviously incurs a Serialization and Database roundtrip, again on it’s own not a problem but if you multiply the cost it adds up!

The driver behind using those most commonly is a custom tracing/logging solution, mainly driven by the fact that Health and Activity Tracking is hard to use. I agree with this sentiment, but writing your own custom logging solution where you save message bodies, etc. may help you in the short-term but is likely to cause performance problems later down the line. Technologies like BAM and HAT can be scripted and configured in such a way to solve every requirement I’ve seen to date – Use this technology, Please don’t write your own!.

So the reason Atomic Scopes end up being used in these scenarios is most customers end up using the System.Xml classes to pull bits out of the message to log, etc. Most of the types in System.Xml are non-serialzable hence you get “forced” into using Atomic Scopes, same goes for your own components – if you can, mark them as Serialzable to avoid the cost.

So please don’t write your own tracing / logging system J – spend some time with BAM and HAT and you should be able to do everything with these tried and tested technologies.

I’ll try and post some more stuff in this area soon!

The PSfD team here at Microsoft UK is looking to utilize a large amount of our Scaling Lab Hardware early next year to conduct a BizTalk “What it costs” session where we’ll test a bare-bones BizTalk solution (just messaging) on a variety of hardware to get some performance metrics. We'll then then add “features” such as Atomic Scopes, Correlation and get individual metrics for each area – hopefully the output of this will be a deep understanding of the impact of various features, and how to tune BizTalk for a number of scenarios which we’re planning to write-up and post via GotDotNet or MSDN.