Avoiding Duplicate Messages using the BizTalk DB2 Adapter

My BizTalk knowledge is slowly increasing as I take on more support work. A recent case required investigation into how the DB2 Adapter works when you have multiple instances of the host running so I thought I'd share my findings.

The objective was to have multiple instances running to provide fault tolerance and load balancing. To do this, a DB2 Receive Location was created that was available to all the BizTalk servers.

This TechNet article has a good overview of configuring the DB2 Adapter (amongst others) which I'll borrow content from:

Introduction to BizTalk Adapters for Host Systems

Scrolling down to:

Data Integration – DB2 and Host Files

BizTalk Host Adapter for DB2

Configuring the Receive-Side

there's a useful screenshot (although it could have been a little larger for clarity).

 

As you can see, there's a Connection String to set up access to the database and an SQL Command to collect the required data.

In the customer's case, unfortunately, duplicate messages were created from the records returned from the SQL Command. This was because each host instance polled the DB2 database as frequently as specified by the Polling Interval, independent of the other hosts. A new message was created by each host for every row of data returned by the SQL Command.

There is no locking of records. You could add transactional support to the connection string but this will only lock the record until the data has been saved as a message in the MessageBox database. As soon as the transaction commits (which shouldn't take long at all), the record is unlocked. The transaction scope does not extend over the lifetime of the message.

To assist, a Change Command was specified instead:

  • "Do Nothing" is the default and allows duplication to occur.
  • "Delete After Read" will prevent duplication but does mean each row is deleted from the database so might not be a viable choice.
  • "Update Command" allows any of the columns in the SELECTed table to be updated with static values. A good example would be setting an "already processed" field to true so that the record would not be retrieved again by the SQL Command when the next poll came around.

To summarise, BizTalk does not manage the various Receive Locations as a single unit or distribute work amongst them. If each record should only be processed once then configuration changes should be made to ensure this restruction is enforced.