More Poison Message Handling

Article
02/13/2007

We saw the poison message handling strategies for MSMQ 3 and MSMQ 4 yesterday, but how many different strategies can we come up with? Let us count the ways. I've roughly ordered these by increasing complexity.

Discard. We could simply throw away any message that encounters a processing failure as soon as the failure occurs. This strategy completely solves the poison message retry problem. It is somewhat lacking in terms of practical utility for working with a queue. Some of the messages are not processed, but all of those that are processed are processed in order.

Return to front. This is the default strategy that we were looking at for a queue before getting into MSMQ. Return to front has the problem that either a queue administrator regularly cleans out poison messages or we block forever trying to process a single bad message. All of the messages that we reach are processed in order. There's no guarantee of termination and no guarantee that a particular message will ever get processed.

Return to back. We can flip the previous strategy around by moving messages to the end of the queue instead of putting them back on the top. This strategy is a bit trickier to implement because return to front is the natural behavior when rolling back a transaction. Return to back cannot rely on that behavior. All of the messages are processed but not in order. We'll eventually complete every non-poison message although there's still no guarantee of termination.

Move to other queue. This is the MSMQ 3 strategy. We use a heuristic to decide when to give up on a message and stop processing it. All messages are either processed or saved for later. We'll eventually terminate but now somebody has to clean out the other queue.

Shuttle between queues. This is the MSMQ 4 strategy. If we allow an infinite number of cycles, then this is equivalent to return to back. If we only allow a finite number of cycles, then this is equivalent to move to other queue. The advantage of this strategy is that we tend to spend a smaller proportion of time attempting poison messages.

There are many other variations that we could start making on these patterns. For instance, we can vary the heuristic from number of times processed to time spent processing or some application mechanism. We can change move to other queue to move to resource, regardless of whether that resource is a queue, database, web service, workflow, or other application. I'm not trying to limit you into thinking that there's only 20 or 30 ways that you can handle a poison message.

Next time: Stashing Data in Extensible Objects

More Poison Message Handling

Additional resources