BizTalk Terminator Not Cleaning Up Caching Items?

 

UPDATE: See BizTalk Terminator STILL Not Cleaning Up Caching Items for additional info on this plus a new cache cleanup task that can simplify all of this.

 

I've been pinged a number of times on this so thought I should blog the workaround and an explanation. 

First, let's say MBV shows you something like the following in the Warning and Summary Report:

Or you just notice in MBV that there's a bunch of cache messages in one of the queue tables:

 

Well, according to my Using BizTalk Terminator to Resolve Issues article, you simply run the Terminate Caching Instances task: 

Issue Identified by MBV

Resolution Options

Terminator Resolution Task

Terminator View Task

Root Cause

Orphaned Cache Instances

MBV Integration or Manual Task Selection

Terminate Caching Instances

(in Delete task category)

View Count of Cache Messages in All Host Queues

View Count of Cache Instances in All Hosts

This is due to a known bug and there is a hotfix available.  See KBs 944426 & 936536 for details.

 

That should do it.  If it doesn't, make sure you have stopped all the BizTalk hosts (that includes the IIS app pool hosting the BizTalk isolated host if the caching items are there) and try again.  (Hey, you shouldn't be running Terminator without stopping all the BTS hosts anyway).

Now, that will definitely do it.

Well... like 99.9% of the time.

Let's say you do all of the above and then run one of those View tasks (or MBV)  and notice that Terminator left behind some of those caching instances (and their associated caching messages).  This happens even though Terminator claims to have terminated all of them successfully.  And rerunning the task doesn't help - Terminator will repeatedly say it successfully terminated those instances but either of the View tasks will show that they're still there.

Ok, so now you're most likely running into a very rare scenario that I've come across a few times. 

First, I should point out that the Terminate Caching Instances and Terminate Instances tasks use BizTalk's WMI API to interact with the messagebox - and that's key.  I had the opportunity to analyze some data from a customer environment that was running into this issue and it turns out that there are certain times when msgbox logic prevents the stored procs called by WMI from terminating “internal” instances – with caching being considered one of those "internal" types.  As far as when exactly the msgbox logic goes down this "rare" path, I've never had access to a repro environment where I could fully debug this so I don't have a good answer for that.

So I was going to write code in Terminator's WMI class to catch this scenario and warn the user that they need to use the workaround to clean up the remaining items but unfortunately the msgbox logic catches the failed call and doesn’t pass that info on to the WMI caller – so there’s no way for Terminator (or any WMI client) to know that some of the instances weren’t deleted.  As far as the WMI client is concerned, the stored proc call completed sucessfully so it assumes the instances it asked to be Terminated were actually terminated.

So what's the workaround?  Well, don't use WMI for this particular situation.  Instead, use Terminator's Hard Termination tasks (Terminate Multiple Instances (Hard Termination) or Terminate Single Instance (Hard Termination).

While I was working on Terminator's WMI class and having BizTalk engineers use Terminator on an internal-only basis within Microsoft, we noticed that on very rare occasions the termination tasks would not terminate something.  This would not be a limitation of Terminator and we could reproduce it with any WMI client (including the BTS Admin console).  I created the hard termination tasks specifically to handle those scenarios.  They bypass BizTalk's normal termination API and use SQL calls to directly interact with the BizTalk msgbox.  They can terminate anything (well, so far) - even internal instances.  Originally, we just had the Terminate Single Instance (Hard Termination) task to help terminate a one-off instance that just wouldn't terminate any other way.  That worked great for those one-off scenarios but I soon realized that sometimes there would be a larger number of instances that needed hard termination.  I left the single instance task to give users that functionality and wrote the Terminate Multiple Instances (Hard Termination) task.  That allows the user to choose the Host, Class, Status, and the Max number of instances to terminate and will do hard terminates on all items that fit the filter criteria. 

The only thing that is painful about the Terminate Multiple Instances (Hard Termination) task is that if you have instances across multiple hosts with various statuses, you will need to run the task for each permutation since it doesn't have the ability to handle, in one execution, multiple hosts and statuses (or classes) like the WMI-based Terminate Instances and Terminate Caching Instances tasks.  Since Terminator v2 supports Powershell, I'm hoping at some point to create a powershell-based hard terminate task that can provide this functionality - just haven't had the time to do that yet.

  

So, in short:

Problem:

The Terminate Caching Instances task is not cleaning up all cache items.

Solution:

Use the Terminate Multiple Instances (Hard Termination) task, choose Caching as the Class Parameter. 

You will need to re-run this task for each Host and Status that applies to the cache items you're trying to terminate - MBV or the Terminator View Count of Cache tasks should give you some info in this regard. 

BTW, I've seen active, dehydrated, and suspended cache items so you may need to cycle through all of those Statuses.

 

In general, if you ever find that any of the termination tasks are not terminating what you want, the two Hard Termination tasks (Single and Multiple) are the workaround. 

 

Remember that Hard Termination tasks bypass the normal BTS APIs so should only be used if the normal Terminate Instances or Terminate Caching Instances task is not working - and as always, be careful with Terminator - especially when doing any of the deletion tasks.