MSMQ prefers to be unique

If you want to quickly create a few test machines or roll out dozens of branch offices, DON'T clone MSMQ. The end result will be messages that just stick in the outgoing queue for no obvious reason and then, when you've got fed up waiting, get delivered to the destination just as inexplicably. In some situations the messages may just disappear without a trace.

If you look in the registry at HKLM\Software\Microsoft\MSMQ\Parameters\Machine Cache you will find a binary value called QMId. This value, as you'd guess, is the ID of the queue manager and MSMQ uses it to distinguish between different machines (IP address and computer name are never reliably unique enough). All message communication makes use of the value but, more importantly here, MSMQ uses the QMId for performance.

MSMQ maintains a temporary cache in memory of a received message's QMId property and the IP address of the sender. The queue manager uses the cache to find the IP address of the sender so it can correctly address the acknowledgement messages that underpin MSMQ's delivery system. The cache entries do have a lifetime and the queue manager will refresh or purge them at regular intervals.

Should a system have cloned machines running MSMQ then they will effectively all be sharing the same QMId row in the cache table. Every time a queue manager receives a message from one of these machines, it will acknowledge back to the IP address it found in the cache and there is a chance that this will mean a delivery to the wrong sender. The receiver of the misrouted message will simply discard it as the qeue manager cannot find anything locally that the acknowledgement corresponds to.

How this manifests itself will depend on how many clones there are and how busy the system is. For example, if there are only a handful of MSMQ machines and they just send a few times an hour then it is very unlikely that there will be a delay. The cached entries will have timed out and been removed before the next clone machine sends a message. On the other hand, a busy system could end up with a queue {no pun intended} of client machines all waiting on the cached entry to expire before their outgoing messages can finally be delivered.

 

What to do

If your clients are active directory-integrated then the QMId is stored in the msmqConfiguration object and the recommended path is to reinstall Message Queuing to generate a unique value for that machine.

 

Workgroup mode, though, is less drastic:

  1. Stop the MSMQ Service 
  2. Delete the QMId value completely
  3. Add a SysPrep DWORD (Under HKLM\Software\Microsoft\MSMQ\Parameters) and set it to 1
  4. Start the MSMQ Service

 

To add a sense of risk to cloning MSMQ, there is also this KB article:

830639 Access Violation Occurs in the MSMQ Service

the hotfix for which is not currently included in any service packs but is part of Update Rollup 1 for Microsoft Windows 2000 Service Pack 4.

But, in all cases, prevention is better than cure so always install MSMQ (manually or scripted) after you have finished deploying the operating system to the machine and not before.