How to replace an MSMQ server without bringing down the AD roof

The scenario is that you have a migration plan to move all your existing Windows 2000 servers in your domain to a newer Windows operating system. Or just moving to new hardware - the problems are the same.

If you make use of MSMQ in Active Directory Integration mode then there will be a few challenges.

  1. Each MSMQ client or server has a unique GUID to identify it by (the QMID) which is generated at setup time. If a replacement machine is created with a new QMID then any applications hard-coded to use the old identifier will break.
  2. Each public queue also has a unique GUID which applications can use for sending messages. If the replacement machine does not have the same QMID then any queue recreated on it - even if the queue name is the same - will have a new GUID. Again applications with the GUID hard-coded will fail.
  3. In a system that is using routing, such as an implementation of the MSMQ-MQSeries Bridge, there are a number of objects in Active Directory that are associated with the original machine. Reinstalling a machine so that it has a new QMID will mean that such objects as Site Links will no longer work.

If you are not sure which mode your MSMQ machines are running in, check in the Control Panel (For MSMQ 4.0, you will need to go to "Programs and Features", "Turn Windows features on or off"; for MSMQ 3.0, instead go to "Add or Remove Programs", "Add/Remove Windows Components" as indicated below).

 

There are a few approaches that are worth covering:

The simple-but-tedious way

This method is basically "start from scratch" by recreating everything required to get the system back up and running.

  1. Uninstall MSMQ from the old machine
  2. Remove the old machine from the domain (so it is now in a workgroup)
  3. Power down old hardware
  4. Power up new hardware
  5. Install Windows 
  6. Add the machine to the domain
  7. Install MSMQ
  8. Recreate all the public queues (manually or with a script)
  9. Recreate all the site links, site gates, and so on
  10. Recode any applications that make use of hard-coded GUIDs

 

For the gamblers amongst you

In theory it is possible to take an existing full system backup and successfully apply it to new hardware. There may be a few drivers that need updating but you may get away with it and have a fully-functonal system without the need to recreate anything. In practice, this route may not be worth the pain.

  1. Perform a full system backup (which you would do anyway but you'll actually be using this one)
  2. Power down old hardware
  3. Power up new hardware
  4. Perform a full system restore (fingers crossed the machine will boot afterwards)
  5. If appropriate, upgrade the operating system  (fingers still crossed)

 

For the people with less time on their hands

  1. Power down old hardware

  2. In "Active Directory: Users and Computers" right-click the computer and choose "Reset Account"

    Note - Resetting the computer account has the potential to break any applications that rely on it, with unpredictable results. To avoid the risk of data or service loss, this change should only be performed in a test environment.

  3. Power up new hardware

  4. Install Windows

  5. Add the machine to the domain (machine inherits existing computer object)

  6. Install MSMQ (machine inherits existing MSMQ object and public queues)

Obviously the order of some of the steps can be changed, like building a replacement Windows server in advance, but you get the idea.

Note - Installing MSMQ will remove the machine from extra sites it may belong to. This isn't a concern for independent clients but it will break inter-site routing of messages until fixed. This is corrected through the properties of the machine's MSMQ object in Active Directory: Users and Computers. An example below shows the MSMQ properties for BRIDGE1. This routing server had been a member of both the Bridge and WebSphere sites but step 6 (Install MSMQ) removed it from the latter. The Bridge site is the one that the machine matches based on its subnet. Just highlight the removed site and add it back again.

Note - I haven't mentioned what to do with private queues. You could run a script to recreate them. Another option is to copy their configuration files from the MSMQ\Storage\LQS directory on the original machine to the new one. The main concern here is that these text files contain the SIDs of the accounts that had access permissions to the queues so if the SIDs do not exist on the new machine (basically, if you used local - not domain - accounts) then the queue may initially end up inaccessible. This is one of the reasons that just copying private queue configuration files isn't supported.

[[Edited 6th November, 2009]] 

Note - Replacing a domain controller that is running MSMQ hasn't been covered here. I would recommend running DCPROMO and demote the machine to a member server before proceeding as you can always promote it back afterwards. This shouldn't be a problem as you should always have two domain controllers and no single point of falure.

[[Thanks to Trace Young (editorial review) and Jan Varsavsky (testing)]]