It’s always the Load Balancer


Ok, this is probably an (intentionally) inflammatory title, but in this case, it's absolutely true.

A common Exchange Online migration issue we see is the SourceMailboxAlreadyBeingMovedTransientException which I previously blogged about.  One of the most common reasons we see for this error is a communications error (found using the command in the previous blog) that looks like this:

CommunicationErrorTransientException: The call to 'https://MRSPROXYHOST/EWS/mrsproxy.svc SERVER.domain.com (14.3.178.0 caps:05FFFF)' failed.
Error details: The remote endpoint no longer recognizes this sequence. This is most likely due to an abort on the remote endpoint. The value of wsrm:Identifier is not a known Sequence identifier. The reliable session was faulted.

The reason you see this is because MRS Proxy uses WCF's implementation of reliable sessions.  In this scenario each request is sequenced (numbered) and the current (expected) sequence number is held in memory in MRS Proxy (In 2010 on the Client Access Server).  The reason for the error is that the request was sent to a different CAS than was previously being used (and thus the new CAS didn't know about the previous sequence).  In Exchange 2010, we require HTTP(s) sessions for MRS Proxy to be sticky to a particular CAS (also referred to as persistence or affinity).  It also needed to be client IP-based persistence because the MRS Proxy client component doesn't actually store and persist cookies, rendering all forms cookie-based affinity useless.  Additionally, since MRS Proxy connections use NTLM authentication, you can't do persistence on the Authorization HTTP header.  If you see this error, it's almost certainly the load balancer not maintaining persistence.  You'll see this primarily when migrating from Exchange 2010 because Exchange 2013 and later doesn't require load balancer persistence.

Hope this helps.

Comments (4)

  1. Mike Crowley says:

    How is it “always” the load balancer if “primarily when migrating from Exchange 2010”? We see these errors a lot when migrating from a large 2013 environment. unfortunately, the issues often persist for more than 2 hours, and the idea of setting a registry setting (previous post) on 20+ servers seems like a lot of work without a higher degree of certainty (which I had before this post said its always the HLB!). 🙂

    1. Hi Mike, Yes, it’s almost never the load balancer when using a 2013 server for MRS Proxy. The reason is that you no longer need session affinity or persistence as the front-end component in 2013 and later will automatically route the traffic to the correct back-end. Really any kind of transient error can cause this problem – it’s just that in 2010 that “sequence identifier” error specifically is most commonly caused by HLB.

  2. Jazier says:

    Hi, I have the same error. In my scenario, I have two CAS servers which are NATed through public IP addresses. Public DNS resolves the Remote MRS proxy server domain to these 2 public IP addresses. Each time, I try to migrate a mailbox from on-premise to Office 365 the error you mention appears. What do you suggest to do? Thanks!

    1. We’d typically recommend using a network load balancing solution instead of DNS-based load balancing for servers in the same AD site. That being said, it usually works anyway. Due to DNS TTL values you are absolutely going to get this behavior every once in a while when moving mailboxes. Obviously the higher the TTL on your records, the less often it should happen. Honestly, for migration, it may just be simpler to drop down to a single CAS. You’d be surprised what kind of migration throughput you can get on a single CAS.

Skip to main content