Workflow farm may not work expectedly if your load balancer is not set to sticky session mode

The problem:

There is a SharePoint 2013 farm with 3 nodes Workflow/ServiceBus farm. A F5 load balancer is in the middle between SharePoint and WF farm.

 

When using one of the WF endpoints’ FQDN to register SharePoint workflow, everything was working well. But we lost high availability when this specified endpoint was down: https://blog.bugrapostaci.com/2015/05/15/do-we-really-need-a-load-balancer-between-workflow-manager-farm-and-sharepoint/

 

When using load balanced url without sticky session, we met an intermittent 403 error with Sharepoint and WFMQuickTest tool:

 

HTTP/1.1 403

<Error xmlns:i="https://www.w3.org/2001/XMLSchema-instance"><Code>UnexpectedError</Code><Message>Exception thrown from the messaging layer. For more details, please see the server logs.</Message></Error>

 

Analysis:

 

We captured memory dumps to check and found a HTTP 403 error occurred when a correlated request went to a different node.

 

0:014> !wselect m_Uri.m_String,m_StatusCode from 02433230

[System.Net.HttpWebResponse]

(string)System.String m_Uri.m_String = https://workflow.server.com:12290/WFMQuickTest

(int32)System.Net.HttpStatusCode m_StatusCode = 191 (0n401) Unauthorized                                                                                                           //NodeId: WFNODE4354

0:014> !wselect m_Uri.m_String,m_StatusCode from 0246b468

[System.Net.HttpWebResponse]

(string)System.String m_Uri.m_String = https://workflow.server.com:12290/WFMQuickTest

(int32)System.Net.HttpStatusCode m_StatusCode = 191 (0n401) Unauthorized

0:014> !wselect m_Uri.m_String,m_StatusCode from 024ad06c

[System.Net.HttpWebResponse]

(string)System.String m_Uri.m_String = https://workflow.server.com:12290/WFMQuickTest

(int32)System.Net.HttpStatusCode m_StatusCode = c9 (0n201) Created

0:014> !wselect m_Uri.m_String,m_StatusCode from 024c10c4

[System.Net.HttpWebResponse]

(string)System.String m_Uri.m_String = https://workflow.server.com:12290/WFMQuickTest/$Activities/WFMQuickTest

(int32)System.Net.HttpStatusCode m_StatusCode = 191 (0n401) Unauthorized                                                                                                           //NodeId: WFNODE4354

0:014> !wselect m_Uri.m_String,m_StatusCode from 024f7594

[System.Net.HttpWebResponse]

(string)System.String m_Uri.m_String = https://workflow.server.com:12290/WFMQuickTest/$Activities/WFMQuickTest

(int32)System.Net.HttpStatusCode m_StatusCode = 191 (0n401) Unauthorized

0:014> !wselect m_Uri.m_String,m_StatusCode from 0250c190

[System.Net.HttpWebResponse]

(string)System.String m_Uri.m_String = https://workflow.server.com:12290/WFMQuickTest/$Activities/WFMQuickTest

(int32)System.Net.HttpStatusCode m_StatusCode = c9 (0n201) Created

0:014> !wselect m_Uri.m_String,m_StatusCode from 0250de8c

[System.Net.HttpWebResponse]

(string)System.String m_Uri.m_String = https://workflow.server.com:12290/WFMQuickTest/$Workflows/WFMQuickTest

(int32)System.Net.HttpStatusCode m_StatusCode = 191 (0n401) Unauthorized                                                                                                           //NodeId: WFNODE4354

0:014> !wselect m_Uri.m_String,m_StatusCode from 02547ba8

[System.Net.HttpWebResponse]

(string)System.String m_Uri.m_String = https://workflow.server.com:12290/WFMQuickTest/$Workflows/WFMQuickTest

(int32)System.Net.HttpStatusCode m_StatusCode = 191 (0n401) Unauthorized

0:014> !wselect m_Uri.m_String,m_StatusCode from 025589e4

[System.Net.HttpWebResponse]

(string)System.String m_Uri.m_String = https://workflow.server.com:12290/WFMQuickTest/$Workflows/WFMQuickTest

(int32)System.Net.HttpStatusCode m_StatusCode = 193 (0n403) Forbidden                                                                                                                  //NodeId: WFNODE4352 . This is the 403 error mentioned above.

 

Resolution:

 

Set F5 load balancer to sticky session mode.

 

Best regards,

WenJun Zhang