SharePoint WorkflowQuotaExceededException caused by ServiceBus deferred messages

The problem:

SharePoint 2013 with workflow manager 1.0 stopped working with the following  Microsoft.Workflow.Client.WorkflowQuotaExceededException failure in ULS log:

w3wp.exe         0x4AFC SharePoint Server              Workflow Services               Exception              Microsoft.Workflow.Client.WorkflowQuotaExceededException: Cannot start more instances because the size of the topic has exceeded the quota limit. HTTP headers received from the server - ActivityId: 895a6d56-92c1-4ce7-bb17-33b40607c372. NodeId: . Scope: /SharePoint/default/a67eb351-30b5-43f8-9be4-1ed1faf6647a/87c9068b-6ccc-4033-9102-5d27c43a476a. Client ActivityId : d9deb79d-724d-10a7-6483-11ea093bdd49. ---> System.Net.WebException: The remote server returned an error: (403) Forbidden.     at Microsoft.Workflow.Common.AsyncResult.End[TAsyncResult](IAsyncResult result)     at Microsoft.Workflow.Client.HttpGetResponseAsyncResult`1.End(IAsyncResult result)     at Microsoft.Workflow.Client.ClientHelpers.SendRequest[T](HttpWebRequest request, T content)     --- End of inner exception stack trace ---     at Microsoft.Workflow.Client.ClientHelpers.SendRequest[T](HttpWebRequest request, T content)     at Microsoft.Workflow.Client.WorkflowManager.StartInternal(String workflowName, WorkflowStartParameters startParameters)     at Microsoft.SharePoint.WorkflowServices.FabricWorkflowManagementClient.StartInstance(String serviceGroupName, String workflowName, String monitoringParam, String activationKey, IDictionary`2 payload)     at Microsoft.SharePoint.WorkflowServices.FabricWorkflowInstanceProvider.StartWorkflow(WorkflowSubscription subscription, IDictionary`2 payload) StackTrace:  at Microsoft.Office.Server.Native.dll: (sig=678c0f87-966f-4d99-9c94-b49e788d2672|2|microsoft.office.server.native.pdb, offset=131CE) at Microsoft.Office.Server.Native.dll: (offset=21BE5)            d9deb79d-724d-10a7-6483-11ea093bdd49

Analysis:

By running the following SQL query against ServiceBus message container database, we found there were more than 1 million records of deferred messages.

Select count(*) as totalDeferred from [MessageReferencesTable] where state = 2

This is reason why WFM reports topic quota has been used up. To move on, use the following query to find which problematic workflow generates so many deferred messages.

SELECT  T2.SessionId,t1.WorkflowName,t1.WorkflowStatus,t2.state, count(*) as  total
FROM [WFInstanceManagementDB].[dbo].[Instances] T1
inner join [SBMessageContainer01].[dbo].[MessageReferencesTable] T2 on T1.[SessionId] = T2.[SessionId]
group by T2.SessionId, t1.WorkflowName,t1.WorkflowStatus,t2.state
having t2.state = 2 order by total desc

Solution:

  1. Undeploy the problematic workflow.
  2. Contact Microsoft support to cleanup these deferred messages with our internal CleanupDeferredMessages tool.
  3. Make sure both WFM and SharePoint have been updated with the latest patch. Otherwise, deferred messages cannot be 100% avoided.

Best regards,

WenJun Zhang