WCF/MSMQ intermittent MQ_ERROR_TRANSACTION_SEQUENCE error

My customer has an application runs fine until upgraded to WCF 4.0. After that, he received follow message intermittently. 0xc00e0051 means MQ_ERROR_TRANSACTION_SEQUENCE.

 

<Message>An error occurred while receiving a message from the queue: Unrecognized error -1072824239 (0xc00e0051) . Ensure that MSMQ is installed and running. Make sure the queue is available to receive from.</Message>

 

<StackTrace>

at System.ServiceModel.Channels.MsmqQueue.TryReceiveInternal(NativeMsmqMessage message, TimeSpan timeout, MsmqTransactionMode transactionMode, Int32 action)

at System.ServiceModel.Channels.MsmqQueue.TryReceive(NativeMsmqMessage message, TimeSpan timeout, MsmqTransactionMode transactionMode)

at System.ServiceModel.Channels.MsmqReceiveHelper.TryReceive(MsmqInputMessage msmqMessage, TimeSpan timeout, MsmqTransactionMode transactionMode, MsmqMessageProperty&amp; property)

at System.ServiceModel.Channels.MsmqInputChannelBase.TryReceive(TimeSpan timeout, Message&amp; message)

at System.ServiceModel.Dispatcher.InputChannelBinder.TryReceive(TimeSpan timeout, RequestContext&amp; requestContext)

at System.ServiceModel.Dispatcher.ErrorHandlingReceiver.TryReceive(TimeSpan timeout, RequestContext&amp; requestContext)

at System.ServiceModel.Dispatcher.ChannelHandler.TryTransactionalReceive(Transaction tx, RequestContext&amp; request)

at System.ServiceModel.Dispatcher.ChannelHandler.TransactedLoop()

at System.ServiceModel.Dispatcher.ChannelHandler.SyncTransactionalMessagePump()

at System.ServiceModel.Dispatcher.ChannelHandler.OnStartSyncMessagePump(Object state)

at System.Runtime.IOThreadScheduler.ScheduledOverlapped.IOCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* nativeOverlapped)

at System.Runtime.Fx.IOCompletionThunk.UnhandledExceptionFrame(UInt32 error, UInt32 bytesRead, NativeOverlapped* nativeOverlapped)

at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP)

This faults the WCF service host. After that, no messages will be processed. To fix this issue, you have to restart the service. A workaround is detect the fault event and restart the service host.

 

 After enabled WCF trace, we could see this 1 minute timed out.

 

 

 

We know this could happen if there is another application monitor the same queue. And there is another scenario described here.

https://blog.jorgef.net/2011/07/msmqexception.html

 

Even the symptom are exactly matches my customer’s problem, unfortunately, these are not my customer’s scenario. For my customer’s scenario, there is no other applications monitor the same queue, and there is only one endpoint defined. Fortunately, my customer is very active on helping us reproduce the problem. With this, we find out the why this happens and a better solution.

 

Why this happens

Here describes how WCF retrieve messages from a queue.

// System.ServiceModel.Dispatcher.ChannelHandler

private bool TransactedLoop()

{

    try

    {

        this.receiver.WaitForMessage(); // calls MQReceiveMessage with INFINITE time out to “peek” a message. It returns only when a message is available in the Q.

    }

    catch (Exception ex)

    {

    }

     while (true)

        {

            if (!this.TryTransactionalReceive(transaction, out requestContext)) // calls MQReceiveMessage again to retrieve the message with time out(1 minute by default).

            {

                break;

            }

 

If the message was unfortunately retrieved by other application (or disappeared due to whatever reason between the two red highlighted calls, the second call of MQReceiveMessage(second highlighted red line of code) transaction timed out and returns 0xc00e0051.

 

Better Solution?

We already have a workaround which is restart the service host after the error happened. However, this definitely is not a good option.

 

By live debugging my test WCF app, I found MQReceiveMessage returns 0xc00e0051 error as long as there is no message available before transaction timed out.

 

Then, reviewing the code and WCF trace, I found the time out value passed to MQReceiveMessage is the smaller one of these two values:

 

- System.Transaction time out. (default is 1minute if not specified)

- WCF receiveTimeout. (Default is 10 minutes)

 

This is why we see the error happened after 1 minute because 1 minute time out is used by default. So, another solution is set the time out of MQReceiveMessage smaller than the transaction time out.

 

Then, I managed to repro this issue with help of a debugger. After understand why this happens, we fixed the issue with follow configuration (set the receiveTimeout to 30 seconds, set transactionTimeout to 1 minute).

 

  <system.serviceModel>

    <services>

      <service name="Microsoft.Samples.MSMQTransactedSample.OrderProcessorService" behaviorConfiguration="OrderProcessorServiceBehavior">

        <host>

          <baseAddresses>

            <add baseAddress="https://localhost:8000/ServiceModelSamples/service"/>

          </baseAddresses>

        </host>

        <!-- Define NetMsmqEndpoint -->

        <endpoint address="net.msmq://localhost/private/ServiceModelSamplesTransacted" binding="netMsmqBinding" bindingConfiguration="TransactedBinding" contract="Microsoft.Samples.MSMQTransactedSample.IOrderProcessor"/>

        <!-- the mex endpoint is exposed at https://localhost:8000/ServiceModelSamples/service/mex -->

        <endpoint address="mex" binding="mexHttpBinding" contract="IMetadataExchange"/>

      </service>

    </services>

    <behaviors>

      <serviceBehaviors>

         <behavior name="OrderProcessorServiceBehavior">

<serviceTimeouts transactionTimeout="00:01:00"/>

          <serviceMetadata httpGetEnabled="True"/>

        </behavior>

      </serviceBehaviors>

    </behaviors>

    <bindings>

      <netMsmqBinding>

        <binding name="TransactedBinding" receiveTimeout="00:00:30">

          <security mode="None"/>

        </binding>

      </netMsmqBinding>

    </bindings>

  </system.serviceModel>

 

WCF allows only making the transaction timeout smaller than the default System.Transactions timeout (1 minute). You may increase the transaction timeout in your application configure (app.config, or web.config) if you want more than 1 minutes transaction.

 

<configuration>

<system.transactions>

  < defaultSettings timeout="00:10:00 " />

</system.transactions>

</configuration>

 

Furthermore, there is a machine wide "maximum" timeout of 10 minutes by default. If the default timeout specified in App.Config/web.config is greater than the configured maximum, it will be reduced to the maximum. The maximum timeout can be increased in machine.config as follows:

 

<system.transactions>

    < machineSettings maxTimeout="00:10:00" />

</system.transactions>

 

If your application requires longer time out, you need to configure the machine.config, app.config or web.config follow this rule to avoid the 0xc00e0051 error:

 

Machine level transaction time out > default transaction time out > WCF application transaction time out > WCF receivetimeout.

 

Here is a sample of the configuration scenario which using the WCF default setting and increase the DTC time out. In this scenario, it is : MachineLeveDTCTimeOut(20 minutes) >= DefaultTimeOut(15 minutes) >= WCF service transactionTimeout(00:00:00, equals to 15 minutes now) > receiveTimeout(default is 10 minutes).

 

In application configure:

 

<configuration>

<system.transactions>

    <defaultSettings timeout="00:15:00"

</system.transactions>

</configuration>

 

And in machine.configure:

 

<system.transactions>

<machineSettings maxTimeout ="00:20:00"/>

</system.transactions>

 

Here is another scenario which we set WCF receivetimeout to 5 minutes. In this scenario, it is : MachineLeveDTCTimeOut(10 minutes) >= DefaultTimeOut(10 minutes) >= WCF service transactionTimeout(00:10:00, 10 minutes) > receiveTimeout(5 minutes).

 

<behaviors>

      <serviceBehaviors>

        <behavior name="OrderProcessorServiceBehavior">

          <serviceTimeouts transactionTimeout="00:10:00"/>

          <serviceMetadata httpGetEnabled="True"/>

        </behavior>

      </serviceBehaviors>

    </behaviors>

 

    <bindings>

      <netMsmqBinding>

        <binding name="TransactedBinding" receiveTimeout="00:05:00">

          <security mode="None"/>

        </binding>

      </netMsmqBinding>

</bindings>

 

<configuration>

<system.transactions>

                                <defaultSettings timeout="00:10:00" />  

</system.transactions>

</configuration>

 

References:

 

MQReceiveMessage

https://msdn.microsoft.com/en-us/library/windows/desktop/ms699825(v=vs.85).aspx

 

WCF transactionTimeout

https://msdn.microsoft.com/en-us/library/vstudio/ms789017.aspx

 

WCF receiveTimeout

https://msdn.microsoft.com/en-us/library/system.servicemodel.channels.binding.receivetimeout.aspx

 

Default DTC settings

https://msdn.microsoft.com/en-us/library/system.transactions.configuration.defaultsettingssection.timeout.aspx

 

Machine level DTC time out which is 10 minutes

https://msdn.microsoft.com/en-us/library/system.transactions.configuration.machinesettingssection.maxtimeout.aspx

 

See you next time.

Wei