Intermittent Error: MSDTC on server 'servername' is unavailable

MSDTC on server 'servername' is unavailable is a generic error which indicates the application cannot connect to the MSDTC service or determine the service state at the beginning of a transaction. This article addresses the cause and the solutions for the error when it is occurring intermittently.

Symptoms

When you try to execute a distributed transaction that depends on MSDTC in clustered SQL Server, you may receive an error message that resembles the following:

MSDTC on server 'servername' is unavailable

Sometimes the error goes away when the failed instance fails over to a different cluster node, and the error comes back when the instance fails back to the original cluster node. The MSDTC cluster resource is configured correctly. When the error happens, the MSDTC cluster resource is actually online and can be failed over without any errors, however the failed instance just isn’t able to determine that MSDTC is started. You may see the problem occurring on multiple SQL Server instances.

You may see messages that resemble the followings in the SQL error logs:

2009-01-20 13:12:03.35 Server Attempting to initialize Microsoft Distributed Transaction Coordinator (MS DTC). This is an informational message only. No user action is required.
2009-01-20 13:12:04.07 Server The Microsoft Distributed Transaction Coordinator (MS DTC) service could not be contacted. If you would like distributed transaction functionality, please start this service.

You may see an error message that resembles the following in the application event log:

Event Type: Error
Event Source: MSDTC Client
Event Category: CM
Event ID: 4357
Date: 7/23/2008
Time: 8:28:52 PM
User: N/A
Computer: ComputerName
Description:
MS DTC is unable to communicate with MS DTC on a remote system. No common RPC protocol is supported between the two systems. Please ensure that one or more of the following RPC protocols are common to both systems: TCP/IP, SPX, or NetBEUI.
Error Specifics: d:\nt\com\complus\dtc\dtc\cm\src\iomgrclt.cpp:219, Pid: 5784

You may encounter this error when you try to start Cluster Administrator:

Error 1721 Not enough resources are available to complete this operation.

The Cluster service may not start and the following events may be seen in the system event log:

1009 ClusSvc “The Clustering Service could not join an existing cluster and could not form a new cluster. The Clustering Service has terminated."
7031 Service Control Manager "The Cluster Service service terminated unexpectedly. It has done this n time(s). The following corrective action will be taken in n milliseconds: Restart the service."
1070 ClusSvc "The node failed to begin the process of joining the cluster. The error code was 1721."

Other errors include:

Executed as user: User. MSDTC on server 'servername' is unavailable.
[SQLSTATE 42000] (Error 8501) The operation could not be performed because the OLE DB provider 'SQLOLEDB' was unable to begin a distributed transaction

SqlServerError:
Source: .Net SqlClient Data Provider
Number: 8501
State: 2
Class: 16
Server: ServerName
Message: MSDTC on server 'servername' is unavailable.
Procedure:
Line Number: 1

The SSIS Runtime has failed to start the distributed transaction due to error 0x8004D01B "The Transaction Manager is not available.". The DTC transaction failed to start. This could occur because the MSDTC Service is not running.

Began Executing 2007-07-10 19:35:00
Msg 50000, Sev 16, State 1, Line 1721 : MSDTC on server 'servername' is unavailable. [SQLSTATE 42000] 

[SQLSTATE 42000] (Error 8501) The operation could not be performed because the OLE DB provider 'SQLOLEDB' was unable to begin a distributed transaction

Cause

The issue occurs when SQL Server instances cannot connect to MSDTC using RPC calls because all of the RPC server ports are exhausted, which can happen if the range of available RPC server ports is restricted in the registry. By default, RPC dynamic port allocation randomly selects port numbers from 1025 to 5000 (Windows 2003 and earlier) or from 49152 to 65535 (Windows Vista and 2008, see KB929851). By adding the HKEY_LOCAL_MACHINE\Software\Microsoft\RPC\Internetsubkey, you can define the Internet Protocol (IP) ports that are available for RPC communication, allowing DTC to communicate across firewalls. Note that the available RPC server ports are used by all RPC applications and services running on the server. Most of time you see the problem on the Microsoft Cluster Server (MSCS) cluster because RPC is heavily used on the cluster server. The issue rarely happens on a non-clustered server because a much smaller number of PRC server ports are usually sufficient.

The port issue is further compounded by a limitation on the Windows Server 2003 cluster. The Component Services UI (Dcomcnfg.exe) may use more than 70 ports on the cluster. If you are running out of ports on the Windows Server 2003 cluster, don’t open more than one Component Services UI at a time. We have eliminated this limitation on Windows Server 2008. For more information, see

Error message when you use the Distributed COM Configuration Properties tool (Dcomcnfg.exe) in a cluster environment

Resolution

To resolve this issue, make a minimum of 200 ports available in the registry on the cluster if the HKEY_LOCAL_MACHINE\Software\Microsoft\RPC\Internet subkey is present in the registry. Please follow the steps in the article below for Windows 2003:

How to configure RPC dynamic port allocation to work with firewalls