SQL SERVER 2000 CLUSTER FAILS TO COME ONLINE ON CLUSTER AFTER APPLYING SECURITY PATCH oN WINDOWS SERVER

We have observed a number of cases where Sql Server resource fails to come online on Windows 2000 Cluster after applying a security patch.

As mentioned in my previous posts whenever a sql server resource fails to come online we should start by checking the Sql Server Error log to see if the sql server is getting started or not.

From the Sql Server Error log we see the following information

2009-02-27 21:25:46.23 spid3 SQL global counter collection task is created.
2009-02-27 21:25:50.48 server SQL server listening on TCP, Shared Memory, Named Pipes.
2009-02-27 21:25:50.48 server SQL Server is ready for client connections
2009-02-27 21:27:09.76 spid322 Starting up database 'test'.
2009-02-27 21:28:49.65 spid3 SQL Server is terminating due to 'stop' request from Service Control Manager.

The above information shows that sql server was started successfully but was terminated by the Service Control manager.

We checked the Cluster.log located in C:\WINNT\Cluster folder and found the following error information

SQL Server <SQL Server>: [sqsrvres] ODBC sqldriverconnect failed
SQL Server <SQL Server>: [sqsrvres] checkODBCConnectError: sqlstate = 28000; native error = 4814; message = [Microsoft][ODBC SQL Server Driver][SQL Server]Login failed for user '(null)'. Reason: Not associated with a trusted SQL Server connection.
SQL Server <SQL Server>: [sqsrvres] ODBC sqldriverconnect failed
SQL Server <SQL Server>: [sqsrvres] checkODBCConnectError: sqlstate = 28000; native error = 4814; message = [Microsoft][ODBC SQL Server Driver][SQL Server]Login failed for user '(null)'. Reason: Not associated with a trusted SQL Server connection.
SQL Server <SQL Server>: [sqsrvres] ODBC sqldriverconnect failed
SQL Server <SQL Server>: [sqsrvres] checkODBCConnectError: sqlstate = 28000; native error = 4814; message = [Microsoft][ODBC SQL Server Driver][SQL Server]Login failed for user '(null)'. Reason: Not associated with a trusted SQL Server connection.

From the Error Message we see that Cluster Service is unable to login into the Sql server to perform the "IsAlive" and hence it was unable to bring the sql server resource online on cluster. Cluster Service was failing to login in the sql server with Error message "Login failed for user '(null)'" so in other words Cluster Service was not able to pass on it's credentials to the sql server.

This problem occurs because in the security patch for Windows 2000 as well as with Windows Server 2003 SP1 we have a new security feature introduced which is called as loopback check functionality.

By default, loopback check functionality is turned on in Windows Server 2003 SP1, and the value of the DisableLoopbackCheck registry entry is set to 0 (zero).

Note The loopback check functionality is stored in the following registry subkey for x86 based system:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa\DisableLoopbackCheck

IMPORTANT : This resolution contains information about modifying the registry. Before you modify the registry, make sure to back it up and make sure that you understand how to restore the registry if a problem occurs. For information about how to back up, restore, and edit the registry, click the following article number to view the article in the Microsoft Knowledge Base:

256986 Description of the Microsoft Windows Registry

In order to resolve the issue we need to Disable the loop back check functionality by setting the above registry key value to 1

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa\DisableLoopbackCheck ----- Set it to 1.

After Disabling the Loop back check when we try to bring the sql server resource online and it should online fine without any errors.

Note:

This post should not be treated as the Microsoft’s Recommendation or Resolution to the problem, it is only a workaround which worked in our environment and so we would like to share it.

Parikshit Savjani
SE, Microsoft Sql Server

Reviewed By Anurag Sharma, SQL Server Escalation Services

Technorati Tags: Cluster,fail,sql,online,resources,security patch,windows 2000,windows 2003,Loopback,Login failed for user '(null)',Not associated with a trusted connection