SQLDumper unable to generate mdmp files in SQL Server 2008 Failover clusters

SQLDumper utility is used by various components of the product to generate and save diagnostic information in the form of mini-dump and other log files. You will normally find the output files [with extension .mdmp, .txt and .log] created by sqldumper.exe in the LOG folder of the specific instance for the Database Engine. Recently while working with a customer we noticed a problem that prevents these important files from getting created and could affect some of you who are managing SQL Server 2008 failover cluster installations. There are 2 alternate options you can follow to install SQL Server failover cluster:

                Integrated Installation with Add Node

                Advanced/Enterprise Installation [Prepare and Complete]

 If you perform a “Prepare and Complete” option of install for a failover cluster, then sqldumper.exe may not have the capability to automatically generate dumps for critical server health conditions including exceptions and other errors.

This problem is caused by a combination of two factors:

                ErrorDumpDir registry value under the key [HKEY_LOCAL_MACHINESOFTWAREMicrosoftMicrosoft SQL Server<instance>CPE] is pointing to an incorrect location

                SQL Server service account and SQLDumper.exe does not have necessary permissions to write files here due to the ACL settings on this folder location

Under this condition, when SQLSERVR.EXE attempts to launch SQLDUMPER.EXE and generate dumps for sever conditions or events; you will notice the following in the SQL Server Error log.

2009-02-25 23:01:03.40 spid52 Using 'dbghelp.dll' version '4.0.5'

2009-02-25 23:01:03.40 spid52 **Dump thread - spid = 0, EC = 0x0000000080AA40F0

2009-02-25 23:01:03.40 spid52 *

2009-02-25 23:01:03.40 spid52 * User initiated stack dump. This is not a server exception dump.

2009-02-25 23:01:03.40 spid52 *

2009-02-25 23:01:03.40 spid52 * *******************************************************************************

2009-02-25 23:01:03.40 spid52 *

2009-02-25 23:01:03.40 spid52 * BEGIN STACK DUMP:

2009-02-25 23:01:03.41 spid52 * 02/25/09 23:01:03 spid 52

2009-02-25 23:01:03.41 spid52 *

2009-02-25 23:01:03.41 spid52 * Dump triggered by event 'sqldumper_check'.

2009-02-25 23:01:03.41 spid52 *

2009-02-25 23:01:03.41 spid52 * Input Buffer 64 bytes -

2009-02-25 23:01:03.41 spid52 * select * from notable

2009-02-25 23:01:03.41 spid52 *

2009-02-25 23:01:03.41 spid52 * *******************************************************************************

2009-02-25 23:01:03.41 spid52 * -------------------------------------------------------------------------------

2009-02-25 23:01:03.41 spid52 * Short Stack Dump

2009-02-25 23:01:03.43 spid52 Stack Signature for the dump is 0x000000000C2C26CB

2009-02-25 23:01:03.49 spid52 External dump process return code 0x20000003.

Error - Failed to create dump file.

 

If you monitor the sequence of operations performed by sqldumper.exe through Process Monitor tool, you can clearly see the problem:

11:01:03.5060825 PM SQLDUMPER.EXE 1704 CreateFile C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOG SUCCESS Desired Access: Read Data/List Directory, Synchronize, Disposition: Open, Options: Directory, Synchronous IO Non-Alert, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened

11:01:03.5061132 PM SQLDUMPER.EXE 1704 QueryDirectory C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOGSQLDmpr*.mdmp NO SUCH FILE Filter: SQLDmpr*.mdmp

11:01:03.5061420 PM SQLDUMPER.EXE 1704 CloseFile C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOG SUCCESS

11:01:03.5062758 PM SQLDUMPER.EXE 1704 CreateFile C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOGSQLDMPR0001.MDMP NAME NOT FOUND Desired Access: Read Attributes, Disposition: Open, Options: Complete If Oplocked, Attributes: N, ShareMode: Read, Write, Delete, AllocationSize: n/a

11:01:03.5063605 PM SQLDUMPER.EXE 1704 CreateFile C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOGSQLDmpr0001.mdmp ACCESS DENIED Desired Access: Generic Read/Write, Disposition: Create, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Write, AllocationSize: 0

<< cut >>

11:01:03.5072883 PM SQLDUMPER.EXE 1704 CreateFile C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOGSQLDUMPER_ERRORLOG.log ACCESS DENIED Desired Access: Generic Write, Read Attributes, Disposition: OpenIf, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Write, AllocationSize: 0

11:01:03.5073584 PM SQLDUMPER.EXE 1704 QueryNameInformationFile C:WindowsSystem32 SUCCESS Name: WindowsSystem32

11:01:03.5074458 PM SQLDUMPER.EXE 1704 CreateFile C:WindowsSystem32SQLDUMPER_ERRORLOG.log ACCESS DENIED Desired Access: Generic Write, Read Attributes, Disposition: OpenIf, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Write, AllocationSize: 0

11:01:03.5076866 PM SQLDUMPER.EXE 1704 CreateFile C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOGSQLDUMPER_ERRORLOG.log ACCESS DENIED Desired Access: Generic Write, Read Attributes, Disposition: OpenIf, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Write, AllocationSize: 0

11:01:03.5077526 PM SQLDUMPER.EXE 1704 QueryNameInformationFile C:WindowsSystem32 SUCCESS Name: WindowsSystem32

11:01:03.5078319 PM SQLDUMPER.EXE 1704 CreateFile C:WindowsSystem32SQLDUMPER_ERRORLOG.log ACCESS DENIED Desired Access: Generic Write, Read Attributes, Disposition: OpenIf, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Write, AllocationSize: 0

11:01:03.5078995 PM SQLDUMPER.EXE 1704 Thread Exit SUCCESS Thread ID: 3256, User Time: 0.0000000, Kernel Time: 0.0156250

11:01:03.5080350 PM SQLDUMPER.EXE 1704 Process Exit SUCCESS Exit Status: 536870915, User Time: 0.0000000, Kernel Time: 0.0156250, Private Bytes: 2,822,144, Peak Private Bytes: 2,822,144, Working Set: 4,042,752, Peak Working Set: 4,042,752

After you perform a “Prepare and Complete” SQL Failover cluster setup, you will observe the following settings:

Registry key: HKEY_LOCAL_MACHINESOFTWAREMicrosoftMicrosoft SQL ServerMSSQL10.MYINSTANCENAMECPE

ErrorDumpDir = C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOG

Folders:

 Directory of C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQL

02/25/2009 08:35 PM <DIR> .

02/25/2009 08:35 PM <DIR> ..

02/25/2009 08:35 PM <DIR> Binn

02/25/2009 08:35 PM <DIR> Install

02/25/2009 03:44 PM <DIR> LOG

02/25/2009 03:39 PM <DIR> Upgrade

01/08/2009 09:55 PM 14,168 sql_engine_core_inst_keyfile.dll

07/10/2008 04:40 AM 17,432 sql_fulltext_keyfile.dll

 Directory of H:MSSQL10.MYINSTANCENAMEMSSQL

02/25/2009 05:05 PM <DIR> .

02/25/2009 05:05 PM <DIR> ..

02/25/2009 05:05 PM <DIR> Backup

02/25/2009 05:05 PM <DIR> DATA

02/25/2009 08:37 PM <DIR> FTData

02/25/2009 05:05 PM <DIR> JOBS

02/25/2009 10:52 PM <DIR> Log

02/25/2009 05:05 PM <DIR> repldata

H drive is the shared cluster disk drive that is part of the cluster group configured for this sql failover cluster instance. C drive is the local drive of the node.

Notice that the registry key ErrorDumpDir is pointing to the LOG folder in the C drive. If you examine the LOG folder in both the drives, you will notice that the error logs are present in LOG folder of the H drive. The LOG folder in the C drive will be empty.

The Prepare phase of the setup creates this LOG folder in the C drive and configures the CPE registry key to this folder. This LOG folder of C drive is not setup with correct ACL’s so SQL Server service account can create files in this folder. The Complete phase does not remap this registry key value to the LOG folder in the shared cluster drive. That is how we end up with this scenario.

CURRENT SOLUTION