SQLDumper unable to generate mdmp files in SQL Server 2008 Failover clusters


SQLDumper utility is used by various components of the product to generate and save diagnostic information in the form of mini-dump and other log files. You will normally find the output files [with extension .mdmp, .txt and .log] created by sqldumper.exe in the LOG folder of the specific instance for the Database Engine. Recently while working with a customer we noticed a problem that prevents these important files from getting created and could affect some of you who are managing SQL Server 2008 failover cluster installations. There are 2 alternate options you can follow to install SQL Server failover cluster:

                Integrated Installation with Add Node

                Advanced/Enterprise Installation [Prepare and Complete]

 If you perform a “Prepare and Complete” option of install for a failover cluster, then sqldumper.exe may not have the capability to automatically generate dumps for critical server health conditions including exceptions and other errors.

 

This problem is caused by a combination of two factors:

                ErrorDumpDir registry value under the key [HKEY_LOCAL_MACHINESOFTWAREMicrosoftMicrosoft SQL Server<instance>CPE] is pointing to an incorrect location

                SQL Server service account and SQLDumper.exe does not have necessary permissions to write files here due to the ACL settings on this folder location

 

Under this condition, when SQLSERVR.EXE attempts to launch SQLDUMPER.EXE and generate dumps for sever conditions or events; you will notice the following in the SQL Server Error log.

2009-02-25 23:01:03.40 spid52      Using ‘dbghelp.dll’ version ‘4.0.5’

2009-02-25 23:01:03.40 spid52      **Dump thread – spid = 0, EC = 0x0000000080AA40F0

2009-02-25 23:01:03.40 spid52      *

2009-02-25 23:01:03.40 spid52      * User initiated stack dump.  This is not a server exception dump.

2009-02-25 23:01:03.40 spid52      *

2009-02-25 23:01:03.40 spid52      * *******************************************************************************

2009-02-25 23:01:03.40 spid52      *

2009-02-25 23:01:03.40 spid52      * BEGIN STACK DUMP:

2009-02-25 23:01:03.41 spid52      *   02/25/09 23:01:03 spid 52

2009-02-25 23:01:03.41 spid52      *

2009-02-25 23:01:03.41 spid52      * Dump triggered by event ‘sqldumper_check’.

2009-02-25 23:01:03.41 spid52      *

2009-02-25 23:01:03.41 spid52      * Input Buffer 64 bytes –

2009-02-25 23:01:03.41 spid52      *             select * from notable

2009-02-25 23:01:03.41 spid52      * 

2009-02-25 23:01:03.41 spid52      * *******************************************************************************

2009-02-25 23:01:03.41 spid52      * ——————————————————————————-

2009-02-25 23:01:03.41 spid52      * Short Stack Dump

2009-02-25 23:01:03.43 spid52      Stack Signature for the dump is 0x000000000C2C26CB

2009-02-25 23:01:03.49 spid52      External dump process return code 0x20000003.

Error – Failed to create dump file.

 

If you monitor the sequence of operations performed by sqldumper.exe through Process Monitor tool, you can clearly see the problem:

11:01:03.5060825 PM             SQLDUMPER.EXE    1704        CreateFile               C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOG             SUCCESS Desired Access: Read Data/List Directory, Synchronize, Disposition: Open, Options: Directory, Synchronous IO Non-Alert, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened

11:01:03.5061132 PM             SQLDUMPER.EXE    1704        QueryDirectory        C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOGSQLDmpr*.mdmp              NO SUCH FILE         Filter: SQLDmpr*.mdmp

11:01:03.5061420 PM             SQLDUMPER.EXE    1704        CloseFile C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOG                SUCCESS

11:01:03.5062758 PM             SQLDUMPER.EXE    1704        CreateFile               C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOGSQLDMPR0001.MDMP       NAME NOT FOUND                Desired Access: Read Attributes, Disposition: Open, Options: Complete If Oplocked, Attributes: N, ShareMode: Read, Write, Delete, AllocationSize: n/a

11:01:03.5063605 PM             SQLDUMPER.EXE    1704        CreateFile               C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOGSQLDmpr0001.mdmp      ACCESS DENIED      Desired Access: Generic Read/Write, Disposition: Create, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Write, AllocationSize: 0

<< cut >>

11:01:03.5072883 PM             SQLDUMPER.EXE    1704        CreateFile               C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOGSQLDUMPER_ERRORLOG.log         ACCESS DENIED      Desired Access: Generic Write, Read Attributes, Disposition: OpenIf, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Write, AllocationSize: 0

11:01:03.5073584 PM             SQLDUMPER.EXE    1704        QueryNameInformationFile   C:WindowsSystem32          SUCCESS Name: WindowsSystem32

11:01:03.5074458 PM             SQLDUMPER.EXE    1704        CreateFile               C:WindowsSystem32SQLDUMPER_ERRORLOG.log         ACCESS DENIED    Desired Access: Generic Write, Read Attributes, Disposition: OpenIf, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Write, AllocationSize: 0

11:01:03.5076866 PM             SQLDUMPER.EXE    1704        CreateFile               C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOGSQLDUMPER_ERRORLOG.log            ACCESS DENIED      Desired Access: Generic Write, Read Attributes, Disposition: OpenIf, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Write, AllocationSize: 0

11:01:03.5077526 PM             SQLDUMPER.EXE    1704        QueryNameInformationFile   C:WindowsSystem32          SUCCESS Name: WindowsSystem32

11:01:03.5078319 PM             SQLDUMPER.EXE    1704        CreateFile               C:WindowsSystem32SQLDUMPER_ERRORLOG.log         ACCESS DENIED    Desired Access: Generic Write, Read Attributes, Disposition: OpenIf, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Write, AllocationSize: 0

11:01:03.5078995 PM             SQLDUMPER.EXE    1704        Thread Exit                              SUCCESS Thread ID: 3256, User Time: 0.0000000, Kernel Time: 0.0156250

11:01:03.5080350 PM             SQLDUMPER.EXE    1704        Process Exit                             SUCCESS Exit Status: 536870915, User Time: 0.0000000, Kernel Time: 0.0156250, Private Bytes: 2,822,144, Peak Private Bytes: 2,822,144, Working Set: 4,042,752, Peak Working Set: 4,042,752

 

After you perform a “Prepare and Complete” SQL Failover cluster setup, you will observe the following settings:

 

Registry key: HKEY_LOCAL_MACHINESOFTWAREMicrosoftMicrosoft SQL ServerMSSQL10.MYINSTANCENAMECPE

ErrorDumpDir = C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQLLOG

 

Folders:

 Directory of C:Program FilesMicrosoft SQL ServerMSSQL10.MYINSTANCENAMEMSSQL

02/25/2009  08:35 PM    <DIR>          .

02/25/2009  08:35 PM    <DIR>          ..

02/25/2009  08:35 PM    <DIR>          Binn

02/25/2009  08:35 PM    <DIR>          Install

02/25/2009  03:44 PM    <DIR>          LOG

02/25/2009  03:39 PM    <DIR>          Upgrade

01/08/2009  09:55 PM            14,168 sql_engine_core_inst_keyfile.dll

07/10/2008  04:40 AM            17,432 sql_fulltext_keyfile.dll

 

 Directory of H:MSSQL10.MYINSTANCENAMEMSSQL

02/25/2009  05:05 PM    <DIR>          .

02/25/2009  05:05 PM    <DIR>          ..

02/25/2009  05:05 PM    <DIR>          Backup

02/25/2009  05:05 PM    <DIR>          DATA

02/25/2009  08:37 PM    <DIR>          FTData

02/25/2009  05:05 PM    <DIR>          JOBS

02/25/2009  10:52 PM    <DIR>          Log

02/25/2009  05:05 PM    <DIR>          repldata

 

H drive is the shared cluster disk drive that is part of the cluster group configured for this sql failover cluster instance. C drive is the local drive of the node.

Notice that the registry key ErrorDumpDir is pointing to the LOG folder in the C drive. If you examine the LOG folder in both the drives, you will notice that the error logs are present in LOG folder of the H drive. The LOG folder in the C drive will be empty.

The Prepare phase of the setup creates this LOG folder in the C drive and configures the CPE registry key to this folder. This LOG folder of C drive is not setup with correct ACL’s so SQL Server service account can create files in this folder. The Complete phase does not remap this registry key value to the LOG folder in the shared cluster drive. That is how we end up with this scenario.

 

CURRENT SOLUTION