How It Works: Controlling SQL Server memory dumps

I just completed a lengthy bit of research and thought I would share it with everyone. There are several trace flags and registry keys outlined in this post.

!!! As always USE WITH APPROPRIATE CAUTION !!!

From: Robert Dorr
Sent: Tuesday, November 17, 2009 3:23 PM
Subject: RE: SQL Server 2008 Trace Flag

 

It is a dump trigger that is enabled for error 5243. (You usually do this with dbcc dumptrigger(‘set’, 5243) or –y5243 on the command line)

However, 5243 is a special error in that it is automatically setup to capture a dump.

00000000`220ce290 00000000`020817bf sqlservr!stackTraceCallBack+0x49 [e:sql10_katmai_tsqlntdbmsstorengdfsstartupstack.cpp @ 4972]

00000000`220ce2d0 00000000`013a3d41 sqlservr!ex_raise2+0xcdd8bf

00000000`220ce630 00000000`02deb8ce sqlservr!ex_raise+0x51 [e:sql10_katmai_tsqlcommondksossrcsosexcept.cpp @ 1258]

00000000`220ce6c0 00000000`01e860fc sqlservr!RaiseInconsistencyError+0x33e [e:sql10_katmai_tsqlntdbmsstorengdrsrecord.cpp @ 1006]

00000000`220ce9c0 00000000`0158b06d sqlservr!PageRef::Fix+0xe6157c

00000000`220cea30 00000000`02a71cfb sqlservr!IAMPageRef::Fix+0x1d [e:sql10_katmai_tsqlntdbmsstorengincludeallocpageref.h @ 185]

00000000`220cea70 00000000`02cc0623 sqlservr!DeferredAllocUnitDrop::DeAllocateFirstIAMAndItsExtents+0x23b [e:sql10_katmai_tsqlntdbmsstorengdfsallocdealloc.cpp @ 914]

00000000`220cebe0 00000000`03404ede sqlservr!DeferredAllocUnitDrop::ProcessOneIAM+0xa3 [e:sql10_katmai_tsqlntdbmsstorengdfsallocdealloc.cpp @ 1078]

In this case it is a background task that wakes up and is attempting to deallocate and encountering the damaged ‘so called’ IAM page. I suspect it is like Paul mentioned that the pointer in the sysallocunits was some-how damaged or the IAM chain was damaged. The IAM page was re-used or the chain pointer was bad to the back-ground task keeps trying to access the IAM to deallocate and it can’t.

For this specific issue we might have been able to disable deferred deallocation but you would have to know that. (-T670)

· The error 5243 is an automatic dump trigger setup for corruption capture that you can’t control with dbcc dumptrigger(‘clear’, 5243)

· The –T2558 is not to disable all dumps it disables DBCC CHECKDB Watson integration and since you were not hitting CHECKDB errors in this code it won’t change things.

· You can generically use –T2542 (skip mini-dump)

· Trace flag –T3656 disables loading of dbghelp so the stack should not be put in the error log if this is enabled

· The dump logic also has some limitations in that if you had had full or filtered dumps enabled it will only take ## before it downgrades to mini-dump only as well. (See registry section below)

· Worse come to worse you could have renamed SQLDumper.exe for a bit until the problem was corrected. This would prevent .mdmp’s but not the error log growth.

· If error is severity 20 or higher the dump is automatically generated.

· Trace flag –T8026 tells dump trigger to remove the trigger after the first dump has been triggered. (Does not disable the automatic triggers)

· If you enable trace flag –T3628 it can include other errors based on a severity.

· If you enable trace flag –T3629 it will include messages marked to include with this trace flag enabled.

Automatic Dumping Control

A bunch of the automatic, message dumping information is controlled by the message formatting specifications. When combined with some of the trace flags I just outlined you can determine the behaviors. This information is contained in an internal array returning if the message should always log, always dump, or dump only if trace flag 3629 is enabled. The array is generated from a header file so it is not just something I can pull up and show you easily. It is actually part of the build that takes a header, parses comments and generates the array.

Here are some of the errors that are setup to dump.

ErrorGenerateDump: Always

5243

5242

8624

8644

8930

8967

9239

Registry

There are some registry keys to set the dump limits as well. By default it is set to (0) indicating INFINITE.

Under the Setup base key location: (EX: HKLMSoftwareMicrosoftMicrosoft SQL ServerMSSQL10.SQL2008Setup)

· MaxDumps (DWORD) value

· MaxFullDumps (DWORD) value

· SQLExceptionDumpPath (String) value

Display Dump Triggers

You can use dbcc dumptrigger(‘display’) to see the currently installed dump triggers on the system. We sure to enabled dbcc tracon(3604) to retrieve the output of the command.

------------------------------------------------------------------------

From: Paul Randal (Sysolutions Inc.)
Sent: Tuesday, November 17, 2009 11:04 AM
Subject: RE: SQL Server 2008 Trace Flag

 

That’s an allocation unit that doesn’t have a link to it in sysallocunits – repair should have deleted the IAM chain. Is it the same IAM chain in the subsequent 2576?

The 5243s in the dump/log are usually from record corruption (can’t see exactly where it’s throwing the error as I don’t have source code access any more) – Bob?

From the frequency, I’m guessing it’s a record in a system table.

Paul S. Randal

Managing Director, SQLskills.com

Microsoft Regional Director / SQL MVP

https://www.sqlskills.com/blogs/paul | https://twitter.com/PaulRandal

------------------------------------------------------------------------

Sent: Tuesday, November 17, 2009 8:49 AM
To: Paul Randal (Sysolutions Inc.); Robert Dorr
Subject: RE: SQL Server 2008 Trace Flag

 

Hi Paul and Robert,

Glad to have your help. Attachment are error log, one dump file, and one DBCC CHECKDB output. I’m in China so the DBCC output is Chinese. And in English it reads:

-----------------------------------------------------------------------------

Server: Msg 8906, Level 16, State 1, Line 1

Page (1:5882160) in database ID 8 is allocated in the SGAM (1:5623553) and PFS (1:5879976), but was not allocated in any IAM. PFS flags 'MIXED_EXT ALLOCATED 0_PCT_FULL'.

Msg 2576, Level 16, State 1, Line 1

The Index Allocation Map (IAM) page (0:0) is pointed to by the previous pointer of IAM page (1:5882163) in object ID 0, index ID -1, partition ID 0, alloc unit ID 72057599386320896 (type Unknown), but it was not detected in the scan.

-------------------------------------------------------------------------------

Then I run DBCC CHECKDB with ALLOW_DATA_LOSS again, there is no more Msg 8906 and Msg 2576.

Then I run DBCC CHECKDB to double check, no Msg 8906 but one Msg 2576.

Bob Dorr - Principal SQL Server Escalation Engineer