Easily Resolving an Event Viewer Error using a Process Memory Dump

My name is Ryan Mangipano (ryanman) and I am a Sr. Support Escalation Engineer at Microsoft. Today I will be blogging about how I used the SOS .Net Framework debugging extension (and !analyze -v) to easily troubleshoot a .Net Framework exception. This exception was preventing Event Viewer from displaying properly. Event Viewer was returning an error that provided very little information about what was actually causing the issue. I will demonstrate how, in this case, it was very easy to use windbg to obtain information about what went wrong. I did not have to perform a live debug on the issue. Instead, I used a process dump to obtain very exact information, which was returned by the debugger, relating to the root cause. I was then able to use Process Monitor to identify the file that needed to be examined. These actions led me to the source of the problem, which was easily corrected. Also, the issue discussed in this blog was easily reproduced on Windows Server 2008. This means that you should be able to practice this debug exercise, on your non-production Windows 2008 SP1 Server, for learning purposes if you are interested in doing so.

 

Issue Reported:   The following error was encountered when opening eventvwr.msc (Event Viewer) on a Windows 2008 Server system:

 

                       "MMC could not create the snap-in."

MMC could not create the snap-in.

The snap-in might not have been installed correctly

Name: Event Viewer

               CLSID: FX:{b05566ad-fe9c-4363-be05-7a4cbb7cb510}

clip_image001

First Step- Research & Data Gathering: After ensuring I first understood the problem reported, I searched for known issues. I found out that we have seen this error before. It may occur when the following registry key gets deleted or corrupted:

 

                HKLM\software\Microsoft\MMC\SnapIns\FX:{ b05566ad-fe9c-4363-be05-7a4cbb7cb510}

 

I had the customer export this key and found that it was not corrupted in any way. I verified that all data was as expected

 

Next, a memory dump of the mmc.exe process was collected. The mmc.exe process is used to host the eventvwr.msc snap-in. This was easily obtained using the built in Windows 2008 Server "Windows Task Manager" feature: "Create Dump File" . If you have several mmc console instances executing on your system, you can use the Task Manager context menu shortcuts "Switch To" and "Go To Process" to help you to identify the correct instance.

clip_image002clip_image003

clip_image004

 

Note: We also collected a process monitor logfile during the startup of eventvwr.msc. This log file later proved very helpful in resolving the issue (as I will show below). Process monitor can be obtained at the following URL:  https://technet.microsoft.com/en-us/sysinternals/bb896645.aspx

 

Now let's have a look at the debug.

 

1. First, I navigated Windows Explorer to the location of the dump file and double-clicked it to open it in windbg.exe.

clip_image005

It opened in windbg because I had previously run the command windbg -IA, which associates .dmp files with windbg. You can read more about the command line options in windbg in the help file that is included with the debugging tools.

clip_image002[5]

clip_image002[9]

2. I noticed the following output from the debugger after it loaded the dump file:

 

This dump file has an exception of interest stored in it.

The stored exception information can be accessed via .ecxr.

(ff8.a2c): CLR exception - code e0434f4d (first/second chance not available)

 

3. Next, I wanted to ensure my symbol path was set correctly. I could have set it using the .sympath command:

 

0:011> .sympath SRV*c:\websymbols*https://msdl.microsoft.com/download/symbols

Symbol search path is: SRV*c:\websymbols*https://msdl.microsoft.com/download/symbols

Expanded Symbol search path is: srv*c:\websymbols*https://msdl.microsoft.com/download/symbols

0:011> .sympath

Symbol search path is: SRV*c:\websymbols*https://msdl.microsoft.com/download/symbols

Expanded Symbol search path is: srv*c:\websymbols*https://msdl.microsoft.com/download/symbols

 

However, when your goal is to simply point to the default symbol server, .symfix is a very nice shortcut. It prevents one from having to try to remember the URL. Here’s the syntax: 

 

0:011> .symfix c:\websymbols

0:011> .sympath

Symbol search path is: SRV*c:\websymbols*https://msdl.microsoft.com/download/symbols

4. To ensure that I didn't waste time reviewing the wrong data, I performed a quick check to ensure that we collected a dump of the requested snap-in.

 

0:005> !peb

PEB at 000007fffffdb000

....

   CommandLine: '"C:\Windows\system32\mmc.exe" "C:\Windows\system32\eventvwr.msc" '

 

You could alternatively dump the CommandLine from the nt!_PEB using the dt command

 

0:005> dt nt!_PEB ProcessParameters->CommandLine 000007fffffdb000

ntdll!_PEB

   +0x020 ProcessParameters :

      +0x070 CommandLine : _UNICODE_STRING ""C:\Windows\system32\mmc.exe" "C:\Windows\system32\eventvwr.msc" "

5. Next, I then dumped out all of the threads in this process and found the following thread contained a stack that was raising a .Net Framework exception

 

0:011> ~* kL

... (ommitted the non-relevent threads)

# 11 Id: ff8.a2c Suspend: 1 Teb: 7ffd3000 Unfrozen

ChildEBP RetAddr

0691f03c 7343a91c kernel32!RaiseException+0x58

0691f09c 7343d81a mscorwks!RaiseTheExceptionInternalOnly+0x2a8

*** WARNING: Unable to verify checksum for MMCEx.ni.dll

0691f140 6bfe0b5a mscorwks!JIT_Rethrow+0xbf

*** WARNING: Unable to verify checksum for mscorlib.ni.dll

0691f1e8 69926cf6 MMCEx_ni+0xd0b5a

0691f1f4 6993019f mscorlib_ni+0x216cf6

0691f208 69926c74 mscorlib_ni+0x22019f

0691f220 733d1b4c mscorlib_ni+0x216c74

0691f230 733e21b1 mscorwks!CallDescrWorker+0x33

0691f2b0 733f6501 mscorwks!CallDescrWorkerWithHandler+0xa3

0691f3e8 733f6534 mscorwks!MethodDesc::CallDescr+0x19c

0691f404 733f6552 mscorwks!MethodDesc::CallTargetWorker+0x1f

0691f41c 7349d803 mscorwks!MethodDescCallSite::CallWithValueTypes+0x1a

0691f604 733f845f mscorwks!ThreadNative::KickOffThread_Worker+0x192

0691f618 733f83fb mscorwks!Thread::DoADCallBack+0x32a

0691f6ac 733f8321 mscorwks!Thread::ShouldChangeAbortToUnload+0xe3

0691f6e8 733f84ad mscorwks!Thread::ShouldChangeAbortToUnload+0x30a

0691f710 7349d5d4 mscorwks!Thread::ShouldChangeAbortToUnload+0x33e

6. Out of curiosity, I also ran the Get Last Error command

 

0:011> !gle

LastErrorValue: (Win32) 0 (0) - The operation completed successfully.

LastStatusValue: (NTSTATUS) 0 - STATUS_WAIT_0

7. After this, I ran analyze -v to see what helpful information the debugger would provide. The debugger did output exception information but informed me that I needed to use the x86 debugger instead.

 

0:011> !analyze -v

...

                        EXCEPTION_RECORD: ffffffff -- (.exr 0xffffffffffffffff)

                        ExceptionAddress: 771a42eb (kernel32!RaiseException+0x00000058)

                          ExceptionCode: e0434f4d (CLR exception)

                          ExceptionFlags: 00000001

                        NumberParameters: 1

                        Parameter[0]: 80131604

                        MANAGED_BITNESS_MISMATCH:

                                                Managed code needs matching platform of sos.dll for proper analysis. Use 'x86' debugger.

                        FAULTING_THREAD: 00000a2c

                                                PRIMARY_PROBLEM_CLASS: CLR_EXCEPTION

 

8. I fired up the x86 debugger and loaded the appropriate version of the SOS .Net Framework debugger extension. This extension ships in the Operating System along with the .Net Framework.On most occasions, I would have initiated the loading of the extension through the use of the following syntax:

 

 0:011> .load C:\Windows\Microsoft.NET\Framework\v2.0.50727\sos.dll

 

OR

 

0:011> .load c:\Windows\Microsoft.NET\Framework64\v2.0.50727\sos.dll

 

However, once you realize that managed debugging will be necessary and that you need the services of the SOS extension, it’s best to use the .loadby command rather than .load. This is due to the fact that the version of SOS must match the version of the CLR loaded into that process. Here’s the recommended syntax: 

 

0:011 > .loadby sos mscorwks

 

I always verify that my extensions are loaded properly by using the .chain command.

 

0:011> .chain

... Extension DLL chain:

                        C:\Windows\Microsoft.NET\Framework\v2.0.50727\sos.dll: image 2.0.50727.1434, API 1.0.0, built Wed Dec 05 22:42:38 2007

 

9. Running !help printed out the following helpful information about the SOS extension since sos.dll was at the top of the .chain output:

 

0:011> !help

-------------------------------------------------------------------------------

SOS is a debugger extension DLL designed to aid in the debugging of managed

programs. Functions are listed by category, then roughly in order of

importance. Shortcut names for popular functions are listed in parenthesis.

Type "!help <functionname>" for detailed info on that function.

Object Inspection Examining code and stacks

----------------------------- -----------------------------

DumpObj (do) Threads

DumpArray (da) CLRStack

DumpStackObjects (dso) IP2MD

DumpHeap U

DumpVC DumpStack

GCRoot EEStack

ObjSize GCInfo

FinalizeQueue EHInfo

PrintException (pe) COMState

TraverseHeap BPMD

 

10. Using the exception address, displayed by the debugger when opening the dump, and the !pe command listed above, I obtained more information about the exception:

               

0:011> !pe 771a42eb

Invalid object

There are nested exceptions on this thread. Run with -nested for details

0:011> !pe -nested 771a42eb

Invalid object

Nested exception -------------------------------------------------------------

Exception object: 040a676c

Exception type: System.Reflection.TargetInvocationException

Message: Exception has been thrown by the target of an invocation.

InnerException: System.Reflection.TargetInvocationException, use !PrintException 040a6a20 to see more

StackTrace (generated):

    SP IP Function

StackTraceString: <none>

HResult: 80131604

0:011> !PrintException 040a6a20

Exception object: 040a6a20

Exception type: System.Reflection.TargetInvocationException

Message: Exception has been thrown by the target of an invocation.

InnerException: System.Configuration.ConfigurationErrorsException, use !PrintException 040a6cf8 to see more

StackTrace (generated):

<none>

StackTraceString: <none>

HResult: 80131604

There are nested exceptions on this thread. Run with -nested for details

0:011> !PrintException 040a6cf8

Exception object: 040a6cf8

Exception type: System.Configuration.ConfigurationErrorsException

Message: Configuration system failed to initialize

InnerException: System.Configuration.ConfigurationErrorsException, use !PrintException 040a7174 to see more

StackTrace (generated):

<none>

StackTraceString: <none>

HResult: 80131902

There are nested exceptions on this thread. Run with -nested for details

0:011> !PrintException 040a7174

Exception object: 040a7174

Exception type: System.Configuration.ConfigurationErrorsException

Message: Unrecognized configuration section system.web/myInvalidData

InnerException: <none>

StackTrace (generated):

<none>

StackTraceString: <none>

HResult: 80131902

There are nested exceptions on this thread. Run with -nested for details

11. Based on the exception information listed above, it appeared that a .Net Framework configuration section, system.web, contained an invalid configuration section named myInvalidDatainside of it. I then re-ran !analyze -v against the dump again (now that I had loaded the x86 debugger) and found that !analyze -v will load the sos.dll extension and even run the !pe extension automatically. It then automatically displayed the exception record information for me as well. Also, notice that the thread listed by !analyze -v matches the thread I examined earlier.

 

            0:011> !analyze -v

                ...

                        EXCEPTION_MESSAGE: Unrecognized configuration section system.web/myInvalidData.

                                                MANAGED_OBJECT_NAME: System.Configuration.ConfigurationErrorsException

                                                FAULTING_THREAD: 00000a2c

0:011> ~

   0 Id: ff8.c84 Suspend: 1 Teb: 7ffdf000 Unfrozen

   1 Id: ff8.96c Suspend: 1 Teb: 7ffde000 Unfrozen

   2 Id: ff8.d10 Suspend: 1 Teb: 7ffdd000 Unfrozen

   3 Id: ff8.d94 Suspend: 1 Teb: 7ffdc000 Unfrozen

   4 Id: ff8.a14 Suspend: 1 Teb: 7ffda000 Unfrozen

   5 Id: ff8.fbc Suspend: 1 Teb: 7ffd9000 Unfrozen

   6 Id: ff8.f88 Suspend: 1 Teb: 7ffd8000 Unfrozen

   7 Id: ff8.a64 Suspend: 1 Teb: 7ffd6000 Unfrozen

   8 Id: ff8.bf8 Suspend: 1 Teb: 7ffd5000 Unfrozen

   9 Id: ff8.d24 Suspend: 1 Teb: 7ffd4000 Unfrozen

  10 Id: ff8.ff0 Suspend: 1 Teb: 7ffd7000 Unfrozen

. 11 Id: ff8.a2c Suspend: 1 Teb: 7ffd3000 Unfrozen

12. At this point I was interested in identifying the source of this unrecognized configuration. Instead of engaging our .Net support team, I started with a quick search using www.live.com for

"unrecognized configuration section" system.web site:microsoft.com

This returned the following results https://search.live.com/results.aspx?q=%22unrecognized+configuration+section%22+system.web+site%3Amicrosoft.com&form=QBRE

By quickly reviewing some of the hits returned, I found that others had encountered this exception in their own applications. This is due to invalid entries in the various .config files used in .Net. Looking through the posts, different configuration file names and paths were observed.

So, I opened up the process monitor logfile to see which configuration files we were reading data from. I added filter criterion to match entries from the mmc.exe process, the TID from the FAULTING_THREAD listed in the exception data, path data containing .config, and a successful status result. It's best to be as specific as possible.

 

clip_image002[7]

I found that we were reading in a large amount of settings over and over again from the .net Framework global configuration file:

                                c:\Windows\Microsoft.NET\Framework\v2.0.50727\CONFIG\machine.config

(on x64 this would be C:\Windows\Microsoft.NET\Framework64\v2.0.50727\CONFIG\machine.config)

clip_image002[11]

Final Step- Putting it all together, Reproducing the issue, & confirming resolution : Using notepad, a quick search of the suspect xml file (C:\Windows\Microsoft.NET\Framework64\v2.0.50727\CONFIG\machine.config) on my system revealed a <system.web> section. At this point, I suspected that this section contained an invalid section which may have been related to the problem. To verify this, and since I like to break things, I added an invalid configuration setting <myInvalidData/> to my global configuration file. Doing so, I successfully reproduced the issue on my system. I then contacted the customer and asked if they had by any chance added any settings under the <system.web> in the configuration file: c:\Windows\Microsoft.NET\Framework\v2.0.50727\CONFIG\machine.config.

 

The customer informed me that, per the request of their ASP.net developer, they had in fact added settings to that section of the file. By researching https://msdn.microsoft.com/en-us/library/system.web.aspx and the schema documentation at https://msdn.microsoft.com/en-us/library/dayb112d.aspx, we were able to determine that the settings that were present in this file should not have been present inside of <system.web> . The settings were moved to the proper location per the developer and the issue was resolved.

 

Here are the steps I used to reproduce the issue in case you are attempting to replicate this at home-

 

A. Using notepad, open the following configuration file on a non-production Windows Server 2008 SP1 system:

    (please make a backup copy first in case you make a mistake)

   c:\Windows\Microsoft.NET\Framework\v2.0.50727\CONFIG\machine.config

     OR (Open the version that matches the architecture of your platform )

   C:\Windows\Microsoft.NET\Framework64\v2.0.50727\CONFIG\machine.config

 

clip_image012

B. Find the section <system.web> in this file (you can use the find function in notepad):

clip_image013

C. Add the following line directly after <system.web> as shown in the example below:

                   <myInvalidData/>    

clip_image014

D. Save the file and then open eventvwr.msc and verify that the following error is displayed:

clip_image015

Conclusion

Hopefully this blog has demonstrated an example of how you can use the "create dump file" feature of Windows 2008, windbg, and other related tools in an attempt to gain more specific data when your error message is not revealing the source of the problem. Feel free to post any questions, comments, or concerns.