Using WinDbg - Hunting Exceptions

Prerequisites

This post will require some basic knowledge of windbg and the sos extension. For this I recommend looking at the following posts:

For more information on Exceptions in general and why they should be avoided I'd like to recommend this post:

Introduction

I thought it was time to write another post on how to use windbg for troubleshooting. A lot of my time is spent locating exceptions in various web applications, so I thought this might be a good topic to cover. I've previously written a post specifically targeting OutOfMemoryExceptions, but I thought I should broaden the terms and make it a bit more general. There are two scenarios that are exceptionally common in my line of work:

  1. Clients are reporting 2nd chance exceptions displayed on screen with the classic "Server Error"- page.
  2. Performance is generally bad, and when we investigate it turns out that there are tons of exceptions being thrown every second.

In this post I'll cover how to investigate what exceptions have been thrown by an application, as well as how to use windbg and adplus to automatically gather specific information for us.

Where to start

Okay, so you have a web application that you've been monitoring and you believe it is throwing a lot of exceptions. You've taken a dump of the process and you're ready to begin the investigation. Where do you start?

!dumpallexceptions (!dae)

If your application is running under the .NET Framework 1.1 you can use the !dumpallexceptions-command (!dae) to get a list of all the exceptions still on the heap. Now, remember that an exception is a managed object, so they will eventually be garbage collected just like everything else. This means that when looking at the heap for exceptions you will only get the exceptions still in memory, not every exception thrown by the application since startup.

Anyway, if you run !dae you'll get a list of exceptions that looks like this:

0:000> !dae
Going to dump the .NET Exceptions found in the heap.
Loading the heap objects into our cache.
Number of exceptions of this type:        1
Exception object: 026200ec
Exception type: System.ExecutionEngineException
Message:
InnerException:
StackTrace (generated):

StackTraceString:
HResult: 80131506
The current thread is unmanaged
-----------------

Number of exceptions of this type:        1
Exception object: 026200a4
Exception type: System.StackOverflowException
Message:
InnerException:
StackTrace (generated):

StackTraceString:
HResult: 800703e9
The current thread is unmanaged
-----------------

Number of exceptions of this type:        1
Exception object: 0262005c
Exception type: System.OutOfMemoryException
Message:
InnerException:
StackTrace (generated):

StackTraceString:
HResult: 8007000e
The current thread is unmanaged
-----------------

Number of exceptions of this type:        1
Exception object: 0b62cf38
Exception type: System.Data.SqlClient.SqlException
Message: Transaction (Process ID 96) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
InnerException:
StackTrace (generated):
    SP       IP       Function
    1DACECA8 1D0A4D4C Company.Database.AuditTrail.Write(System.String, System.Guid, System.String)
    1DACED6C 1D0A4B98 Company.Database.AuditTrail.AddEntry(EntryType, System.Guid, System.String)
    1DACED90 1D0A4B24 Company.Database.DataFunctions.SaveRow(System.String, System.Guid, Boolean)
    1DACEE20 1DDD7A06 FooBase.PersonApplicationOtherQualificationBase.PersonApplicationOtherQualificationBaseManager.BaseSave(System.Guid, Boolean)
    1DACEE34 1DDD79BF Foo.PersonApplicationOtherQualification.PersonApplicationOtherQualificationManager.Save(System.Guid, Boolean)
    1DACEE48 1DDD77F5 UserControls.PersonApplicationOtherQualificationDetail.OnSave()
    1DACEE70 1DDD2701 Company.Web.UI.Page.InvokeUsercontrolTransaction(System.Web.UI.Control, Company.Web.TransactionType)

StackTraceString:
HResult: 80131904
The current thread is unmanaged
-----------------

Number of exceptions of this type:        3
Exception object: 0b62d23c
Exception type: System.Exception
Message: Thrown while invoking Save on object PersonApplicationOtherQualificationDetail1(ASP.sys_modules_person_personapplicationotherqualificationdetail_ascx)
InnerException: System.Data.SqlClient.SqlException, use !PrintException 0b62cf38 to see more
StackTrace (generated):
    SP       IP       Function
    1DACEDF8 1DDD28C8 Company.Web.UI.Page.InvokeUsercontrolTransaction(System.Web.UI.Control, Company.Web.TransactionType)
    1DACEEC4 1DDD262D Company.Web.UI.Page.InvokeUsercontrolTransaction(System.Web.UI.Control, Company.Web.TransactionType)

StackTraceString:
HResult: 80131500
The current thread is unmanaged
-----------------

Number of exceptions of this type:        8
Exception object: 03c6fbb8
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException:
StackTrace (generated):
    SP       IP       Function
    1F15E3E0 795FC73C System.Guid..ctor(System.String)
    1F15E43C 1D0AB97F Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)

StackTraceString:
HResult: 80004003
The current thread is unmanaged
-----------------

Number of exceptions of this type:       36
Exception object: 02620134
Exception type: System.Threading.ThreadAbortException
Message:
InnerException:
StackTrace (generated):

StackTraceString:
HResult: 80131530
The current thread is unmanaged
-----------------

Number of exceptions of this type:      116
Exception object: 0396692c
Exception type: System.Web.HttpUnhandledException
Message:
InnerException: System.NullReferenceException, use !PrintException 03966878 to see more
StackTrace (generated):
    SP       IP       Function
    1CE6DF58 6614FDB2 System.Web.UI.Page.HandleError(System.Exception)
    1CE6DFA0 6615681A System.Web.UI.Page.ProcessRequestMain(Boolean, Boolean)
    1CE6EF10 66154A8A System.Web.UI.Page.ProcessRequest(Boolean, Boolean)
    1CE6EF48 66154967 System.Web.UI.Page.ProcessRequest()
    1CE6EF80 66154887 System.Web.UI.Page.ProcessRequestWithNoAssert(System.Web.HttpContext)
    1CE6EF88 6615481A System.Web.UI.Page.ProcessRequest(System.Web.HttpContext)
    1CE6EF9C 1C741EAE ASP.sys_pages_application_application_aspx.ProcessRequest(System.Web.HttpContext)
    1CE6EFA8 65FF27D4 System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
    1CE6EFDC 65FC15B5 System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)

StackTraceString:
HResult: 80004005
The current thread is unmanaged
-----------------

Number of exceptions of this type:      136
Exception object: 03966878
Exception type: System.NullReferenceException
Message: Object reference not set to an instance of an object.
InnerException:
StackTrace (generated):
    SP       IP       Function
    1CE6EC2C 1C745B38 Pages.Application.Page_Load(System.Object, System.EventArgs)
    00000000 00000001 System.EventHandler.Invoke(System.Object, System.EventArgs)
    1CE6ED34 66143A84 System.Web.UI.Control.OnLoad(System.EventArgs)
    1CE6ED44 66143AD0 System.Web.UI.Control.LoadRecursive()
    1CE6ED58 66155106 System.Web.UI.Page.ProcessRequestMain(Boolean, Boolean)

StackTraceString:
HResult: 80004003
The current thread is unmanaged
-----------------

Total 303 exceptions

As you can see the !dae-command lists all exception types found on the heap. If possible it also gives us a callstack for each exception. Please note, however, that this doesn't mean that the call stack is more or less the same for all exceptions. In the sample above you might have ~20 different callstacks leading to the 136 NullReferenceExceptions you see.

Unfortunately the !dae command is not available in version 2.0 of sos.dll. Still, it's quite easy to get (more or less) the same result by using the !dumpheap command. If we just type "!dumpheap -type Exception" we'll get a list of all objects with the string "Exception" in their class name. This is almost as good.

0:000> !dumpheap -type Exception -stat
------------------------------
Heap 0
total 79 objects
------------------------------
Heap 1
total 76 objects
------------------------------
Heap 2
total 91 objects
------------------------------
Heap 3
total 92 objects
------------------------------
total 338 objects
Statistics:
      MT    Count    TotalSize Class Name
790ff624        3           36 System.Text.DecoderExceptionFallback
790ff5d8        3           36 System.Text.EncoderExceptionFallback
790f9ad4        1           72 System.ExecutionEngineException
790f9a30        1           72 System.StackOverflowException
790f998c        1           72 System.OutOfMemoryException
653c8d04        1           76 System.Data.SqlClient.SqlException
790f984c        3          216 System.Exception
66414de0       18          216 System.Web.HttpApplication+CancelModuleException
7911bc7c       11          352 System.UnhandledExceptionEventHandler
7911a3b0        8          608 System.ArgumentNullException
790f9b78       36         2592 System.Threading.ThreadAbortException
663d9268      116         9744 System.Web.HttpUnhandledException
7915cf40      136         9792 System.NullReferenceException
Total 338 objects

As you can see, this gives us almost the same information except for the callstacks.

Knowing what to ignore

When analyzing data it is always good to know how to filter the information.

The ever-present exceptions

There are a three exceptions that are created as soon as the worker process starts. This means that you will always see them on the heap even if they haven't been thrown at all.:

  • System.ExecutionEngineException
  • System.StackOverflowException
  • System.OutOfMemoryException

So why are they created if we haven't thrown them? - Any guesses?

The answer is quite simple: If you run into a situation where you need to throw any of these exceptions you will probably be in a state where you can't create them. For example, you've run out of memory and are no longer able to allocate even the tiniest string. How would you then be able to allocate enough memory to create a new exception?

So, provided that there's still only one of each on the heap, you can most likely ignore these three exceptions. When it comes to ExecutionEngineExceptions and OutOfMemoryExceptions you will probably have a pretty good idea that this is what you're looking for, and finding a StackOverflowException isn't that hard. If you run !clrstack and find a callstack of 200+ lines you can be more or less certain that this is your problem.

System.Threading.ThreadAbortException

Usually when you see a ThreadAbortExceptions it is because you've called Response.Redirect.

Whenever you call Response.Redirect, this will also result in a call to Response.End. This will terminate the thread prematurely, resulting in a System.Threading.ThreadAbortException. See the callstack below for an example.

    SP       IP       Function
    1ED6F37C 793D74D0 mscorlib_ni!System.Threading.Thread.Abort(System.Object)+0x2c
    1ED6F390 6600CA8C System_Web_ni!System.Web.HttpResponse.End()+0x5c
    1ED6F3A4 6600B8C3 System_Web_ni!System.Web.HttpResponse.Redirect(System.String, Boolean)+0x1f3
    1ED6F3B8 6600B6B7 System_Web_ni!System.Web.HttpResponse.Redirect(System.String)+0x7
    1ED6F3BC 1DDD2E1D Company_Web!Company.Web.UI.Page.RedirectToPreviousPage()+0x125

Obviously I'm not saying you should discard all System.Threading.ThreadAbortExceptions as irrelevant. Even if you have no reason to believe that ThreadAbortExceptions are a major concern it's always a good idea to investigate a few of them. Take a minute or two to confirm that there is an underlying call to Response.End caused by a Response.Redirect. Once you think that you have enough statistical data to imply that the ThreadAbortExceptions are caused by Redirects you can move on.

Examining the Exceptions

Okay, so say we want to look at the callstacks for the System.Data.SqlClient.SqlException, well first of all we need the address for it. As you might remember, this is easily obtained by using !dumpheap without the -stat option.

0:000> !dumpheap -type System.Data.SqlClient.SqlException
------------------------------
Heap 0
Address       MT     Size
total 0 objects
------------------------------
Heap 1
Address       MT     Size
total 0 objects
------------------------------
Heap 2
Address       MT     Size
0b62cf38 653c8d04       76    
total 1 objects
------------------------------
Heap 3
Address       MT     Size
total 0 objects
------------------------------
total 1 objects
Statistics:
      MT    Count    TotalSize Class Name
653c8d04        1           76 System.Data.SqlClient.SqlException
Total 1 objects

Now we have the address for the exception. In order to investigate the exception we could use !dumpobject, but there is another command I want to use first.

!printexception (!pe)

Running the !printexception command on the address of an exception will give us some neat information on the exception in question. Here's the result of running !printexception on the SqlException:

0:000> !pe 0b62cf38
Exception object: 0b62cf38
Exception type: System.Data.SqlClient.SqlException
Message: Transaction (Process ID 96) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
InnerException:
StackTrace (generated):
    SP       IP       Function
    1DACECA8 1D0A4D4C Company_Database_1d3c0000!Company.Database.AuditTrail.Write(System.String, System.Guid, System.String)+0x194
    1DACED6C 1D0A4B98 Company_Database_1d3c0000!Company.Database.AuditTrail.AddEntry(EntryType, System.Guid, System.String)+0x48
    1DACED90 1D0A4B24 Company_Database_1d3c0000!Company.Database.DataFunctions.SaveRow(System.String, System.Guid, Boolean)+0x5e4
    1DACEE20 1DDD7A06 FooBase_1d560000!FooBase.PersonApplicationOtherQualificationBase.PersonApplicationOtherQualificationBaseManager.BaseSave(System.Guid, Boolean)+0x2e
    1DACEE34 1DDD79BF Foo_1cfd0000!Foo.PersonApplicationOtherQualification.PersonApplicationOtherQualificationManager.Save(System.Guid, Boolean)+0x27
    1DACEE48 1DDD77F5 Foo_1cfd0000!UserControls.PersonApplicationOtherQualificationDetail.OnSave()+0x65
    1DACEE70 1DDD2701 Company_Web_1d090000!Company.Web.UI.Page.InvokeUsercontrolTransaction(System.Web.UI.Control, Company.Web.TransactionType)+0x1d1

StackTraceString:
HResult: 80131904
The current thread is unmanaged

This is good stuff. The command was even able to generate a callstack for us. (This may not always be the case, since the callstack may very well have gone out of scope.)

!dumpobject (!do) still has its uses

I wouldn't say that !printexception is a complete replacement for !dumpobject when it comes to examining exceptions. !Printexception will fit the exception into a standard template, and since some exceptions may contain more data than others we sometimes want to use !dumpobject as well. The SqlException has a property called _errors that contains a System.Data.SqlClient.SqlErrorCollection that we might want to look at. This is not in the listing above, so we need to use !dumpobject to look at it.

0:000> !do 0b62cf38
Name: System.Data.SqlClient.SqlException
MethodTable: 653c8d04
EEClass: 6540a0d0
Size: 76(0x4c) bytes
(C:\WINDOWS\assembly\GAC_32\System.Data\2.0.0.0__b77a5c561934e089\System.Data.dll)
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
790f9244  40000b5        4        System.String  0 instance 00000000 _className
79107d4c  40000b6        8 ...ection.MethodBase  0 instance 00000000 _exceptionMethod
790f9244  40000b7        c        System.String  0 instance 00000000 _exceptionMethodString
790f9244  40000b8       10        System.String  0 instance 0b62cdfc _message
79112734  40000b9       14 ...tions.IDictionary  0 instance 0b62cf84 _data
790f984c  40000ba       18     System.Exception  0 instance 00000000 _innerException
790f9244  40000bb       1c        System.String  0 instance 00000000 _helpURL
790f8a7c  40000bc       20        System.Object  0 instance 0b62d030 _stackTrace
790f9244  40000bd       24        System.String  0 instance 00000000 _stackTraceString
790f9244  40000be       28        System.String  0 instance 00000000 _remoteStackTraceString
790fdb60  40000bf       34         System.Int32  1 instance        0 _remoteStackIndex
790f8a7c  40000c0       2c        System.Object  0 instance 00000000 _dynamicMethods
790fdb60  40000c1       38         System.Int32  1 instance -2146232060 _HResult
790f9244  40000c2       30        System.String  0 instance 00000000 _source
790fcfa4  40000c3       3c        System.IntPtr  1 instance        0 _xptrs
790fdb60  40000c4       40         System.Int32  1 instance -532459699 _xcode
653c8b28  40017e0       44 ...qlErrorCollection  0 instance 0b62cd90 _errors

There we have it. Now we can continue using !dumpobject to investigate it even further if we wish.

Inner exceptions

If we take a look at one of the HttpUnhandledExceptions we find that it has an inner exception. It is even nice enough to let us know how to find out more about it.

0:000> !pe 10544f64
Exception object: 10544f64
Exception type: System.Web.HttpUnhandledException
Message:
InnerException: System.NullReferenceException, use !PrintException 10544df8 to see more
StackTrace (generated):
    SP       IP       Function
    1E3BE1D8 6614FDB2 System_Web_ni!System.Web.UI.Page.HandleError(System.Exception)+0x3e6
    1E3BE220 6615681A System_Web_ni!System.Web.UI.Page.ProcessRequestMain(Boolean, Boolean)+0x1b3a
    1E3BF190 66154A8A System_Web_ni!System.Web.UI.Page.ProcessRequest(Boolean, Boolean)+0xd6
    1E3BF1C8 66154967 System_Web_ni!System.Web.UI.Page.ProcessRequest()+0x57
    1E3BF200 66154887 System_Web_ni!System.Web.UI.Page.ProcessRequestWithNoAssert(System.Web.HttpContext)+0x13
    1E3BF208 6615481A System_Web_ni!System.Web.UI.Page.ProcessRequest(System.Web.HttpContext)+0x32
    1E3BF21C 1C741EAE App_Web__ekpvebx!ASP.sys_pages_application_application_aspx.ProcessRequest(System.Web.HttpContext)+0x1e
    1E3BF228 65FF27D4 System_Web_ni!System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()+0x130
    1E3BF25C 65FC15B5 System_Web_ni!System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)+0x41

This means that the System.NullReferenceException mentioned lead to the HttpUnhandledException we're currently investigating. So if we want to find the root cause we'll need to investigate the inner exception as well.

Extra credit

If you've looked at my "Advanced commands"-post a while back you saw some examples of the .foreach command. This is a great command to use, for example if you want to see the callstack for all System.ArgumentNullExceptions. Instead of manually iterating through all the exceptions we can now dump them all at once, check their callstacks, etc.

0:000> .foreach(myVariable {!dumpheap -type System.ArgumentNullException -short}){!pe myVariable;.echo *************}
Exception object: 03c6fbb8
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException:
StackTrace (generated):
    SP       IP       Function
    1F15E3E0 795FC73C mscorlib_ni!System.Guid..ctor(System.String)+0x2a14bc
    1F15E43C 1D0AB97F Foo_1cfd0000!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString:
HResult: 80004003
The current thread is unmanaged
*************
Exception object: 08378e24
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException:
StackTrace (generated):
    SP       IP       Function
    1CE6EC60 795FC73C mscorlib_ni!System.Guid..ctor(System.String)+0x2a14bc
    1CE6ECBC 1D0AB97F Foo_1cfd0000!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString:
HResult: 80004003
The current thread is unmanaged
*************
Exception object: 084c0b30
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException:
StackTrace (generated):
    SP       IP       Function
    1E3BEEE0 795FC73C mscorlib_ni!System.Guid..ctor(System.String)+0x2a14bc
    1E3BEF3C 1D0AB97F Foo_1cfd0000!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString:
HResult: 80004003
The current thread is unmanaged
*************
Exception object: 08522f84
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException:
StackTrace (generated):
    SP       IP       Function
    1E3BEEE0 795FC73C mscorlib_ni!System.Guid..ctor(System.String)+0x2a14bc
    1E3BEF3C 1D0AB97F Foo_1cfd0000!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString:
HResult: 80004003
The current thread is unmanaged
*************
Exception object: 0c036bf8
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException:
StackTrace (generated):
    SP       IP       Function
    1DC7EB60 795FC73C mscorlib_ni!System.Guid..ctor(System.String)+0x2a14bc
    1DC7EBBC 1D0AB97F Foo_1cfd0000!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString:
HResult: 80004003
The current thread is unmanaged
*************
Exception object: 105d7f60
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException:
StackTrace (generated):
    SP       IP       Function
    1F39F360 795FC73C mscorlib_ni!System.Guid..ctor(System.String)+0x2a14bc
    1F39F3BC 1D0AB97F Foo_1cfd0000!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString:
HResult: 80004003
The current thread is unmanaged
*************
Exception object: 106206fc
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException:
StackTrace (generated):
    SP       IP       Function
    1F39F360 795FC73C mscorlib_ni!System.Guid..ctor(System.String)+0x2a14bc
    1F39F3BC 1D0AB97F Foo_1cfd0000!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString:
HResult: 80004003
The current thread is unmanaged
*************
Exception object: 1077a864
Exception type: System.ArgumentNullException
Message: Value cannot be null.
InnerException:
StackTrace (generated):
    SP       IP       Function
    1E3BEEE0 795FC73C mscorlib_ni!System.Guid..ctor(System.String)+0x2a14bc
    1E3BEF3C 1D0AB97F Foo_1cfd0000!Pages.PersonHomePage.Page_Load(System.Object, System.EventArgs)+0x77

StackTraceString:
HResult: 80004003
The current thread is unmanaged
*************
Unknown option: ------------------------------
*************

Well this post was even longer than usual.

I hope you found it of value, and I'll gladly listen to any comments, feedback or wishes on future topics you might have.

/ Johan