.NET Debugging Demos Lab 2: Crash – Review


Hi all,

I realize that you probably haven’t all had a chance to look at the Crash lab yet, considering I just published it on Friday but here is the review for it.  I have to say that considering the amount of downloads for the buggy bits site I am a bit surprised with the lack of questions about the labs.  Perhaps they were too easy, or perhaps people just haven’t had the time to go through them yet… either case, feel free to ask questions or comment on the labs, even questions that aren’t directly related to the labs are welcome. My comments and answers are shown inline in red.

I will be publishing a Lab 2 which will be a memory lab later this week.

Reproduce the problem

1. Browse to the reviews page http://localhost/BuggyBits/Reviews.aspx, you should see a couple of bogus reviews for BuggyBits

2. Click on the Refresh button in the reviews page. This will crash the w3wp.exe process (or aspnet_wp.exe on IIS 5) 

    Note: If you have Visual Studio installed a Just-In-Time Debugger message may pop up (just click no for the purposes of this excercise).    
    However since this message box will sit there and wait for user input in order to shut down the app you may want to
disable JIT debugging if you have visual studio 
    installed on a test system.

Examine the eventlogs

1. Open the Application and System eventlogs, the information in the eventlogs will differ based on the OS and IIS version you are running. Among other events you may
    have a System Event looking something like this…

Event Type:	Warning
Event Source:	W3SVC
Event Category:	None
Event ID:	         1009
Date:		2008-02-08
Time:		10:12:06
User:		N/A
Computer:   	MYMACHINE
Description:
A process serving application pool 'DefaultAppPool' terminated unexpectedly. The process id was '4592'. 
The process exit code was '0xe0434f4d'.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

Q: What events do you see?

A: On my machine (Win2k3, .NET 2.0 SP1) i get the above message in the system eventlog and the following two messages in the application eventlog.

From the first one we can see that there was a nullreference exception of some kind…

Event Type:	Error
Event Source:	.NET Runtime 2.0 Error Reporting
Event Category:	None
Event ID:	         5000
Date:		2008-02-11
Time:		11:13:18
User:		N/A
Computer:	         MYMACHINE
Description:
EventType clr20r3, P1 w3wp.exe, P2 6.0.3790.3959, P3 45d6968e, P4 app_code.wbwztx_4, P5 0.0.0.0, P6 47a81aee, P7 13, P8 0, P9 system.nullreferenceexception, P10 NIL.

And the second one completely gives it away as a null reference exception that occurred in the finalizer/destructor for Review (Review.Finalize()). 
The good thing about this is that we don’t really even need to debug this on W2k3, .NET 2.0 SP1, but on some combinations of IIS/OS/Framework you will not get
the 2nd error message.

Event Type:	Error
Event Source:	ASP.NET 2.0.50727.0
Event Category:	None
Event ID:	1334
Date:		2008-02-11
Time:		11:13:15
User:		N/A
Computer:	PRATHER
Description:
An unhandled exception occurred and the process was terminated.

Application ID: /LM/W3SVC/1/Root/BuggyBits

Process ID: 4944

Exception: System.NullReferenceException

Message: Object reference not set to an instance of an object.

StackTrace:    at Review.Finalize()

Q: What does the exit code 0xe0434f4d mean?

A: 0xe0434f4d is the native exception code for .NET exceptions, any .net exception will be translated into 0xe0434f4d, whether it is a SQLException, NullReferenceException or a FileNotFoundException.

Q: Can you tell from the eventlogs what it was that caused the crash? 

A: In this case as mentioned above, the 2nd application eventlog entry gives it away completely.  The process crashed because there was an unhandled exception on the finalizer thread when finalizing an object of type Review.

Get a memory dump

1. Browse to the reviews page http://localhost/BuggyBits/Reviews.aspx again, but don’t click refresh

2. Open a command prompt and move to the debuggers directory and type in “adplus -crash -pn w3wp.exe” and hit enter

Q: A new window should appear on the toolbar, what is it? 

A: This is the debugger, if you maximize it you will see a cmd window with cdb running.  Cdb is the commandline equivalent for windbg (i.e. windbg without a UI).     

Q: What is the debugger waiting for? Hint: Check the help files for ADplus/crash mode in windbg

A: Here is a table showing what events/exceptions adplus monitors for by default, and what it will do if any of these occurr

Abbreviations L=Log entry in debuglog, T=Record Time, S=Log stack, M=Mini dump, F=Full dump, E=log in the eventlog.

You can change how these events are handled by providing your own cfg file to adplus.

Code Exception/Event Explanation/Note 1st Chance 2nd Chance
av Access Violation An attempt to access to uninitialized memory.
This is also thrown for .net NullReferenceExceptions
LTSM LTSFE
ch Invalid Handle An attempt to access data using an invalid handle. Eg. window, file or other native handle. LTSM LTSFE
ii Illegal Instruction An attempt is made to execute an invalid instruction. LTSM LTSFE
dz Integer divide Integer divide by zero. Even in .net. LTSM LTSFE
c000008e Floating divide Floating point divide by zero. LTSM LTSFE
iov Integer overflow An attempt to store more than 32/64 bits in an integer. LTSM LTSFE
lsq Invalid Lock Sequence An attempt is made to execute an invalid lock sequence. LTSM LTSFE
sov Stack overflow The memory used for the stak (method pointers, parameters, local vars etc.) exceeds the amount allocated for the stack. Typically caused by recursion. LTSM LTSFE
eh C++ EH exception C++ EH Exception LTS LTSFE
* Unknown exception Any exception not specifically defined in this list LTS LTSFE
clr .NET Exception Any .net exception LTSFE
bpe Breakpoint A break point is hit or the debugger window is shut down. LTSME
wkd Wake Debugger Wake debugger. LTSME
ld Dll Load A dll is loaded in the process L L
ud Dll Unload A dll is unloaded from the process L L
epr Exit Process The process exits. LTSFE
aph Application Hang The Windows OS has determined that the process is no longer responding. LTSM LTSFE
cce Ctrl+C Console App Ctrl+C or Ctrl+Break is passed to a console app. LTSM LTSFE
dm Data Misaligned A unit of data is read from or written to an address that is not a multiple of the data size. LTSM LTSFE
gp Guard page violation An attempt to access data within a guard page. LTSM LTSFE
ip In Page IO Error I/O Error LTSM LTSFE
isc Invalid System Call Invalid System Call LTSM LTSFE
sbo Stack Buffer overflow Stack buffer overflow LTSM LTSFE

3. Reproduce the issue by clicking on the refresh button in the reviews page.

Q: What files got created in the dump folder?  Note: The dump folder will be located under your debuggers directory with the name crash_mode and todays date and time 

A: The following files were created in my crash mode folder (under the debuggers folder)

ADPlus_report.txt
<DIR> CDBScripts
PID-4628__W3WP.EXE__Date_02-11-2008__Time_13-14-0808.log
Process_List.txt
PID-4628__W3WP.EXE__1st_chance_AccessViolation__mini_17a4_2008-02-11_13-14-19-358_1214.dmp
PID-4628__W3WP.EXE__1st_chance_Process_Shut_Down__full_17a4_2008-02-11_13-14-31-889_1214.dmp
PID-4628__W3WP.EXE__2nd_chance_NET_CLR__full_17a4_2008-02-11_13-14-20-093_1214.dmp

So we have the normal log files, CDBScripts, ADPlus_report.txt and the process list which I described in the review for lab 1

Apart from this we have a minidump for a 1st chance access violation (probably our NullReferenceException), and then a 2nd chance CLR exception (the same nullreference exception, but now 2nd chance since it wasn’t handled), followed by a process shutdown dump.  You can see in which order they occurred by looking at the timestamps on them.

 

Open the dump in windbg

1. Open the dump file labeled 2nd Chance CLR Exception in windbg (file/open crash dump).  Note that this dump got created just before the 1st chance process shutdown.

Note: if you throw an exception (.net or other) you have a chance to handle it in a try/catch block.  The first time it is thrown it becomes a 1st chance exception and is non-fatal.  If you don’t handle the exception it will become a 2nd chance exception (unhandled exception) and any 2nd chance exceptions will terminate the process.

2. Set up the symbol path and load sos (see the setup instructions for more info)

 

In a crash dump, the active thread is the one that caused the exceptions (since the dump is triggered on an exception).

Q: Which thread is active when you open the dump? Hint: check the command bar at the bottom of the windbg window.

A: The active thread is thread #20, which if I compare it to the !threads output is the Finalizer thread

 

20 2 116c 001ac540 b220 Enabled 02f0d28c:02f0f130 00199400 0 MTA (Finalizer) System.NullReferenceException (02f0cb3c)

In the !threads output we can also see that this thread threw a NullReferenceException and that the address of the Exception is 0x02f0cb3c

Examine the callstacks and the exception

1. Examine the native and managed callstacks. 

kb 2000
!clrstack

Q: What type of thread is it?

A: We can see from kb that this is in fact the finalizer thread but !clrstack does not give us a whole lot of information about what the managed stack looks like because we are deep in the exception handler code. 

0:020> kL 2000
ChildEBP RetAddr  
02a0f9d8 79f071ac kernel32!RaiseException+0x53
02a0fa38 7a10733d mscorwks!RaiseTheExceptionInternalOnly+0x2a8
02a0fa50 7a1073b1 mscorwks!RaiseTheException+0x4e
02a0fa78 7a1073e2 mscorwks!RealCOMPlusThrowWorker+0x71
02a0faa4 7a1073f0 mscorwks!RealCOMPlusThrow+0x30
02a0fab4 7a14f0d4 mscorwks!RealCOMPlusThrow+0xd
02a0fbe4 7a025a5b mscorwks!Thread::RaiseCrossContextException+0x41f
02a0fc98 79f01747 mscorwks!Thread::DoADCallBack+0x293
02a0fcb8 79f04a71 mscorwks!ManagedThreadBase_DispatchInner+0x47
02a0fcc4 79f04b01 mscorwks!ManagedThreadBase::FinalizerAppDomain+0x25
02a0fd20 79efe5b4 mscorwks!SVR::DoOneFinalization+0x141
02a0fda8 79efe70a mscorwks!SVR::FinalizeAllObjects+0x56
02a0fdc0 79ef3207 mscorwks!SVR::GCHeap::FinalizerThreadWorker+0xe7
02a0fdd4 79ef31a3 mscorwks!ManagedThreadBase_DispatchInner+0x4f
02a0fe68 79ef30c3 mscorwks!ManagedThreadBase_DispatchMiddle+0xb1
02a0fea4 79fb9643 mscorwks!ManagedThreadBase_DispatchOuter+0x6d
02a0fecc 79fb960d mscorwks!ManagedThreadBase_NoADTransition+0x32
02a0fedc 79fd0c91 mscorwks!ManagedThreadBase::FinalizerBase+0xd
02a0ff14 79f95a2e mscorwks!SVR::GCHeap::FinalizerThreadStart+0xbb
02a0ffb8 7d4dfe21 mscorwks!Thread::intermediateThreadProc+0x49
02a0ffec 00000000 kernel32!BaseThreadStart+0x34
0:020> !clrstack
OS Thread Id: 0x116c (20)
ESP       EIP     
02a0fa8c 7d4e2366 [GCFrame: 02a0fa8c] 
02a0faf0 7d4e2366 [GCFrame: 02a0faf0] 
02a0fcec 7d4e2366 [GCFrame: 02a0fcec] 

Q: What is this thread doing?

A: It is trying to call the finalizer of some object and in doing so it threw an exception that is now being rethrown as a 2nd chance exception.

2. Examine the exception thrown

!pe

Note: !pe/!PrintException will print out the current exception being thrown on this stack if no parameters are given

Q: What type of exception is it?

A: A NullReferenceException 

0:020> !pe
Exception object: 02f0cb3c
Exception type: System.NullReferenceException
Message: Object reference not set to an instance of an object.
InnerException: 
StackTrace (generated):

StackTraceString: 
HResult: 80004003

Note: In some cases, like this one where the exception has been rethrown, the original stacktrace may not be available in the exception.  In cases like this you may get more information if you find the original exception

3. Look at the objects on the stack to find the address of the original exception

!dso

Q: What is the address of the original exception

Hint: Look at your previous pe output to see the address of the rethrown exception.  Compare this to the addresses of the objects on the stack.  You should have multiple exceptions, a few with the rethrown exception address but one of the bottommost exceptions will be the original one (look for one with a different address).

A:  The only NullReferenceException with a different address than the one we listed before (0x2f0cb3c) is 0x6f1e5e4

0:020> !dso
OS Thread Id: 0x116c (20)
ESP/REG  Object   Name
02a0f9f4 02f0cb3c System.NullReferenceException
02a0f9f8 02f0cb3c System.NullReferenceException
02a0fa40 02f0cb3c System.NullReferenceException
02a0fa58 02f0cb3c System.NullReferenceException
02a0faac 02f0cb3c System.NullReferenceException
02a0fabc 02f0cb3c System.NullReferenceException
02a0fb58 06f1e5e4 System.NullReferenceException
02a0fb5c 02f0bd70 System.Byte[]
02a0fb78 02f0cb3c System.NullReferenceException
02a0fb7c 02f0bd70 System.Byte[]
02a0fb8c 02f0bd70 System.Byte[]
02a0fc58 06eb281c System.RuntimeType
02a0fcb0 02efd4b0 Review

4. Print out the original exception and look at the information and the callstack

!pe <original exception address> 

0:020> !pe 06f1e5e4 
Exception object: 06f1e5e4
Exception type: System.NullReferenceException
Message: Object reference not set to an instance of an object.
InnerException: 
StackTrace (generated):
    SP       IP       Function
    02A0F950 0FE90F24 App_Code_wbwztx_4!Review.Finalize()+0x14

StackTraceString: 
HResult: 80004003

Q: In what method is the exception thrown?

A: In the finalizer/destructor for the Review class

Q: What object is being finalized?

Note: you could actually have gotten this information by dumping out the _exceptionMethodString of the rethrown exception as well, but with !pe of the original exception you get the information in a cleaner way.

A: A Review object

Q: Normally exceptions thrown in ASP.NET are handled with the global exception handler and an error page is shown to the user.  Why did this not occurr here?  Why did it cause a crash?

A: The global exception handler for ASP.NET only handles exceptions thrown when processing a request.  In this case the exception is thrown on the finalizer which means that there is no user there to see the exception and therefore there is noone to show the error page to.

In 1.1. and if you use legacy exception handling in 2.0, exceptions on the finalizer are non-fatal, i.e. they will not crash the process, instead the finalizer would just stop finalizing the current object.  This is very dangerous since this means that you may not process code that releases native objects and thus you may end up with a memory leak that is very hard to troubleshoot or other resource leaks like connection leaks etc.  All unhandled exceptions on the finalizer thread should be investigated and taken seriously.

Examine the code for verification

1. Open Review.cs to find the destructor/finalizer for the Review class 

    ~Review()
    {
        if (quote.ToString() != string.Empty)
        {
            quote = null;
        }
    }

Q: which line or method could have caused the exception

A: In this case the code for the Review destructor is fairly short so it is pretty easy to pinpoint where the exception occurred.  The only instruction in here that could have caused a NullRef is the quote.ToString() and this would occurr if quote = null.

As an extra excercise you can also examine the disassembly of the function to try to pinpoint better where in the function the exception is caused

!u <IP shown in the exceptionstack>

0:020> !u 0FE90F24 
Normal JIT generated code
Review.Finalize()
Begin 0fe90f10, size 57
0fe90f10 55              push    ebp		
0fe90f11 8bec            mov     ebp,esp
0fe90f13 57              push    edi
0fe90f14 56              push    esi
0fe90f15 53              push    ebx
0fe90f16 83ec14          sub     esp,14h
0fe90f19 33c0            xor     eax,eax
0fe90f1b 8945e8          mov     dword ptr [ebp-18h],eax
0fe90f1e 8bf1            mov     esi,ecx
0fe90f20 8b4e04          mov     ecx,dword ptr [esi+4]		
0fe90f23 8b01            mov     eax,dword ptr [ecx]
0fe90f25 ff5028          call    dword ptr [eax+28h]
0fe90f28 8bc8            mov     ecx,eax
0fe90f2a 8b15ac46eb0a    mov     edx,dword ptr ds:[0AEB46ACh] ("")
0fe90f30 e8db2a4d69      call    mscorlib_ni+0x2a3a10 (79363a10) (System.String.Equals(System.String, System.String), mdToken: 06000143)
0fe90f35 85c0            test    eax,eax
0fe90f37 7505            jne     0fe90f3e			
0fe90f39 33d2            xor     edx,edx
0fe90f3b 895604          mov     dword ptr [esi+4],edx
0fe90f3e c745e400000000  mov     dword ptr [ebp-1Ch],0
0fe90f45 c745e8fc000000  mov     dword ptr [ebp-18h],0FCh
0fe90f4c 685e0fe90f      push    0FE90F5Eh
0fe90f51 eb00            jmp     0fe90f53
0fe90f53 58              pop     eax
0fe90f54 ffe0            jmp     eax
0fe90f56 8d65f4          lea     esp,[ebp-0Ch]
0fe90f59 5b              pop     ebx
0fe90f5a 5e              pop     esi
0fe90f5b 5f              pop     edi
0fe90f5c 5d              pop     ebp
0fe90f5d c3              ret
0fe90f5e c745e800000000  mov     dword ptr [ebp-18h],0
0fe90f65 ebef            jmp     0fe90f56

If we look at the disassembly below and compare it to the method above we find a few things.

1. The exception address 0x0FE90F24 falls on the bolded lines
2. Right before the bolded line we set EAX to ptr(ESI+0x4) and it is when we use EAX that we access violate/get the nullreference exception.  ESI is typically used to hold the this    
    pointer (in this case the pointer to our Review object) so the question is really what is at ESI+0x4

    Dumping out the Review object that we found with !dso we can see that at offset 0x4 we have the member variable quote and looking at the value we can see that it is null.

0:020> !do 02efd4b0 
Name: Review
MethodTable: 0fe11cc4
EEClass: 0fde5860
Size: 16(0x10) bytes
 (C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\Temporary ASP.NET Files\buggybits\b27906cc\f5f91899\App_Code.wbwztx_4.dll)
Fields:
      MT    Field   Offset                 Type VT     Attr    Value Name
790fd8c4  4000008        4        System.String  0 instance 00000000 quote
790fd8c4  4000009        8        System.String  0 instance 00000000 source

3.  If that wasn’t enough proof we could also have seen approximately where we were at in the function by looking at the instructions below the point where we threw the exception. We have an instruction to move “” into edx, i.e. an empty string (string.Empty), and right after that there is a string compare operation which matches up to if(quote.ToString() != string.Empty)

Btw, the reason that quote was null is because ClearReview() sets it to null… The proper way to resolve this issue would have been to not have a finalizer (since we don’t really need one given that we don’t have to clean up any native resources) or at the very least we should have called GC.SuppressFinalize() in the ClearReview() method.

 

Until next time

Tess

Comments (45)

  1. Lucian Bargaoanu says:

    I just wanted to say that the labs are great. And because of the hints is kind of easy to go through them and this is good because I’m pretty ignorant about this type of debugging. I’m sure they’ll get more difficult later on.

  2. Essam says:

    Tess,

    What about a screen cast for the lab ?

  3. Sharepoint Announcing the DoD 5015.2 Resource Kit for SharePoint Server 2007 [Via: sptblog ] WPF The…

  4. Link Listing – February 11, 2008

  5. Tom says:

    I aggree with Lucian Bargaoanu: "The labs are great." I am also pleased wit the hint because of my ignorance with this kind of debugging.

    The lack of questions about the lab you notice, may be caused by the quality of the labs. So just keep on going as you are; I lik it.

  6. Tess says:

    Great to hear:)  I’ll post a memory lab today or tomorrow depending on if when i can get an hour or two freed up.

    As a teaser, the mem leak will be produced by running tinyget on 30 threads, 100 loops for links.aspx

  7. baal says:

    “examine the disassembly ” parts is difficult for me

  8. Tess says:

    agreed, I think that is a bit difficult overall… and not needed in most cases, which is why i just added it as an extra excercise.

    do you have any specific questions around it?

  9. baal says:

    Dear, questions will come, don’t worry!

    today i spent 8 hours to make 2 reviews articles to chinese, I have not run the code, just translate to catch up with you. i will investigate them, then ask you something.

    ——-

    tired,over here

  10. Kevin says:

    I’m curious, is there something you did to cause the disassembly to show that the memory contained an empty string?  I do not see the same results when disassembling.  My output does not show the ("") with the mov instruction.

    01bc0f20 8bc8            mov     ecx,eax

    01bc0f22 8b158c46550b    mov     edx,dword ptr ds:[0B55468Ch]

    01bc0f28 e8a3db7a77      call    mscorlib_ni+0x2aead0 (7936ead0) (System.String.Equals(System.String, System.String), mdToken: 06000142)

    Thanks,

    Kevin

  11. Tess says:

    interesting, no nothing special, it might be slightly different versions of sos.dll…

  12. charju says:

    A littel comments for exception object:

    0:018> kb

    ChildEBP RetAddr  Args to Child              

    01e9f9fc 79f97065 e0434f4d 00000001 00000001 kernel32!RaiseException+0x53

    01e9fa5c 7a05b941 02678b1c 00000000 00000000 mscorwks!RaiseTheExceptionInternalOnly+0x226

    Review the first args of the 1st line in callstack: kernel32!RaiseException+0x53, "e0434f3d", you can also found this strange number at eventvwr log just descripbed in tess’s lab2. This is a common CLR exception number.

    Second, the real CLR exception object is identified in the first arg of mscorwks!RaiseTheExceptionInternalOnly

    so, you can dumpobject to examine this exception object by !do 02678b1c or directly by !pe 02678b1c . If you type command !do, then you can found below info:

    790fa3e0  40000b7        c        System.String  0 instance 02679cc4 _exceptionMethodString

    we can continously !do 02679cc4  to see more information about this exception.

    Thanks for tess’s great post!!! I’d learned much more debugging skill and thoughts from you, thanks again!

  13. Raymond Hung says:

    Thank you for all the posting. This is the best place to learn managed debugging. 🙂

    I have one question on !dso.

    I am getting the following

    0:018> !dso

    OS Thread Id: 0xb90 (18)

    ESP/REG  Object   Name

    01e9fa18 0266174c System.NullReferenceException

    01e9fa1c 0266174c System.NullReferenceException

    01e9fa64 0266174c System.NullReferenceException

    01e9fa78 0266174c System.NullReferenceException

    01e9facc 0266174c System.NullReferenceException

    01e9fad8 0266174c System.NullReferenceException

    01e9fb74 02659b0c System.NullReferenceException

    01e9fb78 0266040c System.Byte[]

    01e9fba4 0266174c System.NullReferenceException

    01e9fba8 0266040c System.Byte[]

    01e9fbb8 0266040c System.Byte[]

    01e9fccc 025f75c0 System.CodeDom.Compiler.TempFileCollection

    The difference between this and the one shows in the lab is that the bottom piece. Your output shows

    0:020> !dso

    OS Thread Id: 0x116c (20)

    ESP/REG  Object   Name

    02a0f9f4 02f0cb3c System.NullReferenceException

    02a0f9f8 02f0cb3c System.NullReferenceException

    02a0fa40 02f0cb3c System.NullReferenceException

    02a0fa58 02f0cb3c System.NullReferenceException

    02a0faac 02f0cb3c System.NullReferenceException

    02a0fabc 02f0cb3c System.NullReferenceException

    02a0fb58 06f1e5e4 System.NullReferenceException

    02a0fb5c 02f0bd70 System.Byte[]

    02a0fb78 02f0cb3c System.NullReferenceException

    02a0fb7c 02f0bd70 System.Byte[]

    02a0fb8c 02f0bd70 System.Byte[]

    02a0fc58 06eb281c System.RuntimeType

    02a0fcb0 02efd4b0 Review <<–

    So in my case, I am not able to follow some piece of the lab. I am running IIS 6.0 with Win2K3 SP2.

    I am interested on getting the offset like the one in the lab

    "Dumping out the Review object that we found with !dso we can see that at offset 0x4 we have the member variable quote and looking at the value we can see that it is null."

    Is there another way to get the offset?

  14. Raymond Hung says:

    Found it. Sorry, should have tried a little harder before posting the question.

    0:018> !dumpheap -mt 01f98ba4        

    Address       MT     Size

    02644834 01f98ba4       16    

    02644a74 01f98ba4       16    

    total 2 objects

    Statistics:

         MT    Count    TotalSize Class Name

    01f98ba4        2           32 Review

    Total 2 objects

    0:018> !do 02644834

    Name: Review

    MethodTable: 01f98ba4

    EEClass: 024354ac

    Size: 16(0x10) bytes

    (C:WINDOWSMicrosoft.NETFrameworkv2.0.50727Temporary ASP.NET Filesdebug1b1d4288db06837fbApp_Code.jbom0q7o.dll)

    Fields:

         MT    Field   Offset                 Type VT     Attr    Value Name

    790fa3e0  4000008        4        System.String  0 instance 00000000 quote

    790fa3e0  4000009        8        System.String  0 instance 00000000 source

  15. I have the same problem as Raymond, the Review object doesn’t show when I run !dso

    I can get to it if I run !dumpheap -stat, find it "manually", then use !dumpheap -mt [address]

    but there’s gotta be an easier way than that

  16. Tess says:

    if you dont see the review object with !dso that probably means that you caught the dump when it was finalizing some other object, in which case you would need to run !dumpheap -stat to get any review objects on the heap.  Can you post your !dso output?

  17. is pretty much the same (other than the addresses) as the one Raymond posted

    0:018> !dso

    OS Thread Id: 0xb90 (18)

    ESP/REG  Object   Name

    01e9fa18 0266174c System.NullReferenceException

    01e9fa1c 0266174c System.NullReferenceException

    01e9fa64 0266174c System.NullReferenceException

    01e9fa78 0266174c System.NullReferenceException

    01e9facc 0266174c System.NullReferenceException

    01e9fad8 0266174c System.NullReferenceException

    01e9fb74 02659b0c System.NullReferenceException

    01e9fb78 0266040c System.Byte[]

    01e9fba4 0266174c System.NullReferenceException

    01e9fba8 0266040c System.Byte[]

    01e9fbb8 0266040c System.Byte[]

    01e9fccc 025f75c0 System.CodeDom.Compiler.TempFileCollection

    is there an easier way to locate a specific object though?

  18. Tess says:

    I’m not sure why the review object is not showing up, it is probably gone by the time you gathered the dump.  

    Looking at the review object is just an additional step to verify the issue but you can verify it through looking at the exception and through looking at the code as well so getting the review object is not crucial to the issue.

  19. KB says:

    Hey Tess,

    Another great one!  Thanks!  Is it beneficial to have a good understanding of assembly language to be proficient at this?  I’m assuming it does but wanted to get your take.

    And speaking of that, in the 0:020> !u 0FE90F24  command, I had a tough time seeing the lines in bold type.  Tried on two different monitors.  Screen res on both is cranked to the highest setting possible.  Could be other settings I guess.  Anyway, just thought I’d mention it.

    Take care.

    KB

  20. Tess says:

    Hi KB,

    No it is not essential to have a good understanding of assembly language to be proficient at debugging .net.  I just wanted to add it for completeness.

    I am not sure why the bolded lines are not showing up online.  The bolding is in the html but I can’t see it online either now that you mention it… either way it is these two lines that were bolded

    0fe90f23 8b01            mov     eax,dword ptr [ecx]

    0fe90f25 ff5028          call    dword ptr [eax+28h]

    Thanks

    Tess

  21. Thyagu says:

    Hi Tess,

    This is the best blog which i could find for debugging .net code. I have a strange problem reported by a  user. The problem is that a customer application crashes at a certain point of time. When we had a look at the event viewer and the dumps and saw a Null Reference exception in our code.

    SP       IP       Function

       1E57ED38 1B1F5023 com.uni.jellybeans.CommsManager.Run()

       1E57ED70 793B0D1F System.Threading.ThreadHelper.ThreadStart_Context(System.Object)

       1E57ED78 793740AB System.Threading.ExecutionContext.runTryCode(System.Object)

       00000000 00000001 System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode, CleanupCode, System.Object)

       1E57F204 79373FF7 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)

       1E57F21C 79373EDE System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)

       1E57F234 793B0C68 System.Threading.ThreadHelper.ThreadStart()

    I followed your instructions and was successful in diassembling the code. But i could not find the exact string which is causing the failure in our code. i would need your assistance on this. The following is the dso of the dump

    !dso

    OS Thread Id: 0x1dc0 (58)

    ESP/REG  Object   Name

    1e57f528 10ebe164 System.NullReferenceException

    1e57f52c 10ebe164 System.NullReferenceException

    1e57f574 10ebe164 System.NullReferenceException

    1e57f58c 10ebe164 System.NullReferenceException

    1e57f5e0 10ebe164 System.NullReferenceException

    1e57f5f0 10ebe164 System.NullReferenceException

    1e57f68c 04ed5b00 System.NullReferenceException

    1e57f690 10ebd12c System.Byte[]

    1e57f6ac 10ebe164 System.NullReferenceException

    1e57f6b0 10ebd12c System.Byte[]

    1e57f6c0 10ebd12c System.Byte[]

    Please let me know if you need anything further to analyze

  22. Tess says:

    Hi Thyagu,

    From this we can see that the app is starting a new thread and runs com.uni.jellybeans.CommsManager.Run() on it. and in there there is a null reference exception.  If that is not handled then that would definitely lead to a crash.

    Its hard to say anything else about what is causing the null ref just based on the stack above but you can do

    !u 1B1F5023 which will unassemble the code for com.uni.jellybeans.CommsManager.Run() and then locate where in the !u output that 1B1F5023 falls to get a bit more info about where you are getting the nullref.  If that doesnt help then perhaps looking at the code and adding some debug.writelines would help

  23. Thyagu says:

    Hi Tess,

    Thanks a ton for your quick response. I would like to admit that  i am an amateur in analyzing this dumps. I have pasted below the failing disassembly code.

    1b1f5016 8b0d3cbd9612    mov     ecx,dword ptr ds:[1296BD3Ch]

    1b1f501c 8b45d8          mov     eax,dword ptr [ebp-28h]

    1b1f501f 8b5844          mov     ebx,dword ptr [eax+44h]

    1b1f5022 8b530c          mov     edx,dword ptr [ebx+0Ch]

    1b1f5025 e8c685175e      call    mscorlib_ni+0x2ad5f0 (7936d5f0) (System.String.Concat(System.String, System.String), mdToken: 060001c7)

    1b1f502a 50              push    eax

    1b1f502b 8bcf            mov     ecx,edi

    1b1f502d 8bd6            mov     edx,esi

    1b1f502f 8b01            mov     eax,dword ptr [ecx]

    1b1f5031 ff505c          call    dword ptr [eax+5Ch]

    1b1f5034 8d65f4          lea     esp,[ebp-0Ch]

    I would be pleased if you can help me or guide me on further action.

    Thanks a lot for your help.

    Cheers,

    Thyagu

  24. Tess says:

    Hi Thyagu

    The NullRef happens between these lines.  

    1b1f5022 8b530c          mov     edx,dword ptr [ebx+0Ch]

    1b1f5025 e8c685175e      call    mscorlib_ni+0x2ad5f0 (7936d5f0) (System.String.Concat(System.String, System.String), mdToken: 060001c7)

    The top line is taking ebx+0xc  (probably a membervariable of some object) and stores it in a temporary register

    The second line calls string.concat so it is doing something along the lines of str1 = str2 + str3

    I would look for the line in code that concatenates strings and see if any of the strings that it concatenates may be null.  and also look at the line of code above it to see if could cause a null ref.

    I am not exactly sure how the code looks but looking at the disassembly we have

    mov     ecx,dword ptr ds:[1296BD3Ch]

    => ecx = whatever is stored at 0x1296BD3C  

    (likely some static string)

    mov     eax,dword ptr [ebp-28h]

    => eax = what is stored at ebp-0x28

    (some parameter or local)  i’ll call it myparam for now

    ebx,dword ptr [eax+44h]

    => ebx = what is stored at eax+0x44

    (some membervariable of myparam) i’ll call it mymembervar

    edx,dword ptr [ebx+0Ch]

    => edx = ebx + 0xc

    (some membervar of mymembervar)

    so the code should look something like this

    str = somestring + myParam.myMemberVar.membervar

    and one of these is probably null

  25. Thyagu says:

    Hi Tess,

    Thanks for your quick response.

    I spent the whole day in figuring where the failure had happened, but i was not successful.

    If there is any alternative way to figure out which object is responsible for the Null Reference exception, it would be helpful.

    FYI : This problem is not reproducible in our environment, so we can’t debug the code and find out the reason.

    Your help will be very much appreciated.

    Cheers,

    Thyagu

  26. Tess says:

    if the above didnt help then the best would be to add tracing statements as i mentioned before to print out what objects are null etc. inside that method…

  27. PaulG says:

    Hello Tess,

    I agree with the other folks … nice work on these labs!

    I am a little late to the party as this Lab2 was posted a few months ago, but I had a question about your explanation of the assembly code.

    You made this comment:

    2. Right before the bolded line we set EAX to ptr(ESI+0x4) and it is when we use EAX that we access violate/get the nullreference exception.  

    Did you mean to say ECX here instead of EAX?

    Thanks again for sharing your knowledge.

    Cheers,

    PaulG

  28. Tess says:

    0fe90f20 8b4e04          mov     ecx,dword ptr [esi+4]

    0fe90f23 8b01            mov     eax,dword ptr [ecx]

    Well actually both,  first we set ecx to ptr[esi+4], and then we move the contents of this pointer to eax

  29. PaulG says:

    Thanks Tess … that makes sense.

    I have another question … I hope you don’t mind.

    I liked how you displayed the assembly code and then used the IP to show us which line of assembly code was causing the access violation / null reference.

    Is there a way to point us to the line of IL that is causing the null reference?

    I can find the IL by printing my exception:

    0:016> !pe 03cdbb0c

    Exception object: 03cdbb0c

    Exception type: System.NullReferenceException

    Message: Object reference not set to an instance of an object.

    InnerException: <none>

    StackTrace (generated):

       SP       IP       Function

       025BF520 027A0F1C App_Code_ie7jwbrp!Review.Finalize()+0x14

    StackTraceString: <none>

    HResult: 80004003

    Then I use my IP to find the corresponding method desc:

    0:016> !ip2md 027A0F1C

    MethodDesc: 029b0ae0

    Method Name: Review.Finalize()

    Class: 025c54ac

    MethodTable: 029b0b0c

    mdToken: 06000013

    Module: 00ee8670

    IsJitted: yes

    m_CodeOrIL: 027a0f08

    And then I dump the IL for this method desc:

    0:016> !DumpIL 029b0ae0

    ilAddr = 025227f4

    .try

    {

     IL_0000: ldarg.0

     IL_0001: ldfld Review::quote

     IL_0006: callvirt System.Object::ToString

     IL_000b: ldsfld System.String::Empty

     IL_0010: call System.String::op_Inequality

     IL_0015: brfalse.s IL_001e

     IL_0017: ldarg.0

     IL_0018: ldnull

     IL_0019: stfld Review::quote

     IL_001e: leave.s IL_0027

    } // end .try

    .finally

    {

     IL_0020: ldarg.0

     IL_0021: call System.Object::Finalize

     IL_0026: endfinally

    } // end .finally

    IL_0027: ret

    Because we know that we are calling ToString() on a null string we it is obvious that this line is the culprit.

     IL_0006: callvirt System.Object::ToString

    But is there some commnad that will point us exactly to this line of IL?

    Cheers again,

    PaulG

  30. Tess says:

    unfortunately there is no way (other than matching it up to the assembly) to find out which line is causing the issue

  31. Michael L. says:

    In step one, regarding the application event log entry for 2003, you stated: "The process crashed because there was an unhandled exception on the finalizer thread when finalizing an object of type Review."

    I’m assuming that you just know it was in the finalizer thread because you wrote it, right? Just looking at the event log entry, we can obviously tell what the exception was, but I’m missing how one would know where it occured without walking through the debugging.

    On that note,  do you know of a link that breaks down what the fields are in the event ID 5000 entry? I’ve not had any luck finding that reference…

    Thanks! These labs have been a big help!

  32. Tess says:

    _______________________

    Message: Object reference not set to an instance of an object.

    StackTrace:    at Review.Finalize()

    ______________________

    The stack for the exception shows that it is on the finalizer i.e. in the finalize method for the review object, so that is how I know…

    Regarding the breakdown of the 5000 entry, what specifically are you looking for?

  33. Michael L. says:

    For the 5000 entry, I’m looking for what are the P fields.

    So, for instance, for this entry:

    EventType clr20r3, P1 w3wp.exe, P2 6.0.3790.3959, P3 45d6968e, P4 app_code.wbwztx_4, P5 0.0.0.0, P6 47a81aee, P7 13, P8 0, P9 system.nullreferenceexception, P10 NIL

    P5 looks like an emtpy IP address. Is it? What about P10? What is Nil? What is the address that is being shows in P3?

    I’m not really worried about what the answers to those questions are for this specific example.. just the generic answer such as "P1 is the executable that generated the exception".  I just haven’t had any luck finding a reference for that, and was curious about it.

  34. robins@interknowlogy.com says:

    In the walkthrough you state "ESI is typically used to hold the this pointer".  Is there a compilation somewhere of this type of information when debugging .NET applications?

    Also with your help I found it very easy to identify which x86 assembler instruction the exception occurred on, however determining which .NET source line or as a minimum which IL instruction in the method the exception occurred on.  I can use WinDBG to produce the assembler version of the JITed method or the IL version of the method but I need an IL version of the method interspersed with the assembler instructions so that I can correlate.

  35. Tess says:

    Robin,

    The fact that ESI usually holds the this pointer is not .net specific.  I dont know off hand if there is a compilation of what registers are usually used for but the most common ones are that ESI holds this, and eax holds the return value of a function.

    Regarding the IL, you can run !dumpil on the method descriptor to get the il of the method.

  36. KR says:

    When I try this example, w3wp.exe doesn’t crash; I get the normal red and yellow exception page, with the message:

    Server Error in ‘/BuggyBits’ Application.

    ——————————————————————————–

    Validation of viewstate MAC failed. If this application is hosted by a Web Farm or cluster, ensure that <machineKey> configuration specifies the same validationKey and validation algorithm. AutoGenerate cannot be used in a cluster.

    What am I doing wrong (or right??)

  37. Tess says:

    That sounds like a config error.  Do any of the pages work?

  38. BenThomp says:

    Hey Tess,

    I really liked your Labs.

    But in this particular Lab, I just wanted kno what is the resolution??

    Because If i comment the below

    //  ~Review()

    //    {

    //        if (quote.ToString() != string.Empty)

    //        {

    //            quote = null;

    //        }

    //    }

    process crash does not happen. is it good to comment this function……or there some other solution??

    Thank You

    Ben

  39. BenThomp says:

    Hey Tess,

    I really liked your Labs.

    But in this particular Lab, I just wanted kno what is the resolution??

    Because If i comment the below

    //  ~Review()

    //    {

    //        if (quote.ToString() != string.Empty)

    //        {

    //            quote = null;

    //        }

    //    }

    process crash does not happen. is it good to comment this function……or there some other solution??

    Thank You

    Ben

  40. Tess says:

    BenThomp,

    The problem here is an unhandled nullreference exception in the finalizer when calling quote.ToString() so anything that you do to avoid or handle that exception will resolve the problem.   (including removing the finalizer if that is feasible)

  41. Sebastian Dau says:

    if ( string.IsNullOrEmpty(quote) )   should work, too 🙂

  42. xentelworker says:

    Tess,

    First off, great article! This actually solved a application crash on my IIS7 Farm. I will definitely be blogging about it and linking back to your site for more detailed instructions.

    Regards,

    xentelworker

  43. stanley says:

    Thanks for share. I run the command line adplus -crash -pn w3wp.exe as instruction and got an error message "An output direction was not defined.". Any idea  what should I do to fix this issue.

  44. stan.chen says:

    Thanks for the great article. I follow the instruction to run the command line adplus -crash -pn w3wp.exe but got an error message "An output directory was not define". Any idea how I can fix this issue?

    Thanks

    Stanley

  45. stan.chen says:

    This is great but I got an error "No export clrstack found" when I view my crash dump file in windbg. Can you give some idea how I can get this working?

    Regards

    Stanley