SOS Debugging of the CLR, Part 1





SOS Debugging of the CLR, Part 1

SOS Debugging with the CLR (Part 1)

In this entry I was planning to cover some debugging hints and
tricks primarily with the SOS extension dll.  A couple of things happened
after I started.  First, I did a quick search on the
net and was very pleasantly surprised to see a lot of material on this already. 
Since we created a lot of this stuff from scratch, we often forget to look for
this kind of stuff.  I’m actually going to send some links around to the
team since there is a lot of good data here, and we haven’t invested a huge
amount in our own documentation (I assume you’ve read sos.htm? <g>). 
Second, I wound up with more content than I expected.  I generally write
these things at the Bear Creek Starbucks on my tmobile account (what did
we do before wireless, terminal server, broadband, and Tivo?).  For some
reason, it’s like a common room at school: white noise facilitates volume. 
So I’ve decided to keep all the overview material, but then concentrate on the
Windows debugging stack tricks and background.  I’m going to split off and save
the trully SOS specific content for another post.

So given all that good stuff, I no longer need to write up the
seminal work on debugging, and will instead try and call out some things to augment that
material rather than repeat it.  Let’s start off with some important links and background
material you’ll want to read up on:

SOS Documentation There is some background material that ships with the
SDK.  These docs were literally written by our development team and
the PSS folks for the CLR.  Suffice it to say we didn’t get a big
Doc budget here <g>  The help files and samples live in the SDK
under the <SDK\v1.1\Tool Developers Guide\Samples\sos> directory.
SOS Source Code I’m not sure if you noticed it, but a version of this
code lives in the SSCLI distribution.  Look in the \sscli\clr\src\tools\sos
directory.  I think you’ll find most of the code is very similar to
the version in the full distribution.
John
Robbins MSDN Article
This is a pretty thorough article with a sample
program to try out your skills on.  To be honest, it is the entry I
originally set out to write (a sample application with some commands). 
But since John has done it already, read this article and then this one.

PAG Guide for Debugging
This is an impressive piece of documentation on doing
advanced debugging of ASP.NET including how to use SOS.DLL.  Make
sure you grab a scone and a grande lattee before sitting down to read
this one (wow!).

First let me get the trivia out of the way.  The original
name of the CLR team (chosen by team founder and former Microsoft Distinguished
Engineer Mike Toutonghi) was "Lighting".  Larry Sullivan’s dev
team created an ntsd extension dll to help facilitate the bootstrapping of v1.0. 
We called it strike.dll
(get it?  "Lightning Strike"?  yeah, I know, ba’dump bum).  PSS really needed this in order to give us information back to the
team when it was time to debug nasty stress failures, which are almost always
done with the Windows debugger stack.  But we didn’t want to hand out our full strike.dll, because it contained some "dangerous" commands that if you really
didn’t have our source code could cause you confusion and pain (even to other
Microsoft teams).  So I pushed the team to create "Son of Strike" (Simon from our dev takes
credit/blame for this), and we shipped it with the product starting with Everett (aka
V1.1).  Keep your eye on future developments in this space; I hadn’t
anticipated as much interest as we’ve gotten in the tool, and I think future
improvements in the Windows debuggers might have us ship the full meal deal next go
around.

Background on Debugger Stacks

There are two supported debugger technologies that come from
Microsoft, the Windows Debugger Stack and Visual Studio.Net.

Windows Debugger Stack

This debugger is (currently) strictly an unmanaged debugger. 
There is one core debugger engine that handles exceptions, enables a very
handy plug in model, and accepts commands.  This is then wrapped with a set
of debugger front ends.  cdb and ntsd are a command line version for user
mode debugging.  They are identical except that cdb inherits your
console and ntsd creates a brand new one (mostly useful for debugging services
where you need a new console to be created).  windbg is a GUI user mode
debugger that adds nice windows and integrated call stacks on top of the cdb
functionality.  It has a command line window, so anything you can do on the
console you can also do inside this GUI shell.  Finally there is KD, which
is the kernel debugger.  It has the command console/command line behavior,
only you are debugging the kernel itself and therefore there are some extra
commands at your disposal. 

The Windows debugger stack has a great facility called extension dlls.  This basically allows you to write your own native dll which exports
a set of entry points that can be invoked from inside any Windows debugger.  As
an extension dll author, you get to read and write process memory and do other
very basic things against the debuggee process.  This is really a killer
feature, and very widely used inside Microsoft.  The nice thing is you are
can compile using your normal product headers, and then add utility features
like "dump this tree of data structures with nice formatting to the console". 
Now you produce that extension dll right out of your normal build with the rest
of the product, and you’re ready to go.  Sos.dll and Strike.dll are just
extension dll’s.

Visual Studio.Net

I expect most people reading this entry are very familiar with
VS.Net and its capabilities.  The obvious thing VS does well is GUI
debugging and RAD development.  You won’t find any of the project system or
wizards in Windbg, but will find a lot of the same debugger windows (call
stacks, memory, registers, etc).  VS.Net also allows you to step into RPC
calls, Web Services, and TSQL.  It does not, however, do a good job at
extension dlls.  The team added some very basic support for sos.dll in
version 7.1 (Everett), but personally I never use it because it falls way short
of the basic commands you want outside sos.dll (who doesn’t want to get all
their call stacks using "~*kb"?).

Which to Use?

98% of you reading this should just use VS.Net. 
It is far easier and more approachable.  Most of our engineers at Microsoft
use VS.Net to author things like the Base Class Libraries.  However I do
personally feel that VS has the "last mile problem".  In particular, I find
it to be ill suited to stress debugging.  Most operations guys I know do
not want to install the remote components of VS (which currently includes DCOM!) on a
production server.  It changes your environment too much.  Remoting
the console of the Windows debuggers is far more widely used (all of the customer PSS
remotes we get to the engineering team come in this way). 

Beyond this problem, stress debugging often involves things
like trying to figure out which threads are holding which lock objects and in
what order.  Clicking through dialogs to get thread stacks to do this
sucks.  It is so much easier to just type in "~*kb", and you will get a
call stack for every thread in the process in one nice output.  And then
what really rocks:  there are default extension dll commands like !critsec
that dump the contents of a critical section.  You will find similar
commands for dumping lots of internal data structures, and doing some default
diagnosis.  Finally, it is a lot easier to extract a context record (a
struct with all of the registers from a thread) and reset the thread.  This
is very handy when dealing with AV’s in a process that have thrown an exception
and land in some nasty code.

So again, bottom line, plan to use VS.Net.  But if
you’re hard core, then you will want to learn the Windows debuggers and extension dlls.  Usage on
the CLR team (and similar teams like the Windows Shell) are about 50/50 each. 
I’ve got engineers on my team that I can’t make use VS.Net because their life
just involves too many stress remotes and nasty unmanaged code (aren’t you glad
we’re doing it for you <g>).

Sos.dll falls primarily into the stress debugging bucket. 
Internally in the CLR, we also use it to bootstrap a new system.  For
example, we would have been dead without it when bootstrapping the 64-bit
version of the CLR (coming end of this month to the PDC at LA; to answer the
FAQ, we are not yet shipping a 64-bit version but we have announced it and demo’d it working at numerous public forums).

Setting Up Your
Machine for Debugging

This section is the quick start for getting debugging working
on your machine.

  1. Make sure you already have the .NET Framework and the SDK
    installed.

  2. Install
    the Windows Debugger stack.  I usually install this to
    c:\debuggers.  Place this directory on
    your path, eg: "set path=c:\debuggers;%path%;"
    Note that I placed this path before the regular path on purpose, so that the
    newest debugging tools are found before anything else you might have on your
    machine.

  3. Run the VSVars32.bat file if you care about the SDK on
    your path (not necessary to run the debuggers)

  4. Set up the symbol path for Windows symbols using
    symbol
    server
    .  As an example, create a directory
    c:\websymbols, and then
    set _NT_SYMBOL_PATH=
    SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols.

Trying It Out

Example 1: Who Loaded These Dlls?

One thing I’ve always really liked about the Windows debuggers is
that you can do a certain amount of automated scripting while you are debugging. 
The best I’ve come up with allows you to log all of the call stacks which loaded every dll in a process.  First make sure you
set up your machine for
debugging
.  For my test, we’ll look at all the loaded dll’s in the
regasm utility by just showing its help text:

  c:\> ntsd
regasm.exe /?

Loaded ntsdexts
extension DLL

Symbol search path is: C:\Program Files\Microsoft Visual Studio .NET
2003\SDK\v1

.1\symbols

Executable search path is:

ModLoad: 79000000 7900a000 RegAsm.exe

ModLoad: 77f50000 77ff7000 ntdll.dll

ModLoad: 79170000 79196000 C:\WINDOWS\system32\mscoree.dll

ModLoad: 77e60000 77f46000 C:\WINDOWS\system32\KERNEL32.dll

ModLoad: 77dd0000 77e5d000 C:\WINDOWS\system32\ADVAPI32.dll

ModLoad: 78000000 78086000 C:\WINDOWS\system32\RPCRT4.dll

Break instruction exception – code 80000003 (first chance)

eax=00241eb4 ebx=7ffdf000 ecx=00000000 edx=77f51301 esi=00241eb4 edi=00241f48

eip=77f75a58 esp=0012fb38 ebp=0012fc2c iopl=0 nv up ei pl nz na pe nc

cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202

*** ERROR: Symbol file could not be found. Defaulted to export symbols for
ntdll.dll –

ntdll!DbgBreakPoint:

77f75a58 cc int 3

0:000> .logopen load.log

Opened log file ‘load.log’

0:000> sxe -c"kb 10;g" ld

0:000> g

ModLoad: 63180000
631e5000 C:\WINDOWS\system32\SHLWAPI.dll

ChildEBP RetAddr Args to Child

0012ec6c 77f5bb04 77f67948 000007d8 ffffffff *SharedUserSystemCall+0xc

WARNING: Stack unwind information not available. Following frames may be
wrong.

0012ed6c 77f6133d 00143aa8 0012edd0 0012f2e8 ntdll!NtMapViewOfSection+0xc

0012f014 77f570e0 00000000 00143aa8 0012f2e8
ntdll!LdrGetProcedureAddress+0x3d5

*** ERROR: Symbol file could not be found. Defaulted to export symbols for
C:\WINDOWS\system32\KERNEL32.dll –

0012f2a8 77e7d901 00143aa8 0012f2e8 0012f2e0 ntdll!LdrLoadDll+0x1c5

0012f310 77e7d95e 7ffdec00 00000000 00000000 KERNEL32!LoadLibraryExW+0xc8

0012f398 7917b591 00000000 79190020 00320000 KERNEL32!LoadLibraryExA+0x1d

0012f3c8 7917bddb 00143570 0012f3dc 0012f91c mscoree!_tailMerge_SHLWAPI+0xd

0012f3e0 7917be4c 00143570 0012f428 0012f424
mscoree!XMLGetVersionWithSupported+0x22

0012f434 7917a8e7 0012f8e0 001434c0 00000000
mscoree!RuntimeRequest__GetRuntimeVersion+0x55

0012f4c4 79179078 00000001 00080000 00000000
mscoree!RuntimeRequest__ComputeVersionString+0x2ee

0012f8e4 7917c135 00000001 0012f8fc 00080000
mscoree!RuntimeRequest__FindVersionedRuntime+0xee

0012f900 79179386 00000001 00000000 00080000
mscoree!RuntimeRequest__RequestRuntimeDll+0x23

0012ffa4 7917d09e 00000001 0012ffbc 00000000 mscoree!GetInstallation+0x66

0012ffc0 77e814c7 00080000 0006f38c 7ffdf000 mscoree!_CorExeMain+0x12

0012fff0 00000000 7917d08c 00000000 78746341
KERNEL32!GetCurrentDirectoryW+0x44

 

<cut out full call stacks for some
brevity, here are the rest of the load strings>

ModLoad: 63180000
631e5000 C:\WINDOWS\system32\SHLWAPI.dll

ModLoad: 77c10000 77c63000 C:\WINDOWS\system32\msvcrt.dll

ModLoad: 77c70000 77cb0000 C:\WINDOWS\system32\GDI32.dll

ModLoad: 77d40000 77dc6000 C:\WINDOWS\system32\USER32.dll

ModLoad: 791b0000 79412000 C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscorwks.dll

ModLoad: 7c340000 7c396000 C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\MSVCR71.dll

ModLoad: 79040000 79085000 C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\fusion.dll

ModLoad: 773d0000 77bc2000 C:\WINDOWS\system32\SHELL32.dll

ModLoad: 71950000 71a34000 C:\WINDOWS\WinSxS\x86_Microsoft.Windows.Common-Controls_6595b64144ccf1df_6.0.10.0_x-ww_f7fb5805\comctl32.dll

ModLoad: 77340000 773cb000 C:\WINDOWS\system32\comctl32.dll

ModLoad: 79780000 79980000 c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll

ModLoad: 79980000 79ca6000 c:\windows\assembly\nativeimages1_v1.1.4322\mscorlib\1.0.5000.0__b77a5c561934e089_ae53b8c5\mscorlib.dll

ModLoad: 79510000 79523000 C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscorsn.dll

ModLoad: 0ffd0000 0fff3000 C:\WINDOWS\System32\rsaenh.dll

ModLoad: 79430000 7947c000
C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\MSCORJIT.DLL



<and finally,
the last piece of user code>


ModLoad: 60050000 6005c000
c:\windows\assembly\gac\regcode\1.0.5000.0__b03f5f7f11d50a3a\regcode.dll

ChildEBP RetAddr Args to Child

0012c9a0 77f5bb04 77f67948 00000720 ffffffff *SharedUserSystemCall+0xc

WARNING: Stack unwind information not available. Following frames may be
wrong.

0012caa0 77f6133d 0017b008 0012cb04 0012d01c ntdll!NtMapViewOfSection+0xc

0012cd48 77f570e0 00000000 0017b008 0012d01c
ntdll!LdrGetProcedureAddress+0x3d5

0012cfdc 77e7d901 0017b008 0012d01c 0012d014 ntdll!LdrLoadDll+0x1c5

0012d044 791b7b63 00152d30 00000000 00000009 KERNEL32!LoadLibraryExW+0xc8

0012d060 791c19d0 00152d30 00000000 00000009 mscorwks!WszLoadLibraryEx+0x5f

0012d088 791c17e9 00152ddc 00154250 00008000
mscorwks!CorMap__BaseAddress+0x8b

0012d0a4 791c1855 00154040 00152ddc 00000000 mscorwks!PEFile__Setup+0x45

0012d0c0 791c564f 00152ddc 0012d1d4 00000000 mscorwks!PEFile__Create+0x38

0012d0d8 791ece29 00152ddc 00152ddc 0012d1d4
mscorwks!PEFile__CreateImageFile+0x3a

0012d14c 791eec92 00000001 00000000 00c09ff0
mscorwks!PEFile__VerifyModule+0x76

0012d190 791c4efe 0012d298 00000000 26000000 mscorwks!PEFile__Create+0x11a

0012d1d8 791eefa7 0012d298 00000000 26000000
mscorwks!SystemDomain__LoadFile+0x184

0012d4bc 791ed2f4 00156018 00859ab0 0012dcbc
mscorwks!AssemblySpec__GetAssemblyFromFusion+0x5d6

0012d73c 791ed195 0012d7e4 0012d7e8 0012d7ec
mscorwks!AssemblySpec__LowLevelLoadManifestFile+0x159

0012d75c 791ecfc7 0012dc8c 0012d7e4 0012d7e8
mscorwks!AppDomain__BindAssemblySpec+0x50

eax=0014db68 ebx=00000000 ecx=77e760de edx=00000000 esi=77f5c448 edi=00000000

eip=7ffe0304 esp=0012fe88 ebp=0012ff80 iopl=0 nv up ei pl nz na pe nc

cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202

*SharedUserSystemCall+c:

7ffe0304 c3 ret

Let’s take a look at the commands:

0:000>
.logopen load.log

This command will start logging all of the output from the
debugger to a file called "load.log".  This is handy for going back to
grep against or just review history when it scrolls off the screen.

0:000>
sxe -c"kb 10;g" ld
‘sxe ld’ tells the debugger that whenever a
dll is loaded, give an exception.  The -c is the command to run when
the event fires.  In this case I’m running two:  ‘kb 10’ gives us
the 10 top frames of the callstack where the load occurred, and ‘g’ says
restart execution.  So this key command sequence says "give me 10
frames whenever you load a dll and then keep going".
0:000>
g
Starts the program.

I’ve attached a couple of full call stack examples and the
abbreviated list of all dll’s loaded for reference ("findstr
ModLoad load.log"
)  The ModLoad tags in the output show you the name of
the dll, followed by the call stack which forced it to get loaded.  One
thing you’ll notice is a lot of extra engine goo in there.  You may need to
play a bit more with the stack depth to get down into JIT’d code frames. 

The regasm utility requires its own managed dll, regcode.dll,
to run.  Let’s take a closer look at that loaded call stack (with more
frames than just 10 this time):

  ChildEBP RetAddr Args to
Child

0012c9a0 77f5bb04 77f67948 00000718 ffffffff *SharedUserSystemCall+0xc

WARNING: Stack unwind information not available. Following frames may be
wrong.

0012caa0 77f6133d 00168ff0 0012cb04 0012d01c ntdll!NtMapViewOfSection+0xc

0012cd48 77f570e0 00000000 00168ff0 0012d01c
ntdll!LdrGetProcedureAddress+0x3d5

0012cfdc 77e7d901 00168ff0 0012d01c 0012d014 ntdll!LdrLoadDll+0x1c5

0012d044 791b7b63 00152da0 00000000 00000009 KERNEL32!LoadLibraryExW+0xc8

0012d060 791c19d0 00152da0 00000000 00000009 mscorwks!WszLoadLibraryEx+0x5f

0012d088 791c17e9 00152e4c 00154290 00008000
mscorwks!CorMap__BaseAddress+0x8b

0012d0a4 791c1855 00154080 00152e4c 00000000 mscorwks!PEFile__Setup+0x45

0012d0c0 791c564f 00152e4c 0012d1d4 00000000 mscorwks!PEFile__Create+0x38

0012d0d8 791ece29 00152e4c 00152e4c 0012d1d4
mscorwks!PEFile__CreateImageFile+0x3a

0012d14c 791eec92 00000001 00000000 00c09ff0
mscorwks!PEFile__VerifyModule+0x76

0012d190 791c4efe 0012d298 00000000 26000000 mscorwks!PEFile__Create+0x11a

0012d1d8 791eefa7 0012d298 00000000 26000000
mscorwks!SystemDomain__LoadFile+0x184

0012d4bc 791ed2f4 00156018 00859ab0 0012dcbc
mscorwks!AssemblySpec__GetAssemblyFromFusion+0x5d6

0012d73c 791ed195 0012d7e4 0012d7e8 0012d7ec
mscorwks!AssemblySpec__LowLevelLoadManifestFile+0x159

0012d75c 791ecfc7 0012dc8c 0012d7e4 0012d7e8
mscorwks!AppDomain__BindAssemblySpec+0x50

0012dc68 791eca5e 0012df24 0012e038 00000000
mscorwks!AssemblySpec__LoadAssembly+0x9c

0012def4 791ecaae 23000002 0014b980 0014d080
mscorwks!Assembly__LoadExternalAssembly+0x17e

0012df1c 791e54b5 00000000 23000002 0014b980
mscorwks!Assembly__FindExternalAssembly+0x3c

0012df50 791e551d 0012dfa8 0012df90 0012e038
mscorwks!Assembly__FindAssemblyByTypeRef+0xbe

0012df84 791b5404 0012e024 00000000 0012e038
mscorwks!ClassLoader__LoadTypeHandle+0x8c

0012dfcc 791df1a1 0012e024 00152080 0012e038
mscorwks!SigPointer__GetTypeHandle+0xd5

0012e02c 7944df48 0012e830 00000000 0012e488
mscorwks!CEEInfo__getFieldType+0x93

0012e548 79438daa 00bf1d60 00bf1e0c 00bf0010
MSCORJIT!Compiler__impImportBlockCode+0x2910

0012e5f0 7943b423 00bf1d60 0012e858 00bf0010
MSCORJIT!Compiler__impImportBlock+0x24a

0012e608 7943b35a 00bf1d60 00bf0010 794310c6
MSCORJIT!Compiler__impImport+0xe8

0012e614 794310c6 00bf0010 0012e66c 7943143f
MSCORJIT!Compiler__fgImport+0x41

0012e620 7943143f 0012e734 0012e8b0 0012e728
MSCORJIT!Compiler__compCompile+0xb

0012e66c 794314e2 00152080 0012e80c 0012e858
MSCORJIT!Compiler__compCompile+0x1e8

0012e6f4 7943156c 0012e80c 0012e858 0012e734 MSCORJIT!jitNativeCode+0x95

0012e738 791dd630 79478e1c 0012e80c 0012e858
MSCORJIT!CILJit__compileMethod+0xa2

0012e780 791de06e 001526d8 0012e80c 0012e858
mscorwks!CallCompileMethodWithSEHWrapper+0x52

0012e8e8 791de4ec 008f5110 0012eab4 0012e9f0 mscorwks!JITFunction+0x2c7

0012ea20 791de6cf 008f5110 0012eab4 00000000 mscorwks!MakeJitWorker+0x2bb

0012eb0c 791d6a7e 00000000 0012f5f4 0012eb48
mscorwks!MethodDesc__DoPrestub+0x498

0012eb24 0014832e 0012eb48 00000000 00000000 mscorwks!PreStubWorker+0x42

0012eb58 791dc106 0012eba4 00000000 0012eb7c 0x14832e

0012ebdc 791dc1b9 008f510b 00152080 009125ac
mscorwks!MethodDesc__CallDescr+0x155

0012ec04 791dc26b 00000000 009125ac 00152628 mscorwks!MethodDesc__Call+0x8e

0012ec80 791ba529 00152628 0012f654 00000000
mscorwks!EEClass__RunClassInit+0x100

0012f578 791b6a2c 0012f654 00000000 00000000
mscorwks!EEClass__DoRunClassInit+0x2a2

0012f58c 791d68e4 0012f654 0012f6a0 008f5080
mscorwks!MethodTable__CheckRunClassInit+0x1d

0012f664 791d6a7e 00000000 0012f8b0 0012f6a0
mscorwks!MethodDesc__DoPrestub+0x253

0012f67c 0014832e 0012f6a0 00000004 05051b7c mscorwks!PreStubWorker+0x42

0012f6b0 791da434 0012f6fc 00000000 0012f6d4 0x14832e

0012f7c0 791da58a 008f507b 00152080 0012f7e0
mscorwks!MethodDesc__CallDescr+0x1b6

0012f870 791da5f6 008f507b 00152080 790059a8
mscorwks!MethodDesc__CallDescr+0x43

0012f898 7923d587 0012f928 00000000 00147c80 mscorwks!MethodDesc__Call+0x97

0012f950 7923d342 008f5080 00000001 00000000
mscorwks!ClassLoader__CanAccess+0x1d6

0012fa64 7923d441 00152080 00000000 79041374
mscorwks!ClassLoader__ExecuteMainMethod+0x49d

0012fa7c 7923d92f 00000000 0012fd70 00000000
mscorwks!Assembly__ExecuteMainMethod+0x21

0012fd60 791c6e73 00000001 0012fd90 0012ffe0
mscorwks!SystemDomain__ExecuteMainMethod+0x416

0012ffa0 791c6ef3 79000000 00080000 00000000 mscorwks!ExecuteEXE+0x1c0

0012ffb0 7917d0b8 0006f38c 791b0000 0012fff0 mscorwks!_CorExeMain+0x59

0012ffc0 77e814c7 00080000 0006f38c 7ffdf000 mscoree!_CorExeMain+0x30

0012fff0 00000000 7917d08c 00000000 78746341
KERNEL32!GetCurrentDirectoryW+0x44

Notice the blue lines in the call stack which have no symbol
names.  These are managed frames.  mscorwks!PreStubWorker is the
routine in the CLR which owns invoking the JIT compiler and back patching call
sites.  You can use the !ClrStack command to get the managed call stack. 
I’m going to drill harder on this in Part 2.  The other pointers above have
a good tutorial so hopefully this is pretty obvious at this point.

Example 2:  What Processes Did My App Run?

The CLR Performance team did an analysis on an internal
Microsoft business application,
HeadTrax,
which is worth reading up on.  This application was produced by our
internal IT group using VB.Net, Web Services, SQL Server, etc.  It is the
typical kind of multi-tier application you might be writing.  One interesting
thing that came out of that analysis was the very interesting way in which the
XML Serializer class launches the C# compiler every time it starts up (16 times
in our internal application — ouch!). 

I’m going to cover a couple of ways that you can debug your
application for this kind of behavior.  First let’s use the Windows debuggers to watch every CreateProcess call:

  c:\> ntsd
headtrax.exe

0:000>
sxe -c ".loadby sos mscorwks;g" ld mscorwks.dll

0:000> bp kernel32!CreateProcessW "!ClrStack;du
poi(esp+8);g"

0:000> g

 

Loaded Son of Strike data table version
5 from "C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscorwks.dll"

Thread 7

ESP EIP

08fbf804 77e61b8e [FRAME: NDirectMethodFrameGeneric] [DEFAULT] Boolean
Microsoft.Win32.UnsafeNativeMethods.CreateProcess(String,Class
System.Text.StringBuilder,Class SECURITY_ATTRIBUTES,Class
SECURITY_ATTRIBUTES,Boolean,I4,ValueClass
System.Runtime.InteropServices.HandleRef,String,Class STARTUPINFO,Class
PROCESS_INFORMATION)

08fbf830 7b20b664 [DEFAULT] I4
System.CodeDom.Compiler.Executor.ExecWaitWithCaptureUnimpersonated(I,String,String,Class
System.CodeDom.Compiler.TempFileCollection,ByRef String,ByRef String,String)

08fbf95c 7b20b0a9 [DEFAULT] I4
System.CodeDom.Compiler.Executor.ExecWaitWithCapture(I,String,String,Class
System.CodeDom.Compiler.TempFileCollection,ByRef String,ByRef String,String)

08fbfa04 7b20af2d [DEFAULT] [hasThis] Void
System.CodeDom.Compiler.CodeCompiler.Compile(Class
System.CodeDom.Compiler.CompilerParameters,String,String,String,ByRef
String,ByRef I4,String)

08fbfa38 7b20a2c2 [DEFAULT] [hasThis] Class
System.CodeDom.Compiler.CompilerResults
System.CodeDom.Compiler.CodeCompiler.FromFileBatch(Class
System.CodeDom.Compiler.CompilerParameters,SZArray String)

08fbfa94 7b232e75 [DEFAULT] [hasThis] Class
System.CodeDom.Compiler.CompilerResults
System.CodeDom.Compiler.CodeCompiler.FromSourceBatch(Class
System.CodeDom.Compiler.CompilerParameters,SZArray String)

08fbfaec 7b2328ce [DEFAULT] [hasThis] Class
System.CodeDom.Compiler.CompilerResults
System.CodeDom.Compiler.CodeCompiler.FromSource(Class
System.CodeDom.Compiler.CompilerParameters,String)

08fbfb08 7b232596 [DEFAULT] [hasThis] Class
System.CodeDom.Compiler.CompilerResults
System.CodeDom.Compiler.CodeCompiler.System.CodeDom.Compiler.ICodeCompiler.CompileAssemblyFromSource(Class
System.CodeDom.Compiler.CompilerParameters,String)

08fbfb38 7be28b7d [DEFAULT] [hasThis] Class System.Reflection.Assembly
System.Xml.Serialization.Compiler.Compile()

08fbfb88 7be27c3a [DEFAULT] [hasThis] Void
System.Xml.Serialization.TempAssembly..ctor(SZArray Class
System.Xml.Serialization.XmlMapping)

08fbfbd4 7be725ee [DEFAULT] SZArray Class
System.Xml.Serialization.XmlSerializer
System.Xml.Serialization.XmlSerializer.FromMappings(SZArray Class
System.Xml.Serialization.XmlMapping)

08fbfbe8 083945c7 [DEFAULT] [hasThis] Void
System.Web.Services.Protocols.SoapClientType..ctor(Class System.Type)

08fbfc30 08393e68 [DEFAULT] [hasThis] Void
System.Web.Services.Protocols.SoapHttpClientProtocol..ctor()

08fbfc58 08393d7b [DEFAULT] [hasThis] Void
Microsoft.HeadTrax.HTXCommon.HTXFrameworkWS.HTXFramework..ctor()

08fbfc60 08393ca4 [DEFAULT] [hasThis] Class
Microsoft.HeadTrax.HTXCommon.HTXFrameworkWS.HTXFramework
Microsoft.HeadTrax.HTXCommon.HTXDataLayer.get_wsHTXFramework()

08fbfc74 08393b54 [DEFAULT] [hasThis] Class System.Data.DataSet
Microsoft.HeadTrax.HTXCommon.HTXDataLayer.AppLoadDataGet()

08fbfc8c 0839393b [DEFAULT] [hasThis] Void
Microsoft.HeadTrax.HTXCommon.HTXDataLayer.ThreadStage4()

08fbfed4 791da717 [FRAME: GCFrame]

084a8d98 ""c:\windows\microsoft.net\framew"

084a8dd8 "ork\v1.1.4322\csc.exe" /noconfig"

084a8e18 " @"C:\DOCUME~1\jasonz\LOCALS~1\T"

084a8e58 "emp\i50mbv3n.cmdline""

 

Let’s take a look at the commands:

sxe -c
".loadby sos mscorwks;g" ld mscorwks.dll

Here again we are firing a command we want to run when a
dll get’s loaded.  In this case, the command is fired when mscorwks.dll
is loaded.  The command, ".loadby sos mscorwks;g" executes a load of
the sos.dll from the same directory where mscorwks.dll lives (using the full
path of the already loaded dll).  It then continues execution. 
Many thanks to Drew Bliss of the Windows debugger team for adding the loadby
command; it has saved us a lot of typing since we have a unique version of
sos for each unique version of mscorwks in every side by side installation
of the CLR.

bp
kernel32!CreateProcessW "!ClrStack;du poi(esp+8);g"
This command sets a break point on kernel32!CreateProcessW.  I’m
running this on Windows XP, and so any create process call will go through
the Unicode (W) version of the API.  Whenever this bp fires, I want to
execute the !ClrStack command from sos, and then display the command line
argument passed to Create Process.  See the detailed explanation below. 
After displaying all this information, a "g" command is executed so the
application keeps running.

Scanning through the output from the debugger, you can see
when the first XML serializer is created using System.CodeDOM.Compiler. 
Once the source code for the serializer instance is cooked up, the C# compiler
is invoked to do the compilation so the code can be loaded and executed. 
In this case you can see that a config file is passed to the csc.exe compiler. 
An example of such a file in this application is:

 

/t:library /utf8output /R:"c:\windows\assembly\gac\system.xml\1.0.5000.0__b77a5c561934e089\system.xml.dll"
/R:"c:\windows\assembly\gac\system.data\1.0.5000.0__b77a5c561934e089\system.data.dll"
/R:"c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll" /R:"c:\windows\assembly\gac\system\1.0.5000.0__b77a5c561934e089\system.dll"
/out:"C:\DOCUME~1\jasonz\LOCALS~1\Temp\a2k-bbed.dll" /debug- /optimize+ /w:1
"C:\DOCUME~1\jasonz\LOCALS~1\Temp\a2k-bbed.0.cs"

No surprises here.  The generated code will utilize the
XML parser and various system classes to do your serialization work for your
unique type.  The first few lines of the generated application can then be
found by examining the temp file (hint: you’ll need to do this before the file
gets cleaned up, iow while the bp is waiting for you to continue):

 

[assembly:System.Security.AllowPartiallyTrustedCallers()]

    namespace Microsoft.Xml.Serialization.GeneratedAssembly {



    public class XmlSerializationWriter1 :
System.Xml.Serialization.XmlSerializationWriter {



        void Write1_Object(string n, string
ns, System.Object o, bool isNullable, bool needType) {

            if ((object)o
== null) {

               
if (isNullable) WriteNullTagLiteral(n, ns);

               
return;

            }

            if (!needType)
{

               
System.Type t = o.GetType();

               
if (t == typeof(System.Object))

                   
;

               
else if (t == typeof(System.Data.DataSet[])) {

                   
Writer.WriteStartElement(n, ns);

                   
WriteXsiType(@"ArrayOfDataSet", @"http://IChangedThis/UrlOnPurpose/");

<etc…>

One last thing to puzzle through.  What’s up with this
command again?

   
bp kernel32!CreateProcessW "!ClrStack;du poi(esp+8);g"

As I mentioned above, on an NT based Windows system all calls
ultimately go through the Unicode version of the system API’s.  In this
case I want to look for all calls which

create a process
.  When this breakpoint fires, the instruction pointer
(eip) will be pointing at the first byte of the method.  You can view the
code for this method using the unassemble ("u") command.

 

0:000> u
kernel32!CreateProcessW

KERNEL32!CreateProcessW:

77e61b8e 55 push ebp

77e61b8f 8bec mov ebp,esp

77e61b91 6a00 push 0x0

77e61b93 ff752c push dword ptr [ebp+0x2c]

77e61b96 ff7528 push dword ptr [ebp+0x28]

0:007> dd esp

08f1f090 791befaf
00000000
00167960 00000000

08f1f0a0 00000000 00000001 00000000 053d8028

08f1f0b0 053cae10 053ce08c 053cb2d8 791fbbd0

08f1f0c0 00000404 08f1f7e8 00150000 00000000



 

In this case all of the Windows win32 api’s use the stdcall
calling convention (ebp chaining).  The general layout of such a stack
frame is as follows:

esp ————–> | <current location 
|

                   
| <local #n>         |

                   
| <local #2>         |

                   
| <local #1>         |

ebp ————–> | <previous ebp>    
|

                   
| <return address>   | ebp + 4

                   
| param 1           
| ebp + 8

                   
| param 2           
| ebp + c

                   
| param 3           
| ebp + 0x10

At the point where the bp fires however, we have not yet set
up ebp (the prolog has not run).  That means that esp is pointing at the
return address that was just pushed before the call.  Here’s our picture
(before execution of the prolog):

                   
|                   
|

esp ————–> |
<return address>  
| esp + 0

                   
| param 1           
| esp + 4    ;
lpApplicationName

                   
| param 2           
| esp + 8    ;
lpCommandLine

                   
| param 3           
| esp + c    ; lpProcessAttributes, etc…

As you can see, you can now encode the parameters for the
method call relative to the current value of esp.  Now combine that with
the poi() operator, which when dumping memory means "take what this expressions
resolves to as a pointer to data I want to dump".  So one more time:

   
bp kernel32!CreateProcessW "!ClrStack;du poi(esp+8);g"

The command sets a breakpoint on CreateProcessW, executes a !ClrStack
command so you can see what caused it, and then finally dumps the unicode string
that esp+8 points to (lpCommandLine).  Pretty easy huh?  You can
experiment a bit with this technique and do more commands, such as adding a ‘kb
10’ to get the unmanaged frames as well as the managed ones.  This is a
pretty powerful debugging technique, and very handy for doing lot’s of automated
runs.

Example 3:  What Processes Did My App Run, Part 2

I mentioned above that there are many ways to solve a problem
like this.  In this case we know that the C# compiler, csc.exe, is going to
get invoked.  Let’s say you wanted to simply log every single invocation to
the C# compiler on your machine, no matter which process did it.  The best
way to do this is using the Image File Execution Options registry key.  The
easiest way to get access to this is running the gflags.exe program that came
with the Windows Debugger.  This is pretty easy:

  1. Start the application and set the Image File Name to
    "csc.exe"

  2. Click the radio button (*) Image File Options

  3. Give the following for the Debugger edit box (lower
    right): c:\debuggers\ntsd.exe

  4. Check [x] the Debugger check box

  5. Click Apply, then OK

This will create a key entry under "HKEY_LOCAL_MACHINE\SOFTWARE\Windows
NT\CurrentVersion\Image File Execution Options" for csc.exe.  A Debbuger
key value is created with your command line.  I find it quicker and easier
to simply use regedit to update this value at this point.  If you want to
skip gflags, run the following .reg file contents through regedit:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image
File Execution Options\csc.exe]

"Debugger"="c:\\debuggers\\ntsd.exe"

Using regedit (which doesn’t have a limit for size of text you
can enter), make the Debugger string the following:

c:\debuggers\ntsd.exe -c ".logappend
c:\csc.log;!peb;g" -G

Whenever csc.exe is executed, the ntsd.exe debugger will run. 
The command will open c:\csc.log in append mode
so that any commands you run under the debugger get recorded.  Next the
program execution block is dumped using !peb.  This will record the command
line and process launch information.  Then execution is continued using the
‘g’ command.  The final ‘-G’ tells ntsd that when the program terminates,
don’t bother to give me a break point.  Now launch your application, or
simply run csc.exe from the command line and examine the contents of
c:\csc.log.  Here’s mine:

 

Opened log
file ‘c:\csc.log’

PEB at 7ffdf000

InheritedAddressSpace: No

ReadImageFileExecOptions: No

BeingDebugged: Yes

ImageBaseAddress: 00400000

Ldr 00241e90

Ldr.Initialized: Yes

Ldr.InInitializationOrderModuleList: 00241f28 . 002427d0

Ldr.InLoadOrderModuleList: 00241ec0 . 002427c0

Ldr.InMemoryOrderModuleList: 00241ec8 . 002427c8

Base TimeStamp Module

400000 3e566dfb Feb 21 10:20:43 2003 c:\windows\microsoft.net\framework\v1.1.4322\csc.exe

77f50000 3eb1b41a May 01 16:56:10 2003 C:\WINDOWS\System32\ntdll.dll

77e60000 3d6dfa28 Aug 29 03:40:40 2002 C:\WINDOWS\system32\kernel32.dll

535b0000 3e566e08 Feb 21 10:20:56 2003 c:\windows\microsoft.net\framework\v1.1.4322\cscomp.dll

771b0000 3f4a5b36 Aug 25 11:53:42 2003 C:\WINDOWS\system32\ole32.dll

77c70000 3d6dfa27 Aug 29 03:40:39 2002 C:\WINDOWS\system32\GDI32.dll

77d40000 3f731c7d Sep 25 09:49:01 2003 C:\WINDOWS\system32\USER32.dll

77dd0000 3d6dfa28 Aug 29 03:40:40 2002 C:\WINDOWS\system32\ADVAPI32.dll

78000000 3f4a5b39 Aug 25 11:53:45 2003 C:\WINDOWS\system32\RPCRT4.dll

77120000 3d6dfa22 Aug 29 03:40:34 2002 C:\WINDOWS\system32\OLEAUT32.dll

77c10000 3d6dfa27 Aug 29 03:40:39 2002 C:\WINDOWS\system32\MSVCRT.DLL

77c00000 3b7dfe0f Aug 17 22:33:03 2001 C:\WINDOWS\system32\VERSION.dll

63180000 3f11dc57 Jul 13 15:25:27 2003 C:\WINDOWS\system32\SHLWAPI.dll

7c340000 3e561eac Feb 21 04:42:20 2003 c:\windows\microsoft.net\framework\v1.1.4322\MSVCR71.dll

79170000 3e5597ae Feb 20 19:06:22 2003 C:\WINDOWS\System32\mscoree.dll

SubSystemData: 00000000

ProcessHeap: 00140000

ProcessParameters: 00020000

WindowTitle: ‘c:\windows\microsoft.net\framework\v1.1.4322\csc.exe’

ImageFile: ‘c:\windows\microsoft.net\framework\v1.1.4322\csc.exe’

CommandLine: ‘"c:\windows\microsoft.net\framework\v1.1.4322\csc.exe" /noconfig
@"C:\DOCUME~1\jasonz\LOCALS~1\Temp\9fyenvyy.cmdline"’

DllPath: ‘c:\windows\microsoft.net\framework\v1.1.4322;C:\WINDOWS\System32;C:\WINDOWS\system;C:\WINDOWS;.;C:\Program
Files\Microsoft Visual Studio .NET 2003\Common7\IDE;C:\Program
Files\Microsoft Visual Studio .NET 2003\VC7\BIN;C:\Program Files\Microsoft
Visual Studio .NET 2003\Common7\Tools;C:\Program Files\Microsoft Visual
Studio .NET 2003\Common7\Tools\bin\prerelease;C:\Program Files\Microsoft
Visual Studio .NET 2003\Common7\Tools\bin;C:\Program Files\Microsoft Visual
Studio .NET 2003\SDK\v1.1\bin;C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322;c:\debuggers;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\PROGRA~1\CA\Common\SCANEN~1;C:\PROGRA~1\CA\eTrust\ANTIVI~1;’

Environment: 00010000

_ClrRestrictSecAttributes=1

_NT_SYMBOL_PATH=SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols;

ALLUSERSPROFILE=C:\Documents and Settings\All Users

APPDATA=C:\Documents and Settings\jasonz\Application Data

 

<… rest of
env vars …>

ModLoad:
791b0000 79412000 C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscorwks.dll

ModLoad: 79040000 79085000 C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\fusion.dll

ModLoad: 773d0000 77bc2000 C:\WINDOWS\system32\SHELL32.dll

ModLoad: 71950000 71a34000 C:\WINDOWS\WinSxS\x86_Microsoft.Windows.Common-Controls_6595b64144ccf1df_6.0.10.0_x-ww_f7fb5805\comctl32.dll

ModLoad: 77340000 773cb000 C:\WINDOWS\system32\comctl32.dll

ModLoad: 56aa0000 56abb000 c:\windows\microsoft.net\framework\v1.1.4322\alink.dll

ModLoad: 76fd0000 77048000 C:\WINDOWS\System32\CLBCATQ.DLL

ModLoad: 77050000 77115000 C:\WINDOWS\System32\COMRes.dll

ModLoad: 794a0000 794b0000

C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\mscorpe.dll

In addition to this kind of automated logging, I’ve also used
this technique to debug complicated programs that launch many applications (like
a build process).  One warning:  make sure to delete the registry key
once you are done, otherwise it’ll kick in even when you are building from
VS.Net.

Wrap Up and Future Directions

In the future I expect to get
much tighter integration with the Windows Debugger Stack.  This is really
important now that managed code is such a core part of so many products (you are
coming to the PDC right?).  How much better would your life be if you could
just type ‘k’ and have a real call stack without having to do a hand merge using
sos.dll?  I’m definitely looking forward to that world, but we aren’t there
yet.

The XML Serializer behavior as
described in the Headtrax overview demonstrates how easy it is to write a simple
and powerful application that can have some pretty interesting performance
issues.  In this case it really is the design of the XML Serializer itself. 
The Web Services team is aware of the down sides of this kind of design and
hopefully will do something to clear this up.  To be clear, it is a
perfectly functional and powerful programming model, it just doesn’t optimize
for the startup performance of your application.  And I’ll bet if you are
using this technology you probably didn’t actually know you were spawning copies
of C# just to do so.  I recommend reading
Rico’s blog
for more details on performance related recommendations. 
In particular, you should avoid touching these types until you need them, so you
can at least amortize the cost over the run of the application.

That’s it for this post. 
By now hopefully you picked up a couple of interesting tidbits (installing the
Windows debuggers, getting symbol server going, the new .loadby command, and
some scripting capabilities).  What I’m planning to do for Part 2 is
digging more into sos specifics and hopefully filling in some blanks on some of
those commands.  Please do send me any suggestions for that content or
questions you’d like to hear answered and I’ll see what I can do.

October 21, 2003


Comments (25)

  1. Rob Walker says:

    Wow – thanks for a very informative and comprehensive article. I learnt a number of new ‘tricks’ and it’s only just 9am …

  2. Andreas Häber says:

    Yeah, I totally agree – many useful tricks to do when debugging. Think I’m gonna start using the console debugger a lot more then I’ve done before. One thing with the VS.NET debugger IMHO is that it isn’t good to find problems with multi-threaded processes (and especially when dealing with multi-processes with multi-threads :)).

    So, I’m looking very forward to part 2 (and also 3,4,5 if they’ll show up!) of this article 🙂

  3. Keith Hill says:

    I have used WinDbg w/SOS a good bit to debug managed minidumps of field failures. This was very helpful information. I look forward to part 2. BTW, perhaps you could answer how to get a call to MiniDumpWriteDump to work in managed code. I have tried but my stacks come out as garbage (even using Marshal.GetExceptionPointers). One other thing, I kept running into failures loading the SOS extension in WinDbg (something about a resource module not found) until I added this env var: _NT_DEBUGGER_EXTENSION_PATH=C:WINDOWSMicrosoft.NETFrameworkv1.1.4322. Perhaps this could be documented better or be more obvious.

  4. Stale DNA says:

    Probably beaten to death elsewhere but I always seem to end up having to search for these when I need…

  5. Stale DNA says:

    Probably beaten to death elsewhere but I always seem to end up having to search for these when I need…

  6. Stale DNA says:

    Probably beaten to death elsewhere but I always seem to end up having to search for these when I need…

  7. SoS is the WinDbg extension for analysing Managed Applications

    Keith

    Brown’s Bugslayer

    SOS It’s…

  8. this is related article

  9. .NET announced… PDC 2000

    A co-worker of mine attended PDC 2000 and was blown away with what Microsoft…

  10. Probably beaten to death elsewhere but I always seem to end up having to search for these when I need

  11. B# .NET Blog says:

    For demonstration purposes, I’ve been creating some little functions to print the address of a variable…

  12. tanliyoung says:

    用WinDbg探索CLR世界 [3] 跟踪方法的 JIT 过程

  13. roy ashbrook says:

    I&#39;m done with most of these, but not totally. Tons of great information! Here are some more great

  14. Que triste ser&#225; el d&#237;a en que yo sea abuelo, y cuando mis nietos me pidan que les cuente mis

  15. &#160; I just got this new book on advanced debugging in Windows by Mario Hewardt and Daniel Pravat.&#160;

  16. &#160; I just got this new book on advanced debugging in Windows by Mario Hewardt and Daniel Pravat.&#160;