Getting started with windbg – part I

Okay, I’ve previously written some random posts about how to set up windbg and how to troubleshoot OutOfMemoryExceptions. I thought I’d take a few steps back and review some of the basics in order to help you get started on using this fantastic tool.

Basic Configuration

  1. Copy sos.dll from the framework directory to the folder where you installed windbg. Make sure you copy it from the same Framework version as the one you wish to investigate. If you’ll be working with both 1.1 and 2.0 you can rename the SOS.dlls to SOS11.dll and SOS20.dll or put them in separate folders.
  2. Create a folder where you want to cache all the symbol files. For example: “C:\Symbols”.
  3. Start windbg and open the dialogue to configure the symbol path by clicking File -> Symbol File Path.
  4. Enter the path, as well as the address from which you’ll want to download missing symbols using the following syntax:
    srv*[cache path]*[symbols path]
    I’d recommend the following path:

You should now be set to go. You’re now ready to open up a saved dump, or attach to a process

Useful commands

I’ll be using a dump from an IIS6-server to demonstrate some useful commands.

The first thing you’ll want to do is load SOS. You’ll do this using the .load command. The syntax is simple. .load [extension filename]. So if you want to load sos and haven’t renamed the .dll you’d simply write:

.load sos

You’ll now have all the cool commands from the SOS-extension at your disposal as well as the default windbg ones. Commands from extensions are always preceded by a “!”, so if you want to run the help-command for sos you’d write


If you should happen to have two extensions with an identically named command you can always separate them by typing ![extension name].[command] Example:


Okay, now that we know how to run the commands from the extension, try running !help. It should give you the following result.

0:000> !help
SOS is a debugger extension DLL designed to aid in the debugging of managed
programs. Functions are listed by category, then roughly in order of
importance. Shortcut names for popular functions are listed in parenthesis.
Type “!help ” for detailed info on that function.

Object Inspection                  Examining code and stacks
—————————–      —————————–
DumpObj (do)                       Threads
DumpArray (da)                     CLRStack
DumpStackObjects (dso)             IP2MD
DumpHeap                           U
DumpVC                             DumpStack
GCRoot                             EEStack
ObjSize                            GCInfo
FinalizeQueue                      EHInfo
PrintException (pe)                COMState
TraverseHeap                       BPMD

Examining CLR data structures      Diagnostic Utilities
—————————–      —————————–
DumpDomain                         VerifyHeap
EEHeap                             DumpLog
Name2EE                            FindAppDomain
SyncBlk                            SaveModule
DumpMT                             GCHandles
DumpClass                          GCHandleLeaks
DumpMD                             VMMap
Token2EE                           VMStat
EEVersion                          ProcInfo
DumpModule                         StopOnException (soe)
ThreadPool                         MinidumpMode
DumpMethodSig                      Other
DumpRuntimeTypes                   —————————–
DumpSig                            FAQ

For more documentation on a specific command, type !help [name of command]


This is not an SOS-command, which is evident by the command not beginning with a “!”. Running the .Time command will show you relevant info about the time, as well as system uptime, process uptime and the amount of time spent in kernel & user mode.

0:000> .time
Debug session time: Tue Oct 23 08:38:35.000 2007 (GMT+1)
System Uptime: 4 days 17:48:01.906
Process Uptime: 0 days 0:24:37.000
  Kernel time: 0 days 0:04:23.000
  User time: 0 days 0:03:28.000

As you can see the system has been up for over 4 days. The process has been running for 24½ minutes and has an accumulated CPU-time of about 8 minutes total. This would give us an average CPU-usage for the process of around 32,5%


We can then use the !Threadpool-command to find out exactly what the CPU-usage was at the time the dump was taken. We’ll also get some useful information like the number of work requests in the queue, completion port threads and timers.

0:000> !threadpool
CPU utilization 100%
Worker Thread: Total: 5 Running: 4 Idle: 1 MaxLimit: 200 MinLimit: 2
Work Request in Queue: 16
Unknown Function: 6a2d945d  Context: 023ede30
Unknown Function: 6a2d945d  Context: 023ee1e8
AsyncTimerCallbackCompletion TimerInfo@11b53760
Unknown Function: 6a2d945d  Context: 023ee3a8
Unknown Function: 6a2d945d  Context: 023e3040
Unknown Function: 6a2d945d  Context: 023ee178
Unknown Function: 6a2d945d  Context: 023edfb0
AsyncTimerCallbackCompletion TimerInfo@11b36428
AsyncTimerCallbackCompletion TimerInfo@11b53868
Unknown Function: 6a2d945d  Context: 023ee060
Unknown Function: 6a2d945d  Context: 023ee290
Unknown Function: 6a2d945d  Context: 023eded0
Unknown Function: 6a2d945d  Context: 023edd88
Unknown Function: 6a2d945d  Context: 023ede98
Unknown Function: 6a2d945d  Context: 023ee258
Unknown Function: 6a2d945d  Context: 023edfe8
Number of Timers: 9
Completion Port Thread:Total: 3 Free: 3 MaxFree: 4 CurrentLimit: 2 MaxLimit: 200 MinLimit: 2

So we can see that currently we’re using 100% of the CPU, which leads us to the next command.


This is a nice command that will list all running threads and their CPU-usage. It’s your best friend when troubleshooting a high CPU hang issue.

0:000> !runaway
User Mode Time
  Thread       Time
  25:1a94      0 days 0:00:39.937
  16:1bc0      0 days 0:00:38.390
  50:1e8c      0 days 0:00:08.859
  52:1e40      0 days 0:00:08.687
  20:1c2c      0 days 0:00:08.234
  51:1340      0 days 0:00:08.171
  21:1bcc      0 days 0:00:06.953
  26:13ec      0 days 0:00:06.671
  44:131c      0 days 0:00:03.906
  22:d8c       0 days 0:00:03.375
  33:78c       0 days 0:00:02.656
  34:1a8c      0 days 0:00:00.906
  29:1f5c      0 days 0:00:00.828
   6:e28       0 days 0:00:00.625
   5:1c78      0 days 0:00:00.546
  23:14a4      0 days 0:00:00.484
   4:5ac       0 days 0:00:00.437
  45:5dc       0 days 0:00:00.421
   3:13b4      0 days 0:00:00.421
  47:19c8      0 days 0:00:00.375
  28:1b6c      0 days 0:00:00.250
  46:1dac      0 days 0:00:00.156
   7:1dd8      0 days 0:00:00.109
  48:cdc       0 days 0:00:00.093
  49:1eac      0 days 0:00:00.062
  15:1a64      0 days 0:00:00.062
   0:1804      0 days 0:00:00.046
  36:4a4       0 days 0:00:00.031
  11:1eb4      0 days 0:00:00.031
   1:10b4      0 days 0:00:00.031
  31:16ac      0 days 0:00:00.015
  14:4ac       0 days 0:00:00.015
   2:186c      0 days 0:00:00.015
  59:590       0 days 0:00:00.000
  58:294       0 days 0:00:00.000
  57:16d0      0 days 0:00:00.000
  56:1578      0 days 0:00:00.000
  55:1428      0 days 0:00:00.000
  54:16d8      0 days 0:00:00.000
  53:fd8       0 days 0:00:00.000
  43:1b8c      0 days 0:00:00.000
  42:1c24      0 days 0:00:00.000
  41:1e2c      0 days 0:00:00.000
  40:11b0      0 days 0:00:00.000
  39:edc       0 days 0:00:00.000
  38:1a08      0 days 0:00:00.000
  37:171c      0 days 0:00:00.000
  35:1254      0 days 0:00:00.000
  32:1f9c      0 days 0:00:00.000
  30:1ae8      0 days 0:00:00.000
  27:190c      0 days 0:00:00.000
  24:1d2c      0 days 0:00:00.000
  19:1e38      0 days 0:00:00.000
  18:ee4       0 days 0:00:00.000
  17:fb8       0 days 0:00:00.000
  13:1b54      0 days 0:00:00.000
  12:1a48      0 days 0:00:00.000
  10:f64       0 days 0:00:00.000
   9:1024      0 days 0:00:00.000
   8:1b78      0 days 0:00:00.000

As you can see the total amount of time does not match the total CPU utilization time that we got from the .time command. That’s simply because threads get reused and recycled. This means that the total amount of CPU time used by a thread may have been split up over several page requests.


To get more information about the running threads we can run the !Threads-command. This will list all managed threads in the application, what application domain the thread is currently executing under, etc. The output will look like this:

0:000> !threads
ThreadCount: 48
UnstartedThread: 0
BackgroundThread: 29
PendingThread: 0
DeadThread: 19
Hosted Runtime: no
                                      PreEmptive   GC Alloc           Lock
       ID OSID ThreadOBJ    State     GC       Context       Domain   Count APT Exception
  16    1 1bc0 001fccd0   1808220 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
  22    2  d8c 002016f0      b220 Enabled  00000000:00000000 0019daf0     0 MTA (Finalizer)
  14    4  4ac 00242e58   880a220 Enabled  00000000:00000000 0019daf0     0 MTA (Threadpool Completion Port)
  23    5 14a4 11b39f18    80a220 Enabled  00000000:00000000 0019daf0     0 MTA (Threadpool Completion Port)
  24    6 1d2c 11b41ad8      1220 Enabled  00000000:00000000 0019daf0     0 Ukn
  25    7 1a94 11b46c70   180b220 Enabled  27240c98:27241fd8 11b42540     1 MTA (Threadpool Worker)
  26    9 13ec 12ce2888   200b220 Enabled  2a9f1434:2a9f33c0 11b42540     0 MTA
  27    a 190c 12d85eb8   200b220 Enabled  00000000:00000000 11b42540     0 MTA
  29    b 1f5c 13df6a50   200b220 Enabled  2ab1da6c:2ab1f1c0 11b42540     0 MTA
  30    c 1ae8 12d44a58      b220 Enabled  00000000:00000000 11b42540     0 MTA
  31    d 16ac 12e2e008   200b220 Enabled  2a81348c:2a8153c0 11b42540     1 MTA
   5    e 1c78 12da2160       220 Enabled  00000000:00000000 0019daf0     0 Ukn
  33    8  78c 11b674c8   200b220 Enabled  2707b818:2707c1d8 11b42540     0 MTA
  34   12 1a8c 13f163c8       220 Enabled  00000000:00000000 0019daf0     0 Ukn
  36   13  4a4 13eef718   200b220 Enabled  2a7db4a4:2a7dd3c0 11b42540     0 MTA
   4   14  5ac 13ef2008       220 Enabled  00000000:00000000 0019daf0     0 Ukn
  42   10 1c24 13f0e950   880b220 Enabled  00000000:00000000 0019daf0     0 MTA (Threadpool Completion Port)
   6   11  e28 13f16008       220 Enabled  00000000:00000000 0019daf0     0 Ukn
   3    f 13b4 13eba008       220 Enabled  00000000:00000000 0019daf0     0 Ukn
  43   15 1b8c 140db008   880b220 Enabled  00000000:00000000 0019daf0     0 MTA (Threadpool Completion Port)
  44   17 131c 140ceb28   200b220 Enabled  272288c8:27229fd8 11b42540     0 MTA
  45   1d  5dc 140cd0a0       220 Enabled  00000000:00000000 0019daf0     0 Ukn
  47   20 19c8 1651a008       220 Enabled  00000000:00000000 0019daf0     0 Ukn
XXXX   24    0 16468880   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
  46   1f 1dac 1650ab48       220 Enabled  00000000:00000000 0019daf0     0 Ukn
XXXX   1a    0 140d5008   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   16    0 140c5008   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
  50    3 1e8c 14064420   180b220 Enabled  27246f54:27247fd8 11b42540     1 MTA (Threadpool Worker)
XXXX   35    0 1406e800   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
  51   36 1340 140df008   180b220 Enabled  2adec9cc:2aded1c0 11b42540     1 MTA (Threadpool Worker)
XXXX   37    0 16566868   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
  48   38  cdc 16578840       220 Enabled  00000000:00000000 0019daf0     0 Ukn
XXXX   39    0 16566c28   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   3b    0 1646b8b0   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   3c    0 16674008   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   3d    0 16676418   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   3e    0 16676fb8   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   3f    0 16674d48   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   40    0 1667de10   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   41    0 16680050   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   42    0 166812e8   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   43    0 16683e60   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
  52   44 1e40 165259e8   180b220 Enabled  2adf126c:2adf31c0 11b42540     1 MTA (Threadpool Worker)
XXXX   45    0 165b7c08   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   46    0 165aa3d8   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   47    0 165242c8   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
XXXX   48    0 165e9500   1801820 Enabled  00000000:00000000 0019daf0     0 Ukn (Threadpool Worker)
  49   3a 1eac 165676f0       220 Enabled  00000000:00000000 0019daf0     0 Ukn 

The threads with an ID of XXXX have ended and are waiting to be recycled. We can also see that the finalizer thread has an ID of 22. So if we’d seen an unhealthy amount of activity on thread 22 when we ran the !runaway-command we would now have known that we had a finalizer-issue on our hands.

Switching to a specific thread

To go to a specific thread we use the ~-command. The syntax is as follows: ~[number of thread]s. So to switch to thread 50 we would type the following:

0:000> ~50s

We have then switched to thread 50, and can use a lot of other useful commands.


This great command will list the callstack for the current thread. If you want additional information you can add the “-p” switch which will show you parameters and local variables as well.

Below is a sample listing of the clrstack for thread 50.

0:050> !clrstack
OS Thread Id: 0x1e8c (50)
ESP       EIP    
17a9e750 7d61c828 [NDirectMethodFrameSlim: 17a9e750] System.DirectoryServices.Protocols.Wldap32.ldap_bind_s(IntPtr, System.String, System.DirectoryServices.Protocols.SEC_WINNT_AUTH_IDENTITY_EX, System.DirectoryServices.Protocols.BindMethod)
17a9e768 14df70f9 System.DirectoryServices.Protocols.LdapConnection.BindHelper(System.Net.NetworkCredential, Boolean)
17a9e794 14df6de0 System.DirectoryServices.Protocols.LdapConnection.Bind()
17a9e79c 14df59e9 System.DirectoryServices.Protocols.LdapConnection.SendRequestHelper(System.DirectoryServices.Protocols.DirectoryRequest, Int32 ByRef)
17a9e8b8 14df56e8 System.DirectoryServices.Protocols.LdapConnection.SendRequest(System.DirectoryServices.Protocols.DirectoryRequest, System.TimeSpan)
17a9e8bc 14df5657 [InlinedCallFrame: 17a9e8bc]

So, reading from the bottom up we can see that an LdapConnection called the SendRequest method, which in turn called the SendRequestHelper method, which called the Bind method, and so on.

If we run !clrstack -p we get the following:

0:050> !clrstack -p
OS Thread Id: 0x1e8c (50)
ESP       EIP    
17a9e750 7d61c828 [NDirectMethodFrameSlim: 17a9e750] System.DirectoryServices.Protocols.Wldap32.ldap_bind_s(IntPtr, System.String, System.DirectoryServices.Protocols.SEC_WINNT_AUTH_IDENTITY_EX, System.DirectoryServices.Protocols.BindMethod)
17a9e768 14df70f9 System.DirectoryServices.Protocols.LdapConnection.BindHelper(System.Net.NetworkCredential, Boolean)
        this = 0x271fdfe0
        newCredential =
        needSetCredential =

17a9e794 14df6de0 System.DirectoryServices.Protocols.LdapConnection.Bind()
        this =

17a9e79c 14df59e9 System.DirectoryServices.Protocols.LdapConnection.SendRequestHelper(System.DirectoryServices.Protocols.DirectoryRequest, Int32 ByRef)
        this = 0x271fdfe0
        request = 0x27246e38
        messageID = 0x17a9e8ec

17a9e8b8 14df56e8 System.DirectoryServices.Protocols.LdapConnection.SendRequest(System.DirectoryServices.Protocols.DirectoryRequest, System.TimeSpan)
        this = 0x271fdfe0
        request = 0x27246e38
        requestTimeout =

17a9e8bc 14df5657 [InlinedCallFrame: 17a9e8bc]

We can now look at the parameters, like the DirectoryRequest that was sent to the SendRequest and SendRequestHelper methods. To do this we simply copy the address of the request, (0x27246e38) and use it as an argument for our next command.

!dumpobject (!do)

This is another crucial command. Dumpobject will dump the object at the specified address, so if we send the address of the request as a parameter we will get the request dumped to screen.:

0:050> !do 0x27246e38
Name: System.DirectoryServices.Protocols.SearchRequest
MethodTable: 14b394c4
EEClass: 14d97ce0
Size: 52(0x34) bytes
GC Generation: 0
      MT    Field   Offset                 Type VT     Attr    Value Name
02c39310  4000102        4        System.String  0 instance 00000000 directoryRequestID
14b398bc  4000103        8 …ControlCollection  0 instance 27246e90 directoryControlCollection
02c39310  4000111        c        System.String  0 instance 27246d00 dn
12579f5c  4000112       10 ….StringCollection  0 instance 27246eb4 directoryAttributes
02c36ca0  4000113       14        System.Object  0 instance 27246ddc directoryFilter
14b39344  4000114       18         System.Int32  1 instance        1 directoryScope
14b393fc  4000115       1c         System.Int32  1 instance        0 directoryRefAlias
0fd3da00  4000116       20         System.Int32  1 instance        0 directorySizeLimit
1202af88  4000117       28      System.TimeSpan  1 instance 27246e60 directoryTimeLimit
120261c8  4000118       24       System.Boolean  1 instance        0 directoryTypesOnly

Okay, so what’s this? Well, it’s a System.DirectoryServices.Protocols.SearchRequest object. This means that it has various properties defined by the System.DirectoryServices.Protocols.SearchRequest class. If you want to know more about these properties I suggest you look up the SearchRequest class in msdn. We have the RequestId, the Scope, the DistinguishedName, etc.

So, let’s say we want to know what the distinguished name is for this particular request. We boldly assume that the dn-property we see in the listing above is what is called DistinguishedName by MSDN. Simply copy the address of the dn-property (27246d00) and use !dumpobject again. We can see that the property is a System.String, so the output should be pretty clear.

0:050> !do 27246d00
Name: System.String
MethodTable: 02c39310
EEClass: 0fb610ac
Size: 112(0x70) bytes
GC Generation: 0
String: CN=Dummy,CN=Accounts,CN=useradm,DC=dummy,DC=net
      MT    Field   Offset                 Type VT     Attr    Value Name
0fd3da00  4000096        4         System.Int32  1 instance       48 m_arrayLength
0fd3da00  4000097        8         System.Int32  1 instance       47 m_stringLength
0fb80010  4000098        c          System.Char  1 instance       43 m_firstChar
02c39310  4000099       10        System.String  0   shared   static Empty
    >> Domain:Value  0019daf0:03380310 11b42540:03380310 <<
0fb86d44  400009a       14        System.Char[]  0   shared   static WhitespaceChars
    >> Domain:Value  0019daf0:03380324 11b42540:033855bc <<

Apparently, the distinguished name used was “CN=Dummy,CN=Accounts,CN=useradm,DC=dummy,DC=net”. If we want to find out more we simply continue using !dumpobject to examine the objects and their respective values.

In my next post I thought I’d continue using the !dumpobject-command to probe through the w3wp-process, and introduce a couple of other great commands.


To be continued…

/ Johan

Comments

  2. Aaron Lerch says:

    Instead of copying SOS.dll and doing a ".load sos", you can just do ".loadby sos mscorwks" to load SOS from the relevant framework install location.

    Thanks for the good intro!

  3. Brian says:

    Great article. I’m just getting into debugging for the first time and finding it quite frustrating :) I suppose that comes with the territory.

    Perhaps someone can explain the confusion i’m having over something. I’m troubleshooting a problem with a .net app that will grow very large in mem usage and eat up high cpu until it eventually becomes unresponsive. This happens randomly after the app will be up and working fine for hours.  I suspect that garbage collection might be an issue (would GC really be suspect of the hang if the problem was that it grew too large and then started too much GC? I think I’m barking up the wrong tree and should be looking for a mem leak)

    Never the less, based on the output I get from doing a !threads, I want to take a look at thread 25, which appears like this:

    25   2d0 164a2dc0   1800220 Enabled  00000000:00000000 000f13e0     2 MTA (GC) (Threadpool Worker)

    so I do ~25s and then pull a !clrstack. Some data comes back with these being on top of the stack:

    17d9ea04  7c8285ec [FRAME: ECallMethodFrame] [DEFAULT] Void System.GC.nativeCollectGeneration(I4)

    17d9ea14  799dbec3 [DEFAULT] I8 System.GC.GetTotalMemory(Boolean)

    After all this rambling, one of my main questions is this – when I do a !clrstack -p on this same thread, I get output identical as doing it without the -p option. What would the reason for this be? I want to delv into this thread further by using the !do but I cannot.

  4. JohanS says:

    Hi Brian,

    It sounds like you’re performing a lot of generation 2 garbage collections. This is not the GC’s fault as such. I’d take a long, good look at your memory usage to see what can be minimized.

    Here’s what I’d do:

    1. Remove ALL calls to GC.Collect() if any. They will mess up the "ecosystem" for the GC and should only be used for testing purposes.

    2. Take a look at my previous post The title mentions out of memory exceptions, but it will show you how to look at general memory usage for your application and hopefully pinpoint the culprit.

    Regards / Johan

  5. Cory Foy says:

    @Aaron – Just to add – you can only do that if the framework has already been loaded. For example, you can’t do that at the beginning of a process because mscorwks hasn’t been loaded yet.

    If you need to get SOS loaded as soon as mscorwks is loaded, you can do:

    sxe ld mscorwks


    .loadby sos mscorwks

    Generally most of us do .loadby sos mscorwks, but there are circumstances where that won’t work.


  7. Tess says:

    A note for Brian here.  GC.GetTotalMemory() will call GC.Collect if you pass it true.  To get rid of that behaviour you should pass in false instead.

    The reason you dont get any more data if you run !clrstack -p is probably because you are not building your dlls in debug mode (which btw is a very good practice in production).  

    With retail builds you may still get some more info if you run !clrstack -a


  9. MItch Wheat says:

    Excellent introduction to Windbg. Nice one , Johan!

  11. Paul says:

    Johan, great stuff there, bookmarked for reference! ;o)

  12. HS says:

    Great article!

    In your article you said "The threads with an ID of XXXX have ended and are waiting to be recycled.", but you didn’t say what is considered "ended". I thought that IIS will continue using the same threads over & over again from its thread pool and thus would not end any threads in the threadpool. Is having "DeadThreads" normal or does it require some investigation? Thanks.

  13. JohanS says:


    The !threads-command only lists the managed threads. If you want the native threads you should use ~ instead.

    Managed threads are occasionally ended when there’s nothing left to do. One thread may handle a couple of requests and then end. Having ended threads on the heap, like in the sample above, is completely normal.

    / Johan

  14. dannyR says:

    where is the sos.dll for .NET 3.5? I can’t find it anywhere.

  15. JohanS says:

    Hi Danny,

    No problem. Use the 2.0 version.

    / Johan

  16. Peter says:

    Hi Johan,

    Gr8 article. I came looking for help on debugging memory fragmentation for unmanaged applications. I have a simple CC++ application running on WS2K3 SP1. I’m not sure, loading SOS in my WinDbg will do me much good. Could you give me some information on WinDbg commands for unmanaged applications.

    Also, I’ve been using !heap -l to find out leaked heaps, but can I be really sure all the heaps displayed are really leaking memory.

    Thanks !

  17. Rene Pally says:

    Nice Article,

    Can you tell me please what it means? :

    (a60.b5c): CLR exception – code e0434f4d (first chance)

    Thank you very much

    i would like to know the detail of the listed results of the SOS extern Commands..

    how and where can i find the detail documents??

    3ks so much

  20. JohanS says:

    I’m not sure I follow. Could you please elaborate a little?

  22. Matt G says:

    What can be done to investigate threads that have ended and are waiting to be recycled?

    I’ve got an intermittent ASP.NET worker process crash.  Running the !threads command on the dump shows a managed thread with a StackOverflowException.  Is there any way to get the stack trace on that thread?

    If not, do you have any suggestions for setting up adplus to do a memory dump that does have the stack information?


  23. JohanS says:

    Hi Matt,

    Are you sure you’ve really experienced the StackOverflowException?

    There is always a StackOverflowException created by default "just in case". The reason for this is that when a StackOverflow occurs the framework normally won’t be in a state to create an exception, so it has to create it proactively. See my post on exception hunting for further details.

    / Johan

  24. Matt G says:

    Based on your other blog post it appears I do not have a StackOverflowException.

    I’m not sure where to go from here.  I can get the exceptions from the heap per your post, but I don’t know how to use that information to determine what killed w3wp.exe.

  25. JohanS says:

    Hi Matt,

    Run ~*kb (this will show you the native stack for all threads) and look for kernel32!ExitProcess. This way you’ll find the thread that ended the process.

  26. Jun says:

    Hi Johan,

    about those threads with an ID of XXXX, should they go away after certain amount of idle time like 2 minutes? I am trouble shooting an application while the application should sit idle (because of no stimulation), however it is still using quite some CPU time and and threadpool threads, where it should not. So I created an adplus dump when it is sitting idle, and I found that there are many threads with ID XXXX, and I created a dump file again after 15 minutes, again, it still have the exact same XXXX threads. All those threads are completion port threads, and I wonder why they did not get recycled?

    Also, when doing a ‘~*e !clrstack’, most of the worker threads and completion port threads are showing "Failed to start stack walk: 80004005". Is there a way to show the stack for those threads, because those are the threads I am intereted in.

    Thank you!

  27. JohanS says:

    The GC will be triggered when you’re allocating memory. If your application is inactive, then so is the Garbage Collector. Even if you’re making a few, random allocations they still may not be enough to trigger a GC, so this is not at all unexpected.

    The “Failed to start stack walk: 80004005”-error is displayed when the thread did contain a managed stack, but no longer does.

  28. Justin says:



  29. Anish says:


       Great content.

    I am new to debugging .net programs using windebug.

    How can I set break point on Main method of a console application.

    Following are the steps i have done

    1.Open the executable in windbg.exe

    2.Set a break poin in mscorwks using following command

       bp mscorwks!EEStartup "g @$ra"

    3. From here onwards i am able to use sos commands.

        I wanted to set breakpoint in Main method of


    I have tried using !bpmd command as mentioned int eh sos help.

    "bl" command lists the breakpoint also. But program is not breaking at the Main

    what could be the reason

    Thank You,


  33. Kams says:

    I have been using !clrstack to display stack trace, but what I don’t see method names. In my case i get

    0:028> !clrstack

    OS Thread Id: 0xffc (28)

    ESP       EIP    

    04e9e1f4 7c90e4f4 [GCFrame: 04e9e1f4]

    04e9e210 7c90e4f4 [GCFrame: 04e9e210]

    04e9e3f4 7c90e4f4 [HelperMethodFrame_1OBJ: 04e9e3f4]

    04e9e464 792d5348

    04e9e4b4 792d514f

    04e9e4f0 792d4fde

    04e9e510 6384da66

    04e9e560 6384d666

    04e9e5bc 63a95573

    04e9e5d0 07df1c91

    04e9e600 064b6935

    04e9e644 07df1b7c

    04e9e684 07df15f1

    04e9e6d8 0623e9c7

    04e9e79c 056baf8c

    04e9e7e4 056b89cf

    04e9e864 056b7a1c

    04e9e948 056b6b51

    04e9ea94 056b5d0e

    04e9ead4 056b5bd0

    04e9ead8 056b123e

    04e9eba0 05e9b38c

    04e9ec10 062347a1

    04e9ec90 062344b9

    04e9f0c8 79e71b4c [HelperMethodFrame_PROTECTOBJ: 04e9f0c8]

    04e9f20c 792795b3

    04e9f230 7927038b

    04e9f28c 7927023c

    04e9f294 792701f5

    04e9f2a8 792700e1

    04e9f2dc 7926fe5f

    04e9f318 7985f9ba

    04e9f360 79864e7a

    04e9f390 066d8553

    04e9f484 066d7049

    04e9f4f4 066d5ccf

    04e9f550 066d5afc

    04e9f580 066d59d1

    04e9f588 066d58b7

    04e9f5b8 066d57e8

    04e9f5d0 7928cdc4

    04e9f770 79e71b4c [GCFrame: 04e9f770]

    Any clues on how I can get this right. I have set the symbol file path to the microsoft server as suggested above.

    when I do a !dso, I can see objects on the stack, but If !clrstack would work it would be great!

    Any help on this would be appreciated

  34. JohanS says:

    It seems likely that you’re loading the wrong version of SOS. Try loading SOS using the following command:

    .loadby sos mscorwks

    / Johan

  35. dudu says:

    最近在看相关内容,作者写的相当不错,自己随手学习了一下,非常棒,顺便也锻炼一下自己的翻译能力(好久没有处理相关内容了,能力直线下降啊) 原文地址:

  36. Thank you so much for an thorough introduction to WinDBG. It is something I have been looking for for months.

    I am having some real trouble with the introduction. Funnily, I have been able to do the later labs (5-7), but came back to the beginning to check I hadn’t missed anything.

    I followed your above instructions with the "Crash" lab, to see the instructions generated, but I am finding that a lot of the commands are not yielding any good results.

    For example


                                        PreEmptive   GC Alloc           Lock

          ID OSID ThreadOBJ    State     GC       Context       Domain   Count APT Exception

      1    1  c10 001a0a28   a808220 Enabled  00000000:00000000 001d04a8     1 MTA (Threadpool Completion Port)

      7    2  664 001a4190      b220 Enabled  01056390:0105652c 0016e3a8     0 MTA (Finalizer) System.NullReferenceException (01055ad0)

      8    3 12cc 001bd228    80a220 Enabled  00000000:00000000 0016e3a8     0 MTA (Threadpool Completion Port)

     11    4  9a4 001cff88      1220 Enabled  00000000:00000000 0016e3a8     0 Ukn

    Switch to Thread 2

    0:011> ~2

      2  Id: 158c.288 Suspend: 1 Teb: 7ffd9000 Unfrozen

         Start: msvcr80!_endthread+0x71 (7813286e)

         Priority: 0  Priority class: 32  Affinity: 3

    !clrstack it

    0:011> !clrstack

    OS Thread Id: 0x9a4 (11)

    Failed to start stack walk: 80004005

    Nothing? I also tried !pe to get:

    0:011> !pe

    There is no current managed exception on this thread

    But there must be? As it says on the !threads!

    I’m not sure what I’ve been doing wrong. Can you give me guidance please?

    Thanks again


  37. JohanS says:

    Hi Dominic.

    Here’s where it went wrong:

    ~2 will give you info about the thread.

    ~2s will switch to the thread in question. This is the command you wanted.

    As you can see you’re still on thread 11:

    0:011> !clrstack

    OS Thread Id: 0x9a4 (11)

    Failed to start stack walk: 80004005

    / Johan

  38. Thanks Johan!

    I always wonder what the different was between those 2 commands.

  39. Caveman says:

    Thank you for the cool intro. Johan!

  40. JohnOpincar says:

    Thanks for the great post.  It’s been invaluable to me as I’ve researched an IIS deadlock problem.

  41. Sagar says:

    This is a great post Johan. Its very useful. Thanks for sharing this information.