Dumping a hash table with external chaining from the debugger


I was doing some crash dump debugging, as I am often called upon to do, and one of the data structure I had to grovel through was something that operated basically like an atom table, so that's what I'll call it. Like an atom table, it manages a collection of strings. You can add a string to the table (getting a unique value back, which we will call an atom), and later you can hand it the atom and it will give you the string back. It looked something like this:

struct ENTRY
{
  ENTRY *next;
  UINT   atom;
  PCWSTR value;
};

#define ATOMTABLESIZE 19
struct ATOMTABLE
{
  ENTRY *buckets[ATOMTABLESIZE];
};

(It didn't actually look like this; I've reduced it to the smallest example that still illustrates my point.)

As part of my debugging, I had an atom and needed to look it up in the table. A program would do this by simply calling the "here is an atom, please give me the string" function, but since this was a crash dump, there's nothing around to execute anything. (Not that having a live machine would've made things much easier, because this was a kernel-mode crash, so you don't get any of this edit-and-continue froofy stuff. This is real debugging™.)

But even though the crashed system can't execute anything, the debugger can execute stuff, and the debugger engine used by kd comes with its own mini-programming language. Here's how I dumped the atom table:

// note: this was entered all on one line
// broken into two lines for readability
0: kd> .for (r $t0=0; @$t0 < 0n19; r $t0=@$t0+1)
         { dl poi (fffff8a0`06b69930+@$t0*8) 99 2 }
fffff8a0`06ad3b90  fffff8a0`037a3fc0 fffff8a0`0000000c \
fffff8a0`037a3fc0  fffff8a0`037a4ae0 00000000`00000025 | $t0=0
fffff8a0`037a4ae0  fffff8a0`02257580 00000000`00000036 |
fffff8a0`02257580  00000000`00000000 00000000`00000056 /
fffff8a0`06ad3b30  fffff8a0`06ad3ad0 a9e8a9d8`0000000d \
fffff8a0`06ad3ad0  fffff8a0`037a4700 000007fc`0000000e |
fffff8a0`037a4700  fffff8a0`01f96fb0 00000000`0000003f | $t0=1
fffff8a0`01f96fb0  fffff8a0`06cfa5d0 fffff8a0`00000044 |
fffff8a0`06cfa5d0  00000000`00000000 00181000`00000060 /
fffff8a0`033e7a70  fffff8a0`037a4770 00000020`00000023 \
fffff8a0`037a4770  fffff8a0`023b52f0 00000000`0000003e | $t0=2
fffff8a0`023b52f0  fffff8a0`03b6e020 006f0063`00000059 |
fffff8a0`03b6e020  00000000`00000000 00000000`00000075 /
fffff8a0`037a0670  fffff8a0`02b08870 fffff8a0`00000026 \ $t0=3
fffff8a0`03b9e390  00000000`00000000 00240000`00000071 /
...

Let's take that weirdo command apart one piece at a time.

The overall command is

.for (a; b; c) { d }

This operates the same as the C programming language. (Sorry, Delphi programmers, for yet another C-centric example.) In our case, we use the $t0 pseudo-register as our loop control.

  • r $t0=0 — this sets $t0 to zero

  • @$t0 < 0n19 — this stops once $t0 reaches 19.

  • r $t0=@$t0+1 — this increments $t0.

Note that here as well as in the loop body, I prefix the register name with the @ character when I want to obtain its value, in order to force it to be interpreted as a register. (Otherwise, the debugger will look for a symbol called $t0.)

The command being executed at each iteration is { dl poi (fffff8a0`06b69930+@$t0*8) 99 2 }. Let's break this down, too:

  • dl — this command dumps a singly-linked list.

  • poi (fffff8a0`06b69930+@$t0*8) — this is the head of the linked list. In this example, 0xfffff8a0`06b69930 is the address of the buckets array, so we add the loop counter times the size of a pointer (8, in this case) to get the address of the $t0'th entry, and then dereference it (poi) to get the address of the head of the linked list.

  • 99 — This is the maximum length of the linked list. I picked an arbitrary large-enough number. I like using 9's because it carries the most value per keypress. Other people like to use nice round numbers like 1000, but 999 saves you a whole keypress and is just one less. (On the other hand, I don't use fff because that runs the risk of being misinterpreted as a symbol.)

  • 2 — This is the number of pointer-sized objects to dump at the start of each node. For 32-bit code, I use 4, whereas for 64-bit code, I use 2. Why those values? Because those are the maximum number of pointer-sized objects that the debugger will print on one line.

Okay, so now I have that linked list dump. The value I'm looking for is the atom whose value is 0x3F, so I skim down the last column looking for 0000003f, and hey there it is.

fffff8a0`037a4700  fffff8a0`01f96fb0 00000000`0000003f

Now I have my ENTRY, and I can dump it to see what the corresponding value is.

0: kd> dt contoso!ENTRY fffff8a0`037a4700
    +0x000 next: 0xfffff8a0`01f96fb0
    +0x008 atom: 0x0000003f
    +0x010 value: 0xffff8a0`01f97e20 -> "foo"
Comments (17)
  1. Anonymous says:

    I was confused, because it took me a while to realize that there weren't 19 lines of response. I'd thought that each line represented the entire output of the {dl…} command for each @t0 value, but that's not so, is it? Or have I misunderstood completely?

    [It's a bunch of dl output concatenated. (I truncated the output in the article but neglected to add ellipses, so I put the ellipses in. I also annotated the output blocks. Hope that helps.) -Raymond]
  2. MItaly says:

    On an unrelated note, I find disturbing that every day some person comes here just to rate "one star" every post, without particular regard to its content.

  3. Anonymous says:

    "I like using 9's because it carries the most value per keypress."

    Priceless!

  4. WndSks says:

    I also tend to use 9's for throw away code: char buf[999]; partyonbuffer(buf);

  5. Anonymous says:

    @Matteo: I've noticed that as well.  I feel sorry for that person.

  6. Anonymous says:

    Is it me, or did Raymond just utterly slam Delphi programmers?

    And, can he do it again some time soon??

    JamesNT

    [It was an apology, not a slam. The default programming language for this blog is a subset of C++. If you prefer some other language, then you'll just have to tough it out. Sorry. -Raymond]
  7. Anonymous says:

    Neat trick.  For the curious, here's how you might do something similar in GDB (which AFAIK doesn't have a command for walking a linked list, so you have to implement that yourself):

    set $i=0

    while $i < 19

     set $p = table->buckets[$i]

     while $p != 0

       x/4wx $p

       set $p = $p->next

     end

     set $i++

    end

    If you don't have symbols available, then use expressions like (void*)(0xfffff8a006b69930 + 8*$i) instead of table->buckets[$i] and *(void**)$p instead of $p->next.

  8. Anonymous says:

    This post reminds me how weak I am at post-mortem (user mode) debugging.

    Are there any books or resources people can recommend?

    Currently I'm at a loss as soon as Visual Studio tooltips refuse to show a value.

  9. Anonymous says:

    I'm one of those Delphi programmers but instead of getting mad or anything I find this rather amusing. It's like the parallel in television. While NTSC used by US and to extent Japan because was imposed by US after WW2 the rest of the world is using PAL/SECAM which are far superior to NTSC. But NTSC was first, and hey, there is nothing better then the one you are used to it. Same as Microsoft at beginning, Delphi was not around so they used C and to extent C++. And now is like in that anecdote about monkeys getting down any new monkey thrown in the cage that tries to reach for the banana. No one knows why but they keep doing it in this way.

    On the other hand I also know Java/VB/C#/PHP/C/C++ like any programmer who is working in this field for more then two decades.

    And speaking of PHP, those $ in front of variables makes me wonder if the mini-programming language used by debugger is not PHP (or PERL / RUBY perhaps?) :)

  10. Anonymous says:

    Mr. Chen,

    No apology needed and yes, I knew you were not slamming Delphi programmers.  In my circle of friends, my hatred for Delphi is legendary for what I, and many others, consider to be good reasons.  I am a .Net programmer through-and-through with a touch of C/C++ and you are my programming god.

    You have my apologies for my lack of ability to control myself from taking advantage of the situation.

    JamesNT

  11. Anonymous says:

    I too did not take that as slamming Delphi programmers.

    Anyway, Delphi was a great language that I loved using for several years; until I made the move to c# for all my work around 2006. In any case, Delphi programmers generally understand C/C++ relatively well. You have to, to do anything less than trivial while referring to the MSDN. (The same as you end up using pinvoke in .Net)

    But I'm amazed that anybody was *still proudly programming in Delphi* in 2011. I would call that "proudly being a glutton for punishment".

  12. Medinoc says:

    This "dl" command, can it work for linked lists whose "next" pointer isn't the first element?

    [In those cases, use the !list command, which is more flexible. -Raymond]
  13. Anonymous says:

    Jerome: People still program in old Delphi probably for the same reasons that people still program in VB6.

  14. Anonymous says:

    It's called XE2, which has support for Mac OS X / iOS beside Windows 32/64…and latest acquisition these days is XE3 coming in Sep 2012 and will features support for W8, Mountain Lion, Android and Linux. Practically everything. So yes, as of 2012 I am very proud to still program in Delphi..and I have a lot of work actually.

    So dear Jerome, from 2006 till 2012 …you kinda of took a coma?

  15. Mike Dimmick says:

    @Jon: "Debugging Applications for Microsoft.NET and Microsoft Windows" by John Robbins. There was an earlier edition just called "Debugging Applications" that covered VB 6, if you need that, and I think its native content was slightly broader, from memory. The later "Debugging Microsoft.NET 2.0 Applications" doesn't appear to have much, if any, native content, so I'd go with the older book.

    @Danny: WinDBG, or at least the debugging engine under it, dates right back to the start of the NT project, circa 1987 (it merely wraps the same debugger engine as cdb and ntsd in a barely-good-enough GUI). PHP and Ruby are much newer, Perl is about the same vintage. $ was being used for variables much further back than that. en.wikipedia.org/…/Sigil_(computer_programming)

  16. @Danny: I can't speak much from the Delphi perspective, but I can from the C++ Builder perspective.  The compiler itself has aged very poorly.  No support yet for major C++11 features like lambda functions.  Visual C++ 2005 introduced 64-bit compiling, yet here we are 7 years later and there is STILL no 64-bit C++ Builder compiler.  The compiled code runs 2 or 3 times slower than equivalent Visual C++ code.  Not to mention there are countless small bugs and gotchas in the product – noticeably worse than, say, the .NET Framework.  The rumors I read online say that it's going to be replaced with a clang/LLVM-based compiler, but it's nowhere in sight.

    The VCL runtime for both Delphi and C++ Builder is forever tied to the Win32 API; they have no hope of "fixing" the VCL to be cross-platform without breaking compatibility in a major way.  So, the cross-platform support you mention requires the new FireMonkey framework, and existing applications have to be re-written.  It sounds nice in theory, but all I read about online is many negative complaints about how FireMonkey is buggy, incomplete, etc.  I installed XE2 Update 4 and ran a simple FireMonkey form with a label on it.  Immediately I noticed that the font looked horrible.  Took a screenshot and zoomed in: turns out there is some weird greyscale anti-aliasing going on – it's not ClearType and it looks terrible on my monitor.  This speaks volumes about the amount of time I'd likely have to spend fussing with the product to get it to just work and look decent – if a simple label looks like garbage, how much more would I be dealing with?  A major complaint is lack of documentation, so I looked into that and I'd have to agree.  There are zero (0) books about this framework; compare that to the dozens of books for WPF and Qt, let alone WinForms or even standard Delphi/VCL.  Now suppose we split documentation into two categories: (1) overview documentation with high-level documentation, and (2) function-level reference documentation.  The overview documentation, when fully expanded in the table of contents tree, only consumes a screen height and a half.  I get the feeling I'd be on my own with no documentation what-so-ever 90% of the time.  No thanks.  If I have to re-write, I'd rather re-write for Qt.

    The Windows 8 support you mention sounds like a joke.  Desktop apps that are made to look like Metro?  What's the point?!  Make it a Metro app!  Making it a desktop app will just confuse the user.

    The Mac OS X support in FireMonkey sounds sad, too.  Apparently they use the Carbon framework, which Apple already obsoleted several years ago.  And you know how Apple likes to do (or not do) backwards-compatibility!  WHY oh WHY would anyone write a new product that targets an obsolete API from Apple?!?!  I don't know why they couldn't have just used the newer Cocoa framework.  As a result, you can't compile 64-bit OS X applications (Delphi or C++), because Apple doesn't make a 64-bit version of Carbon.

    I could go on, but I'm done here.  Frankly I don't think the company has the money, resources, or community to remain competitive any more.  They might have a lot of good ideas, but they don't have the resources to properly execute these ideas.  The only reason I mess with any of those products is I get a paycheck for it for specific projects.  Otherwise, I happily stick to other products which cost less and just work.

  17. Anonymous says:

    @JamesJohnston

    This last project involved managing products for big retailers where a customer would get the app for their mobile phone, then chose it's target super-market location and fill in the desired products they would like to purchase. Customer drives to the supermarket and the app will show a map for their list in shortest path where all his desired products shelves are. The customer gets notified when is in the area of the/a product on the list, grabs it and mark it as "purchased" in the list. Goes to cashier and already knows how much money he has to pay, just in case the cashier swipes a product 2 times or other human mistakes.

    Now, the actual implementation is a 2 different apps, one app is the retailer and the other one is the customer app. The retailer uses his app to manage the products on the shelves, gets reports, notifications etc etc, all sort of implementation a database related app is suppose to do, including GPS location for each product so the built-in GPS of the customer phone would match when he gets closer.

    Took me like 6 months and is done in XE2, and yes with FireMonkey..and looks good. I do not know what you did but I got no problems with labels font or any other visual you ranted above. And we can't wait for September launch of XE3 to get Android support so the customer based to be doubled. And yes, beside normal perks and money from the contract, the customer (big retailer company) was so happy that throw in a nice fat bonus too (6 digits in USD).

    So, you tested XE2 once and rant and I got a fat check and a fat bonus..guess I'll go use XE2 and XE3 further more.

Comments are closed.