Identifying an object whose underlying DLL has been unloaded


Okay, so I gave it away in the title, but follow along anyway.

Your program chugs along and then suddenly it crashes like this:

eax=06bad8e8 ebx=00000000 ecx=1e1cfdf0 edx=00000000 esi=06b9a680 edi=01812950
eip=1180ab57 esp=001178b4 ebp=001178c0 iopl=0         nv up ei pl nz na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010206
ABC!FunctionX+0x1f:
1180ab57 ff5108          call    dword ptr [ecx+8]    ds:0023:1e1cfdf8=????????
0:000>>

Instantly you recognize the following:

  • This is a virtual method call. (Call indirect through register plus offset.) — Very high confidence.

  • The vtable is in ecx. (That is the base register of the indirect call.) — Very high confidence.

  • The underlying DLL for this object has been unloaded. (The memory that contains the vtable is not valid and its address is consistent with once having been in valid code.) — High confidence.

  • This is a IUnknown::Release call. (Release is the third function of IUnknown and therefore resides at offset 8 on x86.) — High confidence.

Of course, all of the above "instant conclusions" are merely "highly-educated guesses", but life is full of highly-educated guesses. (Every morning, I guess that my plates are still in the cupboard.)

Let's run with our theory that the object was in an unloaded DLL and look for confirmation.

0:000> lm
start    end        module name
...
Unloaded modules:
10340000 10348000   DEF.DLL
1e1c0000 1e781000   GHI.DLL
25a90000 25a96000   JKL.DLL
0:000>

Aha, our presumed vtable address lies right inside the address space where GHI.DLL used to be loaded. Let's see what used to be loaded at that address. For this, I borrow a trick from Doron, namely loading a module as a dump file. This "virtually loads" the library so you can poke around inside it.

C:\Program Files\ABC> ntsd -z GHI.DLL

Microsoft (R) Windows Debugger
Copyright (c) Microsoft Corporation. All rights reserved.

Loading Dump File [C:\Program Files\ABC\GHI.DLL]
...
ModLoad: 15800000 15dc1000   C:\Program Files\ABC\GHI.DLL
eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=00000000 edi=00000000
eip=15807366 esp=00000000 ebp=00000000 iopl=0         nv up di pl nz na pe nc
cs=0000  ss=0000  ds=0000  es=0000  fs=0000  gs=0000             efl=00000000
GHI!_DllMainCRTStartup:
15807366 8bff             mov     edi,edi
0:000>

That module-load notification tells you where the DLL got virtually-loaded; in our case, it got loaded to 0x15800000. This isn't the same address as it was in our crashed process, so we'll have to do some mental arithmetic to account for the discrepancy.

Going back to the original register dump, we see that our putative vtable is at ecx=1e1cfdf0 relative to the load address 1e1c0000. Since our DLL-loaded-as-a-dump-file was loaded at 0x1580000 we need to adjust the address to be relative to the new location.

// working with the second copy of ntsd
0:000> ln 0x1580fdf0
(1580fdf0)   GHI!CAlphaStream::`vftable'

That magic number 0x1580fdf0 is just the result of some mental arithmetic. First:

0x1e1cfdf0
-0x1e1c0000
0x0000fdf0

This is the address of the vtable in the crashed process relative to the load address of the DLL in the crashed process. Next:

0x15800000
+0x0000fdf0
0x1580fdf0

This is the address of the vtable in the DLL-loaded-as-a-dump-file relative to the load address of the DLL in the DLL-loaded-as-a-dump-file. The math really isn't that hard, as you can see, since a lot of things cancel out. This happens a lot.

When we asked the debugger to tell us what symbol is nearest to that address, we hit the jackpot: It is exactly a vtable for the CAlphaStream object. This confirms our original theory. We can even confirm the IUnknown::Release theory by dumping the vtable.

0:000> dds 1580fdf0
1580fdf0  159234b3 GHI!CAlphaStream::QueryInterface
1580fdf4  15810539 GHI!CBetaState::AddRef
1580fdf8  15923cfc GHI!CAlphaStream::Release
1580fdfc  15923d30 GHI!CAlphaStream::Read
...

Yup, that's a CAlphaStream vtable all right.

Since I'm not familiar with the GHI.DLL file, let's ask the debugger where the source code is so we can take a closer look:

0:000> .lines
Line number information will be loaded
0:000> dds 1580fdf0
1580fdf0  159234b3 GHI!CAlphaStream::QueryInterface
                   [c:\dev\fabricam\synergy\proactive\winwin.cpp @ 2624]
1580fdf4  15810539 GHI!CBetaState::AddRef
                   [c:\dev\fabricam\leverage\paradigm\initiative.cpp @ 427]
1580fdf8  15923cfc GHI!CAlphaStream::Release
                   [c:\dev\fabricam\synergy\proactive\winwin.cpp @ 2638]
1580fdfc  15923d30 GHI!CAlphaStream::Read
                   [c:\dev\fabricam\synergy\proactive\winwin.cpp @ 2649]

Now that we know where the source code to CAlphaStream is, we can hop on over to take a quick peek and confirm that, oh look, the object doesn't increment the DLL object count when it is constructed (or decrement it when it is destructed). As a result, when COM calls DllCanUnloadNow, the GHI.DLL says, "Sure, go ahead!" The DLL is unloaded even though ABC still has a reference to it, and then when ABC goes to release that reference, we crash because GHI is already gone.

After I wrote this up, I discovered that Tony Schreiner went through pretty much the same exercise with a third-party Internet Explorer toolbar, except he had the extra bonus challenge of not having source code for the plug-in!

Comments (10)
  1. Anonymous says:

    Are you running OS/2?

    My copy of ntsd says that the -z option is "reserved for OS/2 debugging".

  2. Rhomboid says:

    Rather than doing the address calculation manually, couldn’t you run rebase on a copy of GHI.DLL so that it loads at the same place it did in the app?

  3. Anonymous says:

    Richard…

    I checked the version of NTSD that ships with Windows XP SP2. -z is documented as:

    -z <CrashDmpFile> specifies the name of a crash dump file to debug

    -zp <CrashPageFile> specifies the name of a page.dmp file

    Are you using the Windows 2000 version? If it was reserved for OS/2, they might have recycled it for XP.

  4. Anonymous says:

    Yes, I am using Windows 2000. I realized after I posted that I had omitted some vital information (such as the version of the OS and tools I was running).

  5. Anonymous says:

    Often seen in C++ apps when you have a global COM smart pointer, so the destructor tries calling Release on the COM object after main has completed, and CoUninitialize has been called.

  6. Anonymous says:

    I think you can also make debugger believe that GHI.DLL is still loaded by doing this:

    0:000> ? 1e781000 – 1e1c0000

    Evaluate expression: 6033408 = 005c1000

    0:000> .reload GHI.DLL=1e1c0000,005c1000

    This is convenient when you need to translate multiple addresses (such as when there is a stack trace with several return addresses from the unloaded DLL on the stack).

    (Richard/James – ntsd from system32 is very old and doesn’t have any extensions. You should download the latest version from http://www.microsoft.com/whdc/DevTools/Debugging/default.mspx).

  7. Anonymous says:

    Use:

    .reload /unl ghi.dll

    It does all the dirty work for you.

  8. Kujo says:

    In this crash, my first thought is usually that the “this” pointer was bad, so the vtable in ecx isn’t a valid one (using a dangling pointer?)  I probably would have barked up that tree before considering an unloaded dll.

    I’ve noticed that good windbg debugging involves a lot of pattern recognition (being able to spot a float or a string, for example.) Unlike me, it sounds like unloaded dll was your first guess, and with high confidence no less! The heuristic you listed was “its address is consistent with once having been in valid code.”  How did you discern that instantly?  

    I can see that 0x1e1cfdf8 is well-aligned, but that’s hardly telling on its own.  0x10000000 is a very common dll base address, but 0x1e1cfdf8 doesn’t feel close enough to that (and indeed, the lucky dll was based at 0x1e1c0000.) Is it just that I don’t work with COM very often, so dlls aren’t my first guess?

    [True, if I had thought harder, 0x1Exxxxxx does seem a bit too high, but sometimes you get the right answer for the wrong reason. -Raymond]
  9. Eric C Brown says:

    In recent versions of windbg, the register display will often say <unloaded ghi.dll>+blah.  Not always, but often.

  10. Kujo says:

    Thanks for the tip on -z, it’s definitely nicer than using dumpbin or something.  

    Inspired by Eric’s comment, I see there’s a brand new version of the debugging tools released today! Thanks :)

Comments are closed.