Resolving Symbols Manually on Windows CE (ADDRESS --> SYMBOL)

You know you've been there.  You are looking at a callstack, and the one function you want to know the identity of, failed to resolve symbols or has bad symbols.  Dang!  If only you knew what function 0x12345678 refers to, you'd find your bug.

First thing's first: you have to understand the Windows CE virtual memory layout.  You have to understand what slots are, and recognize slot 0 & 1 addresses versus addresses from process slots.  That’s why I blogged about that first.  If you internalized all of those details already, then you’d know that 0x12345678 was inside the process that’s running in the slot starting at 0x12000000.  In fact, it’s at offset 0x00345678 from the start of that .exe.  Heck, you’re almost to the symbol already!

But this shows that the 2 main steps to resolving a symbol are: figuring out what module the address lies within, and figuring out what symbol lies at that offset within the module.  If you need to get all the way down to the line of source code involved, you can figure that out too, by figuring out what line of source code is at that offset within the function.

Figuring out what module it is

To figure out what function or variable is at a particular address, first you need to know what module (DLL or EXE) is loaded into that chunk of RAM.  So first you eyeball the address and decide whether it’s a DLL or an EXE.  Since EXE code & data starts at the bottom of the slot, and DLL code & data starts at the top, it’s pretty easy to figure out.  0x12345678 is an EXE, 0x13987654 is a DLL.

Then you look for the DLL or EXE that is the closest match.  There are a bunch of ways to do this.  You can open up the Modules and Symbols Window, sort by address, and find the module that matches.  If you are looking up the symbol for a function, sort by “Image Address Range.”

Modules and Symbols Window
Module Image Address Range Relocated Data Address Range

lmemdebug.dll 0x01AC0000-0x01AC4FFF
kbdmouse.dll 0x01BA0000-0x01BC1FFF
devmgr.dll 0x03EE0000-0x03EE9FFF 0x01FE9000-0x01FE92A4
fsdmgr.dll 0x03F50000-0x03F60FFF 0x01FF7000-0x01FF73F0
coredll.dll 0x03F80000-0x03FD8FFF 0x01FFB000-0x01FFBD64
filesys.exe 0x04010000-0x04041FFF
shell.exe 0x06010000-0x06022FFF
device.exe 0x08010000-0x08012FFF
gwes.exe 0x0C010000-0x0C0CBFFF
services.exe 0x12010000-0x12017FFF
nk.exe 0x81200000-0x8123DFFF 0x81FA0000-0x81FFD7BF
hd.dll 0x8123E000-0x8123FFFF 0x82002000-0x82002954
osaxst0.dll 0x81240000-0x81246FFF 0x82003000-0x82006328
giisr.dll 0x81249000-0x8124AFFF 0x82008000-0x82008504
osaxst1.dll 0x82042000-0x82046FFF
kd.dll 0x82047000-0x82060FFF

If you’re looking up the symbol for a variable, then for a DLL or EXE with a “Relocated Data Address Range” the variable would fall inside that range; for a DLL or EXE that does not have relocated data, the variable would fall inside the image address range.  So you may need to sort on both in order to find the right address range.

Another option you have is that you can type “gi proc” or “gi mod” in the Target Control Window to get the location where the code is loaded.  The exception for this is the kernel, nk.exe, which reports a different value under “gi proc” than where its code is loaded.  (These lists don’t show relocated data addresses, so they’re only useful for looking up function symbols.)

Target Control Window

Windows CE>gi proc
PROC: Name hProcess: CurAKY :dwVMBase:CurZone
P00: NK.EXE 03fb4002 00000001 c2000000 00000000
P01: filesys.exe 03f4b69e 00000002 04000000 00000020
P02: shell.exe 03dcae86 00000004 06000000 00000000
P03: device.exe e3d60f6e 00000008 08000000 00000000
P05: gwes.exe 4389f8aa 00000020 0c000000 00000000
P08: services.exe 03497182 00000100 12000000 00000000

Windows CE>gi mod
MOD: Name pModule :dwInUSE :dwVMBase:CurZone
M35: coredll.dll 83fb4794 7ffbffff 03f80000 00000000
M45: devmgr.dll 83d36250 00000008 03ee0000 00000000
M58: fsdmgr.dll 83fb1ea0 00000002 03f50000 00000000
M60: giisr.dll 8390c5dc 00000009 81249000 00000000
M65: hd.dll 83fb425c 00000001 8123e000 00000000
M75: kbdmouse.dll 8383be54 00000020 01ba0000 00000000
M76: kd.dll 82ce7e40 00000001 82047000 00000000
M80: lmemdebug.dll 83d60d30 0000003c 01ac0000 00000000
M98: osaxst0.dll 83fb44c0 00000001 81240000 00000000
M99: osaxst1.dll 83103d6c 00000001 82042000 00000000

Other possible options – if you have data from a Remote Kernel Tracker / CeLog output file (.clg), that file usually contains a “re-sync” (the output of the CeLogReSync API) where CeLog dumps a list of all the existing threads, processes and modules.  (This list doesn’t contain relocated data addresses either.)  Here is some sample output from the “readlog” command-line tool which parses CeLog output files, edited a little to fit this web format.

Readlog.exe Output File
--:--:--.---.--- : Marker, counter frequency=1193180 Hz, default thread quantum=100
0:18:33.578.716 : ProcessCreate, dwVMBase=0x81200000, NK.EXE
0:18:33.578.779 : ProcessCreate, dwVMBase=0x04000000, filesys.exe
0:18:33.578.870 : ProcessCreate, dwVMBase=0x06000000, shell.exe
0:18:33.578.901 : ProcessCreate, dwVMBase=0x08000000, device.exe
0:18:33.579.666 : ProcessCreate, dwVMBase=0x0C000000, gwes.exe
0:18:33.579.979 : ProcessCreate, dwVMBase=0x12000000, services.exe
0:18:33.581.421 : ModuleLoad, dwBase=0x82047000, kd.dll
0:18:33.581.450 : ModuleLoad, dwBase=0x82042000, osaxst1.dll
0:18:33.582.422 : ModuleLoad, dwBase=0x01BA0000, kbdmouse.dll
0:18:33.582.559 : ModuleLoad, dwBase=0x81249000, giisr.dll
0:18:33.583.481 : ModuleLoad, dwBase=0x03EE0000, devmgr.dll
0:18:33.583.494 : ModuleLoad, dwBase=0x01AC0000, lmemdebug.dll
0:18:33.583.673 : ModuleLoad, dwBase=0x03F50000, fsdmgr.dll
0:18:33.583.699 : ModuleLoad, dwBase=0x03F80000, coredll.dll
0:18:33.583.714 : ModuleLoad, dwBase=0x81240000, osaxst0.dll
0:18:33.583.727 : ModuleLoad, dwBase=0x8123E000, hd.dll
--:--:--.---.--- : Sync End Marker

Another option you have is to use the ToolHelp APIs (CreateToolhelp32Snapshot, etc) to programmatically gather this data.

It’s relatively straightforward to figure out, given an address, which module that address falls inside.  Use the “Price Is Right” matching: whichever module is closest to that address, without going over, is the matching module.  [“The Price is Right” is an old US TV game show.]  Address 0x8123FFFF is inside hd.dll; 0x81240000 is inside osaxst0.dll.

The main thing you have to be careful about is recognizing addresses that are inside process slots, and converting them to slot 0 addresses.  Address 0x09EE5678 is inside devmgr.dll, not device.exe.  (ERRATA: See comments below for the error in this statement.  I've left the text here unchanged so that you know what I was talking about.  For the sake of the remainder of this write-up, just pretend devmgr.dll loads at address range 0x01EE0000-0x01EE9FFF instead of the range I listed in all the examples above.)

Figuring out what function or global variable it is

Once you know what DLL or EXE the address is within, you can simply subtract the DLL or EXE base address from the address in question, to get the offset of the address from the start of the DLL or EXE.  So address 0x09EE5678 is at offset 0x00005678 from the start of devmgr.dll.

Then you look up the offset from the .map file for the DLL or EXE.  You should already have the .map files from most modules you work with, but if you are missing one, it may be because the WINCEMAP environment variable must be set to 1 when the module is  linked, in order to produce the .map file.

Here’s an excerpt from devmgr.map.  The important things in the .map file are the function names and the addresses on the right side (highlighted):

Preferred load address is 10000000

 0001:000044b0 _ConvertStringToGuid 100054b0 f devcore:devpnp.obj
0001:00004546 _FindDeviceInterface 10005546 f devcore:devpnp.obj
0001:0000456c _I_AdvertiseDeviceInterface 1000556c f devcore:devpnp.obj
0001:00004774 _PnpAdvertiseInterfaces 10005774 f devcore:devpnp.obj
0001:00004a9b _PnpDeadvertiseInterfaces 10005a9b f devcore:devpnp.obj
0001:00004ae5 _InitializePnPNotifications 10005ae5 f devcore:devpnp.obj

The address on the right is saying where the function would live in RAM if the DLL was loaded at its PreferredLoadAddress – which is stored at the top of the .map file, but is almost always 0x10000000 for a DLL.  Which makes it easy, because you just chop off the top part of the address to find out where the function would live if the DLL was loaded starting at 0.  So offset 0x00005678 from the beginning of devmgr.dll is the symbol I_AdvertiseDeviceInterface.  Any offset between 0x0000556C and 0x00005773 inside devmgr.dll falls inside I_AdvertiseDeviceInterface.  And, since devmgr.dll is loaded at address 0x03EE0000, any address between 0x03EE556C and 0x03EE5773 corresponds to I_AdvertiseDeviceInterface.  So again you are basically using “Price is Right” matching.

EXEs are a little different, but even easier.  Most EXEs will be linked with a PreferredLoadAddress of 0x00010000, but that is the spot the Windows CE kernel always loads an EXE at anyway.  So there’s no subtraction involved – just use the address to the right of the symbol.

Once you get used to doing this stuff, it’s not that hard, just a little manual labor.  In general the formula for going from ADDRESS à SYMBOL is:

    (.map file offset) = (address) – (module base addr) + (module preferred load addr)

And remember to use “Price is Right” matching for the module and the symbol: whichever has the closest address without going over is the closest match.

Figuring out what source line it is

You can even get down to the source line if you have the COD files to look at.  I have rarely had to do this kind of lookup, but it is occasionally useful.  The two main uses I’ve had for it are: figuring out exactly which line of source/disassembly a crash or debug break occurred on, and figuring out which instance of a function call is involved in a callstack.  For example if function Foo calls function Bar twice, and I’m looking at a callstack which involves Foo calling Bar, I may want to know which invocation of Bar it was inside Foo.  The address given for Foo in the stack is the address where the jump to Bar occurred, so I can use the address given for Foo in the stack to figure out what line of code the jump occurred on.

COD files come from compiling the code with the WINCECOD environment variable set to 1.  By default WINCECOD is not set, so you will not have COD files by default.  And COD files are not shipped with the Platform Builder CDs or with the Windows CE source.  So unless you have buildable source for the OS, this work is only possible for your own code.

You may have noticed that the MAP file lists the OBJ file the function came from.  For example I_AdvertiseDeviceInterface came from the devpnp.obj file which was linked into devcore.lib.  This means the code was in “devpnp.c” or “devpnp.cpp.”  If you know where this source file is, you can go to the right directory, set WINCECOD=1, build –c, and then open the COD file from obj\%_TGTCPU%\%WINCEDEBUG%.  It’ll be named after the C or CPP file – in this case devpnp.cod.  The function name will appear several times in the COD file, but in only one case will the function name be followed by the word “PROC,” and this is the beginning of the assembly code for that function.  Here’s a sample that’s munged a little from reality.

_I_AdvertiseDeviceInterface PROC NEAR ; COMDAT

; 348 : {

  001e8 55 push ebp
001e9 8b ec mov ebp, esp
001eb 81 ec 14 02 00
00 sub esp, 532 ; 00000214H
001f1 a1 00 00 00 00 mov eax, DWORD PTR ___cke
001f6 53 push ebx
001f7 56 push esi

; 349 : ULONG result = ERROR_SUCCESS;

  001f8 33 f6 xor esi, esi

As you can see, the source code and line numbers are included in the COD file next to the assembly code that implements the source.  Each assembly instruction is labeled with an address.  So the COD file provides the mapping from function offset to source line.  Sometimes the assembly for the function is labeled starting with 0, and sometimes not.  (I believe it may depend on which compiler is being used – meaning it varies by CPU type.)  If not, you have to take the function start label into account.  So in this example the function starts at label 0x001E8.

So the formula for the line of assembly to look for is:

    (assembly label) = (.map file offset) – (function start addr from .map) + (function start label from .cod)

And to complete my example, subtracting the start offset of I_AdvertiseDeviceInterface 0x0000556C from the address 0x00005678, I find that the address I'm interested in is at offset 0x0000010C from the beginning of the function.  Add that to the start label 0x001E8 of I_AdvertiseDeviceInterface from the COD file, and I am looking for the assembly line labeled 0x002F4.  This line will tell me exactly what assembly code is at that address, and what line of source code that corresponds to.