Understanding "Magic" Pointers and Offsets

With this blog post I try to explain how "magic" pointers and offsets work.

I just copied the term "magic" to refer to these kinds of pointers or offsets:

dd poi(0x129514 + 0x18) + 0x8 L2

du poi(0x0007de95)

du poi(poi(poi(0x129514 + 0x9c)) + 0x4)

dd poi(0x129514 + 0x34)

To use an analogy: it is similar to the "magic number" term we use for programs that access a value using a number instead of using a constant, like:

b = 3.14; // Magic number.

Instead of:

const float PI = 3.14;

b = PI;

I decided to blog about it because sometimes I run into people that look at these and wonder how these expressions can return valid data and how to create them.

Actually it's very simple!

Let me explain: The commands above display values or Unicode strings from specific locations.

These locations are selected based on pointers and offsets. These offsets usually come from structs (struct is the short term for structure. It’s a C/C++ reserved word) or classes. The structs or classes can be seen when using private symbols or source code access. To create them it's necessary to have private symbols or source code access because you need to see the class or struct to be able to create a command that displays information for a specific field without using any symbol.

Ok, at this point you may wonder why not use the variable names instead of using offsets.

The answer is because sometimes you just don't have private symbols.

Thus, when you use specific offsets from a specific address, you work around the lack of symbols.

This is great; therefore, I use this approach in some of my scripts. That's why it works for you using just public symbols or no symbols at all, like this one here. The downside is that your command or script depends on the specific offsets and pointers. In other words, if the internal class or struct changes, you might need to change your command, or you won't get the right information anymore.

This is funny: very often I see that inexperienced debuggers get impressed when they see a command using several layers of offsets.

Probably they don’t know the person who created the command memorized it because he uses it over and over. The person created it because he/she had access to the source code!

Yes, this is how we can create these kinds of commands using these magic pointers! :)

A lot of engineers at Microsoft, including myself, have a cheat sheet file where we save many "magic" pointers and offsets.

Some of the commands I have:

- Extracts PerfMon counters from mscorsvr/mscorwks.

- Extracts ASP page number being executed from an ASP call.

- Extracts ASP template from an ASP call.

- Extracts ASP source code from an ASP call.

- Extracts SQL command from OLEDB call.

- Extracts Connection string from OLEDB call.

- Extracts HRESULT from call stack.

- Extracts COM Object from OLE32 call.

- Extracts Remote Server IP Address from Winsock call.

We do that because we might forget the specific offsets and have a situation where we don't have private symbols access, for example, when working from home without being able to access the office computer or when working from a customer's site.

When we have this situation, we just open our cheat sheet file and use the command to get the information we want without needing to read the source code.

Whenever you see this kind of "magic" pointer or offset you should consider it might not work with newer product versions. Keep it in mind, and you won't be frustrated.

That said; let me exemplify how this “magic” pointer works and what might happen when the struct or class being used changes.

Source code for the application MagicPointer:

#include "stdafx.h"

#include <conio.h>

struct stData

                   {

                             TCHAR szName[21];

                             int nVersion;

                             TCHAR cLetter;

                             int nCode;

                             // int nCode2;

                             TCHAR szSomeSentence[51];

                   };

void DoSomething(stData* pst);

                            

int _tmain(int argc, _TCHAR* argv[])

{

    stData* pstStruct = new stData();

         

    DoSomething(pstStruct);

   

    _getch();

    delete pstStruct;

   

    return 0;

}

void DoSomething(stData* pst)

{

    _tcscpy_s(pst->szName, L"Test");

    pst->nVersion = 5;

    pst->cLetter = L'D';

    pst->nCode = 3;

    _tcscpy_s(pst->szSomeSentence, L"Some Sentence");

}

I used Visual C++ 2005 and compiled the application above using a Debug build.

When debugging this application using symbols, I have the structure below after executing the last line from DoSomething():

Local var @ 0x17fe44 Type stData*

0x005f5600

   +0x000 szName : [21] "Test"

   +0x02c nVersion : 5

   +0x030 cLetter : 0x44 'D'

   +0x034 nCode : 3

   +0x038 szSomeSentence : [51] "Some Sentence"

Since I have symbols and source code access, I can see the offsets above for each field.

Let’s suppose I want to dump the szSomeSentence field.

I can use:

0:000> du @@c++(pst->szSomeSentence)

005f5638 "Some Sentence"

Or:

0:000> ?? (wchar_t*) pst->szSomeSentence

wchar_t * 0x005f5638

 "Some Sentence"

Cool, huh? I’m using the C++ sintax from Windbg.

 

I also know the address of the DoSomething() method:

004114f0 MagicPointer!DoSomething (struct stData *)

Now, let’s try the same process but without using symbols.

I reloaded the application and broke into the debugger when the _getch() was called.

As you can see below, the approaches above didn’t work because I cannot use the variable names anymore.

 

 

How can I overcome it? Once I know the offsets, since I can see the source code and private symbols, I can create a magic pointer.

First, I put a breakpoint when the function is about to return. At this point the data will be already assigned to the struct.

 

Then I dump the Unicode string, but this time using pointers and offsets, like:

bp 00411560          ß Break point when the method is about to end.

0:000> kvn

 # ChildEBP RetAddr Args to Child

00 0017fe3c 00411478 008d5600 00000000 00000000 MagicPointer!DoSomething+0x70 (FPO: [Non-Fpo]) (CONV: cdecl) [c:\development\my tools\personal blog\article #17\magicpointer\magicpointer\magicpointer.cpp @ 40]

01 0017ff48 00411bd6 00000001 008d1188 008d1278 MagicPointer!wmain+0x88 (FPO: [Non-Fpo]) (CONV: cdecl) [c:\development\my tools\personal blog\article #17\magicpointer\magicpointer\magicpointer.cpp @ 24]

02 0017ff98 00411a1d 0017ffac 776119f1 7efde000 MagicPointer!__tmainCRTStartup+0x1a6

03 0017ffa0 776119f1 7efde000 0017ffec 77c2d109 MagicPointer!wmainCRTStartup+0xd

04 0017ffac 77c2d109 7efde000 00178231 00000000 kernel32!BaseThreadInitThunk+0xe

05 0017ffec 00000000 0041108c 7efde000 00000000 ntdll!_RtlUserThreadStart+0x23

008d5600 is our struct.

Then I use:

0:000> du 008d5600 + 0x38

008d5638 "Some Sentence"

Here it is!

Even without using symbols for the MagicPointer application I can get the information because I know the offsets.

Now, what happens when we change the struct? It might break our magic pointer.

Let’s see…

Remove the comment from the commented field nCode2 and rebuild the application.

Reload Windbg and, without using symbols for the MagicPointer application, use the breakpoint above and the same instructions.

bp 00411560

kvn

Then use:

0:000> kvn

 # ChildEBP RetAddr Args to Child

WARNING: Stack unwind information not available. Following frames may be wrong.

00 0017fe3c 00411478 00365600 00000000 00000000 MagicPointer+0x11560

01 0017ff48 00411bd6 00000001 00361188 00361278 MagicPointer+0x11478

02 0017ff98 00411a1d 0017ffac 776119f1 7efde000 MagicPointer+0x11bd6

03 0017ffa0 776119f1 7efde000 0017ffec 77c2d109 MagicPointer+0x11a1d

04 0017ffac 77c2d109 7efde000 00177837 00000000 kernel32!BaseThreadInitThunk+0xe

05 0017ffec 00000000 0041108c 7efde000 00000000 ntdll!_RtlUserThreadStart+0x23

du 00365600 + 0x38

What happened?

 

Now we got garbage because the field szSomeSentence is not at the offset 0x38 anymore.

However, if you use symbols and the instructions I used before referring to variable names, it works without problems!

So we need to change our offset. Of course, as we have access to the source code and symbols, we can do that very easily and create an instruction that again dumps the szSomeSentence value when not using symbols. Actually we could get the same information by reverse engineering the application if you don’t have access to private symbols or source code. I’m not considering this approach here. We don’t need to do that if we have or had access to the source code.

 

Ok, so now we know there’s a new field. It’s an integer, so it uses a double word, 4 bytes on Win32.

What do we need to do? Shift our offset.

The new one is:

du <address> + 0x3c

Let’s try it…

 

It worked!

Of course, this is a very simple example. Usually we have classes that have properties that are pointers to structs and so on. The expressions become more complex although the principle is the same.

Actually, it sounds complex but it’s simple.