Why does the debugger show me the wrong virtual function?


Pointers to virtual functions all look basically the same and therefore, as we learned last time, all end up merged into a single function. Here’s a contrived example:

class Class1
{
public:
 virtual int f1() { return 0; }
 virtual int f2() { return 1; }
};

class Class2
{
public:
 virtual int g1() { return 2; }
 virtual int g2() { return 3; }
};

int (Class1::*pfn1)() = Class1::f2;
int (Class2::*pfn2)() = Class2::g2;

If you take a look at pfn1 and pfn2 you’ll see that the point to the same function:

0:000> dd pfn1 l1
01002000  010010c8
0:000> dd pfn2 l1
01002004  010010c8
0:000> u 10010c8 l2
010010c8 8b01     mov     eax,[ecx]           ; first vtable
010010ca ff6004   jmp     dword ptr [eax+0x4] ; second function

That’s because the virtual functions Class1::f2 and Class2::g2 are both stored in the same location relative to the respective object pointer: They are the second entry in the first vtable. Therefore, the code to call those functions is identical and consequently has been merged by the linker.

Notice that the function pointers are not direct pointers to the concrete implementations of Class1::f2 and Class2::g2 because the function pointer might be applied to a derived class which override the virtual function:

class Class3 : public Class1
{
public:
 virtual int f2() { return 9; }
};

Class3 c3;
(c3.*pfn1)(); // calls Class3::f2

Applying the function pointer invokes the function on the derived class, which is the whole point of declaring the function Class1::f2 as virtual in the first place.

Note that the C++ language explicitly states that the result of comparing non-null pointers to virtual member functions is “unspecified”, which is language-standards speak for “the result not only depends on the implementation, but the implementation isn’t even required to document how it arrives at the result.”

Comments (22)
  1. PM says:

    Slightly off topic, but is there any reliable way to get the index of a virtual function into the vtable in MSVC? On GCC, I take a member function pointer, which then contains the information… As you stated, MSVC on the other hand, generates code that knows the vtable offset. At the moment I’m extracting the offset from the code, but that is an even worse solution than what I am doing on GCC…

    Thanks in advance.

  2. A says:

    but is there any reliable way to get the index of a virtual function into the vtable in MSVC?

    This is a really dangerous operation. It’s totally compiler specific and may break in release vs debug modes. And just when you think you’ve got it right the optimizer can step in and kill what you’re doing in one specific location.

    I can think of several reasons you might want to do this, but I can’t think of one which can’t be replaced with an equally fast yet safe method.

    What are you actually trying to do? If you’re trying to fake a jump table – just use a jump table.

  3. Ben Hutchings says:

    In order to look up a virtual function in an object, not only do you need to know the vtable index, but you also need to know where the vtable pointer is. For example, in a typical 32-bit implementation:

    class B1 {

    virtual void foo() = 0; // offset 0, index 0

    };

    class B2 {

    virtual void bar() = 0; // offset 0, index 0

    };

    class D : public B1, // offset 0

    public B2 // offset 4

    {

    // D::foo has offset 0, index 0

    // D::bar has offset 4, index 0

    };

    Clearly the two vtable pointers "inherited" by D can’t be merged, so only one of them can be at offset 0.

    Even without m.i., there are other reasons why vtable pointers might not be at the start of an object:

    struct B {

    int i; // offset 0

    };

    class D : private B // offset 0

    {

    virtual void foo() {} // offset 4, index 0

    };

  4. PM says:

    Well, yes, I know that it’s dangerous, but I guess it’s neccessary. What I am doing is hooking virtual functions. Until now, the index was hardcoded… But I don’t like hardcoded stuff so I thought there might be an easier way…

    Note that I can’t patch the function, because NX-bit / PaX would break that approach… So I am going with virtual function hooking…

  5. Ben Hutchings says:

    PM: AFAIK, NX and PaX don’t prevent you from changing page permissions, though you may be prevented from enabling write and execute at the same time.

  6. PM says:

    I am aware of the problems with multiple inheritance. My code relies on the fact that MSVC seems to be always placing the vtable pointer to the beginning of the object (so the this pointer points to a pointer to the vtable). I can get the this pointer the function expects to be called with using some member function pointer hackery.

    What PaX does is enforce that any page that ever had PROT_WRITE set may never get PROT_EXEC set anymore; So you can’t execute any code you have modified (or generated on runtime).

    http://en.wikipedia.org/wiki/PaX_(Linux)#Enforced_non-executable_pages <– Check the "restricted mprotect" part.

  7. igor1960 says:

    I maybe wrong here and kindly correct me: as I understand about PaX is that yes you can’t use PROT_EXEC|PROT_WRITE to inject and execute later code located on such page. However, your VTable points to a page that has just array of addresses to vtable functions, so it doesn’t have to be a code page and in most cases it’s not. Code of the function itself is located on the code page, no question. However, you are not modifying that code page — what you are modifying is the page with Vtables.

  8. igor1960 says:

    I forgot to mention one more thing: I think you may continue to rely "on the fact that MSVC seems to be always placing the vtable pointer to the beginning of the object" and again somebody please correct me if I’m wrong here. The reason you should strongly continue to rely on that fact is that MSVC is the ground for COM implementation and continues to be such a ground. MSVC VTable implementation used by so many MSFTs COM related librairies (ATL, MFC, .NET nows and etc). According to COM binary layout however that the is the only way interface is represented "(so the this pointer points to a pointer to the vtable)". At least in 32bit world. I have my own reservations and doubts on what will happen in 64 bits and should COM binary layout suppose to be changed/extended, but I’m not going to speculate on this right now.

  9. Ben Hutchings says:

    Preventing writable pages from ever becoming executable is, erm, quite special. It will tend to break JIT compilers, though they could write code to files and then map them back in. Windows’s execution protection is not that restrictive.

  10. igor1960 says:

    Raymond, as I’m not going to argue with your statement about multiple inheritance and vtable placement in that case. But the only way you can put multiple inherited objects vtables in the middle of the object is due to MSFTs invented ATL_NO_VTABLE… In that case no full objects VTable is getting created.

    However, even in such case of enclosed each object VTable — respective VTable pointer of the enclosed object could be obtained through QueryInterface…

    As to your statement that "vtable can legally be in the middle of a COM object" – this is completely wrong, and I’m not going even to argue with that. Just to point you are out: presence of VTable in the middle of some object doesn’t mean that this is this objects Vtable — in fact according to COM binary layout pointer to vtable of any COM interface should be the first member if you want to implement it manually.

  11. Raymond Chen says:

    There is no requirement that the vtable be the first member of the C++ object. Notice for example that in the third diagram on http://blogs.msdn.com/oldnewthing/archive/2004/02/05/68017.aspx the vtable for "q" is in the middle of the larger object.

    The term "COM object" is somewhat imprecise. From COM’s point of view, a "COM object" is just a pointer to a vtable. However the enclosing C++ object chooses to use the to use the memory that comes before and after that vtable pointer is not COM’s problem.

  12. asdf says:

    The standard doesn’t even say there needs to be a vtable and that code is wrong. The standard only says that static_casts (which your C style cast boils down to) to void* can only be converted back to the original pointer type (5.2.9#10) [except for NULL pointers of course].

  13. igor1960 says:

    Raymond: yes I know that "There is no requirement that the vtable be the first member of the C++ object…".

    My message was specifiucally targeted as a reply to PM, whose meeesages I’ve interpreted that he explicitly knows type of his object and he explicitly hacking into this object VTable and therefore this object has VTable and probably this objects base class has VTable. So, my message was not just some universal case, but specific case used by PM. I hope, I’ve interpreted his situation correctly.

    Michael Grier [MSFT]: Eventhough MSFT implements excatly what your sample shows — it’s not standard that requires such an implementation. However, consider the following:

    assume that struct X has virtual function:

    struct X { int i; virtual void fooX(); };

    struct Y : public X { virtual void foo() { exit(0); } };

    Do you agree with me that the following layout for Y would be created:

    Y =>

    — vTableY1 pointer to Y vtable that has 2 entries [0] points to fooX and [1] points to foo;

    — int i

    —-vTableY2 pointer to Y vtable again that has 2 entries [0] points to fooX and [1] points to foo ;

    So, vTableY1 == vTableY2

    If we do the same and declare Y with ATL_NO_VTABLE then:

    struct ATL_NO_VTABLE Y : public X

    Then layout of Y will look liket that:

    Y =>

    — vTableX pointer to X vtable that has just 1 entry [0] points to fooX;

    — int i

    —-vTableY pointer to Y vtable again that has 2 entries [0] points to fooX and [1] points to foo ;

    Now vTableX != vTableY

    Am I right?

    asdf: as far as I know — you are correct.

  14. M Knight says:

    Ben Hutchings, Windows’s execution protection is more comprehensive. And you can always use VirtualProtect to change the page flags after you have done stuff.

    There is a difference between PAGE_EXECUTE, PAGE_EXECUTE_READWRITE & PAGE_READWRITE. Its only been with the NX & PaX bits that there is any functional difference.

    http://msdn.microsoft.com/library/default.asp?url=/library/en-us/memory/base/memory_protection.asp

  15. Raymond Chen says:

    The vtable can legally be in the middle of a COM object, and in fact it *must* be so in multiple inheritance scenarios (only one vtable can be first; the other has to go somewhere else).

    The pointer to the interface is not the same as the pointer to the object. You can put the vtable anywhere you want.

  16. consider also:

    struct X { int i; };

    struct Y : public X { virtual void foo() { exit(0); } };

    I believe that the standard requires that the vtable for Y be after all the members of X. Otherwise this would not work:

    Y y;

    X *px;

    void *pv;

    y.i = 7;

    pv = (void *) &y;

    px = (X *) pv;

    printf("px->i = %dn", px->i);

    The compiler is free to move around the "this" pointer on transitioning from a virtual function’s caller to the implementation but for basic single derivation, casting between the base type and derived type is required to be trivial.

  17. Two confusing features that explode when you combine them.

  18. igor1960 says:

    Ben: You are right — I’ve made a bubu on Y, but with ATL_NO_VTABLE — everything just fine in my example. As to your previous remark on .NET and JIT compiler — I wonder if MSFT realizes the danger of having JIT generated Vtables and code itself dynamicaly generated on the heap? Do they at all consider any measures to protect from intrusion? And if they do — what exactly is considered and is it possible at all: as any measure like that in case of .NET would be conflicting with the ground of .NET architecture?

  19. Ok, so I stopped trying to be a language lawyer about when C++ was going through the ISO process so I won’t pretend to be one.

    Maybe I’m confusing the fact that "C type" derivation:

    struct Base { int i; };

    struct Derived { struct Base base; int x; };

    You have to be able to cast a pointer to the base member in Derived to pvoid and that pvoid back to a pointer to a Derived.

    Mea culpa! I haven’t felt the need to go buy a copy of the ISO C++ standard since there are plenty of language lawyers around.

  20. Ben Hutchings says:

    igor1960: In your example, Y will only have one vtable pointer; there’s no need for more. Also the purpose of ATL_NO_VTABLE (which expands to __declspec(novtable)) is to tell the compiler that the class doesn’t need *any* vtables of its own. This is a somewhat useful space optimisation in an abstract class whose constructor and destructor don’t call virtual functions indirectly. (For any other class, it will probably result in disaster.)

  21. Ben Hutchings says:

    Michael Grier wrote: "The compiler is free to move around the "this" pointer on transitioning from a virtual function’s caller to the implementation but for basic single derivation, casting between the base type and derived type is required to be trivial."

    That’s not required by the C++ standard, though I suspect it is implied by the COM ABI. Casting from void * to another pointer type is equivalent to a static_cast, which is defined (in 5.2.9/10) only if the destination type is exactly the type of pointer that was originally converted to void *. However it is possible to reinterpret_cast a pointer to a POD struct to a pointer to its first member and vice versa (9.2/17) which supports pseudo-inheritance in struct definitions shared with C; perhaps that’s what you remember.

  22. Hi

    &amp;nbsp;

    This is also a&amp;nbsp;test in posting images in my blog ..&amp;nbsp;

    &amp;nbsp;

    &amp;nbsp;

    In this…

Comments are closed.