Pointers to member functions are very strange animals


Pointers to member functions are very strange animals.

Warning: The discussion that follows is specific to the way pointers to member functions are implemented by the Microsoft Visual C++ compiler. Other compilers may do things differently.

Well, okay, if you only use single inheritance, then pointers to member functions are just a pointer to the start of the function, since all the base classes share the same "this" pointer:

class Simple { int s; void SimpleMethod(); };
class Simple2 : public Simple
  { int s2; void Simple2Method(); };
class Simple3 : public Simple2
  { int s3; Simple3Method(); };
p
Simple::s

Simple2::s2

Simple3::s3

Since they all use the same "this" pointer (p), a pointer to a member function of Base can be used as if it were a pointer to a member function of Derived2 without any adjustment necessary.

The size of a pointer-to-member-function of a class that uses only single inheritance is just the size of a pointer.

But if you have multiple base classes, then things get interesting.

class Base1 { int b1; void Base1Method(); };
class Base2 { int b2; void Base2Method(); };
class Derived : public Base1, Base2
  { int d; void DerivedMethod(); };
p
Base1::b1
q
Base2::b2

Derived::d

There are now two possible "this" pointers. The first (p) is used by both Derived and Base1, but the second (q) is used by Base2.

A pointer to a member function of Base1 can be used as a pointer to a member function of Derived, since they both use the same "this" pointer. But a pointer to a member function of Base2 cannot be used as-is as a pointer to a member function of Derived, since the "this" pointer needs to be adjusted.

There are many ways of solving this. Here's how the Visual Studio compiler decides to handle it:

A pointer to a member function of a multiply-inherited class is really a structure.

Address of function
Adjustor

The size of a pointer-to-member-function of a class that uses multiple inheritance is the size of a pointer plus the size of a size_t.

Compare this to the case of a class that uses only single inheritance.

The size of a pointer-to-member-function can change depending on the class!

Aside: Sadly, this means that Rich Hickey's wonderful technique of Callbacks in C++ Using Template Functors cannot be used as-is. You have to fix the place where he writes the comment

// Note: this code depends on all ptr-to-mem-funcs being same size

Okay, back to our story.

To call through a pointer to a member function, the "this" pointer is adjusted by the Adjustor, and then the function provided is called. A call through a function pointer might be compiled like this:

void (Derived::*pfn)();
Derived d;

(d.*pfn)();

  lea  ecx, d       ; ecx = "this"
  add  ecx, pfn[4]  ; add adjustor
  call pfn[0]       ; call

When would an adjustor be nonzero? Consider the case above. The function Derived::Base2Method() is really Base2::Base2Method() and therefore expects to receive "q" as its "this" pointer. In order to convert a "p" to a "q", the adjustor must have the value sizeof(Base1), so that when the first line of Base2::Base2Method() executes, it receives the expected "q" as its "this" pointer.

"But why not just use a thunk instead of manually adding the adjustor?" In other words, why not just use a simple pointer to a thunk that goes like this:

Derived::Base2Method thunk:
    add ecx, sizeof(Base1)  ; convert "p" to "q"
    jmp Base2::Base2Method  ; continue

and use that thunk as the function pointer?

The reason: Function pointer casts.

Consider the following code:

void (Base2::*pfnBase2)();
void (Derived::*pfnDerived)();

pfnDerived = pfnBase2;

  mov  ecx, pfnBase2            ; ecx = address
  mov  pfnDerived[0], ecx

  mov  pfnDerived[4], sizeof(Base1) ; adjustor!

We start with a pointer to a member function of Base2, which is a class that uses only single inheritance, so it consists of just a pointer to the code. To assign it to a pointer to a member function of Derived, which uses multiple inheritance, we can re-use the function address, but we now need an adjustor so that the pointer "p" can properly be converted to a "q".

Notice that the code doesn't know what function pfnBase2 points to, so it can't just replace it with the matching thunk. It would have to generate a thunk at runtime and somehow use its psychic powers to decide when the memory can safely be freed. (This is C++. No garbage collector here.)

Notice also that when pfnBase2 got cast to a pointer to member function of Derived, its size changed, since it went from a pointer to a function in a class that uses only single inheritance to a pointer to a function in a class that uses multiple inheritance.

Casting a function pointer can change its size!

I bet that you didn't know that before reading this entry.

There's still an awful lot more to this topic, but I'm going to stop here before everybody's head explodes.

Exercise: Consider the class

class Base3 { int b3; void Base3Method(); };
class Derived2 : public Base3, public Derived { };

How would the following code be compiled?

void (Derived::*pfnDerived)();
void (Derived2::*pfnDerived2();

pfnDerived2 = pfnDerived;

Answer to appear tomorrow.

Comments (19)
  1. R says:

    ‘Casting a function pointer can change its size!’

    Bu..bu.. bu … my head hurts.

  2. Lonnie McCullough says:

    I was just wondering about this very subject and was going to post a comment asking you to discuss it. Wow! Thanks again, I always wondered why my pointers to member functions had a sizeof() == 8 sometimes and 4 other times.

  3. Henk Devos says:

    Long ago, when trying to cast a member function pointer to a DWORD, i couldn’t believe that the compiler gave me a size mismatch pointer…

  4. Curt says:

    The size of a pointer-to-member-function can change depending on the class!

    As if Mondays aren’t hard enough! I’m going back to bed. :)

  5. Jonathan O'Connor says:

    My friend and former colleague, Martin O’Riordain came up with the idea of using thunks for ptr-to-member-fn about 1990. He wouldn’t tell us about them then, because he wanted to use this for his interview with Microsoft. Its funny to see that they aren’t used anymore. But then I guess, C++ has moved on a bit since then.

  6. Ben Hutchings says:

    The size of a pointer-to-member should not vary as you describe. The C++ standard says it’s OK to cast a pointer-to-member-of-derived to pointer-to-member-of-base (though it must be a non-virtual base) so long as you only use the result with pointers-to-base that really point to instances of the derived class. So the "optimisation" of dropping the adjustment field where it’s apparently not necessary is actually a bug. The compiler admits:

    "warning C4407: cast between different pointer to member representations, compiler may generate incorrect code". Thankfully there is an option that fixes this: "/vmg". However, the documentation for C4407 and for "/vmg" fails to acknowledge that this behaviour is a bug and basically blames the programmer for finding the edge cases where this "optimisation" doesn’t work.

  7. Raymond Chen says:

    Actually, Ben, if you’re referring to 4.11.2 (pointer to member conversions) the standard actually says the opposite. You can convert pointer-to-base-member into pointer-to-derived-member without loss of fidelity. And that’s what we’re doing here.

    I can’t find a place in the standard where it says that you can safely cast pointer-to-derived-member to pointer-to-base-member.

  8. Woon Kiat says:

    Hi Raymond, this is a very interesting topic. Is there any book or material you would recommend for further understanding on this topic?

  9. Raymond Chen says:

    This is all undocumented implementation details, so I doubt there’d be a book on it. You can read Ellis/Stroustrup "The Annotated C++ Reference Manual" to see the design of a completely different way of doing pointers to functions.

  10. Woon Kiat says:

    What about reference on multiple inheritance? Is there any book you would recommend? Most books just touch the surface and never go deep into it. The concept of MI always confuse me when it comes to the vtable layout and implementation. And the architect of C# decide not to have MI in C#. Do you think MI is a good topic to blog on?

  11. Raymond Chen says:

    I’m not sure what there is to write about on the subject of multiple inheritance. It’s a language feature. The language specification leaves a lot of the details to the implementation, so you shouldn’t expect to find a book that details implementation since that’s all undocumented and subject to change at any time.

  12. Stefan says:

    There is an article in MSDN about this:

    "C++: Under the Hood", Jan Gray, March 1994

    http://msdn.microsoft.com/archive/default.asp?url=/archive/en-us/dnarvc/html/jangrayhood.asp

  13. Ben Hutchings says:

    Raymond, I’m talking about explicit conversions (aka casts) which are specified in 5.4/7. 4.11/2 is about standard conversions which can be done implicitly (because they’re generally safe).

  14. Raymond Chen says:

    Ben: Okay I read 5.4/7 and I think you’re right. Then again, I’m not a real language lawyer; I just play one on the Internet.

  15. Giovanni Bajo says:

    Why anybody should care about Rich Hickey’s ten-years-old techniques when we have boost::bind and boost::function which works wonderfully as generalized binders and callbacks?

  16. Two confusing features that explode when you combine them.

Comments are closed.