dynamic_cast is slow in x64


A thread in internal discussion group reveals that dynamic_cast is very slow in x64 system. And one of the developers explains the reason:


From:
Sent: Tuesday, October 17, 2006 11:52 AM
To:
Subject: RE: dynamic_cast code runs faster in WOW mode than native x64


 


I haven’t looked at profiles or tried your testcase, but I think I know why you see this difference.


 


dynamic_cast works by looking at the RTTI (run-time type info) associated with a type.  That RTTI has a bunch of pointers in it.  On x86, these pointers are just that, raw 32-bit pointers.  But on 64-bit platforms, the pointers are actually 32-bit offsets which need to be added to the base address of the DLL or EXE in which the RTTI resides to compute a true 64-bit pointer.  That addition shouldn’t be causing any major perf problems, since it’s cheap and not too common.  But determining the module base address, which happens once per dynamic_cast, could be expensive.  It’s done via the API call RtlPcToFileHeader, which (in the fast case) takes the loader lock and walks the list of loaded modules to find where the RTTI data resides.


 

Comments (5)

  1. Dean Harding says:

    > the pointers are actually 32-bit offsets which need to be added to the

    > base address of the DLL or EXE in which the RTTI resides to compute a true 64-bit pointer

    Do you know why it does that? Why doesn’t it just store raw 64-bit pointers? Is it some sort of binary compatibility issue?

  2. Adam says:

    That’s the *fast* case? What’s the slow case?!?

  3. Adam says:

    Also, shouldn’t the title be something like "dynamic_cast slow in WOW on x64"? The current title implies (to me) that it’s a problem with native x64, rather than a problem with the 32-on-64 back-compat hacks.

  4. junfeng says:

    Dean,

    I have no idea why it is implemented that way.

    Adam,

    The title is correct, the email says dynamic_cast runs faster in WOW than native x64.

  5. Brian Tyler says:

    Any idea on whether (a) this is fixed in Orcas or (b) this could be fixed via a SP (don’t know how much of this logic is compiled in versus a RT call)?