A single Interop assembly does not work for different architectures


One of the exciting things about managed code is that it is quite a bit easier to write code that is portable across architectures. There is no question about the size of a short, int, or long integer. In fact, it is possible to write assemblies that run unaltered on 32 and 64 bit platforms. I say possible unless they interoperate with native code.


 


As an example, consider the following idl file:


 


import "oaidl.idl";


import "ocidl.idl";


 


[


    uuid(4FBDEE5B-607B-45b1-8B8F-A5779ECB5766),


    version(1.0),


    helpstring("align")


]


library align


{


importlib("stdole2.tlb");


 


typedef struct Simple


{


    int fourbytes;


    double dForce64bit;


} Simple;


 


typedef struct Complex


{


    int fourbytes;


    void * pv1;


} Complex;


 


};


 


I built this IDL file with:


 


rmdir /q /s simple


mkdir simple


midl align.idl /zp8 /tlb .\simple\align.tlb


tlbimp .\simple\align.tlb /out:.\simple\interop.align.dll


ildasm /text .\simple\interop.align.dll > .\simple\align.txt


 


Note that I am explicitly stating /zp8 (8 byte packing), but it is the default for MIDL and the C compiler so it does not need to be specified. The result is a type library (built for x86 – the default) and an Interop assembly. Looking at the disassembly we see:


 


.class public sequential ansi sealed beforefieldinit interop.align.Simple


       extends [mscorlib]System.ValueType


{


  .pack 8


  .size 0


  .field public int32 fourbytes


  .field public float64 dForce64bit


} // end of class interop.align.Simple


 


.class public sequential ansi sealed beforefieldinit interop.align.Complex


       extends [mscorlib]System.ValueType


{


  .pack 4 ß This will yield wrong results on 64 bit platforms


  .size 0


  .field public int32 fourbytes


  .field public native int pv1


} // end of class interop.align.Complex


 


Note that the Complex struct gets converted to “.pack 4” even though the void* has been converted to “native int” because we have built on a 32bit platform. The /zp8 is somewhat hard to understand. It doesn’t mean “align everything on 8 byte boundaries.” It means “add what padding is needed to make eight byte or smaller types have natural alignment.”


 


The simplest way to think of packing is to think of a simple struct:


 


struct Basic


{


    char c;


    INT32 myint;


};


 


/zp1 would pack things on 1 byte boundaries so myint would directly follow c in memory. /zp2 would align on 2 byte boundaries so there would be one byte of padding after c before myint. /zp4 would aling on 4 byte boundaries so there would be three bytes of padding after c before myint. /zp8 would actually behave like /zp4 in this case since only three bytes of padding are needed to achieve natural alignment for a 32 bit integer. Note that /zp1 and /zp2 have the potential to produce unaligned results. /zp8 produces results that roughly make sense.


 


The Simple struct contains a four byte integer followed by an eight byte floating point type. With eight byte packing, four bytes of padding is needed to get the floating point value to start on an eight byte boundary. This causes the .pack 8 to appear in the managed assembly.


 


The Complex struct contains a four byte integer followed by a “native int.” The size of the native int is platform dependent and was determined at build time. In this case, we built for 32bit (the default) which meant the size was taken to be four bytes. Since things naturally align on four byte boundaries, .pack 4 is put into the managed assembly.


 


The problem is that C++ code (ignoring managed C++) is platform dependent and the code compiled on the C compiler for Complex will have four bytes of padding between the four byte integer and the “native int” / void* value. This will be one of the worst types of failures. X64 tolerates misaligned data access so there will not be some immediate fault when the data is accessed, but reads and writes will happen at the wrong address—silently! This is awful and one of the hardest types of issues to track down.


 


This is bad enough that I am going to look into to having tlbimp generated assemblies built on 32 bit platforms fail to load on 64 bit platforms as a potential future change (or at least an option).


 


The correct solution is to build separately with MIDL and tlbimp on all platforms. An x64 build would look like:


 


rmdir /q /s x64


mkdir x64


midl align.idl /zp8 /win64 /tlb .\x64\align.tlb


tlbimp .\x64\align.tlb /out:.\x64\interop.align.dll /machine:X64


ildasm /text .\x64\interop.align.dll > .\x64\align.txt


 


The disassembly of the resulting Interop assembly will look like:


 


.class public sequential ansi sealed beforefieldinit interop.align.Simple


       extends [mscorlib]System.ValueType


{


  .pack 8


  .size 0


  .field public int32 fourbytes


  .field public float64 dForce64bit


} // end of class interop.align.Simple


 


.class public sequential ansi sealed beforefieldinit interop.align.Complex


       extends [mscorlib]System.ValueType


{


  .pack 8 ß This is what we expect on 64 bit platforms


  .size 0


  .field public int32 fourbytes


  .field public native int pv1


} // end of class interop.align.Complex


 


Both structs are .pack 8 as expected and will match the output of the C++ compiler. Note that both Interop assemblies can coexist with the same name in the GAC and the loader will correctly bind to the right version for the executing platform.


 


I would like to thank Bill Evans for writing a detailed email to a customer that I scavenged for information. I’d like to thank my friend Kevin Frei for answering some C++ compiler questions about packing.

Comments (1)
  1. ladipro says:

    So why does tlbimp generate .pack 4? If it generated (the default) .pack 8, the resulting interop assembly would work fine on both 32 bit and 64 bit. You say it yourself, the packing means “add what padding is needed to make eight byte or smaller types have natural alignment” and this is decided at type load time.

    [StructLayout(LayoutKind.Sequential, Pack=8)]

    struct Complex

    {

       int fourbytes;

       IntPtr pv1;

    }

    This structure will be 8 bytes big with pv1 on offset 4 on 32 bit, and 16 bytes big with pv1 on offset 8 on 64 bit. 100% compatible with what how the C++ compiler would lay out the fields on those platforms. So where is the problem? Isn’t the .pack 4 a tlbimp bug?

Comments are closed.

Skip to main content