Structures with embedded string fields

Tuesday's quiz used character arrays to represent string fields of the DEVMODE structure.  Why not define them as strings, which would be more convenient to manipulate in managed code?

If you define a structure field as a string, it is marshaled as a pointer to an unmanaged string by default (LPSTR/LPWSTR).  But with MarshalAsAttribute and UnmanagedType.ByValTStr, you can marshal a managed string field as an embedded string instead.  This must be used with MarshalAsAttribute's SizeConst named parameter to specify the size of the string.  The CharSet marking in the structure's StructLayoutAttribute determines whether the string is marshaled as ANSI or Unicode.

So it would seem natural to define DEVMODE's string fields like this:

  [MarshalAs(UnmanagedType.ByValTStr, SizeConst=32)]

  public string dmFormName;

instead of this:

  [MarshalAs(UnmanagedType.ByValArray, SizeConst=32)]

  public char [] dmFormName;

The problem with ByValTStr is that the marshaler only copies SizeConst-1 characters, plus a trailing null character, into the unmanaged buffer.  This almost certainly isn't the behavior you'd want, but you can get away with it if you know that you're never going to make use of all the characters.  Or if the location of the field and the padding of the structure lets you get away with extending the size of the string by one character without affecting other fields, you could potentially set SizeConst to n+1 to work around this problem.  But this is rarely the case.  So to get the full length of the string, you often have to use ByValArray, as I did with my DEVMODE definition.

You can see exactly how structures get marshaled to unmanaged code by using the Visual Studio .NET memory window while debugging.  To quickly show you the unmanaged memory layout corresponding to each definition of the dmFormName field above, I dusted off the MarshaledStructInspector class from Chapter 19 of .NET and COM: The Complete Interoperablity Guide.  With the character array version, the following code:

  DEVMODE d = new DEVMODE();

  d.dmCollate = 0x1234;

  d.dmFormName = new char[]{'a','b','c','d','e','f',

    'g','h','i','j','k','l','m','n','o','p','q','r',

    's','t','u','v','w','x','y','z','A','B','C','D',

    'E','F'};

  d.dmLogPixels = 0x6789;

  MarshaledStructInspector.DisplayStruct(d);

outputs:

  Total Bytes = 220

  ...

  34 12 61 00 4↕a.

  62 00 63 00 b.c.

  64 00 65 00 d.e.

  66 00 67 00 f.g.

  68 00 69 00 h.i.

  6A 00 6B 00 j.k.

  6C 00 6D 00 l.m.

  6E 00 6F 00 n.o.

  70 00 71 00 p.q.

  72 00 73 00 r.s.

  74 00 75 00 t.u.

  76 00 77 00 v.w.

  78 00 79 00 x.y.

  7A 00 41 00 z.A.

  42 00 43 00 B.C.

  44 00 45 00 D.E.

  46 00 89 67   F.?g

  ...

With the string version, the following code:

  DEVMODE d = new DEVMODE();

  d.dmCollate = 0x1234;

  d.dmFormName = "abcdefghijklmnopqrstuvwxyzABCDEF";

  d.dmLogPixels = 0x6789;

  MarshaledStructInspector.DisplayStruct(d);

outputs:

  Total Bytes = 220

  ...

  34 12 61 00 4↕a.

  62 00 63 00 b.c.

  64 00 65 00 d.e.

  66 00 67 00 f.g.

  68 00 69 00 h.i.

  6A 00 6B 00 j.k.

  6C 00 6D 00 l.m.

  6E 00 6F 00 n.o.

  70 00 71 00 p.q.

  72 00 73 00 r.s.

  74 00 75 00 t.u.

  76 00 77 00 v.w.

  78 00 79 00 x.y.

  7A 00 41 00 z.A.

  42 00 43 00 B.C.

  44 00 45 00 D.E.

  00 00 89 67 .. ?g

  ...

Notice that the last character of the string is cut off.

As one final note, the Interop marshaler does not support StringBuilder fields.  So unlike the case with parameters, StringBuilders can't be used to represent unmanaged string fields.