Interop with PROPVARIANTs in .NET

Article
04/11/2008

Introduction

When it comes interop, .NET has solved most of the hard problems. However, if you've worked with COM interfaces that return or take PROPVARIANTs as parameters, you may have noticed that there isn't much support for this union'd struct type. I came across one such interface, and needed a good way to interop with its PROPVARIANT parameters. In fact, the implementation I was working with returned BLOBs, which break all sorts of basic implementations of variant interop.

Of course, PROPVARIANTs are used in other places than just COM. You may find yourself using P/Invoke with PROPVARIANT params, as well.

Background

The PROPVARIANT reuses a lot of definitions of the VARIANT/VARIANTARG types. In fact, they are nearly identical. I guess the difference is around their intended use. That is, VARIANTARG would be used to pass a set of self described arguments to a function, where the caller maintains the memory management role before and after the call. A PROPVARIANT is ideal for out params, where the caller gets an instance of the value that it needs to maintain the memory of, but the callee still maintains its own value's memory.

.NET actually has some support for interop of VARIANT types. For example, if you define this method:

// C#
void GetProp(
[Out, MarshalAs(UnmanagedType.Struct)] out object pval
);

.NET will see the MarshalAs attribute set to use UnmanagedType.Struct in combination with the object type for the parameter and assume the variable you are marshaling is a VARIANT. This works for types like int, short, byte, float, double, but not for types like BLOB. In this case you'll get a runtime error, because .NET considers the VARIANT invalid. If you'd like to know more about how .NET decides what an invalid VARIANT is, take a look at Adam Nathan's post about debug probes.

The Union

The PROPVARIANT has an interesting structure. For the most part, it is a simple struct, with a field that identifies the type of data, plus a union field that contains the data itself. For those of you .NET developers that don't know what a union is, I'll give you a brief intro.

In a C/C++ type you can define a field as a union. Here is a simple example:

 // C++
typedef struct tagMYSTRUCT
{
    union
    {
        int iVal;
        BYTE bVal;
    };
} MYSTRUCT;

In this struct I have defined two fields. Unlike a normal struct, though, I have pulled the two fields into a union. That union says that the two values share the same memory. So, the size of the union member is the max of the sizes of the fields defined within it. In this case, sizeof(int) > sizeof(BYTE), so the union members size equals sizeof(int).

The Native PROPVARIANT Structure

Let's take a look at the structure of the native PROPVARIANT:

// C++
struct tagPROPVARIANT
{
union
{
    struct tag_inner_PROPVARIANT
{
VARTYPE vt;
WORD wReserved1;
WORD wReserved2;
WORD wReserved3;
      union
{
CHAR cVal;
UCHAR bVal;
SHORT iVal;
USHORT uiVal;
LONG lVal;
ULONG ulVal;
INT intVal;
UINT uintVal;
LARGE_INTEGER hVal;
ULARGE_INTEGER uhVal;
FLOAT fltVal;
DOUBLE dblVal;
VARIANT_BOOL boolVal;
_VARIANT_BOOL bool;
SCODE scode;
CY cyVal;
DATE date;
FILETIME filetime;
CLSID *puuid;
CLIPDATA *pclipdata;
BSTR bstrVal;
BSTRBLOB bstrblobVal;
BLOB blob;
LPSTR pszVal;
LPWSTR pwszVal;
IUnknown *punkVal;
IDispatch *pdispVal;
IStream *pStream;
IStorage *pStorage;
        /* .. snip .. */
};
} ;
DECIMAL decVal;
};
};

This structure has an extra level of confusion... it nests a union in a struct in a union. This is because of that DECIMAL value you'll see at the end. The DECIMAL field actually overlaps the VARTYPE field (via the outer union). This is because the PROPVARIANT's size is 16 bytes (on a 32-bit architecture) before the DECIMAL member, which is 16 bytes in itself. If the DECIMAL member were within the inner union, the size of the PROPVARIANT would grow to 20 bytes, which is no longer compatible with VARIANTs. I ignore the DECIMAL field in my implementation, because I've not run across a need for it.

Most of the types in the PROPVARIANT are simple, including 1/2/4/8 byte integers, 4/8 byte floating point numbers, booleans, and pointers. There are a couple of structs to take note of. BSTRBLOB and BLOB are structs that have two members, a 4-byte integer, and a pointer. These will be considered when we come up with our .NET structure layout.

Creating a Managed PROPVARIANT

There have been some attempts at laying out a managed PROPVARIANT by using LayoutKind.Explicit and the FieldOffsetAttribute. These work for marshaling an unmanaged struct to a managed struct, but I've seen issues going the other way. Also, because the fields of a struct in .NET will never share memory, you may end up with hundreds of bytes in your managed struct, which is passed by value! Other implementations use an IntPtr for the union member, but this doesn't allow for accessing all 8 bytes of the union member, at least not on a 32-bit architecture.

My implementation takes a different approach. I use an IntPtr for the first 4 bytes of the union, but I use an extra int field to access the second 4 bytes. That gives me access to all 8 bytes of the union. Plus, on a 64-bit architecture, that provides me with 12 bytes, because the size of the IntPtr becomes 8 bytes. Note that the size of the largest member of the union on a 64-bit architecture is 12 bytes, that is why I used one IntPtr and one int, instead of two IntPtrs which would result in 16 bytes for the union member.

The definition of my managed struct looks like this:

// C#
[StructLayout(LayoutKind.Sequential)]
public struct PropVariant
{
    ushort vt;
    ushort wReserved1;
    ushort wReserved2;
    ushort wReserved3;
IntPtr p;
    int p2;
}

NOTE: The vt member value is from the System.Runtime.InteropServices.ComTypes.VarEnum enumeration.

NOTE: I tried to code it to work on 64-bit architectures, but I don't have one to test on, so I'll leave it to my readers to tell me what does/doesn't work with regards to that. This struct will have a size of 16 bytes on a 32-bit architecture, and a size of 20 bytes on a 64-bit architecture.

This completes the implementation for marshaling purposes. You could use this struct to pass a PROPVARIANT to a method, or get a PROPVARIANT as an out parameter, etc. For example, an unmanaged method may like this:

// C++
HRESULT GetProp(PROPVARIANT *pval);

The managed import might look like this:

// C#
void GetProp(out PropVariant pval);

 // C#
void GetProp(ref PropVariant pval);

The standard .NET marshaler will handle these interop calls just fine without any special marshaling attributes.

However, this doesn't make for easy examination of the data value of the variant.

Accessing the Data of the Managed PropVariant

What I've done is taken the union members of the native struct and created them as private member properties of my managed struct. Each property does the appropriate bit twiddling to get its value from the IntPtr field of the struct, and the trailing int field when necessary. Here are a couple of the properties:

    sbyte cVal // CHAR cVal;
    {
        get { return (sbyte)GetDataBytes()[0]; }
}

    short iVal // SHORT iVal;
    {
        get { return BitConverter.ToInt16(GetDataBytes(), 0); }
}

    int lVal // LONG lVal;
    {
        get { return BitConverter.ToInt32(GetDataBytes(), 0); }
}

    long hVal // LARGE_INTEGER hVal;
    {
        get { return BitConverter.ToInt64(GetDataBytes(), 0); }
}

    float fltVal // FLOAT fltVal;
    {
        get { return BitConverter.ToSingle(GetDataBytes(), 0); }
}

The BitConverter is part of the framework, and comes in handy when you are working with bits that you don't want to convert by value (ie. (float)1 results in a float of value 1, not a float with bits 0x00..0001).

I wrote a GetDataBytes function to help get the needed bytes from the union part of the structure:

    private byte[] GetDataBytes()
{
        byte[] ret = new byte[IntPtr.Size + sizeof(int)];
        if (IntPtr.Size == 4)
BitConverter.GetBytes(p.ToInt32()).CopyTo(ret, 0);
        else if (IntPtr.Size == 8)
BitConverter.GetBytes(p.ToInt64()).CopyTo(ret, 0);
BitConverter.GetBytes(p2).CopyTo(ret, IntPtr.Size);
        return ret;
}

Now, based on the vt member, we can determine which property to use to get the real value. This can be simplified into a single public property:

    public object Value
{
        get
{
            switch ((VarEnum)vt)
{
                case VarEnum.VT_I1:
                    return cVal;
                case VarEnum.VT_UI1:
                    return bVal;
                case VarEnum.VT_I2:
                    return iVal;
                case VarEnum.VT_UI2:
                    return uiVal;
                case VarEnum.VT_I4:
                case VarEnum.VT_INT:
                    return lVal;
                case VarEnum.VT_UI4:
                case VarEnum.VT_UINT:
                    return ulVal;
                case VarEnum.VT_I8:
                    return hVal;
                case VarEnum.VT_UI8:
                    return uhVal;
                case VarEnum.VT_R4:
                    return fltVal;
                case VarEnum.VT_R8:
                    return dblVal;
                case VarEnum.VT_BOOL:
                    return boolVal;
                case VarEnum.VT_ERROR:
                    return scode;
                case VarEnum.VT_CY:
                    return cyVal;
                case VarEnum.VT_DATE:
                    return date;
                case VarEnum.VT_FILETIME:
                    return DateTime.FromFileTime(hVal);

case VarEnum.VT_BSTR:
                    return Marshal.PtrToStringBSTR(p);
                case VarEnum.VT_BLOB:
                    byte[] blobData = new byte[lVal];
IntPtr pBlobData;
                    if (IntPtr.Size == 4)
{
pBlobData = new IntPtr(p2);
}
                    else if (IntPtr.Size == 8)
{
                        // In this case, we need to derive a pointer at offset 12,
                        // because the size of the blob is represented as a 4-byte int
                        // but the pointer is immediately after that.
                        pBlobData = new IntPtr(BitConverter.ToInt64(GetDataBytes(), sizeof(int)));
}
                    else
                        throw new NotSupportedException();
Marshal.Copy(pBlobData, blobData, 0, lVal);
                    return blobData;
                case VarEnum.VT_LPSTR:
                    return Marshal.PtrToStringAnsi(p);
                case VarEnum.VT_LPWSTR:
                    return Marshal.PtrToStringUni(p);
                case VarEnum.VT_UNKNOWN:
                    return Marshal.GetObjectForIUnknown(p);
                case VarEnum.VT_DISPATCH:
                    return p;
                default:
                    throw new NotSupportedException("The type of this variable is not support ('" + vt.ToString() + "')");
}
}

All I do here is determine the type of the data, then return the proper value, referencing my private properties for conversions. For some of the types, I do some special handling. For example, the string types need to allocate managed strings from the native strings, and for the VT_UNKNOWN type I get the COM object that the IUnknown pointer references. Also, the VT_BLOB type is handled so that it gets the size of the data, then allocates a managed byte array to copy the data into. For this type, we have to take into account the architecture.

Cleaning Things Up

Because I don't rely on marshaling to clean up unmanaged memory, I need to provide a mechanism to make sure it gets cleaned up. Unfortunately, I can't add a finalizer to the struct to ensure its proper deallocation, but I can provide a method that depends on the user to call when finished with the value:

     [DllImport("ole32.dll")]
    private extern static int PropVariantClear(ref PropVariant pvar);

    public void Clear()
    {
        // Can't pass "this" by ref, so make a copy to call PropVariantClear with
        PropVariant var = this;
        PropVariantClear(ref var);

        // Since we couldn't pass "this" by ref, we need to clear the member fields manually
        // NOTE: PropVariantClear already freed heap data for us, so we are just setting
        //       our references to null.
        vt = (ushort)VarEnum.VT_EMPTY;
        wReserved1 = wReserved2 = wReserved3 = 0;
        p = IntPtr.Zero;
        p2 = 0;
    }

NOTE: I could have implemented the structure as a class in order to implement a finalizer and IDisposable, then require the MarshalAs attribute to marshal as an UnmanagedType.LPStruct, but that would require the user to initialize the class before use. The usage pattern would also be changed for reusing the object for multiple calls. If you were to reuse the object, the fields would be overwritten without being properly deallocated first. This can happen when using a struct, too, but by using the pattern I chose, you are always expected to call the Clear function, whereas with a class, you only have to call it when reusing the object. The inconsistency is what made me settle on using a struct.

Using the PropVariant struct

Using the struct is easy. The usage is similar to the Dispose pattern, however we can't utilize the using keyword. Using the above example method, we can call it like this:

    PropVariant propVar;
GetProp(out propVar);
    try
{
        object value = propVar.Value;
}
    finally { propVar.Clear(); }

If you don't call Clear, and the PROPVARIANT contains a pointer to some data, you *will* have a memory leak.

For reusing the variable, you can wrap all your calls in one try..finally block. This is because calling the Clear method on an empty PropVariant does not throw an exception:

    PropVariant propVar = new PropVariant();
    object value;
    try
{
GetProp(out propVar); value = propVar.Value;
propVar.Clear();
        // ... use value ...

GetProp(out propVar);
value = propVar.Value;
propVar.Clear();
        // ... use value ...
    }
    finally
{
propVar.Clear();
}

Source Code

I've provided source for the PropVariant struct as a single C# code file. I've also provided a project that has sample usage, using a C++ COM project, plus a test project.

File	Description
View PropVariant.cs	View the PropVarient implementation online.
	Download the PropVariant implementation as a single C# code file.
	A solution with the PropVariant project, a C++ COM project for usage testing, a C# project for usage testing, and a test project for unit tests.