C++/CLI and mixed mode programming

beach

I had very limited idea about how mixed mode programming on .NET works. In mixed mode the binary can have both native and managed code. They are generally programmed in a special variant of the C++ language called C++/CLI and the sources needs to be compiled with /CLR switch.

For some recent work I am doing I had to ramp up on Managed C++ usage and how the .NET runtime supports the mixed mode assemblies generated by it. I wrote up some notes for myself and later thought that it might be helpful for others trying to understand the inner workings.

History

The initial foray of C++ into the managed world was via the managed extension for C++ or MC++. This is deprecated now and was originally released on VS 2003.  This MC++ syntax turned out to be too confusing and wasn’t adopted well. The MC++ was soon replaced with C++/CLI. C++/CLI added limited extension over C++ and was more well designed so that the language feels more in sync with the general C++ language specification.

C++/CLI

The code looks like below.

 ref class CFoo
{
public:
    CFoo()
    {
        pI = new int;
        *pI = 42;
        str = L"Hello";
    }

    void ShowFoo()
    {
        printf("%d\n", *pI);
        Console::WriteLine(str);
    }

    int *pI;
    String^ str;
};

In this code we are defining a reference type class CFoo. This class uses both managed (str) and native (pI) data types and seamlessly calls into managed and native code. There is no special code required to be written by the developer for the interop.

The managed type uses special handles denoted by ^ as in String^ and native pointers continue to use * as in int*. A nice comparison between C++/CLI and C# syntax is available at the end of https://msdn.microsoft.com/en-US/library/ms379617(v=VS.80).aspx. Junfeng also has a good post at https://blogs.msdn.com/b/junfeng/archive/2006/05/20/599434.aspx

The benefits of using mixed mode

  1. Easy to port over C++ code and take the benefit of integrating with other managed code
  2. Access to the extensive managed API surface area
  3. Seamless managed to native and native to managed calls
  4. Static-type checking is available (so no mismatched P/Invoke signatures)
  5. Performance of native code where required
  6. Predictable finalization of native code (e.g. stack based deterministic cleanup)

 

Implicit Managed and Native Interop

Seamless, static type-checked, implicit, interop between managed and native code is the biggest draw to C++/CLI.

Calls from managed to native and vice versa are transparently handled and can be intermixed. E.g. managed --> unmanaged --> managed calls are transparently handled without the developer having to do anything special. This technology is called IJW (it just works). We will use the following code to understand the flow.

 #pragma managed
void ManagedAgain(int n)
{
    Console::WriteLine(L"Managed again {0}", n);
}

#pragma unmanaged
void NativePrint(int n)
{
    wprintf(L"Native Hello World %u\n\n", n);
    ManagedAgain(n);
}

#pragma managed

void ManagedPrint(int n)
{
    Console::WriteLine(L"Managed {0}", n);
    NativePrint(n);
}

The call flow goes from ManagedPrint --> NativePrint –> ManagedAgain

Native to Managed

For every managed method a managed and an unmanaged entry point is created by the C++ compiler. The unmanaged entry point is a thunk/call-forwarder, it sets up the right managed context and calls into the managed entry point. It is called the IJW thunk.

When a native function calls into a managed function the compiler actually binds the call to the native forwarding entry point for the managed function. If we inspect the disassembly of the NativePrint we see the following code is generated to call into the ManagedAgain function

 00D41084  mov         ecx,dword ptr [n]         // Store NativePrint argument n to ECX
00D41087  push        ecx                       // Push n onto stack
00D41088  call        ManagedAgain (0D4105Dh)   // Call IJW Thunk

Now at 0x0D4105D is the address for the native entry point. If forwards the call to the actual managed implementation

 ManagedAgain:
00D4105D  jmp         dword ptr [__mep@?ManagedAgain@@$$FYAXH@Z (0D4D000h)]  

Managed to Native

In the case where a managed function calls into a native function standard P/Invoke is used. The compiler just defines a P/Invoke signature for the native function in MSIL

 .method assembly static pinvokeimpl(/* No map */) 
        void modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) 
        NativePrint(int32 A_0) native unmanaged preservesig
{
  .custom instance void [mscorlib]System.Security.SuppressUnmanagedCodeSecurityAttribute::.ctor() = ( 01 00 00 00 ) 
  // Embedded native code
  // Disassembly of native methods is not supported.
  //  Managed TargetRVA = 0x00001070
} // end of method 'Global Functions'::NativePrint

The managed to native call in IL looks as

 Manged IL:
  IL_0010:  ldarg.0
  IL_0011:  call void modopt([mscorlib]System.Runtime.CompilerServices.CallConvCdecl) NativePrint(int32)

The virtual machine (CLR) at runtime generates the correct thunk to get the managed code to P/Invoke into native code. It also takes care of other things like marshaling the managed argument to native and vice-versa.

Managed to Managed

While it would seem this should be easy, it was a bit more convoluted. Essentially the compiler always bound to native entry point for a given managed method. So a managed to managed call degenerated to managed -> native -> managed and hence resulted in suboptimal double P/Invoke. See https://msdn.microsoft.com/en-us/library/ms235292(v=VS.80).aspx

This was fixed in later versions by using dynamic checks and ensuring managed calls always call into managed targets directly. However, in some cases managed to managed calls still degenerate to double P/Invoke. So an additional knob provided was the __clrcall calling convention keyword. This will stop the native entry point from being generated completely. The pitfall is that these methods are not callable from native code. So if I stick in a __clrcall infront of ManagedAgain I get the following build error while compiling NativePrint.

 Error    2   error C3642: 'void ManagedAgain(int)' : cannot call a function with
 __clrcall calling convention from native code   <filename>

/CLR:PURE

If a C++ file is compiled with this flag, instead of mixed mode assembly (one that has both native and MSIL) a pure MSIL assembly is generated. So all methods are __clrcall and the Cpp code is compiled into MSIL code and NOT to native code.

This comes with some benefits as in the assembly becomes a standard MSIL based assembly which is no different from another managed only assembly. Also it comes with some limitation. Native code cannot call into the managed codes in this assembly because there is no native entry point to call into. However, native data is supported and also the managed code can transparently call into other native code. Let's see a sample

I moved all the unmanaged code to a separate /C++:CLI dll as

 void NativePrint(int n)
{
    wprintf(L"Native Hello World %u\n\n", n);
}

Then I moved my managed C++ code to a new project and compiled it with /C++:PURE

 #include "stdafx.h"
#include 

#include "..\Unmanaged\Unmanaged.h"
using namespace System;

void ManagedPrint(int n)
{
    char str[30] = "some cool number";     // native data  
    str[5] = 'f';                          // modifying native data
    Console::WriteLine(L"Managed {0}", n); // call to BCL
    NativePrint(n);                        // call to my own native methods
    printf("%s %d\n\n", str, n);           // CRT
}

int main(array ^args)
{
    ManagedPrint(42);
    return 0;
}

The above builds and works fine. So even with C/++:PURE I was able to

  1. Use native data like a char array and modify it
  2. Call into BCL (Console::WriteLine)
  3. Call transparently into other native code without having to hand generate P/Invoke signatures
  4. Use native CRT (printf)

However, no native code can call into ManagedPrint. Also do note that even though Pure MSIL is generated, the code is unverifiable (think C# unsafe). So it doesn't get the added safety that the managed runtime provides (e.g. I can just do str[200]  = 0 and not get any bounds check error)

/CLR:Safe

/CLR:safe compiler switch generates MSIL only assemblies whose IL is fully verifiable. The output is not different from anything generated from say C# or VB.NET compilers. This provides more security to the code but at the same time losses on several capabilities over and above the PURE variant

  1. No support for CRT
  1. Only explicit P/Invokes

So for /CLR:Safe we need to do the following

 [DllImport("Unmanaged.dll")]
void NativePrint(int i);

void ManagedPrint(int n)
{
    //char str[3000] = "some cool number"; // will fail to compile with  
    //str[5] = 'f';                        // "this type is not verifiable"

    Console::WriteLine(L"Managed {0}", n);

    NativePrint(n);                        // Hand coded P/Invoke

Migration

MSDN has some nice articles on people trying to migrate from /CLR to

  1. To /CLR:Pure https://msdn.microsoft.com/en-US/library/ms173253(v=vs.80).aspx
  1. To /CLR:Safe https://msdn.microsoft.com/en-US/library/ykbbt679(v=vs.80).aspx