Gotchas with Reverse Pinvoke (unmanaged to managed code callbacks)

One of the first things I had to do when I started working in Outlook Web Access (will call it OWA from now on) was integrating an unmanaged component that another group at Microsoft wrote. After spending some time with it, I decided the best way to bring it in would be writing a very small unmanaged library that wrapped its behavior and exposed a very simple interface to OWA. This interface would consist of only 3 or 4 APIs, and we would use them via PInvoke.

In order to avoid keeping unnecessary state around in the unmanaged wrapper dll, one of the APIs would do callbacks to OWA in order to report incremental results. This is something I hadn't done before in managed code. This is how it looks like:

C++ side:

typedef void (__stdcall *PFN_MYCALLBACK)();
int __stdcall MyUnmanagedApi(PFN_ MYCALLBACK callback);

C# side

public delegate void MyCallback();
[DllImport("MYDLL.DLL")] public static extern void MyUnmanagedApi(MyCallback callback);

public static void Main()
{

MyUnmanagedApi(
delegate()
{
Console.WriteLine("Called back by unmanaged side");
}
);
}

The CLR will do all the magic to marshal our anonymous delegate to an unmanaged pointer that can be passed out to the C++ side. However, when we look a bit closely at things we may find some interesting problems:

What happens if my callback throws, what will my unmanaged code see? For example:

delegate()
{
Console.WriteLine("Called back by unmanaged side");
throw new ApplicationException("Let's see what happens");
}

If you try this out you will see that your C++ code will just see an SEH exception fly through... oops, my C++ code wasn't ready to deal with SEH stuff, it may leak. What can I do? In my case I controlled the signature of the callback, so I just changed the signature to return an HRESULT, so that I could morph exceptions into HRESULTS, which is what the CLR does for the COM interop case:

delegate()
{
int hresult = Constants.S_OK;
try
{
DoWork()
}
catch (Exception e)
{
hresult = Marshal.GetHRException(e);
}

return hresult;
}

Are we done? Nope. What are we missing? For one, a thread abort could be induced in the 'return hresult' statement, so we would be back to square 1, the unmanaged code seeing an unexpected SEH. Another interesting thing to look at is what does the unmanaged to managed transition look like. First, let's think about what an unmanaged to unmanaged callback would look like:

Let's assume a callback with that adds the 2 arguments. This would be some code in the caller:

mov ECX, [x]

mov EDX, [y]
call [EDX]

This is what the callback code could look like

mov eax, ecx
add eax, edx
ret

What can go wrong here? Not much, we jump directly to the callee, the only thing you could hit could be a stack overflow, if you are running out of stack when the call instruction pushes the return address. Besides that, this code shouldn't fail.

Now, let's go back to our unmanaged to managed callback. It's definitely more complicated than the unmanaged to managed one. What can go wrong? For example, marshalling a LPWSTR to a string object requires memory allocations, which may fail, also, you may not be allowed to actually go into managed code because the appdomain the thread was running on is unloading, etc... So definitely, things can fail before we have any chance to take any action, and again we are in hands of the CLR. Again, for COM interop, the CLR knows the signature has an HRESULT, so it will actually trap any of these problems and return an HRESULT. However, for our callback, what can it do if the callback returns void? In Whidbey, the answer is nothing, again, a rude SEH will be raised against us.

So basically, in order to be robust here, the best we can do right now is have the C++ code wrap it's callback with a __try/__catch block, which will handle these situations and also the thread abort on return I describe above. Of course, in this case I owned the unmanaged code, so I was able to fix it, but there are a lot of these callbacks in Windows APIs, which won't do the __try/__catch, so you basically are always at risk of something bad happening. Don't you hate when you can't write reliable code?

HRESULT __stdcall DoCallback(PFN_ MYCALLBACK callback)
{
HRESULT hr;
__try
{
hr = callback();
}
__except(FilterCLRExceptions(GetExceptionCode(), GetExceptionInformation()))
{
hr = OOPS;
}

return hr;
}

I talked with my excoworkers about this and they acknowledged that this is not something they are very happy about, but there just wasn't time to address this in Whidbey , fixing this is in the TODO list for future versions of the CLR.

Note: The one thing that drove me to write this down was that not one of the tutorials or documentation that explains 'reverse pinvoke' or unmanaged to managed callbacks mentions these problems, which is something I think we can do much better about (and we really need to do a better job, since 99.999% of people out there can't see our code to see what we really do).