Excel and Managed Code ... How does that work?

As well as C/C++ and VBA, I'll be blogging a lot about managed code. So I figured it would be useful to lay out the basics of how Excel and managed code interoperate today.  

If you write Excel addins you’ve probably seen articles like this one that show you how you can write your addin in C#. You will have used Visual Studio’s Extensibility Project template to build a simple “Hello, Excel” addin in C# with nothing more than a few clicks of the mouse. I remember the first time I did this too. I remember adding a Messagebox.Show(“Does this really work?”); line to the wizard generated OnConnection method then tentatively pressing F5 not quite sure what I expected to happen.

Well (of course) it all worked. Excel started; loaded my addin; my message box got displayed. I was amazed. How was this happening?

So I looked at the source code the wizard had generated for me and noticed there were a few things I didn’t understand. I mean, I was used to writing COM code and could recognise  ProgIDs and CLSIDs in the code and could make obvious statements like – “Oh, that’s the ProgID of my addin” while hoping that no one followed that up with “How does that ProgID get mapped to your managed code?”.

If you’re like me you like to understand what your code’s doing. I have nothing against wizard generated code (it can save a lot of time) but I still like to be able to read the generated code and know what it is doing and why it is doing it. So I asked myself:

How come Excel can use an addin written entirely in managed code?

COM Addin Basics

I guess a good place to start is by looking at what code gets generated for us. Once we've created our new shared addin project (File->New->Project->Other Project Type->Extensibility->Shared Addin) in Visual Studio we're presented with a solution containing two projects: the addin and a setup project. Let's ignore the setup project for now.

In the addin's project, there is a Connect.cs file containing a class called Connect. This class has some interesting attributes:

Now, if you've written COM addins before you'll likely recognise a few things here. If you're not familiar with COM then these attributes need explaining a bit.

The GuidAttribute contains, well, a Guid. I'm not going to explain what a guid is other than to say that it's a (g)lobally (u)nique (id)entifier. COM relies on guids to identify classes and interfaces. Every COM class and every COM interface has a guid. Additionally, COM uses another identifying value called a ProgID (programmatic identifier). Again, I'm not going to go into ProgIDs. There is endless information on the web covering this subject.

So all we need to understand about COM (for now) is that a COM class has a Guid and a ProgID. The Guid is a unique identifier for the class and the ProgID is what we use to create an instance of the class.

When we register our COM addin with Excel, Excel stores the ProgID of our addin in the registry (more on this later). When Excel starts, it reads the ProgID from the registry and uses it to create an instance of our addin.

But there is more to writing a COM addin than this. Excel needs a way to call our addin, for example to get the addin to initialise itself. So Excel requires that all COM addins implement specific methods. They can, of course, implement additional methods - but they need to implement at least these specific methods in order for Excel to be able to interact with it.

These methods are specified in the IDTExtensibility2 interface.

The IDTExtensibility2 is documented here.

You'll notice that the wizard automatically generated the code to implement IDTExtensibility2 for us.

This sounds like COM stuff ... but I want a Managed Addin 

If you hadn't guessed it by now, a managed addin is really a COM addin written in managed code. The managed addin needs to support the same things that COM addins do and Excel treats it in the same way. In fact, Excel doesn't even know your addin is written in managed code. It simply makes a call to the COM libraries for an instance of your addin's ProgID and gets back a pointer. All the magic that makes this happen isn't in Excel; its in the .NET CLR and is called COM Interop.

The CLR creates a COM Callable Wrapper (CCW) of your addin and it is this CCW that Excel thinks is your addin (CCW's are covered in this article).

The fact that an addin written in C# is really a COM addin is a subtle but important concept to grasp. Excel does not have a managed API - it has a COM API. This means that whenever you call Excel from managed code you are using the COM Interop facilities of the CLR.

Over time, I'll be covering why we need to care about COM Interop. In the meantime, here's a good article.

Calling Excel's API from Managed Code

Like I said, Excel's API is COM based but we can still call it from managed code. When managed code calls unmanaged COM code, the CLR uses a runtime callable wrapper (RCW). This wrapper handles the transitions between the managed world and the unmanaged world.

RCWs are implemented in .NET assembly files and these files are called Interop Assemblies

Excel ships Primary Interop Assemblies (PIAs). A PIA is a normal interop assembly that has been marked by its vendor as being the one everyone should use. So, if a COM library (like Excel) offers a PIA, you should use that rather than generating your own interop assemblies.

There is a little more to PIAs than this, but we don't need to care that much for now. More details on PIAs can be found here and details of the Office PIAs are here.

From managed code, PIAs look like any other managed code. To call the Excel PIAs (and hence call Excel via its COM API via its RCWs by way of its PIAs :o) ) you add a reference to the PIAs. This is done in Visual Studio using the Add Reference dialogue shown below.

Once we've done this, we can make calls to Excel from our managed addin. 

Excel's RCWs are in a namspace called: Microsoft.Office.Interop.Excel. This is a bit of a mouthful so sticking a:

using Excel = Microsoft.Office.Interop.Excel;

statement at the top of your source files means that you can access the RCWs via the Excel namespace i.e. instead of:

Microsoft.Office.Interop.Excel.Application

you can type:

Excel.Application.

The Visual Studio wizard doesn't reference the Excel PIAs for us so this is something we need to do manually.

COM Addins and the Registry

The last main area we should introduce is the registry. Excel stores the ProgIDs of all its COM addins under the HKCU\Software\Microsoft\Office\Excel\Addins. There is a key for each ProgID and subkeys which tell Excel when the addin should be loaded.

For your managed code addin, you need to have the same entries. That is, you need to have a key with a value of your addin's ProgID (i.e. MyAddin.Connect) along with the subkeys as detailed here.

Summary

briefly covered the main topics of writing an Excel addin in managed code. The following are the high level points we should remember about managed addins:

  • 'Managed Addins' are really COM addins;

  • All COM addins have to implement the IDTExtensibility2 interface, including those written in managed code;

  • COM Interop is the magic that lets managed code call Excel and Excel to call managed code;

  • COM Interop uses two types of wrappers/proxies: COM Callable Wrappers (CCW) and Runtime Callable Wrappers (RCW);

  • Every call between Excel and managed code (in either direction) goes via a proxy which marshals between managed code and unmanaged code;

  • Excel ships Primary Interop Assemblies (PIAs) which contain the RCWs of the COM API;

and finally:

  • Visual Studio contains a Shared Addin project template that generates (almost) all the plumbing code we need to get started (we need to add a reference to the Excel PIAs manually);