VS, DTE, COM, and COMException (:-P)

Recently, for testing, I wanted to manipulate Visual Studio using DTE. DTE is Visual Studio’s Automation API but it also tends to be used in writing VS extensions and so on. I hadn’t done a lot of COM programming before, but I had done a little p/invoke interop code before and I sort of knew that .Net also has really useful interop features for dealing with COM interfaces. Also, I should mention that in the Visual Studio SDK which you can download, there are all the DTE COM interfaces are wrapped up in a nice managed assembly, so you do not need to do a lot of heavy lifting importing every interface definition.

In this case there was one extra thing which promised to make my life really easy. Someone had also written libraries I could use which find the correct Visual Studio COM object for the process I want, and wrapper functions for most of the useful interfaces. There’s just one tiny problem. I’ve started using this library, and something somewhere is going wrong, and I am getting 0xc0000005 access violation, because code has branched to some illegal memory location, and I have no idea at all how it got there. Indeed, all I have is a sinking feeling in my stomach, a feeling that this library I am using has a bug of the nasty to debug kind.

Well, I figure that thanks to the amazing power of the internet, plus a little reverse engineering, it might not be hard to reimplement the DTE functionality that I need. And I would have just a slightly easier time if I am debugging my own code rather than someone else’s library. And of course it might be a fun and educational experience. Or one out of two.

 

Part 1: Getting a DTE object for a Visual Studio process devenv.exe that you have already launched (Input – a process ID)

One possible way of getting a DTE handle for a Visual Studio process is to use object creation APIs to create a new DTE object, using Activator.CreateInstance. My understanding is that this launches a new instance of VS, which is not what I want. So, we fall back to the reference document, “How to: get References to the DTE and DTE2 Objects”. Unfortunately this document while also providing a solution of how to attach to an instance of VS, doesn’t seem to provide a way to specify which particular instance you want to attach to, in the case that there are multiple instances, which is all too often the case. However, it does offer a lot of useful hints, such as the fact you can get it from the ‘Running Object Table’.

The Running Object Table is something I hadn’t heard of before but it’s easy enough to understand. Any COM object can be registered in a global (machine-wide) table of objects, and thereby made browsable and reachable from other processes. With this new keyword, I stumbled across a helpful codeproject article “Automating a specific instace of Visual Studio.Net using C#”. Wait, that’s exactly what I want to do, right? Why did it take so long to get this far?

The article is fairly short, and for some reason even though he explains that the process ID of Visual Studio will be part of the running object’s name in the table, the code posted there doesn’t actually filter upon this ID. I ended up with basically the same code but a minor variation inside the enumeration loop:

    // (depending on VS version, process ID) we are looking for something like

    // "!VisualStudio.DTE.10.0:5656"

    string displayName;

    monikers[0].GetDisplayName(bindContext, null, out displayName);

    string processIdString = processId.ToString();

 

    if (displayName.StartsWith("!VisualStudio.DTE")

        && displayName.EndsWith(processIdString))

    {

        // this is probably it

        object boundObject;

        runningObjectTable.GetObject(monikers[0], out boundObject);

        EnvDTE80.DTE2 dte2 = boundObject as EnvDTE80.DTE2;

        if (dte2 != null)

        {

            return dte2;

        }

    }

So now I can throw away that library right?

Oh darn, I am still getting Access Violations… Maybe it’s because of that other code I wrote just before?

 

Part 2: Message Filtering

There is another COM concept I first heard about from teammate Anders Liu, a message filter. Anders did this great presentation on how we could adopt COM message filtering in our  automation code and thereby solve one of the little pain points of calling DTE from another process that we had – random COMExceptions. Which needs a little explaining:

The Random COMException Problem

Random COM Exceptions were well, one of many banes of our existence while testing Workflow Designer 4.0 in Visual Studio 2010. The scenario is basically that you would call some wonderful function in an automation helper library which you expect does something really useful. And, for no apparent reason, this function would throw COMException. We didn’t even realize we were calling COM when we were calling this function, but clearly somewhere it was… in fact it is probably using DTE. For a long time we avoided doing anything about this problem because it would not reproduce when we tried to debug the problem. But the more tests we wrote and the more other sporadic issues were ironed out the worse, relatively, this particular issue became. Eventually one day someone realized there was a fairly simple workaround:

    while(true)

    {

        try

        {

            ThatWonderfulThing();

            break;

        }

        catch (COMException) { }

    }

Retrying the operation until it succeeded would nearly always work. (We didn’t actually loop forever, we timed out after several attempts.) But, the sad thing is that it wasn’t just one function in the library we were calling which could throw COMException. It was dozens of them. And the way we would found out which dozens was to have dozens of tests fail from COMException – but not all at once, it would be over a period of weeks and months. So COMException became this ongoing, well-known problem on our team.

At this point our team had not yet adopted DTE. But a few months later, I think someone said something like ‘hey, it would make our automation faster and more reliable if we used DTE.’ OK that’s probably not true at all, we may have adopted it because we desperately needed a workaround for something really bad happening when we tried to create projects using the UI. I don’t really remember. But at some point everyone started to think DTE would indeed be faster and more reliable, and we started using it more heavily. And we started getting a whole bunch more COMException issues, because basically anything you can do using the DTE interface wrappers can, randomly, throw COMException. Because it’s all COM. So all through our code we had to add extra code whose only purpose was to do automatic retry until we didn’t get COMException.

Why were we getting COMException anyway? The answer appears to be that Visual Studio is temporarily too busy to let us know if it can process our DTE request or not. In more detail, VS is not pumping messages in the main message loop, so the COM request (which is probably a SendMessageTimeout* under the hood ) times out, and comes back to us as an exception.

Which brings us up to the presentation. He showed us a lot of code which implemented a standard COM pattern he had found out exists partly to prevent these timeout message failed error codes causing every COM developer the same pain we were feeling. Which is the IMessageFilter interface. In hindsight, I think we should have adopted Ander’s IMessageFilter approach straight after he presented it to us. It would have saved us many ‘add missing retry’ check-ins, where we added yet more catch/retry logic to functions we had recently started calling that we didn’t realize could sometimes throw.

So, with benefit of that hindsight I thought ‘why don’t I start using message filters?’

[To be continued…]
[Footnote: *I didn’t research, I’m just guessing…]