The layout of a COM object


The Win32 COM calling convention specifies the layout of the virtual method table (vtable) of an object. If a language/compiler wants to support COM, it must lay out its object in the specified manner so other components can use it.

It is no coincidence that the Win32 COM object layout matches closely the C++ object layout. Even though COM was originally developed when C was the predominant programming language, the designers saw fit to "play friendly" with the up-and-coming new language C++.

The layout of a COM object is made explicit in the header files for the various interfaces. For example, here's IPersist from objidl.h, after cleaning up some macros.

typedef struct IPersistVtbl
{
    HRESULT ( STDMETHODCALLTYPE *QueryInterface )(
        IPersist * This,
        /* [in] */ REFIID riid,
        /* [iid_is][out] */ void **ppvObject);

    ULONG ( STDMETHODCALLTYPE *AddRef )(
        IPersist * This);

    ULONG ( STDMETHODCALLTYPE *Release )(
        IPersist * This);

    HRESULT ( STDMETHODCALLTYPE *GetClassID )(
        IPersist * This,
        /* [out] */ CLSID *pClassID);

} IPersistVtbl;

struct IPersist
{
    const struct IPersistVtbl *lpVtbl;
};

This corresponds to the following memory layout:

p    lpVtbl    QueryInterface
AddRef
Release
GetClassID

What does this mean?

A COM interface pointer is a pointer to a structure that consists of just a vtable. The vtable is a structure that contains a bunch of function pointers. Each function in the list takes that interface pointer (p) as its first parameter ("this").

The magic to all this is that since your function gets p as its first parameter, you can "hang" additional stuff onto that vtable:

p    lpVtbl    QueryInterface
...
other stuff
...
AddRef
Release
GetClassID

The functions in the vtable can use offsets relative to the interface pointer to access its other stuff.

If an object implements multiple interfaces but they are all descendants of each other, then a single vtable can be used for all of them. For example, the object above is already set to be used either as an IUnknown or as an IPersist, since IUnknown is a subset of IPersist.

On the other hand, if an object implements multiple interfaces that are not descendants of each other, then you get multiple inheritance, in which case the object is typically laid out in memory like this:

p    lpVtbl    QueryInterface (1)
q    lpVtbl    QueryInterface (2) AddRef (1)
...
other stuff
...
AddRef (2) Release (1)
Release (2) ...
...

If you are using an interface that comes from the first vtable, then the interface pointer is p. But if you're using an interface that comes from the second vtable, then the interface pointer is q.

Hang onto that diagram, because tomorrow we will learn about those mysterious "adjustor thunks".

Comments (39)
  1. Ian Hanschen says:

    Raymond,

    Good read.

    How are you creating your diagrams? Manually writing the HTML, using Word, or? I really dislike having to use photoshop to throw something together that’s going to sit in html that would look just fine using vml/tables.

    -Ian

  2. For a change of pace I have a genuine technical question as a followup, and it even reveals some of my incompetence.

    When a COM object is designed with a dual interface, access to the COM interface is pretty straightforward, callable from Visual Basic and JavaScript etc. The COM interface is also more or less accessible by C++ applications, depending on what fraction of the DLL’s Type Library is understood by Class Wizard. For example if the COM interface uses SAFEARRAYs and the .odl file imports "oaidl.idl" (which by the way is a different file than MSDN says to import for SAFEARRAY) then Visual Basic arrays map onto it perfectly but VC++ clients don’t get interfaces generated by Class Wizard.

    However, the purpose of a dual interface is that the VTBL interface should also be visible to VC++ clients, right? Then the DLL can export methods using unsafe arrays and VC++ clients can call those methods directly, right?

    But I’ve never figured out how to code a VC++ application to access the VTBL interface of classes/methods exported from a DLL. If I try to #include the relevant .h files of the DLL itself then those bring in all kinds of baggage related to the fact that the DLL is a COM server. If I use Class Wizard to generate a .h file from the Type Library then we’re back to the COM interface (and the limitations of Class Wizard). I think I have sufficient skills to hand-code a .h file that will result in compiling the client application, but I’m very suspicious of doing things this way. The purpose of a dual interface is to expose both interfaces to clients, VC++ wizards generate all sorts of code to assist developers, and I don’t think tedious hand-construction of one .h file fits this scenario. There must be something I’m missing.

  3. 2/5/2004 10:02 PM Ian Hanschen:

    > why not use the #import directive?

    Isn’t the effect the same? It interprets the DLL’s Type Library and produces new classes which mostly describe the COM object’s COM interface?

    I want to try accessing the COM object’s VTBL interface but can’t figure out how. I thought the purpose of a dual interface was that clients could access it either way, not being restricted to COM interface (in the case of C++, not being restricted to the portion of the COM interface that VC++ tools understand).

  4. Mike Dimmick says:

    Norman:

    A dual interface has the same structure whether you access it through the vtable or through IDispatch. The methods have the same types. If you want a more C++-friendly interface, use a custom interface.

    The #import statement produces vtable access code; ClassWizard always produces IDispatch access code.

  5. Nate says:

    Internet Explorer, it still can’t render transparent PNGs properly, but it can display VML…..

  6. Reuben Harris says:

    I thought the purpose of a dual interface was to allow clients to invoke methods either natively or by name (through IDispatch). Both involve calling through interface pointers…

    Were you hoping for a plain C++ class with normal methods corresponding to what’s the typelib?

  7. Raymond Chen says:

    Reuben is correct. The point of dual interfaces is that instead of

    IDispatch *pd;

    CoCreateInstance(CLSID_Shell, NULL, CLSCTX_ALL, IID_IDispatch, &pd);

    LPOLESTR pszCmd = L"ControlPanelItem";

    DISPID dispid;

    pd->GetIDsOfNames(IID_NULL, &pszCmd, 1, LOCAL_SYSTEM_DEFAULT, &dispid);

    VARIANT vt;

    V_VT(&vt) = VT_BSTR;

    V_BSTR(&vt) = SysAllocString(L"keyboard");

    DISPPARAMS dp = { &vt, NULL, 1, 0 };

    pd->Invoke(dispid, IID_NULL, LOCAL_SYSTEM_DEFAULT, DISPATCH_METHOD, &dp, NULL, NULL, NULL);

    you can do this:

    #include <shldisp.h>

    IShellDispatch *psd;

    CoCreateInstance(CLSID_Shell, NULL, CLSCTX_ALL, IID_IShellDispatch, &psd);

    BSTR bs = SysAllocString(L"keyboard");

    psd->ControlPanelItem(bs);

  8. I also think that Norman thought dual interfaces would allow certain parameter types to be treated differently (he mentions SAFEARRAY and "unsafe" arrays, which I take to mean conformant, or counted, arrays as the size is needed by the marshalling code). Unfortunately this is not the case. Dual interfaces merely permit "normal" or IDispatch-based calling (a.k.a. early and late binding)

  9. 2/6/2004 6:07 AM Raymond Chen:

    > #include <shldisp.h>

    > IShellDispatch *psd;

    > CoCreateInstance(CLSID_Shell, NULL,

    > CLSCTX_ALL, IID_IShellDispatch, &psd);

    > BSTR bs = SysAllocString(L"keyboard");

    > psd->ControlPanelItem(bs);

    Thank you. For some reason I hadn’t heard of IShellDispatch before. I’m a bit disappointed that even this moderate degree of complexity is necessary. I was hoping for something resembling an ordinary DLL that could export methods of an ordinary class, and the DLL’s client can just include the .h file and make ordinary calls directly. The client didn’t even have to know that a vtbl is involved, though as C++ programmers we know about it. Starting now I will think of [dual] as permitting access through either IDispatch or IShellDispatch, but still not really directly through the vtbl.

    2/6/2004 7:11 AM Paul Bartlett:

    > I also think that Norman thought dual

    > interfaces would allow certain parameter

    > types to be treated differently (he mentions

    > SAFEARRAY and "unsafe" arrays

    Yes I was hoping for that. SAFEARRAYs do get marshalled between the COM DLL’s COM interface and callers in Visual Basic, JavaScript, etc. The DLL’s .odl file has to import "oaidl.idl" (which by the way is a different file than MSDN says to import for SAFEARRAY). But I don’t want to force an ordinary C++ client to use SAFEARRAYs just to access my DLL, so I wanted to export a method with pointer and count parameters through the vtbl. For a plain ordinary DLL without COM that would of course be most straightforward.

  10. Raymond Chen says:

    I guess I don’t understand what you mean by "not really directly through the vtbl". When you write

    psd->ControlPanelItem(bs);

    you’re calling the method through the vtbl, just like any other C++ object.

  11. 2/8/2004 10:37 PM Raymond Chen:

    > When you write

    > psd->ControlPanelItem(bs);

    > you’re calling the method through the vtbl

    You’re right, so that accomplishes what I said I was asking for. Mike Dimmick also explained part of it, saying that the #import statement yields knowledge of the vtbl instead of being a clone of Class Wizard’s knowledge of the COM interface. Now when I have time I need to experiment. I guess I was confused because it was still necessary to call CoCreateInstance() instead of just including a .h file.

  12. Raymond Chen says:

    CoCreateInstance creates the object. If not with CCI, how else would you be able to create the object? (I guess the .h file could have its own creation function, like DirectDrawCreate, but then you’d also need a .lib to link against.)

  13. Mo says:

    I think what Norman was getting at (to a point) is something which exports either a set of plain C functions, or a C++ class which effectively wraps a given CoClass in a typelib – i.e., such that the client doesn’t have to know anything about COM at all.

    Perhaps he’d like something like this:

    Shell *pshell = new Shell();

    pshell->ControlPanelItem("Keyboard");

    Letting the class constructor deal with the CoCreateInstance() for you.

    The problem with this is, of course, that the only sane way to do that is to have a stub class which deals with the construction and marshals parameters for you.

    Alternatively, you could play with deriving the interface structures and add static methods which constructed the COM object for you, giving you something like:

    ShellDispatch *psd = ShellDispatch::Create();

    or perhaps:

    ShellDispatch *psd = Shell::CreateDispatch();

    But that still leaves you with a few problems; most importantly, you have the fundamental problem of interfaces vs. classes. Do you try and roll all of a CoClass’s interfaces into a single C++ class? Or do you have a separate class for each interface? Perhaps you return IShellDispatch like normal, but construct it using a helper class? Whatever your answer, it hasn’t really got you very far (and isn’t much above calling CoCreateInstance(), besides looking prettier).

    I guess what would be really nice (and potentially what Norman *might* have been hankering after) was a way of using COM to, for want of a better term, marshal C++ method calls. We all know how to write COM servers in C++ – wouldn’t it be nice if the client side looked (from a programming interface point of view) like the servers?

    The answer to that is a definite maybe. In reality, COM (and I can only assume by design) went out of its way to strike a balance between convenience and extensibility; the clear division between clients and servers, and between interfaces and classes (which are opaque, save for IUnknown, of course), does you an awful lot of favours. It’s certainly true that the COM APIs aren’t the nicest in the world, but CoCreateInstance isn’t too bad, and once you have your instance there’s very little reason to touch them throughout the lifetime of the instance.

    My suggestion: if CoCreateInstance puts you off, wrap it in a macro call:

    #define CreateShellDispatch(inst) CoCreateInstance(CLSID_Shell, NULL, CLSCTX_ALL, IID_IShellDispatch, &(inst))

  14. Raymond Chen says:

    If the only quibble was having to use CoCreateInstance instead of "new", then – well – that’s what happens when you are designing a language-neutral interoperability system. A wrapper class or helper macro will have to do.

    If you use IDL to generate your interfaces, then the marshalling is done for you by the MIDL compiler. The catch is that the things you pass need to be MIDL-friendly, but that’s unavoidable since MIDL isn’t psychic.

  15. Mo says:

    Well, yes, you’re completely right (of course). In a system such as COM, you’re always going to get a divide between the "ideal" and the "sane".

    From experience, the two biggest hurdles for people using COM for the first time seem to be memory management (especially if delving into the shell interfaces) and parameter types. It’s not so much that there’s anything wrong with the way things are done, more that they’re just so different that it takes a little getting used to.

    Learning COM can often mean throwing out a mindset you’ve built up using various programming languages and trying to think of the bigger picture – a lot of people doing COM stuff will never ever do anything besides inproc servers, so a few of the things COM makes you do might seem like overkill. Even trying to get your head around why IUnknown even works can be a bit of a leap of faith to begin with :)

  16. Norman Diamond says:

    OK, I think I figured out what I wish for. It is something less than COM, it is only for in-process servers where marshalling would be trivial.

    As previously mentioned, the server provides an interface which is accessible from clients in Visual Basic, JavaScript, etc., using their ordinary function or method calls and member variable accesses and "new" operations.

    The same interface is accessible from clients in Visual C++ through the entire weighty COM client infrastructure. For out-of-process or distributed servers, of course all this stuff is necessary in order to do the marshalling. For in-process stuff, besides being overkill, it also overkills ordinary C++ client programmers. The Class Wizard operation builds helpers that map BSTR* to CString to help make some things easier for C++ clients, but it doesn’t understand SAFEARRAY. #import understands all data types used in the class library but it doesn’t build helpers. CString cannot be used in IDL.

    For in-process servers with trivial marshalling, C++ clients of ordinary DLLs can do ordinary function or method calls equally simply as Visual Basic programmers can call COM interfaces. Of course Visual Basic calling COM does have all the execution overhead, but client programmers can write simple code.

    To provide an in-process server with simple interface access from VC++ clients, it still looks like I have to make a separate DLL from the DLL that serves clients in other languages. It would be nice if a single DLL could provide an in-process server with trivial marshalling and simple calling from clients in all languages, instead of needing two DLLS.

  17. Raymond Chen says:

    All you have to do now is come up with a language-independent "new" operator – and then you find that you’ve reinvented CoCreateInstance.

    I don’t see why you need two DLLs for this. C++ clients can link to a DLL just as well as COM. After all, shell32 exposes some objects (CLSID_ShellLink) through COM and still can export functions (SHGetFileInfo) normally.

  18. Norman Diamond says:

    I put a second interface in the IDL file and corresponding code in the .h and .cpp files, intending to make this an interface for C++ clients. Up to a point it worked. But when I added a CString parameter, the MIDL compiler complained. If I use BSTRs then we’re halfway back to the situation where C++ clients have to be overkilled and there’s no point having the second interface any more. (Well, for arrays there might still be a purpose in having the second interface.)

    Or do you mean that the server’s .h and .cpp files can contain additional C++ methods that are not even mentioned in the IDL file? Then I could tell clients of C++ clients not to use either the Class Wizard or #import, ignore the difficult exported methods and just use the simple ones. If a C++ client uses the DLL directly and a VB client uses the COM mechanism, both clients will be served properly?

  19. Raymond Chen says:

    Right, MIDL doesn’t know how to marshal a CString. It does understand boring LPCWSTR though; that may be good enough.

    Or you can just put the stuff you don’t want MIDL to mess with inside a cpp_quote directive. Then MIDL will just emit it blindly without interpreting it.

    And yes then you can tell C++ clients to just #include the header file and make direct calls. VB clients can still use the COM mechanism. Shell32 does this.

  20. Norman Diamond says:

    OK, maybe I’ve figured it out now. One DLL can export two classes. A generic class serves C++ clients and serves an ATL class. The ATL class serves all other languages. The generic class must be declared in its own .h file so that C++ clients can #include that one without being forced to understand a COM interface.

    I still worry that the DLL might get unloaded when the COM usage count goes to 0 and COM doesn’t know that the DLL has non-COM clients.

    Meanwhile, back to the attempted use of #import from a COM interface,

    > MIDL […] does understand boring LPCWSTR

    I neglected to try LPCTSTR. It knows both of the possible underlying types but does it suit VC++ style to a _T()?

    > put the stuff you don’t want MIDL to mess

    > with inside a cpp_quote directive

    I can’t figure out what to use it for. For anything that I want to add to the server and/or make visible to C++ clients, I can just type it into a .h file and MIDL will never see it.

  21. Raymond Chen says:

    If the DLL has non-COM clients they will have done their own LoadLibrary of the target DLL (either by an explicit call to LoadLibrary or implicitly by the loader when it sees the link in the import table), so the DLL reference count will not drop to zero. Even if COM shows up, it will LoadLibrary you (bring your DLL reference count up to 2), and then FreeLibrary you when it’s done, dropping the count to 1 (not zero – so the DLL is not freed).

    You can’t use LPCTSTR in a header file – that means that your function will get passed Unicode strings by Unicode callers and ANSI strings by ANSI callers – and you can’t tell which is which. If you want to support both Unicode and ANSI callers you need two functions, one W and one A.

    Using cpp_quote lets you reduce to two files (foo.idl and foo.h). Otherwise you need three (foo.idl, foo.h, and fooextra.h where fooextra.h contains the C++ interfaces).

  22. Norman Diamond says:

    > Even if COM shows up […]

    OK, thank you.

    > You can’t use LPCTSTR in a header file

    In an IDL file, for the reason you mentioned. You’re right, I still need to supply separate methods to C++ ANSI clients and C++ Unicode clients (besides the separate interface for all non-C languages).

    > Using cpp_quote lets you reduce to two files

    > (foo.idl and foo.h).

    That I don’t see. If foo.idl generates foo.h then foo.h declares a class with its COM interface declaration. If C++ clients #include foo.h then they will have to understand the entire COM interface declaration even if they don’t use it. I need fooextra.h for C++ clients regardless of whether foo.idl has any cpp_quote stuff.

  23. Raymond Chen says:

    The COM interface declaration is plain C/C++ once you unwrap the macros. Check out objidl.h, for example.

  24. Norman Diamond says:

    > The COM interface declaration is plain C/C++

    > once you unwrap the macros.

    Of course, but clients do not automatically include all the header files required for that unwrapping. I already got lost, trying to hunt down everything that would have to be included for it. Even though foo.cpp compiles while it’s including foo.h, it’s pretty hard to make fooclient.cpp compile when it’s including foo.h. It seems to be easier to let fooclient.cpp include fooextra.h instead of foo.h.

  25. Raymond Chen says:

    MIDL automatically sticks the necessary #include directives at the top of the generated .h file. But if you don’t like it, then fine, create two .h files.

  26. Norman Diamond says:

    > MIDL automatically sticks the necessary

    > #include directives at the top of the

    > generated .h file.

    I guess VC++ .NET did that for you. For me, VC++ 6 SP5 sticks some necessary #include directives at the top of foo.h, some buried further down in foo.h, some in StdAfx.h, and some at the top foo.c. For a client (.exe) it didn’t generate as many #include directives as for the server (.dll), and didn’t generate all of the needed ones. It’s not a matter of liking it or not, but I kept getting lost when following the chains of #include directives and could not get a client to compile when the client had #include foo.h.

  27. Raymond Chen says:

    foo.c? stdafx.h? I was talking about foo.idl and the autogenerated foo.h. At the top of the foo.h that is produced by the MIDL compiler you’ll see

    #include "rpc.h"

    #include "rpcndr.h"

    #include "windows.h"

    #include "ole2.h"

    These define the macros used by the interface declarations.

  28. Norman Diamond says:

    OK, I wasn’t clear when mentioning which #include directives were generated by which parts of VC++ 6. The MFC DLL wizard generates foo.h and StdAfx.h. The MIDL compiler generates foo_i.h, foo_i.c, and foo_p.c.

    When a client’s .h file had #include foo.h, it didn’t compile. I kept getting lost while trying to track down other header files that it needed. From your latest reply I guess I should have tried to #include foo_i.h.

    Meanwhile I finished making a foo_extra.h and refactored the server. The COM interface is working for a VB client and the extra class is working for a VC++ client, so my real wish seems to have come true, it’s all one DLL.

    Whether development effort can be further optimized by telling VC++ clients to #include foo_i.h instead of foo_extra.h, I’ll experiment again when I have time.

    Hmm, now I see why cpp_quote can reduce the number of .h files. Stuff that is presently in foo.h can be fed into foo_i.h through cpp_quote instead. But then what happens if someone uses the VC++ 6 IDE to add a class member? Additions to Ifoo go in foo.idl, but additions to Cfoo ordinarily go into foo.h.

  29. Raymond Chen says:

    True if you have other tools that munge header files then you can’t use cpp_quote. Personally I don’t use any of those wizardly things. If I want to edit something, I whip out "vi".

  30. I’ve talked a few times in this blog about the semantics of the equality operators in various languages….

  31. Ramblings says:

    So in my last entry, I teased you with a hint that you can now work with COM using pure REALbasic code. Today, I’m going to tease you a bit more. ;-) Before you can understand how to write COM…

Comments are closed.