Passing by address versus passing by reference, a puzzle


Commenter Mike Petry asked via the Suggestion Box:

Why can you dereference a COM interface pointer and pass it to a function with a Com interface reference.

The call.

OutputDebugString(_T("IntfByRef::Execute - Begin\n"));
BadBoy badone;
CComPtr<IDoer> Doer;
Doer.CoCreateInstance(CLSID_Doer, NULL, CLSCTX_INPROC_SERVER);

// created a raw pointer - maybe the
// smart pointer was effecting it some how.
IDoer* Doer2;
Doer.CopyTo(&Doer2);

badone.stupid_method(*Doer2);
Doer2->Release();
// no still works.

The function called.

void stupid_method(IDoer& IDoerRef)
{
 IDoerRef.Do();
 CComQIPtr<IDispatch> WatchIt(&IDoerRef);

 if( WatchIt )
  OutputDebugString(_T("QI the address of the ")
                    _T("ref works - this is weird\n"));
 else
  OutputDebugString(_T("At least trying to QI the ")
                    _T("address of the ref fails\n"));
}

I found some code written like this during a code review. It is wrong but it seems to work.

You already know the answer to this question. You merely got distracted by the use of a COM interface. Let me rephrase the question, using an abstract C++ class instead of a COM interface. (The virtualness isn't important to the discussion.) Given this code:

class Doer {
 public: virtual void Do() = 0;
};

void caller(Doer *p)
{
 stupid_method(*p);
}

void stupid_method(Doer& ref)
{
 ref.Do();
}

How is this different from the pointer version?

void caller2(Doer *p)
{
 stupid_method2(p);
}

void stupid_method2(Doer *p)
{
 p->Do();
}

The answer: From the compiler's point of view, it's the same. I could prove this by going into what references mean, but you'd just find that boring, but instead I'll show you the generated code. First, the version that passes by reference:

; void caller(Doer *p) { stupid_method(*p); }

  00000 55               push    ebp
  00001 8b ec            mov     ebp, esp
  00003 ff 75 08         push    DWORD PTR _p$[ebp]
  00006 e8 00 00 00 00   call    stupid_method
  0000b 5d               pop     ebp
  0000c c2 04 00         ret     4

; void stupid_method(Doer& ref) { ref.Do(); }

  00000 55               push    ebp
  00001 8b ec            mov     ebp, esp
  00003 8b 4d 08         mov     ecx, DWORD PTR _ref$[ebp]
  00006 8b 01            mov     eax, DWORD PTR [ecx]
  00008 ff 10            call    DWORD PTR [eax]
  0000a 5d               pop     ebp
  0000b c2 04 00         ret     4

Now the version that passes by address:

; void caller2(Doer *p) { stupid_method2(p); }

  00000 55               push    ebp
  00001 8b ec            mov     ebp, esp
  00003 ff 75 08         push    DWORD PTR _p$[ebp]
  00006 e8 00 00 00 00   call    stupid_method2
  0000b 5d               pop     ebp
  0000c c2 04 00         ret     4

; void stupid_method2(Doer *p) { p->Do(); }

  00000 55               push    ebp
  00001 8b ec            mov     ebp, esp
  00003 8b 4d 08         mov     ecx, DWORD PTR _p$[ebp]
  00006 8b 01            mov     eax, DWORD PTR [ecx]
  00008 ff 10            call    DWORD PTR [eax]
  0000a 5d               pop     ebp
  0000b c2 04 00         ret     4

Notice that the code generation is identical.

If you're still baffled, go ask your local C++ expert.

Mind you, dereferencing an abstract object is highly unusual and will probably cause the people who read your code to scratch their heads, but it is nevertheless technically legal, in the same way it is technically legal to give a function that deletes an item the name add_item.

Comments (29)
  1. Stewart says:

    I use exactly this technique with C++ abstract classes all the time. This is because, to me and my team at least, passing by reference in this way does not imply a transfer of ownership, whereas passing by pointer typically does. This leaves the reader in no doubt that the method call will not delete the object. This is reinforced by the fact that "delete &ref" just looks wrong, so hopefully no-one would do it.

    I would never do it with COM interfaces though, although the same logic could be used.

  2. Thomas says:

    To Stewart:

    Transferring ownership using a "raw" pointer is normally a bug; to correctly transfer ownership, pass a "smart pointer object".

    To me, the decision between pointer and reference should be based on the question whether NULL should be able to be passed.

  3. josh says:

    I’ve argued with a number of people (including instructors) who somehow think that references are not pointers but rather just introduce another name for an object.  :(

    Meanwhile, even though this produces exactly the same code in both cases (here and in probably every other implementation), I believe it’s technically not guaranteed to work in C++ if the pointer is null.  (and I don’t just mean this particular example…  If stupid_method did not even touch its arguments, this would still be true.)

  4. Thomas says:

    My argument is not just technical.

    Using a pointer signals to the humnan reader that NULL may be passed. Using a reference signals her that NULL is not going to be passed.

  5. KristofU says:

    I don’t think dereferencing an abstract object is weird. Polymorphism works on pointers and on references. So why should this be a problem?

    It’s just that the C++ syntax forces you to write  ‘*pPointer’ to pass the object as a reference.

    References are just a way to introduce object semantics instead of value semantics, without using a pointer.

    And yes you can also do this on null pointers, which of course can result in disaster.

  6. SamK says:

    I’ll also put my vote in the "not weird" camp.  I think that using references in this context is often semantically superior, mostly due to the reasons already cited.

    I think Ray Trent’s comments, about the functions "contract" with the caller, are key.  Putting a burden on the caller, explicitly, is very useful.

    In complex environments, it may be hard to tell if a pointer has already been vetted.  This is so because there’s no way to contractually communicate to a callee that a pointer has been vetted.  This typically leads to duplicate checks for NULL, etc., throughout a call chain.  If using references, the caller is contractually obligated to provide valid data.  This removes the need for pointer checks in the callee.

    I’m also a big fan of explicit ownership/responsibility in function contracts.  References, like the "const" keyword, tighten up the contract, and I use both judiciously.

  7. Sebastian Redl says:

    > I’ve argued with a number of people (including instructors) who somehow think that references are not pointers but rather just introduce another name for an object.

    Conceptually, these people are right. Technically, of course, references are always implemented as pointers.

    > I believe it’s technically not guaranteed to work in C++ if the pointer is null.

    Indeed.

    int *p = 0;

    int &r = *p;

    This is undefined behaviour according to the standard. The case is even explicitly mentioned somewhere.

  8. Ray Trent says:

    To amplify that last slightly: Using a reference signals that, absent a compiler bug, stack overflow, etc., NULL *cannot* be passed.

    In fact, it also signals that (absent those conditions again) the reference will always "point to" a *valid* object (at the time of the call).

    Pointers have none of those guarantees, and thus are easier to misuse, but passing NULL is a valuable signal to a function that an object is not applicable or invalid. Of course you could always override the function with one fewer parameters and have the two functions call a private pointer-taking function, and at least get a guarantee that the pointer will *either* be NULL or valid).

  9. Igor says:

    ^ Jesus… what people do and get paid, are there no quality standards today?

  10. Anony Moose says:

    I agree. Anyone who can’t tell the difference between a "null" reference and a reference to an object located at address zero has no standards at all.  ;)

    "Null reference" and "pointer to object at address zero" are not synonyms, and there are machines where an object at address 0 is valid and usable.

    But the typical use of that idea in code designed for an x86 machine and used to indicate a reference to an object that the developer knows is not valid is still a really bad idea.

  11. GregM says:

    Yeah, references should NEVER be NULL.  I scream inside each time I see the code like this in one of the components we use:

    extern void DoStuff(int a, int b, int c = 0, object &r = *(object *)NULL);

    No, I am not kidding.  They apparently REALLY wanted to add reference parameters to functions which already had optional parameters.  This then requires that they check the address of the parameter later to see if it’s NULL.

  12. josh says:

    "Conceptually, these people are right."

    It may be a valid way to think about references in some contexts, but I think it’s dangerous.  I could easily see it biting someone who doesn’t completely understand object lifetimes.

  13. Michael Fitzpatrick says:

    You forgot one very important and subtle difference:

    void stupid_method2(Doer * const p)

    {

    p->Do();

    }

    Using a ref denies the called routine from modifying the pointer

  14. Csaboka says:

    Anony Moose:

    Bjarne Stroustrup doesn’t agree with you in his FAQ (http://www.research.att.com/~bs/bs_faq2.html#null)

    "Should I use NULL or 0?

    In C++, the definition of NULL is 0, so there is only an aesthetic difference."

  15. Goran says:

    josh:

    "It may be a valid way to think about references in some contexts, but I think it’s dangerous.  I could easily see it biting someone who doesn’t completely understand object lifetimes."

    I disagree. Dangling pointer vs. dangling reference <=> potato vs. potato. Are you saying that someone will think that holding a reference will make underlying object alive? Well, that someone must know it’s craft. No excuse for that in my book ;-)

  16. Stewart says:

    To Thomas,

    Good point, and in an ideal world one would use a smart pointer for this. Sadly, the question of which is the hard part.

    Of the boost ones, only boost::shared_ptr allows transfer of ownership, and thats a big fat smart pointer for simple cases.

    Of the SCL ones, std::auto_ptr would be perfect if it wasn’t for the fact that it is useful for very little else. I have seen junior engineers copy its usage because it was used in this case and get it VERY wrong. Simply having it in the code can be dangerous if the uninitiated (most C++ programmers sadly) copy it.

  17. IMil says:

    To Michael Fitzpatrick:

    The same const modifier may be applied to pointer. There is indeed a subtle and somewhat confusing difference. If you write

    void ptr_method(const class SomeClass* c)

    you may not modify the object:

    c->ChangeMe(); //illegal

    c = someOtherPointer; //OK

    But

    void ptr_method(class SomeClass* const c)

    means that you may not modify the pointer:

    c->ChangeMe(); //OK

    c = someOtherPointer; //illegal

  18. Goran says:

    +1 in "not weird" camp.

    I’d say that "Dereferencing an abstract object" question shouldn’t even be asked. If it’s abstract, and *p was assigned to an actual object, a constructor must be called, at which point compiler will bark at abstract members. If *p goes to a reference, it shouldn’t matter, to a well-versed C++-er, if it’s abstract.

    Are you underestimating your audience, huh? I am collectively hurt! ;-)

  19. josh says:

    The token "0" is null, but it’s not necessarily a representation of address zero.  I don’t think the language even defines how absolute address values relate to pointers at all.  I’m not sure why Anony Moose is talking about address zero though.

    "Dangling pointer vs. dangling reference <=> potato vs. potato."

    Yes, exactly.  If you thought a reference is just another name for the object, you may not see that.

    "Are you saying that someone will think that holding a reference will make underlying object alive?"

    If they need a crutch to understand references because they don’t get pointers, I would not be surprised to see that happening.

  20. BryanK says:

    Csaboka:  It’s not just Bjarne that thinks that way.  The C virtual machine (yes, there is one, it’s just *very* similar to the underlying hardware most of the time) specifies that an all-bits-zero pointer is equivalent to NULL.  NULL in C is *ALWAYS* zero.

    See the comp.lang.c FAQ, questions 5.5 and 5.13 (and others in section 5):

    http://c-faq.com/null/machnon0.html

    http://c-faq.com/null/varieties.html

  21. asdf says:

    A "null pointer constant" is any constant integral expression that evaluates to 0. Any null pointer constant when converted (except via reinterpret_cast) to a pointer type yields the "null pointer value" of that type (i.e. what most people usually mean when they say NULL).

    What Anony Moose is talking about is the object representation of the pointer (which roughly means what address the pointer points to). The null pointer value isn’t guaranteed to point to address 0. So:

    reinterpret_cast<T*>(0)

    can do anything, but:

    static_cast<T*>(0)

    is guaranteed to evaluate to the null pointer value of type T* (and yes, the null pointer value for each pointer type isn’t guaranteed to point to the same address either).

    Note: before you tell me I’m wrong because the standard explicitly says reinterpret_cast<T*>(0) results in the null pointer value, that was an obvious defect: http://www.open-std.org/JTC1/sc22/wg21/docs/cwg_defects.html#463

  22. Norman Diamond says:

    Tuesday, March 27, 2007 9:27 AM by BryanK

    > The C virtual machine (yes, there is one,

    > it’s just *very* similar to the underlying

    > hardware most of the time) specifies that an

    > all-bits-zero pointer is equivalent to NULL.

    > NULL in C is *ALWAYS* zero.

    Wrong.  BryanK, see AND READ the comp.lang.c FAQ, questions 5.5 and 5.13, particularly 5.13:

    http://c-faq.com/null/machnon0.html

    http://c-faq.com/null/varieties.html

    0 in a source program’s syntax turns into a null pointer constant at compile time, which can turn into null pointers as needed at compile time.  The representations of null pointers in the execution environment don’t have to be all-bits-zero.

    In fact some computer architectures (e.g. Intel) can provide hardware assistance to debug some fraction of unintended attempts to dereference null pointers (e.g. a scalar object of length 2 or more bytes) if a null pointer is represented by all-bits-one.

    For practical purposes this fight was lost long ago, because only antisocial weirdo thermonuclear geeks were willing to learn that a 0 in syntax didn’t have to be all-bits-zero at execution time.

    By the way the title of this thread would have been better as "Passing by pointer versus passing by reference, a puzzle".  For practical purposes both pointers and references are "usually" addresses, but from the language’s point of view either or both could be represented differently.

    Another difference also arises from a choice to use a pointer vs. a reference.  In some cases use of a reference will automatically convert some argument to a temporary and use the temporary, but use of a pointer won’t.

  23. BryanK says:

    Aw, crap.  s/whose values is/whose value is/ — I hate it when I rewrite a sentence but don’t fix it properly.

  24. BryanK says:

    You probably got thrown off by the "all-bits-zero" part, and yes, that was poorly worded.  I should have said "a pointer whose values is the constant zero is equivalent to NULL."

    The rest is right though — NULL in C (and by extension, C++) is *ALWAYS* zero.  A NULL pointer will always compare equal to the constant zero (because in comparison context, the compiler can tell what kind of pointer it needs to use, so it can generate code that uses a nonzero bit pattern if it needs to), and assigning the constant value zero to a pointer will set its bits to whatever the real hardware uses for null pointers.

  25. Norman Diamond says:

    Wednesday, March 28, 2007 8:20 AM by BryanK

    > I should have said "a pointer whose values is

    > the constant zero is equivalent to NULL."

    The C and C++ standards only make that guarantee for certain specified forms and only when they’re known to be constant at compile time.

    For example:

    int *pf() = NULL;  // OK

    int *pf() = 0;  // OK

    int *pf() = 3.5 – 3.5;  // compilers can decide

    int *pf() = (void*) 0;  // OK

    int *pf() = (void*)(void*) 0;  // prohibited

    > NULL in C (and by extension, C++) is *ALWAYS* zero.

    If you mean at execution time, it is *NOT ALWAYS* (except when speaking practically as mentioned earlier, because of so many broken programs that have to be catered to).  If you mean at compile time, then NULL is always something that the implementor knew would be equivalent to a compile-time zero of some sort, but that says nothing about its execution-time representation.

  26. BryanK says:

    No, I didn’t mean at execution time, that was basically what I was trying to say in my previous post.

    A pointer variable which is currently holding a null pointer will always compare equal to an unadorned constant zero, because the constant zero is interpreted in a pointer context (because it’s being compared to a pointer).  But the actual bits of the pointer variable may not all be zero (if you print out an expression like *((int *)(&ptr)), or maybe even (int)ptr, you may not get zero).

    So in that sense, you’re right, it’s not "always" zero.  But if the programmer compares the pointer-variable-containing-a-null-pointer to a constant zero, the comparison will always succeed.

    (I’m not sure why the "(void *)(void *) 0" expression is prohibited: Is it just because you’re doing a double-cast to the same type?)

  27. Norman Diamond says:

    Thursday, March 29, 2007 8:19 AM by BryanK

    > (I’m not sure why the "(void *)(void *) 0"

    > expression is prohibited: Is it just because

    > you’re doing a double-cast to the same type?)

    The redundant cast is perfectly legal.  The standard doesn’t even mention the redundancy, there’s no problem with that.

    The result is a value which is a null pointer, which is a constant, and which has type (void*).  But that isn’t enough for the hypothetical usage which I put it to.  My examples require null pointer constants.  A null pointer constant has some magic features besides simply being a null pointer and constant.

    For comparison again:

    int *pi = (void*)(void*) 0;  // legal in C

    int *pf() = (void*)(void*) 0;  // illegal in C

    // (both are illegal in C++, I think)

    Around 10 years ago I posted in comp.std.c about (void*)(void*) 0 not being a null pointer.  dmr posted a followup saying he was planning to write something about nasal daemons, but he double-checked the standard before writing and he agreed.

  28. Norman Diamond says:

    Friday, March 30, 2007 4:49 AM by Norman Diamond

    Idiot.

    You know you’re mad when you start talking to yourself.  Well let me assure you, you were right to be mad.

  29. Norman Diamond says:

    Thursday, March 29, 2007 10:25 PM by Norman Diamond

    > Around 10 years ago I posted in comp.std.c

    > about (void*)(void*) 0 not being a null

    > pointer.

    Idiot.  You just finished explaining the difference between a value which only happens to be a null pointer and a constant, and a value which has the additional magic property of being a null pointer constant.  And then here you screwed it up already.

    Now get this.  (void*)(void*) 0 IS a _null_pointer_.  And it’s constant.  What it isn’t, is that it isn’t a _null_pointer_constant_.  That’s why dmr agreed with you.

    Now we can only wonder why no one else tore you to shreds on this before I did.  You sure deserve it.

Comments are closed.

Skip to main content