I have a Fit, but a lack of Focus.


Here’s a statement I read the other day about making comparisons between objects of reference type in C#:

Object.ReferenceEquals(x,y) returns true if and only if x and y refer to the same object.

True or false?

My wife Leah recently acquired a Honda Fit, thanks to the imminant failure of the automatic transmission solenoids in her aged Honda Civic. The back seats in the Fit fold down flat. You can fit a llama or a whole pile of hula hoops or whatever into that thing. It’s quite handy. Not what I would call a powerful engine by any means, but for quick trips around town, it certainly gets the job done.

Since we were married when she bought the car, and we continue to be married, what’s mine is hers and what’s hers is mine. So if x = Eric’s Honda Fit, and y = Leah’s Honda Fit, then x and y are “reference equals”. Those two things refer to the same object, viz, the shiny black object full of llamas and hula hoops in my driveway.

Now, we could have bought a different car. Say, a Ford Focus. But we did not. We own a total of zero Ford Foci. Suppose I said that x = Eric’s Ford Focus, and y = Leah’s Ford Focus. What’s the sensible way to characterize the nature of x and y? Do we say that x and y refer to the same Ford Focus, namely that they refer to the Ford Focus that does not exist? The mind boggles at the repugnant and paradoxical implication that there exists a Ford Focus that is the Ford Focus that does not exist! (*) Rather, the right way to characterize this is to say that neither x nor y refer to any object. They’re “null references” — references that do not have any referent, but rather, capture the notion of “a lack of referent”.

And that’s why it’s incorrect to say that Object.ReferenceEquals(x,y) returns true if and only if x and y refer to the same object.If x and y both do not refer to any object, then clearly they do not refer to the same object, because neither refers to an object in the first place. The correct way to characterize the behaviour of reference equality is

Object.ReferenceEquals(x,y) returns true if and only if either x and y refer to the same object, or x and y are both null references.

***********

(*) And yet I am a fan of the “null object pattern”. Life is just full of these little contradictions.

[Eric is on vacation this week; this posting is pre-recorded.]

Comments (19)

  1. Michael says:

    as usual, great example (and funny too) to illustrate a point.

  2. There’s one other place where object.ReferenceEquals() won’t work as you expect — when you use value types.  Silly, but true!  (Of course the name makes it quite clear that this is intended for reference types, but if you’re in a generic method and haven’t used a "class" constraint that might be immediately obvious.)

    To wit:

    static bool Foo<T>(T value)

    {

       return object.ReferneceEquals(value, value);

    }

    Surely ‘value’ should compare equal to itself!

    Foo("value") == true; // expected

    Foo(42) == false; // Hmm…

    Oops.

    Of course, with a little thought we know that this is correct — ‘value’ is boxed twice, and thus two different objects are compared — but it could still be confusing to see that a value is not, in fact, reference equal to itself. :-)

  3. Aga says:

    that was a fun to read, thanks

  4. Jim says:

    Analysis like this is why you belong in language development.  Most impressive.

    It’s amazing how straightforward this stuff is when you take the time to logic it out.

  5. Anand says:

    Interesting !!!! its been a while from language world — aricles like these create interest to research

  6. keith says:

    The first statement is correct in ternary logic: Null = Null should not evaluate as True, but as Null (until you know, Schrodinger’s cat is both alive and dead).  If you know by a business rule that assets are joint, then you can project your current, future, and non-possessions will be the same.  But I understand that "null=null is true" is a convention of the language, not that you’re confirming joint ownership.  

    Maybe the example misleads me or my years of SQL simply ruin me for the OO mindset: did you and your wife reject the class of all Focuses without considering a specific one (I suspect this is the correct way to frame this, or that neither you or your wife know anything about any Focuses, and the question is being asked from outside your point of view), or is there a single one you both rejected?  If so, I’d agree you both refer to the same non-object.  But what if there is a set of specific Focuses that were compared in an evaluation function that the Fit won?  Perhaps you preferred the purchased Fit to the first Focus "returned by the universe to your query", say a blue hatchback Focus at Foo Dealership that you saw advertised on TV, while your wife preferred the Fit to a silver wagon Focus at Bar Dealership that she saw on her commute.  In this case, the Focuses in your minds are not the same, even if the evaluation was just in your mind.   These would not be the same.

  7. Joren says:

    You could easily define that by ‘null and x refer to the same object’ you mean ‘x is null’. This might not be intuitively obvious, but then at least the relation ‘refer to the same object’ is defined over all references, which is mathematically pleasing, like closure is for operators.

  8. rob says:

    @keith

    True they would not be the same, but neither would they be Eric’s Ford Focus and Leah’s Ford Focus.  Since Eric and Leah own no Foci those are both null.  But The Ford Focus Considered By Eric and The Ford Focus Considered By Leah are not null: they have real and different referents and hence are neither null nor equal.  

    In SQL terms you are following the Considered relationship rather than the Owns relationship between Persons and Ford Foci.  Or, more brutally, your query is: "SELECT * FROM FORD_FOCI WHERE ID IN (SELECT CONSIDERED FROM PERSONS WHERE NAME LIKE "ERIC");" instead of "SELECT * FROM FORD_FOCI WHERE ID IN (SELECT OWNS FROM PERSONS WHERE NAME LIKE "ERIC");".

    It’s a rookie mistake – try to keep your focus.

  9. estee says:

    Oh well, just a moment ago I’ve got to know what a billion-dollar mistake is, and now there’s your story =)

  10. Mike says:

    >> Since we were married when she bought the car, and we continue to be married, what’s mine is hers and what’s hers is mine

    Prior to getting married my father had a slightly different spin on this. He said: "What’s your’s is her’s and what is her’s is her’s."

    -Mike

  11. long time silent reader says:

    Hi Eric,

    Since you are a fan of the null object pattern why did you (your team) not implement it with events in c#. I often wish it was the default implementation.

    I would like that too, but there are a few reasons to not do so. First off, doing so complicates initialization; it is a nice fact that initializing all memory associated with a new object to zeros gives default semantics; if the memory manager had to figure out what object reference to put into an event field, that would complicate the memory manager. I suppose we could have done it in the compiler to generate the initialization in the construction sequence. We could ensure that the field is never accessed before it is initialized.

    But then what to do for structs with events? Structs do not have a default constructor. And a nice thing about structs is that allocating and deallocating them is very cheap if they can stay on the stack; the singleton delegate would have to be allocated on the heap. It’s a nice property of structs that you can often use them in a manner that guarantees that they do not impose load on the garbage collector; this design choice would take away that ability from all users who want to have structs with events.

    In short, it’s easy enough for you to use this pattern yourself if you want to take on its costs in exchange for its benefits. It’s hard to say whether the increased flexibility of being able to do it both ways “pays for” the increase in potential for people doing it wrong. These things are judgment calls; that’s what makes language design so interesting. — Eric

  12. P Baughman says:

    I’ve always considered "null" to be a singleton object  and therefore (inside my brain) your second statement is redundant.

  13. Pavel Minaev [MSFT] says:

    @keith:

    In SQL, NULL really means "unknown". Of course, when you compare one value that is "unknown" to another value that is also "unknown", the result can only be "unknown".

    In C# (and Java, etc), null doesn’t mean "unknown". It means "there is no object here". The correct way to translate that to relational terms would be to treat every object reference as a reference to a relation which may have either 0 or 1 tuples. Null means 0 tuples. Obviously, two relations with 0 tuples each are definitely equal – "unknown" doesn’t enter into equation here.

    … well, except for nullable value types, where for some reason the existing meaning (which is implicit, but clearly defined by semantics of ReferenceEquals) was scrapped in C# 2.0 favor of treating null as "unknown" for everything but comparisons.

    Of course, in practice, you often see NULL being used in C# meaning in real-world SQL, and null being used in SQL meaning in real-world C#, so it’s probably pointless to try to draw the line now.

  14. Mark Knell says:

    @Pavel "In SQL, NULL really means ‘unknown’"

    If we’re splitting haris (and why else are any of us here, really), null semantics are pretty close to meaning "unknown" when considered at row scope.  The aggregate operators often ignore nulls, though. To me, that doesn’t fit the metaphor.  

    2 + 2 + null = null  

    Sum({2, 2, null}) = 4  [not SQL notation, obviously]

    With null semantics, i’s not clear to that anything "really means" anything. At best it is "really easier to understand" if you remember a certain few patterns and the scopes in which they apply.

  15. Denis says:

    @Pavel, @keith, @Eric, @and the rest,

    So what on Earth does the NULL mean: unknown or non-existent? The example in the post itself (about the Ford Focus that was never bought) obviously points at the latter, but the comments about the SQL NULL seem to speak in favour of the former…

    Also, from my practical experience with Assembler and C/C++ on "good old" (well, certainly old!) MS-DOS, the NULL pointer means, "can be anything – a random memory address (but set to zero by most implementations) – do not touch!" That’s more like "unknown", but that’s if you regard a NULL pointer as a memory address. If you wax all object-oriented and see it as an un-initialized reference to an object, whatever that object is, then the meaning is, "you don’t know what’s on ‘the other side’, but you know what isn’t there: there is almost certainly no object you are looking for, so don’t delude yourself by assuming there is one". This is still mostly about unknown, but has a tinge of non-existence to it, too. And from here, the SQL people seem to move in the direction of the "unknown", and the C#/.NET people say it’s "non-existent" – the Great Void, which is the same for all non-existent objects (so that Object.ReferenceEquals(x,y) returns true if both x and y are set to null)…

    Good fun, that. Ignoti et quasi occulti… :-)

  16. Denis says:

    Sorry, everyone: I’ve typed up some nonsense above. I totally confused an uninitialized and un-allocated memory in MS-DOS and a pointer set to NULL. The latter may not have always been set to 0000:0000 (what if you switch to the protected mode, for example?), but it was always a definite value, not an unknown address. My memory must be going… :-)

    But my main question still stands: what is NULL now? Does it mean "unknown" or "non-existent?"

  17. Gabe says:

    How often do you see structs with events in them? I can’t think of an instance where that is called for, but there may be some significant use case I’m not taking into account.

  18. Sam says:

    I'm confused, is there a typo?  These two statements (at the end of the pot) seem to contradict each other:

    "If x and y both do not refer to any object, then clearly they do not  refer to the same object, because neither refers to an object in the first place."

    "Object.ReferenceEquals(x,y) returns true if and only if either x and y refer to the same object, or x and y are both null references."

  19. pgeerkens@hotmail.com says:

    As a clarification, I believe the correct semantics of the SQL null is "I don't know,  yet.", which more closely high-lights it's relationship to the Halting Problem and is subtley different from"Unknown". It also highlights my belief that NULLs should not be allowed in base tables, as they are always a temporary convenience made us of during object construction.