The Revised C++ Language Design Supporting .NET — Part 1

"urn:schemas-microsoft-com:office:office" />Probably
the most conspicuous and eyebrow-lifting change between the original and revised design
of the dynamic programming support within C++ is the change in the declaration of
a .NET reference type:


original language

* obj = 0;


revisied language

^ obj = nullptr;


is actually a great deal of significance to that change – not enough to justify a
book, certainly, but more than enough to warrant a first entry in a blog. So, here
we go.


are two primary questions that get asked when people see this? Why the hat (as the
^ is called affectionately along the corridors here within Microsoft), but, more fundamentally,
why any new syntax at all? Why couldn’t the original language design be cleaned up
with less invasiveness rather than the admittedly in-your-face strangeness of the
revised language design, various aspects of which will be the topic of this blog.


the original design challenge of supporting .NET within C++ is that the object model
of the two are very, very, well, really very different. Really. It is kind of like


with Oz. I suppose I need to support that, even in something as informal as a blog.
Ok, let’s see what I can do here.


is built upon a machine-oriented systems view. Although it supports a high-level type
system, there is always an escape mechanism, and those mechanisms always lead down
into the bowels of the machine – you cast away type, you suppress the virtual mechanism,
you turn names into relatively fixed machine addresses. Think Neo and the Matrix,
if you are not into Dorothy and find


a bit too, well, native. C++ supports a static (that is, compile-time) physics – even
when we push the envelope and poke into the run-time, it is all strictly set up at
compile-time: not just the virtual mechanism, but the exception-handling model as
well. When push comes to shove, and the user is hard-pressed to pull a rabbit out
of the hat, she tunnels under the program abstractions, picking apart types into addresses
and offsets. (Think of Luke turning the computer-aided target system off in the original
Star Wars and trusting to the force – that’s your essential C programmer, and those
that don’t believe, can’t believe you’re really doing that. Don’t you realize the
Universe is at stake?)


is built atop a software component layer, in this case called the Common Language
Runtime (CLR). This is a fundamental design solution for really difficult problems.
Back when I began programming, the best advice I received – this was back at Bell
Labs – was, if you don’t know how to solve it, add a pointer – that is, add a level
of indirection, and that will give you enough slack to do something smart. Well, that
was . gosh, that was a long time ago. (It was about the same time that making overheads for
a talk meant manually attaching blank acetate to a printed page of text and hand-feeding
it through a toaster that burned the
text onto the acetate which you caught before it fell to the ground, separated, then
began the process over again. Ok. I digress.)


C++ equivalent of adding a pointer is defining a class – that is, adding a layer of
abstraction. This is what the stl does with iterators in order to transparently support
walking across contiguous vectors and non-contiguous lists. The trade-off is two-fold:
an additional layer of complexity, and a possible loss of efficiency are the usual
suspects in terms of downside. The upside is increased flexibility and, at times,
elegance bordering on beauty. This is what .NET is about, but in this case, the level
of abstraction is a runtime software level. It supports a dynamic, run-time physics.
When push comes to shove, the user reflects upon
the execution environment, querying, coding, and creating objects literally out of
thin air. Instead of tunneling, one soars, but the experience can be unsettling to
those use to having both feet on the ground.


If we
give the C++ static object model the code-name


, and the .NET dynamic object-model the code-name Oz, then the work of the language
design is to bridge these two worlds both within the C++ language and within the C++
community at large. Neither of these were easy tasks.


me engage in a bit of poetic license for a moment [I’m permitted that; I have an MFA
in writing packed in a box somewhere in a garage or basement here in the Northwest,
or back in Los Angeles, or even further back in New York City – this is called a biographical
aside [I’ve been told by Martyn Lovell that these are permitted in blogs [but only
with three levels of parenthetical nesting!]!]!]. So, I’m claiming in a fit of poetical
musing, that C++ and .NET represent, respectively, the physics of


and Oz. For example, you can get from here to there under the Oz semantics by clasping
ones hands together, squeezing one eyes tightly shut (no peeking, please, all addresses
are encapsulated), and just wishing. And poof. You are transported and all references
to your location are updated without you having to do a thing. In


, however, either you use public transportation, your own vehicle, or you cast all
that aside, and you walk. If you don’t physically update everyone with your new location,
you will be lost to them. That’s how it works down here in




  • So,
    the question becomes, how do we get our Dorothy to become a successful resident of
    Oz while not offending her




  • Or,
    alternatively, how can we allow her to move between the two worlds, rather than feeling
    each to be the shadowy dream world of the other?


of poetic aside. You can start reading again.]


consider this from a more prosaic viewpoint. [After all, an MFA, while a mighty fine achievement,
is a terminal degree. Trust me on that.] So, what does it mean when we write the following?




in ISO C++, regardless of the nature of T, we are certain of the following characteristics:
(1) there is a compile-time memory commitment of bytes associated with
t equal
to sizeof(
, (2) this memory associated with
t is
independent of all other objects within the program during the extent of
(3) The memory directly holds the state/values associated with
and (4) this memory and state persists for the extent of


are some of the consequences of these characteristics. Item (1) tells us that
t cannot
be polymorphic. That is, a polymorphic type cannot have a compile-time memory commitment
except in the trivial case in which derived instances do not impose additional memory
requirements. This is true regardless of whether
T is
a primitive type or serves as a base class to a complex hierarchy. A polymorphic type
in C++ only is possible when the type is qualified with a token modifying its meaning
– either
T* representing
a pointer, or
T& representing
a reference – such that it the object of the declaration only indirectly refers to
an object of T. (Those hostile to C++ finds this a huge laughing point, gleefully
pointing out the naïve slicing errors that occur when an object (that is, an unqualified
is attempted to be used as a polymorphic target of an assignment or initialization.
[This observation is actually not an aside, but a set-up for motivating the introduction
of the hat (^) notation.)


separation of value and reference within a single notational type system was a deliberate
design decision by Bjarne Stroustrup in the late 1970s based on his doctoral experience


with using Simula-68, in which all objects are allocated on the runtime heap 
and all object access is indirect through a transparent handle. At the time, the Simula-68
object model provided prohibitely expensive for the then current machines and resource
availability. [By prohibitively expensive I mean that the required work could not
be carried out in the time allotted.]


[A brief
digression on pointers and references that will later prove to be relevant to the
introduction of the hat (^) syntax for .NET reference types]


To delay
resource commitment until run-time, two forms of indirection are explicitly supported
in C++:


  1. Pointers:
    T *pt = 0;
  2. References:
    T &rt = *pt; // oh, well .


form is well-behaved under the model supported by traditional 00 languages (again,
to their followers great amusement).


conform to the C++ Object Model. In


*pt = 0;


pt directly
holds a value of type size_t that is of fixed size and extent. Lexical cues are used
to toggle between the direct use of the pointer and the indirect use of the object
addressed. It can be unclear at times which mode applies to what and when or how (and
is a form of celebrated obscuritanism etched within code as a kind of tattoo proving
mental toughness):


provide a syntactic relief from the seeming lexical complexity of pointers while retaining
their efficiency:


operator+( const Matrix&, const Matrix& );

m3 = m1 + m2;


do not toggle between a direct and
an indirect mode; rather they phaseshift between
the two: (a) at initialization, they are directly manipulated, but (b) on all subsequent
uses, they are transparent.


In a
sense, a reference represents a quantum anomaly in the physics of the C++ Object Model:
(a) they take up space but, except for temporary objects, they are immaterial, (b)
they exhibit deep copy on assignment and shallow copy on initialization, and (c) unlike
const, they really are immutable. While they are not all that useful within ISO C++,
except as function parameters, they turn out to be the inspirational pivot upon which
the language revision turns.



C++.NET Design Challenge


for every aspect of the C++ extensions to support .NET the question always reduces
to “How do we integrate this (or that) aspect of the Common Language Runtime (CLR)
into C++ so that it (a) feels natural to the C++ programmer, and (b) is easy to use
in its own right under .NET. I like to call this the Janus face dilemma. (Janus is
a two-faced Roman diety, the one turned facing towards what has just been, the other
towards what is to be.)



Reader Language Design Challenge


to give you a flavor of the process, here is the challenge: How should we declare
and use a .NET reference type? It differs significantly from the C++ Object Model:
different memory model (garbage collected), different copy semantics (shallow copy),
different inheritance models (monolithic, rooted to Object, supporting single inheritance
only with additional support for Interfaces).


in the traditional cliff-hanger whiz-bang multi-part installment, I leave it to you,
until Part 2 [after the Turkey roosts] to think about, ok, just how will we integrate
support for the .NET reference type within ISO C++. [Hint: Version 1 chose to represent
it as a pointer. The general concensus is that it doesn’t offer a first-class
programming experience.]



disclaimer: This posting is provided “AS IS” with no warranties, and
confers no rights. 

Comments (11)

  1. Gherkin von Manon says:

    Please sir, get to the point! WHY is the ^ used rather than *?

  2. Tim Sweeney says:

    You poor, confused soul.

  3. Anon coward says:

    Why can’t you use existing C++ syntax, just like std::auto_ptr does? auto_ptr also has non-conventional C++ syntax. Yes, auto_ptr’s design was controversial, but I think the result works
    fairly well and it has spawned other kinds of references, like shared_ptr in Boost.

    Your example would look like:

    NETRef<Object> obj;

    I hope that renders OK; you can work it out for yourself otherwise. Obviously you need some magic when implementing NETRef, but you’re writing the compiler, right?

  4. AlisdairM says:

    C++ does very little magic in its libraries. There seems to be a general principle that if you
    can do it in a library, supply a library rather than extend the language.
    OTOH, if you cannot supply a feature through a library, add the minimum to the language
    to enable that library to be written. Then add the library.
    Rather than move the magic into a hidden library implementation, the hat syntax appears
    to be the minimum change necessary. .NET references are too similar/different to native
    pointers and references to overload the same syntax. I gather this was learned the hard
    way with managed C++ (although I haven’t dabbled myself yet)

  5. Frans Bouma says:

    "Probably the most conspicuous and eyebrow-lifting change between the original and revised design of the dynamic programming support within C++ is the change in the declaration of a .NET reference type: "

    I thought that the most conspicious and eyebrow lifting change was the absense of multiple inheritance and decent templates.

  6. Zorba The Geek says:

    "how do we get our Dorothy to become a successful resident of Oz"

    Well, _why_ do we need Dorothy in Oz?

    Should she not stay in Kansas, and leave Oz to C# ?

  7. midi says:

    Please sir, get to the point! WHY is the ^ used rather than *?

  8. 七里香 says:

    How to Pointers conform to the C++ Object Model. In ?