Why “gcnew” instead of just plain “new”?

Edward Diener asked:

  • I don’t see why you don’t use just ‘new’ rather than ‘gcnew’ ? The type of T will
    determine whether the obect is allocated in GC collected memory or the C++ heap.

That’s one way to design it. Interestingly, the existing “Managed C++” does use just
plain old new for everything, with pretty much the semantics described
above: In MC++, the expression new T allocates a T object
on the CLI gc heap if T is a CLI type, and on the native
heap if T is a native type.

Briefly, this model was limiting, and it was confusing to users. Just to give a taste
for what happens when you go down this path, the next question you hit is “what is a T*?”
In particular, consider:

T* t = new T;

Is t a pointer to an object in gc’d memory, which means that the
object can move? Or is it a pointer to an object in native memory that doesn’t
move? “That’s easy!” one might say. “We could deduce that t points
into gc’d memory if and only if T is a CLI type.” In the above code
statement, that’s true, and Managed C++ used “defaulting rules” to make it mean exactly
that. In particular, in MC++ the T* above means the same as the following
longer version if T is a CLI type:

T __gc * t = __gc new T;

But it turns out that when going along this path you do need that __gc pointer
decoration (or its moral equivalent) sometimes, even if much of the time you can make
it optional by defaulting it based on the pointed-at type. In particular, you need
it for combinations of pointers, including the simplest case:

int** ppi;  // what is this?

Consider: Where is the int* that the int** is pointing
to? The answer cannot be deduced from the type, because an int* can
exist on the gc’d heap or on the native heap. Therefore both int* __gc * (a
pointer to an int* on the gc heap) and int** (a
pointer to an int* on the native heap) are valid pointer types, and therefore
you need to be able to distinguish between them.

In practice, this has been a great source of user confusion because programmers
are never really sure where to put the __gc. So what we have observed
many users do, time and again, is simply add __gc‘s until the code
compiles — whether the __gc is needed (or correct) or not.

So you can use defaulting rules to hide the __gc some of the time,
but not all of the time. And the “some of the time” itself is at a cost, namely that
it arbitrarily restricts native types from ever being allocated on the gc heap (i.e.,
from being garbage collected), and restricts CLI types from ever being allocated on
the native heap.

He continued:

  • The only reason for disambiguating them is if, in the future, CLI will allow allocating
    a GC object on the C++ heap and/or allocating a non-GC object in the GC collected
    memory area. But it baffles me why one would ever change the CLI bindings to do that.

Bingo. In addition to the clarity and usability issue above, you hit on a primary
reason for not just using unadorned new: It conflates two ideas that
ought to be independent, namely the idea of what kind of type T is
and the idea of where T objects are allocated. In particular, we
will definitely in the future allow allocating an object of any type on the gc heap
or on the native heap.

Why do we feel the need to support allocating an object of native type on the GC heap?
Because customers ask for it. “You’ve got a great garbage collector in there,” they
say, “so why can’t I use it to garbage-collect the objects I already have today?”
That’s a reasonable question.

Why do we feel the need to support allocating an object of CLI type on the native
heap? Because there are lots of native templated libraries out there today, many of
which  that internally allocate objects on the native heap (because, after all,
they know nothing about the GC heap). We ought to be able to leverage all those
existing libraries and use them unmodified also with the CLI types. True, for
some such libraries it may be enough simply to give a different meaning to new depending
on the type of T as originally proposed above. But note that
such libraries might not only allocate objects using new, but may
rely on the resulting pointers to support things like pointer arithmetic that MC++’s __gc*‘s
and C++/CLI’s ^‘s cannot support, and so they would still be broken
for CLI types.

Of the two, I see gc’ing native objects as the more compelling and mainstream use

Comments (4)

  1. yuaba says:

    wtf, are you for real

  2. e2wind says:

    We ought to be able to leverage all those existing libraries and use them unmodified also with the CLI types?