Why R^ instead of cli::handle?


Nicola Musatti asked the following excellent question:

  • The hat symbol and gcnew could be replaced with a template like syntax, e.g.

    cli::handle<R> r = cli::gcnew<R>();

I agree that those are alternatives. Everyone, including me, first pushes hard for
a library-only (or at least library-like) solution when they first start out on this
problem. I think an argument can be made for it, and at one time I did so too.

To me, the killer argument in favor of a new declarator with usage R^ instead
of a library-like cli::handle<R> is its pervasiveness: It will
be by far the most widely used part of all these extensions, as it’s the common use
case the vast majority of the time for CLI types (as objects, as parameters, etc.).
This extremely wide use amplifies two particular negative consequences we’d like to
avoid: First, the long spelling (here “handle”) could in practice effectively become
a reserved word just because people are liable to widely apply using to
avoid being forced to write the qualification every time (this is worse if the name
chosen is a common name likely to be used for other identifiers or even macros, and
“handle” is a very common name). Second, and worse, the long spelling would also make
the language several times more verbose in a very common case than even the Managed
Extensions syntax was, and that in turn was already verbose compared to other CLI
languages.

Compare five alternatives side by side:

  cli::handle<R> r = cli::gcnew<R>(); // 1:
above suggestion

  handle<R> r = gcnew<R>(); // 2: ditto, with
“using”s

  R __gc* r = new R; // 3: original MC++ syntax

  R^ r = gcnew R; // 4: C++/CLI syntax

  R r = new R(); // 5: C#/Java syntax

I think you could make a case for any one of these, depending on your tradeoffs. But
I think a tradeoff that favors usability will favor the last few options.

There are also other issues where having ^ and % declarators/operators
that roughly correspond to * and & enables a
more elegant type calculus. I (or someone on the team) will have to write those up
someday, but consider at some future time when we have full mixed types too: When
we can have a type that inherits from both native and CLR base classes/interfaces,
we will want to be able to pass a pointer to such an object to existing ISO C++ APIs
that take a Base1* and a handle to the same object to existing CLI
APIs that take an Base2^. Both will be common operations and therefore
both should be distinctly expressible with a terse syntax:

  class NativeBase { };

  // a mixed type
  ref class R
    : public NativeBase
    , public System::Windows::Forms::Form
  { };

  void NativeFunc( NativeBase* );
  void CLIFunc( Object^ );

  R r;                 
// object on the stack
  NativeFunc( &r ); // “give me a *” is spelled “&”
as usual
  CLIFunc( %r ); // “give me a ^” is spelled “%”

In this way, % is to ^ pretty much just as & is
to *. If R^ were instead spelled using a templatelike
syntax, what would be the corresponding code to get at it?

Finally, consider the agnostic template case:

  template<typename T>
  void f( T t ) {
    SomeBase* b = &t; 
// I have to have a way of
saying “I want a *” without knowing the type of T
    SomeInterface^ i = %t; // I have to have a way
of saying “I want a ^” without knowing the type of T
  }

I’ll write more about the full pointer system in the future. For other design considerations
about handles I’ll point to at Brandon’s Behind
the Design: Handles
blog entry again, and to my own earlier this week on why
pointers aren’t enough by themselves
.


Comments (4)

  1. Edward Diener says:

    I believe it is a mistake to invent new syntax of ‘^’ for pointer to gc type and ‘%’ for reference to gc type. Why not just use the normal C++ ‘*’ and ‘&’ respectively ? The compiler will know from the type itself whether this is a pointer/reference in GC memory or the C++ heap. I see no point in complicating the syntax unless there is a good technical reason, such as not being able to disambiguate between a pointer/reference to GC memory and a pointer/reference to non-GC memory based on the type.

    Saying that using ‘*’ pointer syntax is breaking the C++ standard because what a C++ pointer points to can not be moved around in memory is not a good reason, since CLI is not an attempt to create a C++ standard language. I would much rather the syntax remained as consistent with C++ as possible and the places where CLI diverged from C++ occur because GC types are different than C++ types in that they are automatically garbage collected, be the primary focus of a programmer’s understanding. Now C++ programmers must use a new syntax, ‘^’ and ‘%’, where they are already use to pointer and reference syntax. This only creates more headaches for C++ programmers using CLI.

  2. Herb Sutter says:

    >>I see no point in complicating the syntax unless there is a good technical reason, such as not being able to disambiguate between a pointer/reference to GC memory and a pointer/reference to non-GC memory based on the type.<<

    Yes, you got it — that’s one of the core reasons. The links above also describe additional ones.