Having the right Type Vocabulary to discuss Boxed Value Types

Comments from Stan Lippman's BLog: Jon Flanders
re: Value Type Redux
I think it is always important to point out when having the “value types are boxed whenever they are treated as object” discussion, that if the value type overrides ToString() (or the method in question) boxing does not need to occur.

In the original Managed Extensions for C++ there is no support for implicit boxing. In the case of the ToString() example, this means that the invocation of an inherited and overridden virtual function through an object of a value type is different in the two cases. In the former case, the user must explicitly box the object or else the invocation is flagged at compile-time as an error. So, in the Managed Extensions for C++, Jon’s point is moot because the distinction is built into the language.

This had the pedagogical effect within the original Managed Extensions for C++ of teaching the programmer the underlying complexity of the unified type system, and providing a lexical incentive for the introduction of an overriding instance of the virtual function. The majority of users of the value type, however, has no authoring ability with regard the type definition and so found the lexical incentive a disincentive for using the language. In the revised language, currently under ECMA standardization as C++/CLI, implicit boxing is supported, and the general user is left blissfully unaware of the potential overhead of the call.

Which is why it is important in C# and VB.NET to call ToString() explicitly on value types, because most of them *do* override ToString(), and doing it explicitly avoids the need to box. The compiliers (C#/VB.NET) add the box instruction if you just write Console.WriteLine(v);, where typing Console.WriteLine(v.ToString()); just ends up as a virutal method call). This is true even when the value type overrides ToString().

C# and VB.NET have by choice no type vocabulary for speaking about the boxed value types on the managed heap. Console.WriteLine(), in this case, is just a special case of a larger issue -- the initialization or assignment of an Object^ with a value type. The unified type system requires that the value type be boxed in order to transform it into a handle/object duple that underlies the representation of a reference type.

While it is correct to state that invoking ToString() for those value types that have a overriding definition avoids boxing, that is a very special case for which a string representation makes sense. Were we using a Hashtable to count word occurrence, the invocation of ToString() would not be appropriate, and the user would have to live with the multiple boxings associated with the reading and writing of the boxed value types. However, the user might well never be aware of what is going on.

In the Managed Extensions for C++, the type vocabulary for speaking of a boxed value type is __box V*. This has been simplified in C++/CLI to V^ [this is discussed in more detail in an earlier blog entry]. This permits a direct handle on the representation in the managed heap, and does not require multiple boxing operations back and forth when we repeatedly read and write a boxed value.