Virtual Function Behavior Between ISO C++ and C++/CLI Revisited

I seem to have rather badly explained the `virtual function invocation within a constructor issue,’ if one is to judge by the following question posted as a follow-up to my blog entry, for which I apologize:

Sender: Indranil Banejree

=====================================

re: Making a Virtual Table Context-Sensitive

Do you mean that that virtual calls are not context sensitive at all with Managed C++? How about C++/CLI?

If not, the feature will be sorely missed. One of my favourite GoF design patterns, Template Method directly makes use of this feature. Where a non virtual base class method calls a bunch of virtual methods.

I've written plenty of native C++ and Java that works like this. I'd hate for this to break in .NET. I'll check my colleagues copy of C# Design Patterns to see how Template Method is handled there.

Conveying something never seems to be as clear cut as I delude myself into believing. Before I actually attempt to clarify Indranil’s question, let me provide some context.

An object model has two faces, one that is presented to the user of a program language, and one that implements that model on the target platform. In the original implementation of C++ by Stroustrup, the target platform was the C language which of course provides no direct support for (a) type encapsulation [a struct does not maintain scope or permission sections], (b) interface specification [C is a data abstraction language in which function and state are separate], or (c) inheritance. The following simple class hierarchy, which does not support polymorphism [it has been variously called implementation or value inheritance]

            class Point2d {

            public:

                        Point2d( float x = 0.f, float y = 0.f );

                        float x() const { return m_x; }

                        void x( float new_x ) { m_x = new_x; }

                        // …

            private:

                        float m_x;

                        float m_y;

            };

            class Point3d : public Point2d {

            public:

                        Point3d( float x = 0.f, float y = 0.f, float z = 0.f );

                        // …

            private:

                        float m_z;

            };

has no direct mapping onto the C language, and could be translated in any number of ways. For example, one could choose a very simple object model in which each member of the class is assigned a slot. For Point2d, slot0 is assigned to the constructor, slot1 to the read function of the x coordinate member, slot2 to the write function of the x coordinate member, slotn-2 to the static member ms_cnt, slotn-1 to the state member m_x, and slotn to the state member m_y. Internally, the slots representing methods would hold the addresses while the slots representing state members could hold the actual values. This would be relatively simple to implement and would be most appropriate as a proof of concept rather than as a production model.

An alternative model might keep the state members within the class object, but factor out the methods in a method table, maintaining a slot within the object addressing that table. One could imagine variations to that – add a member table as well. With each additional level of indirection, one gains a further flexibility in terms of either substituting or extending the method or state table of the class.

The actual C++ object model chose to maintain the space and time efficiency of the C target platform: the implicit this pointer and an internal name-mangling identify a class method from an independent function; otherwise, there is no binding between a class object and the methods of that class. The non-static state members are stored by value within each class object, and so on.

In certain cases, such as pointers to class members or the virtual mechanisms of inheritance or runtime method invocation, there is no one-to-one mapping with the C target constructs, and so full-blown auxiliary abstractions are necessary, and these bring with them a space and time cost that hopefully is offset by the additional functionality. [The ideal in C++ has been that a programmer should not pay the cost of a facility unless the facility is used.]

This is an distinction worth emphasizing because too often people confuse the CLR platform with a constraint on the Object Model possible for a .NET language, and this is not true – or at least not true in general. For example, the CLR does not support value inheritance, and so internally a .NET language that chose to support value inheritance would have to translate that into a form that could be represented within the CLR. The same is true for multiple inheritance. It is not true to say that a .NET language cannot support multiple inheritance; only that the support of multiple inheritance requires that the programmer face of the object model be translated into a form able to be represented by the underlying CLR platform. Eiffel.NET, for example, has done just that.

On the other hand, there are some hard constraints that a language cannot reasonably get around. One such constraint is the different resolution algorithm of a virtual function invoked within a constructor and destructor. [One could imagine synthesizing numbers of special sub-object constructors for a class to simulate the ISO C++ behavior, but this would put it at considerable semantic odds with the rest of the .NET object behavior and could result in serious runtime faults.] In these cases, the C++ programmer has to concede that she is working in a different semantic model that requires learning new habits. C++ did that to the C programmer with the elimination of tentative global definitions.

The different resolution algorithm between ISO C++ and the CLR Object Models is uncompromising and there was no amelioration provided within either the original or revised C++/CLI language design. The two-part blog first described what the difference is, illustrating it with a simple example, and then peeked under the covers to show the work necessary in ISO C++ to provide its resolution semantics. This is a singular case because it has to do with object identity; that is, when is an object of a class an actual instance of that type? In the ISO C++ Object Model, the object is not an actual instance of that type until the execution of the explicit code of its constructor.

Apart from that, the virtual mechanism works the same. The order of constructors [and destructors] is the same. The run-time resolution of a method based on the actual type of the object referred to at each call point remains the same. The Template Method of the Gang of Four [GOF] Patterns book still works the same. While there be dragons, this is not one of their places of inhabitation.