Factor, Don't Complicate

Article
06/04/2004

A reader asks the following question,

Sender: bv

re: A Question about Copy Constructors in C++/CLI

Ok, i have a question.

What happens if: suppose you want to make a shallow copy. Ex:

ClassX obj(somePtr);//internally obj.m_ptr = SomePtr;

Now

ClassX obj2(obj);

What problems can occur in such cases?

because now neither obj nor obj2 owns the member ptr. so none should destroy them.

And i want that, only obj should be able to modify the ptr and not obj2. How will i do it?

The purpose of a copy constructor (and copy assignment operator) is to allow users to overwrite the default behavior. In the C language, the default behavior of copying one aggregate object with another is bitwise copy. In the original C++ implementation, this was also the behavior of copying one class object with another. However, the greater functionality of a class over that of a C struct required changing the default behavior to that of memberwise copy – that is, to recognize the integrity of member and base class sub-objects.

The complexity within C++ of copying one class object with another falls into two general categories – at least into two general categories that I wish to address:

When we use primitive members such as pointers that reflect shallow copy and fall outside the constructor as resource acquisition pattern. In effect, we have to provide our own deep copy semantics.

When we decide to implement complex behavioral patterns, such as, for example, copy on write, reference counting, and all sorts of neat abstract relationships that fall outside the default copy behavior.

The default copy behavior of CLI reference types is shallow copy. That is, a reference type is a duple consisting of a named tracking handle and an object allocated on the CLI heap. The copying of one tracking handle to another results in both handles addressing the same heap object. In a garbage collection environment, the too-early destruction problem of ISO-C++ goes away.

So, with that background, let’s go to the reader’s question.

ClassX obj(somePtr); //internally obj.m_ptr = SomePtr;

ClassX obj2(obj);

What problems can occur in such cases?

Well, the first thing is, I will presume that this is an ISO-C++ class, not a C++/CLI class given the emphasis on pointers. So we are talking about a classic solution – that is, with memberwise default behavior. The minimal class we can extrapolate from this small code sample is,

class X

{

T * m_ptr;

public:

// ClassX obj(somePtr);

X( T * somePtr ) : m_ptr( somePtr ){}

};

This is all that is required to support the examples provided by the writer. An initialization of one X object with another, such as

X x1( myT );

X x2( x1 );

by default results in the bitwise copy of x2 with x1 without the explicit invocation of a synthesized copy constructor, with x2.m_ptr holding the same address as x1.m_ptr.

The reader then asks,

What problems can occur in such cases?

because now neither obj nor obj2 owns the member ptr. so none should destroy them.

And i want that, only obj should be able to modify the ptr and not obj2. How will i do it?

Well, because both objects manipulate the same object through their pointer member, any write may come as a surprise; moreover, if they are on separate threads, there is a need for locking on write operations.

As the user notes, it is not a good idea for either x1 or x2 to destroy the object addressed without somehow synchronizing it with the other – again, this is exactly the problem that garbage collection solves, removing the burden on the user.

To synchronize the destruction, one has to come up with some form of reference counting mechanism – the first discussion of that in C++ was James Coplien’s Advanced C++. Bjarne provides an example in his C++ Programming Language. I’m not going to go into the actual implementation.

The second question is, how should one restrain changes to the object that are pointed to by multiple instances of class X such that there is one `master’ and many … well, readers. The best way, in my opinion, to do this is to factor the design into two classes – and allowing the readers read-only access rather than holding a pointer to it. Otherwise, you have a rather clunky design in which the object has to ask, am I allowed to modify this guy I point to? Does the writer, or master, still exist? [probably can’t answer that] – and there is no notification semantics that one could employ, etc.

So, the answer to the second question is, come up with a design in which the characteristics of the one writer or many readers is built into the types; otherwise, the class is non-intuitive and will likely confuse users and be a cause of error and frustration.

Factor, Don't Complicate

Additional resources