Beware the C++ implicit conversion


Today's topic was inspired by a question from a customer:

I am working on a stack overflow bug. To reduce the size of the stack frame, I removed as many local variables as I could, but there's still a a lot of stack space that I can't account for. What else lives on the stack aside from local variables, parameters, saved registers, and the return address?

Well, there's also structured exception handling information, but that's typically not too much and therefore wouldn't be the source of "a lot" of mysterious stack usage.

My guess is that the code is generating lots of large C++ temporaries. Consider the following program fragment:

class BigBuffer
{
public:
 BigBuffer(int initialValue)
   { memset(buffer, initialValue, sizeof(buffer)); }
private:
 char buffer[65536];
};

extern void Foo(const BigBuffer& o);

void oops()
{
 Foo(3);
}

"How does this code even compile? The function Foo wants a BigBuffer, not an integer!" Yet compile it does.

That's because the compiler is using the BigBuffer constructor as a converter. In other words, the compiler inserted the following temporary variable:

void oops()
{
 BigBuffer temp(3);
 Foo(temp);
}

It did this because a constructor that takes exactly one argument serves two purposes: It can be used as a traditional constructor (as we saw with BigBuffer temp(3)) or it can be used to provide an implicit conversion from the argument type to the constructed type. In this case, the BigBuffer(int) constructor is being used as a conversion from int to BigBuffer.

To prevent this from happening, use the explicit keyword:

class BigBuffer
{
public:
 explicit BigBuffer(int initialValue)
   { memset(buffer, initialValue, sizeof(buffer)); }
private:
 char buffer[65536];
};

With this change, the call to Foo(3) raises a compiler error:

sample.cpp: error C2664: 'Foo' : cannot convert parameter 1 from
     'int' to 'const BigBuffer &'
     Reason: cannot convert from 'int' to 'const BigBuffer'
     Constructor for class 'BigBuffer' is declared 'explicit'
Comments (47)
  1. Peter Ritchie says:

    I believe explicit should be on every constructor (with parameters) when is written.  Sort of like TDD (write your code to only do what it needs to do until it passes the current test), pull it off when you need to use that c’tor as a conversion constructor.

  2. Moi says:

    Thanks, you just ruined my favourite interview question :-)

    Peter – it is only necessary on constructors taking exactly one parameter (as Raymond said in his article, to be fair), so putting it on "every constructor (with parameters)" is going a bit far.

  3. BryanK says:

    Been a while since I’ve done any C++, but is the "explicit" keyword (and the compiler behavior that makes it necessary) part of the C++ standard?  It sounds like it is, but I’ve never heard of it before.  (Though since it’s been a while, that doesn’t mean much.)

  4. Yes, the "explicit" specifier is standard. Section 12.3.1.

    PMP

  5. Alyosha` says:

    For me, the red flag is not the lack of "explicit".  It’s the large buffer inside a class.  Most classes with large, fixed-sized buffers don’t need them … they probably would benefit from string or vector instead.

  6. Alyosha, you kind of missed the point.

    Raymond’s BigBuffer class is clearly as contrived as the oops() function, only there to serve as an example of "a class that has a fair ammount of statically allocated memory". A representation of the situations where large, fixed-size buffers are actually useful.

    You might as well have pointed out that the oops() function is suspicous, since all it does is to call another function with a hardcoded parameter ;)

  7. Duke says:

    Damn, I love Java.

  8. Soren says:

    Stuff like this is a major reason I don’t like C++. To use even the simplest features like construtors, you need to know and understand all minds of crap, like automatic conversions. I’ll take plain old C with a good utility library any day.

    There are very few useful C++ features that don’t drag in lots of weird stuff, so the argument that ‘you can just use the parts of C++ that you understand’ is bunk.

  9. boxmonkey says:

    I don’t know about you Soren, but I will never understand minds of crap. ;)

    I appreciate this post from Raymond, I didn’t know C++ behaved this way.

  10. Some guy says:

    The funny thing is that with "plain old C with a good utility library", you still actually need to understand all of C and all of the utility library.

    Hell, even with Java or C# you can write code that will work but run like a dead snail if you don’t understand what the memory manager is doing behind your back. (Java code written by people who fell for the lie that you don’t need to understand memory management is not pretty. C# code written by the same people isn’t any better.)

    The only language where you can do high quality development work without understanding the intricate details of the compiler is "english", as used by a manager when telling the coders what to do. (Well, perhaps…)

    However, people who know C++ intimately assume that it’s perfect, while C experts assume that C is naturally "easier" when actually it’s just what they know. A good X programmer will beat a poor Y programmer for virtually any pair of languages you care to name. And that even includes your personal least favorite version of Basic.

  11. Gabe says:

    Does anybody know why C++ has this feature in the first place?

  12. rsclient says:

    Not only are the odd bits of C++ pretty wierd, but they often aren’t documented very well.

    The Microsoft C++ documentation for ‘explicit’ has some fairly useless verbiage, plus an example of how having an ‘explicit’ in a constructor will case a compiler error.  It then says: "To resolve the error, remove the ‘explicit’ keywords…".

    It would be better, of course, to fix the code to use the class correctly.

    (And of my four C++ books, only Stroustrup mentions explicit at all)

    Thank you, Raymond, for adding to my C++ knowledge!

  13. rsclient: Which feature are you asking about? The "using a constructor as a converter" feature or the "explicit" feature?

    I’m surprise nobody has yet commented, "If Microsoft hadn’t made X stupid decision back in 19XX, we wouldn’t have this problem today," since that’s the typical reaction to most of my "compatibility pitfalls" articles…

  14. Some guy says:

    Even the idiots amongst us know that Microsoft didn’t invent C++? Well, I bet that wasn’t your prediction.  ;)

    Still, I am kinda surprised that so many people allegedly didn’t know this detail – I must have gotten lucky with my C++ reading because I’ve known it for years. It’s odd the things that occasionally turn out to be an obscure corner of a language. It’s good to know that C++ is the only language where that applies, though. Or something.

  15. Arlie Davis says:

    If Microsoft hadn’t made the stupid decision to support C and C++ in the 1980s, we wouldn’t have this problem today…

    …and we probably wouldn’t have anything, at all…

  16. Bart says:

    Found this out very early on while learning c++.

    This little ‘feature’ can cause all sorts of basicly broken code to compile if you’re not aware of it.

    Altough i can see the reason for including it, it allows the same syntax to be used for implicit conversion in the way that its supported for native types.

    That a constructor becomes a conversion operator without some kind of keyword however is not something i like, and i hope new languages will invert the use of something like ‘explicit’.

  17. Kzinti says:

    It would be nice if the default was "explicit" and you had to write "implicit" to get the current behavior. A bit late to fix this.

  18. Arlie Davis says:

    Bart, Kzinti: You’ll be happy to know that C# does things the way you want.  You can provide conversions, but they must be marked "explicit" or "implicit", and constructors are not default conversions.

  19. "Obviously, the C++ committee must fix this bug in the next version of the language specification. Programs that relied on the bug will have to be rewritten. That’s the price of progress. Backwards compatbility is a drain on resources."

  20. rsclient says:

    Raymond: That was ‘Gabe’ who was asking about the feature.  I was just commenting that it wasn’t documented very well — many books don’t mention it, and at least one major manufacturer has, for its documentation, the implicit notion that it should just be removed from code if it causes problems.

    Gabe: My guess is: because it’s darn useful.  Strousstrup gives the example

       string s = ‘a’; // make s a string with int(‘a’) elements

    which is clearly useless and non-intuitive.

  21. Norman Diamond says:

    My guess is that the code is generating lots

    > of large C++ temporaries.

    That’s what killed memory performance, but in fact you could delete the word "large" and describe a lot more performance killers.  Think of how much code creates and deletes temporaries that need heap allocations, even just 4 bytes at a time, but invoke the memory manager twice for each one.

    Wednesday, May 24, 2006 5:56 PM by Arlie Davis

    > If Microsoft hadn’t made the stupid decision

    > to support C and C++ in the 1980s, we

    > wouldn’t have this problem today…

    > …and we probably wouldn’t have anything,

    > at all…

    Sure we would.  We wouldn’t have a language which started out as a replacement for assembly language, and we wouldn’t have a language which started out as a replacement for object oriented assembly language, but we’d have others.  Maybe Microsoft might not have made a decision to drop support for Fortran?  Microsoft still would have bought VB, then killed it off and started selling VB#.  Same for something starting with a J ^_^

    Once upon a time Fortran benefited from the addition of a statement "IMPLICIT NONE" which careful programmers started using.  Later VB got its Option Explicit and some VC++ headers started allowing programmers to use Strict to catch some of their own bugs in VC++ programs.  It really isn’t too late for the C++ standard to be augmented with a similar optional kind of declaration, or maybe a #pragma.  Political problems yes, timing problems no.

  22. Norman: I’m not sure why you’re bringing performance into the picture. The original problem wasn’t a performance issue, it was a crashing bug! Lots of small temporaries wouldn’t have caused a stack overflow crash.

  23. Peter Ritchie says:

    Moi, true, it’s not "needed" on anything other than a constructor with a single parameter; but, I’ve run into many situations where a constructor with multiple parameters was refactored to a constructor with a single parameter and introduced the problem outlined here.

    Add it to all c’tors does not add any code and therefore does not add any performance implications; it merely makes the code safer.

  24. Peter Ritchie says:

    In the original C++ specifications there was no explicit keyword; the default behaviour was that c’tors with a single parameter were always potentially implicitly used as conversion constructors.  This, obviously, wasn’t realized until after much code had been written that already used this default behaviour and adding an "implicit" keyword would have potentially broken much code.  Thus, the "explicit" keyword was added.

  25. Norman Diamond says:

    I’m not sure why you’re bringing performance

    > into the picture. The original problem

    > wasn’t a performance issue, it was a

    > crashing bug!

    I misinterpreted the original complaint, sorry.  I thought the original complaint was that the program was using too much memory.  In fact the original complaint was a more specific version of this, which you stated but I overlooked.  Sorry.

    By the way it’s interesting that one of your colleagues blogged about essentially the same topic on the same day.  A different design decision in the C programming language was involved but it also leads to bugs.  The programmer made a mistake which was unintended, received no error message because the program had a valid meaning, and the program behaved in an undesired manner.

  26. Neil says:

    Don’t forget that this particular conversion only happens for const parameters. If you have extern void Foo(BigBuffer& o); then a call to Foo(3) won’t compile.

  27. Phylyp says:

    Raymond,

    Wow, this was a really interesting (and concise) example!

  28. Leo Davidson says:

    While we’re at it, I wish C++ didn’t automatically produce (usually wrong) copy constructors and assignment operators. Like the conversion constructors they are useful features that should be left in, but they should also not do things by default (because they’re usually wrong!), invisibly, without the programmer asking for them explicitly (no pun intended).

    There’s clearly a willingness to change C++ as the hundreds of warnings I got when moving to VS2005 are testament to. Okay, this was mainly Secure CRT stuff rather than the language itself, but small parts of the language have also been changing. I think it’s good that things are not set in stone and the language is progressing.

    I hope implicit conversion/copy/assignment constructors/operators are next against the wall! The effort required to update old code which relies on them is trivial compared to the effort that can be spent tracing the problems they cause (not to mention the fact that I’m sick of declaring private constructors/assignment operators, just in case, for classes that nobody in their right mind would apply those semantics to), and there can always be a compiler switch or pragma to get the old behaviour, just like there is/was for the (stupid) for loop variable scoping.

  29. Anders Dalvander says:

    Gabe: The implicit conversion is rather handy when creating objects on the stack.

    std::string s = "Hello World!";

    may be a bit easier to read than either of:

    std::string s("Hello World!");

    std::string s = std::string("Hello World!");

    Using the heap is another thing as you’ll always use new and the name of the class you want to instantiate:

    std::string* p = new std::string("Hello World!");

    Another interesting C++ "feature" is the following:

    std::string s();

    What does that line do? Anyone? ;)

  30. meh says:

    Nothing really. :P It’s a function declaration.

  31. Me says:

    There’s a reason why Lint issues a warning "constructor … can be used for implicit conversions" !

  32. Dave Harris says:

    My favourite example of this bug is like:

       CPoint p = (5, 10);

    which looks reasonable but doesn’t do what a naive user would expect. It’s equivalent to:

       CPoint p = CPoint(10);

    Microsoft in its wisdom having given MFC CPoint a constructor that takes a single int.

  33. Randolpho says:

    Dave Harris:

    You used the words "wisdom" and "MFC" in the same sentence. SHAME!!! :D

  34. Alex Blekhman says:

    <flame intensity=100%>

    I’m sick of all those people who lament about C++ complexity. If you’re not smart enough to code in C++, then just leave this niche and learn a language you can cope with. One doesn’t pursue tenure of nuclear physics professor and don’t attend world chess championships if he/she isn’t qualified/smart (or both) enough to do these thing. So, why people claim they program in C++ just to cry a moment later about how difficult C++ is.

    No one proposes to remove half of pieces in chess with excuse that there are too many freaking combinations. That’s the game and that’s the rules. You don’t like it, you don’t play it. That simple.

    </flame>

  35. Jay B says:

    >>I hope implicit conversion/copy/assignment constructors/operators are next against the wall! The effort required to update old code which relies on them is trivial compared to the effort that can be spent tracing the problems they cause <clip> <<<

    Now, imagine you have multiple million lines of code spread out through many hundred different compiled modules as part of your "product".  Still think it’s trivial?  I can tell you without a doubt, it is not.

    Sure, you can say in hindsight, fixing a large problem would have resulted in less investment than finding and fixing a bunch of little ones, but what if you’ve already invested the time in finding and fixing those little ones?  Now you’re talking about fixing the large one after the fact, meaning not only was there the original investment but now there is a new investment.  And what’s the benefit?  You’re likely going to introduce more bugs, cost development dollars and gain nothing in the marketplace.  Clients and customers don’t ultimately care how proper your code is.  You can’t sell new versions of your product if nothing of substance changed.

  36. J. says:

    Moi, I may be mistaken, but my understanding is that the "explicit" keyword does have meaning outside of the single-argument constructor scenario described here.  For instance:

    class Money {

     public:

       Money(int dollars, int cents) { … }

    }

    Money roundsForYou() { return Money(10.5, 50); }

    Here, an implicit conversion will take place from the double, 10.5, to an int.  If the constructor was declared with the explicit keyword, I don’t think this would be allowed.

  37. Alex Blekhman says:

    J, you are mistaken. `explicit’ constructor prevents implicit creation of an object. For instance:

    struct X

    {

       explicit X(int) {}

    };

    X foo() { return 3; }

    In this example foo() won’t compile. You need explicitly state that you want to create an

    object:

    X foo() { return X(3); }

  38. Tim Smith says:

    Over the years of dealing with the compiler generated methods, I have run into more issues where people created such things as assignment operators and copy constructors when the compiler generated methods would have worked just fine.

    Removing the compiler generated methods might remove one source of bugs, but you end up just trading those bugs for another type of bug.

  39. Norman Diamond says:

    Thursday, May 25, 2006 12:29 PM by Alex Blekhman

    > <flame intensity=100%>

    > If you’re not smart enough to code in C++,

    > then just leave this niche and learn a

    > language you can cope with.

    Oh I agree.  If you’re smart enough to code in C++ for 5 minutes per day while really maximizing your concentration and avoiding making any coding errors, then you should code in C++ for 5 minutes per day and code in Eiffel for the rest of the day.  If you have a cold that day then you should do 0 minutes in C++ and stick to Eiffel.

    This is almost exactly parallel to the use of other dangerous powers.  If you’re smart enough to run as root (or Administrator) for 5 minutes per day while really maximizing your concentration and avoiding making any typing or mousing errors, then you should run as Administrator for 5 minutes per day and run as a limited user for the rest of the day.  If you have a cold that day then you shouldn’t use an account that is capable of having administrative privileges enabled that day.

    Now what happens if you want to hibernate a PC instead of pulling the plug, or if you need to fix some C++ code that was written by someone less skilled than you, but it’s going to take more than 5 minutes?

    If you’re a mere mortal who is aware that sometimes you make mistakes, and you want to make a bit of effort to protect yourself and your customers from mistakes (e.g. by getting informed of them as quickly as possible), then you might wish for better tools.  Let me know when a C++ to Eiffel converter is available, and let me know when it works.

  40. Luther Baker says:

    <slightly-ot>

    How does one look at the size of the stack in something like and MSVC 2005 environment? Do you insert a breakpoint and somehow look at the stack trace or debug output?

    </slightly-ot>

    Thanks,

    -Luther

  41. Norman Diamond says:

    Ooooops.  This one’s too explicit for me:

    http://thedailywtf.com/forums/permalink/74461/74486/ShowThread.aspx#74486

  42. David Conrad says:

    "If you’re not smart enough to code in C++, then just leave this niche and learn a language you can cope with."

    Absolutely! In fact, I hope that future changes will greatly increase the complexity of C++ to weed out more people who think they can program. Thin the herd! We shouldn’t design programming languages for the people who are going to use them; instead, make the people adapt to the programming language. With a little work we can turn C++ into a language where almost any construction is a subtle, hard-to-find, difficult-to-understand error.

    "Now, imagine you have multiple million lines of code spread out through many hundred different compiled modules as part of your "product".  Still think it’s trivial?"

    Yup. Just turn on the backwards-compatibility compiler switch. Then add the copy ctors and assignment ctors to the classes one by one, over time, occasionally turning off the compiler switch to see what classes still need them. (This could be a long range refactoring project over a period of months, while other development continues unimpeded.)

    Eventually, when you turn off the compiler switch you get no errors, and you know you are done. You may have millions of SLOC, but you don’t have millions of classes in your product, do you?

    Worst case, you just turn on the backwards-compatibility compiler switch and leave it that way, but new projects can take advantage of the change. Now, whether this is a good idea, I am undecided on, but I think it would definitely be doable.

  43. Oliver says:

    >>>Don’t forget that this particular conversion only happens for const parameters. If you have extern void Foo(BigBuffer& o); then a call to Foo(3) won’t compile.<<<

    And consider this (actually somewhat made-up) example with build-in types (which will generate a warning, but compile):

    class CTest

    {

    public:

    /*explicit*/ CTest(int i) : m_i(i) { }

    int GetInt() const { return m_i; }

    private:

    int m_i;

    };

    CTest foo()

    {

    return 7.9;

    }

    ;)

  44. Leo Davidson:

    I agree that copy constructors and assignment operators should not be implicitly created, especially when they are almost always wrong.  When you don’t want them, it forces you to override them yourself, introducing all kinds of problems — duplicate code, error handling, dependency problems.  I ran into this recently on a large project, and it’s just a complete mess.  Luckily I can make them private (which doesn’t solve everything), but why am I forced to do even this?  Why does the language create them when they are useless for anything but the most basic classes?

    I also agree that a compiler switch / pragma can solve the problem, to allow the old code of yesteryear to compile without the expense of upgrading, but also allow newer code to be written with less bugs.  Both worlds can co-exist.

    Jay B:

    Yes, it is costly to fix millions of lines of code, even if the fixes are trivial.  But, doesn’t a backward compatibility switch allow both worlds to co-exist with no additional effort from either?  And is this too much to ask?

  45. Per Vognsen says:

    Jason Doucette,

    "I agree that copy constructors and assignment operators should not be implicitly created, especially when they are almost always wrong."

    The synthesized copy constructor and assignment operator often do exactly you want in a well-designed system where (1) most classes have proper value semantics; and (2) manual memory management is confined to as few classes as possible.

    "Luckily I can make them private (which doesn’t solve everything)"

    What problem are you trying to solve?

    If you don’t want to allow copying or assignment, the usual idiom is to make the copy constructor and assignment operator private and–this is important–leave them unimplemented. If you do this correctly, attempts at copying or assignment (even by member or friend functions) will result in a compile-time error.

    Even better would be to use boost::noncopyable or the like.

  46. I’d like to see the C++ defaults changed too: an "implicit" keyword that can be used everywhere "explicit" is, and then slowly make "explicit" the default.

    Same goes for "mutable" and "const", I want to explicitly write "mutable" functions or arguments just like I do for "const" now.  And eventually, make "const" the default instead of an implicit "mutable" (or get rid of an implicit default altogether).

  47. J. says:

    Thanks Alex, I’ll shut up now :/

Comments are closed.