A nifty little preprocessor trick for C++


We found out something a little surprising about C++ enums a while back.  It turns out that if you have this code:

 

class Foo {

public:

      enum Color {

            Red   = 1 << 0,

            Green = 1 << 1,

            Blue  = 1 << 2

      };

 

      void Bar() {

            Color purple = Color::Red | Color::Blue;

      }

};

 

It turns out that this code is illegal for a few reasons.  First of all, it turns out that you can’t refer to elements of a nested enum through the enum name.  Instead you need to refer to them through the outer type’s name.  i.e. instead of Color::Red, you’d say Foo::Red, Foo::Green or Foo::Blue.  Secondly, while enums are implicitly convertible to integers, the reverse is not true.  So applying the binary operator | to two enum values produces an int which then needs to be casted back to the enum type.  So the code would actually have to look like:

 

      void Bar() {

            Color purple = (Color)(Red | Blue);

      }

 

Or, if you’re external to the class it’ll look like

 

      void Bar() {

            Foo::Color purple = (Foo::Color)(Foo::Red | Foo::Blue);

      }

 

Neither of which are particularly nice to use in practice.  Casting means that even operators like |= are unpleasant to use.   And referring to the values through the outer type name is just confusing.  An alternative to solve the latter problem is to just make the enums top level.  i.e. have something like:

 

enum Color {

      Red   = 1 << 0,

      Green = 1 << 1,

      Blue  = 1 << 2

};

 

That makes things a little nicer now that you can refer to the elements trough the more natural Color::Red.  However, we’re also now polluting the top level namespace with the enum member names.  This means that I now can’t add an enum like:

 

enum StopLightState {

      Red,

      Yellow,

      Green

};

 

because the “Red” value will conflict.   So in our codebase we had these enums defined using ugly prefixes in order to not pollute the namespace (i.e. Color_Red, Color_Green, Color_Blue), and we would also just store those values into DWORDs so that we wouldn’t have to be casting on every operation.

 

Clearly this was pretty suboptimal.  The naming is just redundant we didn’t like that code duplication (not to mention the ugliness of the prefixes), and by using DWORDs we were losing type safety.  So what can we do about this?  Well, thanks to a few C++ tricks it’s easily remediable.  Instead of how I defined the enums as I did above instead try using the following pattern:

 

struct Color {

private:

      Color() {}

 

public:

      enum _Enum {

            Red   = 1 << 0,

            Green = 1 << 1,

            Blue  = 1 << 2

      };

};

typedef Color::_Enum ColorEnum

inline ColorEnum operator &(ColorEnum e1, ColorEnum e2) {

      return (ColorEnum)(e1 & e2);

}

inline ColorEnum operator <<(ColorEnum e, int shift) {

      return (ColorEnum)(e1 << shift);

}

//…define the rest of the operators that you’d like

 

Now I can write code just like this:

 

      void Bar() {

            ColorEnum purple = Color::Red | Color::Blue;

      }

 

It’s now typesafe (no more DWORDs), and it allows convenient naming of the enum values without polluting the main namespace.

 

It’s turned out to be so useful that we now have macros to do the work for us.  Specifically:

 

#define DECLARE_ENUM(name) \

      struct name { \

      private: \

            name() {} \

      public: \

            enum _Enum {

 

#define END_ENUM(name) \

            }; \

      }; \

      typedef name::_Enum name##Enum;

 

#define FLAGS_ENUM(name) \

      inline name::_Enum operator &(name::_Enum e1, name::_Enum e2) { \

            return (name::_Enum)(e1 & e2); \

      } \

      // … rest of the operators you care about go here.

 

Once you’ve done this you can now just write:

 

DECLARE_ENUM(Color)

      Red   = 1 << 0,

      Green = 1 << 1,

      Blue  = 1 << 2

END_ENUM(Color)

FLAGS_ENUM(Color)

 

And now all of that boilerplate code will be written for you. 

Useless for you?  Quite possibly :)   But very helpful for me.

 


Edit: I’ve added in the values that the enum members are initialized to so that they can be used effectively as named flags.  However, it’s somewhat annoying to force you to remember to write in (1 << n) for every member.  Anyone have any ideas for how one might get that automatically?


Comments (27)

  1. This is a misuse of enums IMO. Why not just use constants (or even #defines) instead?

  2. DrPizza says:

    Personally, I find that enumerated values shouldn’t allow combinations that aren’t also enumerated values. Being able to come up with an answer that’s not in the enumeration seems extremely silly to me. It makes sense for *flags*, but flags aren’t really enumerations (unless you want to list and name all the combinations for any particular family of flags, which you probably don’t….). For your traffic light example, for instance, Red | Green is meaningless; the traffic light can’t show that value, so it should be prohibited. Red | Amber, OTOH, should be a value within the enumeration, as it’s a possible combination.

    If you’re going to allow non-enumerated values to appear you probably shouldn’t be using an enumeration at all.

  3. Nick says:

    And they say that MACROs in C# would be useless… imagine what you could do there too!

  4. Ryan Lamansky (Kardax) says:

    Nick: This isn’t a good case for macros in C# because C# enums are way nicer to work with than C++ enums.

  5. Dan says:

    This still won’t work unless you explicitly give each item in the enumeration a power-of-two value. In your example, Red | Blue is just going to be ( 0 | 2 ) = 2 = Blue.

  6. Pavel: Constants give me no type safety.

    DrPizza: You’re right. And that’s why the FLAGS_ENUM portion of the macro is optional. If your enum is meaningless as a flag then you can leave it out.

    Dan: You’re correct. I was leaving that out for brevity, but i can see how confusing that made it and I’ll add it back in.

  7. Andy Pennell says:

    But what about the code generation Cyrus? Does it generate the same code as the "ugly" source? If it generates a load more code, its not worth it IMHO.

  8. Andy: You’d rather write out all the operators by hand? Right now i have

    operator |

    operator &

    operator ~

    operator <<

    operator >>

    operator |=

    operator &=

    operator <<=

    operator >>=

    This macro gives me C# like usage of enums. They’re typesafe, and can be used as named flags handily.

  9. DrPizza says:

    "DrPizza: You’re right. And that’s why the FLAGS_ENUM portion of the macro is optional. If your enum is meaningless as a flag then you can leave it out. "

    But that still isn’t correct, because it doesn’t force combinations that are legal.

    For a traffic light, for example, Red | Amber should be legal (as it’s one of the states that a traffic light can have) and should have a member in the actual enumeration, but Red | Green should not (because it’s not a state a traffic light can actually have).

  10. DrPizza: That’s why StopLightState wasn’t a flag enum. It was just there to show an example of namespace pollution.

  11. DrPizza says:

    The StopLightState didn’t use the macros at all, so who was to know one way or the other.

    But the issues remains. Flags shouldn’t be enums because the results of combining flags aren’t enumerated values.

  12. David says:

    I think this is great! I’ve got several places where this will really clean-up some code…

  13. Zdeslav says:

    Actually, this is a flawed usage of enums. as Stroustrup says: "An enumeration is a type that can hold a set of values specified by user". so, if you define an enum Color to hold values Red, Blue and Green, then the Purple is not a member of the Color, and declaring it as such is breaking type-safety. IMO, the closest match to flags semantics is a struct with constants defined in it.

  14. Zdeslav says:

    here is a quick hack which should solve automatic assignment of power-of-2 values. it probably doesn’t fulfill al the requirements you had, but could be a starting point:

    #define DECLARE_FLAGS(Name) struct Name { private: static const int __x = __LINE__ + 1; public:

    #define FLAG_VALUE(Name) static const int Name = 1 << (__LINE__ – __x);

    #define END_FLAGS() };

    DECLARE_FLAGS(Color)

    FLAG_VALUE(Red)

    FLAG_VALUE(Green)

    FLAG_VALUE(Blue)

    FLAG_VALUE(Yellow)

    END_FLAGS()

    /* declaration above is preprocessed into this:

    struct Flags

    {

    private: static const int __x = __LINE__ + 2; public:

    static const int Red = 1 << (__LINE__ – __x);

    static const int Green = 1 << (__LINE__ – __x);

    static const int Blue = 1 << (__LINE__ – __x);

    static const int Yellow = 1 << (__LINE__ – __x);

    }; */

    int main()

    {

    std::cout << "Red = " << Color::Red << "n";

    std::cout << "Green = " << Color::Green << "n";

    std::cout << "Blue = " << Color::Blue << "n";

    std::cout << "Yellow = " << Color::Yellow << "n";

    std::cout << "Purple = " << (Color::Red | Color::Blue) << "n";

    return 0;

    }

  15. Zdeslav: I really couldn’t care less what stroustroup says :)

    Also, in your example i lose typesafety when it’s a constant inside a struct.

    Oh, i love the enum definition which automatically increments the values. Very very cute :)

  16. Zdeslav says:

    it doesn’t matter what stroustrup says, i agree, but that’s also what C++ standard says. Now, i understand that adherence to standards is not always the top priority in MS, but they have they place (don’t get me wrong, i’m not one of linux zealots, i use MS tools almost exclusivelly and i think they’re great).

    Besides, when you cast any number into an enum, you also lose type-safety. There’s is something similar in COM: if you have a method which receives a parameter which is an enum, and you try to pass a value which is not defined in the enum, there is no guarantee that the call will be successfully marshaled, and there is a good reason for that. Enum does not mean any number, it means exactly what it says: an enumeration. If you just want to pass any integer to a function, use integers.

    No offense, but for me, this is just a bad coding practice.

  17. Zdeslav says:

    sorry to be nittpicking here, but i would like to correct myself: this is not a bad coding practice, it’s an error. C++ standard from 1998 (ISO/IEC14482) states this (5.2.9): "A value of integral type can be explicitly converted to an enumeration type. The value is unchanged if the integral value is within the range of the enumerated values (7.2) Otherwise, the resulting enumeration value is unspecified."

    Therefore, combining Red and Blue values and casting it into Color, as in your example, results in undefined behavior.

  18. Radu Grigore says:

    Here is another idea for assigning powers-of-2 to enum values:

    #define LCONS(a) 1, a = 1 >> 0

    #define CONS(a, b) PRIV_CONS(a, b)

    #define TRIM(x) PRIV_TRIM(x)

    #define PRIV_CONS(a, x, …) (x+1), a = 1 >> x, __VA_ARGS__

    #define PRIV_TRIM(x, …) __VA_ARGS__

    #define A b, c

    #define P(b, c) c, b

    enum X { TRIM(

    CONS(RED,

    CONS(ORANGE,

    CONS(YELLOW,

    CONS(GREEN,

    CONS(BLUE,

    LCONS(MAGENTA) ) ) ) ) )

    )};

    It’s just a "cute" use of macros. i’m not sure if the idea can be tweaked enough to turn it into one usable in practice.

  19. Radu Grigore says:

    Ooops.. the A and P function-like macros are just leftovers from where I have copied the code: it has nothing to do with the solution.

  20. James says:

    The operators you’ve written are incomplete and will recurse to their doom. You need functions more like:

    E& operator|= (E& a, E b) { return static_cast<E>(a+0|b); }

    E operator| (E a, E b) { return a |= b; }

    Your macro stunt breaks down if you want to provide a good implementation of operator ~ (unless you’re willing to set bits that have no corresponding enumerator).

    As others have pointed out, your example is quite questionable and using flags to represent traffic lights is not a very sensible thing to do. Shifting flags is something else to avoid under normal circumstances.

    Don’t worry about what Zdeslav said; if he had read the section referenced from the one he quoted (specifically 7.2 paragraphs 5 and 6) he’d understand that the enum can hold all the necessary values; even if it couldn’t, the behaviour wouldn’t be "undefined" – the sentence he quoted explains that it’s *unspecified*, which is different (it can truncate, saturate or whatever, but not crash).

    You should also be aware that names starting with an underscore and a capital letter are reserved for the implementation and shouldn’t be used by diligent programmers: see section 17.4.3.1.2 in the C++ standard. That means your _Enum member should be renamed. In the C standard, names beginning with a capital E and an upper-case letter are also reserved as macro names for constants like EINVAL… do you remember the errno variable from <errno.h>? That would discount your END_ENUM macro, but I’m not sure whether the restriction applies to C++.

    While you’re looking clause 17, you might as well read section 17.3.2.1.2 (Bitmask types), because it describes exactly what you’re doing here (sans macros).

    Finally, can I point out that if you really want to pretend that C++ has scoping rules for enums that are similar to C#, you have some work to do. With the definition above, the following compiles:

    Color c(c);

    struct MoreColourful: Color {};

    …etc. The proper tool for the job would be namespaces:

    namespace Colour { enum T { Red, Green, Blue }; }

    Thanks.

  21. James says:

    I should have read that before I posted it…

    E& operator|= (E& a, E b) { return static_cast<E>(a+0|b); }

    should be

    E& operator|= (E& a, E b) { return a = static_cast<E>(a+0|b); }

  22. Zdeslav says:

    James:

    Yes, i understand that enum can hold any value of its underlying type, but that doesn’t mean that unspecified behavior is something that should be exploited in the code. Also, you can’t be sure that it wouldn’t result in a crash, because there is other code which relies on the values of such constructs which can crash when it faces a value. What should the client code do with something which is declared as enum but contains something completely different?

    Paragraphs 5 and 6 of 7.2 state that implemetation defines which underlying type shall be used for an enumeration, and that it must be big enugh to hold values defined by enum. So if you have an enum like this:

    enum TraficLight

    {

    Red = 1,

    Yellow =2,

    Green = 255 // doesn’t have flag semantics, I know

    }

    implementation can choose unsigned char as underlying type. In that case, Red + Green would result in unspecified value (anything from 0-255, but who knows). I know that this is a contrived example, but in order to have a construct like “Red + Green” work correctly, you have to know the exact values of the enum, which really beats the purpose of having an enum in first place and breaks the encapsulation. Besides, you must know that it carries the flags semantics, and there is nothing in enum which can enforce that.

  23. Mark Vabulas says:

    Last time i checked, c++ (yes, even MSVC++) had a preprocessor macro defined that increments automatically each time it is used. That means you could modify your code slightly to use it, something like:

    1 << _COUNT

    or whatever it is, dont have my C++ book handy. That would give you the ability to automatically increment it each time, no need to manually do it.

  24. Llama says:

    How do you reset the _COUNT?