Intro
The enum type in C++ is fairly basic; it basically just creates a bunch of compile-time values for labels (potentially with proper scoping with enum class).
It's very attractive for grouping related compile-time constants together:
enum class Animal{
DOG,
CAT,
COW,
...
};
// ...
Animal myAnimal = Animal::DOG;
However it has a variety of perceived shortcomings, including:
No standard way to get the number of possible elements
No iteration over elements
No easy association of enum with string
In this post, I seek to create a type that addresses those perceived shortcomings.
An ideal solution takes the notion of compile-time knowledge of constants and their associations with strings and groups them together into an scoped-enum-like object that is searchable both by enum id and enum string name. Finally, the resulting type would use syntax that is as close to enum syntax as possible.
In this post I'll first outline what others have attempted for the individual pieces, and then walk through two approaches, one that accomplishes the above but has undefined behavior due to order of initialization of static members, and another solution that has less-pretty syntax but no undefined behavior due to order of initialization.
Prior work
There are plenty of questions on SO about getting the number of items in an enum (1 2 3) and plenty of other questions on the web asking the same thing (4 5 6) etc. And the general consensus is that there's no sure-fire way to do it.
The N'th element trick
The following pattern only works if you enforce that enum values are positive and increasing:
enum Foo{A=0, B, C, D, FOOCOUNT}; // FOOCOUNT is 4
But is easily broken if you're trying to encode some sort of business logic that requires arbitrary values:
enum Foo{A=-1, B=120, C=42, D=6, FOOCOUNT}; // ????
Boost Enum
And so the developers at Boost attempted to solve the issue with Boost.Enum which uses some fairly complicated macros to expand into some code that will at least give you the size.
Iterable Enums
There have been a few attempts at iterable enums; enum-like objects that one can iterate over, theoretically allowing for implicit size computation, or even explicitly in the case of [7] (7 8 9, ...)
Enum to String conversion
Attempts to implement this usually result in free-floating functions and the use of macros to call them appropriately. (8 9 10)
This also covers searching enums by string (13)
Additional Constraints
No macros
Yes, this means no Boost.Enum or similar approach
Need int->Enum and Enum-int Conversion
A rather unique problem when you start moving away from actual enums;
Need to be able to find enum by int (or string)
Also a problem one runs into when they move away from actual enums. The list of enums is considered a collection, and the user wants to interrogate it for specific, known-at-compile-time values. (See iterable enums and Enum to String conversion)
At this point it becomes pretty clear that we cannot really use an enum anymore. However, I'd still like an enum-like interface for the user
Approach
Let's say I think that I'm super clever and realize that if I have some class A:
struct A
{
static int myInt;
};
int A::myInt;
Then I can access myInt by saying A::myInt.
Which is the same way I'd access an enum:
enum A{myInt};
// ...
// A::myInt
I say to myself: well I know all my enum values ahead of time, so an enum is basically like this:
struct MyEnum
{
static const int A;
static const int B;
// ...
};
const int MyEnum::A = 0;
const int MyEnum::B = 1;
// ...
Next, I want to get fancier; let's address the constraint where we need std::string and int conversions:
struct EnumValue
{
EnumValue(std::string _name): name(std::move(_name)), id(gid){++gid;}
std::string name;
int id;
operator std::string() const
{
return name;
}
operator int() const
{
return id;
}
private:
static int gid;
};
int EnumValue::gid = 0;
And then I can declare some containing class with static EnumValues:
MyEnum v1
class MyEnum
{
public:
static const EnumValue Alpha;
static const EnumValue Beta;
static const EnumValue Gamma;
};
const EnumValue MyEnum::Alpha = EnumValue("Alpha")
const EnumValue MyEnum::Beta = EnumValue("Beta")
const EnumValue MyEnum::Gamma = EnumValue("Gamma")
Great! That solves some of our constraints, but how about searching the collection? Hm, well if we now add a static container like unordered_map, then things get even cooler! Throw in some #defines to alleviate string typos, too:
MyEnum v2
#define ALPHA "Alpha"
#define BETA "Beta"
#define GAMMA "Gamma"
// ...
class MyEnum
{
public:
static const EnumValue& Alpha;
static const EnumValue& Beta;
static const EnumValue& Gamma;
static const EnumValue& StringToEnumeration(std::string _in)
{
return enumerations.find(_in)->second;
}
static const EnumValue& IDToEnumeration(int _id)
{
auto iter = std::find_if(enumerations.cbegin(), enumerations.cend(),
[_id](const map_value_type& vt)
{
return vt.second.id == _id;
});
return iter->second;
}
static const size_t size()
{
return enumerations.size();
}
private:
typedef std::unordered_map<std::string, EnumValue> map_type ;
typedef map_type::value_type map_value_type ;
static const map_type enumerations;
};
const std::unordered_map<std::string, EnumValue> MyEnum::enumerations =
{
{ALPHA, EnumValue(ALPHA)},
{BETA, EnumValue(BETA)},
{GAMMA, EnumValue(GAMMA)}
};
const EnumValue& MyEnum::Alpha = enumerations.find(ALPHA)->second;
const EnumValue& MyEnum::Beta = enumerations.find(BETA)->second;
const EnumValue& MyEnum::Gamma = enumerations.find(GAMMA)->second;
Full working demo HERE!
Now I get the added benefit of searching the container of enums by name or id:
std::cout << MyEnum::StringToEnumeration(ALPHA).id << std::endl; //should give 0
std::cout << MyEnum::IDToEnumeration(0).name << std::endl; //should give "Alpha"
BUT
This all feels very wrong. We're initializing a LOT of static data. I mean, it wasn't until recently that we could populate a map at compile time! (11)
Then there's the issue of the static-initialization order fiasco:
A subtle way to crash your program.
The static initialization order fiasco is a very subtle and commonly
misunderstood aspect of C++. Unfortunately it’s very hard to detect —
the errors often occur before main() begins.
In short, suppose you have two static objects x and y which exist in
separate source files, say x.cpp and y.cpp. Suppose further that the
initialization for the y object (typically the y object’s constructor)
calls some method on the x object.
That’s it. It’s that simple.
The tragedy is that you have a 50%-50% chance of dying. If the
compilation unit for x.cpp happens to get initialized first, all is
well. But if the compilation unit for y.cpp get initialized first,
then y’s initialization will get run before x’s initialization, and
you’re toast. E.g., y’s constructor could call a method on the x
object, yet the x object hasn’t yet been constructed.
I hear they’re hiring down at McDonalds. Enjoy your new job flipping
burgers.
If you think it’s “exciting” to play Russian Roulette with live rounds
in half the chambers, you can stop reading here. On the other hand if
you like to improve your chances of survival by preventing disasters
in a systematic way, you probably want to read the next FAQ.
Note: The static initialization order fiasco can also, in some cases,
apply to built-in/intrinsic types.
Which can be mediated with a getter function that initializes your static data and returns it (12):
Fred& GetFred()
{
static Fred* ans = new Fred();
return *ans;
}
But if I do that, now I have to call a function to initialize my static data, and I lose the pretty syntax you see above!
#Questions#
So, now I finally get around to my questions:
Be honest, how bad is the above approach? In terms of initialization order safety and maintainability?
What kind of alternatives do I have that are still pretty for the end user?
EDIT
The comments on this post seem to indicate a strong preference for static accessor functions to get around the static order initialization problem:
public:
typedef std::unordered_map<std::string, EnumValue> map_type ;
typedef map_type::value_type map_value_type ;
static const map_type& Enumerations()
{
static map_type enumerations {
{ALPHA, EnumValue(ALPHA)},
{BETA, EnumValue(BETA)},
{GAMMA, EnumValue(GAMMA)}
};
return enumerations;
}
static const EnumValue& Alpha()
{
return Enumerations().find(ALPHA)->second;
}
static const EnumValue& Beta()
{
return Enumerations().find(BETA)->second;
}
static const EnumValue& Gamma()
{
return Enumerations().find(GAMMA)->second;
}
Full working demo v2 HERE
Questions
My Updated questions are as follows:
Is there another way around the static order initialization problem?
Is there a way to perhaps only use the accessor function to initialize the unordered_map, but still (safely) be able to access the "enum" values with enum-like syntax? e.g.:
MyEnum::Enumerations()::Alpha
or
MyEnum::Alpha
Instead of what I currently have:
MyEnum::Alpha()
Regarding the bounty:
I believe an answer to this question will also solve the issues surrounding enums I've elaborated in the post (Enum is in quotes because the resulting type will not be an enum, but we want enum-like behavior):
getting the size of an "enum"
string to "enum" conversion
a searchable "enum".
Specifically, if we could do what I've already done, but somehow accomplish syntax that is enum-like while enforcing static initialization order, I think that would be acceptable
Sometimes when you want to do something that isn't supported by the language, you should look external to the language to support it. In this case, code-generation seems like the best option.
Start with a file with your enumeration. I'll pick XML completely arbitrarily, but really any reasonable format is fine:
<enum name="MyEnum">
<item name="ALPHA" />
<item name="BETA" />
<item name="GAMMA" />
</enum>
It's easy enough to add whatever optional fields you need in there (do you need a value? Should the enum be unscoped? Have a specified type?).
Then you write a code generator in the language of your choice that turns that file into a C++ header (or header/source) file a la:
enum class MyEnum {
ALPHA,
BETA,
GAMMA,
};
std::string to_string(MyEnum e) {
switch (e) {
case MyEnum::ALPHA: return "ALPHA";
case MyEnum::BETA: return "BETA";
case MyEnum::GAMMA: return "GAMMA";
}
}
MyEnum to_enum(const std::string& s) {
static std::unordered_map<std::string, MyEnum> m{
{"ALPHA", MyEnum::ALPHA},
...
};
auto it = m.find(s);
if (it != m.end()) {
return it->second;
}
else {
/* up to you */
}
}
The advantage of the code generation approach is that it's easy to generate whatever arbitrary complex code you want for your enums. Basically just side-step all the problems you're currently having.
I usually prefer non-macro code but in this case, I don't see what's wrong with macros.
IMHO, for this task macros are a much better fit as they are simpler and shorter to write and to read, and the same goes for the generated code. Simplicity is a goal in its own right.
These 2 macro calls:
#define Animal_Members(LAMBDA) \
LAMBDA(DOG) \
LAMBDA(CAT) \
LAMBDA(COW) \
CREATE_ENUM(Animal,None);
Generate this:
struct Animal {
enum Id {
None,
DOG,
CAT,
COW
};
static Id fromString( const char* s ) {
if( !s ) return None;
if( strcmp(s,"DOG")==0 ) return DOG;
if( strcmp(s,"CAT")==0 ) return CAT;
if( strcmp(s,"COW")==0 ) return COW;
return None;
}
static const char* toString( Id id ) {
switch( id ) {
case DOG: return "DOG";
case CAT: return "CAT";
case COW: return "COW";
default: return nullptr;
}
}
static size_t count() {
static Id all[] = { None, DOG, CAT, COW };
return sizeof(all) / sizeof(Id);
}
};
You could wrap them into a single macro using BOOST_PP and have a sequence for the members. This would make it a lot less readable, though.
You can easily change it to your preferences of default return values, or remove the default altogether, add a specific member value and string name, etc.
There's no loose functions, no init order hell, and only a bit of macro code that looks very much like the final result:
#define ENUM_MEMBER(MEMBER) \
, MEMBER
#define ENUM_FROM_STRING(MEMBER) \
if( strcmp(s,#MEMBER)==0 ) return MEMBER;
#define ENUM_TO_STRING(MEMBER) \
case MEMBER: return #MEMBER;
#define CREATE_ENUM_1(NAME,MACRO,DEFAULT) \
struct NAME { \
enum Id { \
DEFAULT \
MACRO(ENUM_MEMBER) \
}; \
static Id fromString( const char* s ) { \
if( !s ) return DEFAULT; \
MACRO(ENUM_FROM_STRING) \
return DEFAULT; \
} \
static const char* toString( Id id ) { \
switch( id ) { \
MACRO(ENUM_TO_STRING) \
default: return nullptr; \
} \
} \
static size_t count() { \
static Id all[] = { DEFAULT \
MACRO(ENUM_MEMBER) }; \
return sizeof(all) / sizeof(Id); \
} \
};
#define CREATE_ENUM_2(NAME,DEFAULT) \
CREATE_ENUM_1(NAME,NAME##_Members,DEFAULT)
#define CREATE_ENUM(NAME,DEFAULT) \
CREATE_ENUM_2(NAME,DEFAULT)
Hope this helps.
Related
I am trying to make it possible for a programmer (who uses my library) to create nameable instances of type X that are stored inside an instance of class C (or at least are exclusive to that instance).
These are the only two (ugly) solutions I have managed to come up with (needless to say, I am just picking up C++)
1)
class C
{
public:
class XofC
{
public:
XofC() = delete;
XofC(C& mom)
{
mom.Xlist.emplace_front();
ref = Xlist.front();
}
X& access()
{
return ref;
}
private:
X& ref;
};
//etc
private:
std::forward_list<X> Xlist;
friend class XofC;
//etc
}
Problem:
Having to pass everywhere XofC instances.
2)
class C
{
public:
void newX(std::string);
X& getX(std::string);
//etc.
private:
/*possible run-time mapping implementation
std::vector<X> Xvec;
std::unordered_map<std::string, decltype(Xvec.size())> NameMap;
*/
//etc
}
Problem:
This does the job, but since all names of X (std::string) are known at compilation, the overhead of using run-time std::unordered_map<std::string, decltype(Xvec.size())> kind-of bugs me for something this simple.
Possible(?) solution: compile-time replacing of std::string with automatic index (int). Then I could use:
class C
{
public:
void newX(int); //int: unique index calculated at compile time from std::string
X& getX(int); //int: unique index calculated at compile time from std::string
//etc.
private:
std::vector<X> Xvec;
}
Questions:
Is there a 3)?
Is a compile time solution possible for 2)?
This is the real-life situation: I was starting my first C++ "project" and I thought I could use the practice and utility from an awesome user-friendly, simple and fast argument management library. I plan to make an ArgMan class which can parse the argV based on some specified switches. Switches would be named by the programmer descriptively and the trigger strings be specified (e.g. a switch named recurse could have "-r" and "-recursive" as triggers). When necessary, you should be easily able to get the setting of the switch. Implementation detail: ArgMan would have a std::unordered_map<std::string/*a trigger*/, ??/*something linking to the switch to set on*/>. This ensures an almost linear parse of argV relative to argC. How should I approach this?
You could 'abuse' non-type template arguments to get compiletime named instances:
Live on Coliru
Assume we have a data class X:
#include <string>
struct X
{
int has_some_properties;
std::string data;
};
Now, for our named instances, we define some name constants. The trick is, to give them external linkage, so we can use the address as a non-type template argument.
// define some character arrays **with external linkage**
namespace Names
{
extern const char Vanilla[] = "Vanilla";
extern const char Banana [] = "Banana";
extern const char Coconut[] = "Coconut";
extern const char Shoarma[] = "Shoarma";
}
Now, we make a NamedX wrapper that takes a const char* non-type template argument. The wrapper holds a static instance of X (the value).
// now we can "adorn" a `namedX` with the name constants (above)
template <const char* Name>
struct NamedX
{
static X value;
};
template <const char* Name> X NamedX<Name>::value;
Now you can use it like this:
int main()
{
X& vanilla = NamedX<Names::Vanilla>::value;
vanilla = { 42, "Woot!" };
return vanilla.has_some_properties;
}
Note that due to the fact that the template arguments are addresses, no actual string comparison is done. You cannot, e.g. use
X& vanilla = NamedX<"Vanilla">::value;
becuase "Vanilla" is a prvalue without external linkage. So, in fact you could do without some of the complexity and use tag structs instead: Live on Coliru
While Neil's solution did what I asked for, it was too gimmicky to use in my library. Also, sehe's trick is surely useful, but, if I understood correctly, but doesn't seem related to my question. I have decided to emulate the desired behavior using method 1), here is a less broken attempt at it:
class C
{
private:
class X
{
//std::string member;
//etc
};
public:
class XofC
{
public:
XofC(C & _mom) : mom(_mom)
{
mom.Xlist.emplace_front();
tehX = &(Xlist.front());
}
X & get(maybe)
{
if (&maybe != &mom) throw std::/*etc*/;
return &tehX;
}
private:
X * tehX;
C & mom;
};
private:
//etc
std::forward_list<X> Xlist;
friend class XofC;
//etc
};
Usage:
C foo;
bar = C::XofC(foo); //acts like an instance of X, but stored in C, but you have to use:
bar.get(foo)/*reference to the actual X*/.member = "_1_";
Of course, the downside is you have to make sure you pass bar everywhere you need it, but works decently.
This is how it looks like in my tiny argument manager library:
https://raw.github.com/vuplea/arg_manager.h/master/arg_manager.h
Perhaps something similar has already been asked, and sure, it's a nitpick...
I have a bunch of constant std::maps to switch between enum (class) values and their std::string representations (both ways). Someone here pointed out to me that these maps will be initialized at runtime, when other initialization code is run, before my program executes all the good stuff. This would mean constant expressions would impact runtime performance, as the maps are built up from their enum-string pairs.
As an illustrative example, here is an example of one of these maps:
enum class os
{
Windows,
Linux,
MacOSX
};
const map<string, os> os_map =
{ {"windows", os::Windows},
{"linux", os::Linux},
{"mac", os::MacOSX} };
const map<os, string> os_map_inverse =
{ {os::Windows, "windows"},
{os::Linux, "linux"},
{os::MacOSX, "mac"} };
Would the C++11 constexpr have any influence on performance, or is my assumption of a runtime initialization penalty false? I would think a compiler can embed a constant STL container as pure data in the executable, but apparently that may not be as easy as I make it sound?
It's not so much the performance of initialization that is a problem, but the order of initialization. If someone uses your map before main has started (for example at initialization for a namespace scope variable), then you are SOL, because you are not guaranteed that your map has been initialized before the user's initialization uses it.
However you can do this thing at compile time. String literals are constant expressions, as are enumerators. A simple linear-time complexity structure
struct entry {
char const *name;
os value;
};
constexpr entry map[] = {
{ "windows", os::Windows },
{ "linux", os::Linux },
{ "mac", os::Mac }
};
constexpr bool same(char const *x, char const *y) {
return !*x && !*y ? true : (*x == *y && same(x+1, y+1));
}
constexpr os value(char const *name, entry const *entries) {
return same(entries->name, name) ? entries->value : value(name, entries+1);
}
If you use value(a, b) in a constant expression context, and the name you specify doesn't exist, you will get a compile time error because the function call would become non-constant.
To use value(a, b) in a non-constant expression context, you better add safety features, like adding an end marker to your array and throwing an exception in value if you hit the end marker (the function call will still be a constant expression as long as you never hit the end marker).
constexpr does not work on arbitrary expressions, and especially not on things that will use the freestore. map/string will use the freestore, and thus constexpr will not work for initializing them at compile time, and have no code run at runtime.
As for the runtime penalty: Depending on the storage duration of those variables (and I assume static here, which means initialization before main), you will not even be able to measure the penalty, especially not in the code that is using them, assuming you will use them many times, where the lookup has much more "overhead" than the initialization.
But as for everything, remember rule one: Make things work. Profile. Make things fast. In this order.
Ah yes, it's a typical issue.
The only alternative I've found to avoid this runtime initialization is to use plain C structures. If you're willing to go the extra-mile and store the values in a plain C array, you can get static initialization (and reduced memory footprint).
It's one of the reason LLVM/Clang actually use the tblgen utility: plain tables are statically initialized, and you can generate them sorted.
Another solution would be to create a dedicated function instead: for the enum to string conversion it's easy to use a switch and let the compiler optimize it into a plain table, for the string to enum it's a bit more involved (you need if/else branches organized right to get the O(log N) behavior) but for small enums a linear search is as good anyway, in which case a single macro hackery (based off Boost Preprocessor goodness) can get you everything you need.
I have recently found my own best way to furnish an enum. See the code:
enum State {
INIT, OK, ENoFou, EWroTy, EWroVa
};
struct StateToString {
State state;
const char *name;
const char *description;
public:
constexpr StateToString(State const& st)
: state(st)
, name("Unknown")
, description("Unknown")
{
switch(st){
case INIT : name = "INIT" , description="INIT"; break;
case OK : name = "OK" , description="OK"; break;
case ENoFou: name = "ENoFou", description="Error: Not found"; break;
case EWroTy: name = "EWroTy", description="Error: Wrong type"; break;
case EWroVa: name = "EWroVa", description="Error: Wrong value, setter rejected"; break;
// Do not use default to receive a comile-time warning about not all values of enumeration are handled. Need `-Wswitch-enum` in GCC
}
}
};
Extending this example to some universal code, let's name it lib code:
/// Concept of Compile time meta information about (enum) value.
/// This is the best implementation because:
/// - compile-time resolvable using `constexpr name = CtMetaInfo<T>(value)`
/// - enum type can be implemented anywhere
/// - compile-time check for errors, no hidden run-time surprises allowed - guarantied by design.
/// - with GCC's `-Wswitch` or `-Werror=switch` - you will never forget to handle every enumerated value
/// - nice and simple syntaxis `CtMetaInfo(enumValue).name`
/// - designed for enums suitable for everything
/// - no dependencies
/// Recommendations:
/// - write `using TypeToString = CtMetaInfo<Type>;` to simplify code reading
/// - always check constexpr functions assigning their return results to constexpr values
/**\code
template< typename T >
concept CtMetaInfo_CONCEPT {
T value;
const char *name;
// Here could add variables for any other meta information. All vars must be filled either by default values or in ctor init list
public:
///\param defaultName will be stored to `name` if constructor body will not reassign it
constexpr CtMetaInfoBase( T const& val, const char* defaultName="Unknown" );
};
\endcode
*/
/// Pre-declare struct template. Specializations must be defined.
template< typename T >
struct CtMetaInfo;
/// Template specialization gives flexibility, i.e. to define such function templates
template <typename T>
constexpr const char* GetNameOfValue(T const& ty)
{ return CtMetaInfo<T>(ty).name; }
User must define specialization for user's type:
/// Specialization for `rapidjson::Type`
template<>
struct CtMetaInfo<Type> {
using T = Type;
T value;
const char *name;
public:
constexpr CtMetaInfo(T const& val)
: value(val)
, name("Unknown")
{
switch(val){
case kNullType : name = "Null" ; break;
case kFalseType: case kTrueType: name = "Bool" ; break;
case kObjectType : name = "Object"; break;
case kArrayType : name = "Array" ; break;
case kStringType : name = "String"; break;
case kNumberType : name = "Number"; break;
}
}
static constexpr const char* Name(T const& val){
return CtMetaInfo<Type>(val).name;
}
};
/// TEST
constexpr const char *test_constexprvalue = CtMetaInfo<Type>(kStringType).name;
constexpr const char *test_constexprvalu2 = GetNameOfValue(kNullType);
I'm writing some code which could really do with some simple compile time metaprogramming. It is common practise to use empty-struct tags as compile time symbols. I need to decorate the tags with some run-time config elements. static variables seem the only way to go (to enable meta-programming), however static variables require global declarations. to side step this Scott Myers suggestion (from the third edition of Effective C++), about sequencing the initialization of static variables by declaring them inside a function instead of as class variables, came to mind.
So I came up with the following code, my hypothesis is that it will let me have a compile-time symbol with string literals use-able at runtime. I'm not missing anything I hope, and that this will work correctly, as long as I populate the runtime fields before I Initialize the depending templates classes ? .
#include <string>
template<class Instance>
class TheBestThing {
public:
static void set_name(const char * name_in) {
get_name() = std::string(name_in);
}
static void set_fs_location(const char * fs_location_in) {
get_fs_location() = std::string(fs_location_in);
}
static std::string & get_fs_location() {
static std::string fs_location;
return fs_location;
}
static std::string & get_name() {
static std::string name;
return name;
}
};
struct tag {};
typedef TheBestThing<tag> tbt;
int main()
{
tbt::set_name("xyz");
tbt::set_fs_location("/etc/lala");
ImportantObject<tbt> SinceSlicedBread;
}
edit:
Made community wiki.
I've finally understood what the problem was... and your solution does not solve much, if any.
The goal of using local static variable is to provide initialization on first use, thus being safe from the "Initialization Order Fiasco" (by the way, it does not solve the "Destruction Order Fiasco").
But with your design, if you effectively prevent the crash you do not however prevent the issue of using a variable before its value is used.
ImportantObject<tbt> SinceSliceBread; // using an empty string
tbt::set_name("xyz");
Compare with the following use:
std::string& tbt::get_name() { static std::string MName = "xyz"; return MName; }
Here the name is not only created but also initialized on first use. What's the point of using a non initialized name ?
Well, now that we know your solution does not work, let's think a bit. In fact we would like to automate this:
struct tag
{
static const std::string& get_name();
static const std::string& get_fs_location();
};
(with possibly some accessors to modify them)
My first (and easy) solution would be to use a macro (bouh not typesafe):
#define DEFINE_NEW_TAG(Tag_, Name_, FsLocation_) \
struct Tag_ \
{ \
static const std::string& get_name() { \
static const std::string name = #Name_; \
return name; \
} \
static const std::string& get_fs_location() { \
static const std::string fs_location = #FsLocation_; \
return fs_location; \
} \
};
The other solution, in your case, could be to use boost::optional to detect that the value has not been initialized yet, and postpone initialization of the values that depend on it.
The problem:
I have a C++ class with gajillion (>100) members that behave nearly identically:
same type
in a function, each member has the same exact code done to it as other members, e.g. assignment from a map in a constructor where map key is same as member key
This identicality of behavior is repeated across many-many functions (>20), of course the behavior in each function is different so there's no way to factor things out.
The list of members is very fluid, with constant additions and sometimes deletions, some (but not all) driven by changing columns in a DB table.
As you can imagine, this presents a big pain-in-the-behind as far as code creation and maintenance, since to add a new member you have to add code to every function
where analogous members are used.
Example of a solution I'd like
Actual C++ code I need (say, in constructor):
MyClass::MyClass(SomeMap & map) { // construct an object from a map
intMember1 = map["intMember1"];
intMember2 = map["intMember2"];
... // Up to
intMemberN = map["intMemberN"];
}
C++ code I want to be able to write:
MyClass::MyClass(SomeMap & map) { // construct an object from a map
#FOR_EACH_WORD Label ("intMember1", "intMember2", ... "intMemberN")
$Label = map["$Label"];
#END_FOR_EACH_WORD
}
Requirements
The solution must be compatible with GCC (with Nmake as make system, if that matters).
Don't care about other compilers.
The solution can be on a pre-processor level, or something compilable. I'm fine with either one; but so far, all of my research pointed me to the conclusion that the latter is just plain out impossible in C++ (I so miss Perl now that I'm forced to do C++ !)
The solution must be to at least some extent "industry standard" (e.g. Boost is great, but a custom Perl script that Joe-Quick-Fingers created once and posted on his blog is not. Heck, I can easily write that Perl script, being much more of a Perl expert than a C++ one - I just can't get bigwigs in Software Engineering at my BigCompany to buy into using it :) )
The solution should allow me to declare a list of IDs (ideally, in only one header file instead of in every "#FOR_EACH_WORD" directive as I did in the example above)
The solution must not be limited to "create an object from a DB table" constructor. There are many functions, most of them not constructors, that need this.
A solution of "Make them all values in a single vector, and then run a 'for' loop across the vector" is an obvious one, and can not be used - the code's in a library used by many apps, the members are public, and re-writing those apps to use vector members instead of named members is out of the question, sadly.
Boost includes a great preprocessor library that you can use to generate such code:
#include <boost/preprocessor/repetition.hpp>
#include <boost/preprocessor/stringize.hpp>
#include <boost/preprocessor/cat.hpp>
typedef std::map<std::string, int> SomeMap;
class MyClass
{
public:
int intMember1, intMember2, intMember3;
MyClass(SomeMap & map)
{
#define ASSIGN(z,n,_) BOOST_PP_CAT(intMember, n) = map[ BOOST_PP_STRINGIZE(BOOST_PP_CAT(intMember, n))];
BOOST_PP_REPEAT_FROM_TO(1, 4, ASSIGN, nil)
}
};
Boost.Preprocessor proposes many convenient macros to perform such operations. Bojan Resnik already provided a solution using this library, but it assumes that every member name is constructed the same way.
Since you explicitely required the possibily to declare a list of IDs, here is a solution that should better fulfill your needs.
#include <boost/preprocessor/seq/for_each.hpp>
#include <boost/preprocessor/stringize.hpp>
// sequence of member names (can be declared in a separate header file)
#define MEMBERS (foo)(bar)
// macro for the map example
#define GET_FROM_MAP(r, map, member) member = map[BOOST_PP_STRINGIZE(member)];
BOOST_PP_SEQ_FOR_EACH(GET_FROM_MAP, mymap, MEMBERS)
// generates
// foo = mymap["foo"]; bar = mymap["bar];
-------
//Somewhere else, we need to print all the values on the standard output:
#define PRINT(r, ostream, member) ostream << member << std::endl;
BOOST_PP_SEQ_FOR_EACH(PRINT, std::cout, MEMBERS)
As you can see, you just need to write a macro representing the pattern you want to repeat, and pass it to the BOOST_PP_SEQ_FOR_EACH macro.
You could do something like this: create an adapter class or modify the existing class to have a vector of pointers to those fields, add the addresses of all member variables in question to that vector in the class constructor, then when needed run the for-loop on that vector. This way you don't (or almost don't) change the class for external users and have a nice for-loop capability.
Of course, the obvious question is: Why do you have a class with 100 members? It doesn't really seem sane.
Assuming it is sane nevertheless -- have you looked at boost preprocessor library? I have never used it myself (as one friend used to say: doing so leads to the dark side), but from what I heard it should be the tool for the job.
Surreptitiously use perl on your own machine to create the constructor. Then ask to increase your salary since you're succesfully maintaining such a huge chunk of code.
You could use the preprocessor to define the members, and later use the same definition to access them:
#define MEMBERS\
MEMBER( int, value )\
SEP MEMBER( double, value2 )\
SEP MEMBER( std::string, value3 )\
struct FluctuatingMembers {
#define SEP ;
#define MEMBER( type, name ) type name
MEMBERS
#undef MEMBER
#undef SEP
};
.. client code:
FluctuatingMembers f = { 1,2., "valuesofstringtype" };
std::cout <<
#define SEP <<
#define MEMBER( type, name ) #name << ":" << f.##name
MEMBERS;
#undef MEMBER
#undef SEP
It worked for me, but is hard to debug.
You can also implement a visitor pattern based on pointer-to-members. After the preprocessor solution, this one turns out way more debuggeable.
struct FluctuatingMembers {
int v1;
double v2;
std::string v3;
template<typename Visitor> static void each_member( Visitor& v );
};
template<typename Visitor> void FluctuatingMembers::each_member( Visitor& v ) {
v.accept( &FluctuatingMembers::v1 );
v.accept( &FluctuatingMembers::v2 );
v.accept( &FluctuatingMembers::v3 );
}
struct Printer {
FluctuatingMembers& f;
template< typename pt_member > void accept( pt_member m ) const {
std::cout << (f::*m) << "\n";
}
};
// you can even use this approach for visiting
// multiple objects simultaneously
struct MemberComparer {
FluctuatingMembers& f1, &f2;
bool different;
MemberComparer( FluctuatingMembers& f1, FluctuatingMembers& f2 )
: f1(f1),f2(f2)
,different(false)
{}
template< typename pt_member > void accept( pt_member m ) {
if( (f1::*m) != (f2::*m) ) different = true;
}
};
... client code:
FluctuatingMembers object1 = { 1, 2.2, "value2" }
, object2 = { 1, 2.2, "valuetoo" };
Comparer compare( object1, object2 );
FluctuatingMembers::each_member( compare );
Printer pr = { object1 };
FluctuatingMembers::each_member( pr );
Why not do it at run time? (I really hate macro hackery)
What you really are asking for, in some sense, is class metadata.
So I would try something like:
class AMember{
......
};
class YourClass{
AMember member1;
AMember member2;
....
AMember memberN;
typedef AMember YourClass::* pMember_t;
struct MetaData : public std::vector<std::pair<std::string,pMember_t>>{
MetaData(){
push_back(std::make_pair(std::string("member1"),&YourClass::member1));
...
push_back(std::make_pair(std::string("memberN"),&YourClass::memberN));
}
};
static const MetaData& myMetaData() {
static const MetaData m;//initialized once
return m;
}
YourClass(const std::map<std::string,AMember>& m){
const MetaData& md = myMetaData();
for(MetaData::const_iterator i = md.begin();i!= md.end();++i){
this->*(i->second) = m[i->first];
}
}
YourClass(const std::vector<std::pair<std::string,pMember_t>>& m){
const MetaData& md = myMetaData();
for(MetaData::const_iterator i = md.begin();i!= md.end();++i){
this->*(i->second) = m[i->first];
}
}
};
(pretty sure I've got the syntax right but this is a machinery post not a code post)
RE:
in a function, each member has the same exact code done to it as other members, e.g. assignment from a map in a constructor where map key is same as member key
this is handled above.
RE:
The list of members is very fluid, with constant additions and sometimes deletions, some (but not all) driven by changing columns in a DB table.
When you add a new AMember, say newMember, all you have to do is update the MetaData constructor with an:
push_back(make_pair(std::string("newMember"),&YourClass::newMember));
RE:
This identicality of behavior is repeated across many-many functions (>20), of course the behavior in each function is different so there's no way to factor things out.
You have the machinery to apply this same idiom to build the functions
eg: setAllValuesTo(const AMember& value)
YourClass::setAllValuesTo(const AMember& value){
const MetaData& md = myMetaData();
for(MetaData::const_iterator i = md.begin();i!= md.end();++i){
this->*(i->second) = value;
}
}
If you are a tiny bit creative with function pointers or template functionals you can factor out the mutating operation and do just about anything you want to YourClass' AMember's on a collection basis. Wrap these general functions (that may take a functional or function pointer) to implement your current set of 20 public methods in the interface.
If you need more metadata just augment the codomain of the MetaData map beyond a pointer to member. (Of course the i->second above would change then)
Hope this helps.
You can do something like his:
#define DOTHAT(m) m = map[#m]
DOTHAT(member1); DOTHAT(member2);
#undef DOTHAT
That doesn't fully fit your description, but closest to it that saves you typing.
Probably what I'd look to do would be to make use of runtime polymorphism (dynamic dispatch). Make a parent class for those members with a method that does the common stuff. The members derive their class from that parent class. The ones that need a different implementation of the method implement their own. If they need the common stuff done too, then inside the method they can downcast to the base class and call its version of the method.
Then all you have to do inside your original class is call the member for each method.
I would recommend a small command-line app, written in whatever language you or your team are most proficient in.
Add some kind of template language to your source files. For something like this, you don't need to implement a full-fledged parser or anything fancy like that. Just look for an easily-identified character at the beginning of a line, and some keywords to replace.
Use the command-line app to convert the templated source files into real source files. In most build systems, this should be pretty easy to do automatically by adding a build phase, or simply telling the build system: "use MyParser.exe to handle files of type *.tmp"
Here's an example of what I'm talking about:
MyClass.tmp
MyClass::MyClass(SomeMap & map) { // construct an object from a map
▐REPLACE_EACH, LABEL, "intMember1", "intMember2, ... , "intMemberN"
▐ LABEL = map["$Label"];
}
I've used "▐" as an example, but any character that would otherwise never appear as the first character on a line is perfectly acceptable.
Now, you would treat these .tmp files as your source files, and have the actual C++ code generated automatically.
If you've ever heard the phrase "write code that writes code", this is what it means :)
There are already a lot of good answers and ideas here, but for the sake of diversity I'll present another.
In the code file for MyClass would be:
struct MemberData
{
size_t Offset;
const char* ID;
};
static const MemberData MyClassMembers[] =
{
{ offsetof(MyClass, Member1), "Member1" },
{ offsetof(MyClass, Member2), "Member2" },
{ offsetof(MyClass, Member3), "Member3" },
};
size_t GetMemberCount(void)
{
return sizeof(MyClassMembers)/sizeof(MyClassMembers[0]);
}
const char* GetMemberID(size_t i)
{
return MyClassMembers[i].ID;
}
int* GetMemberPtr(MyClass* p, size_t i) const
{
return (int*)(((char*)p) + MyClassMembers[i].Offset);
}
Which then makes it possible to write the desired constructor as:
MyClass::MyClass(SomeMap& Map)
{
for(size_t i=0; i<GetMemberCount(); ++i)
{
*GetMemberPtr(i) = Map[GetMemberID(i)];
}
}
And of course, for any other functions operating on all the members you would write similar loops.
Now there are a few issues with this technique:
Operations on members use a runtime loop as opposed to other solutions which would yield an unrolled sequence of operations.
This absolutely depends on each member having the same type. While that was allowed by OP, one should still evaluate whether or not that might change in the future. Some of the other solutions don't have this restriction.
If I remember correctly, offsetof is only defined to work on POD types by the C++ standard. In practice, I've never seen it fail. However I haven't used all the C++ compilers out there. In particular, I've never used GCC. So you would need to test this in your environment to ensure it actually works as intended.
Whether or not any of these are problems is something you'll have to evaluate against your own situation.
Now, assuming this technique is usable, there is one nice advantage. Those GetMemberX functions can be turned into public static/member functions of your class, thus providing this generic member access to more places in your code.
class MyClass
{
public:
MyClass(SomeMap& Map);
int Member1;
int Member2;
int Member3;
static size_t GetMemberCount(void);
static const char* GetMemberID(size_t i);
int* GetMemberPtr(size_t i) const;
};
And if useful, you could also add a GetMemberPtrByID function to search for a given string ID and return a pointer to the corresponding member.
One disadvantage with this idea so far is that there is a risk that a member could be added to the class but not to the MyClassMembers array. However, this technique could be combined with xtofl's macro solution so that a single list could populate both the class and the array.
changes in the header:
#define MEMBERS\
MEMBER( Member1 )\
SEP MEMBER( Member2 )\
SEP MEMBER( Member3 )\
class MyClass
{
public:
#define SEP ;
#define MEMBER( name ) int name
MEMBERS;
#undef MEMBER
#undef SEP
// other stuff, member functions, etc
};
and changes in the code file:
const MemberData MyClassMembers[] =
{
#define SEP ,
#define MEMBER( name ) { offsetof(MyClass, name), #name }
MEMBERS
#undef MEMBER
#undef SEP
};
Note: I have left error checking out of my examples here. Depending on how this would be used, you might want to ensure the array bounds are not overrun with debug mode asserts and/or release mode checks that would return NULL pointers for bad indexes. Or some use of exceptions if appropriate.
Of course, if you aren't worried about error checking the array bounds, then GetMemberPtr could actually be changed into something else that would return a reference to the member.
Is there a way in C++ to extend/"inherit" enums?
I.E:
enum Enum {A,B,C};
enum EnumEx : public Enum {D,E,F};
or at least define a conversion between them?
No, there is not.
enum are really the poor thing in C++, and that's unfortunate of course.
Even the class enum introduced in C++0x does not address this extensibility issue (though they do some things for type safety at least).
The only advantage of enum is that they do not exist: they offer some type safety while not imposing any runtime overhead as they are substituted by the compiler directly.
If you want such a beast, you'll have to work yourself:
create a class MyEnum, that contains an int (basically)
create named constructors for each of the interesting values
you may now extend your class (adding named constructors) at will...
That's a workaround though, I have never found a satistifying way of dealing with an enumeration...
I've solved in this way:
typedef enum
{
#include "NetProtocols.def"
} eNetProtocols, eNP;
Of course, if you add a new net protocol in the NetProtocols.def file, you have to recompile, but at least it's expandable.
"NetProtocols.def" will contain only the field names:
HTTP,
HTTPS,
FTP
I had this problem in some projects that ran on small hardware devices I design. There is a common project that holds a number of services. Some of these services use enums as parameters to get additional type checking and safety. I needed to be able to extend these enums in the projects that use these services.
As other people have mentioned c++ doesn't allow you to extend enums. You can however emulate enums using a namespace and a template that has all the benefits of enum class.
enum class has the following benefits:
Converts to a known integer type.
Is a value type
Is constexpr by default and takes up no valuable RAM on small processors
Is scoped and accessible by enum::value
Works in case statements
Provides type safety when used as a parameter and needs to be explicitly cast
Now if you define a class as an enum you can't create constexpr instances of the enum in the class declaration, because the class is not yet complete and it leads to a compile error. Also even if this worked you could not extend the value set of enums easily later in another file/sub project .
Now namespaces have no such problem but they don't provide type safety.
The answer is to first create a templated base class which allows enums of different base sizes so we don't waste what we don't use.
template <typename TYPE>
class EnumClass {
private:
TYPE value_;
public:
explicit constexpr EnumClass(TYPE value) :
value_(value){
}
constexpr EnumClass() = default;
~EnumClass() = default;
constexpr explicit EnumClass(const EnumClass &) = default;
constexpr EnumClass &operator=(const EnumClass &) = default;
constexpr operator TYPE() const {return value_;}
constexpr TYPE value() const {return value_;}
};
Then for each enum class we want to extend and emulate we create a namespace and a Type like this:
namespace EnumName {
class Type :public Enum<uint8_t> {
public:
explicit constexpr Type(uint8_t value): Enum<uint8_t>(value){}
constexpr Enum() = default;
}
constexpr auto Value1 = Type(1);
constexpr auto Value2 = Type(2);
constexpr auto Value3 = Type(3);
}
Then later in your code if you have included the original EnumName you can do this:
namespace EnumName {
constexpr auto Value4 = Type(4U);
constexpr auto Value5 = Type(5U);
constexpr auto Value6 = Type(6U);
constexpr std::array<Type, 6U> Set = {Value1, Value2, Value3, Value4, Value5, Value6};
}
now you can use the Enum like this:
#include <iostream>
void fn(EnumName::Type val){
if( val != EnumName::Value1 ){
std::cout << val;
}
}
int main(){
for( auto e :EnumName::Set){
switch(e){
case EnumName::Value1:
std::cout << "a";
break;
case EnumName::Value4:
std::cout << "b";
break;
default:
fn(e);
}
}
}
So we have a case statement, enum comparisons, parameter type safety and its all extensible. Note the set is constexpr and wont end up using valuable RAM on a small micro (placement verified on Godbolt.org. :-). As a bonus we have the ability to iterate over a set of enum values.
If you were able to create a subclass of an enum it'd have to work the other way around.
The set of instances in a sub-class is a subset of the instances in the super-class. Think about the standard "Shape" example. The Shape class represents the set of all Shapes. The Circle class, its subclass, represents the subset of Shapes that are Circles.
So to be consistent, a subclass of an enum would have to contain a subset of the elements in the enum it inherits from.
(And no, C++ doesn't support this.)
A simple, but useful workaround for this c++ gap could be as follows:
#define ENUM_BASE_VALS A,B,C
enum Enum {ENUM_BASE_VALS};
enum EnumEx {ENUM_BASE_VALS, D,E,F};
Actually you can extend enums in a round about way.
The C++ standard defines the valid enum values to be all the valid values of the underlying type so the following is valid C++ (11+). Its not Undefined Behaviour, but it is very nasty - you have been warned.
#include <cstdint>
enum Test1:unit8_t {
Value1 =0,
Value2 =1
};
constexpr auto Value3 = static_cast<Test1>(3);
constexpr auto Value4 = static_cast<Test1>(4);
constexpr auto Value5 = static_cast<Test1>(5);
Test1 fn(Test1 val){
switch(val){
case Value1:
case Value2:
case Value3:
case Value4:
return Value1;
case Value5:
return Value5;
}
}
int main(){
return static_cast<uint8_t>(fn(Value5));
}
Note that most of the compilers don't consider the additional values as part of the set for generating warnings about missing enums values in switch statements.So
clang and gcc will warn if Value2 is missing but will do nothing if Value4 is missing in the above switch statement.
One might try this :)
struct EnumType
{
enum
{
None = 0,
Value_1 = 1,
Value_2 = 2,
Value_3 = 3
};
//For when using the EnumType as a variable type
int Value { None };
};
struct EnumTypeEx : EnumType
{
enum
{
ExValue_1 = 3,
ExValue_2 = 4,
//override the value of Value_3
Value_3 = 3000
};
};
Pros:
Class like extensibility
Can override base labels value
Cons:
Tedious to write
You have to explicitly set the value for each label(*)
(*) you can start the auto value increment from the last base value by just writing it explicitly eg. ExValue_Start = LastBaseValue.
I do this
enum OPC_t // frame Operation Codes
{
OPC_CVSND = 0 // Send CV value
, OPC_CVREQ = 1 // Request CV (only valid for master app)
, OPC_COMND = 2 // Command
, OPC_HRTBT = 3 // Heart Beat
};
enum rxStatus_t // this extends OPC_t
{
RX_CVSND = OPC_CVSND // Send CV value
, RX_CVREQ = OPC_CVREQ // Request CV
, RX_COMND = OPC_COMND // Command
, RX_HRTBT = OPC_HRTBT // Heart Beat
, RX_NONE // No new Rx
, RX_NEWCHIP // new chip detected
};
http://www.codeproject.com/KB/cpp/InheritEnum.aspx goes over a method to created an expanded enum.
Create InheritEnum.h:
// -- InheritEnum.h
template <typename EnumT, typename BaseEnumT>
class InheritEnum
{
public:
InheritEnum() {}
InheritEnum(EnumT e)
: enum_(e)
{}
InheritEnum(BaseEnumT e)
: baseEnum_(e)
{}
explicit InheritEnum( int val )
: enum_(static_cast<EnumT>(val))
{}
operator EnumT() const { return enum_; }
private:
// Note - the value is declared as a union mainly for as a debugging aid. If
// the union is undesired and you have other methods of debugging, change it
// to either of EnumT and do a cast for the constructor that accepts BaseEnumT.
union
{
EnumT enum_;
BaseEnumT baseEnum_;
};
};
And then to use:
enum Fruit { Orange, Mango, Banana };
enum NewFruits { Apple, Pear };
typedef InheritEnum< NewFruit, Fruit > MyFruit;
void consume(MyFruit myfruit);
YMMV.
Just an idea:
You could try to create an empty class for each constant (maybe put them all in the same file to reduce clutter), create one instance of each class and use the pointers to these instances as the "constants". That way, the compiler will understand inheritance and will perform any ChildPointer-to-ParentPointer conversion necessary when using function calls, AND you still get type-safety checks by the compiler to ensure no one passes an invalid int value to functions (which would have to be used if you use the LAST value method to "extend" the enum).
Haven't fully thought this through though so any comments on this approach are welcome.
And I'll try to post an example of what I mean as soon as I have some time.
My approach slightly differs from that chosen by Marco.
I have added a macro as the last value of the enum I want to extend. In a separate header, I have defined that macro with the enum entries I want to add.
// error.h
#ifndef __APP_DEFINED__
#define __APP_DEFINED__
#endif
typedef enum { a, b, c, __APP_DEFINED__ } Error;
To add more values, I have created
// app_error.h
#define __APP_DEFINED__ d, e, f,
Lastly, I have added -include app_error.h to the compiler flags.
I prefer this approach compared to placing an #include "app_error.h" directive within the enum, mainly because I need the freedom not to have the "app_error.h" file at all in some projects.
The following code works well.
enum Enum {A,B,C};
enum EnumEx {D=C+1,E,F};