I am working on an application with a message based / asynchronous agent-like architecture.
There will be a few dozen distinct message types, each represented by C++ types.
class message_a
{
long long identifier;
double some_value;
class something_else;
...//many more data members
}
Is it possible to write a macro/meta-program that would allow calculating the number of data members within the class at compile time?
//eg:
class message_b
{
long long identifier;
char foobar;
}
bitset<message_b::count_members> thebits;
I am not familiar with C++ meta programming, but could boost::mpl::vector allow me to accomplish this type of calculation?
as others already suggested, you need Boost.Fusion and its BOOST_FUSION_DEFINE_STRUCT. You'll need to define your struct once using unused but simple syntax. As result you receive required count_members (usually named as size) and much more flexibility than just that.
Your examples:
Definition:
BOOST_FUSION_DEFINE_STRUCT(
(), message_a,
(long long, identifier),
(double, some_value)
)
usage:
message_a a;
size_t count_members = message_a::size;
No, there is no way in C++ to know the names of all members or how many members are actually there.
You could store all types in a mpl::vector along in your classes but then you face the problem of how to turn them into members with appropriate names (which you cannot achieve without some macro hackery).
Using std::tuple instead of PODs is a solution that generally works but makes for incredible messy code when you actually work with the tuple (no named variables) unless you convert it at some point or have a wrapper that forwards accessors onto the tuple member.
class message {
public:
// ctors
const int& foo() const { return std::get<0>(data); }
// continue boiler plate with const overloads etc
static std::size_t nun_members() { return std::tuple_size<data>::value; }
private:
std::tuple<int, long long, foo> data;
};
A solution with Boost.PP and MPL:
#include <boost/mpl/vector.hpp>
#include <boost/mpl/at.hpp>
#include <boost/preprocessor.hpp>
#include <boost/preprocessor/arithmetic/inc.hpp>
struct Foo {
typedef boost::mpl::vector<int, double, long long> types;
// corresponding type names here
#define SEQ (foo)(bar)(baz)
#define MACRO(r, data, i, elem) boost::mpl::at< types, boost::mpl::int_<i> >::type elem;
BOOST_PP_SEQ_FOR_EACH_I(MACRO, 0, SEQ)
};
int main() {
Foo a;
a.foo;
}
I didn't test it so there could be bugs.
There are several answers simply saying that it is not possible, and if you hadn't linked to magic_get I would've agreed with them. But magic_get shows, to my amazement, that it actually is possible in some cases. This goes to show that proving that something is not possible is harder than proving that something is possible!
The short answer to your question would be to use the facilities in magic_get directly rather than reimplement them yourself. After all, even looking at the pre-Boost version of the code, it's not exactly clear how it works. At one point in the comments it mentions something about constructor arguments; I suspect this is the key, because it is possible to count the arguments to a regular function, so perhaps it is counting the number of arguments needed to brace-initialise the struct. This indicates that it may only be possible with plain old structs rather than objects with your own methods.
Despite all this, I would suggest using a reflection library as others have suggested. A good one that I often recommend is Google's protobuf library, which has reflection and serialisation along with multi-language support. However, it is intended only for data-only objects (like plain old structs but with vectors and strings).
Plain structs do not support counting members, but boost::fusion offers a good way to declare a struct that is count- and iteratable.
Something like this might get you closer:
struct Foo {
Foo() : a(boost::get<0>(values)), b(boost::get<1>(values)) {}
int &a;
float &b;
typedef boost::tuple<int,float> values_t;
values_t values;
};
If your types respect some properties ("SimpleAggregate"), you might use magic_get (which is now boost_pfr) (from C++14/C++17).
So you will have something like:
class message_b
{
public;
long long identifier;
char foobar;
};
static_assert(boost::pfr::tuple_size<message_b>::value == 2);
Related
I need to parse and store a somewhat (but not too) complex stream and need to store the parsed result somehow. The stream essentially contains name-value pairs with values possibly being of different type for different names. Basically, I end up with a map of key (always string) to a pair <type, value>.
I started with something like this:
typedef enum ValidType {STRING, INT, FLOAT, BINARY} ValidType;
map<string, pair<ValidType, void*>> Data;
However I really dislike void* and storing pointers. Of course, I can always store the value as binary data (vector<char> for example), in which case the map would end up being
map<string, pair<ValidType, vector<char>>> Data;
Yet, in this case I would have to parse the binary data every time I need the actual value, which would be quite expensive in terms of performance.
Considering that I am not too worried about memory footprint (the amount of data is not large), but I am concerned about performance, what would be the right way to store such data?
Ideally, I'd like to avoid using boost, as that would increase the size of the final app by a factor of 3 if not more and I need to minimise that.
You're looking for a discriminated (or tagged) union.
Boost.Variant is one example, and Boost.Any is another. Are you so sure Boost will increase your final app size by a factor of 3? I would have thought variant was header-only, in which case you don't need to link any libraries.
If you really can't use Boost, implementing a simple discriminated union isn't so hard (a general and fully-correct one is another matter), and at least you know what to search for now.
For completeness, a naive discriminated union might look like:
class DU
{
public:
enum TypeTag { None, Int, Double };
class DUTypeError {};
private:
TypeTag type_;
union {
int i;
double d;
} data_;
void typecheck(TypeTag tt) const { if(type_ != tt) throw DUTypeError(); }
public:
DU() : type_(None) {}
DU(DU const &other) : type_(other.type_), data_(other.data_) {}
DU& operator= (DU const &other) {
type_=other.type_; data_=other.data_; return *this;
}
TypeTag type() const { return type_; }
bool istype(TypeTag tt) const { return type_ == tt; }
#define CONVERSIONS(TYPE, ENUM, MEMBER) \
explicit DU(TYPE val) : type_(ENUM) { data_.MEMBER = val; } \
operator TYPE & () { typecheck(ENUM); return data_.MEMBER; } \
operator TYPE const & () const { typecheck(ENUM); return data_.MEMBER; } \
DU& operator=(TYPE val) { type_ = ENUM; data_.MEMBER = val; return *this; }
CONVERSIONS(int, Int, i)
CONVERSIONS(double, Double, d)
};
Now, there are several drawbacks:
you can't store non-POD types in the union
adding a type means modifying the enum, and the union, and remembering to add a new CONVERSIONS line (it would be even worse without the macro)
you can't use the visitor pattern with this (or, you'd have to write your own dispatcher for it), which means lots of switch statements in the client code
every one of these switches may also need updating if you add a type
if you did write a visitor dispatch, that needs updating if you add a type, and so may every visitor
you need to manually reproduce something like the built-in C++ type-conversion rules if you want to do anything like arithmetic with these (ie, operator double could promote an Int instead of only handling Double ... but only if you hand-roll every operator)
I haven't implemented operator== precisely because it needs a switch. You can't just memcmp the two unions if the types match, because identical 32-bit integers could still compare different if the extra space required for the double holds a different bit pattern
Some of these issues can be addressed if you care about them, but it's all more work. Hence my preference for not re-inventing this particular wheel if it can be avoided.
Since your data types are fixed what about something like this...
Have something like a std::vector for each type of value.
And your map would have as the second value of the pair the index to the data.
std::vector<int> vInt;
std::vector<float> vFloat;
.
.
.
map<std::string, std::pair<ValidType, int>> Data;
You can implement a multi-type map by leveraging the nifty features of std::tuple in C++11, which allows access by a type key. You can wrap this to create access by arbitrary keys. An in-depth explanation of this (and quite an interesting read) is available here:
https://jguegant.github.io/blogs/tech/thread-safe-multi-type-map.html
The modern C++ features provide create ways to solve old problems.
As programming becomes more complex, and the need to perform operations on struct data becomes visible. Is there a conversion method for converting a struct type into an array of its members such that:
struct FooDesc Foo{
int num_Foo;
int num_Bar;
int GreenFoo;
};
can be represented by:
int Bar[2];
Or better, dynamically as:
vector<int> Bar;
The goal is to convert or re-represent the data struct as an iteratable form, without the excessive use of the assignment operator.
You could use unnamed structs to make a hybrid struct where its member could be treated as an array:
struct Foo {
union {
struct {
int x;
int y;
int z;
};
struct {
int array[3];
};
};
};
LIVE DEMO
Note however, that unnamed struct comes from C11 and its not a standard C++ feature. It is supported as an extension though by GCC as well Clang.
If your structs are POD then you might consider using std::tuple instead of structs. You could then use various template facilities to work through the members of the tuple.
Here is a simple example that prints the elements of a tuple - using boost::fusion::tuple instead of the std::tuple since it has many more tuple-manipulating facilities available:
#include <boost/fusion/tuple.hpp>
#include <boost/fusion/include/for_each.hpp>
#include <iostream>
struct Printer {
template<typename T>
void operator()(const T &t) const {
std::cout << t << std::endl;
}
};
int main(int argc, const char * argv[])
{
boost::fusion::tuple<int, int, int, int, float> t =
boost::fusion::make_tuple(3, 5, 1, 9, 7.6f);
boost::fusion::for_each(t, Printer());
return 0;
}
You could include these in unions with the struct but you'd want to do some testing to ensure proper alignment agreement.
The upside is that these manipulations are very fast - most of the work is done at compile time. The down-side is that you can't use normal control structs like indexing with runtime indices - you'd have to build an abstraction layer around that as the normal get<i>(tuple) accessor requires that i be a compile time constant. Whether this is worth the complexity depends strongly on the application.
How about:
vector <Foo> Bar;
You can then add instances of your struct and then access each element as desired, using an array-like format.
See this related question for further information:
Vector of structs initialization
Upon re-reading your question a few times, I think I mis-understood your intent and answered the "wrong question". You can make an array of your struct as mentioned above and index it as an array, but I don't believe it is quite as simple as that to make each struct element a different element of an array. If you are looking to make an array of structs, my answer should help. If you are looking to make each element of your struct an element of your array, 40two's answer should help you out.
In my company, I'm working on providing a faster SSE path for some hot code. I'm using the intrinsic approach which keeps to C++ and really shows impressive results.
All code only has to work on float and double, so I created a templated SSE operations class that I specialized for both. What I really don't like is that these two classes look almost identical except for the number type (float/double), the used SSE type (__m128/__m128d) and the intrisics suffix (_ps/_pd) like so:
template<>
struct SseOperations<float> : public Sse<float>
{
typedef __m128 vector;
vector load(float const * const from) const
{
return _mm_loadu_ps(from);
}
vector add(vector const & a, vector const & b) const
{
return _mm_add_ps(a, b);
}
// etc.
};
and
template<>
struct SseOperations<double> : public Sse<double>
{
typedef __m128d vector;
vector load(double const * const from) const
{
return _mm_loadu_pd(from);
}
vector add(vector const & a, vector const & b) const
{
return _mm_add_pd(a, b);
}
// etc.
};
I wouldn't know how to unify this using template magic, because of the different intrinsics suffix.
Then the ## capability of macros came to my mind, which would lend itself for that purpose. So I managed to put the complete specialized class into a macro that I could use to generate both classes with:
SSE_OPERATIONS(float, __m128, _ps);
SSE_OPERATIONS(double, __m128d, _pd);
I know macros are evil and all, but at least in this case I don't see any of the typical dangers and it gets the job done.
What bothers me now is that the second and third macro parameter are redundant; they could be deduced from the first one, only that I have absolutely no idea how. #if and its friends aren't supposed to work because sizeof() doesn't work during pre-processing.
Searching for solutions is unexpectedly hard, because of #if topics polluting the results heavily. Can anyone tell me how to do a macro level decision for this problem?
PS: I heard of Boost Preprocessor but I'm not allowed to use it.
Update: Although I'm asking for a macro solution, I would also accept a nice template solution. For that, know that I'm encapsulating at least 7 intrinsics—just in case that would bloat template code.
You can get rid of the second parameter by a trait:
template <class Scalar>
struct Vector;
template <>
struct Vector<float>
{
typedef __m128 type;
};
template <>
struct Vector<double>
{
typedef __m128d type;
};
As for the third one, you can do a really ugly special preprocessor hack trick:
#define SUFFIX_float ps
#define SUFFIX_double pd
and use ## on SUFFIX_ and the outermost macro parameter to arrive at the correct version. Of course, it would require some levels of indirection to get the macros to expand at the correct time. Using Boost.Preprocessor, in particular BOOST_PP_CAT and possibly BOOST_PP_EXPAND, might make this slightly easier.
I need to parse and store a somewhat (but not too) complex stream and need to store the parsed result somehow. The stream essentially contains name-value pairs with values possibly being of different type for different names. Basically, I end up with a map of key (always string) to a pair <type, value>.
I started with something like this:
typedef enum ValidType {STRING, INT, FLOAT, BINARY} ValidType;
map<string, pair<ValidType, void*>> Data;
However I really dislike void* and storing pointers. Of course, I can always store the value as binary data (vector<char> for example), in which case the map would end up being
map<string, pair<ValidType, vector<char>>> Data;
Yet, in this case I would have to parse the binary data every time I need the actual value, which would be quite expensive in terms of performance.
Considering that I am not too worried about memory footprint (the amount of data is not large), but I am concerned about performance, what would be the right way to store such data?
Ideally, I'd like to avoid using boost, as that would increase the size of the final app by a factor of 3 if not more and I need to minimise that.
You're looking for a discriminated (or tagged) union.
Boost.Variant is one example, and Boost.Any is another. Are you so sure Boost will increase your final app size by a factor of 3? I would have thought variant was header-only, in which case you don't need to link any libraries.
If you really can't use Boost, implementing a simple discriminated union isn't so hard (a general and fully-correct one is another matter), and at least you know what to search for now.
For completeness, a naive discriminated union might look like:
class DU
{
public:
enum TypeTag { None, Int, Double };
class DUTypeError {};
private:
TypeTag type_;
union {
int i;
double d;
} data_;
void typecheck(TypeTag tt) const { if(type_ != tt) throw DUTypeError(); }
public:
DU() : type_(None) {}
DU(DU const &other) : type_(other.type_), data_(other.data_) {}
DU& operator= (DU const &other) {
type_=other.type_; data_=other.data_; return *this;
}
TypeTag type() const { return type_; }
bool istype(TypeTag tt) const { return type_ == tt; }
#define CONVERSIONS(TYPE, ENUM, MEMBER) \
explicit DU(TYPE val) : type_(ENUM) { data_.MEMBER = val; } \
operator TYPE & () { typecheck(ENUM); return data_.MEMBER; } \
operator TYPE const & () const { typecheck(ENUM); return data_.MEMBER; } \
DU& operator=(TYPE val) { type_ = ENUM; data_.MEMBER = val; return *this; }
CONVERSIONS(int, Int, i)
CONVERSIONS(double, Double, d)
};
Now, there are several drawbacks:
you can't store non-POD types in the union
adding a type means modifying the enum, and the union, and remembering to add a new CONVERSIONS line (it would be even worse without the macro)
you can't use the visitor pattern with this (or, you'd have to write your own dispatcher for it), which means lots of switch statements in the client code
every one of these switches may also need updating if you add a type
if you did write a visitor dispatch, that needs updating if you add a type, and so may every visitor
you need to manually reproduce something like the built-in C++ type-conversion rules if you want to do anything like arithmetic with these (ie, operator double could promote an Int instead of only handling Double ... but only if you hand-roll every operator)
I haven't implemented operator== precisely because it needs a switch. You can't just memcmp the two unions if the types match, because identical 32-bit integers could still compare different if the extra space required for the double holds a different bit pattern
Some of these issues can be addressed if you care about them, but it's all more work. Hence my preference for not re-inventing this particular wheel if it can be avoided.
Since your data types are fixed what about something like this...
Have something like a std::vector for each type of value.
And your map would have as the second value of the pair the index to the data.
std::vector<int> vInt;
std::vector<float> vFloat;
.
.
.
map<std::string, std::pair<ValidType, int>> Data;
You can implement a multi-type map by leveraging the nifty features of std::tuple in C++11, which allows access by a type key. You can wrap this to create access by arbitrary keys. An in-depth explanation of this (and quite an interesting read) is available here:
https://jguegant.github.io/blogs/tech/thread-safe-multi-type-map.html
The modern C++ features provide create ways to solve old problems.
As most programmers I admire and try to follow the principles of Literate programming, but in C++ I routinely find myself using std::pair, for a gazillion common tasks. But std::pair is, IMHO, a vile enemy of literate programming...
My point is when I come back to code I've written a day or two ago, and I see manipulations of a std::pair (typically as an iterator) I wonder to myself "what did iter->first and iter->second mean???".
I'm guessing others have the same doubts when looking at their std::pair code, so I was wondering, has anyone come up with some good solutions to recover literacy when using std::pair?
std::pair is a good way to make a "local" and essentially anonymous type with essentially anonymous columns; if you're using a certain pair over so large a lexical space that you need to name the type and columns, I'd use a plain struct instead.
How about this:
struct MyPair : public std::pair < int, std::string >
{
const int& keyInt() { return first; }
void keyInt( const int& keyInt ) { first = keyInt; }
const std::string& valueString() { return second; }
void valueString( const std::string& valueString ) { second = valueString; }
};
It's a bit verbose, however using this in your code might make things a little easier to read, eg:
std::vector < MyPair > listPairs;
std::vector < MyPair >::iterator iterPair( listPairs.begin() );
if ( iterPair->keyInt() == 123 )
iterPair->valueString( "hello" );
Other than this, I can't see any silver bullet that's going to make things much clearer.
typedef std::pair<bool, int> IsPresent_Value;
typedef std::pair<double, int> Price_Quantity;
...you get the point.
You can create two pairs of getters (const and non) that will merely return a reference to first and second, but will be much more readable. For instance:
string& GetField(pair& p) { return p.first; }
int& GetValue(pair& p) { return p.second; }
Will let you get the field and value members from a given pair without having to remember which member holds what.
If you expect to use this a lot, you could also create a macro that will generate those getters for you, given the names and types: MAKE_PAIR_GETTERS(Field, string, Value, int) or so. Making the getters straightforward will probably allow the compiler to optimize them away, so they'll add no overhead at runtime; and using the macro will make it a snap to create those getters for whatever use you make of pairs.
You could use boost tuples, but they don't really alter the underlying issue: Do your really want to access each part of the pair/tuple with a small integral type, or do you want more 'literate' code. See this question I posted a while back.
However, boost::optional is a useful tool which I've found replaces quite a few of the cases where pairs/tuples are touted as ther answer.
Recently I've found myself using boost::tuple as a replacement for std::pair. You can define enumerators for each member and so it's obvious what each member is:
typedef boost::tuple<int, int> KeyValueTuple;
enum {
KEY
, VALUE
};
void foo (KeyValueTuple & p) {
p.get<KEY> () = 0;
p.get<VALUE> () = 0;
}
void bar (int key, int value)
{
foo (boost:tie (key, value));
}
BTW, comments welcome on if there is a hidden cost to using this approach.
EDIT: Remove names from global scope.
Just a quick comment regarding global namespace. In general I would use:
struct KeyValueTraits
{
typedef boost::tuple<int, int> Type;
enum {
KEY
, VALUE
};
};
void foo (KeyValueTuple::Type & p) {
p.get<KeyValueTuple::KEY> () = 0;
p.get<KeyValueTuple::VALUE> () = 0;
}
It does look to be the case that boost::fusion does tie the identity and value closer together.
As Alex mentioned, std::pair is very convenient but when it gets confusing create a structure and use it in the same way, have a look at std::pair code, it's not that complex.
I don't like std::pair as used in std::map either, map entries should have had members key and value.
I even used boost::MIC to avoid this. However, boost::MIC also comes with a cost.
Also, returning a std::pair results in less than readable code:
if (cntnr.insert(newEntry).second) { ... }
???
I also found that std::pair is commonly used by the lazy programmers who needed 2 values but didn't think why these values where needed together.