Initialise a std::vector with data from an array of unions - c++

How might I initialise a std::vector from an array of structs, where the struct contains a union of different types. In other words, the array is used to store a number of values of a specific type, which can be int, char* etc.
This is my solution so far but I'm looking for a better approach:
The convert function returns a vector<int> if it stores ints or a vector<std::string> if it stores char*.
The Value type below is a struct containing a union called value. The Container class below points to a buffer of such Values.
// union member getter
class Getter
{
public:
void operator()(int8_t& i, const Value& value)
{
i = value.value.i;
}
void operator()(std::string& s, const Value& value)
{
s = std::string(value.value.s);
}
...
};
template<class T>
std::vector<T> convert(Container* container)
{
std::vector<T> c;
c.reserve(container->nrOfValues);
Getter g;
for(int i=0;i<container->nrOfValues;i++)
{
T value;
g(value, container->values[i]);
c.push_back(value);
}
return c;
}

Your problem is the union gives a different name to each value, which causes the need for a function that converts a name to a type, such as Getter::operator() returning a type and getting a named member of the union.
There isn't much you can do with this. You can save a variable declaration and a copy/string constructor on each item, but that's about it.
If you can't modify the original struct, you could initialize the vector with a length set of default value (which must be passed in), then iterate through using the getter as:
vector<T> v(length, defaultValue);
typename vector<T>::iterator iter = vec.begin();
for(int index = 0; *iter != vec.end() && index < length; ++iter, ++index) {
converter(*iter, array[index]);
}
Notice that this starts getting cumbersome in iterating the index and the iterator and verifying both are still valid in case of an accident...
If you can modify the original struct:
class Ugly { // or struct, it doesn't matter
public:
union {
char* s;
int i;
} value;
Ugly(char* s) {
value.s = s;
}
Ugly (const int& i) {
value.i = i;
}
operator std::string() const {
return std::string(value.s);
}
operator int() const {
return value.i;
}
};
Then your for loop becomes:
for(int i=0;i<container->nrOfValues;i++)
{
c.push_back(container->values[i]);
}
Note: You might create the vector and pass it as an argument to the copy function since it involves copying the data over during the return.

If you like some template magic, you could do it slightly different way:
// Source union to get data from
union U
{
int i;
char* s;
double d;
};
// Conversion type template function (declared only)
template <class T> T convert(const U& i_u);
// Macro for template specializations definition
#define FIELD_CONV(SrcType, DestField)\
template <> SrcType convert(const U& i_u)\
{ auto p = &DestField; return i_u.*p; }
// Defining conversions: source type -> union field to get data from
FIELD_CONV(int, U::i)
FIELD_CONV(std::string, U::s)
FIELD_CONV(double, U::d)
// Get rid of macro that not needed any more - just for macro haters ;-)
#undef FIELD_CONV
// Usage
template<class T> std::vector<T> convert(Container* container)
{
std::vector<T> c;
c.reserve(container->nrOfValues);
for(int i = 0; i < container->nrOfValues; ++i)
c.push_back(convert<T>(container->values[i]));
return c;
}
The advantage of this approach - it is short, simple and easy to extend. When you add new field to union you just write another FIELD_CONV() definition.
Compiled example is here.

Related

Get byte representation of C++ class

I have objects that I need to hash with SHA256. The object has several fields as follows:
class Foo {
// some methods
protected:
std::array<32,int> x;
char y[32];
long z;
}
Is there a way I can directly access the bytes representing the 3 member variables in memory as I would a struct ? These hashes need to be computed as quickly as possible so I want to avoid malloc'ing a new set of bytes and copying to a heap allocated array. Or is the answer to simply embed a struct within the class?
It is critical that I get the exact binary representation of these variables so that the SHA256 comes out exactly the same given that the 3 variables are equal (so I can't have any extra padding bytes etc included going into the hash function)
Most Hash classes are able to take multiple regions before returning the hash, e.g. as in:
class Hash {
public:
void update(const void *data, size_t size) = 0;
std::vector<uint8_t> digest() = 0;
}
So your hash method could look like this:
std::vector<uint8_t> Foo::hash(Hash *hash) const {
hash->update(&x, sizeof(x));
hash->update(&y, sizeof(y));
hash->update(&z, sizeof(z));
return hash->digest();
}
You can solve this by making an iterator that knows the layout of your member variables. Make Foo::begin() and Foo::end() functions and you can even take advantage of range-based for loops.
If you can increment it and dereference it, you can use it any other place you're able to use a LegacyForwardIterator.
Add in comparison functions to get access to the common it = X.begin(); it != X.end(); ++it idiom.
Some downsides include: ugly library code, poor maintainability, and (in this current form) no regard for endianess.
The solution to the latter downside is left as an exercise to the reader.
#include <array>
#include <iostream>
class Foo {
friend class FooByteIter;
public:
FooByteIter begin() const;
FooByteIter end() const;
Foo(const std::array<int, 2>& x, const char (&y)[2], long z)
: x_{x}
, y_{y[0], y[1]}
, z_{z}
{}
protected:
std::array<int, 2> x_;
char y_[2];
long z_;
};
class FooByteIter {
public:
FooByteIter(const Foo& foo)
: ptr_{reinterpret_cast<const char*>(&(foo.x_))}
, x_end_{reinterpret_cast<const char*>(&(foo.x_)) + sizeof(foo.x_)}
, y_begin_{reinterpret_cast<const char*>(&(foo.y_))}
, y_end_{reinterpret_cast<const char*>(&(foo.y_)) + sizeof(foo.y_)}
, z_begin_{reinterpret_cast<const char*>(&(foo.z_))}
{}
static FooByteIter end(const Foo& foo) {
FooByteIter fbi{foo};
fbi.ptr_ = reinterpret_cast<const char*>(&foo.z_) + sizeof(foo.z_);
return fbi;
}
bool operator==(const FooByteIter& other) const { return ptr_ == other.ptr_; }
bool operator!=(const FooByteIter& other) const { return ! (*this == other); }
FooByteIter& operator++() {
ptr_++;
if (ptr_ == x_end_) {
ptr_ = y_begin_;
}
else if (ptr_ == y_end_) {
ptr_ = z_begin_;
}
return *this;
}
FooByteIter operator++(int) {
FooByteIter pre = *this;
(*this)++;
return pre;
}
char operator*() const {
return *ptr_;
}
private:
const char* ptr_;
const char* const x_end_;
const char* const y_begin_;
const char* const y_end_;
const char* const z_begin_;
};
FooByteIter Foo::begin() const {
return FooByteIter(*this);
}
FooByteIter Foo::end() const {
return FooByteIter::end(*this);
}
template <typename InputIt>
char checksum(InputIt first, InputIt last) {
char check = 0;
while (first != last) {
check += (*first);
++first;
}
return check;
}
int main() {
Foo f{{1, 2}, {3, 4}, 5};
for (const auto b : f) {
std::cout << (int)b << ' ';
}
std::cout << std::endl;
std::cout << "Checksum is: " << (int)checksum(f.begin(), f.end()) << std::endl;
}
You can generalize this further by making serialization functions for all data types you might care about, allowing serialization of classes that aren't plain-old-data types.
Warning
This code assumes that the underlying types being serialized have no internal padding, themselves. This answer works for this datatype because it is made of types which themselves do not pad. To make this work for datatypes that have datatypes that have padding, this method would need to be recursed all the way down.
Just cast a pointer to object to a pointer to char. You can iterate through the bytes by increment. Use sizeof(foo) to check overflow.
As long as you're able to make your class an aggregate, i.e. std::is_aggregate_v<T> == true, you can actually sort-of reflect the members of the structure.
This allows you to easily hash the members without actually having to name them. (also you don't have to remember updating your hash function every time you add a new member)
Step 1: Getting the number of members inside the aggregate
First we need to know how many members a given aggregate type has.
We can check this by (ab-)using aggregate initialization.
Example:
Given struct Foo { int i; int j; };:
Foo a{}; // ok
Foo b{{}}; // ok
Foo c{{}, {}}; // ok
Foo d{{}, {}, {}}; // error: too many initializers for 'Foo'
We can use this to get the number of members inside the struct, by trying to add more initializers until we get an error:
template<class T>
concept aggregate = std::is_aggregate_v<T>;
struct any_type {
template<class T>
operator T() {}
};
template<aggregate T>
consteval std::size_t count_members(auto ...members) {
if constexpr (requires { T{ {members}... }; } == false)
return sizeof...(members) - 1;
else
return count_members<T>(members..., any_type{});
}
Notice that i used {members}... instead of members....
This is because of arrays - a structure like struct Bar{int i[2];}; could be initialized with 2 elements, e.g. Bar b{1, 2}, so our function would have returned 2 for Bar if we had used members....
Step 2: Extracting the members
Now that we know how many members our structure has, we can use structured bindings to extract them.
Unfortunately there is no way in the current standard to create a structured binding expression with a variable amount of expressions, so we have to add a few extra lines of code for each additional member we want to support.
For this example i've only added a max of 4 members, but you can add as many as you like / need:
template<aggregate T>
constexpr auto tie_struct(T const& data) {
constexpr std::size_t fieldCount = count_members<T>();
if constexpr(fieldCount == 0) {
return std::tie();
} else if constexpr (fieldCount == 1) {
auto const& [m1] = data;
return std::tie(m1);
} else if constexpr (fieldCount == 2) {
auto const& [m1, m2] = data;
return std::tie(m1, m2);
} else if constexpr (fieldCount == 3) {
auto const& [m1, m2, m3] = data;
return std::tie(m1, m2, m3);
} else if constexpr (fieldCount == 4) {
auto const& [m1, m2, m3, m4] = data;
return std::tie(m1, m2, m3, m4);
} else {
static_assert(fieldCount!=fieldCount, "Too many fields for tie_struct! add more if statements!");
}
}
The fieldCount!=fieldCount in the static_assert is intentional, this prevents the compiler from evaluating it prematurely (it only complains if the else case is actually hit)
Now we have a function that can give us references to each member of an arbitrary aggregate.
Example:
struct Foo {int i; float j; std::string s; };
Foo f{1, 2, "miau"};
// tup is of type std::tuple<int const&, float const&, std::string const&>
auto tup = tie_struct(f);
// this will output "12miau"
std::cout << std::get<0>(tup) << std::get<1>(tup) << std::get<2>(tup) << std::endl;
Step 3: hashing the members
Now that we can convert any aggregate into a tuple of its members, hashing it shouldn't be a big problem.
You can basically hash the individual types like you want and then combine the individual hashes:
// for merging two hash values
std::size_t hash_combine(std::size_t h1, std::size_t h2)
{
return (h2 + 0x9e3779b9 + (h1<<6) + (h1>>2)) ^ h1;
}
// Handling primitives
template <class T, class = void>
struct is_std_hashable : std::false_type { };
template <class T>
struct is_std_hashable<T, std::void_t<decltype(std::declval<std::hash<T>>()(std::declval<T>()))>> : std::true_type { };
template <class T>
concept std_hashable = is_std_hashable<T>::value;
template<std_hashable T>
std::size_t hash(T value) {
return std::hash<T>{}(value);
}
// Handling tuples
template<class... Members>
std::size_t hash(std::tuple<Members...> const& tuple) {
return std::apply([](auto const&... members) {
std::size_t result = 0;
((result = hash_combine(result, hash(members))), ...);
return result;
}, tuple);
}
template<class T, std::size_t I>
using Arr = T[I];
// Handling arrays
template<class T, std::size_t I>
std::size_t hash(Arr<T, I> const& arr) {
std::size_t result = 0;
for(T const& elem : arr) {
std::size_t h = hash(elem);
result = hash_combine(result, h);
}
return result;
};
// Handling structs
template<aggregate T>
std::size_t hash(T const& agg) {
return hash(tie_struct(agg));
}
This allows you to hash basically any aggregate struct, even with arrays and nested structs:
struct Foo{ int i; double d; std::string s; };
struct Bar { Foo k[10]; float f; };
std::cout << hash(Foo{1, 1.2f, "miau"}) << std::endl;
std::cout << hash(Bar{}) << std::endl;
full example on godbolt
Footnotes
This only works with aggregates
No need to worry about padding because we access the members directly.
You have to add a few more ifs into tie_struct if you need more than 4 members
The provided hash() function doesn't handle all types - if you need e.g. std::array, std::pair, etc... you need to add overloads for those.
It's a lot of boilerplate code, but it's insanely powerful.
You can also use Boost.PFR for the aggregate-to-tuple part, if you are allowed to use boost

How do I call template array operator overloading function?

I need to create an adapter C++ class, which accepts an integer index, and retrieves some types of data from a C module by the index, and then returns it to the C++ module.
The data retrieving functions in the C module are like:
int getInt(int index);
double getDouble(int index);
const char* getString(int index);
// ...and etc.
I want to implement an array-like interface for the C++ module, so I created the following class:
class Arguments {
public:
template<typename T> T operator[] (int index);
};
template<> int Arguments::operator[] (int index) { return getInt(index); }
template<> double Arguments::operator[] (int index) { return getdouble(index); }
template<> std::string Arguments::operator[] (int index) { return getString(index); }
(Template class doesn't help in this case, but only template member functions)
The adapter class is no biggie, but calling the Arguments::operator[] is a problem!
I found out that I can only call it in this way:
Arguments a;
int i = a.operator[]<int>(0); // OK
double d = a.operator[]<double>(1); // OK
int x = a[0]; // doesn't compile! it doesn't deduce.
But it looks like a joke, doesn't it?
If this is the case, I would rather create normal member functions, like template<T> T get(int index).
So here comes the question: if I create array-operator-overloading function T operator[]() and its specializations, is it possible to call it like accessing an array?
Thank you!
The simple answer is: No, not possible. You cannot overload a function based on its return type. See here for a similar quesiton: overload operator[] on return type
However, there is a trick that lets you deduce a type from the lhs of an assignment:
#include <iostream>
#include <type_traits>
struct container;
struct helper {
container& c;
size_t index;
template <typename T> operator T();
};
struct container {
helper operator[](size_t i){
return {*this,i};
}
template <typename T>
T get_value(size_t i){
if constexpr (std::is_same_v<T,int>) {
return 42;
} else {
return 0.42;
}
}
};
template <typename T>
helper::operator T(){
return c.get_value<T>(index);
}
int main() {
container c;
int x = c[0];
std::cout << x << "\n";
double y = c[1];
std::cout << y ;
}
Output is:
42
0.42
The line int x = c[0]; goes via container::get_value<int> where the int is deduced from the type of x. Similarly double y = c[1]; uses container::get_value<double> because y is double.
The price you pay is lots of boilerplate and using auto like this
auto x = c[1];
will get you a helper, not the desired value which might be a bit unexpected.

Virtually turn vector of struct into vector of struct members

I have a function that takes a vector-like input. To simplify things, let's use this print_in_order function:
#include <iostream>
#include <vector>
template <typename vectorlike>
void print_in_order(std::vector<int> const & order,
vectorlike const & printme) {
for (int i : order)
std::cout << printme[i] << std::endl;
}
int main() {
std::vector<int> printme = {100, 200, 300};
std::vector<int> order = {2,0,1};
print_in_order(order, printme);
}
Now I have a vector<Elem> and want to print a single integer member, Elem.a, for each Elem in the vector. I could do this by creating a new vector<int> (copying a for all Elems) and pass this to the print function - however, I feel like there must be a way to pass a "virtual" vector that, when operator[] is used on it, returns this only the member a. Note that I don't want to change the print_in_order function to access the member, it should remain general.
Is this possible, maybe with a lambda expression?
Full code below.
#include <iostream>
#include <vector>
struct Elem {
int a,b;
Elem(int a, int b) : a(a),b(b) {}
};
template <typename vectorlike>
void print_in_order(std::vector<int> const & order,
vectorlike const & printme) {
for (int i : order)
std::cout << printme[i] << std::endl;
}
int main() {
std::vector<Elem> printme = {Elem(1,100), Elem(2,200), Elem(3,300)};
std::vector<int> order = {2,0,1};
// how to do this?
virtual_vector X(printme) // behaves like a std::vector<Elem.a>
print_in_order(order, X);
}
It's not really possible to directly do what you want. Instead you might want to take a hint from the standard algorithm library, for example std::for_each where you take an extra argument that is a function-like object that you call for each element. Then you could easily pass a lambda function that prints only the wanted element.
Perhaps something like
template<typename vectorlike, typename functionlike>
void print_in_order(std::vector<int> const & order,
vectorlike const & printme,
functionlike func) {
for (int i : order)
func(printme[i]);
}
Then call it like
print_in_order(order, printme, [](Elem const& elem) {
std::cout << elem.a;
});
Since C++ have function overloading you can still keep the old print_in_order function for plain vectors.
Using member pointers you can implement a proxy type that will allow you view a container of objects by substituting each object by one of it's members (see pointer to data member) or by one of it's getters (see pointer to member function). The first solution addresses only data members, the second accounts for both.
The container will necessarily need to know which container to use and which member to map, which will be provided at construction. The type of a pointer to member depends on the type of that member so it will have to be considered as an additional template argument.
template<class Container, class MemberPtr>
class virtual_vector
{
public:
virtual_vector(const Container & p_container, MemberPtr p_member_ptr) :
m_container(&p_container),
m_member(p_member_ptr)
{}
private:
const Container * m_container;
MemberPtr m_member;
};
Next, implement the operator[] operator, since you mentioned that it's how you wanted to access your elements. The syntax for dereferencing a member pointer can be surprising at first.
template<class Container, class MemberPtr>
class virtual_vector
{
public:
virtual_vector(const Container & p_container, MemberPtr p_member_ptr) :
m_container(&p_container),
m_member(p_member_ptr)
{}
// Dispatch to the right get method
auto operator[](const size_t p_index) const
{
return (*m_container)[p_index].*m_member;
}
private:
const Container * m_container;
MemberPtr m_member;
};
To use this implementation, you would write something like this :
int main() {
std::vector<Elem> printme = { Elem(1,100), Elem(2,200), Elem(3,300) };
std::vector<int> order = { 2,0,1 };
virtual_vector<decltype(printme), decltype(&Elem::a)> X(printme, &Elem::a);
print_in_order(order, X);
}
This is a bit cumbersome since there is no template argument deduction happening. So lets add a free function to deduce the template arguments.
template<class Container, class MemberPtr>
virtual_vector<Container, MemberPtr>
make_virtual_vector(const Container & p_container, MemberPtr p_member_ptr)
{
return{ p_container, p_member_ptr };
}
The usage becomes :
int main() {
std::vector<Elem> printme = { Elem(1,100), Elem(2,200), Elem(3,300) };
std::vector<int> order = { 2,0,1 };
auto X = make_virtual_vector(printme, &Elem::a);
print_in_order(order, X);
}
If you want to support member functions, it's a little bit more complicated. First, the syntax to dereference a data member pointer is slightly different from calling a function member pointer. You have to implement two versions of the operator[] and enable the correct one based on the member pointer type. Luckily the standard provides std::enable_if and std::is_member_function_pointer (both in the <type_trait> header) which allow us to do just that. The member function pointer requires you to specify the arguments to pass to the function (non in this case) and an extra set of parentheses around the expression that would evaluate to the function to call (everything before the list of arguments).
template<class Container, class MemberPtr>
class virtual_vector
{
public:
virtual_vector(const Container & p_container, MemberPtr p_member_ptr) :
m_container(&p_container),
m_member(p_member_ptr)
{}
// For mapping to a method
template<class T = MemberPtr>
auto operator[](std::enable_if_t<std::is_member_function_pointer<T>::value == true, const size_t> p_index) const
{
return ((*m_container)[p_index].*m_member)();
}
// For mapping to a member
template<class T = MemberPtr>
auto operator[](std::enable_if_t<std::is_member_function_pointer<T>::value == false, const size_t> p_index) const
{
return (*m_container)[p_index].*m_member;
}
private:
const Container * m_container;
MemberPtr m_member;
};
To test this, I've added a getter to the Elem class, for illustrative purposes.
struct Elem {
int a, b;
int foo() const { return a; }
Elem(int a, int b) : a(a), b(b) {}
};
And here is how it would be used :
int main() {
std::vector<Elem> printme = { Elem(1,100), Elem(2,200), Elem(3,300) };
std::vector<int> order = { 2,0,1 };
{ // print member
auto X = make_virtual_vector(printme, &Elem::a);
print_in_order(order, X);
}
{ // print method
auto X = make_virtual_vector(printme, &Elem::foo);
print_in_order(order, X);
}
}
You've got a choice of two data structures
struct Employee
{
std::string name;
double salary;
long payrollid;
};
std::vector<Employee> employees;
Or alternatively
struct Employees
{
std::vector<std::string> names;
std::vector<double> salaries;
std::vector<long> payrollids;
};
C++ is designed with the first option as the default. Other languages such as Javascript tend to encourage the second option.
If you want to find mean salary, option 2 is more convenient. If you want to sort the employees by salary, option 1 is easier to work with.
However you can use lamdas to partially interconvert between the two. The lambda is a trivial little function which takes an Employee and returns a salary for him - so effectively providing a flat vector of doubles we can take the mean of - or takes an index and an Employees and returns an employee, doing a little bit of trivial data reformatting.
template<class F>
struct index_fake_t{
F f;
decltype(auto) operator[](std::size_t i)const{
return f(i);
}
};
template<class F>
index_fake_t<F> index_fake( F f ){
return{std::move(f)};
}
template<class F>
auto reindexer(F f){
return [f=std::move(f)](auto&& v)mutable{
return index_fake([f=std::move(f),&v](auto i)->decltype(auto){
return v[f(i)];
});
};
}
template<class F>
auto indexer_mapper(F f){
return [f=std::move(f)](auto&& v)mutable{
return index_fake([f=std::move(f),&v](auto i)->decltype(auto){
return f(v[i]);
});
};
}
Now, print in order can be rewritten as:
template <typename vectorlike>
void print(vectorlike const & printme) {
for (auto&& x:printme)
std::cout << x << std::endl;
}
template <typename vectorlike>
void print_in_order(std::vector<int> const& reorder, vectorlike const & printme) {
print(reindexer([&](auto i){return reorder[i];})(printme));
}
and printing .a as:
print_in_order( reorder, indexer_mapper([](auto&&x){return x.a;})(printme) );
there may be some typos.

c++ send arguments to union (variable union)

well i cant find how do this, basically its a variable union with params, basic idea, (writed as function)
Ex1
union Some (int le)
{
int i[le];
float f[le];
};
Ex2
union Some
{
int le;
int i[le];
float f[le];
};
obs this don't works D:
maybe a way to use an internal variable to set the lenght but don't works too.
Thx.
No, this is not possible: le would need to be known at compile-time.
One solution would be to use a templated union:
template <int N> union Some
{
int i[N];
float f[N];
};
N, of course, is compile-time evaluable.
Another solution is the arguably more succinct
typedef std::vector<std::pair<int, float>> Some;
or a similar solution based on std::array.
Depending on your use case you could try to simulate a union.
struct Some
{
//Order is important
private:
char* pData;
public:
int* const i;
float* const f;
public:
Some(size_t len)
:pData(new char[sizeof(int) < sizeof(float) ? sizeof(float) : sizeof(int)])
,i ((int*)pData)
,f ((float*)pData)
{
}
~Some()
{
delete[] pData;
}
Some(const Some&) = delete;
Some& operator=(const Some&) = delete;
};
Alternative solution using templates, unique_ptr and explicit casts:
//max_size_of<>: a recursive template struct to evaluate the
// maximum value of the sizeof function of all types passed as
// parameter
//The recursion is done by using the "value" of another
// specialization of max_size_of<> with less parameter types
template <typename T, typename...Args>
struct max_size_of
{
static const std::size_t value = std::max(sizeof(T), max_size_of<Args...>::value);
};
//Specialication for max_size_of<> as recursion stop
template <typename T>
struct max_size_of<T>
{
static const std::size_t value = sizeof(T);
};
//dataptr_auto_cast<>: a recursive template struct that
// introduces a virtual function "char* const data_ptr()"
// and an additional explicit cast operator for a pointer
// of the first type. Due to the recursion a cast operator
// for every type passed to the struct is created.
//Attention: types are not allowed to be duplicate
//The recursion is done by inheriting from of another
// specialization of dataptr_auto_cast<> with less parameter types
template <typename T, typename...Args>
struct dataptr_auto_cast : public dataptr_auto_cast<Args...>
{
virtual char* const data_ptr() const = 0; //This is needed by the cast operator
explicit operator T* const() const { return (T*)data_ptr(); } //make it explicit to avoid unwanted side effects (manual cast needed)
};
//Specialization of dataptr_auto_cast<> as recursion stop
template <typename T>
struct dataptr_auto_cast<T>
{
virtual char* const data_ptr() const = 0;
explicit operator T* const() const { return (T*)data_ptr(); }
};
//union_array<>: inherits from dataptr_auto_cast<> with the same
// template parameters. Also has a static const member "blockSize"
// that indicates the size of the largest datatype passed as parameter
// "blockSize" is used to determine the space needed to store "size"
// elements.
template <typename...Args>
struct union_array : public dataptr_auto_cast<Args...>
{
static const size_t blockSize = max_size_of<Args...>::value;
private:
std::unique_ptr<char[]> m_pData; //std::unique_ptr automatically deletes the memory it points to on destruction
size_t m_size; //The size/no. of elements
public:
//Create a new array to store "size" elements
union_array(size_t size)
:m_pData(new char[size*blockSize])
,m_size(size)
{
}
//Copy constructor
union_array(const union_array<Args...>& other)
:m_pData(new char[other.m_size*blockSize])
,m_size(other.m_size)
{
memcpy(m_pData.get(), other.m_pData.get(), other.m_size);
}
//Move constructor
union_array(union_array<Args...>&& other)
:m_pData(std::move(other.m_pData))
,m_size(std::move(other.m_size))
{
}
union_array& operator=(const union_array<Args...>& other)
{
m_pData = new char[other.m_size*blockSize];
m_size = other.m_size;
memcpy(m_pData.get(), other.m_pData.get(), other.m_size);
}
union_array& operator=(union_array<Args...>&& other)
{
m_pData = std::move(other.m_pData);
m_size = std::move(other.m_size);
}
~union_array() = default;
size_t size() const
{
return m_size;
}
//Implementation of dataptr_auto_cast<>::data_ptr
virtual char* const data_ptr() const override
{
return m_pData.get();
}
};
int main()
{
auto a = union_array<int, char, float, double>(5); //Create a new union_array object with enough space to store either 5 int, 5 char, 5 float or 5 double values.
((int*)a)[3] = 3; //Use as int array
auto b = a; //copy
((int*)b)[3] = 1; //Change a value
auto c = std::move(a);// move a to c, a is invalid beyond this point
// std::cout << ((int*)a)[3] << std::endl; //This will crash as a is invalid due to the move
std::cout << ((int*)b)[3] << std::endl; //prints "1"
std::cout << ((int*)c)[3] << std::endl; //prints "3"
}
Explanation
template <typename T, typename...Args>
struct max_size_of
{
static const std::size_t value = std::max(sizeof(T), max_size_of<Args...>::value);
};
template <typename T>
struct max_size_of<T>
{
static const std::size_t value = sizeof(T);
};
max_size_of<> is used to get the largest sizeof() value of all types passed as template paremeters.
Let's have a look at the simple case first.
- max_size_of<char>::value: value will be set to sizeof(char).
- max_size_of<int>::value: value will be set to sizeof(int).
- and so on
If you put in more than one type it will evaluate to the maximum of the sizeof of these types.
For 2 types this would look like this: max_size_of<char, int>::value: value will be set to std::max(sizeof(char), max_size_of<int>::value).
As described above max_size_of<int>::value is the same as sizeof(int), so max_size_of<char, int>::value is the same as std::max(sizeof(char), sizeof(int)) which is the same as sizeof(int).
template <typename T, typename...Args>
struct dataptr_auto_cast : public dataptr_auto_cast<Args...>
{
virtual char* const data_ptr() const = 0;
explicit operator T* const() const { return (T*)data_ptr(); }
};
template <typename T>
struct dataptr_auto_cast<T>
{
virtual char* const data_ptr() const = 0;
explicit operator T* const() const { return (T*)data_ptr(); }
};
dataptr_auto_cast<> is what we use as a simple abstract base class.
It forces us to implement a function char* const data_ptr() const in the final class (which will be union_array).
Let's just assume that the class is not abstract and use the simple version dataptr_auto_cast<T>:
The class implements a operator function that returns a pointer of the type of the passed template parameter.
dataptr_auto_cast<int> has a function explicit operator int* const() const;
The function provides access to data provided by the derived class through the data_ptr()function and casts it to type T* const.
The const is so that the pointer isn't altered accidentially and the explicit keyword is used to avoid unwanted implicit casts.
As you can see there are 2 versions of dataptr_auto_cast<>. One with 1 template paremeter (which we just looked at) and one with multiple template paremeters.
The definition is quite similar with the exception that the multiple parameters one inherits dataptr_auto_cast with one (the first) template parameter less.
So dataptr_auto_cast<int, char> has a function explicit operator int* const() const; and inherits dataptr_auto_cast<char> which has a function explicit operator char* const() const;.
As you can see there is one cast operator function implemented with each type you pass.
There is only one exception and that is passing the same template parameter twice.
This would lead in the same operator function being defined twice within the same class which doesn't work.
For this use case, using this as a base class for the union_array, this shouldn't matter.
Now that these two are clear let's look at the actual code for union_array:
template <typename...Args>
struct union_array : public dataptr_auto_cast<Args...>
{
static const size_t blockSize = max_size_of<Args...>::value;
private:
std::unique_ptr<char[]> m_pData;
size_t m_size;
public:
//Create a new array to store "size" elements
union_array(size_t size)
:m_pData(new char[size*blockSize])
,m_size(size)
{
}
//Copy constructor
union_array(const union_array<Args...>& other)
:m_pData(new char[other.m_size*blockSize])
,m_size(other.m_size)
{
memcpy(m_pData.get(), other.m_pData.get(), other.m_size);
}
//Move constructor
union_array(union_array<Args...>&& other)
:m_pData(std::move(other.m_pData))
,m_size(std::move(other.m_size))
{
}
union_array& operator=(const union_array<Args...>& other)
{
m_pData = new char[other.m_size*blockSize];
m_size = other.m_size;
memcpy(m_pData.get(), other.m_pData.get(), other.m_size);
}
union_array& operator=(union_array<Args...>&& other)
{
m_pData = std::move(other.m_pData);
m_size = std::move(other.m_size);
}
~union_array() = default;
size_t size() const
{
return m_size;
}
virtual char* const data_ptr() const override
{
return m_pData.get();
}
};
As you can see union_array<> inherits from dataptr_auto_cast<> using the same template arguments.
So this gives us a cast operator for every type passed as template paremeter to union_array<>.
Also at the end of union_array<> you can see that the char* const data_ptr() const function is implemented (the abstract function from dataptr_auto_cast<>).
The next interesting thing to see is static const size_t blockSize which is initilialized with the maximum sizeof value of the template paremeters to union_array<>.
To get this value the max_size_of is used as described above.
The class uses std::unique_ptr<char[]> as data storage, as std::unique_ptr automatically will delete the space for us, once the class is destroyed.
Also std::unique_ptr is capable of move semantics, which is used in the move assign operator function and the move constructor.
A "normal" copy assign operator function and a copy constructor are also included and copy the memory accordingly.
The class has a constructor union_array(size_t size) which takes the number of elements the union_array should be able to hold.
Multiplying this value with blockSize gives us the space needed to store exactly size elements of the largest template type.
Last but not least there is an access method to ask for the size() if needed.
C++ requires that the size of a type be known at compile time.
The size of a block of data need not be known, but all types have known sizes.
There are three ways around it.
I'll ignore the union part for now. Imagine if you wanted:
struct some (int how_many) {
int data[how_many];
};
as the union part adds complexity which can be dealt with separately.
First, instead of storing the data as part of the type, you can store pointers/references/etc to the data.
struct some {
std::vector<int> data;
explicit some( size_t how_many ):data(how_many) {};
some( some&& ) = default;
some& operator=( some&& ) = default;
some( some const& ) = default;
some& operator=( some const& ) = default;
some() = default;
~some() = default;
};
here we store the data in a std::vector -- a dynamic array. We default copy/move/construct/destruct operations (explicitly -- because it makes it clearer), and the right thing happens.
Instead of a vector we can use a unique_ptr:
struct some {
std::unique_ptr<int[]> data;
explicit some( size_t how_many ):data(new int[how_many]) {};
some( some&& ) = default;
some& operator=( some&& ) = default;
some() = default;
~some() = default;
};
this blocks copying of the structure, but the structure goes from being size of 3 pointers to being size of 1 in a typical std implementation. We lose the ability to easily resize after the fact, and copy without writing the code ourselves.
The next approach is to template it.
template<std::size_t N>
struct some {
int data[N];
};
this, however, requires that the size of the structure be known at compile-time, and some<2> and some<3> are 'unrelated types' (barring template pattern matching). So it has downsides.
A final approach is C-like. Here we rely on the fact that data can be variable in size, even if types are not.
struct some {
int data[1]; // or `0` in some compilers as an extension
};
some* make_some( std::size_t n ) {
Assert(n >= 1); // unless we did `data[0]` above
char* buff = new some[(n-1)*sizeof(int) + sizeof(some)]; // note: alignment issues on some platforms?
return new(buff) some(); // placement new
};
where we allocate a buffer for some of variable size. Access to the buffer via data[13] is practically legal, and probably actually so as well.
This technique is used in C to create structures of variable size.
For the union part, you'll want to create a buffer of char with the right size std::max(sizeof(float), sizeof(int))*N, and expose functions:
char* data(); // returns a pointer to the start of the buffer
int* i() { return reinterpret_cast<int*>(data()); }
float* f() { return reinterpret_cast<float*>(data()); }
you may also need to properly initialize the data as the proper type; in theory, a char buffer of '\0's may not correspond to defined float values or ints that are zero.
I would like to suggest a different approach: Instead of tying the number of elements to the union, tie it outside:
union Some
{
int i;
float f;
};
Some *get_Some(int le) { return new Some[le]; }
Don't forget to delete[] the return value of get_Some... Or use smart pointers:
std::unique_ptr<Some[]> get_Some(int le)
{ return std::make_unique<Some[]>(le); }
You can even create a Some_Manager:
struct Some_Manager
{
union Some
{
int i;
float f;
};
Some_Manager(int le) :
m_le{le},
m_some{std::make_unique<Some[]>(le)}
{}
// ... getters and setters...
int count() const { return m_le; }
Some &operator[](int le) { return m_some[le]; }
private:
int m_le{};
std::unique_ptr<Some[]> m_some;
};
Take a look at the Live example.
It's not possible to declare a structure with dynamic sizes as you are trying to do, the size must be specified at run time or you will have to use higher-level abstractions to manage a dynamic pool of memory at run time.
Also, in your second example, you include le in the union. If what you were trying to do were possible, it would cause le to overlap with the first value of i and f.
As was mentioned before, you could do this with templating if the size is known at compile time:
#include <cstdlib>
template<size_t Sz>
union U {
int i[Sz];
float f[Sz];
};
int main() {
U<30> u;
u.i[0] = 0;
u.f[1] = 1.0;
}
http://ideone.com/gG9egD
If you want dynamic size, you're beginning to reach the realm where it would be better to use something like std::vector.
#include <vector>
#include <iostream>
union U {
int i;
float f;
};
int main() {
std::vector<U> vec;
vec.resize(32);
vec[0].i = 0;
vec[1].f = 42.0;
// But there is no way to tell whether a given element is
// supposed to be an int or a float:
// vec[1] was populated via the 'f' option of the union:
std::cout << "vec[1].i = " << vec[1].i << '\n';
}
http://ideone.com/gjTCuZ

compare function in lower bound

I have following structure
enum quality { good = 0, bad, uncertain };
struct Value {
int time;
int value;
quality qual;
};
class MyClass {
public:
MyClass() {
InsertValues();
}
void InsertValues();
int GetLocationForTime(int time);
private:
vector<Value> valueContainer;
};
void MyClass::InsertValues() {
for(int num = 0; num < 5; num++) {
Value temp;
temp.time = num;
temp.value = num+1;
temp.qual = num % 2;
valueContainer.push_back(temp);
}
}
int MyClass::GetLocationForTime(int time)
{
// How to use lower bound here.
return 0;
}
In above code I have been thrown with lot of compile errors. I think I am doing wrong here I am new to STL programming and can you please correct me where is the error? Is there better to do this?
Thanks!
The predicate needs to take two parameters and return bool.
As your function is a member function it has the wrong signature.
In addition, you may need to be able to compare Value to int, Value to Value, int to Value and int to int using your functor.
struct CompareValueAndTime
{
bool operator()( const Value& v, int time ) const
{
return v.time < time;
}
bool operator()( const Value& v1, const Value& v2 ) const
{
return v1.time < v2.time;
}
bool operator()( int time1, int time2 ) const
{
return time1 < time2;
}
bool operator()( int time, const Value& v ) const
{
return time < v.time;
}
};
That is rather cumbersome, so let's reduce it:
struct CompareValueAndTime
{
int asTime( const Value& v ) const // or static
{
return v.time;
}
int asTime( int t ) const // or static
{
return t;
}
template< typename T1, typename T2 >
bool operator()( T1 const& t1, T2 const& t2 ) const
{
return asTime(t1) < asTime(t2);
}
};
then:
std::lower_bound(valueContainer.begin(), valueContainer.end(), time,
CompareValueAndTime() );
There are a couple of other errors too, e.g. no semicolon at the end of the class declaration, plus the fact that members of a class are private by default which makes your whole class private in this case. Did you miss a public: before the constructor?
Your function GetLocationForTime doesn't return a value. You need to take the result of lower_bound and subtract begin() from it. The function should also be const.
If the intention of this call is to insert here, then consider the fact that inserting in the middle of a vector is an O(N) operation and therefore vector may be the wrong collection type here.
Note that the lower_bound algorithm only works on pre-sorted collections. If you want to be able to look up on different members without continually resorting, you will want to create indexes on these fields, possibly using boost's multi_index
One error is that the fourth argument to lower_bound (compareValue in your code) cannot be a member function. It can be a functor or a free function. Making it a free function which is a friend of MyClass seems to be the simplest in your case. Also you are missing the return keyword.
class MyClass {
MyClass() { InsertValues(); }
void InsertValues();
int GetLocationForTime(int time);
friend bool compareValue(const Value& lhs, const Value& rhs)
{
return lhs.time < rhs.time;
}
Class keyword must start from lower c - class.
struct Value has wrong type qualtiy instead of quality
I dont see using namespace std to use STL types without it.
vector<value> - wrong type value instead of Value
Etc.
You have to check it first before posting here with such simple errors i think.
And main problem here that comparison function cant be member of class. Use it as free function:
bool compareValue(const Value lhs, const int time) {
return lhs.time < time ;
}
class is the keyword and not "Class":
class MyClass {
And its body should be followed by semicolon ;.
There can be other errors, but you may have to paste them in the question for further help.
You just want to make compareValue() a normal function. The way you have implemented it right now, you need an object of type MyClass around. The way std::lower_bound() will try to call it, it will just pass in two argument, no extra object. If you really want it the function to be a member, you can make it a static member.
That said, there is a performance penalty for using functions directly. You might want to have comparator type with an inline function call operator:
struct MyClassComparator {
bool operator()(MyClass const& m0, MyClass const& m1) const {
return m0.time < m1.time;
}
};
... and use MyClassComparator() as comparator.