So I'm currently refactoring a giant function:
int giant_function(size_t n, size_t m, /*... other parameters */) {
int x[n]{};
float y[n]{};
int z[m]{};
/* ... more array definitions */
And when I find a group of related definitions with discrete functionality, grouping them into a class definition:
class V0 {
std::unique_ptr<int[]> x;
std::unique_ptr<float[]> y;
std::unique_ptr<int[]> z;
public:
V0(size_t n, size_t m)
: x{new int[n]{}}
, y{new float[n]{}}
, z{new int[m]{}}
{}
// methods...
}
The refactored version is inevitably more readable, but one thing I find less-than-satisfying is the increase in the number of allocations.
Allocating all those (potentially very large) arrays on the stack is arguably a problem waiting to happen in the unrefactored version, but there's no reason that we couldn't get by with just one larger allocation:
class V1 {
int* x;
float* y;
int* z;
public:
V1(size_t n, size_t m) {
char *buf = new char[n*sizeof(int)+n*sizeof(float)+m*sizeof(int)];
x = (int*) buf;
buf += n*sizeof(int);
y = (float*) buf;
buf += n*sizeof(float);
z = (int*) buf;
}
// methods...
~V0() { delete[] ((char *) x); }
}
Not only does this approach involve a lot of manual (read:error-prone) bookkeeping, but its greater sin is that it's not composable.
If I want to have a V1 value and a W1 value on the stack, then that's
one allocation each for their behind-the-scenes resources. Even simpler, I'd want the ability to allocate a V1 and the resources it points to in a single allocation, and I can't do that with this approach.
What this initially led me to was a two-pass approach - one pass to calculate how much space was needed, then make one giant allocation, then another pass to parcel out the allocation and initialize the data structures.
class V2 {
int* x;
float* y;
int* z;
public:
static size_t size(size_t n, size_t m) {
return sizeof(V2) + n*sizeof(int) + n*sizeof(float) + m*sizeof(int);
}
V2(size_t n, size_t m, char** buf) {
x = (int*) *buf;
*buf += n*sizeof(int);
y = (float*) *buf;
*buf += n*sizeof(float);
z = (int*) *buf;
*buf += m*sizeof(int);
}
}
// ...
size_t total = ... + V2::size(n,m) + ...
char* buf = new char[total];
// ...
void* here = buf;
buf += sizeof(V2);
V2* v2 = new (here) V2{n, m, &buf};
However that approach had a lot of repetition at a distance, which is asking for trouble in the long run. Returning a factory got rid of that:
class V3 {
int* const x;
float* const y;
int* const z;
V3(int* x, float* y, int* z) : x{x}, y{y}, z{z} {}
public:
class V3Factory {
size_t const n;
size_t const m;
public:
Factory(size_t n, size_t m) : n{n}, m{m};
size_t size() {
return sizeof(V3) + sizeof(int)*n + sizeof(float)*n + sizeof(int)*m;
}
V3* build(char** buf) {
void * here = *buf;
*buf += sizeof(V3);
x = (int*) *buf;
*buf += n*sizeof(int);
y = (float*) *buf;
*buf += n*sizeof(float);
z = (int*) *buf;
*buf += m*sizeof(int);
return new (here) V3{x,y,z};
}
}
}
// ...
V3::Factory v3factory{n,m};
// ...
size_t total = ... + v3factory.size() + ...
char* buf = new char[total];
// ..
V3* v3 = v3factory.build(&buf);
Still some repetition, but the params only get input once. And still a lot of manual bookkeeping. It'd be nice if I could build this factory out of smaller factories...
And then my haskell brain hit me. I was implementing an Applicative Functor. This could totally be nicer!
All I needed to do was write some tooling to automatically sum sizes and run build functions side-by-side:
namespace plan {
template <typename A, typename B>
struct Apply {
A const a;
B const b;
Apply(A const a, B const b) : a{a}, b{b} {};
template<typename ... Args>
auto build(char* buf, Args ... args) const {
return a.build(buf, b.build(buf + a.size()), args...);
}
size_t size() const {
return a.size() + b.size();
}
Apply(Apply<A,B> const & plan) : a{plan.a}, b{plan.b} {}
Apply(Apply<A,B> const && plan) : a{plan.a}, b{plan.b} {}
template<typename U, typename ... Vs>
auto operator()(U const u, Vs const ... vs) const {
return Apply<decltype(*this),U>{*this,u}(vs...);
}
auto operator()() const {
return *this;
}
};
template<typename T>
struct Lift {
template<typename ... Args>
T* build(char* buf, Args ... args) const {
return new (buf) T{args...};
}
size_t size() const {
return sizeof(T);
}
Lift() {}
Lift(Lift<T> const &) {}
Lift(Lift<T> const &&) {}
template<typename U, typename ... Vs>
auto operator()(U const u, Vs const ... vs) const {
return Apply<decltype(*this),U>{*this,u}(vs...);
}
auto operator()() const {
return *this;
}
};
template<typename T>
struct Array {
size_t const length;
Array(size_t length) : length{length} {}
T* build(char* buf) const {
return new (buf) T[length]{};
}
size_t size() const {
return sizeof(T[length]);
}
};
template <typename P>
auto heap_allocate(P plan) {
return plan.build(new char[plan.size()]);
}
}
Now I could state my class quite simply:
class V4 {
int* const x;
float* const y;
int* const z;
public:
V4(int* x, float* y, int* z) : x{x}, y{y}, z{z} {}
static auto plan(size_t n, size_t m) {
return plan::Lift<V4>{}(
plan::Array<int>{n},
plan::Array<float>{n},
plan::Array<int>{m}
);
}
};
And use it in a single pass:
V4* v4;
W4* w4;
std::tie{ ..., v4, w4, .... } = *plan::heap_allocate(
plan::Lift<std::tie>{}(
// ...
V4::plan(n,m),
W4::plan(m,p,2*m+1),
// ...
)
);
It's not perfect (among other issues, I need to add code to track destructors, and have heap_allocate return a std::unique_ptr that calls all of them), but before I went further down the rabbit hole, I thought I should check for pre-existing art.
For all I know, modern compilers may be smart enough to recognize that the memory in V0 always gets allocated/deallocated together and batch the allocation for me.
If not, is there a preexisting implementation of this idea (or a variation thereof) for batching allocations with an applicative functor?
First, I'd like to provide feedback on problems with your solutions:
You ignore alignment. Relying on assumption that int and float share the same alignment on your system, your particular use case might be "fine". But try to add some double into the mix and there will be UB. You might find your program crashing on ARM chips due to unaligned access.
new (buf) T[length]{}; is unfortunately bad and non-portable. In short: Standard allows the compiler to reserve initial y bytes of the given storage for internal use. Your program fails to allocate this y bytes on systems where y > 0 (and yes, those systems apparently exist; VC++ does this allegedly).
Having to allocate for y is bad, but what makes array-placement-new unusable is the inability to find out how big y is until placement new is actually called. There's really no way to use it for this case.
You're already aware of this, but for completeness: You don't destroy the sub-buffers, so if you ever use a non-trivially-destructible type, then there will be UB.
Solutions:
Allocate extra alignof(T) - 1 bytes for each buffer. Align start of each buffer with std::align.
You need to loop and use non-array placement new. Technically, doing non-array placement new means that using pointer arithmetic on these objects has UB, but standard is just silly in this regard and I choose to ignore it. Here's language lawyerish discussion about that. As I understand, p0593r2 proposal includes a resolution to this technicality.
Add destructor calls that correspond to placement-new calls (or static_assert that only trivially destructible types shall be used). Note that support for non-trivial destruction raises the need for exception safety. If construction of one buffer throws an exception, then the sub buffers that were constructed earlier need to be destroyed. Same care need to be taken when constructor of single element throws after some have already been constructed.
I don't know of prior art, but how about some subsequent art? I decided to take a stab at this from a slightly different angle. Be warned though, this lacks testing and may contain bugs.
buffer_clump template for constructing / destructing objects into an external raw storage, and calculating aligned boundaries of each sub-buffer:
#include <cstddef>
#include <memory>
#include <vector>
#include <tuple>
#include <cassert>
#include <type_traits>
#include <utility>
// recursion base
template <class... Args>
class buffer_clump {
protected:
constexpr std::size_t buffer_size() const noexcept { return 0; }
constexpr std::tuple<> buffers(char*) const noexcept { return {}; }
constexpr void construct(char*) const noexcept { }
constexpr void destroy(const char*) const noexcept {}
};
template<class Head, class... Tail>
class buffer_clump<Head, Tail...> : buffer_clump<Tail...> {
using tail = buffer_clump<Tail...>;
const std::size_t length;
constexpr std::size_t size() const noexcept
{
return sizeof(Head) * length + alignof(Head) - 1;
}
constexpr Head* align(char* buf) const noexcept
{
void* aligned = buf;
std::size_t space = size();
assert(std::align(
alignof(Head),
sizeof(Head) * length,
aligned,
space
));
return (Head*)aligned;
}
constexpr char* next(char* buf) const noexcept
{
return buf + size();
}
static constexpr void
destroy_head(Head* head_ptr, std::size_t last)
noexcept(std::is_nothrow_destructible<Head>::value)
{
if constexpr (!std::is_trivially_destructible<Head>::value)
while (last--)
head_ptr[last].~Head();
}
public:
template<class... Size_t>
constexpr buffer_clump(std::size_t length, Size_t... tail_lengths) noexcept
: tail(tail_lengths...), length(length) {}
constexpr std::size_t
buffer_size() const noexcept
{
return size() + tail::buffer_size();
}
constexpr auto
buffers(char* buf) const noexcept
{
return std::tuple_cat(
std::make_tuple(align(buf)),
tail::buffers(next(buf))
);
}
void
construct(char* buf) const
noexcept(std::is_nothrow_default_constructible<Head, Tail...>::value)
{
Head* aligned = align(buf);
std::size_t i;
try {
for (i = 0; i < length; i++)
new (&aligned[i]) Head;
tail::construct(next(buf));
} catch (...) {
destroy_head(aligned, i);
throw;
}
}
constexpr void
destroy(char* buf) const
noexcept(std::is_nothrow_destructible<Head, Tail...>::value)
{
tail::destroy(next(buf));
destroy_head(align(buf), length);
}
};
A buffer_clump_storage template that leverages buffer_clump to construct sub-buffers into a RAII container.
template <class... Args>
class buffer_clump_storage {
const buffer_clump<Args...> clump;
std::vector<char> storage;
public:
constexpr auto buffers() noexcept {
return clump.buffers(storage.data());
}
template<class... Size_t>
buffer_clump_storage(Size_t... lengths)
: clump(lengths...), storage(clump.buffer_size())
{
clump.construct(storage.data());
}
~buffer_clump_storage()
noexcept(noexcept(clump.destroy(nullptr)))
{
if (storage.size())
clump.destroy(storage.data());
}
buffer_clump_storage(buffer_clump_storage&& other) noexcept
: clump(other.clump), storage(std::move(other.storage))
{
other.storage.clear();
}
};
Finally, a class that can be allocated as automatic variable and provides named pointers to sub-buffers of buffer_clump_storage:
class V5 {
// macro tricks or boost mpl magic could be used to avoid repetitive boilerplate
buffer_clump_storage<int, float, int> storage;
public:
int* x;
float* y;
int* z;
V5(std::size_t xs, std::size_t ys, std::size_t zs)
: storage(xs, ys, zs)
{
std::tie(x, y, z) = storage.buffers();
}
};
And usage:
int giant_function(size_t n, size_t m, /*... other parameters */) {
V5 v(n, n, m);
for(std::size_t i = 0; i < n; i++)
v.x[i] = i;
In case you need only the clumped allocation and not so much the ability to name the group, this direct usage avoids pretty much all boilerplate:
int giant_function(size_t n, size_t m, /*... other parameters */) {
buffer_clump_storage<int, float, int> v(n, n, m);
auto [x, y, z] = v.buffers();
Criticism of my own work:
I didn't bother making V5 members const which would have arguably been nice, but I found it involved more boilerplate than I would prefer.
Compilers will warn that there is a throw in a function that is declared noexcept when the constructor cannot throw. Neither g++ nor clang++ were smart enough to understand that the throw will never happen when the function is noexcept. I guess that can be worked around by using partial specialization, or I could just add (non-standard) directives to disable the warning.
buffer_clump_storage could be made copyable and assignable. This involves loads more code, and I wouldn't expect to have need for them. The move constructor might be superfluous as well, but at least it's efficient and concise to implement.
Related
I have objects that I need to hash with SHA256. The object has several fields as follows:
class Foo {
// some methods
protected:
std::array<32,int> x;
char y[32];
long z;
}
Is there a way I can directly access the bytes representing the 3 member variables in memory as I would a struct ? These hashes need to be computed as quickly as possible so I want to avoid malloc'ing a new set of bytes and copying to a heap allocated array. Or is the answer to simply embed a struct within the class?
It is critical that I get the exact binary representation of these variables so that the SHA256 comes out exactly the same given that the 3 variables are equal (so I can't have any extra padding bytes etc included going into the hash function)
Most Hash classes are able to take multiple regions before returning the hash, e.g. as in:
class Hash {
public:
void update(const void *data, size_t size) = 0;
std::vector<uint8_t> digest() = 0;
}
So your hash method could look like this:
std::vector<uint8_t> Foo::hash(Hash *hash) const {
hash->update(&x, sizeof(x));
hash->update(&y, sizeof(y));
hash->update(&z, sizeof(z));
return hash->digest();
}
You can solve this by making an iterator that knows the layout of your member variables. Make Foo::begin() and Foo::end() functions and you can even take advantage of range-based for loops.
If you can increment it and dereference it, you can use it any other place you're able to use a LegacyForwardIterator.
Add in comparison functions to get access to the common it = X.begin(); it != X.end(); ++it idiom.
Some downsides include: ugly library code, poor maintainability, and (in this current form) no regard for endianess.
The solution to the latter downside is left as an exercise to the reader.
#include <array>
#include <iostream>
class Foo {
friend class FooByteIter;
public:
FooByteIter begin() const;
FooByteIter end() const;
Foo(const std::array<int, 2>& x, const char (&y)[2], long z)
: x_{x}
, y_{y[0], y[1]}
, z_{z}
{}
protected:
std::array<int, 2> x_;
char y_[2];
long z_;
};
class FooByteIter {
public:
FooByteIter(const Foo& foo)
: ptr_{reinterpret_cast<const char*>(&(foo.x_))}
, x_end_{reinterpret_cast<const char*>(&(foo.x_)) + sizeof(foo.x_)}
, y_begin_{reinterpret_cast<const char*>(&(foo.y_))}
, y_end_{reinterpret_cast<const char*>(&(foo.y_)) + sizeof(foo.y_)}
, z_begin_{reinterpret_cast<const char*>(&(foo.z_))}
{}
static FooByteIter end(const Foo& foo) {
FooByteIter fbi{foo};
fbi.ptr_ = reinterpret_cast<const char*>(&foo.z_) + sizeof(foo.z_);
return fbi;
}
bool operator==(const FooByteIter& other) const { return ptr_ == other.ptr_; }
bool operator!=(const FooByteIter& other) const { return ! (*this == other); }
FooByteIter& operator++() {
ptr_++;
if (ptr_ == x_end_) {
ptr_ = y_begin_;
}
else if (ptr_ == y_end_) {
ptr_ = z_begin_;
}
return *this;
}
FooByteIter operator++(int) {
FooByteIter pre = *this;
(*this)++;
return pre;
}
char operator*() const {
return *ptr_;
}
private:
const char* ptr_;
const char* const x_end_;
const char* const y_begin_;
const char* const y_end_;
const char* const z_begin_;
};
FooByteIter Foo::begin() const {
return FooByteIter(*this);
}
FooByteIter Foo::end() const {
return FooByteIter::end(*this);
}
template <typename InputIt>
char checksum(InputIt first, InputIt last) {
char check = 0;
while (first != last) {
check += (*first);
++first;
}
return check;
}
int main() {
Foo f{{1, 2}, {3, 4}, 5};
for (const auto b : f) {
std::cout << (int)b << ' ';
}
std::cout << std::endl;
std::cout << "Checksum is: " << (int)checksum(f.begin(), f.end()) << std::endl;
}
You can generalize this further by making serialization functions for all data types you might care about, allowing serialization of classes that aren't plain-old-data types.
Warning
This code assumes that the underlying types being serialized have no internal padding, themselves. This answer works for this datatype because it is made of types which themselves do not pad. To make this work for datatypes that have datatypes that have padding, this method would need to be recursed all the way down.
Just cast a pointer to object to a pointer to char. You can iterate through the bytes by increment. Use sizeof(foo) to check overflow.
As long as you're able to make your class an aggregate, i.e. std::is_aggregate_v<T> == true, you can actually sort-of reflect the members of the structure.
This allows you to easily hash the members without actually having to name them. (also you don't have to remember updating your hash function every time you add a new member)
Step 1: Getting the number of members inside the aggregate
First we need to know how many members a given aggregate type has.
We can check this by (ab-)using aggregate initialization.
Example:
Given struct Foo { int i; int j; };:
Foo a{}; // ok
Foo b{{}}; // ok
Foo c{{}, {}}; // ok
Foo d{{}, {}, {}}; // error: too many initializers for 'Foo'
We can use this to get the number of members inside the struct, by trying to add more initializers until we get an error:
template<class T>
concept aggregate = std::is_aggregate_v<T>;
struct any_type {
template<class T>
operator T() {}
};
template<aggregate T>
consteval std::size_t count_members(auto ...members) {
if constexpr (requires { T{ {members}... }; } == false)
return sizeof...(members) - 1;
else
return count_members<T>(members..., any_type{});
}
Notice that i used {members}... instead of members....
This is because of arrays - a structure like struct Bar{int i[2];}; could be initialized with 2 elements, e.g. Bar b{1, 2}, so our function would have returned 2 for Bar if we had used members....
Step 2: Extracting the members
Now that we know how many members our structure has, we can use structured bindings to extract them.
Unfortunately there is no way in the current standard to create a structured binding expression with a variable amount of expressions, so we have to add a few extra lines of code for each additional member we want to support.
For this example i've only added a max of 4 members, but you can add as many as you like / need:
template<aggregate T>
constexpr auto tie_struct(T const& data) {
constexpr std::size_t fieldCount = count_members<T>();
if constexpr(fieldCount == 0) {
return std::tie();
} else if constexpr (fieldCount == 1) {
auto const& [m1] = data;
return std::tie(m1);
} else if constexpr (fieldCount == 2) {
auto const& [m1, m2] = data;
return std::tie(m1, m2);
} else if constexpr (fieldCount == 3) {
auto const& [m1, m2, m3] = data;
return std::tie(m1, m2, m3);
} else if constexpr (fieldCount == 4) {
auto const& [m1, m2, m3, m4] = data;
return std::tie(m1, m2, m3, m4);
} else {
static_assert(fieldCount!=fieldCount, "Too many fields for tie_struct! add more if statements!");
}
}
The fieldCount!=fieldCount in the static_assert is intentional, this prevents the compiler from evaluating it prematurely (it only complains if the else case is actually hit)
Now we have a function that can give us references to each member of an arbitrary aggregate.
Example:
struct Foo {int i; float j; std::string s; };
Foo f{1, 2, "miau"};
// tup is of type std::tuple<int const&, float const&, std::string const&>
auto tup = tie_struct(f);
// this will output "12miau"
std::cout << std::get<0>(tup) << std::get<1>(tup) << std::get<2>(tup) << std::endl;
Step 3: hashing the members
Now that we can convert any aggregate into a tuple of its members, hashing it shouldn't be a big problem.
You can basically hash the individual types like you want and then combine the individual hashes:
// for merging two hash values
std::size_t hash_combine(std::size_t h1, std::size_t h2)
{
return (h2 + 0x9e3779b9 + (h1<<6) + (h1>>2)) ^ h1;
}
// Handling primitives
template <class T, class = void>
struct is_std_hashable : std::false_type { };
template <class T>
struct is_std_hashable<T, std::void_t<decltype(std::declval<std::hash<T>>()(std::declval<T>()))>> : std::true_type { };
template <class T>
concept std_hashable = is_std_hashable<T>::value;
template<std_hashable T>
std::size_t hash(T value) {
return std::hash<T>{}(value);
}
// Handling tuples
template<class... Members>
std::size_t hash(std::tuple<Members...> const& tuple) {
return std::apply([](auto const&... members) {
std::size_t result = 0;
((result = hash_combine(result, hash(members))), ...);
return result;
}, tuple);
}
template<class T, std::size_t I>
using Arr = T[I];
// Handling arrays
template<class T, std::size_t I>
std::size_t hash(Arr<T, I> const& arr) {
std::size_t result = 0;
for(T const& elem : arr) {
std::size_t h = hash(elem);
result = hash_combine(result, h);
}
return result;
};
// Handling structs
template<aggregate T>
std::size_t hash(T const& agg) {
return hash(tie_struct(agg));
}
This allows you to hash basically any aggregate struct, even with arrays and nested structs:
struct Foo{ int i; double d; std::string s; };
struct Bar { Foo k[10]; float f; };
std::cout << hash(Foo{1, 1.2f, "miau"}) << std::endl;
std::cout << hash(Bar{}) << std::endl;
full example on godbolt
Footnotes
This only works with aggregates
No need to worry about padding because we access the members directly.
You have to add a few more ifs into tie_struct if you need more than 4 members
The provided hash() function doesn't handle all types - if you need e.g. std::array, std::pair, etc... you need to add overloads for those.
It's a lot of boilerplate code, but it's insanely powerful.
You can also use Boost.PFR for the aggregate-to-tuple part, if you are allowed to use boost
So I have the following available:
struct data_t {
char field1[10];
char field2[20];
char field3[30];
};
const char *getData(const char *key);
const char *field_keys[] = { "key1", "key2", "key3" };
This code is given to my and I cannot modify it in any way. It comes from some old C project.
I need to fill in the struct using the getData function with the different keys, something like the following:
struct data_t my_data;
strncpy(my_data.field1, getData(field_keys[0]), sizeof(my_data.field1));
strncpy(my_data.field1, getData(field_keys[1]), sizeof(my_data.field2));
strncpy(my_data.field1, getData(field_keys[2]), sizeof(my_data.field3));
Of course, this is a simplification, and more things are going on in each assignment. The point is that I would like to represent the mapping between keys and struct member in a constant structure, and use that to transform the last code in a loop. I am looking for something like the following:
struct data_t {
char field1[10];
char field2[20];
char field3[30];
};
typedef char *(data_t:: *my_struct_member);
const std::vector<std::pair<const char *, my_struct_member>> mapping = {
{ "FIRST_KEY" , &my_struct_t::field1},
{ "SECOND_KEY", &my_struct_t::field2},
{ "THIRD_KEY", &my_struct_t::field3},
};
int main()
{
data_t data;
for (auto const& it : mapping) {
strcpy(data.*(it.second), getData(it.first));
// Ideally, I would like to do
// strlcpy(data.*(it.second), getData(it.first), <the right sizeof here>);
}
}
This, however, has two problems:
It does not compile :) But I believe that should be easy to solve.
I am not sure about how to get the sizeof() argument for using strncpy/strlcpy, instead of strcpy. I am using char * as the type of the members, so I am losing the type information about how long each array is. In the other hand, I am not sure how to use the specific char[T] types of each member, because if each struct member pointer has a different type I don't think I will be able to have them in a std::vector<T>.
As explained in my comment, if you can store enough information to process a field in a mapping, then you can write a function that does the same.
Therefore, write a function to do so, using array references to ensure what you do is safe, e.g.:
template <std::size_t N>
void process_field(char (&dest)[N], const char * src)
{
strlcpy(dest, getData(src), N);
// more work with the field...
};
And then simply, instead of your for loop:
process_field(data.field1, "foo");
process_field(data.field2, "bar");
// ...
Note that the amount of lines is the same as with a mapping (one per field), so this is not worse than a mapping solution in terms of repetition.
Now, the advantages:
Easier to understand.
Faster: no memory needed to keep the mapping, more easily optimizable, etc.
Allows you to write different functions for different fields, easily, if needed.
Further, if both of your strings are known at compile-time, you can even do:
template <std::size_t N, std::size_t M>
void process_field(char (&dest)[N], const char (&src)[M])
{
static_assert(N >= M);
std::memcpy(dest, src, M);
// more work with the field...
};
Which will be always safe, e.g.:
process_field(data.field1, "123456789"); // just fits!
process_field(data.field1, "1234567890"); // error
Which has even more pros:
Way faster than any strcpy variant (if the call is done in run-time).
Guaranteed to be safe at compile-time instead of run-time.
A variadic templates based solution:
struct my_struct_t {
char one_field[30];
char another_field[40];
};
template<typename T1, typename T2>
void do_mapping(T1& a, T2& b) {
std::cout << sizeof(b) << std::endl;
strncpy(b, a, sizeof(b));
}
template<typename T1, typename T2, typename... Args>
void do_mapping(T1& a, T2& b, Args&... args) {
do_mapping(a, b);
do_mapping(args...);
}
int main()
{
my_struct_t ms;
do_mapping(
"FIRST_MAPPING", ms.one_field,
"SECOND_MAPPING", ms.another_field
);
return 0;
}
Since data_t is a POD structure, you can use offsetof() for this.
const std::vector<std::pair<const char *, std::size_t>> mapping = {
{ "FIRST_FIELD" , offsetof(data_t, field1},
{ "SECOND_FIELD", offsetof(data_t, field2)}
};
Then the loop would be:
for (auto const& it : mapping) {
strcpy(static_cast<char*>(&data) + it.second, getData(it.first));
}
I don't think there's any way to get the size of the member similarly. You can subtract the offset of the current member from the next member, but this will include padding bytes. You'd also have to special-case the last member, subtracting the offset from the size of the structure itself, since there's no next member.
The mapping can be a function to write the data into the appropriate member
struct mapping_t
{
const char * name;
std::function<void(my_struct_t *, const char *)> write;
};
const std::vector<mapping_t> mapping = {
{ "FIRST_KEY", [](data_t & data, const char * str) { strlcpy(data.field1, str, sizeof(data.field1); } }
{ "SECOND_KEY", [](data_t & data, const char * str) { strlcpy(data.field2, str, sizeof(data.field2); } },
{ "THIRD_KEY", [](data_t & data, const char * str) { strlcpy(data.field3, str, sizeof(data.field3); } },
};
int main()
{
data_t data;
for (auto const& it : mapping) {
it.write(data, getData(it.name));
}
}
To iterate over struct member you need:
offset / pointer to the beginning of that member
size of that member
struct Map {
const char *key;
std::size_t offset;
std::size_t size;
};
std::vector<Map> map = {
{ field_keys[0], offsetof(data_t, field1), sizeof(data_t::field1), },
{ field_keys[1], offsetof(data_t, field2), sizeof(data_t::field2), },
{ field_keys[2], offsetof(data_t, field3), sizeof(data_t::field3), },
};
once we have that we need strlcpy:
std::size_t mystrlcpy(char *to, const char *from, std::size_t max)
{
char * const to0 = to;
if (max == 0)
return 0;
while (--max != 0 && *from) {
*to++ = *from++;
}
*to = '\0';
return to0 - to - 1;
}
After having that, we can just:
data_t data;
for (auto const& it : map) {
mystrlcpy(reinterpret_cast<char*>(&data) + it.offset, getData(it.key), it.size);
}
That reinterpret_cast looks a bit ugly, but it just shift &data pointer to the needed field.
We can also create a smarter container which takes variable pointer on construction, thus is bind with an existing variable and it needs a little bit of writing:
struct Map2 {
static constexpr std::size_t max = sizeof(field_keys)/sizeof(*field_keys);
Map2(data_t* pnt) : mpnt(pnt) {}
char* getDest(std::size_t num) {
std::array<char*, max> arr = {
mpnt->field1,
mpnt->field2,
mpnt->field3,
};
return arr[num];
}
const char* getKey(std::size_t num) {
return field_keys[num];
}
std::size_t getSize(std::size_t num) {
std::array<std::size_t, max> arr = {
sizeof(mpnt->field1),
sizeof(mpnt->field2),
sizeof(mpnt->field3),
};
return arr[num];
}
private:
data_t* mpnt;
};
But probably makes the iterating more readable:
Map2 m(&data);
for (std::size_t i = 0; i < m.max; ++i) {
mystrlcpy(m.getDest(i), getData(m.getKey(i)), m.getSize(i));
}
Live code available at onlinegdb.
Background
I'm working with an embedded platform with the following restrictions:
No heap
No Boost libraries
C++11 is supported
I've dealt with the following problem a few times in the past:
Create an array of class type T, where T has no default constructor
The project only recently added C++11 support, and up until now I've been using ad-hoc solutions every time I had to deal with this. Now that C++11 is available, I thought I'd try to make a more general solution.
Solution Attempt
I copied an example of std::aligned_storage to come up with the framework for my array type. The result looks like this:
#include <type_traits>
template<class T, size_t N>
class Array {
// Provide aligned storage for N objects of type T
typename std::aligned_storage<sizeof(T), alignof(T)>::type data[N];
public:
// Build N objects of type T in the aligned storage using default CTORs
Array()
{
for(auto index = 0; index < N; ++index)
new(data + index) T();
}
const T& operator[](size_t pos) const
{
return *reinterpret_cast<const T*>(data + pos);
}
// Other methods consistent with std::array API go here
};
This is a basic type - Array<T,N> only compiles if T is default-constructible. I'm not very familiar with template parameter packing, but looking at some examples led me to the following:
template<typename ...Args>
Array(Args&&... args)
{
for(auto index = 0; index < N; ++index)
new(data + index) T(args...);
}
This was definitely a step in the right direction. Array<T,N> now compiles if passed arguments matching a constructor of T.
My only remaining problem is constructing an Array<T,N> where different elements in the array have different constructor arguments. I figured I could split this into two cases:
1 - User Specifies Arguments
Here's my stab at the CTOR:
template<typename U>
Array(std::initializer_list<U> initializers)
{
// Need to handle mismatch in size between arg and array
size_t index = 0;
for(auto arg : initializers) {
new(data + index) T(arg);
index++;
}
}
This seems to work fine, aside from needing to handle a dimension mismatch between the array and initializer list, but there are a number of ways to deal with that that aren't important. Here's an example:
struct Foo {
explicit Foo(int i) {}
};
void bar() {
// foos[0] == Foo(0)
// foos[1] == Foo(1)
// ..etc
Array<Foo,10> foos {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
}
2 - Arguments Follow Pattern
In my previous example, foos is initialized with an incrementing list, similar to std::iota. Ideally I'd like to support something like the following, where range(int) returns SOMETHING that can initialize the array.
// One of these should initialize foos with parameters returned by range(10)
Array<Foo,10> foosA = range(10);
Array<Foo,10> foosB {range(10)};
Array<Foo,10> foosC = {range(10)};
Array<Foo,10> foosD(range(10));
Googling has shown me that std::initializer_list isn't a "normal" container, so I don't think there's any way for me to make range(int) return a std::initializer_list depending on the function parameter.
Again, there are a few options here:
Parameters specified at run-time (function return?)
Parameters specified at compile-time (constexpr function return? templates?)
Questions
Are there any issues with this solution so far?
Does anyone have a suggestion to generate constructor parameters? I can't think of a solution at runtime or compile-time other than hard-coding an std::initializer_list, so any ideas are welcome.
If i understand your problem correctly, I've also stumbled across std::array's total inflexibility regarding element construction in favor of aggregate initialization (and an absense of statically-allocated container with flexible element contruction options). The best approach I came up with was creating a custom array-like container which accepts an iterator to construct it's elements.
This is totally flexible solution:
Works for both fixed-size and dynamic-sized containers
Can pass different or same parameters to element constructors
Can call constructors with one or multiple (tuple piecewise construction) arguments, or even different constructors for different elements (with inversion of control)
For your example it would be like:
const size_t SIZE = 10;
std::array<int, SIZE> params;
for (size_t c = 0; c < SIZE; c++) {
params[c] = c;
}
Array<Foo, SIZE> foos(iterator_construct, ¶ms[0]); //iterator_construct is a special tag to call specific constructor
// also, we are able to pass a pointer as iterator, since it has both increment and dereference operators
Note: you can totally skip parameters array allocation here by using custom iterator class, which calculates it's value from it's position on-the-fly.
For multiple-argument constructor that would be:
const size_t SIZE = 10;
std::array<std::tuple<int, float>, SIZE> params; // will call Foo(int, float)
for (size_t c = 0; c < SIZE; c++) {
params[c] = std::make_tuple(c, 1.0f);
}
Array<Foo, SIZE> foos(iterator_construct, piecewise_construct, ¶ms[0]);
Concrete implementation example is kinda big piece of code, so please let me know if you want more insights into implementation details besides the general idea - I will update my answer then.
I'd use a factory lambda.
The lambda takes a pointer to where to construct and an index, and is responsible for constructing.
This makes copy/move easy to write as well, which is a good sign.
template<class T, std::size_t N>
struct my_array {
T* data() { return (T*)&buffer; }
T const* data() const { return (T const*)&buffer; }
// basic random-access container operations:
T* begin() { return data(); }
T const* begin() const { return data(); }
T* end() { return data()+N; }
T const* end() const { return data()+N; }
T& operator[](std::size_t i){ return *(begin()+i); }
T const& operator[](std::size_t i)const{ return *(begin()+i); }
// useful utility:
bool empty() const { return N!=0; }
T& front() { return *begin(); }
T const& front() const { return *begin(); }
T& back() { return *(end()-1); }
T const& back() const { return *(end()-1); }
std::size_t size() const { return N; }
// construct from function object:
template<class Factory,
typename std::enable_if<!std::is_same<std::decay_t<Factory>, my_array>::value, int> =0
>
my_array( Factory&& factory ) {
std::size_t i = 0;
try {
for(; i < N; ++i) {
factory( (void*)(data()+i), i );
}
} catch(...) {
// throw during construction. Unroll creation, and rethrow:
for(std::size_t j = 0; j < i; ++j) {
(data()+i-j-1)->~T();
}
throw;
}
}
// other constructors, in terms of above naturally:
my_array():
my_array( [](void* ptr, std::size_t) {
new(ptr) T();
} )
{}
my_array(my_array&& o):
my_array( [&](void* ptr, std::size_t i) {
new(ptr) T( std::move(o[i]) );
} )
{}
my_array(my_array const& o):
my_array( [&](void* ptr, std::size_t i) {
new(ptr) T( o[i] );
} )
{}
my_array& operator=(my_array&& o) {
for (std::size_t i = 0; i < N; ++i)
(*this)[i] = std::move(o[i]);
return *this;
}
my_array& operator=(my_array const& o) {
for (std::size_t i = 0; i < N; ++i)
(*this)[i] = o[i];
return *this;
}
private:
using storage = typename std::aligned_storage< sizeof(T)*N, alignof(T) >::type;
storage buffer;
};
it defines my_array(), but that is only compiled if you try to compile it.
Supporting initializer list is relatively easy. Deciding what to do when the il isn't long enough, or too long, is hard. I think you might want:
template<class Fail>
my_array( std::initializer_list<T> il, Fail&& fail ):
my_array( [&](void* ptr, std::size_t i) {
if (i < il.size()) new(ptr) T(il[i]);
else fail(ptr, i);
} )
{}
which requires you pass in a "what to do on fail". We could default to throw by adding:
template<class WhatToThrow>
struct throw_when_called {
template<class...Args>
void operator()(Args&&...)const {
throw WhatToThrow{"when called"};
}
};
struct list_too_short:std::length_error {
list_too_short():std::length_error("list too short") {}
};
template<class Fail=ThrowWhenCalled<list_too_short>>
my_array( std::initializer_list<T> il, Fail&& fail={} ):
my_array( [&](void* ptr, std::size_t i) {
if (i < il.size()) new(ptr) T(il[i]);
else fail(ptr, i);
} )
{}
which if I wrote it right, makes a too-short initializer list cause a meaningful throw message. On your platform, you could just exit(-1) if you don't have exceptions.
well i cant find how do this, basically its a variable union with params, basic idea, (writed as function)
Ex1
union Some (int le)
{
int i[le];
float f[le];
};
Ex2
union Some
{
int le;
int i[le];
float f[le];
};
obs this don't works D:
maybe a way to use an internal variable to set the lenght but don't works too.
Thx.
No, this is not possible: le would need to be known at compile-time.
One solution would be to use a templated union:
template <int N> union Some
{
int i[N];
float f[N];
};
N, of course, is compile-time evaluable.
Another solution is the arguably more succinct
typedef std::vector<std::pair<int, float>> Some;
or a similar solution based on std::array.
Depending on your use case you could try to simulate a union.
struct Some
{
//Order is important
private:
char* pData;
public:
int* const i;
float* const f;
public:
Some(size_t len)
:pData(new char[sizeof(int) < sizeof(float) ? sizeof(float) : sizeof(int)])
,i ((int*)pData)
,f ((float*)pData)
{
}
~Some()
{
delete[] pData;
}
Some(const Some&) = delete;
Some& operator=(const Some&) = delete;
};
Alternative solution using templates, unique_ptr and explicit casts:
//max_size_of<>: a recursive template struct to evaluate the
// maximum value of the sizeof function of all types passed as
// parameter
//The recursion is done by using the "value" of another
// specialization of max_size_of<> with less parameter types
template <typename T, typename...Args>
struct max_size_of
{
static const std::size_t value = std::max(sizeof(T), max_size_of<Args...>::value);
};
//Specialication for max_size_of<> as recursion stop
template <typename T>
struct max_size_of<T>
{
static const std::size_t value = sizeof(T);
};
//dataptr_auto_cast<>: a recursive template struct that
// introduces a virtual function "char* const data_ptr()"
// and an additional explicit cast operator for a pointer
// of the first type. Due to the recursion a cast operator
// for every type passed to the struct is created.
//Attention: types are not allowed to be duplicate
//The recursion is done by inheriting from of another
// specialization of dataptr_auto_cast<> with less parameter types
template <typename T, typename...Args>
struct dataptr_auto_cast : public dataptr_auto_cast<Args...>
{
virtual char* const data_ptr() const = 0; //This is needed by the cast operator
explicit operator T* const() const { return (T*)data_ptr(); } //make it explicit to avoid unwanted side effects (manual cast needed)
};
//Specialization of dataptr_auto_cast<> as recursion stop
template <typename T>
struct dataptr_auto_cast<T>
{
virtual char* const data_ptr() const = 0;
explicit operator T* const() const { return (T*)data_ptr(); }
};
//union_array<>: inherits from dataptr_auto_cast<> with the same
// template parameters. Also has a static const member "blockSize"
// that indicates the size of the largest datatype passed as parameter
// "blockSize" is used to determine the space needed to store "size"
// elements.
template <typename...Args>
struct union_array : public dataptr_auto_cast<Args...>
{
static const size_t blockSize = max_size_of<Args...>::value;
private:
std::unique_ptr<char[]> m_pData; //std::unique_ptr automatically deletes the memory it points to on destruction
size_t m_size; //The size/no. of elements
public:
//Create a new array to store "size" elements
union_array(size_t size)
:m_pData(new char[size*blockSize])
,m_size(size)
{
}
//Copy constructor
union_array(const union_array<Args...>& other)
:m_pData(new char[other.m_size*blockSize])
,m_size(other.m_size)
{
memcpy(m_pData.get(), other.m_pData.get(), other.m_size);
}
//Move constructor
union_array(union_array<Args...>&& other)
:m_pData(std::move(other.m_pData))
,m_size(std::move(other.m_size))
{
}
union_array& operator=(const union_array<Args...>& other)
{
m_pData = new char[other.m_size*blockSize];
m_size = other.m_size;
memcpy(m_pData.get(), other.m_pData.get(), other.m_size);
}
union_array& operator=(union_array<Args...>&& other)
{
m_pData = std::move(other.m_pData);
m_size = std::move(other.m_size);
}
~union_array() = default;
size_t size() const
{
return m_size;
}
//Implementation of dataptr_auto_cast<>::data_ptr
virtual char* const data_ptr() const override
{
return m_pData.get();
}
};
int main()
{
auto a = union_array<int, char, float, double>(5); //Create a new union_array object with enough space to store either 5 int, 5 char, 5 float or 5 double values.
((int*)a)[3] = 3; //Use as int array
auto b = a; //copy
((int*)b)[3] = 1; //Change a value
auto c = std::move(a);// move a to c, a is invalid beyond this point
// std::cout << ((int*)a)[3] << std::endl; //This will crash as a is invalid due to the move
std::cout << ((int*)b)[3] << std::endl; //prints "1"
std::cout << ((int*)c)[3] << std::endl; //prints "3"
}
Explanation
template <typename T, typename...Args>
struct max_size_of
{
static const std::size_t value = std::max(sizeof(T), max_size_of<Args...>::value);
};
template <typename T>
struct max_size_of<T>
{
static const std::size_t value = sizeof(T);
};
max_size_of<> is used to get the largest sizeof() value of all types passed as template paremeters.
Let's have a look at the simple case first.
- max_size_of<char>::value: value will be set to sizeof(char).
- max_size_of<int>::value: value will be set to sizeof(int).
- and so on
If you put in more than one type it will evaluate to the maximum of the sizeof of these types.
For 2 types this would look like this: max_size_of<char, int>::value: value will be set to std::max(sizeof(char), max_size_of<int>::value).
As described above max_size_of<int>::value is the same as sizeof(int), so max_size_of<char, int>::value is the same as std::max(sizeof(char), sizeof(int)) which is the same as sizeof(int).
template <typename T, typename...Args>
struct dataptr_auto_cast : public dataptr_auto_cast<Args...>
{
virtual char* const data_ptr() const = 0;
explicit operator T* const() const { return (T*)data_ptr(); }
};
template <typename T>
struct dataptr_auto_cast<T>
{
virtual char* const data_ptr() const = 0;
explicit operator T* const() const { return (T*)data_ptr(); }
};
dataptr_auto_cast<> is what we use as a simple abstract base class.
It forces us to implement a function char* const data_ptr() const in the final class (which will be union_array).
Let's just assume that the class is not abstract and use the simple version dataptr_auto_cast<T>:
The class implements a operator function that returns a pointer of the type of the passed template parameter.
dataptr_auto_cast<int> has a function explicit operator int* const() const;
The function provides access to data provided by the derived class through the data_ptr()function and casts it to type T* const.
The const is so that the pointer isn't altered accidentially and the explicit keyword is used to avoid unwanted implicit casts.
As you can see there are 2 versions of dataptr_auto_cast<>. One with 1 template paremeter (which we just looked at) and one with multiple template paremeters.
The definition is quite similar with the exception that the multiple parameters one inherits dataptr_auto_cast with one (the first) template parameter less.
So dataptr_auto_cast<int, char> has a function explicit operator int* const() const; and inherits dataptr_auto_cast<char> which has a function explicit operator char* const() const;.
As you can see there is one cast operator function implemented with each type you pass.
There is only one exception and that is passing the same template parameter twice.
This would lead in the same operator function being defined twice within the same class which doesn't work.
For this use case, using this as a base class for the union_array, this shouldn't matter.
Now that these two are clear let's look at the actual code for union_array:
template <typename...Args>
struct union_array : public dataptr_auto_cast<Args...>
{
static const size_t blockSize = max_size_of<Args...>::value;
private:
std::unique_ptr<char[]> m_pData;
size_t m_size;
public:
//Create a new array to store "size" elements
union_array(size_t size)
:m_pData(new char[size*blockSize])
,m_size(size)
{
}
//Copy constructor
union_array(const union_array<Args...>& other)
:m_pData(new char[other.m_size*blockSize])
,m_size(other.m_size)
{
memcpy(m_pData.get(), other.m_pData.get(), other.m_size);
}
//Move constructor
union_array(union_array<Args...>&& other)
:m_pData(std::move(other.m_pData))
,m_size(std::move(other.m_size))
{
}
union_array& operator=(const union_array<Args...>& other)
{
m_pData = new char[other.m_size*blockSize];
m_size = other.m_size;
memcpy(m_pData.get(), other.m_pData.get(), other.m_size);
}
union_array& operator=(union_array<Args...>&& other)
{
m_pData = std::move(other.m_pData);
m_size = std::move(other.m_size);
}
~union_array() = default;
size_t size() const
{
return m_size;
}
virtual char* const data_ptr() const override
{
return m_pData.get();
}
};
As you can see union_array<> inherits from dataptr_auto_cast<> using the same template arguments.
So this gives us a cast operator for every type passed as template paremeter to union_array<>.
Also at the end of union_array<> you can see that the char* const data_ptr() const function is implemented (the abstract function from dataptr_auto_cast<>).
The next interesting thing to see is static const size_t blockSize which is initilialized with the maximum sizeof value of the template paremeters to union_array<>.
To get this value the max_size_of is used as described above.
The class uses std::unique_ptr<char[]> as data storage, as std::unique_ptr automatically will delete the space for us, once the class is destroyed.
Also std::unique_ptr is capable of move semantics, which is used in the move assign operator function and the move constructor.
A "normal" copy assign operator function and a copy constructor are also included and copy the memory accordingly.
The class has a constructor union_array(size_t size) which takes the number of elements the union_array should be able to hold.
Multiplying this value with blockSize gives us the space needed to store exactly size elements of the largest template type.
Last but not least there is an access method to ask for the size() if needed.
C++ requires that the size of a type be known at compile time.
The size of a block of data need not be known, but all types have known sizes.
There are three ways around it.
I'll ignore the union part for now. Imagine if you wanted:
struct some (int how_many) {
int data[how_many];
};
as the union part adds complexity which can be dealt with separately.
First, instead of storing the data as part of the type, you can store pointers/references/etc to the data.
struct some {
std::vector<int> data;
explicit some( size_t how_many ):data(how_many) {};
some( some&& ) = default;
some& operator=( some&& ) = default;
some( some const& ) = default;
some& operator=( some const& ) = default;
some() = default;
~some() = default;
};
here we store the data in a std::vector -- a dynamic array. We default copy/move/construct/destruct operations (explicitly -- because it makes it clearer), and the right thing happens.
Instead of a vector we can use a unique_ptr:
struct some {
std::unique_ptr<int[]> data;
explicit some( size_t how_many ):data(new int[how_many]) {};
some( some&& ) = default;
some& operator=( some&& ) = default;
some() = default;
~some() = default;
};
this blocks copying of the structure, but the structure goes from being size of 3 pointers to being size of 1 in a typical std implementation. We lose the ability to easily resize after the fact, and copy without writing the code ourselves.
The next approach is to template it.
template<std::size_t N>
struct some {
int data[N];
};
this, however, requires that the size of the structure be known at compile-time, and some<2> and some<3> are 'unrelated types' (barring template pattern matching). So it has downsides.
A final approach is C-like. Here we rely on the fact that data can be variable in size, even if types are not.
struct some {
int data[1]; // or `0` in some compilers as an extension
};
some* make_some( std::size_t n ) {
Assert(n >= 1); // unless we did `data[0]` above
char* buff = new some[(n-1)*sizeof(int) + sizeof(some)]; // note: alignment issues on some platforms?
return new(buff) some(); // placement new
};
where we allocate a buffer for some of variable size. Access to the buffer via data[13] is practically legal, and probably actually so as well.
This technique is used in C to create structures of variable size.
For the union part, you'll want to create a buffer of char with the right size std::max(sizeof(float), sizeof(int))*N, and expose functions:
char* data(); // returns a pointer to the start of the buffer
int* i() { return reinterpret_cast<int*>(data()); }
float* f() { return reinterpret_cast<float*>(data()); }
you may also need to properly initialize the data as the proper type; in theory, a char buffer of '\0's may not correspond to defined float values or ints that are zero.
I would like to suggest a different approach: Instead of tying the number of elements to the union, tie it outside:
union Some
{
int i;
float f;
};
Some *get_Some(int le) { return new Some[le]; }
Don't forget to delete[] the return value of get_Some... Or use smart pointers:
std::unique_ptr<Some[]> get_Some(int le)
{ return std::make_unique<Some[]>(le); }
You can even create a Some_Manager:
struct Some_Manager
{
union Some
{
int i;
float f;
};
Some_Manager(int le) :
m_le{le},
m_some{std::make_unique<Some[]>(le)}
{}
// ... getters and setters...
int count() const { return m_le; }
Some &operator[](int le) { return m_some[le]; }
private:
int m_le{};
std::unique_ptr<Some[]> m_some;
};
Take a look at the Live example.
It's not possible to declare a structure with dynamic sizes as you are trying to do, the size must be specified at run time or you will have to use higher-level abstractions to manage a dynamic pool of memory at run time.
Also, in your second example, you include le in the union. If what you were trying to do were possible, it would cause le to overlap with the first value of i and f.
As was mentioned before, you could do this with templating if the size is known at compile time:
#include <cstdlib>
template<size_t Sz>
union U {
int i[Sz];
float f[Sz];
};
int main() {
U<30> u;
u.i[0] = 0;
u.f[1] = 1.0;
}
http://ideone.com/gG9egD
If you want dynamic size, you're beginning to reach the realm where it would be better to use something like std::vector.
#include <vector>
#include <iostream>
union U {
int i;
float f;
};
int main() {
std::vector<U> vec;
vec.resize(32);
vec[0].i = 0;
vec[1].f = 42.0;
// But there is no way to tell whether a given element is
// supposed to be an int or a float:
// vec[1] was populated via the 'f' option of the union:
std::cout << "vec[1].i = " << vec[1].i << '\n';
}
http://ideone.com/gjTCuZ
I've been trying to create a variable length multidimensional array. As I understand, you can't create variable length arrays on the stack, but you can create 1D variable length arrays in C++ using dynamic allocation. Correct me if this is a compiler extension, but it seems to work fine on clang and gcc with --pedantic set.
int size = 10;
int *ary = new int[size]();
I tried to extend the concept to multidimensional arrays. Here are my results. The problem with possiblity 1 and 2 is that they require a constexpr and do not work with variable sizes. Is it possible to make either of them accept a variable as its size? I put possibility 3 as I am aware of it, but it lacks [][] access, which is what I'm looking for.
constexpr int constSize = 10;
//Possibility 1: Only works in C++11
//Creates CONTIGUOUS 2D array equivalent to array[n*n], but with [][] access
int (*ary1)[constSize] = new int[constSize][constSize]();
delete [] ary1;
//Possibility 2:
//Really horrible as it does NOT create a contiguous 2D array
//Instead creates n seperate arrays that are each themselves contiguous
//Also requires a lot of deletes, quite messy
int **ary2 = new int*[constSize];
for (int i = 0; i < n; ++i)
ary2[i] = new int[constSize];
for (int i = 0; i < n; ++i)
delete [] ary2;
delete [] ary2;
//Possibility 3:
//This DOES work with non-constexpr variable
//However it does not offer [][] access, need to access element using ary[i*n+j]
int *ary3 = new int[size*size];
delete [] ary3;
This will create a dynamically allocated 2D variable-length array, with dimensions w and h:
std::vector<std::vector<int>> ary4(w, std::vector<int>(h));
It can be accessed with [][]:
ary4[x][y] = 0;
However, it's not contiguously allocated. To get a contiguous array, here's one solution:
template<typename E>
class Contiguous2DArray
{
public:
Contiguous2DArray(std::size_t width, std::size_t height)
: array(width * height), width(width) {}
E& operator()(std::size_t x, std::size_t y)
{ return array[x + width * y]; }
private:
std::vector<E> array;
std::size_t width;
}
It can be used like this:
Contiguous2DArray<int> ary5(w, h);
ary5(x, y) = 0;
The number of dimensions is fixed, because the type of what [] returns changes based on the number of dimensions. Access is through both repeated [] and (...). The first mimics C-style array lookup. The (...) syntax must be complete (it must pass N args to an N dimensional array). There is a modest efficiency cost to support both.
Uses C++14 features to save on verbosity. None are essential.
Using an an n_dim_array with 0 dimensions will give bad results.
template<class T, size_t N>
struct array_index {
size_t const* dimensions;
size_t offset;
T* buffer;
array_index<T,N-1> operator[](size_t i)&&{
return {dimensions+1, (offset+i)* *dimensions, buffer};
}
template<class...Is, std::enable_if_t<sizeof...(Is) == N>>
T& operator()(size_t i, Is...is)&&{
return std::move(*this)[i](is...);
}
};
template<class T>
struct array_index<T,0> {
size_t const* dimension;
size_t offset;
T* buffer;
T& operator[](size_t i)&&{
return buffer[i+offset];
}
T& operator()(size_t i)&&{
return std::move(*this)[i];
}
};
template<class T, size_t N>
struct n_dim_array {
template<class...Szs, class=std::enable_if_t<sizeof...(Szs)==N>>
explicit n_dim_array( Szs... sizes ):
szs{ { static_cast<size_t>(sizes)... } }
{
size_t sz = 1;
for( size_t s : szs )
sz *= s;
buffer.resize(sz);
}
n_dim_array( n_dim_array const& o ) = default;
n_dim_array& operator=( n_dim_array const& o ) = default;
using top_level_index = array_index<T,N-1>;
top_level_index index(){return {szs.data(),0,buffer.data()};}
auto operator[]( size_t i ) {
return index()[i];
}
using const_top_level_index = array_index<const T,N-1>;
const_top_level_index index()const{return {szs.data(),0,buffer.data()};}
auto operator[]( size_t i ) const {
return index()[i];
}
template<class...Is,class=std::enable_if_t<sizeof...(Is)==N>>
T& operator()(Is...is){
return index()(is...);
}
template<class...Is,class=std::enable_if_t<sizeof...(Is)==N>>
T const& operator()(Is...is) const {
return index()(is...);
}
private:
n_dim_array() = delete;
std::array<size_t,N> szs;
std::vector<T> buffer;
};
live example
Does not support for(:) loop iteration. Writing an iterator isn't that hard: I'd do it in array_index.