Recasting a container of void pointers - c++

Short version
Can I reinterpret_cast a std::vector<void*>* to a std::vector<double*>*?
What about with other STL containers?
Long version
I have a function to recast a vector of void pointers to a datatype specified by a template argument:
template <typename T>
std::vector<T*> recastPtrs(std::vector<void*> const& x) {
std::vector<T*> y(x.size());
std::transform(x.begin(), x.end(), y.begin(),
[](void *a) { return static_cast<T*>(a); } );
return y;
}
But I was thinking that copying the vector contents isn't really necessary, since we're really just reinterpreting what's being pointed to.
After some tinkering, I came up with this:
template <typename T>
std::vector<T*> recastPtrs(std::vector<void*>&& x) {
auto xPtr = reinterpret_cast<std::vector<T*>*>(&x);
return std::vector<T*>(std::move(*xPtr));
}
So my questions are:
Is it safe to reinterpret_cast an entire vector like this?
What if it was a different kind of container (like a std::list or std::map)? To be clear, I mean casting a std::list<void*> to std::list<T*>, not casting between STL container types.
I'm still trying to wrap my head around move semantics. Am I doing it right?
And one follow-up question: What would be the best way to generate a const version without code duplication? i.e. to define
std::vector<T const*> recastPtrs(std::vector<void const*> const&);
std::vector<T const*> recastPtrs(std::vector<void const*>&&);
MWE
#include <vector>
#include <algorithm>
#include <iostream>
template <typename T>
std::vector<T*> recastPtrs(std::vector<void*> const& x) {
std::vector<T*> y(x.size());
std::transform(x.begin(), x.end(), y.begin(),
[](void *a) { return static_cast<T*>(a); } );
return y;
}
template <typename T>
std::vector<T*> recastPtrs(std::vector<void*>&& x) {
auto xPtr = reinterpret_cast<std::vector<T*>*>(&x);
return std::vector<T*>(std::move(*xPtr));
}
template <typename T>
void printVectorAddr(std::vector<T> const& vec) {
std::cout<<" vector object at "<<&vec<<", data()="<<vec.data()<<std::endl;
}
int main(void) {
std::cout<<"Original void pointers"<<std::endl;
std::vector<void*> voidPtrs(100);
printVectorAddr(voidPtrs);
std::cout<<"Elementwise static_cast"<<std::endl;
auto dblPtrs = recastPtrs<double>(voidPtrs);
printVectorAddr(dblPtrs);
std::cout<<"reintepret_cast entire vector, then move ctor"<<std::endl;
auto dblPtrs2 = recastPtrs<double>(std::move(voidPtrs));
printVectorAddr(dblPtrs2);
}
Example output:
Original void pointers
vector object at 0x7ffe230b1cb0, data()=0x21de030
Elementwise static_cast
vector object at 0x7ffe230b1cd0, data()=0x21de360
reintepret_cast entire vector, then move ctor
vector object at 0x7ffe230b1cf0, data()=0x21de030
Note that the reinterpret_cast version reuses the underlying data structure.
Previously-asked questions that didn't seem relevant
These are the questions that come up when I tried to search this:
reinterpret_cast vector of class A to vector of class B
reinterpret_cast vector of derived class to vector of base class
reinterpret_cast-ing vector of one type to a vector of another type which is of the same type
And the answer to these was a unanimous NO, with reference to the strict aliasing rule. But I figure that doesn't apply to my case, since the vector being recast is an rvalue, so there's no opportunity for aliasing.
Why I'm trying to do this
I'm interfacing with a MATLAB library that gives me data pointers as void* along with a variable indicating the datatype. I have one function that validates the inputs and collects these pointers into a vector:
void parseInputs(int argc, mxArray* inputs[], std::vector<void*> &dataPtrs, mxClassID &numericType);
I can't templatize this part since the type is not known until runtime. On the other side, I have numeric routines to operate on vectors of a known datatype:
template <typename T>
void processData(std::vector<T*> const& dataPtrs);
So I'm just trying to connect one to the other:
void processData(std::vector<void*>&& voidPtrs, mxClassID numericType) {
switch (numericType) {
case mxDOUBLE_CLASS:
processData(recastPtrs<double>(std::move(voidPtrs)));
break;
case mxSINGLE_CLASS:
processData(recastPtrs<float>(std::move(voidPtrs)));
break;
default:
assert(0 && "Unsupported datatype");
break;
}
}

Given the comment that you're receiving the void * from a C library (something like malloc), it seems like we can probably narrow the problem down quite a bit.
In particular, I'd guess you're really dealing with something that's more like an array_view than a vector. That is, you want something that lets you access some data cleanly. You might change individual items in that collection, but you'll never change the collection as a whole (e.g., you won't try to do a push_back that could need to expand the memory allocation).
For such a case, you can pretty easily create a wrapper of your own that gives you vector-like access to the data--defines an iterator type, has a begin() and end() (and if you want, the others like rbegin()/rend(), cbegin()/cend() and crbegin()/crend()), as well as an at() that does range-checked indexing, and so on.
So a fairly minimal version could look something like this:
#pragma once
#include <cstddef>
#include <stdexcept>
#include <cstdlib>
#include <iterator>
template <class T> // note: no allocator, since we don't do allocation
class array_view {
T *data;
std::size_t size_;
public:
array_view(void *data, std::size_t size_) : data(reinterpret_cast<T *>(data)), size_(size_) {}
T &operator[](std::size_t index) { return data[index]; }
T &at(std::size_t index) {
if (index > size_) throw std::out_of_range("Index out of range");
return data[index];
}
std::size_t size() const { return size_; }
typedef T *iterator;
typedef T const &const_iterator;
typedef T value_type;
typedef T &reference;
iterator begin() { return data; }
iterator end() { return data + size_; }
const_iterator cbegin() { return data; }
const_iterator cend() { return data + size_; }
class reverse_iterator {
T *it;
public:
reverse_iterator(T *it) : it(it) {}
using iterator_category = std::random_access_iterator_tag;
using difference_type = std::ptrdiff_t;
using value_type = T;
using pointer = T *;
using reference = T &;
reverse_iterator &operator++() {
--it;
return *this;
}
reverse_iterator &operator--() {
++it;
return *this;
}
reverse_iterator operator+(size_t size) const {
return reverse_iterator(it - size);
}
reverse_iterator operator-(size_t size) const {
return reverse_iterator(it + size);
}
difference_type operator-(reverse_iterator const &r) const {
return it - r.it;
}
bool operator==(reverse_iterator const &r) const { return it == r.it; }
bool operator!=(reverse_iterator const &r) const { return it != r.it; }
bool operator<(reverse_iterator const &r) const { return std::less<T*>(r.it, it); }
bool operator>(reverse_iterator const &r) const { return std::less<T*>(it, r.it); }
T &operator *() { return *(it-1); }
};
reverse_iterator rbegin() { return data + size_; }
reverse_iterator rend() { return data; }
};
I've tried to show enough that it should be fairly apparent how to add most of the missing functionality (e.g., crbegin()/crend()), but I haven't worked really hard at including everything here, since much of what's left is more repetitive and tedious than educational.
This is enough to use the array_view in most of the typical vector-like ways. For example:
#include "array_view"
#include <iostream>
#include <iterator>
int main() {
void *raw = malloc(16 * sizeof(int));
array_view<int> data(raw, 16);
std::cout << "Range based:\n";
for (auto & i : data)
i = rand();
for (auto const &i : data)
std::cout << i << '\n';
std::cout << "\niterator-based, reverse:\n";
auto end = data.rend();
for (auto d = data.rbegin(); d != end; ++d)
std::cout << *d << '\n';
std::cout << "Forward, counted:\n";
for (int i=0; i<data.size(); i++) {
data[i] += 10;
std::cout << data[i] << '\n';
}
}
Note that this doesn't attempt to deal with copy/move construction at all, nor with destruction. At least as I've formulated it, the array_view is a non-owning view into some existing data. It's up to you (or at least something outside of the array_view) to destroy the data when appropriate. Since we're not destroying the data, we can use the compiler-generated copy and move constructors without any problem. We won't get a double-delete from doing a shallow copy of the pointer, because we don't do any delete when the array_view is destroyed.

No, you cannot do anything like this in Standard C++.
The strict aliasing rule says that to access an object of type T, you must use an expression of type T; with a very short list of exceptions to that.
Accessing a double * via a void * expression is not such an exception; let alone a vector of each. Nor is it an exception if you accessed the object of type T via an rvalue.

Related

Is there a standard C++ class for arrays with fixed run-time-determined size?

I need a container with run-time known size with no need to resizing. std::unique_ptr<T[]> would be a useful, but there is no encapsulated size member. In the same time std::array is for compile type size only. Hence I need some combination of these classes with no/minimal overhead.
Is there a standard class for my needs, maybe something in upcoming C++20?
Use std::vector. This is the class for runtime sized array in the STL.
It let you resize it or pushing elements into it:
auto vec = std::vector<int>{};
vec.resize(10); // now vector has 10 ints 0 initialized
vec.push_back(1); // now 11 ints
Some problems stated in the comments:
vector has an excessive interface
So is std::array. You have more than 20 function in std::array including operators.
Just don't use what you don't need. You don't pay for the function you won't use. It won't even increase your binary size.
vector will force initialize items on resize. As far as I know, it is not allowed to use operator[] for indexes >= size (despite calling reserve).
This is not how it is meant to be used. When reserving you should then resize the vector with resize or by pushing elements into it. You say vector will force initialize elements into it, but the problem is that you cannot call operator= on unconstructed objects, including ints.
Here's an example using reserve:
auto vec = std::vector<int>{};
vec.reserve(10); // capacity of at least 10
vec.resize(3); // Contains 3 zero initialized ints.
// If you don't want to `force` initialize elements
// you should push or emplace element into it:
vec.emplace_back(1); // no reallocation for the three operations.
vec.emplace_back(2); // no default initialize either.
vec.emplace_back(3); // ints constructed with arguments in emplace_back
Keep in mind that there is a high chance for such allocation and use case, the compiler may completely elide construction of elements in the vector. There may be no overhead in your code.
I would suggest to measure and profile if your code is subject to very precise performance specification. If you do not have such specification, most likely this is premature optimization. The cost of memory allocation completely out measure the time it takes to initialize elements one by one.
Other parts of your program may be refactored to gain much more performance than trivial initialization can offer you. In fact, getting in the way of it may hinder optimization and make your program slower.
Allocate the memory using an std::unique_ptr<T[]> like you suggested, but to use it - construct an std::span (in C++20; gsl::span before C++20) from the raw pointer and the number of elements, and pass the span around (by value; spans are reference-types, sort of). The span will give you all the bells and whistles of a container: size, iterators, ranged-for, the works.
#include <span>
// or:
// #include <gsl/span>
int main() {
// ... etc. ...
{
size_t size = 10e5;
auto uptr { std::make_unique<double[]>(size) };
std::span<int> my_span { uptr.get(), size };
do_stuff_with_the_doubles(my_span);
}
// ... etc. ...
}
For more information about spans, see:
What is a "span" and when should I use one?
Use std::vector. If you want to remove the possibility of changing it's size, wrap it.
template <typename T>
single_allocation_vector : private std::vector<T>, public gsl::span<T>
{
single_allocation_vector(size_t n, T t = {}) : vector(n, t), span(vector::data(), n) {}
// other constructors to taste
};
Something called std::dynarray was proposed for C++14:
std::dynarray is a sequence container that encapsulates arrays with a size that is fixed at construction and does not change throughout the lifetime of the object.
But there were too many issues and it didn't become part of the standard.
So there exists no such container currently in the STL. You can keep using vectors with an initial size.
Unfortunately, no new containers were added in C++ 20 (at least none that I'd be aware of). I would agree, however, that such a container would be very useful. While just using std::vector<T> with reserve() and emplace_back() will usually do OK, it does often generate inferior code compared to using a plain new T[] as the use of emplace_back() seems to inhibit vectorization. If we use an std::vector<T> with an initial size instead, compilers seem to have trouble optimizing away the value initialization of elements, even if the entire vector is going to be overwritten right afterwards. Play with an example here.
You could use, for example, a wrapper like
template <typename T>
struct default_init_wrapper
{
T t;
public:
default_init_wrapper() {}
template <typename... Args>
default_init_wrapper(Args&&... args) : t(std::forward<Args>(args)...) {}
operator const T&() const { return t; }
operator T&() { return t; }
};
and
std::vector<no_init_wrapper<T>> buffer(N);
to avoid the useless initialization for trivial types. Doing so seems to lead to code similarly good as the plain std::unique_ptr version. I wouldn't recommend this though, as it's quite ugly and cubmersome to use, since you then have to work with a vector of wrapped elements.
I guess the best option for now is to just roll your own container. This may serve as a starting point (beware of bugs):
template <typename T>
class dynamic_array
{
public:
using value_type = T;
using reference = T&;
using const_reference = T&;
using pointer = T*;
using const_pointer = const T*;
using iterator = T*;
using const_iterator = const T*;
using reverse_iterator = std::reverse_iterator<iterator>;
using const_reverse_iterator = std::reverse_iterator<const_iterator>;
using size_type = std::size_t;
using difference_type = std::ptrdiff_t;
private:
std::unique_ptr<T[]> elements;
size_type num_elements = 0U;
friend void swap(dynamic_array& a, dynamic_array& b)
{
using std::swap;
swap(a.elements, b.elements);
swap(a.num_elements, b.num_elements);
}
static auto alloc(size_type size)
{
return std::unique_ptr<T[]> { new T[size] };
}
void checkRange(size_type i) const
{
if (!(i < num_elements))
throw std::out_of_range("dynamic_array index out of range");
}
public:
const_pointer data() const { return &elements[0]; }
pointer data() { return &elements[0]; }
const_iterator begin() const { return data(); }
iterator begin() { return data(); }
const_iterator end() const { return data() + num_elements; }
iterator end() { return data() + num_elements; }
const_reverse_iterator rbegin() const { return std::make_reverse_iterator(end()); }
reverse_iterator rbegin() { return std::make_reverse_iterator(end()); }
const_reverse_iterator rend() const { return std::make_reverse_iterator(begin()); }
reverse_iterator rend() { return std::make_reverse_iterator(begin()); }
const_reference operator [](size_type i) const { return elements[i]; }
reference operator [](size_type i) { return elements[i]; }
const_reference at(size_type i) const { return checkRange(i), elements[i]; }
reference at(size_type i) { return checkRange(i), elements[i]; }
size_type size() const { return num_elements; }
constexpr size_type max_size() const { return std::numeric_limits<size_type>::max(); }
bool empty() const { return std::size(*this) == 0U; }
dynamic_array() = default;
dynamic_array(size_type size)
: elements(alloc(size)), num_elements(size)
{
}
dynamic_array(std::initializer_list<T> elements)
: elements(alloc(std::size(elements))), num_elements(std::size(elements))
{
std::copy(std::begin(elements), std::end(elements), std::begin(*this));
}
dynamic_array(const dynamic_array& arr)
{
auto new_elements = alloc(std::size(arr));
std::copy(std::begin(arr), std::end(arr), &new_elements[0]);
elements = std::move(new_elements);
num_elements = std::size(arr);
}
dynamic_array(dynamic_array&&) = default;
dynamic_array& operator =(const dynamic_array& arr)
{
return *this = dynamic_array(arr);
}
dynamic_array& operator =(dynamic_array&&) = default;
void swap(dynamic_array& arr)
{
void swap(dynamic_array& a, dynamic_array& b);
swap(*this, arr);
}
friend bool operator ==(const dynamic_array& a, const dynamic_array& b)
{
return std::equal(std::begin(a), std::end(a), std::begin(b));
}
friend bool operator !=(const dynamic_array& a, const dynamic_array& b)
{
return !(a == b);
}
};

How to implement operator-> for an iterator that constructs its values on-demand?

I have a C++ class that acts like a container: it has size() and operator[] member functions. The values stored "in" the container are std::tuple objects. However, the container doesn't actually hold the tuples in memory; instead, it constructs them on-demand based on underlying data stored in a different form.
std::tuple<int, int, int>
MyContainer::operator[](std::size_t n) const {
// Example: draw corresponding elements from parallel arrays
return { underlying_data_a[n], underlying_data_b[n], underlying_data_c[n] };
}
Hence, the return type of operator[] is a temporary object, not a reference. (This means it's not an lvalue, so the container is read-only; that's OK.)
Now I'm writing an iterator class that can be used to traverse the tuples in this container. I'd like to model RandomAccessIterator, which depends on InputIterator, but InputIterator requires support for the expression i->m (where i is an iterator instance), and as far as I can tell, an operator-> function is required to return a pointer.
Naturally, I can't return a pointer to a temporary tuple that's constructed on-demand. One possibility that comes to mind is to put a tuple instance into the iterator as a member variable, and use it to store a copy of whichever value the iterator is currently positioned on:
class Iterator {
private:
MyContainer *container;
std::size_t current_index;
// Copy of (*container)[current_index]
std::tuple<int, int, int> current_value;
// ...
};
However, updating the stored value will require the iterator to check whether its current index is less than the container's size, so that a past-the-end iterator doesn't cause undefined behavior by accessing past the end of the underlying arrays. That adds (a small amount of) runtime overhead — not enough to make the solution impractical, of course, but it feels a little inelegant. The iterator shouldn't really need to store anything but a pointer to the container it's iterating and the current position within it.
Is there a clean, well-established way to support operator-> for iterator types that construct their values on-demand? How would other developers do this sort of thing?
(Note that I don't really need to support operator-> at all — I'm implementing the iterator mainly so that the container can be traversed with a C++11 "range for" loop, and std::tuple doesn't have any members that one would typically want to access via -> anyway. But I'd like to model the iterator concepts properly nonetheless; it feels like I'm cutting corners otherwise. Or should I just not bother?)
template<class T>
struct pseudo_ptr {
T t;
T operator*()&&{return t;}
T* operator->(){ return &t; }
};
then
struct bar { int x,y; };
struct bar_iterator:std::iterator< blah, blah >{
// ...
pseudo_ptr<bar> operator->() const { return {**this}; }
// ...
};
This relies on how -> works.
ptr->b for pointer ptr is simply (*ptr).b.
Otherwise it is defined as (ptr.operator->())->b. This evaluates recursively if operator-> does not return a pointer.
The pseudo_ptr<T> above gives you a wrapper around a copy of T.
Note, however, that lifetime extension doesn't really work. The result is fragile.
Here's an example relying on the fact that operator-> is applied repeatedly until a pointer is returned. We make Iterator::operator-> return the Contained object as a temporary. This causes the compiler to reapply operator->. We then make Contained::operator-> simply return a pointer to itself. Note that if we don't want to put operator-> in the Contained on-the-fly object, we can wrap it in a helper object that returns a pointer to the internal Contained object.
#include <cstddef>
#include <iostream>
class Contained {
public:
Contained(int a_, int b_) : a(a_), b(b_) {}
const Contained *operator->() {
return this;
}
const int a, b;
};
class MyContainer {
public:
class Iterator {
friend class MyContainer;
public:
friend bool operator!=(const Iterator &it1, const Iterator &it2) {
return it1.current_index != it2.current_index;
}
private:
Iterator(const MyContainer *c, std::size_t ind) : container(c), current_index(ind) {}
public:
Iterator &operator++() {
++current_index;
return *this;
}
// -> is reapplied, since this returns a non-pointer.
Contained operator->() {
return Contained(container->underlying_data_a[current_index], container->underlying_data_b[current_index]);
}
Contained operator*() {
return Contained(container->underlying_data_a[current_index], container->underlying_data_b[current_index]);
}
private:
const MyContainer *const container;
std::size_t current_index;
};
public:
MyContainer() {
for (int i = 0; i < 10; i++) {
underlying_data_a[i] = underlying_data_b[i] = i;
}
}
Iterator begin() const {
return Iterator(this, 0);
}
Iterator end() const {
return Iterator(this, 10);
}
private:
int underlying_data_a[10];
int underlying_data_b[10];
};
int
main() {
MyContainer c;
for (const auto &e : c) {
std::cout << e.a << ", " << e.b << std::endl;
}
}

Any alternative to std::dynarray presently available?

C++11 gave us great std::array, which requires size to be known at compile time:
std::array<int, 3> myarray = {1, 2, 3};
Now, I happen to have some old short* buffers to wrap, whose size will be known (and it will be, of course) at runtime only.
C++14 will define std::dynarray to cover this case, but dynarray is not available yet in GCC 4.7 nor in Clang 3.2.
So, does anyone know a container which is comparable to std::array (in terms of efficiency) but does not require to specify size at compile time? I suspect Boost has something ready for me, although I couldn't find anything.
I think std::vector is what you're looking for before dynarray becomes available. Just use the allocating constructor or reserve and you'll avoid reallocation overhead.
I’ll put in a vote for std::unique_ptr<short[]>(new short[n]) if you don’t need the range-checked access provided by std::dynarray<T>::at(). You can even use an initializer list:
#include <iostream>
#include <memory>
int main(int argc, char** argv) {
const size_t n = 3;
std::unique_ptr<short[]> myarray(new short[n]{ 1, 2, 3 });
for (size_t i = 0; i < n; ++i)
std::cout << myarray[i] << '\n';
}
You could (ab)use a std::valarray<short>.
int main() {
short* raw_array = (short*) malloc(12 * sizeof(short));
size_t length = 12;
for (size_t i = 0; i < length; ++ i) {
raw_array[i] = (short) i;
}
// ...
std::valarray<short> dyn_array (raw_array, length);
for (short elem : dyn_array) {
std::cout << elem << std::endl;
}
// ...
free(raw_array);
}
valarray supports most features of a dynarray, except:
allocator
reverse iterator
.at()
.data()
Note that the standard (as of n3690) does not require valarray storage be continuous, although there's no reason not to do so :).
(For some implementation detail, in libstdc++ it is implemented as a (length, data) pair, and in libc++ it is implemented as (begin, end).)
A buffer and a size, plus some basic methods, give you most of what you want.
Lots of boilerplate, but something like this:
template<typename T>
struct fixed_buffer {
typedef T value_type;
typedef T& reference;
typedef const T& const_reference;
typedef T* iterator;
typedef const T* const_iterator;
typedef std::reverse_iterator<iterator> reverse_iterator;
typedef std::reverse_iterator<const_iterator> const_reverse_iterator;
typedef size_t size_type;
typedef ptrdiff_t difference_type;
std::size_t length;
std::unique_ptr<T[]> buffer;
std::size_t size() const { return length; }
iterator begin() { return data(); }
const_iterator begin() const { return data(); }
const_iterator cbegin() const { return data(); }
iterator end() { return data()+size(); }
const_iterator end() const { return data()+size(); }
const_iterator cend() const { return data()+size(); }
reverse_iterator rbegin() { return {end()}; }
const_reverse_iterator rbegin() const { return {end()}; }
const_reverse_iterator crbegin() const { return {end()}; }
reverse_iterator rend() { return {begin()}; }
const_reverse_iterator rend() const { return {begin()}; }
const_reverse_iterator crend() const { return {begin()}; }
T& front() { return *begin(); }
T const& front() const { return *begin(); }
T& back() { return *(begin()+size()-1); }
T const& back() const { return *(begin()+size()-1); }
T* data() { return buffer.get(); }
T const* data() const { return buffer.get(); }
T& operator[]( std::size_t i ) { return data()[i]; }
T const& operator[]( std::size_t i ) const { return data()[i]; }
fixed_buffer& operator=(fixed_buffer &&) = default;
fixed_buffer(fixed_buffer &&) = default;
explicit fixed_buffer(std::size_t N):length(N), buffer( new T[length] ) {}
fixed_buffer():length(0), buffer() {}
fixed_buffer(fixed_buffer const& o):length(o.N), buffer( new T[length] )
{
std::copy( o.begin(), o.end(), begin() );
}
fixed_buffer& operator=(fixed_buffer const& o)
{
std::unique_ptr<T[]> tmp( new T[o.length] );
std::copy( o.begin(), o.end(), tmp.get() );
length = o.length;
buffer = std::move(tmp);
return *this;
}
};
at() is missing, as are allocators.
operator= is different than dyn_array proposal -- the proposal blocks operator=, I give it value semantics. A few methods are less efficient (like copy construction). I allow empty fixed_buffer.
This would probably block being able to use the stack to store a dyn_array, which is probably why it doesn't allow it. Simply delete my operator= and trivial constructor if you want closer-to-dyn_array behavior.
C++14 also adds variable length arrays, similar to those in C99, and that's supported by some compilers already:
void foo(int n) {
int data[n];
// ...
}
It's not a container, as it doesn't support begin() and end() etc. but might be a workable solution.
dynarray is very easy to implement oneself without the stack-allocation component -which apparently isn't possible to do until perhaps C++14 anyway- so I just rolled a dynarray inverse-backport (forwardport?) as part of my library and started using it ever since. Works in C++03 without any "void in Nebraska" clauses so far, as it doesn't absolutely depend on any C++11-specific capability, and it's neat to have
That way when C++1y/2z dynarray comes along my code is still for the most part compatible.
(It's also one of those many obvious "why didn't C++ have this sooner?" things so it's good to have it around).
This was before I learned that apparently C++1y-dynarray and C++1y-runtime-size-arrays are the exact same proposal (one is just syntactic sugar for the other) and not two different-but-complementary proposals as I first thought. So if I had to solve the same question nowadays I'd probably switch to something based off #Yakk's solution for correctness.

pointer to vector at index Vs iterator

I have a vector< Object > myvec which I use in my code to hold a list of objects in memory. I keep a pointer to the current object in that vector in the "normal" C fashion by using
Object* pObj = &myvec[index];
This all works fine if... myvec doesn't grow big enough that it is moved around during a push_back at which time pObj becomes invalid - vectors guarantee data is sequential, hence they make no effort to keep the vector at the same memory location.
I can reserve enough space for myvec to prevent this, but I dnt' like that solution.
I could keep the index of the selected myvec position and when I need to use it just access it directly, but it's a costly modification to my code.
I'm wondering if iterators keep the their references intact as a vector is reallocated/moved and if so can I just replace
Object* pObj = &myvec[index];
by something like
vector<Object>::iterator = myvec.begin()+index;
What are the implication of this?
Is this doable?
What is the standard pattern to save pointers to vector positions?
Cheers
No... using an iterator you would have the same exact problem. If a vector reallocation is performed then all iterators are invalidated and using them is Undefined Behavior.
The only solution that is reallocation-resistant with an std::vector is using the integer index.
Using for example std::list things are different, but also the are different efficiency compromises, so it really depends on what you need to do.
Another option would be to create your own "smart index" class, that stores a reference to the vector and the index. This way you could keep just passing around one "pointer" (and you could implement pointer semantic for it) but the code wouldn't suffer from reallocation risks.
Iterators are (potentially) invalidated by anything that could resize the vector (e.g., push_back).
You could, however, create your own iterator class that stored the vector and an index, which would be stable across operations that resized the vector:
#include <iterator>
#include <algorithm>
#include <iostream>
#include <vector>
namespace stable {
template <class T, class Dist=ptrdiff_t, class Ptr = T*, class Ref = T&>
class iterator : public std::iterator<std::random_access_iterator_tag, T, Dist, Ptr, Ref>
{
T &container_;
size_t index_;
public:
iterator(T &container, size_t index) : container_(container), index_(index) {}
iterator operator++() { ++index_; return *this; }
iterator operator++(int) { iterator temp(*this); ++index_; return temp; }
iterator operator--() { --index_; return *this; }
iterator operator--(int) { stable_itertor temp(*this); --index_; return temp; }
iterator operator+(Dist offset) { return iterator(container_, index_ + offset); }
iterator operator-(Dist offset) { return iterator(container_, index_ - offset); }
bool operator!=(iterator const &other) const { return index_ != other.index_; }
bool operator==(iterator const &other) const { return index_ == other.index_; }
bool operator<(iterator const &other) const { return index_ < other.index_; }
bool operator>(iterator const &other) const { return index_ > other.index_; }
typename T::value_type &operator *() { return container_[index_]; }
typename T::value_type &operator[](size_t index) { return container_[index_ + index]; }
};
template <class T>
iterator<T> begin(T &container) { return iterator<T>(container, 0); }
template <class T>
iterator<T> end(T &container) { return iterator<T>(container, container.size()); }
}
#ifdef TEST
int main() {
std::vector<int> data;
// add some data to the container:
for (int i=0; i<10; i++)
data.push_back(i);
// get iterators to the beginning/end:
stable::iterator<std::vector<int> > b = stable::begin(data);
stable::iterator<std::vector<int> > e = stable::end(data);
// add enough more data that the container will (probably) be resized:
for (int i=10; i<10000; i++)
data.push_back(i);
// Use the previously-obtained iterators:
std::copy(b, e, std::ostream_iterator<int>(std::cout, "\n"));
// These iterators also support most pointer-like operations:
std::cout << *(b+125) << "\n";
std::cout << b[150] << "\n";
return 0;
}
#endif
Since we can't embed this as a nested class inside of the container like a normal iterator class, this requires a slightly different syntax to declare/define an object of this type; instead of the usual std::vector<int>::iterator whatever;, we have to use stable::iterator<std::vector<int> > whatever;. Likewise, to obtain the beginning of a container, we use stable::begin(container).
There is one point that may be a bit surprising (at least at first): when you obtain a stable::end(container), that gets you the end of the container at that time. As shown in the test code above, if you later add more items to the container, the iterator your obtained previously is not adjusted to reflect the new end of the container -- it retains the position it had when you obtained it (i.e., the position that was the end of the container at that time, but isn't any more).
No, iterators are invalidated after vector growth.
The way to get around this problem is to keep the index to the item, not a pointer or iterator to it. This is because the item stays at its index, even if the vector grows, assuming of course that you don't insert any items before it (thus changing its index).

C++ container question

I was looking for some suitable 2D element container. What I want is the ability to iterate through every element of the container using, for example BOOST_FOREACH and I also would like to have an ability to construct subview (slices / subranges) of my container and, probably iterate through them too.
Right now I am using boost::numeric::ublas::matrix for these purposes, but, well, it doesn't look as a good solution for me, because, well, it's a BLAS matrix, although it behaves very well as a plain 2d element container (custom unbounded / bounded storages are also very sweet).
Another boost alternative, boost::multi_array is bad, because you can't iterate through every element using one BOOST_FOREACH statement and because constructing views has extremely obfuscated syntax.
Any alternatives?
Thank you.
I do the following (array type is container/iterator range concept):
ublas::matrix<douple> A;
foreach (double & element, A.data())
{
}
However, this will not work for slices: your best solution is to write an iterator for them.
Here is an example of using multi_array to provide storage of a custom class.
Perhaps you could do the same:
template<size_t N, typename T>
struct tensor_array : boost::multi_array_ref<T,N> {
typedef boost::multi_array_ref<T,N> base_type;
typedef T value_type;
typedef T& reference;
typedef const T& const_reference;
tensor_array() : base_type(NULL, extents())
{
// std::cout << "create" << std::endl;
}
template<class A>
tensor_array(const A &dims)
: base_type(NULL, extents())
{
//std::cout << "create" << std::endl;
resize(dims);
}
template<typename U>
void resize(const U (&dims)[N]) {
boost::array<U,N> dims_;
std::copy(dims, dims + N, dims_.begin());
resize(dims_);
}
template<typename U>
void resize(const boost::array<U,N> &dims) {
size_t size = 1;
boost::array<size_t,N> shape;
for (size_t i = 0; i < N; ++i) {
size *= dims[i];
shape[N-(i+1)] = dims[i];
}
data_.clear();
data_.resize(size, 0);
// update base_type parent
set_base_ptr(&data_[0]);
this->num_elements_ = size;
reshape(shape);
}
size_t size() const { return data_.size(); }
size_t size(size_t i) const { return this->shape()[N-(i+1)]; }
tensor_array& fill(const T &value) {
std::fill(data_.begin(), data_.end(), value);
return *this;
}
private:
typedef boost::detail::multi_array::extent_gen<N> extents;
std::vector<T> data_;
};
Define your own type (trivial), give it an iterator and const_interator (trivial), and BOOST_FOREACH will work with it.
http://beta.boost.org/doc/libs/1_39_0/doc/html/foreach.html