C++11 gave us great std::array, which requires size to be known at compile time:
std::array<int, 3> myarray = {1, 2, 3};
Now, I happen to have some old short* buffers to wrap, whose size will be known (and it will be, of course) at runtime only.
C++14 will define std::dynarray to cover this case, but dynarray is not available yet in GCC 4.7 nor in Clang 3.2.
So, does anyone know a container which is comparable to std::array (in terms of efficiency) but does not require to specify size at compile time? I suspect Boost has something ready for me, although I couldn't find anything.
I think std::vector is what you're looking for before dynarray becomes available. Just use the allocating constructor or reserve and you'll avoid reallocation overhead.
I’ll put in a vote for std::unique_ptr<short[]>(new short[n]) if you don’t need the range-checked access provided by std::dynarray<T>::at(). You can even use an initializer list:
#include <iostream>
#include <memory>
int main(int argc, char** argv) {
const size_t n = 3;
std::unique_ptr<short[]> myarray(new short[n]{ 1, 2, 3 });
for (size_t i = 0; i < n; ++i)
std::cout << myarray[i] << '\n';
}
You could (ab)use a std::valarray<short>.
int main() {
short* raw_array = (short*) malloc(12 * sizeof(short));
size_t length = 12;
for (size_t i = 0; i < length; ++ i) {
raw_array[i] = (short) i;
}
// ...
std::valarray<short> dyn_array (raw_array, length);
for (short elem : dyn_array) {
std::cout << elem << std::endl;
}
// ...
free(raw_array);
}
valarray supports most features of a dynarray, except:
allocator
reverse iterator
.at()
.data()
Note that the standard (as of n3690) does not require valarray storage be continuous, although there's no reason not to do so :).
(For some implementation detail, in libstdc++ it is implemented as a (length, data) pair, and in libc++ it is implemented as (begin, end).)
A buffer and a size, plus some basic methods, give you most of what you want.
Lots of boilerplate, but something like this:
template<typename T>
struct fixed_buffer {
typedef T value_type;
typedef T& reference;
typedef const T& const_reference;
typedef T* iterator;
typedef const T* const_iterator;
typedef std::reverse_iterator<iterator> reverse_iterator;
typedef std::reverse_iterator<const_iterator> const_reverse_iterator;
typedef size_t size_type;
typedef ptrdiff_t difference_type;
std::size_t length;
std::unique_ptr<T[]> buffer;
std::size_t size() const { return length; }
iterator begin() { return data(); }
const_iterator begin() const { return data(); }
const_iterator cbegin() const { return data(); }
iterator end() { return data()+size(); }
const_iterator end() const { return data()+size(); }
const_iterator cend() const { return data()+size(); }
reverse_iterator rbegin() { return {end()}; }
const_reverse_iterator rbegin() const { return {end()}; }
const_reverse_iterator crbegin() const { return {end()}; }
reverse_iterator rend() { return {begin()}; }
const_reverse_iterator rend() const { return {begin()}; }
const_reverse_iterator crend() const { return {begin()}; }
T& front() { return *begin(); }
T const& front() const { return *begin(); }
T& back() { return *(begin()+size()-1); }
T const& back() const { return *(begin()+size()-1); }
T* data() { return buffer.get(); }
T const* data() const { return buffer.get(); }
T& operator[]( std::size_t i ) { return data()[i]; }
T const& operator[]( std::size_t i ) const { return data()[i]; }
fixed_buffer& operator=(fixed_buffer &&) = default;
fixed_buffer(fixed_buffer &&) = default;
explicit fixed_buffer(std::size_t N):length(N), buffer( new T[length] ) {}
fixed_buffer():length(0), buffer() {}
fixed_buffer(fixed_buffer const& o):length(o.N), buffer( new T[length] )
{
std::copy( o.begin(), o.end(), begin() );
}
fixed_buffer& operator=(fixed_buffer const& o)
{
std::unique_ptr<T[]> tmp( new T[o.length] );
std::copy( o.begin(), o.end(), tmp.get() );
length = o.length;
buffer = std::move(tmp);
return *this;
}
};
at() is missing, as are allocators.
operator= is different than dyn_array proposal -- the proposal blocks operator=, I give it value semantics. A few methods are less efficient (like copy construction). I allow empty fixed_buffer.
This would probably block being able to use the stack to store a dyn_array, which is probably why it doesn't allow it. Simply delete my operator= and trivial constructor if you want closer-to-dyn_array behavior.
C++14 also adds variable length arrays, similar to those in C99, and that's supported by some compilers already:
void foo(int n) {
int data[n];
// ...
}
It's not a container, as it doesn't support begin() and end() etc. but might be a workable solution.
dynarray is very easy to implement oneself without the stack-allocation component -which apparently isn't possible to do until perhaps C++14 anyway- so I just rolled a dynarray inverse-backport (forwardport?) as part of my library and started using it ever since. Works in C++03 without any "void in Nebraska" clauses so far, as it doesn't absolutely depend on any C++11-specific capability, and it's neat to have
That way when C++1y/2z dynarray comes along my code is still for the most part compatible.
(It's also one of those many obvious "why didn't C++ have this sooner?" things so it's good to have it around).
This was before I learned that apparently C++1y-dynarray and C++1y-runtime-size-arrays are the exact same proposal (one is just syntactic sugar for the other) and not two different-but-complementary proposals as I first thought. So if I had to solve the same question nowadays I'd probably switch to something based off #Yakk's solution for correctness.
Related
I need a container with run-time known size with no need to resizing. std::unique_ptr<T[]> would be a useful, but there is no encapsulated size member. In the same time std::array is for compile type size only. Hence I need some combination of these classes with no/minimal overhead.
Is there a standard class for my needs, maybe something in upcoming C++20?
Use std::vector. This is the class for runtime sized array in the STL.
It let you resize it or pushing elements into it:
auto vec = std::vector<int>{};
vec.resize(10); // now vector has 10 ints 0 initialized
vec.push_back(1); // now 11 ints
Some problems stated in the comments:
vector has an excessive interface
So is std::array. You have more than 20 function in std::array including operators.
Just don't use what you don't need. You don't pay for the function you won't use. It won't even increase your binary size.
vector will force initialize items on resize. As far as I know, it is not allowed to use operator[] for indexes >= size (despite calling reserve).
This is not how it is meant to be used. When reserving you should then resize the vector with resize or by pushing elements into it. You say vector will force initialize elements into it, but the problem is that you cannot call operator= on unconstructed objects, including ints.
Here's an example using reserve:
auto vec = std::vector<int>{};
vec.reserve(10); // capacity of at least 10
vec.resize(3); // Contains 3 zero initialized ints.
// If you don't want to `force` initialize elements
// you should push or emplace element into it:
vec.emplace_back(1); // no reallocation for the three operations.
vec.emplace_back(2); // no default initialize either.
vec.emplace_back(3); // ints constructed with arguments in emplace_back
Keep in mind that there is a high chance for such allocation and use case, the compiler may completely elide construction of elements in the vector. There may be no overhead in your code.
I would suggest to measure and profile if your code is subject to very precise performance specification. If you do not have such specification, most likely this is premature optimization. The cost of memory allocation completely out measure the time it takes to initialize elements one by one.
Other parts of your program may be refactored to gain much more performance than trivial initialization can offer you. In fact, getting in the way of it may hinder optimization and make your program slower.
Allocate the memory using an std::unique_ptr<T[]> like you suggested, but to use it - construct an std::span (in C++20; gsl::span before C++20) from the raw pointer and the number of elements, and pass the span around (by value; spans are reference-types, sort of). The span will give you all the bells and whistles of a container: size, iterators, ranged-for, the works.
#include <span>
// or:
// #include <gsl/span>
int main() {
// ... etc. ...
{
size_t size = 10e5;
auto uptr { std::make_unique<double[]>(size) };
std::span<int> my_span { uptr.get(), size };
do_stuff_with_the_doubles(my_span);
}
// ... etc. ...
}
For more information about spans, see:
What is a "span" and when should I use one?
Use std::vector. If you want to remove the possibility of changing it's size, wrap it.
template <typename T>
single_allocation_vector : private std::vector<T>, public gsl::span<T>
{
single_allocation_vector(size_t n, T t = {}) : vector(n, t), span(vector::data(), n) {}
// other constructors to taste
};
Something called std::dynarray was proposed for C++14:
std::dynarray is a sequence container that encapsulates arrays with a size that is fixed at construction and does not change throughout the lifetime of the object.
But there were too many issues and it didn't become part of the standard.
So there exists no such container currently in the STL. You can keep using vectors with an initial size.
Unfortunately, no new containers were added in C++ 20 (at least none that I'd be aware of). I would agree, however, that such a container would be very useful. While just using std::vector<T> with reserve() and emplace_back() will usually do OK, it does often generate inferior code compared to using a plain new T[] as the use of emplace_back() seems to inhibit vectorization. If we use an std::vector<T> with an initial size instead, compilers seem to have trouble optimizing away the value initialization of elements, even if the entire vector is going to be overwritten right afterwards. Play with an example here.
You could use, for example, a wrapper like
template <typename T>
struct default_init_wrapper
{
T t;
public:
default_init_wrapper() {}
template <typename... Args>
default_init_wrapper(Args&&... args) : t(std::forward<Args>(args)...) {}
operator const T&() const { return t; }
operator T&() { return t; }
};
and
std::vector<no_init_wrapper<T>> buffer(N);
to avoid the useless initialization for trivial types. Doing so seems to lead to code similarly good as the plain std::unique_ptr version. I wouldn't recommend this though, as it's quite ugly and cubmersome to use, since you then have to work with a vector of wrapped elements.
I guess the best option for now is to just roll your own container. This may serve as a starting point (beware of bugs):
template <typename T>
class dynamic_array
{
public:
using value_type = T;
using reference = T&;
using const_reference = T&;
using pointer = T*;
using const_pointer = const T*;
using iterator = T*;
using const_iterator = const T*;
using reverse_iterator = std::reverse_iterator<iterator>;
using const_reverse_iterator = std::reverse_iterator<const_iterator>;
using size_type = std::size_t;
using difference_type = std::ptrdiff_t;
private:
std::unique_ptr<T[]> elements;
size_type num_elements = 0U;
friend void swap(dynamic_array& a, dynamic_array& b)
{
using std::swap;
swap(a.elements, b.elements);
swap(a.num_elements, b.num_elements);
}
static auto alloc(size_type size)
{
return std::unique_ptr<T[]> { new T[size] };
}
void checkRange(size_type i) const
{
if (!(i < num_elements))
throw std::out_of_range("dynamic_array index out of range");
}
public:
const_pointer data() const { return &elements[0]; }
pointer data() { return &elements[0]; }
const_iterator begin() const { return data(); }
iterator begin() { return data(); }
const_iterator end() const { return data() + num_elements; }
iterator end() { return data() + num_elements; }
const_reverse_iterator rbegin() const { return std::make_reverse_iterator(end()); }
reverse_iterator rbegin() { return std::make_reverse_iterator(end()); }
const_reverse_iterator rend() const { return std::make_reverse_iterator(begin()); }
reverse_iterator rend() { return std::make_reverse_iterator(begin()); }
const_reference operator [](size_type i) const { return elements[i]; }
reference operator [](size_type i) { return elements[i]; }
const_reference at(size_type i) const { return checkRange(i), elements[i]; }
reference at(size_type i) { return checkRange(i), elements[i]; }
size_type size() const { return num_elements; }
constexpr size_type max_size() const { return std::numeric_limits<size_type>::max(); }
bool empty() const { return std::size(*this) == 0U; }
dynamic_array() = default;
dynamic_array(size_type size)
: elements(alloc(size)), num_elements(size)
{
}
dynamic_array(std::initializer_list<T> elements)
: elements(alloc(std::size(elements))), num_elements(std::size(elements))
{
std::copy(std::begin(elements), std::end(elements), std::begin(*this));
}
dynamic_array(const dynamic_array& arr)
{
auto new_elements = alloc(std::size(arr));
std::copy(std::begin(arr), std::end(arr), &new_elements[0]);
elements = std::move(new_elements);
num_elements = std::size(arr);
}
dynamic_array(dynamic_array&&) = default;
dynamic_array& operator =(const dynamic_array& arr)
{
return *this = dynamic_array(arr);
}
dynamic_array& operator =(dynamic_array&&) = default;
void swap(dynamic_array& arr)
{
void swap(dynamic_array& a, dynamic_array& b);
swap(*this, arr);
}
friend bool operator ==(const dynamic_array& a, const dynamic_array& b)
{
return std::equal(std::begin(a), std::end(a), std::begin(b));
}
friend bool operator !=(const dynamic_array& a, const dynamic_array& b)
{
return !(a == b);
}
};
Short version
Can I reinterpret_cast a std::vector<void*>* to a std::vector<double*>*?
What about with other STL containers?
Long version
I have a function to recast a vector of void pointers to a datatype specified by a template argument:
template <typename T>
std::vector<T*> recastPtrs(std::vector<void*> const& x) {
std::vector<T*> y(x.size());
std::transform(x.begin(), x.end(), y.begin(),
[](void *a) { return static_cast<T*>(a); } );
return y;
}
But I was thinking that copying the vector contents isn't really necessary, since we're really just reinterpreting what's being pointed to.
After some tinkering, I came up with this:
template <typename T>
std::vector<T*> recastPtrs(std::vector<void*>&& x) {
auto xPtr = reinterpret_cast<std::vector<T*>*>(&x);
return std::vector<T*>(std::move(*xPtr));
}
So my questions are:
Is it safe to reinterpret_cast an entire vector like this?
What if it was a different kind of container (like a std::list or std::map)? To be clear, I mean casting a std::list<void*> to std::list<T*>, not casting between STL container types.
I'm still trying to wrap my head around move semantics. Am I doing it right?
And one follow-up question: What would be the best way to generate a const version without code duplication? i.e. to define
std::vector<T const*> recastPtrs(std::vector<void const*> const&);
std::vector<T const*> recastPtrs(std::vector<void const*>&&);
MWE
#include <vector>
#include <algorithm>
#include <iostream>
template <typename T>
std::vector<T*> recastPtrs(std::vector<void*> const& x) {
std::vector<T*> y(x.size());
std::transform(x.begin(), x.end(), y.begin(),
[](void *a) { return static_cast<T*>(a); } );
return y;
}
template <typename T>
std::vector<T*> recastPtrs(std::vector<void*>&& x) {
auto xPtr = reinterpret_cast<std::vector<T*>*>(&x);
return std::vector<T*>(std::move(*xPtr));
}
template <typename T>
void printVectorAddr(std::vector<T> const& vec) {
std::cout<<" vector object at "<<&vec<<", data()="<<vec.data()<<std::endl;
}
int main(void) {
std::cout<<"Original void pointers"<<std::endl;
std::vector<void*> voidPtrs(100);
printVectorAddr(voidPtrs);
std::cout<<"Elementwise static_cast"<<std::endl;
auto dblPtrs = recastPtrs<double>(voidPtrs);
printVectorAddr(dblPtrs);
std::cout<<"reintepret_cast entire vector, then move ctor"<<std::endl;
auto dblPtrs2 = recastPtrs<double>(std::move(voidPtrs));
printVectorAddr(dblPtrs2);
}
Example output:
Original void pointers
vector object at 0x7ffe230b1cb0, data()=0x21de030
Elementwise static_cast
vector object at 0x7ffe230b1cd0, data()=0x21de360
reintepret_cast entire vector, then move ctor
vector object at 0x7ffe230b1cf0, data()=0x21de030
Note that the reinterpret_cast version reuses the underlying data structure.
Previously-asked questions that didn't seem relevant
These are the questions that come up when I tried to search this:
reinterpret_cast vector of class A to vector of class B
reinterpret_cast vector of derived class to vector of base class
reinterpret_cast-ing vector of one type to a vector of another type which is of the same type
And the answer to these was a unanimous NO, with reference to the strict aliasing rule. But I figure that doesn't apply to my case, since the vector being recast is an rvalue, so there's no opportunity for aliasing.
Why I'm trying to do this
I'm interfacing with a MATLAB library that gives me data pointers as void* along with a variable indicating the datatype. I have one function that validates the inputs and collects these pointers into a vector:
void parseInputs(int argc, mxArray* inputs[], std::vector<void*> &dataPtrs, mxClassID &numericType);
I can't templatize this part since the type is not known until runtime. On the other side, I have numeric routines to operate on vectors of a known datatype:
template <typename T>
void processData(std::vector<T*> const& dataPtrs);
So I'm just trying to connect one to the other:
void processData(std::vector<void*>&& voidPtrs, mxClassID numericType) {
switch (numericType) {
case mxDOUBLE_CLASS:
processData(recastPtrs<double>(std::move(voidPtrs)));
break;
case mxSINGLE_CLASS:
processData(recastPtrs<float>(std::move(voidPtrs)));
break;
default:
assert(0 && "Unsupported datatype");
break;
}
}
Given the comment that you're receiving the void * from a C library (something like malloc), it seems like we can probably narrow the problem down quite a bit.
In particular, I'd guess you're really dealing with something that's more like an array_view than a vector. That is, you want something that lets you access some data cleanly. You might change individual items in that collection, but you'll never change the collection as a whole (e.g., you won't try to do a push_back that could need to expand the memory allocation).
For such a case, you can pretty easily create a wrapper of your own that gives you vector-like access to the data--defines an iterator type, has a begin() and end() (and if you want, the others like rbegin()/rend(), cbegin()/cend() and crbegin()/crend()), as well as an at() that does range-checked indexing, and so on.
So a fairly minimal version could look something like this:
#pragma once
#include <cstddef>
#include <stdexcept>
#include <cstdlib>
#include <iterator>
template <class T> // note: no allocator, since we don't do allocation
class array_view {
T *data;
std::size_t size_;
public:
array_view(void *data, std::size_t size_) : data(reinterpret_cast<T *>(data)), size_(size_) {}
T &operator[](std::size_t index) { return data[index]; }
T &at(std::size_t index) {
if (index > size_) throw std::out_of_range("Index out of range");
return data[index];
}
std::size_t size() const { return size_; }
typedef T *iterator;
typedef T const &const_iterator;
typedef T value_type;
typedef T &reference;
iterator begin() { return data; }
iterator end() { return data + size_; }
const_iterator cbegin() { return data; }
const_iterator cend() { return data + size_; }
class reverse_iterator {
T *it;
public:
reverse_iterator(T *it) : it(it) {}
using iterator_category = std::random_access_iterator_tag;
using difference_type = std::ptrdiff_t;
using value_type = T;
using pointer = T *;
using reference = T &;
reverse_iterator &operator++() {
--it;
return *this;
}
reverse_iterator &operator--() {
++it;
return *this;
}
reverse_iterator operator+(size_t size) const {
return reverse_iterator(it - size);
}
reverse_iterator operator-(size_t size) const {
return reverse_iterator(it + size);
}
difference_type operator-(reverse_iterator const &r) const {
return it - r.it;
}
bool operator==(reverse_iterator const &r) const { return it == r.it; }
bool operator!=(reverse_iterator const &r) const { return it != r.it; }
bool operator<(reverse_iterator const &r) const { return std::less<T*>(r.it, it); }
bool operator>(reverse_iterator const &r) const { return std::less<T*>(it, r.it); }
T &operator *() { return *(it-1); }
};
reverse_iterator rbegin() { return data + size_; }
reverse_iterator rend() { return data; }
};
I've tried to show enough that it should be fairly apparent how to add most of the missing functionality (e.g., crbegin()/crend()), but I haven't worked really hard at including everything here, since much of what's left is more repetitive and tedious than educational.
This is enough to use the array_view in most of the typical vector-like ways. For example:
#include "array_view"
#include <iostream>
#include <iterator>
int main() {
void *raw = malloc(16 * sizeof(int));
array_view<int> data(raw, 16);
std::cout << "Range based:\n";
for (auto & i : data)
i = rand();
for (auto const &i : data)
std::cout << i << '\n';
std::cout << "\niterator-based, reverse:\n";
auto end = data.rend();
for (auto d = data.rbegin(); d != end; ++d)
std::cout << *d << '\n';
std::cout << "Forward, counted:\n";
for (int i=0; i<data.size(); i++) {
data[i] += 10;
std::cout << data[i] << '\n';
}
}
Note that this doesn't attempt to deal with copy/move construction at all, nor with destruction. At least as I've formulated it, the array_view is a non-owning view into some existing data. It's up to you (or at least something outside of the array_view) to destroy the data when appropriate. Since we're not destroying the data, we can use the compiler-generated copy and move constructors without any problem. We won't get a double-delete from doing a shallow copy of the pointer, because we don't do any delete when the array_view is destroyed.
No, you cannot do anything like this in Standard C++.
The strict aliasing rule says that to access an object of type T, you must use an expression of type T; with a very short list of exceptions to that.
Accessing a double * via a void * expression is not such an exception; let alone a vector of each. Nor is it an exception if you accessed the object of type T via an rvalue.
Multi-dimensional initializers can be created by nesting brace-enclosed lists, as in {{1,2,3}, {4,5,6}}. A function accepting this can be written using nested std::initializer_lists.
Are the data elements guaranteed to be contiguous?
Here's an example:
void f(std::initializer_list<std::initializer_list<int>> a)
{
for(auto const & p: a)
for(auto const & q: p)
std::cout << &q << std::endl;
}
int main()
{
f({{1,2,3}, {4,5,6}});
return 0;
}
The above code outputs consecutive addresses on my machine.
0x400c60
0x400c64
0x400c68
0x400c6c
0x400c70
0x400c74
It is guaranteed?
Update
The answer must be no.
void g(std::initializer_list<int> a, std::initializer_list<int> b)
{
f({b,a});
}
int main()
{
g({1,2,3}, {4,5,6});
return 0;
}
The output is:
0x400cdc
0x400ce0
0x400ce4
0x400cd0
0x400cd4
0x400cd8
C++11 § [support.initlist] 18.9/1 specifies that for std::initializer_list<T> iterator must be T*, so you are guaranteed that sequential elements in a single initializer_list are contiguous.
In the case of nested lists, e.g., std::initializer_list<std::initializer_list<int>>, there is no requirement that all elements are contiguous. The top-level list's begin() must return a pointer to a contiguous array of std::initializer_list<int>, and each of those lists' begin() must return a pointer to a contiguous array of int. Those second-tier int arrays could be scattered all over memory or stored precisely in total order exactly as you observed. Both ways are compliant.
From here (std::initializer_list) :-
Initializer lists may be implemented as a pair of pointers or pointer
and length.
A possible implementation can be (taken & edited from Mingw 4.8.1 initializer_list header ) :
template<class T>
class initializer_list
{
public:
typedef T value_type;
typedef const T& reference;
typedef const T& const_reference;
typedef size_t size_type;
typedef const T* iterator;
typedef const T* const_iterator;
private:
iterator arr;
size_type arr_len;
// The compiler can call a private constructor.
constexpr initializer_list(const_iterator it, size_type len)
: arr(it), arr_len(len) { }
public:
constexpr initializer_list() noexcept
: arr(0), arr_len(0) { }
// Number of elements.
constexpr size_type
size() const noexcept { return arr_len; }
// First element.
constexpr const_iterator
begin() const noexcept { return arr; }
// One past the last element.
constexpr const_iterator
end() const noexcept { return begin() + size(); }
};
The iterators are not magic, its just playing around with pointers. So the answer is yes, its guaranteed.
In second example of yours, it does show the contiguous memory location, you just flipped a and b
I made a collection for which I want to provide an STL-style, random-access iterator. I was searching around for an example implementation of an iterator but I didn't find any. I know about the need for const overloads of [] and * operators. What are the requirements for an iterator to be "STL-style" and what are some other pitfalls to avoid (if any)?
Additional context: This is for a library and I don't want to introduce any dependency on it unless I really need to. I write my own collection to be able to provide binary compatibility between C++03 and C++11 with the same compiler (so no STL which would probably break).
https://cplusplus.com/reference/iterator/ has a handy chart that details the specs of § 24.2.2 of the C++11 standard. Basically, the iterators have tags that describe the valid operations, and the tags have a hierarchy. Below is purely symbolic, these classes don't actually exist as such.
iterator {
iterator(const iterator&);
~iterator();
iterator& operator=(const iterator&);
iterator& operator++(); //prefix increment
reference operator*() const;
friend void swap(iterator& lhs, iterator& rhs); //C++11 I think
};
input_iterator : public virtual iterator {
iterator operator++(int); //postfix increment
value_type operator*() const;
pointer operator->() const;
friend bool operator==(const iterator&, const iterator&);
friend bool operator!=(const iterator&, const iterator&);
};
//once an input iterator has been dereferenced, it is
//undefined to dereference one before that.
output_iterator : public virtual iterator {
reference operator*() const;
iterator operator++(int); //postfix increment
};
//dereferences may only be on the left side of an assignment
//once an output iterator has been dereferenced, it is
//undefined to dereference one before that.
forward_iterator : input_iterator, output_iterator {
forward_iterator();
};
//multiple passes allowed
bidirectional_iterator : forward_iterator {
iterator& operator--(); //prefix decrement
iterator operator--(int); //postfix decrement
};
random_access_iterator : bidirectional_iterator {
friend bool operator<(const iterator&, const iterator&);
friend bool operator>(const iterator&, const iterator&);
friend bool operator<=(const iterator&, const iterator&);
friend bool operator>=(const iterator&, const iterator&);
iterator& operator+=(size_type);
friend iterator operator+(const iterator&, size_type);
friend iterator operator+(size_type, const iterator&);
iterator& operator-=(size_type);
friend iterator operator-(const iterator&, size_type);
friend difference_type operator-(iterator, iterator);
reference operator[](size_type) const;
};
contiguous_iterator : random_access_iterator { //C++17
}; //elements are stored contiguously in memory.
You can either specialize std::iterator_traits<youriterator>, or put the same typedefs in the iterator itself, or inherit from std::iterator (which has these typedefs). I prefer the second option, to avoid changing things in the std namespace, and for readability, but most people inherit from std::iterator.
struct std::iterator_traits<youriterator> {
typedef ???? difference_type; //almost always ptrdiff_t
typedef ???? value_type; //almost always T
typedef ???? reference; //almost always T& or const T&
typedef ???? pointer; //almost always T* or const T*
typedef ???? iterator_category; //usually std::forward_iterator_tag or similar
};
Note the iterator_category should be one of std::input_iterator_tag, std::output_iterator_tag, std::forward_iterator_tag, std::bidirectional_iterator_tag, or std::random_access_iterator_tag, depending on which requirements your iterator satisfies. Depending on your iterator, you may choose to specialize std::next, std::prev, std::advance, and std::distance as well, but this is rarely needed. In extremely rare cases you may wish to specialize std::begin and std::end.
Your container should probably also have a const_iterator, which is a (possibly mutable) iterator to constant data that is similar to your iterator except it should be implicitly constructable from a iterator and users should be unable to modify the data. It is common for its internal pointer to be a pointer to non-constant data, and have iterator inherit from const_iterator so as to minimize code duplication.
My post at Writing your own STL Container has a more complete container/iterator prototype.
The iterator_facade documentation from Boost.Iterator provides what looks like a nice tutorial on implementing iterators for a linked list. Could you use that as a starting point for building a random-access iterator over your container?
If nothing else, you can take a look at the member functions and typedefs provided by iterator_facade and use it as a starting point for building your own.
Here is sample of raw pointer iterator.
You shouldn't use iterator class to work with raw pointers!
#include <iostream>
#include <vector>
#include <list>
#include <iterator>
#include <assert.h>
template<typename T>
class ptr_iterator
: public std::iterator<std::forward_iterator_tag, T>
{
typedef ptr_iterator<T> iterator;
pointer pos_;
public:
ptr_iterator() : pos_(nullptr) {}
ptr_iterator(T* v) : pos_(v) {}
~ptr_iterator() {}
iterator operator++(int) /* postfix */ { return pos_++; }
iterator& operator++() /* prefix */ { ++pos_; return *this; }
reference operator* () const { return *pos_; }
pointer operator->() const { return pos_; }
iterator operator+ (difference_type v) const { return pos_ + v; }
bool operator==(const iterator& rhs) const { return pos_ == rhs.pos_; }
bool operator!=(const iterator& rhs) const { return pos_ != rhs.pos_; }
};
template<typename T>
ptr_iterator<T> begin(T *val) { return ptr_iterator<T>(val); }
template<typename T, typename Tsize>
ptr_iterator<T> end(T *val, Tsize size) { return ptr_iterator<T>(val) + size; }
Raw pointer range based loop workaround. Please, correct me, if there is better way to make range based loop from raw pointer.
template<typename T>
class ptr_range
{
T* begin_;
T* end_;
public:
ptr_range(T* ptr, size_t length) : begin_(ptr), end_(ptr + length) { assert(begin_ <= end_); }
T* begin() const { return begin_; }
T* end() const { return end_; }
};
template<typename T>
ptr_range<T> range(T* ptr, size_t length) { return ptr_range<T>(ptr, length); }
And simple test
void DoIteratorTest()
{
const static size_t size = 10;
uint8_t *data = new uint8_t[size];
{
// Only for iterator test
uint8_t n = '0';
auto first = begin(data);
auto last = end(data, size);
for (auto it = first; it != last; ++it)
{
*it = n++;
}
// It's prefer to use the following way:
for (const auto& n : range(data, size))
{
std::cout << " char: " << static_cast<char>(n) << std::endl;
}
}
{
// Only for iterator test
ptr_iterator<uint8_t> first(data);
ptr_iterator<uint8_t> last(first + size);
std::vector<uint8_t> v1(first, last);
// It's prefer to use the following way:
std::vector<uint8_t> v2(data, data + size);
}
{
std::list<std::vector<uint8_t>> queue_;
queue_.emplace_back(begin(data), end(data, size));
queue_.emplace_back(data, data + size);
}
}
Thomas Becker wrote a useful article on the subject here.
There was also this (perhaps simpler) approach that appeared previously on SO: How to correctly implement custom iterators and const_iterators?
First of all you can look here for a list of the various operations the individual iterator types need to support.
Next, when you have made your iterator class you need to either specialize std::iterator_traits for it and provide some necessary typedefs (like iterator_category or value_type) or alternatively derive it from std::iterator, which defines the needed typedefs for you and can therefore be used with the default std::iterator_traits.
disclaimer: I know some people don't like cplusplus.com that much, but they provide some really useful information on this.
I was/am in the same boat as you for different reasons (partly educational, partly constraints). I had to re-write all the containers of the standard library and the containers had to conform to the standard. That means, if I swap out my container with the stl version, the code would work the same. Which also meant that I had to re-write the iterators.
Anyway, I looked at EASTL. Apart from learning a ton about containers that I never learned all this time using the stl containers or through my undergraduate courses. The main reason is that EASTL is more readable than the stl counterpart (I found this is simply because of the lack of all the macros and straight forward coding style). There are some icky things in there (like #ifdefs for exceptions) but nothing to overwhelm you.
As others mentioned, look at cplusplus.com's reference on iterators and containers.
I was trying to solve the problem of being able to iterate over several different text arrays all of which are stored within a memory resident database that is a large struct.
The following was worked out using Visual Studio 2017 Community Edition on an MFC test application. I am including this as an example as this posting was one of several that I ran across that provided some help yet were still insufficient for my needs.
The struct containing the memory resident data looked something like the following. I have removed most of the elements for the sake of brevity and have also not included the Preprocessor defines used (the SDK in use is for C as well as C++ and is old).
What I was interested in doing is having iterators for the various WCHAR two dimensional arrays which contained text strings for mnemonics.
typedef struct tagUNINTRAM {
// stuff deleted ...
WCHAR ParaTransMnemo[MAX_TRANSM_NO][PARA_TRANSMNEMO_LEN]; /* prog #20 */
WCHAR ParaLeadThru[MAX_LEAD_NO][PARA_LEADTHRU_LEN]; /* prog #21 */
WCHAR ParaReportName[MAX_REPO_NO][PARA_REPORTNAME_LEN]; /* prog #22 */
WCHAR ParaSpeMnemo[MAX_SPEM_NO][PARA_SPEMNEMO_LEN]; /* prog #23 */
WCHAR ParaPCIF[MAX_PCIF_SIZE]; /* prog #39 */
WCHAR ParaAdjMnemo[MAX_ADJM_NO][PARA_ADJMNEMO_LEN]; /* prog #46 */
WCHAR ParaPrtModi[MAX_PRTMODI_NO][PARA_PRTMODI_LEN]; /* prog #47 */
WCHAR ParaMajorDEPT[MAX_MDEPT_NO][PARA_MAJORDEPT_LEN]; /* prog #48 */
// ... stuff deleted
} UNINIRAM;
The current approach is to use a template to define a proxy class for each of the arrays and then to have a single iterator class that can be used to iterate over a particular array by using a proxy object representing the array.
A copy of the memory resident data is stored in an object that handles reading and writing the memory resident data from/to disk. This class, CFilePara contains the templated proxy class (MnemonicIteratorDimSize and the sub class from which is it is derived, MnemonicIteratorDimSizeBase) and the iterator class, MnemonicIterator.
The created proxy object is attached to an iterator object which accesses the necessary information through an interface described by a base class from which all of the proxy classes are derived. The result is to have a single type of iterator class which can be used with several different proxy classes because the different proxy classes all expose the same interface, the interface of the proxy base class.
The first thing was to create a set of identifiers which would be provided to a class factory to generate the specific proxy object for that type of mnemonic. These identifiers are used as part of the user interface to identify the particular provisioning data the user is interested in seeing and possibly modifying.
const static DWORD_PTR dwId_TransactionMnemonic = 1;
const static DWORD_PTR dwId_ReportMnemonic = 2;
const static DWORD_PTR dwId_SpecialMnemonic = 3;
const static DWORD_PTR dwId_LeadThroughMnemonic = 4;
The Proxy Class
The templated proxy class and its base class are as follows. I needed to accommodate several different kinds of wchar_t text string arrays. The two dimensional arrays had different numbers of mnemonics, depending on the type (purpose) of the mnemonic and the different types of mnemonics were of different maximum lengths, varying between five text characters and twenty text characters. Templates for the derived proxy class was a natural fit with the template requiring the maximum number of characters in each mnemonic. After the proxy object is created, we then use the SetRange() method to specify the actual mnemonic array and its range.
// proxy object which represents a particular subsection of the
// memory resident database each of which is an array of wchar_t
// text arrays though the number of array elements may vary.
class MnemonicIteratorDimSizeBase
{
DWORD_PTR m_Type;
public:
MnemonicIteratorDimSizeBase(DWORD_PTR x) { }
virtual ~MnemonicIteratorDimSizeBase() { }
virtual wchar_t *begin() = 0;
virtual wchar_t *end() = 0;
virtual wchar_t *get(int i) = 0;
virtual int ItemSize() = 0;
virtual int ItemCount() = 0;
virtual DWORD_PTR ItemType() { return m_Type; }
};
template <size_t sDimSize>
class MnemonicIteratorDimSize : public MnemonicIteratorDimSizeBase
{
wchar_t (*m_begin)[sDimSize];
wchar_t (*m_end)[sDimSize];
public:
MnemonicIteratorDimSize(DWORD_PTR x) : MnemonicIteratorDimSizeBase(x), m_begin(0), m_end(0) { }
virtual ~MnemonicIteratorDimSize() { }
virtual wchar_t *begin() { return m_begin[0]; }
virtual wchar_t *end() { return m_end[0]; }
virtual wchar_t *get(int i) { return m_begin[i]; }
virtual int ItemSize() { return sDimSize; }
virtual int ItemCount() { return m_end - m_begin; }
void SetRange(wchar_t (*begin)[sDimSize], wchar_t (*end)[sDimSize]) {
m_begin = begin; m_end = end;
}
};
The Iterator Class
The iterator class itself is as follows. This class provides just basic forward iterator functionality which is all that is needed at this time. However I expect that this will change or be extended when I need something additional from it.
class MnemonicIterator
{
private:
MnemonicIteratorDimSizeBase *m_p; // we do not own this pointer. we just use it to access current item.
int m_index; // zero based index of item.
wchar_t *m_item; // value to be returned.
public:
MnemonicIterator(MnemonicIteratorDimSizeBase *p) : m_p(p) { }
~MnemonicIterator() { }
// a ranged for needs begin() and end() to determine the range.
// the range is up to but not including what end() returns.
MnemonicIterator & begin() { m_item = m_p->get(m_index = 0); return *this; } // begining of range of values for ranged for. first item
MnemonicIterator & end() { m_item = m_p->get(m_index = m_p->ItemCount()); return *this; } // end of range of values for ranged for. item after last item.
MnemonicIterator & operator ++ () { m_item = m_p->get(++m_index); return *this; } // prefix increment, ++p
MnemonicIterator & operator ++ (int i) { m_item = m_p->get(m_index++); return *this; } // postfix increment, p++
bool operator != (MnemonicIterator &p) { return **this != *p; } // minimum logical operator is not equal to
wchar_t * operator *() const { return m_item; } // dereference iterator to get what is pointed to
};
The proxy object factory determines which object to created based on the mnemonic identifier. The proxy object is created and the pointer returned is the standard base class type so as to have a uniform interface regardless of which of the different mnemonic sections are being accessed. The SetRange() method is used to specify to the proxy object the specific array elements the proxy represents and the range of the array elements.
CFilePara::MnemonicIteratorDimSizeBase * CFilePara::MakeIterator(DWORD_PTR x)
{
CFilePara::MnemonicIteratorDimSizeBase *mi = nullptr;
switch (x) {
case dwId_TransactionMnemonic:
{
CFilePara::MnemonicIteratorDimSize<PARA_TRANSMNEMO_LEN> *mk = new CFilePara::MnemonicIteratorDimSize<PARA_TRANSMNEMO_LEN>(x);
mk->SetRange(&m_Para.ParaTransMnemo[0], &m_Para.ParaTransMnemo[MAX_TRANSM_NO]);
mi = mk;
}
break;
case dwId_ReportMnemonic:
{
CFilePara::MnemonicIteratorDimSize<PARA_REPORTNAME_LEN> *mk = new CFilePara::MnemonicIteratorDimSize<PARA_REPORTNAME_LEN>(x);
mk->SetRange(&m_Para.ParaReportName[0], &m_Para.ParaReportName[MAX_REPO_NO]);
mi = mk;
}
break;
case dwId_SpecialMnemonic:
{
CFilePara::MnemonicIteratorDimSize<PARA_SPEMNEMO_LEN> *mk = new CFilePara::MnemonicIteratorDimSize<PARA_SPEMNEMO_LEN>(x);
mk->SetRange(&m_Para.ParaSpeMnemo[0], &m_Para.ParaSpeMnemo[MAX_SPEM_NO]);
mi = mk;
}
break;
case dwId_LeadThroughMnemonic:
{
CFilePara::MnemonicIteratorDimSize<PARA_LEADTHRU_LEN> *mk = new CFilePara::MnemonicIteratorDimSize<PARA_LEADTHRU_LEN>(x);
mk->SetRange(&m_Para.ParaLeadThru[0], &m_Para.ParaLeadThru[MAX_LEAD_NO]);
mi = mk;
}
break;
}
return mi;
}
Using the Proxy Class and Iterator
The proxy class and its iterator are used as shown in the following loop to fill in a CListCtrl object with a list of mnemonics. I am using std::unique_ptr so that when the proxy class i not longer needed and the std::unique_ptr goes out of scope, the memory will be cleaned up.
What this source code does is to create a proxy object for the array within the struct which corresponds to the specified mnemonic identifier. It then creates an iterator for that object, uses a ranged for to fill in the CListCtrl control and then cleans up. These are all raw wchar_t text strings which may be exactly the number of array elements so we copy the string into a temporary buffer in order to ensure that the text is zero terminated.
std::unique_ptr<CFilePara::MnemonicIteratorDimSizeBase> pObj(pFile->MakeIterator(m_IteratorType));
CFilePara::MnemonicIterator pIter(pObj.get()); // provide the raw pointer to the iterator who doesn't own it.
int i = 0; // CListCtrl index for zero based position to insert mnemonic.
for (auto x : pIter)
{
WCHAR szText[32] = { 0 }; // Temporary buffer.
wcsncpy_s(szText, 32, x, pObj->ItemSize());
m_mnemonicList.InsertItem(i, szText); i++;
}
And now a keys iterator for range-based for loop.
template<typename C>
class keys_it
{
typename C::const_iterator it_;
public:
using key_type = typename C::key_type;
using pointer = typename C::key_type*;
using difference_type = std::ptrdiff_t;
keys_it(const typename C::const_iterator & it) : it_(it) {}
keys_it operator++(int ) /* postfix */ { return it_++ ; }
keys_it& operator++( ) /* prefix */ { ++it_; return *this ; }
const key_type& operator* ( ) const { return it_->first ; }
const key_type& operator->( ) const { return it_->first ; }
keys_it operator+ (difference_type v ) const { return it_ + v ; }
bool operator==(const keys_it& rhs) const { return it_ == rhs.it_; }
bool operator!=(const keys_it& rhs) const { return it_ != rhs.it_; }
};
template<typename C>
class keys_impl
{
const C & c;
public:
keys_impl(const C & container) : c(container) {}
const keys_it<C> begin() const { return keys_it<C>(std::begin(c)); }
const keys_it<C> end () const { return keys_it<C>(std::end (c)); }
};
template<typename C>
keys_impl<C> keys(const C & container) { return keys_impl<C>(container); }
Usage:
std::map<std::string,int> my_map;
// fill my_map
for (const std::string & k : keys(my_map))
{
// do things
}
That's what i was looking for. But nobody had it, it seems.
You get my OCD code alignment as a bonus.
As an exercise, write your own for values(my_map)
I have a vector< Object > myvec which I use in my code to hold a list of objects in memory. I keep a pointer to the current object in that vector in the "normal" C fashion by using
Object* pObj = &myvec[index];
This all works fine if... myvec doesn't grow big enough that it is moved around during a push_back at which time pObj becomes invalid - vectors guarantee data is sequential, hence they make no effort to keep the vector at the same memory location.
I can reserve enough space for myvec to prevent this, but I dnt' like that solution.
I could keep the index of the selected myvec position and when I need to use it just access it directly, but it's a costly modification to my code.
I'm wondering if iterators keep the their references intact as a vector is reallocated/moved and if so can I just replace
Object* pObj = &myvec[index];
by something like
vector<Object>::iterator = myvec.begin()+index;
What are the implication of this?
Is this doable?
What is the standard pattern to save pointers to vector positions?
Cheers
No... using an iterator you would have the same exact problem. If a vector reallocation is performed then all iterators are invalidated and using them is Undefined Behavior.
The only solution that is reallocation-resistant with an std::vector is using the integer index.
Using for example std::list things are different, but also the are different efficiency compromises, so it really depends on what you need to do.
Another option would be to create your own "smart index" class, that stores a reference to the vector and the index. This way you could keep just passing around one "pointer" (and you could implement pointer semantic for it) but the code wouldn't suffer from reallocation risks.
Iterators are (potentially) invalidated by anything that could resize the vector (e.g., push_back).
You could, however, create your own iterator class that stored the vector and an index, which would be stable across operations that resized the vector:
#include <iterator>
#include <algorithm>
#include <iostream>
#include <vector>
namespace stable {
template <class T, class Dist=ptrdiff_t, class Ptr = T*, class Ref = T&>
class iterator : public std::iterator<std::random_access_iterator_tag, T, Dist, Ptr, Ref>
{
T &container_;
size_t index_;
public:
iterator(T &container, size_t index) : container_(container), index_(index) {}
iterator operator++() { ++index_; return *this; }
iterator operator++(int) { iterator temp(*this); ++index_; return temp; }
iterator operator--() { --index_; return *this; }
iterator operator--(int) { stable_itertor temp(*this); --index_; return temp; }
iterator operator+(Dist offset) { return iterator(container_, index_ + offset); }
iterator operator-(Dist offset) { return iterator(container_, index_ - offset); }
bool operator!=(iterator const &other) const { return index_ != other.index_; }
bool operator==(iterator const &other) const { return index_ == other.index_; }
bool operator<(iterator const &other) const { return index_ < other.index_; }
bool operator>(iterator const &other) const { return index_ > other.index_; }
typename T::value_type &operator *() { return container_[index_]; }
typename T::value_type &operator[](size_t index) { return container_[index_ + index]; }
};
template <class T>
iterator<T> begin(T &container) { return iterator<T>(container, 0); }
template <class T>
iterator<T> end(T &container) { return iterator<T>(container, container.size()); }
}
#ifdef TEST
int main() {
std::vector<int> data;
// add some data to the container:
for (int i=0; i<10; i++)
data.push_back(i);
// get iterators to the beginning/end:
stable::iterator<std::vector<int> > b = stable::begin(data);
stable::iterator<std::vector<int> > e = stable::end(data);
// add enough more data that the container will (probably) be resized:
for (int i=10; i<10000; i++)
data.push_back(i);
// Use the previously-obtained iterators:
std::copy(b, e, std::ostream_iterator<int>(std::cout, "\n"));
// These iterators also support most pointer-like operations:
std::cout << *(b+125) << "\n";
std::cout << b[150] << "\n";
return 0;
}
#endif
Since we can't embed this as a nested class inside of the container like a normal iterator class, this requires a slightly different syntax to declare/define an object of this type; instead of the usual std::vector<int>::iterator whatever;, we have to use stable::iterator<std::vector<int> > whatever;. Likewise, to obtain the beginning of a container, we use stable::begin(container).
There is one point that may be a bit surprising (at least at first): when you obtain a stable::end(container), that gets you the end of the container at that time. As shown in the test code above, if you later add more items to the container, the iterator your obtained previously is not adjusted to reflect the new end of the container -- it retains the position it had when you obtained it (i.e., the position that was the end of the container at that time, but isn't any more).
No, iterators are invalidated after vector growth.
The way to get around this problem is to keep the index to the item, not a pointer or iterator to it. This is because the item stays at its index, even if the vector grows, assuming of course that you don't insert any items before it (thus changing its index).