I have some code that enumerates some data, something like this:
int count;
InitDataEnumeration(/* some init params */, &count);
for (int i = 0; i < count; i++)
{
EnumGetData(i, &data);
// process data ...
}
I'd like to convert this code in a form suitable to C++11's range-for.
I was thinking of defining a DataEnumerator wrapper class, whose constructor would call the above InitDataEnumeration() function.
The idea would be to use this wrapper class like this:
DataEnumerator enumerator{/* init params*/};
for (const auto& data : enumerator)
{
// process data ...
}
How could the former int-indexed for loop be refactored in the latter range-based form?
I was thinking of exposing begin() and end() methods from the enumerator wrapper class, but I don't know what kind of iterators they should return, and how to define such iterators.
Note that the iteration process is forward-only.
What you are looking for can be done with boost::irange. It will construct a lazy range of integers in the range [first, last) and you can just drop it right in like you use i in your for loop.
for (int i = 0; i < count; i++)
{
EnumGetData(i, &data);
// process data ...
}
Becomes
for (auto i : boost::irange(0, count))
{
EnumGetData(i, &data);
// process data ...
}
You require an input iterator this example completely copied from http://en.cppreference.com/w/cpp/iterator/iterator :
#include <iostream>
#include <algorithm>
template<long FROM, long TO>
class Range {
public:
// member typedefs provided through inheriting from std::iterator
class iterator: public std::iterator<
std::input_iterator_tag, // iterator_category
long, // value_type
long, // difference_type
const long*, // pointer
long // reference
>{
long num = FROM;
public:
explicit iterator(long _num = 0) : num(_num) {}
iterator& operator++() {num = TO >= FROM ? num + 1: num - 1; return *this;}
iterator operator++(int) {iterator retval = *this; ++(*this); return retval;}
bool operator==(iterator other) const {return num == other.num;}
bool operator!=(iterator other) const {return !(*this == other);}
reference operator*() const {return num;}
};
iterator begin() {return iterator(FROM);}
iterator end() {return iterator(TO >= FROM? TO+1 : TO-1);}
};
int main() {
// std::find requires a input iterator
auto range = Range<15, 25>();
auto itr = std::find(range.begin(), range.end(), 18);
std::cout << *itr << '\n'; // 18
// Range::iterator also satisfies range-based for requirements
for(long l : Range<3, 5>()) {
std::cout << l << ' '; // 3 4 5
}
std::cout << '\n';
}
You are right about begin() and end(), whatever they return should supply:
operator++ (prefix only is enough)
operator!=
operator*
All pretty self-explanatory.
Notice that no traits or categories are required, as would be for iterators intended for some standard library algorithms - just bare minimum.
for(auto x : y) Here, y must be an object of a class that has a begin() method and an end() method that each returns an object implementing the concept of an iterator. The iterator must be incrementable (iter++), must be able to accurately determine if it is equal to another iterator of the same kind (via !=) and must de-reference to whatever x needs to be.
This is something you should consider doing if you either A) are bored or otherwise have nothing better to do or B) have a legit need to. While this is not difficult to do, neither is it trivial.
Related
This is a code example using std::reverse_iterator:
template<typename T, size_t SIZE>
class Stack {
T arr[SIZE];
size_t pos = 0;
public:
T pop() {
return arr[--pos];
}
Stack& push(const T& t) {
arr[pos++] = t;
return *this;
}
auto begin() {
return std::reverse_iterator(arr+pos);
}
auto end() {
return std::reverse_iterator(arr);
// ^ does reverse_iterator take this `one back`? how?
}
};
int main() {
Stack<int, 4> s;
s.push(5).push(15).push(25).push(35);
for(int val: s) {
std::cout << val << ' ';
}
}
// output is as expected: 35 25 15 5
When using std::reverse_iterator as an adaptor for another iterator, the newly adapted end shall be one before the original begin. However calling std::prev on begin is UB.
How does std::reverse_iterator hold one before begin?
Initialization of std::reverse_iterator from an iterator does not decrease the iterator upon initialization, as it would then be UB when sending begin to it (one cannot assume that std::prev(begin) is a valid call).
The trick is simple, std::reverse_iterator holds the original iterator passed to it, without modifying it. Only when it is being dereferenced it peeks back to the actual value. So in a way the iterator is pointing inside to the next element, from which it can get the current.
It would look something like:
// partial possible implementation of reverse_iterator for demo purpose
template<typename Itr>
class reverse_iterator {
Itr itr;
public:
constexpr explicit reverse_iterator(Itr itr): itr(itr) {}
constexpr auto& operator*() {
return *std::prev(itr); // <== only here we peek back
}
constexpr auto& operator++() {
--itr;
return *this;
}
friend bool operator!=(reverse_iterator<Itr> a, reverse_iterator<Itr> b) {
return a.itr != b.itr;
}
};
This is however an internal implementation detail (and can be in fact implemented in other similar manners). The user of std::reverse_iterator shall not be concerned with how it is implemented.
I want to create a range-like construct in c++, that will be used like this:
for (auto i: range(5,9))
cout << i << ' '; // prints 5 6 7 8
for (auto i: range(5.1,9.2))
cout << i << ' '; // prints 5.1 6.1 7.1 8.1 9.1
Handling the integer case is relatively easy:
template<typename T>
struct range
{
T from, to;
range(T from, T to) : from(from), to(to) {}
struct iterator
{
T current;
T operator*() { return current; }
iterator& operator++()
{
++current;
return *this;
}
bool operator==(const iterator& other) { return current == other.current; }
bool operator!=(const iterator& other) { return current != other.current; }
};
iterator begin() const { return iterator{ from }; }
iterator end() const { return iterator{ to }; }
};
However, this does not work in the float case, since the standard range-based loop in C++ checks whether iter==end and not whether iter <= end as you would do in a for a loop.
Is there a simple way to create an iterable object that will behave like a correct range based for-loop on floats?
Here is my attempt which does not impair the semantics of iterators. Now, each iterator knows its stopping value. The iterator will set itself to this value upon exceeding it. All end iterators of a range with equal to therefore compare equal.
template <typename T>
struct range {
T from, to;
range(T from, T to): from(from), to(to) {}
struct iterator {
const T to; // iterator knows its bounds
T current;
T operator*() { return current; }
iterator& operator++() {
++current;
if(current > to)
// make it an end iterator
// (current being exactly equal to 'current' of other end iterators)
current = to;
return *this;
}
bool operator==(const iterator& other) const // OT: note the const
{ return current == other.current; }
// OT: this is how we do !=
bool operator!=(const iterator& other) const { return !(*this == other); }
};
iterator begin() const { return iterator{to, from}; }
iterator end() const { return iterator{to, to}; }
};
Why is this better?
The solution by #JeJo relies on the order in which you compare those iterators, i.e. it != end or end != it. But, in the case of range-based for, it is defined. Should you use this contraption in some other context, I advise the above approach.
Alternatively, if sizeof(T) > sizeof(void*), it makes sense to store a pointer to the originating range instance (which in the case of the range-for persists until the end) and use that to refer to a single T value:
template <typename T>
struct range {
T from, to;
range(T from, T to): from(from), to(to) {}
struct iterator {
range const* range;
T current;
iterator& operator++() {
++current;
if(current > range->to)
current = range->to;
return *this;
}
...
};
iterator begin() const { return iterator{this, from}; }
iterator end() const { return iterator{this, to}; }
};
Or it could be T const* const pointing directly to that value, it is up to you.
OT: Do not forget to make the internals private for both classes.
Instead of a range object you could use a generator (a coroutine using co_yield). Despite it is not in the standard (but planned for C++20), some compilers already implement it.
See: https://en.cppreference.com/w/cpp/language/coroutines
With MSVC it would be:
#include <iostream>
#include <experimental/generator>
std::experimental::generator<double> rangeGenerator(double from, double to) {
for (double x=from;x <= to;x++)
{
co_yield x;
}
}
int main()
{
for (auto i : rangeGenerator(5.1, 9.2))
std::cout << i << ' '; // prints 5.1 6.1 7.1 8.1 9.1
}
Is there a simple way to create an iterable object that will behave
like a correct for loop on floats?
The simplest hack† would be using the traits std::is_floating_point to provide different return (i.e. iter <= end) within the operator!= overload.
(See Live)
#include <type_traits>
bool operator!=(const iterator& other)
{
if constexpr (std::is_floating_point_v<T>) return current <= other.current;
return !(*this == other);
}
†Warning: Even though that does the job, it breaks the meaning of operator!= overload.
Alternative Solution
The entire range class can be replaced by a simple function in which the values of the range will be populated with the help of std::iota
in the standard container std::vector.
Use SFINE, to restrict the use of the function for only the valid types.
This way, you can rely on standard implementations and forget about the reinventions.
(See Live)
#include <iostream>
#include <type_traits>
#include <vector> // std::vector
#include <numeric> // std::iota
#include <cstddef> // std::size_t
#include <cmath> // std::modf
// traits for valid template types(integers and floating points)
template<typename Type>
using is_integers_and_floats = std::conjunction<
std::is_arithmetic<Type>,
std::negation<std::is_same<Type, bool>>,
std::negation<std::is_same<Type, char>>,
std::negation<std::is_same<Type, char16_t>>,
std::negation<std::is_same<Type, char32_t>>,
std::negation<std::is_same<Type, wchar_t>>
/*, std::negation<std::is_same<char8_t, Type>> */ // since C++20
>;
template <typename T>
auto ragesof(const T begin, const T end)
-> std::enable_if_t<is_integers_and_floats<T>::value, std::vector<T>>
{
if (begin >= end) return std::vector<T>{}; // edge case to be considered
// find the number of elements between the range
const std::size_t size = [begin, end]() -> std::size_t
{
const std::size_t diffWhole
= static_cast<std::size_t>(end) - static_cast<std::size_t>(begin);
if constexpr (std::is_floating_point_v<T>) {
double whole; // get the decimal parts of begin and end
const double decimalBegin = std::modf(static_cast<double>(begin), &whole);
const double decimalEnd = std::modf(static_cast<double>(end), &whole);
return decimalBegin <= decimalEnd ? diffWhole + 1 : diffWhole;
}
return diffWhole;
}();
// construct and initialize the `std::vector` with size
std::vector<T> vec(size);
// populates the range from [first, end)
std::iota(std::begin(vec), std::end(vec), begin);
return vec;
}
int main()
{
for (auto i : ragesof( 5, 9 ))
std::cout << i << ' '; // prints 5 6 7 8
std::cout << '\n';
for (auto i : ragesof(5.1, 9.2))
std::cout << i << ' '; // prints 5.1 6.1 7.1 8.1 9.1
}
A floating-point loop or iterator should typically use integer types to hold the total number of iterations and the number of the current iteration, and then compute the "loop index" value used within the loop based upon those and loop-invariant floating-point values.
For example:
for (int i=-10; i<=10; i++)
{
double x = i/10.0; // Substituting i*0.1 would be faster but less accurate
}
or
for (int i=0; i<=16; i++)
{
double x = ((startValue*(16-i))+(endValue*i))*(1/16);
}
Note that there is no possibility of rounding errors affecting the number of iterations. The latter calculation is guaranteed to yield a correctly-rounded result at the endpoints; computing startValue+i*(endValue-startValue) would likely be faster (since the loop-invariant (endValue-startValue) can be hoisted) but may be less accurate.
Using an integer iterator along with a function to convert an integer to a floating-point value is probably the most robust way to iterate over a range of floating-point values. Trying to iterate over floating-point values directly is far more likely to yield "off-by-one" errors.
I have a class that contains the vector of elements of the specific class. The main idea is to generate periodic sequence of the elements, based on the one period of the sequence (elems_) and the number of the periods (nperiod_) so I do not need to store all elements, but just one period.
class PeriodicContainer
{
private:
std::vector<Class> elems_; // elements
size_t nperiod_; // period of repetition of elems_
public:
PeriodicContainer();
PeriodicContainer(const std::vector<Class>& elems, size_t nperiod);
/*...*/
}
Is it possible to implement custom iterator for the PeriodicContainer so that I can do things like (semi-pseudo-code):
PeriodicContainer container({Class(1), Class(2)}, 4);
for (auto it : container)
std::cout << it << '\n';
and the output will be
Class(1)
Class(2)
Class(1)
Class(2)
Class(1)
Class(2)
Class(1)
Class(2)
If you can use range-v3, you can do:
namespace rv = ranges::views;
std::vector<Class> Container { Class(1), Class(2) };
for (auto it : rv::repeat_n(Container, 4) | rv::join)
std::cout << it;
and not have to write any additional code yourself. This will also work for any contiguous container, not just std::vector.
Here's a demo.
If your underlying container is simply a std::vector, then you know that it's a contiguous container -- which actually makes this quite easy.
You can form an iterator from the following:
A pointer (or reference) to the container being iterated, and
The current iteration count (note: not 'index'). This will be used as "index" into the underlying container's operator[] after wrapping around the container's size().
The behavior of this iterator would be simply:
Each increment just increments the current count
Each dereference returns (*elems_)[current_ % elems_->size()], which will account for the loop-around for the "period".
The begin() would simply return an iterator with a 0 count, and
The end() would return an iterator with a count of elems_.size() * nperiod_
An example of what this could look like as a LegacyForwardIterator is the following:
template <typename T>
class PeriodicContainerIterator
{
public:
using value_type = T;
using reference = T&;
using pointer = T*;
using difference_type = std::ptrdiff_t;
using iterator_category = std::forward_iterator_tag;
PeriodicContainerIterator(std::vector<T>* elems, std::size_t current)
: elems_{elems},
current_{current}
{}
reference operator*() {
return (*elems_)[current_ % elems_->size()]
}
pointer operator->() {
return &(*elems_)[current_ % elems_->size()];
}
PeriodicContainerIterator& operator++() const {
++current_;
return (*this);
}
PeriodicContainerIterator operator++(int) const {
auto copy = (*this);
++current_;
return copy;
}
bool operator==(const PeriodicContainerIterator& other) const {
return current_ == other.current_;
}
bool operator!=(const PeriodicContainerIterator& other) const {
return current_ != other.current_;
}
private:
std::vector<T>* elems_;
std::size_t current_;
};
The container would then define begin() and end() as:
PeriodicContainerIterator<Class> begin() {
return PeriodicContainerIterator<Class>{&elems_, 0};
}
PeriodicContainerIterator<Class> end() {
return PeriodicContainerIterator<Class>{&elems_, elems_->size() * nperiod_};
}
You could easily make this all the way up to a LegacyRandomAccessIterator, but this requires a lot of extra functions which will bulk this answer.
If you don't specifically need this as an iterator but just want a simple way to visit each element in the periodic sequence, it might be easier to read / understand if you were to make this into a for_each-like call that expects a callback instead. For example:
template <typename Fn>
void forEach(Fn&& fn)
{
for (auto i = 0; i < nperiod_; ++i) {
for (auto& e : elems_) {
fn(e);
}
}
}
Which allows for use like:
container.forEach([&](auto& e){
// 'e' is each visited element
});
I want to create a range-like construct in c++, that will be used like this:
for (auto i: range(5,9))
cout << i << ' '; // prints 5 6 7 8
for (auto i: range(5.1,9.2))
cout << i << ' '; // prints 5.1 6.1 7.1 8.1 9.1
Handling the integer case is relatively easy:
template<typename T>
struct range
{
T from, to;
range(T from, T to) : from(from), to(to) {}
struct iterator
{
T current;
T operator*() { return current; }
iterator& operator++()
{
++current;
return *this;
}
bool operator==(const iterator& other) { return current == other.current; }
bool operator!=(const iterator& other) { return current != other.current; }
};
iterator begin() const { return iterator{ from }; }
iterator end() const { return iterator{ to }; }
};
However, this does not work in the float case, since the standard range-based loop in C++ checks whether iter==end and not whether iter <= end as you would do in a for a loop.
Is there a simple way to create an iterable object that will behave like a correct range based for-loop on floats?
Here is my attempt which does not impair the semantics of iterators. Now, each iterator knows its stopping value. The iterator will set itself to this value upon exceeding it. All end iterators of a range with equal to therefore compare equal.
template <typename T>
struct range {
T from, to;
range(T from, T to): from(from), to(to) {}
struct iterator {
const T to; // iterator knows its bounds
T current;
T operator*() { return current; }
iterator& operator++() {
++current;
if(current > to)
// make it an end iterator
// (current being exactly equal to 'current' of other end iterators)
current = to;
return *this;
}
bool operator==(const iterator& other) const // OT: note the const
{ return current == other.current; }
// OT: this is how we do !=
bool operator!=(const iterator& other) const { return !(*this == other); }
};
iterator begin() const { return iterator{to, from}; }
iterator end() const { return iterator{to, to}; }
};
Why is this better?
The solution by #JeJo relies on the order in which you compare those iterators, i.e. it != end or end != it. But, in the case of range-based for, it is defined. Should you use this contraption in some other context, I advise the above approach.
Alternatively, if sizeof(T) > sizeof(void*), it makes sense to store a pointer to the originating range instance (which in the case of the range-for persists until the end) and use that to refer to a single T value:
template <typename T>
struct range {
T from, to;
range(T from, T to): from(from), to(to) {}
struct iterator {
range const* range;
T current;
iterator& operator++() {
++current;
if(current > range->to)
current = range->to;
return *this;
}
...
};
iterator begin() const { return iterator{this, from}; }
iterator end() const { return iterator{this, to}; }
};
Or it could be T const* const pointing directly to that value, it is up to you.
OT: Do not forget to make the internals private for both classes.
Instead of a range object you could use a generator (a coroutine using co_yield). Despite it is not in the standard (but planned for C++20), some compilers already implement it.
See: https://en.cppreference.com/w/cpp/language/coroutines
With MSVC it would be:
#include <iostream>
#include <experimental/generator>
std::experimental::generator<double> rangeGenerator(double from, double to) {
for (double x=from;x <= to;x++)
{
co_yield x;
}
}
int main()
{
for (auto i : rangeGenerator(5.1, 9.2))
std::cout << i << ' '; // prints 5.1 6.1 7.1 8.1 9.1
}
Is there a simple way to create an iterable object that will behave
like a correct for loop on floats?
The simplest hack† would be using the traits std::is_floating_point to provide different return (i.e. iter <= end) within the operator!= overload.
(See Live)
#include <type_traits>
bool operator!=(const iterator& other)
{
if constexpr (std::is_floating_point_v<T>) return current <= other.current;
return !(*this == other);
}
†Warning: Even though that does the job, it breaks the meaning of operator!= overload.
Alternative Solution
The entire range class can be replaced by a simple function in which the values of the range will be populated with the help of std::iota
in the standard container std::vector.
Use SFINE, to restrict the use of the function for only the valid types.
This way, you can rely on standard implementations and forget about the reinventions.
(See Live)
#include <iostream>
#include <type_traits>
#include <vector> // std::vector
#include <numeric> // std::iota
#include <cstddef> // std::size_t
#include <cmath> // std::modf
// traits for valid template types(integers and floating points)
template<typename Type>
using is_integers_and_floats = std::conjunction<
std::is_arithmetic<Type>,
std::negation<std::is_same<Type, bool>>,
std::negation<std::is_same<Type, char>>,
std::negation<std::is_same<Type, char16_t>>,
std::negation<std::is_same<Type, char32_t>>,
std::negation<std::is_same<Type, wchar_t>>
/*, std::negation<std::is_same<char8_t, Type>> */ // since C++20
>;
template <typename T>
auto ragesof(const T begin, const T end)
-> std::enable_if_t<is_integers_and_floats<T>::value, std::vector<T>>
{
if (begin >= end) return std::vector<T>{}; // edge case to be considered
// find the number of elements between the range
const std::size_t size = [begin, end]() -> std::size_t
{
const std::size_t diffWhole
= static_cast<std::size_t>(end) - static_cast<std::size_t>(begin);
if constexpr (std::is_floating_point_v<T>) {
double whole; // get the decimal parts of begin and end
const double decimalBegin = std::modf(static_cast<double>(begin), &whole);
const double decimalEnd = std::modf(static_cast<double>(end), &whole);
return decimalBegin <= decimalEnd ? diffWhole + 1 : diffWhole;
}
return diffWhole;
}();
// construct and initialize the `std::vector` with size
std::vector<T> vec(size);
// populates the range from [first, end)
std::iota(std::begin(vec), std::end(vec), begin);
return vec;
}
int main()
{
for (auto i : ragesof( 5, 9 ))
std::cout << i << ' '; // prints 5 6 7 8
std::cout << '\n';
for (auto i : ragesof(5.1, 9.2))
std::cout << i << ' '; // prints 5.1 6.1 7.1 8.1 9.1
}
A floating-point loop or iterator should typically use integer types to hold the total number of iterations and the number of the current iteration, and then compute the "loop index" value used within the loop based upon those and loop-invariant floating-point values.
For example:
for (int i=-10; i<=10; i++)
{
double x = i/10.0; // Substituting i*0.1 would be faster but less accurate
}
or
for (int i=0; i<=16; i++)
{
double x = ((startValue*(16-i))+(endValue*i))*(1/16);
}
Note that there is no possibility of rounding errors affecting the number of iterations. The latter calculation is guaranteed to yield a correctly-rounded result at the endpoints; computing startValue+i*(endValue-startValue) would likely be faster (since the loop-invariant (endValue-startValue) can be hoisted) but may be less accurate.
Using an integer iterator along with a function to convert an integer to a floating-point value is probably the most robust way to iterate over a range of floating-point values. Trying to iterate over floating-point values directly is far more likely to yield "off-by-one" errors.
I have a big vector of items that belong to a certain class.
struct item {
int class_id;
//some other data...
};
The same class_id can appear multiple times in the vector, and the vector is constructed once and then sorted by class_id. So all elements of the same class are next to each other in the vector.
I later have to process the items per class, ie. I update all items of the same class but I do not modify any item of a different class. Since I have to do this for all items and the code is trivially parallelizable I wanted to use Microsoft PPL with Concurrency::parallel_for_each(). Therefore I needed an iterator and came up with a forward iterator that returns the range of all items with a certain class_id as proxy object. The proxy is simply a std::pair and the proxy is the iterator's value type.
using item_iterator = std::vector<item>::iterator;
using class_range = std::pair<item_iterator, item_iterator>;
//iterator definition
class per_class_iterator : public std::iterator<std::forward_iterator_tag, class_range> { /* ... */ };
By now I was able to loop over all my classes and update the items like this.
std::vector<item> items;
//per_class_* returns a per_class_iterator
std::for_each(items.per_class_begin(), items.per_class_end(),
[](class_range r)
{
//do something for all items in r
std::for_each(r.first, r.second, /* some work */);
});
When replacing std::for_each with Concurrency::parallel_for_each the code crashed. After debugging I found the problem to be the following code in _Parallel_for_each_helper in ppl.h at line 2772 ff.
// Add a batch of work items to this functor's array
for (unsigned int _Index=0; (_Index < _Size) && (_First != _Last); _Index++)
{
_M_element[_M_len++] = &(*_First++);
}
It uses postincrement (so a temporary iterator is returned), dereferences that temporary iterator and takes the address of the dereferenced item. This only works if the item returned by dereferencing a temporary object survives, ie. basically if it points directly into the container. So fixing this is easy, albeit the per class std::for_each work loop has to be replaced with a for-loop.
//it := iterator somewhere into the vector of items (item_iterator)
for(const auto cur_class = it->class_id; cur_class == it->class_id; ++it)
{
/* some work */
}
My question is if returning proxy objects the way I did is violating the standard or if the assumption that every iterator dereferences into permanent data has been made by Microsoft for their library, but is not documented. At least I could not find any documentation on the iterator requirements for parallel_for_each() except that either a random access or a forward iterator are expected. I have seen the question about forward iterators and vector but since my iterator's reference type is const value_type& I still think my iterator is ok by the standard. So is a forward iterator returning a proxy object still a valid forward iterator? Or put another way, is it ok for an iterator to have a value type different from a type that is actually stored somewhere in a container?
Compilable example:
#include <vector>
#include <utility>
#include <cassert>
#include <iterator>
#include <memory>
#include <algorithm>
#include <iostream>
#include <ppl.h>
using identifier = int;
struct item
{
identifier class_id;
// other data members
// ...
bool operator<(const item &rhs) const
{
return class_id < rhs.class_id;
}
bool operator==(const item &rhs) const
{
return class_id == rhs.class_id;
}
//inverse operators omitted
};
using container = std::vector<item>;
using item_iterator = typename container::iterator;
using class_range = std::pair<item_iterator, item_iterator>;
class per_class_iterator : public std::iterator<std::forward_iterator_tag, class_range>
{
public:
per_class_iterator() = default;
per_class_iterator(const per_class_iterator&) = default;
per_class_iterator& operator=(const per_class_iterator&) = default;
explicit per_class_iterator(container &data) :
data_(std::addressof(data)),
class_(equal_range(data_->front())), //this would crash for an empty container. assume it's not.
next_(class_.second)
{
assert(!data_->empty()); //a little late here
assert(std::is_sorted(std::cbegin(*data_), std::cend(*data_)));
}
reference operator*()
{
//if data_ is unset the iterator is an end iterator. dereferencing end iterators is bad.
assert(data_ != nullptr);
return class_;
}
per_class_iterator& operator++()
{
assert(data_ != nullptr);
//if we are at the end of our data
if(next_ == data_->end())
{
//reset the data pointer, ie. make iterator an end iterator
data_ = nullptr;
}
else
{
//set to the class of the next element
class_ = equal_range(*next_);
//and update the next_ iterator
next_ = class_.second;
}
return *this;
}
per_class_iterator operator++(int)
{
per_class_iterator tmp{*this};
++(*this);
return tmp;
}
bool operator!=(const per_class_iterator &rhs) const noexcept
{
return (data_ != rhs.data_) ||
(data_ != nullptr && rhs.data_ != nullptr && next_ != rhs.next_);
}
bool operator==(const per_class_iterator &rhs) const noexcept
{
return !(*this != rhs);
}
private:
class_range equal_range(const item &i) const
{
return std::equal_range(data_->begin(), data_->end(), i);
}
container* data_ = nullptr;
class_range class_;
item_iterator next_;
};
per_class_iterator per_class_begin(container &c)
{
return per_class_iterator{c};
}
per_class_iterator per_class_end()
{
return per_class_iterator{};
}
int main()
{
std::vector<item> items;
items.push_back({1});
items.push_back({1});
items.push_back({3});
items.push_back({3});
items.push_back({3});
items.push_back({5});
//items are already sorted
//#define USE_PPL
#ifdef USE_PPL
Concurrency::parallel_for_each(per_class_begin(items), per_class_end(),
#else
std::for_each(per_class_begin(items), per_class_end(),
#endif
[](class_range r)
{
//this loop *cannot* be parallelized trivially
std::for_each(r.first, r.second,
[](item &i)
{
//update item (by evaluating all other items of the same class) ...
//building big temporary data structure for all items of same class ...
//i.processed = true;
std::cout << "item: " << i.class_id << '\n';
});
});
return 0;
}
When you're writing a proxy iterator, the reference type should be a class type, precisely because it can outlive the iterator it is derived from. So, for a proxy iterator, when instantiating the std::iterator base should specify the Reference template parameter as a class type, typically the same as the value type:
class per_class_iterator : public std::iterator<
std::forward_iterator_tag, class_range, std::ptrdiff_t, class_range*, class_range>
~~~~~~~~~~~
Unfortunately, PPL is not keen on proxy iterators and will break compilation:
ppl.h(2775): error C2338: lvalue required for forward iterator operator *
ppl.h(2772): note: while compiling class template member function 'Concurrency::_Parallel_for_each_helper<_Forward_iterator,_Function,1024>::_Parallel_for_each_helper(_Forward_iterator &,const _Forward_iterator &,const _Function &)'
with
[
_Forward_iterator=per_class_iterator,
_Function=main::<lambda_051d98a8248e9970abb917607d5bafc6>
]
This is actually a static_assert:
static_assert(std::is_lvalue_reference<decltype(*_First)>::value, "lvalue required for forward iterator operator *");
This is because the enclosing class _Parallel_for_each_helper stores an array of pointers and expects to be able to indirect them later:
typename std::iterator_traits<_Forward_iterator>::pointer _M_element[_Size];
Since PPL doesn't check that pointer is actually a pointer, we can exploit this by supplying a proxy pointer with an operator* and overloading class_range::operator&:
struct class_range_ptr;
struct class_range : std::pair<item_iterator, item_iterator> {
using std::pair<item_iterator, item_iterator>::pair;
class_range_ptr operator&();
};
struct class_range_ptr {
class_range range;
class_range& operator*() { return range; }
class_range const& operator*() const { return range; }
};
inline class_range_ptr class_range::operator&() { return{*this}; }
class per_class_iterator : public std::iterator<
std::forward_iterator_tag, class_range, std::ptrdiff_t, class_range_ptr, class_range&>
{
// ...
This works great:
item: item: 5
1
item: 3item: 1
item: 3
item: 3
Press any key to continue . . .
For your direct question, no, iterator does not have to be something which is related to any kind of container. About only requirements for an iterator are for it to be:
be copy-constructible, copy-assignable and destructible
support equality/inequality
be dereferencable
Iterator does not necessarily has to be tied to a particular container (see generators), and so it cannot be said that "it has to has same type as container" - because there is no container in generic case.
It seems, hovever, having a custom iterator class may be actually an overkill in your case. Here's why:
In C++, array/vector end iterator is and iterator pointing just behind the end of the last item.
Given a vector of objects of "classes" (in your definition) A,B,C, etc., filled like following:
AAAAAAABBBBBBBBBBBBCCCCCCCD.......
You can just take regular vector iterators that will act as your range starts and ends:
AAAAAAABBBBBBBBBBBBCCCCCCCD......Z
^ ^ ^ ^ ^
i1 i2 i3 i4 iN
For the 4 iterators you see here, following is true:
i1 is begin iterator for class A
i2 is end iterator for class A and begin iterator for class B
i3 is end iterator for class B and begin iterator for class C etc.
Hence, for each class you can have a pair of iterators which are start and end of the respective class range.
Hence, your processing is as trivial as:
for(auto it = i1; i!= i2; i++) processA(*it);
for(auto it = i2; i!= i3; i++) processB(*it);
for(auto it = i3; i!= i4; i++) processC(*it);
Each loop being trivially parallelizable.
parallel_for_each (i1; i2; processA);
parallel_for_each (i2; i3; processB);
parallel_for_each (i3; i4; processC);
To use a range-based for, you can introduce a substitute range class:
class vector_range<T> {
public:
vector<T>::const_iterator begin() {return _begin;};
vector<T>::const_iterator end() {return _end;};
// Trivial constructor filling _begin and _end fields
}
That is to say, you don't really need a proxy iterators to parallelize loops - the way C++ iterators are done is already ideally covers your case.