I have a loop like this (where mySet is a std::set):
for(auto iter=mySet.begin(); iter!=mySet.end(); ++iter){
if (someCondition){mySet.insert(newElement);}
if (someotherCondition){mySet.insert(anothernewElement);}
}
I am experiencing some strange behavior, and I am asking myself if this could be due to the inserted element being inserted "before" the current iterator position in the loop. Namely, I have an Iteration where both conditions are true, but still the distance
distance(iter, mySet.end())
is only 1, not 2 as I would expect. Is my guess about set behavior right? And more importantly, can I still do what I want to do?
what I'm trying to do is to build "chains" on a hexagonal board beween fields of the same color. I have a set containing all fields of my color, and the conditions check the color of neighboring fields, and if they are of the same color, copy this field to mySet, so the chain.
I am trying to use std::set for this because it allows no fields to be in the chain more than once. Reading the comments so far I fear I need to swich to std::vector, where append() will surely add the element at the end, but then I will run into new problems due to having to think of a way to forbid doubling of elements. I therefore am hoping for advice how to solve this the best way.
Depending on the new element's value, it may be inserted before or after current iterator value. Below is an example of inserting before and after an iterator.
#include <iostream>
#include <set>
int main()
{
std::set<int> s;
s.insert(3);
auto it = s.begin();
std::cout << std::distance(it, s.end()) << std::endl; // prints 1
s.insert(2); // 2 will be inserted before it
std::cout << std::distance(it, s.end()) << std::endl; // prints 1
s.insert(5); // 5 will be inserted after it
std::cout << std::distance(it, s.end()) << std::endl; // prints 2
}
Regarding your question in the comments: In my particular case, modifying it while iterating is basically exactly what I want, but of course I need to add averything after the current position; no you can not manually arrange the order of the elements. A new value's order is determined by comparing the new one and existing elements. Below is the quote from cppreference.
std::set is an associative container that contains a sorted set of unique objects of type Key. Sorting is done using the key comparison function Compare. Search, removal, and insertion operations have logarithmic complexity. Sets are usually implemented as red-black trees.
Thus, the implementation of the set will decide where exactly it will be placed.
If you really need to add values after current position, you need to use a different container. For example, simply a vector would be suitable:
it = myvector.insert ( it+1 , 200 ); // +1 to add after it
If you have a small number of items, doing a brute-force check to see if they're inside a vector can actually be faster than checking if they're in a set. This is because vectors tend to have better cache locality than lists.
We can write a function to do this pretty easily:
template<class T>
void insert_unique(std::vector<T>& vect, T const& elem) {
if(std::find(vect.begin(), vect.end(), elem) != vect.end()) {
vect.push_back(elem);
}
}
I am trying to have multiple iterators to a bit more complex range (using range-v3 library) -- manually implementing a cartesian product, using filter, for_each and yield. However, when I tried to hold multiple iterators to such range, they share a common value. For example:
#include <vector>
#include <iostream>
#include <range/v3/view/for_each.hpp>
#include <range/v3/view/filter.hpp>
int main() {
std::vector<int> data1{1,5,2,7,6};
std::vector<int> data2{1,5,2,7,6};
auto range =
data1
| ranges::v3::view::filter([](int v) { return v%2; })
| ranges::v3::view::for_each([&data2](int v) {
return data2 | ranges::v3::view::for_each([v](int v2) {
return ranges::v3::yield(std::make_pair(v,v2));
});
});
auto it1 = range.begin();
for (auto it2 = range.begin(); it2 != range.end(); ++it2) {
std::cout << "[" << it1->first << "," << it1->second << "] [" << it2->first << "," << it2->second << "]\n";
}
return 0;
}
I expected the iterator it1 to keep pointing at the beginning of the range, while the iterator it2 goes through the whole sequence. To my surprise, it1 is incremented as well! I get the following output:
[1,1] [1,1]
[1,5] [1,5]
[1,2] [1,2]
[1,7] [1,7]
[1,6] [1,6]
[5,1] [5,1]
[5,5] [5,5]
[5,2] [5,2]
[5,7] [5,7]
[5,6] [5,6]
[7,1] [7,1]
[7,5] [7,5]
[7,2] [7,2]
[7,7] [7,7]
[7,6] [7,6]
Why is that?
How can I avoid this?
How can I keep multiple, independent iterators pointing in various locations of the range?
Should I implement a cartesian product in a different way? (that's my previous question)
While it is not reflected in the MCVE above, consider a use case where someone tries to implement something similar to std::max_element - trying to return an iterator to the highest-valued pair in the cross product. While looking for the highest value you need to store an iterator to the current best candidate. It cannot alter while you search, and it would be cumbersome to manage the iterators if you need a copy of the range (as suggested in one of the answers).
Materialising the whole cross product is not an option either, as it requires a lot of memory. After all, the whole point of using ranges with filters and other on-the-fly transformations is to avoid such materialisation.
It seems that the resulting view stores state such that it turns out to be single pass. You can work around that by simply making as many copies of the view as you need:
int main() {
std::vector<int> data1{1,5,2,7,6};
std::vector<int> data2{1,5,2,7,6};
auto range =
data1
| ranges::v3::view::filter([](int v) { return v%2; })
| ranges::v3::view::for_each([&data2](int v) {
return data2 | ranges::v3::view::for_each([v](int v2) {
return ranges::v3::yield(std::make_pair(v,v2));
});
});
auto range1= range; // Copy the view adaptor
auto it1 = range1.begin();
for (auto it2 = range.begin(); it2 != range.end(); ++it2) {
std::cout << "[" << it1->first << "," << it1->second << "] [" << it2->first << "," << it2->second << "]\n";
}
std::cout << '\n';
for (; it1 != range1.end(); ++it1) { // Consume the copied view
std::cout << "[" << it1->first << "," << it1->second << "]\n";
}
return 0;
}
Another option would be materializing the view into a container as mentioned in the comments.
Keeping in mind the aforementioned limitation of single-pass views, it is not really hard to implement a max_element
function that returns an iterator, with the important drawback of having to compute the sequence one time and a half.
Here's a possible implementation:
template <typename InputRange,typename BinaryPred = std::greater<>>
auto my_max_element(InputRange &range1,BinaryPred &&pred = {}) -> decltype(range1.begin()) {
auto range2 = range1;
auto it1 = range1.begin();
std::ptrdiff_t pos = 0L;
for (auto it2 = range2.begin(); it2 != range2.end(); ++it2) {
if (pred(*it2,*it1)) {
ranges::advance(it1,pos); // Computing again the sequence as the iterator advances!
pos = 0L;
}
++pos;
}
return it1;
}
What is goin on here?
The entire problem here originates in the fact that std::max_element requires its arguments to be LecacyForwardIterators while the ranges created by ranges::v3::yield apparently (obviously?) only provide LecacyInputIterators. Unfortunately, the range-v3 docs do not explicitly mention the iterator categories one can expect (at least I haven't found it being mentioned). This would indeed be a huge enhancement as all standard library algorithms do explicitly state what iterator categories they require.
In the particular case of std::max_element you are not the first one to stumble over this counterintuitive requirement of ForwardIterator rather than just InputIterator, see Why does std::max_element require a ForwardIterator? for example. In summary, it does make sense, though, because std::max_element does not (despite the name suggesting it) return the max element, but an iterator to the max element. Hence, it is in particular the multipass guarantee that is missing on InputIterator in order to make std::max_element work with it.
For this reason, many other standard library functions do not work with std::max_element either, e.g. std::istreambuf_iterator which really is a pity: you just cannot get the max element from a file with the existing standard library! You either have to load the entire file into memory first, or you have to use your own max algorithm.
The standard library is simply missing an algorithm that really returns the max element rather than an iterator pointing to the max element. Such an algorithm could work with InputIterators as well. Of course, this can very easily be implemented manually, but still it would be handy to have this given by the standard library. I can only speculate why it doesn't exist. Maybe one reason is, that it would require the value_type to be copy constructable because InputIterator is not required to return references to the elements and it might be in turn counterintuitive for a max algorithm to make a copy...
So, now regarding your actual questions:
Why is this? (i.e. why does your range only return InputIterators?)
Obviously, yield creates the values on the fly. This is by design, it's the very reason why one would want to use yield: to not have to create (and thus store) the range upfront. Hence, I do not see how yield could be implemented in a way that it fulfills the multipass guarantee, especially the second bullet is giving me headaches:
If a and b compare equal (a == b is contextually convertible to true) then either they are both non-dereferenceable or *a and *b are references bound to the same object
Technically, I could imagine that one could implement yield in a way that all iterators created from one range share a common internal storage that is filled on the fly during the first traversal. Then it would be possible for different iterators to give you the same references to underlying objects. But then std::max_element would silently consume O(n²) memory (all elements of your cartesian product). So, in my opinion it's definitely better to not do this and instead make the users materialize the range themselves, so that they are aware of it happening.
How can I avoid this?
Well, as already said by metalfox, you can copy your view which would result in different ranges and thus independent iterators. Still, that wouldn't make std::max_element work. So, given the nature of yield the answer to this question, unfortunately, is: you simply cannot avoid this with yield or any other technique that creates values on the fly.
How can I keep multiple, independent iterators pointing in various locations of the range?
This is related to the previous question. Basically, this question answers itself: If you want to point independent iterators in various locations, these locations have to exist somewhere in memory. So, you need to materialize at least those elements that did once have an iterator pointing to them, which in case of std::max_element means that you have to materialize all of them.
Should I implement a cartesian product in a different way?
I can imagine many different implementations. But none of them will be able to provide both of these properties all together:
return ForwardIterators
require less than O(n²) memory
Technically, it could be possible to implement an iterator that is specialized for the usage with std::max_element, meaning that it keeps only the current max element in memory so that it can be referenced... But this would be somewhat ridiculous, wouldn't it? We cannot expect a general purpose library like range-v3 to come up with such highly specialized iterator categories.
Summary
You are saying
After all, I don't think my use case is such a rare outlier and ranges
are planned to be added to the C++20 standard - so there should be
some reasonable way to achieve this without traps...
I definitely agree that "this is not a rare outlier"! However, that doesn't necessarily imply that "there should be some reasonable way to achieve this without traps". Consider e.g. NP-hard problems. It is not a rare outlier to be facing one. Still, it is impossible (unless P=NP) to solve them in polynomial time. And in your case it is simply not possible to use std::max_element without ForwardIterators. And it is not possible to implement a ForwardIterator (as defined by the standard library) on a cartesian product without consuming O(n²) memory.
For the particular case of std::max_element I would suggest to just implement your own version that returns the max element rather than an iterator pointing to it.
However, if I understand your question correctly your concern is more general and std::max_element is just an example. So, I have to disappoint you. Even with the existing standard library some trivial things are impossible due to incompatible iterator categories (again, std::istreambuf_iterator is an existing example). So, if range-v3 happens to be added, there will just be some more of such examples.
So, finally, my recommendation is to just go with your own algorithms, if possible, and swallow the pill of materializing a view otherwise.
An iterator is a pointer to an element in the vector, in this case, it1 points to the beginning of the vector. And hence, if you are trying to point the iterator to the same location of the vector, they will be the same. However, you can have multiple iterators pointing to different locations of the vector. Hope this answers your question.
I have a C++11 list of complex elements that are defined by a structure node_info. A node_info element, in particular, contains a field time and is inserted into the list in an ordered fashion according to its time field value. That is, the list contains various node_info elements that are time ordered. I want to remove from this list all the nodes that verify some specific condition specified by coincidence_detect, which I am currently implementing as a predicate for a remove_if operation.
Since my list can be very large (order of 100k -- 10M elements), and for the way I am building my list this coincidence_detect condition is only verified by few (thousands) elements closer to the "lower" end of the list -- that is the one that contains elements whose time value is less than some t_xv, I thought that to improve speed of my code I don't need to run remove_if through the whole list, but just restrict it to all those elements in the list whose time < t_xv.
remove_if() though does not seem however to allow the user to control up to which point I can iterate through the list.
My current code.
The list elements:
struct node_info {
char *type = "x";
int ID = -1;
double time = 0.0;
bool spk = true;
};
The predicate/condition for remove_if:
// Remove all events occurring at t_event
class coincident_events {
double t_event; // Event time
bool spk; // Spike condition
public:
coincident_events(double time,bool spk_) : t_event(time), spk(spk_){}
bool operator()(node_info node_event){
return ((node_event.time==t_event)&&(node_event.spk==spk)&&(strcmp(node_event.type,"x")!=0));
}
};
The actual removing from the list:
void remove_from_list(double t_event, bool spk_){
// Remove all events occurring at t_event
coincident_events coincidence(t_event,spk_);
event_heap.remove_if(coincidence);
}
Pseudo main:
int main(){
// My list
std::list<node_info> event_heap;
...
// Populate list with elements with random time values, yet ordered in ascending order
...
remove_from_list(0.5, true);
return 1;
}
It seems that remove_if may not be ideal in this context. Should I consider instead instantiating an iterator and run an explicit for cycle as suggested for example in this post?
It seems that remove_if may not be ideal in this context. Should I consider instead instantiating an iterator and run an explicit for loop?
Yes and yes. Don't fight to use code that is preventing you from reaching your goals. Keep it simple. Loops are nothing to be ashamed of in C++.
First thing, comparing double exactly is not a good idea as you are subject to floating point errors.
You could always search the point up to where you want to do a search using lower_bound (I assume you list is properly sorted).
The you could use free function algorithm std::remove_if followed by std::erase to remove items between the iterator returned by remove_if and the one returned by lower_bound.
However, doing that you would do multiple passes in the data and you would move nodes so it would affect performance.
See also: https://en.cppreference.com/w/cpp/algorithm/remove
So in the end, it is probably preferable to do you own loop on the whole container and for each each check if it need to be removed. If not, then check if you should break out of the loop.
for (auto it = event_heap.begin(); it != event_heap.end(); )
{
if (coincidence(*it))
{
auto itErase = it;
++it;
event_heap.erase(itErase)
}
else if (it->time < t_xv)
{
++it;
}
else
{
break;
}
}
As you can see, code can easily become quite long for something that should be simple. Thus, if you need to do that kind of algorithm often, consider writing you own generic algorithm.
Also, in practice you might not need to do a complete search for the end using the first solution if you process you data in increasing time order.
Finally, you might consider using an std::set instead. It could lead to simpler and more optimized code.
Thanks. I used your comments and came up with this solution, which seemingly increases speed by a factor of 5-to-10.
void remove_from_list(double t_event,bool spk_){
coincident_events coincidence(t_event,spk_);
for(auto it=event_heap.begin();it!=event_heap.end();){
if(t_event>=it->time){
if(coincidence(*it)) {
it = event_heap.erase(it);
}
else
++it;
}
else
break;
}
}
The idea to make erase return it (as already ++it) was suggested by this other post. Note that in this implementation I am actually erasing all list elements up to t_event value (meaning, I pass whatever I want for t_xv).
I was looking for BOOST_FOREACH that would be resistant to removing the currently processed element from the container, where removing element doesn't invalidate iterators (apart the one pointing to the element removed, thus the one that is the foreach holding).
Containers like linked list are typical example of that and as we use boost intrusive lists a lot, the for cycles based on the example are starting to be too frequent.
//NumberList is boost intrusive list of structures, containing number property
void removeEvenNumbers(NumberList& numbers)
{
NumberList::iterator next = numbers.begin();
for (NumberList::iterator i = numbers.begin(); i != numbers.end(); i = next)
{
++next;
if (i->number % 2 == 0)
i->unlink();
}
}
Edit: please note, that the example is solvable by remove_if but in the real scenarios, it is often not usable, or practical.
I'm looking for foreach variant that would allow me to write much more elegant source code.
void removeEvenNumbers(NumberList& numbers)
{
BOOST_FOREACH_RESISTANT(NumberList::value_type& item, numbers)
if (item.number % 2 == 0)
item.unlink();
}
What is the simpliest way to create this kind of macro from the existing components used to create the original FOREACH in boost?
I want to compare the current and next element of a set of addresses . I tried the following code
struct Address{
string state;
string city;
}
if((*it).state == (*(it+1)).state){
}
But the compiler gave an error that no match for operator+ in "it+1". On cplusplus.com I found that + operator is not supported for set containers. So I am unable to figure out a way to access both the current and the next element of a set in the same if statement.
But ++ is provided, so you can write:
?::iterator next = it;
next++;
Just create a copy of the iterator, advance it(++), then compare. Or, if your standard library has it, you can use the c++11 next function from the <iterator> library.
if(it->state == std::next(it)->state)
As you already found out the operator + is not supported for std::set iterators, since those are only bidirectional iterators and not random access iterators. So if you want to access the next element at the same time as the current one you have to make a copy and increment that one:
std::set<Address>::iterator next_it = it;
++next_it;
if(it->state == (next_it)->state)
If you are using c++11 this code can be simplyfied using the std::next function found in <iterator>(which basically does the same thing):
if(it->state == std::next(it)->state)
Of course writing that function is pretty trivial, so you could always write your own next when coding pre C++11 .
Also: Remember to make sure that the next iterator isn't equal to set.end()