fast deletion from vector with a reverse iterator - c++

I want to delete vector entries that meet some condition, after calling a function on them. I don't care about stable ordering, so I'd normally, in effect, move the final array element to replace the one I'm examining.
Question: what is the slickest idiom for doing this with iterators?
(Yes, erase-remove is the standard idiom if you want to preserve ordering, but in my case it's not needed and I think will be slower, thanks to those moves, than my version given here.)
With an int subscript I'd do it thusly, and this code works:
for ( int i = (int) apc.size() - 1; i >= 0; i-- )
if ( apc[i]->blah ) {
MyFunc( apc[i] );
apc[i] = apc.back();
apc.pop_back();
}
I try the same with a reverse iterator and it blows up in the ++ of the for loop after it's gone into the if block the first time. I can't figure out why. If were actually calling erase() on *it I know that would render it undefined, but I'm not doing that. I suppose that pop_back() would undefine rbegin(). I should check if it is going into the if block on first iteration and if it only crashes in that situation.
for ( auto it = apc.rbegin(); it != apc.rend(); it++ )
if ( (*it)->blah ) {
MyFunc( *it );
*it = apc.back();
apc.pop_back();
}
With forward iterator it seems to work, though I dislike the stutter effect of having the loop stop when it's finding elements with blah true. A reverse loop is a little ugly, but at least its a real loop, not this centaur-like half-loop-half-while concoction:
for ( auto it = apc.begin(); it != apc.end(); )
if ( (*it)->blah ) {
MyFunc( *it );
*it = apc.back();
apc.pop_back();
} else
it++;

pop_back should normally only invalidate back() and end(). But you can be caught by a corner case if the last element of the array has to be deleted. With indices, no problem, you try to move an element on itself which should be a no-op and proceed on previous index. But with iterators, the current value is back() so it should be invalidated.
Beware, it may also be a problem in your implementation so it could make sense to provide that information so that others could try to reproduce with this or other implementations.

I think the old and tested erase-remove idiom covers this nicely.
apc.erase(std::remove_if(apc.begin(), apc.end(), [](auto& v) {
if (v->blah) {
MyFunc(v);
return true;
}
return false;
}), apc.end());
This idiom moves all the elements to be removed to the end of the container by std::remove_if, and then we remove all of them in one go with erase.
Edit: As pointed out by Marshall, the algorithm will move the elements to be kept to the front, which makes sense considering that it promises to preserve the relative ordering of the kept elements.
If the lambda needs to act on this or any variable other then the passed in v it needs to be captured. In this case we don't need to worry about lifetime, so a default capture by reference is a good choice.
[&](auto& v) {
if (v->blah < x) { //captures x by reference
MyFunc(v, member_variable); //current object is captured by reference, and can access member variables
return true;
}
return false;
})
If MyFunc potentially could modify member_variable we additionally need to make the lambda mutable.
By default the lambda creates a function object with a operator() const but mutable removes the const.
[&](auto& v) mutable { ... }

Related

std::list::erase not working

I am trying to delete an element from a list of objects if one of the object's properties matches a condition. This is my function to do so, however, after performing this operation and then printing the contents, the erase() seems to have no effect. What am I doing wrong here?
void FileReader::DeleteProcess(int id, list<Process> listToDeleteFrom)
{
list<Process>::iterator process;
for(process = listToDeleteFrom.begin(); process != listToDeleteFrom.end(); process++)
{
if (process -> ID == id)
{
listToDeleteFrom.erase(process);
}
}
}
First, you need to pass the list by reference; your code is working on a copy, so changes it makes won't affect the caller's list:
void FileReader::DeleteProcess(int id, list<Process> & listToDeleteFrom)
^
Second, erasing a list element invalidates any iterator that refers to that element, so attempting to carry on iterating afterwards will cause undefined behaviour. If there will only be one element to remove, then return from the function straight after the call to erase; otherwise, the loop needs to be structured something like:
for (auto it = list.begin(); it != list.end(); /* don't increment here */) {
if (it->ID == id) {
it = list.erase(it);
} else {
++it;
}
}
Calling erase() when an iterator is iterating over the list invalidates the iterator. Add the elements to erase to a second list then remove them afterwards.
Also note that you are passing the list by value rather than using a reference or a pointer. Did you mean to use list<Process>& listToDeleteFrom or list<Process>* listToDeleteFrom?
The reason you see no changes reflected is that your list is not being passed by reference, so you are only removing elements from a copy of the list.
Change it to this:
void FileReader::DeleteProcess(int id, list<Process> &listToDeleteFrom) //note &
This will keep the same syntax in the function and modify the original.
However, the way you're deleting the elements is a bit sub-optimal. If you have C++11, the following will remove your invalidation problem, and is more idiomatic, using an existing algorithm designed for the job:
listToDeleteFrom.erase ( //erase matching elements returned from remove_if
std::remove_if(
std::begin(listToDeleteFrom),
std::end(listToDeleteFrom),
[](const Process &p) { //lambda that matches based on id
return p->ID == id;
}
),
std::end(listToDeleteFrom) //to the end of the list
);
Note the keeping of std::list<>::erase in there to actually erase the elements that match. This is known as the erase-remove idiom.

How to delete an element from a vector while looping over it?

I am looping through a vector with a loop such as for(int i = 0; i < vec.size(); i++). Within this loop, I check a condition on the element at that vector index, and if a certain condition is true, I want to delete that element.
How do I delete a vector element while looping over it without crashing?
The idiomatic way to remove all elements from an STL container which satisfy a given predicate is to use the remove-erase idiom. The idea is to move the predicate (that's the function which yields true or false for some element) into a given function, say pred and then:
static bool pred( const std::string &s ) {
// ...
}
std::vector<std::string> v;
v.erase( std::remove_if( v.begin(), v.end(), pred ), v.end() );
If you insist on using indices, you should not increment the index for every element, but only for those which didn't get removed:
std::vector<std::string>::size_type i = 0;
while ( i < v.size() ) {
if ( shouldBeRemoved( v[i] ) ) {
v.erase( v.begin() + i );
} else {
++i;
}
}
However, this is not only more code and less idiomatic (read: C++ programmers actually have to look at the code whereas the 'erase & remove' idiom immediately gives some idea what's going on), but also much less efficient because vectors store their elements in one contiguous block of memory, so erasing on positions other than the vector end also moves all the elements after the segment erased to their new positions.
If you cannot use remove/erase (e.g. because you don't want to use lambdas or write a predicate), use the standard idiom for sequence container element removal:
for (auto it = v.cbegin(); it != v.cend() /* not hoisted */; /* no increment */)
{
if (delete_condition)
{
it = v.erase(it);
}
else
{
++it;
}
}
If possible, though, prefer remove/erase:
#include <algorithm>
v.erase(std::remove_if(v.begin(), v.end(),
[](T const & x) -> bool { /* decide */ }),
v.end());
Use the Erase-Remove Idiom, using remove_if with a predicate to specify your condition.
if(vector_name.empty() == false) {
for(int i = vector_name.size() - 1; i >= 0; i--)
{
if(condition)
vector_name.erase(vector_name.at(i));
}
}
This works for me. And Don't need to think about indexes have already erased.
Iterate over the vector backwards. That way, you don't nuke the ability to get to the elements you haven't visited yet.
I realize you are asking specifically about removing from vector, but just wanted to point out that it is costly to remove items from a std::vector since all items after the removed item must be copied to new location. If you are going to remove items from the container you should use a std::list. The std::list::erase(item) method even returns the iterator pointing to the value after the one just erased, so it's easy to use in a for or while loop. Nice thing too with std::list is that iterators pointing to non-erased items remain valid throughout list existence. See for instance the docs at cplusplus.com.
That said, if you have no choice, a trick that can work is simply to create a new empty vector and add items to it from the first vector, then use std::swap(oldVec, newVec), which is very efficient (no copy, just changes internal pointers).

C++ iterators problem

I'm working with iterators on C++ and I'm having some trouble here. It says "Debug Assertion Failed" on expression (this->_Has_container()) on line interIterator++.
Distance list is a vector< vector< DistanceNode > >. What I'm I doing wrong?
vector< vector<DistanceNode> >::iterator externIterator = distanceList.begin();
while (externIterator != distanceList.end()) {
vector<DistanceNode>::iterator interIterator = externIterator->begin();
while (interIterator != externIterator->end()){
if (interIterator->getReference() == tmp){
//remove element pointed by interIterator
externIterator->erase(interIterator);
} // if
interIterator++;
} // while
externIterator++;
} // while
vector's erase() returns a new iterator to the next element. All iterators to the erased element and to elements after it become invalidated. Your loop ignores this, however, and continues to use interIterator.
Your code should look something like this:
if (condition)
interIterator = externIterator->erase(interIterator);
else
++interIterator; // (generally better practice to use pre-increment)
You can't remove elements from a sequence container while iterating over it — at least not the way you are doing it — because calling erase invalidates the iterator. You should assign the return value from erase to the iterator and suppress the increment:
while (interIterator != externIterator->end()){
if (interIterator->getReference() == tmp){
interIterator = externIterator->erase(interIterator);
} else {
++interIterator;
}
}
Also, never use post-increment (i++) when pre-increment (++i) will do.
I'll take the liberty to rewrite the code:
class ByReference: public std::unary_function<bool, DistanceNode>
{
public:
explicit ByReference(const Reference& r): mReference(r) {}
bool operator()(const DistanceNode& node) const
{
return node.getReference() == r;
}
private:
Reference mReference;
};
typedef std::vector< std::vector< DistanceNode > >::iterator iterator_t;
for (iterator_t it = dl.begin(), end = dl.end(); it != end; ++it)
{
it->erase(
std::remove_if(it->begin(), it->end(), ByReference(tmp)),
it->end()
);
}
Why ?
The first loop (externIterator) iterates over a full range of elements without ever modifying the range itself, it's what a for is for, this way you won't forget to increment (admittedly a for_each would be better, but the syntax can be awkward)
The second loop is tricky: simply speaking you're actually cutting the branch you're sitting on when you call erase, which requires jumping around (using the value returned). In this case the operation you want to accomplish (purging the list according to a certain criteria) is exactly what the remove-erase idiom is tailored for.
Note that the code could be tidied up if we had true lambda support at our disposal. In C++0x we would write:
std::for_each(distanceList.begin(), distanceList.end(),
[const& tmp](std::vector<DistanceNode>& vec)
{
vec.erase(
std::remove_if(vec.begin(), vec.end(),
[const& tmp](const DistanceNode& dn) { return dn.getReference() == tmp; }
),
vec.end()
);
}
);
As you can see, we don't see any iterator incrementing / dereferencing taking place any longer, it's all wrapped in dedicated algorithms which ensure that everything is handled appropriately.
I'll grant you the syntax looks strange, but I guess it's because we are not used to it yet.

update map value

I have a map like this:
map<prmNode,vector<prmEdge>,prmNodeComparator> nodo2archi;
When I have to update the value (vector), I take the key and his value, I update the vector of values, I erase the old key and value then I insert the key and the new vector. The code is this:
bool prmPlanner::insert_edgemap(int from,int to) {
prmEdge e;
e.setFrom(from);
e.setTo(to);
map<prmNode,vector<prmEdge> >::iterator it;
for (it=nodo2archi.begin(); it!=nodo2archi.end(); it++){
vector<prmEdge> appo;
prmNode n;
n=(*it).first;
int indice=n.getIndex();
if (indice==f || indice==t){
appo.clear();
vector<prmEdge> incArchi;
incArchi=(*it).second;
appo=(incArchi);
appo.push_back(e);
nodo2archi.erase(it);
nodo2archi.insert(make_pair(n,appo) );
}
}
return true;
}
The problem is that for the first 40-50 iterations everything go weel and the map is updated well, while with more iterations it goes sometimes in segmentation fault, sometimes in an infinite idle. I don't know why. Somebody can help me please??
Thank you very much.
You are iterating through nodo2archi and at the sametime changing its size by doing nodo2archi.erase(it); and nodo2archi.insert(make_pair(n,appo) );. If you do that your iterator may become invalid and your it++ might crash.
Are you simply trying to append data to some of the mapped vectors? In this case you don't need to erase and insert anything:
for (MapType::iterator it = map.begin(); it != map.end(); ++it) {
if (some_condition) {
it->second.push_back(some_value);
}
}
The problem is that after erasing the iterator it you are trying to perform operations on it (increment) which is Undefined Behavior. Some of the answers state that modifying the container while you are iterating over it is UB, which is not true, but you must know when your iterators become invalidated.
For sequence containers, the erase operation will return a new valid iterator into the next element in the container, so this would be a correct and idiomatic way of erasing from such a container:
for ( SequenceContainer::iterator it = c.begin(); it != c.end(); )
// note: no iterator increment here
// note: no caching of the end iterator
{
if ( condition(*it) ) {
it = c.erase(it);
} else {
++it;
}
}
But sadly enough, in the current standard, associative containers erase does not return an iterator (this is fixed in the new standard draft), so you must manually fake it
for ( AssociativeContainer::iterator it = c.begin(); it != c.end(); )
// again, no increment in the loop and no caching of the end iterator
{
if ( condition(*it) ) {
AssociativeContainer::iterator del = it++; // increment while still valid
c.erase(del); // erase previous position
} else {
++it;
}
}
And even more sadly, the second approach, correct for associative containers, is not valid for some sequence containers (std::vector in particular), so there is no single solution for the problem and you must know what you are iterating over. At least until the next standard is published and compilers catch up.
Yo do modify collection while iterating over it.
You are erasing nodes while iterating through your map. This is asking for trouble :)
You must not modify a collection itself while iterating over it. C++ will allow it, but it still results in undefined behavior. Other languages like Java have fail-fast iterators that immediately break when the collection has been modified.

How to filter items from a std::map? [duplicate]

This question already has answers here:
What happens if you call erase() on a map element while iterating from begin to end?
(3 answers)
Closed 6 years ago.
I have roughly the following code. Could this be made nicer or more efficient? Perhaps using std::remove_if? Can you remove items from the map while traversing it? Can we avoid using the temporary map?
typedef std::map<Action, What> Actions;
static Actions _actions;
bool expired(const Actions::value_type &action)
{
return <something>;
}
void bar(const Actions::value_type &action)
{
// do some stuff
}
void foo()
{
// loop the actions finding expired items
Actions actions;
BOOST_FOREACH(Actions::value_type &action, _actions)
{
if (expired(action))
bar(action);
else
actions[action.first]=action.second;
}
}
actions.swap(_actions);
}
A variation of Mark Ransom algorithm but without the need for a temporary.
for(Actions::iterator it = _actions.begin();it != _actions.end();)
{
if (expired(*it))
{
bar(*it);
_actions.erase(it++); // Note the post increment here.
// This increments 'it' and returns a copy of
// the original 'it' to be used by erase()
}
else
{
++it; // Use Pre-Increment here as it is more effecient
// Because no copy of it is required.
}
}
You could use erase(), but I don't know how BOOST_FOREACH will handle the invalidated iterator. The documentation for map::erase states that only the erased iterator will be invalidated, the others should be OK. Here's how I would restructure the inner loop:
Actions::iterator it = _actions.begin();
while (it != _actions.end())
{
if (expired(*it))
{
bar(*it);
Actions::iterator toerase = it;
++it;
_actions.erase(toerase);
}
else
++it;
}
Something that no one ever seems to know is that erase returns a new, guaranteed-to-be-valid iterator, when used on any container.
Actions::iterator it = _actions.begin();
while (it != _actions.end())
{
if (expired(*it))
{
bar(*it);
it = _actions::erase(it);
}
else
++it;
}
Storing actions.end() is probably not a good plan in this case since iterator stability is not guaranteed, I believe.
If the idea is to remove expired items, why not use map::erase? This way you only have to remove elements you don't need anymore, not rebuild an entire copy with all the elements that you want to keep.
The way you would do this is to save off the iterators pointing to the elements you want to erase, then erase them all after the iteration is over.
Or, you can save off the element that you visited, move to the next element, and then erase the temporary. The loop bounds get messed up in your case though, so you have to fine tune the iteration yourself.
Depending on how expired() is implemented, there may be other better ways. For example if you are keeping track of a timestamp as the key to the map (as expired() implies?), you can do upper_bound on the current timestamp, and all elements in the range [ begin(), upper_bound() ) need to be processed and erased.