STL remove first element that matches a predicate from a vector - c++

What is an efficient way to erase the first element in a vector that matches a predicate? I am storing unique values in a vector so I wouldn't want the algorithm to search the whole container.
Currently I am doing:
if ((auto it = std::find_if(container.begin(),container.end(),
[](Type& elem){ return elem == value;}) != container.end()))
{
container.erase(it);
}
Thanks, in advance.

Only a minor improvement:
container.erase(
std::remove(container.begin(), container.end(), value),
container.end()
);
of if you want to use an unary predicate my_predicate:
container.erase(
std::remove_if(container.begin(), container.end(), my_predicate),
container.end()
);
This has exactly the same performance characteristics (find+erase together will touch all elements as well), but elegantly avoids special cases (!= container.end()) because it is ranged based.
If you don't care about keeping the vector stable (or sorted!) you can also swap the found element to the back() and pop_back() which will slightly improve the average (but not asymptotic!) runtime.
Not to forget this is also a commonly accepted C++ idiom, so it can be more easily recognized.

Related

Is there an even faster approach than swap-and-pop for erasing from std::vector?

I am asking this as the other relevant questions on SO seem to be either for older versions of the C++ standard, do not mention any form of parallelization, or are focused on keeping the ordering/indexing the same as elements are removed.
I have a vector of potentially hundreds of thousands or millions of elements (which are fairly light structures, around ~20 bytes assuming they're compacted down).
Due to other restrictions, it must be a std::vector and other containers would not work (like std::forward_list), or be even less optimal in other uses.
I recently swapped from simple it = std::erase(it) approach to using pop-and-swap using something like this:
for(int i = 0; i < myVec.size();) {
// Do calculations to determine if element must be removed
// ...
// Remove if needed
if(elementMustBeRemoved) {
myVec[i] = myVec.back();
myVec.pop_back();
} else {
i++;
}
}
This works, and was a significant improvement. It cut the runtime of the method down to ~61% of what it was previously. But I would like to improve this further.
Does C++ have a method to remove many non-consecutive elements from a std::vector efficiently? Like passing a vector of indices to erase() and have C++ do some magic under the hood to minimize movement of data?
If so, I could have threads individually gather indices that must be removed in parallel, and then combine them and pass them to erase().
Take a look at std::remove_if algorithm. You could use it like this:
auto firstToErase = std::remove_if(myVec.begin(), myVec.end(),
[](const & T x){
// Do calculations to determine if element must be removed
// ...
return elementMustBeRemoved;});
myVec.erase(firstToErase, myVec.end());
cppreference says that following code is a possible implementation for remove_if:
template<class ForwardIt, class UnaryPredicate>
ForwardIt remove_if(ForwardIt first, ForwardIt last, UnaryPredicate p)
{
first = std::find_if(first, last, p);
if (first != last)
for(ForwardIt i = first; ++i != last; )
if (!p(*i))
*first++ = std::move(*i);
return first;
}
Instead of swapping with the last element it continuously moves through a container building up a range of elements which should be erased, until this range is at the very end of vector. This looks like a more cache-friendly solution and you might notice some performance improvement on a very big vector.
If you want to experiment with a parallel version, there is a version (4) which allows to specify execution policy.
Or, since C++20 you can type sligthly less and use erase_if.
However, in such case you lose the option to choose execution policy.
Is there an even faster approach than swap-and-pop for erasing from std::vector?
Ever since C++11, the optimal removal of single element from vector without preserving order has been move-and-pop rather than swap-and-pop.
Does C++ have a method to remove many non-consecutive elements from a std::vector efficiently?
The remove-erase (std::erase in C++20) idiom is the most efficient that the standard provides. std::remove_if does preserve order, and if you don't care about that, then a more efficient algorithm may be possible. But standard library does not come with unstable remove out of the box. The algorithm goes as follows:
Find first element to be removed (a)
Find last element to not be removed (b)
Move b to a.
Repeat between a and b until iterators meet.
There is a proposal P0048 to add such algorithm to the standard library, and there is a demo implementation in https://github.com/WG21-SG14/SG14/blob/6c5edd5c34e1adf42e69b25ddc57c17d99224bb4/SG14/algorithm_ext.h#L84

How to iterate an unordered_set from the end to the begin

I want to iterate an unordered_set from the end to the begin:
unordered_set<Expression*> BlocExpressions;
for(auto it=BlocExpressions.end(); it != BlocExpressions.begin(); it--)
{
//do some work
}
But there is no operator-- declared.
So, should I code the operator--, or is there a way to do that?
For std::unordered_set, the order in which you iterate through the elements does not matter. Saying that, you could just imagine the order is random. You get no particular order regardless you do a forward iteration or backward iteration. That's why it provides no reverse iterator nor provides the -- operator overload for normal iterator. Forward and backward iterations have the same semantics here: to iterate in a random order.
I can't understand why you use words "end" and "begin" for unordered_set. unordered_set does not have particular order. You can iterate all elements by using iterator object.
If you need order in the set, you should use other container, for example std::set
I also find it curious why there is no rbegin() and rend(). I am using unordered_set to add random numbers (unordered) to represent a certain path.
That would be great.
The solution I have found, it might help someone else is the following one. Adding the decrement in the first argument of the for loop:
auto it = --BlocExpressions.end()
unordered_set<Expression*> BlocExpressions;
for(auto it = --BlocExpressions.end(); ; it--){
//do some work
// This will include the last item (which is BlocExpressions.begin())
if(it == BlocExpressions.begin()){
break;
}
}

How to partially sort in a stable way

Is std::partial_sort stable and if not, is there a stable partial sort provided by the standard library or e.g. boost?
partial_sort is efficient and easy to provide because it is basically a quicksort where recursions that aren't necessary for the desired range are skipped. There is no equivalent efficient partial stable sort algorithm; stable_sort is usually implemented as a merge sort, and merge sort's recursion works the wrong way.
If you want a partial sort to be stable, you need to associate position information with each element. If you have a modifiable zip range you can do that by zipping together the elements and a iota vector, but modifiable zip ranges are actually impossible to build within the current iterator concepts, so it's easier to do indirect sorting via iterators and rely on the iterators' ordering. In other words, you can do this:
using MyThingV = std::vector<MyThing>;
using MyThingIt = typename MyThingV::iterator;
MyThingV things;
// Set up a vector of iterators. We'll sort that.
std::vector<MyThingIt> sorted; sorted.reserve(things.size());
for (auto it = things.begin(); it != things.end(); ++it) sorted.push_back(it);
std::partial_sort(sorted.begin(), sorted.begin() + upto_index, sorted.end(),
[](MyThingIt lhs, MyThingIt rhs) {
// First see if the underlying elements differ.
if (*lhs < *rhs) return true;
if (*rhs < *lhs) return false;
// Underlying elements are the same, so compare iterators; these represent
// position in original vector.
return lhs < rhs;
});
Now your base vector is still unsorted, but the vector of iterators is sorted the way you want.

Erasing item in a for(-each) auto loop

Is there a way to erase specific elements when using a auto variable in a for loop like this?
for(auto a: m_Connections)
{
if(something)
{
//Erase this element
}
}
I know I can either do say
for(auto it=m_map.begin() ...
or
for(map<int,int>::iterator it=m_map.begin() ...
and manually increment the iterator (and erase) but if I could do it with less lines of code I'd be happier.
Thanks!
You can't. A range-based loop makes a simple iteration over a range simpler, but doesn't support anything that invalidates either the range, or the iterator it uses. Of course, even if that were supported, you couldn't efficiently erase an element without access to the iterator.
You'll need an old-school loop, along the lines of
for (auto it = container.begin(); it != container.end();) {
if (something) {
it = container.erase(it);
} else {
++it;
}
}
or a combination of container.erase() and std::remove_if, if you like that sort of thing.
No, there isn't. Range based for loop is used to access each element of a container once.
Every time an element is removed from the container, iterators at or after the erased element are no longer valid (and given the implementation of the range-based-for this is a problem).
You should use the normal for loop (or a while) if you need to modify the container as you go along.
If you want to erase elements for which a predicate returns true, a good way is:
m_Connections.erase(
std::remove_if(m_Connections.begin(),
m_Connections.end(),
[](Type elem) { return predicate(elem); }),
m_Connections.end());
std::remove_if doesn't mix iteration logic with the predicate.
You need the iterator if you want to erase an element from a container.
And you can't get the iterator from the element itself -- and even if you could, for instance with vector, the iterator that range-based for internally uses would be invalidated in the next step causing undefined behavior.
So the answer is: No, in its classic usage you can't. range-based for was solely designed for convenient iteration of all elements in a range.
push all elements into array and then do pop operation to remove the item

Iterating over std::vector with lambda does not want to remove with remove_if

I have a small problem with lambda expression while using remove_if on std::vector
I have a following piece of code :
std::remove_if( openList.begin(), openList.end(),
[&](BoardNode& i){
std::cout<< i.getCoordinates() << std::endl;
std::cout<< currentNode.getCoordinates() << std::endl;
return i.getCoordinates() == currentNode.getCoordinates(); }
);
There is no compiler error with this, but the elements which return true from the above statement won't be removed from the vector;
I get printed on the screen e.g.
[5,5]
[5,5]
but the openList remains as it was.
std::remove_if doesn't erase anything from the vector, since it doesn't have access to it. Instead, it moves the elements you want to keep to the start of the range, leaving the remaining elements in a valid but unspecified state, and returns the new end.
You can use the "erase-remove" idiom to actually erase them from the vector:
openList.erase(
std::remove_if(
openList.begin(),
openList.end(),
[&](BoardNode& i){return i.getCoordinates() == currentNode.getCoordinates();}),
openList.end());
I think you intend to remove items from the vector. But what you do, would not really remove the items from the vector, which makes you think that the lambda doesn't work. You need to use erase() member function in conjunction with std::remove.
In other words, you have to use erase-remove idiom as:
v.erase(std::remove_if(v.begin(), v.end(), your-lambda-goes-here), v.end());
Removing is done by shifting the elements in the range in such a way
that elements to be erased are overwritten. The elements between the
old and the new ends of the range have unspecified values. An iterator
to the new end of the range is returned. Relative order of the
elements that remain is preserved.
http://en.cppreference.com/w/cpp/algorithm/remove
Also, check the example on that link.