Keeping track of removed elements using std::remove_if

Keeping track of removed elements using std::remove_if - c++

I want to remove some elements from a vector and am using remove_if algorithm to do this. But I want to keep track of the removed elements so that I can perform some operation on them later. I tried this with the following code:
#include <vector>
#include <algorithm>
#include <iostream>
using namespace std;
struct IsEven
{
bool operator()(int n)
{
if(n % 2 == 0)
{
evens.push_back(n);
return true;
}
return false;
}
vector<int> evens;
};
int main(int argc, char **argv)
{
vector<int> v;
for(int i = 0; i < 10; ++i)
{
v.push_back(i);
}
IsEven f;
vector<int>::iterator newEnd = remove_if(v.begin(), v.end(), f);
for(vector<int>::iterator it = f.evens.begin(); it != f.evens.end(); ++it)
{
cout<<*it<<"\n";
}
v.erase(newEnd, v.end());
return 0;
}
But this doesn't work as remove_if accepts the copy of my functor object, so the the stored evens vector is not accessible. What is the correct way of achieving this?
P.S. : The example, with even and odds is just for example sake, my real code is somethinf different. So don't suggest a way to identify even or odds differently.

The solution is not remove_if, but it's cousin partial_sort partition. The difference is that remove_if only guarantees that [begin, middle) contains the matching elements, but partition also guarantees that [middle, end) contains the elements which didn't match the predicate.
So, your example becomes just (note that evens is no longer needed):
vector<int>::iterator newEnd = partition(v.begin(), v.end(), f);
for(vector<int>::iterator it = newEnd; it != v.end(); ++it)
{
cout<<*it<<"\n";
}
v.erase(newEnd, v.end());

Your best bet is std::partition() which will rearrange all elts in the sequence such as all elts for which your predicate return true will precede those for which it returns false.
Exemple:
vector<int>::iterator bound = partition (v.begin(), v.end(), IsEven);
std::cout << "Even numbers:" << std::endl;
for (vector<int>::iterator it = v.begin(); it != bound; ++it)
std::cout << *it << " ";
std::cout << "Odd numbers:" << std::endl;
for (vector<int>::iterator it = bound; it != v.end(); ++it)
std::cout << *it << " ";

You can avoid copying your functor (i.e. pass by value) if you pass ist by reference like this:
vector<int>::iterator newEnd = remove_if(v.begin(), v.end(),
boost::bind<int>(boost::ref(f), _1));
If you can't use boost, the same is possible with std::ref. I tested the code above and it works as expected.

An additional level of indirection. Declare the vector locally, and
have IsEven contain a copy to it. It's also possible for IsEven to
own the vector, provided that it is dynamically allocated and managed by
a shared_ptr. In practice, I've generally found the local variable
plus pointer solution more convenient. Something like:
class IsEven
{
std::vector<int>* myEliminated;
public:
IsEven( std::vector<int>* eliminated = NULL )
: myEliminated( eliminated )
{
}
bool
operator()( int n ) const
{
bool results = n % 2 == 0;
if ( results && myEliminated != NULL ) {
myEliminated->push_back( n );
}
return results;
}
}
Note that this also allows the operator()() function to be const. I
think this is formally required (although I'm not sure).

The problem that I see with the code is that the evens vector that you create inside the struct gets created everytime the remove_if algorithm calls it. So no matter if you pass in a functor to remove_if it will create a new vector each time. So once the last element is removed and when the function call ends and comes out of the function the f.evens will always fetch an empty vector. This could be sorted in two ways,
Replace the struct with a class and declare evens as static (if that is what you wanted)
Or you could make evens global. I wouldn't personally recommend that( it makes the code bad, say no to globals unless you really need them).
Edit:
As suggested by nabulke you could also std::ref something likke this, std::ref(f). This prevents you from making the vector global and avoids for unnecessary statics.
A sample of making it global is as follows,
#include <vector>
#include <algorithm>
#include <iostream>
using namespace std;
vector<int> evens;
struct IsEven
{
bool operator()(int n)
{
if(n % 2 == 0)
{
evens.push_back(n);
return true;
}
return false;
}
};
int main(int argc, char **argv)
{
vector<int> v;
for(int i = 0; i < 10; ++i)
{
v.push_back(i);
}
IsEven f;
vector<int>::iterator newEnd = remove_if(v.begin(), v.end(), f);
for(vector<int>::iterator it = evens.begin(); it != evens.end(); ++it)
{
cout<<*it<<"\n";
}
v.erase(newEnd, v.end());
return 0;
}
This code seems to work just fine for me. Let me know if this is not what you wanted.

You may have another solution; only if you don't need to remove elts in the same time (do you?). With std::for_each() which returns a copy of your functor. Exemple:
IsEven result = std::for_each(v.begin(), v.end(), IsEven());
// Display the even numbers.
std::copy(result.evens.begin(), result.evens.end(), std::ostream_iterator<int> (cout, "\n"));
Take note that it is always better to create unnamed variables in c++ when possible. Here that solution does not exactly answer your primary issue (removing elts from the source container), but it reminds everyone that std::for_each() returns a copy of your functor. :-)

Related

How to remove an item from a list of tuples in c++?

I'm iterating over my list of tuples : list<tuple<int,int>> edges, and want to remove some elements in it. This is necessary for me to reduce the total overhead as I am working with huge data.
std::list<tuple<int, int>>::iterator it;
for (it = edges.begin(); it != edges.end(); ++it)
{
if (get<0>(*it) == 0 || get<1>(*it) == 0){
edges.remove(*it);
}
}
As I know, remove(element) works, but here edges.remove(*it) does not. How can I do this correctly?

In C++20, you can simply use a specialization of std::erase_if for std::list to do this.
#include <list>
#include <tuple>
int main() {
std::list<std::tuple<int, int>> l;
std::erase_if(l, [](const auto& elem) {
auto& [first, second] = elem;
return first == 0 || second == 0; });
}
Demo
However, since std::list itself has a remove_if member function, it is more appropriate to use it directly, since it applies to any C++ standard.

You can use erase() to specify an element to remove by an iterator.
It returns an iterator for the next element, so don't forget to catch that.
std::list<tuple<int, int>>::iterator it;
for (it = edges.begin(); it != edges.end(); ) // don't increment it here
{
if (get<0>(*it) == 0 || get<1>(*it) == 0){
it = edges.erase(it);
} else {
++it;
}
}

In my opionion remove_if which is a dedicated and optimized function for a std::list should be used. This will avoid unnecessary indirections.
Please read here about it.
The result will be an efficient one liner.
Please see one of many potential solutions:
#include <iostream>
#include <list>
#include <tuple>
using MyTuple = std::tuple<int,int>;
using MyList = std::list<MyTuple>;
int main() {
// Define some demo data
MyList myList{{0,1},{2,3},{4,5},{6,0},{7,8},{9,10},{0,0}};
// Predicate function. Define whatever you want
auto unwanted = [](const MyTuple& t) {return std::get<0>(t)==0 or std::get<1>(t)==0;};
// Remove all unwanted stuff
myList.remove_if(unwanted);
// Some debug output
for (const auto&[l,r] : myList)
std::cout << l << ' ' << r << '\n';
}

How to repair SigSegV [duplicate]

I want to clear a element from a vector using the erase method. But the problem here is that the element is not guaranteed to occur only once in the vector. It may be present multiple times and I need to clear all of them. My code is something like this:
void erase(std::vector<int>& myNumbers_in, int number_in)
{
std::vector<int>::iterator iter = myNumbers_in.begin();
std::vector<int>::iterator endIter = myNumbers_in.end();
for(; iter != endIter; ++iter)
{
if(*iter == number_in)
{
myNumbers_in.erase(iter);
}
}
}
int main(int argc, char* argv[])
{
std::vector<int> myNmbers;
for(int i = 0; i < 2; ++i)
{
myNmbers.push_back(i);
myNmbers.push_back(i);
}
erase(myNmbers, 1);
return 0;
}
This code obviously crashes because I am changing the end of the vector while iterating through it. What is the best way to achieve this? I.e. is there any way to do this without iterating through the vector multiple times or creating one more copy of the vector?

Use the remove/erase idiom:
std::vector<int>& vec = myNumbers; // use shorter name
vec.erase(std::remove(vec.begin(), vec.end(), number_in), vec.end());
What happens is that remove compacts the elements that differ from the value to be removed (number_in) in the beginning of the vector and returns the iterator to the first element after that range. Then erase removes these elements (whose value is unspecified).
Edit: While updating a dead link I discovered that starting in C++20 there are freestanding std::erase and std::erase_if functions that work on containers and simplify things considerably.

Calling erase will invalidate iterators, you could use:
void erase(std::vector<int>& myNumbers_in, int number_in)
{
std::vector<int>::iterator iter = myNumbers_in.begin();
while (iter != myNumbers_in.end())
{
if (*iter == number_in)
{
iter = myNumbers_in.erase(iter);
}
else
{
++iter;
}
}
}
Or you could use std::remove_if together with a functor and std::vector::erase:
struct Eraser
{
Eraser(int number_in) : number_in(number_in) {}
int number_in;
bool operator()(int i) const
{
return i == number_in;
}
};
std::vector<int> myNumbers;
myNumbers.erase(std::remove_if(myNumbers.begin(), myNumbers.end(), Eraser(number_in)), myNumbers.end());
Instead of writing your own functor in this case you could use std::remove:
std::vector<int> myNumbers;
myNumbers.erase(std::remove(myNumbers.begin(), myNumbers.end(), number_in), myNumbers.end());
In C++11 you could use a lambda instead of a functor:
std::vector<int> myNumbers;
myNumbers.erase(std::remove_if(myNumbers.begin(), myNumbers.end(), [number_in](int number){ return number == number_in; }), myNumbers.end());
In C++17 std::experimental::erase and std::experimental::erase_if are also available, in C++20 these are (finally) renamed to std::erase and std::erase_if (note: in Visual Studio 2019 you'll need to change your C++ language version to the latest experimental version for support):
std::vector<int> myNumbers;
std::erase_if(myNumbers, Eraser(number_in)); // or use lambda
or:
std::vector<int> myNumbers;
std::erase(myNumbers, number_in);

You can iterate using the index access,
To avoid O(n^2) complexity
you can use two indices, i - current testing index, j - index to
store next item and at the end of the cycle new size of the vector.
code:
void erase(std::vector<int>& v, int num)
{
size_t j = 0;
for (size_t i = 0; i < v.size(); ++i) {
if (v[i] != num) v[j++] = v[i];
}
// trim vector to new size
v.resize(j);
}
In such case you have no invalidating of iterators, complexity is O(n), and code is very concise and you don't need to write some helper classes, although in some case using helper classes can benefit in more flexible code.
This code does not use erase method, but solves your task.
Using pure stl you can do this in the following way (this is similar to the Motti's answer):
#include <algorithm>
void erase(std::vector<int>& v, int num) {
vector<int>::iterator it = remove(v.begin(), v.end(), num);
v.erase(it, v.end());
}

Depending on why you are doing this, using a std::set might be a better idea than std::vector.
It allows each element to occur only once. If you add it multiple times, there will only be one instance to erase anyway. This will make the erase operation trivial.
The erase operation will also have lower time complexity than on the vector, however, adding elements is slower on the set so it might not be much of an advantage.
This of course won't work if you are interested in how many times an element has been added to your vector or the order the elements were added.

There are std::erase and std::erase_if since C++20 which combines the remove-erase idiom.
std::vector<int> nums;
...
std::erase(nums, targetNumber);
or
std::vector<int> nums;
...
std::erase_if(nums, [](int x) { return x % 2 == 0; });

If you change your code as follows, you can do stable deletion.
void atest(vector<int>& container,int number_in){
for (auto it = container.begin(); it != container.end();) {
if (*it == number_in) {
it = container.erase(it);
} else {
++it;
}
}
}
However, a method such as the following can also be used.
void btest(vector<int>& container,int number_in){
container.erase(std::remove(container.begin(), container.end(), number_in),container.end());
}
If we must preserve our sequence’s order (say, if we’re keeping it sorted by some interesting property), then we can use one of the above. But if the sequence is just a bag of values whose order we don’t care about at all, then we might consider moving single elements from the end of the sequence to fill each new gap as it’s created:
void ctest(vector<int>& container,int number_in){
for (auto it = container.begin(); it != container.end(); ) {
if (*it == number_in) {
*it = std::move(container.back());
container.pop_back();
} else {
++it;
}
}
}
Below are their benchmark results:
CLang 15.0:
Gcc 12.2:

Copy subset of a std::map

I am trying to copy a subset of a std::map structure to a new map structure in the most efficient way. I can only think to a plain vanilla solution like this:
// Example program
#include <iostream>
#include <string>
#include <map>
int main()
{
// build map
std::map<int, int> mymap;
size_t num_el = 10;
for(size_t i = 0; i < num_el; ++i)
{
mymap.insert(std::pair<int,int>(i,i));
}
// copy submap
int start_index = 5;
std::map<int,int> output_map;
std::map<int,int>::const_iterator it;
for(it = mymap.find(start_index); it != mymap.end(); ++it)
{
output_map.insert(*it);
}
//print result
std::map<int,int>::const_iterator pit;
for(pit = output_map.begin(); pit != output_map.end(); ++pit)
{
std::cout << pit->second << " , ";
}
std::cout << std::endl;
}
Is there a better way to do this?

The insert method allows you to specify a range like so:
auto range_start = mymap.find(5);
auto range_end = mymap.end();
output_map.insert(range_start, range_end);

If you want additional flexibility on choice of the copied elements, copy_if will provide it:
std::copy_if(
begin(mymap),
end(mymap),
std::inserter(output_map, begin(output_map)),
[&start_index](std::pair<int,int> p) { return p.first >= start_index; }
);
It's a bit tricky to use it for std::map (and also e.g. std::set), but std::inserter works nicely.
Of course the simpler std::copy could be used similarly to the range constructor.

You can use map range constructor (c++11):
std::map<int,int> output_map{mymap.find(start_index), mymap.end()};
and if you need older standard:
std::map<int,int> output_map( mymap.find(start_index), mymap.end() );
And insert works too

Converting const auto & to iterator

A number of posts I've read lately claim for(const auto &it : vec) is the same as using the longer iterator syntax for(std::vector<Type*>::const_iterator it = vec.begin(); it != vec.end(); it++). But, I came upon this post that says they're not the same.
Currently, I'm trying to erase an element in a for loop, after it is used, and wondering if there is any way to convert const auto &it : nodes to std::vector<txml::XMLElement*>::iterator?
Code in question:
std::vector<txml2::XMLElement *> nodes;
//...
for (const auto &it : nodes)
{
//...
nodes.erase(it);
}
I pretty sure I could just rewrite std::vector<txml2::XMLElement*> as a const pointer, but would prefer not to since this code is just for debugging in the moment.

You should not be attempting to convert the range declaration in your range based for loop to an iterator and then deleting it whilst iterating. Even adjusting iterators while iterating is dangerous, and you should instead rely on algorithms.
You should use the Erase-remove idom.
You can use it with remove_if.
It would look something like:
nodes.erase( std::remove_if(nodes.begin(), nodes.end(), [](auto it){
//decide if the element should be deleted
return true || false;
}), nodes.end() );
Currently in the technical specifications, is erase_if.
This is a cleaner version of the same behaviour shown above:
std::erase_if(nodes,[](auto it){
//decide if the element should be deleted
return true || false;
});

You don't get an iterator but a reference to the element. Unless you want to do a std::find with it, it's pretty hard to get an iterator out of it.
Vectors are nice, so you could increase a counter per element and do nodes.begin() + counter to get the iterator, but it'd sort of defeat the point.
Also erasing the iterator in the for loop will result in you iterating after the end of the vector, you can test this code:
#include <iostream>
#include <vector>
using namespace std;
int main() {
vector<int> v = {0,1,2,3,4,5,6};
for (int x : v) {
cout << x << endl;
if (x == 2) {
v.erase(v.begin() + 2);
}
}
return 0;
}
If you want to use iterators, just do a loop with them, if in addition you want to erase one mid-loop you have to follow this answer:
for (auto it = res.begin() ; it != res.end(); ) {
const auto &value = *it;
if (condition) {
it = res.erase(it);
} else {
++it;
}
}
Note that you don't need to specify the whole type of the iterator, auto works just as well.

How do iterators map/know their current position or element

Consider the following code example :
#include <vector>
#include <numeric>
#include <algorithm>
#include <iterator>
#include <iostream>
#include <functional>
int main()
{
std::vector<int> v(10, 2);
std::partial_sum(v.cbegin(), v.cend(), v.begin());
std::cout << "Among the numbers: ";
std::copy(v.cbegin(), v.cend(), std::ostream_iterator<int>(std::cout, " "));
std::cout << '\n';
if (std::all_of(v.cbegin(), v.cend(), [](int i){ return i % 2 == 0; })) {
std::cout << "All numbers are even\n";
}
if (std::none_of(v.cbegin(), v.cend(), std::bind(std::modulus<int>(),
std::placeholders::_1, 2))) {
std::cout << "None of them are odd\n";
}
struct DivisibleBy
{
const int d;
DivisibleBy(int n) : d(n) {}
bool operator()(int n) const { return n % d == 0; }
};
if (std::any_of(v.cbegin(), v.cend(), DivisibleBy(7))) {
std::cout << "At least one number is divisible by 7\n";
}
}
If we look at this part of the code :
if (std::all_of(v.cbegin(), v.cend(), [](int i){ return i % 2 == 0; })) {
std::cout << "All numbers are even\n";
}
which is fairly easy to understand. It iterates over those vector elements , and finds out i%2==0 , whether they are completely divisible by 2 or not , hence finds out they're even or not.
Its for loop counterpart could be something like this :
for(int i = 0; i<v.size();++i){
if(v[i] % 2 == 0) areEven = true; //just for readablity
else areEven = false;
}
In this for loop example , it is quiet clear that the current element we're processing is i since we're actually accessing v[i]. But how come in iterator version of same code , it maps i or knows what its current element is that we're accessing?
How does [](int i){ return i % 2 == 0; }) ensures/knows that i is the current element which iterator is pointing to.
I'm not able to makeout that without use of any v.currently_i_am_at_this_posiition() , how is iterating done. I know what iterators are but I'm having a hard time grasping them. Thanks :)

Iterators are modeled after pointers, and that's it really. How they work internally is of no interest, but a possible implementation is to actually have a pointer inside which points to the current element.

Iterating is done by using an iterator object
An iterator is any object that, pointing to some element in a range of
elements (such as an array or a container), has the ability to iterate
through the elements of that range using a set of operators (with at
least the increment (++) and dereference (*) operators).
The most obvious form of iterator is a pointer: A pointer can point to
elements in an array, and can iterate through them using the increment
operator (++).
and advancing it through the set of elements. The std::all_of function in your code is roughly equivalent to the following code
template< class InputIt, class UnaryPredicate >
bool c_all_of(InputIt first, InputIt last, UnaryPredicate p)
{
for (; first != last; ++first) {
if (!p(*first)) {
return false; // Found an odd element!
}
}
return true; // All elements are even
}
An iterator, when incremented, keeps track of the currently pointed element, and when dereferenced it returns the value of the currently pointed element.
For teaching's and clarity's sake, you might also think of the operation as follows (don't try this at home)
bool c_all_of(int* firstElement, size_t numberOfElements, std::function<bool(int)> evenTest)
{
for (size_t i = 0; i < numberOfElements; ++i)
if (!evenTest(*(firstElement + i)))
return false;
return true;
}
Notice that iterators are a powerful abstraction since they allow consistent elements access in different containers (e.g. std::map).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Keeping track of removed elements using std::remove_if - c++

Related

How to remove an item from a list of tuples in c++?

How to repair SigSegV [duplicate]

Copy subset of a std::map

Converting const auto & to iterator

How do iterators map/know their current position or element

Categories

Resources