remove_if based on vector index with a functor - c++

This question showed how to use erase/remove_if based on vector indices using a function predicate. This works well the first time the function is called but because the local static variable maintains state, on the next call to a different vector I'd be out of luck. So I thought I could use a functor with a private variable that would be re-usable. It mostly works except for the first element. There's something specific to the way remove_if uses the functor that messes up initialization of the private variable
#include <vector>
#include <algorithm>
#include <iostream>
#include <iterator>
using namespace std;
class is_IndexEven_Functor {
public:
is_IndexEven_Functor() : k(0) {}
bool operator()(const int &i) {
cout << "DEBUG: isIndexEvenFunctor: k " << k << "\ti " << i << endl; ////
if(k++ % 2 == 0) {
return true;
} else {
return false;
}
}
private:
int k;
};
int main() {
is_IndexEven_Functor a;
a(0);
a(1);
a(2);
a(3);
vector<int> v;
v.push_back(0);
v.push_back(1);
v.push_back(2);
v.push_back(3);
cout << "\nBefore\n";
copy(v.begin(), v.end(), ostream_iterator<int>(cout, " ")); cout << endl;
is_IndexEven_Functor b;
v.erase( remove_if(v.begin(), v.end(), b), v.end() );
cout << "\nAfter\n";
copy(v.begin(), v.end(), ostream_iterator<int>(cout, " ")); cout << endl;
return 0;
}
Here's the output:
DEBUG: isIndexEvenFunctor: k 0 i 0
DEBUG: isIndexEvenFunctor: k 1 i 1
DEBUG: isIndexEvenFunctor: k 2 i 2
DEBUG: isIndexEvenFunctor: k 3 i 3
Before
0 1 2 3
DEBUG: isIndexEvenFunctor: k 0 i 0
DEBUG: isIndexEvenFunctor: k 0 i 1 // why is k == 0 here ???
DEBUG: isIndexEvenFunctor: k 1 i 2
DEBUG: isIndexEvenFunctor: k 2 i 3
After
2
The crux of the question is why is the value of k equal to 0 on the second call to the functor (and how do I fix it)?
I'm guessing this has something to do with remove_if using it as temporary object or something but I don't really understand what that means.
EDIT: it would be great if I could avoid c++11

Yes, the implementation is allowed to copy the function, so you can get confusing behavior if the functional has mutable state. There are a couple of ways to work around this. The easiest is probably to use a std::reference_wrapper, which can be created using the std::ref function:
is_IndexEven_Functor b;
v.erase( remove_if(v.begin(), v.end(), std::ref(b)), v.end() );
Now, instead of making a copy of the function object, the implementation copies a wrapper, so there is only one instance of the original function object.
Another option is to keep the index separate from the function:
class is_IndexEven_Functor {
public:
is_IndexEven_Functor(int &index) : k(index) {}
.
.
.
private:
int &k;
};
and use it like this:
int index = 0;
is_IndexEven_Functor b(index);
v.erase( remove_if(v.begin(), v.end(), b), v.end() );

Generally, functors with state are not so great for STL algorithms, as they do not guarantee anything about the handling of functor. For all it cares, it could create a new copy of the functor (from the original!) for every iteration to perform checks. STL assumes functors are stateless.
To overcome this, you should use references to state inside your functors (in your case, have your k a reference to int, which is initialized before) or pass functors themselves in reference wrappers. To answer particular question (that is, what's happening in the remove_if), let's take a look into the implementation (one of implementations, of course):
__remove_if(_ForwardIterator __first, _ForwardIterator __last,
_Predicate __pred)
{
__first = std::__find_if(__first, __last, __pred);
if (__first == __last)
return __first;
_ForwardIterator __result = __first;
++__first;
for (; __first != __last; ++__first)
if (!__pred(__first))
{
*__result = _GLIBCXX_MOVE(*__first);
++__result;
}
return __result;
}
As you see, it is using a COPY of a predicate for find_if, and than continues with original predicate from found position - but original knows nothing about new position, reflected in the copy.

Related

Can we use heterogeneous lookup comparator to perform a "partial-match" search on STL associative containers?

So I was looking at support for heterogeneous lookup in STL's associative containers (since C++14) and got a bit confused about what we can do and what we shouldn't.
The following snippet
#include <algorithm>
#include <iostream>
#include <set>
struct partial_compare : std::less<>
{
//"full" key_type comparison done by std::less
using less<>::operator();
//"sequence-partitioning" comparison: only check pair's first member
bool operator ()(std::pair<int, int> const &lhs, int rhs) const
{
return lhs.first < rhs;
}
bool operator ()(int lhs, std::pair<int, int> const &rhs) const
{
return lhs < rhs.first;
}
};
int main()
{
//Using std::set's lookup
{
std::cout << "std::set's equal_range:\n";
std::set <std::pair<int, int>, partial_compare> s{{1,0},{1,1},{1,2},{1,3},{2,0}};
auto r = s.equal_range (1);
for (auto it = r.first; it != r.second; ++it)
{
std::cout << it->first << ", " << it->second << '\n';
}
std::cout << "std::set's lower_bound + iteration on equivalent keys:\n";
auto lb = s.lower_bound(1);
while (lb != std::end(s) && !s.key_comp()(lb->first, 1) && !s.key_comp()(1, lb->first))
{
std::cout << lb->first << ", " << lb->second << '\n';
++lb;
}
}
//Using algorithms on std::set
{
std::cout << "std::equal_range\n";
std::set <std::pair<int, int>> s{{1,0},{1,1},{1,2},{1,3},{2,0}};
auto r = std::equal_range (std::begin(s), std::end(s), 1, partial_compare{});
for (auto it = r.first; it != r.second; ++it)
{
std::cout << it->first << ", " << it->second << '\n';
}
}
return 0;
}
Produces the following output when compiled with clang-5/libc++:
std::set's equal_range:
1, 1
std::set's lower_bound + iteration on equivalent keys:
1, 0
1, 1
1, 2
1, 3
std::equal_range
1, 0
1, 1
1, 2
1, 3
And the following when compiled with gcc-7.1.0:
std::set's equal_range:
1, 0
1, 1
1, 2
1, 3
std::set's lower_bound + iteration on equivalent keys:
1, 0
1, 1
1, 2
1, 3
std::equal_range
1, 0
1, 1
1, 2
1, 3
By reading the initial N3465 proposal, I think that what I'm doing here should be fine and conceptually identical to what's in the proposal's initial example: a "partial-match" during lookup, relying on "the notion of sequence partitioning".
Now, if I'm understanding correctly, what actually ended up in the standard is N3657, but that doesn't seem to change the concept as it "just" focuses on ensuring heterogeneous lookup member templates are only made available when the provided comparator "is_transparent".
So, I can't really understand why using std::set's equal_range member template in clang/libc++ doesn't produce the same results of gcc's or the equivalent "lower_bound + scan".
Am I missing something and using heterogeneous lookup in this way is actually violating the standard (and then clang is right and the difference between equal_range and lower_bound + scan might be due to UB), or is clang/libc++ wrong?
EDIT: After reading the now accepted answer, I was able to find the relevant bug report for libc++.
There's also a specific question about the difference in behavior of equal_range template members between libc++ and libstdc++ here on SO that I didn't find while doing my search.
Not sure if this should be deleted or closed, as the one I've linked doesn't have an accepted answer.
This is a bug in libc++: its equal_range implementation (at r315179) immediately returns both iterators upon finding an "equal" element even with heterogeneous comparison.

How do iterators map/know their current position or element

Consider the following code example :
#include <vector>
#include <numeric>
#include <algorithm>
#include <iterator>
#include <iostream>
#include <functional>
int main()
{
std::vector<int> v(10, 2);
std::partial_sum(v.cbegin(), v.cend(), v.begin());
std::cout << "Among the numbers: ";
std::copy(v.cbegin(), v.cend(), std::ostream_iterator<int>(std::cout, " "));
std::cout << '\n';
if (std::all_of(v.cbegin(), v.cend(), [](int i){ return i % 2 == 0; })) {
std::cout << "All numbers are even\n";
}
if (std::none_of(v.cbegin(), v.cend(), std::bind(std::modulus<int>(),
std::placeholders::_1, 2))) {
std::cout << "None of them are odd\n";
}
struct DivisibleBy
{
const int d;
DivisibleBy(int n) : d(n) {}
bool operator()(int n) const { return n % d == 0; }
};
if (std::any_of(v.cbegin(), v.cend(), DivisibleBy(7))) {
std::cout << "At least one number is divisible by 7\n";
}
}
If we look at this part of the code :
if (std::all_of(v.cbegin(), v.cend(), [](int i){ return i % 2 == 0; })) {
std::cout << "All numbers are even\n";
}
which is fairly easy to understand. It iterates over those vector elements , and finds out i%2==0 , whether they are completely divisible by 2 or not , hence finds out they're even or not.
Its for loop counterpart could be something like this :
for(int i = 0; i<v.size();++i){
if(v[i] % 2 == 0) areEven = true; //just for readablity
else areEven = false;
}
In this for loop example , it is quiet clear that the current element we're processing is i since we're actually accessing v[i]. But how come in iterator version of same code , it maps i or knows what its current element is that we're accessing?
How does [](int i){ return i % 2 == 0; }) ensures/knows that i is the current element which iterator is pointing to.
I'm not able to makeout that without use of any v.currently_i_am_at_this_posiition() , how is iterating done. I know what iterators are but I'm having a hard time grasping them. Thanks :)
Iterators are modeled after pointers, and that's it really. How they work internally is of no interest, but a possible implementation is to actually have a pointer inside which points to the current element.
Iterating is done by using an iterator object
An iterator is any object that, pointing to some element in a range of
elements (such as an array or a container), has the ability to iterate
through the elements of that range using a set of operators (with at
least the increment (++) and dereference (*) operators).
The most obvious form of iterator is a pointer: A pointer can point to
elements in an array, and can iterate through them using the increment
operator (++).
and advancing it through the set of elements. The std::all_of function in your code is roughly equivalent to the following code
template< class InputIt, class UnaryPredicate >
bool c_all_of(InputIt first, InputIt last, UnaryPredicate p)
{
for (; first != last; ++first) {
if (!p(*first)) {
return false; // Found an odd element!
}
}
return true; // All elements are even
}
An iterator, when incremented, keeps track of the currently pointed element, and when dereferenced it returns the value of the currently pointed element.
For teaching's and clarity's sake, you might also think of the operation as follows (don't try this at home)
bool c_all_of(int* firstElement, size_t numberOfElements, std::function<bool(int)> evenTest)
{
for (size_t i = 0; i < numberOfElements; ++i)
if (!evenTest(*(firstElement + i)))
return false;
return true;
}
Notice that iterators are a powerful abstraction since they allow consistent elements access in different containers (e.g. std::map).

Weird (?) behavior when comparing iterators

I'm new to C++ so this might be a simple problem but I'm working my way through the C++ book by Stanley Lippman and there's this exercise where you're supposed to write a very basic search function for a vector of ints. Basically just incrementing the iterator until you find what you're looking for and then return an iterator to the element.
My first question is, in the book it says "Don't forget to handle the case where the element can't be found" - what would you do in a case like that? In java I would return a null but I guess that's not okay in C++ (a nullptr?)?
Second question is, why doesn't it work? I thought that if I can't find it, I'll just return the end()-iterator as it's one element behind the last one (thus, not pointing to an element in the vector) but I can't get the comparing to work, it says "Found!" on every number when I try it.
#include <vector>
#include <iterator>
#include <iostream>
const std::vector<int>::iterator
search (std::vector<int> v, const int find) {
auto beg = v.begin();
const auto end = v.end();
while (beg != end) {
if (*beg == find) {
return beg;
}
++beg;
}
return beg; // This can only be reached if beg = v.end()?
}
int
main () {
std::vector<int> v;
v.insert(v.end(), 2);
v.insert(v.end(), 5);
v.insert(v.end(), 10);
v.insert(v.end(), 7);
v.insert(v.end(), 12);
for (int i = 0; i < 16; ++i) {
std::vector<int>::iterator b = search(v, i);
std::cout << i;
if (std::distance(b, v.end()) == 0) {
std::cout << " not found!";
} else {
std::cout << " found!";
}
std::cout << std::endl;
}
return 0;
}
with output as follows:
$ ./a.exe
0 found!
1 found!
2 found!
3 found!
4 found!
5 found!
6 found!
7 found!
8 found!
9 found!
10 found!
11 found!
12 found!
13 found!
14 found!
15 found!
When you call the function, you are passing the vector by value, so it makes a copy. The iterators for this copy will not be the same as the ones in the original vector so the comparison fails. To fix this, pass the vector by constant reference:
search( const std::vector<int>& v, const int find )
To answer your first question, yes, returning the end() iterator is how you indicate the value was not found. This is how std::find() works:
If no such element is found, the function returns last.

Keeping track of removed elements using std::remove_if

I want to remove some elements from a vector and am using remove_if algorithm to do this. But I want to keep track of the removed elements so that I can perform some operation on them later. I tried this with the following code:
#include <vector>
#include <algorithm>
#include <iostream>
using namespace std;
struct IsEven
{
bool operator()(int n)
{
if(n % 2 == 0)
{
evens.push_back(n);
return true;
}
return false;
}
vector<int> evens;
};
int main(int argc, char **argv)
{
vector<int> v;
for(int i = 0; i < 10; ++i)
{
v.push_back(i);
}
IsEven f;
vector<int>::iterator newEnd = remove_if(v.begin(), v.end(), f);
for(vector<int>::iterator it = f.evens.begin(); it != f.evens.end(); ++it)
{
cout<<*it<<"\n";
}
v.erase(newEnd, v.end());
return 0;
}
But this doesn't work as remove_if accepts the copy of my functor object, so the the stored evens vector is not accessible. What is the correct way of achieving this?
P.S. : The example, with even and odds is just for example sake, my real code is somethinf different. So don't suggest a way to identify even or odds differently.
The solution is not remove_if, but it's cousin partial_sort partition. The difference is that remove_if only guarantees that [begin, middle) contains the matching elements, but partition also guarantees that [middle, end) contains the elements which didn't match the predicate.
So, your example becomes just (note that evens is no longer needed):
vector<int>::iterator newEnd = partition(v.begin(), v.end(), f);
for(vector<int>::iterator it = newEnd; it != v.end(); ++it)
{
cout<<*it<<"\n";
}
v.erase(newEnd, v.end());
Your best bet is std::partition() which will rearrange all elts in the sequence such as all elts for which your predicate return true will precede those for which it returns false.
Exemple:
vector<int>::iterator bound = partition (v.begin(), v.end(), IsEven);
std::cout << "Even numbers:" << std::endl;
for (vector<int>::iterator it = v.begin(); it != bound; ++it)
std::cout << *it << " ";
std::cout << "Odd numbers:" << std::endl;
for (vector<int>::iterator it = bound; it != v.end(); ++it)
std::cout << *it << " ";
You can avoid copying your functor (i.e. pass by value) if you pass ist by reference like this:
vector<int>::iterator newEnd = remove_if(v.begin(), v.end(),
boost::bind<int>(boost::ref(f), _1));
If you can't use boost, the same is possible with std::ref. I tested the code above and it works as expected.
An additional level of indirection. Declare the vector locally, and
have IsEven contain a copy to it. It's also possible for IsEven to
own the vector, provided that it is dynamically allocated and managed by
a shared_ptr. In practice, I've generally found the local variable
plus pointer solution more convenient. Something like:
class IsEven
{
std::vector<int>* myEliminated;
public:
IsEven( std::vector<int>* eliminated = NULL )
: myEliminated( eliminated )
{
}
bool
operator()( int n ) const
{
bool results = n % 2 == 0;
if ( results && myEliminated != NULL ) {
myEliminated->push_back( n );
}
return results;
}
}
Note that this also allows the operator()() function to be const. I
think this is formally required (although I'm not sure).
The problem that I see with the code is that the evens vector that you create inside the struct gets created everytime the remove_if algorithm calls it. So no matter if you pass in a functor to remove_if it will create a new vector each time. So once the last element is removed and when the function call ends and comes out of the function the f.evens will always fetch an empty vector. This could be sorted in two ways,
Replace the struct with a class and declare evens as static (if that is what you wanted)
Or you could make evens global. I wouldn't personally recommend that( it makes the code bad, say no to globals unless you really need them).
Edit:
As suggested by nabulke you could also std::ref something likke this, std::ref(f). This prevents you from making the vector global and avoids for unnecessary statics.
A sample of making it global is as follows,
#include <vector>
#include <algorithm>
#include <iostream>
using namespace std;
vector<int> evens;
struct IsEven
{
bool operator()(int n)
{
if(n % 2 == 0)
{
evens.push_back(n);
return true;
}
return false;
}
};
int main(int argc, char **argv)
{
vector<int> v;
for(int i = 0; i < 10; ++i)
{
v.push_back(i);
}
IsEven f;
vector<int>::iterator newEnd = remove_if(v.begin(), v.end(), f);
for(vector<int>::iterator it = evens.begin(); it != evens.end(); ++it)
{
cout<<*it<<"\n";
}
v.erase(newEnd, v.end());
return 0;
}
This code seems to work just fine for me. Let me know if this is not what you wanted.
You may have another solution; only if you don't need to remove elts in the same time (do you?). With std::for_each() which returns a copy of your functor. Exemple:
IsEven result = std::for_each(v.begin(), v.end(), IsEven());
// Display the even numbers.
std::copy(result.evens.begin(), result.evens.end(), std::ostream_iterator<int> (cout, "\n"));
Take note that it is always better to create unnamed variables in c++ when possible. Here that solution does not exactly answer your primary issue (removing elts from the source container), but it reminds everyone that std::for_each() returns a copy of your functor. :-)

Remove an element from a vector by value - C++

If I have
vector<T> list
Where each element in the list is unique, what's the easiest way of deleting an element provided that I don't know if it's in the list or not? I don't know the index of the element and I don't care if it's not on the list.
You could use the Erase-remove idiom for std::vector
Quote:
std::vector<int> v;
// fill it up somehow
v.erase(std::remove(v.begin(), v.end(), 99), v.end());
// really remove all elements with value 99
Or, if you're sure, that it is unique, just iterate through the vector and erase the found element. Something like:
for( std::vector<T>::iterator iter = v.begin(); iter != v.end(); ++iter )
{
if( *iter == VALUE )
{
v.erase( iter );
break;
}
}
Based on Kiril's answer, you can use this function in your code :
template<typename T>
inline void remove(vector<T> & v, const T & item)
{
v.erase(std::remove(v.begin(), v.end(), item), v.end());
}
And use it like this
remove(myVector, anItem);
If occurrences are unique, then you should be using std::set<T>, not std::vector<T>.
This has the added benefit of an erase member function, which does what you want.
See how using the correct container for the job provides you with more expressive tools?
#include <set>
#include <iostream>
int main()
{
std::set<int> notAList{1,2,3,4,5};
for (auto el : notAList)
std::cout << el << ' ';
std::cout << '\n';
notAList.erase(4);
for (auto el : notAList)
std::cout << el << ' ';
std::cout << '\n';
}
// 1 2 3 4 5
// 1 2 3 5
Live demo
From c++20
//LIKE YOU MENTIONED EACH ELEMENT IS UNIQUE
std::vector<int> v = { 2,4,6,8,10 };
//C++20 UNIFORM ERASE FUNCTION (REMOVE_ERASE IDIOM IN ONE FUNCTION)
std::erase(v, 8); //REMOVES 8 FROM VECTOR
Now try
std::erase(v, 12);
Nothing will happen, the vector remains intact.