I have a source container of strings I want to remove any strings from the source container that match a predicate and add them into the destination container.
remove_copy_if and other algorithms can only reorder the elements in the container, and therefore have to be followed up by the erase member function. My book (Josuttis) says that remove_copy_if returns an iterator after the last position in the destination container. Therefore if I only have an iterator into the destination container, how can I call erase on the source container? I have tried using the size of the destination to determine how far back from the end of the source container to erase from, but had no luck. I have only come up with the following code, but it makes two calls (remove_if and remove_copy_if).
Can someone let me know the correct way to do this? I'm sure that two linear calls is not
the way to do this.
#include <iostream>
#include <iterator>
#include <vector>
#include <string>
#include <algorithm>
#include <functional>
using namespace std;
class CPred : public unary_function<string, bool>
{
public:
CPred(const string& arString)
:mString(arString)
{
}
bool operator()(const string& arString) const
{
return (arString.find(mString) == std::string::npos);
}
private:
string mString;
};
int main()
{
vector<string> Strings;
vector<string> Container;
Strings.push_back("123");
Strings.push_back("145");
Strings.push_back("ABC");
Strings.push_back("167");
Strings.push_back("DEF");
cout << "Original list" << endl;
copy(Strings.begin(), Strings.end(),ostream_iterator<string>(cout,"\n"));
CPred Pred("1");
remove_copy_if(Strings.begin(), Strings.end(),
back_inserter(Container),
Pred);
Strings.erase(remove_if(Strings.begin(), Strings.end(),
not1(Pred)), Strings.end());
cout << "Elements beginning with 1 removed" << endl;
copy(Strings.begin(), Strings.end(),ostream_iterator<string>(cout,"\n"));
cout << "Elements beginning with 1" << endl;
copy(Container.begin(), Container.end(),ostream_iterator<string>(cout,"\n"));
return 0;
}
With all due respect to Fred's hard work, let me add this: the move_if is no different than remove_copy_if at an abstract level. The only implementation level change is the end() iterator. You are still not getting any erase(). The accepted answer does not erase() the matched elements -- part of the OP's problem statement.
As for the OP's question: what you want is an in-place splice. This is possible for lists. However, with vectors this will not work. Read about when and how and why iterators are invalidated. You will have to take a two pass algorithm.
remove_copy_if and other algorithms can only reorder the elements in the container,
From SGI's documentation on remove_copy_if:
This operation is stable, meaning that the relative order of the elements that are copied is the same as in the range [first, last).
So no relative reordering takes place. Moreover, this is a copy, which means the elements from Source vector in your case, is being copied to the Container vector.
how can I call erase on the source container?
You need to use a different algorithm, called remove_if:
remove_if removes from the range [first, last) every element x such that pred(x) is true. That is, remove_if returns an iterator new_last such that the range [first, new_last) contains no elements for which pred is true. The iterators in the range [new_last, last) are all still dereferenceable, but the elements that they point to are unspecified. Remove_if is stable, meaning that the relative order of elements that are not removed is unchanged.
So, just change that remove_copy_if call to:
vector<string>::iterator new_last = remove_if(Strings.begin(),
Strings.end(),
Pred);
and you're all set. Just keep in mind, your Strings vector's range is no longer that defined by the iterators [first(), end()) but rather by [first(), new_last).
You can, if you want to, remove the remaining [new_last, end()) by the following:
Strings.erase(new_last, Strings.end());
Now, your vector has been shortened and your end() and new_last are the same (one past the last element), so you can use as always:
copy(Strings.begin(), Strings.end(), ostream_iterator(cout, "\"));
to get a print of the strings on your console (stdout).
I see your point, that you'd like to avoid doing two passes over your source container. Unfortunately, I don't believe there's a standard algorithm that will do this. It would be possible to create your own algorithm that would copy elements to a new container and remove from the source container (in the same sense as remove_if; you'd have to do an erase afterward) in one pass. Your container size and performance requirements would dictate whether the effort of creating such an algorithm would be better than making two passes.
Edit: I came up with a quick implementation:
template<typename F_ITER, typename O_ITER, typename FTOR>
F_ITER move_if(F_ITER begin, F_ITER end, O_ITER dest, FTOR match)
{
F_ITER result = begin;
for(; begin != end; ++begin)
{
if (match(*begin))
{
*dest++ = *begin;
}
else
{
*result++ = *begin;
}
}
return result;
}
Edit:
Maybe there is confusion in what is meant by a "pass". In the OP's solution, there is a call to remove_copy_if() and a call to remove_if(). Each of these will traverse the entirety of the original container. Then there is a call to erase(). This will traverse any elements that were removed from the original container.
If my algorithm is used to copy the removed elements to a new container (using begin() the original container for the output iterator will not work, as dirkgently demonstrated), it will perform one pass, copying the removed elements to the new container by means of a back_inserter or some such mechanism. An erase will still be required, just as with remove_if(). One pass over the original container is eliminated, which I believe is what the OP was after.
There will be copy_if and remove_if.
copy_if( Strings.begin(), Strings.end(),
back_inserter(Container), not1(Pred) );
Strings.erase( remove_if( Strings.begin(), Strings.end(), not1(Pred) ),
Strings.end() );
It is better to understand code where Predicate class answering "true" if something is present. In that case you won't need not1 two times.
Because std::find looks for substring not obligatory from the begin you need to change "beginning with 1" to "with 1" to avoid future misunderstanding of your code.
The whole reason why the remove_* algorithms do not erase elements is because it is impossible to "erase" an element by the iterator alone. You can't get container by iterator
This point is explained in more details in the book "Effective STL"
Use 'copy_if', followed by 'remove_if'. remove_copy_if does not modify the source.
On lists you can do better - reordering followed by splice.
If you don't mind having your strings in the same container, and having just an iterator to separate them, this code works.
#include "stdafx.h"
#include <iostream>
#include <iterator>
#include <vector>
#include <string>
#include <algorithm>
#include <functional>
using namespace std;
class CPred : public unary_function<string, bool>
{
public:
CPred(const string& arString)
:mString(arString)
{
}
bool operator()(const string& arString) const
{
return (arString.find(mString) == std::string::npos);
}
private:
string mString;
};
int main()
{
vector<string> Strings;
Strings.push_back("213");
Strings.push_back("145");
Strings.push_back("ABC");
Strings.push_back("167");
Strings.push_back("DEF");
cout << "Original list" << endl;
copy(Strings.begin(), Strings.end(),ostream_iterator<string>(cout,"\n"));
CPred Pred("1");
vector<string>::iterator end1 =
partition(Strings.begin(), Strings.end(), Pred);
cout << "Elements matching with 1" << endl;
copy(end1, Strings.end(), ostream_iterator<string>(cout,"\n"));
cout << "Elements not matching with 1" << endl;
copy(Strings.begin(), end1, ostream_iterator<string>(cout,"\n"));
return 0;
}
remove*() don't relally remove elements, it simply reorders them and put them at the end of the collection and return a new_end iterator in the same container indicating where the new end is. You then need to call erase to remove the range from the vector.
source.erase(source.remove(source.begin(), source.end(), element), source.end());
remove_if() does the same but with a predicate.
source.erase(source.remove_if(source.begin(), source.end(), predicate), source.end());
remove_copy_if() will only copy the elements NOT matching the predicate, leaving the source vector intact and providing you with the end iterator on the target vector, so that you can shrink it.
// target must be of a size ready to accomodate the copy
target.erase(source.remove_copy_if(source.begin(), source.end(), target.begin(), predicate), target.end());
Related
In other words, what I mean to say is : Is itr+=2 a valid argument in c++ ?, where (itr is an iterator to first element of the set). If so, then the following piece of code should work:
In this piece if code, the code written in /comment section/ functions well, while the code not in comment section do not. Help me out to iterate alternate elements.
#include <bits/stdc++.h>
using namespace std;
int main()
{
set<int> s;
s.insert(5);
s.insert(7);
s.insert(8);
auto it=s.begin();
cout<<*it<<'\n';
it+=2;
cout<<*it<<'\n';
/*for(auto it=s.begin();it!=s.end();it++)
cout<<*it<<" ";*/
return 0;
}
Is itr+=2 a valid argument in c++?
It depends on the container type. For example, it would be perfectly valid for std::vector or std::array, but not for std::set. Each container, due to its nature, provides different types of iterators. std::set only provides BidirectionalIterator, which do not support jumping over arbitrary number of elements, only incrementation and decrementation.
However, you can use std::advance() from <iterator> library (or just increment the iterator twice). Beware that you must never increment end() iterator, so you need to take it into account in loop condition.
for(auto it=s.begin(); it != s.end() && it != std::prev(s.end()); std::advance(it, 2))
Consider the following code (taken from cppreference.com, slightly adapted):
#include <algorithm>
#include <string>
#include <iostream>
#include <cctype>
int main()
{
std::string str1 = " Text with some spaces";
str1.erase(std::remove(str1.begin(), str1.end(), ' '), str1.end());
std::cout << str1 << '\n';
return 0;
}
Why is the second parameter to erase neccessary? (I.e. str1.end() in this case.)
Why can't I just supply the iterators which are returned by remove to erase? Why do I have to tell it also about the last element of the container from which to erase?
The pitfall here is that you can also call erase without the second parameter but that produces the wrong result, obviously.
Are there use cases where I would not want to pass the end of the container as a second parameter to erase?
Is omitting the second parameter of erase for the erase-remove idiom always an error or could that be a valid thing to do?
std::remove returns one iterator; it's the new past-the-end iterator for the sequence. But when the sequence is managed by a container, the size of the container hasn't changed; std::remove shuffles the order of the elements in the sequence, but doesn't actually remove any of them.
To get rid of the elements in the container that are not part of the new sequence you call, of course, container.erase(). But the goal is to remove all the extra elements; calling container.erase() with only one iterator tells it to remove that element. To tell container.erase() to erase everything "from here to the end" you have to tell it both where "here" is and where the end is. So that means two iterators.
If it helps, think of the "remove/erase" idiom as two separate steps:
auto new_end = std::remove(str1.begin(), str1.end(), ' ');
str1.erase(new_end, str1.end());
I tried writing a generic, in place, intersperse function. The function should intersperse a given element into a sequence of elements.
#include <vector>
#include <list>
#include <algorithm>
#include <iostream>
template<typename ForwardIterator, typename InserterFunc>
void intersperse(ForwardIterator begin, ForwardIterator end, InserterFunc ins,
// we cannot use rvalue references here,
// maybe taking by value and letting users feed in std::ref would be smarter
const ForwardIterator::value_type& elem) {
if(begin == end) return;
while(++begin != end) {
// bugfix would be something like:
// begin = (ins(begin) = elem); // insert_iterator is convertible to a normal iterator
// or
// begin = (ins(begin) = elem).iterator(); // get the iterator to the last inserted element
// begin now points to the inserted element and we need to
// increment the iterator once again, which is safe
// ++begin;
ins(begin) = elem;
}
}
int main()
{
typedef std::list<int> container;
// as expected tumbles, falls over and goes up in flames with:
// typedef std::vector<int> container;
typedef container::iterator iterator;
container v{1,2,3,4};
intersperse(v.begin(), v.end(),
[&v](iterator it) { return std::inserter(v, it); },
23);
for(auto x : v)
std::cout << x << std::endl;
return 0;
}
The example works only for containers that do not invalidate their
iterators on insertion. Should I simply get rid of the iterators and
accept a container as the argument or am I missing something about
insert_iterator that makes this kind of usage possible?
The example works only for containers that do not invalidate their iterators on insertion.
Exactly.
Should I simply get rid of the iterators and accept a container as the argument
That would be one possibility. Another would be not making the algorithm in-place (ie. output to a different container/output-iterator).
am I missing something about insert_iterator that makes this kind of usage possible?
No. insert_iterator is meant for repeated inserts to a single place of a container eg. by a transform algorithm.
The problems with your implementation have absolutely nothing to do with the properties of insert_iterator. All kinds of insert iterators in C++ standard library are guaranteed to remain valid, even if you perform insertion into a container that potentially invalidates iterators on insert. This is, of course, true only if all insertions are performed through only through the insert iterator.
In other words, the implementation of insert iterators guarantees that the iterator will automatically "heal" itself, even if the insertion lead to a potentially iterator-invalidating event in the container.
The problem with your code is that begin and end iterators can potentially get invalidated by insertion into certain container types. It is begin and end that you need to worry about in your code, not the insert iterator.
Meanwhile, you do it completely backwards for some reason. You seem to care about refreshing the insert iterator (which is completely unnecessary), while completely ignoring begin and end.
I am trying to understand STL algorithms.
Copy is defined as :
template<class InputIterator, class OutputIterator>
OutputIterator copy ( InputIterator first, InputIterator last, OutputIterator result )
Can some one please explain why does the following works when vectors & deques are mixed but fails when vectors and sets are mixed.
#include <iostream>
#include <algorithm>
#include <vector>
#include <deque>
#include <deque>
#include <set>
using namespace std;
int main () {
int myints[]={10,20,30,40,50,60,70};
vector<int> myvector;
vector<int>::iterator it;
set<int> mset(myints,myints+8);
set<int>::iterator setItr = mset.begin();
deque<int> deq;
deq.resize(10);
deque<int>::iterator deqItr = deq.begin();
myvector.resize(7); // allocate space for 7 elements
copy ( myints, myints+7, myvector.begin() );
copy ( myvector.begin(), myvector.end(), deqItr );
cout << "deque contains:";
for (deque<int>::iterator dit=deq.begin(); dit!=deq.end(); ++dit)
cout << " " << *dit;
cout << endl;
//copy ( myvector.begin(), myvector.end(), setItr );
return 0;
}
I understand vectors/deque have random access iterators, where as set's have bidirectional iterators. I fail to understand why compilation fails when only a input/output iterators are required.
PS : This is just an experiment to increase my understanding :)
Associative containers (in plain C++03) are special containers that keep their elements sorted at all times, commonly implemented as a Red Black Tree. To maintain the order invariant, the set and map iterators provide constant references into the key object, and as such you cannot modify it.
In particular for std::set<T>, the iterator will usually be such that std::iterator_traits< std::set<T>::iterator >::reference is const T&, and as such the assignment implicit in the std::copy operation will fail.
If what you want to do is insert the elements into a set, you can use iterators from the <iterator> header that will perform insert operations in the set:
std::copy( v.begin(), v.end(), std::inserter( s, s.end() ) ); // s is the set
std::vector and std::deque have a way to preallocate space. std::set doesn't. Without preallocating the space, attempting to dereference the iterator you pass to copy produces undefined behavior.
The obvious alternative is to use insert iterators instead -- though, unfortunately, you still neednearly always use different code for a set than a deque or vector:
std::copy(myvector.begin(), myvector.end(), std::back_inserter(mydeque));
std::copy(myvector.begin(), myvector.end(), std::inserter(mySet, mySet.end());
It works for vector and deque because you can allocate space beforehand. With other containers, like map, you need an iterator adapter to do that for you. Look at insert_iterator, for example.
Currently, I plan to remove all items from vector, which is not found in a set.
For example :
#include <vector>
#include <set>
#include <string>
#include <iostream>
using namespace std;
int main() {
std::set<string> erase_if_not_found;
erase_if_not_found.insert("a");
erase_if_not_found.insert("b");
erase_if_not_found.insert("c");
std::vector<string> orders;
orders.push_back("a");
orders.push_back("A");
orders.push_back("A");
orders.push_back("b");
orders.push_back("c");
orders.push_back("D");
// Expect all "A" and "D" to be removed.
for (std::vector<std::string>::iterator itr = orders.begin(); itr != orders.end();) {
if (erase_if_not_found.find(*itr) == erase_if_not_found.end()) {
orders.erase(itr);
// Begin from start point again? Do we have a better way?
itr = orders.begin();
} else {
++itr;
}
}
for (std::vector<std::string>::iterator itr = orders.begin(); itr != orders.end(); ++itr) {
std::cout << *itr << std::endl;
}
getchar();
}
Although the above code work, it is not efficient, as I begin from vector's start point each time I delete an item.
Is there a better way?
Yes; you can use the erase/remove idiom with a custom predicate:
template <typename SetT>
struct not_contained_in_set_impl
{
not_contained_in_set_impl(const SetT& s) : set_(s) { }
template <typename T>
bool operator()(const T& v)
{
return set_.find(v) == set_.end();
}
const SetT& set_;
};
template <typename SetT>
not_contained_in_set_impl<SetT> not_contained_in_set(const SetT& s)
{
return not_contained_in_set_impl<SetT>(s);
}
Used as:
orders.erase(
std::remove_if(orders.begin(),
orders.end(),
not_contained_in_set(erase_if_not_found)),
orders.end());
[compiled in my head on the fly]
If you are willing to sort the range first, you have other options that may perform better (std::set_intersection, for example).
Yes, there is a better way - you can move the items that are to be removed at the end of the vector. Then just cut out the ending of the vector after the loop ends.
I would suggest to copy elements you want to keep in another vector instead of parsing again the vector from the beginning after each removal.
Also, you should store the iterator returned by end() method outside the loop if the collections are not modified anymore in the loop as calling end() is costly for some STL implementations. Some compilers are optimizing that, but not always.
It may help to sort first the vector, as the set is itself ordered.
A variant could be to order the vector by existance in the set, then chop all items at once.
I'm not sure if what you ask for is the intersection of two vectors, but if so, you might take a look at std::set_intersection.
It requires sorted vectors though.
The algorithm remove_if() will do this but you need a predicate to determine if the item is not in your set.
You can also use remove_copy_if() to copy your items into a new vector.
If your vector is sorted you can use set_intersection. That would also only allow one copy of each found element.