C++ std::set::erase with std::remove_if - c++

This code has the Visual Studio error C3892. If I change std::set to std::vector - it works.
std::set<int> a;
a.erase(std::remove_if(a.begin(), a.end(), [](int item)
{
return item == 10;
}), a.end());
What's wrong? Why can't I use std::remove_if with std::set?

You cannot use std::remove_if() with sequences which have const parts. The sequence of std::set<T> elements are made up of T const objects. We actually discussed this question just yesterday at the standard C++ committee and there is some support to create algorithms dealing specifically with the erase()ing objects from containers. It would look something like this (see also N4009):
template <class T, class Comp, class Alloc, class Predicate>
void discard_if(std::set<T, Comp, Alloc>& c, Predicate pred) {
for (auto it{c.begin()}, end{c.end()}; it != end; ) {
if (pred(*it)) {
it = c.erase(it);
}
else {
++it;
}
}
}
(it would probably actually delegate to an algorithm dispatching to the logic above as the same logic is the same for other node-based container).
For you specific use, you can use
a.erase(10);
but this only works if you want to remove a key while the algorithm above works with arbitrary predicates. On the other hand, a.erase(10) can take advantage of std::set<int>'s structure and will be O(log N) while the algorithm is O(N) (with N == s.size()).

Starting with C++20, you can use std::erase_if for containers with an erase() method, just as Kühl explained.
// C++20 example:
std::erase_if(setUserSelection, [](auto& pObject) {
return !pObject->isSelectable();
});
Notice that this also includes std::vector, as it has an erase method. No more chaining a.erase(std::remove_if(... :)

std::remove_if re-orders elements, so it cannot be used with std::set. But you can use std::set::erase:
std::set<int> a;
a.erase(10);

Related

Time complexity for std::find_if() using set in C++

What is time complexity for std::find_if() function using std::set in C++?
Consider we have the following example:
auto cmp = [&](const pair<int, set<int>>& a , const pair<int, set<int>>& b) -> bool {
if (a.second.size() == b.second.size()) {
return a.first < b.first;
}
return a.second.size() < b.second.size();
};
set<pair<int, set<int>>, decltype(cmp)> tree(cmp);
...
int value = ...;
auto it = find_if(tree.begin(), tree.end(), [](const pair<int, int> &p) {
return p.first == value;
});
I know it takes std::find_if O(n) to work with std::vector. Can't see any problems for this function not to work with std::set with time complexity O(log(n)).
The complexity for std::find, std::find_if, and std::find_if_not is O(N). It does not matter what type of container you are using as the function is basically implemented like
template<class InputIt, class UnaryPredicate>
constexpr InputIt find_if(InputIt first, InputIt last, UnaryPredicate p)
{
for (; first != last; ++first) {
if (p(*first)) {
return first;
}
}
return last;
}
Which you can see just does a linear scan from first to last
If you want to take advantage of std::set being a sorted container, then you can use std::set::find which has O(logN) complexity. You can get a find_if type of behavior by using a comparator that is transparent.
std::find_if is equal opportunity because it can't take advantage of any special magic from the container. It just iterates the container looking for a match, so worst case is always O(N).
See documentation for std::find_if starting at the section on Complexity and pay special attention to the possible implementations.
Note that std::find_if's actual performance will likely be MUCH worse with a set than that of a vector of the same size because of the lack of locality in the data stored.

Is there a better way to split a container of a non-movable type based on a predicate

I want to ask for an alternative solution to this problem. I am dealing with this C/C++ style interface that has this non-moveable type NonMovableType defined roughly as follows:
union union_type {
int index;
const char* name;
};
struct NonMovableType
{
std::initializer_list<union_type> data;
};
This is something I cannot change, despite the unfortunate use of unions and initializer lists.
We then have some container of this type, say
std::vector<NonMovableType> container
and we want to split container based on some predicate for each of its members. Now, if it was a movable type i'd do
std::vector<NonMovableType> container;
std::vector<NonMovableType> result;
auto iter = std::partition(container.begin(), container.end(), [](const NonMovableType& element){
return element.data.size(); // the predicate
});
std::move(iter, container.end(), std::back_inserter(result));
container.erase(iter, container.end());
I could then trust container and result would contain the elements split by the predicate, that way I could then iterate over each one individually and do the necessary processing on them.
This wont work however because std::move and std::partition both require a movable type. Instead I have to result to the rather slow:
std::vector<NonMovableType> container;
std::vector<NonMovableType> result_a;
std::vector<NonMovableType> result_b;
std::copy_if(container.begin(), container.end(), std::back_inserter(result_a), [](const NonMovableType& element){
return element.data.size();
});
std::copy_if(container.begin(), container.end(), std::back_inserter(result_b), [](const NonMovableType& element){
return !element.data.size();
});
container.clear();
And so, my question is, is there any better way to do this? I suppose calling it a 'non movable type' may be wrong, its only the union and the initializer list which are giving me problems, so really the question becomes is there a way to move this type safely, and do so without having to change the initial class. Could it also be possible to wrap NonMovableType into another class and then use pointers as opposed to a direct type?
Is it really a performance problem or are you trying to optimize in advance?
As for a general answer: it really depends. I would probably try to achieve everything in a single pass (especially if the original container has a lot of elements), e.g.
for (const auto& el : container) {
if (el.data.size()) out1.push_back(el);
else out2.push_back(el);
}
which can be easily generalized into:
template<typename ForwardIt, typename OutputIt1, typename OutputIt2, typename Pred>
void split_copy(ForwardIt b, ForwardIt e, OutputIt1 out1, OutputIt2 out2, Pred f)
{
for(; b != e; ++b) {
if (f(*b)) {
*out1 = *b;
++out1;
} else {
*out2 = *b;
++out2;
}
}
}
Is this going to be faster than partitioning first and copying later?
I can't tell, maybe. Both solutions are imho ok in terms of readability, as for their performance - please measure and get back with the numbers. :)
Demo:
https://godbolt.org/z/axMsKGq7d
EDIT: operating on heap-allocated objects and vectors of pointers to them, as well as operating on lists is sth. to be verified in practice, for your particular use case. It might help, of course, but again, measure first, optimize later.

Any way to trick std::transform into operating on the iterator themselves?

So I wrote this code which won't compile. I think the reason is because std::transform, when given an iterator range such as this will operate on the type pointed to by the iterator, not the iterator itself. Is there any simple wrapper, standard lib tool, etc. to make this code work i.e. to store all the iterators of the original map into a new vector, with minimum changes required? Thanks!
#include <map>
#include <iostream>
#include <vector>
using MT = std::multimap<char, int>;
using MTI = MT::iterator;
int main()
{
MT m;
m.emplace('a', 1); m.emplace('a', 2); m.emplace('a', 3);
m.emplace('b', 101);
std::vector<MTI> itrs;
std::transform(m.begin(), m.end(), std::back_inserter(itrs), [](MTI itr){
return itr;
});
}
EDIT 1: Failed to compile with gcc11 and clang13, C++17/20
EDIT 2: The purpose of the question is mostly out of curiosity. I want to see what's a good way to manipulate existing standard algorithm to work on the level that I want. The sample code and problem are entirely made up for demonstration but they are not related to any real problem that requires a solution
Is there such a wrapper? Not in the standard. But it doesn't mean you can't write one, even fairly simply.
template<typename It>
struct PassIt : It {
It& operator*() { return *this; }
It const& operator*() const { return *this; }
PassIt & operator++() { ++static_cast<It&>(*this); return *this; }
PassIt operator++(int) const { return PassIt{static_cast<It&>(*this)++}; }
};
template<typename It>
PassIt(It) -> PassIt<It>;
That is just an example1 of wrapper that is a iterator of the specified template parameter type. It delegates to its base for the bookkeeping, while ensuring the the return types conform to returning the wrapped iterator itself when dereferencing.
You can use it in your example to simply copy the iterators
std::copy(PassIt{m.begin()}, PassIt{m.end()}, std::back_inserter(itrs));
See it live
(1) - It relies on std::iterator_traits deducing the correct things. As written in this example, it may not conform to all the requirements of the prescribed iterator type (in this case, we aimed at a forward iterator). If that happens, more boiler-plate will be required.
The function you pass to std::transform and algorithms in general are supposed to use elements not iterators. You could use the key to find the iterator in the map, though thats neither efficient nor simple. Instead use a plain loop:
for (auto it = m.begin(); it != m.end(); ++it) itrs.push_back(it);

What's wrong with my vector<T>::erase here?

I have two vector<T> in my program, called active and non_active respectively. This refers to the objects it contains, as to whether they are in use or not.
I have some code that loops the active vector and checks for any objects that might have gone non active. I add these to a temp_list inside the loop.
Then after the loop, I take my temp_list and do non_active.insert of all elements in the temp_list.
After that, I do call erase on my active vector and pass it the temp_list to erase.
For some reason, however, the erase crashes.
This is the code:
non_active.insert(non_active.begin(), temp_list.begin(), temp_list.end());
active.erase(temp_list.begin(), temp_list.end());
I get this assertion:
Expression:("_Pvector == NULL || (((_Myvec*)_Pvector)->_Myfirst <= _Ptr && _Ptr <= ((_Myvect*)_Pvector)->_Mylast)",0)
I've looked online and seen that there is a erase-remove idiom, however not sure how I'd apply that to a removing a range of elements from a vector<T>
I'm not using C++11.
erase expects a range of iterators passed to it that lie within the current vector. You cannot pass iterators obtained from a different vector to erase.
Here is a possible, but inefficient, C++11 solution supported by lambdas:
active.erase(std::remove_if(active.begin(), active.end(), [](const T& x)
{
return std::find(temp_list.begin(), temp_list.end(), x) != temp_list.end();
}), active.end());
And here is the equivalent C++03 solution without the lambda:
template<typename Container>
class element_of
{
Container& container;
element_of(Container& container) : container(container) {}
public:
template<typename T>
bool operator()(const T& x) const
{
return std::find(container.begin(), container.end(), x)
!= container.end();
}
};
// ...
active.erase(std::remove_if(active.begin(), active.end(),
element_of<std::vector<T> >(temp_list)),
active.end());
If you replace temp_list with a std::set and the std::find_if with a find member function call on the set, the performance should be acceptable.
The erase method is intended to accept iterators to the same container object. You're trying to pass in iterators to temp_list to use to erase elements from active which is not allowed for good reasons, as a Sequence's range erase method is intended to specify a range in that Sequence to remove. It's important that the iterators are in that sequence because otherwise we're specifying a range of values to erase rather than a range within the same container which is a much more costly operation.
The type of logic you're trying to perform suggests to me that a set or list might be better suited for the purpose. That is, you're trying to erase various elements from the middle of a container that match a certain condition and transfer them to another container, and you could eliminate the need for temp_list this way.
With list, for example, it could be as easy as this:
for (ActiveList::iterator it = active.begin(); it != active.end();)
{
if (it->no_longer_active())
{
inactive.push_back(*it);
it = active.erase(it);
}
else
++it;
}
However, sometimes vector can outperform these solutions, and maybe you have need for vector for other reasons (like ensuring contiguous memory). In that case, std::remove_if is your best bet.
Example:
bool not_active(const YourObjectType& obj);
active_list.erase(
remove_if(active_list.begin(), active_list.end(), not_active),
active_list.end());
More info on this can be found under the topic, 'erase-remove idiom' and you may need predicate function objects depending on what external states are required to determine if an object is no longer active.
You can actually make the erase/remove idiom usable for your case. You just need to move the value over to the other container before std::remove_if possibly shuffles it around: in the predicate.
template<class OutIt, class Pred>
struct copy_if_predicate{
copy_if_predicate(OutIt dest, Pred p)
: dest(dest), pred(p) {}
template<class T>
bool operator()(T const& v){
if(pred(v)){
*dest++ = v;
return true;
}
return false;
}
OutIt dest;
Pred pred;
};
template<class OutIt, class Pred>
copy_if_predicate<OutIt,Pred> copy_if_pred(OutIt dest, Pred pred){
return copy_if_predicate<OutIt,Pred>(dest,pred);
}
Live example on Ideone. (I directly used bools to make the code shorter, not bothering with output and the likes.)
The function std::vector::erase requires the iterators to be iterators into this vector, but you are passing iterators from temp_list. You cannot erase elements from a container that are in a completely different container.
active.erase(temp_list.begin(), temp_list.end());
You try to erase elements from one list, but you use iterators for second list. First list iterators aren't the same, like in second list.
I would like to suggest that this is an example of where std::list should be used. You can splice members from one list to another. Look at std::list::splice()for this.
Do you need random access? If not then you don't need a std::vector.
Note that with list, when you splice, your iterators, and references to the objects in the list remain valid.
If you don't mind making the implementation "intrusive", your objects can contain their own iterator value, so they know where they are. Then when they change state, they can automate their own "moving" from one list to the other, and you don't need to transverse the whole list for them. (If you want this sweep to happen later, you can get them to "register" themselves for later moving).
I will write an algorithm here now to run through one collection and if a condition exists, it will effect a std::remove_if but at the same time will copy the element into your "inserter".
//fwd iterator must be writable
template< typename FwdIterator, typename InputIterator, typename Pred >
FwdIterator copy_and_remove_if( FwdIterator inp, FwdIterator end, InputIterator outp, Pred pred )
{
for( FwdIterator test = inp; test != end; ++test )
{
if( pred(*test) ) // insert
{
*outp = *test;
++outp;
}
else // keep
{
if( test != inp )
{
*inp = *test;
}
++inp;
}
}
return inp;
}
This is a bit like std::remove_if but will copy the ones being removed into an alternative collection. You would invoke it like this (for a vector) where isInactive is a valid predicate that indicates it should be moved.
active.erase( copy_and_remove_if( active.begin(), active.end(), std::back_inserter(inactive), isInactive ), active.end() );
The iterators you pass to erase() should point into the vector itself; the assertion is telling you that they don't. This version of erase() is for erasing a range out of the vector.
You need to iterate over temp_list yourself and call active.erase() on the result of dereferencing the iterator at each step.

using STL to find all elements in a vector

I have a collection of elements that I need to operate over, calling member functions on the collection:
std::vector<MyType> v;
... // vector is populated
For calling functions with no arguments it's pretty straight-forward:
std::for_each(v.begin(), v.end(), std::mem_fun(&MyType::myfunc));
A similar thing can be done if there's one argument to the function I wish to call.
My problem is that I want to call a function on elements in the vector if it meets some condition. std::find_if returns an iterator to the first element meeting the conditions of the predicate.
std::vector<MyType>::iterator it =
std::find_if(v.begin(), v.end(), MyPred());
I wish to find all elements meeting the predicate and operate over them.
I've been looking at the STL algorithms for a "find_all" or "do_if" equivalent, or a way I can do this with the existing STL (such that I only need to iterate once), rather than rolling my own or simply do a standard iteration using a for loop and comparisons.
Boost Lambda makes this easy.
#include <boost/lambda/lambda.hpp>
#include <boost/lambda/bind.hpp>
#include <boost/lambda/if.hpp>
std::for_each( v.begin(), v.end(),
if_( MyPred() )[ std::mem_fun(&MyType::myfunc) ]
);
You could even do away with defining MyPred(), if it is simple. This is where lambda really shines. E.g., if MyPred meant "is divisible by 2":
std::for_each( v.begin(), v.end(),
if_( _1 % 2 == 0 )[ std::mem_fun( &MyType::myfunc ) ]
);
Update:
Doing this with the C++0x lambda syntax is also very nice (continuing with the predicate as modulo 2):
std::for_each( v.begin(), v.end(),
[](MyType& mt ) mutable
{
if( mt % 2 == 0)
{
mt.myfunc();
}
} );
At first glance this looks like a step backwards from boost::lambda syntax, however, it is better because more complex functor logic is trivial to implement with c++0x syntax... where anything very complicated in boost::lambda gets tricky quickly. Microsoft Visual Studio 2010 beta 2 currently implements this functionality.
I wrote a for_each_if() and a for_each_equal() which do what I think you're looking for.
for_each_if() takes a predicate functor to evaluate equality, and for_each_equal() takes a value of any type and does a direct comparison using operator ==. In both cases, the function you pass in is called on each element that passes the equality test.
/* ---
For each
25.1.1
template< class InputIterator, class Function, class T>
Function for_each_equal(InputIterator first, InputIterator last, const T& value, Function f)
template< class InputIterator, class Function, class Predicate >
Function for_each_if(InputIterator first, InputIterator last, Predicate pred, Function f)
Requires:
T is of type EqualityComparable (20.1.1)
Effects:
Applies f to each dereferenced iterator i in the range [first, last) where one of the following conditions hold:
1: *i == value
2: pred(*i) != false
Returns:
f
Complexity:
At most last - first applications of f
--- */
template< class InputIterator, class Function, class Predicate >
Function for_each_if(InputIterator first,
InputIterator last,
Predicate pred,
Function f)
{
for( ; first != last; ++first)
{
if( pred(*first) )
f(*first);
}
return f;
};
template< class InputIterator, class Function, class T>
Function for_each_equal(InputIterator first,
InputIterator last,
const T& value,
Function f)
{
for( ; first != last; ++first)
{
if( *first == value )
f(*first);
}
return f;
};
Is it ok to change the vector? You may want to look at the partition algorithm.
Partition algorithm
Another option would be to change your MyType::myfunc to either check the element, or to take a predicate as a parameter and use it to test the element it's operating on.
std::vector<int> v, matches;
std::vector<int>::iterator i = v.begin();
MyPred my_pred;
while(true) {
i = std::find_if(i, v.end(), my_pred);
if (i == v.end())
break;
matches.push_back(*i);
}
For the record, while I have seen an implementation where calling end() on a list was O(n), I haven't seen any STL implementations where calling end() on a vector was anything other than O(1) -- mainly because vectors are guaranteed to have random-access iterators.
Even so, if you are worried about an inefficient end(), you can use this code:
std::vector<int> v, matches;
std::vector<int>::iterator i = v.begin(), end = v.end();
MyPred my_pred;
while(true) {
i = std::find_if(i, v.end(), my_pred);
if (i == end)
break;
matches.push_back(*i);
}
For what its worth for_each_if is being considered as an eventual addition to boost. It isn't hard to implement your own.
Lamda functions - the idea is to do something like this
for_each(v.begin(), v.end(), [](MyType& x){ if (Check(x) DoSuff(x); })
Origial post here.
You can use Boost.Foreach:
BOOST_FOREACH (vector<...>& x, v)
{
if (Check(x)
DoStuff(x);
}