C++ algorithm to advance iterator while condition is true - c++

Having a pair of iterators [begin, end) I want to advance begin while a condition is true and I didn't reach end. As I don't know any "direct" algorithm from the standard library to do this I'm using:
std::find_if_not(begin, end, condition);
but my problem is that the name of the function does't express my intention to advance begin while the condition is true in a clear way.
Is any algorithm in the C++ standard library to advance an iterator while a condition is true?

C++14:
template<class...Args>
auto advance_while_true( Args&&... args ) {
return std::find_if_not( std::forward<Args>(args)... );
}
but really, just use find_if_not. The name might not match your description of the problem, but as a std library algorithm, it is relatively famous.
If the condition is common, write a wrapper that takes two (templetized) iterators and includes the condition inside itself.
template<class Iterator>
std::decay_t<Iterator> advance_while_foo( Iterator first, Iterator last ) {
return std::find_if_not( std::forward<Iterator>(first), std::forward<Iterator>(last),
[](auto&& x) {
return foo(x);
}
);
}
which both uses the std algorithm for the guts (meaning it will be better written than if you write it yourself probably), and gives it a name (foo) that in theory should be appropriate.
(forward and decay_t is probably overkill. Replace auto&& with the stored type const& and std::decay_t<?> with typename std::decay<?>::type if you aren't C++14.)

I think this is most easily expressed by the idiom...
while (condition (begin++));
And if you want to check against an end iterator, just add that to the conditions...
while (begin != end && condition(begin++));
It's a nice little trick (that goes back to C) because it works for things that aren't even technically iterators like...
// Consume leading whitespace
while (isspace(ch = getchar()));

Related

Is there a std::unique-style library algorithm that has user-defined collision handler?

I have a basic std::vector of key/value pairs. It is sorted by key. I would like to reduce all of the adjacent duplicate key entries using a user-defined binary operator while compacting the vector.
This is basically a std::unique application where the user can decide how to handle the collision rather than just keeping the first entry.
Is there a library algorithm that satisfies this requirement? I can write my own but I would prefer to rely on something that an expert has written.
The map-as-sorted-vector is core to other parts of the algorithm and can't be changed. I am limited to C++14.
I can't think of a standard algo for this. std::unique almost satisfies the requirement, but unfortunately the BinaryPredicate you supply to compare elements isn't allowed to modify them ("binary_pred shall not apply any non-constant function through the dereferenced iterators." - [algorithms.requirements] paragraph 7 in the C++17 Standard) - a requirement that lets the implementation optimise more freely (e.g. parallel processing of different parts of the vector).
An implementation's not too hard though...
template <typename Iterator, typename BinaryPredicate, typename Compaction>
Iterator compact(Iterator begin, Iterator end, BinaryPredicate equals, Compaction compaction)
{
if (begin == end) return begin;
Iterator compact_to = begin;
while (++begin != end)
if (equals(*begin, *compact_to))
compaction(*compact_to, *begin);
else
*++compact_to = *begin;
return ++compact_to;
}
The return value will be the new "end" for the compacted vector - you can erase therefrom like you would for remove_if.
You can see it running here.

Inplace versions of set_difference, set_intersection and set_union

I implemented versions of set_union, set_intersection and set_difference that take a sorted container and a sorted range (that must not be within the container), and write the result of the operation into the container.
template<class Container, class Iter>
void assign_difference(Container& cont, Iter first, Iter last)
{
auto new_end = std::set_difference( // (1)
cont.begin(), cont.end(), first, last, cont.begin());
cont.erase(new_end, cont.end());
}
template<class Container, class Iter>
void assign_intersection(Container& cont, Iter first, Iter last)
{
auto new_end = std::set_intersection( // (2)
cont.begin(), cont.end(), first, last, cont.begin());
cont.erase(new_end, cont.end());
}
template<class Container, class Iter>
void assign_union(Container& cont, Iter first, Iter last)
{
auto insert_count = last - first;
cont.resize(cont.size() + insert_count); // T must be default-constructible
auto rfirst1 = cont.rbegin() + insert_count, rlast1 = cont.rend();
auto rfirst2 = std::make_reverse_iterator(last);
auto rlast2 = std::make_reverse_iterator(first);
rlast1 = std::set_union( // (3)
rfirst1, rlast1, rfirst2, rlast2, cont.rbegin(), std::greater<>());
cont.erase(std::copy(rlast1.base(), cont.end(), cont.begin()), cont.end());
}
The goal was:
No allocation is performed if the container has enaugh capacity to hold the result.
Otherwise exactly one allocation is performed to give the container the capacity to hold the result.
As you can see in the lines marked (1), (2) and (3), the same container is used as input and output for those STL algorithms. Assuming a usual implementation of those STL algorithms, this code works, since it only writes to parts of the container that have already been processed.
As pointed out in the comments, it's not guaranteed by the standard that this works. set_union, set_intersection and set_difference require that the resulting range doesn't overlap with one of the input ranges.
However, can there be a STL implementation that breaks the code?
If your answer is yes, please provide a conforming implementations of one of the three used STL algorithms that breaks the code.
A conforming implementation could check if argument 1 and 5 of set_intersection are equal, and if they are format your harddrive.
If you violate the requirements, the behaviour of your program is not constrained by the standard; your program is ill formed.
There are situations where UB may be worth the risk and cost (auditing all compiler changes and assembly output). I do not see the point here; write your own. Any fancy optimizations that the std library comes up with could cause problems when you violate requirements as you are doing, and as you have noted the naive implementation is simple.
As rule of thumb I use do not write on a container on which you are iterating. Everything can happen. In general it's odd.
As #Yakk said, it sounds ill. That's it. Something to be removed from your code base an sleep peacefully.
If you really need those functions, I would suggest to write by yourself the inner loop (eg: the inner of std::set_intersection) in order to handle the constraint you need for your algorithm to work.
I don't think that seeking for an STL implementation on which it doesn't work is the right approach. It doesn't sound like a long term solution. For the long term: the standard should be your reference, and as someone already pointed out, your solution doesn't seems to properly deal with it.
My 2 cents

Why is there no std::inplace_merge_unique?

I tried looking for an algorithm that would do what std::inplace_merge
followed by std::unique would do. Seems more efficient to do it in 1 pass than in 2.
Could not find it in standard library or by oogling.
So is there implementation somewhere in boost under different name maybe?
Is such algorithn possible (in a sense that it has same complexity guarantees as normal inplace_merge)?
It doesn't operate in-place, but assuming that neither range contains duplicates beforehand, std::set_union will find the same result as merge followed by unique.
There are many interesting algorithms missing from the algorithms section. The original submission of STL was incomplete from Stepanov's view and some algorithms were even removed. The proposal by Alexander Stepanov and Meng Lee doesn't seem to include an algorithm inplace_merge_unique() or any variation thereof.
One of the potential reasons why there is no such algorithm is that it isn't clear which of the element should be dropped: since the comparison is only a strict weak ordering, the choice of element matters. One approach to implement inplace_merge_unique() is to
Use std::remove_if() to remove any element which is a duplicate from the second range.
Use inplace_merge() to do the actual merge.
The predicate to std::remove_if() would track the current position in the first part of the sequence to be merged. The code below isn't tested but something like that should work:
template <typename BiDirIt, typename Comp>
BiDirIt inplace_merge_unique(BiDirIt begin, BiDirIt middle, BiDirIt end, Comp comp) {
using reference = typename std::iterator_traits<BiDirIt>::reference;
BiDirIt result = std::remove_if(middle, end, [=](reference other) mutable -> bool {
begin = std::find_if(begin, middle, [=](reference arg)->bool {
return !comp(arg, other);
});
return begin != middle && !comp(other, *begin);
});
std::inplace_merge(begin, middle, result, comp);
return result;
}

What's wrong with my vector<T>::erase here?

I have two vector<T> in my program, called active and non_active respectively. This refers to the objects it contains, as to whether they are in use or not.
I have some code that loops the active vector and checks for any objects that might have gone non active. I add these to a temp_list inside the loop.
Then after the loop, I take my temp_list and do non_active.insert of all elements in the temp_list.
After that, I do call erase on my active vector and pass it the temp_list to erase.
For some reason, however, the erase crashes.
This is the code:
non_active.insert(non_active.begin(), temp_list.begin(), temp_list.end());
active.erase(temp_list.begin(), temp_list.end());
I get this assertion:
Expression:("_Pvector == NULL || (((_Myvec*)_Pvector)->_Myfirst <= _Ptr && _Ptr <= ((_Myvect*)_Pvector)->_Mylast)",0)
I've looked online and seen that there is a erase-remove idiom, however not sure how I'd apply that to a removing a range of elements from a vector<T>
I'm not using C++11.
erase expects a range of iterators passed to it that lie within the current vector. You cannot pass iterators obtained from a different vector to erase.
Here is a possible, but inefficient, C++11 solution supported by lambdas:
active.erase(std::remove_if(active.begin(), active.end(), [](const T& x)
{
return std::find(temp_list.begin(), temp_list.end(), x) != temp_list.end();
}), active.end());
And here is the equivalent C++03 solution without the lambda:
template<typename Container>
class element_of
{
Container& container;
element_of(Container& container) : container(container) {}
public:
template<typename T>
bool operator()(const T& x) const
{
return std::find(container.begin(), container.end(), x)
!= container.end();
}
};
// ...
active.erase(std::remove_if(active.begin(), active.end(),
element_of<std::vector<T> >(temp_list)),
active.end());
If you replace temp_list with a std::set and the std::find_if with a find member function call on the set, the performance should be acceptable.
The erase method is intended to accept iterators to the same container object. You're trying to pass in iterators to temp_list to use to erase elements from active which is not allowed for good reasons, as a Sequence's range erase method is intended to specify a range in that Sequence to remove. It's important that the iterators are in that sequence because otherwise we're specifying a range of values to erase rather than a range within the same container which is a much more costly operation.
The type of logic you're trying to perform suggests to me that a set or list might be better suited for the purpose. That is, you're trying to erase various elements from the middle of a container that match a certain condition and transfer them to another container, and you could eliminate the need for temp_list this way.
With list, for example, it could be as easy as this:
for (ActiveList::iterator it = active.begin(); it != active.end();)
{
if (it->no_longer_active())
{
inactive.push_back(*it);
it = active.erase(it);
}
else
++it;
}
However, sometimes vector can outperform these solutions, and maybe you have need for vector for other reasons (like ensuring contiguous memory). In that case, std::remove_if is your best bet.
Example:
bool not_active(const YourObjectType& obj);
active_list.erase(
remove_if(active_list.begin(), active_list.end(), not_active),
active_list.end());
More info on this can be found under the topic, 'erase-remove idiom' and you may need predicate function objects depending on what external states are required to determine if an object is no longer active.
You can actually make the erase/remove idiom usable for your case. You just need to move the value over to the other container before std::remove_if possibly shuffles it around: in the predicate.
template<class OutIt, class Pred>
struct copy_if_predicate{
copy_if_predicate(OutIt dest, Pred p)
: dest(dest), pred(p) {}
template<class T>
bool operator()(T const& v){
if(pred(v)){
*dest++ = v;
return true;
}
return false;
}
OutIt dest;
Pred pred;
};
template<class OutIt, class Pred>
copy_if_predicate<OutIt,Pred> copy_if_pred(OutIt dest, Pred pred){
return copy_if_predicate<OutIt,Pred>(dest,pred);
}
Live example on Ideone. (I directly used bools to make the code shorter, not bothering with output and the likes.)
The function std::vector::erase requires the iterators to be iterators into this vector, but you are passing iterators from temp_list. You cannot erase elements from a container that are in a completely different container.
active.erase(temp_list.begin(), temp_list.end());
You try to erase elements from one list, but you use iterators for second list. First list iterators aren't the same, like in second list.
I would like to suggest that this is an example of where std::list should be used. You can splice members from one list to another. Look at std::list::splice()for this.
Do you need random access? If not then you don't need a std::vector.
Note that with list, when you splice, your iterators, and references to the objects in the list remain valid.
If you don't mind making the implementation "intrusive", your objects can contain their own iterator value, so they know where they are. Then when they change state, they can automate their own "moving" from one list to the other, and you don't need to transverse the whole list for them. (If you want this sweep to happen later, you can get them to "register" themselves for later moving).
I will write an algorithm here now to run through one collection and if a condition exists, it will effect a std::remove_if but at the same time will copy the element into your "inserter".
//fwd iterator must be writable
template< typename FwdIterator, typename InputIterator, typename Pred >
FwdIterator copy_and_remove_if( FwdIterator inp, FwdIterator end, InputIterator outp, Pred pred )
{
for( FwdIterator test = inp; test != end; ++test )
{
if( pred(*test) ) // insert
{
*outp = *test;
++outp;
}
else // keep
{
if( test != inp )
{
*inp = *test;
}
++inp;
}
}
return inp;
}
This is a bit like std::remove_if but will copy the ones being removed into an alternative collection. You would invoke it like this (for a vector) where isInactive is a valid predicate that indicates it should be moved.
active.erase( copy_and_remove_if( active.begin(), active.end(), std::back_inserter(inactive), isInactive ), active.end() );
The iterators you pass to erase() should point into the vector itself; the assertion is telling you that they don't. This version of erase() is for erasing a range out of the vector.
You need to iterate over temp_list yourself and call active.erase() on the result of dereferencing the iterator at each step.

Breaking in std::for_each loop

While using std::for_each algorithm how do I break when a certain condition is satisfied?
You can use std::any_of (or std::all_of or std::none_of) e.g. like this:
std::vector<int> a;
// ...
std::all_of(a.begin(), a.end(), [&](int val) {
// return false if you want to break, true otherwise
});
However, this is a wasteful solution (return values are not really used for anything), and you're better off writing you own loop.
You can use std::find_if algorithm, which will stop and return the iterator to the first element where the predicate condition applied to returns true. So your predicate should be changed to return a boolean as the continue/break condition.
However, this is a hack, so you can use the algorithms.
Another way is to use BOOST_FOREACH.
You can break from the for_each() by throwing an exception from your functor. This is often not a good idea however, and there are alternatives.
You can retain state in your functor. If you detect the 'break' condition, simply set a flag in your functor and then for each subsequent iteration simply return without doing your functor's thing. Obviously this won't stop the iteration, which might be expensive for large collections, but it will at least stop the work from being performed.
If your collection is sorted, you can find() the element that you want to break at, then do for_each from begin() to the element find() returned.
Finally, you can implement a for_each_if(). This will again not stop the iteration but will not evaluate your functor which does the work if the predicate evaluates to false. Here are 2 flavors of for_each_xxx(), one which takes a value and performs the work if operator==() evaluates to true, and another which takes two functors; one which performs a comparison ala find_if(), and the other which performs the work if the comparison operator evaluates to true.
/* ---
For each
25.1.1
template< class InputIterator, class Function, class T>
Function for_each_equal(InputIterator first, InputIterator last, const T& value, Function f)
template< class InputIterator, class Function, class Predicate >
Function for_each_if(InputIterator first, InputIterator last, Predicate pred, Function f)
Requires:
T is of type EqualityComparable (20.1.1)
Effects:
Applies f to each dereferenced iterator i in the range [first, last) where one of the following conditions hold:
1: *i == value
2: pred(*i) != false
Returns:
f
Complexity:
At most last - first applications of f
--- */
template< class InputIterator, class Function, class Predicate >
Function for_each_if(InputIterator first,
InputIterator last,
Predicate pred,
Function f)
{
for( ; first != last; ++first)
{
if( pred(*first) )
f(*first);
}
return f;
};
template< class InputIterator, class Function, class T>
Function for_each_equal(InputIterator first,
InputIterator last,
const T& value,
Function f)
{
for( ; first != last; ++first)
{
if( *first == value )
f(*first);
}
return f;
};
If you want do some actions while condition is not satisfied, maybe you need change algorithm on something like std::find_if?
As already shown by others it is only achievable with workarounds that IMHO obfuscate the code.
So my suggestions is to change the for_each into a regular for loop. This will make it more visible to others that you are using break (and maybe even continue).
You can't do it, unless you throw an exception, which is not a good idea because you don't do flow control with exceptions.
Update: apparently Boost has a for_each_if that might help, but you're not using Boost.
You throw an exception. Whether or not it's a good idea is sort of a style question, pace #Dan, but may be more of an issue with your design. for_each is intended for a sort of functional-programming style, which implicitly assumes that your function can be applied uniformly across the set. So, if you do need to break, that could be consiered an unusual condition, and therefore worthy of an exception.
The other solution, and a more "functional" solution, is to write your function so that if it shouldn't have an effect on some applications, then write it to have no effect. So, for example, if you had a summing function, have it add 0 in the cases you would have "broken" from.
You can use std::find_if instead std::for_each:
int aaa[]{ 1, 2, 3, 4, 5, 6, 7, 8 };
std::find_if(aaa, std::next(aaa, sizeof(aaa) / sizeof(int)), [](const auto &i) {
if (i == 5)
return true;
std::cout << i << std::endl;
return false;
});
Output:
1
2
3
4