Efficiency of STL's copy function - c++

I'm trying to construct a set of unique words from a list of entries, each of which has a vector of strings.
So I made a function called Insert, which gets called for each of the entries like this:
for( auto & e : _Entries )
_Dictionary.Insert( begin( e.getNameWords( ) ), end( e.getNameWords( ) ) );
The class _Dictionary internally has a set (the STL container) and I wrote the function Insert as follows:
template< typename InputIterator >
void Insert( InputIterator first, InputIterator last )
{
for( auto it = first ; it != last ; ++it )
_AllWords.insert( *it );
}
In my case, calling Insert for all entries in _Entries took an average of 570 milliseconds.
Then I thought that I should use the functions that the STL already has to do the same thing that the for loop in Insert does, so I changed the function Insert to the following:
template< typename InputIterator >
void Insert( InputIterator first, InputIterator last )
{
copy( first, last, inserter( _AllWords, begin( _AllWords ) ) );
}
I was expecting this to
be more correct, and
be at least as fast, if not more
(guided by the philosophy of letting the STL do as much for you as you can). However, I was surprised to notice that this implementation actually took longer; not much more, but a consistent 200 milliseconds more than the previous for-loop based implementation.
I know this is an essentially trivial speed difference, but I'm still surprised.
So my question is: why is my implementation faster?
Note: I am compiling this with clang's version 3.5.2 with the libc++ standard library and with the -O3 flag, under Ubuntu 14.04.

The problem is this:
copy( first, last, inserter( _AllWords, begin( _AllWords ) ) );
ends up calling this version of insert:
iterator insert( iterator hint, const value_type& value );
with begin() as the hint. That is, typically, not where you're going to want to insert each value. As a result, you're just making the container do more work trying to figure out where to add your values since your hint is as bad as possible.
But note that there is also this overload of insert:
template< class InputIt >
void insert( InputIt first, InputIt last );
which you should just use†:
template< typename InputIterator >
void Insert( InputIterator first, InputIterator last )
{
_AllWords.insert(first, last);
}
And side-note, _AllWords is a reserved identifier.
†Although based on this note:
The overloads (5-6) are often implemented as a loop that calls the overload (3) with end() as the hint; they are optimized for appending a sorted sequence (such as another set) whose smallest element is greater than the last element in *this
That seems like a really specific goal to optimize against, which you may or may not satisfy, so probably you shouldn't use this overload, and your initial loop is just fine.

Related

c++11: Erase multiple occurrences from vector. Which is best practice?

I understand that erase moves the iterator forward automatically, so when removing multiple occurrences I need to avoid this so I can compare contiguous elements. That's why I usually do:
auto i = vect.begin();
while (i!=vect.end())
if (*i==someValue)
vect.erase(i);
else
++i;
But I was wondering if I could also do it with a for loop, like this:
for (auto i=vec.begin(); i!=vec.end(); ++i)
if (*i==someValue){
vec.erase(i);
--i;
}
The --i part looks a bit weird but it works. Would that be bad practice? Bad code? Be prone to errors? Or it's just right to use either option?
Thanks.
Use Remove and Erase idiom:
auto new_end = std::remove(v.begin(), v.end(), some_value);
v.erase(new_end, v.end());
That above code has complexity of O(n) and it can be executed in parallel if there're no data race from C++17 with
template< class ExecutionPolicy, class ForwardIt, class T >
ForwardIt remove( ExecutionPolicy&& policy, ForwardIt first, ForwardIt last, const T& value );
or with parallelism TS
Your code has a problem because from vector.modifiers#3
Effects: Invalidates iterators and references at or after the point of the erase
The standard said that iterator is invalidated
However, in reality, most of implementations keep the iterator point to the old node, which is now the end if it was the last element or the next element, then your code has complexity of O(n2) because it will loop n times and take n more for shift the data. It also can't be executed in parallel.

Why does std::binary_search return bool?

According to draft N4431, the function std::binary_search in the algorithms library returns a bool, [binary.search]:
template<class ForwardIterator, class T>
bool binary_search(ForwardIterator first, ForwardIterator last,
const T& value);
template<class ForwardIterator, class T, class Compare>
bool binary_search(ForwardIterator first, ForwardIterator last,
const T& value, Compare comp);
Requires: The elements e of [first,last) are partitioned with respect to the expressions e < value and !(value < e) or comp(e, value) and !comp(value, e). Also, for all elements e of [first,last), e < value implies !(value < e) or comp(e, value) implies !comp(value, e).
Returns: true if there is an iterator i in the range [first,last) that satisfies the corresponding conditions:
!(*i < value) && !(value < *i) or comp(*i, value) == false && comp(value, *i) ==
false.
Complexity: At most log2(last - first) + O(1) comparisons.
Does anyone know why this is the case?
Most other generic algorithms either return an iterator to the element or an iterator that is equivalent to the iterator denoting the end of the sequence of elements (i.e., one after the last element to be considered in the sequence), which is what I would have expected.
The name of this function in 1994 version of STL was isMember. I think you'd agree that a function with that name should return bool
http://www.stepanovpapers.com/Stepanov-The_Standard_Template_Library-1994.pdf
It's split into multiple different functions in C++, as for the reasoning it's nearly impossible to tell why someone made something one way or another. binary_search will tell you if such an element exists. If you need to know the location of them use lower_bound and upper_bound which will give the begin/end iterator respectively. There's also equal_range that gives you both the begin and end at once.
Since others seem to think that it's obvious why it was created that way I'll argue my points why it's hard/impossible to answer if you aren't Alexander Stepanov or someone who worked with him.
Sadly the SGI STL FAQ doesn't mention binary_search at all. It explains reasoning for list<>::size being linear time or pop returning void. It doesn't seem like they deemed binary_search special enough to document it.
Let's look at the possible performance improvement mentioned by #user2899162:
You can find the original implementation of the SGI STL algorithm binary_search here. Looking at it one can pretty much simplify it (we all know how awful the internal names in the standard library are) to:
template <class ForwardIter, class V>
bool binary_search(ForwardIter first, ForwardIter last, const V& value) {
ForwardIter it = lower_bound(first, last, value);
return it != last && !(value < *it);
}
As you can see it was implemented in terms of lower_bound and got the same exact performance. If they really wanted it to take advantage of possible performance improvements they wouldn't have implemented it in terms of the slower one, so it doesn't seem like that was the reason they did it that way.
Now let's look at it simply being a convenience function
It being simply a convenience function seems more likely, but looking through the STL you'll find numerous other algorithms where this could have been possible. Looking at the above implementation you'll see that it's only trivially more to do than a std::find(begin, end, value) != end; yet we have to write that all the time and don't have a convenience function that returns a bool. Why exactly here and not all the other algorithms too? It's not really obvious and can't simply be explained.
In conclusion I find it far from obvious and don't really know if I could confidently and honestly answer it.
The binary search algorithm relies on strict weak ordering. Meaning that the elements are supposed to be partitioned according to the operator < or according to a custom comparator that has the same guarantees. This means that there isn't necessarily only one element that could be found for a given query. Thus you need the lower_bound, upper_bound and equal_range functions to retrieve iterators.
The standard library contains variants of binary search algorithm that return iterators. They are called std::lower_bound and std::upper_bound. I think the rationale behind std::binary_search returning bool is that it wouldn't be clear what iterator to return in case of equivalent elements, while in case of std::lower_bound and std::upper_bound it is clear.
There might have been performance considerations as well, because in theory std::binary_search could be implemented to perform better in case of multiple equivalent elements and certain types. However, at least one popular implementation of the standard library (libstdc++) implements std::binary_search using std::lower_bound and, moreover, they have the same theoretical complexity.
If you want to get an iterator on a value, you can use std::equal_range which will return 2 iterators, one on the lower bound and one on the higher bound of the range of values that are equal to the one you're looking for.
Since the only requirement is that values are sorted and not unique, there's is no simple "find" that would return an iterator on the one element you're looking for. If there's only one element equal to the value you're looking for, there will only be a difference of 1 between the two iterators.
Here's a C++20 binary-seach alternative that returns an iterator:
template<typename RandomIt, typename T, typename Pred>
inline
RandomIt xbinary_search( RandomIt begin, RandomIt end, T const &key, Pred pred )
requires std::random_access_iterator<RandomIt>
&&
requires( Pred pred, typename std::iterator_traits<RandomIt>::value_type &elem, T const &key )
{
{ pred( elem, key ) } -> std::convertible_to<std::strong_ordering>;
}
{
using namespace std;
size_t lower = 0, upper = end - begin, mid;
strong_ordering so;
while( lower != upper )
{
mid = (lower + upper) / 2;
so = pred( begin[mid], key );
if( so == 0 )
{
assert(mid == 0 || pred( begin[mid - 1], key ) < 0);
assert(begin + mid + 1 == end || pred( begin[mid + 1], key ) > 0);
return begin + mid;
}
if( so > 0 )
upper = mid;
else
lower = mid + 1;
}
return end;
}
This code only works correctly if there's only one value between begin and end that matches the key. But if you debug and NDEBUG is not defined, the code stops in your debugger.

STL Algorithm that Takes a Test and Mutate Function

What I want is this behavior: void change_if( ForwardIterator first, ForwardIterator last, UnaryPredicate test, UnaryOperation op )
Is the best way to achieve that just with a for loop? Or is there some STL magic I don't yet know?
This can be done without using boost but applying standard algorithm std::for_each
I do not advice to use boost for such simple tasks. It is simply a stupidy to include boost in your project that to perform such a simple task. You may use boost for such tasks provided that it is already included in your project.
std::for_each( first, last, []( const T &x ) { if ( test( x ) ) op( x ); } );
Or you can remove the qualifier const if you are going to change elements of the sequence
std::for_each( first, last, []( T &x ) { if ( test( x ) ) op( x ); } );
Sometimes when the whole range of a sequence is used it is simpler to use the range based for statement instead of an algorithm becuase using algorithms with lambda expressions sometimes makes code less readable
for ( auto &x : sequence )
{
if ( test( x ) ) op( x );
}
Or
for ( auto &x : sequence )
{
if ( test( x ) ) x = op( x );
}
The solution by Vlad from Moscow is the recommended approach for it's simplicity.
The "seemingly obious" use of the std::transform standard algorithm with a lambda:
std::transform(first, last, first, [](auto elem) {
return test(elem) ? op(elem) : elem;
});
actually leads to performance degradation because all elements will be assigned to, not just those satisfying the predicate. To only modify the predicated elements, one would also need something like boost::filter_iterator as mentioned in the answer by kiwi.
Note that I used C++14 syntax with the auto inside the lambda. For C++11, you would need something like decltype(*first) or iterator_traits<ForwardIterator>::value_type. And in C++98/03 you would both that and a hand made function object.
Still another boost solution :
http://www.boost.org/doc/libs/1_55_0/libs/iterator/doc/filter_iterator.html
Just call std::transform on your filtered iterator.
std::for_each( first, last, []( T &x ) { if ( test( x ) ) op( x ); } );
or using boost lambda:
#include <boost/lambda/lambda.hpp>
#include <boost/lambda/bind.hpp>
#include <boost/lambda/if.hpp>
std::for_each( v.begin(), v.end(),
if_( test() )[ op() ]
);
alternatively:
std::vector<int>::iterator it = v.begin();
while ( it != v.end()) {
if ( test( *it)) op(*it);
++it;
}
You want change_if as a simple loop?
template<typename ForwardIterator, typename UnaryPredicate>
void change_if( ForwardIterator first, ForwardIterator last, UnaryPredicate test, UnaryOperation op ) {
for(; first!=last; ++first)
if (test(*first)) *first=op(std::move(*first));
}
or just write the above loop. I would advise actually writing change_if and calling it, because while the above code is short I would find the change_if call to be more, not less, clear than just dropping the above code in.
I also like writing container-based overloads:
template<typename Container, typename UnaryPredicate>
void change_if( Container&& c, UnaryPredicate test, UnaryOperation op ) {
for(auto& v : std::forward<Container>(c))
if (test(v)) v=op(std::move(v));
}
but I also have this:
template<typename Iterator>
struct range {
Iterator b, e;
Iterator begin() const { return b; }
Iterator end() const { return e; }
};
template<typename Iterator0, typename Iterator1>
range<typename std::decay<Iterator0>::type> make_range(Iterator0&& b, Iterator1&& e) {
static_assert(
std::is_convertible< Iterator1, typename std::decay<Iterator0>::type >::value,
"end must be compatible with begin iterator type"
);
return { std::forward<Iterator0>(b), std::forward<Iterator1>(e) };
}
which lets me use such container-based algorithms with iterators.
You'll see I have a Container based change_if? It is really a range-based change_if.
It is called like:
change_if( myVect, [](int x){return (x%2)==0;}, [](int x){return x/2;} );
on a container, not a pair of iterators. However, if you only want to change the first half of a container, it doesn't work: so at first glance, container-based (well, range-based) algorithms are less useful.
But make_range turns iterators into a range. So you can:
change_if( make_range( myVec.begin(), myVec.begin()+myVec.size()/2 ), [](int x){return (x%2)==0;}, [](int x){return x/2;} )
make_range fills in the inability to directly pass 2 iterators to range-based algorithms by bundling two iterators into one range<> object. This corner case is more verbose, but the typical case (of processing an entire container) becomes less verbose.
Plus, a common kind of error (naming a different container for begin and end) is made far less frequent.
All of this ends up being as efficient, or more so, than the iterator-based version. And, if you replace your ranges with iterables (ranges that have dissimilar begin and end iterator types), my change_if just works

STL: set_union, includes, mismatch, find_if but is there no includes_any?

From the title you'd almost assuredly think use set_union to create a list and then check if it's empty. However, the objects I'm comparing are "expensive" to copy. I've looked at includes but that only works if all the items of one list are found in another. I've also looked at mismatch but rejected it for obvious reasons.
I can and have written my own function which assumes both lists are sorted but I'm wondering if an efficient function already exists in the STL. (Project is forbidden to use third-party libraries including Boost and TR1, don't ask.)
If the sets are unsorted, then you can use find_first_of for an O(N*M) algorithm.
If they are sorted (which would be required for set_intersection anyway), then you can iterate over one set calling equal_range in the other for every element. If every returned range is empty, there is no intersection. Performance is O(N log M).
However, there is no excuse not to have O(N+M) performance, right? Nothing is copied by set_intersection if it's passed a dummy iterator.
struct found {};
template< class T > // satisfy language requirement
struct throwing_iterator : std::iterator< std::output_iterator_tag, T > {
T &operator*() { throw found(); }
throwing_iterator &operator++() { return *this; }
throwing_iterator operator++(int) { return *this; }
};
template< class I, class J >
bool any_intersection( I first1, I last1, J first2, J last2 ) {
try {
throwing_iterator< typename std::iterator_traits<I>::value_type > ti;
set_intersection( first1, last1, first2, last2, ti );
return false;
} catch ( found const& ) {
return true;
}
}
This provides for early exit. You could alternately avoid the exception and just have the iterator remember how many times it was incremented, and no-op the assignment.
Is find_first_of() what you're after?

using STL to find all elements in a vector

I have a collection of elements that I need to operate over, calling member functions on the collection:
std::vector<MyType> v;
... // vector is populated
For calling functions with no arguments it's pretty straight-forward:
std::for_each(v.begin(), v.end(), std::mem_fun(&MyType::myfunc));
A similar thing can be done if there's one argument to the function I wish to call.
My problem is that I want to call a function on elements in the vector if it meets some condition. std::find_if returns an iterator to the first element meeting the conditions of the predicate.
std::vector<MyType>::iterator it =
std::find_if(v.begin(), v.end(), MyPred());
I wish to find all elements meeting the predicate and operate over them.
I've been looking at the STL algorithms for a "find_all" or "do_if" equivalent, or a way I can do this with the existing STL (such that I only need to iterate once), rather than rolling my own or simply do a standard iteration using a for loop and comparisons.
Boost Lambda makes this easy.
#include <boost/lambda/lambda.hpp>
#include <boost/lambda/bind.hpp>
#include <boost/lambda/if.hpp>
std::for_each( v.begin(), v.end(),
if_( MyPred() )[ std::mem_fun(&MyType::myfunc) ]
);
You could even do away with defining MyPred(), if it is simple. This is where lambda really shines. E.g., if MyPred meant "is divisible by 2":
std::for_each( v.begin(), v.end(),
if_( _1 % 2 == 0 )[ std::mem_fun( &MyType::myfunc ) ]
);
Update:
Doing this with the C++0x lambda syntax is also very nice (continuing with the predicate as modulo 2):
std::for_each( v.begin(), v.end(),
[](MyType& mt ) mutable
{
if( mt % 2 == 0)
{
mt.myfunc();
}
} );
At first glance this looks like a step backwards from boost::lambda syntax, however, it is better because more complex functor logic is trivial to implement with c++0x syntax... where anything very complicated in boost::lambda gets tricky quickly. Microsoft Visual Studio 2010 beta 2 currently implements this functionality.
I wrote a for_each_if() and a for_each_equal() which do what I think you're looking for.
for_each_if() takes a predicate functor to evaluate equality, and for_each_equal() takes a value of any type and does a direct comparison using operator ==. In both cases, the function you pass in is called on each element that passes the equality test.
/* ---
For each
25.1.1
template< class InputIterator, class Function, class T>
Function for_each_equal(InputIterator first, InputIterator last, const T& value, Function f)
template< class InputIterator, class Function, class Predicate >
Function for_each_if(InputIterator first, InputIterator last, Predicate pred, Function f)
Requires:
T is of type EqualityComparable (20.1.1)
Effects:
Applies f to each dereferenced iterator i in the range [first, last) where one of the following conditions hold:
1: *i == value
2: pred(*i) != false
Returns:
f
Complexity:
At most last - first applications of f
--- */
template< class InputIterator, class Function, class Predicate >
Function for_each_if(InputIterator first,
InputIterator last,
Predicate pred,
Function f)
{
for( ; first != last; ++first)
{
if( pred(*first) )
f(*first);
}
return f;
};
template< class InputIterator, class Function, class T>
Function for_each_equal(InputIterator first,
InputIterator last,
const T& value,
Function f)
{
for( ; first != last; ++first)
{
if( *first == value )
f(*first);
}
return f;
};
Is it ok to change the vector? You may want to look at the partition algorithm.
Partition algorithm
Another option would be to change your MyType::myfunc to either check the element, or to take a predicate as a parameter and use it to test the element it's operating on.
std::vector<int> v, matches;
std::vector<int>::iterator i = v.begin();
MyPred my_pred;
while(true) {
i = std::find_if(i, v.end(), my_pred);
if (i == v.end())
break;
matches.push_back(*i);
}
For the record, while I have seen an implementation where calling end() on a list was O(n), I haven't seen any STL implementations where calling end() on a vector was anything other than O(1) -- mainly because vectors are guaranteed to have random-access iterators.
Even so, if you are worried about an inefficient end(), you can use this code:
std::vector<int> v, matches;
std::vector<int>::iterator i = v.begin(), end = v.end();
MyPred my_pred;
while(true) {
i = std::find_if(i, v.end(), my_pred);
if (i == end)
break;
matches.push_back(*i);
}
For what its worth for_each_if is being considered as an eventual addition to boost. It isn't hard to implement your own.
Lamda functions - the idea is to do something like this
for_each(v.begin(), v.end(), [](MyType& x){ if (Check(x) DoSuff(x); })
Origial post here.
You can use Boost.Foreach:
BOOST_FOREACH (vector<...>& x, v)
{
if (Check(x)
DoStuff(x);
}