Related
In terms of space-time complexity which of the following is best way to iterate over a std::vector and why?
Way 1:
for(std::vector<T>::iterator it = v.begin(); it != v.end(); ++it) {
/* std::cout << *it; ... */
}
Way 2:
for(std::vector<int>::size_type i = 0; i != v.size(); i++) {
/* std::cout << v[i]; ... */
}
Way 3:
for(size_t i = 0; i != v.size(); i++) {
/* std::cout << v[i]; ... */
}
Way 4:
for(auto const& value: a) {
/* std::cout << value; ... */
First of all, Way 2 and Way 3 are identical in practically all standard library implementations.
Apart from that, the options you posted are almost equivalent. The only notable difference is that in Way 1 and Way 2/3, you rely on the compiler to optimize the call to v.end() and v.size() out. If that assumption is correct, there is no performance difference between the loops.
If it's not, Way 4 is the most efficient. Recall how a range based for loop expands to
{
auto && __range = range_expression ;
auto __begin = begin_expr ;
auto __end = end_expr ;
for ( ; __begin != __end; ++__begin) {
range_declaration = *__begin;
loop_statement
}
}
The important part here is that this guarantees the end_expr to be evaluated only once. Also note that for the range based for loop to be the most efficient iteration, you must not change how the dereferencing of the iterator is handled, e.g.
for (auto value: a) { /* ... */ }
this copies each element of the vector into the loop variable value, which is likely to be slower than for (const auto& value : a), depending on the size of the elements in the vector.
Note that with the parallel algorithm facilities in C++17, you can also try out
#include <algorithm>
#include <execution>
std::for_each(std::par_unseq, a.cbegin(), a.cend(),
[](const auto& e) { /* do stuff... */ });
but whether this is faster than an ordinary loop depends on may circumstantial details.
Prefer iterators over indices/keys.
While for vector or array there should be no difference between either form1, it is a good habit to get into for other containers.
1 As long as you use [] instead of .at() for accesssing by index, of course.
Memorize the end-bound.
Recomputing the end-bound at each iteration is inefficient for two reasons:
In general: a local variable is not aliased, which is more optimizer-friendly.
On containers other than vector: computing the end/size could be a bit more expensive.
You can do so as a one-liner:
for (auto it = vec.begin(), end = vec.end(); it != end; ++it) { ... }
(This is an exception to the general prohibition on declaring a single variable at a time.)
Use the for-each loop form.
The for-each loop form will automatically:
Use iterators.
Memorize the end-bound.
Thus:
for (/*...*/ value : vec) { ... }
Take built-in types by values, other types by reference.
There is a non-obvious trade-off between taking an element by value and taking an element by reference:
Taking an element by reference avoids a copy, which can be an expensive operation.
Taking an element by value is more optimizer-friendly1.
At the extremes, the choice should be obvious:
Built-in types (int, std::int64_t, void*, ...) should be taken by value.
Potentially allocating types (std::string, ...) should be taken by reference.
In the middle, or when faced with generic code, I would recommend starting with references: it's better to avoid a performance cliff than attempting to squeeze out the last cycle.
Thus, the general form is:
for (auto& element : vec) { ... }
And if you are dealing with a built-in:
for (int element : vec) { ... }
1 This is a general principle of optimization, actually: local variables are friendlier than pointers/references because the optimizer knows all the potential aliases (or absence, thereof) of the local variable.
Addition to lubgr's answer:
Unless you discover via profiling the code in question to be a bottleneck, efficiency (which you probably meant instead of 'effectivity') shouldn't be your first concern, at least not on this level of code. Much more important are code readability and maintainability! So you should select the loop variant that reads best, which usually is way 4.
Indices can be useful if you have steps greater than 1 (whyever you would need to...):
for(size_t i = 0; i < v.size(); i += 2) { ... }
While += 2 per se is legal on iterators, too, you risk undefined behaviour at loop end if the vector has odd size because you increment past the one past the end position! (Generally spoken: If you increment by n, you get UB if size is not an exact multiple of n.) So you need additional code to catch this, while you don't with the index variant...
The lazy answer: The complexities are equivalent.
The time complexity of all solutions is Θ(n).
The space complexity of all solutions is Θ(1).
The constant factors involved in the various solutions are implementation details. If you need numbers, you're probably best off benchmarking the different solutions on your particular target system.
It may help to store v.size() rsp. v.end(), although these are usually inlined, so such optimizations may not be needed, or performed automatically.
Note that indexing (without memoizing v.size()) is the only way to correctly deal with a loop body that may add additional elements (using push_back()). However, most use cases do not need this extra flexibility.
Prefer method 4, std::for_each (if you really must), or method 5/6:
void method5(std::vector<float>& v) {
for(std::vector<float>::iterator it = v.begin(), e = v.end(); it != e; ++it) {
*it *= *it;
}
}
void method6(std::vector<float>& v) {
auto ptr = v.data();
for(std::size_t i = 0, n = v.size(); i != n; i++) {
ptr[i] *= ptr[i];
}
}
The first 3 methods can suffer from issues of pointer aliasing (as alluded to in previous answers), but are all equally bad. Given that it's possible another thread may be accessing the vector, most compilers will play it safe, and re-evaluate [] end() and size() in each iteration. This will prevent all SIMD optimisations.
You can see proof here:
https://godbolt.org/z/BchhmU
You'll notice that only 4/5/6 make use of the vmulps SIMD instructions, where as 1/2/3 only ever use the non-SIMD vmulss instructiuon.
Note: I'm using VC++ in the godbolt link because it demonstrates the problem nicely. The same problem does occur with gcc/clang, but it's not easy to demonstrate it with godbolt - you usually need to disassemble your DSO to see this happening.
For completeness, I wanted to mention that your loop might want to change the size of the vector.
std::vector<int> v = get_some_data();
for (std::size_t i=0; i<v.size(); ++i)
{
int x = some_function(v[i]);
if(x) v.push_back(x);
}
In such an example you have to use indices and you have to re-evaluate v.size() in every iteration.
If you do the same with a range-based for loop or with iterators, you might end up with undefined behavior since adding new elements to a vector might invalidate your iterators.
By the way, I prefer to use while-loops for such cases over for-loops but that's another story.
It depends to a large extent on what you mean by "effective".
Other answers have mentioned efficiency, but I'm going to focus on the (IMO) most important purpose of C++ code: to convey your intent to other programmers¹.
From this perspective, method 4 is clearly the most effective. Not just because there are fewer characters to read, but mainly because there's less cognitive load: we don't need to check whether the bounds or step size are unusual, whether the loop iteration variable (i or it) is used or modified anywhere else, whether there's a typo or copy/paste error such as for (auto i = 0u; i < v1.size(); ++i) { std::cout << v2[i]; }, or dozens of other possibilities.
Quick quiz: Given std::vector<int> v1, v2, v3;, how many of the following loops are correct?
for (auto it = v1.cbegin(); it != v1.end(); ++it)
{
std::cout << v1[i];
}
for (auto i = 0u; i < v2.size(); ++i)
{
std::cout << v1[i];
}
for (auto const i: v3)
{
std::cout << i;
}
Expressing the loop control as clearly as possible allows the developer's mind to hold more understanding of the high-level logic, rather than being cluttered with implementation details - after all, this is why we're using C++ in the first place!
¹ To be clear, when I'm writing code, I consider the most important "other programmer" to be Future Me, trying to understand, "Who wrote this rubbish?"...
All of the ways you listed have identical time complexity and identical space complexity (no surprise there).
Using the for(auto& value : v) syntax is marginally more efficient, because with the other methods, the compiler may re-load v.size() and v.end() from memory every time you do the test, whereas with for(auto& value : v) this never occurs (it only loads the begin() and end() iterators once).
We can observe a comparison of the assembly produced by each method here: https://godbolt.org/z/LnJF6p
On a somewhat funny note, the compiler implements method3 as a jmp instruction to method2.
The complexity is the same for all except the last one that is in theory faster because the end of the container is evaluated only once.
Last one is also the nicest to read and to write, but has the drawback that doesn't give you the index (that is quite often important).
You are however ignoring what I think is a good alternative (it's my preferred one when I need the index and cannot use for (auto& x : v) {...}):
for (int i=0,n=v.size(); i<n; i++) {
... use v[i] ...
}
note that I used int and not size_t and that the end is computed only once and also available in the body as a local variable.
Often when the index and the size are needed then math computations are also performed on them and size_t behaves "strangely" when used for math (for example a+1 < b and a < b-1 are different things).
When testing my code I noticed a significant increase in execution time when the empty ranged-for loop was deleted or not. Normally I would think that the compiler would notice that the for loop serves no purpose and would therefor be ignored. As the compiler flags I am using -O3 (gcc 5.4). I also tested it with a vector instead of a set and that seems to work and give the same execution time in both cases. It seems that the incrementation of the iterator costs all the extra time.
First case with the ranged for loop still present (slow):
#include <iostream>
#include <set>
int main () {
long result;
std::set<double> results;
for (int i = 2; i <= 10000; ++i) {
results.insert(i);
for (auto element : results) {
// no operation
}
}
std::cout << "Result: " << result << "\n";
}
Second case with the ranged for loop deleted (fast):
#include <iostream>
#include <set>
int main () {
long result;
std::set<double> results;
for (int i = 2; i <= 10000; ++i) {
results.insert(i);
}
std::cout << "Result: " << result << "\n";
}
Internally std::set iterator uses some kind of pointer chain. This seems to be the issue.
Here is a minimal setup similar to your issue:
struct S
{
S* next;
};
void f (S* s) {
while (s)
s = s->next;
}
It's not a problem with complex collection implementations or overhead of iterators but simply this pointer chain pattern that the optimizer can't optimize.
I don't know the precise reason why optimizers fail on this pattern though.
Also, note that this variant is optimized away:
void f (S* s) {
// Copy paste as many times as you wish the following two lines
if(s)
s = s->next;
}
Edit
As suggested by #hvd this might have to do with the compiler not being able to prove the loop is not infinite. And if we write the OP loop like so:
void f(std::set<double>& s)
{
auto it = s.begin();
for (size_t i = 0; i < s.size() && it != s.end(); ++i, ++it)
{
// Do nothing
}
}
The compiler optimizes everything away.
The range based for loop is not as trivial as it looks. It is translated to an iterator-based loop internally in the compiler and if the iterator is complex enough the compiler may not even be allowed by the standard to remove these iterator operations.
You could play around with clang optimization report. Compile your code with save-optimization-record enabled, so optimization report will be dumped to main.opt.yaml.
clang++ -std=c++11 main.cpp -O2 -fsave-optimization-record
You will see that there are several problems with the loop:
Clang thinks, that there is a value modified in this loop.
- String: value that could not be identified as reduction is used outside the loop
Also, the compiler can't compute the number of loop iterations.
- String: could not determine number of loop iterations
Note, that compiler successfully inlined begin, end, operator++ and operator=.
Range-for is "syntactic sugar", meaning what it does is simply provide short-hand notation for something that can be expressed in more verbose manner.
For example, range-for transforms into something like this.
for (Type obj : container) ->
auto endpos = container.end();
for ( auto iter=container.begin(); iter != endpos; ++iter)
{
Type obj(*iter);
// your code here
}
Now the problem is that begin/end/*iter/++iter/(obj = ) are function-calls.
In order to optimize them out, the compiler needs to know that they have no side-effects, (changes to global state).
Whether the compiler can do this or not is implementation defined, and will depend on the container type.
What I can say though, in most case you do not need the (obj =) function, so prefer
for (const auto& X: cont)
or ...
for (auto& X: cont)
to ...
for (auto X : cont)
You might find that simplifies it enough for optimizations to kick in.
This question already has answers here:
Advantages of std::for_each over for loop
(22 answers)
Closed 7 years ago.
Let's consider a template function written in C++11 which iterates over a container.
Please exclude from consideration the range loop syntax because it is not yet supported by the compiler I'm working with.
template <typename Container>
void DoSomething(const Container& i_container)
{
// Option #1
for (auto it = std::begin(i_container); it != std::end(i_container); ++it)
{
// do something with *it
}
// Option #2
std::for_each(std::begin(i_container), std::end(i_container),
[] (typename Container::const_reference element)
{
// do something with element
});
}
What are pros/cons of for loop vs std::for_each in terms of:
a) performance? (I don't expect any difference)
b) readability and maintainability?
Here I see many disadvantages of for_each. It wouldn't accept a c-style array while the loop would. The declaration of the lambda formal parameter is so verbose, not possible to use auto there. It is not possible to break out of for_each.
In pre- C++11 days arguments against for were a need of specifying the type for the iterator (doesn't hold any more) and an easy possibility of mistyping the loop condition (I've never done such mistake in 10 years).
As a conclusion, my thoughts about for_each contradict the common opinion. What am I missing here?
I think there are some other differences not yet covered by the answers so far.
a for_each can accept any appropriate callable object, allowing one to 'recycle' the loop body for different for loops. For example (pseudo code)
for( range_1 ) { lengthy_loop_body } // many lines of code
for( range_2 ) { lengthy_loop_body } // the same many lines of code again
becomes
auto loop_body = some_lambda; // many lines of code here only
std::for_each( range_1 , loop_body ); // a single line of code
std::for_each( range_2 , loop_body ); // another single line of code
thus avoiding duplication and simplifying code maintenance. (Of course, in a funny mix of styles one could also use a similar approach with the for loop.)
another difference regards breaking out of the loop (with break or return in the for loop). As far as I know, in an for_each loop this can only be done by throwing an exception. For example
for( range )
{
some code;
if(condition_1) return x; // or break
more code;
if(condition_2) continue;
yet more code;
}
becomes
try {
std::for_each( range , [] (const_reference x)
{
some code;
if(condition_1) throw x;
more code;
if(condition_2) return;
yet more code;
} );
} catch(const_reference r) { return r; }
with the same effects regarding calling of destructors for objects with scope of the loop body and the function body (around the loop).
the main benefit of for_each is, IMHO, that one can overload it for certain container types, when plain iteration is not as efficient. For example, consider a container that holds a linked list of data blocks, each block containing a contiguous array of elements, similar to (omitting irrelevant code)
namespace my {
template<typename data_type, unsigned block_size>
struct Container
{
struct block
{
const block*NEXT;
data_type DATA[block_size];
block() : NEXT(0) {}
} *HEAD;
};
}
then an appropriate forward iterator for this type would require to check for the end of block at each increment and the comparison operator needs to compare both the block pointer and the index within each block (omitting irrelevant code):
namespace my {
template<typename data_type, unsigned block_size>
struct Container
{
struct iterator
{
const block*B;
unsigned I;
iterator() = default;
iterator&operator=(iterator const&) = default;
iterator(const block*b, unsigned i) : B(b), I(i) {}
iterator& operator++()
{
if(++I==block_size) { B=B->NEXT; I=0; } // one comparison and branch
return*this;
}
bool operator==(const iterator&i) const
{ return B==i.B && I==i.I; } // one or two comparisons
bool operator!=(const iterator&i) const
{ return B!=i.B || I!=i.I; } // one or two comparisons
const data_type& operator*() const
{ return B->DATA[I]; }
};
iterator begin() const
{ return iterator(HEAD,0); }
iterator end() const
{ return iterator(0,0); }
};
}
this type of iterator works correctly with for and for_each, for example
my::Container<int,5> C;
for(auto i=C.begin();
i!=C.end(); // one or two comparisons here
++i) // one comparison here and a branch
f(*i);
but requires two to three comparisons per iteration as well as a branch. A more efficient way is to overload the for_each() function to loop on the block pointer and index separately:
namespace my {
template<typename data_type, int block_size, typename FuncOfDataType>
FuncOfDataType&&
for_each(typename my::Container<data_type,block_size>::iterator i,
typename my::Container<data_type,block_size>::iterator const&e,
FuncOfDataType f)
{
for(; i.B != e.B; i.B++,i.I=0)
for(; i.I != block_size; i.I++)
f(*i);
for(; i.I != e.I; i.I++)
f(*i);
return std::move(f);
}
}
using my::for_each; // ensures that the appropriate
using std::for_each; // version of for_each() is used
which requires only one comparison for most iterations and has no branches (note that branches can have a nasty impact on performance). Note that we don't need to define this in namespace std (which might be illegal), but can ensure that the correct version is used by appropriate using directives. This is equivalent to using std::swap; when specialising swap() for certain user-defined types.
Regarding perfomance, your for loop calls std::end repeatedly, while std::for_each will not. This might or might not result in a performance difference depending on the container used.
The std::for_each version will visit each element exactly once. Somebody reading the code can know that as soon as they see std::for_each, as there's nothing that can be done in the lambda to mess with the iterator. In the traditional for loop, you have to study the body of the loop for unusual control flow (continue, break, return) and dinking with the iterator (e.g., in this case, skip the next element with ++it).
You can trivially change the algorithm in the lambda solution. For example, you could make an algorithm that visits every nth element. In many cases, you didn't really want a for loop anyway, but a different algorithm like copy_if. Using an algorithm+lambda, is often more amenable to change and is a bit more concise.
On the flip side, programmers are much more used to traditional for loops, so they may find algorithm+lambda to be harder to read.
First, I cannot see much difference between these two, because for_each is implemented using for loop. But note that for_each is a function which has a return value.
Second, I will use range loop syntax once available in this case since this day would come soon anyway.
Indeed; in the case of using a Lambda expression, you have to declare the parameter type and name, so nothing is won.
But it will be awesome as soon as you want to call one (named) function or function-object with this. (Remember that you can combine function-like things via std::bind.)
The books from Scott Meyers (I believe it was Effective STL) describe such programming styles very good and clear.
I know that there are similar questions to this one, but I didn’t manage to find the way on my code by their aid. I want merely to delete/remove an element of a vector by checking an attribute of this element inside a loop. How can I do that? I tried the following code but I receive the vague message of error:
'operator =' function is unavailable in 'Player’.
for (vector<Player>::iterator it = allPlayers.begin(); it != allPlayers.end(); it++)
{
if(it->getpMoney()<=0)
it = allPlayers.erase(it);
else
++it;
}
What should I do?
Update: Do you think that the question vector::erase with pointer member pertains to the same problem? Do I need hence an assignment operator? Why?
You should not increment it in the for loop:
for (vector<Player>::iterator it=allPlayers.begin();
it!=allPlayers.end();
/*it++*/) <----------- I commented it.
{
if(it->getpMoney()<=0)
it = allPlayers.erase(it);
else
++it;
}
Notice the commented part;it++ is not needed there, as it is getting incremented in the for-body itself.
As for the error "'operator =' function is unavailable in 'Player’", it comes from the usage of erase() which internally uses operator= to move elements in the vector. In order to use erase(), the objects of class Player must be assignable, which means you need to implement operator= for Player class.
Anyway, you should avoid raw loop1 as much as possible and should prefer to use algorithms instead. In this case, the popular Erase-Remove Idiom can simplify what you're doing.
allPlayers.erase(
std::remove_if(
allPlayers.begin(),
allPlayers.end(),
[](Player const & p) { return p.getpMoney() <= 0; }
),
allPlayers.end()
);
1. It's one of the best talks by Sean Parent that I've ever watched.
if(allPlayers.empty() == false) {
for(int i = allPlayers.size() - 1; i >= 0; i--) {
if(allPlayers.at(i).getpMoney() <= 0) {
allPlayers.erase( allPlayers.begin() + i );
}
}
}
This is my way to remove elements in vector.
It's easy to understand and doesn't need any tricks.
Forget the loop and use the std or boost range algorthims.
Using Boost.Range en Lambda it would look like this:
boost::remove_if( allPlayers, bind(&Player::getpMoney, _1)<=0 );
Your specific problem is that your Player class does not have an assignment operator. You must make "Player" either copyable or movable in order to remove it from a vector. This is due to that vector needs to be contiguous and therefore needs to reorder elements in order to fill gaps created when you remove elements.
Also:
Use std algorithm
allPlayers.erase(std::remove_if(allPlayers.begin(), allPlayers.end(), [](const Player& player)
{
return player.getpMoney() <= 0;
}), allPlayers.end());
or even simpler if you have boost:
boost::remove_erase_if(allPlayers, [](const Player& player)
{
return player.getpMoney() <= 0;
});
See TimW's answer if you don't have support for C++11 lambdas.
Or do the loop backwards.
for (vector<Player>::iterator it = allPlayers.end() - 1; it != allPlayers.begin() - 1; it--)
if(it->getpMoney()<=0)
it = allPlayers.erase(it);
C++11 has introduced a new collection of functions that will be of use here.
allPlayers.erase(
std::remove_if(allPlayers.begin(), allPlayers.end(),
[](auto& x) {return x->getpMoney() <= 0;} ),
allPlayers.end());
And then you get the advantage of not having to do quite so much shifting of end elements.
Late answer, but as having seen inefficient variants:
std::remove or std::remove_if is the way to go.
If for any reason those are not available or cannot be used for whatever other reason, do what these hide away from you.
Code for removing elements efficiently:
auto pos = container.begin();
for(auto i = container.begin(); i != container.end(); ++i)
{
if(isKeepElement(*i)) // whatever condition...
{
if(i != pos)
{
*pos = *i; // will move, if move assignment is available...
}
++pos;
}
}
// well, std::remove(_if) stops here...
container.erase(pos, container.end());
You might need to write such a loop explicitly e. g. if you need the iterator itself to determine if the element is to be removed (the condition parameter needs to accept a reference to element, remember?), e. g. due to specific relationship to successor/predecessor (if this relationship is equality, though, there is std::unique).
Are there any advantages of std::for_each over for loop? To me, std::for_each only seems to hinder the readability of code. Why do then some coding standards recommend its use?
The nice thing with C++11 (previously called C++0x), is that this tiresome debate will be settled.
I mean, no one in their right mind, who wants to iterate over a whole collection, will still use this
for(auto it = collection.begin(); it != collection.end() ; ++it)
{
foo(*it);
}
Or this
for_each(collection.begin(), collection.end(), [](Element& e)
{
foo(e);
});
when the range-based for loop syntax is available:
for(Element& e : collection)
{
foo(e);
}
This kind of syntax has been available in Java and C# for some time now, and actually there are way more foreach loops than classical for loops in every recent Java or C# code I saw.
Here are some reasons:
It seems to hinder readability just because you're not used to it and/or not using the right tools around it to make it really easy. (see boost::range and boost::bind/boost::lambda for helpers. Many of these will go into C++0x and make for_each and related functions more useful.)
It allows you to write an algorithm on top of for_each that works with any iterator.
It reduces the chance of stupid typing bugs.
It also opens your mind to the rest of the STL-algorithms, like find_if, sort, replace, etc and these won't look so strange anymore. This can be a huge win.
Update 1:
Most importantly, it helps you go beyond for_each vs. for-loops like that's all there is, and look at the other STL-alogs, like find / sort / partition / copy_replace_if, parallel execution .. or whatever.
A lot of processing can be written very concisely using "the rest" of for_each's siblings, but if all you do is to write a for-loop with various internal logic, then you'll never learn how to use those, and you'll end up inventing the wheel over and over.
And (the soon-to-be available range-style for_each) + lambdas:
for_each(monsters, [](auto& m) { m.think(); });
is IMO more readable than:
for (auto i = monsters.begin(); i != monsters.end(); ++i) {
i->think();
}
Also this:
for_each(bananas, [&](auto& b) { my_monkey.eat(b); );
Is more concise than:
for (auto i = bananas.begin(); i != bananas.end(); ++i) {
my_monkey->eat(*i);
}
But new range based for is probably the best:
for (auto& b : bananas)
my_monkey.eat(b);
But the for_each could be useful, especially if you have several functions to call in order but need to run each method for all objects before next... but maybe that's just me. ;)
Update 2: I've written my own one-liner wrappers of stl-algos that work with ranges instead of pair of iterators. boost::range_ex, once released, will include that and maybe it will be there in C++0x too?
for_each is more generic. You can use it to iterate over any type of container (by passing in the begin/end iterators). You can potentially swap out containers underneath a function which uses for_each without having to update the iteration code. You need to consider that there are other containers in the world than std::vector and plain old C arrays to see the advantages of for_each.
The major drawback of for_each is that it takes a functor, so the syntax is clunky. This is fixed in C++11 (formerly C++0x) with the introduction of lambdas:
std::vector<int> container;
...
std::for_each(container.begin(), container.end(), [](int& i){
i+= 10;
});
This will not look weird to you in 3 years.
Personally, any time I'd need to go out of my way to use std::for_each (write special-purpose functors / complicated boost::lambdas), I find BOOST_FOREACH and C++0x's range-based for clearer:
BOOST_FOREACH(Monster* m, monsters) {
if (m->has_plan())
m->act();
}
vs
std::for_each(monsters.begin(), monsters.end(),
if_then(bind(&Monster::has_plan, _1),
bind(&Monster::act, _1)));
its very subjective, some will say that using for_each will make the code more readable, as it allows to treat different collections with the same conventions.
for_each itslef is implemented as a loop
template<class InputIterator, class Function>
Function for_each(InputIterator first, InputIterator last, Function f)
{
for ( ; first!=last; ++first ) f(*first);
return f;
}
so its up to you to choose what is right for you.
You're mostly correct: most of the time, std::for_each is a net loss. I'd go so far as to compare for_each to goto. goto provides the most versatile flow-control possible -- you can use it to implement virtually any other control structure you can imagine. That very versatility, however, means that seeing a goto in isolation tells you virtually nothing about what's it's intended to do in this situation. As a result, almost nobody in their right mind uses goto except as a last resort.
Among the standard algorithms, for_each is much the same way -- it can be used to implement virtually anything, which means that seeing for_each tells you virtually nothing about what it's being used for in this situation. Unfortunately, people's attitude toward for_each is about where their attitude toward goto was in (say) 1970 or so -- a few people had caught onto the fact that it should be used only as a last resort, but many still consider it the primary algorithm, and rarely if ever use any other. The vast majority of the time, even a quick glance would reveal that one of the alternatives was drastically superior.
Just for example, I'm pretty sure I've lost track of how many times I've seen people writing code to print out the contents of a collection using for_each. Based on posts I've seen, this may well be the single most common use of for_each. They end up with something like:
class XXX {
// ...
public:
std::ostream &print(std::ostream &os) { return os << "my data\n"; }
};
And their post is asking about what combination of bind1st, mem_fun, etc. they need to make something like:
std::vector<XXX> coll;
std::for_each(coll.begin(), coll.end(), XXX::print);
work, and print out the elements of coll. If it really did work exactly as I've written it there, it would be mediocre, but it doesn't -- and by the time you've gotten it to work, it's difficult to find those few bits of code related to what's going on among the pieces that hold it together.
Fortunately, there is a much better way. Add a normal stream inserter overload for XXX:
std::ostream &operator<<(std::ostream *os, XXX const &x) {
return x.print(os);
}
and use std::copy:
std::copy(coll.begin(), coll.end(), std::ostream_iterator<XXX>(std::cout, "\n"));
That does work -- and takes virtually no work at all to figure out that it prints the contents of coll to std::cout.
Like many of the algorithm functions, an initial reaction is to think it's more unreadable to use foreach than a loop. It's been a topic of many flame wars.
Once you get used to the idiom you may find it useful. One obvious advantage is that it forces the coder to separate the inner contents of the loop from the actual iteration functionality. (OK, I think it's an advantage. Other's say you're just chopping up the code with no real benifit).
One other advantage is that when I see foreach, I know that either every item will be processed or an exception will be thrown.
A for loop allows several options for terminating the loop. You can let the loop run its full course, or you can use the break keyword to explicitly jump out of the loop, or use the return keyword to exit the entire function mid-loop. In contrast, foreach does not allow these options, and this makes it more readable. You can just glance at the function name and you know the full nature of the iteration.
Here's an example of a confusing for loop:
for(std::vector<widget>::iterator i = v.begin(); i != v.end(); ++i)
{
/////////////////////////////////////////////////////////////////////
// Imagine a page of code here by programmers who don't refactor
///////////////////////////////////////////////////////////////////////
if(widget->Cost < calculatedAmountSofar)
{
break;
}
////////////////////////////////////////////////////////////////////////
// And then some more code added by a stressed out juniour developer
// *#&$*)#$&#(#)$#(*$&#(&*^$#(*$#)($*#(&$^#($*&#)$(#&*$&#*$#*)$(#*
/////////////////////////////////////////////////////////////////////////
for(std::vector<widgetPart>::iterator ip = widget.GetParts().begin(); ip != widget.GetParts().end(); ++ip)
{
if(ip->IsBroken())
{
return false;
}
}
}
The advantage of writing functional for beeing more readable, might not show up when for(...) and for_each(...).
If you utilize all algorithms in functional.h, instead of using for-loops, the code gets a lot more readable;
iterator longest_tree = std::max_element(forest.begin(), forest.end(), ...);
iterator first_leaf_tree = std::find_if(forest.begin(), forest.end(), ...);
std::transform(forest.begin(), forest.end(), firewood.begin(), ...);
std::for_each(forest.begin(), forest.end(), make_plywood);
is much more readable than;
Forest::iterator longest_tree = it.begin();
for (Forest::const_iterator it = forest.begin(); it != forest.end(); ++it{
if (*it > *longest_tree) {
longest_tree = it;
}
}
Forest::iterator leaf_tree = it.begin();
for (Forest::const_iterator it = forest.begin(); it != forest.end(); ++it{
if (it->type() == LEAF_TREE) {
leaf_tree = it;
break;
}
}
for (Forest::const_iterator it = forest.begin(), jt = firewood.begin();
it != forest.end();
it++, jt++) {
*jt = boost::transformtowood(*it);
}
for (Forest::const_iterator it = forest.begin(); it != forest.end(); ++it{
std::makeplywood(*it);
}
And that is what I think is so nice, generalize the for-loops to one line functions =)
Easy: for_each is useful when you already have a function to handle every array item, so you don't have to write a lambda. Certainly, this
for_each(a.begin(), a.end(), a_item_handler);
is better than
for(auto& item: a) {
a_item_handler(a);
}
Also, ranged for loop only iterates over whole containers from start to end, whilst for_each is more flexible.
The for_each loop is meant to hide the iterators (detail of how a loop is implemented) from the user code and define clear semantics on the operation: each element will be iterated exactly once.
The problem with readability in the current standard is that it requires a functor as the last argument instead of a block of code, so in many cases you must write specific functor type for it. That turns into less readable code as functor objects cannot be defined in-place (local classes defined within a function cannot be used as template arguments) and the implementation of the loop must be moved away from the actual loop.
struct myfunctor {
void operator()( int arg1 ) { code }
};
void apply( std::vector<int> const & v ) {
// code
std::for_each( v.begin(), v.end(), myfunctor() );
// more code
}
Note that if you want to perform an specific operation on each object, you can use std::mem_fn, or boost::bind (std::bind in the next standard), or boost::lambda (lambdas in the next standard) to make it simpler:
void function( int value );
void apply( std::vector<X> const & v ) {
// code
std::for_each( v.begin(), v.end(), boost::bind( function, _1 ) );
// code
}
Which is not less readable and more compact than the hand rolled version if you do have function/method to call in place. The implementation could provide other implementations of the for_each loop (think parallel processing).
The upcoming standard takes care of some of the shortcomings in different ways, it will allow for locally defined classes as arguments to templates:
void apply( std::vector<int> const & v ) {
// code
struct myfunctor {
void operator()( int ) { code }
};
std::for_each( v.begin(), v.end(), myfunctor() );
// code
}
Improving the locality of code: when you browse you see what it is doing right there. As a matter of fact, you don't even need to use the class syntax to define the functor, but use a lambda right there:
void apply( std::vector<int> const & v ) {
// code
std::for_each( v.begin(), v.end(),
[]( int ) { // code } );
// code
}
Even if for the case of for_each there will be an specific construct that will make it more natural:
void apply( std::vector<int> const & v ) {
// code
for ( int i : v ) {
// code
}
// code
}
I tend to mix the for_each construct with hand rolled loops. When only a call to an existing function or method is what I need (for_each( v.begin(), v.end(), boost::bind( &Type::update, _1 ) )) I go for the for_each construct that takes away from the code a lot of boiler plate iterator stuff. When I need something more complex and I cannot implement a functor just a couple of lines above the actual use, I roll my own loop (keeps the operation in place). In non-critical sections of code I might go with BOOST_FOREACH (a co-worker got me into it)
Aside from readability and performance, one aspect commonly overlooked is consistency. There are many ways to implement a for (or while) loop over iterators, from:
for (C::iterator iter = c.begin(); iter != c.end(); iter++) {
do_something(*iter);
}
to:
C::iterator iter = c.begin();
C::iterator end = c.end();
while (iter != end) {
do_something(*iter);
++iter;
}
with many examples in between at varying levels of efficiency and bug potential.
Using for_each, however, enforces consistency by abstracting away the loop:
for_each(c.begin(), c.end(), do_something);
The only thing you have to worry about now is: do you implement the loop body as function, a functor, or a lambda using Boost or C++0x features? Personally, I'd rather worry about that than how to implement or read a random for/while loop.
I used to dislike std::for_each and thought that without lambda, it was done utterly wrong. However I did change my mind some time ago, and now I actually love it. And I think it even improves readability, and makes it easier to test your code in a TDD way.
The std::for_each algorithm can be read as do something with all elements in range, which can improve readability. Say the action that you want to perform is 20 lines long, and the function where the action is performed is also about 20 lines long. That would make a function 40 lines long with a conventional for loop, and only about 20 with std::for_each, thus likely easier to comprehend.
Functors for std::for_each are more likely to be more generic, and thus reusable, e.g:
struct DeleteElement
{
template <typename T>
void operator()(const T *ptr)
{
delete ptr;
}
};
And in the code you'd only have a one-liner like std::for_each(v.begin(), v.end(), DeleteElement()) which is slightly better IMO than an explicit loop.
All of those functors are normally easier to get under unit tests than an explicit for loop in the middle of a long function, and that alone is already a big win for me.
std::for_each is also generally more reliable, as you're less likely to make a mistake with range.
And lastly, compiler might produce slightly better code for std::for_each than for certain types of hand-crafted for loop, as it (for_each) always looks the same for compiler, and compiler writers can put all of their knowledge, to make it as good as they can.
Same applies to other std algorithms like find_if, transform etc.
If you frequently use other algorithms from the STL, there are several advantages to for_each:
It will often be simpler and less error prone than a for loop, partly because you'll be used to functions with this interface, and partly because it actually is a little more concise in many cases.
Although a range-based for loop can be even simpler, it is less flexible (as noted by Adrian McCarthy, it iterates over a whole container).
Unlike a traditional for loop, for_each forces you to write code that will work for any input iterator. Being restricted in this way can actually be a good thing because:
You might actually need to adapt the code to work for a different container later.
At the beginning, it might teach you something and/or change your habits for the better.
Even if you would always write for loops which are perfectly equivalent, other people that modify the same code might not do this without being prompted to use for_each.
Using for_each sometimes makes it more obvious that you can use a more specific STL function to do the same thing. (As in Jerry Coffin's example; it's not necessarily the case that for_each is the best option, but a for loop is not the only alternative.)
With C++11 and two simple templates, you can write
for ( auto x: range(v1+4,v1+6) ) {
x*=2;
cout<< x <<' ';
}
as a replacement for for_each or a loop. Why choose it boils down to brevity and safety, there's no chance of error in an expression that's not there.
For me, for_each was always better on the same grounds when the loop body is already a functor, and I'll take any advantage I can get.
You still use the three-expression for, but now when you see one you know there's something to understand there, it's not boilerplate. I hate boilerplate. I resent its existence. It's not real code, there's nothing to learn by reading it, it's just one more thing that needs checking. The mental effort can be measured by how easy it is to get rusty at checking it.
The templates are
template<typename iter>
struct range_ {
iter begin() {return __beg;} iter end(){return __end;}
range_(iter const&beg,iter const&end) : __beg(beg),__end(end) {}
iter __beg, __end;
};
template<typename iter>
range_<iter> range(iter const &begin, iter const &end)
{ return range_<iter>(begin,end); }
for is for loop that can iterate each element or every third etc. for_each is for iterating only each element. It is clear from its name. So it is more clear what you are intending to do in your code.
for_each allow us to implement Fork-Join pattern . Other than that it supports fluent-interface.
fork-join pattern
We can add implementation gpu::for_each to use cuda/gpu for heterogeneous-parallel computing by calling the lambda task in multiple workers.
gpu::for_each(users.begin(),users.end(),update_summary);
// all summary is complete now
// go access the user-summary here.
And gpu::for_each may wait for the workers work on all the lambda-tasks to finish before executing the next statements.
fluent-interface
It allow us to write human-readable code in concise manner.
accounts::erase(std::remove_if(accounts.begin(),accounts.end(),used_this_year));
std::for_each(accounts.begin(),accounts.end(),mark_dormant);
Mostly you'll have to iterate over the whole collection. Therefore I suggest you write your own for_each() variant, taking only 2 parameters. This will allow you to rewrite Terry Mahaffey's example as:
for_each(container, [](int& i) {
i += 10;
});
I think this is indeed more readable than a for loop. However, this requires the C++0x compiler extensions.
I find for_each to be bad for readability. The concept is a good one but c++ makes it very hard to write readable, at least for me. c++0x lamda expressions will help. I really like the idea of lamdas. However on first glance I think the syntax is very ugly and I'm not 100% sure I'll ever get used to it. Maybe in 5 years I'll have got used to it and not give it a second thought, but maybe not. Time will tell :)
I prefer to use
vector<thing>::iterator istart = container.begin();
vector<thing>::iterator iend = container.end();
for(vector<thing>::iterator i = istart; i != iend; ++i) {
// Do stuff
}
I find an explicit for loop clearer to read and explicity using named variables for the start and end iterators reduces the clutter in the for loop.
Of course cases vary, this is just what I usually find best.
There are a lot of good reasons in other answers but all seem to forget that
for_each allows you to use reverse or pretty much any custom iterator when for loop always starts with begin() iterator.
Example with reverse iterator:
std::list<int> l {1,2,3};
std::for_each(l.rbegin(), l.rend(), [](auto o){std::cout<<o;});
Example with some custom tree iterator:
SomeCustomTree<int> a{1,2,3,4,5,6,7};
auto node = a.find(4);
std::for_each(node.breadthFirstBegin(), node.breadthFirstEnd(), [](auto o){std::cout<<o;});
You can have the iterator be a call to a function that is performed on each iteration through the loop.
See here:
http://www.cplusplus.com/reference/algorithm/for_each/
For loop can break;
I dont want to be a parrot for Herb Sutter so here is the link to his presentation:
http://channel9.msdn.com/Events/BUILD/BUILD2011/TOOL-835T
Be sure to read the comments also :)
std::for_each is great when you don't have a range.
For example, consider std::istream_iterator:
using Iter = std::istream_iterator<int>;
for (Iter i(str); i != Iter(); ++i) {
f(*i);
}
It has no container, so you can't easily use a for (auto &&item: ...) loop, but you can do:
std::for_each(Iter(str), Iter(), [](int item)
// ...
});