++it or it++ when iterating over a map? - c++

Examples showing how to iterate over a std::map are often like that:
MapType::const_iterator end = data.end();
for (MapType::const_iterator it = data.begin(); it != end; ++it)
i.e. it uses ++it instead of it++. Is there any reason why? Could there be any problem if I use it++ instead?

it++ returns a copy of the previous iterator. Since this iterator is not used, this is wasteful. ++it returns a reference to the incremented iterator, avoiding the copy.
Please see Question 13.15 for a fuller explanation.

Putting it to the test, I made three source files:
#include <map>
struct Foo { int a; double b; char c; };
typedef std::map<int, Foo> FMap;
### File 1 only ###
void Set(FMap & m, const Foo & f)
{
for (FMap::iterator it = m.begin(), end = m.end(); it != end; ++it)
it->second = f;
}
### File 2 only ###
void Set(FMap & m, const Foo & f)
{
for (FMap::iterator it = m.begin(); it != m.end(); ++it)
it->second = f;
}
### File 3 only ###
void Set(FMap & m, const Foo & f)
{
for (FMap::iterator it = m.begin(); it != m.end(); it++)
it->second = f;
}
### end ###
After compiling with g++ -S -O3, GCC 4.6.1, I find that version 2 and 3 produce identical assembly, and version 1 differs only in one instruction, cmpl %eax, %esi vs cmpl %esi, %eax.
So, take your pick and use whatever suits your style. Prefix increment ++it is probably best because it expresses your requirements most accurately, but don't get hung up about it.

There’s a slight performance advantage in using pre-increment operators versus post-increment operators. In setting up loops that use iterators, you should opt to use pre-increments:
for (list<string>::const_iterator it = tokens.begin();
it != tokens.end();
++it) { // Don't use it++
...
}
The reason comes to light when you think about how both operators would typically be implemented.The pre-increment is quite straightforward. However, in order for post-increment to work, you need to first make a copy of the object, do the actual increment on the original object and then return the copy:
class MyInteger {
private:
int m_nValue;
public:
MyInteger(int i) {
m_nValue = i;
}
// Pre-increment
const MyInteger &operator++() {
++m_nValue;
return *this;
}
// Post-increment
MyInteger operator++(int) {
MyInteger clone = *this; // Copy operation 1
++m_nValue;
return clone; // Copy operation 2
}
}
As you can see, the post-increment implementation involves two extra copy operations. This can be quite expensive if the object in question is bulky. Having said that, some compilers may be smart enough to get away with a single copy operation, through optimization. The point is that a post-increment will typically involve more work than a pre-increment and therefore it’s wise to get used to putting your “++”s before your iterators rather than after.
(1) Credit to linked website.

At logical point of view - it's the same and it doesn't matter here.
Why the prefix one is used - because it's faster - it changes the iterator and returns its value, while the postfix creates temp object, increments the current iterator and then returns the temp object (copy of the same iterator, before the incrementing ). As no one watches this temp object here (the return value), it's the same (logically).
There's a pretty big chance, that the compiler will optimize this.
In Addition - actually, this is supposed to be like this for any types at all. But it's just supposed to be. As anyone can overload operator++ - the postfix and prefix, they can have side effects and different behavior.
Well, this is a horrible thing to do, but still possible.

It won't cause any problems, but using ++it is more correct. With small types it doesn't really matter to use ++i or i++, but for "big" classes:
operator++(type x,int){
type tmp=x; //need copy
++x;
return tmp;
}
The compiler may optimize out some of them, but it's hard to be sure.

As other answers have said, prefer ++it unless it won't work in context. For iterating over containers of small types it really makes little difference (or no difference if the compiler optimizes it away), but for containers of large types it can make a difference due to saving the cost of making a copy.
True, you might know in your specific context that the type is small enough so you don't worry about it. But later, someone else on your team might change the contents of the container to where it would matter. Plus, I think it is better to get yourself into a good habit, and only post-increment when you know you must.

Related

What is the most effective way of iterating a std::vector and why?

In terms of space-time complexity which of the following is best way to iterate over a std::vector and why?
Way 1:
for(std::vector<T>::iterator it = v.begin(); it != v.end(); ++it) {
/* std::cout << *it; ... */
}
Way 2:
for(std::vector<int>::size_type i = 0; i != v.size(); i++) {
/* std::cout << v[i]; ... */
}
Way 3:
for(size_t i = 0; i != v.size(); i++) {
/* std::cout << v[i]; ... */
}
Way 4:
for(auto const& value: a) {
/* std::cout << value; ... */
First of all, Way 2 and Way 3 are identical in practically all standard library implementations.
Apart from that, the options you posted are almost equivalent. The only notable difference is that in Way 1 and Way 2/3, you rely on the compiler to optimize the call to v.end() and v.size() out. If that assumption is correct, there is no performance difference between the loops.
If it's not, Way 4 is the most efficient. Recall how a range based for loop expands to
{
auto && __range = range_expression ;
auto __begin = begin_expr ;
auto __end = end_expr ;
for ( ; __begin != __end; ++__begin) {
range_declaration = *__begin;
loop_statement
}
}
The important part here is that this guarantees the end_expr to be evaluated only once. Also note that for the range based for loop to be the most efficient iteration, you must not change how the dereferencing of the iterator is handled, e.g.
for (auto value: a) { /* ... */ }
this copies each element of the vector into the loop variable value, which is likely to be slower than for (const auto& value : a), depending on the size of the elements in the vector.
Note that with the parallel algorithm facilities in C++17, you can also try out
#include <algorithm>
#include <execution>
std::for_each(std::par_unseq, a.cbegin(), a.cend(),
[](const auto& e) { /* do stuff... */ });
but whether this is faster than an ordinary loop depends on may circumstantial details.
Prefer iterators over indices/keys.
While for vector or array there should be no difference between either form1, it is a good habit to get into for other containers.
1 As long as you use [] instead of .at() for accesssing by index, of course.
Memorize the end-bound.
Recomputing the end-bound at each iteration is inefficient for two reasons:
In general: a local variable is not aliased, which is more optimizer-friendly.
On containers other than vector: computing the end/size could be a bit more expensive.
You can do so as a one-liner:
for (auto it = vec.begin(), end = vec.end(); it != end; ++it) { ... }
(This is an exception to the general prohibition on declaring a single variable at a time.)
Use the for-each loop form.
The for-each loop form will automatically:
Use iterators.
Memorize the end-bound.
Thus:
for (/*...*/ value : vec) { ... }
Take built-in types by values, other types by reference.
There is a non-obvious trade-off between taking an element by value and taking an element by reference:
Taking an element by reference avoids a copy, which can be an expensive operation.
Taking an element by value is more optimizer-friendly1.
At the extremes, the choice should be obvious:
Built-in types (int, std::int64_t, void*, ...) should be taken by value.
Potentially allocating types (std::string, ...) should be taken by reference.
In the middle, or when faced with generic code, I would recommend starting with references: it's better to avoid a performance cliff than attempting to squeeze out the last cycle.
Thus, the general form is:
for (auto& element : vec) { ... }
And if you are dealing with a built-in:
for (int element : vec) { ... }
1 This is a general principle of optimization, actually: local variables are friendlier than pointers/references because the optimizer knows all the potential aliases (or absence, thereof) of the local variable.
Addition to lubgr's answer:
Unless you discover via profiling the code in question to be a bottleneck, efficiency (which you probably meant instead of 'effectivity') shouldn't be your first concern, at least not on this level of code. Much more important are code readability and maintainability! So you should select the loop variant that reads best, which usually is way 4.
Indices can be useful if you have steps greater than 1 (whyever you would need to...):
for(size_t i = 0; i < v.size(); i += 2) { ... }
While += 2 per se is legal on iterators, too, you risk undefined behaviour at loop end if the vector has odd size because you increment past the one past the end position! (Generally spoken: If you increment by n, you get UB if size is not an exact multiple of n.) So you need additional code to catch this, while you don't with the index variant...
The lazy answer: The complexities are equivalent.
The time complexity of all solutions is Θ(n).
The space complexity of all solutions is Θ(1).
The constant factors involved in the various solutions are implementation details. If you need numbers, you're probably best off benchmarking the different solutions on your particular target system.
It may help to store v.size() rsp. v.end(), although these are usually inlined, so such optimizations may not be needed, or performed automatically.
Note that indexing (without memoizing v.size()) is the only way to correctly deal with a loop body that may add additional elements (using push_back()). However, most use cases do not need this extra flexibility.
Prefer method 4, std::for_each (if you really must), or method 5/6:
void method5(std::vector<float>& v) {
for(std::vector<float>::iterator it = v.begin(), e = v.end(); it != e; ++it) {
*it *= *it;
}
}
void method6(std::vector<float>& v) {
auto ptr = v.data();
for(std::size_t i = 0, n = v.size(); i != n; i++) {
ptr[i] *= ptr[i];
}
}
The first 3 methods can suffer from issues of pointer aliasing (as alluded to in previous answers), but are all equally bad. Given that it's possible another thread may be accessing the vector, most compilers will play it safe, and re-evaluate [] end() and size() in each iteration. This will prevent all SIMD optimisations.
You can see proof here:
https://godbolt.org/z/BchhmU
You'll notice that only 4/5/6 make use of the vmulps SIMD instructions, where as 1/2/3 only ever use the non-SIMD vmulss instructiuon.
Note: I'm using VC++ in the godbolt link because it demonstrates the problem nicely. The same problem does occur with gcc/clang, but it's not easy to demonstrate it with godbolt - you usually need to disassemble your DSO to see this happening.
For completeness, I wanted to mention that your loop might want to change the size of the vector.
std::vector<int> v = get_some_data();
for (std::size_t i=0; i<v.size(); ++i)
{
int x = some_function(v[i]);
if(x) v.push_back(x);
}
In such an example you have to use indices and you have to re-evaluate v.size() in every iteration.
If you do the same with a range-based for loop or with iterators, you might end up with undefined behavior since adding new elements to a vector might invalidate your iterators.
By the way, I prefer to use while-loops for such cases over for-loops but that's another story.
It depends to a large extent on what you mean by "effective".
Other answers have mentioned efficiency, but I'm going to focus on the (IMO) most important purpose of C++ code: to convey your intent to other programmers¹.
From this perspective, method 4 is clearly the most effective. Not just because there are fewer characters to read, but mainly because there's less cognitive load: we don't need to check whether the bounds or step size are unusual, whether the loop iteration variable (i or it) is used or modified anywhere else, whether there's a typo or copy/paste error such as for (auto i = 0u; i < v1.size(); ++i) { std::cout << v2[i]; }, or dozens of other possibilities.
Quick quiz: Given std::vector<int> v1, v2, v3;, how many of the following loops are correct?
for (auto it = v1.cbegin(); it != v1.end(); ++it)
{
std::cout << v1[i];
}
for (auto i = 0u; i < v2.size(); ++i)
{
std::cout << v1[i];
}
for (auto const i: v3)
{
std::cout << i;
}
Expressing the loop control as clearly as possible allows the developer's mind to hold more understanding of the high-level logic, rather than being cluttered with implementation details - after all, this is why we're using C++ in the first place!
¹ To be clear, when I'm writing code, I consider the most important "other programmer" to be Future Me, trying to understand, "Who wrote this rubbish?"...
All of the ways you listed have identical time complexity and identical space complexity (no surprise there).
Using the for(auto& value : v) syntax is marginally more efficient, because with the other methods, the compiler may re-load v.size() and v.end() from memory every time you do the test, whereas with for(auto& value : v) this never occurs (it only loads the begin() and end() iterators once).
We can observe a comparison of the assembly produced by each method here: https://godbolt.org/z/LnJF6p
On a somewhat funny note, the compiler implements method3 as a jmp instruction to method2.
The complexity is the same for all except the last one that is in theory faster because the end of the container is evaluated only once.
Last one is also the nicest to read and to write, but has the drawback that doesn't give you the index (that is quite often important).
You are however ignoring what I think is a good alternative (it's my preferred one when I need the index and cannot use for (auto& x : v) {...}):
for (int i=0,n=v.size(); i<n; i++) {
... use v[i] ...
}
note that I used int and not size_t and that the end is computed only once and also available in the body as a local variable.
Often when the index and the size are needed then math computations are also performed on them and size_t behaves "strangely" when used for math (for example a+1 < b and a < b-1 are different things).

Why use the prefix increment form for iterators? [duplicate]

This question already has answers here:
Incrementing iterators: Is ++it more efficient than it++? [duplicate]
(7 answers)
Closed 6 years ago.
Johannes Schaub claims here
always use the prefix increment form for iterators whose definitions
you don't know. That will ensure your code runs as generic as
possible.
for(std::vector<T>::iterator it = v.begin(); it != v.end(); ++it) {
/* std::cout << *it; ... */
}
Why doesn't this first iterate it, then start the loop (at v.begin() + 1)?
Why doesn't this first iterate it, then start the loop (at v.begin() + 1)?
The iteration statement is always executed at the end of each iteration. That is regardless of the type of increment operator you use, or whether you use an increment operator at all.
The result of the iteration statement expression is not used, so it has no effect on how the loop behaves. The statement:
++it;
Is functionally equivalent to the statement:
it++;
Postfix and prefix increment expressions have different behaviour only when the result of the expression is used.
Why use the prefix increment form for iterators?
Because the postfix operation implies a copy. Copying an iterator is generally at least as slow, but potentially slower than not copying an iterator.
A typical implementation of postfix increment:
iterator tmp(*this); // copy
++(*this); // prefix increment
return tmp; // return copy of the temporary
// (this copy can be elided by NRVO)
When the result is not used, even the first copy can be optimized away but only if the operation is expanded inline. But that is not guaranteed.
I wouldn't blindly use the rule "always use prefix increment with itrators". Some algorithms are clearer to express with postfix, although that is just my opinion. An example of an algorithm suitable for postfix increment:
template<class InIter, class OutIter>
OutIter copy(InIter first, InIter last, OutIter out) {
while(first != last)
*out++ = *first++;
return out;
}
Note that your code is equivalent to
for(std::vector<T>::iterator it = v.begin(); it != v.end(); ) {
/* std::cout << *it; ... */
++it;
}
and it should be readily apparent that it doesn't matter if you write ++it; or it++;. (This also addresses your final point.)
But conceptually it++ needs to store, in its implementation, a copy of the unincremented value, as that is what the expression evaluates to.
it might be a big heavy object of which taking a value copy is computationally expensive, and your compiler might not be able to optimise away that implicit value copy taken by it++.
These days, for most containers, a compiler will optimise the arguably clearer it++ to ++it if the value of the expression is not used; i.e. the generated code will be identical.
I follow the author's advice and always use the pre-increment whenever possible, but I am (i) old fashioned and (ii) aware that plenty of expert programmers don't, so it's largely down to personal choice.
Why doesn't this first iterate it, then start the loop (at v.begin() + 1)?
Because the for loop will be parsed as:
{
init_statement
while ( condition ) {
statement
iteration_expression ;
}
}
So
for(std::vector<T>::iterator it = v.begin(); it != v.end(); ++it) {
/* std::cout << *it; ... */
}
is equivalent to
{
std::vector<T>::iterator it = v.begin();
while ( it != v.end() ) {
/* std::cout << *it; ... */
++it ;
}
}
That means it would do the loop at v.begin() at first, then step forward it. Prefix increment means increase the value and then return the reference of the increased object; As you can seen the returned object is not used at all for this case, then ++it and it++ will lead to the same result.

what is the better way to write iterators for a loop in C++

For a very simple thing, like for example to print each element in a vector, what is the better way to use in C++?
I have been using this:
for (vector<int>::iterator i = values.begin(); i != values.end(); ++i)
before, but in one of the Boost::filesystem examples I have seen this way:
for (vec::const_iterator it(v.begin()), it_end(v.end()); it != it_end; ++it)
For me it looks more complicated and I don't understand why is it better then the one I have been using.
Can you tell me why is this version better? Or it doesn't matter for simple things like printing elements of a vector?
Does i != values.end() make the iterating slower?
Or is it const_iterator vs iterator? Is const_iterator faster in a loop like this?
Foo x = y; and Foo x(y); are equivalent, so use whichever you prefer.
Hoisting the end out of the loop may or may not be something the compiler would do anyway, in any event, it makes it explicit that the container end isn't changing.
Use const-iterators if you aren't going to modify the elements, because that's what they mean.
for (MyVec::const_iterator it = v.begin(), end = v.end(); it != end; ++it)
{
/* ... */
}
In C++0x, use auto+cbegin():
for (auto it = v.cbegin(), end = v.cend(); it != end; ++it)
(Perhaps you'd like to use a ready-made container pretty-printer?)
for (vector<int>::iterator i = values.begin(); i != values.end(); ++i)
...vs...
for (vec::const_iterator it(v.begin()), it_end(v.end()); it != it_end; ++it)
For me [the latter, seen in boost] looks more complicated and I don't understand why is it better then the one I have been using.
I'd say it would look more complicated to anybody who hasn't got some specific reason for liking the latter to the extent that it distorts perception. But let's move on to why it might be better....
Can you tell me why is this version better? Or it doesn't matter for simple things like printing elements of a vector?
Does i != values.end() make the iterating slower?
it_end
Performance: it_end gets the end() value just once as the start of the loop. For any container where calculating end() was vaguely expensive, calling it only once may save CPU time. For any halfway decent real-world C++ Standard library, all the end() functions perform no calculations and can be inlined for equivalent performance. In practice, unless there's some chance you may need to drop in a non-Standard container that's got a more expensive end() function, there's no benefit to explicitly "caching" end() in optimised code.This is interesting, as it means for vector that size() may require a small calculation - conceptually subtracting begin() from end() then dividing by sizeof(value_type) (compilers scale by size implicitly during pointer arithmetic), e.g. GCC 4.5.2:
size_type size() const
{ return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }
Maintenance: if the code evolves to insert or erase elements inside the loop (obvious in such a way that the iterator itself isn't invalidated - plausible for maps / sets / lists etc.) it's one more point of maintenance (and hence error-proneness) if the cached end() value also needs to be explicitly recalculated.
A small detail, but here vec must be a typedef, and IMHO it's often best to use typedefs for containers as it loosens the coupling of container type with access to the iterator types.
type identifier(expr)
Style and documentary emphasis: type identifier(expr) is more directly indicative of a constructor call than type identifier = expr, which is the main reason some people prefer the form. I generally prefer the latter, as I like to emphasise the sense of assignment... it's visually unambiguous whereas function call notation is used for many things.
Near equivalence: For most classes, both invoke the same constructor anyway, but if type has an explicit constructor from the type of expr, it will be passed over if = is used. Worse still, some other conversion may allow a less ideal constructor be used instead. For example, X x = 3.14;, would pass over explicit X::X(double); to match X::X(int) - you could get a less precise (or just plain wrong) result - but I'm yet to be bitten by such an issue so it's pretty theoretical!
Or is it const_iterator vs iterator? Is const_iterator faster in a loop like this?
For Standard containers, const_iterator and iterator perform identically, but the latter implies you want the ability to modify the elements as you iterate. Using const_iterator documents that you don't intend to do that, and the compiler will catch any contradictory uses of the iterator that attempt modification. For example, you won't be able to accidentally increment the value the iterator addresses when you intend to increment the iterator itself.
Given C++0x has been mentioned in other answers - but only the incremental benefit of auto and cbegin/cend - there's also a new notation supported:
for (const Foo& foo: container)
// use foo...
To print the items in a vector, you shouldn't be using any of the above (at least IMO).
I'd recommend something like this:
std::copy(values.begin(), values.end(),
std::ostream_iterator<T>(std::cout, "\n"));
You could just access them by index
int main(int argc, char* argv[])
{
std::vector<int> test;
test.push_back(10);
test.push_back(11);
test.push_back(12);
for(int i = 0; i < test.size(); i++)
printf("%d\n", test[i]);
}
prints out:
10
11
12
I don't think it matters. Internally, they do the same thing, so you compiler should optimise it anyway. I would personally use the first version as I find it much clearer as it closely follows the for-loop strucutre.
for (vector<int>::iterator i = values.begin(); i != values.end(); ++i)

c++ best way to use for loop

I have this question that runs in my mind...
I have a std::vector to iterate:
which is the best way (the faster) to iterate?
here is the code using an iterator:
// using the iterator
for( std::vector <myClass*>::iterator it = myObject.begin( ); it != myObject.end( ); it++ )
{
(*it)->someFunction( );
}
and here is 'normal' mode...
// normal loop
for( int i = 0; i < myObject.Size( ); i++ )
{
myObject[i]->someFunction( );
}
thanks for your suggestions!
None of the two will be any faster really, because on most implementations a vector<T>::iterator is just a typedef for T* and size is cached.
But doing ++it instead of it++ is a good habit. The latter involves creating a temporary.
for(std::vector <myClass*>::iterator it = myObject.begin( );
it != myObject.end( );
++it)
^^^^
On other containers such as map, list etc. with nontrivial iterators the difference between postincrement and preincrement might become noticable.
If you really care, you can find out: Just make a single source file with one function with that loop and look at the optimized assembly:
g++ -O2 -S -o ver1.s ver1.cpp
g++ -O2 -S -o ver2.s ver2.cpp
You can directly see the differences! I bet there are none.
That said, you should use the iterator pattern because it's idomatic, generic C++ and it gets you in the right mood -- plus, it works in far more general cases than just vectors! Write it like this:
typedef std::vector<MyClass*> myVec;
for (myVec::const_iterator it = v.begin(), end = v.end(); it != end; ++it)
{
const MyClass & x = **it;
/* ... */
}
In case you're curious, a vector iterator is most likely just going to be a native, raw pointer, so there's really nothing to fear in terms of efficiency, and a lot to be enjoyed from the self-explanatory, algorithmic style!
PS If you have C++0x, say it like this:
for (auto it = v.cbegin(), end = v.cend(); it != end; ++it)
// or
for (const MyClass * & i : v)
The first code will decompose into incrementing a pointer. The second one will increment an index, and index into the array. The first one can use slightly smaller instructions (and thus potentially be faster) assuming the compiler doesn't optimize the second into the first already. But it will be a trivial difference.
Iterators should be preferred, however, not because of speed but because you can then easily move the code to iterate any standard C++ container, not just vector.
However, you've got a few things to improve.
Don't use it++, but ++it. This can be very important in C++ because iterators can end up doing a little more work in post-increment which won't be optimized out as if the type were an int.
Don't constantly call end() or size(). For some iterator types and collections this might not be optimized out and can be very sub-optimal.
Use vector::size_type when you need an index into a vector. int is not guaranteed to be big enough, while size_type was made specifically for that.
So, the better ways to write these are:
// using the iterator
for(std::vector <myClass*>::iterator it = myObject.begin( ), end = myObject.end(); it != end; ++it)
{
(*it)->someFunction( );
}
// normal loop
for(std::vector <myClass*>::size_type i = 0, size = myObject.size(); i < size; ++i)
{
myObject[i]->someFunction( );
}
Here's what I use:
int s = vec.size();
for(int i=0;i<s;i++)
{
T &o = vec[i];
...
}
This loop has the following advantages over other approaches:
it always looks the same (unlike the hacks that change from i++ to ++i)
it's relatively short to write (compared to the iterator version)
the int indexes are useful in your public interface (unlike iterators)
it still always looks the same (unlike stdlib algorithms, where you need documentation to remember the parameters)
it's very old and thus widely used. (originally comes from book "The C programming language" aka K&R)
It doesn't give warnings on most compilers (unlike the loop that was used in the question)
It does have some disadvantages too:
some newer programmers doesn't like it because they think C ways are too old
in a const function, you might need to change it slightly to const T &o = vec[i]; or change the data member mutable
type T changes depending on the use
You can do this:
#include <iostream>
using namespace std;
int main() {
for (int i = 3; i > 0; i--) {
cout << i << "\n";
}
return 0;
}
source https://mockstacks.com/Cpp_For_Loop

C++: Proper way to iterate over STL containers

In my game engine project, I make extensive use of the STL, mostly of the std::string and std::vector classes.
In many cases, I have to iterate through them. Right now, the way I'm doing it is:
for( unsigned int i = 0; i < theContainer.size(); i ++ )
{
}
Am I doing it the right way?
If not, why, and what should I do instead?
Is size() really executed every loop cycle with this implementation? Would the performance loss be negligible?
C++11 has a new container aware for loop syntax that can be used if your compiler supports the new standard.
#include <iostream>
#include <vector>
#include <string>
using namespace std;
int main()
{
vector<string> vs;
vs.push_back("One");
vs.push_back("Two");
vs.push_back("Three");
for (const auto &s : vs)
{
cout << s << endl;
}
return 0;
}
You might want to look at the standard algorithms.
For example
vector<mylass> myvec;
// some code where you add elements to your vector
for_each(myvec.begin(), myvec.end(), do_something_with_a_vector_element);
where do_something_with_a_vector_element is a function that does what goes in your loop
for example
void
do_something_with_a_vector_element(const myclass& element)
{
// I use my element here
}
The are lots of standard algorithms - see http://www.cplusplus.com/reference/algorithm/ - so most things are supported
STL containers support Iterators
vector<int> v;
for (vector<int>::iterator it = v.begin(); it!=v.end(); ++it) {
cout << *it << endl;
}
size() would be re-computed every iteration.
For random-access containers, it's not wrong.
But you can use iterators instead.
for (string::const_iterator it = theContainer.begin();
it != theContainer.end(); ++it) {
// do something with *it
}
There are some circumstances under which a compiler may optimize away the .size() (or .end() in the iterator case) calls (e.g. only const access, function is pure). But do not depend on it.
Usually the right way to "iterate" over a container is using "iterators". Something like
string myStr = "hello";
for(string::iterator i = myStr.begin(); i != myStr.end(); ++i){
cout << "Current character: " << *i << endl;
}
Of course, if you aren't going to modify each element, it's best to use string::const_iterator.
And yes, size() gets called every time, and it's O(n), so in many cases the performance loss will be noticeable and it's O(1), but it's a good practice to calculate the size prior to the loop than calling size every time.
No, this is not the correct way to do it. For a ::std::vector or a ::std::string it works fine, but the problem is that if you ever use anything else, it won't work so well. Additionally, it isn't idiomatic.
And, to answer your other question... The size function is probably inline. This means it likely just fetches a value from the internals of ::std::string or ::std::vector. The compiler will optimize this away and only fetch it once in most cases.
But, here is the idiomatic way:
for (::std::vector<Foo>::iterator i = theContainer.begin();
i != theContainer.end();
++i)
{
Foo &cur_element = *i;
// Do stuff
}
The ++i is very important. Again, for ::std:vector or ::std::string where the iterator is basically a pointer, it's not so important. But for more complicated data structures it is. i++ has to make a copy and create a temporary because the old value needs to stick around. ++i has no such issue. Get into the habit of always using ++i unless you have a compelling reason not to.
Lastly, theContainer.end() will also be generally optimized out of existence. But you can force things to be a little better by doing this:
const ::std::vector<Foo>::iterator theEnd = theContainer.end();
for (::std::vector<Foo>::iterator i = theContainer.begin(); i != theEnd; ++i)
{
Foo &cur_element = *i;
// Do stuff
}
Of course, C++0x simplifies all of this considerably with a new syntax for for loops:
for (Foo &i: theContainer)
{
// Do stuff with i
}
These will work on standard fix-sized arrays as well as any type that defines begin and end to return iterator-like things.
Native for-loop (especially index-based) - it's C-way, not C++-way.
Use BOOST_FOREACH for loops.
Compare, for container of integers:
typedef theContainer::const_iterator It;
for( It it = theContainer.begin(); it != theContainer.end(); ++it ) {
std::cout << *it << std::endl;
}
and
BOOST_FOREACH ( int i, theContainer ) {
std::cout << i << std::endl;
}
But this is not perfect way. If you can do your work without loop - you MUST do it without loop. For example, with algorithms and Boost.Phoenix:
boost::range::for_each( theContainer, std::cout << arg1 << std::endl );
I understand that these solutions bring additional dependencies in your code, but Boost is 'must-have' for modern C++.
You're doing it OK for vectors, although that doesn't translate into the right way for other containers.
The more general way is
for(std::vector<foo>::const_iterator i = theContainer.begin(); i != theContainer.end; ++i)
which is more typing than I really like, but will become a lot more reasonable with the redefinition of auto in the forthcoming Standard. This will work on all standard containers. Note that you refer to the individual foo as *i, and use &*i if you want its address.
In your loop, .size() is executed every time. However, it's constant time (Standard, 23.1/5) for all standard containers, so it won't slow you down much if at all. Addition: the Standard says "should" have constant complexity, so a particularly bad implementation could make it not constant. If you're using such a bad implementation, you've got other performance issues to worry about.