I am just wondering why I would do this in c++
for(int i=0, n=something.size(); i<n; ++i)
vs
for(int i=0; i<something.size(); ++i)
..
Assuming syntactically correct versions of both samples, if the call to something.size() were expensive, the first sample would potentially be more efficient because it saves one call per loop iteration. Even so, you should measure whether it actually makes a difference.
Note that the two would have different semantics if the size of something were to change inside of the loop.
The loop condition is evaluated before every loop round, so if the operand of the comparison doesn't change (i.e. you don't mutate the sequence during its iteration), then you don't need to recompute the operand each time and instead hoist it out.
Whether that makes a difference depends on how much the compiler can see of the size() call. For instance, if it can prove that the result cannot change during the iteration, then it may already do the hoisting for you. If in doubt, compile both versions and compare the machine code.
If you do
for(int i=0; i<something.size(); ++i);
it will be correct.
You should check in some C++ handbook how for loop looks like.
Your second example is invalid C++ code
The two examples are not the same.
for(int i=0, n=something.size(); i<n; ++i)
{
// ....
}
evaluates something.size() only once.
for(int i=0; i<something.size(); ++i) // Syntax corrected
{
// ....
}
evaluates something.size() in each loop.
So they could behave very differently if something.size() changed while doing the loop.
If you know something.size() will not change, you should go for the first solution for performance reason (i.e. only one call to something.size()).
If something.size() can change (e.g. in the body of the for-loop) the second option is the way to go.
Related
Given a vector v, I want to loop through each element in the vector and perform an operation that requires the current index.
I've seen a basic for loop written both of these ways:
// Using "<" in the terminating condition
for (auto i = 0; i < v.size(); ++i)
{
// Do something with v[i]
}
// Using "!=" in the terminating condition
for (auto i = 0; i != v.size(); ++i)
{
// Do something with v[i]
}
Is there any practical reason to prefer one over the other? I've seen it written using < much more often, but is there a performance benefit to using !=?
There is one, albeit kinda convoluted, reason to prefer < over !=.
If for whatever reason the body of your lop modifies i and skips over the threshold, < still terminates the loop, where != will keep iterating.
It might make no difference in the vast majority of cases, but getting used to < might prevent bugs that != won't. There's also an additional advantage, that is ridiculously minor but < takes less characters to write, so it makes the source file smaller.
Again the above argument is borderline a joke, but if you need an argument to use one over the other, there you have it.
You should probably prefer using a range-for instead.
Is there any practical reason to prefer one over the other?
There are no strong reasons to prefer one way or another given that both have equivalent behaviour.
Using the less than operator works even if the end condition is not one of the values of the accumulator which would be possible if increment was greater than 1 or the initial value was greater than the end condition. As such, it is more general, which means that using it always may be more consistent. Consistency is a weak argument, but a weak argument is better than no argument if you agree with it.
There is no difference in performance.
There is a huge difference if you decided to change ++i in such a way that the loop would step over the desired end.
For clarity for other programmers reading your code, use the convention they expect to see, namely <. When I see i != ..., I think "he is skipping over that one value".
The is an obscure bug with < -- Suppose you iterating over all 256 chars using a byte-sized i. The value will overflow and probably never show the value 256. The loop will never stop. And != won't fix the bug.
I was told that a while loop was more efficient than a for loop. (c/c++)
This seemed reasonable but I wanted to find a way to prove or disprove it.
I have tried three tests using analogous snippets of code. Each containing Nothing but a for or while loop with the same output:
Compile time - roughly the same
Run time - Same
Compiled to intel assembly code and compared - Same number of lines and virtually the same code
Should I have tried anything else, or can anyone confirm one way or the other?
All loops follow the same template:
{
// Initialize
LOOP:
if(!(/* Condition */) ) {
goto END
}
// Loop body
// Loop increment/decrement
goto LOOP
}
END:
Therefor the two loops are the same:
// A
for(int i=0; i<10; i++) {
// Do stuff
}
// B
int i=0;
while(i < 10) {
// Do stuff
i++;
}
// Or even
int i=0;
while(true) {
if(!(i < 10) ) {
break;
}
// Do stuff
i++;
}
Both are converted to something similar to:
{
int i=0;
LOOP:
if(!(i < 10) ) {
goto END
}
// Do stuff
i++;
goto LOOP
}
END:
Unused/unreachable code will be removed from the final executable/library.
Do-while loops skip the first conditional check and are left as an exercise for the reader. :)
Certainly LLVM will convert ALL types of loops to a consistent form (to the extent possible, of course). So as long as you have the same functionality, it doesn't really matter if you use for, while, do-while or goto to form the loop, if it's got the same initialization, exit condition, and update statement and body, it will produce the exact same machine code.
This is not terribly hard to do in a compiler if it's done early enough during the optimisation (so the compiler still understands what is actually being written). The purpose of such "make all loops equal" is that you then only need one way to optimise loops, rather than having one for while-loops, one for for-loops, one for do-while loops and one for "any other loops".
It's not guaranteed for ALL compilers, but I know that gcc/g++ will also generate nearly identical code whatever loop construct you use, and from what I've seen Microsoft also does the same.
C and C++ compilers actually convert high level C or C++ codes to assembly codes and in assembly we don't have while or for loops. We can only check a condition and jump to another location.
So, performance of for or while loop heavily depends on how strong the compiler is to optimize the codes.
This is good paper on code optimizations:
http://www.linux-kongress.org/2009/slides/compiler_survey_felix_von_leitner.pdf.
I was trying to compile the following code:
#pragma omp parallel shared (j)
{
#pragma omp for schedule(dynamic)
for(i = 0; i != j; i++)
{
// do something
}
}
but I got the following error: error: invalid controlling predicate.
The OpenMP standard states that for parallel for constructor it "only" allows one of the following operators: <, <=, > >=.
I do not understand the rationale for not allowing i != j. I could understand, in the case of the static schedule, since the compiler needs to pre-compute the number of iterations assigned to each thread. But I can't understand why this limitation in such case for example. Any clues?
EDIT: even if I make for(i = 0; i != 100; i++), although I could just have put "<" or "<=" .
.
I sent an email to OpenMP developers about this subject, the answer I got:
For signed int, the wrap around behavior is undefined. If we allow !=, programmers may get unexpected tripcount. The problem is whether the compiler can generate code to compute a trip count for the loop.
For a simple loop, like:
for( i = 0; i < n; ++i )
the compiler can determine that there are 'n' iterations, if n>=0, and zero iterations if n < 0.
For a loop like:
for( i = 0; i != n; ++i )
again, a compiler should be able to determine that there are 'n' iterations, if n >= 0; if n < 0, we don't know how many iterations it has.
For a loop like:
for( i = 0; i < n; i += 2 )
the compiler can generate code to compute the trip count (loop iteration count) as floor((n+1)/2) if n >= 0, and 0 if n < 0.
For a loop like:
for( i = 0; i != n; i += 2 )
the compiler can't determine whether 'i' will ever hit 'n'. What if 'n' is an odd number?
For a loop like:
for( i = 0; i < n; i += k )
the compiler can generate code to compute the trip count as floor((n+k-1)/k) if n >= 0, and 0 if n < 0, because the compiler knows that the loop must count up; in this case, if k < 0, it's not a legal OpenMP program.
For a loop like:
for( i = 0; i != n; i += k )
the compiler doesn't even know if i is counting up or down. It doesn't know if 'i' will ever hit 'n'. It may be an infinite loop.
Credits: OpenMP ARB
Contrary to what it may look like, schedule(dynamic) does not work with dynamic number of elements. Rather the assignment of iteration blocks to threads is what is dynamic. With static scheduling this assignment is precomputed at the beginning of the worksharing construct. With dynamic scheduling iteration blocks are given out to threads on the first come, first served basis.
The OpenMP standard is pretty clear that the amount of iteratons is precomputed once the workshare construct is encountered, hence the loop counter may not be modified inside the body of the loop (OpenMP 3.1 specification, ยง2.5.1 - Loop Construct):
The iteration count for each associated loop is computed before entry to the outermost
loop. If execution of any associated loop changes any of the values used to compute any
of the iteration counts, then the behavior is unspecified.
The integer type (or kind, for Fortran) used to compute the iteration count for the
collapsed loop is implementation defined.
A worksharing loop has logical iterations numbered 0,1,...,N-1 where N is the number of
loop iterations, and the logical numbering denotes the sequence in which the iterations
would be executed if the associated loop(s) were executed by a single thread. The
schedule clause specifies how iterations of the associated loops are divided into
contiguous non-empty subsets, called chunks, and how these chunks are distributed
among threads of the team. Each thread executes its assigned chunk(s) in the context of
its implicit task. The chunk_size expression is evaluated using the original list items of any variables that are made private in the loop construct. It is unspecified whether, in what order, or how many times, any side-effects of the evaluation of this expression occur. The use of a variable in a schedule clause expression of a loop construct causes an implicit reference to the variable in all enclosing constructs.
The rationale behind these relational operator restriction is quite simple - it provides clear indication on what is the direction of the loop, it alows easy computation of the number of iterations, and it provides similar semantics of the OpenMP worksharing directive in C/C++ and Fortran. Also other relational operations would require close inspection of the loop body in order to understand how the loop goes which would be unaceptable in many cases and would make the implementation cumbersome.
OpenMP 3.0 introduced the explicit task construct which allows for parallelisation of loops with unknown number of iterations. There is a catch though: tasks introduce some severe overhead and the one task per loop iteration only makes sense if these iterations take quite some time to be executed. Otherwise the overhead would dominate the execution time.
The answer is simple.
OpenMP does not allow premature termination of a team of threads.
With == or !=, OpenMP has no way of determining when the loop stops.
1. One or more threads could hit the termination condition, which might not be unique.
2. OpenMP has no way to shut down the other threads that might never detect the condition.
If I were to see the statement
for(i = 0; i != j; i++)
used instead of the statement
for(i = 0; i < j; i++)
I would be left wondering why the programmer had made that choice, never mind that it can mean the same thing. It may be that OpenMP is making a hard syntactic choice in order to force a certain clarity of code.
Here's code which raises challenges for the use of != and may help explain why it isn't allowed.
#include <cstdio>
int main(){
int j=10;
#pragma omp parallel for
for(int i = 0; i < j; i++){
printf("%d\n",i++);
}
}
notice that i is incremented in both the for statement as well as within the loop itself leading to the possibility (but not the guarantee) of an infinite loop.
If the predicate is < then the loop's behavior can still be well-defined in a parallel context without the compiler having to check within the loop for changes to i and determining how those changes will affect the loop's bounds.
If the predicate is != then the loop's behavior is no longer well-defined and it may be infinite in extent, preventing easy parallel subdivision.
I think there is perhaps no good reason other than having extended existing functionality to get this far.
IIRC originally these had to be static so that it could determine at compile time how to generate the loop code... it could just be a hangover from that.
If i have a myVector which is a STL vector and execute a loop like this:
for(int i=0;i<myVector.size();++i) { ... }
Does the C++ compiler play some trick to call size() only once, or it will be called size()+1 times?
I am little confused, can anyone help?
Logically, myVector.size() will be called each time the loop is iterated - or at least the compiler must produce code as if it's called each time.
If the optimizer can determine that the size of the vector will not change in the body of the loop, it could hoist the call to size() outside the loop. Note that usually, vector::size() is an inline that's just a simple difference between pointers to the end and beginning of the vector (or something similar - maybe a simple load of a member that keeps track of the number of elements).
So there's actually probably little reason for concern about what happens for vector::size().
Note that list::size() could be a different story - the C++03 standard permits it to be linear complexity (though I think this is rare, and the C++0x standard changes list::size() requirements to be constant complexity).
I'm assuming that the vector doesn't change size in the loop. If it does change size, it's impossible to tell without knowing how it changes size.
On the C++ abstract machine it will be called exactly size()+1 times. And on a concrete implementation it will have an observable behaviour equivalent to it having been called size()+1 times (this is called the as if rule).
This means that the compiler can choose to call it just once, because the observable behaviour is the same. In fact, by following the as if rule, if the body of the loop is empty, the compiler can even choose to not call it at all and just skip the whole thing altogether. The observable behaviour is the same, because making your code run faster is not considered different observable behaviour.
It will be called size + 1 times. Changing the size of the vector will affect the number of iterations.
It may be called once, may be called size+1 times, or it may never be called at all. Assuming that the vector size doesn't change, your program will behave as if it had been called size+1 times.
There are two optimizations at play here: first, std::vector::size() is probably inlined, so it may never be "called" at all in the traditional sense. Second, the compiler may determine that it evaluate size() only once, or perhaps never:
For example, this code might never evaluate std::vector::size():
for(int i = 0; i < myVector.size(); ++i) { ; }
Either of these loops might evaluate std::vector::size() only once:
for(int i = 0; i < myVector.size(); ++i) { std::cout << "Hello, world.\n"; }
for(int i = 0; i < myVector.size(); ++i) { sum += myVector[i]; }
While this loop might evaluate std::vector::size() many times:
for(int i = 0; i < myVector.size(); ++i) { ExternalFunction(&myVector); }
In the final analysis, the key questions are:
Why do you care?, and
How would you know?
Why do you care how many times size() is invoked? Are you trying to make your program go faster?
How would you even know? Since size() has no visible side-effects, how would you even know who many times it was called (or otherwise evaluated)?
It will be called size() + 1 times (it may be that the compiler can recognize it as invariant in the loop, but you shouldn't count on it)
It will be called until the condition is not falsified (size() could change each time for example). If size() remains constant, it's size() + 1 times.
From the MSDN page about for:
for ( init-expression ; cond-expression ; loop-expression )
statement
cond-expression
Before execution of each iteration of statement,
including the first iteration. statement is executed only if
cond-expression evaluates to true (nonzero). An expression that
evaluates to an integral type or a class type that has an unambiguous
conversion to an integral type. Normally used to test for
loop-termination criteria.
It will be called size + 1 times, as Ernest have mentioned. However, if you are sure that size is not changing, you can apply an optimisation and make your code look like this:
for (unsigned int i = 0, e = myVector.size (); i < e; ++i)
... in which case size () will be called only once.
Actually myVector.size() will be inlined, so there will be no call at all. Just comparing register value with memory location. Of course I am talking about release build.
In debug build it will be called size()+1 times.
EDIT: No doubt that there exists such compiler or STL implementation which cannot optimize myVector.size(). But the chances to face them are very low.
I would like to increment two variables in a for-loop condition instead of one.
So something like:
for (int i = 0; i != 5; ++i and ++j)
do_something(i, j);
What is the syntax for this?
A common idiom is to use the comma operator which evaluates both operands, and returns the second operand. Thus:
for(int i = 0; i != 5; ++i,++j)
do_something(i,j);
But is it really a comma operator?
Now having wrote that, a commenter suggested it was actually some special syntactic sugar in the for statement, and not a comma operator at all. I checked that in GCC as follows:
int i=0;
int a=5;
int x=0;
for(i; i<5; x=i++,a++){
printf("i=%d a=%d x=%d\n",i,a,x);
}
I was expecting x to pick up the original value of a, so it should have displayed 5,6,7.. for x. What I got was this
i=0 a=5 x=0
i=1 a=6 x=0
i=2 a=7 x=1
i=3 a=8 x=2
i=4 a=9 x=3
However, if I bracketed the expression to force the parser into really seeing a comma operator, I get this
int main(){
int i=0;
int a=5;
int x=0;
for(i=0; i<5; x=(i++,a++)){
printf("i=%d a=%d x=%d\n",i,a,x);
}
}
i=0 a=5 x=0
i=1 a=6 x=5
i=2 a=7 x=6
i=3 a=8 x=7
i=4 a=9 x=8
Initially I thought that this showed it wasn't behaving as a comma operator at all, but as it turns out, this is simply a precedence issue - the comma operator has the lowest possible precedence, so the expression x=i++,a++ is effectively parsed as (x=i++),a++
Thanks for all the comments, it was an interesting learning experience, and I've been using C for many years!
Try this
for(int i = 0; i != 5; ++i, ++j)
do_something(i,j);
Try not to do it!
From http://www.research.att.com/~bs/JSF-AV-rules.pdf:
AV Rule 199
The increment expression in a for loop will perform no action other than to change a single
loop parameter to the next value for the loop.
Rationale: Readability.
for (int i = 0; i != 5; ++i, ++j)
do_something(i, j);
I came here to remind myself how to code a second index into the increment clause of a FOR loop, which I knew could be done mainly from observing it in a sample that I incorporated into another project, that written in C++.
Today, I am working in C#, but I felt sure that it would obey the same rules in this regard, since the FOR statement is one of the oldest control structures in all of programming. Thankfully, I had recently spent several days precisely documenting the behavior of a FOR loop in one of my older C programs, and I quickly realized that those studies held lessons that applied to today's C# problem, in particular to the behavior of the second index variable.
For the unwary, following is a summary of my observations. Everything I saw happening today, by carefully observing variables in the Locals window, confirmed my expectation that a C# FOR statement behaves exactly like a C or C++ FOR statement.
The first time a FOR loop executes, the increment clause (the 3rd of its three) is skipped. In Visual C and C++, the increment is generated as three machine instructions in the middle of the block that implements the loop, so that the initial pass runs the initialization code once only, then jumps over the increment block to execute the termination test. This implements the feature that a FOR loop executes zero or more times, depending on the state of its index and limit variables.
If the body of the loop executes, its last statement is a jump to the first of the three increment instructions that were skipped by the first iteration. After these execute, control falls naturally into the limit test code that implements the middle clause. The outcome of that test determines whether the body of the FOR loop executes, or whether control transfers to the next instruction past the jump at the bottom of its scope.
Since control transfers from the bottom of the FOR loop block to the increment block, the index variable is incremented before the test is executed. Not only does this behavior explain why you must code your limit clauses the way you learned, but it affects any secondary increment that you add, via the comma operator, because it becomes part of the third clause. Hence, it is not changed on the first iteration, but it is on the last iteration, which never executes the body.
If either of your index variables remains in scope when the loop ends, their value will be one higher than the threshold that stops the loop, in the case of the true index variable. Likewise, if, for example, the second variable is initialized to zero before the loop is entered, its value at the end will be the iteration count, assuming that it is an increment (++), not a decrement, and that nothing in the body of the loop changes its value.
I agree with squelart. Incrementing two variables is bug prone, especially if you only test for one of them.
This is the readable way to do this:
int j = 0;
for(int i = 0; i < 5; ++i) {
do_something(i, j);
++j;
}
For loops are meant for cases where your loop runs on one increasing/decreasing variable. For any other variable, change it in the loop.
If you need j to be tied to i, why not leave the original variable as is and add i?
for(int i = 0; i < 5; ++i) {
do_something(i,a+i);
}
If your logic is more complex (for example, you need to actually monitor more than one variable), I'd use a while loop.
int main(){
int i=0;
int a=0;
for(i;i<5;i++,a++){
printf("%d %d\n",a,i);
}
}
Use Maths. If the two operations mathematically depend on the loop iteration, why not do the math?
int i, j;//That have some meaningful values in them?
for( int counter = 0; counter < count_max; ++counter )
do_something (counter+i, counter+j);
Or, more specifically referring to the OP's example:
for(int i = 0; i != 5; ++i)
do_something(i, j+i);
Especially if you're passing into a function by value, then you should get something that does exactly what you want.