Hay Dear!
i know that if statement is an expensive statement in c++. I remember that once my teacher said that if statement is an expensive statement in the sense of computer time.
Now we can do every thing by using if statement in c++ so this is very powerful statement in programming perspective but its expensive in computer time perspective.
i am a beginner and i am studying data structure course after introduction to c++ course.
my Question to you is
Is it better for me to use if statement extensivly?
If statements are compiled into a conditional branch. This means the processor must jump (or not) to another line of code, depending on a condition. In a simple processor, this can cause a pipeline stall, which in layman's terms means the processor has to throw away work it did early, which wastes time on the assembly line. However, modern processors use branch prediction to avoid stalls, so if statements become less costly.
In summary, yes they can be expensive. No, you generally shouldn't worry about it. But Mykola brings up a separate (though equally valid) point. Polymorphic code is often preferable (for maintainability) to if or case statements
I'm not sure how you can generalise that the if statement is expensive.
If you have
if ( true ) { ... }
then this the if will most likele be optimised away by your compiler.
If, on the other hand, you have..
if ( veryConvolutedMethodTogGetAnswer() ) { .. }
and the method veryConvolutedMethodTogGetAnswer() does lots of work then, you could argue tha this is an expensive if statement but not because of the if, but because of the work you're doing in the decision making process.
"if"'s themselves are not usually "expensive" in terms of clock cycles.
Premature optimization is a bad idea. Use if statements where they make sense. When you discover a part of your code where its performance needs improvement, then possibly work on removing if statements from that part of your code.
If statements can be expensive because they force the compiler to generate branch instructions. If you can figure out a way to code the same logic in such a way that the compiler does not have to branch at all the code will likely be a lot faster, even if there are more total instructions. I remember being incredibly surprised at how recoding a short snippet of code to use various bit manipulations rather than doing any branching sped it up by a factor of 10-20%.
But that is not a reason to avoid them in general. It's just something to keep in mind when you're trying to wring the last bit of speed out of a section of code you know is performance critical because you've already run a profiler and various other tools to prove it to yourself.
Another reason if statements can be expensive is because they can increase the complexity of your program which makes it harder to read and maintain. Keep the complexity of your individual functions low. Don't use too many statements that create different control paths through your function.
I would say a lot of if statement is expensive from the maintainability perspective.
An if-statement implies a conditional branch which might be a bit more expensive that a code that doesn't branch.
As an example, counting how many times a condition is true (e.g how many numbers in a vector are greater than 10000):
for (std::vector<int>::const_iterator it = v.begin(), end = v.end(); it != end; ++it) {
//if (*it > 10000) ++count;
count += *it > 10000;
}
The version which simply adds 1 or 0 to the running total may be a tiny amount faster (I tried with 100 million numbers before I could discern a difference).
However, with MinGW 3.4.5, using a dedicated standard algorithm turns out to be noticeably faster:
count = std::count_if(v.begin(), v.end(), std::bind2nd(std::greater<int>(), 10000));
So the lesson is that before starting to optimize prematurely, using some tricks you've learnt off the internets, you might try out recommended practices for the language. (And naturally make sure first, that that part of the program is unreasonably slow in the first place.)
Another place where you can often avoid evaluating complicated conditions is using look-up tables (a rule of thumb: algorithms can often be made faster if you let them use more memory). For example, counting vowels (aeiou) in a word-list, where you can avoid branching and evaluating multiple conditions:
unsigned char letters[256] = {0};
letters['a'] = letters['e'] = letters['i'] = letters['o'] = letters['u'] = 1;
for (std::vector<std::string>::const_iterator it = words.begin(), end = words.end(); it != end; ++it) {
for (std::string::const_iterator w_it = it->begin(), w_end = it->end(); w_it != w_end; ++w_it) {
unsigned char c = *w_it;
/*if (c == 'e' || c == 'a' || c == 'i' || c == 'o' || c == 'u') {
++count;
}*/
count += letters[c];
}
}
You should write your code to be correct, easy to understand, and easy to maintain. If that means using if statements, use them! I would find it hard to believe that someone suggested you to not use the if statement.
Maybe your instructor meant that you should avoid something like this:
if (i == 0) {
...
} else if (i == 1) {
...
} else if (i == 2) {
...
} ...
In that case, it might be more logical to rethink your data structure and/or algorithm, or at the very least, use switch/case:
switch (i) {
case 1: ...; break;
case 2: ...; break;
...;
default: ...; break;
}
But even then, the above is better more because of improved readability rather than efficiency. If you really need efficiency, things such as eliminating if conditions are probably a bad way to start. You should profile your code instead, and find out where the bottleneck is.
Short answer: use if if and only if it makes sense!
In terms of computer time the "if" statement by itself is one of the cheapest statements there is.
Just don't put twenty of them in a row when there is a better way like a switch or a hash table, and you'll do fine.
You can use Switch in stead which makes the more readable, but I don't if it is any faster. If you have something like :
if (condition1) {
// do something
} else if (condition2) {
// do something
} else if (condition3) {
// do something
} else if (condition4) {
// do something
}
I am not what can be done to speed it up. if condition4 is occurs more frequently, you might move it to the top.
In a data structures course, the performance of an if statement doesn't matter. The small difference between an if statement and any of the obscure alternatives is totally swamped by the difference between data structures. For instance, in the following pseudocode
FOR EACH i IN container
IF i < 100
i = 100
container.NEXT(i)
END FOR
the performance is most determined by container.NEXT(i); this is far more expensive for linked lists then it is for contiguous arrays. For linked lists this takes an extra memory access, which depending on the cache(s) may take somewhere between 2.5 ns and 250 ns. The cost of the if statement would be measured in fractions of a nanosecond.
I did confront with performance issues due to to many if statements called inside loops in batch scripts, so instead I used integer math to emulate if statement, and it dramatically improved the performance.
if [%var1%] gtr [1543]
set var=1
else
set var=0
equivalent to
set /a var=%var1%/1543
I even used much more longer expressions, with many / and % operations, and it still was preferable over an if statement.
I know this is not C++, but I guess is the same principle. So whenever you need performance, then avoid conditional statements as much as you can.
Related
I have done my best and read a lot of Q&As on SO.SE, but I haven't found an answer to my particular question. Most for-loop and break related question refer to nested loops, while I am concerned with performance.
I want to know if using a break inside a for-loop has an impact on the performance of my C++ code (assuming the break gets almost never called). And if it has, I would also like to know tentatively how big the penalization is.
I am quite suspicions that it does indeed impact performance (although I do not know how much). So I wanted to ask you. My reasoning goes as follows:
Independently of the extra code for the conditional statements that
trigger the break (like an if), it necessarily ads additional
instructions to my loop.
Further, it probably also messes around when my compiler tries to
unfold the for-loop, as it no longer knows the number of iterations
that will run at compile time, effectively rendering it into a
while-loop.
Therefore, I suspect it does have a performance impact, which could be
considerable for very fast and tight loops.
So this takes me to a follow-up question. Is a for-loop & break performance-wise equal to a while-loop? Like in the following snippet, where we assume that checkCondition() evaluates 99.9% of the time as true. Do I loose the performance advantage of the for-loop?
// USING WHILE
int i = 100;
while( i-- && checkCondition())
{
// do stuff
}
// USING FOR
for(int i=100; i; --i)
{
if(checkCondition()) {
// do stuff
} else {
break;
}
}
I have tried it on my computer, but I get the same execution time. And being wary of the compiler and its optimization voodoo, I wanted to know the conceptual answer.
EDIT:
Note that I have measured the execution time of both versions in my complete code, without any real difference. Also, I do not trust compiling with -s (which I usually do) for this matter, as I am not interested in the particular result of my compiler. I am rather interested in the concept itself (in an academic sense) as I am not sure if I got this completely right :)
The principal answer is to avoid spending time on similar micro optimizations until you have verified that such condition evaluation is a bottleneck.
The real answer is that CPU have powerful branch prediction circuits which empirically work really well.
What will happen is that your CPU will choose if the branch is going to be taken or not and execute the code as if the if condition is not even present. Of course this relies on multiple assumptions, like not having side effects on the condition calculation (so that part of the body loop depends on it) and that that condition will always evaluate to false up to a certain point in which it will become true and stop the loop.
Some compilers also allow you to specify the likeliness of an evaluation as a hint the branch predictor.
If you want to see the semantic difference between the two code versions just compile them with -S and examinate the generated asm code, there's no other magic way to do it.
The only sensible answer to "what is the performance impact of ...", is "measure it". There are very few generic answers.
In the particular case you show, it would be rather surprising if an optimising compiler generated significantly different code for the two examples. On the other hand, I can believe that a loop like:
unsigned sum = 0;
unsigned stop = -1;
for (int i = 0; i<32; i++)
{
stop &= checkcondition(); // returns 0 or all-bits-set;
sum += (stop & x[i]);
}
might be faster than:
unsigned sum = 0;
for (int i = 0; i<32; i++)
{
if (!checkcondition())
break;
sum += x[i];
}
for a particular compiler, for a particular platform, with the right optimization levels set, and for a particular pattern of "checkcondition" results.
... but the only way to tell would be to measure.
I was wondering, if we have if-else condition, then what is computationally more efficient to check: using the equal to operator or the not equal to operator? Is there any difference at all?
E.g., which one of the following is computationally efficient, both cases below will do same thing, but which one is better (if there's any difference)?
Case1:
if (a == x)
{
// execute Set1 of statements
}
else
{
// execute Set2 of statements
}
Case 2:
if (a != x)
{
// execute Set2 of statements
}
else
{
// execute Set1 of statements
}
Here assumptions are most of the time (say 90% of the cases) a will be equal to x. a and x both are of unsigned integer type.
Generally it shouldn't matter for performance which operator you use. However it is recommended for branching that the most likely outcome of the if-statement comes first.
Usually what you should consider is; what is the simplest and clearest way to write this code? IMHO, the first, positive is the simplest (not requiring a !)
In terms of performance there is no differences as the code is likely to compile to the same thing. (Certainly in the JIT for Java it should)
For Java, the JIT can optimise the code so the most common branch is preferred by the branch prediction.
In this simple case, it makes no difference. (assuming a and x are basic types) If they're class-types with overloaded operator == or operator != they might be different, but I wouldn't worry about it.
For subsequent loops:
if ( c1 ) { }
else if ( c2 ) { }
else ...
the most likely condition should be put first, to prevent useless evaluations of the others. (again, not applicable here since you only have one else).
GCC provides a way to inform the compiler about the likely outcome of an expression:
if (__builtin_expect(expression, 1))
…
This built-in evaluates to the value of expression, but it informs the compiler that the likely result is 1 (true for Booleans). To use this, you should write expression as clearly as possible (for humans), then set the second parameter to whichever value is most likely to be the result.
There is no difference.
The x86 CPU architecture has two opcodes for conditional jumps
JNE (jump if not equal)
JE (jump if equal)
Usually they both take the same number of CPU cycles.
And even when they wouldn't, you could expect the compiler to do such trivial optimizations for you. Write what's most readable and what makes your intention more clear instead of worrying about microseconds.
If you ever manage to write a piece of Java code that can be proven to be significantly more efficient one way than the other, you should publish your result and raise an issue against whatever implementation you observed the difference on.
More to the point, just asking this kind of question should be a sign of something amiss: it is an indication that you are focusing your attention and efforts on a wrong aspect of your code. Real-life application performance always suffers from inadequate architecture; never from concerns such as this.
Early optimization is the root of all evil
Even for branch prediction, I think you should not care too much about this, until it is really necessary.
Just as Peter said, use the simplest way.
Let the compiler/optimizer do its work.
It's a general rule of thumb (most nowadays) that the source code should express your intention in the most readable way. You are writing it to another human (and not to the computer), the one year later yourself or your team mate who will need to understand your code with the less effort.
It shouldn't make any difference performance wise but you consider what is easiest to read. Then when you are looking back on your code or if someone is looking at it, you want it to be easy to understand.
it has a little advantage (from point of readability) if the first condition is the one that is true in most cases.
Write the conditions that way that you can read them best. You will not benefit from speed by negating a condition
Most processors use an electrical gate for equality/inequality checks, this means all bits are checked at once. Therefore it should make no difference, but you want to truly optimise your code it is always better to benchmark things yourself and check the results.
If you are wondering whether it's worth it to optimise like that, imagine you would have this check multiple times for every pixel in your screen, or scenarios like that. Imho, it is alwasy worth it to optimise, even if it's only to teach yourself good habits ;)
Only the non-negative approach which you have used at the first seems to be the best .
The only way to know for sure is to code up both versions and measure their performance. If the difference is only a percent or so, use the version that more clearly conveys the intent.
It's very unlikely that you're going to see a significant difference between the two.
Performance difference between them is negligible. So, just think about readability of the code. For readability I prefer the one which has a more lines of code in the If statement.
if (a == x) {
// x lines of code
} else {
// y lines of code where y < x
}
while(some_condition){
if(FIRST)
{
do_this;
}
else
{
do_that;
}
}
In my program the possibility of if(FIRST) succeeding is about 1 in 10000. Can there be any alternative in C/C++ such that we can avoid checking the condition on every iteration inside the while loop with the hope of seeing a better performance in this case.
Ok! Let me put in some more detail.
i am writing a code for a signal acquisiton and tracking scheme where the state of my system will remain in TRACKING mode more often that ACQUISITION mode.
while(signal_present)
{
if(ACQUISITION_SUCCEEDED)
{
do_tracking(); // this functions can change the state from TRACKING to ACQUISITION
}
else
{
do_acquisition(); // this function can change the state from ACQUISITION to TRACKING
}
}
So what happens here is that the system usually remains in tracking mode but it can enter acquisition mode when tracking fails but is not a common occurrence.( Assume the incoming data to be infinite in number. )
The performance cost of a single branch is not going to be a big deal. The only thing you really can do is put the most likely code first, save on some instruction cache. Maybe. This is really deep into micro-optimization.
There is no particularly good reason to try to optimize this. Almost all modern architectures incorporate branch predictors. These speculate that a branch (an if or else) will be taken essentially the way it has been in the past. In your case, the speculation will always succeed, eliminating all overhead. There are non-portable ways to hint that a condition is taken one way or another, but any branch predictor will work just as well.
One thing you might want to do to improve instruction-cache locality is to move do_that out of the while loop (unless it is a function call).
The GCC has a __builtin_expect “function” that you can use to indicate to the compiler which branch will likely be taken. You could use it like this:
if(__builtin_expect(FIRST, 1)) …
Is this useful? I have no idea. I have never used it, never seen it used (except allegedly in the Linux kernel). The GCC documentation actually discourages its usage in favour of using profiling information to achieve a more reliable metric.
On recent x86 processor systems, final execution speed will barely rely on source code implementation.
You can have a look at this page http://igoro.com/archive/fast-and-slow-if-statements-branch-prediction-in-modern-processors/ to see amount the optimization that occurs inside the processor.
If this test is really consuming significant time compared to the implementation of do_aquisition, then you might get a boost by having a function table:
typedef void (*trackfunc)(void);
trackfunc tracking_action[] = {do_acquisition, do_tracking};
while (signal_present)
{
tracking_action[ACQUISITION_STATE]();
}
The effects of these kinds of manual optimizations are very dependent on the platform, the compiler, and the optimization settings.
You will most likely get a much greater performance gain by spending your time measuring and tuning the do_aquisition and do_tracking algorithms.
If you don't know when "FIRST" will be true, then no.
The issue is whether FIRST is time consuming or not; maybe you could evaluate FIRST before the loop (or part of it) and just test the boolean.
I'd change moonshadow's code a little bit to
while( some_condition )
{
do_that;
if( FIRST )
{
do_this; // overwrite what you did earlier.
}
}
Based on your new information, I'd say something like the following:
while(some_condition)
{
while(ACQUISITION_SUCCEEDED)
{
do_tracking();
}
if (some_condition)
while(!ACQUISITION_SUCCEEDED)
{
do_acquisition();
}
}
The point is that the ACQUISITION_SUCCEEDED state must include the some_condition information to a certain extent (i.e. it will break out of the inner loops if some_condition is false - hence there is a chance to break out of the outer loop)
This is a classic in optimization. You should avoid putting conditionals within loops if you can. This code:
while(...)
{
if( a )
{
foo();
}
else
{
bar();
}
}
is often better to rewrite as:
if( a )
{
while(...)
{
foo();
}
}
else
{
while(...)
{
bar();
}
}
It's not always possible though, and you should always when you try to optimize something measure the performance before and after.
There is not much more useful optimizing you can do with your example.
The call / branch to the do_this and do_that may negate any savings you earned by optimizing an if-then-else statement.
One of the rules of performance optimizing is to reduce branches. Most processors prefer to execute sequential code. They can take a chunk of sequential code and haul it into their caches. Branching interrupts this pleasantry and may cause a complete reload of the instruction cache (which loses valuable execution time).
Before you micro-optimize at this level, review your design to see if you can:
Eliminate unnecessary branching.
Split up code so it fits into the
cache.
Organize the data to reduce fetches
from memory or hard drive.
I'm sure that the above steps will gain you more performance than optimizing your posted loop.
Or is it all about semantics?
Short answer: no, they are exactly the same.
Guess it could in theory depend on the compiler; a really broken one might do something slightly different but I'd be surprised.
Just for fun here are two variants that compile down to exactly the same assembly code for me using x86 gcc version 4.3.3 as shipped with Ubuntu. You can check the assembly produced on the final binary with objdump on linux.
int main()
{
#if 1
int i = 10;
do { printf("%d\n", i); } while(--i);
#else
int i = 10;
for (; i; --i) printf("%d\n", i);
#endif
}
EDIT: Here is an "oranges with oranges" while loop example that also compiles down to the same thing:
while(i) { printf("%d\n", i); --i; }
If your for and while loops do the same things, the machine code generated by the compiler should be (nearly) the same.
For instance in some testing I did a few years ago,
for (int i = 0; i < 10; i++)
{
...
}
and
int i = 0;
do
{
...
i++;
}
while (i < 10);
would generate exactly the same code, or (and Neil pointed out in the comments) with one extra jmp, which won't make a big enough difference in performance to worry about.
There is no semantic difference, there need not be any compiled difference. But it depends on the compiler. So I tried with with g++ 4.3.2, CC 5.5, and xlc6.
g++, CC were identical, xlc WAS NOT
The difference in xlc was in the initial loop entry.
extern int doit( int );
void loop1( ) {
for ( int ii = 0; ii < 10; ii++ ) {
doit( ii );
}
}
void loop2() {
int ii = 0;
while ( ii < 10 ) {
doit( ii );
ii++;
}
}
XLC OUTPUT
.loop2: # 0x00000000 (H.10.NO_SYMBOL)
mfspr r0,LR
stu SP,-80(SP)
st r0,88(SP)
cal r3,0(r0)
st r3,64(SP)
l r3,64(SP) ### DIFFERENCE ###
cmpi 0,r3,10
bc BO_IF_NOT,CR0_LT,__L40
...
enter code here
.loop1: # 0x0000006c (H.10.NO_SYMBOL+0x6c)
mfspr r0,LR
stu SP,-80(SP)
st r0,88(SP)
cal r3,0(r0)
cmpi 0,r3,10 ### DIFFERENCE ###
st r3,64(SP)
bc BO_IF_NOT,CR0_LT,__La8
...
The scope of the variable in the test of the while loop is wider than the scope of variables declared in the header of the for loop.
Therefore, if there are performance implications as a side-effect of keeping a variable alive longer, then there will be performance implications in choosing between a while and a for loop ( and not wrapping the while up in {} to reduce the scope of its variables ).
An example might be a concurrent collection which counts the number of iterators referring to it, and if more than one iterator exists, it applies locking to prevent concurrent modification, but as an optimisation elides the locking if only one iterator refers to it. If you then had two for loops in a function using differently named iterators on the same container, the fast path would be taken, but with two while loops the slow path would be taken. Similarly there may be performance implications if the objects are large (more cache traffic), or use system resources. But I can't think of a real example that I've ever seen where it would make a difference.
Compilers that optimize using loop unrolling will probably only do so in the for-loop case.
Both are equivalent. It's a matter of semantics.
The only difference may lie in the do... while construct, where you postpone the evaluation of the condition until after the body, and thus may save 1 evaluation.
i = 1; do { ... i--; } while( i > 0 );
as opposed to
for( i = 1; i > 0; --i )
{ ....
}
I write compilers. We compile all "structured" control flow (if, while, for, switch, do...while) into conditional and unconditional branches. Then we analyze the control-flow graph. Since a C compiler has to deal with general goto anyway, it is easiest to reduce everything to branch and conditional-branch instructions, then be sure to handle that case well. (A C compiler has to do a good job not just on handwritten code but also on automatically generated code, which may have many, many goto statements.)
No. If they're doing equivalent things, they'll compile to the same code - as you say, it's about semantics. Choose the one that best represents what you're trying to express.
Ideally it should be the same, but eventually it depends on your compiler/interpreter. To be sure, you must measure or examine the generated assembly code.
Proof that there may be a difference: These lines produce different assembly code using cc65.
for (; i < 1000; ++i);
while (i < 1000) ++i;
On Atmel ATMega while() is faster than for(). Why is this is explained in AVR035: Efficient C Coding for AVR.
P.S. Original platform was not mentioned in question.
continue behaves differently in for and while: in for, it alters the counter, in while, it usually doesn't
To add another answer: In my experience, optimizing software is like a big, bushy beard being shaved off a man.
First you lop it off in big chunks with scissors (prune whole limbs off the call tree).
Then you make it short with an electric clipper (tweak algorithms).
Finally you shave it with a razor to get rid of the last little bit (low-level optimization).
The last is where the difference between for() and while() might, but probably won't, make a difference.
P.S. The programmers I know (who are all very good, and I suspect are a representative sample) basically go at it from the other direction.
They are the same as far as performance goes. I tend to use while when waiting for a state change (such as waiting for a buffer to be filled) and for when processing a number of discrete objects (such as going through each item in a collection).
There is a difference in some cases.
If you are at the point where that difference matters, you either need to pick a better algorithm or begin coding in assembly language. Trust me, coding in assembly is preferable to fixing your compiler version.
Is while() faster/slower than for()? Let's review a few things about optimization:
Compiler-writers work very hard to shave cycles by having fewer calls to jump, compare, increment, and the other kinds of instructions that they generate.
Call instructions, on the other hand, consume many magnitudes more cycles, but the compiler is nearly powerless to do anything to remove those.
As programmers, we write lots of function calls, some because we mean to, some because we're lazy, and some because the compiler slips them in without being obvious.
Most of the time, it doesn't matter, because the hardware is so fast, and our jobs are so small, that the computer is like a beagle dog who wolfes her food and begs for more.
Sometimes, however, the job is big enough that performance is an issue.
What do we do then? Where's the bigger payoff?
Getting the compiler to shave a few cycles off loops & such?
Finding function calls that don't -really- need to be done so much?
The compiler can't do the latter. Only we the programmers can.
We need to learn or be taught how to do this. It doesn't come naturally.
We are congenitally inclined to make wrong guesses and then bet on them.
Getting better algorithms is a start, but only a start. Our teachers need to teach this, if indeed they know how.
Profilers are a start. I do this.
The apocryphal quote of Willie Sutton when asked Why do you rob banks?:
Because that's where the money is.
If you want to save cycles, find out where they are.
Probably only coding style.
for if you know the number of iterations.
while if you do not know the number of iterations.
Disclaimer: I tried to search for similar question, however this returned about every C++ question... Also I would be grateful to anyone that could suggest a better title.
There are two eminent loop structure in C++: while and for.
I purposefully ignore the do ... while construct, it is kind of unparalleled
I know of std::for_each and BOOST_FOREACH, but not every loop is a for each
Now, I may be a bit tight, but it always itches me to correct code like this:
int i = 0;
while ( i < 5)
{
// do stuff
++i; // I'm kind and use prefix notation... though I usually witness postfix
}
And transform it in:
for (int i = 0; i < 5; ++i)
{
// do stuff
}
The advantages of for in this example are multiple, in my opinion:
Locality: the variable i only lives in the scope of the loop
Pack: the loop 'control' is packed, so with only looking at the loop declaration I can figure if it is correctly formed (and will terminate...), assuming of course that the loop variable is not further modified within the body
It may be inlined, though I would not always advised it (that makes for tricky bugs)
I have a tendency therefore not to use while, except perhaps for the while(true) idiom but that's not something I have used in a while (pun intended). Even for complicated conditions I tend to stick to a for construct, though on multiple lines:
// I am also a fan of typedefs
for (const_iterator it = myVector.begin(), end = myVector.end();
it != end && isValid(*it);
++it)
{ /* do stuff */ }
You could do this with a while, of course, but then (1) and (2) would not be verified.
I would like to avoid 'subjective' remarks (of the kind "I like for/while better") and I am definitely interested to references to existing coding guidelines / coding standards.
EDIT:
I tend to really stick to (1) and (2) as far as possible, (1) because locality is recommended >> C++ Coding Standards: Item 18, and (2) because it makes maintenance easier if I don't have to scan a whole body loop to look for possible alterations of the control variable (which I takes for granted using a for when the 3rd expression references the loop variables).
However, as gf showed below, while do have its use:
while (obj.advance()) {}
Note that this is not a rant against while but rather an attempt to find which one of while or for use depending on the case at hand (and for sound reasons, not merely liking).
Not all loops are for iteration:
while(condition) // read e.g.: while condition holds
{
}
is ok, while this feels forced:
for(;condition;)
{
}
You often see this for any input sources.
You might also have implicit iteration:
while(obj.advance())
{
}
Again, it looks forced with for.
Additionally, when forcing for instead of while, people tend to misuse it:
for(A a(0); foo.valid(); b+=x); // a and b don't relate to loop-control
Functionally, they're the same thing, of course. The only reason to differentiate is to impart some meaning to a maintainer or to some human reader/reviewer of the code.
I think the while idiom is useful for communicating to the reader of the code that a non-linear test is controlling the loop, whereas a for idiom generally implies some kind of sequence. My brain also kind of "expects" that for loops are controlled only by the counting expression section of the for statement arguments, and I'm surprised (and disappointed) when I find someone conditionally messing with the index variable inside the execution block.
You could put it in your coding standard that "for" loops should be used only when the full for loop construct is followed: the index must be initialized in the initializer section, the index must be tested in the loop-test section, and the value of the index must only be altered in the counting expression section. Any code that wants to alter the index in the executing block should use a while construct instead. The rationale would be "you can trust a for loop to execute using only the conditions you can see without having to hunt for hidden statements that alter the index, but you can't assume anything is true in a while loop."
I'm sure there are people who would argue and find plenty of counter examples to demonstrate valid uses of for statements that don't fit my model above. That's fine, but consider that your code can be "surprising" to a maintainer who may not have your insight or brilliance. And surprises are best avoided.
i does not automatically increase within a while loop.
while (i < 5) {
// do something with X
if (X) {
i++;
}
}
One of the most beautiful stuff in C++ is the algorithms part of STL. When reading code written properly using STL, the programmer would be reading high-level loops instead of low-level loops.
I don't believe that compilers can optimize significantly better if you chose to express your loop one way or the other. In the end it all boils down to readability, and that's a somewhat subjective matter (even though most people probably agree on most examples' readability factor).
As you have noticed, a for loop is just a more compact way of saying
type counter = 0;
while ( counter != end_value )
{
// do stuff
++counter;
}
While its syntax is flexible enough to allow you to do other things with it, I try to restrict my usage of for to examples that aren't much more complicated than the above. OTOH, I wouldn't use a while loop for the above.
I tend to use for loops when there is some kind of counting going on and the loop ends when the counting ends. Obviously you have your standard for( i=0; i < maxvalue; i++ ), but also things like for( iterator.first(); !iterator.is_done(); iterator.next() ).
I use while loops when it's not clear how many times the loop might iterate, i.e. "loop until some condition that cannot be pre-computed holds (or fails to hold)".
// I am also a fan of typedefs
for (const_iterator it = myVector.begin(), end = myVector.end();
it != end && isValid(*it);
++it)
{ /* do stuff */ }
It seems to me that the above code, is rather less readable than the code below.
// I am also a fan of typedefs
const_iterator it = myVector.begin();
end = myVector.end();
while(it != end && isValid(*it))
{
/* do stuff */
++it}
Personally, I think legibility trumps these kind of formatting standards. If another programmer can't easily read your code, that leads to mistakes at worst, and at best it results in wasted time which costs the company money.
In Ye Olde C, for and while loops were not the same.
The difference was that in for loops, the compiler was free to assign a variable to a CPU register and reclaim the register after the loop. Thus, code like this had non-defined behaviour:
int i;
for (i = 0; i < N; i++) {
if (f(i)) break;
}
printf("%d", i); /* Non-defined behaviour, unexpected results ! */
I'm not 100% sure, but I believe this is described in K&R
This is fine:
int i = 0;
while (i < N) {
if (f(i)) break;
i++;
}
printf("%d", i);
Of course, this is compiler-dependent. Also, with time, compilers stopped making use of that freedom, so if you run the first code in a modern C compiler, you should get the expected results.
I wouldn't be so quick to throw away do-while loops. They are useful if you know your loop body will run at least once. Consider some code which creates one thread per CPU core. With a for loop it might appear:
for (int i = 0; i < number_of_cores; ++i)
start_thread(i);
Uentering the loop, the first thing that is checked is the condition, in case number_of_cores is 0, in which case the loop body should never run. Hang on, though - this is a totally redundant check! The user will always have at least one core, otherwise how is the current code running? The compiler can't eliminate the first redundant comparison, as far as it knows, number_of_cores could be 0. But the programmer knows better. So, eliminating the first comparison:
int i = 0;
do {
start_thread(i++);
} while (i < number_of_cores);
Now every time this loop is hit there is only one comparison instead of two on a one-core machine (with a for loop the condition is true for i = 0, false for i = 1, whereas the do while is false all the time). The first comparison is omitted, so it's faster. With less potential branching, there is less potential for branch mispredicts, so it is faster. Because the while condition is now more predictable (always false on 1-core), the branch predictor can do a better job, which is faster.
Minor nitpick really, but not something to be thrown away - if you know the body will always run at least once, it's premature pessimization to use a for loop.