How to use [[(un)likely]] at do while loop in C++20? - c++

do [[unlikely]]
{...}
while(a == 0);
This code can be compiled.
But is this the correct way to tell compiler that a is usually non-zero.

Structurally, this is a correct way to say what you're trying to say. The attribute is placed in a location that tags the path of execution that is likely/unlikely to be executed. Applying it to the block statement of the do/while loop works adequately. It would also work within the block.
That having been said, it's unclear what good this would do practically. It might prevent some unrolling of the loop or inhibit prefetching. But it can't really change the structure of the compiled code, since the block has to be executed at least once and the conditional branch has to come after the block.

Related

C++ Writing an Interpreter - determining loops target for break statement c++

I am writing a simple program interpreter in c++. When I am building the internal representation of the program and I get a break statement, how do I determine the encompassing loops target location?
void Imp::whilestmt()
{
Expr *pExpr;
accept(Token::WHILE);
expr(pExpr);
WhileStmt *pwhilestmt = new WhileStmt(pExpr,vm.getLocationCounter);
vm.add(pwhilestmt);
accept(Token::LOOP);
stmtlist();
pwhilestmt->setTarget(vm.getLocationCounter);
accept(Token::END);
accept(Token::LOOP);
vm.add(new EndLoopStmt);
}
My break statement object is going to take the the while statement's target as a parameter, how can I determine this?
I'd consider building a kind of execution tree/pipeline. Every LOOP/WHILE would be a new branch (similarily to every function) so when you encounter END/BREAK instruction you just revert to the branches origin point and continue down the line.
I think the solution is to add a forward reference that is resolved (by looking up the location of the end of the loop) when all the code for that level of loop has been produced.
In other words, when generating the code for the loop, you need to form a "jump" instruction, which has it's target set to somewhere you don't know where it is yet. The solution is to have a jump with an unknown destination (set the "destination" to instruction 0 or -1 or 0xdeaddead or something else that can be easily identified for debugging purposes later - because the best way to avoid getting bugs of "I didn't fix it up properly" is to make it easy to identify those places - bugs only occur in things that are hard to identify, just like it never rains when you carry an umbrella), and keep a fixup list of such jumps until you have generated the entire loop, then work your way through that fixup list, and fill in the relevant address that you now know is "here" (the next instruction after the loop). I suspect you also need something similar for the condition of the loop itself - if that's false, then you need to continue "after" the loop.
I added setTarget as a virtual function of Stmt.
I stored the start location in the part that handles the if statements and then checked if I had any break stmts from the start location to the current location, and if I did I set the target to the current location.
really messy way to do it, but it works for now

When should I use do-while instead of while loops? [duplicate]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
When I was taking CS in college (mid 80's), one of the ideas that was constantly repeated was to always write loops which test at the top (while...) rather than at the bottom (do ... while) of the loop. These notions were often backed up with references to studies which showed that loops which tested at the top were statistically much more likely to be correct than their bottom-testing counterparts.
As a result, I almost always write loops which test at the top. I don't do it if it introduces extra complexity in the code, but that case seems rare. I notice that some programmers tend to almost exclusively write loops that test at the bottom. When I see constructs like:
if (condition)
{
do
{
...
} while (same condition);
}
or the inverse (if inside the while), it makes me wonder if they actually wrote it that way or if they added the if statement when they realized the loop didn't handle the null case.
I've done some googling, but haven't been able to find any literature on this subject. How do you guys (and gals) write your loops?
I always follow the rule that if it should run zero or more times, test at the beginning, if it must run once or more, test at the end. I do not see any logical reason to use the code you listed in your example. It only adds complexity.
Use while loops when you want to test a condition before the first iteration of the loop.
Use do-while loops when you want to test a condition after running the first iteration of the loop.
For example, if you find yourself doing something like either of these snippets:
func();
while (condition) {
func();
}
//or:
while (true){
func();
if (!condition) break;
}
You should rewrite it as:
do{
func();
} while(condition);
Difference is that the do loop executes "do something" once and then checks the condition to see if it should repeat the "do something" while the while loop checks the condition before doing anything
Does avoiding do/while really help make my code more readable?
No.
If it makes more sense to use a do/while loop, then do so. If you need to execute the body of a loop once before testing the condition, then a do/while loop is probably the most straightforward implementation.
First one may not execute at all if condition is false. Other one will execute at least once, then check the conidition.
For the sake of readability it seems sensible to test at the top. The fact it is a loop is important; the person reading the code should be aware of the loop conditions before trying to comprehend the body of the loop.
Here's a good real-world example I came across recently. Suppose you have a number of processing tasks (like processing elements in an array) and you wish to split the work between one thread per CPU core present. There must be at least one core to be running the current code! So you can use a do... while something like:
do {
get_tasks_for_core();
launch_thread();
} while (cores_remaining());
It's almost negligable, but it might be worth considering the performance benefit: it could equally be written as a standard while loop, but that would always make an unnecessary initial comparison that would always evaluate true - and on single-core, the do-while condition branches more predictably (always false, versus alternating true/false for a standard while).
Yaa..its true.. do while will run atleast one time.
Thats the only difference. Nothing else to debate on this
The first tests the condition before performing so it's possible your code won't ever enter the code underneath. The second will perform the code within before testing the condition.
The while loop will check "condition" first; if it's false, it will never "do something." But the do...while loop will "do something" first, then check "condition".
Yes, just like using for instead of while, or foreach instead of for improves readability. That said some circumstances need do while and I agree you would be silly to force those situations into a while loop.
It's more helpful to think in terms of common usage. The vast majority of while loops work quite naturally with while, even if they could be made to work with do...while, so basically you should use it when the difference doesn't matter. I would thus use do...while for the rare scenarios where it provides a noticeable improvement in readability.
The use cases are different for the two. This isn't a "best practices" question.
If you want a loop to execute based on the condition exclusively than use
for or while
If you want to do something once regardless of the the condition and then continue doing it based the condition evaluation.
do..while
For anyone who can't think of a reason to have a one-or-more times loop:
try {
someOperation();
} catch (Exception e) {
do {
if (e instanceof ExceptionIHandleInAWierdWay) {
HandleWierdException((ExceptionIHandleInAWierdWay)e);
}
} while ((e = e.getInnerException())!= null);
}
The same could be used for any sort of hierarchical structure.
in class Node:
public Node findSelfOrParentWithText(string text) {
Node node = this;
do {
if(node.containsText(text)) {
break;
}
} while((node = node.getParent()) != null);
return node;
}
A while() checks the condition before each execution of the loop body and a do...while() checks the condition after each execution of the loop body.
Thus, **do...while()**s will always execute the loop body at least once.
Functionally, a while() is equivalent to
startOfLoop:
if (!condition)
goto endOfLoop;
//loop body goes here
goto startOfLoop;
endOfLoop:
and a do...while() is equivalent to
startOfLoop:
//loop body
//goes here
if (condition)
goto startOfLoop;
Note that the implementation is probably more efficient than this. However, a do...while() does involve one less comparison than a while() so it is slightly faster. Use a do...while() if:
you know that the condition will always be true the first time around, or
you want the loop to execute once even if the condition is false to begin with.
Here is the translation:
do { y; } while(x);
Same as
{ y; } while(x) { y; }
Note the extra set of braces are for the case you have variable definitions in y. The scope of those must be kept local like in the do-loop case. So, a do-while loop just executes its body at least once. Apart from that, the two loops are identical. So if we apply this rule to your code
do {
// do something
} while (condition is true);
The corresponding while loop for your do-loop looks like
{
// do something
}
while (condition is true) {
// do something
}
Yes, you see the corresponding while for your do loop differs from your while :)
As noted by Piemasons, the difference is whether the loop executes once before doing the test, or if the test is done first so that the body of the loop might never execute.
The key question is which makes sense for your application.
To take two simple examples:
Say you're looping through the elements of an array. If the array has no elements, you don't want to process number one of zero. So you should use WHILE.
You want to display a message, accept a response, and if the response is invalid, ask again until you get a valid response. So you always want to ask once. You can't test if the response is valid until you get a response, so you have to go through the body of the loop once before you can test the condition. You should use DO/WHILE.
I tend to prefer do-while loops, myself. If the condition will always be true at the start of the loop, I prefer to test it at the end. To my eye, the whole point of testing conditions (other than assertions) is that one doesn't know the result of the test. If I see a while loop with the condition test at the top, my inclination is to consider the case that the loop executes zero times. If that can never happen, why not code in a way that clearly shows that?
It's actually meant for a different things. In C, you can use do - while construct to achieve both scenario (runs at least once and runs while true). But PASCAL has repeat - until and while for each scenario, and if I remember correctly, ADA has another construct that lets you quit in the middle, but of course that's not what you're asking.
My answer to your question : I like my loop with testing on top.
Both conventions are correct if you know how to write the code correctly :)
Usually the use of second convention ( do {} while() ) is meant to avoid have a duplicated statement outside the loop. Consider the following (over simplified) example:
a++;
while (a < n) {
a++;
}
can be written more concisely using
do {
a++;
} while (a < n)
Of course, this particular example can be written in an even more concise way as (assuming C syntax)
while (++a < n) {}
But I think you can see the point here.
while( someConditionMayBeFalse ){
// this will never run...
}
// then the alternative
do{
// this will run once even if the condition is false
while( someConditionMayBeFalse );
The difference is obvious and allows you to have code run and then evaluate the result to see if you have to "Do it again" and the other method of while allows you to have a block of script ignored if the conditional is not met.
I write mine pretty much exclusively testing at the top. It's less code, so for me at least, it's less potential to screw something up (e.g., copy-pasting the condition makes two places you always have to update it)
It really depends there are situations when you want to test at the top, others when you want to test at the bottom, and still others when you want to test in the middle.
However the example given seems absurd. If you are going to test at the top, don't use an if statement and test at the bottom, just use a while statement, that's what it is made for.
You should first think of the test as part of the loop code. If the test logically belongs at the start of the loop processing, then it's a top-of-the-loop test. If the test logically belongs at the end of the loop (i.e. it decides if the loop should continue to run), then it's probably a bottom-of-the-loop test.
You will have to do something fancy if the test logically belongs in them middle. :-)
I guess some people test at the bottom because you could save one or a few machine cycles by doing that 30 years ago.
To write code that is correct, one basically needs to perform a mental, perhaps informal proof of correctness.
To prove a loop correct, the standard way is to choose a loop invariant, and an induction proof. But skip the complicated words: what you do, informally, is figure out something that is true of each iteration of the loop, and that when the loop is done, what you wanted accomplished is now true. The loop invariant is false at the end, for the loop to terminate.
If the loop conditions map fairly easily to the invariant, and the invariant is at the top of the loop, and one infers that the invariant is true at the next iteration of the loop by working through the code of the loop, then it is easy to figure out that the loop is correct.
However, if the invariant is at the bottom of the loop, then unless you have an assertion just prior to the loop (a good practice) then it becomes more difficult because you have to essentially infer what that invariant should be, and that any code that ran before the loop makes the loop invariant true (since there is no loop precondition, code will execute in the loop). It just becomes that more difficult to prove correct, even if it is an informal in-your-head proof.
This isn't really an answer but a reiteration of something one of my lecturers said and it interested me at the time.
The two types of loop while..do and do..while are actually instances of a third more generic loop, which has the test somewhere in the middle.
begin loop
<Code block A>
loop condition
<Code block B>
end loop
Code block A is executed at least once and B is executed zero or more times, but isn't run on the very last (failing) iteration. a while loop is when code block a is empty and a do..while is when code block b is empty. But if you're writing a compiler, you might be interested in generalizing both cases to a loop like this.
In a typical Discrete Structures class in computer science, it's an easy proof that there is an equivalence mapping between the two.
Stylistically, I prefer while (easy-expr) { } when easy-expr is known up front and ready to go, and the loop doesn't have a lot of repeated overhead/initialization. I prefer do { } while (somewhat-less-easy-expr); when there is more repeated overhead and the condition may not be quite so simple to set up ahead of time. If I write an infinite loop, I always use while (true) { }. I can't explain why, but I just don't like writing for (;;) { }.
I would say it is bad practice to write if..do..while loops, for the simple reason that this increases the size of the code and causes code duplications. Code duplications are error prone and should be avoided, as any change to one part must be performed on the duplicate as well, which isn't always the case. Also, bigger code means a harder time on the cpu cache. Finally, it handles null cases, and solves head aches.
Only when the first loop is fundamentally different should one use do..while, say, if the code that makes you pass the loop condition (like initialization) is performed in the loop. Otherwise, if it certain that loop will never fall on the first iteration, then yes, a do..while is appropriate.
From my limited knowledge of code generation I think it may be a good idea to write bottom test loops since they enable the compiler to perform loop optimizations better. For bottom test loops it is guaranteed that the loop executes at least once. This means loop invariant code "dominates" the exit node. And thus can be safely moved just before the loop starts.

Does replacing statements by expressions using the C++ comma operator could allow more compiler optimizations?

The C++ comma operator is used to chain individual expressions, yielding the value of the last executed expression as the result.
For example the skeleton code (6 statements, 6 expressions):
step1;
step2;
if (condition)
step3;
return step4;
else
return step5;
May be rewritten to: (1 statement, 6 expressions)
return step1,
step2,
condition?
step3, step4 :
step5;
I noticed that it is not possible to perform step-by-step debugging of such code, as the expression chain seems to be executed as a whole. Does it means that the compiler is able to perform special optimizations which are not possible with the traditional statement approach (specially if the steps are const or inline)?
Note: I'm not talking about the coding style merit of that way of expressing sequence of expressions! Just about the possible optimisations allowed by replacing statements by expressions.
Most compilers will break your code down into "basic blocks", which are stretches of code with no jumps/branches in or out. Optimisations will be performed on a graph of these blocks: that graph captures all the control flow in the function. The basic blocks are equivalent in your two versions of the code, so I doubt that you'd get different optimisations. That the basic blocks are the same isn't entirely obvious: it relies on the fact that the control flow between the steps is the same in both cases, and so are the sequence points. The most plausible difference is that you might find in the second case there is only one block including a "return", and in the first case there are two. The blocks are still equivalent, since the optimiser can replace two blocks that "do the same thing" with one block that is jumped to from two different places. That's a very common optimisation.
It's possible, of course, that a particular compiler doesn't ignore or eliminate the differences between your two functions when optimising. But there's really no way of saying whether any differences would make the result faster or slower, without examining what that compiler is doing. In short there's no difference between the possible optimisations, but it doesn't necessarily follow that there's no difference between the actual optimisations.
The reason you can't single-step your second version of the code is just down to how the debugger works, not the compiler. Single-step usually means, "run to the next statement", so if you break your code into multiple statements, you can more easily debug each one. Otherwise, if your debugger has an assembly view, then in the second case you could switch to that and single-step the assembly, allowing you to see how it progresses. Or if any of your steps involve function calls, then you may be able to "do the hokey-cokey", by repeatedly doing "step in, step out" of the functions, and separate them that way.
Using the comma operator neither promotes nor hinders optimization in any circumstances I'm aware of, because the C++ standard guarantee is only that evaluation will be in left-to-right order, not that statement execution necessarily will be. (This is the same guarantee you get with statement line order.)
What it is likely to do, though, is turn your code into a confusing mess, since many programmers are unaware that the comma-as-operator even exists, and are apt to confuse it with commas used as parameter separators. (Want to really make your code unreadable? Call a function like my_func((++i, y), x).)
The "best" use of the comma operator I've seen is to work with multiple variables in the iteration statement of a for loop:
for (int i = 0, j = 0;
i < 10 && j < 12;
i += j, ++j) // each time through the loop we're tinkering with BOTH i and j
{
}
Very unlikely IMHO. The thing get's compiled down to assembler/machine code, then further low-level optimizations are done, so it probably turns out to the same thing.
OTOH, if the comma operator is overloaded, the game changes completely. But I'm sure you know that. ;)
The obligatory list:
Don't worry about rewriting almost equivalent code to gain performance
If you have a perf-problem, profile to see what the problem is
If you can't get it faster by algorithmic ops, look at the disassembly and see that the compiler does what you intended
If not, ask here and post source and disassembly for both versions. :)

Any reason to replace while(condition) with for(;condition;) in C++?

Looks like
while( condition ) {
//do stuff
}
is completely equivalent to
for( ; condition; ) {
//do stuff
}
Is there any reason to use the latter instead of the former?
There's no good reason as far as I know. You're intentionally misleading people by using a for-loop that doesn't increment anything.
Update:
Based on the OP's comment to the question, I can speculate on how you might see such a construct in real code. I've seen (and used) this before:
lots::of::namespaces::container::iterator iter = foo.begin();
for (; iter != foo.end(); ++iter)
{
// do stuff
}
But that's as far as I'll go with leaving things out of a for-loop. Perhaps your project had a loop that looked like that at one time. If you add code that removes elements of a container in the middle of the loop, you likely have to control carefully how iter is incremented. That could lead to code that looks like this:
for (; iter != foo.end(); )
{
// do stuff
if (condition)
{
iter = foo.erase(iter);
}
else
{
++iter;
}
}
However, that's no excuse for not taking the five seconds needed to change it into a while-loop.
Some compilers warn about constant loop conditions:
while (true) { /* ... */ } /* Warning! */
for (;;) { /* ... */ } /* No warning */
In the specific case of an infinite loop, I might choose a for loop over a while loop for that reason. But if the condition is not empty, I don't really see any benefit. My guess as to why it appeared in the mentioned project is that the code somehow evolved through maintenance, but was written in a more conventional way originally.
No. No. No.
Even if there were a microscopic performance difference, you'd have to be an end-stage Jedi performance tuner to have it matter enough to care.
Is there any reason to use the latter
instead of the former?
A misguided effort to impress your colleagues that you know that those two forms are equivalent.
A foolish maneuver to ensure "job security" by making your code as confusing as possible so that no one will ever want to change it.
The "w" key on your keyboard is broken.
It started life as a for loop with initializers and incrementing condition, and when the logic changed, the developer was too busy to change it.
It's possible to compile
for(INIT; CONDITION; UPDATE)
{
BODY
}
into
{
INIT
while(CONDITION)
{
BODY
UPDATE
}
}
UPDATE: The seemingly redundant extra scope is to cage any variable definitions in INIT, i.e. from for(int i = 0; ...). Thanks!
It's basically just a reordering of the expressions. So there's no reason to prefer one over the other, for performance reasons. I would recommend while() if possible, since it's simpler. If a simpler construct expresses what you want to do, I think that's the one to use.
As far as I know the two statements are optimized by the compiler into the same assember code anyway.. so no, there's no reason to do so - just personal preference.
I think "while" and "for" loops are meant for different idioms. The idiom of using "while" is "do something, while certain conditions are true". The idiom for "for" is "iterate over a certain range of elements"...
Whenever I read a code, I expect these idioms (and I think I am not alone). When I see "for" I understand, that someone is iterating over the certain range and I do not go into details. When I see the for cycle, used for another idiom (not the one, I expect), I get confused and have to go into details.
Anyway, it is very subjective...
In this case, I personally prefer the first loop as it is easier to write and read.
But if I have a loop that needs to some post statement, I'd use for loop like this:
for (; i < 10; i += 2)
There might be small compiler-dependent differences on the assembly level, but ideally both should behave exactly the same, and the former is more readable. So no, no reson to use the latter version other than nonconformism.
Compile both and check the resulting disassembly, if they are the same (which they probably are). Choose the one you find most readable.
if you want to do something a limited amount of times, then "for" let's you specify the constraint without jumbling it in with the logic inside your loop.
Keeping readability aside for a small while, there is usually no performance difference between the different loops. At least there is no significant difference.
For desktop applications you can chose based on Readability criteria. Refer to the other posts - e.g. looking at for loop someone thinks the incrementor is declared within the loop.
It seems for web applications e.g. client side scripting there might be a difference.
Check this site: http://www.websiteoptimization.com/speed/10/10-2.html
Run your own experiments and go by the results else stick by readability rules.
I can see 2 reasons, none of which I'd consider:
Only have 1 loop construct, but then Kristo's objection stands
write "for (; EVER;)", but then prefer a LOOP_FOREVER macro if really want this.
There really is no difference in C-ish languages between a for (;cond;) loop and a while loop. Generally what I do in C-ish languages is start off writing the loop as a "for" and change it into a "while" if I end up with that form. It is kinda rare though, as you are always iterating through something, and C lets you put any code you want in that last area.
It would be different if C had real (pre-computed iteration) for loops.
You might want to use a do-while loop instead of a for loop so the code is processed at least once before conditions are checked and met (or not).
I used to write some pretty cryptic C/C++ code. Looking back, I would probably do this in a while loop:
ifstream f("file.txt");
char c;
for(f.get(c); !f.eof(); f.get(c)) {
// ...
}
I guess my point is that for loops are usually shorter but less readable, if they're not used in the traditional sense of looping over a range.
This question has been answered - the language has a more natural construct for expressing what you want - you should use it. For example, I can certainly write this:
for (bool b = condition(); b; b = !b) {
/* more code */
}
or:
while (condition()) {
/* more code */
break;
}
instead of the more conventional:
if (condition()) {
/* more code */
}
But why? C (and all languages) have idioms and most of them make rational sense in terms of expressivity and expectation of meaning. When you dick with the idiom, your mess with the sensibilities of the person who has to read your code.

for(;true;) different from while(true)?

If my understanding is correct, they do exactly the same thing. Why would anyone use for the "for" variant? Is it just taste?
Edit: I suppose I was also thinking of for (;;).
for (;;)
is often used to prevent a compiler warning:
while(1)
or
while(true)
usually throws a compiler warning about a conditional expression being constant (at least at the highest warning level).
Yes, it is just taste.
I've never seen for (;true;). I have seen for (;;), and the only difference seems to be one of taste. I've found that C programmers slightly prefer for (;;) over while (1), but it's still just preference.
Not an answer but a note: Sometimes remembering that for(;x;) is identical to while(x) (In other words, just saying "while" as I examine the center expression of an if conditional) helps me analyze nasty for statements...
For instance, it makes it obvious that the center expression is always evaluated at the beginning of the first pass of the loop, something you may forget, but is completely unambiguous when you look at it in the while() format.
Sometimes it also comes in handy to remember that
a;
while(b) {
...
c;
}
is almost (see comments) the same as
for(a;b;c) {
...
}
I know it's obvious, but being actively aware of this relationship really helps you to quickly convert between one form and the other to clarify confusing code.
Some compilers (with warnings turned all the way up) will complain that while(true) is a conditional statement that can never fail, whereas they are happy with for (;;).
For this reason I prefer the use of for (;;) as the infinite loop idiom, but don't think it is a big deal.
It's in case they plan to use a real for() loop later. If you see for(;true;), it's probably code meant to be debugged.
An optimizing compiler should generate the same assembly for both of them -- an infinite loop.
The compiler warning has already been discussed, so I'll approach it from a semantics stand-point. I use while(TRUE) rather than for(;;) because in my mind, while(TRUE) sounds like it makes more sense than for(;;). I read while(TRUE) as "while TRUE is always TRUE". Personally, this is an improvement in the readability of code.
So, Zeus forbid I don't document my code (this -NEVER- happens, of course) it stays just a little bit more readable than the alternative.
But, overall, this is such a nit-picky thing that it comes down to personal preference 99% of the time.