how does a C-like compiler interpret the if statement

how does a C-like compiler interpret the if statement - c++

In C-like languages, we are used to having if statements similar to the following:
if(x == 5) {
//do something
}
else if(x == 7) {
//do something else
}
else if(x == 9) {
//do something else
} else {
//do something else
}
My question is, does the compiler see that if statement that way, or does it end up being interpreted like:
if(x == 5) {
//do something
}
else {
if(x == 7) {
//do something
}
else {
if(x == 9) {
//do something
}
else {
//do something else
}
}
}
EDIT: I realized that while the question made sense in my head, it probably sounded rather stupid to the rest of the general populace. I was more referring to how the AST would look and if there was any special AST cases for 'else-if' statements or if it would be compiled as a cascading if/else block.

They are equivalent to a C compiler. There is no special syntax else if in C. The second if is just another if statement.
To make it clearer, according to C99 standard, if statement is defined as
selection-statement:
if (expression) statement
if (expression) statement else statement
switch (expression) statement
and a compound-statement is defined as
compound-statement:
{block-item-list(opt) }
block-item-list:
block-item
block-item-list block-item
block-item:
declaration
statement
When a compiler frond-end tries to understand a source code file it often follows these steps:
Lexical analysis: turn the plain-text source code into a list of 'tokens'
Semantic analysis: parse the token list and generate an abstract syntax tree (AST)
The tree is then passed to compiler middle-end (to optimize) or back-end (to generate machine code)
In your case this if statement
if(x == 7) {
//do something else
} else if(x == 9) {
//do something else
} else {
//do something else
}
Is parsed as a selection-statement inside a selection-statement,
selection-stmt
/ | \
exp stmt stmt
| | |
... ... selection-stmt
/ | \
exp stmt stmt
| | |
... ... ...
and this one
if(x == 7) {
//do something else
} else {
if(x == 9) {
//do something else
} else {
//do something else
}
}
is the same selection-statement inside a compound-statement inside a selection-statement:
selection-stmt
/ | \
exp stmt stmt
| | |
... ... compound-stmt
|
block-item-list
|
block-item
|
stmt
|
selection-stmt
/ | \
exp stmt stmt
| | |
... ... ...
So they have different ASTs. But it makes no differences for the compiler backend: as you can see in the AST, there is no structural changes.

In both C and C++ enclosing a statement into a redundant pair of {} does not change the semantics of the program. This statement
a = b;
is equivalent to this one
{ a = b; }
is equivalent to this one
{{ a = b; }}
and to this one
{{{{{ a = b; }}}}}
Redundant {} make absolutely no difference to the compiler.
In your example, the only difference between the first version and the second version is a bunch of redundant {} you added to the latter, just like I did in my a = b example above. Your redundant {} change absolutely nothing. There's no appreciable difference between the two versions of code you presented, which makes your question essentially meaningless.
Either clarify your question, or correct the code, if you meant to ask about something else.

The two snippets of code are, in fact, identical. You can see why this is true by realizing that the syntax of the "if" statement is as follows:
if <expression>
<block>
else
<block>
NOTE that <block> may be surrounded by curly braces if necessary.
So, your code breaks down as follows.
// if <expression>
if (x == 5)
// <block> begin
{
//do something
}
// <block> end
// else
else
// <block> begin
if(x == 7) {
//do something else
}
else if(x == 9) {
//do something else
} else {
//do something else
}
// <block> end
Now if you put curly braces around the block for the "else", as is allowed by the language, you end up with your second form.
// if <expression>
if (x == 5)
// <block> begin
{
//do something
}
// <block> end
// else
else
// <block> begin
{
if(x == 7) {
//do something else
}
else if(x == 9) {
//do something else
} else {
//do something else
}
}
// <block> end
And if you do this repeatedly for all "if else" clauses, you end up with exactly your second form. The two pieces of code are exactly identical, and seen exactly the same way by the compiler.

Closer to the first one, but the question doesn't exactly fit.
When a programs compiled, it goes through a few stages. The first stage is lexical analysis, then the second stage is syntactic analysis. Lexical analysis analyses the text, separating it into tokens. Then syntactic analysis looks at the structure of the program, and constructs an abstract syntax tree (AST). This is the underlying syntactic structure that's created during a compilation.
So basically, if and if-else and if-elseif-else statements are all eventually structures into an abstract syntax tree (AST) by the compiler.
Here's the wikipedia page on ASTs: https://en.wikipedia.org/wiki/Abstract_syntax_tree
edit:
And actually, and if/if else statement probably forms something closer to the second one inside the AST. I'm not quite sure, but I wouldn't be surprised if its represented at an underlying level as a binary tree-like conditional branching structure. If you're interested in learning more in depth about it, you can do some research on the parsing aspect of compiler theory.

Note that although your first statement is indented according to the if-else "ladder" convention, actually the "correct" indentation for it which reveals the true nesting is this:
if(x == 5) {
//do something
} else
if(x == 7) { // <- this is all one big statement
//do something else
} else
if(x == 9) { // <- so is this
//do something else
} else {
//do something else
}
Indentation is whitespace; it means nothing to the compiler. What you have after the first else is one big if statement. Since it is just one statement, it does not require braces around it. When you ask, "does the compiler read it that way", you have to remember that most space is insignificant; the syntax determines the true nesting of the syntax tree.

Related

Should I prefer two if statements over an if-else statement if the conditions aren't related?

So I know that generally speaking, I should prefer an else-if over and if if. But what if the two conditions aren't related? For example, these would be considered "related" conditionals:
if (line[a] == '{'){
openCurly = true;
}
else if (line[a] == '}'){
closeCurly = false;
}
Notice how the two conditionals in the if-statements are related in a way such that when one is true, the other must be false. This is because line[a] can either be { or } but not both.
Here is another example:
if (line[a] == '{')
{
openCurly = true;
}
else if ((line[a] == ';' && !openCurly) || (line[a] == '}' && openCurly))
{
DoSomething(line);
line = "";
}
The second condition will never evaluate to true if the first condition if true, so it makes sense to have an else-if. However, those two conditionals look vastly different.
So, should I prefer something like this?
if (line[a] == '{')
{
openCurly = true;
}
if ((line[a] == ';' && !openCurly) || (line[a] == '}' && openCurly))
{
DoSomething(line);
line = "";
}

You should use an else-if statement. This is because an if-else construct only checks the second statement if the first one doesn't evaluate to true.
In the example you give,
if (line[a] == '{')
{
openCurly = true;
}
else if ((line[a] == ';' && !openCurly) || (line[a] == '}' && openCurly))
{
DoSomething(line);
line = "";
}
replacing the else if with an if statement would result in the second condition being checked even if the first one is true, which is completely pointless and would also lose you some time.
In the future, make decisions to use else-if statements based on whether the conditions are mutually exclusive or not.

You could do something like this:
#include <stdint.h>
#define COMBINATION(x, y) ((uint16_t(x) << 8) | (uint16_t(y) << 0))
...
switch (COMBINATION(line[a], openCurly))
{
case COMBINATION('{', false):
...
break;
case COMBINATION(';', false):
case COMBINATION('}', true):
...;
break;
}
}
Some may say it's a bit of an overkill, but I think that it may actually help splitting up the logical operation of your program into a set of distinct cases, thus make it easier to handle each case precisely as desired.

About the exclusiveness of the cases of an if block

I have a question about good coding practices. I understand the differences between doing an if-else if and multiple ifs (that is, when a condition is met in an if-else if, the rest of the checks are skipped). I've found a piece of code along these lines:
if (A == 5) {
do_something();
} else if (B == 7) {
do_something_else();
}
I understand that this code won't check B == 7 if A == 5. The code works, so that means that B is only 7, if A is not 5, but I think this is just waiting to break when the code changes. What I would do is:
if (A == 5) {
do_something();
return or continue or break;
}
if (B == 7) {
do_something_else();
return or continue or break;
}
My question is, when I have multiple exclusive cases that depend on different, exclusive variables, what's the best way to tackle the flow control? I have the impression that the first code (with else ifs) depends a lot on other pieces of code to work, and that changes in other areas might break it. The second one seems to be a bit clunky. A switch could be a third option, but I would need to create another structure to hold the case and the logic to assign its value, and I think that it would be a bit clunky and counter-intuitive.

You asked about "exclusive" cases, but the issue with the conditions A == 5 and B == 7 is that they are not exclusive; they are independent.
For full generality you may need to test and handle all four cases:
if(A == 5) {
if(B == 7) {
/* case 1 */
} else {
/* case 2 */
}
} else {
if(B == 7) {
/* case 3 */
} else {
/* case 4 */
}
}
This is the notorious "bushy" if/else block. It's notorious because it can almost immediately become nearly impossible for a reader to follow, especially if the cases are involved, or more levels are introduced. (I think most style guides will tell you never to use an if/else tree that's 3 or more levels deep. I'd certainly say that.)
I have occasionally used these two alternatives:
(1) Fully decouple the cases:
if(A == 5 && B == 7) {
/* case 1 */
} else if(A == 5 && B != 7) {
/* case 2 */
} else if(A != 5 && B == 7) {
/* case 3 */
} else if(A != 5 && B != 7) {
/* case 4 */
} else {
/* can't happen */
}
The point here is to make it maximally clear to a later reader exactly which conditions go with cases 1, 2, 3, and 4. For this reason, you might as well list the last, else if(A != 5 && B != 7) case explicitly (as I've shown), even though by that point it's basically an "else".
(2) Contrive a "two level" switch. I can't say this is a common technique; it has a whiff of being "too clever", but it's robust and readable, in its way:
#define PAIR(b1, b2) (((b1) << 8) | (b2))
switch(PAIR(A == 5), (B == 7)) {
case PAIR(TRUE, TRUE):
/* case 1 */
break;
case PAIR(TRUE, FALSE):
/* case 2 */
break;
case PAIR(FALSE, TRUE):
/* case 3 */
break;
case PAIR(FALSE, FALSE):
/* case 4 */
break;
}
I wouldn't recommend this when the conditions are A == 5 and B == 7, because when you're down in the switch, it's not obvious what "TRUE" and "FALSE" mean, but sometimes, this sort of thing can read cleanly. It's also cleanly amenable to 3 or more levels of nesting, unlike "bushy" if/else trees, which as I said are notoriously unreadable.

The most robust way of programming this,
while avoiding the assumption that either A==5 or B==7 is to consider all the four cases:
if ((A == 5) && (B == 7))
{
do_somethingAB();
/* or */
do_somethingA();
do_somethingB();
} else if (A == 5)
{
do_somethingA();
} else if (B == 7)
{
do_somethingB();
} else
{
do_somethingNeither();
/* or
do nothing */
}

As I think you know, the two pieces of code are not equivalent. (They're equivalent IF they both contain "return or continue or break", which makes the question more interesting, but that's a different answer.)
In general, which one you choose (or how you choose to rewrite it) has to depend on precisely what you want the program to do.
When you write
if (A == 5) {
do_something();
} else if (B == 7) {
do_something_else();
}
you're additionally saying you want to do_something_else only if A is not equal to 5. That might be just what you want, or it might be a bug. If you wanted to achieve the same effect without an else, it would have to look like this:
if (A == 5) {
do_something();
}
if (A != 5 && B == 7) {
do_something_else();
}
The second piece of code you wrote in your question, on the other hand, has the potential to execute both do_something and do_something_else.
In general, it's best (clearest and least confusing) if all the conditions in an if/else chain test variations on the same condition, not some unusual mixture involving, for example, both A and B.
You use an if/else block when the alternatives are truly and deliberately exclusive, and when you want to emphasize this fact. You might choose to use separate if blocks (not chained with else) when the alternatives are not exclusive, or when they're only coincidentally or accidentally exclusive. For example, I have deliberately written code like
if(A == 5) {
do_something();
}
if(A != 5) {
do_some_unrelated_thing();
}
I might do this when the two things have nothing to do with each other, meaning that in some future revision of the program's logic, they might be not be exclusive after all. Or, I might do this if do_something is not a single like, but is a long, elaborate block, at the end of which I'm concerned that the reader might not have remembered why we were or weren't doing something, and that on the other hand we might want to do something else. For similar reasons, I've occasionally written
if(A == 5) {
do_something();
}
if(A == 5) {
do_some_unrelated_thing();
}
in the case that, again, the two things to be done had nothing to do with each other, and the reasons for doing them might diverge.

[This is now my third answer. The fact that I keep misreading your question, and failing to grasp the essential point you're asking about, suggests that maybe I shouldn't be answering at all.]
I think the essential point you're asking about concerns the case where the cases are independent, but you get the effect of an else due to the fact that each clause contains a control-flow statement which "goes out": a break, or a continue, or a return, or something like that.
In this specific case, my preference today would be not to use the else. When we write
if(A == 5) {
do_something();
return or continue or break;
}
if(B == 7) {
do_something_else();
return or continue or break;
}
it's clear that the two conditions have nothing to do with each other, other than that they're both cases that do something to "finish" the subtask being done, and leave the block of code that's responsible for performing that subtask.
When we write the two cases separately (without an else), we make clear not only that they're independent, but that they could be reordered, or that another case could be introduced in between them, etc.
But then again, could they be reordered? How likely is it that both cases A == 5 and B == 7 will both be true? And in that case, how important is it that do_something be done, as opposed to do_something_else? If the two cases can't be reordered, if it would be wrong to test B first and maybe do do_something_else, I suppose the explicit else is preferable, to tie the two cases together and make even more clear the requirement that A be tested first.
Like any question of style, the arguments for and against this sort of thing end up being pretty subjective. You're not likely to find a single, overwhelmingly convincing answer one way or the other.

One way to handle this is to use a do { ... } while (0); technique.
Here is your original code:
if (A == 5) {
do_something();
} else if (B == 7) {
do_something_else();
}
Doing else if on the same line is [IMO] a bit of a hack because it hides the true indentation:
if (A == 5) {
do_something();
}
else
if (B == 7) {
do_something_else();
}
Using the aformentioned technique, which I've used quite a lot is:
do {
if (A == 5) {
do_something();
break;
}
if (B == 7) {
do_something_else();
break;
}
} while (0);
This becomes even more evident when we increase the number of levels in the if/else ladder:
if (A == 5) {
do_something();
} else if (B == 7) {
do_something_else();
} else if (C == 9) {
do_something_else_again();
} else if (D == 3) {
do_something_for_D();
}
Once again, this is indented to:
if (A == 5) {
do_something();
}
else
if (B == 7) {
do_something_else();
}
else
if (C == 9) {
do_something_else_again();
}
else
if (D == 3) {
do_something_for_D();
}
Using the do/while/0 block, we get something that is simpler/cleaner:
do {
if (A == 5) {
do_something();
break;
}
if (B == 7) {
do_something_else();
break;
}
if (C == 9) {
do_something_else_again();
break;
}
if (D == 3) {
do_something_for_D();
break;
}
} while (0);
Note: I've been programming in c for 35+ years, and I've yet to find a case where a more standard use of do/while (e.g. do { ... } while (<cond>)) can't be replaced more cleanly/effectively with either a standard for or while loop. Some languages don't even have a do/while loop. Thus, I consider the do loop to be available for reuse.
Another use of do/while/0 is to allow things defined by a preprocessor macro to appear as a single block:
#define ABORTME(msg_) \
do { \
printf(stderr,"ABORT: %s (at line %d)\n",msg_,__LINE__); \
dump_some_state_data(); \
exit(1); \
} while (0)
if (some_error_condition)
ABORTME("some_error_condition");

Which are the most common pitfalls with conditional (if) statements in C++? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Preface
This question is meant as a canonical collection of the most frequent (beginner) mistakes using conditional statements like if() ... else or similar.
Answers are meant to describe unexpected behaviors at runtime, syntactical flaws and misconceptions like
if(x) {}
else (y) {}
should not be addressed here.
Addressed issues
Misconceptions of conditional expressions
Formatting and scoping errors

Misconceptions of conditional expressions
if(x = 1) // ...
Equality comparisons are expressed using the ==. = is an assignment, and the result is evaluated as a cast to bool. I.e. any value evaluated to != 0 results in true. As a prevention mechanism consider the below expressions:
if(1 = x) // invalid assignment compilation error!
if(1 == x) // valid equality comparison
A wrong use of an assignment operator can be avoided by always placing the constant on the left hand side of the expression. The compiler will flag any mistake triggering an invalid assignment error.
if(answer == 'y' || 'Y')
Variations: if(answer == 'y','Y')
Conditions must be tested with separate comparisons. The || operator binding doesn't do what's expected here. Use if(answer == 'y' || answer == 'Y')instead.
if (0 < x < 42)
Valid syntax in Python, with expected behaviour, that syntax is valid in C++, but parsed as if ((0 < x) < 42) so false/true converted to 0/1 and then tested against < 42 -> always true.
Condition must be tested with separate comparisons: if (0 < x && x < 42)

Formatting and scoping errors
if(mycondition);
{
// Why this code is always executed ???
}
There's a superfluous ; after the if() statement.
if(mycondition)
statement1();
statement2(); // Why this code is always executed ???
The code is equivalent to
if(mycondition) {
statement1();
}
statement2();
statement2(); is outside the scope of the conditional block. Add {} to group statements.
if (mycondition)
if (mycondition2)
statement1();
else
statement2();
The code is equivalent to
if(mycondition) {
if (mycondition2)
statement1();
else
statement2();
}
else apply on previous if. Add {}:
if (mycondition) {
if (mycondition2)
statement1();
}
else
statement2();
The same applies for any wrongly placed ; in loop statements like
for(int x = 0;x < 5;++x);
// ^
{
// statements executed only once
}
or
while(x < 5);
// ^
{
// statements executed only once
}

if (north) {
} else if (south) {
} else if (west) {
} else if (east) {
}
potential missing else-clause for wrong direction.
Funny placement of {} because they add them, misplace them and then delete the wrong ones.
if (a) {
}
if (b) {
}
if (c) {
}
missing else as only one of them should be done even if more is true.

Qt String Comparison

Suppose I have:
QString x;
Is the following code fragment:
if(x.compare("abcdefg") == 0){
doSomething();
}
else{
doSomethingElse();
}
... functionally equivalent to:
if(x == "abcdefg"){
doSomething();
}
else{
doSomethingElse();
}
I could prove this for myself by writing a fairly trivial program and executing it, but I was surprised I couldn't find the question / answer here, so I thought I'd ask it for the sake of future me / others.

QString::compare will only return zero if the string passed to it and the string it is called on are equal.
Qstring::operator== returns true if the strings are equal otherwise, false.
Since compare only returns zero when the strings are equal then
(qstrign_variable.compare("text") == 0) == (qstrign_variable == "text")
If qstrign_variable contains "text" in the above example. If qstrign_variable contains something else then both cases evaluate to false.
Also note that std::string has the same behavior

Why does a false statement still execute?

I have this code...
void drawMap(void)
{
if (false)
return;
for(auto iter = this->m_layers.begin(); iter != m_layers.end(); ++iter)
{
if ((*iter)->get() == NULL)
continue;
PN::draw((*iter)->get(), b2Vec2(0,0), true, 0);
}
}
If I'm not mistaken it should NEVER execute...but it does...and when I change
if (false)
return;
to
if (false)
return;
else
return;
it doesn't execute at all now, but how can that first statement NOT be false? grabs headache pills
P.S. I only did this 'cause I was debugging and noticed my code was drawing to the screen when it wasn't supposed to.

if (false) will never execute its body... because the value of the condition is never true. So in the code you've given, the remainder of drawMap will always execute because it will never return at the start.
Consider if (x == 5) - that will only execute if the expression x == 5 is true. Now substitute false for x == 5...
If you want an if statement which will always execute, you want
if (true)
instead.

Count me in with the crowd that didn't actually read the problem well enough, or couldn't believe that the OP didn't understand the problem if it were so simple :)
John Skeet's answer, of course, was spot on :)
Two thoughts:
If you're in a debugger, lines can appear to be executed, out of order, not at all or at unexpected lines when compiled with optimizations. This is because some machine instructions will get 'attributed' to different source lines. Compile without optimization to eliminate the source of confusion. It is confusing only, as optimizations should (! barring compiler bugs) not alter effective behaviour
It could be that you're getting an evil #define for false that you cannot trust. Rule this out by running the code through preprocessor only. g++ -E will do that. MSVC++ has an option to 'keep preprocessed' source
Blockquote

if (false)
is analagous to
if (1 == 2)
and will therefore never execute the next statement (or block).
In your context consider the following comments I made:
void drawMap(void)
{
if (false) return; //Not gonna happen.
//The following will always happen
for(auto iter = this->m_layers.begin(); iter != m_layers.end(); ++iter)
{
if ((*iter)->get() == NULL)
continue;
PN::draw((*iter)->get(), b2Vec2(0,0), true, 0);
}
}

I have seen the usage of this if(false), in a switch / case like construction like this:
int ret = doSomeThingFunction();
if (false) {}
else if (ret < 0 ) {
}
else if (ret == 0) {
}
else if (ret > 0) {
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

how does a C-like compiler interpret the if statement - c++

Related

Should I prefer two if statements over an if-else statement if the conditions aren't related?

About the exclusiveness of the cases of an if block

Which are the most common pitfalls with conditional (if) statements in C++? [closed]

Qt String Comparison

Why does a false statement still execute?

Categories

Resources