In C++ is there any difference in the assembly code generated by a statement such as:
if (*expr*) { }
vs.
if (*expr*) return;
Basically what I want to know is whether or not delimiting an if statement with brackets makes any difference to the underlying code generated than a simple return statement, as above.
If there's going to be only one statement within the block then both are identical.
if (e)
{
stmt;
}
if (e) stmt;
are the same. However, when you've more than one statement to be executed, it's mandatory to wrap them in {}.
No, there is no difference between the two examples. You can use a profiler to see what assembly code is outputted and see for your self.
Aside from the return statement, there is no difference.
A compiler may optimise out the first case altogether if the expression is either compile-time evaluable (e.g. sizeof) or has no side-effects.
Similarly, the second case might be optimised out to a simple return;
This function …
void f1(int e) {
if (e) {}
}
… compiles to …
f1:
rep ret
… while this function …
void f2(int e) {
if (e) return;
}
… compiles to …
f2:
rep ret
… when optimization is enabled using the -O2 option.
If there is only one statement, then there is no difference in using "block" and "writing inline".
Braces are used only to enclose a sequence of statements that are intended to be seen as a single process.
General syntax for if:
if ( condition_in_the_diamond )
statement_to_execute_if_condition_is_true;
So to make multiple lines of code as a single process, we use braces.
Hence if you have only one statement to execute in if statement, the it would be similar.
Using braces are better because it reduces the chances of error. Suppose you are commenting a line of code in hurry to debug:
if(condition)
// statement1;
statement2; //this will bring statement2 in if clause which was not intended
or while adding a line of code:
if(condition)
statement1;
statement3; // won't be included in if
statement2;
But if you are using inline statement as:
if(condition) statement1;
then it might prevent you from above error but it will make statement1 of limited length (assuming 80 character width code). That will make it sweet and simple to read though.
Hence unless you are using inline statement, using braces is suggested.
Source
With poorly-written macros nasty things can happen.
// c-style antipattern. Better use scoped_ptr
#define CLEANUPANDRETURN delete p; return
if (*expr*) CLEANUPANDRETURN;
// preprocesses to: (with added linebreaks)
if (*expr*)
delete p;
return;
Many functions of the C-Library may actually be macros. See
Why use apparently meaningless do-while and if-else statements in macros? for a trick of somehow safer macros.
Related
I just find out that no matter how many semicolons (if more than 0) the compiler will compile without error
#include <iostream>
int main()
{
int x;
x = 5;;
std::cout << x;;;
}
will just works fine, so why?
It's not an error because the language standard says so. It's OK to have empty statements, they do nothing, and are harmless.
There are times when it's useful:
#ifdef DEBUG
#include <iostream>
#define DEBUG_LOG(X) std::cout << X << std::endl;
#else
#define DEBUG_LOG(X)
#endif
int main()
{
DEBUG_LOG(1);
}
When DEBUG is not defined this will expand to:
int main()
{
;
}
If you couldn't have empty statements that would not compile.
The semicolon is a terminal, a token that terminates something. What exactly it terminates depends on the context.
For example, a semicolon character is at the end of the following parts of the C++ grammar (not necessarily a complete list):
an expression-statement
a do/while iteration-statement
the various jump-statements
the simple-declaration
Note that in an expression-statement, the expression is optional. That's why a 'run' of semicolons, ;;;;, is valid in many (but not all) places where a single one is.
A semicolon terminates a statement, consecutive semicolons represent no operation / empty statement.
No code will be generated for empty statement.
If you have two consecutive semicolons, there is an empty statement between them (just another way of saying: there is no statement between them). So, why are empty statements allowed?
Sometimes you need to use some construct, where the language expects a statement, but you dont want to supply one. For example, a common way to write a infinite loop is like this
for (;;) {
// do something
// ...and break somewhere
}
If c++ didnt allow empty statements, we would have to put some dummy statements instead of naked ;; just to make this work.
When you have ;;; in between you have empty statements. Remember C++ doesn't care about white space.
One of my university colleagues, who has started programming this year, sometimes writes if statements like this:
if(something) doA();
else
if(something2) doC();
else doD();
He is conviced that the second if-else pair is treated as a single entity, and that it is in fact nested under the first else.
I'm, however, sure that his code is equivalent to:
if(something) doA();
else if(something2) doC();
else doD();
This shows that the second else is not actually nested, but on the same level as the first if. I told him he needs to use curly braces to achieve what he wants to.
"But my code works as intended!"
And indeed, it worked as intended. Turns out the behavior of the code was the same, even if the else was not nested.
Surprisingly, I have found myself unable to write a clear and concise example that shows different behavior between
if(something) doA();
else
if(something2) doC();
else doD();
and
if(something) doA();
else {
if(something2) doC();
else doD();
}
Can you help me find an example that will show my colleague the difference between using/not using curly braces?
Or is the incorrect-looking version always equivalent to the one with curly braces, in terms of behavior?
Per C 2011 6.8.4 1, the grammar for a selection-statement includes this production:
selection-statement: if ( expression ) statement else statement
Per 6.8 1, a production for statement is:
statement: selection-statement
Thus, in:
if(something) doA();
else
if(something2) doC();
else doD();
the indented if and else form a selection-statement that is the statement that appears in the else clause of the preceding selection-statement.
The productions I have shown show that this is a possible interpretation in the C grammar. To see that it is the only interpretation, we observe that the text in the else clause of the initial selection-statement must be a statement, because there is no other production in the C grammar that produces an else keyword. (This is most easily seen by searching the grammar in clause A.2. Due to its size, I will not reproduce it here.) So we know the else is followed by a statement. We can easily see that the statement is a selection-statement, since it begins with if. Then the only question remaining is whether the next else is part of that if statement or not. Per 6.8.4.1 3, “An else is associated with the lexically nearest preceding if that is allowed by the syntax.”
Both structures come out to the same thing. The compiler
effectively sees the code as:
if ( something ) {
doA()
} else {
if ( something2 ) {
doC();
} else {
doD();
}
}
In practice, however, there is no different between this and:
if ( something ) {
doA();
} else if ( something2 ) {
doC();
} else {
doD();
}
The extra braces encapsulate a single statement, and you don't
actually need the braces when the if or the else controls
a single statement. (My first example puts every statement
except the encompassing if in braces.)
Logically, programmers tend to thing along the lines of the
second; languages where some sort of bracing ({},
BEGIN/END or indentation) is required almost always add an
elsif or elif keyword in order to permit this second form.
C and C++ (and Java, and C#, and...) don't, because the second
form works out without the extra keyword.
In the end, you don't want the extra indentation. (I've cases
with fifteen or twenty successive else if. That would make
for some serious indentation.) On the other hand, you do want
the controlled statement on a separate line. (Bracing is
optional: if your coding standard puts the brace on a separate
line, it's also conventional to suppress it if it only contains
a single statement.)
I recently just lost some time figuring out a bug in my code which was caused by a typo:
if (a=b)
instead of:
if (a==b)
I was wondering if there is any particular case you would want to assign a value to a variable in a if statement, or if not, why doesn't the compiler throw a warning or an error?
if (Derived* derived = dynamic_cast<Derived*>(base)) {
// do stuff with `derived`
}
Though this is oft cited as an anti-pattern ("use virtual dispatch!"), sometimes the Derived type has functionality that the Base simply does not (and, consequently, distinct functions), and this is a good way to switch on that semantic difference.
Here is some history on the syntax in question.
In classical C, error handling was frequently done by writing something like:
int error;
...
if(error = foo()) {
printf("An error occurred: %s\nBailing out.\n", strerror(error));
abort();
}
Or, whenever there was a function call that might return a null pointer, the idiom was used the other way round:
Bar* myBar;
... // In old C variables had to be declared at the start of the scope
if(myBar = getBar()) {
// Do something with myBar
}
However, this syntax is dangerously close to
if(myValue == bar()) ...
which is why many people consider the assignment inside a condition bad style, and compilers started to warn about it (at least with -Wall). Some compilers allow avoiding this warning by adding an extra set of parentheses:
if((myBar = getBar())) { // It tells the compiler: Yes, I really want to do that assignment!
However, this is ugly and somewhat of a hack, so it's better avoid writing such code.
Then C99 came around, allowing you to mix definitions and statements, so many developers would frequently write something like
Bar* myBar = getBar();
if(myBar) {
which does feel awkward. This is why the newest standard allows definitions inside conditions, to provide a short, elegant way to do this:
if(Bar* myBar = getBar()) {
There isn't any danger in this statement anymore. You explicitly give the variable a type, obviously wanting it to be initialized. It also avoids the extra line to define the variable, which is nice. But most importantly, the compiler can now easily catch this sort of bug:
if(Bar* myBar = getBar()) {
...
}
foo(myBar->baz); // Compiler error
myBar->foo(); // Compiler error
Without the variable definition inside the if statement, this condition would not be detectable.
To make a long answer short: The syntax in you question is the product of old C's simplicity and power, but it is evil, so compilers can warn about it. Since it is also a very useful way to express a common problem, there is now a very concise, bug robust way to achieve the same behavior. And there are a lot of good, possible uses for it.
The assignment operator returns the value of the assigned value. So, I might use it in a situation like this:
if (x = getMyNumber())
I assign x to be the value returned by getMyNumber and I check if it's not zero.
Avoid doing that. I gave you an example just to help you understand this.
To avoid such bugs up to some extent, one should write the if condition as if(NULL == ptr) instead of if (ptr == NULL). Because when you misspell the equality check operator == as operator =, the compiler will throw an lvalue error with if (NULL = ptr), but if (res = NULL) passed by the compiler (which is not what you mean) and remain a bug in code for runtime.
One should also read Criticism regarding this kind of code.
why doesn't the compiler throw a warning
Some compilers will generate warnings for suspicious assignments in a conditional expression, though you usually have to enable the warning explicitly.
For example, in Visual C++, you have to enable C4706 (or level 4 warnings in general). I generally turn on as many warnings as I can and make the code more explicit in order to avoid false positives. For example, if I really wanted to do this:
if (x = Foo()) { ... }
Then I'd write it as:
if ((x = Foo()) != 0) { ... }
The compiler sees the explicit test and assumes that the assignment was intentional, so you don't get a false positive warning here.
The only drawback with this approach is that you can't use it when the variable is declared in the condition. That is, you cannot rewrite:
if (int x = Foo()) { ... }
as
if ((int x = Foo()) != 0) { ... }
Syntactically, that doesn't work. So you either have to disable the warning, or compromise on how tightly you scope x.
C++17 added the ability to have an init-statement in the condition for an if statement (p0305r1), which solves this problem nicely (for kind of comparison, not just != 0).
if (x = Foo(); x != 0) { ... }
Furthermore, if you want, you can limit the scope of x to just the if statement:
if (int x = Foo(); x != 0) { /* x in scope */ ... }
// x out of scope
In C++17, one can use:
if (<initialize> ; <conditional_expression>) { <body> }
Similar to a for loop iterator initializer.
Here is an example:
if (Employee employee = GetEmployee(); employee.salary > 100) { ... }
It depends on whether you want to write clean code or not. When
C was first being developed, the importance of clean code
wasn't fully recognized, and compilers were very simplistic:
using nested assignment like this could often result in faster
code. Today, I can't think of any case where a good programmer
would do it. It just makes the code less readable and more
difficult to maintain.
Suppose you want to check several conditions in a single if, and if any one of the conditions is true, you'd like to generate an error message. If you want to include in your error message which specific condition caused the error, you could do the following:
std::string e;
if( myMap[e = "ab"].isNotValid() ||
myMap[e = "cd"].isNotValid() ||
myMap[e = "ef"].isNotValid() )
{
// Here, 'e' has the key for which the validation failed
}
So if the second condition is the one that evaluates to true, e will be equal to "cd". This is due to the short-circuit behaviour of || which is mandated by the standard (unless overloaded). See this answer for more details on short-circuiting.
Doing assignment in an if is a fairly common thing, though it's also common that people do it by accident.
The usual pattern is:
if (int x = expensive_function_call())
{
// ...do things with x
}
The anti-pattern is where you're mistakenly assigning to things:
if (x = 1)
{
// Always true
}
else
{
// Never happens
}
You can avoid this to a degree by putting your constants or const values first, so your compiler will throw an error:
if (1 = x)
{
// Compiler error, can't assign to 1
}
= vs. == is something you'll need to develop an eye for. I usually put whitespace around the operator so it's more obvious which operation is being performed, as longname=longername looks a lot like longname==longername at a glance, but = and == on their own are obviously different.
a quite common case.
use
if (0 == exit_value)
instead of
if (exit_value == 0)
this kind of typo will cause compile error
I have a bit unusual situation - I want to use goto statement to jump into the loop, not to jump out from it.
There are strong reasons to do so - this code must be part of some function which makes some calculations after the first call, returns with request for new data and needs one more call to continue. Function pointers (obvious solution) can't be used because we need interoperability with code which does not support function pointers.
I want to know whether code below is safe, i.e. it will be correctly compiled by all standard-compliant C/C++ compilers (we need both C and C++).
function foo(int not_a_first_call, int *data_to_request, ...other parameters... )
{
if( not_a_first_call )
goto request_handler;
for(i=0; i<n; i++)
{
*data_to_request = i;
return;
request_handler:
...process data...
}
}
I've studied standards, but there isn't much information about such use case. I also wonder whether replacing for by equivalent while will be beneficial from the portability point of view.
Thanks in advance.
UPD: Thanks to all who've commented!
to all commenters :) yes, I understand that I can't jump over initializers of local variables and that I have to save/restore i on each call.
about strong reasons :) This code must implement reverse communication interface. Reverse communication is a coding pattern which tries to avoid using function pointers. Sometimes it have to be used because of legacy code which expects that you will use it.
Unfortunately, r-comm-interface can't be implemented in a nice way. You can't use function pointers and you can't easily split work into several functions.
Seems perfectly legal.
From a draft of the C99 standard http://std.dkuug.dk/JTC1/SC22/WG14/www/docs/n843.htm in the section on the goto statement:
[#3] EXAMPLE 1 It is sometimes convenient to jump into the
middle of a complicated set of statements. The following
outline presents one possible approach to a problem based on
these three assumptions:
1. The general initialization code accesses objects only
visible to the current function.
2. The general initialization code is too large to
warrant duplication.
3. The code to determine the next operation is at the
head of the loop. (To allow it to be reached by
continue statements, for example.)
/* ... */
goto first_time;
for (;;) {
// determine next operation
/* ... */
if (need to reinitialize) {
// reinitialize-only code
/* ... */
first_time:
// general initialization code
/* ... */
continue;
}
// handle other operations
/* ... */
}
Next, we look at the for loop statement:
[#1] Except for the behavior of a continue statement in the |
loop body, the statement
for ( clause-1 ; expr-2 ; expr-3 ) statement
and the sequence of statements
{
clause-1 ;
while ( expr-2 ) {
statement
expr-3 ;
}
}
Putting the two together with your problem tells you that you are jumping past
i=0;
into the middle of a while loop. You will execute
...process data...
and then
i++;
before flow of control jumps to the test in the while/for loop
i<n;
Yes, that's legal.
What you're doing is nowhere near as ugly as e.g. Duff's Device, which also is standard-compliant.
As #Alexandre says, don't use goto to skip over variable declarations with non-trivial constructors.
I'm sure you're not expecting local variables to be preserved across calls, since automatic variable lifetime is so fundamental. If you need some state to be preserved, functors (function objects) would be a good choice (in C++). C++0x lambda syntax makes them even easier to build. In C you'll have no choice but to store state into some state block passed in by pointer by the caller.
First, I need to say that you must reconsider doing this some other way. I've rarely seen someone using goto this days if not for error management.
But if you really want to stick with it, there are a few things you'll need to keep in mind:
Jumping from outside the loop to the middle won't make your code loop. (check the comments below for more info)
Be careful and don't use variables that are set before the label, for instance, referring to *data_to_request. This includes iwhich is set on the for statement and is not initialized when you jump to the label.
Personally, I think in this case I would rather duplicate the code for ...process data... then use goto. And if you pay close attention, you'll notice the return statement inside your for loop, meaning that the code of the label will never get executed unless there's a goto in the code to jump to it.
function foo(int not_a_first_call, int *data_to_request, ...other parameters... )
{
int i = 0;
if( not_a_first_call )
{
...process data...
*data_to_request = i;
return;
}
for (i=0; i<n; i++)
{
*data_to_request = i;
return;
}
}
No, you can't do this. I don't know what this will do exactly, but I do know that as soon as you return, your call stack is unwound and the variable i doesn't exist anymore.
I suggest refactoring. It looks like you're pretty much trying to build an iterator function similar to yield return in C#. Perhaps you could actually write a C++ iterator to do this?
It seems to me that you didn't declare i. From the point of declaration completely depends whether or not this is legal what you are doing, but see below for the initialization
In C you may declare it before the loop or as loop variable. But if it is declared as loop variable its value will not be initialized when you use it, so this is undefined behavior. And if you declare it before the for the assignment of 0 to it will not be performed.
In C++ you can't jump across the constructor of the variable, so you must declare it before the goto.
In both languages you have a more important problem, this is if the value of i is well defined, and if it is initialized if that value makes sense.
Really if there is any way to avoid this, don't do it. Or if this is really, really, performance critical check the assembler if it really does what you want.
If I understand correctly, you're trying to do something on the order of:
The first time foo is called, it needs to request some data from somewhere else, so it sets up that request and immediately returns;
On each subsequent call to foo, it processes the data from the previous request and sets up a new request;
This continues until foo has processed all the data.
I don't understand why you need the for loop at all in this case; you're only iterating through the loop once per call (if I understand the use case here). Unless i has been declared static, you lose its value each time through.
Why not define a type to maintain all the state (such as the current value of i) between function calls, and then define an interface around it to set/query whatever parameters you need:
typedef ... FooState;
void foo(FooState *state, ...)
{
if (FirstCall(state))
{
SetRequest(state, 1);
}
else if (!Done(state))
{
// process data;
SetRequest(state, GetRequest(state) + 1);
}
}
The initialisation part of the for loop will not occur, which makes it somewhat redundant. You need to initialise i before the goto.
int i = 0 ;
if( not_a_first_call )
goto request_handler;
for( ; i<n; i++)
{
*data_to_request = i;
return;
request_handler:
...process data...
}
However, this is really not a good idea!
The code is flawed in any case, the return statment circumvents the loop. As it stands it is equivalent to:
int i = 0 ;
if( not_a_first_call )
\\...process_data...
i++ ;
if( i < n )
{
*data_to_request = i;
}
In the end, if you think you need to do this then your design is flawed, and from the fragment posted your logic also.
My guess is it just made parsing easier, but I can't see exactly why.
So what does this have ...
do
{
some stuff
}
while(test);
more stuff
that's better than ...
do
{
some stuff
}
while(test)
more stuff
Because you're ending the statement. A statement ends either with a block (delimited by curly braces), or with a semicolon. "do this while this" is a single statement, and can't end with a block (because it ends with the "while"), so it needs a semicolon just like any other statement.
If you take a look at C++ grammar, you'll see that the iteration statements are defined as
while ( condition ) statement
for ( for-init-statement condition-opt ; expression-opt ) statement
do statement while ( expression ) ;
Note that only do-while statement has an ; at the end. So, the question is why the do-while is so different from the rest that it needs that extra ;.
Let's take a closer look: both for and regular while end with a statement. But do-while ends with a controlling expression enclosed in (). The presence of that enclosing () already allows the compiler to unambiguously find the end of the controlling expression: the outer closing ) designates where the expression ends and, therefore, where the entire do-while statement ends. In other words, the terminating ; is indeed redundant.
However, in practice that would mean that, for example, the following code
do
{
/* whatever */
} while (i + 2) * j > 0;
while valid from the grammar point of view, would really be parsed as
do
{
/* whatever */
} while (i + 2)
*j > 0;
This is formally sound, but it is not really intuitive. I'd guess that for such reasons it was decided to add a more explicit terminator to the do-while statement - a semicolon. Of course, per #Joe White's answer there are also considerations of plain and simple consistency: all ordinary (non-compound) statements in C end with a ;.
It's because while statements are valid within a do-while loop.
Consider the different behaviors if the semicolon weren't required:
int x = 10;
int y = 10;
do
while(x > 0)
x--;
while(x = y--);
While I don't know the answer, consistency seems like the best argument. Every statement group in C/C++ is either terminated by
A semicolon
A brace
Why create a construct which does neither?
Flow control statement consistency
Considering consistency...
if (expr) statement;
do statement; while (expr);
for (expr; expr; expr) statement;
while (expr) statement;
...all these flow-control constructs, end with a semicolon.
But, countering that we can note that of the block-statement forms, only do while is semicolon delimited:
if (expr) { ... }
do { ... } while (expr);
for (expr; expr; expr) { }
while (expr) { }
So, we have ';' or '}', but never a "bare" ')'.
Consistency of statement delimiters
We can at least say that every statement must be delimited by ; or }, and visually that helps us distinguish statements.
If no semicolon were required, consider:
do statement1; while (expr1) statement2; do ; while (expr2) statement3; while (expr3) statement4;
It's very difficult to visually resolve that to the distinct statements:
do statement1; while (expr1)
statement2;
do ; while (expr2)
statement3;
while (expr3) statement4;
By way of contrast, the following is more easily resolved as a ; immediately after a while condition tells you to seek backwards for the do, and that the next statement is unconnected to that while:
do statement1; while (expr1); statement2; do ; while (expr2); statement3; while (expr3) statement4;
Does it matter, given people indent their code to make the flow understandable? Yes, because:
people sometimes make mistakes (or have them transiently as the code's massaged) and if it visually stands out that means it will be fixed more easily, and
macro substitutions can throw together lots of statements on one line, and we occasionally need to visually verify the preprocessor output while troubleshooting or doing QA.
Implications to preprocessor use
It's also worth noting the famous preprocessor do-while idiom:
#define F(X) do { fn(X); } while (false)
This can be substituted as follows:
if (expr)
F(x);
else
x = 10;
...yields...
if (expr)
do ( fn(x); } while (false);
else
x = 10;
If the semicolon wasn't part of the do while statement, then the if statement would be interpreted as:
if (expr)
do-while-statement
; // empty statement
else
x = 10;
...and, because there are two statements after the if, it's considered complete, which leaves the else statement unmatched.
C is semicolon-terminated (whereas Pascal is semicolon-separated). It would be inconsistent to drop the semicolon there.
I, frankly, hate the reuse of the while for the do loop. I think repeat-until would have been less confusing. But it is what it is.
In C/C++ whitespace don't contribute to structure (like e.g. in python). In C/C++ statements must be terminated with a semicolon. This is allowed:
do
{
some stuff; more stuff; even more stuff;
}
while(test);
My answer is that, the compiler may get confusion, when we didn't include the semicolon in the termination of do.....while(); loop. Without this it is not clear about:
when the do ends?
If the while may a separate loop followed immediately after do loop.
That's why we include semicolon in the end of do......while loop, to indicate the loop is terminating here if the condition is false.