C++ compiler optimizations and short-circuit evaluation [duplicate] - c++

This question already has answers here:
Is short-circuiting logical operators mandated? And evaluation order?
(7 answers)
Closed 7 years ago.
Here is my code :
b = f() || b;
The function f() has side effect and it must be always executed. Normally, only the right operand can be short-circuited and this code should work. But I am afraid some compilators reverse the two operands, since it's more efficient to short-circuit a function evaluation rather than a simple variable evaluation. I know that g++ -O3 can break some specifications, but I don't know if this code can be affected.
So, is my code risk-free?
I knew Is short-circuiting logical operators mandated? And evaluation order? but my question was about compilers optimizations, I didn't know that they can't break the standards (even if this would be strange).

But I am afraid some compilators reverse the two operands
These expressions must be evaluated left-to-right. This is covered in the standard about the operators &&, ||, ?, and ,. They specifically mention the order, as well as enforced sequence points.
§5.14.1 (Logical AND)
The && operator groups left-to-right. The operands are both contextually converted to bool (Clause 4). The result is true if both operands are true and false otherwise. Unlike &, && guarantees left-to-right evaluation: the second operand is not evaluated if the first operand is false.
§5.15.1 (Logical OR)
The || operator groups left-to-right. The operands are both contextually converted to bool (Clause 4). It returns true if either of its operands is true, and false otherwise. Unlike |, || guarantees left-to-right evaluation; moreover, the second operand is not evaluated if the first operand evaluates to true.
§5.16.1 (Conditional operator)
Conditional expressions group right-to-left. The first expression is contextually converted to bool (Clause 4). It is evaluated and if it is true, the result of the conditional expression is the value of the second expression,
otherwise that of the third expression. Only one of the second and third expressions is evaluated. Every value computation and side effect associated with the first expression is sequenced before every value computation
and side effect associated with the second or third expression.
§5.19.1 (Comma operator)
The comma operator groups left-to-right. A pair of expressions separated by a comma is evaluated left-to-right; the left expression is a discarded value
expression (Clause 5). Every value computation and side effect associated with the left expression is sequenced before every value computation and side effect associated with the right expression. The type and value of the result are the type and value of the right operand; the result is of the same value category
as its right operand, and is a bit-field if its right operand is a glvalue and a bit-field. If the value of the right operand is a temporary (12.2), the result is that temporary.
Regarding your concern about optimizations violating this order, no compilers are not allowed to change the order. Compilers must first and foremost (try to) follow the standard. Then they can try to make your code faster. They may not violate the standard just for the sake of performance. That undermines the entire premise of having a standard.

It's explicitly stated by the standard that optimized code should behave "as-if" it was exactly the code that was written, as long as you only rely on standard behavior.
Since the standard requires the boolean statements to be evaluated from left to right, no (compliant) optimization can change the order of evaluation.

Related

How can a built-in operator (except new, delete and function call) return?

I have found next built-in operators which can return in Clause 5:
new, delete, function call, logical OR, all assignments operators.
It's all clear about the first three operators:
new operator usually calls allocation function which returns the
result.
delete operator usually calls deallocation function which
returns the result.
The result of a function call is usually the
result of the operand of the evaluated return statement in the called
function.
Other built-in operators didn't call functions. So how they can return something?
We can say only that such operators evaluate the result value or produce side effects like any other expressions.
Are there two inaccuracies in the Standard about built-in logical AND and built-in assignment operators?
Let's look at logical OR (§5.15/1):
The || operator groups left-to-right. The operands are both
contextually converted to bool (Clause 4). It returns true if either
of its operands is true, and false otherwise.
Compare with the technically correct definition of logical AND (§5.14/1):
The && operator groups left-to-right. The operands are both
contextually converted to bool (Clause 4). The result is true if both
operands are true and false otherwise.
Why they used "returns" in case of || operator and "the result is" in case of logical AND?
Next look at the built-in assignment operators (§5.18/1).
All require a modifiable lvalue as their left operand and return an
lvalue referring to the left operand.
Again we see "return" instead of "the result is".
P.S. I didn't find any phrases like "expression returns" in the Standard so it seems that such phrases are technically incorrect. Using them may cause strange questions
From the discussion above (thanks to Matteo Italia) we can conclude that there is absolutely NO difference between concepts "return the result" and "yield/evaluate the result value" for built-in operators (at least for built-in types which are the only ones that are interesting for the rules of built in operators).

C++ Order of Evaluation of Subexpressions with Logical Operators

There are lots of questions on concepts of precedence and order of evaluation but I failed to find one that refers to my special case.
Consider the following statement:
if(f(0) && g(0)) {};
Is it guaranteed that f(0) will be evaluated first? Notice that the operator is &&.
My confusion stems from what I've read in "The C++ Programming Language, (Stroustrup, 4ed, 2013)".
In section 10.3.2 of the book, it says:
The order of evaluation of subexpressions within an expression is undefined. In particular, you cannot assume that the expression is evaluated left-to-right. For example:
int x = f(2)+g(3); // undefined whether f() or g() is called first
This seems to apply to all operators including && operator, but in a following paragraph it says:
The operators , (comma), && (logical and), and || (logical or) guarantee that their left-hand operand is evaluated before their right-hand operand.
There is also another mention of this in section 11.1.1:
The && and || operators evaluate their second argument only if necessary, so they can be used to control evaluation order (§10.3.2). For example:
while (p && !whitespace(p)) ++p;
Here, p is not dereferenced if it is the nullptr.
This last quote implies that && and || evaluate their 1st argument first, so it seems to reinforce my assumption that operators mentioned in 2nd quote are exceptions to 1st quote, but I cannot draw a definitive conclusion from this last example either, as the expression contains only one subexpression as opposed to my example, which contains two.
The special sequencing behavior of &&, ||, and , is well-established in C and C++. The first sentence you quoted should say "The order of evaluation of subexpressions within an expression is generally unspecified" or "With a few specific exceptions, the order of evaluation of subexpressions within an expression is unspecified".
You asked about C++, but this question in the C FAQ list is pertinent.
Addendum: I just realized that "unspecified" is a better word in these rules than "undefined". Writing something like f() + g() doesn't give you undefined behavior. You just have no way of knowing whether f or g might be called first.
Yes, it is guaranteed that f(0) will be completely evaluated first.
This is to support behaviour known as short-circuiting, by which we don't need to call the second function at all if the first returns false.

Does GCC do lazy evaluation by default? [duplicate]

This question already has answers here:
Is short-circuiting logical operators mandated? And evaluation order?
(7 answers)
Closed 8 years ago.
I need to check if the node->next != NULL and I do not know if I can write:
if(node!= NULL && node->next!= NULL)
Yes, this is a guaranteed behaviour of C/C++, as nicely described in this question.
Just as an addition: In C/C++, this is refered to as short circuit evaluation, not lazy evaluation.
This has got nothing to do with GCC; the evaluation order and the short-circuiting is guaranteed by the language standard. Here's the relevant part from the C++11 standard, draft N3337 (emphasis mine):
5.14 Logical AND operator
The && operator groups left-to-right. The operands are both contextually converted to type bool. The result is true if both operands are true and false otherwise. Unlike &, && guarantees left-to-right evaluation: the second operand is not evaluated if the first operand is false.
The result is a bool. If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression.
5.15 Logical OR Operator
The || operator groups left-to-right. The operands are both contextually converted to bool. It returns true if either of its operands is true, and false otherwise. Unlike |, || guarantees left-to-right evaluation; moreover, the second operand is not evaluated if the first operand evaluates to true.
The result is a bool. If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression.
Any standards-conformant compiler should do so. So, yes, GCC does so too.

Do parentheses force order of evaluation and make an undefined expression defined?

I was just going though my text book when I came across this question:
What would be the value of a after the following expression? Assume the initial value of a = 5. Mention the steps.
a+=(a++)+(++a)
At first I thought this is undefined behaviour because a has been modified more than once. Then I read the question and it said "Mention the steps" so I probably thought this question is right.
Does applying parentheses make an undefined behaviour defined?
Is a sequence point created after evaluating a parentheses expression?
If it is defined,how do the parentheses matter since ++ and () have the same precedence?
No, applying parentheses doesn't make it a defined behaviour. It's still undefined. The C99 standard §6.5 ¶2 says
Between the previous and next sequence point an object shall have its
stored value modified at most once by the evaluation of an expression.
Furthermore, the prior value shall be read only to determine the value
to be stored.
Putting a sub-expression in parentheses may force the order of evaluation of sub-expressions but it does not create a sequence point. Therefore, it does not guarantee when the side effects of the sub-expressions, if they produce any, will take place. Quoting the C99 standard again §5.1.2.3¶2
Evaluation of an expression may produce side effects. At certain
specified points in the execution sequence called sequence points, all
side effects of previous evaluations shall be complete and no side
effects of subsequent evaluations shall have taken place.
For the sake of completeness, following are sequence points laid down by the C99 standard in Annex C.
The call to a function, after the arguments have been evaluated.
The end of the first operand of the following operators: logical AND &&; logical OR ||; conditional ?; comma ,.
The end of a full declarator.
The end of a full expression; the expression in an expression statement; the controlling expression of a selection statement (if
or switch); the controlling expression of a while or do
statement; each of the expressions of a for statement; the
expression in a return statement.
Immediately before a library function returns.
After the actions associated with each formatted input/output function conversion specifier.
Immediately before and immediately after each call to a comparison function, and also between any call to a comparison function and any
movement of the objects passed as arguments to that call.
Adding parenthesis does not create a sequence point and in the more modern standards it does not create a sequenced before relationship with respect to side effects which is the problem with the expression that you have unless noted the rest of this will be with respect to C++11. Parenthesis are a primary expression covered in section 5.1 Primary expressions, which has the following grammar (emphasis mine going forward):
primary-expression:
literal
this
( expression )
[...]
and in paragraph 6 it says:
A parenthesized expression is a primary expression whose type and value are identical to those of the enclosed expression. The presence of parentheses does not affect whether the expression is an lvalue. The parenthesized expression can be used in exactly the same contexts as those where the enclosed expression can be used, and with the same meaning, except as otherwise indicated.
The postfix ++ is problematic since we can not determine when the side effect of updating a will happen pre C++11 and in C this applies to both the postfix ++ and prefix ++ operations. With respect to how undefined behavior changed for prefix ++ in C++11 see Assignment operator sequencing in C11 expressions.
The += operation is problematic since:
[...]E1 op = E2 is equivalent to E1 = E1 op E2 except that E1 is
evaluated only once[...]
So in C++11 the following went from undefined to defined:
a = ++a + 1 ;
but this remains undefined:
a = a++ + 1 ;
and both of the above are undefined pre C++11 and in both C99 and C11.
From the draft C++11 standard section 1.9 Program execution paragraph 15 says:
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [ Note: In an expression that is evaluated more than once during the execution of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations. —end note ] The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.

C++ ternary operator execution conditions

I am unsure about the guarantees of execution for the C / C++ ternary operator.
For instance if I am given an address and a boolean that tells if that address is good for reading I can easily avoid bad reads using if/else:
int foo(const bool addressGood, const int* ptr) {
if (addressGood) { return ptr[0]; }
else { return 0; }
}
However can a ternary operator (?:) guarantee that ptr won't be accessed unless addressGood is true? Or could an optimizing compiler generate code that accesses ptr in any case (possibly crashing the program), stores the value in an intermediate register and use conditional assignment to implement the ternary operator?
int foo(const bool addressGood, const int* ptr) {
// Not sure about ptr access conditions here.
return (addressGood) ? ptr[0] : 0;
}
Thanks.
Yes, the standard guarantees that ptr is only accessed if addressGood is true. See this answer on the subject, which quotes the standard:
Conditional expressions group right-to-left. The first expression is contextually converted to bool (Clause 4). It is evaluated and if it is true, the result of the conditional expression is the value of the second expression, otherwise that of the third expression. Only one of the second and third expressions is evaluated. Every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second or third expression.
(C++11 standard, paragraph 5.16/1)
I would say, in addition to the answer that "yes, it's guaranteed by the C++ standard":
Please use the first form. It's MUCH clearer what you are trying to achieve.
I can almost guarantee that any sane compiler (with minimal amount of optimisation) generates exactly the same code for both examples anyway.
So whilst it's useful to know that both of these forms achieve the same "protection", it is definitely preferred to use the form that is most readable.
It also means you don't need to write a comment explaining that it is safe because of paragraph such and such in the C++ standard, thus making both take up the same amount of code-space - because if you didn't know it before, then you can rely on someone else ALSO not knowing that this is safe, and then spending the next half hour finding the answer via google, and either running into this thread, or asking the question again!
The conditional (ternary) operator guarantees to only evaluate the second operand if the first operand compares unequal to 0, and only evaluate the third operand if the first operand compares equal to 0. This means that your code is safe.
There is also a sequence point after the evaluation of the first operand.
By the way, you don't need the parantheses - addressGood ? ptr[0] : 0 is fine too.
c++11/[expr.cond]/1
Conditional expressions group right-to-left. The first expression is
contextually converted to bool (Clause 4).
It is evaluated and if it is true, the result of the conditional expression is the value of the second expression,
otherwise that of the third expression. Only one of the second and third expressions is evaluated. Every value
computation and side effect associated with the first expression is sequenced before every value computation
and side effect associated with the second or third expression.