Do parentheses force order of evaluation and make an undefined expression defined? - c++

I was just going though my text book when I came across this question:
What would be the value of a after the following expression? Assume the initial value of a = 5. Mention the steps.
a+=(a++)+(++a)
At first I thought this is undefined behaviour because a has been modified more than once. Then I read the question and it said "Mention the steps" so I probably thought this question is right.
Does applying parentheses make an undefined behaviour defined?
Is a sequence point created after evaluating a parentheses expression?
If it is defined,how do the parentheses matter since ++ and () have the same precedence?

No, applying parentheses doesn't make it a defined behaviour. It's still undefined. The C99 standard §6.5 ¶2 says
Between the previous and next sequence point an object shall have its
stored value modified at most once by the evaluation of an expression.
Furthermore, the prior value shall be read only to determine the value
to be stored.
Putting a sub-expression in parentheses may force the order of evaluation of sub-expressions but it does not create a sequence point. Therefore, it does not guarantee when the side effects of the sub-expressions, if they produce any, will take place. Quoting the C99 standard again §5.1.2.3¶2
Evaluation of an expression may produce side effects. At certain
specified points in the execution sequence called sequence points, all
side effects of previous evaluations shall be complete and no side
effects of subsequent evaluations shall have taken place.
For the sake of completeness, following are sequence points laid down by the C99 standard in Annex C.
The call to a function, after the arguments have been evaluated.
The end of the first operand of the following operators: logical AND &&; logical OR ||; conditional ?; comma ,.
The end of a full declarator.
The end of a full expression; the expression in an expression statement; the controlling expression of a selection statement (if
or switch); the controlling expression of a while or do
statement; each of the expressions of a for statement; the
expression in a return statement.
Immediately before a library function returns.
After the actions associated with each formatted input/output function conversion specifier.
Immediately before and immediately after each call to a comparison function, and also between any call to a comparison function and any
movement of the objects passed as arguments to that call.

Adding parenthesis does not create a sequence point and in the more modern standards it does not create a sequenced before relationship with respect to side effects which is the problem with the expression that you have unless noted the rest of this will be with respect to C++11. Parenthesis are a primary expression covered in section 5.1 Primary expressions, which has the following grammar (emphasis mine going forward):
primary-expression:
literal
this
( expression )
[...]
and in paragraph 6 it says:
A parenthesized expression is a primary expression whose type and value are identical to those of the enclosed expression. The presence of parentheses does not affect whether the expression is an lvalue. The parenthesized expression can be used in exactly the same contexts as those where the enclosed expression can be used, and with the same meaning, except as otherwise indicated.
The postfix ++ is problematic since we can not determine when the side effect of updating a will happen pre C++11 and in C this applies to both the postfix ++ and prefix ++ operations. With respect to how undefined behavior changed for prefix ++ in C++11 see Assignment operator sequencing in C11 expressions.
The += operation is problematic since:
[...]E1 op = E2 is equivalent to E1 = E1 op E2 except that E1 is
evaluated only once[...]
So in C++11 the following went from undefined to defined:
a = ++a + 1 ;
but this remains undefined:
a = a++ + 1 ;
and both of the above are undefined pre C++11 and in both C99 and C11.
From the draft C++11 standard section 1.9 Program execution paragraph 15 says:
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [ Note: In an expression that is evaluated more than once during the execution of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations. —end note ] The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.

Related

C++ Order of Evaluation of Subexpressions with Logical Operators

There are lots of questions on concepts of precedence and order of evaluation but I failed to find one that refers to my special case.
Consider the following statement:
if(f(0) && g(0)) {};
Is it guaranteed that f(0) will be evaluated first? Notice that the operator is &&.
My confusion stems from what I've read in "The C++ Programming Language, (Stroustrup, 4ed, 2013)".
In section 10.3.2 of the book, it says:
The order of evaluation of subexpressions within an expression is undefined. In particular, you cannot assume that the expression is evaluated left-to-right. For example:
int x = f(2)+g(3); // undefined whether f() or g() is called first
This seems to apply to all operators including && operator, but in a following paragraph it says:
The operators , (comma), && (logical and), and || (logical or) guarantee that their left-hand operand is evaluated before their right-hand operand.
There is also another mention of this in section 11.1.1:
The && and || operators evaluate their second argument only if necessary, so they can be used to control evaluation order (§10.3.2). For example:
while (p && !whitespace(p)) ++p;
Here, p is not dereferenced if it is the nullptr.
This last quote implies that && and || evaluate their 1st argument first, so it seems to reinforce my assumption that operators mentioned in 2nd quote are exceptions to 1st quote, but I cannot draw a definitive conclusion from this last example either, as the expression contains only one subexpression as opposed to my example, which contains two.
The special sequencing behavior of &&, ||, and , is well-established in C and C++. The first sentence you quoted should say "The order of evaluation of subexpressions within an expression is generally unspecified" or "With a few specific exceptions, the order of evaluation of subexpressions within an expression is unspecified".
You asked about C++, but this question in the C FAQ list is pertinent.
Addendum: I just realized that "unspecified" is a better word in these rules than "undefined". Writing something like f() + g() doesn't give you undefined behavior. You just have no way of knowing whether f or g might be called first.
Yes, it is guaranteed that f(0) will be completely evaluated first.
This is to support behaviour known as short-circuiting, by which we don't need to call the second function at all if the first returns false.

C++ compiler optimizations and short-circuit evaluation [duplicate]

This question already has answers here:
Is short-circuiting logical operators mandated? And evaluation order?
(7 answers)
Closed 7 years ago.
Here is my code :
b = f() || b;
The function f() has side effect and it must be always executed. Normally, only the right operand can be short-circuited and this code should work. But I am afraid some compilators reverse the two operands, since it's more efficient to short-circuit a function evaluation rather than a simple variable evaluation. I know that g++ -O3 can break some specifications, but I don't know if this code can be affected.
So, is my code risk-free?
I knew Is short-circuiting logical operators mandated? And evaluation order? but my question was about compilers optimizations, I didn't know that they can't break the standards (even if this would be strange).
But I am afraid some compilators reverse the two operands
These expressions must be evaluated left-to-right. This is covered in the standard about the operators &&, ||, ?, and ,. They specifically mention the order, as well as enforced sequence points.
§5.14.1 (Logical AND)
The && operator groups left-to-right. The operands are both contextually converted to bool (Clause 4). The result is true if both operands are true and false otherwise. Unlike &, && guarantees left-to-right evaluation: the second operand is not evaluated if the first operand is false.
§5.15.1 (Logical OR)
The || operator groups left-to-right. The operands are both contextually converted to bool (Clause 4). It returns true if either of its operands is true, and false otherwise. Unlike |, || guarantees left-to-right evaluation; moreover, the second operand is not evaluated if the first operand evaluates to true.
§5.16.1 (Conditional operator)
Conditional expressions group right-to-left. The first expression is contextually converted to bool (Clause 4). It is evaluated and if it is true, the result of the conditional expression is the value of the second expression,
otherwise that of the third expression. Only one of the second and third expressions is evaluated. Every value computation and side effect associated with the first expression is sequenced before every value computation
and side effect associated with the second or third expression.
§5.19.1 (Comma operator)
The comma operator groups left-to-right. A pair of expressions separated by a comma is evaluated left-to-right; the left expression is a discarded value
expression (Clause 5). Every value computation and side effect associated with the left expression is sequenced before every value computation and side effect associated with the right expression. The type and value of the result are the type and value of the right operand; the result is of the same value category
as its right operand, and is a bit-field if its right operand is a glvalue and a bit-field. If the value of the right operand is a temporary (12.2), the result is that temporary.
Regarding your concern about optimizations violating this order, no compilers are not allowed to change the order. Compilers must first and foremost (try to) follow the standard. Then they can try to make your code faster. They may not violate the standard just for the sake of performance. That undermines the entire premise of having a standard.
It's explicitly stated by the standard that optimized code should behave "as-if" it was exactly the code that was written, as long as you only rely on standard behavior.
Since the standard requires the boolean statements to be evaluated from left to right, no (compliant) optimization can change the order of evaluation.

issue with cout in C++ [duplicate]

In the expression a + b, is a guaranteed to be evaluated before b, or is the order of evaluation unspecified? I think it is the latter, but I struggle to find a definite answer in the standard.
Since I don't know whether C handles this different from C++, or if evaluation order rules were simplified in C++11, I'm gonna tag the question as all three.
In C++, for user-defined types a + b is a function call, and the standard says:
§5.2.2.8 - [...] The order of evaluation of function arguments is unspecified. [...]
For normal operators, the standard says:
§5.4 - Except where noted, the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified. [...]
These haven't been changed for C++11. However, the wording changes in the second one to say that the order is "unsequenced" rather than unspecified, but it is essentially the same.
I don't have a copy of the C standard, but I imagine that it is the same there as well.
It is Unspecified.
Reference - C++03 Standard:
Section 5: Expressions, Para 4:
except where noted [e.g. special rules for && and ||], the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is Unspecified.
C++0x FDIS section 1.9 "Program Execution" §15 is similar to the corresponding paragraph in C++03, just reworded to accommodate the conceptual change from "sequence points" to "being sequenced":
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.
It is unspecified. C and C++ follow the same logic in selecting sequence points:
https://en.wikipedia.org/wiki/Sequence_point
For C: "Order of operations is not defined by the language. The compiler is free to evaluate such expressions in any order, if the compiler can guarantee a consistent result." [...] "Only the sequential-evaluation (,), logical-AND (&&), logical-OR (||), conditional-expression (? :), and function-call operators constitute sequence points and therefore guarantee a particular order of evaluation for their operands."
Source: http://msdn.microsoft.com/en-us/library/2bxt6kc4.aspx
The way this is organized in that article, it seems to indicate this applies to C++ too, which is confirmed by the answer to this question: Operator Precedence vs Order of Evaluation.
According to the current C standard, C11, it also specifies that the order of evaluation of subexpressions (a and b in this case) is indeterminate. In fact, this order doesn't even have to be the same if the same expression is evaluated multiple times.
From section 6.5:
The grouping of operators and operands is indicated by the
syntax.85) Except as specified later, side effects and
value computations of subexpressions are unsequenced.86)
86) In an expression that is evaluated more than once during the
execution of a program, unsequenced and indeterminately sequenced
evaluations of its subexpressions need not be performed
consistently in different evaluations.

Order of evaluation and undefined behaviour

Speaking in the context of the C++11 standard (which no longer has a concept of sequence points, as you know) I want to understand how two simplest examples are defined.
int i = 0;
i = i++; // #0
i = ++i; // #1
There are two topics on SO which explain those examples within the C++11 context. Here it was said that #0 invokes UB and #1 is well-defined. Here it was said that both examples are undefined. This ambiguity confuses me much. I've read this well-structured reference three times already but the topic seems to be way too complicated for me.
.
Let's analyze the example #0: i = i++;.
Corresponding quotes are:
The value computation of the built-in postincrement and postdecrement
operators is sequenced before its side-effect.
The side effect (modification of the left argument) of the built-in
assignment operator and of all built-in compound assignment operators
is sequenced after the value computation (but not the side effects) of
both left and right arguments, and is sequenced before the value
computation of the assignment expression (that is, before returning
the reference to the modified object)
If a side effect on a scalar object is unsequenced relative to another
side effect on the same scalar object, the behavior is undefined.
As I get it, the side effect of the assignment operator is not sequenced with side effects of it's left and right arguments. Thus the side effect of the assignment operator is not sequenced with the side effects of i++. So #0 invokes an UB.
.
Let's analyze the example #1: i = ++i;.
Corresponding quotes are:
The side effect of the built-in preincrement and predecrement
operators is sequenced before its value computation (implicit rule due
to definition as compound assignment)
The side effect (modification of the left argument) of the built-in
assignment operator and of all built-in compound assignment operators
is sequenced after the value computation (but not the side effects) of
both left and right arguments, and is sequenced before the value
computation of the assignment expression (that is, before returning
the reference to the modified object)
If a side effect on a scalar object is unsequenced relative to another
side effect on the same scalar object, the behavior is undefined.
I can not see, how this example is different from the #0. This seems to be an UB for me for the very same reason as #0. The side effect of assignment is not sequenced with the side effect of ++i. It seems to be an UB. The topic liked above says it is well-defined. Why?
.
Question: how can I apply quoted rules to determine the UB of the examples. An as simple as possible explanation would be greatly appreciated. Thank you!
Since your quotes are not directly from the standard, I will try to give a detailed answer quoting the relevant parts of the standard. The definitions of "side effects" and "evaluation" is found in paragraph 1.9/12:
Accessing an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression (or a sub-expression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects.
The next relevant part is paragraph 1.9/15:
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [...] The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
Now let's see, how to apply this to the two examples.
i = i++;
This is the postfix form of increment and you find its definition in paragraph 5.2.6. The most relevant sentence reads:
The value computation of the ++ expression is sequenced before the modification
of the operand object.
For the assignment expression see paragraph 5.17. The relevant part states:
In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.
Using all the information from above, the evaluation of the whole expression is (this order is not guaranteed by the standard!):
value computation of i++ (right hand side)
value computation of i (left hand side)
modification of i (side effect of ++)
modification of i (side effect of =)
All the standard guarantees is that the value computations of the two operands is sequenced before the value computation of the assignment expression. But the value computation of the right hand side is only "reading the value of i" and not modifying i, the two modifications (side effects) are not sequenced with respect to each other and we get undefined behavior.
What about the second example?
i = ++i;
The situation is quite different here. You find the definition of prefix increment in paragraph 5.3.2. The relevant part is:
If x is not of type bool, the expression ++x is equivalent to x+=1.
Substituting that, our expression is equivalent to
i = (i += 1)
Looking up the compound assignment operator += in 5.17/7 we get that i += 1 is equivalent to i = i + 1 except that i is only evaluated once. Hence, the expression in question finally becomes
i = ( i = (i + 1))
But we already know from above that the value computation of the = is sequenced after the value computation of the operands and the side effects are sequenced before the value computations of =. So we get a well-defined order of evaluation:
compute value of i + 1 (and i - left hand side of inner expression)(#1)
initiate side effect of inner =, i.e. modify "inner" i
compute value of (i = i + 1), which is the "new" value of i
initiate side effect of outer =, i.e. modify "outer" i
compute value of full expression.
(#1): Here, i is only evaluated once, since i += 1 is equivalent to i = i + 1 except that i is only evaluated once (5.17/7).
The key difference is that ++i is defined as i += 1, so
i = ++i;
is the same as:
i = (i += 1);
Since the side effects of the += operator are sequenced before
the value computation of the operator, the actual modification
of i in ++i is sequenced before the outer assignment. This
follows directly from the sections you quote: "The side effect
(modification of the left argument) of the built-in assignment
operator and of all built-in compound assignment operators is
sequenced after the value computation (but not the side effects)
of both left and right arguments, and is sequenced before the
value computation of the assignment expression (that is, before
returning the reference to the modified object)"
This is due to the nested assignment operator; the (outer)
assignment operator only imposes sequenced before on the value
computation of its operands, not on their side effects. (But of
course, it doesn't undo sequencing imposed otherwise.)
And as you indirectly point out, this is new to C++11;
previously, both were undefined. The older versions of C++
used sequence points, rather than sequenced before, and there
was no sequence point in any of the assignment operators. (I
have the impression that the intent was that operators which
result in an lvalue have a value which is sequenced after any
side effects. In earlier C++, the expression *&++i was
undefined behavior; in C++11, it is guaranteed to be the same as
++i.)

order of evaluation of operands

In the expression a + b, is a guaranteed to be evaluated before b, or is the order of evaluation unspecified? I think it is the latter, but I struggle to find a definite answer in the standard.
Since I don't know whether C handles this different from C++, or if evaluation order rules were simplified in C++11, I'm gonna tag the question as all three.
In C++, for user-defined types a + b is a function call, and the standard says:
§5.2.2.8 - [...] The order of evaluation of function arguments is unspecified. [...]
For normal operators, the standard says:
§5.4 - Except where noted, the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified. [...]
These haven't been changed for C++11. However, the wording changes in the second one to say that the order is "unsequenced" rather than unspecified, but it is essentially the same.
I don't have a copy of the C standard, but I imagine that it is the same there as well.
It is Unspecified.
Reference - C++03 Standard:
Section 5: Expressions, Para 4:
except where noted [e.g. special rules for && and ||], the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is Unspecified.
C++0x FDIS section 1.9 "Program Execution" §15 is similar to the corresponding paragraph in C++03, just reworded to accommodate the conceptual change from "sequence points" to "being sequenced":
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.
It is unspecified. C and C++ follow the same logic in selecting sequence points:
https://en.wikipedia.org/wiki/Sequence_point
For C: "Order of operations is not defined by the language. The compiler is free to evaluate such expressions in any order, if the compiler can guarantee a consistent result." [...] "Only the sequential-evaluation (,), logical-AND (&&), logical-OR (||), conditional-expression (? :), and function-call operators constitute sequence points and therefore guarantee a particular order of evaluation for their operands."
Source: http://msdn.microsoft.com/en-us/library/2bxt6kc4.aspx
The way this is organized in that article, it seems to indicate this applies to C++ too, which is confirmed by the answer to this question: Operator Precedence vs Order of Evaluation.
According to the current C standard, C11, it also specifies that the order of evaluation of subexpressions (a and b in this case) is indeterminate. In fact, this order doesn't even have to be the same if the same expression is evaluated multiple times.
From section 6.5:
The grouping of operators and operands is indicated by the
syntax.85) Except as specified later, side effects and
value computations of subexpressions are unsequenced.86)
86) In an expression that is evaluated more than once during the
execution of a program, unsequenced and indeterminately sequenced
evaluations of its subexpressions need not be performed
consistently in different evaluations.