New Sequence Points in C++11 - c++

With the new college year upon us.
We have started to receive the standard why does ++ i ++ not work as expected questions.
After just answering one of these type of questions I was told that the new C++11 standard has changed and this is no longer undefined behavior. I have heard that sequence points have been replaced by sequenced before and sequenced after but have not read deep (or at all) into the subject.
So the question I was just answering had:
int i = 12;
k = ++ (++ i);
So the question is:
How has the sequence points changes in C++11 and how does it affect questions like the above. Is it still undefined behavior or is this now well defined?

The UB in those cases is based on [intro.execution]/15
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [...] The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
For ++(++i): [expr.pre.incr]/1 states that ++i is defined as i+=1. This leads to [expr.ass]/1, which says
In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.
Therefore, for ++(++i), equivalent to (i+=1)+=1, the inner assignment is sequenced before the outer assignment, and we have no UB.
[intro.execution]/15 has an example of UB:
i = i++ + 1; // the behavior is undefined
The case here is a bit different (thanks to Oktalist for pointing out a pre/postfix mistake here). [expr.post.incr]/1 describes the effects of postfix increment. It states:
The value computation of the ++ expression is sequenced before the modification of the operand object.
However, there is no requirement on the sequencing of the side effect (the modification of i). Such a requirement could also be imposed by the assignment-expression. But the assignment-expression only requires the value computations (but not the side effects) of the operands to be sequenced before the assignment. Therefore, the two modifications via i = .. and i++ are unsequenced, and we get undefined behaviour.
N.B. i = (i = 1); does not have the same problem: The inner assignment guarantees the side effect of i = 1 is sequenced before the value computation of the same expression. And the value is required for the outer assignment, which guarantees that it (the value computation of the right operand (i = 1)) is sequenced before the side effect of the outer assignment. Similarly, i = ++i + 1; (equivalent to i = (i+=1) + 1;) has defined behaviour.
The comma operator is an example where the side effects are sequenced; [expr.comma]/1
Every value computation and side effect associated with the left expression is sequenced before every value computation and side effect associated with the right expression.
[intro.execution]/15 includes the example i = 7, i++, i++; (read: (i=7), i++, i++;), which is defined behaviour (i becomes 9).

I don't think sequencing is relevant to your situation. The expression ++i++ is grouped as ++(i++), so:
If i is a built-in type, then this is invalid, since i++ is an rvalue.
If i is of user-defined type and the operators are overloaded, this is a nested function call, such as T::operator++(T::operator++(i), 0), and function arguments are evaluated before the function call is evaluated.

Related

Pre-increment and post-increment in C++ not working correctly when used as arguments [duplicate]

What are "sequence points"?
What is the relation between undefined behaviour and sequence points?
I often use funny and convoluted expressions like a[++i] = i;, to make myself feel better. Why should I stop using them?
If you've read this, be sure to visit the follow-up question Undefined behavior and sequence points reloaded.
(Note: This is meant to be an entry to Stack Overflow's C++ FAQ. If you want to critique the idea of providing an FAQ in this form, then the posting on meta that started all this would be the place to do that. Answers to that question are monitored in the C++ chatroom, where the FAQ idea started out in the first place, so your answer is very likely to get read by those who came up with the idea.)
C++98 and C++03
This answer is for the older versions of the C++ standard. The C++11 and C++14 versions of the standard do not formally contain 'sequence points'; operations are 'sequenced before' or 'unsequenced' or 'indeterminately sequenced' instead. The net effect is essentially the same, but the terminology is different.
Disclaimer : Okay. This answer is a bit long. So have patience while reading it. If you already know these things, reading them again won't make you crazy.
Pre-requisites : An elementary knowledge of C++ Standard
What are Sequence Points?
The Standard says
At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations
shall be complete and no side effects of subsequent evaluations shall have taken place. (§1.9/7)
Side effects? What are side effects?
Evaluation of an expression produces something and if in addition there is a change in the state of the execution environment it is said that the expression (its evaluation) has some side effect(s).
For example:
int x = y++; //where y is also an int
In addition to the initialization operation the value of y gets changed due to the side effect of ++ operator.
So far so good. Moving on to sequence points. An alternation definition of seq-points given by the comp.lang.c author Steve Summit:
Sequence point is a point in time at which the dust has settled and all side effects which have been seen so far are guaranteed to be complete.
What are the common sequence points listed in the C++ Standard?
Those are:
at the end of the evaluation of full expression (§1.9/16) (A full-expression is an expression that is not a subexpression of another expression.)1
Example :
int a = 5; // ; is a sequence point here
in the evaluation of each of the following expressions after the evaluation of the first expression (§1.9/18) 2
a && b (§5.14)
a || b (§5.15)
a ? b : c (§5.16)
a , b (§5.18) (here a , b is a comma operator; in func(a,a++) , is not a comma operator, it's merely a separator between the arguments a and a++. Thus the behaviour is undefined in that case (if a is considered to be a primitive type))
at a function call (whether or not the function is inline), after the evaluation of all function arguments (if any) which
takes place before execution of any expressions or statements in the function body (§1.9/17).
1 : Note : the evaluation of a full-expression can include the evaluation of subexpressions that are not lexically
part of the full-expression. For example, subexpressions involved in evaluating default argument expressions (8.3.6) are considered to be created in the expression that calls the function, not the expression that defines the default argument
2 : The operators indicated are the built-in operators, as described in clause 5. When one of these operators is overloaded (clause 13) in a valid context, thus designating a user-defined operator function, the expression designates a function invocation and the operands form an argument list, without an implied sequence point between them.
What is Undefined Behaviour?
The Standard defines Undefined Behaviour in Section §1.3.12 as
behavior, such as might arise upon use of an erroneous program construct or erroneous data, for which this International Standard imposes no requirements 3.
Undefined behavior may also be expected when this
International Standard omits the description of any explicit definition of behavior.
3 : permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or with-
out the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
In short, undefined behaviour means anything can happen from daemons flying out of your nose to your girlfriend getting pregnant.
What is the relation between Undefined Behaviour and Sequence Points?
Before I get into that you must know the difference(s) between Undefined Behaviour, Unspecified Behaviour and Implementation Defined Behaviour.
You must also know that the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified.
For example:
int x = 5, y = 6;
int z = x++ + y++; //it is unspecified whether x++ or y++ will be evaluated first.
Another example here.
Now the Standard in §5/4 says
Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.
What does it mean?
Informally it means that between two sequence points a variable must not be modified more than once.
In an expression statement, the next sequence point is usually at the terminating semicolon, and the previous sequence point is at the end of the previous statement. An expression may also contain intermediate sequence points.
From the above sentence the following expressions invoke Undefined Behaviour:
i++ * ++i; // UB, i is modified more than once btw two SPs
i = ++i; // UB, same as above
++i = 2; // UB, same as above
i = ++i + 1; // UB, same as above
++++++i; // UB, parsed as (++(++(++i)))
i = (i, ++i, ++i); // UB, there's no SP between `++i` (right most) and assignment to `i` (`i` is modified more than once btw two SPs)
But the following expressions are fine:
i = (i, ++i, 1) + 1; // well defined (AFAIK)
i = (++i, i++, i); // well defined
int j = i;
j = (++i, i++, j*i); // well defined
Furthermore, the prior value shall be accessed only to determine the value to be stored.
What does it mean? It means if an object is written to within a full expression, any and all accesses to it within the same expression must be directly involved in the computation of the value to be written.
For example in i = i + 1 all the access of i (in L.H.S and in R.H.S) are directly involved in computation of the value to be written. So it is fine.
This rule effectively constrains legal expressions to those in which the accesses demonstrably precede the modification.
Example 1:
std::printf("%d %d", i,++i); // invokes Undefined Behaviour because of Rule no 2
Example 2:
a[i] = i++ // or a[++i] = i or a[i++] = ++i etc
is disallowed because one of the accesses of i (the one in a[i]) has nothing to do with the value which ends up being stored in i (which happens over in i++), and so there's no good way to define--either for our understanding or the compiler's--whether the access should take place before or after the incremented value is stored. So the behaviour is undefined.
Example 3 :
int x = i + i++ ;// Similar to above
Follow up answer for C++11 here.
This is a follow up to my previous answer and contains C++11 related material..
Pre-requisites : An elementary knowledge of Relations (Mathematics).
Is it true that there are no Sequence Points in C++11?
Yes! This is very true.
Sequence Points have been replaced by Sequenced Before and Sequenced After (and Unsequenced and Indeterminately Sequenced) relations in C++11.
What exactly is this 'Sequenced before' thing?
Sequenced Before(§1.9/13) is a relation which is:
Asymmetric
Transitive
between evaluations executed by a single thread and induces a strict partial order1
Formally it means given any two evaluations(See below) A and B, if A is sequenced before B, then the execution of A shall precede the execution of B. If A is not sequenced before B and B is not sequenced before A, then A and B are unsequenced 2.
Evaluations A and B are indeterminately sequenced when either A is sequenced before B or B is sequenced before A, but it is unspecified which3.
[NOTES]
1 : A strict partial order is a binary relation "<" over a set P which is asymmetric, and transitive, i.e., for all a, b, and c in P, we have that:
........(i). if a < b then ¬ (b < a) (asymmetry);
........(ii). if a < b and b < c then a < c (transitivity).
2 : The execution of unsequenced evaluations can overlap.
3 : Indeterminately sequenced evaluations cannot overlap, but either could be executed first.
What is the meaning of the word 'evaluation' in context of C++11?
In C++11, evaluation of an expression (or a sub-expression) in general includes:
value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and
initiation of side effects.
Now (§1.9/14) says:
Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.
Trivial example:
int x;
x = 10;
++x;
Value computation and side effect associated with ++x is sequenced after the value computation and side effect of x = 10;
So there must be some relation between Undefined Behaviour and the above-mentioned things, right?
Yes! Right.
In (§1.9/15) it has been mentioned that
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced4.
For example :
int main()
{
int num = 19 ;
num = (num << 3) + (num >> 3);
}
Evaluation of operands of + operator are unsequenced relative to each other.
Evaluation of operands of << and >> operators are unsequenced relative to each other.
4: In an expression that is evaluated more than once during the execution
of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations.
(§1.9/15)
The value computations of the operands of an
operator are sequenced before the value computation of the result of the operator.
That means in x + y the value computation of x and y are sequenced before the value computation of (x + y).
More importantly
(§1.9/15) If a side effect on a scalar object is unsequenced relative to either
(a) another side effect on the same scalar object
or
(b) a value computation using the value of the same scalar object.
the behaviour is undefined.
Examples:
int i = 5, v[10] = { };
void f(int, int);
i = i++ * ++i; // Undefined Behaviour
i = ++i + i++; // Undefined Behaviour
i = ++i + ++i; // Undefined Behaviour
i = v[i++]; // Undefined Behaviour
i = v[++i]: // Well-defined Behavior
i = i++ + 1; // Undefined Behaviour
i = ++i + 1; // Well-defined Behaviour
++++i; // Well-defined Behaviour
f(i = -1, i = -1); // Undefined Behaviour (see below)
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [Note: Value computations and side effects associated with different argument expressions are unsequenced. — end note]
Expressions (5), (7) and (8) do not invoke undefined behaviour. Check out the following answers for a more detailed explanation.
Multiple preincrement operations on a variable in C++0x
Unsequenced Value Computations
Final Note :
If you find any flaw in the post please leave a comment. Power-users (With rep >20000) please do not hesitate to edit the post for correcting typos and other mistakes.
C++17 (N4659) includes a proposal Refining Expression Evaluation Order for Idiomatic C++
which defines a stricter order of expression evaluation.
In particular, the following sentence
8.18 Assignment and compound assignment operators:....
In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression.
The right operand is sequenced before the left operand.
together with the following clarification
An expression X is said to be sequenced before an expression Y if every
value computation and every side effect associated with the expression X is sequenced before every value
computation and every side effect associated with the expression Y.
make several cases of previously undefined behavior valid, including the one in question:
a[++i] = i;
However several other similar cases still lead to undefined behavior.
In N4140:
i = i++ + 1; // the behavior is undefined
But in N4659
i = i++ + 1; // the value of i is incremented
i = i++ + i; // the behavior is undefined
Of course, using a C++17 compliant compiler does not necessarily mean that one should start writing such expressions.
I am guessing there is a fundamental reason for the change, it isn't merely cosmetic to make the old interpretation clearer: that reason is concurrency. Unspecified order of elaboration is merely selection of one of several possible serial orderings, this is quite different to before and after orderings, because if there is no specified ordering, concurrent evaluation is possible: not so with the old rules. For example in:
f (a,b)
previously either a then b, or, b then a. Now, a and b can be evaluated with instructions interleaved or even on different cores.
In C99(ISO/IEC 9899:TC3) which seems absent from this discussion thus far the following steteents are made regarding order of evaluaiton.
[...]the order of evaluation of subexpressions and the order in which
side effects take place are both unspecified. (Section 6.5 pp 67)
The order of evaluation of the operands is unspecified. If an attempt
is made to modify the result of an assignment operator or to access it
after the next sequence point, the behavior[sic] is undefined.(Section
6.5.16 pp 91)

Operator precedence in C/C++ [duplicate]

What are "sequence points"?
What is the relation between undefined behaviour and sequence points?
I often use funny and convoluted expressions like a[++i] = i;, to make myself feel better. Why should I stop using them?
If you've read this, be sure to visit the follow-up question Undefined behavior and sequence points reloaded.
(Note: This is meant to be an entry to Stack Overflow's C++ FAQ. If you want to critique the idea of providing an FAQ in this form, then the posting on meta that started all this would be the place to do that. Answers to that question are monitored in the C++ chatroom, where the FAQ idea started out in the first place, so your answer is very likely to get read by those who came up with the idea.)
C++98 and C++03
This answer is for the older versions of the C++ standard. The C++11 and C++14 versions of the standard do not formally contain 'sequence points'; operations are 'sequenced before' or 'unsequenced' or 'indeterminately sequenced' instead. The net effect is essentially the same, but the terminology is different.
Disclaimer : Okay. This answer is a bit long. So have patience while reading it. If you already know these things, reading them again won't make you crazy.
Pre-requisites : An elementary knowledge of C++ Standard
What are Sequence Points?
The Standard says
At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations
shall be complete and no side effects of subsequent evaluations shall have taken place. (§1.9/7)
Side effects? What are side effects?
Evaluation of an expression produces something and if in addition there is a change in the state of the execution environment it is said that the expression (its evaluation) has some side effect(s).
For example:
int x = y++; //where y is also an int
In addition to the initialization operation the value of y gets changed due to the side effect of ++ operator.
So far so good. Moving on to sequence points. An alternation definition of seq-points given by the comp.lang.c author Steve Summit:
Sequence point is a point in time at which the dust has settled and all side effects which have been seen so far are guaranteed to be complete.
What are the common sequence points listed in the C++ Standard?
Those are:
at the end of the evaluation of full expression (§1.9/16) (A full-expression is an expression that is not a subexpression of another expression.)1
Example :
int a = 5; // ; is a sequence point here
in the evaluation of each of the following expressions after the evaluation of the first expression (§1.9/18) 2
a && b (§5.14)
a || b (§5.15)
a ? b : c (§5.16)
a , b (§5.18) (here a , b is a comma operator; in func(a,a++) , is not a comma operator, it's merely a separator between the arguments a and a++. Thus the behaviour is undefined in that case (if a is considered to be a primitive type))
at a function call (whether or not the function is inline), after the evaluation of all function arguments (if any) which
takes place before execution of any expressions or statements in the function body (§1.9/17).
1 : Note : the evaluation of a full-expression can include the evaluation of subexpressions that are not lexically
part of the full-expression. For example, subexpressions involved in evaluating default argument expressions (8.3.6) are considered to be created in the expression that calls the function, not the expression that defines the default argument
2 : The operators indicated are the built-in operators, as described in clause 5. When one of these operators is overloaded (clause 13) in a valid context, thus designating a user-defined operator function, the expression designates a function invocation and the operands form an argument list, without an implied sequence point between them.
What is Undefined Behaviour?
The Standard defines Undefined Behaviour in Section §1.3.12 as
behavior, such as might arise upon use of an erroneous program construct or erroneous data, for which this International Standard imposes no requirements 3.
Undefined behavior may also be expected when this
International Standard omits the description of any explicit definition of behavior.
3 : permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or with-
out the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
In short, undefined behaviour means anything can happen from daemons flying out of your nose to your girlfriend getting pregnant.
What is the relation between Undefined Behaviour and Sequence Points?
Before I get into that you must know the difference(s) between Undefined Behaviour, Unspecified Behaviour and Implementation Defined Behaviour.
You must also know that the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified.
For example:
int x = 5, y = 6;
int z = x++ + y++; //it is unspecified whether x++ or y++ will be evaluated first.
Another example here.
Now the Standard in §5/4 says
Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.
What does it mean?
Informally it means that between two sequence points a variable must not be modified more than once.
In an expression statement, the next sequence point is usually at the terminating semicolon, and the previous sequence point is at the end of the previous statement. An expression may also contain intermediate sequence points.
From the above sentence the following expressions invoke Undefined Behaviour:
i++ * ++i; // UB, i is modified more than once btw two SPs
i = ++i; // UB, same as above
++i = 2; // UB, same as above
i = ++i + 1; // UB, same as above
++++++i; // UB, parsed as (++(++(++i)))
i = (i, ++i, ++i); // UB, there's no SP between `++i` (right most) and assignment to `i` (`i` is modified more than once btw two SPs)
But the following expressions are fine:
i = (i, ++i, 1) + 1; // well defined (AFAIK)
i = (++i, i++, i); // well defined
int j = i;
j = (++i, i++, j*i); // well defined
Furthermore, the prior value shall be accessed only to determine the value to be stored.
What does it mean? It means if an object is written to within a full expression, any and all accesses to it within the same expression must be directly involved in the computation of the value to be written.
For example in i = i + 1 all the access of i (in L.H.S and in R.H.S) are directly involved in computation of the value to be written. So it is fine.
This rule effectively constrains legal expressions to those in which the accesses demonstrably precede the modification.
Example 1:
std::printf("%d %d", i,++i); // invokes Undefined Behaviour because of Rule no 2
Example 2:
a[i] = i++ // or a[++i] = i or a[i++] = ++i etc
is disallowed because one of the accesses of i (the one in a[i]) has nothing to do with the value which ends up being stored in i (which happens over in i++), and so there's no good way to define--either for our understanding or the compiler's--whether the access should take place before or after the incremented value is stored. So the behaviour is undefined.
Example 3 :
int x = i + i++ ;// Similar to above
Follow up answer for C++11 here.
This is a follow up to my previous answer and contains C++11 related material..
Pre-requisites : An elementary knowledge of Relations (Mathematics).
Is it true that there are no Sequence Points in C++11?
Yes! This is very true.
Sequence Points have been replaced by Sequenced Before and Sequenced After (and Unsequenced and Indeterminately Sequenced) relations in C++11.
What exactly is this 'Sequenced before' thing?
Sequenced Before(§1.9/13) is a relation which is:
Asymmetric
Transitive
between evaluations executed by a single thread and induces a strict partial order1
Formally it means given any two evaluations(See below) A and B, if A is sequenced before B, then the execution of A shall precede the execution of B. If A is not sequenced before B and B is not sequenced before A, then A and B are unsequenced 2.
Evaluations A and B are indeterminately sequenced when either A is sequenced before B or B is sequenced before A, but it is unspecified which3.
[NOTES]
1 : A strict partial order is a binary relation "<" over a set P which is asymmetric, and transitive, i.e., for all a, b, and c in P, we have that:
........(i). if a < b then ¬ (b < a) (asymmetry);
........(ii). if a < b and b < c then a < c (transitivity).
2 : The execution of unsequenced evaluations can overlap.
3 : Indeterminately sequenced evaluations cannot overlap, but either could be executed first.
What is the meaning of the word 'evaluation' in context of C++11?
In C++11, evaluation of an expression (or a sub-expression) in general includes:
value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and
initiation of side effects.
Now (§1.9/14) says:
Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.
Trivial example:
int x;
x = 10;
++x;
Value computation and side effect associated with ++x is sequenced after the value computation and side effect of x = 10;
So there must be some relation between Undefined Behaviour and the above-mentioned things, right?
Yes! Right.
In (§1.9/15) it has been mentioned that
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced4.
For example :
int main()
{
int num = 19 ;
num = (num << 3) + (num >> 3);
}
Evaluation of operands of + operator are unsequenced relative to each other.
Evaluation of operands of << and >> operators are unsequenced relative to each other.
4: In an expression that is evaluated more than once during the execution
of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations.
(§1.9/15)
The value computations of the operands of an
operator are sequenced before the value computation of the result of the operator.
That means in x + y the value computation of x and y are sequenced before the value computation of (x + y).
More importantly
(§1.9/15) If a side effect on a scalar object is unsequenced relative to either
(a) another side effect on the same scalar object
or
(b) a value computation using the value of the same scalar object.
the behaviour is undefined.
Examples:
int i = 5, v[10] = { };
void f(int, int);
i = i++ * ++i; // Undefined Behaviour
i = ++i + i++; // Undefined Behaviour
i = ++i + ++i; // Undefined Behaviour
i = v[i++]; // Undefined Behaviour
i = v[++i]: // Well-defined Behavior
i = i++ + 1; // Undefined Behaviour
i = ++i + 1; // Well-defined Behaviour
++++i; // Well-defined Behaviour
f(i = -1, i = -1); // Undefined Behaviour (see below)
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [Note: Value computations and side effects associated with different argument expressions are unsequenced. — end note]
Expressions (5), (7) and (8) do not invoke undefined behaviour. Check out the following answers for a more detailed explanation.
Multiple preincrement operations on a variable in C++0x
Unsequenced Value Computations
Final Note :
If you find any flaw in the post please leave a comment. Power-users (With rep >20000) please do not hesitate to edit the post for correcting typos and other mistakes.
C++17 (N4659) includes a proposal Refining Expression Evaluation Order for Idiomatic C++
which defines a stricter order of expression evaluation.
In particular, the following sentence
8.18 Assignment and compound assignment operators:....
In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression.
The right operand is sequenced before the left operand.
together with the following clarification
An expression X is said to be sequenced before an expression Y if every
value computation and every side effect associated with the expression X is sequenced before every value
computation and every side effect associated with the expression Y.
make several cases of previously undefined behavior valid, including the one in question:
a[++i] = i;
However several other similar cases still lead to undefined behavior.
In N4140:
i = i++ + 1; // the behavior is undefined
But in N4659
i = i++ + 1; // the value of i is incremented
i = i++ + i; // the behavior is undefined
Of course, using a C++17 compliant compiler does not necessarily mean that one should start writing such expressions.
I am guessing there is a fundamental reason for the change, it isn't merely cosmetic to make the old interpretation clearer: that reason is concurrency. Unspecified order of elaboration is merely selection of one of several possible serial orderings, this is quite different to before and after orderings, because if there is no specified ordering, concurrent evaluation is possible: not so with the old rules. For example in:
f (a,b)
previously either a then b, or, b then a. Now, a and b can be evaluated with instructions interleaved or even on different cores.
In C99(ISO/IEC 9899:TC3) which seems absent from this discussion thus far the following steteents are made regarding order of evaluaiton.
[...]the order of evaluation of subexpressions and the order in which
side effects take place are both unspecified. (Section 6.5 pp 67)
The order of evaluation of the operands is unspecified. If an attempt
is made to modify the result of an assignment operator or to access it
after the next sequence point, the behavior[sic] is undefined.(Section
6.5.16 pp 91)

In C++11, does `i += ++i + 1` exhibit undefined behavior?

This question came up while I was reading (the answers to) So why is i = ++i + 1 well-defined in C++11?
I gather that the subtle explanation is that (1) the expression ++i returns an lvalue but + takes prvalues as operands, so a conversion from lvalue to prvalue must be performed; this involves obtaining the current value of that lvalue (rather than one more than the old value of i) and must therefore be sequenced after the side effect from the increment (i.e., updating i) (2) the LHS of the assignment is also an lvalue, so its value evaluation does not involve fetching the current value of i; while this value computation is unsequenced w.r.t. the value computation of the RHS, this poses no problem (3) the value computation of the assignment itself involves updating i (again), but is sequenced after the value computation of its RHS, and hence after the prvious update to i; no problem.
Fine, so there is no UB there. Now my question is what if one changed the assigment operator from = to += (or a similar operator).
Does the evaluation of the expression i += ++i + 1 lead to undefined behavior?
As I see it, the standard seems to contradict itself here. Since the LHS of += is still an lvalue (and its RHS still a prvalue), the same reasoning as above applies as far as (1) and (2) are concerned; there is no undefined behavior in the evalutation of the operands on +=. As for (3), the operation of the compound assignment += (more precisely the side effect of that operation; its value computation, if needed, is in any case sequenced after its side effect) now must both fetch the current value of i, and then (obviously sequenced after it, even if the standard does not say so explicitly, or otherwise the evaluation of such operators would always invoke undefined behavior) add the RHS and store the result back into i. Both these operations would have given undefined behavior if they were unsequenced w.r.t. the side effect of the ++, but as argued above (the side effect of the ++ is sequenced before the value computation of + giving the RHS of the += operator, which value computation is sequenced before the operation of that compound assignment), that is not the case.
But on the other hand the standard also says that E += F is equivalent to E = E + F, except that (the lvalue) E is evaluated only once. Now in our example the value computation of i (which is what E is here) as lvalue does not involve anything that needs to be sequenced w.r.t. other actions, so doing it once or twice makes no difference; our expression should be strictly equivalent to E = E + F. But here's the problem; it is pretty obvious that evaluating i = i + (++i + 1) would give undefined behaviour! What gives? Or is this a defect of the standard?
Added. I have slightly modified my discussion above, to do more justice to the proper distinction between side effects and value computations, and using "evaluation" (as does the standard) of an expression to encompass both. I think my main interrogation is not just about whether behavior is defined or not in this example, but how one must read the standard in order to decide this. Notably, should one take the equivalence of E op= F to E = E op F as the ultimate authority for the semantics of the compound assignment operation (in which case the example clearly has UB), or merely as an indication of what mathematical operation is involved in determining the value to be assigned (namely the one identified by op, with the lvalue-to-rvalue converted LHS of the compound assignment operator as left operand and its RHS as right operand). The latter option makes it much harder to argue for UB in this example, as I have tried to explain. I admit that it is tempting to make the equivalence authoritative (so that compound assignments become a kind of second-class primitives, whose meaning is given by rewriting in term of first-class primitives; thus the language definition would be simplified), but there are rather strong arguments against this:
The equivalence is not absolute, because of the "E is evaluated only once" exception. Note that this exception is essential to avoid making any use where the evaluation of E involves a side effect undefined behavior, for instance in the fairly common a[i++] += b; usage. If fact I think no absolutely equivalent rewriting to eliminate compound assignments is possible; using a fictive ||| operator to designate unsequenced evaluations, one might try to define E op= F; (with int operands for simplicity) as equivalent to { int& L=E ||| int R=F; L = L + R; }, but then the example no longer has UB. In any case the standard gives us no rewriitng recipe.
The standard does not treat compound assignments as second-class primitives for which no separate definition of semantics is necessary. For instance in 5.17 (emphasis mine)
The assignment operator (=) and the compound assignment operators all group right-to-left. [...] In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression. With respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation.
If the intention were to let compound assignments be mere shorthands for simple assignments, there would be no reason to include them explicitly in this description. The final phrase even directly contradicts what would be the case if the equivalence was taken to be authoritative.
If one admits that compound assignments have a semantics of their own, then the point arises that their evaluation involves (apart from the mathematical operation) more than just a side effect (the assignment) and a value evaluation (sequenced after the assignment), but also an unnamed operation of fetching the (previous) value of the LHS. This would normally be dealt with under the heading of "lvalue-to-rvalue conversion", but doing so here is hard to justify, since there is no operator present that takes the LHS as an rvalue operand (though there is one in the expanded "equivalent" form). It is precisely this unnamed operation whose potential unsequenced relation with the side effect of ++ would cause UB, but this unsequenced relation is nowhere explicitly stated in the standard, because the unnamed operation is not. It is hard to justify UB using an operation whose very existence is only implicit in the standard.
About the description of i = ++i + 1
I gather that the subtle explanation is that
(1) the expression ++i returns an lvalue but + takes prvalues as operands, so a conversion from lvalue to prvalue must be performed;
Probably, see CWG active issue 1642.
this involves obtaining the
current value of that lvalue (rather than one more than the old value
of i) and must therefore be sequenced after the side effect from the
increment (i.e., updating i)
The sequencing here is defined for the increment (indirectly, via +=, see (a)):
The side effect of ++ (the modification of i) is sequenced before the value computation of the whole expression ++i. The latter refers to computing the result of ++i, not to loading the value of i.
(2) the LHS of the assignment is also an
lvalue, so its value evaluation does not involve fetching the current
value of i;
while this value computation is unsequenced w.r.t. the
value computation of the RHS, this poses no problem
I don't think that's properly defined in the Standard, but I'd agree.
(3) the value
computation of the assignment itself involves updating i (again),
The value computation of i = expr is only required when you use the result, e.g. int x = (i = expr); or (i = expr) = 42;. The value computation itself does not modify i.
The modification of i in the expression i = expr that happens because of the = is called the side effect of =. This side effect is sequenced before value computation of i = expr -- or rather the value computation of i = expr is sequenced after the side effect of the assignment in i = expr.
In general, the value computation of the operands of an expression are sequenced before the side effect of that expression, of course.
but is sequenced after the value computation of its RHS, and hence after
the previous update to i; no problem.
The side effect of the assignment i = expr is sequenced after the value computation of the operands i (A) and expr of the assignment.
The expr in this case is a +-expression: expr1 + 1. The value computation of this expression is sequenced after the value computations of its operands expr1 and 1.
The expr1 here is ++i. The value computation of ++i is sequenced after the side effect of ++i (the modification of i) (B)
That's why i = ++i + 1 is safe: There's a chain of sequenced before between the value computation in (A) and the side effect on the same variable in (B).
(a) The Standard defines ++expr in terms of expr += 1, which is defined as expr = expr + 1 with expr being evaluated only once.
For this expr = expr + 1, we therefore have only one value computation of expr. The side effect of = is sequenced before the value computation of the whole expr = expr + 1, and it's sequenced after the value computation of the operands expr (LHS) and expr + 1 (RHS).
This corresponds to my claim that for ++expr, the side effect is sequenced before the value computation of ++expr.
About i += ++i + 1
Does the value computation of i += ++i + 1 involve undefined behavior?
Since the
LHS of += is still an lvalue (and its RHS still a prvalue), the same
reasoning as above applies as far as (1) and (2) are concerned;
as for
(3) the value computation of the += operator now must both fetch the
current value of i, and then (obviously sequenced after it, even if
the standard does not say so explicitly, or otherwise the execution of
such operators would always invoke undefined behavior) perform the
addition of the RHS and store the result back into i.
I think here's the problem: The addition of i in the LHS of i += to the result of ++i + 1 requires knowing the value of i - a value computation (which can mean loading the value of i). This value computation is unsequenced with respect to the modification performed by ++i. This is essentially what you say in your alternative description, following the rewrite mandated by the Standard i += expr -> i = i + expr. Here, the value computation of i within i + expr is unsequenced with respect to the value computation of expr. That's where you get UB.
Please note that a value computation can have two results: The "address" of an object, or the value of an object. In an expression i = 42, the value computation of the lhs "produces the address" of i; that is, the compiler needs to figure out where to store the rhs (under the rules of observable behaviour of the abstract machine). In an expression i + 42, the value computation of i produces the value. In the above paragraph, I was referring to the second kind, hence [intro.execution]p15 applies:
If a side effect on a scalar object is unsequenced relative to either
another side effect on the same scalar object or a value computation
using the value of the same scalar object, the behavior is undefined.
Another approach for i += ++i + 1
the value computation of the += operator now must both fetch the
current value of i, and then [...] perform the addition of the RHS
The RHS being ++i + 1. Computing the result of this expression (the value computation) is unsequenced with respect to the value computation of i from the LHS. So the word then in this sentence is misleading: Of course, it must first load i and then add the result of the RHS to it. But there's no order between the side-effect of the RHS and the value computation to get the value of the LHS. For example, you could get for the LHS either the old or the new value of i, as modified by the RHS.
In general a store and a "concurrent" load is a data race, which leads to Undefined Behaviour.
Addressing the addendum
using a fictive ||| operator to designate unsequenced evaluations, one might try to define E op= F; (with int operands for simplicity) as equivalent to { int& L=E ||| int R=F; L = L + R; }, but then the example no longer has UB.
Let E be i and F be ++i (we don't need the + 1). Then, for i = ++i
int* lhs_address;
int lhs_value;
int* rhs_address;
int rhs_value;
( lhs_address = &i)
||| (i = i+1, rhs_address = &i, rhs_value = *rhs_address);
*lhs_address = rhs_value;
On the other hand, for i += ++i
( lhs_address = &i, lhs_value = *lhs_address)
||| (i = i+1, rhs_address = &i, rhs_value = *rhs_address);
int total_value = lhs_value + rhs_value;
*lhs_address = total_value;
This is intended to represent my understanding of the sequencing guarantees. Note that the , operator sequences all value computations and side effects of the LHS before those of the RHS. Parentheses do not affect sequencing. In the second case, i += ++i, we have a modification of i unsequenced wrt an lvalue-to-rvalue conversion of i => UB.
The standard does not treat compound assignments as second-class primitives for which no separate definition of semantics is necessary.
I would say that's a redundancy. The rewrite from E1 op = E2 to E1 = E1 op E2 also includes which expression types and value categories are required (on the rhs, 5.17/1 says something about the lhs), what happens to pointer types, the required conversions etc. The sad thing is that the sentence about "With respect to an.." in 5.17/1 is not in 5.17/7 as an exception of that equivalence.
In any way, I think we should compare the guarantees and requirements for compound assignment vs. simple assignment plus the operator, and see if there's any contradiction.
Once we put that "With respect to an.." also in the list of exceptions in 5.17/7, I don't think there's a contradiction.
As it turns out, as you can see in the discussion of Marc van Leeuwen's answer, this sentence leads to the following interesting observation:
int i; // global
int& f() { return ++i; }
int main() {
i = i + f(); // (A)
i += f(); // (B)
}
It seems that (A) has an two possible outcomes, since the evaluation of the body of f is indeterminately sequenced with the value computation of the i in i + f().
In (B), on the other hand, the evaluation of the body of f() is sequenced before the value computation of i, since += must be seen as a single operation, and f() certainly needs to be evaluated before the assignment of +=.
The expression:
i += ++i + 1
does invoke undefined behavior. The language lawyer method requires us to go back to the defect report that results in:
i = ++i + 1 ;
becoming well defined in C++11, which is defect report 637. Sequencing rules and example disagree , it starts out saying:
In 1.9 [intro.execution] paragraph 16, the following expression is
still listed as an example of undefined behavior:
i = ++i + 1;
However, it appears that the new sequencing rules make this expression
well-defined
The logic used in the report is as follows:
The assignment side-effect is required to be sequenced after the value computations of both its LHS and RHS (5.17 [expr.ass] paragraph 1).
The LHS (i) is an lvalue, so its value computation involves computing the address of i.
In order to value-compute the RHS (++i + 1), it is necessary to first value-compute the lvalue expression ++i and then do an lvalue-to-rvalue conversion on the result. This guarantees that the incrementation side-effect is sequenced before the computation of the addition operation, which in turn is sequenced before the assignment side effect. In other words, it yields a well-defined order and final value for this expression.
So in this question our problem changes the RHS which goes from:
++i + 1
to:
i + ++i + 1
due to draft C++11 standard section 5.17 Assignment and compound assignment operators which says:
The behavior of an expression of the form E1 op = E2 is equivalent to
E1 = E1 op E2 except that E1 is evaluated only once. [...]
So now we have a situation where the computation of i in the RHS is not sequenced relative to ++i and so we then have undefined behavior. This follows from section 1.9 paragraph 15 which says:
Except where noted, evaluations of operands of individual operators
and of subexpressions of individual expressions are unsequenced. [
Note: In an expression that is evaluated more than once during the
execution of a program, unsequenced and indeterminately sequenced
evaluations of its subexpressions need not be performed consistently
in different evaluations. —end note ] The value computations of the
operands of an operator are sequenced before the value computation of
the result of the operator. If a side effect on a scalar object is
unsequenced relative to either another side effect on the same scalar
object or a value computation using the value of the same scalar
object, the behavior is undefined.
The pragmatic way to show this would be to use clang to test the code, which generates the following warning (see it live):
warning: unsequenced modification and access to 'i' [-Wunsequenced]
i += ++i + 1 ;
~~ ^
for this code:
int main()
{
int i = 0 ;
i += ++i + 1 ;
}
This is further bolstered by this explicit test example in clang's test suite for -Wunsequenced:
a += ++a;
Yes, it is UB!
The evaluation of your expression
i += ++i + 1
proceeds in the following steps:
5.17p1 (C++11) states (emphases mine):
The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand and return an lvalue referring to the left operand. The result in all cases is a bit-field if the left operand is a bit-field. In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.
What does "value computation" mean?
1.9p12 gives the answer:
Accessing an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression (or a sub-expression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects.
Since your code uses a compound assignment operator, 5.17p7 tells us, how this operator behaves:
The behavior of an expression of the form E1 op= E2 is equivalent to E1 = E1 op E2 except that E1 is evaluated only once.
Hence the evaluation of the expression E1 ( == i) involves both, determining the identity of the object designated by i and an lvalue-to-rvalue conversion to fetch the value stored in that object. But the evaluation of the two operands E1 and E2 are not sequenced with respect to each other. Thus we get undefined behavior since the evaluation of E2 ( == ++i + 1) initiates a side effect (updating i).
1.9p15:
... If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
The following statements in your question/comments seem to be the root of your misunderstanding:
(2) the LHS of the assignment is also an lvalue, so its value evaluation does not involve fetching the current value of i
fetching a value can be part of a prvalue evaluation. But in E += F the only prvalue is F so fetching the value of E is not part of the evaluation of the (lvalue) subexpression E
If an expression is an lvalue or rvalue doesn't tell anything about how this expression is to be evaluated. Some operators require lvalues as their operands some others require rvalues.
Clause 5p8:
Whenever a glvalue expression appears as an operand of an operator that expects a prvalue for that operand, the lvalue-to-rvalue (4.1), array-to-pointer (4.2), or function-to-pointer (4.3) standard conversions are applied to convert the expression to a prvalue.
In a simple assignment the evaluation of of the LHS only requires determining the identity of the object. But in a compound assignment such as += the LHS must be a modifiable lvalue, but the evaluation of the LHS in this case consists of determining the identity of the object and an lvalue-to-rvalue conversion. It is the result of this conversion (which is a prvalue) that is added to the result (also a prvalue) of the evaluation of the RHS.
"But in E += F the only prvalue is F so fetching the value of E is not part of the evaluation of the (lvalue) subexpression E"
That's not true as I explained above. In your example F is a prvalue expression, but F may as well be an lvalue expression. In that case, the lvalue-to-rvalue conversion is also applied to F. 5.17p7 as cited above tells us, what the semantics of the compound assignment operators are. The standard states that the behavior of E += F is the same as of E = E + F but E is only evaluated once. Here, the evaluation of E includes the lvalue-to-rvalue conversion, because the binary operator + requires it operands to be rvalues.
There is no clear case for Undefined Behavior here
Sure, an argument leading to UB can be given, as I indicated in the question, and which has been repeated in the answers given so far. However this involves a strict reading of 5.17:7 that is both self-contradictory and in contradiction with explicit statements in 5.17:1 about compound assignment. With a weaker reading of 5.17:7 the contradictions disappear, as does the argument for UB. Whence my conclusion is neither that there is UB here, nor that there is clearly defined behaviour, but the the text of the standard is inconsistent, and should be modified to make clear which reading prevails (and I suppose this means a defect report should be written). Of course one might invoke here the fall-back clause in the standard (the note in 1.3.24) that evaluations for which the standard fails to define the behavior [unambiguously and self-consistently] are Undefined Behavior, but that would make any use of compound assignments (including prefix increment/decrement operators) into UB, something that might appeal to certain implementors, but certainly not to programmers.
Instead of arguing for the given problem, let me present a slightly modified example that brings out the inconsistency more clearly. Assume one has defined
int& f (int& a) { return a; }
a function that does nothing and returns its (lvalue) argument. Now modify the example to
n += f(++n) + 1;
Note that while some extra conditions about sequencing of function calls are given in the standard, this would at first glance not seem to effect the example, since there are no side effect at all from the function call (not even locally inside the function), as the incrementation happens in the argument expression for f, whose evaluation is not subject to those extra conditions. Indeed, let us apply the Crucial Argument for Undefined Behavior (CAUB), namely 5.17:7 which says that the behavior of such a compound assignment is equivalent to that of (in this case)
n = n + f(++n) + 1;
except that n is evaluated only once (an exception that makes no difference here). The evaluation of the statement I just wrote clearly has UB (the value computation of the first (prvalue) n in the RHS is unsequenced w.r.t. the side effect of the ++ operation, which involves the same scalar object (1.9:15) and you're dead).
So the evaluation of n += f(++n) + 1 has undefined behavior, right? Wrong! Read in 5.17:1 that
With respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation. [ Note: Therefore, a function call shall not intervene between the lvalue-to-rvalue conversion and the side effect associated with any single compound assignment operator. — end note ]
This language is far from as precise as I would like it to be, but I don't think it is a stretch to assume that "indeterminately-sequenced" should mean "with respect to that operation of a compound assignment". The (non normative, I know) note makes it clear that the lvalue-to-rvalue conversion is part of the operation of the compound assignment. Now is the call of f indeterminately-sequenced with respect to the operation of the compound assignment of +=? I'm unsure, because the 'sequenced' relation is defined for individual value computations and side effects, not complete evaluations of operators, which may involve both. In fact the evaluation of a compound assignment operator involves three items: the lvalue-to-rvalue conversion of its left operand, the side effect (the assignment proper), and the value computation of the compound assignment (which is sequenced after the side effect, and returns the original left operand as lvalue). Note that the existence of the lvalue-to-rvalue conversion is never explicitly mentioned in the standard except in the note cited above; in particular, the standard makes no (other) statement at all regarding its sequencing relative to other evaluations. It is pretty clear that in the example the call of f is sequenced before the side effect and value computation of += (since the call occurs in the value computation of the right operand to +=), but it might be indeterminately-sequenced with respect to the lvalue-to-rvalue conversion part. I recall from my question that since the left operand of += is an lvalue (and necessarily so), one cannot construe the lvalue-to-rvalue conversion to have occurred as part of the value computation of the left operand.
However, by the principle of the excluded middle, the call to f must either be indeterminately-sequenced with respect to the operation of the compound assignment of +=, or not indeterminately-sequenced with respect to it; in the latter case it must be sequenced before it because it cannot possibly be sequenced after it (the call of f being sequenced before the side effect of +=, and the relation being anti-symmetric). So first assume it is indeterminately-sequenced with respect to the operation. Then the cited clause says that w.r.t. the call of f the evaluation of += is a single operation, and the note explains that it means the call should not intervene between the lvalue-to-rvalue conversion and the side effect associated with +=; it should either be sequenced before both, or after both. But being sequenced after the side effect is not possible, so it should be before both. This makes (by transitivity) the side effect of ++ sequenced before the lvalue-to-rvalue conversion, exit UB. Next assume the call of f is sequenced before the operation of +=. Then it is in particular sequenced before the lvalue-to-rvalue conversion, and again by transitivity so is the side effect of ++; no UB in this branch either.
Conclusion: 5.17:1 contradicts 5.17:7 if the latter is taken (CAUB) to be normative for questions of UB resulting from unsequenced evaluations by 1.9:15. As I said CAUB is self-contradictory as well (by arguments indicated in the question), but this answer is getting to long, so I'll leave it at this for now.
Three problems, and two proposals for resolving them
Trying to understand what the standard writes about these matters, I distinguish three aspects in which the text is hard to interpret; they all are of a nature that the text is insufficiently clear about what model its statements are referring to. (I cite the texts at the end of the numbered items, since I do not know the markup to resume a numbered item after a quote)
The text of 5.17:7 is of an apparent simplicity that, although the intention is easy to grasp, gives us little hold when applied to difficult situations. It makes a sweeping claim (equivalent behavior, apparently in all aspects) but whose application is thwarted by the exception clause. What if the behavior of E1 = E1 op E2 is undefined? Well then that of E1 op = E2 should be as well. But what if the UB was due to E1 being evaluated twice in E1 = E1 op E2? Then evaluating E1 op = E2 should presumably not be UB, but if so, then defined as what? This is like saying "the youth of the second twin was exactly like that of the first, except that he did not die at childbirth." Frankly, I think this text, which has little evolved since the C version "A compound assignment of the the form E1 op = E2 differs from the simple assignment expression E1 = E1 op E2 only in that the lvalue E1 is evaluated only once." might be adapted to match the changes in the standard.
(5.17) 7 The behavior of an expression of the form E1 op = E2 is equivalent to
E1 = E1 op E2 except that E1 is evaluated only once.[...]
It is not so clear what precisely the actions (evaluations) are between which the 'sequenced' relation is defined. It is said (1.9:12) that evaluation of an expression includes value computations and initiation of side effects. Though this appears to say that an evaluation may have multiple (atomic) components, the sequenced relation is actually mostly defined (e.g. in 1.9:14,15) for individual components, so that it might be better to read this as that the notion of "evaluation" encompasses both value computations and (initiation of) side effects. However in some cases the 'sequenced' relation is defined for the (entire) execution of an expression of statement (1.9:15) or for a function call (5.17:1), even though a passage in 1.9:15 avoids the latter by referring directly to executions in the body of a called function.
(1.9) 12 Evaluation of an expression (or a sub-expression) in general includes
both value computations (...) and initiation of side effects. [...] 13 Sequenced before is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread [...] 14 Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated. [...] 15 When calling a function (whether or not the function is inline), every value computation and side effect
associated with any argument expression, or with the postfix expression designating the called function, is
sequenced before execution of every expression or statement in the body of the called function. [...] Every evaluation in the calling function (including other function calls) ... is indeterminately sequenced with
respect to the execution of the called function [...] (5.2.6, 5.17) 1 ... With respect to an indeterminately-sequenced function call, ...
The text should more clearly acknowledge that a compound assignment involves, in contrast to a simple assignment, the action of fetching the value previously assigned to its left operand; this action is like lvalue-to-rvalue conversion, but does not happen as part of the value computation of that left operand, since it is not a prvalue; indeed it is a problem that 1.9:12 only acknowledges such action for prvalue evaluation. In particular the text should be more clear about which 'sequenced' relations are given for that action, if any.
(1.9) 12 Evaluation of an expression... includes... value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation)
The second point is the least directly related to our concrete question, and I think it can be solved simply by choosing a clear point of view and reformulating pasages that seem to indicate a different point of view. Given that one of the main purposes of the old sequence points, and now the 'sequenced' relation, was to make clear that the side effect of postfix-increment operators is unsequenced w.r.t. to actions sequenced after the value computation of that operator (thus giving e.g. i = i++ UB), the point of view must be that individual value computations and (initiation of) individual side effects are "evaluations" for which "sequenced before" may be defined. For pragmatic reasons I would also include two more kinds of (trivial) "evaluations": function entry (so that the language of 1.9:15 may be simplified to: "When calling a function..., every value computation and side effect associated with any of its argument expressions, or with the postfix expression designating the called function, is sequenced before entry of that function") and function exit (so that any action in the function body gets by transitivity sequenced before anything that requires the function value; this used to be guaranteed by a sequence point, but the C++11 standard seems to have lost such guarantee; this might make calling a function ending with return i++; potentially UB where this is not intended, and used to be safe). Then one can also be clear about the "indeterminately sequenced" relation of functions calls: for every function call, and every evaluation that is not (directly or indirectly) part of evaluating that call, that evaluation shall be sequenced (either before or after) w.r.t. both entry and exit of that function call, and it shall have the same relation in both cases (so that in particular such external actions cannot be sequenced after function entry but before function exit, as is clearly desirable within a single thread).
Now to resolve points 1. and 3., I can see two paths (each affecting both points), which have different consequences for the defined or not behavior of our example:
Compound assignments with two operands, and three evaluations
Compound operations have thier two usual operands, an lvalue left operand and a prvalue right operand. To settle the unclarity of 3., it is included in 1.9:12 that fetching the value previously assigned to an object also may occur in compound assignments (rather than only for prvalue evaluation). The semantics of compount assignments are defined by changing 5.17:7 to
In a compound assignment op=, the value previously assigned to the object referred to by the left operand is fetched, the operator op is applied with this value as left operand and the right operand of op= as right operand, and the resulting value replaces that of the object referred to by the left operand.
(That gives two evaluations, the fetch and the side effect; a third evaluation is the trivial value computation of the compound operator, sequenced after both other evaluations.)
For clarity, state clearly in 1.9:15 that value computations in operands are sequenced before all value computations associated with the operator (rather than just those for the result of the operator), which ensures that evaluating the lvalue left operand is sequenced before fetching its value (one can hardly imagine otherwise), and also sequences the value computation of the right operand before that fetch, thus excluding UB in our example. While at it, I see no reason not to also sequence value computations in operands before any side effects associated with the operator (as they clearly must); this would make mentioning this explicitly for (compound) assignments in 5.17:1 superfluous. On the other hand do mention there that the value fetching in a compound assignment is sequenced before its side effect.
Compound assignments with three operands, and two evaluations
In order to obtain that the fetch in a compount assignment will be unsequenced with respect to the value computation of the right operand, making our example UB, the clearest way seems to be to give compound operators an implicit third (middle) operand, a prvalue, not represented by a separate expression, but obtained by lvalue-to-rvalue conversion from the left operand (this three-operand nature corresponds to the expanded form of compound assignments, but by obtaining the middle operand from the left operand, it is ensured that the value is fetched from the same object to which the result will be stored, a crucial guarantee that is only vaguely and implicitly given in the current formulation through the "except that E1 is evaluated only once" clause). The difference with the previous solution is that the fetch is now a genuine lvalue-to-rvalue conversion (since the middle operand is a prvalue) and is performed as part of the value computation of the operands to the compound assignment, which makes it naturally unsequenced with the value computation of the right operand. It should be stated somewhere (in a new clause that describes this implicit operand) that the value computation of the left operand is sequenced before this lvalue-to-rvalue conversion (it clearly must). Now 1.9:12 can be left as it is, and in place of 5.17:7 I propose
In a compound assignment op= with left operand a (an lvalue), and midlle and right operands brespectively c (both prvalues), the operator op is applied with b as left operand and c as right operand, and the resulting value replaces that of the object referred to by a.
(That gives one evaluation, the side effect, with as second evaluation the trivial value computation of the compound operator, sequenced after it.)
The still applicable changes to 1.9:15 and 5.17:1 suggested in the previous solution could still apply, but would not give our original example defined behavior. However the modified example at the top of this answer would still have defined behavior, unless the part 5.17:1 "compound assignment is a single operation" is scrapped or modified (there is a similar passage in 5.2.6 for postfix increment/decrement). The existence of those passages would suggest that detaching the fecth and store operations within a single compound assignement or postfix increment/decrement was not the intention of those who wrote the current standard (and by extension making our example UB), but this of course is mere guesswork.
From the compiler writer's perspective, they don't care about "i += ++i + 1", because whatever the compiler does, the programmer may not get the correct result, but they surely get what they deserve. And nobody writes code like that. What the compiler writer cares about is
*p += ++(*q) + 1;
The code must read *p and *q, increase *q by 1, and increase *p by some amount that is calculated. Here the compiler writer cares about restrictions on the order of read and write operations. Obviously if p and q point to different objects, the order makes no difference, but if p == q then it will make a difference. Again, p will be different from q unless the programmer writing the code is insane.
By making the code undefined, the language allows the compiler to produce the fastest possible code without caring for insane programmers. By making the code defined, the language forces the compiler to produce code that conforms to the standard even in insane cases, which may make it run slower. Both compiler writers and sane programmers don't like that.
So even if the behaviour is defined in C++11, it would be very dangerous to use it, because (a) a compiler might not be changed from C++03 behaviour, and (b) it might be undefined behaviour in C++14, for the reasons above.

Pointer arithmetics in two dimensional array (C++) [duplicate]

What are "sequence points"?
What is the relation between undefined behaviour and sequence points?
I often use funny and convoluted expressions like a[++i] = i;, to make myself feel better. Why should I stop using them?
If you've read this, be sure to visit the follow-up question Undefined behavior and sequence points reloaded.
(Note: This is meant to be an entry to Stack Overflow's C++ FAQ. If you want to critique the idea of providing an FAQ in this form, then the posting on meta that started all this would be the place to do that. Answers to that question are monitored in the C++ chatroom, where the FAQ idea started out in the first place, so your answer is very likely to get read by those who came up with the idea.)
C++98 and C++03
This answer is for the older versions of the C++ standard. The C++11 and C++14 versions of the standard do not formally contain 'sequence points'; operations are 'sequenced before' or 'unsequenced' or 'indeterminately sequenced' instead. The net effect is essentially the same, but the terminology is different.
Disclaimer : Okay. This answer is a bit long. So have patience while reading it. If you already know these things, reading them again won't make you crazy.
Pre-requisites : An elementary knowledge of C++ Standard
What are Sequence Points?
The Standard says
At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations
shall be complete and no side effects of subsequent evaluations shall have taken place. (§1.9/7)
Side effects? What are side effects?
Evaluation of an expression produces something and if in addition there is a change in the state of the execution environment it is said that the expression (its evaluation) has some side effect(s).
For example:
int x = y++; //where y is also an int
In addition to the initialization operation the value of y gets changed due to the side effect of ++ operator.
So far so good. Moving on to sequence points. An alternation definition of seq-points given by the comp.lang.c author Steve Summit:
Sequence point is a point in time at which the dust has settled and all side effects which have been seen so far are guaranteed to be complete.
What are the common sequence points listed in the C++ Standard?
Those are:
at the end of the evaluation of full expression (§1.9/16) (A full-expression is an expression that is not a subexpression of another expression.)1
Example :
int a = 5; // ; is a sequence point here
in the evaluation of each of the following expressions after the evaluation of the first expression (§1.9/18) 2
a && b (§5.14)
a || b (§5.15)
a ? b : c (§5.16)
a , b (§5.18) (here a , b is a comma operator; in func(a,a++) , is not a comma operator, it's merely a separator between the arguments a and a++. Thus the behaviour is undefined in that case (if a is considered to be a primitive type))
at a function call (whether or not the function is inline), after the evaluation of all function arguments (if any) which
takes place before execution of any expressions or statements in the function body (§1.9/17).
1 : Note : the evaluation of a full-expression can include the evaluation of subexpressions that are not lexically
part of the full-expression. For example, subexpressions involved in evaluating default argument expressions (8.3.6) are considered to be created in the expression that calls the function, not the expression that defines the default argument
2 : The operators indicated are the built-in operators, as described in clause 5. When one of these operators is overloaded (clause 13) in a valid context, thus designating a user-defined operator function, the expression designates a function invocation and the operands form an argument list, without an implied sequence point between them.
What is Undefined Behaviour?
The Standard defines Undefined Behaviour in Section §1.3.12 as
behavior, such as might arise upon use of an erroneous program construct or erroneous data, for which this International Standard imposes no requirements 3.
Undefined behavior may also be expected when this
International Standard omits the description of any explicit definition of behavior.
3 : permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or with-
out the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
In short, undefined behaviour means anything can happen from daemons flying out of your nose to your girlfriend getting pregnant.
What is the relation between Undefined Behaviour and Sequence Points?
Before I get into that you must know the difference(s) between Undefined Behaviour, Unspecified Behaviour and Implementation Defined Behaviour.
You must also know that the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified.
For example:
int x = 5, y = 6;
int z = x++ + y++; //it is unspecified whether x++ or y++ will be evaluated first.
Another example here.
Now the Standard in §5/4 says
Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.
What does it mean?
Informally it means that between two sequence points a variable must not be modified more than once.
In an expression statement, the next sequence point is usually at the terminating semicolon, and the previous sequence point is at the end of the previous statement. An expression may also contain intermediate sequence points.
From the above sentence the following expressions invoke Undefined Behaviour:
i++ * ++i; // UB, i is modified more than once btw two SPs
i = ++i; // UB, same as above
++i = 2; // UB, same as above
i = ++i + 1; // UB, same as above
++++++i; // UB, parsed as (++(++(++i)))
i = (i, ++i, ++i); // UB, there's no SP between `++i` (right most) and assignment to `i` (`i` is modified more than once btw two SPs)
But the following expressions are fine:
i = (i, ++i, 1) + 1; // well defined (AFAIK)
i = (++i, i++, i); // well defined
int j = i;
j = (++i, i++, j*i); // well defined
Furthermore, the prior value shall be accessed only to determine the value to be stored.
What does it mean? It means if an object is written to within a full expression, any and all accesses to it within the same expression must be directly involved in the computation of the value to be written.
For example in i = i + 1 all the access of i (in L.H.S and in R.H.S) are directly involved in computation of the value to be written. So it is fine.
This rule effectively constrains legal expressions to those in which the accesses demonstrably precede the modification.
Example 1:
std::printf("%d %d", i,++i); // invokes Undefined Behaviour because of Rule no 2
Example 2:
a[i] = i++ // or a[++i] = i or a[i++] = ++i etc
is disallowed because one of the accesses of i (the one in a[i]) has nothing to do with the value which ends up being stored in i (which happens over in i++), and so there's no good way to define--either for our understanding or the compiler's--whether the access should take place before or after the incremented value is stored. So the behaviour is undefined.
Example 3 :
int x = i + i++ ;// Similar to above
Follow up answer for C++11 here.
This is a follow up to my previous answer and contains C++11 related material..
Pre-requisites : An elementary knowledge of Relations (Mathematics).
Is it true that there are no Sequence Points in C++11?
Yes! This is very true.
Sequence Points have been replaced by Sequenced Before and Sequenced After (and Unsequenced and Indeterminately Sequenced) relations in C++11.
What exactly is this 'Sequenced before' thing?
Sequenced Before(§1.9/13) is a relation which is:
Asymmetric
Transitive
between evaluations executed by a single thread and induces a strict partial order1
Formally it means given any two evaluations(See below) A and B, if A is sequenced before B, then the execution of A shall precede the execution of B. If A is not sequenced before B and B is not sequenced before A, then A and B are unsequenced 2.
Evaluations A and B are indeterminately sequenced when either A is sequenced before B or B is sequenced before A, but it is unspecified which3.
[NOTES]
1 : A strict partial order is a binary relation "<" over a set P which is asymmetric, and transitive, i.e., for all a, b, and c in P, we have that:
........(i). if a < b then ¬ (b < a) (asymmetry);
........(ii). if a < b and b < c then a < c (transitivity).
2 : The execution of unsequenced evaluations can overlap.
3 : Indeterminately sequenced evaluations cannot overlap, but either could be executed first.
What is the meaning of the word 'evaluation' in context of C++11?
In C++11, evaluation of an expression (or a sub-expression) in general includes:
value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and
initiation of side effects.
Now (§1.9/14) says:
Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.
Trivial example:
int x;
x = 10;
++x;
Value computation and side effect associated with ++x is sequenced after the value computation and side effect of x = 10;
So there must be some relation between Undefined Behaviour and the above-mentioned things, right?
Yes! Right.
In (§1.9/15) it has been mentioned that
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced4.
For example :
int main()
{
int num = 19 ;
num = (num << 3) + (num >> 3);
}
Evaluation of operands of + operator are unsequenced relative to each other.
Evaluation of operands of << and >> operators are unsequenced relative to each other.
4: In an expression that is evaluated more than once during the execution
of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations.
(§1.9/15)
The value computations of the operands of an
operator are sequenced before the value computation of the result of the operator.
That means in x + y the value computation of x and y are sequenced before the value computation of (x + y).
More importantly
(§1.9/15) If a side effect on a scalar object is unsequenced relative to either
(a) another side effect on the same scalar object
or
(b) a value computation using the value of the same scalar object.
the behaviour is undefined.
Examples:
int i = 5, v[10] = { };
void f(int, int);
i = i++ * ++i; // Undefined Behaviour
i = ++i + i++; // Undefined Behaviour
i = ++i + ++i; // Undefined Behaviour
i = v[i++]; // Undefined Behaviour
i = v[++i]: // Well-defined Behavior
i = i++ + 1; // Undefined Behaviour
i = ++i + 1; // Well-defined Behaviour
++++i; // Well-defined Behaviour
f(i = -1, i = -1); // Undefined Behaviour (see below)
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [Note: Value computations and side effects associated with different argument expressions are unsequenced. — end note]
Expressions (5), (7) and (8) do not invoke undefined behaviour. Check out the following answers for a more detailed explanation.
Multiple preincrement operations on a variable in C++0x
Unsequenced Value Computations
Final Note :
If you find any flaw in the post please leave a comment. Power-users (With rep >20000) please do not hesitate to edit the post for correcting typos and other mistakes.
C++17 (N4659) includes a proposal Refining Expression Evaluation Order for Idiomatic C++
which defines a stricter order of expression evaluation.
In particular, the following sentence
8.18 Assignment and compound assignment operators:....
In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression.
The right operand is sequenced before the left operand.
together with the following clarification
An expression X is said to be sequenced before an expression Y if every
value computation and every side effect associated with the expression X is sequenced before every value
computation and every side effect associated with the expression Y.
make several cases of previously undefined behavior valid, including the one in question:
a[++i] = i;
However several other similar cases still lead to undefined behavior.
In N4140:
i = i++ + 1; // the behavior is undefined
But in N4659
i = i++ + 1; // the value of i is incremented
i = i++ + i; // the behavior is undefined
Of course, using a C++17 compliant compiler does not necessarily mean that one should start writing such expressions.
I am guessing there is a fundamental reason for the change, it isn't merely cosmetic to make the old interpretation clearer: that reason is concurrency. Unspecified order of elaboration is merely selection of one of several possible serial orderings, this is quite different to before and after orderings, because if there is no specified ordering, concurrent evaluation is possible: not so with the old rules. For example in:
f (a,b)
previously either a then b, or, b then a. Now, a and b can be evaluated with instructions interleaved or even on different cores.
In C99(ISO/IEC 9899:TC3) which seems absent from this discussion thus far the following steteents are made regarding order of evaluaiton.
[...]the order of evaluation of subexpressions and the order in which
side effects take place are both unspecified. (Section 6.5 pp 67)
The order of evaluation of the operands is unspecified. If an attempt
is made to modify the result of an assignment operator or to access it
after the next sequence point, the behavior[sic] is undefined.(Section
6.5.16 pp 91)

Unsequenced value computations (a.k.a sequence points)

Sorry for opening this topic again, but thinking about this topic itself has started giving me an Undefined Behavior. Want to move into the zone of well-defined behavior.
Given
int i = 0;
int v[10];
i = ++i; //Expr1
i = i++; //Expr2
++ ++i; //Expr3
i = v[i++]; //Expr4
I think of the above expressions (in that order) as
operator=(i, operator++(i)) ; //Expr1 equivalent
operator=(i, operator++(i, 0)) ; //Expr2 equivalent
operator++(operator++(i)) ; //Expr3 equivalent
operator=(i, operator[](operator++(i, 0)); //Expr4 equivalent
Now coming to behaviors here are the important quotes from C++ 0x.
$1.9/12- "Evaluation of an expression
(or a sub-expression) in general
includes both value computations
(including determining the identity of
an object for lvalue evaluation and
fetchinga value previously assigned to
an object for rvalue evaluation) and
initiation of side effects."
$1.9/15- "If a side effect on a scalar
object is unsequenced relative to
either another side effect on the same
scalar object or a value
computation using the value of the
same scalar object, the behavior is
undefined."
[ Note: Value computations and side
effects associated with different
argument expressions are unsequenced.
—end note ]
$3.9/9- "Arithmetic types (3.9.1),
enumeration types, pointer types,
pointer to member types (3.9.2),
std::nullptr_t, and cv-qualified
versions of these types (3.9.3) are
collectively called scalar types."
In Expr1, the evaluation of the expression i (first argument), is unsequenced with respect to the evaluation of the expession operator++(i) (which has a side effect).
Hence Expr1 has undefined behavior.
In Expr2, the evaluation of the expression i (first argument), is unsequenced with respect to the evaluation of the expession operator++(i, 0) (which has a side effect)'.
Hence Expr2 has undefined behavior.
In Expr3, the evaluation of the lone argument operator++(i) is required to be complete before the outer operator++ is called.
Hence Expr3 has well defined behavior.
In Expr4, the evaluation of the expression i (first argument) is unsequenced with respect to the evaluation of the operator[](operator++(i, 0) (which has a side effect).
Hence Expr4 has undefined behavior.
Is this understanding correct?
P.S. The method of analyzing the expressions as in OP is not correct. This is because, as #Potatoswatter, notes - "clause 13.6 does not apply. See the disclaimer in 13.6/1, "These candidate functions participate in the operator overload resolution process as described in 13.3.1.2 and are used for no other purpose." They are just dummy declarations; no function-call semantics exist with respect to built-in operators."
Native operator expressions are not equivalent to overloaded operator expressions. There is a sequence point at the binding of values to function arguments, which makes the operator++() versions well-defined. But that doesn't exist for the native-type case.
In all four cases, i changes twice within the full-expression. Since no ,, ||, or && appear in the expressions, that's instant UB.
§5/4:
Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.
Edit for C++0x (updated)
§1.9/15:
The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
Note however that a value computation and a side effect are two distinct things. If ++i is equivalent to i = i+1, then + is the value computation and = is the side effect. From 1.9/12:
Evaluation of an expression (or a sub-expression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects.
So although the value computations are more strongly sequenced in C++0x than C++03, the side effects are not. Two side effects in the same expression, unless otherwise sequenced, produce UB.
Value computations are ordered by their data dependencies anyway and, side effects absent, their order of evaluation is unobservable, so I'm not sure why C++0x goes to the trouble of saying anything, but that just means I need to read more of the papers by Boehm and friends wrote.
Edit #3:
Thanks Johannes for coping with my laziness to type "sequenced" into my PDF reader search bar. I was going to bed and getting up on the last two edits anyway… right ;v) .
§5.17/1 defining the assignment operators says
In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.
Also §5.3.2/1 on the preincrement operator says
If x is not of type bool, the expression ++x is equivalent to x+=1 [Note: see … addition (5.7) and assignment operators (5.17) …].
By this identity, ++ ++ x is shorthand for (x +=1) +=1. So, let's interpret that.
Evaluate the 1 on the far RHS and descend into the parens.
Evaluate the inner 1 and the value (prvalue) and address (glvalue) of x.
Now we need the value of the += subexpression.
We're done with the value computations for that subexpression.
The assignment side effect must be sequenced before the value of assignment is available!
Assign the new value to x, which is identical to the glvalue and prvalue result of the subexpression.
We're out of the woods now. The whole expression has now been reduced to x +=1.
So, then 1 and 3 are well-defined and 2 and 4 are undefined behavior, which you would expect.
The only other surprise I found by searching for "sequenced" in N3126 was 5.3.4/16, where the implementation is allowed to call operator new before evaluating constructor arguments. That's cool.
Edit #4: (Oh, what a tangled web we weave)
Johannes notes again that in i == ++i; the glvalue (a.k.a. the address) of i is ambiguously dependent on ++i. The glvalue is certainly a value of i, but I don't think 1.9/15 is intended to include it for the simple reason that the glvalue of a named object is constant, and cannot actually have dependencies.
For an informative strawman, consider
( i % 2? i : j ) = ++ i; // certainly undefined
Here, the glvalue of the LHS of = is dependent on a side-effect on the prvalue of i. The address of i is not in question; the outcome of the ?: is.
Perhaps a good counterexample is
int i = 3, &j = i;
j = ++ i;
Here j has a glvalue distinct from (but identical to) i. Is this well-defined, yet i = ++i is not? This represents a trivial transformation that a compiler could apply to any case.
1.9/15 should say
If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the prvalue of the same scalar object, the behavior is undefined.
In thinking about expressions like those mentioned, I find it useful to imagine a machine where memory has interlocks so that reading a memory location as part of a read-modify-write sequence will cause any attempted read or write, other than the concluding write of the sequence, to be stalled until the sequence completes. Such a machine would hardly be an absurd concept; indeed, such a design could simplify many multi-threaded code scenarios. On the other hand, an expression like "x=y++;" could fail on such a machine if 'x' and 'y' were references to the same variable, and the compiler's generated code did something like read-and-lock reg1=y; reg2=reg1+1; write x=reg1; write-and-unlock y=reg2. That would be a very reasonable code sequence on processors where writing a newly-computed value would impose a pipeline delay, but the write to x would lock up the processor if y were aliased to the same variable.