Why is the first expression allowed, but the second not:
void test()
{
int a;
++a = getSomeInt();
a++ = getSomeInt();
}
I mean, why its forbidden for the second one to be an lvalue? The second one makes sense and the first not. In the first one we increment the variable and immediately after we gave here a new value, we lose it. That's not the case in the second expression. It makes sense to assign some value and increment the variable after that.
The result of postfix increment is prvalue which mean pure rvalue so it is not modifiable. This is per the draft C++ standard under postfix expressions section 5.2.6 Increment and decrement which says (emphasis mine):
The value of a postfix ++ expression is the value of its operand. [ Note: the value obtained is a copy of the original value —end note ] [...] The result is a prvalue. [...]
this makes sense if you think about it, since you need to return the previous value of a it has to be a temporary value.
For completeness sake the language for prefix increment in section 5.3.2 Increment and decrement says (emphasis mine):
The operand of prefix ++ is modified by adding 1, or set to true if it is bool (this use is deprecated). The operand shall be a modifiable lvalue. The type of the operand shall be an arithmetic type or a pointer to a completely-defined object type. The result is the updated operand; it is an lvalue [...]
Update
I realized that:
++a = getSomeInt();
invokes undefined behavior in C++03, we can see that by looking at the relevant section in an older draft standard would be section 5 Expressions paragraph 4 which says:
[...]Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full expression; otherwise the behavior is undefined.
so since you are modifying a more than once it is undefined. As far as I can tell this is well defined in C++11 which in section 1.9 Program execution paragraph 15 says:
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [...] If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
and we can see in section 5.17 Assignment and compound assignment operators paragraph 1 says:
[...] In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. [...]
but regardless, even if it is well defined expression like this:
++a = getSomeInt();
are difficult to read and maintain and should be eschewed for simpler code.
Update 2
Not sure how I missed this beofre but you are not initializing a here:
int a;
and so it will have an indeterminate value, we don't know what it's initial value will be and performing a pre-increment on a would also be undefined behavior.
Related
map<int, int> mp;
printf("%d ", mp.size());
mp[10]=mp.size();
printf("%d\n", mp[10]);
This code yields an answer that is not very intuitive:
0 1
I understand why it happens - the left side of the assignment returns reference to mp[10]'s underlying value and at the same time creates aforementioned value, and only then is the right side evaluated, using the newly computed size() of the map.
Is this behaviour stated anywhere in C++ standard? Or is the order of evaluation undefined?
Result was obtained using g++ 5.2.1.
Yes, this is covered by the standard and it is unspecified behavior. This particular case is covered in a recent C++ standards proposal: N4228: Refining Expression Evaluation Order for Idiomatic C++ which seeks to refine the order of evaluation rules to make it well specified for certain cases.
It describes this problem as follows:
Expression evaluation order is a recurring discussion topic in the C++
community. In a nutshell, given an expression such as f(a, b,
c), the order in which the sub-expressions f, a, b, c are evaluated is left unspecified by the standard. If any two of these sub-expressions happen to modify the same object without intervening sequence points, the behavior of the program is undefined. For instance, the expression f(i++, i) where i is an
integer variable leads to undefined behavior , as does v[i]
= i++. Even when the behavior is not undefined, the result of evaluating an expression can still be anybody’s guess. Consider
the following program fragment:
#include <map>
int main() {
std::map<int, int> m;
m[0] = m.size(); // #1
}
What should the map object m look like after evaluation of the
statement marked #1? { {0, 0 } } or {{0, 1 } } ?
We know that unless specified the evaluations of sub-expressions are unsequenced, this is from the draft C++11 standard section 1.9 Program execution which says:
Except where noted, evaluations of operands of individual operators
and of subexpressions of individual expressions are unsequenced.[...]
and all the section 5.17 Assignment and compound assignment operators [expr.ass] says is:
[...]In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression.[...]
So this section does not nail down the order of evaluation but we know this is not undefined behavior since both operator [] and size() are function calls and section 1.9 tells us(emphasis mine):
[...]When calling a function (whether or not the function is inline), every value computation and side effect
associated with any argument expression, or with the postfix expression designating the called function, is
sequenced before execution of every expression or statement in the body of the called function. [ Note: Value
computations and side effects associated with different argument expressions are unsequenced. —end note ]
Every evaluation in the calling function (including other function calls) that is not otherwise specifically
sequenced before or after the execution of the body of the called function is indeterminately sequenced with
respect to the execution of the called function.9[...]
Note, I cover the second interesting example from the N4228 proposal in the question Does this code from “The C++ Programming Language” 4th edition section 36.3.6 have well-defined behavior?.
Update
It seems like a revised version of N4228 was accepted by the Evolution Working Group at the last WG21 meeting but the paper(P0145R0) is not yet available. So this could possibly no longer be unspecified in C++17.
Update 2
Revision 3 of p0145 made this specified and update [expr.ass]p1:
The assignment operator (=) and the compound assignment operators all group right-to-left.
All require a modifiable lvalue as their left operand; their result is an lvalue referring to the left operand.
The result in all cases is a bit-field if the left operand is a bit-field.
In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. The right operand is sequenced before the left operand. ...
From the C++11 standard (emphasis mine):
5.17 Assignment and compound assignment operators
1 The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand and return an lvalue referring to the left operand. The result in all cases is a bit-field if the left operand is a bit-field. In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.
Whether the left operand is evaluated first or the right operand is evaluated first is not specified by the language. A compiler is free to choose to evaluate either operand first. Since the final result of your code depends on the order of evaluation of the operands, I would say it is unspecified behavior rather than undefined behavior.
1.3.25 unspecified behavior
behavior, for a well-formed program construct and correct data, that depends on the implementation
I'm sure that the standard does not specify for an expression x = y; which order x or y is evaluated in the C++ standard (this is the reason why you can't do *p++ = *p++ for example, because p++ is not done in a defined order).
In other words, to guarantee order x = y; in a defined order, you need to do break it up into two sequence points.
T tmp = y;
x = tmp;
(Of course, in this particular case, one might presume the compiler prefers to do operator[] before size() because it can then store the value directly into the result of operator[] instead of keeping it in a temporary place, to store it later after operator[] has been evaluated - but I'm pretty sure the compiler doesn't NEED to do it in that order)
Let's take a look at what your code breaks down to:
mp.operator[](10).operator=(mp.size());
which pretty much tells the story that in the first part an entry to 10 is created and in the second part the size of the container is assigned to the integer reference in position of 10.
But now you get into the order of evaluation problem which is unspecified. Here is a much simpler example .
When should map::size() get called, before or after map::operator(int const &); ?
Nobody really knows.
Consider the following expression (with declaration for exposition):
int n = 42;
--n &= 0x01;
Does this fall foul of sequencing rules?
In my opinion, the pre-increment is needed as part of the "value computation" of the left-hand operand. If this is true, there's no UB here since C++11 (and, since C++17, both value computations and side effects are sequenced relative to the assignment).
If it were a post-increment, then the modification of n would be merely a side-effect and we'd not have good sequencing (until C++17).
I suppose you are right, here's what standard says:
8.5.18 Assignment and compound assignment operators
All require a modifiable lvalue as their left operand; their result is
an lvalue referring to the left operand. [...]
In all cases, the assignment is sequenced after the value computation
of the right and left operands, and before the value computation of
the assignment expression.
So from above it seems assignment is value expression and both left and right of assignment are evaluated before assignment.
From standard about the preincrement:
8.5.2.2 Increment and decrement
The result is the updated operand; it is an lvalue, and it is a
bit-field if the operand is a bit-field. The expression
++x is equivalent to x+=1.
Which means that even before C++17 its side effect is sequenced before value computation.
As far as I can tell the wording in C++11 doesn't mention the "value computation" of preincrement and predecrement in relation to the update:
[expr.pre.incr]
1 The operand of prefix ++ is modified by adding 1, or set to true if it
is bool (this use is deprecated). The operand shall be a modifiable
lvalue. The type of the operand shall be an arithmetic type or a
pointer to a completely-defined object type. The result is the updated
operand; it is an lvalue, and it is a bit-field if the operand is a
bit-field. If x is not of type bool, the expression ++x is equivalent
to x+=1.
There is nothing in the above paragraph from which I'd conclude the modification has to come first. An implementation may very well compute the updated value (and use it) prior to writing it into the object by the next sequence point.
In which case we will have ourselves a side-effect that is indeterminately sequenced with another modification. So I'd say since the standard doesn't specify if there is a side-effect, nor how such potential side-effect is to be sequenced, the whole thing is undefined by omission.
With C++17, we of course get well defined sequencing with or without this potential side-effect.
map<int, int> mp;
printf("%d ", mp.size());
mp[10]=mp.size();
printf("%d\n", mp[10]);
This code yields an answer that is not very intuitive:
0 1
I understand why it happens - the left side of the assignment returns reference to mp[10]'s underlying value and at the same time creates aforementioned value, and only then is the right side evaluated, using the newly computed size() of the map.
Is this behaviour stated anywhere in C++ standard? Or is the order of evaluation undefined?
Result was obtained using g++ 5.2.1.
Yes, this is covered by the standard and it is unspecified behavior. This particular case is covered in a recent C++ standards proposal: N4228: Refining Expression Evaluation Order for Idiomatic C++ which seeks to refine the order of evaluation rules to make it well specified for certain cases.
It describes this problem as follows:
Expression evaluation order is a recurring discussion topic in the C++
community. In a nutshell, given an expression such as f(a, b,
c), the order in which the sub-expressions f, a, b, c are evaluated is left unspecified by the standard. If any two of these sub-expressions happen to modify the same object without intervening sequence points, the behavior of the program is undefined. For instance, the expression f(i++, i) where i is an
integer variable leads to undefined behavior , as does v[i]
= i++. Even when the behavior is not undefined, the result of evaluating an expression can still be anybody’s guess. Consider
the following program fragment:
#include <map>
int main() {
std::map<int, int> m;
m[0] = m.size(); // #1
}
What should the map object m look like after evaluation of the
statement marked #1? { {0, 0 } } or {{0, 1 } } ?
We know that unless specified the evaluations of sub-expressions are unsequenced, this is from the draft C++11 standard section 1.9 Program execution which says:
Except where noted, evaluations of operands of individual operators
and of subexpressions of individual expressions are unsequenced.[...]
and all the section 5.17 Assignment and compound assignment operators [expr.ass] says is:
[...]In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression.[...]
So this section does not nail down the order of evaluation but we know this is not undefined behavior since both operator [] and size() are function calls and section 1.9 tells us(emphasis mine):
[...]When calling a function (whether or not the function is inline), every value computation and side effect
associated with any argument expression, or with the postfix expression designating the called function, is
sequenced before execution of every expression or statement in the body of the called function. [ Note: Value
computations and side effects associated with different argument expressions are unsequenced. —end note ]
Every evaluation in the calling function (including other function calls) that is not otherwise specifically
sequenced before or after the execution of the body of the called function is indeterminately sequenced with
respect to the execution of the called function.9[...]
Note, I cover the second interesting example from the N4228 proposal in the question Does this code from “The C++ Programming Language” 4th edition section 36.3.6 have well-defined behavior?.
Update
It seems like a revised version of N4228 was accepted by the Evolution Working Group at the last WG21 meeting but the paper(P0145R0) is not yet available. So this could possibly no longer be unspecified in C++17.
Update 2
Revision 3 of p0145 made this specified and update [expr.ass]p1:
The assignment operator (=) and the compound assignment operators all group right-to-left.
All require a modifiable lvalue as their left operand; their result is an lvalue referring to the left operand.
The result in all cases is a bit-field if the left operand is a bit-field.
In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. The right operand is sequenced before the left operand. ...
From the C++11 standard (emphasis mine):
5.17 Assignment and compound assignment operators
1 The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand and return an lvalue referring to the left operand. The result in all cases is a bit-field if the left operand is a bit-field. In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.
Whether the left operand is evaluated first or the right operand is evaluated first is not specified by the language. A compiler is free to choose to evaluate either operand first. Since the final result of your code depends on the order of evaluation of the operands, I would say it is unspecified behavior rather than undefined behavior.
1.3.25 unspecified behavior
behavior, for a well-formed program construct and correct data, that depends on the implementation
I'm sure that the standard does not specify for an expression x = y; which order x or y is evaluated in the C++ standard (this is the reason why you can't do *p++ = *p++ for example, because p++ is not done in a defined order).
In other words, to guarantee order x = y; in a defined order, you need to do break it up into two sequence points.
T tmp = y;
x = tmp;
(Of course, in this particular case, one might presume the compiler prefers to do operator[] before size() because it can then store the value directly into the result of operator[] instead of keeping it in a temporary place, to store it later after operator[] has been evaluated - but I'm pretty sure the compiler doesn't NEED to do it in that order)
Let's take a look at what your code breaks down to:
mp.operator[](10).operator=(mp.size());
which pretty much tells the story that in the first part an entry to 10 is created and in the second part the size of the container is assigned to the integer reference in position of 10.
But now you get into the order of evaluation problem which is unspecified. Here is a much simpler example .
When should map::size() get called, before or after map::operator(int const &); ?
Nobody really knows.
In C++,
i = ++++j;
works fine in the code but when I use,
i = j++++;
I receive the following error:
Operand for operator "++" must be an lvalue.
Why am I getting this error?
Post-increment requires that the operand should be a modifiable lvalue but the result of post-increment is a prvalue("pure" rvalue) which is not modifiable, this diagram shows what is going on:
i = (j++)++ ;
^ ^
| |
| Result is a prvalue, not a valid operand for subsequent post-increment
Modifiable lvalue
Understanding lvalues and rvalues in C and C++ is a good place to start if you need to understand the difference between lvalues and rvalues.
From the draft C++ standard section 5.2.6 Increment and decrement [expr.post.incr] paragraph 1 says(emphasis is mine in this an subsequent quotes):
The value of a postfix ++ expression is the value of its operand. [ Note: the value obtained is a copy of the original value —end note ] The operand shall be a modifiable lvalue. [..] The result is a prvalue.
Update
I reworked my language on undefined behavior since there is a difference here with respect to C++03 and C++11.
Although the first expression shown:
i = ++++j ;
does not generate an error but if this is C++03 and j is a fundamental type this is undefined behavior since modifying it's value more than once within a sequence point is undefined. The relevant section in an older draft standard would be section 5 Expressions paragraph 4 which says:
[...]Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full expression; otherwise the behavior is undefined.
and it gives some examples, one of which is as follows:
i = ++i + 1; // the behavior is undefined
In C++11 the language changes to the side-effect on the same scalar object is unsequenced relative to the another side effect on the same object then the behavior is undefined. So this is actually well defined in C++11, in section 1.9 Program execution paragraph 15 says:
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [...] If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
Using post- and pre- increment in this way does not lead to readable(maintainable) code in both cases using j +=2 either before or after the assignment statement would have sufficed
you are getting this error because the postfix operator returns a value and not a reference. But for clarity you should probably not do i=j++++; it might be more clear to say i = j += 2; or separate the lines into i = j+2; j+=2; or j+=2; i = j;.
Sorry for opening this topic again, but thinking about this topic itself has started giving me an Undefined Behavior. Want to move into the zone of well-defined behavior.
Given
int i = 0;
int v[10];
i = ++i; //Expr1
i = i++; //Expr2
++ ++i; //Expr3
i = v[i++]; //Expr4
I think of the above expressions (in that order) as
operator=(i, operator++(i)) ; //Expr1 equivalent
operator=(i, operator++(i, 0)) ; //Expr2 equivalent
operator++(operator++(i)) ; //Expr3 equivalent
operator=(i, operator[](operator++(i, 0)); //Expr4 equivalent
Now coming to behaviors here are the important quotes from C++ 0x.
$1.9/12- "Evaluation of an expression
(or a sub-expression) in general
includes both value computations
(including determining the identity of
an object for lvalue evaluation and
fetchinga value previously assigned to
an object for rvalue evaluation) and
initiation of side effects."
$1.9/15- "If a side effect on a scalar
object is unsequenced relative to
either another side effect on the same
scalar object or a value
computation using the value of the
same scalar object, the behavior is
undefined."
[ Note: Value computations and side
effects associated with different
argument expressions are unsequenced.
—end note ]
$3.9/9- "Arithmetic types (3.9.1),
enumeration types, pointer types,
pointer to member types (3.9.2),
std::nullptr_t, and cv-qualified
versions of these types (3.9.3) are
collectively called scalar types."
In Expr1, the evaluation of the expression i (first argument), is unsequenced with respect to the evaluation of the expession operator++(i) (which has a side effect).
Hence Expr1 has undefined behavior.
In Expr2, the evaluation of the expression i (first argument), is unsequenced with respect to the evaluation of the expession operator++(i, 0) (which has a side effect)'.
Hence Expr2 has undefined behavior.
In Expr3, the evaluation of the lone argument operator++(i) is required to be complete before the outer operator++ is called.
Hence Expr3 has well defined behavior.
In Expr4, the evaluation of the expression i (first argument) is unsequenced with respect to the evaluation of the operator[](operator++(i, 0) (which has a side effect).
Hence Expr4 has undefined behavior.
Is this understanding correct?
P.S. The method of analyzing the expressions as in OP is not correct. This is because, as #Potatoswatter, notes - "clause 13.6 does not apply. See the disclaimer in 13.6/1, "These candidate functions participate in the operator overload resolution process as described in 13.3.1.2 and are used for no other purpose." They are just dummy declarations; no function-call semantics exist with respect to built-in operators."
Native operator expressions are not equivalent to overloaded operator expressions. There is a sequence point at the binding of values to function arguments, which makes the operator++() versions well-defined. But that doesn't exist for the native-type case.
In all four cases, i changes twice within the full-expression. Since no ,, ||, or && appear in the expressions, that's instant UB.
§5/4:
Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.
Edit for C++0x (updated)
§1.9/15:
The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
Note however that a value computation and a side effect are two distinct things. If ++i is equivalent to i = i+1, then + is the value computation and = is the side effect. From 1.9/12:
Evaluation of an expression (or a sub-expression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects.
So although the value computations are more strongly sequenced in C++0x than C++03, the side effects are not. Two side effects in the same expression, unless otherwise sequenced, produce UB.
Value computations are ordered by their data dependencies anyway and, side effects absent, their order of evaluation is unobservable, so I'm not sure why C++0x goes to the trouble of saying anything, but that just means I need to read more of the papers by Boehm and friends wrote.
Edit #3:
Thanks Johannes for coping with my laziness to type "sequenced" into my PDF reader search bar. I was going to bed and getting up on the last two edits anyway… right ;v) .
§5.17/1 defining the assignment operators says
In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.
Also §5.3.2/1 on the preincrement operator says
If x is not of type bool, the expression ++x is equivalent to x+=1 [Note: see … addition (5.7) and assignment operators (5.17) …].
By this identity, ++ ++ x is shorthand for (x +=1) +=1. So, let's interpret that.
Evaluate the 1 on the far RHS and descend into the parens.
Evaluate the inner 1 and the value (prvalue) and address (glvalue) of x.
Now we need the value of the += subexpression.
We're done with the value computations for that subexpression.
The assignment side effect must be sequenced before the value of assignment is available!
Assign the new value to x, which is identical to the glvalue and prvalue result of the subexpression.
We're out of the woods now. The whole expression has now been reduced to x +=1.
So, then 1 and 3 are well-defined and 2 and 4 are undefined behavior, which you would expect.
The only other surprise I found by searching for "sequenced" in N3126 was 5.3.4/16, where the implementation is allowed to call operator new before evaluating constructor arguments. That's cool.
Edit #4: (Oh, what a tangled web we weave)
Johannes notes again that in i == ++i; the glvalue (a.k.a. the address) of i is ambiguously dependent on ++i. The glvalue is certainly a value of i, but I don't think 1.9/15 is intended to include it for the simple reason that the glvalue of a named object is constant, and cannot actually have dependencies.
For an informative strawman, consider
( i % 2? i : j ) = ++ i; // certainly undefined
Here, the glvalue of the LHS of = is dependent on a side-effect on the prvalue of i. The address of i is not in question; the outcome of the ?: is.
Perhaps a good counterexample is
int i = 3, &j = i;
j = ++ i;
Here j has a glvalue distinct from (but identical to) i. Is this well-defined, yet i = ++i is not? This represents a trivial transformation that a compiler could apply to any case.
1.9/15 should say
If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the prvalue of the same scalar object, the behavior is undefined.
In thinking about expressions like those mentioned, I find it useful to imagine a machine where memory has interlocks so that reading a memory location as part of a read-modify-write sequence will cause any attempted read or write, other than the concluding write of the sequence, to be stalled until the sequence completes. Such a machine would hardly be an absurd concept; indeed, such a design could simplify many multi-threaded code scenarios. On the other hand, an expression like "x=y++;" could fail on such a machine if 'x' and 'y' were references to the same variable, and the compiler's generated code did something like read-and-lock reg1=y; reg2=reg1+1; write x=reg1; write-and-unlock y=reg2. That would be a very reasonable code sequence on processors where writing a newly-computed value would impose a pipeline delay, but the write to x would lock up the processor if y were aliased to the same variable.