Chained compound assignments with C++17 sequencing are still undefined behaviour? - c++

Originally, I presented a more complicated example, this one was proposed by #n. 'pronouns' m. in a now-deleted answer. But the question became too long, see edit history if you are interested.
Has the following program well-defined behaviour in C++17?
int main()
{
int a=5;
(a += 1) += a;
return a;
}
I believe this expression is well-defined and evaluated like this:
The right side a is evaluated to 5.
There are no side-effects of the right side.
The left side is evaluated to a reference to a, a += 1 is well-defined for sure.
The left-side side-effect is executed, making a==6.
The assignment is evaluted, adding 5 to the current value of a, making it 11.
The relevant sections of the standard:
[intro.execution]/8:
An expression X is said to be sequenced before an expression Y if
every value computation and every side effect associated with the
expression X is sequenced before every value computation and every
side effect associated with the expression Y.
[expr.ass]/1 (emphasis mine):
The assignment operator (=) and the compound assignment operators all
group right-to-left. All require a modifiable lvalue as their left
operand; their result is an lvalue referring to the left operand. The
result in all cases is a bit-field if the left operand is a bit-field.
In all cases, the assignment is sequenced after the value computation
of the right and left operands, and before the value computation of
the assignment expression. The right operand is sequenced before the
left operand. With respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation.
The wording originally comes from the accepted paper P0145R3.
Now, I feel there is some ambiguity, even contradiction, in this second section.
The right operand is sequenced before the left operand.
Together with the definition of sequenced before strongly implies the ordering of side-effects, yet the previous sentence:
In all cases, the assignment is sequenced after the value computation
of the right and left operands, and before the value computation of
the assignment expression
only explicitly sequences the assignment after value computation, not their side-effects. Thus allowing this behaviour:
The right side a is evaluated to 5.
The left side is evaluated to a reference of a, a += 1 is well-defined for sure.
The assignment is evaluted, adding 5 to the current value of a, making it 10.
The left-side side-effect is executed, making a==11 or maybe even 6 if the old values was used even for the side-effect.
But this ordering clearly violates the definition of sequenced before since the side-effects of the left operand happened after the value computation of the right operand. Thus left operand was not sequenced after the right operand which violets the above mentioned sentence. No I done goofed. This is allowed behaviour, right? I.e. the assignment can interleave the right-left evaluation. Or it can be done after both full evaluations.
Running the code gcc outputs 12, clang 11. Furthermore, gcc warns about
<source>: In function 'int main()':
<source>:4:8: warning: operation on 'a' may be undefined [-Wsequence-point]
4 | (a += 1) += a;
| ~~~^~~~~
I am terrible at reading assembly, maybe someone can at least rewrite how gcc got to 12? (a += 1), a+=a works but that seems extra wrong.
Well, thinking more about it, the right side also does evaluate to a reference to a, not just to a value 5. So Gcc could still be right, in that case clang could be wrong.

In order to follow better what is actually performed, let's try to mimic the same with our own type and add some printouts:
class Number {
int num = 0;
public:
Number(int n): num(n) {}
Number operator+=(int i) {
std::cout << "+=(int) for *this = " << num
<< " and int = " << i << std::endl;
num += i;
return *this;
}
Number& operator+=(Number n) {
std::cout << "+=(Number) for *this = " << num
<< " and Number = " << n << std::endl;
num += n.num;
return *this;
}
operator int() const {
return num;
}
};
Then when we run:
Number a {5};
(a += 1) += a;
std::cout << "result: " << a << std::endl;
We get different results with gcc and clang (and without any warning!).
gcc:
+=(int) for *this = 5 and int = 1
+=(Number) for *this = 6 and Number = 6
result: 12
clang:
+=(int) for *this = 5 and int = 1
+=(Number) for *this = 6 and Number = 5
result: 11
Which is the same result as for ints in the question. Even though it is not the same exact story: built-in assignment has its own sequencing rules, as opposed to overloaded operator which is a function call, still the similarity is interesting.
It seems that while gcc keeps the right side as a reference and turns it to a value on the call to +=, clang on the other hand turns the right side to a value first.
The next step would be to add a copy constructor to our Number class, to follow exactly when the reference is turned into a value. Doing that results with calling the copy constructor as the first operation, both by clang and gcc, and the result is the same for both: 11.
It seems that gcc delays the reference to value conversion (both in the built-in assignment as well as with user defined type without a user defined copy constructor). Is it coherent with C++17 defined sequencing? To me it seems as a gcc bug, at least for the built-in assignment as in the question, as it sounds that the conversion from reference to value is part of the "value computation" that shall be sequenced before the assignment.
As for a strange behavior of clang reported in previous version of the original post - returning different results in assert and when printing:
constexpr int foo() {
int res = 0;
(res = 5) |= (res *= 2);
return res;
}
int main() {
std::cout << foo() << std::endl; // prints 5
assert(foo() == 5); // fails in clang 11.0 - constexpr foo() is 10
// fixed in clang 11.x - correct value is 5
}
This relates to a bug in clang. The failure of the assert is wrong and is due to wrong evaluation order of this expression in clang, during constant evaluation in compile time. The value should be 5. This bug is already fixed in clang trunk.

Related

Is this operation properly sequenced?

In the following code excerpt from a larger piece of code presented
void func(int* usedNum, int wher) {
*usedNum = *usedNum + 1 > wher ? ++(*usedNum) : wher + 1;
}
int main(void) {
int a = 11, b = 2;
func(&a, b);
}
a warning is emitted
warning: operation on '* usedNum' may be undefined [-Wsequence-point]
*usedNum = *usedNum + 1 > wher ? ++(*usedNum) : wher + 1;
Is there a problem with the code?
My source of doubt was this and the part where it says
The sequence points in the logical expressions such as && and || and ternary operator ?: and the comma operator mean that the left hand side operand is evaluated before the right hand side operand. These few operands are the only operands in C++ that introduce sequence points.
tl;dr
For those that find torturing to read through the comments: The initial question was not properly posed and it would be unfair to create misconceptions. My view on the topic had two sides
The ternary operator does not mess up (in an unexpected way) the sequence points (which holds, the two branches are sequenced in every version of C,C++ - see the link provided)
Is x = ++x the problem? As seen in the coliru link, we compile for c++14. There the operation is well defined (references on the comments), but older versions of c++ and c view this as undefined. So why is there a warning?
Answers focus both in C and C++; this is a good link. Lastly the C tag was there initially (my bad) and can't be removed because existing upvoted answers refer to it
When the condition is true, it is the equivalent of saying x = ++x. In C, and versions of C++ prior to C++11, this constitutes a modification and a read of x without an intervening sequence point and therefore is undefined behaviour if the truthy branch is followed. From C++11 onwards, x = ++x is sequenced and well defined.
Edit To clarify some issues from comments.
1) this would be well defined in all C and C++ standards:
x = (++x, x); // RHS evaluates to x after increment
because the expression in the parentheses involves the comma operator, which introduce a sequence point between the evaluation of its operands. So the whole expression on the RHS evaluates to x after an increment. But the code in your question does not involve the comma operator.
2) The ternary operator introduces a sequence point
It is a sequence point between the condition and the two branches. But this doesn't introduce a sequence point between either branch and the assignment.
The warning you are getting is probably due to the fact that you are compiling your code in c++03 mode or older. In C99 and C++03, expression
x = ++x;
invokes undefined behavior. The reason is that between two sequence points an object can't modify more than once.
This rule is changed in C11 and C++11. According to C11, the rule is as follows:
C11:6.5 Expressions:
If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
When *usedNum + 1 > wher will be true, then
*usedNum = *usedNum + 1 > wher ? ++(*usedNum) : wher + 1;
would be equivalent to
*usedNum = ++(*usedNum);
and according to new rule this is well defined in C++11 this is because the side effect by pre ++ is sequenced before the side effect by = operator. Read this answer for more detailed explanation.
But the same expression *usedNum = ++(*usedNum); invokes undefined behavior in C11. The reason is that there is no guarantee that side effect by = operator is sequenced after the side effect of pre ++ operator.
Note: In the expression
a = x++ ? x++ : 0;
there is sequence point after the first x++ and hence behavior is well defined. Same is true for
x = (++x, x);
because there is a sequence point between the evaluation of left and right operand and hence side effect is sequenced.

Order of evaluation of assignment subexpressions

The C++11 standard (5.17, expr.ass) states that
In all cases, the assignment is sequenced after the value computation
of the right and left operands, and before the value computation of
the assignment expression. With respect to an
indeterminately-sequenced function call, the operation of a compound
assignment is a single evaluation
Does this mean, that the expression:
int a = 1, b = 10;
int c = (a+=1) + (b+=1);
if ( c == 10+1+1+1 ) {
printf("this is guaranteed");
} else {
printf("not guaranteed");
}
will always evaluate to c==23?
The expression
int c = (a+=1) + (b+=1);
(edit: added the missing brackets, I think this is what you intended)
has the following subexpressions
(1) a+=1
(2) b+=1
(3) (1)+(2)
(4) c = (3)
The order in which (1) and (2) are evaluated is unspecified, the compiler is free to choose any order it likes.
Both (1) and (2) must be evaluated before the compiler can evaluate (3).
(3) must be evaluated before the compiler can evaluate (4).
Now as the order of evaluation of (1) and (2) does not matter, the overall result is well defined, your code will always yield 13 and print "this is now standard". Note that is has always been this way, this is not new with C++11.
This has always been guaranteed, and the sequenced before rules
(or the sequence point rules in pre-C++11) aren't need to
determine this. In C++, each (sub-)expression has two important
effects in the generated code: it has a value (unless it is of
type void), and it may have side effects. The sequenced
before/sequence point rules affect when the side effects are
guaranteed to have taken place; they have no effect on the value
of the sub-expressions. In your case, for example, the value
of (a += 1) is the value a will have after the assignment,
regardless of when the actual assignment takes place.
In C++11, the actual modification of a is guaranteed to take
place before the modification of c; in pre C++11, there was no
guarantee concerning the order. In this case, however, there is
no way a conforming program could see this difference, so it
doesn't matter. (It would matter in cases like c = (c += 1),
which would be undefined behavior in pre-C++11.)
In your example the compiler shall issue an error because the priority of the addition operator is higher than priority of the assignment operator. So at first 1 + b will be calculated and then there will be an attempt to assign 1 to expression ( 1 + b ) but ( 1 + b ) is not an lvalue.

Is this undefined behaviour and why?

Please, explain why this code is correct or why not:
In my opinion, line ++*p1 = *p2++ has undefined behaviour, because p1 is dereferenced first and then incrementing.
int main()
{
char a[] = "Hello";
char b[] = "World";
char* p1 = a;
char* p2 = b;
//*++p1 = *p2++; // is this OK?
++*p1 = *p2++; // is this OK? Or this is UB?
std::cout << a << "\n" << b;
return 0;
}
The first is ok
*++p1 = *p2++ // p1++; *p1 = *p2; p2++;
the second is UB with C++ because you are modifying what is pointed by p1 twice (once because of increment and once because of assignment) and there are no sequence points separating the two side effects.
With C++0x rules things are different and more complex to explain and to understand. If you write intentionally expressions like the second one, if it's not for a code golf competition and if you are working for me then consider yourself fired (even if that is legal in C++0x).
I don't know if it is legal in C++0x and I don't want to know. I've too few neurons to waste them this way.
In modern C++ (at least C++ 2011 and later) neither is undefined behavior. And even neither is implementation defined or unspecified. (All three terms are different things.)
These two lines are both well defined (but they do different things).
When you have pointers p1 and p2 to scalar types then
*++p1 = *p2++;
is equivalent to
p1 = p1 + 1;
*p1 = *p2;
p2 = p2 + 1;
(^^^this is also true for C++ 1998/2003)
and
++*p1 = *p2++;
is equivalent to
*p1 = *p1 + 1;
*p1 = *p2;
p2 = p2 + 1;
(^^^maybe also in C++ 1998/2003 or maybe not - as explained below)
Obviously in case 2 incrementing value and then assigning to it (thus overwriting just incremented value) is pointless - but there may be similar examples that make sense (e.g. += instead of =).
BUT like many people point out - just don't write the code that looks ambiguous or unreasonably complex. Write the code that is clear to you and supposed to be clear to the readers.
Old C++ 1998/2003 case for second expression is a strange matter:
At first after reading the description of prefix increment operator:
ISO/IEC 14882-2003 5.3.2:
The operand of prefix ++ is modified by adding 1, or set to true if it
is bool (this use is deprecated). The operand shall be a modifiable
lvalue. The type of the operand shall be an arithmetic type or a
pointer to a completely-defined object type. The value is the new
value of the operand; it is an lvalue. If x is not of type bool, the
expression ++x is equivalent to x+=1.
I personally have a strong feeling that everything is perfectly defined and obvious and the same as above for C++ 2011 and later.
At least in the sense that every reasonable C++ implementation will behave in exact same well defined way (including old ones).
Why it should be otherwise if we always intuitively rely on a general rule that in any simple operator evaluation within a complex expression we evaluate its operands first and after that apply the operator to the values of those operands. Right? Breaking this intuitive expectation would be extremely stupid for any programming language.
So for the full expression ++*p1 = *p2++; we have operands: 1 - ++*p1 evaluated as already incremented lvalue (as defined in the above quote from C++ 2003) and 2 - *p2++ that is an rvalue stored at pointer p2 before its increment. It doesn't look ambiguous at all. Of course in this case - no reason to increment a value you are overwriting anyway BUT if there was double increment instead - ++(++*p1); OR other kind of assignment like +=/-=/&=/*=/etc instead of simple assignment THAT would not be unreasonable at all.
Unfortunately all the intuition and logic is messed up by this:
ISO/IEC 14882-2003 - 5 Expressions:
Except where noted, the order of evaluation of operands
of individual operators and subexpressions of individual
expressions, and the order in which side effects
take place, is unspecified. Between the previous
and next sequence point a scalar object shall have its
stored value modified at most once by the evaluation
of an expression. Furthermore, the prior value shall be
accessed only to determine the value to be stored.
The requirements of this paragraph shall be met for each
allowable ordering of the subexpressions of a full
expression; otherwise the behavior is undefined.
[Example:
i = v[i++]; // the behavior is unspecified
i = 7, i++, i++; // i becomes 9
i = ++i + 1; // the behavior is unspecified
i = i + 1; // the value of i is incremented
—end example]
So this wording if interpreted in a paranoid way seems to imply that modification of a value stored in a specific location more than once without intervening sequence point is explicitly forbidden by this rule and the last sentence declares that failing to comply with every requirement is Undefined Behavior. AND our expression seems to modify the same location more that once (?) with no sequence point until the full expression evaluated. (This arbitrary and unreasonable limitation is reinforced further by example 3 - i = ++i + 1; though it says // the behavior is unspecified - not undefined as in the wording before - which only adds more confusion.)
BUT on the other hand... If we ignore the example 3. (Maybe i = ++i + 1; is a typo and there should have been postfix increment instead - i = i++ + 1;? Who knows... Anyway examples are not part of formal specification.) If we interpret this wording in the most permissive way - we can see that in each allowed order of evaluation of subexpressions of the whole expression - preincrement ++*p1 must be evaluated to an LVALUE (which is something that allows further modification) BEFORE applying assignment operator so the only valid final value at that location is the one that is stored with assignment operator. ALSO NOTE that conforming C++ implementation have no obligation to actually modify that location more than once and may instead store only final result - that is both reasonable optimization allowed by the standard and may be actual demand of this article.
Which one of those interpretations is correct? Paranoid or permissive? Universally applicable logic or some suspicious and ambiguous words in a document almost nobody really ever read? Blue pill or Red pill?
Who knows... It looks like a gray area that requires less ambiguous explanation.
If we interpret the quote from C++ 2003 standard above in a paranoid way then it looks like this code may be Undefined Behavior:
#include <iostream>
#define INC(x) (++(x))
int main()
{
int a = 5;
INC(INC(a));
std::cout << a;
return 0;
}
while this code is perfectly legitimate and well defined:
#include <iostream>
template<class T> T& INC(T& x) // sequence point after evaluation of the arguments
{ // and before execution of the function body
return ++x;
}
int main()
{
int a = 5;
INC(INC(a));
std::cout << a;
return 0;
}
Really?
All this looks very much like a defect of the old C++ standard.
Fortunately this has been addressed in newer C++ standards (starting with C++ 2011) as there is no such concept as sequence point anymore. Instead there is a relation - something sequenced before something. And of course the natural guarantee that evaluation of the argument expressions of any operator is sequenced before evaluation of the result of the operator is there.
ISO/IEC 14882-2011 - 1.9 Program execution
Sequenced before is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread (1.10), which induces
a partial order among those evaluations. Given any two evaluations A
and B, if A is sequenced before B, then the execution of A shall
precede the execution of B. If A is not sequenced before B and B is
not sequenced before A, then A and B are unsequenced. [ Note: The
execution of unsequenced evaluations can overlap. — end note ]
Evaluations A and B are indeterminately sequenced when either A is
sequenced before B or B is sequenced before A, but it is unspecified
which. [ Note: Indeterminately sequenced evaluations cannot overlap,
but either could be executed first. — end note ]
Every value computation and side effect associated with a full-expression is sequenced before every value computation and side
effect associated with the next full-expression to be evaluated.
Except where noted, evaluations of operands of individual operators
and of subexpressions of individual expressions are unsequenced. [
Note: In an expression that is evaluated more than once during the
execution of a program, unsequenced and indeterminately sequenced
evaluations of its subexpressions need not be performed consistently
in different evaluations. — end note ] The value computations of the
operands of an operator are sequenced before the value computation of
the result of the operator. If a side effect on a scalar object is
unsequenced relative to either anotherside effect on the same scalar
object or a value computation using the value of the same scalar
object, the behavior is undefined.
[ Example:
void f(int, int);
void g(int i, int* v) {
i = v[i++]; // the behavior is undefined
i = 7, i++, i++; // i becomes 9
i = i++ + 1; // the behavior is undefined
i = i + 1; // the value of i is incremented
f(i = -1, i = -1); // the behavior is undefined
}
— end example ]
(Also NOTE how C++ 2003 prefix increment example i = ++i + 1; is replaced by postfix increment example i = i++ + 1; in this C++ 2011 quote. :) )

Is this code behavior defined?

What does the following code print to the console?
map<int,int> m;
m[0] = m.size();
printf("%d", m[0]);
Possible answers:
The behavior of the code is not defined since it is not defined which statement m[0] or m.size() is being executed first by the compiler. So it could print 1 as well as 0.
It prints 0 because the right hand side of the assignment operator is executed first.
It prints 1 because the operator[] has the highest priority of the complete statement m[0] = m.size(). Because of this the following sequence of events occurs:
m[0] creates a new element in the map
m.size() gets called which is now 1
m[0] gets assigned the previously returned (by m.size()) 1
The real answer?, which is unknown to me^^
I believe it's unspecified whether 0 or 1 is stored in m[0], but it's not undefined behavior.
The LHS and the RHS can occur in either order, but they're both function calls, so they both have a sequence point at the start and end. There's no danger of the two of them, collectively, accessing the same object without an intervening sequence point.
The assignment is actual int assignment, not a function call with associated sequence points, since operator[] returns T&. That's briefly worrying, but it's not modifying an object that is accessed anywhere else in this statement, so that's safe too. It's accessed within operator[], of course, where it is initialized, but that occurs before the sequence point on return from operator[], so that's OK. If it wasn't, m[0] = 0; would be undefined too!
However, the order of evaluation of the operands of operator= is not specified by the standard, so the actual result of the call to size() might be 0 or 1 depending which order occurs.
The following would be undefined behavior, though. It doesn't make function calls and so there's nothing to prevent size being accessed (on the RHS) and modified (on the LHS) without an intervening sequence point:
int values[1];
int size = 0;
(++size, values[0] = 0) = size;
/* fake m[0] */ /* fake m.size() */
It does print 1, and without raising a warning(!) using gcc. It should raise a warning because it is undefined.
The precedence class of both operator[] and operator. is 2 whereas the precedence class of operator= is 16.
This means that it is well-defined that m[0] and m.size() will be executed before the assignment. However, it is not defined which one executes first.
There is no sequence point between the call to operator [] and the call to clear in this statement. Consequently, the behaviour should be undefined.
Given that C++17 is pretty much here, I think it's worth mentioning that this code now exhibits well defined behavior under the new standard. For this case of = being the built-in assignment to an integer:
[expr.ass]/1:
The assignment operator (=) and the compound assignment operators all
group right-to-left. All require a modifiable lvalue as their left
operand and return an lvalue referring to the left operand. The result
in all cases is a bit-field if the left operand is a bit-field. In all
cases, the assignment is sequenced after the value computation of the
right and left operands, and before the value computation of the
assignment expression. The right operand is sequenced before the left
operand. With respect to an indeterminately-sequenced function call,
the operation of a compound assignment is a single evaluation.
Which leaves us with only one option, and that is #2.

Is it legal to use the increment operator in a C++ function call?

There's been some debate going on in this question about whether the following code is legal C++:
std::list<item*>::iterator i = items.begin();
while (i != items.end())
{
bool isActive = (*i)->update();
if (!isActive)
{
items.erase(i++); // *** Is this undefined behavior? ***
}
else
{
other_code_involving(*i);
++i;
}
}
The problem here is that erase() will invalidate the iterator in question. If that happens before i++ is evaluated, then incrementing i like that is technically undefined behavior, even if it appears to work with a particular compiler. One side of the debate says that all function arguments are fully evaluated before the function is called. The other side says, "the only guarantees are that i++ will happen before the next statement and after i++ is used. Whether that is before erase(i++) is invoked or afterwards is compiler dependent."
I opened this question to hopefully settle that debate.
Quoth the C++ standard 1.9.16:
When calling a function (whether or
not the function is inline), every
value computation and side effect
associated with any argument
expression, or with the postfix
expression designating the called
function, is sequenced before
execution of every expression or
statement in the body of the called
function. (Note: Value computations
and side effects associated with the
different argument expressions are
unsequenced.)
So it would seem to me that this code:
foo(i++);
is perfectly legal. It will increment i and then call foo with the previous value of i. However, this code:
foo(i++, i++);
yields undefined behavior because paragraph 1.9.16 also says:
If a side effect on a scalar object is
unsequenced relative to either another
side effect on the same scalar object
or a value computation using the value
of the same scalar object, the
behavior is undefined.
To build on Kristo's answer,
foo(i++, i++);
yields undefined behavior because the order that function arguments are evaluated is undefined (and in the more general case because if you read a variable twice in an expression where you also write it, the result is undefined). You don't know which argument will be incremented first.
int i = 1;
foo(i++, i++);
might result in a function call of
foo(2, 1);
or
foo(1, 2);
or even
foo(1, 1);
Run the following to see what happens on your platform:
#include <iostream>
using namespace std;
void foo(int a, int b)
{
cout << "a: " << a << endl;
cout << "b: " << b << endl;
}
int main()
{
int i = 1;
foo(i++, i++);
}
On my machine I get
$ ./a.out
a: 2
b: 1
every time, but this code is not portable, so I would expect to see different results with different compilers.
The standard says the side effect happens before the call, so the code is the same as:
std::list<item*>::iterator i_before = i;
i = i_before + 1;
items.erase(i_before);
rather than being:
std::list<item*>::iterator i_before = i;
items.erase(i);
i = i_before + 1;
So it is safe in this case, because list.erase() specifically doesn't invalidate any iterators other than the one erased.
That said, it's bad style - the erase function for all containers returns the next iterator specifically so you don't have to worry about invalidating iterators due to reallocation, so the idiomatic code:
i = items.erase(i);
will be safe for lists, and will also be safe for vectors, deques and any other sequence container should you want to change your storage.
You also wouldn't get the original code to compile without warnings - you'd have to write
(void)items.erase(i++);
to avoid a warning about an unused return, which would be a big clue that you're doing something odd.
It's perfectly OK.
The value passed would be the value of "i" before the increment.
++Kristo!
The C++ standard 1.9.16 makes a lot of sense with respect to how one implements operator++(postfix) for a class. When that operator++(int) method is called, it increments itself and returns a copy of the original value. Exactly as the C++ spec says.
It's nice to see standards improving!
However, I distinctly remember using older (pre-ANSI) C compilers wherein:
foo -> bar(i++) -> charlie(i++);
Did not do what you think! Instead it compiled equivalent to:
foo -> bar(i) -> charlie(i); ++i; ++i;
And this behavior was compiler-implementation dependent. (Making porting fun.)
It's easy enough to test and verify that modern compilers now behave correctly:
#define SHOW(S,X) cout << S << ": " # X " = " << (X) << endl
struct Foo
{
Foo & bar(const char * theString, int theI)
{ SHOW(theString, theI); return *this; }
};
int
main()
{
Foo f;
int i = 0;
f . bar("A",i) . bar("B",i++) . bar("C",i) . bar("D",i);
SHOW("END ",i);
}
Responding to comment in thread...
...And building on pretty much EVERYONE's answers... (Thanks guys!)
I think we need spell this out a bit better:
Given:
baz(g(),h());
Then we don't know whether g() will be invoked before or after h(). It is "unspecified".
But we do know that both g() and h() will be invoked before baz().
Given:
bar(i++,i++);
Again, we don't know which i++ will be evaluated first, and perhaps not even whether i will be incremented once or twice before bar() is called. The results are undefined! (Given i=0, this could be bar(0,0) or bar(1,0) or bar(0,1) or something really weird!)
Given:
foo(i++);
We now know that i will be incremented before foo() is invoked. As Kristo pointed out from the C++ standard section 1.9.16:
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [ Note: Value computations and side effects associated with different argument expressions are unsequenced. -- end note ]
Though I think section 5.2.6 says it better:
The value of a postfix ++ expression is the value of its operand. [ Note: the value obtained is a copy of the original value -- end note ] The operand shall be a modifiable lvalue. The type of the operand shall be an arithmetic type or a pointer to a complete effective object type. The value of the operand object is modified by adding 1 to it, unless the object is of type bool, in which case it is set to true. [ Note: this use is deprecated, see Annex D. -- end note ] The value computation of the ++ expression is sequenced before the modification of the operand object. With respect to an indeterminately-sequenced function call, the operation of postfix ++ is a single evaluation. [ Note: Therefore, a function call shall not intervene between the lvalue-to-rvalue conversion and the side effect associated with any single postfix ++ operator. -- end note ] The result is an rvalue. The type of the result is the cv-unqualified version of the type of the operand. See also 5.7 and 5.17.
The standard, in section 1.9.16, also lists (as part of its examples):
i = 7, i++, i++; // i becomes 9 (valid)
f(i = -1, i = -1); // the behavior is undefined
And we can trivially demonstrate this with:
#define SHOW(X) cout << # X " = " << (X) << endl
int i = 0; /* Yes, it's global! */
void foo(int theI) { SHOW(theI); SHOW(i); }
int main() { foo(i++); }
So, yes, i is incremented before foo() is invoked.
All this makes a lot of sense from the perspective of:
class Foo
{
public:
Foo operator++(int) {...} /* Postfix variant */
}
int main() { Foo f; delta( f++ ); }
Here Foo::operator++(int) must be invoked prior to delta(). And the increment operation must be completed during that invocation.
In my (perhaps overly complex) example:
f . bar("A",i) . bar("B",i++) . bar("C",i) . bar("D",i);
f.bar("A",i) must be executed to obtain the object used for object.bar("B",i++), and so on for "C" and "D".
So we know that i++ increments i prior to calling bar("B",i++) (even though bar("B",...) is invoked with the old value of i), and therefore i is incremented prior to bar("C",i) and bar("D",i).
Getting back to j_random_hacker's comment:
j_random_hacker writes: +1, but I had to read the standard carefully to convince myself that this was OK. Am I right in thinking that, if bar() was instead a global function returning say int, f was an int, and those invocations were connected by say "^" instead of ".", then any of A, C and D could report "0"?
This question is a lot more complicated than you might think...
Rewriting your question as code...
int bar(const char * theString, int theI) { SHOW(...); return i; }
bar("A",i) ^ bar("B",i++) ^ bar("C",i) ^ bar("D",i);
Now we have only ONE expression. According to the standard (section 1.9, page 8, pdf page 20):
Note: operators can be regrouped according to the usual mathematical rules only where the operators really are associative or commutative.(7) For example, in the following fragment: a=a+32760+b+5; the expression statement behaves exactly the same as: a=(((a+32760)+b)+5); due to the associativity and precedence of these operators. Thus, the result of the sum (a+32760) is next added to b, and that result is then added to 5 which results in the value assigned to a. On a machine in which overflows produce an exception and in which the range of values representable by an int is [-32768,+32767], the implementation cannot rewrite this expression as a=((a+b)+32765); since if the values for a and b were, respectively, -32754 and -15, the sum a+b would produce an exception while the original expression would not; nor can the expression be rewritten either as a=((a+32765)+b); or a=(a+(b+32765)); since the values for a and b might have been, respectively, 4 and -8 or -17 and 12. However on a machine in which overflows do not produce an exception and in which the results of overflows are reversible, the above expression statement can be rewritten by the implementation in any of the above ways because the same result will occur. -- end note ]
So we might think that, due to precedence, that our expression would be the same as:
(
(
( bar("A",i) ^ bar("B",i++)
)
^ bar("C",i)
)
^ bar("D",i)
);
But, because (a^b)^c==a^(b^c) without any possible overflow situations, it could be rewritten in any order...
But, because bar() is being invoked, and could hypothetically involve side effects, this expression cannot be rewritten in just any order. Rules of precedence still apply.
Which nicely determines the order of evaluation of the bar()'s.
Now, when does that i+=1 occur? Well it still has to occur before bar("B",...) is invoked. (Even though bar("B",....) is invoked with the old value.)
So it's deterministically occurring before bar(C) and bar(D), and after bar(A).
Answer: NO. We will always have "A=0, B=0, C=1, D=1", if the compiler is standards-compliant.
But consider another problem:
i = 0;
int & j = i;
R = i ^ i++ ^ j;
What is the value of R?
If the i+=1 occurred before j, we'd have 0^0^1=1. But if the i+=1 occurred after the whole expression, we'd have 0^0^0=0.
In fact, R is zero. The i+=1 does not occur until after the expression has been evaluated.
Which I reckon is why:
i = 7, i++, i++; // i becomes 9 (valid)
Is legal... It has three expressions:
i = 7
i++
i++
And in each case, the value of i is changed at the conclusion of each expression. (Before any subsequent expressions are evaluated.)
PS: Consider:
int foo(int theI) { SHOW(theI); SHOW(i); return theI; }
i = 0;
int & j = i;
R = i ^ i++ ^ foo(j);
In this case, i+=1 has to be evaluated before foo(j). theI is 1. And R is 0^0^1=1.
To build on MarkusQ's answer: ;)
Or rather, Bill's comment to it:
(Edit: Aw, the comment is gone again... Oh well)
They're allowed to be evaluated in parallel. Whether or not it happens in practice is technically speaking irrelevant.
You don't need thread parallelism for this to occur though, just evaluate the first step of both (take the value of i) before the second (increment i). Perfectly legal, and some compilers may consider it more efficient than fully evaluating one i++ before starting on the second.
In fact, I'd expect it to be a common optimization. Look at it from an instruction scheduling point of view. You have the following you need to evaluate:
Take the value of i for the right argument
Increment i in the right argument
Take the value of i for the left argument
Increment i in the left argument
But there's really no dependency between the left and the right argument. Argument evaluation happens in an unspecified order, and need not be done sequentially either (which is why new() in function arguments is usually a memory leak, even when wrapped in a smart pointer)
It's also undefined what happens when you modify the same variable twice in the same expression.
We do have a dependency between 1 and 2, however, and between 3 and 4.
So why would the compiler wait for 2 to complete before computing 3? That introduces added latency, and it'll take even longer than necessary before 4 becomes available.
Assuming there's a 1 cycle latency between each, it'll take 3 cycles from 1 is complete until the result of 4 is ready and we can call the function.
But if we reorder them and evaluate in the order 1, 3, 2, 4, we can do it in 2 cycles. 1 and 3 can be started in the same cycle (or even merged into one instruction, since it's the same expression), and in the following, 2 and 4 can be evaluated.
All modern CPU's can execute 3-4 instructions per cycle, and a good compiler should try to exploit that.
Sutter's Guru of the Week #55 (and the corresponding piece in "More Exceptional C++") discusses this exact case as an example.
According to him, it is perfectly valid code, and in fact a case where trying to transform the statement into two lines:
items.erase(i);
i++;
does not produce code that is semantically equivalent to the original statement.
To build on Bill the Lizard's answer:
int i = 1;
foo(i++, i++);
might also result in a function call of
foo(1, 1);
(meaning that the actuals are evaluated in parallel, and then the postops are applied).
-- MarkusQ