Is i += ++i undefined behavior in C++0x? - c++

I'm very convinced with the explanation I've found that said that i = ++i is not undefined as far as C++0x is concerned, but I'm unable to judge whether the behavior of i += ++i is well-defined or not. Any takers?

The reasoning that makes i = ++i well-defined can equally be used to prove that i += ++i must also be well-defined.
i += ++i is equivalent to i += (i += 1) and the new sequencing rules require that the assignment takes place before the value-computation of the i += 1 sub-expression.
This means that the result of the expression i += ++i must be the same as for i = 2 * i + 1.
Edit: I have to revise my answer, because the behaviour is undefined after all.
The behaviour of i += ++i is undefined, because the value-computations of the sub-expressions i (left-hand side argument) and ++i are unsequenced in relation to each other and one of them contains an update of the object i.
This is not a problem for the expression i = ++i, because there the i on the left-hand side does not undergo an lvalue-to-rvalue conversion, which does happen in the i += ++i case.
On a side-note: Don't write such code in any serious project. It relies too much on exactly knowing the sequencing rules and there will be many people who either don't properly understand the sequencing rules, are unaware of the change in the rules that is the result of DR 637 or get tripped up by missing some important aspects of the expression in question (as happened to me when composing the first revision of this answer).

Related

Why is "i++ + 1" itself not undefined? What ensures that the postfix's side-effect occurs after the computation of +?

I know this question is asked often in its version of "i = i++ +1" in which i appears twice, but my question differs in that is is specifically ONLY about the right hand side of this expression, the definedness of which is not obvious to me. I am only referring to:
i++ + 1;
cppreference.com states here that:
2) The value computations (but not the side-effects) of the operands to any operator are sequenced before the value computation of the result of the operator (but not its side-effects).
I understand this to mean that the value computation is sequenced but no statement is made about the side-effect.
[...]
4) The value computation of the built-in post-increment and post-decrement operators is sequenced before its side-effect.
It does not, however, specify that the side-effect of (in this case) the left operand is sequenced in relation to the value computation of the expression.
It further states:
If a side effect on a scalar object is unsequenced relative to a value computation using the value of the same scalar object, the behavior is undefined.
Is this not the case here? The post-inc-operator's side effect on i is unsequenced relative to the value computation of the addition operator, which uses the same i.
Why is this expression not usually said to be undefined?
Is it because the addition operator is thought to incur a function call for which stricter sequencing guarantees are given?
What ensures that the postfix's side-effect occurs after the computation of +?
There is no such assurance. The postfix's side effect may occur either before or after the value computation of + .
The post-inc-operator's side effect on i is unsequenced relative to the value computation of the addition operator, which uses the same i.
No, the value computation of the addition operator uses the result of value computation of its operands. The operands of + are i++ (not i), and 1. As you covered in the question, the read of i is sequenced-before the value computation of i++, and therefore (transitivity) sequenced before the value computation of +.
The following things are guaranteed to happen in the following order:
Read of i.
Value computation of ++ (operand: result-of-step-1)
Value computation of + (operands result-of-step-2 and 1)
And the side-effect of i++ must occur after step 1 but it could be anywhere upto that constraint.
i++ + 1 is not undefined on account of the use of the postfix operator because it perpetrates only one side effect on one object, and that object's value is only referenced in that place. The i++ expression unambiguously produces the prior value of i, and that value is what is added to 1, no matter when i is actually updated.
(We don't know that i++ + 1 is well-defined, because things can go wrong for various other reasons: i being uninitialized or otherwise indeterminate or invalid, or numeric overflow or pointer overrun being perpetrated.)
Undefined behavior occurs if in the same evaluation phase we try to modify the same object twice: i++ + i++. This can be convoluted with pointers, because (*p)++ + (*q)++ increment the same object only if p and q point to the same location; otherwise it is fine.
Undefined behavior also occurs if in the same evaluation phase, we try to observe the value of an object that is modified elsewhere in the expression, like i++ + i. The right hand side of the + accesses i, but that is not sequenced with regard to the side effect of i++ on the left; the + operator doesn't impose a sequence point. In i++ + 1, the 1 doesn't try to access i, needless to say.
Here's what happens when i++ + 1 is evaluated:
The subexpression i++ is evaluated. It yields the previous value of i.
Evaluating i++ also has the side effect of incrementing the stored value of i -- but note that that incremented value is not used.
The subexpression 1 is evaluated, yielding the obvious value.
The + operator is evaluated, yielding the result of i++ plus the result of 1. This can happen only after the values of the left and right subexpressions are determined (but it can consistently happen before or after the side effect occurs).
The side effect of the ++ operator is only guaranteed to happen some time before the next sequence point. (That's in C99 terms. The C11 standard presents the same rules in a different way.) But since nothing else in the expression depends on that side effect, it doesn't matter when it occurs. There is no conflict, so there's no undefined behavior.
In i++ + i, the evaluation of i on the RHS will yield different results depending on whether the side effect has happened yet or not. And since the ordering is undefined, the standard throws up its hands and says the behavior is undefined. But in i++ + i, that problem doesn't occur.
"What ensures that the postfix's side-effect occurs after the computation of +?"
Nothing makes that specific guarantee. You must act as if you're using the original value of i, and at some point it needs to perform the side-effect, but as long as everything behaves properly, it doesn't matter how the compiler implements this or in what order. It can (and for certain scenarios, would) implement it as roughly equivalent to either:
auto tmp = i;
i = tmp + 1; // Could be done here, or after the next expression, doesn't matter since i isn't read again
tmp + 1; // produces actual value of i++ + 1
or
auto tmp = i + 1;
i = tmp; // Could be done here, or after the next expression, doesn't matter since tmp isn't changed again
(tmp - 1) + 1; // produces actual value of i++ + 1
or (for primitives or inlined operator overloads where it has enough information) optimize the expression to just:
++i; // Usually the same as i++ + 1 if compiler has enough knowledge
because postfix increment followed by adding one could be treated as prefix increment without adding one after.
Point is, it's up to the compiler to ensure the side-effect occurs sometime, which might be before or after the computation of +; the compiler just needs to make sure it has stored, or can recover, the original value of i.
The various contortions here might seem pointless (clearly ++i is the best if you can swing it, and i + 1; followed by ++i is simplest otherwise), but they're often necessary to work with the hardware atomics on a given architecture; if the architecture offers a fetch_then_add instruction, you'd want to implement it as:
auto tmp = fetch_then_add(i, 1); // Returns original value of i, while atomically adding 1
tmp + 1;
but if it only offers an add_then_fetch instruction, you'd want:
auto tmp = add_then_fetch(i, 1); // Returns incremented value of i
(tmp - 1) + 1;
As with many things, the C++ standard doesn't impose a preferred order because real hardware doesn't always cooperate; if it gets the job done and behaves as documented, it doesn't really matter what order it used.

Preincrement vs postincrement in terms of sequence points

In this answer there're some examples of well-defined and undefined expressions. I'm particularly interested in two of them:
(6) i = i++ + 1; // Undefined Behaviour
(7) i = ++i + 1; // Well-defined Behaviour
This means that there's a difference between pre-increment and post-increment in terms of sequence points and well defined /unspecified/undefined behavior, but I don't understand where this difference comes from.
In standard draft (N4618) there's an example of code ([intro.execution], pt 18)
i = i++ + 1; // the value of i is incremented
i = i++ + i; // the behavior is undefined
Which, as far as I understand, means that expression i = i++ + 1 should be well-defined and the value of a variable i should increase by 1 as the result of this expression. However, this code run in MSVS 2015 increases i by 2.
So, what happens in the expression i = i++ + 1? Is it well-defined, undefined, implementation-defined or unspecified behavior? And is there any difference between pre-increment and post-increment in this and similar expressions in terms of sequence points and UB, as stated in the original answer? And why Visual Studio shows the behavior which is different from written in standard?
Please also note that I'm primarily interested in modern c++ (14/17).
What happens in the expression i = i++ + 1? Is it well-defined, undefined, implementation defined or unspecified behaviour?
This exact example is given in the standard, how lucky are we?
N4296 1.9.15 [intro.execution]
i = i++ + 1; // the behavior is undefined
Of course, we'd like to know why too. The following standard quote appears to be relevant here:
N4296 1.9.15 [intro.execution]
[ ... ] The value computations of the operands of an operator are sequenced
before the value computation of the result of the operator. [ ... ]
This tells us that the sum will occur before the assignment (duh, how else does it know what to assign!), but it doesn't guarantee that the increment will occur before or after the assignment, now we're in murky water...
N4296 1.9.15 [intro.execution]
[ ... ] If a side effect on a scalar object is unsequenced relative to either
another side effect on the same scalar object or a value computation
using the value of the same scalar object, and they are not
potentially concurrent (1.10), the behavior is undefined. [ ... ]
The assignment operator has a side effect on the value of i, which means we have two side effects (the other is the assignment performed by i++) on the same scalar object, which are unsequenced, which is undefined.
Why does Visual Studio show the behavior which is different from written in standard?
It doesn't. The standard says it's undefined, which means it can do anything from what you wanted to something completely different, it just so happens that this is the behaviour that got spat out by the compiler!
i = ++i + 1; // Well-defined Behaviour
This means that there's a difference between preincrement and postincrement in terms of sequence points and well defined / unspecified / undefined behavior, but I don't understand where this difference comes from.
I believe that post is incorrect in several ways. Quoting the same section as they do, emphasis mine:
C++11 1.9/15
The value computations of the operands of an
operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar
object is unsequenced relative to either another side effect on the same scalar object or a value computation
using the value of the same scalar object, the behavior is undefined.
Then the assignment operator:
C++11 5.17
In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.
Notable, the value computation or the right and left operands is not sequenced. (This has been explicitly spelled out in C11 1), which otherwise has a completely identical text as C++11.)
Meaning that in the expression i = ++i + 1; the side effect of ++i is unsequenced in relation to the value computation of the left operand i. Thus it is undefined behavior as per 1.9/15. And the UB has nothing to do with the assignment side-effect at all.
As for the expression i = i++ + 1;, the side-effect of assignment is, as per C++11, explicitly sequenced after the value computations, but before the value computation of the expression as whole. The value computation of i++ is not an issue as per 5.2.6 "The value computation of the ++ expression is sequenced before the modification of the operand object". As per the nature of the postfix ++, the side effect of updating i++ must be sequenced after the value computation of the whole expression. It is well-defined behavior, as far as I can tell.
Correct text should therefore be
(6) i = i++ + 1; // Well-defined Behaviour
(7) i = ++i + 1; // Undefined Behaviour
Apparently there was an incorrect example in C++11 1.9/15 i = i++ + 1; // the behavior is undefined which has been corrected in later versions of the standard.
NOTE: None of this has the slightest to do with the change of wording about sequence points!
1) C11 6.5.16/3
The side effect of updating the stored value of the
left operand is sequenced after the value computations of the left and
right operands. The evaluations of the operands are unsequenced.

Is the behaviour of i = i++ really undefined?

Possible Duplicate:
Could anyone explain these undefined behaviors (i = i++ + ++i , i = i++, etc…)
According to c++ standard,
i = 3;
i = i++;
will result in undefined behavior.
We use the term "undefined behavior" if it can lead to more then one result. But here, the final value of i will be 4 no matter what the order of evaluation, so shouldn't this really be called "unspecified behavior"?
The phrase, "…the final value of i will be 4 no matter what the order of evaluation…" is incorrect. The compiler could emit the equivalent of this:
i = 3;
int tmp = i;
++i;
i = tmp;
or this:
i = 3;
++i;
i = i - 1;
or this:
i = 3;
i = i;
++i;
As to the definitions of terms, if the answer was guaranteed to be 4, that wouldn't be unspecified or undefined behavior, it would be defined behavior.
As it stands, it is undefined behaviour according to the standard (Wikipedia), so it's even free to do this:
i = 3;
system("sudo rm -rf /"); // DO NOT TRY THIS AT HOME … OR AT WORK … OR ANYWHERE.
No, we don't use the term "undefined behavior" when it can simply lead to more than one arithmetical result. When the behavior is limited to different arithmetical results (or, more generally, to some set of predictable results), it is typically referred to as unspecified behavior.
Undefined behavior means completely unpredictable and unlimited consequences, like formatting the hard drive on your computer or simply making your program to crash. And i = i++ is undefined behavior.
Where you got the idea that i should be 4 in this case is not clear. There's absolutely nothing in C++ language that would let you come to that conclusion.
In C and also in C++, the order of any operation between two sequence points is completely up to the compiler and cannot be dependent on. The standard defines a list of things that makes up sequence points, from memory this is
the semicolon after a statement
the comma operator
evaluation of all function arguments before the call to the function
the && and || operand
Looking up the page on wikipedia, the lists is more complete and describes more in detail. Sequence points is an extremely important concept and if you do not already know what it means, you will benefit greatly by learning it right away.
1.
No, the result will be different depending on the order of evaluation. There is no evaluation boundary between the increment and the assignment, so the increment can be performed before or after the assignment. Consider this behaviour:
load i into CX
copy CX to DX
increase DX
store DX in i
store CX in i
The result is that i contains 3, not 4.
As a comparison, in C# there is a evaluation boundary between the evaulation of the expression and the assignment, so the result will always be 3.
2.
Even if the exact behaviour isn't specified, the specification is very clear on what it covers and what it doesn't cover. The behaviour is specified as undefined, it's not unspecified.
i=, and i++ are both side effects that modify i.
i++ does not imply that i is only incremented after the entire statement is evaluated, merely that the current value of i has been read.
As such, the assignment, and the increment, could happen in any order.
This question is old, but still appears to be referenced frequently, so it deserves a new answer in light of changes to the standard, from C++17.
expr.ass Subclause 1 explains
... the assignment is sequenced after the value computation of the right and left operands ...
and
The right operand is sequenced before the left operand.
The implication here is that the side-effects of the right operand are sequenced before the assignment, which means that the expression is not addressed by the provision in [basic.exec] Subclause 10:
If a side effect on a memory location ([intro.memory]) is unsequenced relative to either another side effect on the same memory location or a value computation using the value of any object in the same memory location, and they are not potentially concurrent ([intro.multithread]), the behavior is undefined
The behavior is defined, as explained in the example which immediately follows.
See also: What made i = i++ + 1; legal in C++17?
To answer your questions:
I think "undefined behavior" means that the compiler/language implementator is free to do whatever it thinks best, and no that it could lead to more than one result.
Because it's not unspecified. It's clearly specified that its behavior is undefined.
It's not worth it to type i=i++ when you could simply type i++.
I saw such question at OCAJP practice test.
IntelliJ's IDEA decompiler turns this
public static int iplus(){
int i=0;
return i=i++;
}
into this
public static int iplus() {
int i = 0;
byte var10000 = i;
int var1 = i + 1;
return var10000;
}
Create JAR from module, then import as library & inspect.

Error in the standards?

§5/4 C++ standard
i = 7, i++, i++; // i becomes 9
i = ++i + 1; //the behavior is unspecified
That should be changed to
i = 7, i++, i++; // the behavior is undefined
i = ++i + 1; //the behavior is undefined
right?
Yes, please see this defect report: http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#351 .
Clarification: the example is wrong but your 'fix' is incorrect for the first statement. The first statement is correctly commented in the standard. It is only the second comment that is inaccurate.
i = 7, i++, i++; // i becomes 9
is perfectly fine. Operator = has higher precedence than ,so the expression is equivalent to
(i = 7), i++, i++;
which is perfectly well defined behaviour because , is a sequence point.
As far as
i = ++i + 1; //the behavior is unspecified
is concerned the behaviour is undefined in C++03 but well defined in C++0x. If you have the C++0x draft you can check out in section 1.9/15
i = i++ + 1; // the behavior is undefined
No, the standard is right. The comma operator guarantees that any side effects of previous operands are completed before evaluating the next.
These guarantees are provided by sequence points, which the comma operator (as well as && and ||) are.
Note that you are correct on the wording change for the second statement. It is undefined, not unspecified.
That should be changed to
i = 7, i++, i++; // the behavior is undefined
Why? The standard is correct, this shouldn’t be changed, and this behaviour is well-defined.
A comma (,) introduces a sequence point into the calculation so the order of execution is defined.

Difference between i = ++i and ++i [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Could anyone explain these undefined behaviors (i = i++ + ++i , i = i++, etc…)
What is the difference between i = ++i; and ++i; where i is an integer with value 10?
According to me both do the same job of incrementing i i.e after completion of both the expressions i =11.
i = ++i; invokes Undefined Behaviour whereas ++i; does not.
C++03 [Section 5/4] says Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.
In i = ++i i is being modified twice[pre-increment and assignment] without any intervening sequence point so the behaviour is Undefined in C as well as in C++.
However i = ++i is well defined in C++0x :)
Writing i = ++i; writes to variable i twice (one for the increment, one for the assignment) without a sequence point between the two. This, according to the C language standard causes undefined behavior.
This means the compiler is free to implement i = ++i as identical to i = i + 1, as i = i + 2 (this actually makes sense in certain pipeline- and cache-related circumstances), or as format C:\ (silly, but technically allowed by the standard).
i = ++i will often, but not necessarily, give the result of
i = i;
i +1;
which gives i = 10
As pointed out by the comments, this is undefined behaviour and should never be relied on
while ++i will ALWAYS give
i = i+1;
which gives i = 11;
And is therefore the correct way of doing it
If i is of scalar type, then i = ++i is UB, and ++i is equivalent to i+=1.
if i is of class type and there's an operator++ overloaded for that class then
i = ++i is equivalent to i.operator=(operator++(i)), which is NOT UB, and ++i just executes the ++ operator, with whichever semantics you put in it.
The result for the first one is undefined.
These expressions are related to sequence points and, the most importantly, the first one results in undefined behavior.