When exactly the built-in postfix increment operator returns the result? - c++

I know two things about the built-in postfix increment operator:
Firstly the result value is evaluated (i.e. prvalue copy of the operand is created).
Only after that the side effect (increment) is applied to the original object.
So when exactly this operator returns the result? I see three options here:
a) immediately after its evaluation
(i.e. after 1. In this case "returning the result" is equivalent to "evaluate the result")
b) in some moment between 1. and 2.
c) after completion of the side effect (after 2.)?
Which option is technically correct?
Edit: Maybe this question refers to more general question:
"returning the result of operator/expression" is the same thing as "evaluating the result of operator/expression" or not?
P.S. I know this is a stupid question but I didn't find answer to it. It bothers me because there are other operators (and expressions) with side effects which are completed after evaluating result.

Related

All possible results of a C++ expression?

According to 5/1 (the Standard):
An expression can result in a value and can cause side effects.
So obviously we have two possible options:
1) Expression results in a value and cause side effects
2) Expression results in a value and doesn't cause side effects
What are the other possible options? (For example are there any expressions which don't result in a value?)
I thought about throw-expressions and functions with void return type. Can we refer them to the first or second category (value of void type with possible side effects)?
What are the other possible options?
Expression doesn't result in a value and causes side effects
Expression doesn't result in a value and does not cause side effects
Expressions with void return type do not result in a value. Expressions in 4. do not affect the behaviour of the program.
Given that exit(0) is an expression, we must include the possibility that evaluating an expression ends the program.

C++ ternary operator execution conditions

I am unsure about the guarantees of execution for the C / C++ ternary operator.
For instance if I am given an address and a boolean that tells if that address is good for reading I can easily avoid bad reads using if/else:
int foo(const bool addressGood, const int* ptr) {
if (addressGood) { return ptr[0]; }
else { return 0; }
}
However can a ternary operator (?:) guarantee that ptr won't be accessed unless addressGood is true? Or could an optimizing compiler generate code that accesses ptr in any case (possibly crashing the program), stores the value in an intermediate register and use conditional assignment to implement the ternary operator?
int foo(const bool addressGood, const int* ptr) {
// Not sure about ptr access conditions here.
return (addressGood) ? ptr[0] : 0;
}
Thanks.
Yes, the standard guarantees that ptr is only accessed if addressGood is true. See this answer on the subject, which quotes the standard:
Conditional expressions group right-to-left. The first expression is contextually converted to bool (Clause 4). It is evaluated and if it is true, the result of the conditional expression is the value of the second expression, otherwise that of the third expression. Only one of the second and third expressions is evaluated. Every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second or third expression.
(C++11 standard, paragraph 5.16/1)
I would say, in addition to the answer that "yes, it's guaranteed by the C++ standard":
Please use the first form. It's MUCH clearer what you are trying to achieve.
I can almost guarantee that any sane compiler (with minimal amount of optimisation) generates exactly the same code for both examples anyway.
So whilst it's useful to know that both of these forms achieve the same "protection", it is definitely preferred to use the form that is most readable.
It also means you don't need to write a comment explaining that it is safe because of paragraph such and such in the C++ standard, thus making both take up the same amount of code-space - because if you didn't know it before, then you can rely on someone else ALSO not knowing that this is safe, and then spending the next half hour finding the answer via google, and either running into this thread, or asking the question again!
The conditional (ternary) operator guarantees to only evaluate the second operand if the first operand compares unequal to 0, and only evaluate the third operand if the first operand compares equal to 0. This means that your code is safe.
There is also a sequence point after the evaluation of the first operand.
By the way, you don't need the parantheses - addressGood ? ptr[0] : 0 is fine too.
c++11/[expr.cond]/1
Conditional expressions group right-to-left. The first expression is
contextually converted to bool (Clause 4).
It is evaluated and if it is true, the result of the conditional expression is the value of the second expression,
otherwise that of the third expression. Only one of the second and third expressions is evaluated. Every value
computation and side effect associated with the first expression is sequenced before every value computation
and side effect associated with the second or third expression.

Order of evaluation and undefined behaviour

Speaking in the context of the C++11 standard (which no longer has a concept of sequence points, as you know) I want to understand how two simplest examples are defined.
int i = 0;
i = i++; // #0
i = ++i; // #1
There are two topics on SO which explain those examples within the C++11 context. Here it was said that #0 invokes UB and #1 is well-defined. Here it was said that both examples are undefined. This ambiguity confuses me much. I've read this well-structured reference three times already but the topic seems to be way too complicated for me.
.
Let's analyze the example #0: i = i++;.
Corresponding quotes are:
The value computation of the built-in postincrement and postdecrement
operators is sequenced before its side-effect.
The side effect (modification of the left argument) of the built-in
assignment operator and of all built-in compound assignment operators
is sequenced after the value computation (but not the side effects) of
both left and right arguments, and is sequenced before the value
computation of the assignment expression (that is, before returning
the reference to the modified object)
If a side effect on a scalar object is unsequenced relative to another
side effect on the same scalar object, the behavior is undefined.
As I get it, the side effect of the assignment operator is not sequenced with side effects of it's left and right arguments. Thus the side effect of the assignment operator is not sequenced with the side effects of i++. So #0 invokes an UB.
.
Let's analyze the example #1: i = ++i;.
Corresponding quotes are:
The side effect of the built-in preincrement and predecrement
operators is sequenced before its value computation (implicit rule due
to definition as compound assignment)
The side effect (modification of the left argument) of the built-in
assignment operator and of all built-in compound assignment operators
is sequenced after the value computation (but not the side effects) of
both left and right arguments, and is sequenced before the value
computation of the assignment expression (that is, before returning
the reference to the modified object)
If a side effect on a scalar object is unsequenced relative to another
side effect on the same scalar object, the behavior is undefined.
I can not see, how this example is different from the #0. This seems to be an UB for me for the very same reason as #0. The side effect of assignment is not sequenced with the side effect of ++i. It seems to be an UB. The topic liked above says it is well-defined. Why?
.
Question: how can I apply quoted rules to determine the UB of the examples. An as simple as possible explanation would be greatly appreciated. Thank you!
Since your quotes are not directly from the standard, I will try to give a detailed answer quoting the relevant parts of the standard. The definitions of "side effects" and "evaluation" is found in paragraph 1.9/12:
Accessing an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression (or a sub-expression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects.
The next relevant part is paragraph 1.9/15:
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [...] The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
Now let's see, how to apply this to the two examples.
i = i++;
This is the postfix form of increment and you find its definition in paragraph 5.2.6. The most relevant sentence reads:
The value computation of the ++ expression is sequenced before the modification
of the operand object.
For the assignment expression see paragraph 5.17. The relevant part states:
In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.
Using all the information from above, the evaluation of the whole expression is (this order is not guaranteed by the standard!):
value computation of i++ (right hand side)
value computation of i (left hand side)
modification of i (side effect of ++)
modification of i (side effect of =)
All the standard guarantees is that the value computations of the two operands is sequenced before the value computation of the assignment expression. But the value computation of the right hand side is only "reading the value of i" and not modifying i, the two modifications (side effects) are not sequenced with respect to each other and we get undefined behavior.
What about the second example?
i = ++i;
The situation is quite different here. You find the definition of prefix increment in paragraph 5.3.2. The relevant part is:
If x is not of type bool, the expression ++x is equivalent to x+=1.
Substituting that, our expression is equivalent to
i = (i += 1)
Looking up the compound assignment operator += in 5.17/7 we get that i += 1 is equivalent to i = i + 1 except that i is only evaluated once. Hence, the expression in question finally becomes
i = ( i = (i + 1))
But we already know from above that the value computation of the = is sequenced after the value computation of the operands and the side effects are sequenced before the value computations of =. So we get a well-defined order of evaluation:
compute value of i + 1 (and i - left hand side of inner expression)(#1)
initiate side effect of inner =, i.e. modify "inner" i
compute value of (i = i + 1), which is the "new" value of i
initiate side effect of outer =, i.e. modify "outer" i
compute value of full expression.
(#1): Here, i is only evaluated once, since i += 1 is equivalent to i = i + 1 except that i is only evaluated once (5.17/7).
The key difference is that ++i is defined as i += 1, so
i = ++i;
is the same as:
i = (i += 1);
Since the side effects of the += operator are sequenced before
the value computation of the operator, the actual modification
of i in ++i is sequenced before the outer assignment. This
follows directly from the sections you quote: "The side effect
(modification of the left argument) of the built-in assignment
operator and of all built-in compound assignment operators is
sequenced after the value computation (but not the side effects)
of both left and right arguments, and is sequenced before the
value computation of the assignment expression (that is, before
returning the reference to the modified object)"
This is due to the nested assignment operator; the (outer)
assignment operator only imposes sequenced before on the value
computation of its operands, not on their side effects. (But of
course, it doesn't undo sequencing imposed otherwise.)
And as you indirectly point out, this is new to C++11;
previously, both were undefined. The older versions of C++
used sequence points, rather than sequenced before, and there
was no sequence point in any of the assignment operators. (I
have the impression that the intent was that operators which
result in an lvalue have a value which is sequenced after any
side effects. In earlier C++, the expression *&++i was
undefined behavior; in C++11, it is guaranteed to be the same as
++i.)

Difference between sequence points and operator precedence? 0_o

Let me present a example :
a = ++a;
The above statement is said to have undefined behaviors ( I already read the article on UB on SO)
but according precedence rule operator prefix ++ has higher precedence than assignment operator =
so a should be incremented first then assigned back to a. so every evaluation is known, so why it is UB ?
The important thing to understand here is that operators can produce values and can also have side effects.
For example ++a produces (evaluates to) a + 1, but it also has the side effect of incrementing a. The same goes for a = 5 (evaluates to 5, also sets the value of a to 5).
So what you have here is two side effects which change the value of a, both happening between sequence points (the visible semicolon and the end of the previous statement).
It does not matter that due to operator precedence the order in which the two operators are evaluated is well-defined, because the order in which their side effects are processed is still undefined.
Hence the UB.
Precedence is a consequence of the grammar rules for parsing expressions. The fact that ++ has higher precedence than = only means that ++ binds to its operand "tighter" than =. In fact, in your example, there is only one way to parse the expression because of the order in which the operators appear. In an example such as a = b++ the grammar rules or precedence guarantee that this means the same as a = (b++) and not (a = b)++.
Precedence has very little to do with the order of evaluation of expression or the order in which the side-effects of expressions are applied. (Obviously, if an operator operates on another expression according to the grammar rules - or precedence - then the value of that expression has to be calculated before the operator can be applied but most independent sub-expressions can be calculated in any order and side-effects also processed in any order.)
why it is UB ?
Because it is an attempt to change the variable a two times before one sequence point:
++a
operator=
Sequence point evaluation #6: At the end of an initializer; for example, after the evaluation of 5 in the declaration int a = 5;. from Wikipedia.
You're trying to change the same variable, a, twice. ++a changes it, and assignment (=) changes it. But the sequence point isn't complete until the end of the assignment. So, while it makes complete sense to us - it's not guaranteed by the standard to give the right behavior as the standard says not to change something more than once in a sequence point (to put it simply).
It's kind of subtle, but it could be interpreted as one of the following (and the compiler doesn't know which:
a=(a+1);a++;
a++;a=a;
This is because of some ambiguity in the grammar.

Sequence points and partial order

A few days back there was a discussion here about whether the expression
i = ++i + 1
invokes UB
(Undefined Behavior) or not.
Finally the conclusion was made that it invokes UB as the value of 'i' is changing more than once between two sequence points.
I was involved in a discussion with Johannes Schaub in that same thread. According to him
i=(i,i++,i)+1 ------ (1) /* invokes UB as well */
I said (1) does not invoke UB because the side effects of the previous subexpressions are cleared by the comma operator ',' between i and i++ and between i++ and i.
Then he gave the following explanation:
"Yes the sequence point after i++ completes all side effects before it, but there is nothing that stops the assignment side effect overlapping with the side effect of i++.The underlying problem is that the side effect of an assignment is not specified to happen after or before the evaluation of both operands of the assignment, and so sequence points cannot do anything with regard to protecting this: Sequence points induce a partial order: Just because there is a sequence point after and before i++ doesn't mean all side effects are sequenced with regard to i.
Also, notice that merely a sequence point means nothing: The order of evaluations isn't dictated by the form of code. It's dictated by semantic rules. In this case, there is no semantic rule saying when the assignment side effect happens with regard to evaluating both of its operands or subexpressions of those operands".
The statement written in "bold" confused me. As far as I know:
"At certain specified points in the execution sequence called sequence points,all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place."
Since,comma operators also specify execution order the side effect of i++ have been cancelled when we reach the last i.He(Johannes) would have been right had the order of evaluation been not specified(but in case of comma operator it is well specified).
So I just want to know whether (1) invokes UB or not?. Can someone give another valid explanation?
Thanks!
The C standard says this about assignment operators (C90 6.3.16 or C99 6.5.16 Assignment operators):
The side effect of updating the stored value of the left operand shall occur between the previous and the next sequence point.
It seems to me that in the statement:
i=(i,i++,i)+1;
the sequence point 'previous' to the assignment operator would be the second comma operator and the 'next' sequence point would be the end of the expression. So I'd say that the expression doesn't invoke undefined behavior.
However, this expression:
*(some_ptr + i) = (i,i++,i)+1;
would have undefined behavior because the order of evaluation of the 2 operands of the assignment operator is undefined, and in this case instead of the problem being when the assignment operator's side effect takes place, the problem is you don't know whether the value of i used in the left handle operand will be evaluated before or after the right hand side. This order of evaluation problem doesn't occur in the first example because in that expression the value of i isn't actually used in the left-hand side - all that the assignment operator is interested in is the "lvalue-ness" of i.
But I also think that all this is sketchy enough (and my understanding of the nuances involved are sketchy enough) that I wouldn't be surprised if someone can convince me otherwise (on either count).
i=(i,i++,i)+1 ------ (1) /* invokes UB as well */
It does not invoke undefined behaviour. The side effect of i++ will take place before the evaluation of the next sequence point, which is denoted by the comma following it, and also before the assignment.
Nice language sudoku, though. :-)
edit: There's a more elaborate explanation here.
I believe that the following expression definitely has undefined behaviour.
i + ((i, i++, i) + 1)
The reason is that the comma operator specifies sequence points between the subexpressions in parentheses but does not specify where in that sequence the evaluation of the left hand operand of + occurs. One possibility is between the sequence points surrounding i++ and this violates the 5/4 as i is written to between two sequence points but is also read twice between the same sequence points and not just to determine the value to be stored but also to determine the value of the first operand to the + operator.
This also has undefined behaviour.
i += (i, i++, i) + 1;
Now, I am not so sure about this statement.
i = (i, i++, i) + 1;
Although the same principals apply, i must be "evaluated" as a modifiable lvalue and can be done so at any time, but I'm not convinced that its value is ever read as part of this. (Or is there another restriction that the expression violates to cause UB?)
The sub-expression (i, i++, i) happens as part of determining the value to be stored and that sub-expression contains a sequence point after the storage of a value to i. I don't see any way that this wouldn't require the side effect of i++ to be complete before the determination of the value to be stored and hence the earliest possible point that the assignment side effect could occur.
After this sequnce point i's value is read at most once and only to determine the value that will be stored back to i, so this last part is fine.