There are at least 2 operators with similar texts in their description:
With respect to an indeterminately-sequenced function call, the
operation of an < operator > is a single evaluation.
Where < operator > might be either "compound assignment" ([expr.ass]) or "postfix ++" ([expr.post.incr]). That rule, if understand correctly, states essentially that the any interleaving (overlapping) between the indeterminately-sequenced function call and one of the aforementioned operator is forbidden. But the Standard already forbids it in [intro.execution]p15:
<...> Evaluations A and B are indeterminately sequenced when either A
is sequenced before B or B is sequenced before A, but it is
unspecified which. [ Note: Indeterminately sequenced evaluations cannot overlap, but either could be executed first. — end note ] <...>
So to the question: is the wording in the operator description redundant and might as well be removed completely? And if it is not redundant please describe a situation when the text from the operators apply but the general rule does not.
The execution of the postfix increment occurs in two steps: first, an lvalue-to-rvalue conversion is applied, then, at some point before the end of the full-expression, the value stored in the object is incremented. The latter step is sequenced after the former.
Let's say A is sequenced before B, and both A and B are indeterminately sequenced with C. In that case, no pair of A, B, and C may overlap, however, there are three possible execution orders: ABC, ACB, and CAB.
In the case of the postfix increment, the additional guarantee provided by the language is that the order can only be ABC or CAB. A function call will never occur between the lvalue-to-rvalue conversion and the side effect.
Related
The Problem
The C++17 standard specifies in [intro.execution]/17 that it is undefined behavior if you have a
side effect on a memory location unsequenced with a value computation using the value of any object in the same memory location.
I am currently trying to trace a -Wsequence-point warning emitted by GCC and I assume that this is
what's happening here, though I don't know if I read the standard right.
This is an example of the code that makes GCC 5 to 12 emit the warning1:
struct Int {
int x;
};
int main()
{
Int i{0};
i = i = Int{1};
// This is the same, but it's easier to talk about this:
i.operator=(i.operator=(Int{1}));
}
The warning is:
<source>:9:7: warning: operation on '* & i' may be undefined [-Wsequence-point]
9 | i = i = Int{1};
| ~~^~~~~~~~~~~~
(See here at Compiler Explorer)
Clang and MSVC don't warn about this. I'm trying to verify what happens here. You can see my
detailled, rather tedious analysis below, but here's what I think I found:
In i.operator=(i.operator=(Int{1}));, the outer operator= evaluates two expressions: i (its
left-hand-side operand) and i.operator=(Int{1}) (the right-hand side). I think that the
value-computation part of evaluating i2 is unsequenced
relative to the side-effect part of i.operator=(Int{1})3.
However, that would only be UB if the value-computation of the expression i is, in the words of
the standard, a value computation using the value of any object in the same memory location. Since
i is a local variable, the expression i is a glvalue, and its value-computation should only
compute its identity, not its value, according to [basic.lval]/1.1:
A glvalue is an expression whose evaluation determines the identity of an object, bit-field, or function.
So I would argue that the value-computation of i in this case does not use the value of any object […] — but then I don't see why GCC emits the warning.
So: Is my assumption wrong? Is it the value-computation of i and the side-effect of
i.operator=(Int{1}) that clash here?
Detailled Analysis
Here I'll try to list all side-effects and value-computations (I'll call both 'events' from now on)
that happen during evaluation of i.operator=(i.operator=(Int{1}));, show the standard rules that
govern their sequencing and analyze which pairs of potentially problematic events are sequenced how.
All events
As far as I can see, ten events are happening here (not necessarily in this order):
Av Value-computation for Int{1}
As Side-effects for the Int{1}
Bv Value-computation for the first operand of i.operator=(Int{1})), i.e., i
Bs Side-effects for for the first operand of i.operator=(Int{1})), i.e., i
Cv Value-computation for the expression i.operator=(Int{1}))
Cs Side-effects for the expression i.operator=(Int{1}))
Dv Value-computation for the first operand of i.operator=(i.operator=(Int{1})), i.e., i
Ds Side-Effects for the first operand of i.operator=(i.operator=(Int{1})), i.e., i
Ev Value-computation for i.operator=(i.operator=(Int{1}))
Es Side-effects for i.operator=(i.operator=(Int{1}))
There are two things that can lead to UB in this context:
Two unsequenced side-effects "on the same object"
In this example: Es and Cs being unsequenced (all other side-effects are "empty")
An usequenced value-computation "using the value of the object" and side-effect "on the same object"
In this example: Any of Cs or Es being unsequenced vs any of Bv, Cv, Dv or Ev.
The standard has many rules on what is sequenced after what, but I think the relevant rules here are (paraphrasing, ordering arbitrary):
Rule 1: The value-computation (not the side-effects!) of operands are sequenced before the value-computation (not the side-effect!) of the operator
Rule 2a: The side effect (modification of the left argument) of the built-in assignment operator […] is sequenced after the value computation (but not the side effects) of both left and right arguments,
Rule 2b: The side effect (modification of the left argument) of the built-in assignment operator […] is sequenced before the value computation of the assignment expression (that is, before returning the reference to the modified object)
With that, we get:
(I) Rule 1 gives us (Av, Bv) => Cv (read: Av and Bv are sequenced before Cv), and (Cv, Dv) => Ev.
(II) Rule 2a gives us (Av, Bv) => Cs and (Cv, Dv) => Es
(III) Rule 2b gives us Cs => Cv and Es => Ev
Combining Cv => Es from (II) and Cs => Cv from (III), we get Cs => Es, so we can rule out the "unsequenced side-effects" UB possibility from above.
Let's check if Cs can be unsequenced vs either of Bv, Cv, Dv or Ev:
(II) rules out Bv
(III) rules out Cv
(II) + (III) gives us Cs => Cv => Es => Ev, thereby ruling out Ev
I cannot find a chain from Cs to Dv!.
Finally, let's check if Es can be unsequenced vs either of Bv, Cv, Dv or Ev:
(II) rules out Dv and Cv
(III) rules out Ev
(I) + (II) give us Bv => Cv => Es, thus ruling out Bv
Only Dv vs Cs remains as a potentially unsequenced pair!
Footnotes
1 Note that I could have just used int instead of my wrapped-int Int. However, then I could not have made the .operator=() calls explicit.
2 Denoted Dv in the analysis below
3 Denoted Cs in the example below
(Answer considers only C++11 and later. It may be different for pre-C++11.)
This is an example of the code that makes GCC 5 to 12 emit the warning
The warning is a false positive. The behavior will never be undefined.
// This is the same, but it's easier to talk about this:
i.operator=(i.operator=(Int{1}));
The sequencing rules for function calls are not the same as those for operators, so that is a risky simplification.
Note that I could have just used int instead of my wrapped-int Int. However, then I could not have made the .operator=() calls explicit.
The rules are also different if calls to a function are involved (directly or indirectly) rather than direct access to a scalar. So again this is a risky change.
I think that the value-computation part of evaluating i is unsequenced relative to the side-effect part of i.operator=(Int{1}).
Since C++17 the left-hand i.operator= naming the function is sequenced (including all side-effects) before all argument expressions (i.e. i.operator=(Int{1})). (rule 14 at cppreference)
A side effect can only exist on a scalar object. The side effect of i.operator=(Int{1}) is on the scalar object i.x and happens not immediately in the expression, but inside the function call. All evaluations part of a function call are indeterminately sequenced with all other evaluations in the caller if they are not otherwise sequenced. (rule 11 at cppreference)
Therefore before C++17 it would still not be unsequenced. It would just be unspecified which of the two is evaluated first.
This is already one reason that the GCC warning is bogus.
So I would argue that the value-computation of i in this case does not use the value of any object […] — but then I don't see why GCC emits the warning.
Yes, that is also true and a second reason that the warning is
bogus.
In your detailed analysis the listed rules 1 applies only to i = i = Int{1};, not to i.operator=(i.operator=(Int{1}));, because the latter isn't using any = operator, just function calls.
Rules 2a and 2b don't apply at all, because you are not using a built-in assignment operator. You are in either case calling (explicitly or implicitly) a operator= overload. The overload is implicitly-defined, but that doesn't make it builtin. The builtin assignment operators are the ones that apply directly to scalar types without a function call.
Even if you had not wrapped the int in a struct, so that you really are using a builtin assignment in i = i = Int{1};, these would be the pre-C++17 rules.
Since C++17 there is another sequencing rule that guarantees that the right-hand side of the assignment is sequenced before the left-hand one, including all side-effects. (rule 20 at cppreference)
Note that this is reversed from the ordering I mentioned above for i.operator=(i.operator=(Int{1})) since C++17!
The relevant rules are those for function calls which you are not considering. In particular the rule for indeterminate sequencing with respect to bodies of called functions is missing.
If you had not wrapped the int in a struct and used i = i = int(1);, then it would still not be undefined, even before C++17. In this case the rules you listed, paired with the evaluation of the i glvalue not counting itself as an access to i, will show that there is no violation of the sequencing rules as well. (Admittedly I didn't go through your deduction to check whether it is showing this.)
Given a function call func(E1, E2, E3) where E1 etc are arbitrary expressions, is it true that each expression is indeterminately sequenced with respect to each other expression, or are all the expressions unsequenced (i.e. the evaluations can overlap)?
I've looked at the cppreference page on this and it, in rule 15, uses the sentence
In a function call, value computations and side effects of the
initialization of every parameter are indeterminately sequenced with
respect to value computations and side effects of any other parameter.
which I don't think is quite the same as what I'm asking as the initialisation of the parameter is just the last step in evaluation of the parameter's expression.
But rule 21 which is talking about something else seems to imply that each sub-expression in a function call is indeterminately sequenced
Every expression in a comma-separated list of expressions in a
parenthesized initializer is evaluated as if for a function call
(indeterminately-sequenced)
So I'm a bit confused and any guidance is appreciated.
C++17 states in
8.2.2 Function call [expr.call]
4 ... The initialization and destruction of each parameter occurs within
the context of the calling function.
5 ... Note: All side effects of argument evaluations are sequenced before the
function is entered
5 ... The initialization of a parameter, including every associated value
computation and side effect, is indeterminately sequenced with respect
to that of any other parameter.
I hope this (my bolding) is clear enough.
(ref: n4659, final C++17 draft)
So I understand that re-usage of a variable that has been post incremented is undefined behavior in a function call. My understanding is this is not a problem in constructors. My question is about tie which is oddly halfway between each.
Given: pair<int, int> func() can I do:
tie(*it++, *it) = func();
Or is that undefined behavior?
Since C++17, this code has unspecified behavior. There are two possible outcomes:
the first argument is the result of dereferencing the original iterator, the second argument is the result of dereferencing the incremented iterator; or
the first argument and the second argument are both the result of dereferencing the original iterator.
Per [expr.call]/8:
[...] The initialization of a parameter, including every associated
value computation and side effect, is indeterminately sequenced with
respect to that of any other parameter. [...]
So the second argument to tie may be either the result of dereferencing the incremented iterator or the original iterator.
Prior to C++17, the situation was a bit complicated:
if both ++ and * invoke a function (e.g., when the type of it is a sophisticated class), then the behavior was unspecified, similar to the case since C++17;
otherwise, the behavior was undefined.
Per N4140 (C++14 draft) [expr.call]/8:
[ Note: The evaluations of the postfix expression and of the
arguments are all unsequenced relative to one another. All side
effects of argument evaluations are sequenced before the function is
entered (see [intro.execution]). — end note ]
Thus, the code was undefined behavior because the evaluation of one argument was unsequenced with the other. The evaluation of the two arguments may overlap, resulting in a data race. Unless it is specified otherwise ...
Per N4140 [intro.execution]/15:
When calling a function (whether or not the function is inline), every
value computation and side effect associated with any argument
expression, or with the postfix expression designating the called
function, is sequenced before execution of every expression or
statement in the body of the called function. [ Note: Value
computations and side effects associated with different argument
expressions are unsequenced. — end note ] Every evaluation
in the calling function (including other function calls) that is not
otherwise specifically sequenced before or after the execution of the
body of the called function is indeterminately sequenced with respect
to the execution of the called function.9 Several
contexts in C++ cause evaluation of a function call, even though no
corresponding function call syntax appears in the translation unit.
[
Example: Evaluation of a new expression invokes one or more allocation and constructor functions; see [expr.new]. For another
example, invocation of a conversion function ([class.conv.fct]) can
arise in contexts in which no function call syntax appears. —
end example ] The sequencing constraints on the execution of the called function (as described above) are features of the function
calls as evaluated, whatever the syntax of the expression that calls
the function might be.
9)
In other words, function executions do not interleave with each
other.
Thus, if the operators are actually function calls, then the behavior is similarly unspecified.
In cppref, the following holds until C++17:
code such as f(std::shared_ptr<int>(new int(42)), g()) can cause a
memory leak if g gets called after new int(42) and throws an
exception, while f(std::make_shared<int>(42), g()) is safe, since
two function calls are never interleaved.
I'm wondering which change introduced in C++17 renders this no longer applicable.
The evaluation order of function arguments are changed by P0400R0.
Before the change, evaluation of function arguments are unsequenced relative to one another. This means evaluation of g() may be inserted into the evaluation of std::shared_ptr<int>(new int(42)), which causes the situation described in your quoted context.
After the change, evaluation of function arguments are indeterminately sequenced with no interleaving, which means all side effects of std::shared_ptr<int>(new int(42)) take place either before or after those of g(). Now consider the case where g() may throw.
If all side effects of std::shared_ptr<int>(new int(42)) take place before those of g(), the memory allocated will be deallocated by the destructor of std::shared_ptr<int>.
If all side effects of std::shared_ptr<int>(new int(42)) take place after those of g(), there is even no memory allocation.
In either case, there is no memory leak again anyway.
The P0145R3 paper (which was accepted into C++17) refines the order of evaluation of several C++ constructs, including
Postfix expressions are evaluated from left to right. This includes functions calls and member
selection expressions
Specifically, the paper adds the following text to 5.2.2/4 paragraph of the standard:
The postfix-expression is sequenced before each expression in the
expression-list and any default argument. Every value computation and
side effect associated with the initialization of a parameter, and the
initialization itself, is sequenced before every value computation and
side effect associated with the initialization of any subsequent
parameter.
This question already has answers here:
Undefined behavior and sequence points
(5 answers)
Closed 6 years ago.
I am confused about the output of the code.
It depends on what compiler i run the code. Why is it so?
#include <iostream>
using namespace std;
int f(int &n)
{
n--;
return n;
}
int main()
{
int n=10;
n=n-f(n);
cout<<n;
return 0;
}
Running it on the Ubuntu terminal with g++, the output is 1 whereas running it on Turbo C++ ( the compiler we used in school) gives output as 0.
In C++03, modifying a variable and also using its value in the same expression, without an intervening C++03 sequence point, was Undefined Behavior.
C++03 §5/4:
” Between the previous
and next sequence point a scalar object shall have its stored value modified at most once by the evaluation
of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.
The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full
expression; otherwise the behavior is undefined.
Undefined Behavior, UB, provides the compiler with an opportunity to optimize, because it can assume that UB does not occur in a valid program.
However, with all the myriad UB rules of C++ it's difficult to reason about source code.
In C++11 sequence points were replaced with sequenced before, indeterminately sequenced and unsequenced relations:
C++11 §1.9/3
” Given any two evaluations A and B, if
A is sequenced before B, then the execution of A shall precede the execution of B. If A is not sequenced before
B and B is not sequenced before A, then A and B are unsequenced. [Note: The execution of unsequenced
evaluations can overlap. —end note ] Evaluations A and B are indeterminately sequenced when either A
is sequenced before B or B is sequenced before A, but it is unspecified which.
And with the new C++11 sequence relationship rules the modification in the function in the code in question is indeterminately sequenced with respect to the use of the variable, and so the code has unspecified behavior rather than Undefined Behavior, as noted by Eric M Schmidt in a comment to (the first version of) this answer. Essentially that means that there is no danger of nasal daemons or other possible UB effects, and that the behavior is a reasonable one. The two possible behaviors here are that the modification via the function call is done before the use of the value, or that it's done after the use of the value.
Why it's unspecified behavior:
C++11 §1.9/15:
” Every evaluation in the calling function (including other function calls) that is not otherwise specifically
sequenced before or after the execution of the body of the called function is indeterminately sequenced with
respect to the execution of the called function.
What “unspecified behavior” means:
C++11 §1.3.25:
” unspecified behavior
Behavior, for a well-formed program construct and correct data, that depends on the implementation
[Note: The implementation is not required to document which behavior occurs. The range of possible
behaviors is usually delineated by this International Standard. —end note ]
Why the modification effected by the assignment is not problematic:
C++11 §5.17/1
” In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression.
This is also quite different from C++03.
As the rather drastic edit of this answer shows, following Eric's comment, this kind of issue is not simple! The main advice I can give is to as much as possible just Say No™ to effects governed by subtle or very complex rules, the corners of the language. Simple code has a better chance of being correct, while so called clever code does not have a good chance of being significantly faster.