Is the following valid?
template <typename T>
std::pair<T, T> foo(T one, T two) { ... }
std::tie(one, two) = foo(std::move(one), std::move(two));
(Assuming that the classes involved handle assigning to a moved-from object in a valid manner).
From reading the updated evaluation order proposal, my assumption was that this was fixed, but I can't find an exact reference in the standard that verifies this. Could someone help provide that?
The relevant section from the standard can be found in [expr.ass]/1 and it has
In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. The right operand is sequenced before the left operand. With respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation.
So, according to this, foo(std::move(one), std::move(two)); will be evaluated first, leaving one and two as moved from objects once std::tie(one, two) is evaluated. tie create references, so there is no accessing the moved from objects there. Then the assignment actually happens meaning one and two get assigned to via std::tuple::operator = and get whatever value foo returns. This is legal and well defined.
The code is valid, regardless of C++17 changes to evaluation order of =.
Since LHS doesn't read from one, two, but merely takes their addresses, it being unsequenced (pre-C++17) relative to the RHS doesn't cause UB.
Related
Say I have code:
struct A {int val, string name};
A a{5, "Hello"};
fn(a.val, std::move(a));
Now it looks like I'm reading from a and also moving from a on the same line which looks bad as parameter ordering is generally undefined. However, as std::move is just a cast that should be ok - so I'm not actually reading from a moved from value.
What happens though if fn actually takes the second parameter by value:
fn(int first, A second);
In this case a new A must be constructed from the moved from value a. Would this start causing issues in my first example? I assume the constructor of the parameter second may be called before the first parameter a.val is read?
The problem that you are going to run into with this is [expr.call]/8
The postfix-expression is sequenced before each expression in the expression-list and any default argument. The initialization of a parameter, including every associated value computation and side effect, is indeterminately sequenced with respect to that of any other parameter.
emphasis mine
So, what could happen then is second is initialized first, which will be move constructed from a so it will steal its guts. This means a is in a valid but indeterminate state when you go to initialize first. Using first after that would be undefined behavior. Now if the order is reversed you are fine but since the order is not specified you are not guaranteed any consistent behavior.
You are right that std::move isn't actually moving anything but initializing second with a object that was passed to std::move will cause second to use it's move constructor and that will actually move a.
In cppref, the following holds until C++17:
code such as f(std::shared_ptr<int>(new int(42)), g()) can cause a
memory leak if g gets called after new int(42) and throws an
exception, while f(std::make_shared<int>(42), g()) is safe, since
two function calls are never interleaved.
I'm wondering which change introduced in C++17 renders this no longer applicable.
The evaluation order of function arguments are changed by P0400R0.
Before the change, evaluation of function arguments are unsequenced relative to one another. This means evaluation of g() may be inserted into the evaluation of std::shared_ptr<int>(new int(42)), which causes the situation described in your quoted context.
After the change, evaluation of function arguments are indeterminately sequenced with no interleaving, which means all side effects of std::shared_ptr<int>(new int(42)) take place either before or after those of g(). Now consider the case where g() may throw.
If all side effects of std::shared_ptr<int>(new int(42)) take place before those of g(), the memory allocated will be deallocated by the destructor of std::shared_ptr<int>.
If all side effects of std::shared_ptr<int>(new int(42)) take place after those of g(), there is even no memory allocation.
In either case, there is no memory leak again anyway.
The P0145R3 paper (which was accepted into C++17) refines the order of evaluation of several C++ constructs, including
Postfix expressions are evaluated from left to right. This includes functions calls and member
selection expressions
Specifically, the paper adds the following text to 5.2.2/4 paragraph of the standard:
The postfix-expression is sequenced before each expression in the
expression-list and any default argument. Every value computation and
side effect associated with the initialization of a parameter, and the
initialization itself, is sequenced before every value computation and
side effect associated with the initialization of any subsequent
parameter.
Simple question, but surprisingly hard to search for.
For the statement A && B I know there is a sequence point between the evaluation of A and B, and I know that the order of evaluation is left-to-right, but what is a compiler allowed to do when it can prove that B is always false (perhaps even explicitly so)?
Namely, for function_with_side_effects() && false is the compiler allowed to optimize away the function call?
A compiler is allowed to optimise out anything, as long as it doesn't break the as-if rule. The as-if rule states that with respect to observable behaviour, a program must behave as if it was executed by the exact rules of the C++ abstract machine (basically normal, unoptimised semantics of code).
Observable behaviour is:
Access to volatile objects
Writing to files
Input & output on interactive devices
As long as the program does the three things above in correct order, it is allowed to deviate from other source code functionality as much as it wants.
Of course, in practice, the number of operations which must be left intact by the compiler is much larger than the above, simply because the compiler has to assume that any function whose code it cannot see can, potentially, have an observable effect.
So, in your case, unless the compiler can prove that no action inside function_with_side_effects can ever affect observable behaviour (directly or indirectly by e.g. setting a flag tested later), it has to execute a call of function_with_side_effects, because it could violate the as-if rule if it didn't.
As #T.C. correctly pointed out in comments, there are a few exceptions to the as-if rule, when a compiler is allowed to perform optimisations which change observable behaviour; the most commonly encountered among these exceptions being copy elision. However, none of the exceptions come into play in the code in question.
No.
In general, the C++ Standard specifies the result of the computation in terms of observable effects and as long as your code is written in a Standard-compliant way (avoid Undefined Behavior, Unspecified Behavior and Implementation-Defined Behavior) then a compliant compiler has to produce the observable effects in the order they are specified.
There are only two caveats in the Standard: Copy Elision when returning a value allows the compiler to omit a call to the Copy Constructor or to the Move Constructor without a care for their (potential) observable effects.
The compiler is otherwise only allowed to optimize non-observable behavior, such as for example using less CPU registers or not writing a value in a memory location you never read afterward.
Note: in C++, the address of an object can be observed, and is thus considered observable; it's low-level like that.
In your particular case, let's refer to the Standard:
[expr.log.and] Logical AND operator
The && operator groups left-to-right. The operands are both contextually converted to bool (Clause 4). The result is true if both operands are true and false otherwise. Unlike &, && guarantees left-to-right evaluation: the second operand is not evaluated if the first operand is false.
The result is a bool. If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression.
The key here is (2): sequenced after/sequenced before is standardese speak for ordering the observable events.
According to standard:
5.14 Logical AND operator:
1 The && operator groups left-to-right. The operands are both contextually converted to bool.
The result is true if both operands are true and false otherwise. Unlike &, && guarantees left-to-right evaluation: the second operand is not evaluated if the first operand is false.
2 The result is a bool. If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the
second expression.
So, according to these rules compiler will generate code where function_with_side_effects() will be evaluated.
The expression function_with_side_effects() && false is the same as function_with_side_effects() except that the result value is unconditionally false. The function call cannot be eliminated. The left operand of && is always evaluated; it is only the right operand whose evaluation is conditional.
Perhaps you're really asking about false && function_with_side_effects()?
In this expression, the function call must not happen. This is obvious statically, from the false. Since it must not happen, the translation of the code can be such that the function isn't referenced in the generated code.
However, suppose that we have false && nonexistent_function() which is the only reference to nonexistent_function, and that function isn't defined anywhere. If this is completely optimized away, then the program will link. Thereby, the implementation will fail to diagnose a violation of the One Definition Rule.
So I suspect that, for conformance, the symbol still has to be referenced, even if it isn't used in the code.
In particular, the prefix operators' return-by-reference makes sense to me - it's useful, in case one wants to do further operations on the object.
However, I can't get my head around why the postfix operator was designed to return by value.
Is it solely a convention, or was there a good reason why it was designed this way (like a return-by-value does not make sense for postfix, but makes sense for prefix)?
Can someone explain?
ANSWER
Thanks to the answers below, it appears that the postfix operator doesn't necessarily have to return by value (according to the standards).
However, due to the semantic requirements of the postfix operator (return the original value, but increment the reference to the original value afterwards), in conjunction with the standard requirement of:
Operator overloads are functions, and thus all side effects must take place before the function completes.
as explained clearly by David Rodriguez below, bifurcating the value seems to be a necessary consequence of the semantic requirements.
In this context, since we are returning the other value (not the original reference, since it will have changed by the closing brace of the function), returning the other value by-value seems to make the most sense.
Postfix operators are expressions that yield the original value and then modify the object. At the same time, operator overloads are functions, and thus all side effects must take place before the function completes. The only way of attaining the required semantics is by copying the initial state of the object before applying the change. The original state must be returned by value (if a reference was returned, the evaluation of the expression would yield the state of the object after the function completes, and thus would have the semantics of prefix operators, not postfix ones)
It is the convention because a typical implementation of post fix involves creating a temporary object local to the function, while incrementing the originally passed object using the prefix operator and then returning the temporary object. You cannot return a reference to that local object and hence it needs to be returned by value.
You cannot return a reference because the local object is guaranteed to be alive only within the scope of the function and any access to it beyond this scope will result in Undefined Behavior.
The following code is well-defined in C and C++:
int i = 7;
printf("%i\n", i++ + 2);
This will print 9 to the console, while i will be 8. Guaranteed.
The postfix increment/decrements modify i, but they return the old value of i. The only way to do that with an object is to save the current value in a new value, increment yourself, and return the saved value.
Sure it was good reason for that.
Post increment operator does following things:
increments variable and
returns its old value.
There is no way of returning the reference to the 'old value' of variable. It's gone.
Is it a standard term which is well defined, or just a term coined by developers to explain a concept (.. and what is the concept)? As I understand this has something to do with the all-confusing sequence points, but am not sure.
I found one definition here, but doesn't this make each and every statement of code a side effect?
A side effect is a result of an operator, expression, statement, or function that persists even after the operator, expression, statement, or function has finished being evaluated.
Can someone please explain what the term 'side effect' formally means in C++, and what is its significance?
For reference, some questions talking about side effects:
Is comma operator free from side effect?
Force compiler to not optimize side-effect-less statements
Side effects when passing objects to function in C++
A "side effect" is defined by the C++ standard in [intro.execution], by:
Reading an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.
The term "side-effect" arises from the distinction between imperative languages and pure functional languages. A C++ expression can do three things:
compute a result (or compute "no result" in the case of a void expression),
raise an exception instead of evaluating to a result,
in addition to 1 or 2, otherwise alter the state of the abstract machine on which the program is nominally running.
(3) are side-effects, the "main effect" being to evaluate the result of the expression. Exceptions are a slightly awkward special case, in that altering the flow of control does change the state of the abstract machine (by changing the current point of execution), but isn't a side-effect. The code to construct, handle and destroy the exception may have its own side-effects, of course.
The same principles apply to functions, with the return value in place of the result of the expression.
So, int foo(int a, int b) { return a + b; } just computes a return value, it doesn't alter anything else. Therefore it has no side-effects, which sometimes is an interesting property of a function when it comes to reasoning about your program (e.g. to prove that it is correct, or by the compiler when it optimizes). int bar(int &a, int &b) { return ++a + b; } does have a side-effect, since modifying the caller's object a is an additional effect of the function beyond simply computing a return value. It would not be permitted in a pure functional language.
The stuff in your quote about "has finished being evaluated" refers to the fact that the result of an expression (or return value of a function) can be a "temporary object", which is destroyed at the end of the full expression in which it occurs. So creating a temporary isn't a "side-effect" by that definition: other changes are.
What exactly is a 'side-effect' in C++? Is it a standard term which is well defined...
c++11 draft - 1.9.12: Accessing an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression (or a sub-expression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects. When a call to a library I/O function returns or an access to a volatile object is evaluated the side effect is considered complete, even though some external actions implied by the call (such as the I/O itself) or by the volatile access may not have completed yet.
I found one definition here, but doesn't this make each and every statement of code a side effect?
A side effect is a result of an operator, expression, statement, or function that persists even after the operator, expression, statement, or function has finished being evaluated.
Can someone please explain what the term 'side effect' formally means in C++, and what is its significance?
The significance is that, as expressions are being evaluated they can modify the program state and/or perform I/O. Expressions are allowed in myriad places in C++: variable assignments, if/else/while conditions, for loop setup/test/modify steps, function parameters etc.... A couple examples: ++x and strcat(buffer, "append this").
In a C++ program, the Standard grants the optimiser the right to generate code representing the program operations, but requires that all the operations associated with steps before a sequence point appear before any operations related to steps after the sequence point.
The reason C++ programmers tend to have to care about sequence points and side effects is that there aren't as many sequence points as you might expect. For example: given x = 1; f(++x, ++x);, you may expect a call to f(2, 3) but it's actually undefined behaviour. This behaviour is left undefined so the compiler's optimiser has more freedom to arrange operations with side effects to run in the most efficient order possible - perhaps even in parallel. It also avoid burdening compiler writers with detecting such conditions.
1.Is comma operator free from side effect?
Yes - a comma operator introduces a sequence point: the steps on the left must be complete before those on the right execute. There are a list of sequence points at http://en.wikipedia.org/wiki/Sequence_point - you should read this! (If you have to ask about side effects, then be careful in interpreting this answer - the "comma operator" is NOT invoked between function arguments, array initialisation elements etc.. The comma operator is relatively rarely used and somewhat obscure. Do some reading if you're not sure what the comma operator really is.)
2.Force compiler to not optimize side-effect-less statements
I assume you mean "side-effect-ful" statements. Compiler's are not obliged to support any such option. What behaviour would they exhibit if they tried? - the Standard doesn't define what they should do in such situations. Sometimes a majority of programmers might share an intuitive expectation, but other times it's really arbitary.
3.Side effects when passing objects to function in C++
When calling a function, all the parameters must have been completely evaluated - and their side effects triggered - before the function call takes place. BUT, there are no restrictions on the compiler related to evaluating specific parameter expressions before any other. They can be overlapping, in parallel etc.. So, in f(expr1, expr2) - some of the steps in evaluating expr2 might run before anything from expr1, but expr1 might still complete first - it's undefined.
1.9.6
The observable behavior of the abstract machine is its sequence of
reads and writes to volatile data and calls to library I/O
functions.
A side-effect is anything that affects observable behavior.
Note that there are exceptions specified by the standard, where observable behavior doesn't have to conform to that of the abstract machine - see return value optimization, temporary copy elision.