In cppref, the following holds until C++17:
code such as f(std::shared_ptr<int>(new int(42)), g()) can cause a
memory leak if g gets called after new int(42) and throws an
exception, while f(std::make_shared<int>(42), g()) is safe, since
two function calls are never interleaved.
I'm wondering which change introduced in C++17 renders this no longer applicable.
The evaluation order of function arguments are changed by P0400R0.
Before the change, evaluation of function arguments are unsequenced relative to one another. This means evaluation of g() may be inserted into the evaluation of std::shared_ptr<int>(new int(42)), which causes the situation described in your quoted context.
After the change, evaluation of function arguments are indeterminately sequenced with no interleaving, which means all side effects of std::shared_ptr<int>(new int(42)) take place either before or after those of g(). Now consider the case where g() may throw.
If all side effects of std::shared_ptr<int>(new int(42)) take place before those of g(), the memory allocated will be deallocated by the destructor of std::shared_ptr<int>.
If all side effects of std::shared_ptr<int>(new int(42)) take place after those of g(), there is even no memory allocation.
In either case, there is no memory leak again anyway.
The P0145R3 paper (which was accepted into C++17) refines the order of evaluation of several C++ constructs, including
Postfix expressions are evaluated from left to right. This includes functions calls and member
selection expressions
Specifically, the paper adds the following text to 5.2.2/4 paragraph of the standard:
The postfix-expression is sequenced before each expression in the
expression-list and any default argument. Every value computation and
side effect associated with the initialization of a parameter, and the
initialization itself, is sequenced before every value computation and
side effect associated with the initialization of any subsequent
parameter.
Related
So I understand that re-usage of a variable that has been post incremented is undefined behavior in a function call. My understanding is this is not a problem in constructors. My question is about tie which is oddly halfway between each.
Given: pair<int, int> func() can I do:
tie(*it++, *it) = func();
Or is that undefined behavior?
Since C++17, this code has unspecified behavior. There are two possible outcomes:
the first argument is the result of dereferencing the original iterator, the second argument is the result of dereferencing the incremented iterator; or
the first argument and the second argument are both the result of dereferencing the original iterator.
Per [expr.call]/8:
[...] The initialization of a parameter, including every associated
value computation and side effect, is indeterminately sequenced with
respect to that of any other parameter. [...]
So the second argument to tie may be either the result of dereferencing the incremented iterator or the original iterator.
Prior to C++17, the situation was a bit complicated:
if both ++ and * invoke a function (e.g., when the type of it is a sophisticated class), then the behavior was unspecified, similar to the case since C++17;
otherwise, the behavior was undefined.
Per N4140 (C++14 draft) [expr.call]/8:
[ Note: The evaluations of the postfix expression and of the
arguments are all unsequenced relative to one another. All side
effects of argument evaluations are sequenced before the function is
entered (see [intro.execution]). — end note ]
Thus, the code was undefined behavior because the evaluation of one argument was unsequenced with the other. The evaluation of the two arguments may overlap, resulting in a data race. Unless it is specified otherwise ...
Per N4140 [intro.execution]/15:
When calling a function (whether or not the function is inline), every
value computation and side effect associated with any argument
expression, or with the postfix expression designating the called
function, is sequenced before execution of every expression or
statement in the body of the called function. [ Note: Value
computations and side effects associated with different argument
expressions are unsequenced. — end note ] Every evaluation
in the calling function (including other function calls) that is not
otherwise specifically sequenced before or after the execution of the
body of the called function is indeterminately sequenced with respect
to the execution of the called function.9 Several
contexts in C++ cause evaluation of a function call, even though no
corresponding function call syntax appears in the translation unit.
[
Example: Evaluation of a new expression invokes one or more allocation and constructor functions; see [expr.new]. For another
example, invocation of a conversion function ([class.conv.fct]) can
arise in contexts in which no function call syntax appears. —
end example ] The sequencing constraints on the execution of the called function (as described above) are features of the function
calls as evaluated, whatever the syntax of the expression that calls
the function might be.
9)
In other words, function executions do not interleave with each
other.
Thus, if the operators are actually function calls, then the behavior is similarly unspecified.
Look at this simple function call:
f(a(), b());
According to the standard, call order of a() and b() is unspecified. C++17 has the additional rule which doesn't allow a() and b() to be interleaved. Before C++17, there was no such rule, as far as I know.
Now, look at this simple code:
int v = 0;
int fn() {
int t = v+1;
v = t;
return 0;
}
void foo(int, int) { }
int main() {
foo(fn(), fn());
}
With C++17 rules, v surely will have the value of 2 after the call of foo. But, it makes me wonder, with pre-C++17, is the same guaranteed? Or could it be that v ends up 1? Does it make a difference, if instead of int t = v+1; v = t;, we just have v++?
Functions calls were not allowed to interleave in previous versions as well.
Quoting from C++11 final draft (n3337)
1.9 Program execution [intro.execution]
...
15. ...
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [ Note: Value computations and side effects associated with different argument expressions are unsequenced. —end note ]
Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function9.
9) In other words, function executions do not interleave with each other.
Similar wording can be found in the final draft of the C++14 version as well.
v must be 2 in all versions of C++ (and C). The function fn() must be executed twice, and clearly each time it executes it will increment v. There is no multithreading here, no data race, and no possibility for fn() to be executed only partially and then interrupted while the other invocation of fn() proceeds.
C++17 has the additional rule which doesn't allow a() and b() to be interleaved. Before C++17, there was no such rule, as far as I know.
There were rules that applied here, though the wording and some details have grown more exact.
C++03 [intro.execution]/8:
Once the execution of a function begins, no expressions from the calling function are evaluated until execution of the called function has completed. [footnote 8]
[footnote 8]: In other words, function executions do not interleave with each other.
Though you could argue this doesn't actually say anything about other functions called from the calling function in the text, and footnotes are officially not normative.
C++11 changed the wording, largely because it introduced multithreading semantics. C++11 and C++14 [intro.execution]/15, emphasis mine:
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [Note: Value computations and side effects associated with different argument expressions are unsequenced. - end note] Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function. [footnote 9]
[footnote 9] In other words, function executions do not interleave with each other.
This wording, I think, leaves no doubt, at least in most cases.
C++17 [intro.execution]/18:
When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. For each function invocation F, for every evaluation A that occurs within F and every evaluation B that does not occur within F but is evaluated on the same thread and as part of the same signal handler (if any), either A is sequenced before B or B is sequenced before A. [footnote 10] [Note: If A and B would not otherwise be sequenced then they are indeterminately sequenced. - end note]
[footnote 10] In other words, function executions do not interleave with each other.
This is a much more general statement about all evaluations in separate functions, not just arguments in a function call. But as far as I know, this more precise wording just clarifies some possibly ambiguous cases, but doesn't really change semantic behavior much.
I suspect we are confusing two concepts. Call order was only fixed within expressions in c++17. Interleaving of the statements in a function between 2 calls of functions has always been disallowed, though when the optimiser gets hold of your code, only the effect is guaranteed.
If you write a() ; b() then a() will be fully executed before b is then fully executed. Same if you write a(), b() or a() && b()
If you write a() + b() then there is no guarantee prior to c++17 that a() will be executed before b(), but there is a guarantee that whichever is executed first will be fully completed before the other is executed.
If you write c(a(), b()) it [is my understanding] that again there is no guarantee that a() will be executed before b(), but there /is/ the guarantee that the effects of the first executed function (whichever that is) will complete before the second is started, and of course there is the guarantee that c() will not be executed until both a() and b() have completed.
The optimiser should honour the intent of that guarantee, though it may take the statements of a() and b(), and even c(), and intermingle them, the effect of running the code should be the same as if they were run in sequence. Could the -O0 code do a() then b() and the -O3 code do b() then a()? Perhaps, though that would make the debugging quite difficult, so I would hope not.
The compiler only has to give the effect in the final result, and it is possible in multithreading for another thread to observe the effects of consecutive lines of code occur out of order unless specific thread-aware constructs are used.
It is my understanding that the optimiser can't allow the specific effects of a(), b(), and c() to appear to occur out of order between each function, though prior to c++17 the order of a() and b() is not well defined - all effects of b() might occur before all effects of a().
I am very much trilled with the how a scoped_lock works and was wondering weather a similar implementation can be done as to time a particular code of execution
If say I implement a simple class scoped_timer which on construction initiates a timer and on deletion it stops and report the time elapsed, then would this sample code be timed correctly
func()
{
//some code
{
scoped_timer a;
//some code that does not include a
}
//some code
}
In practice am I guaranteed that scoped_time a is constructed at the beginning and destructed exactly when it is out of scope. Can the compiler decide to reorder the code in such a way as not to destruct it exactly at the end of scope or construct it at the beginning since there is no dependence on the object a? Are there guarantees from C++ standard?
Thanks
Daniel
The code is guaranteed to behave as you would like.
This guarantee is important in C++, because C++ is not a functional programming language, due to the fact that almost any function in C++ can have side effects (either from the flow of execution of the current thread, or from other threads or even other processes, whether or not the data is declared as volatile). Because of this, the language specification makes guarantees about the sequencing of full expressions.
To piece this together from the C++11 standard, there are a number of clauses that must be considered together.
The most important clause is §1.9:
§1.9 Program execution [intro.execution]
1 The semantic descriptions in this International Standard define a
parameterized nondeterministic abstract machine. This International
Standard places no requirement on the structure of conforming
implementations. In particular, they need not copy or emulate the
structure of the abstract machine. Rather, conforming implementations
are required to emulate (only) the observable behavior of the abstract
machine as explained below. * (<-- the footnote is in the standard itself)
* This provision is sometimes called the “as-if” rule, because an implementation is free to disregard any requirement of this
International Standard as long as the result is as if the requirement
had been obeyed, as far as can be determined from the observable
behavior of the program. For instance, an actual implementation need
not evaluate part of an expression if it can deduce that its value is
not used and that no side effects affecting the observable behavior of
the program are produced.
(The bolding of the text is mine.)
This clause imposes two important requirements that are relevant for this question.
If an expression may have side effects, it will be evaluated. In your case, the expression scoped_timer a; may have side effects, so it will be evaluated.
"...conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.", where "below" includes Clauses 13 and 14 of the same section:
§1.9.13 Sequenced before is an asymmetric, transitive, pair-wise
relation between evaluations executed by a single thread (1.10), which
induces a partial order among those evaluations. Given any two
evaluations A and B, if A is sequenced before B, then the execution of
A shall precede the execution of B. If A is not sequenced before B and
B is not sequenced before A, then A and B are unsequenced. [ Note: The
execution of unsequenced evaluations can overlap. —end note ]
Evaluations A and B are indeterminately sequenced when either A is
sequenced before B or B is sequenced before A, but it is unspecified
which. [ Note: Indeterminately sequenced evaluations cannot overlap,
but either could be executed first. —end note ]
§1.9.14 Every value computation and side effect associated with a
full-expression is sequenced before every value computation and side
effect associated with the next full-expression to be evaluated. * (<-- the footnote here is not relevant)
Therefore, your expression scoped_timer a; (which is a full expression) may have side effects and will be evaluated; so the computation of the value of a will be sequenced before any of the following statements in the block.
Regarding destruction of the object a, that is simpler.
§3.7.3.3 If a variable with automatic storage duration has initialization or a destructor with side effects, it shall not be
destroyed before the end of its block, nor shall it be eliminated as
an optimization even if it appears to be unused, except that a class
object or its copy/move may be eliminated as specified in 12.8.
This makes clear that the destructor will not be called until the block exits.
ADDENDUM And to confirm that all block-level variables are destroyed (and their destructor called) at the end of block scope, here it is in the C++11 standard:
§3.7.3.1 Block-scope variables explicitly declared register or not explicitly declared static or extern have automatic storage duration.
The storage for these entities lasts until the block in which they are
created exits.
§3.7.3.2 [ Note: These variables are initialized and destroyed as described in 6.7. —end note ]
... and the above-mentioned §6.7:
§6.7.2 Variables with automatic storage duration (3.7.3) are initialized each time their declaration-statement is executed.
Variables with automatic storage duration declared in the block are
destroyed on exit from the block (6.6).
The block is defined as all code between a pair of curly braces {} here:
§6.3.1 So that several statements can be used where one is expected, the compound statement (also, and equivalently, called “block”) is provided.
compound-statement:
{ statement-seq }
statement-seq:
statement
statement-seq statement
A compound statement defines a block scope (3.3).
Note: The compount-statement (etc) section takes a while to get used to, but the important point is that here, the open curly brace { and close curly brace } actually mean a literal open curly brace and close curly brace in the code. This is the exact place in the C++11 standard where block scope is defined as the sequence of statements between curly braces.
Putting the pieces together: Because the standard, as quoted above, says The storage for these entities lasts until the block in which they are created exits and that Variables with automatic storage duration declared in the block are destroyed on exit from the block, you are assured that the object a in your question (and ANY block-level object) will last until the end of the block, and will be destroyed and have its destructor called when the block exits.
Given the following function call:
f(g(), h())
since the order of evaluation of function arguments is unspecified (still the case in C++11 as far as I'm aware), could an implementation theoretically execute g() and h() in parallel?
Such a parallelisation could only kick in were g and h known to be fairly trivial (in the most obvious case, accessing only data local to their bodies) so as not to introduce concurrency issues but, beyond that restriction I can't see anything to prohibit it.
So, does the standard allow it? Even if only by the as-if rule?
(In this answer, Mankarse asserts otherwise; however, he does not cite the standard, and my read-through of [expr.call] hasn't revealed any obvious wording.)
The requirement comes from [intro.execution]/15:
... When calling a function ... Every evaluation in the calling function (including other function calls) that is not otherwise specifically sequenced before or after the execution of the body of the called function is indeterminately sequenced with respect to the execution of the called function [Footnote: In other words, function executions do not interleave with each other.].
So any execution of the body of g() must be indeterminately sequenced with (that is, not overlapping with) the evaluation of h() (because h() is an expression in the calling function).
The critical point here is that g() and h() are both function calls.
(Of course, the as-if rule means that the possibility cannot be entirely ruled out, but it should never happen in a way that could affect the observable behaviour of a program. At most, such an implementation would just change the performance characteristics of the code.)
As long as you can't tell, whatever the compiler does to evaluate these functions is entirely up to the compiler. Clearly, the evaluation of the functions cannot involve any access to shared, mutable data as this would introduce data races. The basic guiding principle is the "as if"-rule and the fundamental observable operations, i.e., access to volatile data, I/O operations, access to atomic data, etc. The relevant section is 1.9 [intro.execution].
Not unless the compiler knew exactly what g(), h(), and anything they call does.
The two expressions are function calls, which may have unknown side effects. Therefore, parallelizing them could cause a data-race on those side effects. Since the C++ standard does not allow argument evaluation to cause a data-race on any side effects of the expressions, the compiler can only parallelize them if it knows that no such data race is possible.
That means walking though each function and look at exactly what they do and/or call, then tracking through those functions, etc. In the general case, it's not feasible.
Easy answer: when the functions are sequenced, even if indeterminately, there is no possibility for a race condition between the two, which is not true if they are parallelized. Even a pair of one line "trivial" functions could do it.
void g()
{
*p = *p + 1;
}
void h()
{
*p = *p - 1;
}
If p is a name shared by g and h, then a sequential calling of g and h in any order will result in the value pointed to by p not changing. If they are parallelized, the reading of *p and the assigning of it could be interleaved arbitrarily between the two:
g reads *p and finds the value 1.
f reads *p and also finds the value 1.
g writes 2 to *p.
f, still using the value 1 it read before will write 0 to *p.
Thus, the behavior is different when they are parallelized.
Is it a standard term which is well defined, or just a term coined by developers to explain a concept (.. and what is the concept)? As I understand this has something to do with the all-confusing sequence points, but am not sure.
I found one definition here, but doesn't this make each and every statement of code a side effect?
A side effect is a result of an operator, expression, statement, or function that persists even after the operator, expression, statement, or function has finished being evaluated.
Can someone please explain what the term 'side effect' formally means in C++, and what is its significance?
For reference, some questions talking about side effects:
Is comma operator free from side effect?
Force compiler to not optimize side-effect-less statements
Side effects when passing objects to function in C++
A "side effect" is defined by the C++ standard in [intro.execution], by:
Reading an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.
The term "side-effect" arises from the distinction between imperative languages and pure functional languages. A C++ expression can do three things:
compute a result (or compute "no result" in the case of a void expression),
raise an exception instead of evaluating to a result,
in addition to 1 or 2, otherwise alter the state of the abstract machine on which the program is nominally running.
(3) are side-effects, the "main effect" being to evaluate the result of the expression. Exceptions are a slightly awkward special case, in that altering the flow of control does change the state of the abstract machine (by changing the current point of execution), but isn't a side-effect. The code to construct, handle and destroy the exception may have its own side-effects, of course.
The same principles apply to functions, with the return value in place of the result of the expression.
So, int foo(int a, int b) { return a + b; } just computes a return value, it doesn't alter anything else. Therefore it has no side-effects, which sometimes is an interesting property of a function when it comes to reasoning about your program (e.g. to prove that it is correct, or by the compiler when it optimizes). int bar(int &a, int &b) { return ++a + b; } does have a side-effect, since modifying the caller's object a is an additional effect of the function beyond simply computing a return value. It would not be permitted in a pure functional language.
The stuff in your quote about "has finished being evaluated" refers to the fact that the result of an expression (or return value of a function) can be a "temporary object", which is destroyed at the end of the full expression in which it occurs. So creating a temporary isn't a "side-effect" by that definition: other changes are.
What exactly is a 'side-effect' in C++? Is it a standard term which is well defined...
c++11 draft - 1.9.12: Accessing an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression (or a sub-expression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects. When a call to a library I/O function returns or an access to a volatile object is evaluated the side effect is considered complete, even though some external actions implied by the call (such as the I/O itself) or by the volatile access may not have completed yet.
I found one definition here, but doesn't this make each and every statement of code a side effect?
A side effect is a result of an operator, expression, statement, or function that persists even after the operator, expression, statement, or function has finished being evaluated.
Can someone please explain what the term 'side effect' formally means in C++, and what is its significance?
The significance is that, as expressions are being evaluated they can modify the program state and/or perform I/O. Expressions are allowed in myriad places in C++: variable assignments, if/else/while conditions, for loop setup/test/modify steps, function parameters etc.... A couple examples: ++x and strcat(buffer, "append this").
In a C++ program, the Standard grants the optimiser the right to generate code representing the program operations, but requires that all the operations associated with steps before a sequence point appear before any operations related to steps after the sequence point.
The reason C++ programmers tend to have to care about sequence points and side effects is that there aren't as many sequence points as you might expect. For example: given x = 1; f(++x, ++x);, you may expect a call to f(2, 3) but it's actually undefined behaviour. This behaviour is left undefined so the compiler's optimiser has more freedom to arrange operations with side effects to run in the most efficient order possible - perhaps even in parallel. It also avoid burdening compiler writers with detecting such conditions.
1.Is comma operator free from side effect?
Yes - a comma operator introduces a sequence point: the steps on the left must be complete before those on the right execute. There are a list of sequence points at http://en.wikipedia.org/wiki/Sequence_point - you should read this! (If you have to ask about side effects, then be careful in interpreting this answer - the "comma operator" is NOT invoked between function arguments, array initialisation elements etc.. The comma operator is relatively rarely used and somewhat obscure. Do some reading if you're not sure what the comma operator really is.)
2.Force compiler to not optimize side-effect-less statements
I assume you mean "side-effect-ful" statements. Compiler's are not obliged to support any such option. What behaviour would they exhibit if they tried? - the Standard doesn't define what they should do in such situations. Sometimes a majority of programmers might share an intuitive expectation, but other times it's really arbitary.
3.Side effects when passing objects to function in C++
When calling a function, all the parameters must have been completely evaluated - and their side effects triggered - before the function call takes place. BUT, there are no restrictions on the compiler related to evaluating specific parameter expressions before any other. They can be overlapping, in parallel etc.. So, in f(expr1, expr2) - some of the steps in evaluating expr2 might run before anything from expr1, but expr1 might still complete first - it's undefined.
1.9.6
The observable behavior of the abstract machine is its sequence of
reads and writes to volatile data and calls to library I/O
functions.
A side-effect is anything that affects observable behavior.
Note that there are exceptions specified by the standard, where observable behavior doesn't have to conform to that of the abstract machine - see return value optimization, temporary copy elision.