This question already has answers here:
Undefined behavior and sequence points
(5 answers)
Closed 6 years ago.
I am confused about the output of the code.
It depends on what compiler i run the code. Why is it so?
#include <iostream>
using namespace std;
int f(int &n)
{
n--;
return n;
}
int main()
{
int n=10;
n=n-f(n);
cout<<n;
return 0;
}
Running it on the Ubuntu terminal with g++, the output is 1 whereas running it on Turbo C++ ( the compiler we used in school) gives output as 0.
In C++03, modifying a variable and also using its value in the same expression, without an intervening C++03 sequence point, was Undefined Behavior.
C++03 §5/4:
” Between the previous
and next sequence point a scalar object shall have its stored value modified at most once by the evaluation
of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.
The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full
expression; otherwise the behavior is undefined.
Undefined Behavior, UB, provides the compiler with an opportunity to optimize, because it can assume that UB does not occur in a valid program.
However, with all the myriad UB rules of C++ it's difficult to reason about source code.
In C++11 sequence points were replaced with sequenced before, indeterminately sequenced and unsequenced relations:
C++11 §1.9/3
” Given any two evaluations A and B, if
A is sequenced before B, then the execution of A shall precede the execution of B. If A is not sequenced before
B and B is not sequenced before A, then A and B are unsequenced. [Note: The execution of unsequenced
evaluations can overlap. —end note ] Evaluations A and B are indeterminately sequenced when either A
is sequenced before B or B is sequenced before A, but it is unspecified which.
And with the new C++11 sequence relationship rules the modification in the function in the code in question is indeterminately sequenced with respect to the use of the variable, and so the code has unspecified behavior rather than Undefined Behavior, as noted by Eric M Schmidt in a comment to (the first version of) this answer. Essentially that means that there is no danger of nasal daemons or other possible UB effects, and that the behavior is a reasonable one. The two possible behaviors here are that the modification via the function call is done before the use of the value, or that it's done after the use of the value.
Why it's unspecified behavior:
C++11 §1.9/15:
” Every evaluation in the calling function (including other function calls) that is not otherwise specifically
sequenced before or after the execution of the body of the called function is indeterminately sequenced with
respect to the execution of the called function.
What “unspecified behavior” means:
C++11 §1.3.25:
” unspecified behavior
Behavior, for a well-formed program construct and correct data, that depends on the implementation
[Note: The implementation is not required to document which behavior occurs. The range of possible
behaviors is usually delineated by this International Standard. —end note ]
Why the modification effected by the assignment is not problematic:
C++11 §5.17/1
” In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression.
This is also quite different from C++03.
As the rather drastic edit of this answer shows, following Eric's comment, this kind of issue is not simple! The main advice I can give is to as much as possible just Say No™ to effects governed by subtle or very complex rules, the corners of the language. Simple code has a better chance of being correct, while so called clever code does not have a good chance of being significantly faster.
Related
So I understand that re-usage of a variable that has been post incremented is undefined behavior in a function call. My understanding is this is not a problem in constructors. My question is about tie which is oddly halfway between each.
Given: pair<int, int> func() can I do:
tie(*it++, *it) = func();
Or is that undefined behavior?
Since C++17, this code has unspecified behavior. There are two possible outcomes:
the first argument is the result of dereferencing the original iterator, the second argument is the result of dereferencing the incremented iterator; or
the first argument and the second argument are both the result of dereferencing the original iterator.
Per [expr.call]/8:
[...] The initialization of a parameter, including every associated
value computation and side effect, is indeterminately sequenced with
respect to that of any other parameter. [...]
So the second argument to tie may be either the result of dereferencing the incremented iterator or the original iterator.
Prior to C++17, the situation was a bit complicated:
if both ++ and * invoke a function (e.g., when the type of it is a sophisticated class), then the behavior was unspecified, similar to the case since C++17;
otherwise, the behavior was undefined.
Per N4140 (C++14 draft) [expr.call]/8:
[ Note: The evaluations of the postfix expression and of the
arguments are all unsequenced relative to one another. All side
effects of argument evaluations are sequenced before the function is
entered (see [intro.execution]). — end note ]
Thus, the code was undefined behavior because the evaluation of one argument was unsequenced with the other. The evaluation of the two arguments may overlap, resulting in a data race. Unless it is specified otherwise ...
Per N4140 [intro.execution]/15:
When calling a function (whether or not the function is inline), every
value computation and side effect associated with any argument
expression, or with the postfix expression designating the called
function, is sequenced before execution of every expression or
statement in the body of the called function. [ Note: Value
computations and side effects associated with different argument
expressions are unsequenced. — end note ] Every evaluation
in the calling function (including other function calls) that is not
otherwise specifically sequenced before or after the execution of the
body of the called function is indeterminately sequenced with respect
to the execution of the called function.9 Several
contexts in C++ cause evaluation of a function call, even though no
corresponding function call syntax appears in the translation unit.
[
Example: Evaluation of a new expression invokes one or more allocation and constructor functions; see [expr.new]. For another
example, invocation of a conversion function ([class.conv.fct]) can
arise in contexts in which no function call syntax appears. —
end example ] The sequencing constraints on the execution of the called function (as described above) are features of the function
calls as evaluated, whatever the syntax of the expression that calls
the function might be.
9)
In other words, function executions do not interleave with each
other.
Thus, if the operators are actually function calls, then the behavior is similarly unspecified.
Simple question, but surprisingly hard to search for.
For the statement A && B I know there is a sequence point between the evaluation of A and B, and I know that the order of evaluation is left-to-right, but what is a compiler allowed to do when it can prove that B is always false (perhaps even explicitly so)?
Namely, for function_with_side_effects() && false is the compiler allowed to optimize away the function call?
A compiler is allowed to optimise out anything, as long as it doesn't break the as-if rule. The as-if rule states that with respect to observable behaviour, a program must behave as if it was executed by the exact rules of the C++ abstract machine (basically normal, unoptimised semantics of code).
Observable behaviour is:
Access to volatile objects
Writing to files
Input & output on interactive devices
As long as the program does the three things above in correct order, it is allowed to deviate from other source code functionality as much as it wants.
Of course, in practice, the number of operations which must be left intact by the compiler is much larger than the above, simply because the compiler has to assume that any function whose code it cannot see can, potentially, have an observable effect.
So, in your case, unless the compiler can prove that no action inside function_with_side_effects can ever affect observable behaviour (directly or indirectly by e.g. setting a flag tested later), it has to execute a call of function_with_side_effects, because it could violate the as-if rule if it didn't.
As #T.C. correctly pointed out in comments, there are a few exceptions to the as-if rule, when a compiler is allowed to perform optimisations which change observable behaviour; the most commonly encountered among these exceptions being copy elision. However, none of the exceptions come into play in the code in question.
No.
In general, the C++ Standard specifies the result of the computation in terms of observable effects and as long as your code is written in a Standard-compliant way (avoid Undefined Behavior, Unspecified Behavior and Implementation-Defined Behavior) then a compliant compiler has to produce the observable effects in the order they are specified.
There are only two caveats in the Standard: Copy Elision when returning a value allows the compiler to omit a call to the Copy Constructor or to the Move Constructor without a care for their (potential) observable effects.
The compiler is otherwise only allowed to optimize non-observable behavior, such as for example using less CPU registers or not writing a value in a memory location you never read afterward.
Note: in C++, the address of an object can be observed, and is thus considered observable; it's low-level like that.
In your particular case, let's refer to the Standard:
[expr.log.and] Logical AND operator
The && operator groups left-to-right. The operands are both contextually converted to bool (Clause 4). The result is true if both operands are true and false otherwise. Unlike &, && guarantees left-to-right evaluation: the second operand is not evaluated if the first operand is false.
The result is a bool. If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression.
The key here is (2): sequenced after/sequenced before is standardese speak for ordering the observable events.
According to standard:
5.14 Logical AND operator:
1 The && operator groups left-to-right. The operands are both contextually converted to bool.
The result is true if both operands are true and false otherwise. Unlike &, && guarantees left-to-right evaluation: the second operand is not evaluated if the first operand is false.
2 The result is a bool. If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the
second expression.
So, according to these rules compiler will generate code where function_with_side_effects() will be evaluated.
The expression function_with_side_effects() && false is the same as function_with_side_effects() except that the result value is unconditionally false. The function call cannot be eliminated. The left operand of && is always evaluated; it is only the right operand whose evaluation is conditional.
Perhaps you're really asking about false && function_with_side_effects()?
In this expression, the function call must not happen. This is obvious statically, from the false. Since it must not happen, the translation of the code can be such that the function isn't referenced in the generated code.
However, suppose that we have false && nonexistent_function() which is the only reference to nonexistent_function, and that function isn't defined anywhere. If this is completely optimized away, then the program will link. Thereby, the implementation will fail to diagnose a violation of the One Definition Rule.
So I suspect that, for conformance, the symbol still has to be referenced, even if it isn't used in the code.
Here's the sample code:
X * makeX(int index) { return new X(index); }
struct Tmp {
mutable int count;
Tmp() : count(0) {}
const X ** getX() const {
static const X* x[] = { makeX(count++), makeX(count++) };
return x;
}
};
This reports Undefined Behavior on CLang build 500 in the static array construction.
For sake of simplification for this post, the count is not static, but it does not change anything. The error I am receiving is as follows:
test.cpp:8:44: warning: multiple unsequenced modifications to 'count' [-Wunsequenced]
In C++11, this is fine; each clause of an initialiser list is sequenced before the next one, so the evaluation is well-defined.
Historically, the clauses might have been unsequenced, so the two unsequenced modifications of count would give undefined behaviour.
(Although, as noted in the comments, it might have been well-defined even then - you can probably interpret the standard as implying that each clause is a full-expression, and there's a seqeuence point at the end of each full-expression. I'll leave it to historians to debate the finer points of obsolete languages.)
Update 2
So after some research I realized this was actually well defined although the evaluation order is unspecified. It was a pretty interesting putting the pieces together and although there is a more general question covering this for the C++11 case there was not a general question covering the pre C++11 case so I ended up creating a self answer question, Are multiple mutations of the same variable within initializer lists undefined behavior pre C++11 that covers all the details.
Basically, the instinct when seeing makeX(count++), makeX(count++) is to see the whole thing as a full-expression but it is not and therefore each intializer has a sequence point.
Update
As James points out it may not be undefined pre-C++11, which would seem to rely on interpreting the initialization of each element as a full expression but it is not clear you can definitely make that claim.
Original
Pre-C++11 it is undefined behavior to modify a variable more than once within a sequence point, we can see that by looking at the relevant section in an older draft standard would be section 5 Expressions paragraph 4 which says (emphasis mine):
[...]Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full expression; otherwise the behavior is undefined.
In the C++11 draft standard this changes and to the following wording from section 1.9 Program execution paragraph 15 says (emphasis mine):
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [...] If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
and we can see that for initializer lists from section 8.5.4 List-initialization paragraph 4 says:
Within the initializer-list of a braced-init-list, the initializer-clauses, including any that result from pack expansions (14.5.3), are evaluated in the order in which they appear. That is, every value computation and side effect associated with a given initializer-clause is sequenced before every value computation and side effect associated with any initializer-clause that follows it in the comma-separated list of the initializer-list.
Because it this case, the , is NOT a sequence point, but acts more like a delimiter in the initialization of the elements of the array.
In other words, you're modifying the same variable twice in a statement without sequence points (between the modifications).
EDIT: thanks to #MikeSeymour: this is an issue in C++03 an before. It seems like in C++11, the order of evaluation is defined for this case.
I am very much trilled with the how a scoped_lock works and was wondering weather a similar implementation can be done as to time a particular code of execution
If say I implement a simple class scoped_timer which on construction initiates a timer and on deletion it stops and report the time elapsed, then would this sample code be timed correctly
func()
{
//some code
{
scoped_timer a;
//some code that does not include a
}
//some code
}
In practice am I guaranteed that scoped_time a is constructed at the beginning and destructed exactly when it is out of scope. Can the compiler decide to reorder the code in such a way as not to destruct it exactly at the end of scope or construct it at the beginning since there is no dependence on the object a? Are there guarantees from C++ standard?
Thanks
Daniel
The code is guaranteed to behave as you would like.
This guarantee is important in C++, because C++ is not a functional programming language, due to the fact that almost any function in C++ can have side effects (either from the flow of execution of the current thread, or from other threads or even other processes, whether or not the data is declared as volatile). Because of this, the language specification makes guarantees about the sequencing of full expressions.
To piece this together from the C++11 standard, there are a number of clauses that must be considered together.
The most important clause is §1.9:
§1.9 Program execution [intro.execution]
1 The semantic descriptions in this International Standard define a
parameterized nondeterministic abstract machine. This International
Standard places no requirement on the structure of conforming
implementations. In particular, they need not copy or emulate the
structure of the abstract machine. Rather, conforming implementations
are required to emulate (only) the observable behavior of the abstract
machine as explained below. * (<-- the footnote is in the standard itself)
* This provision is sometimes called the “as-if” rule, because an implementation is free to disregard any requirement of this
International Standard as long as the result is as if the requirement
had been obeyed, as far as can be determined from the observable
behavior of the program. For instance, an actual implementation need
not evaluate part of an expression if it can deduce that its value is
not used and that no side effects affecting the observable behavior of
the program are produced.
(The bolding of the text is mine.)
This clause imposes two important requirements that are relevant for this question.
If an expression may have side effects, it will be evaluated. In your case, the expression scoped_timer a; may have side effects, so it will be evaluated.
"...conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.", where "below" includes Clauses 13 and 14 of the same section:
§1.9.13 Sequenced before is an asymmetric, transitive, pair-wise
relation between evaluations executed by a single thread (1.10), which
induces a partial order among those evaluations. Given any two
evaluations A and B, if A is sequenced before B, then the execution of
A shall precede the execution of B. If A is not sequenced before B and
B is not sequenced before A, then A and B are unsequenced. [ Note: The
execution of unsequenced evaluations can overlap. —end note ]
Evaluations A and B are indeterminately sequenced when either A is
sequenced before B or B is sequenced before A, but it is unspecified
which. [ Note: Indeterminately sequenced evaluations cannot overlap,
but either could be executed first. —end note ]
§1.9.14 Every value computation and side effect associated with a
full-expression is sequenced before every value computation and side
effect associated with the next full-expression to be evaluated. * (<-- the footnote here is not relevant)
Therefore, your expression scoped_timer a; (which is a full expression) may have side effects and will be evaluated; so the computation of the value of a will be sequenced before any of the following statements in the block.
Regarding destruction of the object a, that is simpler.
§3.7.3.3 If a variable with automatic storage duration has initialization or a destructor with side effects, it shall not be
destroyed before the end of its block, nor shall it be eliminated as
an optimization even if it appears to be unused, except that a class
object or its copy/move may be eliminated as specified in 12.8.
This makes clear that the destructor will not be called until the block exits.
ADDENDUM And to confirm that all block-level variables are destroyed (and their destructor called) at the end of block scope, here it is in the C++11 standard:
§3.7.3.1 Block-scope variables explicitly declared register or not explicitly declared static or extern have automatic storage duration.
The storage for these entities lasts until the block in which they are
created exits.
§3.7.3.2 [ Note: These variables are initialized and destroyed as described in 6.7. —end note ]
... and the above-mentioned §6.7:
§6.7.2 Variables with automatic storage duration (3.7.3) are initialized each time their declaration-statement is executed.
Variables with automatic storage duration declared in the block are
destroyed on exit from the block (6.6).
The block is defined as all code between a pair of curly braces {} here:
§6.3.1 So that several statements can be used where one is expected, the compound statement (also, and equivalently, called “block”) is provided.
compound-statement:
{ statement-seq }
statement-seq:
statement
statement-seq statement
A compound statement defines a block scope (3.3).
Note: The compount-statement (etc) section takes a while to get used to, but the important point is that here, the open curly brace { and close curly brace } actually mean a literal open curly brace and close curly brace in the code. This is the exact place in the C++11 standard where block scope is defined as the sequence of statements between curly braces.
Putting the pieces together: Because the standard, as quoted above, says The storage for these entities lasts until the block in which they are created exits and that Variables with automatic storage duration declared in the block are destroyed on exit from the block, you are assured that the object a in your question (and ANY block-level object) will last until the end of the block, and will be destroyed and have its destructor called when the block exits.
When I today read the C Standard, it says about side effects
Accessing a volatile object, modifying an object, modifying a file, or calling a function
that does any of those operations are all side effects
and the C++ Standard says
Accessing an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects
Hence because both forbid unsequenced side effects to occur on the same scalar object, C allows the following, but C++ makes it undefined behavior
int a = 0;
volatile int *pa = &a;
int b = *pa + *pa;
Am I reading the specifications correctly? And what is the reason for the discrepancy, if so?
I don't believe there is an effective variation between C and C++ in this regards. Though the wording on sequencing varies the end result is the same: both result in undefined behaviour (though C seems to indicate the evaluation will suceed but with an undefined result).
In C99 (sorry, don't have C11 handy) paragraph 5.1.2.3.5 specifies:
— At sequence points, volatile objects are stable in the sense that previous accesses are
complete and subsequent accesses have not yet occurred.
Combined with your quote from 5.1.2.3.2 would indicate the value of pa would not be in a stable state for at least one of the accesses to pa. This makes logical sense since the compiler would be allowed to evaluate them in any order, just once, or at the same time (if possible). It doesn't actually define what stable means however.
In C++11 there is explicit reference to unsequenced oeprations at 1.9.13. Then point 15 indicates such unsequenced operations on the same operand is undefined. Since undefined behaviour can mean anything happens it is perhaps strong than C's unstable behaviour. However, in both cases there is no guaranteed result of your expression.