I am confused as to why would this result in an undefined behavior. Let me copy and paste the explanation from the textbook first and then show my own code and program which runs perfectly.
Precedence specifies how the operands are grouped. It says nothing
about the order in which the operands are evaluated. In most cases,
the order is largely unspecified. In the following expression* int i = f1() * f2();: *We know that f1 and f2 must be called before the multiplication can be done. After all, it is their
results that are multiplied. However, we have no way of knowing
whether f1 will be called before f2 or vice versa. For operators that
do not specify evaluation order, it is an error for an expression to
refer to and change the same object. Expressions that do so have
undefined behavior (§ 2.1.2, p. 36). As a simple example, the <<
operator makes no guarantees about when or how its operands are
evaluated. As a result, the following output expression is undefined.
-- C++ Primer - Page 193 by Stanley B. Lippman
So, I tried to apply this by writing my own code and I never get an undefined behavior? Can someone please explain what does this mean?
#include <iostream>
using std::cout;
using std::endl;
int f1() { return (5 + 5 * 4 / 2 - 3); } // 12
int f2() { return (10 + 2 * 10 / 2 - 5); } // 15
int main()
{
int i = f1() * f2();
cout << i << endl;
return 0;
}
You're getting it wrong. The author means IF the order matters, it's unspecified. In your case, the order of evaluation doesn't matter. In fact, the function might as well be constexpr. But if you had something like this:
int i = 0;
int f1() { return (i++) * 3; }
int f2() { return (i++) * 4; }
int main() {
int a = f1() + f2();
}
Now, if f1 is called first, the result is 4. If f2 is called first, the result is 3. Thus, it's unspecified.
I never get an undefined behavior?
You can't really know that by simply running the program.
Your code is fine.
it is an error for an expression to refer to and change the same object
(bold mine)
Your don't change any objects in your expressions, so the rule doesn't apply.
Here's an example of when the rule would apply:
int a = 42;
int i = a++ * a++;
Note that it would not apply if the change happened in a function:
int a = 42;
int foo() {return a++;}
int i = foo() * foo();
That's because the UB only happens when two accesses to an object are unsequenced relative to each other, i.e. can happen in any order including in parallel. This doesn't necessarily mean "in parallel threads", but can also mean "single thread, but processor instructions performing the tasks may be interleaved".
But two function calls on the same thread can't happen in parallel (and can't have their instructions interleaved). Rather, in this case, they are indeterminately sequenced, i.e. one strictly after the other, but it's unspecified which one is first.
Also note that
the << operator makes no guarantees about when or how its operands are evaluated
is no longer true starting from C++17.
From C++ draft standard:
3.64 [defns.undefined] undefined behavior
behavior for which this document imposes no requirements [Note 1: Undefined behavior may be
expected when this document omits any explicit definition of behavior
or when a program uses an erroneous construct or erroneous data.
Permissible undefined behavior ranges from ignoring the situation
completely with unpredictable results, to behaving during translation
or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to
terminating a translation or execution (with the issuance of a
diagnostic message). Many erroneous program constructs do not engender
undefined behavior; they are required to be diagnosed. Evaluation of a
constant expression ([expr.const]) never exhibits behavior explicitly
specified as undefined in [intro] through [cpp]. — end note]
Thus, almost everything is possible, even predictable behavior for a given implementation.
But, your program does not exhibit any UB. f1 and f2 do not have any border effect, thus the order of their evaluation has no impact.
Related
Take the following example:
int g_X;
int func() {
return ++g_X; // using g_X++ is not an option
}
int main() {
g_X = 0;
int a = g_X + func();
g_X = 0;
int b = func() + g_X;
g_X = 0; int temp = g_X;
int c = temp + func();
return 0;
}
a and b are both 2, c has the expected value 1 (tested with Visual Studio 2010 to 2015). I read that sums are not associated with a sequence point (e.g. https://en.wikipedia.org/wiki/Sequence_point) and therefore the order is not fixed. Yet I thought that the value of g_X gets captured before the function call.
Am I really in undefined behavior territory or is there some way not having to use the temp var?
Every evaluation in the calling function (including other function calls) that is not otherwise specifically
sequenced before or after the execution of the body of the called function is indeterminately sequenced with
respect to the execution of the called function.
([intro.execution]/15)
Therefore, in an expression of the form g_X + func() or func() + g_X, it's true that the + operator doesn't introduce any sequencing constraint, but nevertheless the access to g_X within main either happens before or after the body of func(), you just can't predict which. This implies that the behaviour is defined, but whether a is 1 or 2 is unpredictable. Likewise, whether b is 1 or 2 is unpredictable.
At first sight this looks like an undefined behavior because, quoting §1.9/15:
If a side effect on a scalar
object is unsequenced relative to either another side effect on the same scalar object or a value computation
using the value of the same scalar object, the behavior is undefined.
and your example seemingly satisfies these conditions:
Evaluation of operands of + operator are unsequenced relative to each other (but see below).
There is a side effect on a scalar object (func increments g_X).
There is a computation using the value of the same scalar object (g_X).
On the other hand, as pointed out in the Brian's answer, function call does introduce sequencing, although indeterminate. According to this argument this is not an undefined behavior, but merely unspecified.
This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 8 years ago.
#include <iostream>
using namespace std;
main(){
int i = 5;
cout << i++ << i--<< ++i << --i << i << endl;
}
The above program compiled with g++ gives output :
45555
While the following program:
int x=20,y=35;
x =y++ + y + x++ + y++;
cout << x<< endl << y;
gives result as
126
37
Can anyone please explain the output.
cout << i++ << i--
is semantically equivalent to
operator<<(operator<<(cout, i++), i--);
<------arg1--------->, <-arg2->
$1.9/15- "When calling a function
(whether or not the function is
inline), every value computation and
side effect associated with any
argument expression, or with the
postfix expression designating the
called function, is sequenced before
execution of every expression or
statement in the body of the called
function. [ Note: Value computations
and side effects associated with
different argument expressions are
unsequenced. —end note ]
C++0x:
This means that the evaluation of the arguments arg1/arg2 are unsequenced (neither of them is sequenced before the other).
The same section in the draft Standard also states,
If a side effect on a scalar object is
unsequenced relative to either another
side effect on the same scalar object
or a value computation using the value
of the same scalar object, the
behavior is undefined.
Now there is a sequence point at the semicolon at the end of the full expression below
operator<<(operator<<(cout, i++), i--);
^ the interesting sequence point is right here
As is clear, evaluation of both arg1 and arg2 lead to side effect on the scalar variable 'i', and as we saw above, the side effects are unsequenced.
Therefore the code has undefined behavior. So what does that mean?
Here is how 'undefined behavior' is defined :) in the Standard.
Permissible undefined behavior ranges
from ignoring the situation completely
with unpredictable results, to
behaving during translation or program
execution in a documented manner
characteristic of the environment
(with or without the issuance of a
diagnostic message), to terminating a
translation or execution (with the
issuance of a diagnostic message).
Many erroneous program constructs do
not engender undefined behavior; they
are required to be diagnosed.
Do you see correlation with #DarkDust's response 'The compiler is even allowed to set your computer on fire :-)'
So any output you get from such a code is really in the dreaded realm of undefined behavior.
Don't do it.
Only thing that is defined about such code is that it helps OP and many of us get lots of votes (if answered correctly) :)
The result of the second program's expression is undefined. The compiler is even allowed to set your computer on fire :-) You're not allowed to modify a variable twice within one sequence point (in this case: from = to ;).
Edit:
For detailed explanations, see the C FAQ, specifically question 3.2.
Adding to other's answers:
If you are using g++, using the -Wsequence-point option tells that:
$ g++ -Wsequence-point a.cpp
a.cpp: In function ‘int main()’:
a.cpp:8: warning: operation on ‘i’ may be undefined
^^^^^^^^^
Undefined behaviour, so anything could happen
I only know i = i++; is undefined behavior, but if there are two or more functions called in an expression, and all the functions are the same. Is it undefined? For example:
int func(int a)
{
std::cout << a << std::endl;
return 0;
}
int main()
{
std::cout << func(0) + func(1) << std::endl;
return 0;
}
The behavior of the expression func(0) + func(1) is defined in that the result will be the sum of the results obtained by calling func with a parameter of 0 and funcwith a parameter of 1.
However, the order in which the functions are called is probably implementation dependent, although it might be unspecified. That is, the compiler could generate code equivalent to:
int a = func(0);
int b = func(1);
int result = a + b;
Or it could generate:
int a = func(1);
int b = func(0);
int result = a + b;
This normally won't be a problem unless func has side effects that depend on the order of calls.
std::cout << func(0) + func(1) << std::endl;
Whether the function call func(0) or func(1) executes first, is implementation dependent. After that, there is a sequence point, and func(0) + func(1) is output.
But by definition, it's not called undefined behavior.
The behavior of this program is not undefined but it is unspecified, if we look at the draft C++ standard section 1.9 Program execution paragraph 15 says(emphasis mine):
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [ Note: In an expression that is evaluated more than once during the execution of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations. —end note ] The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar
object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
and if we check section 5.7 Additive operators which covers + and - that section does not specify an ordering so it is unsequenced.
In this case func has a side effect since it is outputting to stdout and so the order of the output is going to depend on the implementation and it even could change for subsequent evaluations.
Note that the ; ends an expression statement and section 6.2 Expression statement says:
[...]All side effects from an expression statement are completed before the next statement is executed.[...]
so although the order of the function calls is unspecified, the side effects of each statement are completed before the next.
Please, explain why this code is correct or why not:
In my opinion, line ++*p1 = *p2++ has undefined behaviour, because p1 is dereferenced first and then incrementing.
int main()
{
char a[] = "Hello";
char b[] = "World";
char* p1 = a;
char* p2 = b;
//*++p1 = *p2++; // is this OK?
++*p1 = *p2++; // is this OK? Or this is UB?
std::cout << a << "\n" << b;
return 0;
}
The first is ok
*++p1 = *p2++ // p1++; *p1 = *p2; p2++;
the second is UB with C++ because you are modifying what is pointed by p1 twice (once because of increment and once because of assignment) and there are no sequence points separating the two side effects.
With C++0x rules things are different and more complex to explain and to understand. If you write intentionally expressions like the second one, if it's not for a code golf competition and if you are working for me then consider yourself fired (even if that is legal in C++0x).
I don't know if it is legal in C++0x and I don't want to know. I've too few neurons to waste them this way.
In modern C++ (at least C++ 2011 and later) neither is undefined behavior. And even neither is implementation defined or unspecified. (All three terms are different things.)
These two lines are both well defined (but they do different things).
When you have pointers p1 and p2 to scalar types then
*++p1 = *p2++;
is equivalent to
p1 = p1 + 1;
*p1 = *p2;
p2 = p2 + 1;
(^^^this is also true for C++ 1998/2003)
and
++*p1 = *p2++;
is equivalent to
*p1 = *p1 + 1;
*p1 = *p2;
p2 = p2 + 1;
(^^^maybe also in C++ 1998/2003 or maybe not - as explained below)
Obviously in case 2 incrementing value and then assigning to it (thus overwriting just incremented value) is pointless - but there may be similar examples that make sense (e.g. += instead of =).
BUT like many people point out - just don't write the code that looks ambiguous or unreasonably complex. Write the code that is clear to you and supposed to be clear to the readers.
Old C++ 1998/2003 case for second expression is a strange matter:
At first after reading the description of prefix increment operator:
ISO/IEC 14882-2003 5.3.2:
The operand of prefix ++ is modified by adding 1, or set to true if it
is bool (this use is deprecated). The operand shall be a modifiable
lvalue. The type of the operand shall be an arithmetic type or a
pointer to a completely-defined object type. The value is the new
value of the operand; it is an lvalue. If x is not of type bool, the
expression ++x is equivalent to x+=1.
I personally have a strong feeling that everything is perfectly defined and obvious and the same as above for C++ 2011 and later.
At least in the sense that every reasonable C++ implementation will behave in exact same well defined way (including old ones).
Why it should be otherwise if we always intuitively rely on a general rule that in any simple operator evaluation within a complex expression we evaluate its operands first and after that apply the operator to the values of those operands. Right? Breaking this intuitive expectation would be extremely stupid for any programming language.
So for the full expression ++*p1 = *p2++; we have operands: 1 - ++*p1 evaluated as already incremented lvalue (as defined in the above quote from C++ 2003) and 2 - *p2++ that is an rvalue stored at pointer p2 before its increment. It doesn't look ambiguous at all. Of course in this case - no reason to increment a value you are overwriting anyway BUT if there was double increment instead - ++(++*p1); OR other kind of assignment like +=/-=/&=/*=/etc instead of simple assignment THAT would not be unreasonable at all.
Unfortunately all the intuition and logic is messed up by this:
ISO/IEC 14882-2003 - 5 Expressions:
Except where noted, the order of evaluation of operands
of individual operators and subexpressions of individual
expressions, and the order in which side effects
take place, is unspecified. Between the previous
and next sequence point a scalar object shall have its
stored value modified at most once by the evaluation
of an expression. Furthermore, the prior value shall be
accessed only to determine the value to be stored.
The requirements of this paragraph shall be met for each
allowable ordering of the subexpressions of a full
expression; otherwise the behavior is undefined.
[Example:
i = v[i++]; // the behavior is unspecified
i = 7, i++, i++; // i becomes 9
i = ++i + 1; // the behavior is unspecified
i = i + 1; // the value of i is incremented
—end example]
So this wording if interpreted in a paranoid way seems to imply that modification of a value stored in a specific location more than once without intervening sequence point is explicitly forbidden by this rule and the last sentence declares that failing to comply with every requirement is Undefined Behavior. AND our expression seems to modify the same location more that once (?) with no sequence point until the full expression evaluated. (This arbitrary and unreasonable limitation is reinforced further by example 3 - i = ++i + 1; though it says // the behavior is unspecified - not undefined as in the wording before - which only adds more confusion.)
BUT on the other hand... If we ignore the example 3. (Maybe i = ++i + 1; is a typo and there should have been postfix increment instead - i = i++ + 1;? Who knows... Anyway examples are not part of formal specification.) If we interpret this wording in the most permissive way - we can see that in each allowed order of evaluation of subexpressions of the whole expression - preincrement ++*p1 must be evaluated to an LVALUE (which is something that allows further modification) BEFORE applying assignment operator so the only valid final value at that location is the one that is stored with assignment operator. ALSO NOTE that conforming C++ implementation have no obligation to actually modify that location more than once and may instead store only final result - that is both reasonable optimization allowed by the standard and may be actual demand of this article.
Which one of those interpretations is correct? Paranoid or permissive? Universally applicable logic or some suspicious and ambiguous words in a document almost nobody really ever read? Blue pill or Red pill?
Who knows... It looks like a gray area that requires less ambiguous explanation.
If we interpret the quote from C++ 2003 standard above in a paranoid way then it looks like this code may be Undefined Behavior:
#include <iostream>
#define INC(x) (++(x))
int main()
{
int a = 5;
INC(INC(a));
std::cout << a;
return 0;
}
while this code is perfectly legitimate and well defined:
#include <iostream>
template<class T> T& INC(T& x) // sequence point after evaluation of the arguments
{ // and before execution of the function body
return ++x;
}
int main()
{
int a = 5;
INC(INC(a));
std::cout << a;
return 0;
}
Really?
All this looks very much like a defect of the old C++ standard.
Fortunately this has been addressed in newer C++ standards (starting with C++ 2011) as there is no such concept as sequence point anymore. Instead there is a relation - something sequenced before something. And of course the natural guarantee that evaluation of the argument expressions of any operator is sequenced before evaluation of the result of the operator is there.
ISO/IEC 14882-2011 - 1.9 Program execution
Sequenced before is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread (1.10), which induces
a partial order among those evaluations. Given any two evaluations A
and B, if A is sequenced before B, then the execution of A shall
precede the execution of B. If A is not sequenced before B and B is
not sequenced before A, then A and B are unsequenced. [ Note: The
execution of unsequenced evaluations can overlap. — end note ]
Evaluations A and B are indeterminately sequenced when either A is
sequenced before B or B is sequenced before A, but it is unspecified
which. [ Note: Indeterminately sequenced evaluations cannot overlap,
but either could be executed first. — end note ]
Every value computation and side effect associated with a full-expression is sequenced before every value computation and side
effect associated with the next full-expression to be evaluated.
Except where noted, evaluations of operands of individual operators
and of subexpressions of individual expressions are unsequenced. [
Note: In an expression that is evaluated more than once during the
execution of a program, unsequenced and indeterminately sequenced
evaluations of its subexpressions need not be performed consistently
in different evaluations. — end note ] The value computations of the
operands of an operator are sequenced before the value computation of
the result of the operator. If a side effect on a scalar object is
unsequenced relative to either anotherside effect on the same scalar
object or a value computation using the value of the same scalar
object, the behavior is undefined.
[ Example:
void f(int, int);
void g(int i, int* v) {
i = v[i++]; // the behavior is undefined
i = 7, i++, i++; // i becomes 9
i = i++ + 1; // the behavior is undefined
i = i + 1; // the value of i is incremented
f(i = -1, i = -1); // the behavior is undefined
}
— end example ]
(Also NOTE how C++ 2003 prefix increment example i = ++i + 1; is replaced by postfix increment example i = i++ + 1; in this C++ 2011 quote. :) )
This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 8 years ago.
#include <iostream>
using namespace std;
main(){
int i = 5;
cout << i++ << i--<< ++i << --i << i << endl;
}
The above program compiled with g++ gives output :
45555
While the following program:
int x=20,y=35;
x =y++ + y + x++ + y++;
cout << x<< endl << y;
gives result as
126
37
Can anyone please explain the output.
cout << i++ << i--
is semantically equivalent to
operator<<(operator<<(cout, i++), i--);
<------arg1--------->, <-arg2->
$1.9/15- "When calling a function
(whether or not the function is
inline), every value computation and
side effect associated with any
argument expression, or with the
postfix expression designating the
called function, is sequenced before
execution of every expression or
statement in the body of the called
function. [ Note: Value computations
and side effects associated with
different argument expressions are
unsequenced. —end note ]
C++0x:
This means that the evaluation of the arguments arg1/arg2 are unsequenced (neither of them is sequenced before the other).
The same section in the draft Standard also states,
If a side effect on a scalar object is
unsequenced relative to either another
side effect on the same scalar object
or a value computation using the value
of the same scalar object, the
behavior is undefined.
Now there is a sequence point at the semicolon at the end of the full expression below
operator<<(operator<<(cout, i++), i--);
^ the interesting sequence point is right here
As is clear, evaluation of both arg1 and arg2 lead to side effect on the scalar variable 'i', and as we saw above, the side effects are unsequenced.
Therefore the code has undefined behavior. So what does that mean?
Here is how 'undefined behavior' is defined :) in the Standard.
Permissible undefined behavior ranges
from ignoring the situation completely
with unpredictable results, to
behaving during translation or program
execution in a documented manner
characteristic of the environment
(with or without the issuance of a
diagnostic message), to terminating a
translation or execution (with the
issuance of a diagnostic message).
Many erroneous program constructs do
not engender undefined behavior; they
are required to be diagnosed.
Do you see correlation with #DarkDust's response 'The compiler is even allowed to set your computer on fire :-)'
So any output you get from such a code is really in the dreaded realm of undefined behavior.
Don't do it.
Only thing that is defined about such code is that it helps OP and many of us get lots of votes (if answered correctly) :)
The result of the second program's expression is undefined. The compiler is even allowed to set your computer on fire :-) You're not allowed to modify a variable twice within one sequence point (in this case: from = to ;).
Edit:
For detailed explanations, see the C FAQ, specifically question 3.2.
Adding to other's answers:
If you are using g++, using the -Wsequence-point option tells that:
$ g++ -Wsequence-point a.cpp
a.cpp: In function ‘int main()’:
a.cpp:8: warning: operation on ‘i’ may be undefined
^^^^^^^^^
Undefined behaviour, so anything could happen