I only know i = i++; is undefined behavior, but if there are two or more functions called in an expression, and all the functions are the same. Is it undefined? For example:
int func(int a)
{
std::cout << a << std::endl;
return 0;
}
int main()
{
std::cout << func(0) + func(1) << std::endl;
return 0;
}
The behavior of the expression func(0) + func(1) is defined in that the result will be the sum of the results obtained by calling func with a parameter of 0 and funcwith a parameter of 1.
However, the order in which the functions are called is probably implementation dependent, although it might be unspecified. That is, the compiler could generate code equivalent to:
int a = func(0);
int b = func(1);
int result = a + b;
Or it could generate:
int a = func(1);
int b = func(0);
int result = a + b;
This normally won't be a problem unless func has side effects that depend on the order of calls.
std::cout << func(0) + func(1) << std::endl;
Whether the function call func(0) or func(1) executes first, is implementation dependent. After that, there is a sequence point, and func(0) + func(1) is output.
But by definition, it's not called undefined behavior.
The behavior of this program is not undefined but it is unspecified, if we look at the draft C++ standard section 1.9 Program execution paragraph 15 says(emphasis mine):
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [ Note: In an expression that is evaluated more than once during the execution of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations. —end note ] The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar
object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.
and if we check section 5.7 Additive operators which covers + and - that section does not specify an ordering so it is unsequenced.
In this case func has a side effect since it is outputting to stdout and so the order of the output is going to depend on the implementation and it even could change for subsequent evaluations.
Note that the ; ends an expression statement and section 6.2 Expression statement says:
[...]All side effects from an expression statement are completed before the next statement is executed.[...]
so although the order of the function calls is unspecified, the side effects of each statement are completed before the next.
Related
Take the following example:
int g_X;
int func() {
return ++g_X; // using g_X++ is not an option
}
int main() {
g_X = 0;
int a = g_X + func();
g_X = 0;
int b = func() + g_X;
g_X = 0; int temp = g_X;
int c = temp + func();
return 0;
}
a and b are both 2, c has the expected value 1 (tested with Visual Studio 2010 to 2015). I read that sums are not associated with a sequence point (e.g. https://en.wikipedia.org/wiki/Sequence_point) and therefore the order is not fixed. Yet I thought that the value of g_X gets captured before the function call.
Am I really in undefined behavior territory or is there some way not having to use the temp var?
Every evaluation in the calling function (including other function calls) that is not otherwise specifically
sequenced before or after the execution of the body of the called function is indeterminately sequenced with
respect to the execution of the called function.
([intro.execution]/15)
Therefore, in an expression of the form g_X + func() or func() + g_X, it's true that the + operator doesn't introduce any sequencing constraint, but nevertheless the access to g_X within main either happens before or after the body of func(), you just can't predict which. This implies that the behaviour is defined, but whether a is 1 or 2 is unpredictable. Likewise, whether b is 1 or 2 is unpredictable.
At first sight this looks like an undefined behavior because, quoting §1.9/15:
If a side effect on a scalar
object is unsequenced relative to either another side effect on the same scalar object or a value computation
using the value of the same scalar object, the behavior is undefined.
and your example seemingly satisfies these conditions:
Evaluation of operands of + operator are unsequenced relative to each other (but see below).
There is a side effect on a scalar object (func increments g_X).
There is a computation using the value of the same scalar object (g_X).
On the other hand, as pointed out in the Brian's answer, function call does introduce sequencing, although indeterminate. According to this argument this is not an undefined behavior, but merely unspecified.
For example:
int foo(int i) { return i; }
int main()
{
int i = 0;
i = i++; // Undefined
i = foo(i++); // ?
return 0;
}
What would the current ISO C++ standard specify for this case?
EDIT:
Here's where I get confused:
Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.
If a side effect on a scalar
object is unsequenced relative to either another side effect on the same scalar object or a value computation
using the value of the same scalar object, and they are not potentially concurrent (1.10), the behavior is
undefined.
In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression
Every evaluation in the calling function (including other function calls) that is not otherwise specifically
sequenced before or after the execution of the body of the called function is indeterminately sequenced with
respect to the execution of the called function.
So it seems you could have a value computation on the left side of the assignment (just i), and a side effect on the right side (the modification of i from i++) which aren't sequenced with respect to each other.
EDIT2:
For anyone who finds themselves here, there is a really great explanation about sequencing that I found here.
The last sentence in your quote says "that is not otherwise specifically sequenced before or after the execution of the body of the called function" so the question is whether the increment and the assignment are "otherwise specifically sequenced before or after" the function body.
1.9 [intro.execution] p15 has the answer:
When calling a function (whether or not the function is inline), every value computation and side effect
associated with any argument expression, or with the postfix expression designating the called function, is
sequenced before execution of every expression or statement in the body of the called function. [ Note: Value
computations and side effects associated with different argument expressions are unsequenced. — end note ]
So the increment of i happens before the function body, and the assignment to i happens after the function returns, so it is perfectly well-defined.
In pre-C++11 terminology, the function call introduces a sequence point between the increment and the assignment.
i = foo(i++); is fine, because i++ is executed before foo() is called. A copy of i is made, i is then incremented, then the copy is passed to foo(). It is the same as doing this explicitly:
int tmp = i++;
i = foo(tmp);
Calling f(a,a) in the following code is undefined behavior?
#include <iostream>
int f(int &m, int &n) {
m++;
n++;
return m + n;
}
int main() {
int a = 1;
int b = f(a, a);
}
There is no undefined behavior with respect to modifying m and n since the modifications of both variables are sequenced. The modification of m will happen before the modification of n since they are both full expressions and all side effects of a full expression are sequenced before the side effects of the next full expression.
The relevant section of the draft C++ standard is section 1.9 Program execution which says:
Every value computation and side effect associated with a
full-expression is sequenced before every value computation and side
effect associated with the next full-expression to be evaluated.8.
and:
If a side effect on a scalar object is unsequenced relative to either
another side effect on the same scalar object or a value computation
using the value of the same scalar object, the behavior is undefined
on the other hand the following:
m++ + n++ ;
is undefined behavior since the the order of evaluation of each sub expression is indeterminately sequence with respect to each other.
Jonathan brings up the issue of strict aliasing but I don't see how the compiler can assume that n and m are not aliasing each other and my experiments on godbolt does not indicate any unexpected aliasing assumptions.
Note, a full expression is:
[...]an expression that is not a subexpression of another expression[...]
Usually the ; denotes the end of a full expression.
There is no undefined behavior here. a will end up being 3 and b will be 6, consistently.
Please, explain why this code is correct or why not:
In my opinion, line ++*p1 = *p2++ has undefined behaviour, because p1 is dereferenced first and then incrementing.
int main()
{
char a[] = "Hello";
char b[] = "World";
char* p1 = a;
char* p2 = b;
//*++p1 = *p2++; // is this OK?
++*p1 = *p2++; // is this OK? Or this is UB?
std::cout << a << "\n" << b;
return 0;
}
The first is ok
*++p1 = *p2++ // p1++; *p1 = *p2; p2++;
the second is UB with C++ because you are modifying what is pointed by p1 twice (once because of increment and once because of assignment) and there are no sequence points separating the two side effects.
With C++0x rules things are different and more complex to explain and to understand. If you write intentionally expressions like the second one, if it's not for a code golf competition and if you are working for me then consider yourself fired (even if that is legal in C++0x).
I don't know if it is legal in C++0x and I don't want to know. I've too few neurons to waste them this way.
In modern C++ (at least C++ 2011 and later) neither is undefined behavior. And even neither is implementation defined or unspecified. (All three terms are different things.)
These two lines are both well defined (but they do different things).
When you have pointers p1 and p2 to scalar types then
*++p1 = *p2++;
is equivalent to
p1 = p1 + 1;
*p1 = *p2;
p2 = p2 + 1;
(^^^this is also true for C++ 1998/2003)
and
++*p1 = *p2++;
is equivalent to
*p1 = *p1 + 1;
*p1 = *p2;
p2 = p2 + 1;
(^^^maybe also in C++ 1998/2003 or maybe not - as explained below)
Obviously in case 2 incrementing value and then assigning to it (thus overwriting just incremented value) is pointless - but there may be similar examples that make sense (e.g. += instead of =).
BUT like many people point out - just don't write the code that looks ambiguous or unreasonably complex. Write the code that is clear to you and supposed to be clear to the readers.
Old C++ 1998/2003 case for second expression is a strange matter:
At first after reading the description of prefix increment operator:
ISO/IEC 14882-2003 5.3.2:
The operand of prefix ++ is modified by adding 1, or set to true if it
is bool (this use is deprecated). The operand shall be a modifiable
lvalue. The type of the operand shall be an arithmetic type or a
pointer to a completely-defined object type. The value is the new
value of the operand; it is an lvalue. If x is not of type bool, the
expression ++x is equivalent to x+=1.
I personally have a strong feeling that everything is perfectly defined and obvious and the same as above for C++ 2011 and later.
At least in the sense that every reasonable C++ implementation will behave in exact same well defined way (including old ones).
Why it should be otherwise if we always intuitively rely on a general rule that in any simple operator evaluation within a complex expression we evaluate its operands first and after that apply the operator to the values of those operands. Right? Breaking this intuitive expectation would be extremely stupid for any programming language.
So for the full expression ++*p1 = *p2++; we have operands: 1 - ++*p1 evaluated as already incremented lvalue (as defined in the above quote from C++ 2003) and 2 - *p2++ that is an rvalue stored at pointer p2 before its increment. It doesn't look ambiguous at all. Of course in this case - no reason to increment a value you are overwriting anyway BUT if there was double increment instead - ++(++*p1); OR other kind of assignment like +=/-=/&=/*=/etc instead of simple assignment THAT would not be unreasonable at all.
Unfortunately all the intuition and logic is messed up by this:
ISO/IEC 14882-2003 - 5 Expressions:
Except where noted, the order of evaluation of operands
of individual operators and subexpressions of individual
expressions, and the order in which side effects
take place, is unspecified. Between the previous
and next sequence point a scalar object shall have its
stored value modified at most once by the evaluation
of an expression. Furthermore, the prior value shall be
accessed only to determine the value to be stored.
The requirements of this paragraph shall be met for each
allowable ordering of the subexpressions of a full
expression; otherwise the behavior is undefined.
[Example:
i = v[i++]; // the behavior is unspecified
i = 7, i++, i++; // i becomes 9
i = ++i + 1; // the behavior is unspecified
i = i + 1; // the value of i is incremented
—end example]
So this wording if interpreted in a paranoid way seems to imply that modification of a value stored in a specific location more than once without intervening sequence point is explicitly forbidden by this rule and the last sentence declares that failing to comply with every requirement is Undefined Behavior. AND our expression seems to modify the same location more that once (?) with no sequence point until the full expression evaluated. (This arbitrary and unreasonable limitation is reinforced further by example 3 - i = ++i + 1; though it says // the behavior is unspecified - not undefined as in the wording before - which only adds more confusion.)
BUT on the other hand... If we ignore the example 3. (Maybe i = ++i + 1; is a typo and there should have been postfix increment instead - i = i++ + 1;? Who knows... Anyway examples are not part of formal specification.) If we interpret this wording in the most permissive way - we can see that in each allowed order of evaluation of subexpressions of the whole expression - preincrement ++*p1 must be evaluated to an LVALUE (which is something that allows further modification) BEFORE applying assignment operator so the only valid final value at that location is the one that is stored with assignment operator. ALSO NOTE that conforming C++ implementation have no obligation to actually modify that location more than once and may instead store only final result - that is both reasonable optimization allowed by the standard and may be actual demand of this article.
Which one of those interpretations is correct? Paranoid or permissive? Universally applicable logic or some suspicious and ambiguous words in a document almost nobody really ever read? Blue pill or Red pill?
Who knows... It looks like a gray area that requires less ambiguous explanation.
If we interpret the quote from C++ 2003 standard above in a paranoid way then it looks like this code may be Undefined Behavior:
#include <iostream>
#define INC(x) (++(x))
int main()
{
int a = 5;
INC(INC(a));
std::cout << a;
return 0;
}
while this code is perfectly legitimate and well defined:
#include <iostream>
template<class T> T& INC(T& x) // sequence point after evaluation of the arguments
{ // and before execution of the function body
return ++x;
}
int main()
{
int a = 5;
INC(INC(a));
std::cout << a;
return 0;
}
Really?
All this looks very much like a defect of the old C++ standard.
Fortunately this has been addressed in newer C++ standards (starting with C++ 2011) as there is no such concept as sequence point anymore. Instead there is a relation - something sequenced before something. And of course the natural guarantee that evaluation of the argument expressions of any operator is sequenced before evaluation of the result of the operator is there.
ISO/IEC 14882-2011 - 1.9 Program execution
Sequenced before is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread (1.10), which induces
a partial order among those evaluations. Given any two evaluations A
and B, if A is sequenced before B, then the execution of A shall
precede the execution of B. If A is not sequenced before B and B is
not sequenced before A, then A and B are unsequenced. [ Note: The
execution of unsequenced evaluations can overlap. — end note ]
Evaluations A and B are indeterminately sequenced when either A is
sequenced before B or B is sequenced before A, but it is unspecified
which. [ Note: Indeterminately sequenced evaluations cannot overlap,
but either could be executed first. — end note ]
Every value computation and side effect associated with a full-expression is sequenced before every value computation and side
effect associated with the next full-expression to be evaluated.
Except where noted, evaluations of operands of individual operators
and of subexpressions of individual expressions are unsequenced. [
Note: In an expression that is evaluated more than once during the
execution of a program, unsequenced and indeterminately sequenced
evaluations of its subexpressions need not be performed consistently
in different evaluations. — end note ] The value computations of the
operands of an operator are sequenced before the value computation of
the result of the operator. If a side effect on a scalar object is
unsequenced relative to either anotherside effect on the same scalar
object or a value computation using the value of the same scalar
object, the behavior is undefined.
[ Example:
void f(int, int);
void g(int i, int* v) {
i = v[i++]; // the behavior is undefined
i = 7, i++, i++; // i becomes 9
i = i++ + 1; // the behavior is undefined
i = i + 1; // the value of i is incremented
f(i = -1, i = -1); // the behavior is undefined
}
— end example ]
(Also NOTE how C++ 2003 prefix increment example i = ++i + 1; is replaced by postfix increment example i = i++ + 1; in this C++ 2011 quote. :) )
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Undefined Behavior and Sequence Points
I'm not sure if this is a gcc bug or not, so I'll ask:
unsigned int n = 0;
std::cout << n++ << n << ++n;
gcc gives the extremely strange result:
"122" which AFAICT is impossible. Because << is left associative, it should be the same as:
operator<<(operator<<(operator<<(std::cout, n++), n), ++n)
and because there is a sequence point before and after evaluating arguments, n is never modified twice (or even accessed) between two sequence points -- so it shouldn't be undefined behaviour, just the order of evaluation unspecified.
So AFAICT valid results would be:
111
012
002
101
and nothing else
There is a sequence point between evaluating arguments and calling a function. There is no sequence point between evaluating different arguments.
Let's look at the outermost function call:
operator<<(operator<<(operator<<(std::cout, n++), n), ++n)
The arguments are
operator<<(operator<<(std::cout, n++), n)
and
++n
It is unspecified which of these is evaluated first. It's also allowed that the first argument is partially evaluated when the second argument is evaluated.
From the standard, section [intro.execution] (wording from draft 3225):
If A is not sequenced before
B and B is not sequenced before A, then A and B are unsequenced. [ Note: The execution of unsequenced
evaluations can overlap. — end note ]
Except where noted, evaluations of operands of individual operators and of subexpressions of individual
expressions are unsequenced. [ Note: In an expression that is evaluated more than once during the execution
of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be
performed consistently in different evaluations. — end note ] The value computations of the operands of an
operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar
object is unsequenced relative to either another side effect on the same scalar object or a value computation
using the value of the same scalar object, the behavior is undefined.
Because you have multiple operations with side effects on the same scalar object which are unsequenced with respect to each other, you're in that realm of undefined behavior, and even 999 would be a permissible output.
The first rule of compiler bugs: it's probably not a compiler bug but a misunderstanding on your part. Using the postfix and prefix operators in the same statement results in undefined behavior. Try using the -Wall option to give you more warnings and show you the potential pitfalls in your code.
Let's see what GCC 4.2.1 tells us when we ask for warnings about test.cpp:
#include <iostream>
int main() {
unsigned int n = 0;
std::cout << n++ << n << ++n << std::endl;
return 0;
}
When we compile:
$ g++ -Wall test.cpp -o test
test.cpp: In function ‘int main()’:
test.cpp:5: warning: operation on ‘n’ may be undefined
Your code its an example of why in some books remark that experienced programmers don't like that(++,--) operator overload, even other languages (ruby) has not implemented ++ or --.