C++ Pre and Post increment compiler behaviour [duplicate] - c++

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 8 years ago.
#include <iostream>
using namespace std;
main(){
int i = 5;
cout << i++ << i--<< ++i << --i << i << endl;
}
The above program compiled with g++ gives output :
45555
While the following program:
int x=20,y=35;
x =y++ + y + x++ + y++;
cout << x<< endl << y;
gives result as
126
37
Can anyone please explain the output.

cout << i++ << i--
is semantically equivalent to
operator<<(operator<<(cout, i++), i--);
<------arg1--------->, <-arg2->
$1.9/15- "When calling a function
(whether or not the function is
inline), every value computation and
side effect associated with any
argument expression, or with the
postfix expression designating the
called function, is sequenced before
execution of every expression or
statement in the body of the called
function. [ Note: Value computations
and side effects associated with
different argument expressions are
unsequenced. —end note ]
C++0x:
This means that the evaluation of the arguments arg1/arg2 are unsequenced (neither of them is sequenced before the other).
The same section in the draft Standard also states,
If a side effect on a scalar object is
unsequenced relative to either another
side effect on the same scalar object
or a value computation using the value
of the same scalar object, the
behavior is undefined.
Now there is a sequence point at the semicolon at the end of the full expression below
operator<<(operator<<(cout, i++), i--);
^ the interesting sequence point is right here
As is clear, evaluation of both arg1 and arg2 lead to side effect on the scalar variable 'i', and as we saw above, the side effects are unsequenced.
Therefore the code has undefined behavior. So what does that mean?
Here is how 'undefined behavior' is defined :) in the Standard.
Permissible undefined behavior ranges
from ignoring the situation completely
with unpredictable results, to
behaving during translation or program
execution in a documented manner
characteristic of the environment
(with or without the issuance of a
diagnostic message), to terminating a
translation or execution (with the
issuance of a diagnostic message).
Many erroneous program constructs do
not engender undefined behavior; they
are required to be diagnosed.
Do you see correlation with #DarkDust's response 'The compiler is even allowed to set your computer on fire :-)'
So any output you get from such a code is really in the dreaded realm of undefined behavior.
Don't do it.
Only thing that is defined about such code is that it helps OP and many of us get lots of votes (if answered correctly) :)

The result of the second program's expression is undefined. The compiler is even allowed to set your computer on fire :-) You're not allowed to modify a variable twice within one sequence point (in this case: from = to ;).
Edit:
For detailed explanations, see the C FAQ, specifically question 3.2.

Adding to other's answers:
If you are using g++, using the -Wsequence-point option tells that:
$ g++ -Wsequence-point a.cpp
a.cpp: In function ‘int main()’:
a.cpp:8: warning: operation on ‘i’ may be undefined
^^^^^^^^^

Undefined behaviour, so anything could happen

Related

undefined behavior for int i = f1() * f2()

I am confused as to why would this result in an undefined behavior. Let me copy and paste the explanation from the textbook first and then show my own code and program which runs perfectly.
Precedence specifies how the operands are grouped. It says nothing
about the order in which the operands are evaluated. In most cases,
the order is largely unspecified. In the following expression* int i = f1() * f2();: *We know that f1 and f2 must be called before the multiplication can be done. After all, it is their
results that are multiplied. However, we have no way of knowing
whether f1 will be called before f2 or vice versa. For operators that
do not specify evaluation order, it is an error for an expression to
refer to and change the same object. Expressions that do so have
undefined behavior (§ 2.1.2, p. 36). As a simple example, the <<
operator makes no guarantees about when or how its operands are
evaluated. As a result, the following output expression is undefined.
-- C++ Primer - Page 193 by Stanley B. Lippman
So, I tried to apply this by writing my own code and I never get an undefined behavior? Can someone please explain what does this mean?
#include <iostream>
using std::cout;
using std::endl;
int f1() { return (5 + 5 * 4 / 2 - 3); } // 12
int f2() { return (10 + 2 * 10 / 2 - 5); } // 15
int main()
{
int i = f1() * f2();
cout << i << endl;
return 0;
}
You're getting it wrong. The author means IF the order matters, it's unspecified. In your case, the order of evaluation doesn't matter. In fact, the function might as well be constexpr. But if you had something like this:
int i = 0;
int f1() { return (i++) * 3; }
int f2() { return (i++) * 4; }
int main() {
int a = f1() + f2();
}
Now, if f1 is called first, the result is 4. If f2 is called first, the result is 3. Thus, it's unspecified.
I never get an undefined behavior?
You can't really know that by simply running the program.
Your code is fine.
it is an error for an expression to refer to and change the same object
(bold mine)
Your don't change any objects in your expressions, so the rule doesn't apply.
Here's an example of when the rule would apply:
int a = 42;
int i = a++ * a++;
Note that it would not apply if the change happened in a function:
int a = 42;
int foo() {return a++;}
int i = foo() * foo();
That's because the UB only happens when two accesses to an object are unsequenced relative to each other, i.e. can happen in any order including in parallel. This doesn't necessarily mean "in parallel threads", but can also mean "single thread, but processor instructions performing the tasks may be interleaved".
But two function calls on the same thread can't happen in parallel (and can't have their instructions interleaved). Rather, in this case, they are indeterminately sequenced, i.e. one strictly after the other, but it's unspecified which one is first.
Also note that
the << operator makes no guarantees about when or how its operands are evaluated
is no longer true starting from C++17.
From C++ draft standard:
3.64 [defns.undefined] undefined behavior
behavior for which this document imposes no requirements [Note 1: Undefined behavior may be
expected when this document omits any explicit definition of behavior
or when a program uses an erroneous construct or erroneous data.
Permissible undefined behavior ranges from ignoring the situation
completely with unpredictable results, to behaving during translation
or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to
terminating a translation or execution (with the issuance of a
diagnostic message). Many erroneous program constructs do not engender
undefined behavior; they are required to be diagnosed. Evaluation of a
constant expression ([expr.const]) never exhibits behavior explicitly
specified as undefined in [intro] through [cpp]. — end note]
Thus, almost everything is possible, even predictable behavior for a given implementation.
But, your program does not exhibit any UB. f1 and f2 do not have any border effect, thus the order of their evaluation has no impact.

Is modifying an object more than once in an expression through its name and through its reference well-defined?

Hello I have a simple question: is modifying an object more than once in the same expression; once through its identifier (name) and second through a reference to it or a pointer that points at it Undefined Behavior?
int i = 1;
std::cout << i << ", " << ++i << std::endl; //1- Error. Undefined Behavior
int& refI = i;
std::cout << i << ", " << ++refI << std::endl; //2- Is this OK?
int* ptrI = &refI; // ptrI point to the referred-to object (&i)
std::cout << i << ", " << ++*ptrI << std::endl; // 3 is this also OK?
In the second it seems to work fine but I am confused about it because from what I've learned; a Reference is just an alias name for an already existing object. and any change to it will affect the reffered-to object. Thus what I see here is that i and refI are the same so modifying the same object (i) more than once here in the same expression.
But Why all the compilers treat statement 2 as a well-defined behavior?
What about statement 3 (ptrI)?
All of them have undefined behavior before C++17 and all have well-defined behavior since C++17.
Note that you are not modifying i more than once in either example. You are modifying it only with the increment.
However, it is also undefined behavior to have a side effect on one scalar (here the increment of i) be unsequenced with a value computation (here the left-hand use of i). Whether the side effect is produced by directly acting on the variable or through a reference or pointer does not matter.
Before C++17, the << operator did not imply any sequencing of its operands, so the behavior is undefined in all your examples.
Since C++17, the << operator is guaranteed to evaluate its operands from left-to-right. C++17 also extended the sequencing rules for operators to overloaded operators when called with the operator notation. So in all your examples the behavior is well-defined and the left-hand use of i is evaluated first, before i's value is incremented.
Note however, that some compilers didn't implement these changes to the evaluation rules very timely, so even if you use the -std=c++17 flag, it might still unfortunately violate the expected behavior with older and current compiler versions.
In addition, at least in the case of GCC, the -Wsequence-point warning is explicitly documented to warn even for behavior that became well-defined in C++17, to help the user to avoid writing code that would have undefined behavior in C and earlier C++ versions, see GCC documentation.
The compiler is not required (and not able to) diagnose all cases of undefined behavior. In some simple situations it will be able to give you a warning (which you can turn into an error using -Werror or similar), but in more complex cases it will not. Still, your program will loose any guarantee on its behavior if you have undefined behavior, whether diagnosed or not.
The order of evaluation rules are defined in terms of objects, not references or pointers, or whatever method you take to obtain the object.
That said, your three examples are exactly equivalent in terms of the order of evaluation rules (if we are only considering the object defined as i).
Hence let's just look at your first example:
std::cout << i << ", " << ++i << std::endl;
For simplicity we can ignore the ", " and std::endl, hence:
std::cout << i << ++i;
VC(X) = Value computation of X
SE(X) = Side effects of X
Exec(X) = The execution of the function body of X
X <--- Y = X is sequenced after Y
Since c++ 11 and until c++ 17, this is undefined behavior because the side effects of D (see the graph) is unsequenced relative to the value calculation of C. However, both involves object i. This is undefined behavior.
Since c++ 17, there is an extra guarantee (on the << and >> expressions) that both the value computation and side effects of C will be sequenced before the value computation and side effects of D (marked by the dotted lines), therefore the code becomes well-defined.
All of these result in Undefined Behaviour. Just because a compiler doesn't give a warning doesn't make it not so. Anything can happen with UB: it works, it works sometimes, it crashes, it blows up your computer, &c.

Using the post-increment in function arguments

When I run this code, the output is 11, 10.
Why on earth would that be? Can someone give me an explanation of this that will hopefully enlighten me?
Thanks
#include <iostream>
using namespace std;
void print(int x, int y)
{
cout << x << endl;
cout << y << endl;
}
int main()
{
int x = 10;
print(x, x++);
}
The C++ standard states (A note in section 1.9.16):
Value computations and side effects associated with the different argument expressions are unsequenced.
In other words, it's undefined and/or compiler-dependent which order the arguments are evaluated in before their value is passed into the function. So on some compilers (which evaluate the left argument first) that code would output 10, 10 and on others (which evaluate the right argument first) it will output 11, 10. In general you should never rely on undefined behaviour.
To help you understand this, imagine that each argument expression is evaluated before the function is called like so (not that this is exactly how it actually works, it's just an easy way to think of it that will help you understand the sequencing):
int arg1 = x; // This line
int arg2 = x++; // And this line can be swapped.
print(arg1, arg2);
The C++ Standard says that the two argument expression are unsequenced. So, if we write out the argument expressions on separate lines like this, their order should not be significant, because the standard says they can be evaluated in any order. Some compilers might evaluate them in the order above, others might swap them:
int arg2 = x++; // And this line can be swapped.
int arg1 = x; // This line
print(arg1, arg2);
That makes it pretty obvious how arg2 can hold the value 10, while arg1 holds the value 11.
You should always avoid this undefined behaviour in your code.
On a whole the statement:
print(x, x++);
results in an Undefined Behavior. Once a program has an Undefined Behavior it ceases to be an valid C++ program and literally any behavior is possible.So it is pointless to find reasoning for such an program.
Why is this Undefined Behavior?
Lets evaluate the program step by step to the point where we can beyond any doubt prove that it causes Undefined Behavior.
The order of evaluation of arguments to a function is Unspecified[Ref 1].
Unspecified means that an implementation is allowed to implement this particular functionality as it desires and it is not required to document the detail about it.
Applying the above rule to your function call:
print(x, x++);
An implementation might evaluate this as:
Left to Right or
Right to Left or
Any Magical order(in case of more than two function arguments)
In short you cannot rely on an implementation to follow any specific order because it is not required to as per the C++ Standard.
In C/C++ you cannot read or write to a variable more than once without an intervening sequence point[Ref 2].If you do so it results in an Undefined Behavior.Irrespective of whether either of the arguments gets evaluated first in the said function, there is no sequence point between them,a sequence point exists only after evaluation of all function arguments[Ref 3].
In this case x is being accessed without an intervening sequence point and hence it results in an Undefined Behavior.
Simply put it is best to write any code which does not invoke such Undefined Behaviors because once you do so you cannot expect any specific behavior from such a program.
[Ref 1] C++03 Standard §5.2.2.8
Para 8:
[...] The order of evaluation of function arguments is unspecified. [...]
[Ref 2]C++03 5 Expressions [expr]:
Para 4:
....
Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full
expression; otherwise the behavior is undefined.
[Ref 3]C++03 1.9 Program execution [intro.execution]:
Para 17:
When calling a function (whether or not the function is inline), there is a sequence point after the evaluation of all function arguments (if any) which takes place before execution of any expressions or statements in the function body.
x++ is a function parameter and they may be evaluated in an unspecified order which means the behavior is undefined and not portable (or legal).
I believe this has to do with the function call stack where the last argument goes in first. So x++ is your y and x is the local x in print().
Late answer. Ignoring the issue of the order of evaluation, note that the C++ standard explains how post increment and post decrement operate: " Post-increment and post-decrement creates a copy of the object, increments or decrements the value of the object and returns the copy from before the increment or decrement."
https://en.cppreference.com/w/cpp/language/operator_incdec
As an example where the difference in outcome is significant, consider std::list::splice, such as:
mylist.splice(where, mylist, iter++);
This will move the node pointed by iter to just before the node pointed by where. The sequence will be make a copy of iter to be passed to splice, increment iter, then call splice using the copy of iter before it was incremented. After splice returns, iter will point to the next node after the node iter originally pointed to, as opposed to the next node after iter's new location in the list after it was moved.

Unexpected order of evaluation (compiler bug?) [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Undefined Behavior and Sequence Points
I'm not sure if this is a gcc bug or not, so I'll ask:
unsigned int n = 0;
std::cout << n++ << n << ++n;
gcc gives the extremely strange result:
"122" which AFAICT is impossible. Because << is left associative, it should be the same as:
operator<<(operator<<(operator<<(std::cout, n++), n), ++n)
and because there is a sequence point before and after evaluating arguments, n is never modified twice (or even accessed) between two sequence points -- so it shouldn't be undefined behaviour, just the order of evaluation unspecified.
So AFAICT valid results would be:
111
012
002
101
and nothing else
There is a sequence point between evaluating arguments and calling a function. There is no sequence point between evaluating different arguments.
Let's look at the outermost function call:
operator<<(operator<<(operator<<(std::cout, n++), n), ++n)
The arguments are
operator<<(operator<<(std::cout, n++), n)
and
++n
It is unspecified which of these is evaluated first. It's also allowed that the first argument is partially evaluated when the second argument is evaluated.
From the standard, section [intro.execution] (wording from draft 3225):
If A is not sequenced before
B and B is not sequenced before A, then A and B are unsequenced. [ Note: The execution of unsequenced
evaluations can overlap. — end note ]
Except where noted, evaluations of operands of individual operators and of subexpressions of individual
expressions are unsequenced. [ Note: In an expression that is evaluated more than once during the execution
of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be
performed consistently in different evaluations. — end note ] The value computations of the operands of an
operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar
object is unsequenced relative to either another side effect on the same scalar object or a value computation
using the value of the same scalar object, the behavior is undefined.
Because you have multiple operations with side effects on the same scalar object which are unsequenced with respect to each other, you're in that realm of undefined behavior, and even 999 would be a permissible output.
The first rule of compiler bugs: it's probably not a compiler bug but a misunderstanding on your part. Using the postfix and prefix operators in the same statement results in undefined behavior. Try using the -Wall option to give you more warnings and show you the potential pitfalls in your code.
Let's see what GCC 4.2.1 tells us when we ask for warnings about test.cpp:
#include <iostream>
int main() {
unsigned int n = 0;
std::cout << n++ << n << ++n << std::endl;
return 0;
}
When we compile:
$ g++ -Wall test.cpp -o test
test.cpp: In function ‘int main()’:
test.cpp:5: warning: operation on ‘n’ may be undefined
Your code its an example of why in some books remark that experienced programmers don't like that(++,--) operator overload, even other languages (ruby) has not implemented ++ or --.

Behavior of post increment in cout [duplicate]

This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Closed 8 years ago.
#include <iostream>
using namespace std;
main(){
int i = 5;
cout << i++ << i--<< ++i << --i << i << endl;
}
The above program compiled with g++ gives output :
45555
While the following program:
int x=20,y=35;
x =y++ + y + x++ + y++;
cout << x<< endl << y;
gives result as
126
37
Can anyone please explain the output.
cout << i++ << i--
is semantically equivalent to
operator<<(operator<<(cout, i++), i--);
<------arg1--------->, <-arg2->
$1.9/15- "When calling a function
(whether or not the function is
inline), every value computation and
side effect associated with any
argument expression, or with the
postfix expression designating the
called function, is sequenced before
execution of every expression or
statement in the body of the called
function. [ Note: Value computations
and side effects associated with
different argument expressions are
unsequenced. —end note ]
C++0x:
This means that the evaluation of the arguments arg1/arg2 are unsequenced (neither of them is sequenced before the other).
The same section in the draft Standard also states,
If a side effect on a scalar object is
unsequenced relative to either another
side effect on the same scalar object
or a value computation using the value
of the same scalar object, the
behavior is undefined.
Now there is a sequence point at the semicolon at the end of the full expression below
operator<<(operator<<(cout, i++), i--);
^ the interesting sequence point is right here
As is clear, evaluation of both arg1 and arg2 lead to side effect on the scalar variable 'i', and as we saw above, the side effects are unsequenced.
Therefore the code has undefined behavior. So what does that mean?
Here is how 'undefined behavior' is defined :) in the Standard.
Permissible undefined behavior ranges
from ignoring the situation completely
with unpredictable results, to
behaving during translation or program
execution in a documented manner
characteristic of the environment
(with or without the issuance of a
diagnostic message), to terminating a
translation or execution (with the
issuance of a diagnostic message).
Many erroneous program constructs do
not engender undefined behavior; they
are required to be diagnosed.
Do you see correlation with #DarkDust's response 'The compiler is even allowed to set your computer on fire :-)'
So any output you get from such a code is really in the dreaded realm of undefined behavior.
Don't do it.
Only thing that is defined about such code is that it helps OP and many of us get lots of votes (if answered correctly) :)
The result of the second program's expression is undefined. The compiler is even allowed to set your computer on fire :-) You're not allowed to modify a variable twice within one sequence point (in this case: from = to ;).
Edit:
For detailed explanations, see the C FAQ, specifically question 3.2.
Adding to other's answers:
If you are using g++, using the -Wsequence-point option tells that:
$ g++ -Wsequence-point a.cpp
a.cpp: In function ‘int main()’:
a.cpp:8: warning: operation on ‘i’ may be undefined
^^^^^^^^^
Undefined behaviour, so anything could happen