main.cpp
const int& f(int& i ) { return (++i);}
int main(){
int i = 10;
int a = i++ + i++; //undefined behavior
int b = f(i) + f(i); //but this is not
}
compile
$ g++ main.cpp -Wsequence-point
statement int a = i++ + i++; is undefined behaviour.
statement int b = f(i) + f(i); is not undefined .
why?
statement int b = f(i) + f(i); is not undefined . why?
No, the second statement will result in unspecified behavior. You can confirm this here. As you'll see in the above linked demo, gcc gives the output as 23 while msvc gives 24 for the same program.
Pre-c++11, we use Sequence_point_rules:
Pre-C++11 Undefined behavior
Between the previous and next sequence point, the value of any object in a memory location must be modified at most once by the evaluation of an expression, otherwise the behavior is undefined.
Pre-C++11 Rules
3) There is a sequence point after the copying of a returned value of a function and before the execution of any expressions outside the function.
In int a = i++ + i++;, i is modified twice
In int b = f(i) + f(i);, there are sequence point with function call. i is modified only once between the sequence call. so no UB.
Note though that order of evaluation is unspecified so (evaluation of result) f(i) might happens after or before the second f(i), which might lead to different result depending of optimization/compiler and even between call.
Since C++11, we use "Sequenced before" rules, which is a similar way disallows i++ + i++ but allows f(i) + f(i).
Related
This question already has answers here:
Why are these constructs using pre and post-increment undefined behavior?
(14 answers)
Undefined behavior and sequence points
(5 answers)
Closed 5 years ago.
In the legacy code base I'm working on, I discovered the line
n = ++n % size;
that is just a bad phrasing of the intended
n = (n+1) % size;
as deduced from the surrounding code and runtime-proved. (The latter now replaces the former.)
But since this code was marked as an error by Cppckeck, and caused a warning in GCC, without ever having caused any malfunction, I didn't stop thinking here. I reduced the line to
n = ++n;
still getting the original error/warning messages:
Cppcheck 1.80:
Id: unknownEvaluationOrder
Summary: Expression 'n=++n' depends on order of evaluation of side effects
Message: Expression 'n=++n' depends on order of evaluation of side effects
GCC (mingw32-g++.exe, version 4.9.2, C++98):
warning: operation on 'n' may be undefined [-Wsequence-point]|
I already learned that assignment expressions in C/C++ can be heavily affected by undefined evaluation order, but in this very case I just can't imagine how.
Can the undefined evaluation order of n = ++n; really be relevant for the resulting program, especially for intended value of n? That's what I imagine what may happen.
Scenario #1
++n;
n=n;
Scenario #2
n=n;
++n;
I know that the meaning and implications of relaying on undefined behaviour in C++, is hard to understand and hard to teach.
I know that the behaviour of n=++n; is undefined by C++ standards before C++11. But it has a defined behaviour from C++11 on, and this (now standard-defined behaviour) is exactly the same I'm observing with several compilers[1] for this small demo program
#include <iostream>
using namespace std;
int main()
{
int n = 0;
cout << "n before: " << n << endl;
n=++n;
cout << "n after: " << n << endl;
return 0;
}
that has the output
n before: 0
n after: 1
Is it reasonable to expect that the behaviour is actually the same for all compilers regardless of being defined or not by standards? Can you (a) show one counter example or (b) give an easy to understand explanation how this code could produce wrong results?
[1] the compilers a used
Borland-C++ 5.3.0 (pre-C++98)
Borland-C++ 5.6.4 (C++98)
C++ (vc++)
C++ (gcc 6.3)
C++14 (gcc 6.3)
C++14 clang
The increment order is precisely defined. It is stated there that
i = ++i + 2; // undefined behavior until C++11
Since you use a C++11 compiler, you can leave your code as is is. Nevertheless, I think that the expressiveness of
n = (n+1) % size;
is higher. You can more easily figure out what was intended by the writer of this code.
According to cppreference:
If a side effect on a scalar object is unsequenced relative to another side effect on the same scalar object, the behavior is undefined:
i = ++i + 2; // undefined behavior until C++11
i = i++ + 2; // undefined behavior until C++17
f(i = -2, i = -2); // undefined behavior until C++17
f(++i, ++i); // undefined behavior until C++17, unspecified after C++17
i = ++i + i++; // undefined behavior
For the case n = ++n; it would be an undefined behavior but we do not care which assignment happens first, n = or ++n.
Consider the following code in strange.cpp:
#include <vector>
using namespace std;
int i = 0;
int *bar()
{
++i;
return &i;
}
int main()
{
for(size_t j = 0; j < 99999999999; ++j) // (*)
{
const auto p = bar();
if(!p) // (**)
return -1;
}
}
Compiling this with g++ gives a warning:
$ g++ --std=c++11 -O3 strange.cpp
strange.cpp: In function ‘int main()’:
strange.cpp:12:12: warning: iteration 4294967296ul invokes undefined behavior [-Waggressive-loop-optimizations]
++i;
^
strange.cpp:19:9: note: containing loop
for(size_t j = 0; j < 99999999999; ++j) // (*)
^
I don't understand why the increment invokes undefined behavior. Moreover, there are two changes, each of which makes the warning disappear:
changing the line (*) to for(int j...
changing the line (**) to if(!*p)
What is the meaning of this warning, and why are the changes relevant to it?
Note
$ g++ --version
g++ (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4
The increment is undefined because once i reaches std::numeric_limits<int>::max() (231 - 1 on a 32-bit, LP64 or LLP64 platform), incrementing it will overflow, which is undefined behavior for signed integral types.
gcc is warning on iteration 4294967296ul (232) rather than iteration 2147483646u (231) as you might expect, because it doesn't know the initial value of i; some other code might have run before main to set i to something other than 0. But once main is entered, no other code can run to alter i, and so once 232 iterations have completed it will have at some point reached 231 - 1 and overflowed.
"fixes" it by turning the controlling condition of the loop into a tautologically true expression; this makes the loop an infinite loop, since the if inside the loop will never execute, as &i cannot be a null pointer. Infinite loops can be optimized away, so gcc eliminates the body of the loop and the integer overflow of i does not occur.
"fixes" it by allowing gcc an out from the undefined behavior of integer overflow. The only way to prevent integer overflow is for i to have an initial value that is negative, such that at some point i reaches zero. This is possible (see above), and the only alternative is undefined behavior, so it must happen. So i reaches zero, the if inside the loop executes, and main returns -1.
see simple example:
int a = 0;
int b = (a ++ , a + 1); // result of b is UB or well defined ? (c++03).
This was changed in c++11/c++14 ?
The result is well defined and has been since C++98. The comma operator introduces a sequence point (or a "sequenced before" relationship in later C++s) between the the write and the second read of a and I don't see any other potential reasons for undefined behavior.
I vaguely remember reading somewhere that it is undefined behaviour if multiple operands in a compound expression modify the same object.
I believe an example of this UB is shown in the code below however I've compiled on g++, clang++ and visual studio and all of them print out the same values and can't seem to produce unpredictable values in different compilers.
#include <iostream>
int a( int& lhs ) { lhs -= 4; return lhs; }
int b( int& lhs ) { lhs *= 7; return lhs; }
int c( int& lhs ) { lhs += 1; return lhs; }
int d( int& lhs ) { lhs += 2; return lhs; }
int e( int& lhs ) { lhs *= 3; return lhs; }
int main( int argc, char **argv )
{
int i = 100;
int j = ( b( i ) + c( i ) ) * e( i ) / a( i ) * d( i );
std::cout << i << ", " << j << std::endl;
return 0;
}
Is this behaviour undefined or have I somehow conjured up a description of supposed UB that is not actually undefined?
I would be grateful if someone could post an example of this UB and maybe even point me to where in the C++ standard that it says it is UB.
No. It is not. Undefined behavior is out of question here (assuming the int arithmetic does not overflow): all modifications of i are isolated by sequence points (using C++03 terminology). There's a sequence point at the entrance to each function and there's a sequence point at the exit.
The behavior is unspecified here.
Your code actually follows the same pattern as the classic example often used to illustrate the difference between undefined and unspecified behavior. Consider this
int i = 1;
int j = ++i * ++i;
People will often claim that in this example the "result does not depend on the order of evaluation and therefore j must always be 6". This is an invalid claim, since the behavior is undefined.
However in this example
int inc(int &i) { return ++i; }
int i = 1;
int j = inc(i) * inc(i);
the behavior is formally only unspecified. Namely, the order of evaluation is unspecified. However, since the result of the expression does not depend on the order of evaluation at all, j is guaranteed to always end up as 6. This is an example of how generally dangerous unspecified behavior combination can lead to perfectly defined result.
In your case the result of your expression does critically depend on the order of evaluation, which means that the result will be unpredictable. Yet, there's no undefined behavior here, i.e. the program is not allowed to format your hard drive. It is only allowed to produce unpredictable result in j.
P.S. Again, it might turn out that some of the evaluation scenarios for your expression lead to signed integer overflow (I haven't analyzed them all), which by itself triggers undefined behavior. So, there's still a potential for unspecified behavior leading to undefined behavior in your expression. But this is probably not what your question is about.
No its not undefined behavior.
But it does invoke unspecified behavior.
This is because the order that sub-expressions are evaluated is unspecified.
int j = ( b( i ) + c( i ) ) * e( i ) / a( i ) * d( i );
In the above expression the sub expressions:
b(i)
c(i)
e(i)
a(i)
d(i)
Can be evaluated in any order. Because they all have side-effects the results will depend on this order.
If you divide up the expression into all sub-expressions (this is pseudo-code)
Then you can see any ordering required. Not only can the above expressions be done in any order, potentially they can be interleaved with the higher level sub-expressions (with only a few constraits).
tmp_1 = b(i) // A
tmp_2 = c(i) // B
tmp_3 = e(i) // C
tmp_4 = a(i) // D
tmp_5 = d(i) // E
tmp_6 = tmp_1 + tmp_2 // F (Happens after A and B)
tmp_7 = tmp_6 * tmp_3 // G (Happens after C and F)
tmp_8 = tmp_7 / tmp_4 // H (Happens after D and G)
tmp_9 = tmp_8 * tmp_5 // I (Happens after E and H)
int j = tmp_9; // J (Happens after I)
It isn't undefined behavior but it has unspecified results: The only modified object is i through the references passed to the functions. However, the call to the functions introduce sequence points (I don't have the C++ 2011 with me: they are called something different there), i.e. there is no problem of multiple changes within an expression causing undefined behavior.
However, the order in which the expression is evaluated isn't specified. As a result you may get different results if the order of the evaluation changes. This isn't undefined behavior: The result is one of all possible orders of evaluation. Undefined behavior means that the program can behave in any way it wants, including producing the "expected" (expected by the programmer) results for the expression in question while currupting all other data.
int main() {
int a = 10;
int b = a * a++;
printf("%i %i", a, b);
return 0;
}
Is the output of the above code undefined behavior?
No in
int b = a * a++;
the behavior is undefined, so the result can be anything - that's not what "implementation dependent" means.
You might wonder why it's UB here since a is modified only once. The reason is there's also a requirement in 5/4 paragraph of the Standard that the prior value shall be accessed only to determine the value to be stored. a shall only be read to determine the new value of a, but here a is read twice - once to compute the first multiplier and once again to compute the result of a++ that has a side-effect of writing a new value into a. So even though a is modified once here it is undefined behavior.