Without -O2 this code prints 84 84, with O2 flag the output is 84 42. The code was compiled using gcc 4.4.3. on 64-bit Linux platform. Why the output for the following code is different?
Note that when compiled with -Os the output is 0 42
#include <iostream>
using namespace std;
int main() {
long long n = 42;
int *p = (int *)&n;
*p <<= 1;
cout << *p << " " << n << endl;
return 0;
}
When you use optimization with gcc, it can use certain assumptions based on the type of expressions to avoid repeating unnecessary reads and to allow retaining variables in memory.
Your code has undefined behaviour because you cast a pointer to a long long (which gcc allows as an extenstion) to a pointer to an int and then manipulate the pointed-to-object as if it were an int. A pointer-to-int cannot normally point to an object of type long long so gcc is allowed to assume that an operation that writes to an int (via a pointer) won't affect an object that has type long long.
It is therefore legitimate of it to cache the value of n between the time it was originally assigned and the time at which it is subsequently printed. No valid write operation could have changed its value.
The particular switch and documentation to read is -fstrict-aliasing.
You're breaking strict aliasing. Compiling with -Wall should give you a dereferencing type-punned pointer warning. See e.g. http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
I get the same results with GCC 4.4.4 on Linux/i386.
The program's behavior is undefined, since it violates the strict aliasing rule.
Related
I have a question regarding conversion of integers:
#include <iostream>
#include <cstdint>
using namespace std;
int main()
{
int N,R,W,H,D;
uint64_t sum = 0;
uint64_t sum_2 = 0;
cin >> W >> H >> D;
sum += static_cast<uint64_t>(W) * H * D * 100;
sum_2 += W * H * D * 100;
cout << sum << endl;
cout << sum_2 << endl;
return 0;
}
I thought, that sum should be equal to sum_2, because uint64_t type is bigger than int type and during arithmetic operations compiler chooses bigger type(which is uint64_t). So by my understanding, sum_2 must have uint64_t type. But it has int type.
Can you explain my why sum_2 was converted to int? Why didn't it stay uint64_t?
Undefined behavior signed-integer overflow/underflow, and well-defined behavior unsigned-integer overflow/underflow, in C and C++
If I enter 200, 300, and 358 for W, H, and D, I get the following output, which makes perfect sense for my gcc compiler on a 64-bit Linux machine:
2148000000
18446744071562584320
Why does this make perfect sense?
Well, the default type is int, which is int32_t for the gcc compiler on a 64-bit Linux machine, and its max value is 2^32/2-1 = 2147483647, and its min value is -2147483648. The line sum_2 += W * H * D * 100; does int arithmetic since that's the type of each variable there, 100 included, and no explicit cast is used. So, after doing int arithmetic, it then implicitly casts the int result into a uint64_t as it stores the result into the uint64_t sum_2 variable. The int arithmetic on the right-hand side prior to that point, however, results in 2148000000, which has undefined behavior signed integer overflow over the top of the max int value and back down to the min int value and up again.
Even though according to the C and C++ standards, signed integer overflow or underflow is undefined behavior, in the gcc compiler, I know that signed integer overflow happens to roll over to negative values if it is not optimized out. This, by default, is still "undefined behavior", and a bug, however, and must not be relied upon by default. See notes below for details and information on how to make this well-defined behavior via a gcc extension. Anyway, 2148000000 - 2147483647 = 516353 up-counts, the first of which causes roll-over. The first count up rolls over to the min int32_t value of -2147483648, and the next (516353 - 1 = 516352) counts go up to -2147483648 + 516352 = -2146967296. So, the result of W * H * D * 100 for the inputs above is now -2146967296, based on undefined behavior. Next, that value is implicitly cast from an int (int32_t in this case) to a uint64_t in order to store it from an int (int32_t in this case) into the uint64_t sum_2 variable, resulting in well-defined behavior unsigned integer underflow. You start with -2146967296. The first down-count underflows down to uint64_t max, which is 2^64-1 = 18446744073709551615. Now subtract the remaining 2146967296 - 1 = 2146967295 counts from that and you get 18446744073709551615 - 2146967295 = 18446744071562584320, just as shown above!
Voila! With a little compiler and hardware architecture understanding, and some expected but undefined behavior, the result is perfectly explainable and makes sense!
To easily see the negative value, add this to your code:
int sum_3 = W*H*D*100;
cout << sum_3 << endl; // output: -2146967296
Notes
Never intentionally leave undefined behavior in your code. That is known as a bug. You do not have to write ISO C++, however! If you can find compiler documentation indicating a certain behavior is well-defined, that's ok, so long as you know you are writing in the g++ language and not the C++ language, and don't expect your code to work the same across compilers. Here is an example where I do that: Using Unions for "type punning" is fine in C, and fine in gcc's C++ as well (as a gcc [g++] extension). I'm generally okay with relying on compiler extensions like this. Just be aware of what you're doing is all.
#user17732522 makes a great point in the comments here:
"in the gcc compiler, I know that signed integer overflow happens to roll over to negative values.": That is not correct by-default. By-default GCC assumes that signed overflow does not happen and applies optimizations based on that. There is the -fwrapv and/or -fno-strict-overflow flag to enforce wrapping behavior. See https://gcc.gnu.org/onlinedocs/gcc-12.1.0/gcc/Code-Gen-Options.html#Code-Gen-Options.
Take a look at that link above (or even better, this one, to always point to the latest gcc documentation instead of the documentation for just one version: https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html#Code-Gen-Options). Even though signed-integer overflow and underflow is undefined behavior (a bug!) according to the C and C++ standards, gcc allows, by extension, to make it well-defined behavior (not a bug!) so long as you use the proper gcc build flags. Using -fwrapv makes signed-integer overflow/underflow well-defined behavior as a gcc extension. Additionally, -fwrapv-pointer allows pointers to safely overflow and underflow when used in pointer arithmetic, and -fno-strict-overflow applies both -fwrapv and -fwrapv-pointer. The relevant documentation is here: https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html#Code-Gen-Options (emphasis added):
These machine-independent options control the interface conventions used in code generation.
Most of them have both positive and negative forms; the negative form of -ffoo is -fno-foo.
...
-fwrapv
This option instructs the compiler to assume that signed arithmetic overflow of addition, subtraction and multiplication wraps around using twos-complement representation. This flag enables some optimizations and disables others. The options -ftrapv and -fwrapv override each other, so using -ftrapv -fwrapv on the command-line results in -fwrapv being effective. Note that only active options override, so using -ftrapv -fwrapv -fno-wrapv on the command-line results in -ftrapv being effective.
-fwrapv-pointer
This option instructs the compiler to assume that pointer arithmetic overflow on addition and subtraction wraps around using twos-complement representation. This flag disables some optimizations which assume pointer overflow is invalid.
-fstrict-overflow
This option implies -fno-wrapv -fno-wrapv-pointer and when negated [as -fno-strict-overflow] implies -fwrapv -fwrapv-pointer.
So, relying on signed-integer overflow or underflow withOUT using the proper gcc extension flags above is undefined behavior, and therefore a bug, and can not be safely relied upon! It may be optimized out by the compiler and not work reliably as intended without the gcc extension flags above.
My test code
Here is my total code I used for some quick checks to write this answer. I ran it with the gcc/g++ compiler on a 64-bit Linux machine. I did not use the -fwrapv or -fno-strict-overflow flags, so all signed integer overflow or underflow demonstrated below is undefined behavior, a bug, and cannot be relied upon safely without those gcc extension flags. The fact that it works is circumstantial, as the compiler could, by default, choose to optimize out the overflows in unexpected ways.
If you run this on an 8-bit microcontroller such as an Arduino Uno, you'd get different results since an int is a 2-byte int16_t by default, instead! But, now that you understand the principles, you could figure out the expected result. (Also, I think 64-bit values don't exist on that architecture, so they become 32-bit values).
#include <iostream>
#include <cstdint>
using namespace std;
int main()
{
int N,R,W,H,D;
uint64_t sum = 0;
uint64_t sum_2 = 0;
// cin >> W >> H >> D;
W = 200;
H = 300;
D = 358;
sum += static_cast<uint64_t>(W) * H * D * 100;
sum_2 += W * H * D * 100;
cout << sum << endl;
cout << sum_2 << endl;
int sum_3 = W*H*D*100;
cout << sum_3 << endl;
sum_2 = -1; // underflow to uint64_t max
cout << sum_2 << endl;
sum_2 = 18446744073709551615ULL - 2146967295;
cout << sum_2 << endl;
return 0;
}
Just a short version of #Gabriel Staples good answer.
"and during arithmetic operations compiler chooses bigger type(which is uint64_t)"
There is no uin64_t in W * H * D * 100, just four int. After this multiplication, the int product (which overflowed and is UB) is assigned to an uint64_t.
Instead, use 100LLU * W * H * D to perform a wider unsigned multiplication.
Consider the following program:
#include <iostream>
int main()
{
unsigned int a = 3;
unsigned int b = 7;
std::cout << (a - b) << std::endl; // underflow here!
return 0;
}
In the line starting with std::cout an underflow is happening because a is lesser than b so a-b is less than 0, but since a and b are unsigend so is a-b.
Is there a compiler flag (for G++) that gives me a warning when I try to calculate the difference of two unsigend integers?
Now, one could argue that an overflow/underflow can happen in any calculation using any operator. But I think it is more dangerous to apply operator - to unsigend ints because with unsigned integers this error may happen with quite low (to me: "more common") numbers.
A (static analysis) tool that finds such things would also be great but I much prefer a compiler flag and warning.
GCC does not (afaict) support it, but Clang's UBSanitizer has the following option [emphasis mine]:
-fsanitize=unsigned-integer-overflow: Unsigned integer overflow, where the result of an unsigned integer computation cannot be represented in its type. Unlike signed integer overflow, this is not undefined behavior, but it is often unintentional. This sanitizer does not check for lossy implicit conversions performed before such a computation
The following code running compiler options -O3 vs -O0 results different output:
#include <stdlib.h>
#include <stdio.h>
int main(){
int *p = (int*)malloc(sizeof(int));
int *q = (int*)realloc(p, sizeof(int));
*p = 1;
*q = 2;
if (p == q)
printf("%d %d", *p, *q);
return 0;
}
I was very surprised with the outcome.
Compiling with clang 3.4, 3.5 (http://goo.gl/sDLvrq)
using compiler options -O0 — output: 2 2
using compiler options -O3 — output: 1 2
Is it a bug?
Interestingly if I modify the code slightly
(http://goo.gl/QwrozF) it behaves as expected.
int *p = (int*)malloc(sizeof(int));
*p = 1;
Testing it on gcc seems to work fine.
After the realloc, p is no longer valid.
Assuming both of the allocations are successful, q points to an allocated region of memory and p is an invalid pointer. The standard treats realloc and free as deallocation routines, and if successful, the address the pointer held can no longer be used. If the call to realloc fails for some reason, the original memory is still valid (but of course q isn't, it's NULL).
Although you compare p and q, you've already written to an invalid pointer, so all bets are off.
What's probably happening here is that the O3 setting is causing the compiler to ignore the pointers and just substitute numbers inline. High optimisation means a compiler can take all sorts of short cuts and ignore statements so long as it guarantees the same result - the condition being that all of the code is well defined.
Suppose you wrote a function in c++, but absentmindedly forgot to type the word return. What would happen in that case? I was hoping that the compiler would complain, or at least a segmentation fault would be raised once the program got to that point. However, what actually happens is far worse: the program spews out rubbish. Not only that, but the actual output depends on the level of optimization! Here's some code that demonstrate this problem:
#include <iostream>
#include <vector>
using namespace std;
double max_1(double n1,
double n2)
{
if(n1>n2)
n1;
else
n2;
}
int max_2(const int n1,
const int n2)
{
if(n1>n2)
n1;
else
n2;
}
size_t max_length(const vector<int>& v1,
const vector<int>& v2)
{
if(v1.size()>v2.size())
v1.size();
else
v2.size();
}
int main(void)
{
cout << max_1(3,4) << endl;
cout << max_1(4,3) << endl;
cout << max_2(3,4) << endl;
cout << max_2(4,3) << endl;
cout << max_length(vector<int>(3,1),vector<int>(4,1)) << endl;
cout << max_length(vector<int>(4,1),vector<int>(3,1)) << endl;
return 0;
}
And here's what I get when I compile it at different optimization levels:
$ rm ./a.out; g++ -O0 ./test.cpp && ./a.out
nan
nan
134525024
134525024
4
4
$ rm ./a.out; g++ -O1 ./test.cpp && ./a.out
0
0
0
0
0
0
$ rm ./a.out; g++ -O2 ./test.cpp && ./a.out
0
0
0
0
0
0
$ rm ./a.out; g++ -O3 ./test.cpp && ./a.out
0
0
0
0
0
0
Now imagine that you're trying to debug the function max_length. In production mode you get the wrong answer, so you recompile in debug mode, and now when you run it everything works fine.
I know there are ways to avoid such cases altogether by adding the appropriate warning flags (-Wreturn-type), but I'm still have two questions
Why does the compiler even agree to compile a function without a return statement? Is this feature required for legacy code?
Why does the output depend on the optimization level?
This is undefined behavior to drop off the end of the value returning function, this is covered in the draft C++ standard section `6.6.31 The return statement which says:
Flowing off the end of a function is equivalent to a return with no
value; this results in undefined behavior in a value-returning
function.
The compiler is not required to issue a diagnostic, we can see this from section 1.4 Implementation compliance which says:
The set of diagnosable rules consists of all syntactic and semantic
rules in this International Standard except for those rules containing
an explicit notation that “no diagnostic is required” or which are
described as resulting in “undefined behavior.”
although compiler in general do try and catch a wide range of undefined behaviors and produce warnings, although usually you need to use the right set of flags. For gcc and clang I find the following set of flags to be useful:
-Wall -Wextra -Wconversion -pedantic
and in general I would encourage you to turn warnings into errors using -Werror.
Compiler are notorious for taking advantage of undefined behavior during the optimization stages, see Finding Undefined Behavior Bugs by Finding Dead Code for some good examples including the infamous Linux kernel null pointer check removal where in processing this code:
struct foo *s = ...;
int x = s->f;
if (!s) return ERROR;
gcc inferred that since s was deferenced in s->f; and since dereferencing a null pointer is undefined behavior then s must not be null and therefore optimizes away the if (!s) check on the next line (copied from my answer here).
Since undefined behavior is unpredictable, then at more aggressive settings the compiler in many cases will do more aggressive optimizations many of them may not make much intuitive sense but, hey it is undefined behavior so you should have no expectations anyway.
Note, that although there are many cases the compiler can determine a function is not properly returning in the general case this is the halting problem. Doing this at run-time automatically would carry a cost which violates the don't pay for what you don't use philosophy. Although both gcc and clang implement sanitizers to check for things like undefined behavior, for example using the -fsanitize=undefined flag would check for undefined behavior at run-time.
You may want to check out this answer here
The just of it is that the compiler allows you to not have a return statement since there are potentially many different execution paths, ensuring each will exit with a return can be tricky at compile time, so the compiler will take care of it for you.
Things to remember:
if main ends without a return it will always return 0.
if another function ends without a return it will always return the last value in the eax register, usually the last statement
optimization changes the code on the assembly level. This is why you are getting the weird behavior, the compiler is "fixing" your code for you changing when things are executed giving a different last value, and thus return value.
Hope this helped!
The following snippet of C++ code computes Fibonacci numbers. With 32-bit integers, we overflow when n=47 and b becomes negative.
int a=0, b=1, n=2;
do {
a += b; int c = a; a = b; b = c;
++n;
} while ( b>0 and n<50 );
std::cout << "Ended at n=" << n << " b=" << b << std::endl;
Compiling on g++ 4.9.1 all is well; unless I use -O2 or -O3 in which case the loop runs until n=50. Is the compiler perhaps assuming that, as a and b start out positive and only additions are applied, they must stay positive? A look at the assembly output confirms that the condition b>0 isn't even being checked.
Same happens for g++ 4.7.0. So I suspect this is deliberate behaviour...?
First of all: This code invokes undefined behavior according to [expr]/4. Therefore a compiler can deduce that since no negative value can be assigned to b without UB (which case the compiler doesn't consider any further), b can't get negative and thus that condition doesn't need to be checked.
As noted in the comments, the flag -fwrapv instructs the compiler to
[…] assume that signed arithmetic overflow of addition, subtraction and
multiplication wraps around using twos-complement representation. This
flag enables some optimizations and disables others.
With it GCC should produce the same program as with optimization levels lower than O2.