gcc and clang produce different outputs while left-shifting with unsigned values - c++

According to this interesting paper about undefined behavior optimization in c, the expression (x<<n)|(x>>32-n) "performs undefined behavior in C when n = 0". This stackoverflow discussion confirms that the behavior is undefined for negative integers, and discusses some other potential pitfalls with left-shifting values.
Consider the following code:
#include <stdio.h>
#include <stdint.h>
uint32_t rotl(uint32_t x, uint32_t n)
{
return (x << n) | (x >> (32 - n));
}
int main()
{
uint32_t y = rotl(10, 0);
printf("%u\n", y);
return 0;
}
Compile using the following parameters: -O3 -std=c11 -pedantic -Wall -Wextra
In gcc >5.1.0 the output of the program is 10.
In clang >3.7.0 the output is 4294967295.
Interestingly, this is still true when compiling with c++: gcc results, clang results.
Therefore, my questions are as follows:
It is my understanding from the language in the standard that this should not invoke undefined / implementation defined behavior since both of the parameters are unsigned integers and none of the values are negative. Is this correct? If not, what is the relevant section of the standard for c11 and c++11?
If the previous statement is true, which compiler is producing the correct output according to the c/c++ standard? Intuitively, left shifting by no digits should give you back the value, i.e. what gcc outputs.
If the above is not the case, why are there no warnings that this code may invoke undefined behavior due to left-shift overflow?

From [expr.shift], emphasis mine:
The behavior is undefined if the right operand
is negative, or greater than or equal to the length in bits of the promoted left operand.
You are doing:
(x >> (32 - n))
with n == 0, so you're right-shifting a 32-bit number by 32. Hence, UB.

Your n is 0, so performing x << 32 is an undefined behavior as shifting uint32_t 32 bits or more is undefined.

If n is 0, 32-n is 32, and since x has 32 bits, x>>(32-n) is UB.
The issue in the linked SO post is different. This one has nothing to do with signedness.

A part of the post not fully answered.
why are there no warnings that this code may invoke undefined behavior due to left-shift overflow?
Looking at the add() code, what should the compiler warn about? Is it UB if the sum is outside the range of INT_MIN ... INT_MAX. Because the following code does not take precautions to prevent overflow, like here, should it warn? Should you think so, then so much code would be waning about potential this and that, that programmers would quickly turn that warning off.
int add(int a, int b) {
return a + b;
}
The situation is not much different here. If n > 0 && n < 32, there is no problem.
uint32_t rotl(uint32_t x, uint32_t n) {
return (x << n) | (x >> (32 - n));
}
C creates fast code primarily because it lacks lots of run-time error checking and compliers are able to perform very nice optimized code. If one needs lots of run-time checks, there are other languages suitable for those programmers.
C is coding without a net.

When the C Standard was written, some implementations would behave weirdly when trying to perform a shift by extremely large or negative amounts, e.g. left-shifting by -1 might tie up a CPU with interrupts disabled while its microcode shifts a value four billion times, and disabling interrupts for that long might cause other system faults. Further, while few if any implementations would do anything particularly weird when shifting by exactly the word size, implementations weren't consistent about the value returned. Some would treat it as a shift by zero, while others would yield the same result as shifting by one, word-size times, and some would sometimes do one and sometimes the other.
If the authors of the Standard had specified that shifting by precisely the word size may select in Unspecified fashion between those two possible behaviors, that would have been useful, but the authors of the Standard weren't interested in specifying all the things that compilers would naturally do with or without a mandate. I don't think they considered the idea that implementations for commonplace platforms wouldn't naturally yield the commonplace behavior for expressions like the "rotate" given above, and didn't want to clutter the Standard with such details.
Today, however, some compiler writers think it's more important to exploit all forms of UB for "optimization" than to support useful natural behaviors which had previously been supported by essentially all commonplace implementations. Whether or not making the "rotate" expression malfunction when y==0 would allow a compiler to generate a useful program which is smaller than would otherwise be possible is irrelevant.

Related

Compiler optimizations may cause integer overflow. Is that okay?

I have an int x. For simplicity, say ints occupy the range -2^31 to 2^31-1. I want to compute 2*x-1. I allow x to be any value 0 <= x <= 2^30. If I compute 2*(2^30), I get 2^31, which is an integer overflow.
One solution is to compute 2*(x-1)+1. There's one more subtraction than I want, but this shouldn't overflow. However, the compiler will optimize this to 2*x-1. Is this a problem for the source code? Is this a problem for the executable?
Here is the godbolt output for 2*x-1:
func(int): # #func(int)
lea eax, [rdi + rdi]
dec eax
ret
Here is the godbolt output for 2*(x-1)+1:
func(int): # #func(int)
lea eax, [rdi + rdi]
dec eax
ret
As Miles hinted: The C++ code text is bound by the rules of the C++ language (integer overflow = bad), but the compiler is only bound by the rules of the cpu (overflow=ok). It is allowed to make optimizations that the code isn't allowed to.
But don't take this as an excuse to get lazy. If you write undefined behavior, the compiler will take that as a hint and do other optimizations that result in your program doing the wrong thing.
Just because signed integer overflow isn't well-defined at the C++ language level doesn't mean that's the case at the assembly level. It's up to the compiler to emit assembly code that is well-defined on the CPU architecture you're targeting.
I'm pretty sure every CPU made in this century has used two's complement signed integers, and overflow is perfectly well defined for those. That means there is no problem simply calculating 2*x, letting the result overflow, then subtracting 1 and letting the result underflow back around.
Many such C++ language-level rules exist to paper over different CPU architectures. In this case, signed integer overflow was made undefined so that compilers targeting CPUs that use e.g. one's complement or sign/magnitude representations of signed integers aren't forced to add extra instructions to conform to the overflow behavior of two's complement.
Don't assume, however, that you can use a construct that is well-defined on your target CPU but undefined in C++ and get the answer you expect. C++ compilers assume undefined behavior cannot happen when performing optimization, and so they can and will emit different code from what you were expecting if your code isn't well-defined C++.
The ISO C++ rules apply to your source code (always, regardless of the target machine). Not to the asm the compiler chooses to make, especially for targets where signed integer wrapping just works.
The "as if" rules requires that the asm implementation of the function produce the same result as the C++ abstract machine, for every input value where the abstract machine doesn't encounter signed integer overflow (or other undefined behaviour). It doesn't matter how the asm produces those results, that's the entire point of the as-if rule. In some cases, like yours, the most efficient implementation would wrap and unwrap for some values that the abstract machine wouldn't. (Or in general, not wrap where the abstract machine does for unsigned or gcc -fwrapv.)
One effect of signed integer overflow being UB in the C++ abstract machine is that it lets the compiler optimize an int loop counter to pointer width, not redoing sign-extension every time through the loop or things like that. Also, compilers can infer value-range restrictions. But that's totally separate from how they implement the logic into asm for some target machine. UB doesn't mean "required to fail", in fact just the opposite, unless you compile with -fsanitize=undefined. It's extra freedom for the optimizer to make asm that doesn't match the source if you interpreted the source with more guarantees than ISO C++ actually gives (plus any guarantees the implementation makes beyond that, like if you use gcc -fwrapv.)
For an expression like x/2, every possible int x has well-defined behaviour. For 2*x, the compiler can assume that x >= INT_MIN/2 and x <= INT_MAX/2, because larger magnitudes would involve UB.
2*(x-1)+1 implies a legal value-range for x from (INT_MIN+1)/2 to (INT_MAX+1)/2. e.g. on a 32-bit 2's complement target, -1073741823 (0xc0000001) to 1073741824 (0x40000000). On the positive side, 2*0x3fffffff doesn't overflow, doesn't wrap on increment because 2*x was even.
2*x - 1 implies a legal value-range for x from INT_MIN/2 + 1 to INT_MAX/2. e.g. on a 32-bit 2's complement target, -1073741823 (0xc0000001) to 1073741823 (0x3fffffff). So the largest value the expression can produce is 2^n - 3, because INT_MAX will be odd.
In this case, the more complicated expression's legal value-range is a superset of the simpler expression, but in general that's not always the case.
They produce the same result for every x that's a well-defined input for both of them. And x86 asm (where wrapping is well-defined) that works like one or the other can implement either, producing correct results for all non-UB cases. So the compiler would be doing a bad job if it didn't make the same efficient asm for both.
In general, 2's complement and unsigned binary integer math is commutative and associative (for operations where that's mathematically true, like + and *), and compilers can and should take full advantage. e.g. rearranging a+b+c+d to (a+b)+(c+d) to shorten dependency chains. (See an answer on Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)? for an example of GCC doing it with integer, but not FP.)
Unfortunately, GCC has sometimes been reluctant to do signed-int optimizations like that because its internals were treating signed integer math as non-associative, perhaps because of a misguided application of C++ UB rules to optimizing asm for the target machine. That's a GCC missed optimization; Clang didn't have that problem.
Further reading:
Is there some meaningful statistical data to justify keeping signed integer arithmetic overflow undefined? re: some useful loop optimizations it allows.
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
Does undefined behavior apply to asm code? (no)
Is integer overflow undefined in inline x86 assembly?
The whole situation is basically a mess, and the designers of C didn't anticipate the current sophistication of optimizing compilers. Languages like Rust are better suited to it: if you want wrapping, you can (and must) tell the compiler about it on a per-operation basis, for both signed and unsigned types. Like x.wrapping_add(1).
Re: why does clang split up the 2*x and the -1 with lea/dec
Clang is optimizing for latency on Intel CPUs before Ice lake, saving one cycle of latency at the cost of an extra uop of throughput cost. (Compilers often favour latency since modern CPUs are often wide enough to chew through the throughput costs, although it does eat up space in the out-of-order exec window for hiding cache miss latency.)
lea eax, [rdi + rdi - 1] has 3 cycle latency on Skylake, vs. 1 for the LEA it used. (See Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly? for some details). On AMD Zen family, it's break-even for latency (a complex LEA only has 2c latency) while still costing an extra uop. On Ice Lake and later Intel, even a 3-component LEA is still only 1 cycle so it's pure downside there. See https://uops.info/, the entry for LEA_B_I_D8 (R32) (Base, Index, 8-bit displacement, with scale-factor = 1.)
This tuning decision is unrelated to integer overflow.
Signed integer overflow/underflow is undefined behavior precisely so that compilers may make optimizations such as this. Because the compiler is allowed to do anything in the case of overflow/underflow, it can do this, or whatever else is more optimal for the use cases it is required to care about.
If the behavior on signed overflow had been specified as “What the DEC PDP-8 did back in 1973,” compilers for other targets would need to insert instructions to check for overflow and, if it occurs, produce that result instead of whatever the CPU does natively.

Compiler optimizations allowed via "int", "least" and "fast" non-fixed width types C/C++

Clearly, fixed-width integral types should be used when the size is important.
However, I read (Insomniac Games style guide), that "int" should be preferred for loop counters / function args / return codes / ect when the size isn't important - the rationale given was that fixed-width types can preclude certain compiler optimizations.
Now, I'd like to make a distinction between "compiler optimization" and "a more suitable typedef for the target architecture". The latter has global scope, and my guess probably has very limited impact unless the compiler can somehow reason about the global performance of the program parameterized by this typedef. The former has local scope, where the compiler would have the freedom to optimize number of bytes used, and operations, based on local register pressure / usage, among other things.
Does the standard permit "compiler optimizations" (as we've defined) for non-fixed-width types? Any good examples of this?
If not, and assuming the CPU can operate on smaller types as least as fast as larger types, then I see no harm, from a performance standpoint, of using fixed-width integers sized according to local context. At least that gives the possibility of relieving register pressure, and I'd argue couldn't be worse.
The reason that the rule of thumb is to use an int is that the standard defines this integral type as the natural data type of the CPU (provided that it is sufficiently wide for the range INT_MIN to INT_MAX. That's where the best-performance stems from.
There are many things wrong with int_fast types - most notably that they can be slower than int!
#include <stdio.h>
#include <inttypes.h>
int main(void) {
printf("%zu\n", sizeof (int_fast32_t));
}
Run this on x86-64 and it prints 8... but it makes no sense - using 64-bit registers often require prefixes in x86-64 bit mode, and the "behaviour on overflow is undefined" means that using 32-bit int it doesn't matter if the upper 32 bits of the 64 bit register are set after arithmetic - the behaviour is "still correct".
What is even worse, however, than using the signed fast or least types, is using a small unsigned integer instead of size_t or a signed integer for a loop counter - now the compiler must generate extra code to "ensure the correct wraparound behaviour".
I'm not very familiar with the x86 instruction set but unless you can guarantee that practically every arithmetic and move instruction also allows additional shift and (sign) extends then the assumption that smaller types are "as least as fast" as larger ones is not true.
The complexity of x86 makes it pretty hard to come up with simple examples so lets consider an ARM microcontroller instead.
Lets define two addition functions which only differ by return type. "add32" which returns an integer of full register width and "add8" which only returns a single byte.
int32_t add32(int32_t a, int32_t b) { return a + b; }
int8_t add8(int32_t a, int32_t b) { return a + b; }
Compiling those functions with -Os gives the following assembly:
add32(int, int):
add r0, r0, r1
bx lr
add8(int, int):
add r0, r0, r1
sxtb r0, r0 // Sign-extend single byte
bx lr
Notice how the function which only returns a byte is one instruction longer. It has to truncate the 32bit addition to a single byte.
Here is a link to the code # compiler explorer:
https://godbolt.org/z/ABFQKe
However, I read (Insomniac Games style guide), that "int" should be preferred for loop counters
You should rather be using size_t, whenever iterating over an array. int has other problems than performance, such as being signed and also problematic when porting.
From a standard point-of-view, for a scenario where "n" is the size of an int, there exists no case where int_fastn_t should perform worse than int, or the compiler/standard lib/ABI/system has a fault.
Does the standard permit "compiler optimizations" (as we've defined) for non-fixed-width types? Any good examples of this?
Sure, the compiler might optimize the use of integer types quite wildly, as long as it doesn't affect the outcome of the result. No matter if they are int or int32_t.
For example, an 8 bit CPU compiler might optimize int a=1; int b=1; ... c = a + b; to be performed on 8 bit arithmetic, ignoring integer promotions and the actual size of int. It will however most likely have to allocate 16 bits of memory to store the result.
But if we give it some rotten code like char a = 0x80; int b = a >> 1;, it will have to do the optimization so that the side affects of integer promotion are taken in account. That is, the result could be 0xFFC0 rather than 0x40 as one might have expected (assuming signed char, 2's complement, arithmetic shift). The a >> 1 part isn't possible to optimize to an 8 bit type because of this - it has to be carried out with 16 bit arithmetic.
I think the question you are trying to ask is:
Is the compiler allowed to make additional optimizations for a
non-fixed-width type such as int beyond what it would be allowed for
a fixed width type like int32_t that happens to have the same
length on the current platform?
That is, you are not interested in the part where the size of the non-fixed width type is allowed to be chosen appropriately for the hardware - you are aware of that and are asking if beyond that additional optimizations are available?
The answer, as far as I am aware or have seen, is no. No both in the sense that compilers do not actually optimize int differently than int32_t (on platforms where int is 32-bits), and also no in the sense that there are not optimizations allowed by the standard for int which are also not allowed for int32_t1 (this second part is wrong - see comments).
The easiest way to see this is that the various fixed width integers are all typedefs for various underlying primitive integer types - so on a platform with 32-bit integers int32_t will probably be a typedef (perhaps indirectly) of int. So from a behavioral and optimization point of view, the types are identical, and as soon as you are in the IR world of the compiler, the original type probably isn't even really available without jumping through oops (i.e., int and int32_t will generate the same IR).
So I think the advice you received was wrong, or at best misleading.
1 Of course the answer to the question of "Is it allowed for a compiler to optimize int better than int32_t is yes, since there are not particular requirements on optimization so a compiler could do something weird like that, or the reverse, such as optimizing int32_t better than int. I that's not very interesting though.

C++ while loop optimization not working properly

I have this code segment:
#include <stdio.h>
int main(int argc, const char** argv)
{
int a = argv[0][0];
int b = argv[0][1];
while ((a >= 0) &&
(a < b))
{
printf("a = %d\n", a);
a++;
}
return 0;
}
and I'm compiling it with gcc-4.5 -02 -Wstrict-overflow=5.
The compiler yells at me
warning: assuming signed overflow does not occur when changing X +- C1 cmp C2 to X cmp C1 +- C2
What does this mean exactly?
If i am correct, this loop will never cause an overflow, because for a to be incremented, it must be smaller than another integer. If it is bigger, the loop is terminated.
Can anyone explain this behavior to me?
The compiler is making an optimisation to convert a + 1 < b to a < b - 1.
However, if b is INT_MIN then this will underflow, which is a change in behaviour.
That's what it's warning about.
Of course, you can probably tell that this is impossible, but the compiler has limited resources to work things out and generally won't do in-depth analysis on data paths.
Adding a check that b >= 0 may solve the problem.
Edit: Another possibility is that it's moving a >= 0 to outside the loop, as (assuming no overflow) it can never change. Again, the assumption may not be valid for all inputs (i.e. if b is negative). You would need check the final assembly to see what it actually did.
The C++ standard says that if a signed integer calculation produces a result outside the representable range for the type then the behaviour is undefined. Integer overflow is UB. Once UB has happened, the implementation is free to do whatever it likes.
Many compilers apply optimisations on the explicit assumption that UB does not happen. [Or if it does, the code could be wrong but it's your problem!]
This compiler is notifying you that it is applying such an optimisation to a calculation where it is unable to determine from analysing the code that UB does not happen.
Your choices in general are:
Satisfy yourself that UB cannot happen, and ignore the warning.
Allow UB to happen and live with the consequences.
Rewrite the code so UB really cannot happen and the compiler knows it cannot happen, and the warning should go away.
I would recommend the last option. Simple range tests on a and b should be good enough.
My guess is that the compiler emits this error because the loop deals with completely unknown values, and it is unable to analyse the data flow well enough to work out whether UB can happen or not.
We with our superior reasoning power can convince ourselves that UB cannot happen, so we can ignore the error. In fact a careful reading of the error message might leave us asking whether it is relevant at all. Where are these two constant value C1 and C2?
We might also note that a can never go negative, so why is that test in the loop? I would probably rewrite the code to suppress the error, (but from experience that can be a self-defeating exercise). Try this and see what happens (and avoid unneeded parenthetic clutter):
if (a >= 0) {
while (a < b) {
...
++a;
}
}
What the compiler is warning you about is that it is assuming that signed overflow does not take place in the original code.
The warning does not mean "I'm about to write an optimization which potentially introduces overflows."
In other words, if your program depends on overflow (i.e. is not highly portable), then the optimization the compiler is doing may change its behavior. (So please verify for yourself that this code doesn't depend on overflow).
For instance, if you have "a + b > c", and you're depending on this test failing when a + b arithmetic wraps around (typical two's complement behavior), then if this happens to be algebraically transformed to "a > c - b", then it might succeed, because c - b may happen not to overflow, and produce a value smaller than a.
Notes: Only C programs can invoke undefined behavior. When compilers are incorrect, they are "nonconforming". A compiler can only be non-conforming (to the C standard) if it does the wrong thing with (C standard) conforming code. An optimization which alters correct, portable behavior is a nonconforming optimization.

Why certain implicit type conversions are safe on a machine and not on an other?? How can I prevent this cross platform issues?

I recently found a bug on my code that took me a few hours to debug.
the problem was in a function defined as:
unsigned int foo(unsigned int i){
long int v[]={i-1,i,i+1} ;
.
.
.
return x ; // evaluated by the function but not essential how for this problem.
}
The definition of v didn't cause any issue on my development machine (ubuntu 12.04 32 bit, g++ compiler), where the unsigned int were implicitly converted to long int and as such the negative values were correctly handled.
On a different machine (ubuntu 12.04 64 bit, g++ compiler) however this operation was not safe. When i=0, v[0] was not set to -1, but to some weird big value (as it often happens
when trying to make an unsigned int negative).
I could solve the issue casting the value of i to long int
long int v[]={(long int) i - 1, (long int) i, (long int) i + 1};
and everything worked fine (on both machines).
I can't figure out why the first works fine on a machine and doesn't work on the other.
Can you help me understanding this, so that I can avoid this or other issues in the future?
For unsigned values, addition/subtraction is well-defined as modulo arithmetic, so 0U-1 will work out to something like std::numeric_limits<unsigned>::max().
When converting from unsigned to signed, if the destination type is large enough to hold all the values of the unsigned value then it simply does a straight data copy into the destination type. If the destination type is not large enough to hold all the unsigned values I believe that it's implementation defined (will try to find standard reference).
So when long is 64-bit (presumably the case on your 64-bit machine) the unsigned fits and is copied straight.
When long is 32-bits on the 32-bit machine, again it most likely just interprets the bit pattern as a signed value which is -1 in this case.
EDIT: The simplest way to avoid these problems is to avoid mixing signed and unsigned types. What does it mean to subtract one from a value whose concept doesn't allow for negative numbers? I'm going to argue that the function parameter should be a signed value in your example.
That said g++ (at least version 4.5) provides a handy -Wsign-conversion that detects this issue in your particular code.
You can also have specialized cast catching all over-flow casts:
template<typename O, typename I>
O architecture_cast(I x) {
/* make sure I is an unsigned type. It */
static_assert(std::is_unsigned<I>::value, "Input value to architecture_cast has to be unsigned");
assert(x <= static_cast<typename std::make_unsigned<O>::type>( std::numeric_limits<O>::max() ));
return static_cast<O>(x);
}
Using this will catch in debug all of the casts from bigger numbers than the resulting type can accommodate. This includes your case of unsigned int being 0 and subtracted by -1 which results to biggest unsigned int.
Integer promotion rules in the C++ Standard are inherited from those in the C Standard, which were chosen not to describe how a language should most usefully behave, but rather to offer a behavioral description that was as consistent was practical with the ways many existing implementations had extended earlier dialects of C to add unsigned types.
Things get further complicated by an apparent desire to have the Standard specify behavioral aspects that were thought to be consistent among 100% of existing implementations, without regard for whether some other compatible behavior might be more broadly useful, while avoiding having the Standard impose any behavioral requirements on actions if on some plausible implementations it might be expensive to guarantee any behavior consistent with sequential program execution, but impossible to guarantee any behavior that would actually be useful.
I think it's pretty clear that the Committee wanted to unambiguously specify that long1 = uint1+1; uint2 = long1; and long1 = uint1+1; uint2 = long1; must set uint2 in a manner consistent with wraparound behavior in all cases, and did not want to forbid them from using wraparound behavior when setting long1. Although the Standard could have upheld the first requirement while implementations to promote to long on quiet-wraparound two's-complement platforms where the assignments to uint2 would yield results consistent with using wraparound behavior throughout, doing so would have meant including a rule specifically for quiet-wraparound two's-complement platforms, which is something C89 and--to an even greater extent C99--were exceptionally keen to avoid doing.

gcc optimization? bug? and its practial implication to project

My questions are divided into three parts
Question 1
Consider the below code,
#include <iostream>
using namespace std;
int main( int argc, char *argv[])
{
const int v = 50;
int i = 0X7FFFFFFF;
cout<<(i + v)<<endl;
if ( i + v < i )
{
cout<<"Number is negative"<<endl;
}
else
{
cout<<"Number is positive"<<endl;
}
return 0;
}
No specific compiler optimisation options are used or the O's flag is used. It is basic compilation command g++ -o test main.cpp is used to form the executable.
The seemingly very simple code, has odd behaviour in SUSE 64 bit OS, gcc version 4.1.2. The expected output is "Number is negative", instead only in SUSE 64 bit OS, the output would be "Number is positive".
After some amount of analysis and doing a 'disass' of the code, I find that the compiler optimises in the below format -
Since i is same on both sides of comparison, it cannot be changed in the same expression, remove 'i' from the equation.
Now, the comparison leads to if ( v < 0 ), where v is a constant positive, So during compilation itself, the else part cout function address is added to the register. No cmp/jmp instructions can be found.
I see that the behaviour is only in gcc 4.1.2 SUSE 10. When tried in AIX 5.1/5.3 and HP IA64, the result is as expected.
Is the above optimisation valid?
Or, is using the overflow mechanism for int not a valid use case?
Question 2
Now when I change the conditional statement from if (i + v < i) to if ( (i + v) < i ) even then, the behaviour is same, this atleast I would personally disagree, since additional braces are provided, I expect the compiler to create a temporary built-in type variable and them compare, thus nullify the optimisation.
Question 3
Suppose I have a huge code base, an I migrate my compiler version, such bug/optimisation can cause havoc in my system behaviour. Ofcourse from business perspective, it is very ineffective to test all lines of code again just because of compiler upgradation.
I think for all practical purpose, these kinds of error are very difficult to catch (during upgradation) and invariably will be leaked to production site.
Can anyone suggest any possible way to ensure to ensure that these kind of bug/optimization does not have any impact on my existing system/code base?
PS :
When the const for v is removed from the code, then optimization is not done by the compiler.
I believe, it is perfectly fine to use overflow mechanism to find if the variable is from MAX - 50 value (in my case).
Update(1)
What would I want to achieve? variable i would be a counter (kind of syncID). If I do offline operation (50 operation) then during startup, I would like to reset my counter, For this I am checking the boundary value (to reset it) rather than adding it blindly.
I am not sure if I am relying on the hardware implementation. I know that 0X7FFFFFFF is the max positive value. All I am doing is, by adding value to this, I am expecting the return value to be negative. I don't think this logic has anything to do with hardware implementation.
Anyways, all thanks for your input.
Update(2)
Most of the inpit states that I am relying on the lower level behavior on overflow checking. I have one questions regarding the same,
If that is the case, For an unsigned int how do I validate and reset the value during underflow or overflow? like if v=10, i=0X7FFFFFFE, I want reset i = 9. Similarly for underflow?
I would not be able to do that unless I check for negativity of the number. So my claim is that int must return a negative number when a value is added to the +MAX_INT.
Please let me know your inputs.
It's a known problem, and I don't think it's considered a bug in the compiler. When I compile with gcc 4.5 with -Wall -O2 it warns
warning: assuming signed overflow does not occur when assuming that (X + c) < X is always false
Although your code does overflow.
You can pass the -fno-strict-overflow flag to turn that particular optimization off.
Your code produces undefined behavior. C and C++ languages has no "overflow mechanism" for signed integer arithmetic. Your calculations overflow signed integers - the behavior is immediately undefined. Considering it form "a bug in the compiler or not" position is no different that attempting to analyze the i = i++ + ++i examples.
GCC compiler has an optimization based on that part of the specification of C/C++ languages. It is called "strict overflow semantics" or something lake that. It is based on the fact that adding a positive value to a signed integer in C++ always produces a larger value or results in undefined behavior. This immediately means that the compiler is perfectly free to assume that the sum is always larger. The general nature of that optimization is very similar to the "strict aliasing" optimizations also present in GCC. They both resulted in some complaints from the more "hackerish" parts of GCC user community, many of whom didn't even suspect that the tricks they were relying on in their C/C++ programs were simply illegal hacks.
Q1: Perhaps, the number is indeed positive in a 64bit implementation? Who knows? Before debugging the code I'd just printf("%d", i+v);
Q2: The parentheses are only there to tell the compiler how to parse an expression. This is usually done in the form of a tree, so the optimizer does not see any parentheses at all. And it is free to transform the expression.
Q3: That's why, as c/c++ programmer, you must not write code that assumes particular properties of the underlying hardware, such as, for example, that an int is a 32 bit quantity in two's complement form.
What does the line:
cout<<(i + v)<<endl;
Output in the SUSE example? You're sure you don't have 64bit ints?
OK, so this was almost six years ago and the question is answered. Still I feel that there are some bits that have not been adressed to my satisfaction, so I add a few comments, hopefully for the good of future readers of this discussion. (Such as myself when I got a search hit for it.)
The OP specified using gcc 4.1.2 without any special flags. I assume the absence of the -O flag is equivalent to -O0. With no optimization requested, why did gcc optimize away code in the reported way? That does seem to me like a compiler bug. I also assume this has been fixed in later versions (for example, one answer mentions gcc 4.5 and the -fno-strict-overflow optimization flag). The current gcc man page states that -fstrict-overflow is included with -O2 or more.
In current versions of gcc, there is an option -fwrapv that enables you to use the sort of code that caused trouble for the OP. Provided of course that you make sure you know the bit sizes of your integer types. From gcc man page:
-fstrict-overflow
.....
See also the -fwrapv option. Using -fwrapv means that integer signed overflow
is fully defined: it wraps. ... With -fwrapv certain types of overflow are
permitted. For example, if the compiler gets an overflow when doing arithmetic
on constants, the overflowed value can still be used with -fwrapv, but not otherwise.