At the advice of a high rep SO user, I've recently started compiling with the -Wconversion flag on my codebase. This has generated quite a few warnings, some which are legitimate (needlessly adding signed and unsigned types, for instance), but also some head scratchers, demonstrated below:
#include <cstdint>
int main()
{
uint16_t a = 4;
uint16_t b = 5;
b += a;
return 0;
}
When I compile with g++ -Wconversion -std=c++11 -O0 myFile.cpp, I get
warning: conversion to 'uint16_t {aka short unsigned int}' from 'int' may alter its value [-Wconversion]
b += a;
^
I've perused some similar questions on SO (dealing with | and << operators), taken a look here, and have read the Numeric Promotions and Numeric Conversions sections here. My understanding is, in order to do the math, a and b are promoted to int (since that's the first type that can fit the entire uint16_t value range), math is performed, the result is written back... except the result of the math is an int, and writing that back to uint16_t generates the warning. The consensus of the other questions was basically to cast away the warning, and the only way I've figured out how to do that is b = (uint16_t)(b + a); (or the equivalent b = static_cast<uint16_t>(b + a);).
Don't want this question to get too broad, but assuming my understanding of integer promotions is correct...
What's the best way to handle this moving forward? Should I avoid performing math on types narrower than int? It seems quite odd to me that I have to cast an arithmetic result which is the same type as all the operands (guess I would expect the compiler to recognize that and suppress the warning). Historically, I've liked to use no more bits than I need, and just let the compiler handle the promotions/conversions/padding as necessary.
Anyone use -Wconversion flag frequently? Just after a couple of days of using it myself, I'm starting to think its best use case is to turn it on, look at what it complains about, fix the legitimate complaints, then turn it back off. Or perhaps my definition of "legitimate complaint" needs readjusting. Replacing all of my += operators with spelled out casts seems like a nuisance more than anything.
I'm tempted to tag this as c as well, since an equivalent c code compiled with gcc -Wconversion -std=c11 -O0 myFile.c produces the exact same warning. But as is, I'm using g++ version 5.3.1 on an x86_64 Fedora 23 box. Please point me to the dupe if I've missed it; if the only answer/advice here is to cast away the warning, then this is a dupe.
What's the best way to handle this moving forward?
-Wno-conversion
Or just leave it unspecified. This is just an opinion, though.
In my experience, the need for narrow integer arithmetic tends to be quite rare, so you could still keep it on for the project, and disable for the few cases where this useless warning occurs. However, this probably depends highly on the type of your project, so your experience may vary.
Should I avoid performing math on types narrower than int?
Usually yes; unless you have a specific reason to use them. "I don't need the extra bits" isn't a specific enough reason in my opinion. Arithmetic operands are promoted to int anyway and it's usually faster and less error prone to use int.
Just after a couple of days of using it myself, I'm starting to think its best use case is to turn it on, look at what it complains about, fix the legitimate complaints, then turn it back off.
This is quite often a useful approach to warning flags that are included in neither -Wall nor -Wextra such as the ones with -Wsuggest- prefix. There is a reason why they aren't included in "all warnings".
I think this can be considered a shortcoming of gcc.
As this code doesn't generate any warning:
int a = ..., b = ...;
a += b;
This code should not generate either, because semantically they are the same (two same-type numbers are added, and the result is put into a same-type variable):
short a = ..., b = ...;
a += b;
But GCC generates a warning, because as you say, the short's gets promoted to int's. But the short version isn't more dangerous as the int one, in the sense that if the addition overflows, then the behavior is implementation-defined for the short case, and undefined for the int case (or if unsigned numbers are used, then truncation can happen in both cases).
Clang handles this case more intelligently, and doesn't warn for this case. I think it's because it actually tracks the possible bit-width (or maybe range?) of the result. So, for example, this warns:
int a = ...;
short b = a;
But this doesn't (but GCC warns for this):
int a = ...;
short b = a&0xf; // there is a conversion here, but clang knows that only 4 bits are used, so it doesn't warn
So, until GCC will have a more intelligent -Wconversion, your options are:
don't use -Wconversion
fix all the warnings it prints
use clang instead (maybe for GCC: turn off this warning; and for clang: turn it on)
But don't hold your breath until it's fixed, there is a bug about this, opened in 2009.
A note:
Historically, I've liked to use no more bits than I need, and just let the compiler handle the promotions/conversions/padding as necessary.
If you use shorter types for storage, it's fine. But usually, there's no reason to use shorter types than int for arithmetic. It gives no speedup (even, it can be slower, because of the unnecessary maskings).
Related
Consider the following example:
unsigned short c = // ...
if (c > 0xfffful)
c = 0xfffful;
Since unsigned short can actually be larger than 16 bits, I want to limit the value before snprintf it in hex format to a fixed-size buffer.
However, GCC (but not clang) gives a warning: comparison is always false due to limited range of data type [-Wtype-limits].
Is it a bug in GCC or I missed something? I understand that on my machine unsigned short is exactly 16 bits, but it's not guaranteed to be so on other platforms.
I'd say it is not a bug. GCC is claiming if (c > 0xfffful) will always be false, which, on your machine, is true. GCC was smart enough to catch this, clang wasn't. Good job GCC!
On the other hand, GCC was not smart enough to notice that while it was always false on your machine, its not necessarily always false on someone else's machine. Come on GCC!
Note that in C++11, the *_least##_t types appear (I reserve the right to be proven wrong!) to be implemented by typedef. By the time GCC is running it's warning checks it likely has no clue that the original data type was uint_least16_t. If that is the case, the compiler would have no way of inferring that the comparison might be true on other systems. Changing GCC to remember what the original data type was might be extremely difficult. I'm not defending GCC's naive warning but suggesting why it might be hard to fix.
I'd be curious to see what the GCC guys say about this. Have you considered filing an issue with them?
This doesn't seem like a bug (maybe it could be deemed a slightly naive feature), but I can see why you'd want this code there for portability.
In the absence of any standard macros to tell you what the size of the type is on your platform (and there aren't any), I would probably have a step in my build process that works that out and passes it to your program as a -D definition.
e.g. in Make:
if ...
CFLAGS += -DTRUNCATE_UINT16_LEAST_T
endif
then:
#ifdef TRUNCATE_UINT16_LEAST_T
if (c > 0xfffful)
c = 0xfffful;
#endif
with the Makefile conditional predicated on output from a step in configure, or the execution of some other C++ program that simply prints out sizeofs. Sadly that rules out cross-compiling.
Longer-term I propose suggesting more intelligent behaviour to the GCC guys, for when these particular type aliases are in use.
We had a bug in our code coming from the line
unsigned int i = -1;
When the code was originally written, is was i = 0 and thus correct.
Using -Wall -Wextra, I was a bit surprised that gcc didn't warn me here because -1 does not fit in an unsigned int.
Only when turning on -Wsign-conversion this line becomes a warning - but with it many many false positives. I am using a third party library which does array-like operations with signed int's (although they cannot be < 0), so whenever I mix that with e.g. vector, I get warnings - and I don't see the point in adding millions of casts (and even the 3rd party headers produce a lot of warnings). So it is too many warnings for me. All these warnings are that the conversion "may change the sign". That's fine because I know it doesn't in almost all of the cases.
But with the assignment mentioned above, I get the same "may change" warning. Shouldn't this be "Will definitely change sign!" instead of "may change"? Is there any way to emit warnings only for these "will change" cases, not for the maybe cases?
Initialize it with curly braces :
unsigned int i{-1};
GCC outputs :
main.cpp:3:22: error: narrowing conversion of '-1'
from 'int' to 'unsigned int' inside { } [-Wnarrowing]
unsigned int i{-1};
Note that it does not always cause an error, it might be a warning or disabled altogether. You should try it with your actual toolchain.
But with the assignment mentioned above, I get the same "may change" warning. Shouldn't this be "Will definitely change sign!" instead of "may change"?
That's odd. I tested a few versions of gcc in the range of (4.6 - 5.2) and they did give a different warning for unsigned int i = -1;
warning: negative integer implicitly converted to unsigned type [-Wsign-conversion]
That said, they are indeed controlled by the same option as the may change sign warnings, so...
Is there any way to emit warnings only for these "will change" cases, not for the maybe cases?
As far as I know, that's not possible. I'm sure it would be possible to implement in the compiler, so if you want a separate option to enable the warning for assigning a negative number - known at compile time - to an unsigned variable, then you can submit a feature request. However, because assigning -1 to an unsigned variable is such a common and usually perfectly valid thing to do, I doubt such feature would be considered very important.
I recently found a bug on my code that took me a few hours to debug.
the problem was in a function defined as:
unsigned int foo(unsigned int i){
long int v[]={i-1,i,i+1} ;
.
.
.
return x ; // evaluated by the function but not essential how for this problem.
}
The definition of v didn't cause any issue on my development machine (ubuntu 12.04 32 bit, g++ compiler), where the unsigned int were implicitly converted to long int and as such the negative values were correctly handled.
On a different machine (ubuntu 12.04 64 bit, g++ compiler) however this operation was not safe. When i=0, v[0] was not set to -1, but to some weird big value (as it often happens
when trying to make an unsigned int negative).
I could solve the issue casting the value of i to long int
long int v[]={(long int) i - 1, (long int) i, (long int) i + 1};
and everything worked fine (on both machines).
I can't figure out why the first works fine on a machine and doesn't work on the other.
Can you help me understanding this, so that I can avoid this or other issues in the future?
For unsigned values, addition/subtraction is well-defined as modulo arithmetic, so 0U-1 will work out to something like std::numeric_limits<unsigned>::max().
When converting from unsigned to signed, if the destination type is large enough to hold all the values of the unsigned value then it simply does a straight data copy into the destination type. If the destination type is not large enough to hold all the unsigned values I believe that it's implementation defined (will try to find standard reference).
So when long is 64-bit (presumably the case on your 64-bit machine) the unsigned fits and is copied straight.
When long is 32-bits on the 32-bit machine, again it most likely just interprets the bit pattern as a signed value which is -1 in this case.
EDIT: The simplest way to avoid these problems is to avoid mixing signed and unsigned types. What does it mean to subtract one from a value whose concept doesn't allow for negative numbers? I'm going to argue that the function parameter should be a signed value in your example.
That said g++ (at least version 4.5) provides a handy -Wsign-conversion that detects this issue in your particular code.
You can also have specialized cast catching all over-flow casts:
template<typename O, typename I>
O architecture_cast(I x) {
/* make sure I is an unsigned type. It */
static_assert(std::is_unsigned<I>::value, "Input value to architecture_cast has to be unsigned");
assert(x <= static_cast<typename std::make_unsigned<O>::type>( std::numeric_limits<O>::max() ));
return static_cast<O>(x);
}
Using this will catch in debug all of the casts from bigger numbers than the resulting type can accommodate. This includes your case of unsigned int being 0 and subtracted by -1 which results to biggest unsigned int.
Integer promotion rules in the C++ Standard are inherited from those in the C Standard, which were chosen not to describe how a language should most usefully behave, but rather to offer a behavioral description that was as consistent was practical with the ways many existing implementations had extended earlier dialects of C to add unsigned types.
Things get further complicated by an apparent desire to have the Standard specify behavioral aspects that were thought to be consistent among 100% of existing implementations, without regard for whether some other compatible behavior might be more broadly useful, while avoiding having the Standard impose any behavioral requirements on actions if on some plausible implementations it might be expensive to guarantee any behavior consistent with sequential program execution, but impossible to guarantee any behavior that would actually be useful.
I think it's pretty clear that the Committee wanted to unambiguously specify that long1 = uint1+1; uint2 = long1; and long1 = uint1+1; uint2 = long1; must set uint2 in a manner consistent with wraparound behavior in all cases, and did not want to forbid them from using wraparound behavior when setting long1. Although the Standard could have upheld the first requirement while implementations to promote to long on quiet-wraparound two's-complement platforms where the assignments to uint2 would yield results consistent with using wraparound behavior throughout, doing so would have meant including a rule specifically for quiet-wraparound two's-complement platforms, which is something C89 and--to an even greater extent C99--were exceptionally keen to avoid doing.
My questions are divided into three parts
Question 1
Consider the below code,
#include <iostream>
using namespace std;
int main( int argc, char *argv[])
{
const int v = 50;
int i = 0X7FFFFFFF;
cout<<(i + v)<<endl;
if ( i + v < i )
{
cout<<"Number is negative"<<endl;
}
else
{
cout<<"Number is positive"<<endl;
}
return 0;
}
No specific compiler optimisation options are used or the O's flag is used. It is basic compilation command g++ -o test main.cpp is used to form the executable.
The seemingly very simple code, has odd behaviour in SUSE 64 bit OS, gcc version 4.1.2. The expected output is "Number is negative", instead only in SUSE 64 bit OS, the output would be "Number is positive".
After some amount of analysis and doing a 'disass' of the code, I find that the compiler optimises in the below format -
Since i is same on both sides of comparison, it cannot be changed in the same expression, remove 'i' from the equation.
Now, the comparison leads to if ( v < 0 ), where v is a constant positive, So during compilation itself, the else part cout function address is added to the register. No cmp/jmp instructions can be found.
I see that the behaviour is only in gcc 4.1.2 SUSE 10. When tried in AIX 5.1/5.3 and HP IA64, the result is as expected.
Is the above optimisation valid?
Or, is using the overflow mechanism for int not a valid use case?
Question 2
Now when I change the conditional statement from if (i + v < i) to if ( (i + v) < i ) even then, the behaviour is same, this atleast I would personally disagree, since additional braces are provided, I expect the compiler to create a temporary built-in type variable and them compare, thus nullify the optimisation.
Question 3
Suppose I have a huge code base, an I migrate my compiler version, such bug/optimisation can cause havoc in my system behaviour. Ofcourse from business perspective, it is very ineffective to test all lines of code again just because of compiler upgradation.
I think for all practical purpose, these kinds of error are very difficult to catch (during upgradation) and invariably will be leaked to production site.
Can anyone suggest any possible way to ensure to ensure that these kind of bug/optimization does not have any impact on my existing system/code base?
PS :
When the const for v is removed from the code, then optimization is not done by the compiler.
I believe, it is perfectly fine to use overflow mechanism to find if the variable is from MAX - 50 value (in my case).
Update(1)
What would I want to achieve? variable i would be a counter (kind of syncID). If I do offline operation (50 operation) then during startup, I would like to reset my counter, For this I am checking the boundary value (to reset it) rather than adding it blindly.
I am not sure if I am relying on the hardware implementation. I know that 0X7FFFFFFF is the max positive value. All I am doing is, by adding value to this, I am expecting the return value to be negative. I don't think this logic has anything to do with hardware implementation.
Anyways, all thanks for your input.
Update(2)
Most of the inpit states that I am relying on the lower level behavior on overflow checking. I have one questions regarding the same,
If that is the case, For an unsigned int how do I validate and reset the value during underflow or overflow? like if v=10, i=0X7FFFFFFE, I want reset i = 9. Similarly for underflow?
I would not be able to do that unless I check for negativity of the number. So my claim is that int must return a negative number when a value is added to the +MAX_INT.
Please let me know your inputs.
It's a known problem, and I don't think it's considered a bug in the compiler. When I compile with gcc 4.5 with -Wall -O2 it warns
warning: assuming signed overflow does not occur when assuming that (X + c) < X is always false
Although your code does overflow.
You can pass the -fno-strict-overflow flag to turn that particular optimization off.
Your code produces undefined behavior. C and C++ languages has no "overflow mechanism" for signed integer arithmetic. Your calculations overflow signed integers - the behavior is immediately undefined. Considering it form "a bug in the compiler or not" position is no different that attempting to analyze the i = i++ + ++i examples.
GCC compiler has an optimization based on that part of the specification of C/C++ languages. It is called "strict overflow semantics" or something lake that. It is based on the fact that adding a positive value to a signed integer in C++ always produces a larger value or results in undefined behavior. This immediately means that the compiler is perfectly free to assume that the sum is always larger. The general nature of that optimization is very similar to the "strict aliasing" optimizations also present in GCC. They both resulted in some complaints from the more "hackerish" parts of GCC user community, many of whom didn't even suspect that the tricks they were relying on in their C/C++ programs were simply illegal hacks.
Q1: Perhaps, the number is indeed positive in a 64bit implementation? Who knows? Before debugging the code I'd just printf("%d", i+v);
Q2: The parentheses are only there to tell the compiler how to parse an expression. This is usually done in the form of a tree, so the optimizer does not see any parentheses at all. And it is free to transform the expression.
Q3: That's why, as c/c++ programmer, you must not write code that assumes particular properties of the underlying hardware, such as, for example, that an int is a 32 bit quantity in two's complement form.
What does the line:
cout<<(i + v)<<endl;
Output in the SUSE example? You're sure you don't have 64bit ints?
OK, so this was almost six years ago and the question is answered. Still I feel that there are some bits that have not been adressed to my satisfaction, so I add a few comments, hopefully for the good of future readers of this discussion. (Such as myself when I got a search hit for it.)
The OP specified using gcc 4.1.2 without any special flags. I assume the absence of the -O flag is equivalent to -O0. With no optimization requested, why did gcc optimize away code in the reported way? That does seem to me like a compiler bug. I also assume this has been fixed in later versions (for example, one answer mentions gcc 4.5 and the -fno-strict-overflow optimization flag). The current gcc man page states that -fstrict-overflow is included with -O2 or more.
In current versions of gcc, there is an option -fwrapv that enables you to use the sort of code that caused trouble for the OP. Provided of course that you make sure you know the bit sizes of your integer types. From gcc man page:
-fstrict-overflow
.....
See also the -fwrapv option. Using -fwrapv means that integer signed overflow
is fully defined: it wraps. ... With -fwrapv certain types of overflow are
permitted. For example, if the compiler gets an overflow when doing arithmetic
on constants, the overflowed value can still be used with -fwrapv, but not otherwise.
[This question is related to but not the same as this one.]
My compiler warns about implicitly converting or casting certain types to bool whereas explicit conversions do not produce a warning:
long t = 0;
bool b = false;
b = t; // performance warning: forcing long to bool
b = (bool)t; // performance warning
b = bool(t); // performance warning
b = static_cast<bool>(t); // performance warning
b = t ? true : false; // ok, no warning
b = t != 0; // ok
b = !!t; // ok
This is with Visual C++ 2008 but I suspect other compilers may have similar warnings.
So my question is: what is the performance implication of casting/converting to bool? Does explicit conversion have better performance in some circumstance (e.g., for certain target architectures or processors)? Does implicit conversion somehow confuse the optimizer?
Microsoft's explanation of their warning is not particularly helpful. They imply that there is a good reason but they don't explain it.
I was puzzled by this behaviour, until I found this link:
http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=99633
Apparently, coming from the Microsoft Developer who "owns" this warning:
This warning is surprisingly
helpful, and found a bug in my code
just yesterday. I think Martin is
taking "performance warning" out of
context.
It's not about the generated code,
it's about whether or not the
programmer has signalled an intent to
change a value from int to bool.
There is a penalty for that, and the
user has the choice to use "int"
instead of "bool" consistently (or
more likely vice versa) to avoid the
"boolifying" codegen. [...]
It is an old warning, and may have
outlived its purpose, but it's
behaving as designed here.
So it seems to me the warning is more about style and avoiding some mistakes than anything else.
Hope this will answer your question...
:-p
The performance is identical across the board. It involves a couple of instructions on x86, maybe 3 on some other architectures.
On x86 / VC++, they all do
cmp DWORD PTR [whatever], 0
setne al
GCC generates the same thing, but without the warnings (at any warning-level).
The performance warning does actually make a little bit of sense. I've had it as well and my curiousity led me to investigate with the disassembler. It is trying to tell you that the compiler has to generate some code to coerce the value to either 0 or 1. Because you are insisting on a bool, the old school C idea of 0 or anything else doesn't apply.
You can avoid that tiny performance hit if you really want to. The best way is to avoid the cast altogether and use a bool from the start. If you must have an int, you could just use if( int ) instead of if( bool ). The code generated will simply check whether the int is 0 or not. No extra code to make sure the value is 1 if it's not 0 will be generated.
Sounds like premature optimization to me. Are you expecting that the performance of the cast to seriously effect the performance of your app? Maybe if you are writing kernel code or device drivers but in most cases, they all should be ok.
As far as I know, there is no warning on any other compiler for this. The only way I can think that this would cause a performance loss is that the compiler has to compare the entire integer to 0 and then assign the bool appropriately (unlike a conversion such as a char to bool, where the result can be copied over because a bool is one byte and so they are effectively the same), or an integral conversion which involves copying some or all of the source to the destination, possibly after a zero of the destination if it's bigger than the source (in terms of memory).
It's yet another one of Microsoft's useless and unhelpful ideas as to what constitutes good code, and leads us to have to put up with stupid definitions like this:
template <typename T>
inline bool to_bool (const T& t)
{ return t ? true : false; }
long t;
bool b;
int i;
signed char c;
...
You get a warning when you do anything that would be "free" if bool wasn't required to be 0 or 1. b = !!t is effectively assigning the result of the (language built-in, non-overrideable) bool operator!(long)
You shouldn't expect the ! or != operators to cost zero asm instructions even with an optimizing compiler. It is usually true that int i = t is usually optimized away completely. Or even signed char c = t; (on x86/amd64, if t is in the %eax register, after c = t, using c just means using %al. amd64 has byte addressing for every register, BTW. IIRC, in x86 some registers don't have byte addressing.)
Anyway, b = t; i = b; isn't the same as c = t; i = c; it's i = !!t; instead of i = t & 0xff;
Err, I guess everyone already knows all that from the previous replies. My point was, the warning made sense to me, since it caught cases where the compiler had to do things you didn't really tell it to, like !!BOOL on return because you declared the function bool, but are returning an integral value that could be true and != 1. e.g. a lot of windows stuff returns BOOL (int).
This is one of MSVC's few warnings that G++ doesn't have. I'm a lot more used to g++, and it definitely warns about stuff MSVC doesn't, but that I'm glad it told me about. I wrote a portab.h header file with stubs for the MFC/Win32 classes/macros/functions I used. This got the MFC app I'm working on to compile on my GNU/Linux machine at home (and with cygwin). I mainly wanted to be able to compile-test what I was working on at home, but I ended up finding g++'s warnings very useful. It's also a lot stricter about e.g. templates...
On bool in general, I'm not sure it makes for better code when used as a return values and parameter passing. Even for locals, g++ 4.3 doesn't seem to figure out that it doesn't have to coerce the value to 0 or 1 before branching on it. If it's a local variable and you never take its address, the compiler should keep it in whatever size is fastest. If it has to spill it from registers to the stack, it could just as well keep it in 4 bytes, since that may be slightly faster. (It uses a lot of movsx (sign-extension) instructions when loading/storing (non-local) bools, but I don't really remember what it did for automatic (local stack) variables. I do remember seeing it reserve an odd amount of stack space (not a multiple of 4) in functions that had some bools locals.)
Using bool flags was slower than int with the Digital Mars D compiler as of last year:
http://www.digitalmars.com/d/archives/digitalmars/D/opEquals_needs_to_return_bool_71813.html
(D is a lot like C++, but abandons full C backwards compat to define some nice new semantics, and good support for template metaprogramming. e.g. "static if" or "static assert" instead of template hacks or cpp macros. I'd really like to give D a try sometime. :)
For data structures, it can make sense, e.g. if you want to pack a couple flags before an int and then some doubles in a struct you're going to have quite a lot of.
Based on your link to MS' explanation, it appears that if the value is merely 1 or 0, there is not performance hit, but if it's any other non-0 value that a comparison must be built at compile time?
In C++ a bool ISA int with only two values 0 = false, 1 = true. The compiler only has to check one bit. To be perfectly clear, true != 0, so any int can override bool, it just cost processing cycles to do so.
By using a long as in the code sample, you are forcing a lot more bit checks, which will cause a performance hit.
No this is not premature optimization, it is quite crazy to use code that takes more processing time across the board. This is simply good coding practice.
Unless you're writing code for a really critical inner loop (simulator core, ray-tracer, etc.) there is no point in worrying about any performance hits in this case. There are other more important things to worry about in your code (and other more significant performance traps lurking, I'm sure).
Microsoft's explanation seems to be that what they're trying to say is:
Hey, if you're using an int, but are
only storing true or false information in
it, make it a bool!
I'm skeptical about how much would be gained performance-wise, but MS may have found that there was some gain (for their use, anyway). Microsoft's code does tend to run an awful lot, so maybe they've found the micro-optimization to be worthwhile. I believe that a fair bit of what goes into the MS compiler is to support stuff they find useful themselves (only makes sense, right?).
And you get rid of some dirty, little casts to boot.
I don't think performance is the issue here. The reason you get a warning is that information is lost during conversion from int to bool.