Signed/unsigned comparisons - c++

I'm trying to understand why the following code doesn't issue a warning at the indicated place.
//from limits.h
#define UINT_MAX 0xffffffff /* maximum unsigned int value */
#define INT_MAX 2147483647 /* maximum (signed) int value */
/* = 0x7fffffff */
int a = INT_MAX;
//_int64 a = INT_MAX; // makes all warnings go away
unsigned int b = UINT_MAX;
bool c = false;
if(a < b) // warning C4018: '<' : signed/unsigned mismatch
c = true;
if(a > b) // warning C4018: '<' : signed/unsigned mismatch
c = true;
if(a <= b) // warning C4018: '<' : signed/unsigned mismatch
c = true;
if(a >= b) // warning C4018: '<' : signed/unsigned mismatch
c = true;
if(a == b) // no warning <--- warning expected here
c = true;
if(((unsigned int)a) == b) // no warning (as expected)
c = true;
if(a == ((int)b)) // no warning (as expected)
c = true;
I thought it was to do with background promotion, but the last two seem to say otherwise.
To my mind, the first == comparison is just as much a signed/unsigned mismatch as the others?

When comparing signed with unsigned, the compiler converts the signed value to unsigned. For equality, this doesn't matter, -1 == (unsigned) -1. For other comparisons it matters, e.g. the following is true: -1 > 2U.
EDIT: References:
5/9: (Expressions)
Many binary operators that expect
operands of arithmetic or enumeration
type cause conversions and yield
result types in a similar way. The
purpose is to yield a common type,
which is also the type of the result.
This pattern is called the usual
arithmetic conversions, which are
defined as follows:
If either
operand is of type long double, the
other shall be converted to long
double.
Otherwise, if either operand
is double, the other shall be
converted to double.
Otherwise, if
either operand is float, the other
shall be converted to float.
Otherwise, the integral promotions
(4.5) shall be performed on both
operands.54)
Then, if either operand
is unsigned long the other shall be
converted to unsigned long.
Otherwise, if one operand is a long
int and the other unsigned int, then
if a long int can represent all the
values of an unsigned int, the
unsigned int shall be converted to a
long int; otherwise both operands
shall be converted to unsigned long
int.
Otherwise, if either operand is
long, the other shall be converted to
long.
Otherwise, if either operand
is unsigned, the other shall be
converted to unsigned.
4.7/2: (Integral conversions)
If the destination type is unsigned,
the resulting value is the least
unsigned integer congruent to the
source integer (modulo 2n where n is
the number of bits used to represent
the unsigned type). [Note: In a two’s
complement representation, this
conversion is conceptual and there is
no change in the bit pattern (if there
is no truncation). ]
EDIT2: MSVC warning levels
What is warned about on the different warning levels of MSVC is, of course, choices made by the developers. As I see it, their choices in relation to signed/unsigned equality vs greater/less comparisons make sense, this is entirely subjective of course:
-1 == -1 means the same as -1 == (unsigned) -1 - I find that an intuitive result.
-1 < 2 does not mean the same as -1 < (unsigned) 2 - This is less intuitive at first glance, and IMO deserves an "earlier" warning.

Why signed/unsigned warnings are important and programmers must pay heed to them, is demonstrated by the following example.
Guess the output of this code?
#include <iostream>
int main() {
int i = -1;
unsigned int j = 1;
if ( i < j )
std::cout << " i is less than j";
else
std::cout << " i is greater than j";
return 0;
}
Output:
i is greater than j
Surprised? Online Demo : http://www.ideone.com/5iCxY
Bottomline: in comparison, if one operand is unsigned, then the other operand is implicitly converted into unsigned if its type is signed!

The == operator just does a bitwise comparison (by simple division to see if it is 0).
The smaller/bigger than comparisons rely much more on the sign of the number.
4 bit Example:
1111 = 15 ? or -1 ?
so if you have 1111 < 0001 ... it's ambiguous...
but if you have 1111 == 1111 ... It's the same thing although you didn't mean it to be.

Starting from C++20 we have special functions for correct comparing signed-unsigned values
https://en.cppreference.com/w/cpp/utility/intcmp

In a system that represents the values using 2-complement (most modern processors) they are equal even in their binary form. This may be why compiler doesn't complain about a == b.
And to me it's strange compiler doesn't warn you on a == ((int)b). I think it should give you an integer truncation warning or something.

The line of code in question does not generate a C4018 warning because Microsoft have used a different warning number (i.e. C4389) to handle that case, and C4389 is not enabled by default (i.e. at level 3).
From the Microsoft docs for C4389:
// C4389.cpp
// compile with: /W4
#pragma warning(default: 4389)
int main()
{
int a = 9;
unsigned int b = 10;
if (a == b) // C4389
return 0;
else
return 0;
};
The other answers have explained quite well why Microsoft might have decided to make a special case out of the equality operator, but I find those answers are not super helpful without mentioning C4389 or how to enable it in Visual Studio.
I should also mention that if you are going to enable C4389, you might also consider enabling C4388. Unfortunately there is no official documentation for C4388 but it seems to pop up in expressions like the following:
int a = 9;
unsigned int b = 10;
bool equal = (a == b); // C4388

Related

Vector size comparison with integer for non-zero vector size [duplicate]

See this code snippet
int main()
{
unsigned int a = 1000;
int b = -1;
if (a>b) printf("A is BIG! %d\n", a-b);
else printf("a is SMALL! %d\n", a-b);
return 0;
}
This gives the output: a is SMALL: 1001
I don't understand what's happening here. How does the > operator work here? Why is "a" smaller than "b"? If it is indeed smaller, why do i get a positive number (1001) as the difference?
Binary operations between different integral types are performed within a "common" type defined by so called usual arithmetic conversions (see the language specification, 6.3.1.8). In your case the "common" type is unsigned int. This means that int operand (your b) will get converted to unsigned int before the comparison, as well as for the purpose of performing subtraction.
When -1 is converted to unsigned int the result is the maximal possible unsigned int value (same as UINT_MAX). Needless to say, it is going to be greater than your unsigned 1000 value, meaning that a > b is indeed false and a is indeed small compared to (unsigned) b. The if in your code should resolve to else branch, which is what you observed in your experiment.
The same conversion rules apply to subtraction. Your a-b is really interpreted as a - (unsigned) b and the result has type unsigned int. Such value cannot be printed with %d format specifier, since %d only works with signed values. Your attempt to print it with %d results in undefined behavior, so the value that you see printed (even though it has a logical deterministic explanation in practice) is completely meaningless from the point of view of C language.
Edit: Actually, I could be wrong about the undefined behavior part. According to C language specification, the common part of the range of the corresponding signed and unsigned integer type shall have identical representation (implying, according to the footnote 31, "interchangeability as arguments to functions"). So, the result of a - b expression is unsigned 1001 as described above, and unless I'm missing something, it is legal to print this specific unsigned value with %d specifier, since it falls within the positive range of int. Printing (unsigned) INT_MAX + 1 with %d would be undefined, but 1001u is fine.
On a typical implementation where int is 32-bit, -1 when converted to an unsigned int is 4,294,967,295 which is indeed ≥ 1000.
Even if you treat the subtraction in an unsigned world, 1000 - (4,294,967,295) = -4,294,966,295 = 1,001 which is what you get.
That's why gcc will spit a warning when you compare unsigned with signed. (If you don't see a warning, pass the -Wsign-compare flag.)
You are doing unsigned comparison, i.e. comparing 1000 to 2^32 - 1.
The output is signed because of %d in printf.
N.B. sometimes the behavior when you mix signed and unsigned operands is compiler-specific. I think it's best to avoid them and do casts when in doubt.
#include<stdio.h>
int main()
{
int a = 1000;
signed int b = -1, c = -2;
printf("%d",(unsigned int)b);
printf("%d\n",(unsigned int)c);
printf("%d\n",(unsigned int)a);
if(1000>-1){
printf("\ntrue");
}
else
printf("\nfalse");
return 0;
}
For this you need to understand the precedence of operators
Relational Operators works left to right ...
so when it comes
if(1000>-1)
then first of all it will change -1 to unsigned integer because int is by default treated as unsigned number and it range it greater than the signed number
-1 will change into the unsigned number ,it changes into a very big number
Find a easy way to compare, maybe useful when you can not get rid of unsigned declaration, (for example, [NSArray count]), just force the "unsigned int" to an "int".
Please correct me if I am wrong.
if (((int)a)>b) {
....
}
The hardware is designed to compare signed to signed and unsigned to unsigned.
If you want the arithmetic result, convert the unsigned value to a larger signed type first. Otherwise the compiler wil assume that the comparison is really between unsigned values.
And -1 is represented as 1111..1111, so it a very big quantity ... The biggest ... When interpreted as unsigned.
while comparing a>b where a is unsigned int type and b is int type, b is type casted to unsigned int so, signed int value -1 is converted into MAX value of unsigned**(range: 0 to (2^32)-1 )**
Thus, a>b i.e., (1000>4294967296) becomes false. Hence else loop printf("a is SMALL! %d\n", a-b); executed.

Bit shift leads to strange type conversion

The following code compiles without warning:
std::uint16_t a = 12;
std::uint16_t b = a & 0x003f;
However, performing a bit shift along with the bitwise and leads to an 'implicit cast warning':
std::uint16_t b = (a & 0x003f) << 10; // Warning generated.
Both gcc and clang complain that there is an implicit conversion from int to uint16_t, yet I fail to see why introducing the bit shift would cause the right hand expression to suddenly evaluate to an int.
EDIT: For clang, I compiled with the -std=c++14 -Weverything flags; for gcc, I compiled with the -std=c++14 -Wall -Wconversion flags.
From cppreference.com: "If the operand passed to an arithmetic operator is integral or unscoped enumeration type, then before any other action (but after lvalue-to-rvalue conversion, if applicable), the operand undergoes integral promotion."
For instance:
byte a = 1;
byte b = a << byte(1);
a and 1 are promoted to int: int(a) and int(byte(1)).
a is shifted one position to left: int result = int(a) << int(byte(1)) (the result is an int).
result is stored in b. Since int is wider than byte an warning will be issued.
If the operands are constant expressions, a compiler might be able to compute the result at compile time and issue an warning iff the result does not fit in destination:
byte b = 1 << 1; // no warning: does not exceed 8 bits
byte b = 1 << 8; // warning: exceeds 8 bits
Or, using constexpr:
constexpr byte a = 1;
byte b = a << 1; // no warning: it fits in 8 bits
byte b = a << 8; // warning: it does not fit in 8 bits
Doing any arithmetic with integer types always is preceded by promotion to at least (sometimes, but not in this case using gcc, unsigned) int. As you can see by this example, this applies to you first, warning-less variant too.
The probably best way to get around these (admittedly often surprising) integer promotion rules would be to use an unsigned int (or uint32_t on common platforms) from the get go.
If you cannot or don't want to use the bigger type, you can just static_cast the result of the whole expression back to std::uint16_t:
std::uint16_t b = static_cast<std::uint16_t>((a & 0x003f) << 10);
This will correctly result in the RHS-value mod 2^16.
yet I fail to see why introducing the bit shift would cause the right hand expression to suddenly evaluate to an int.
I think you misinterpret the warning. In both cases expression evaluate to int but in the first case result will always fit in uint16_t, in the second case it will not. Looks like the compiler is smart enough to detect that and generates warning only in the second case.

Why is there no underflow warning for unsigned type?

Consider the following code:
unsigned int n = 0;
unsigned int m = n - 1; // no warning here?
if (n > -1) {
std::cout << "n > -1.\n";
} else {
std::cout << "yes, 0 is not > -1.\n";
}
The code above produces a warning on the if condition if (m > -1) for comparing signed and unsigned integer expressions. I have no contest with that. What bothers me is the first two assignment statements.
unsigned int n = 0;
unsigned int m = n - 1;
My thinking is that the compiler should have given me a warning on the second assignment because it knows that the variable n is unsigned with a value of 0 from the first line and that there was an attempt to subtract from a zero value and assigning it to an unsigned type.
If the next line after the second assignment happened to be different than an if statement or something similar, then the concerned code might have slipped through.
Yes, there is a narrowing conversion before the assignment to m there and yes the compiler do not complain about it which was also mentioned by Marshall Clow in his C++Now 2017 Lightning Talk (Fighting Compiler Warnings).
short s = 3 * 6;
short s = integer * integer;
short s = integer;
So, why can't the compiler tell me about the possible underflow in that code?
Compilers:
Clang 3.7/4.0 (-Wall -Wextra)
GCC 5.3/7.1.1 (-Wall -Wextra -pedantic)
Microsoft C/C++ 19.00.23506
The reason is because if (n > -1) can never be false, but unsigned int m = n - 1; is an actual legal expression you may have wanted to write. From 5/9 there are a bunch of rules about how to get your signed an unsigned types to have a consistent type, and all of them fail except the final default condition
Otherwise, both operands shall be converted to the unsigned integer
type corresponding to the type of the operand with signed integer
type.
Since unsigned arithmetic is well-defined to use modulo operations the entire expression is legal and well defined. They could yet decide to emit a warning but there may be enough legacy code using tricks like this that it would cause too many false positives.

Negative integral constant converted to unsigned type [duplicate]

-2147483648 is the smallest integer for integer type with 32 bits, but it seems that it will overflow in the if(...) sentence:
if (-2147483648 > 0)
std::cout << "true";
else
std::cout << "false";
This will print true in my testing. However, if we cast -2147483648 to integer, the result will be different:
if (int(-2147483648) > 0)
std::cout << "true";
else
std::cout << "false";
This will print false.
I'm confused. Can anyone give an explanation on this?
Update 02-05-2012:
Thanks for your comments, in my compiler, the size of int is 4 bytes. I'm using VC for some simple testing. I've changed the description in my question.
That's a lot of very good replys in this post, AndreyT gave a very detailed explanation on how the compiler will behave on such input, and how this minimum integer was implemented. qPCR4vir on the other hand gave some related "curiosities" and how integers are represented. So impressive!
-2147483648 is not a "number". C++ language does not support negative literal values.
-2147483648 is actually an expression: a positive literal value 2147483648 with unary - operator in front of it. Value 2147483648 is apparently too large for the positive side of int range on your platform. If type long int had greater range on your platform, the compiler would have to automatically assume that 2147483648 has long int type. (In C++11 the compiler would also have to consider long long int type.) This would make the compiler to evaluate -2147483648 in the domain of larger type and the result would be negative, as one would expect.
However, apparently in your case the range of long int is the same as range of int, and in general there's no integer type with greater range than int on your platform. This formally means that positive constant 2147483648 overflows all available signed integer types, which in turn means that the behavior of your program is undefined. (It is a bit strange that the language specification opts for undefined behavior in such cases, instead of requiring a diagnostic message, but that's the way it is.)
In practice, taking into account that the behavior is undefined, 2147483648 might get interpreted as some implementation-dependent negative value which happens to turn positive after having unary - applied to it. Alternatively, some implementations might decide to attempt using unsigned types to represent the value (for example, in C89/90 compilers were required to use unsigned long int, but not in C99 or C++). Implementations are allowed to do anything, since the behavior is undefined anyway.
As a side note, this is the reason why constants like INT_MIN are typically defined as
#define INT_MIN (-2147483647 - 1)
instead of the seemingly more straightforward
#define INT_MIN -2147483648
The latter would not work as intended.
The compiler (VC2012) promote to the "minimum" integers that can hold the values. In the first case, signed int (and long int) cannot (before the sign is applied), but unsigned int can: 2147483648 has unsigned int ???? type.
In the second you force int from the unsigned.
const bool i= (-2147483648 > 0) ; // --> true
warning C4146: unary minus operator applied to unsigned type, result still unsigned
Here are related "curiosities":
const bool b= (-2147483647 > 0) ; // false
const bool i= (-2147483648 > 0) ; // true : result still unsigned
const bool c= ( INT_MIN-1 > 0) ; // true :'-' int constant overflow
const bool f= ( 2147483647 > 0) ; // true
const bool g= ( 2147483648 > 0) ; // true
const bool d= ( INT_MAX+1 > 0) ; // false:'+' int constant overflow
const bool j= ( int(-2147483648)> 0) ; // false :
const bool h= ( int(2147483648) > 0) ; // false
const bool m= (-2147483648L > 0) ; // true
const bool o= (-2147483648LL > 0) ; // false
C++11 standard:
2.14.2 Integer literals [lex.icon]
…
An integer literal is a sequence of digits that has no period or
exponent part. An integer literal may have a prefix that specifies its
base and a suffix that specifies its type.
…
The type of an integer literal is the first of the corresponding list
in which its value can be represented.
If an integer literal cannot be represented by any type in its list
and an extended integer type (3.9.1) can represent its value, it may
have that extended integer type. If all of the types in the list for
the literal are signed, the extended integer type shall be signed. If
all of the types in the list for the literal are unsigned, the
extended integer type shall be unsigned. If the list contains both
signed and unsigned types, the extended integer type may be signed or
unsigned. A program is ill-formed if one of its translation units
contains an integer literal that cannot be represented by any of the
allowed types.
And these are the promotions rules for integers in the standard.
4.5 Integral promotions [conv.prom]
A prvalue of an integer type other than bool, char16_t, char32_t, or
wchar_t whose integer conversion rank (4.13) is less than the rank of
int can be converted to a prvalue of type int if int can represent all
the values of the source type; otherwise, the source prvalue can be
converted to a prvalue of type unsigned int.
In Short, 2147483648 overflows to -2147483648, and (-(-2147483648) > 0) is true.
This is how 2147483648 looks like in binary.
In addition, in the case of signed binary calculations, the most significant bit ("MSB") is the sign bit. This question may help explain why.
Because -2147483648 is actually 2147483648 with negation (-) applied to it, the number isn't what you'd expect. It is actually the equivalent of this pseudocode: operator -(2147483648)
Now, assuming your compiler has sizeof(int) equal to 4 and CHAR_BIT is defined as 8, that would make 2147483648 overflow the maximum signed value of an integer (2147483647). So what is the maximum plus one? Lets work that out with a 4 bit, 2s compliment integer.
Wait! 8 overflows the integer! What do we do? Use its unsigned representation of 1000 and interpret the bits as a signed integer. This representation leaves us with -8 being applied the 2s complement negation resulting in 8, which, as we all know, is greater than 0.
This is why <limits.h> (and <climits>) commonly define INT_MIN as ((-2147483647) - 1) - so that the maximum signed integer (0x7FFFFFFF) is negated (0x80000001), then decremented (0x80000000).

(-2147483648> 0) returns true in C++?

-2147483648 is the smallest integer for integer type with 32 bits, but it seems that it will overflow in the if(...) sentence:
if (-2147483648 > 0)
std::cout << "true";
else
std::cout << "false";
This will print true in my testing. However, if we cast -2147483648 to integer, the result will be different:
if (int(-2147483648) > 0)
std::cout << "true";
else
std::cout << "false";
This will print false.
I'm confused. Can anyone give an explanation on this?
Update 02-05-2012:
Thanks for your comments, in my compiler, the size of int is 4 bytes. I'm using VC for some simple testing. I've changed the description in my question.
That's a lot of very good replys in this post, AndreyT gave a very detailed explanation on how the compiler will behave on such input, and how this minimum integer was implemented. qPCR4vir on the other hand gave some related "curiosities" and how integers are represented. So impressive!
-2147483648 is not a "number". C++ language does not support negative literal values.
-2147483648 is actually an expression: a positive literal value 2147483648 with unary - operator in front of it. Value 2147483648 is apparently too large for the positive side of int range on your platform. If type long int had greater range on your platform, the compiler would have to automatically assume that 2147483648 has long int type. (In C++11 the compiler would also have to consider long long int type.) This would make the compiler to evaluate -2147483648 in the domain of larger type and the result would be negative, as one would expect.
However, apparently in your case the range of long int is the same as range of int, and in general there's no integer type with greater range than int on your platform. This formally means that positive constant 2147483648 overflows all available signed integer types, which in turn means that the behavior of your program is undefined. (It is a bit strange that the language specification opts for undefined behavior in such cases, instead of requiring a diagnostic message, but that's the way it is.)
In practice, taking into account that the behavior is undefined, 2147483648 might get interpreted as some implementation-dependent negative value which happens to turn positive after having unary - applied to it. Alternatively, some implementations might decide to attempt using unsigned types to represent the value (for example, in C89/90 compilers were required to use unsigned long int, but not in C99 or C++). Implementations are allowed to do anything, since the behavior is undefined anyway.
As a side note, this is the reason why constants like INT_MIN are typically defined as
#define INT_MIN (-2147483647 - 1)
instead of the seemingly more straightforward
#define INT_MIN -2147483648
The latter would not work as intended.
The compiler (VC2012) promote to the "minimum" integers that can hold the values. In the first case, signed int (and long int) cannot (before the sign is applied), but unsigned int can: 2147483648 has unsigned int ???? type.
In the second you force int from the unsigned.
const bool i= (-2147483648 > 0) ; // --> true
warning C4146: unary minus operator applied to unsigned type, result still unsigned
Here are related "curiosities":
const bool b= (-2147483647 > 0) ; // false
const bool i= (-2147483648 > 0) ; // true : result still unsigned
const bool c= ( INT_MIN-1 > 0) ; // true :'-' int constant overflow
const bool f= ( 2147483647 > 0) ; // true
const bool g= ( 2147483648 > 0) ; // true
const bool d= ( INT_MAX+1 > 0) ; // false:'+' int constant overflow
const bool j= ( int(-2147483648)> 0) ; // false :
const bool h= ( int(2147483648) > 0) ; // false
const bool m= (-2147483648L > 0) ; // true
const bool o= (-2147483648LL > 0) ; // false
C++11 standard:
2.14.2 Integer literals [lex.icon]
…
An integer literal is a sequence of digits that has no period or
exponent part. An integer literal may have a prefix that specifies its
base and a suffix that specifies its type.
…
The type of an integer literal is the first of the corresponding list
in which its value can be represented.
If an integer literal cannot be represented by any type in its list
and an extended integer type (3.9.1) can represent its value, it may
have that extended integer type. If all of the types in the list for
the literal are signed, the extended integer type shall be signed. If
all of the types in the list for the literal are unsigned, the
extended integer type shall be unsigned. If the list contains both
signed and unsigned types, the extended integer type may be signed or
unsigned. A program is ill-formed if one of its translation units
contains an integer literal that cannot be represented by any of the
allowed types.
And these are the promotions rules for integers in the standard.
4.5 Integral promotions [conv.prom]
A prvalue of an integer type other than bool, char16_t, char32_t, or
wchar_t whose integer conversion rank (4.13) is less than the rank of
int can be converted to a prvalue of type int if int can represent all
the values of the source type; otherwise, the source prvalue can be
converted to a prvalue of type unsigned int.
In Short, 2147483648 overflows to -2147483648, and (-(-2147483648) > 0) is true.
This is how 2147483648 looks like in binary.
In addition, in the case of signed binary calculations, the most significant bit ("MSB") is the sign bit. This question may help explain why.
Because -2147483648 is actually 2147483648 with negation (-) applied to it, the number isn't what you'd expect. It is actually the equivalent of this pseudocode: operator -(2147483648)
Now, assuming your compiler has sizeof(int) equal to 4 and CHAR_BIT is defined as 8, that would make 2147483648 overflow the maximum signed value of an integer (2147483647). So what is the maximum plus one? Lets work that out with a 4 bit, 2s compliment integer.
Wait! 8 overflows the integer! What do we do? Use its unsigned representation of 1000 and interpret the bits as a signed integer. This representation leaves us with -8 being applied the 2s complement negation resulting in 8, which, as we all know, is greater than 0.
This is why <limits.h> (and <climits>) commonly define INT_MIN as ((-2147483647) - 1) - so that the maximum signed integer (0x7FFFFFFF) is negated (0x80000001), then decremented (0x80000000).