In C++, why is long l = 0x80000000; positive?
C++:
long l = 0x80000000; // l is positive. Why??
int i = 0x80000000;
long l = i; // l is negative
According to this site: https://en.cppreference.com/w/cpp/language/integer_literal, 0x80000000 should be a signed int but it doesn't appear to be case because when it gets assigned to l sign extension doesn't occur.
Java:
long l = 0x80000000; // l is negative
int i = 0x80000000;
long l = i; // l is negative
On the other hand, Java has a more consistent behavior.
C++ Test code:
#include <stdio.h>
#include <string.h>
void print_sign(long l) {
if (l < 0) {
printf("Negative\n");
} else if (l > 0) {
printf("Positive\n");
} else {
printf("Zero\n");
}
}
int main() {
long l = -0x80000000;
print_sign(l); // Positive
long l2 = 0x80000000;
print_sign(l2); // Positive
int i = 0x80000000;
long l3 = i;
print_sign(l3); // Negative
int i2 = -0x80000000;
long l4 = i2;
print_sign(l4); // Negative
}
From your link: "The type of the integer literal is the first type in which the value can fit, from the list of types which depends on which numeric base and which integer-suffix was used." and for hexadecimal values lists int, unsigned int...
Your compiler uses 32 bit ints, so the largest (signed) int is 0x7FFFFFFF. The reason a signed int cannot represent 0x8000000...0xFFFFFFF is that it needs some of the 2^32 possible values of its 32 bits to represent negative numbers. However, 0x80000000 fits in an 32 bit unsigned int. Your compiler uses 64 bit longs, which can hold up to 0x7FFF FFFF FFFF FFFF, so 0x80000000 also fits in a signed long, and so the long l is the positive value 0x80000000.
On the other hand int i is a signed int and simply doesn't fit 0x80000000, so undefined behaviour occurs. What often happens when a signed number is too big to fit in C++ is that two-complement arithmetic is used and the number wraps round to a large negative number. (Do not rely on this behaviour; optimisations have been known to break this). In any case it appears the two's complement behaviour has indeed happened in this case, resulting in i being negative.
In your example code you use both 0x80000000 and -0x80000000 and in each case they have the same result. In fact, the are the same. Recall that 0x8000000 is an unsigned int. The 2003 C++ standard says in 5.3.1c7: "The negative of an unsigned quantity is computed by subtracting its value from 2^n, where n is the number of bits in the promoted operand." 0x80000000 is precisely 2^31, and so -0x80000000 is 2^32-2^31=2^31. To get the expected behaviours we would have to use -(long)0x80000000 instead.
With the help of the awesome people on SO, I think I can answer my own question now:
Just to correct the notion that 0x80000000 can't fit in an int:
It is possible to store, without loss or undefined behavior, the value 0x80000000 to an int (assuming sizeof(int) == 4). The following code can demonstrate this behavior:
#include <limits.h>
#include <stdio.h>
int main() {
int i = INT_MIN;
printf("%X\n", i);
return 0;
}
Assigning the literal 0x80000000 to a variable is little more nuanced, though.
What the other others failed to mention (except #Daniel Langr) is the fact that C++ doesn't have a concept of negative literals.
There are no negative integer literals. Expressions such as -1 apply the unary minus operator to the value represented by the literal, which may involve implicit type conversions.
With this in mind, the literal 0x80000000 is always treated as a positive number. Negations come after the size and sign have been determined. This is important: negations don't affect the unsigned/signedness of the literal, only the base and the value do. 0x80000000 is too big to fit in a signed integer, so C++ tries to use the next applicable type: unsigned int, which then succeeds. The order of types C++ tries depends on the base of the literal plus any suffixes it may or may not have.
The table is listed here: https://en.cppreference.com/w/cpp/language/integer_literal
So with this rule in mind let's work out some examples:
-2147483648: Treated as a long int because it can't fit in an int.
2147483648: Treated as a long int because C++ doesn't consider unsigned int as a candidate for decimal literals.
0x80000000: Treated as an unsigned int because C++ considers unsigned int as a candidate for non-decimal literals.
(-2147483647 - 1): Treated as an int. This is typically how INT_MIN is defined to preserve the type of the literal as an int. This is the type safe way of saying -2147483648 as an int.
-0x80000000: Treated as an unsigned int even though there's a negation. Negating any unsigned is undefined behavior, though.
-0x80000000l: Treated as a long int and the sign is properly negated.
Related
See this code snippet
int main()
{
unsigned int a = 1000;
int b = -1;
if (a>b) printf("A is BIG! %d\n", a-b);
else printf("a is SMALL! %d\n", a-b);
return 0;
}
This gives the output: a is SMALL: 1001
I don't understand what's happening here. How does the > operator work here? Why is "a" smaller than "b"? If it is indeed smaller, why do i get a positive number (1001) as the difference?
Binary operations between different integral types are performed within a "common" type defined by so called usual arithmetic conversions (see the language specification, 6.3.1.8). In your case the "common" type is unsigned int. This means that int operand (your b) will get converted to unsigned int before the comparison, as well as for the purpose of performing subtraction.
When -1 is converted to unsigned int the result is the maximal possible unsigned int value (same as UINT_MAX). Needless to say, it is going to be greater than your unsigned 1000 value, meaning that a > b is indeed false and a is indeed small compared to (unsigned) b. The if in your code should resolve to else branch, which is what you observed in your experiment.
The same conversion rules apply to subtraction. Your a-b is really interpreted as a - (unsigned) b and the result has type unsigned int. Such value cannot be printed with %d format specifier, since %d only works with signed values. Your attempt to print it with %d results in undefined behavior, so the value that you see printed (even though it has a logical deterministic explanation in practice) is completely meaningless from the point of view of C language.
Edit: Actually, I could be wrong about the undefined behavior part. According to C language specification, the common part of the range of the corresponding signed and unsigned integer type shall have identical representation (implying, according to the footnote 31, "interchangeability as arguments to functions"). So, the result of a - b expression is unsigned 1001 as described above, and unless I'm missing something, it is legal to print this specific unsigned value with %d specifier, since it falls within the positive range of int. Printing (unsigned) INT_MAX + 1 with %d would be undefined, but 1001u is fine.
On a typical implementation where int is 32-bit, -1 when converted to an unsigned int is 4,294,967,295 which is indeed ≥ 1000.
Even if you treat the subtraction in an unsigned world, 1000 - (4,294,967,295) = -4,294,966,295 = 1,001 which is what you get.
That's why gcc will spit a warning when you compare unsigned with signed. (If you don't see a warning, pass the -Wsign-compare flag.)
You are doing unsigned comparison, i.e. comparing 1000 to 2^32 - 1.
The output is signed because of %d in printf.
N.B. sometimes the behavior when you mix signed and unsigned operands is compiler-specific. I think it's best to avoid them and do casts when in doubt.
#include<stdio.h>
int main()
{
int a = 1000;
signed int b = -1, c = -2;
printf("%d",(unsigned int)b);
printf("%d\n",(unsigned int)c);
printf("%d\n",(unsigned int)a);
if(1000>-1){
printf("\ntrue");
}
else
printf("\nfalse");
return 0;
}
For this you need to understand the precedence of operators
Relational Operators works left to right ...
so when it comes
if(1000>-1)
then first of all it will change -1 to unsigned integer because int is by default treated as unsigned number and it range it greater than the signed number
-1 will change into the unsigned number ,it changes into a very big number
Find a easy way to compare, maybe useful when you can not get rid of unsigned declaration, (for example, [NSArray count]), just force the "unsigned int" to an "int".
Please correct me if I am wrong.
if (((int)a)>b) {
....
}
The hardware is designed to compare signed to signed and unsigned to unsigned.
If you want the arithmetic result, convert the unsigned value to a larger signed type first. Otherwise the compiler wil assume that the comparison is really between unsigned values.
And -1 is represented as 1111..1111, so it a very big quantity ... The biggest ... When interpreted as unsigned.
while comparing a>b where a is unsigned int type and b is int type, b is type casted to unsigned int so, signed int value -1 is converted into MAX value of unsigned**(range: 0 to (2^32)-1 )**
Thus, a>b i.e., (1000>4294967296) becomes false. Hence else loop printf("a is SMALL! %d\n", a-b); executed.
-2147483648 is the smallest integer for integer type with 32 bits, but it seems that it will overflow in the if(...) sentence:
if (-2147483648 > 0)
std::cout << "true";
else
std::cout << "false";
This will print true in my testing. However, if we cast -2147483648 to integer, the result will be different:
if (int(-2147483648) > 0)
std::cout << "true";
else
std::cout << "false";
This will print false.
I'm confused. Can anyone give an explanation on this?
Update 02-05-2012:
Thanks for your comments, in my compiler, the size of int is 4 bytes. I'm using VC for some simple testing. I've changed the description in my question.
That's a lot of very good replys in this post, AndreyT gave a very detailed explanation on how the compiler will behave on such input, and how this minimum integer was implemented. qPCR4vir on the other hand gave some related "curiosities" and how integers are represented. So impressive!
-2147483648 is not a "number". C++ language does not support negative literal values.
-2147483648 is actually an expression: a positive literal value 2147483648 with unary - operator in front of it. Value 2147483648 is apparently too large for the positive side of int range on your platform. If type long int had greater range on your platform, the compiler would have to automatically assume that 2147483648 has long int type. (In C++11 the compiler would also have to consider long long int type.) This would make the compiler to evaluate -2147483648 in the domain of larger type and the result would be negative, as one would expect.
However, apparently in your case the range of long int is the same as range of int, and in general there's no integer type with greater range than int on your platform. This formally means that positive constant 2147483648 overflows all available signed integer types, which in turn means that the behavior of your program is undefined. (It is a bit strange that the language specification opts for undefined behavior in such cases, instead of requiring a diagnostic message, but that's the way it is.)
In practice, taking into account that the behavior is undefined, 2147483648 might get interpreted as some implementation-dependent negative value which happens to turn positive after having unary - applied to it. Alternatively, some implementations might decide to attempt using unsigned types to represent the value (for example, in C89/90 compilers were required to use unsigned long int, but not in C99 or C++). Implementations are allowed to do anything, since the behavior is undefined anyway.
As a side note, this is the reason why constants like INT_MIN are typically defined as
#define INT_MIN (-2147483647 - 1)
instead of the seemingly more straightforward
#define INT_MIN -2147483648
The latter would not work as intended.
The compiler (VC2012) promote to the "minimum" integers that can hold the values. In the first case, signed int (and long int) cannot (before the sign is applied), but unsigned int can: 2147483648 has unsigned int ???? type.
In the second you force int from the unsigned.
const bool i= (-2147483648 > 0) ; // --> true
warning C4146: unary minus operator applied to unsigned type, result still unsigned
Here are related "curiosities":
const bool b= (-2147483647 > 0) ; // false
const bool i= (-2147483648 > 0) ; // true : result still unsigned
const bool c= ( INT_MIN-1 > 0) ; // true :'-' int constant overflow
const bool f= ( 2147483647 > 0) ; // true
const bool g= ( 2147483648 > 0) ; // true
const bool d= ( INT_MAX+1 > 0) ; // false:'+' int constant overflow
const bool j= ( int(-2147483648)> 0) ; // false :
const bool h= ( int(2147483648) > 0) ; // false
const bool m= (-2147483648L > 0) ; // true
const bool o= (-2147483648LL > 0) ; // false
C++11 standard:
2.14.2 Integer literals [lex.icon]
…
An integer literal is a sequence of digits that has no period or
exponent part. An integer literal may have a prefix that specifies its
base and a suffix that specifies its type.
…
The type of an integer literal is the first of the corresponding list
in which its value can be represented.
If an integer literal cannot be represented by any type in its list
and an extended integer type (3.9.1) can represent its value, it may
have that extended integer type. If all of the types in the list for
the literal are signed, the extended integer type shall be signed. If
all of the types in the list for the literal are unsigned, the
extended integer type shall be unsigned. If the list contains both
signed and unsigned types, the extended integer type may be signed or
unsigned. A program is ill-formed if one of its translation units
contains an integer literal that cannot be represented by any of the
allowed types.
And these are the promotions rules for integers in the standard.
4.5 Integral promotions [conv.prom]
A prvalue of an integer type other than bool, char16_t, char32_t, or
wchar_t whose integer conversion rank (4.13) is less than the rank of
int can be converted to a prvalue of type int if int can represent all
the values of the source type; otherwise, the source prvalue can be
converted to a prvalue of type unsigned int.
In Short, 2147483648 overflows to -2147483648, and (-(-2147483648) > 0) is true.
This is how 2147483648 looks like in binary.
In addition, in the case of signed binary calculations, the most significant bit ("MSB") is the sign bit. This question may help explain why.
Because -2147483648 is actually 2147483648 with negation (-) applied to it, the number isn't what you'd expect. It is actually the equivalent of this pseudocode: operator -(2147483648)
Now, assuming your compiler has sizeof(int) equal to 4 and CHAR_BIT is defined as 8, that would make 2147483648 overflow the maximum signed value of an integer (2147483647). So what is the maximum plus one? Lets work that out with a 4 bit, 2s compliment integer.
Wait! 8 overflows the integer! What do we do? Use its unsigned representation of 1000 and interpret the bits as a signed integer. This representation leaves us with -8 being applied the 2s complement negation resulting in 8, which, as we all know, is greater than 0.
This is why <limits.h> (and <climits>) commonly define INT_MIN as ((-2147483647) - 1) - so that the maximum signed integer (0x7FFFFFFF) is negated (0x80000001), then decremented (0x80000000).
I am new to C++, I am confused with C++'s behavior for the code below:
#include <iostream>
void hello(unsigned int x, unsigned int y){
std::cout<<x<<std::endl;
std::cout<<y<<std::endl;
std::cout<<x+y<<std::endl;
}
int main(){
int a = -1;
int b = 3;
hello(a,b);
return 1;
}
The x in the output is a very large integer:4294967295, I know that negative integer convert to unsigned will behave like this. But why x+y in the output is 2?
Contrary to the other answers, there is no undefined behavior here, and there is no overflow. Unsigned integers use modulo 2n arithmetic.
Section 4.7 paragraph 2 of the standard says "If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type)." This dictates that -1 is equal to the largest possible unsigned int (modulo 2n).
Section 3.9.1 paragraph 4 says "Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer." To make it clear what this means, the footnote to this clause says "This implies that unsigned arithmetic does not overflow because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type."
In other words, converting -1 to 4294967295 is not just defined behavior, it is required behavior (assuming 32 bit integers). Similarly, adding 3 to that value and yielding 2 as a result is also required behavior. In this case, the value of n is irrelevant. The third value printed by hello() must be 2 or the implementation is not compliant with the standard.
Because unsigned int's will overflow. In other words, a=-1 (signed) which is 1 value below the maximum value for unsigned int's, 4294967295.
Then you add 3, the int will overflow and start at 0, so -1+3 =2
Passing negative number to unsigned int as parameter gives you an undefined behavior. The default int is a signed. It has a range of –2,147,483,648 to 2,147,483,647. Unsigned int range is from 0 to 4,294,967,295.
This isn't so much having to do with C++ but with how computers represent signed and unsigned numbers.
This is a good source on that. Basically, signed numbers are (usually) represented using two's complement, in which the most significant bit has a value of -2^n. In effect, what his means is that the positive numbers are represented the same in two's complement as they are in regular unsigned binary.
-1 is represented as all ones, which when interpreted as an unsigned integer will be the largest integer that can be represented (4294967295, when dealing with 32 bits).
One of the great things about using two complement to represent signed numbers is that you can perform addition and subtraction in the exact same way as with unsigned numbers and it will work out correctly, so long as the number does not exceed the bounds that can be represented. This isn't as easy with other forms such as signed-magnitude.
So, what this means is that because the result of -1 + 3 = 2, and because 2 is positive, it will be interpreted the same as if it were unsigned. Thus, it prints 2.
There's no real need for a solution to this, I just want to know why.
Let's take two numbers:
#include <iostream>
using namespace std;
int main()
{
unsigned long long int a = 17446744073709551615;
signed long long int b = -30000000003;
signed int c;
c = a/b;
cout << "\n\n\n" << c << endl;
}
Now, lately the answer I've been getting is zero. The size of my long long is 8 bytes, so more than enough to take it with the unsigned label. The C variable should also be big enough to handle the answer. (It should be -581 558 136, according to Google). So...
Edit I'd like to point out that on my machine...
Using numeric_limits a falls well withing the maximum of 18446744073709551615 and b falls within the minimum limits of -9223372036854775808.
You have a number of implicit conversions happening, most of them unnecessary.
unsigned long long int a = 17446744073709551615;
An unsuffixed decimal integer literal is of type int, long int, or long long int; it's never of an unsigned type. That particular value almost certainly exceeds the maximum value of a long long int (263-1). Unless your compiler has a signed integer type wider than 64 bits, that makes your program ill-formed.
Add a ULL suffix to ensure that the literal is of the correct type:
unsigned long long int a = 17446744073709551615ULL;
The value happens to be between 263-1 and 264-1, so it fits in a 64-bit unsigned type but not in a 64-bit signed type.
(Actually just the U would suffice, but it doesn't hurt to be explicit.)
signed long long int b = -30000000003;
This shouldn't be a problem. 30000000003 is of some signed integer type; if your compiler supports long long, which is at least 64 bits wide, there's no overflow. Still, as long as you need a suffix on the value of a, it wouldn't hurt to be explicit:
signed long long int b = -30000000003LL;
Now we have:
signed int c;
c = a/b;
Dividing an unsigned long long by a signed long long causes the signed operand to be converted to unsigned long long. In this case, the value being converted is negative, so it's converted to a large positive value. Converting -30000000003 to unsigned long long yields 18446744043709551613. Dividing 17446744073709551615 by 18446744043709551613 yields zero.
Unless your compiler supports integers wider than 64 bits (most don't), you won't be able to directly divide 17446744073709551615 by -30000000003 and get a mathematically correct answer, since there's no integer type that can represent both values. All arithmetic operators (other than the shift operators) require operands of the same type, with implicit conversions applied as necessary.
In this particular case, you can divide 17446744073709551615ULL by 30000000003ULL and then account for the sign. (Check the language rules for division of negative integers.)
If you really need to do this in general, you can resort to floating-point (which means you'll probably lose some precision) or use some arbitrary width integer arithmetic package like GMP.
b is getting treated as an unsigned number which is larger than a. Hence you are getting the answer as 0.
Try using it as
c = abs(a) / abs (b)
if ((a < 0 && b > 0 ) || (a> 0 && b < 0))
return -c;
return c;
-2147483648 is the smallest integer for integer type with 32 bits, but it seems that it will overflow in the if(...) sentence:
if (-2147483648 > 0)
std::cout << "true";
else
std::cout << "false";
This will print true in my testing. However, if we cast -2147483648 to integer, the result will be different:
if (int(-2147483648) > 0)
std::cout << "true";
else
std::cout << "false";
This will print false.
I'm confused. Can anyone give an explanation on this?
Update 02-05-2012:
Thanks for your comments, in my compiler, the size of int is 4 bytes. I'm using VC for some simple testing. I've changed the description in my question.
That's a lot of very good replys in this post, AndreyT gave a very detailed explanation on how the compiler will behave on such input, and how this minimum integer was implemented. qPCR4vir on the other hand gave some related "curiosities" and how integers are represented. So impressive!
-2147483648 is not a "number". C++ language does not support negative literal values.
-2147483648 is actually an expression: a positive literal value 2147483648 with unary - operator in front of it. Value 2147483648 is apparently too large for the positive side of int range on your platform. If type long int had greater range on your platform, the compiler would have to automatically assume that 2147483648 has long int type. (In C++11 the compiler would also have to consider long long int type.) This would make the compiler to evaluate -2147483648 in the domain of larger type and the result would be negative, as one would expect.
However, apparently in your case the range of long int is the same as range of int, and in general there's no integer type with greater range than int on your platform. This formally means that positive constant 2147483648 overflows all available signed integer types, which in turn means that the behavior of your program is undefined. (It is a bit strange that the language specification opts for undefined behavior in such cases, instead of requiring a diagnostic message, but that's the way it is.)
In practice, taking into account that the behavior is undefined, 2147483648 might get interpreted as some implementation-dependent negative value which happens to turn positive after having unary - applied to it. Alternatively, some implementations might decide to attempt using unsigned types to represent the value (for example, in C89/90 compilers were required to use unsigned long int, but not in C99 or C++). Implementations are allowed to do anything, since the behavior is undefined anyway.
As a side note, this is the reason why constants like INT_MIN are typically defined as
#define INT_MIN (-2147483647 - 1)
instead of the seemingly more straightforward
#define INT_MIN -2147483648
The latter would not work as intended.
The compiler (VC2012) promote to the "minimum" integers that can hold the values. In the first case, signed int (and long int) cannot (before the sign is applied), but unsigned int can: 2147483648 has unsigned int ???? type.
In the second you force int from the unsigned.
const bool i= (-2147483648 > 0) ; // --> true
warning C4146: unary minus operator applied to unsigned type, result still unsigned
Here are related "curiosities":
const bool b= (-2147483647 > 0) ; // false
const bool i= (-2147483648 > 0) ; // true : result still unsigned
const bool c= ( INT_MIN-1 > 0) ; // true :'-' int constant overflow
const bool f= ( 2147483647 > 0) ; // true
const bool g= ( 2147483648 > 0) ; // true
const bool d= ( INT_MAX+1 > 0) ; // false:'+' int constant overflow
const bool j= ( int(-2147483648)> 0) ; // false :
const bool h= ( int(2147483648) > 0) ; // false
const bool m= (-2147483648L > 0) ; // true
const bool o= (-2147483648LL > 0) ; // false
C++11 standard:
2.14.2 Integer literals [lex.icon]
…
An integer literal is a sequence of digits that has no period or
exponent part. An integer literal may have a prefix that specifies its
base and a suffix that specifies its type.
…
The type of an integer literal is the first of the corresponding list
in which its value can be represented.
If an integer literal cannot be represented by any type in its list
and an extended integer type (3.9.1) can represent its value, it may
have that extended integer type. If all of the types in the list for
the literal are signed, the extended integer type shall be signed. If
all of the types in the list for the literal are unsigned, the
extended integer type shall be unsigned. If the list contains both
signed and unsigned types, the extended integer type may be signed or
unsigned. A program is ill-formed if one of its translation units
contains an integer literal that cannot be represented by any of the
allowed types.
And these are the promotions rules for integers in the standard.
4.5 Integral promotions [conv.prom]
A prvalue of an integer type other than bool, char16_t, char32_t, or
wchar_t whose integer conversion rank (4.13) is less than the rank of
int can be converted to a prvalue of type int if int can represent all
the values of the source type; otherwise, the source prvalue can be
converted to a prvalue of type unsigned int.
In Short, 2147483648 overflows to -2147483648, and (-(-2147483648) > 0) is true.
This is how 2147483648 looks like in binary.
In addition, in the case of signed binary calculations, the most significant bit ("MSB") is the sign bit. This question may help explain why.
Because -2147483648 is actually 2147483648 with negation (-) applied to it, the number isn't what you'd expect. It is actually the equivalent of this pseudocode: operator -(2147483648)
Now, assuming your compiler has sizeof(int) equal to 4 and CHAR_BIT is defined as 8, that would make 2147483648 overflow the maximum signed value of an integer (2147483647). So what is the maximum plus one? Lets work that out with a 4 bit, 2s compliment integer.
Wait! 8 overflows the integer! What do we do? Use its unsigned representation of 1000 and interpret the bits as a signed integer. This representation leaves us with -8 being applied the 2s complement negation resulting in 8, which, as we all know, is greater than 0.
This is why <limits.h> (and <climits>) commonly define INT_MIN as ((-2147483647) - 1) - so that the maximum signed integer (0x7FFFFFFF) is negated (0x80000001), then decremented (0x80000000).