How can I portably find out the smallest of INT_MAX and abs(INT_MIN)? (That's the mathematical absolute value of INT_MIN, not a call to the abs function.)
It should be as same as INT_MAX in most systems, but I'm looking for a more portable way.
While the typical value of INT_MIN is -2147483648, and the typical value of INT_MAX is 2147483647, it is not guaranteed by the standard. TL;DR: The value you're searching for is INT_MAX in a conforming implementation. But calculating min(INT_MAX, abs(INT_MIN)) isn't portable.
The possible values of INT_MIN and INT_MAX
INT_MIN and INT_MAX are defined by the Annex E (Implementation limits) 1 (C standard, C++ inherits this stuff):
The contents of the header are given below, in alphabetical
order. The minimum magnitudes shown shall be replaced by
implementation-defined magnitudes with the same sign. The values shall
all be constant expressions suitable for use in #if preprocessing
directives. The components are described further in 5.2.4.2.1.
[...]
#define INT_MAX +32767
#define INT_MIN -32767
[...]
The standard requires the type int to be an integer type that can represent the range [INT_MIN, INT_MAX] (section 5.2.4.2.1.).
Then, 6.2.6.2. (Integer types, again part of the C standard), comes into play and further restricts this to what we know as two's or ones' complement:
For signed integer types, the bits of the object representation shall be divided into three
groups: value bits, padding bits, and the sign bit. There need not be any padding bits;
signed char shall not have any padding bits. There shall be exactly one sign bit.
Each bit that is a value bit shall have the same value as the same bit in the object
representation of the corresponding unsigned type (if there are M value bits in the signed
type and N in the unsigned type, then M ≤ N). If the sign bit is zero, it shall not affect the resulting value. If the sign bit is one, the value shall be modified in one of the
following ways:
— the corresponding value with sign bit 0 is negated (sign and magnitude);
— the sign bit has the value −(2M) (two’s complement);
— the sign bit has the value −(2M − 1) (ones’ complement).
Section 6.2.6.2. is also very important to relate the value representation of the signed integer types with the value representation of its unsigned siblings.
This means, you either get the range [-(2^n - 1), (2^n - 1)] or [-2^n, (2^n - 1)], where n is typically 15 or 31.
Operations on signed integer types
Now for the second thing: Operations on signed integer types, that result in a value that is not within the range [INT_MIN, INT_MAX], the behavior is undefined. This is explicitly mandated in C++ by Paragraph 5/4:
If during the evaluation of an expression, the result is not mathematically defined or not in the range of
representable values for its type, the behavior is undefined.
For C, 6.5/5 offers a very similar passage:
If an exceptional condition occurs during the evaluation of an expression (that is, if the
result is not mathematically defined or not in the range of representable values for its
type), the behavior is undefined.
So what happens if the value of INT_MIN happens to be less than the negative of INT_MAX (e.g. -32768 and 32767 respectively)? Calculating -(INT_MIN) will be undefined, the same as INT_MAX + 1.
So we need to avoid ever calculating a value that may isn't in the range of [INT_MIN, INT_MAX]. Lucky, INT_MAX + INT_MIN is always in that range, as INT_MAX is a strictly positive value and INT_MIN a strictly negative value. Hence INT_MIN < INT_MAX + INT_MIN < INT_MAX.
Now we can check, whether, INT_MAX + INT_MIN is equal to, less than, or greater than 0.
`INT_MAX + INT_MIN` | value of -INT_MIN | value of -INT_MAX
------------------------------------------------------------------
< 0 | undefined | -INT_MAX
= 0 | INT_MAX = -INT_MIN | -INT_MAX = INT_MIN
> 0 | cannot occur according to 6.2.6.2. of the C standard
Hence, to determine the minimum of INT_MAX and -INT_MIN (in the mathematical sense), the following code is sufficient:
if ( INT_MAX + INT_MIN == 0 )
{
return INT_MAX; // or -INT_MIN, it doesn't matter
}
else if ( INT_MAX + INT_MIN < 0 )
{
return INT_MAX; // INT_MAX is smaller, -INT_MIN cannot be represented.
}
else // ( INT_MAX + INT_MIN > 0 )
{
return -INT_MIN; // -INT_MIN is actually smaller than INT_MAX, may not occur in a conforming implementation.
}
Or, to simplify:
return (INT_MAX + INT_MIN <= 0) ? INT_MAX : -INT_MIN;
The values in a ternary operator will only be evaluated if necessary. Hence, -INT_MIN is either left unevaluated (therefore cannot produce UB), or is a well-defined value.
Or, if you want an assertion:
assert(INT_MAX + INT_MIN <= 0);
return INT_MAX;
Or, if you want that at compile time:
static_assert(INT_MAX + INT_MIN <= 0, "non-conforming implementation");
return INT_MAX;
Getting integer operations right (i.e. if correctness matters)
If you're interested in safe integer arithmetic, have a look at my implementation of safe integer operations. If you want to see the patterns (rather than this lengthy text output) on which operations fail and which succeed, choose this demo.
Depending on the architecture, there may be other options to ensure correctness, such as gcc's option -ftrapv.
INT_MAX + INT_MIN < 0 ? INT_MAX : -INT_MIN
Edited to add explanation: Of course the difficulty is that -INT_MIN or abs(INT_MIN) will be undefined if -INT_MIN is too big to fit in an int. So we need some way of checking whether this is the case. The condition INT_MAX + INT_MIN < 0 tests whether -INT_MIN is greater than INT_MAX. If it is, then INT_MAX is the smaller of the two absolute values. If not, then INT_MAX is the larger of the two absolute values, and -INT_MIN is the correct answer.
In C99 and above, INT_MAX.
Quoth the spec:
For signed integer types, the bits of the object representation shall be divided into three
groups: value bits, padding bits, and the sign bit. There need not be any padding bits;
signed char shall not have any padding bits. There shall be exactly one sign bit.
Each bit that is a value bit shall have the same value as the same bit in the object
representation of the corresponding unsigned type (if there are M value bits in the signed
type and N in the unsigned type, then M ≤ N). If the sign bit is zero, it shall not affect
the resulting value. If the sign bit is one, the value shall be modified in one of the
following ways:
the corresponding value with sign bit 0 is negated (sign and magnitude);
the sign bit has the value −(2^M) (two’s complement);
the sign bit has the value −(2^M − 1) (ones’ complement).
(Section 6.2.6.2 of http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf)
On most systems, abs (INT_MIN) is not defined. For example, on typical 32 bit machines, INT_MAX = 2^31 - 1, INT_MIN = - 2^31, and abs (INT_MIN) cannot be 2^31.
-INT_MAX is representable as an int in all C and C++ dialects, as far as I know. Therefore:
-INT_MAX <= INT_MIN ? -INT_MIN : INT_MAX
abs(INT_MIN) will invoke undefined behavior. Standard says
7.22.6.1 The abs, labs and llabs functions:
The abs, labs, and llabs functions compute the absolute value of an integer j. If the result cannot be represented, the behavior is undefined.
Try this instead :
Convert INT_MIN to unsignrd int. Since -ve numbers can't be represented as an unsigned int, INT_MAX will be converted to UINT_MAX + 1 + INT_MIN.
#include <stdio.h>
#include <stdlib.h>
unsigned min(unsigned a, unsigned b)
{
return a < b ? a : b;
}
int main(void)
{
printf("%u\n", min(INT_MAX, INT_MIN));
}
Related
I realize that there is a rule by which numbers with a width smaller than int can be promoted to a wider type for the addition operation. But I cannot fully explain how only one permutation of the following print_unsafe_minus will fail. How is it that only the <unsigned, long> example fails, and what is the take-away for programmers with regards to best practices?
#include <fmt/core.h>
template<typename M, typename N>
void print_unsafe_minus() {
M a = 3, b = 4;
N c = a - b;
fmt::print("{}\n", c);
}
int main() {
// storing result of unsigned 3 minus 4 to a signed type
print_unsafe_minus<uint8_t, int8_t>(); // -1
print_unsafe_minus<uint16_t, int8_t>(); // -1
print_unsafe_minus<uint32_t, int8_t>(); // -1
print_unsafe_minus<uint64_t, int8_t>(); // -1
print_unsafe_minus<uint8_t, int16_t>(); // -1
print_unsafe_minus<uint16_t, int16_t>(); // -1
print_unsafe_minus<uint32_t, int16_t>(); // -1
print_unsafe_minus<uint64_t, int16_t>(); // -1
print_unsafe_minus<uint8_t, int32_t>(); // -1
print_unsafe_minus<uint16_t, int32_t>(); // -1
print_unsafe_minus<uint32_t, int32_t>(); // -1
print_unsafe_minus<uint64_t, int32_t>(); // -1
print_unsafe_minus<uint8_t, int64_t>(); // -1
print_unsafe_minus<uint16_t, int64_t>(); // -1
print_unsafe_minus<uint32_t, int64_t>(); // 4294967295
print_unsafe_minus<uint64_t, int64_t>(); // -1
}
(edit) Also worth noting-- if we extend the example to include 128-bit integers, then the following two permutations fail as well:
print_unsafe_minus<uint32_t, __int128>(); // 4294967295
print_unsafe_minus<uint64_t, __int128>(); // 18446744073709551615
Before we start, let us assume OP is using an implementation with 32-bit int type. That is, int32_t is equivalent to int.
Let X be the width of M, and Y be the width of N.
Let us divide your test cases into three categories:
First Category: X <= 16
Integer promotions applies here, which is always done before invoking an arithmetic operator.
uint8_t and uint16_t have their whole value ranges representable by int, hence they are promoted to int before doing the subtraction. Then you get a signed value of -1 from doing 3 - 4, which is then used to initialize a signed integer type, which regardless of its width can hold -1. Thus you get -1 as output.
Second Category: (X >= 32) and (X >= Y)
No promotion happens before doing the subtraction.
The rule that applies here is that unsigned integer arithmetic is always modulo 2X, where X is the width of the integer.
Hence a - b always give you 2X - 1, since this is the value that is equal to -1 modulo 2 in the range of M.
Now you assign it to a signed type. Let us assume C++20 (before C++20 it is implementation-defined behavior when assigning an unsigned value that cannot be represented by a destination signed type).
Here the result of a - b (i.e 2X - 1) is converted to the unique value that is congruent to itself modulo 2Y in the destination range (i.e from -2Y-1 to 2Y-1 - 1). Since X >= Y, this is always going to be -1.
So you get -1 as output.
Third Category: (X >= 32) and (X < Y)
There is only one case in this category, namely the case where M = uint32_t, N = uint64_t.
The subtraction is the same as in category 2, where you get 232 - 1.
The rule to convert to the signed type is still the same. However, this time, 232 - 1 is equal to itself modulo 264, so the value remains unchanged.
Note: 4294967295 == 232 - 1
Take Away
This is probably a surprising aspect of C++, and as suggested by #NathanOliver, you should avoid mixing signed types and unsigned type, and take extreme care when you do want to mix them.
You can tell the compiler to generate warnings for such conversion by turning on -Wconversion. Your code gets a lot of warnings when this is turned on.
Let's assume a sane two-complement platform where int has 32-bits and uint32_t is the same as unsigned.
uint32_t a = 3, b = 4;
int64_t c = a - b;
Operands to - operator undergo integral promotions*. int cannot represent all values of uint32_t, but 32-bit unsigned can represent all values of uint32_t. The values are promoted to unsigned. The result type of - is the common type of operands after promotions - both operands are unsigned. The result type of - operator is unsigned. a - b is mathematically -1. The result is (unsigned)-1, but unsigned cannot represent negative numbers. So -1 is converted to an unsigned type, it "wraps around" and results in UINT_MAX, which is equal to UINT32_MAX, because unsigned has 32-bits. This result is representable in int64_t so no conversion happens and c is assigned the value of UINT32_MAX.
In contrast let's take for example <uint16_t, int64_t>. A 32-bit int can represent all values of an uint16_t, so uint16_t is promoted to int, so the result of a - b is just an (int)-1. There is no conversion from (int)-1 to an unsigned number. Then int64_t can represent -1, so the value -1 is just assigned to a variable with type int64_t.
* It's called integer promotions in C language...
When I was working on string::npos I noticed something and I couldn't find any explanation for it on the web.
(string::npos == ULONG_MAX)
and
(string::npos == -1)
are true.
So I tried this:
(18446744073709551615 == -1)
which is also true.
How can it be possible? Is it because of binary conversation?
18,446,744,073,709,551,615
This number mentioned, 18,446,744,073,709,551,615, is actually 2^64 − 1. The important thing here is that 2^64-1 is essentially 0-based 2^64. The first digit of an unsigned integer is 0, not 1. So if the maximum value is 1, it has two possible values: 0, or 1 (2).
Let's look at 2^64 - 1 in 64bit binary, all the bits are on.
1111111111111111111111111111111111111111111111111111111111111111b
The -1
Let's look at +1 in 64bit binary.
0000000000000000000000000000000000000000000000000000000000000001b
To make it negative in One's Complement (OCP) we invert the bits.
1111111111111111111111111111111111111111111111111111111111111110b
Computers seldom use OCP, they use Two's Complement (TCP). To get TCP, you add one to OCP.
1111111111111111111111111111111111111111111111111111111111111110b (-1 in OCP)
+ 1b (1)
-----------------------------------------------------------------
1111111111111111111111111111111111111111111111111111111111111111b (-1 in TCP)
"But, wait" you ask, if in Twos Complement -1 is,
1111111111111111111111111111111111111111111111111111111111111111b
And, if in binary 2^64 - 1 is
1111111111111111111111111111111111111111111111111111111111111111b
Then they're equal! And, that's what you're seeing. You're comparing a signed 64 bit integer to an unsigned 64bit integer. In C++ that means convert the signed value to unsigned, which the compiler does.
Update
For a technical correction thanks to davmac in the comments, the conversion from -1 which is signed to an unsigned type of the same size is actually specified in the language, and not a function of the architecture. That all said, you may find the answer above useful for understanding the arch/languages that support two's compliment but lack the spec to ensure results you can depend on.
string::npos is defined as constexpr static std::string::size_type string::npos = -1; (or if it's defined inside the class definition that would be constexpr static size_type npos = -1; but that's really irrelevant).
The wraparound of negative numbers converted to unsigned types (std::string::size_type is basically std::size_t, which is unsigned) is perfectly well-defined by the Standard. -1 wraps to the largest representable value of the unsigned type, which in your case is 18446744073709551615. Note that the exact value is implementation-defined because the size of std::size_t is implementation-defined (but capable of holding the size of the largest possible array on the system in question).
According to the C++ Standard (Document Number: N3337 or Document Number: N4296) std::string::npos is defined the following way
static const size_type npos = -1;
where std::string::size_type is some unsigned integer type. So there is nothing wonderful that std::string::npos is equal to -1. The initializer is converted to the tyhpe of std::string::npos.
As for this equation
(string::npos == ULONG_MAX) is true,
then it means that the type std::string::npos has type in the used implementation unsigned long. This type is usually corresponds to the type size_t.
In this equation
(18446744073709551615 == -1)
The left literal has some unsigned integral type that is appropriate to store such a big literal. Thus the right operand is converted also to this unsigned type by propogating the sign bit. As the left operand represents itself the maximum value of the type then they are equal.
This is all about signed overflow and the fact that negative numbers are stored as 2s complement. The means that to get the absolute value of a negative number, you invert all the bits and add one. Meaning when doing an 8 bit comparison 255 and -1 have the same binary value of 11111111. The same applies to bigger integers
https://en.m.wikipedia.org/wiki/Two%27s_complement
In C or C++ it is said that the maximum number a size_t (an unsigned int data type) can hold is the same as casting -1 to that data type. for example see Invalid Value for size_t
Why?
I mean, (talking about 32 bit ints) AFAIK the most significant bit holds the sign in a signed data type (that is, bit 0x80000000 to form a negative number). then, 1 is 0x00000001.. 0x7FFFFFFFF is the greatest positive number a int data type can hold.
Then, AFAIK the binary representation of -1 int should be 0x80000001 (perhaps I'm wrong). why/how this binary value is converted to anything completely different (0xFFFFFFFF) when casting ints to unsigned?? or.. how is it possible to form a binary -1 out of 0xFFFFFFFF?
I have no doubt that in C: ((unsigned int)-1) == 0xFFFFFFFF or ((int)0xFFFFFFFF) == -1 is equally true than 1 + 1 == 2, I'm just wondering why.
C and C++ can run on many different architectures, and machine types. Consequently, they can have different representations of numbers: Two's complement, and Ones' complement being the most common. In general you should not rely on a particular representation in your program.
For unsigned integer types (size_t being one of those), the C standard (and the C++ standard too, I think) specifies precise overflow rules. In short, if SIZE_MAX is the maximum value of the type size_t, then the expression
(size_t) (SIZE_MAX + 1)
is guaranteed to be 0, and therefore, you can be sure that (size_t) -1 is equal to SIZE_MAX. The same holds true for other unsigned types.
Note that the above holds true:
for all unsigned types,
even if the underlying machine doesn't represent numbers in Two's complement. In this case, the compiler has to make sure the identity holds true.
Also, the above means that you can't rely on specific representations for signed types.
Edit: In order to answer some of the comments:
Let's say we have a code snippet like:
int i = -1;
long j = i;
There is a type conversion in the assignment to j. Assuming that int and long have different sizes (most [all?] 64-bit systems), the bit-patterns at memory locations for i and j are going to be different, because they have different sizes. The compiler makes sure that the values of i and j are -1.
Similarly, when we do:
size_t s = (size_t) -1
There is a type conversion going on. The -1 is of type int. It has a bit-pattern, but that is irrelevant for this example because when the conversion to size_t takes place due to the cast, the compiler will translate the value according to the rules for the type (size_t in this case). Thus, even if int and size_t have different sizes, the standard guarantees that the value stored in s above will be the maximum value that size_t can take.
If we do:
long j = LONG_MAX;
int i = j;
If LONG_MAX is greater than INT_MAX, then the value in i is implementation-defined (C89, section 3.2.1.2).
It's called two's complement. To make a negative number, invert all the bits then add 1. So to convert 1 to -1, invert it to 0xFFFFFFFE, then add 1 to make 0xFFFFFFFF.
As to why it's done this way, Wikipedia says:
The two's-complement system has the advantage of not requiring that the addition and subtraction circuitry examine the signs of the operands to determine whether to add or subtract. This property makes the system both simpler to implement and capable of easily handling higher precision arithmetic.
Your first question, about why (unsigned)-1 gives the largest possible unsigned value is only accidentally related to two's complement. The reason -1 cast to an unsigned type gives the largest value possible for that type is because the standard says the unsigned types "follow the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer."
Now, for 2's complement, the representation of the largest possible unsigned value and -1 happen to be the same -- but even if the hardware uses another representation (e.g. 1's complement or sign/magnitude), converting -1 to an unsigned type must still produce the largest possible value for that type.
Two's complement is very nice for doing subtraction just like addition :)
11111110 (254 or -2)
+00000001 ( 1)
---------
11111111 (255 or -1)
11111111 (255 or -1)
+00000001 ( 1)
---------
100000000 ( 0 + 256)
That is two's complement encoding.
The main bonus is that you get the same encoding whether you are using an unsigned or signed int. If you subtract 1 from 0 the integer simply wraps around. Therefore 1 less than 0 is 0xFFFFFFFF.
Because the bit pattern for an int
-1 is FFFFFFFF in hexadecimal unsigned.
11111111111111111111111111111111 binary unsigned.
But in int the first bit signifies whether it is negative.
But in unsigned int the first bit is just extra number because a unsigned int cannot be negative. So the extra bit makes an unsigned int able to store bigger numbers.
As with an unsigned int 11111111111111111111111111111111 (binary) or FFFFFFFF (hexadecimal) is the biggest number a uint can store.
Unsigned Ints are not recommended because if they go negative then it overflows and goes to the biggest number.
I am new to C++, I am confused with C++'s behavior for the code below:
#include <iostream>
void hello(unsigned int x, unsigned int y){
std::cout<<x<<std::endl;
std::cout<<y<<std::endl;
std::cout<<x+y<<std::endl;
}
int main(){
int a = -1;
int b = 3;
hello(a,b);
return 1;
}
The x in the output is a very large integer:4294967295, I know that negative integer convert to unsigned will behave like this. But why x+y in the output is 2?
Contrary to the other answers, there is no undefined behavior here, and there is no overflow. Unsigned integers use modulo 2n arithmetic.
Section 4.7 paragraph 2 of the standard says "If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type)." This dictates that -1 is equal to the largest possible unsigned int (modulo 2n).
Section 3.9.1 paragraph 4 says "Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer." To make it clear what this means, the footnote to this clause says "This implies that unsigned arithmetic does not overflow because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type."
In other words, converting -1 to 4294967295 is not just defined behavior, it is required behavior (assuming 32 bit integers). Similarly, adding 3 to that value and yielding 2 as a result is also required behavior. In this case, the value of n is irrelevant. The third value printed by hello() must be 2 or the implementation is not compliant with the standard.
Because unsigned int's will overflow. In other words, a=-1 (signed) which is 1 value below the maximum value for unsigned int's, 4294967295.
Then you add 3, the int will overflow and start at 0, so -1+3 =2
Passing negative number to unsigned int as parameter gives you an undefined behavior. The default int is a signed. It has a range of –2,147,483,648 to 2,147,483,647. Unsigned int range is from 0 to 4,294,967,295.
This isn't so much having to do with C++ but with how computers represent signed and unsigned numbers.
This is a good source on that. Basically, signed numbers are (usually) represented using two's complement, in which the most significant bit has a value of -2^n. In effect, what his means is that the positive numbers are represented the same in two's complement as they are in regular unsigned binary.
-1 is represented as all ones, which when interpreted as an unsigned integer will be the largest integer that can be represented (4294967295, when dealing with 32 bits).
One of the great things about using two complement to represent signed numbers is that you can perform addition and subtraction in the exact same way as with unsigned numbers and it will work out correctly, so long as the number does not exceed the bounds that can be represented. This isn't as easy with other forms such as signed-magnitude.
So, what this means is that because the result of -1 + 3 = 2, and because 2 is positive, it will be interpreted the same as if it were unsigned. Thus, it prints 2.
In C or C++ it is said that the maximum number a size_t (an unsigned int data type) can hold is the same as casting -1 to that data type. for example see Invalid Value for size_t
Why?
I mean, (talking about 32 bit ints) AFAIK the most significant bit holds the sign in a signed data type (that is, bit 0x80000000 to form a negative number). then, 1 is 0x00000001.. 0x7FFFFFFFF is the greatest positive number a int data type can hold.
Then, AFAIK the binary representation of -1 int should be 0x80000001 (perhaps I'm wrong). why/how this binary value is converted to anything completely different (0xFFFFFFFF) when casting ints to unsigned?? or.. how is it possible to form a binary -1 out of 0xFFFFFFFF?
I have no doubt that in C: ((unsigned int)-1) == 0xFFFFFFFF or ((int)0xFFFFFFFF) == -1 is equally true than 1 + 1 == 2, I'm just wondering why.
C and C++ can run on many different architectures, and machine types. Consequently, they can have different representations of numbers: Two's complement, and Ones' complement being the most common. In general you should not rely on a particular representation in your program.
For unsigned integer types (size_t being one of those), the C standard (and the C++ standard too, I think) specifies precise overflow rules. In short, if SIZE_MAX is the maximum value of the type size_t, then the expression
(size_t) (SIZE_MAX + 1)
is guaranteed to be 0, and therefore, you can be sure that (size_t) -1 is equal to SIZE_MAX. The same holds true for other unsigned types.
Note that the above holds true:
for all unsigned types,
even if the underlying machine doesn't represent numbers in Two's complement. In this case, the compiler has to make sure the identity holds true.
Also, the above means that you can't rely on specific representations for signed types.
Edit: In order to answer some of the comments:
Let's say we have a code snippet like:
int i = -1;
long j = i;
There is a type conversion in the assignment to j. Assuming that int and long have different sizes (most [all?] 64-bit systems), the bit-patterns at memory locations for i and j are going to be different, because they have different sizes. The compiler makes sure that the values of i and j are -1.
Similarly, when we do:
size_t s = (size_t) -1
There is a type conversion going on. The -1 is of type int. It has a bit-pattern, but that is irrelevant for this example because when the conversion to size_t takes place due to the cast, the compiler will translate the value according to the rules for the type (size_t in this case). Thus, even if int and size_t have different sizes, the standard guarantees that the value stored in s above will be the maximum value that size_t can take.
If we do:
long j = LONG_MAX;
int i = j;
If LONG_MAX is greater than INT_MAX, then the value in i is implementation-defined (C89, section 3.2.1.2).
It's called two's complement. To make a negative number, invert all the bits then add 1. So to convert 1 to -1, invert it to 0xFFFFFFFE, then add 1 to make 0xFFFFFFFF.
As to why it's done this way, Wikipedia says:
The two's-complement system has the advantage of not requiring that the addition and subtraction circuitry examine the signs of the operands to determine whether to add or subtract. This property makes the system both simpler to implement and capable of easily handling higher precision arithmetic.
Your first question, about why (unsigned)-1 gives the largest possible unsigned value is only accidentally related to two's complement. The reason -1 cast to an unsigned type gives the largest value possible for that type is because the standard says the unsigned types "follow the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer."
Now, for 2's complement, the representation of the largest possible unsigned value and -1 happen to be the same -- but even if the hardware uses another representation (e.g. 1's complement or sign/magnitude), converting -1 to an unsigned type must still produce the largest possible value for that type.
Two's complement is very nice for doing subtraction just like addition :)
11111110 (254 or -2)
+00000001 ( 1)
---------
11111111 (255 or -1)
11111111 (255 or -1)
+00000001 ( 1)
---------
100000000 ( 0 + 256)
That is two's complement encoding.
The main bonus is that you get the same encoding whether you are using an unsigned or signed int. If you subtract 1 from 0 the integer simply wraps around. Therefore 1 less than 0 is 0xFFFFFFFF.
Because the bit pattern for an int
-1 is FFFFFFFF in hexadecimal unsigned.
11111111111111111111111111111111 binary unsigned.
But in int the first bit signifies whether it is negative.
But in unsigned int the first bit is just extra number because a unsigned int cannot be negative. So the extra bit makes an unsigned int able to store bigger numbers.
As with an unsigned int 11111111111111111111111111111111 (binary) or FFFFFFFF (hexadecimal) is the biggest number a uint can store.
Unsigned Ints are not recommended because if they go negative then it overflows and goes to the biggest number.