binary representation in turbo c - c++

I have a very basic query.
For the following code:
int i = -1;
unsigned int j = (unsigned int) i;
printf("%d",j); // I get 65535 the int limit in turbo C 16 bit compiler
Shouldn't it simply drop the negative sign or does it wrap around the limits ?

The scheme most commonly used to represent negative numbers is called Two's Complement. In this scheme -1 has the bit pattern of all 1 bits: 1111 1111 1111 1111. When interpreted as an unsigned integer, which is what your cast means, that's 65535.
In other words, it does "simply drop the negative sign", except that doing so has a different effect than what you expect.

When converting an int to unsigned int, if the value is negative the result is as if that negative value has been added to the number one larger than what an unsigned int can hold. This is specified by the C standard since C90, and holds regardless of whether the machine uses two's complement, one's complement or signed magnitude representations.

Related

How typecast works on initialization "unsigned int i = -100"? [duplicate]

I was curious to know what would happen if I assign a negative value to an unsigned variable.
The code will look somewhat like this.
unsigned int nVal = 0;
nVal = -5;
It didn't give me any compiler error. When I ran the program the nVal was assigned a strange value! Could it be that some 2's complement value gets assigned to nVal?
For the official answer - Section 4.7 conv.integral
"If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). —end note ]
This essentially means that if the underlying architecture stores in a method that is not Two's Complement (like Signed Magnitude, or One's Complement), that the conversion to unsigned must behave as if it was Two's Complement.
It will assign the bit pattern representing -5 (in 2's complement) to the unsigned int. Which will be a large unsigned value. For 32 bit ints this will be 2^32 - 5 or 4294967291
You're right, the signed integer is stored in 2's complement form, and the unsigned integer is stored in the unsigned binary representation. C (and C++) doesn't distinguish between the two, so the value you end up with is simply the unsigned binary value of the 2's complement binary representation.
It will show as a positive integer of value of max unsigned integer - 4 (value depends on computer architecture and compiler).
BTW
You can check this by writing a simple C++ "hello world" type program and see for yourself
Yes, you're correct. The actual value assigned is something like all bits set except the third. -1 is all bits set (hex: 0xFFFFFFFF), -2 is all bits except the first and so on. What you would see is probably the hex value 0xFFFFFFFB which in decimal corresponds to 4294967291.
When you assign a negative value to an unsigned variable then it uses the 2's complement method to process it and in this method it flips all 0s to 1s and all 1s to 0s and then adds 1 to it. In your case, you are dealing with int which is of 4 byte(32 bits) so it tries to use 2's complement method on 32 bit number which causes the higher bit to flip. For example:
┌─[student#pc]─[~]
└──╼ $pcalc 0y00000000000000000000000000000101 # 5 in binary
5 0x5 0y101
┌─[student#pc]─[~]
└──╼ $pcalc 0y11111111111111111111111111111010 # flip all bits
4294967290 0xfffffffa 0y11111111111111111111111111111010
┌─[student#pc]─[~]
└──╼ $pcalc 0y11111111111111111111111111111010 + 1 # add 1 to that flipped binarry
4294967291 0xfffffffb 0y11111111111111111111111111111011
In Windows and Ubuntu Linux that I have checked assigning any negative number (not just -1) to an unsigned integer in C and C++ results in the assignment of the value UINT_MAX to that unsigned integer.
Compiled example link.

Why does (unsigned int = -1) show the largest value that it can store? [duplicate]

In C or C++ it is said that the maximum number a size_t (an unsigned int data type) can hold is the same as casting -1 to that data type. for example see Invalid Value for size_t
Why?
I mean, (talking about 32 bit ints) AFAIK the most significant bit holds the sign in a signed data type (that is, bit 0x80000000 to form a negative number). then, 1 is 0x00000001.. 0x7FFFFFFFF is the greatest positive number a int data type can hold.
Then, AFAIK the binary representation of -1 int should be 0x80000001 (perhaps I'm wrong). why/how this binary value is converted to anything completely different (0xFFFFFFFF) when casting ints to unsigned?? or.. how is it possible to form a binary -1 out of 0xFFFFFFFF?
I have no doubt that in C: ((unsigned int)-1) == 0xFFFFFFFF or ((int)0xFFFFFFFF) == -1 is equally true than 1 + 1 == 2, I'm just wondering why.
C and C++ can run on many different architectures, and machine types. Consequently, they can have different representations of numbers: Two's complement, and Ones' complement being the most common. In general you should not rely on a particular representation in your program.
For unsigned integer types (size_t being one of those), the C standard (and the C++ standard too, I think) specifies precise overflow rules. In short, if SIZE_MAX is the maximum value of the type size_t, then the expression
(size_t) (SIZE_MAX + 1)
is guaranteed to be 0, and therefore, you can be sure that (size_t) -1 is equal to SIZE_MAX. The same holds true for other unsigned types.
Note that the above holds true:
for all unsigned types,
even if the underlying machine doesn't represent numbers in Two's complement. In this case, the compiler has to make sure the identity holds true.
Also, the above means that you can't rely on specific representations for signed types.
Edit: In order to answer some of the comments:
Let's say we have a code snippet like:
int i = -1;
long j = i;
There is a type conversion in the assignment to j. Assuming that int and long have different sizes (most [all?] 64-bit systems), the bit-patterns at memory locations for i and j are going to be different, because they have different sizes. The compiler makes sure that the values of i and j are -1.
Similarly, when we do:
size_t s = (size_t) -1
There is a type conversion going on. The -1 is of type int. It has a bit-pattern, but that is irrelevant for this example because when the conversion to size_t takes place due to the cast, the compiler will translate the value according to the rules for the type (size_t in this case). Thus, even if int and size_t have different sizes, the standard guarantees that the value stored in s above will be the maximum value that size_t can take.
If we do:
long j = LONG_MAX;
int i = j;
If LONG_MAX is greater than INT_MAX, then the value in i is implementation-defined (C89, section 3.2.1.2).
It's called two's complement. To make a negative number, invert all the bits then add 1. So to convert 1 to -1, invert it to 0xFFFFFFFE, then add 1 to make 0xFFFFFFFF.
As to why it's done this way, Wikipedia says:
The two's-complement system has the advantage of not requiring that the addition and subtraction circuitry examine the signs of the operands to determine whether to add or subtract. This property makes the system both simpler to implement and capable of easily handling higher precision arithmetic.
Your first question, about why (unsigned)-1 gives the largest possible unsigned value is only accidentally related to two's complement. The reason -1 cast to an unsigned type gives the largest value possible for that type is because the standard says the unsigned types "follow the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer."
Now, for 2's complement, the representation of the largest possible unsigned value and -1 happen to be the same -- but even if the hardware uses another representation (e.g. 1's complement or sign/magnitude), converting -1 to an unsigned type must still produce the largest possible value for that type.
Two's complement is very nice for doing subtraction just like addition :)
11111110 (254 or -2)
+00000001 ( 1)
---------
11111111 (255 or -1)
11111111 (255 or -1)
+00000001 ( 1)
---------
100000000 ( 0 + 256)
That is two's complement encoding.
The main bonus is that you get the same encoding whether you are using an unsigned or signed int. If you subtract 1 from 0 the integer simply wraps around. Therefore 1 less than 0 is 0xFFFFFFFF.
Because the bit pattern for an int
-1 is FFFFFFFF in hexadecimal unsigned.
11111111111111111111111111111111 binary unsigned.
But in int the first bit signifies whether it is negative.
But in unsigned int the first bit is just extra number because a unsigned int cannot be negative. So the extra bit makes an unsigned int able to store bigger numbers.
As with an unsigned int 11111111111111111111111111111111 (binary) or FFFFFFFF (hexadecimal) is the biggest number a uint can store.
Unsigned Ints are not recommended because if they go negative then it overflows and goes to the biggest number.

C++ function with unsigned int parameter gets strange result where call it with negitive

I am new to C++, I am confused with C++'s behavior for the code below:
#include <iostream>
void hello(unsigned int x, unsigned int y){
std::cout<<x<<std::endl;
std::cout<<y<<std::endl;
std::cout<<x+y<<std::endl;
}
int main(){
int a = -1;
int b = 3;
hello(a,b);
return 1;
}
The x in the output is a very large integer:4294967295, I know that negative integer convert to unsigned will behave like this. But why x+y in the output is 2?
Contrary to the other answers, there is no undefined behavior here, and there is no overflow. Unsigned integers use modulo 2n arithmetic.
Section 4.7 paragraph 2 of the standard says "If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type)." This dictates that -1 is equal to the largest possible unsigned int (modulo 2n).
Section 3.9.1 paragraph 4 says "Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer." To make it clear what this means, the footnote to this clause says "This implies that unsigned arithmetic does not overflow because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type."
In other words, converting -1 to 4294967295 is not just defined behavior, it is required behavior (assuming 32 bit integers). Similarly, adding 3 to that value and yielding 2 as a result is also required behavior. In this case, the value of n is irrelevant. The third value printed by hello() must be 2 or the implementation is not compliant with the standard.
Because unsigned int's will overflow. In other words, a=-1 (signed) which is 1 value below the maximum value for unsigned int's, 4294967295.
Then you add 3, the int will overflow and start at 0, so -1+3 =2
Passing negative number to unsigned int as parameter gives you an undefined behavior. The default int is a signed. It has a range of –2,147,483,648 to 2,147,483,647. Unsigned int range is from 0 to 4,294,967,295.
This isn't so much having to do with C++ but with how computers represent signed and unsigned numbers.
This is a good source on that. Basically, signed numbers are (usually) represented using two's complement, in which the most significant bit has a value of -2^n. In effect, what his means is that the positive numbers are represented the same in two's complement as they are in regular unsigned binary.
-1 is represented as all ones, which when interpreted as an unsigned integer will be the largest integer that can be represented (4294967295, when dealing with 32 bits).
One of the great things about using two complement to represent signed numbers is that you can perform addition and subtraction in the exact same way as with unsigned numbers and it will work out correctly, so long as the number does not exceed the bounds that can be represented. This isn't as easy with other forms such as signed-magnitude.
So, what this means is that because the result of -1 + 3 = 2, and because 2 is positive, it will be interpreted the same as if it were unsigned. Thus, it prints 2.

What does ~0 mean in this code?

What's the meaning of ~0 in this code?
Can somebody analyze this code for me?
unsigned int Order(unsigned int maxPeriod = ~0) const
{
Point r = *this;
unsigned int n = 0;
while( r.x_ != 0 && r.y_ != 0 )
{
++n;
r += *this;
if ( n > maxPeriod ) break;
}
return n;
}
~0 is the bitwise complement of 0, which is a number with all bits filled. For an unsigned 32-bit int, that's 0xffffffff. The exact number of fs will depend on the size of the value that you assign ~0 to.
It's the one complement, which inverts all bits.
~ 0101 => 1010
~ 0000 => 1111
~ 1111 => 0000
As others have mentioned, the ~ operator performs bitwise complement. However, the result of performing the operation on a signed value is not defined by the standard.
In particular, the value of ~0 need not be -1, which is probably the value intended. Setting the default argument to
unsigned int maxPeriod = -1
would make maxPeriod contain the highest possible value (signed to unsigned conversion is defined as an assignment modulo 2**n, where n is a characteristic number of the given unsigned type (the number of bits of representation)).
Also note that default arguments are not valid in C.
It's a binary complement function.
Basically it means flip each bit.
It is the bitwise complement of 0 which would be, in this example, an int with all the bits set to 1. If sizeof(int) is 4, then the number is 0xffffffff.
Basically, it's saying that maxPeriod has a default value of UINT_MAX. Rather than writing it as UINT_MAX, the author used his knowledge of complements to calculate the value.
If you want to make the code a bit more readable in the future, include
#include <limits>
and change the call to read
unsigned int Order(unsigned int maxPeriod = UINT_MAX) const
Now to explain why ~0 is UINT_MAX. Since we are dealing with an int, in which 0 is represented with all zero bits (00000000). Adding one would give (00000001), adding one more would give (00000010), and one more would give (00000011). Finally one more addition would give (00000100) because the 1's carry.
For unsigned ints, if you repeat the process ad-infiniteum, eventually you have all one bits (11111111), and adding another one will overflow the buffer setting all the bits back to zero. This means that all one bits in an unsigned number is the maximum that data type (int in your case) can hold.
The "~" operation flips all bits from 0 to 1 or 1 to 0, flipping a zero integer (which has all zero bits) effectively gives you UINT_MAX. So he basically the previous coded opted to computer UINT_MAX instead of using the system defined copy located in #include <limits.h>
In the example it is probably an attempt to generate the UINT_MAX value. The technique is possibly flawed for reasons already stated.
The expression does however does have legitimate use to generate a bit mask with all bits set using a literal constant that is type-width independent; but that is not how it is being used in your example.
As others have said, ~ is the bitwise complement operator (sometimes also referred to as bitwise not). It's a unary operator which means that it takes a single input.
Bitwise operators treat the input as a bit pattern and perform their respective operations on each individual bit then return the resulting pattern. Applying the ~ operator to the bit pattern will negate each bit (each zero becomes a one, each one becomes a zero).
In the example you gave, the bit representation of the integer 0 is all zeros. Thus, ~0 will produce a bit pattern of all ones. Even though 0 is an int, it is the bit pattern ~0 that is assigned to maxPeriod (not the int value that would be represented by said bit pattern). Since maxPeriod is an unsigned int, it is assigned the unsigned int value represented by ~0 (a pattern of all ones), which is in fact the highest value that an unsigned int can store before wrapping around back to 0.

Why unsigned int 0xFFFFFFFF is equal to int -1?

In C or C++ it is said that the maximum number a size_t (an unsigned int data type) can hold is the same as casting -1 to that data type. for example see Invalid Value for size_t
Why?
I mean, (talking about 32 bit ints) AFAIK the most significant bit holds the sign in a signed data type (that is, bit 0x80000000 to form a negative number). then, 1 is 0x00000001.. 0x7FFFFFFFF is the greatest positive number a int data type can hold.
Then, AFAIK the binary representation of -1 int should be 0x80000001 (perhaps I'm wrong). why/how this binary value is converted to anything completely different (0xFFFFFFFF) when casting ints to unsigned?? or.. how is it possible to form a binary -1 out of 0xFFFFFFFF?
I have no doubt that in C: ((unsigned int)-1) == 0xFFFFFFFF or ((int)0xFFFFFFFF) == -1 is equally true than 1 + 1 == 2, I'm just wondering why.
C and C++ can run on many different architectures, and machine types. Consequently, they can have different representations of numbers: Two's complement, and Ones' complement being the most common. In general you should not rely on a particular representation in your program.
For unsigned integer types (size_t being one of those), the C standard (and the C++ standard too, I think) specifies precise overflow rules. In short, if SIZE_MAX is the maximum value of the type size_t, then the expression
(size_t) (SIZE_MAX + 1)
is guaranteed to be 0, and therefore, you can be sure that (size_t) -1 is equal to SIZE_MAX. The same holds true for other unsigned types.
Note that the above holds true:
for all unsigned types,
even if the underlying machine doesn't represent numbers in Two's complement. In this case, the compiler has to make sure the identity holds true.
Also, the above means that you can't rely on specific representations for signed types.
Edit: In order to answer some of the comments:
Let's say we have a code snippet like:
int i = -1;
long j = i;
There is a type conversion in the assignment to j. Assuming that int and long have different sizes (most [all?] 64-bit systems), the bit-patterns at memory locations for i and j are going to be different, because they have different sizes. The compiler makes sure that the values of i and j are -1.
Similarly, when we do:
size_t s = (size_t) -1
There is a type conversion going on. The -1 is of type int. It has a bit-pattern, but that is irrelevant for this example because when the conversion to size_t takes place due to the cast, the compiler will translate the value according to the rules for the type (size_t in this case). Thus, even if int and size_t have different sizes, the standard guarantees that the value stored in s above will be the maximum value that size_t can take.
If we do:
long j = LONG_MAX;
int i = j;
If LONG_MAX is greater than INT_MAX, then the value in i is implementation-defined (C89, section 3.2.1.2).
It's called two's complement. To make a negative number, invert all the bits then add 1. So to convert 1 to -1, invert it to 0xFFFFFFFE, then add 1 to make 0xFFFFFFFF.
As to why it's done this way, Wikipedia says:
The two's-complement system has the advantage of not requiring that the addition and subtraction circuitry examine the signs of the operands to determine whether to add or subtract. This property makes the system both simpler to implement and capable of easily handling higher precision arithmetic.
Your first question, about why (unsigned)-1 gives the largest possible unsigned value is only accidentally related to two's complement. The reason -1 cast to an unsigned type gives the largest value possible for that type is because the standard says the unsigned types "follow the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer."
Now, for 2's complement, the representation of the largest possible unsigned value and -1 happen to be the same -- but even if the hardware uses another representation (e.g. 1's complement or sign/magnitude), converting -1 to an unsigned type must still produce the largest possible value for that type.
Two's complement is very nice for doing subtraction just like addition :)
11111110 (254 or -2)
+00000001 ( 1)
---------
11111111 (255 or -1)
11111111 (255 or -1)
+00000001 ( 1)
---------
100000000 ( 0 + 256)
That is two's complement encoding.
The main bonus is that you get the same encoding whether you are using an unsigned or signed int. If you subtract 1 from 0 the integer simply wraps around. Therefore 1 less than 0 is 0xFFFFFFFF.
Because the bit pattern for an int
-1 is FFFFFFFF in hexadecimal unsigned.
11111111111111111111111111111111 binary unsigned.
But in int the first bit signifies whether it is negative.
But in unsigned int the first bit is just extra number because a unsigned int cannot be negative. So the extra bit makes an unsigned int able to store bigger numbers.
As with an unsigned int 11111111111111111111111111111111 (binary) or FFFFFFFF (hexadecimal) is the biggest number a uint can store.
Unsigned Ints are not recommended because if they go negative then it overflows and goes to the biggest number.