printf tilde operator in c - c++

I know that the ~ operator is NOT, so it inverts the bits in a binary number
unsigned int a = ~0, b = ~7;
printf("%d\n",a);
printf("%d\n",b);
printf("%u\n",a);
printf("%u\n",b);
I guessed 0 will be 1 and 7 (0111) will be 8 (1000) but the output was
-1
-8
4294967295
4294967288
how did ~0 and ~7 become -1, and -8? also why is %u printing that long number?

The ~ operator simply inverts all bits in a number.
On most modern compilers, int is 32 bits in size, and a signed int uses 2's complement representation. Which means, among other things, that the high bit is reserved for the sign, and if that bit is 1 then the number is negative.
0 and 7 are int literals. Assuming the above, we get these results:
0 is bits 00000000000000000000000000000000b
= 0 when interpreted as either signed int or unsigned int
~0 is bits 11111111111111111111111111111111b
= -1 when interpreted as signed int
= 4294967285 when interpreted as unsigned int
7 is bits 00000000000000000000000000000111b
= 7 when interpreted as either signed int or unsigned int
~7 is bits 11111111111111111111111111111000b
= -8 when interpreted as signed int
= 4294967288 when interpreted as unsigned int
In your printf() statements, %d interprets its input as a signed int, and %u interprets as an unsigned int. This is why you are seeing the results you get.

The ~ operator inverts all bits of the integer operand. So for example where int is 32-bit, 1 is 0x00000001 in hex and it's one's complement is 0xFFFFFFFE. When interpreted as unsigned, that is 4 294 967 294, and as two's complement signed, -2.

Related

Changed c++ unsigned int from 0 to -2, but prints 4294967294 [duplicate]

I was curious to know what would happen if I assign a negative value to an unsigned variable.
The code will look somewhat like this.
unsigned int nVal = 0;
nVal = -5;
It didn't give me any compiler error. When I ran the program the nVal was assigned a strange value! Could it be that some 2's complement value gets assigned to nVal?
For the official answer - Section 4.7 conv.integral
"If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). —end note ]
This essentially means that if the underlying architecture stores in a method that is not Two's Complement (like Signed Magnitude, or One's Complement), that the conversion to unsigned must behave as if it was Two's Complement.
It will assign the bit pattern representing -5 (in 2's complement) to the unsigned int. Which will be a large unsigned value. For 32 bit ints this will be 2^32 - 5 or 4294967291
You're right, the signed integer is stored in 2's complement form, and the unsigned integer is stored in the unsigned binary representation. C (and C++) doesn't distinguish between the two, so the value you end up with is simply the unsigned binary value of the 2's complement binary representation.
It will show as a positive integer of value of max unsigned integer - 4 (value depends on computer architecture and compiler).
BTW
You can check this by writing a simple C++ "hello world" type program and see for yourself
Yes, you're correct. The actual value assigned is something like all bits set except the third. -1 is all bits set (hex: 0xFFFFFFFF), -2 is all bits except the first and so on. What you would see is probably the hex value 0xFFFFFFFB which in decimal corresponds to 4294967291.
When you assign a negative value to an unsigned variable then it uses the 2's complement method to process it and in this method it flips all 0s to 1s and all 1s to 0s and then adds 1 to it. In your case, you are dealing with int which is of 4 byte(32 bits) so it tries to use 2's complement method on 32 bit number which causes the higher bit to flip. For example:
┌─[student#pc]─[~]
└──╼ $pcalc 0y00000000000000000000000000000101 # 5 in binary
5 0x5 0y101
┌─[student#pc]─[~]
└──╼ $pcalc 0y11111111111111111111111111111010 # flip all bits
4294967290 0xfffffffa 0y11111111111111111111111111111010
┌─[student#pc]─[~]
└──╼ $pcalc 0y11111111111111111111111111111010 + 1 # add 1 to that flipped binarry
4294967291 0xfffffffb 0y11111111111111111111111111111011
In Windows and Ubuntu Linux that I have checked assigning any negative number (not just -1) to an unsigned integer in C and C++ results in the assignment of the value UINT_MAX to that unsigned integer.
Compiled example link.

Binary File Reads Negative Integers After Writing

I came from this question where I wanted to write 2 integers to a single byte that were garunteed to be between 0-16 (4 bits each).
Now if I close the file, and run a different program that reads....
for (int i = 0; i < 2; ++i)
{
char byteToRead;
file.seekg(i, std::ios::beg);
file.read(&byteToRead, sizeof(char));
bool correct = file.bad();
unsigned int num1 = (byteToRead >> 4);
unsigned int num2 = (byteToRead & 0x0F);
}
The issue is, sometimes this works but other times I'm having the first number come out negative and the second number is something like 10 or 9 all the time and they were most certainly not the numbers I wrote!
So here, for example, the first two numbers work, but the next number does not. For examplem, the output of the read above would be:
At byte 0, num1 = 5 and num2 = 6
At byte 1, num1 = 4294967289 and num2 = 12
At byte 1, num1 should be 9. It seems the 12 writes fine but the 9 << 4 isn't working. The byteToWrite on my end is byteToWrite -100 'œ''
I checked out this question which has a similar problem I think but I feel like my endian is right here.
The right-shift operator preserves the value of the left-most bit. If the left-most bit is 0 before the shift, it will still be 0 after the shift; if it is 1, it will still be 1 after the shift. This allow to preserve the value's sign.
In your case, you combine 9 (0b1001) with 12 (0b1100), so you write 0b10011100 (0x9C). The bit #7 is 1.
When byteToRead is right-shifted, you get 0b11111001 (0xF9), but it is implicitly converted to an int. The convertion from char to int also preserve the value's sign, so it produce 0xFFFFFFF9. Then the implicit int is implicitly converted to a unsigned int. So num1 contains 0xFFFFFFF9 which is 4294967289.
There is 2 solutions:
cast byteToRead into a unsigned char when doing the right-shift;
apply a mask to the shift's result to only keep the 4 bits you want.
The problem originates with byteToRead >> 4 . In C, any arithmetic operations are performed in at least int precision. So the first thing that happens is that byteToRead is promoted to int.
These promotions are value-preserving. Your system has plain char as signed, i.e. having range -128 through to 127. Your char might have been initially -112 (bit pattern 10010000), and then after promotion to int it retains its value of -112 (bit pattern 11111...1110010000).
The right-shift of a negative value is implementation-defined but a common implementation is to do an "arithmetic shift", i.e. perform division by two; so you end up with the result of byteToRead >> 4 being -7 (bit pattern 11111....111001).
Converting -7 to unsigned int results in UINT_MAX - 6 which is 4295967289, because unsigned arithmetic is defined as wrapping around mod UINT_MAX+1 .
To fix this you need to convert to unsigned before performing the arithmetic . You could cast (or alias) byteToRead to unsigned char, e.g.:
unsigned char byteToRead;
file.read( (char *)&byteToRead, 1 );

Negation of -2147483648 not possible in C/C++?

#include <iostream>
#include <stdlib.h>
int main(int argc, char *argv[])
{
int num=-2147483648;
int positivenum=-num;
int absval=abs(num);
std::cout<<positivenum<<"\n";
std::cout<<absval<<"\n";
return 0;
}
Hi I am quite curious why the output of the above code is
-2147483648
-2147483648
Now I know that -2147483648 is the smallest represntable number among signed ints, (assuming an int is 32 bits). I would have assumed that one would get garbage answers only after we went below this number. But in this case, +2147483648 IS covered by the 32 bit system of integers. So why the negative answer in both cases?
But in this case, +2147483648 IS covered by the 32 bit system of integers.
Not quite correct. It only goes up to +2147483647. So your assumption isn't right.
Negating -2147483648 will indeed produce 2147483648, but it will overflow back to -2147483648.
Furthermore, signed integer overflow is technically undefined behavior.
The value -(-2147483648) is not possible in 32-bit signed int. The range of signed 32-bit int is –2147483648 to 2147483647
Ahhh, but its not... remember 0, largest signed is actually 2147483647
Because the 2's complement representation of signed integers isn't symmetric and the minimum 32-bit signed integer is -2147483648 while the maximum is +2147483647. That -2147483648 is its own counterpart just as 0 is (in the 2's complement representation there's only one 0, there're no distinct +0 and -0).
Here's some explanation.
A negative number -X when represented as N-bit 2's complement, is effectively represented as unsigned number that's equal to 2N-X. So, for 32-bit integers:
if X = 1, then -X = 232 - 1 = 4294967295
if X = 2147483647, then -X = 232 - 2147483647 = 2147483649
if X = 2147483648, then -X = 232 - 2147483648 = 2147483648
if X = -2147483648, then -X = 232 + 2147483648 = 2147483648 (because we only keep low 32 bits)
So, -2147483648 = +2147483648. Welcome to the world of 2's complement values.
The previous answers have all pointed out that the result is UB (Undefined Behaviour) because 2147483648 is not a valid int32_t value. And we all know, UB means anything can happen, including having daemons flying out of your nose. The question is, why does the cout behavior print out a negative value, which seems to be the worst value it could have chosen randomly ?
I'll try to justify it on a two's complement system. Negation on a CPU is actually somewhat of a tricky operation. You can't do it in one step. One way of implementing negation, i.e. int32_t positivenum = -num is to do a bit inversion followed by adding 1, i.e. int32_t positivenum = ~num + 1, where ~ is the bitwise negation operator and the +1 is to fix the off-by-one error. For example, negation of 0x00000000 is 0xFFFFFFFF + 1 which is 0x00000000 (after roll over which is what most CPUs do). You can verify that this works for most integers... except for 2147483648. 2147483648 is stored as 0x80000000 in two's complement. When you invert and add one, you get
- (min) = -(0x80000000)
= ~(0x80000000) + 1
= 0x7FFFFFFF + 1
= 0x80000000
= min
So magically, the unary operator - operating on min gives you back min!
One thing that is not obvious is that two-complement CPUs' arithmetic have no concept of positive or negative numbers! It treats all numbers as unsigned under the hood. There is just one adder circuit, and one multiplier circuit. The adder circuit works for positive and negative numbers, and the multiplier circuit works for positive and negative number.
Example: -1 * -1
= -1 * -1
= (cast both to uint32_t)
= 0xFFFFFFFF * 0xFFFFFFFF
= FFFFFFFE00000001 // if you do the result to 64 bit precision
= 0x00000001 // after you truncate to 32 bit precision
= 1
The only time you care about signed vs unsigned is for comparisons, like < or >.

Why, if a char is initialized to 1 and then left shifted 7 times and the value printed using %d, does it show -128?

I am aware of the 2s complement representation of signed values. But how does binary '10000000' become -128 in decimal(using %d).
for +64 binary rep = '01000000' for -64 binary rep = '11000000' which is 2's complement of '01000000'
can some one please explain?
Program:
int main()
{
char ch = 1;
int count = 0;
while(count != 8)
{
printf("Before shift val of ch = %d,count=%d\n",ch,count);
ch = ch << 1;
printf("After shift val of ch = %d,count=%d\n",ch,count);
//printBinPattern(ch);
printf("*************************************\n");
count++;
}
return 0;
}
Output:
Before shift val of ch = 1, count=0
After shift val of ch = 2, count=0
*************************************
...
... /* Output not shown */
Before shift val of ch = 32, count=5
After shift val of ch = 64, count=5
*************************************
Before shift val of ch = 64, count=6
After shift val of ch = -128, count=6
*************************************
Before shift val of **ch = -128**, count=7
After shift val of ch = 0, count=7
*************************************
Before shift val of ch = 0, count=8
After shift val of ch = 0, count=8
*************************************
Because on your compiler, char means signed char.
Char is just a tiny integer, generally in the range of 0...255 (for unsigned char) or -128...127 (for signed char).
The means of converting a number to 2-complement negative is to "invert the bits and add 1"
128 = "1000 0000". Inverting the bits is "0111 1111". Adding 1 yields: "1000 0000"
I am aware of the 2s complement representation of signed values.
Well, obviously you aren't. A 1 followed by all 0s is always the smallest negative number.
The answer is implementation defined as the type of 'default char' is implementation defined.
$3.9.1/1
Objects declared as characters (char)
shall be large enough to store any
member of the implementation’s basic
character set. If a character from
this set is stored in a character
object, the integral value of that
character object is equal to the value
of the single character literal form
of that character. It is
implementationdefined whether a char
object can hold negative values.
Characters can be explicitly declared
unsigned or signed. Plain char, signed
char, and unsigned char are three
distinct types.
$5.8/1 -
"The operands shall be of integral or
enumeration type and integral
promotions are performed. The type of
the result is that of the promoted
left operand. The behavior is
undefined if the right operand is
negative, or greater than or equal to
the length in bits of the promoted
left operand."
So when the value of char becomes negative, left shift from thereon has undefined behavior.
That's how it works.
-1 = 1111 1111
-2 = 1111 1110
-3 = 1111 1101
-4 = 1111 1110
...
-126 = 1000 0010
-127 = 1000 0001
-128 = 1000 0000
Two's complement is exactly like unsigned binary representation with one slight change:
The MSB (bit n-1) is redefined to have a value of -2n-1 instead of 2n-1.
That's why the addition logic is unchanged: because all the other bits still have the same place value.
This also explains the underflow/overflow detection method, which involves checking the carry from bit (n-2) into bit (n-1).
There is a pretty simple process for converting from a negative two's complement integer value to it's positive equivalent.
0000 0001 ; The x = 1
1000 0000 ; x <<= 7
The two's complement process is two-steps... first, if the high-bit is 1, reverse all bits
0111 1111 ; (-) 127
then add 1
1000 0000 ; (-) 128
Supplying a char to a %d format specifier that expects an int is probably unwise.
Whether an unadorned char is signed or unsigned is implementation defined. In this case not only is it apparently signed, but also the char argument has been pushed on to the stack an an int sized object and sign extended so that the higher order bits are all set to the same value as the high order bit of the original char.
I am not sure whether this is defined behaviour or not without looking it up, but personally I'd have cast the char to an int when formatting it with %d. Not least because some compilers and static analysis tools will trap that error and issue a warning. GCC will do so when -Wformat is used for example.
That is the explanation, if you want a solution (i.e. one that prints 128 rather than -128) then you need to cast to unsigned and mask-off the sign extension bits as well as using a correctly matching format specifier:
printf("%u", (unsigned)ch & 0xff );

What range of values can integer types store in C++?

Can unsigned long int hold a ten digits number (1,000,000,000 - 9,999,999,999) on a 32-bit computer?
Additionally, what are the ranges of unsigned long int , long int, unsigned int, short int, short unsigned int, and int?
The minimum ranges you can rely on are:
short int and int: -32,767 to 32,767
unsigned short int and unsigned int: 0 to 65,535
long int: -2,147,483,647 to 2,147,483,647
unsigned long int: 0 to 4,294,967,295
This means that no, long int cannot be relied upon to store any 10-digit number. However, a larger type, long long int, was introduced to C in C99 and C++ in C++11 (this type is also often supported as an extension by compilers built for older standards that did not include it). The minimum range for this type, if your compiler supports it, is:
long long int: -9,223,372,036,854,775,807 to 9,223,372,036,854,775,807
unsigned long long int: 0 to 18,446,744,073,709,551,615
So that type will be big enough (again, if you have it available).
A note for those who believe I've made a mistake with these lower bounds: the C requirements for the ranges are written to allow for ones' complement or sign-magnitude integer representations, where the lowest representable value and the highest representable value differ only in sign. It is also allowed to have a two's complement representation where the value with sign bit 1 and all value bits 0 is a trap representation rather than a legal value. In other words, int is not required to be able to represent the value -32,768.
The size of the numerical types is not defined in the C++ standard, although the minimum sizes are. The way to tell what size they are on your platform is to use numeric limits
For example, the maximum value for a int can be found by:
std::numeric_limits<int>::max();
Computers don't work in base 10, which means that the maximum value will be in the form of 2n-1 because of how the numbers of represent in memory. Take for example eight bits (1 byte)
0100 1000
The right most bit (number) when set to 1 represents 20, the next bit 21, then 22 and so on until we get to the left most bit which if the number is unsigned represents 27.
So the number represents 26 + 23 = 64 + 8 = 72, because the 4th bit from the right and the 7th bit right the left are set.
If we set all values to 1:
11111111
The number is now (assuming unsigned)
128 + 64 + 32 + 16 + 8 + 4 + 2 + 1 = 255 = 28 - 1
And as we can see, that is the largest possible value that can be represented with 8 bits.
On my machine and int and a long are the same, each able to hold between -231 to 231 - 1. In my experience the most common size on modern 32 bit desktop machine.
To find out the limits on your system:
#include <iostream>
#include <limits>
int main(int, char **) {
std::cout
<< static_cast< int >(std::numeric_limits< char >::max()) << "\n"
<< static_cast< int >(std::numeric_limits< unsigned char >::max()) << "\n"
<< std::numeric_limits< short >::max() << "\n"
<< std::numeric_limits< unsigned short >::max() << "\n"
<< std::numeric_limits< int >::max() << "\n"
<< std::numeric_limits< unsigned int >::max() << "\n"
<< std::numeric_limits< long >::max() << "\n"
<< std::numeric_limits< unsigned long >::max() << "\n"
<< std::numeric_limits< long long >::max() << "\n"
<< std::numeric_limits< unsigned long long >::max() << "\n";
}
Note that long long is only legal in C99 and in C++11.
Other folks here will post links to data_sizes and precisions, etc.
I'm going to tell you how to figure it out yourself.
Write a small application that will do the following.
unsigned int ui;
std::cout << sizeof(ui));
This will (depending on compiler and architecture) print 2, 4 or 8, saying 2 bytes long, 4 bytes long, etc.
Let’s assume it's 4.
You now want the maximum value 4 bytes can store, the maximum value for one byte is (in hexadecimal) 0xFF. The maximum value of four bytes is 0x followed by 8 f's (one pair of f's for each byte, and the 0x tells the compiler that the following string is a hex number). Now change your program to assign that value and print the result:
unsigned int ui = 0xFFFFFFFF;
std::cout << ui;
that’s the maximum value an unsigned int can hold, shown in base 10 representation.
Now do that for long's, shorts and any other INTEGER value you're curious about.
NB: This approach will not work for floating point numbers (i.e. double or float).
In C++, now int and other data is stored using the two's complement method.
That means the range is:
-2147483648 to 2147483647
or -2^31 to 2^31-1.
1 bit is reserved for 0 so positive value is one less than 2^(31).
You can use the numeric_limits<data_type>::min() and numeric_limits<data_type>::max() functions present in limits header file and find the limits of each data type.
#include <iostream>
#include <limits>
using namespace std;
int main()
{
cout<<"Limits of Data types:\n";
cout<<"char\t\t\t: "<<static_cast<int>(numeric_limits<char>::min())<<" to "<<static_cast<int>(numeric_limits<char>::max())<<endl;
cout<<"unsigned char\t\t: "<<static_cast<int>(numeric_limits<unsigned char>::min())<<" to "<<static_cast<int>(numeric_limits<unsigned char>::max())<<endl;
cout<<"short\t\t\t: "<<numeric_limits<short>::min()<<" to "<<numeric_limits<short>::max()<<endl;
cout<<"unsigned short\t\t: "<<numeric_limits<unsigned short>::min()<<" to "<<numeric_limits<unsigned short>::max()<<endl;
cout<<"int\t\t\t: "<<numeric_limits<int>::min()<<" to "<<numeric_limits<int>::max()<<endl;
cout<<"unsigned int\t\t: "<<numeric_limits<unsigned int>::min()<<" to "<<numeric_limits<unsigned int>::max()<<endl;
cout<<"long\t\t\t: "<<numeric_limits<long>::min()<<" to "<<numeric_limits<long>::max()<<endl;
cout<<"unsigned long\t\t: "<<numeric_limits<unsigned long>::min()<<" to "<<numeric_limits<unsigned long>::max()<<endl;
cout<<"long long\t\t: "<<numeric_limits<long long>::min()<<" to "<<numeric_limits<long long>::max()<<endl;
cout<<"unsiged long long\t: "<<numeric_limits<unsigned long long>::min()<<" to "<<numeric_limits<unsigned long long>::max()<<endl;
cout<<"float\t\t\t: "<<numeric_limits<float>::min()<<" to "<<numeric_limits<float>::max()<<endl;
cout<<"double\t\t\t: "<<numeric_limits<double>::min()<<" to "<<numeric_limits<double>::max()<<endl;
cout<<"long double\t\t: "<<numeric_limits<long double>::min()<<" to "<<numeric_limits<long double>::max()<<endl;
}
The output will be:
Limits of Data types:
char : -128 to 127
unsigned char : 0 to 255
short : -32768 to 32767
unsigned short : 0 to 65535
int : -2147483648 to 2147483647
unsigned int : 0 to 4294967295
long : -2147483648 to 2147483647
unsigned long : 0 to 4294967295
long long : -9223372036854775808 to 9223372036854775807
unsigned long long : 0 to 18446744073709551615
float : 1.17549e-038 to 3.40282e+038
double : 2.22507e-308 to 1.79769e+308
long double : 3.3621e-4932 to 1.18973e+4932
For an unsigned data type, there isn't any sign bit and all bits are for data ; whereas for a signed data type, MSB is indicating a sign bit and the remaining bits are for data.
To find the range, do the following things:
Step 1: Find out number of bytes for the given data type.
Step 2: Apply the following calculations.
Let n = number of bits in data type
For signed data type ::
Lower Range = -(2^(n-1))
Upper Range = (2^(n-1)) - 1)
For unsigned data type ::
Lower Range = 0
Upper Range = (2^(n)) - 1
For example,
For unsigned int size = 4 bytes (32 bits) → Range [0, (2^(32)) - 1]
For signed int size = 4 bytes (32 bits) → Range [-(2^(32-1)), (2^(32-1)) - 1]
No, only part of ten digits number can be stored in a unsigned long int whose valid range is 0 to 4,294,967,295.
You can refer to this:
http://msdn.microsoft.com/en-us/library/s3f49ktz(VS.80).aspx
Can unsigned long int hold a ten digits number (1,000,000,000 - 9,999,999,999) on a 32-bit computer.
No
You should look at the specialisations of the numeric_limits<> template for a given type. It’s in the <limits> header.