C++ copying integer to char[] or unsigned char[] error - c++

So I'm using the following code to put an integer into a char[] or an unsigned char[]
(unsigned???) char test[12];
test[0] = (i >> 24) & 0xFF;
test[1] = (i >> 16) & 0xFF;
test[2] = (i >> 8) & 0xFF;
test[3] = (i >> 0) & 0xFF;
int j = test[3] + (test[2] << 8) + (test[1] << 16) + (test[0] << 24);
printf("Its value is...... %d", j);
When I use type unsigned char and value 1000000000 it prints correctly.
When I use type char (same value) I get 98315724 printed?
So, the question really is can anyone explain what the hell is going on??
Upon examining the binary for the two different numbers I still can't work out whats going on. I thought signed was when the MSB was set to 1 to indicate a negative value (but negative char? wth?)
I'm explicitly telling the buffer what to insert into it, and how to interpret the contents, so don't see why this could be happening.
I have included binary/hex below for clarity in what I examined.
11 1010 1001 1001 1100 1010 0000 0000 // Binary for 983157248
11 1011 1001 1010 1100 1010 0000 0000 // Binary for 1000000000
3 A 9 9 C A 0 0 // Hex for 983157248
3 B 9 A C A 0 0 // Hex for 1000000000

In addition to the answer by Kerrek SB please consider the following:
Computers (almost always) use something called twos-complement notation for negative numbers, with the high bit functioning as a 'negative' indicator. Ask yourself what happens when you perform shifts on a signed type considering that the computer will handle the signed bit specially.
You may want to read Why does left shift operation invoke Undefined Behaviour when the left side operand has negative value? right here on StackOverflow for a hint.

When you say i & 0xFF etc, you're creaing values in the range [0, 256). But (your) char has a range of [-128, +128), and so you cannot actually store those values sensibly (i.e. the behaviour is implementation defined and tedious to reason about).
Use unsigned char for unsigned values. The clue is in the name.

This all has to do with internal representation and the way each type uses that data to interpret it. In the internal representation of a signed character, the first bit of your byte holds the sign, the others, the value. when the first bit is 1, the number is negative, the following bits then represent the complement of the positive value. for example:
unsigned char c; // whose internal representation we will set at 1100 1011
c = (1 * 2^8) + (1 * 2^7) + (1 * 2^4) + (1 * 2^2) + (1 * 2^1);
cout << c; // will give 203
// inversely:
char d = c; // not unsigned
cout << d; // will print -53
// as if the first is 1, d is negative,
// and other bits complement of value its positive value
// 1100 1011 -> -(complement of 100 1011)
// the complement is an XOR +1 011 0101
// furthermore:
char e; // whose internal representation we will set at 011 0101
e = (1 * 2^6) + (1 * 2^5) + (1 * 3^2) + (1 * 2^1);
cout << e; // will print 53

Related

I need to understand the logic behind this cpp code

int x = 25;
unsigned int g = x & 0x80000000;
how did this code read the most significant bit of in the address of x? does the reference to 0x80000000, or binary 1000 0000 0000 0000 accomplished that task, or was it something else?
For char the most significant bit is typically the sign bit as per Two's Complement, so this should be:
char x = 25;
unsigned int msb = x & (1 << 6);
Where (1 << 6) means the 6 bit, counting from 0, or the 7th counting from 1st. It's the second-to-top bit and equivalent to 0x40.
Since 25 is 0b00011001 you won't get a bit set. You'll need a value >= 64.

C++ Implicit conversion

#include <iostream>
int main() {
char x = 'a';
char y = 'b';
char z = x + y;
printf("%d\n",z);
return 0;
}
Why is the output of this code -61?
a)
Every char is in the computer a number. (a computer can only handle numbers, characters are "only" images for a computer.)
So: search for the ASCII table. e.g. http://www.asciitable.com/
-> a=97, b=98
a+b=97 + 98 = 195
b)
char is defined as 8 bit (=256 numbers) with sign: It can contain the numbers between -128 and 127. So 195 did not fit. When calculating this, the carry was thrown away, and such a number appears.
Edit:
To show, whats internally happen: The calculation as binary:
0110 0001 ('a' = 97)
+0110 0010 ('b' = 98)
---------
1100 0011
Because this is a signed number, this is stored two complement.
First bis is set -> negative. But the most easy way (for me) to calc:
This first bit has the value of -128, the second of 64.
the least bits: 2 and 1. So:-128 + 64 + 2 + 1 = -61
hopes this helps more than this is confusing...
Edit 2:
As result of the discussion: This is what happens on your CPU. This is because the CPU has some technical parameter. But you can not assume, that this happens on every CPU! C++ compiles on / for every CPU, but an overflow is not defined in C/C++, so on other CPUs there can be other results.

C++ char arithmetic overflow

#include <stdio.h>
int main()
{
char a = 30;
char b = 40;
char c = 10;
printf ("%d ", char(a*b));
char d = (a * b) / c;
printf ("%d ", d);
return 0;
}
The above code yields normal int value if 127 > x > -127
and a overflow value if other. I can't understand how the overflow value is calculated. As -80 in this case.
Thanks
The trick here is how numbers are represented. Look into 2's complement. 30 * 40 in binary is 1200 or 10010110000 base 2. But our char is only 8 bits so we chop off the leading 100 (and all the implied 0s before that). This leaves us with 1011000.
Note the leading 1. In 2s complement, how your computer probably stores the values, this indicates a negative number. 11111111 is -1, 11111110 is -2 and so on. If go down to 1011000 we get to -80.
That is, if we convert 1011000 to 2s complement we're left with -80.
You can do 2s complement by hand. Take the value, drop the leading sign bit and swap the other values. In this case 10110000 turns into 01001111 in binary this would be 79. Turn it negative and remove one more because we don't start at zero and we're at -80.
Char has only 1 byte. In this case 1200 is 0100 1011 0000 (binary).
For one byte you can only assign 8 bit, in your case: 1011 0000 (first 4 bits will be deleted). Now you have -80 (first bit shows if negative (1) or positive (0)).
Try with your calculator (programmer) and type 1200 decimal and switch from Qword to Byte and you can see what happens with your number.

1's complement using ~ in C/C++

I am using Visual Studio 2013.
Recently I tried the ~ operator for 1's complement:
int a = 10;
cout << ~a << endl;
Output is -11
But for
unsigned int a = 10;
cout << ~a << endl;
the output is 4294967296
I don't get why the output is -11 in the case of signed int.
Please help me with this confusion.
When you put number 10 into 32-bit signed or unsigned integer, you get
0000 0000 0000 0000 0000 0000 0000 1010
When you negate it, you get
1111 1111 1111 1111 1111 1111 1111 0101
These 32 bits mean 4294967285 as an unsigned integer, or -11 as a signed integer (your computer represents negative integers as Two's complement). They can also mean a 32-bit floating point number or four 8-bit characters.
Bits don't have any "absolute" meaning. They can represent anything, depending on how you "look" at them (which type they have).
The ~ operator performs a ones-complement on its argument, and it does not matter whther the argument is a signed or unsigned integer. It merely flips all the bits, so
0000 0000 0000 1010 (bin) / 10 (dec)
becomes
1111 1111 1111 0101 (bin)
(where, presumably, these numbers are 32 bits wide -- I omitted 16 more 0's and 1's.)
How will cout display the result? It looks at the original type. For a signed integer, the most significant bit is its sign. Thus, the result is always going to be negative (because the most significant bit in 10 is 0). To display a negative number as a positive one, you need the two's complement: inverting all bits, then add 1. For example, -1, binary 111..111, displays as (inverting) 000..000 then +1: 000..001. Result: -1.
Applying this to the one's complement of 10 you get 111..110101 -> inverting to 000...001010, then add 1. Result: -11.
For an unsigned number, cout doesn't do this (naturally), and so you get a large number: the largest possible integer minus the original number.
In memory there is stored 4294967285 in both cases (4294967296 properly a typo, 33 bit?), the meaning of this number depends which signdeness you use:
if it's signed, this the number is -11.
if it's unsigned, then it's 4294967285
different interpretations of the same number.
You can reinterpret it as unsigned by casting it, same result:
int a = 10;
cout << (unsigned int) ~a << endl;
Try this
unsigned int getOnesComplement(unsigned int number){
unsigned onesComplement = 1;
if(number < 1)
return onesComplement;
size_t size = (sizeof(unsigned int) * 8 - 1) ;
unsigned int oneShiftedToMSB = 1 << size;
unsigned int shiftedNumber = number;
for ( size_t bitsToBeShifted = 0; bitsToBeShifted < size; bitsToBeShifted++){
shiftedNumber = number << bitsToBeShifted;
if(shiftedNumber & oneShiftedToMSB){
onesComplement = ~shiftedNumber;
onesComplement = onesComplement >> bitsToBeShifted;
break;
}
}
return onesComplement;
}

Why, if a char is initialized to 1 and then left shifted 7 times and the value printed using %d, does it show -128?

I am aware of the 2s complement representation of signed values. But how does binary '10000000' become -128 in decimal(using %d).
for +64 binary rep = '01000000' for -64 binary rep = '11000000' which is 2's complement of '01000000'
can some one please explain?
Program:
int main()
{
char ch = 1;
int count = 0;
while(count != 8)
{
printf("Before shift val of ch = %d,count=%d\n",ch,count);
ch = ch << 1;
printf("After shift val of ch = %d,count=%d\n",ch,count);
//printBinPattern(ch);
printf("*************************************\n");
count++;
}
return 0;
}
Output:
Before shift val of ch = 1, count=0
After shift val of ch = 2, count=0
*************************************
...
... /* Output not shown */
Before shift val of ch = 32, count=5
After shift val of ch = 64, count=5
*************************************
Before shift val of ch = 64, count=6
After shift val of ch = -128, count=6
*************************************
Before shift val of **ch = -128**, count=7
After shift val of ch = 0, count=7
*************************************
Before shift val of ch = 0, count=8
After shift val of ch = 0, count=8
*************************************
Because on your compiler, char means signed char.
Char is just a tiny integer, generally in the range of 0...255 (for unsigned char) or -128...127 (for signed char).
The means of converting a number to 2-complement negative is to "invert the bits and add 1"
128 = "1000 0000". Inverting the bits is "0111 1111". Adding 1 yields: "1000 0000"
I am aware of the 2s complement representation of signed values.
Well, obviously you aren't. A 1 followed by all 0s is always the smallest negative number.
The answer is implementation defined as the type of 'default char' is implementation defined.
$3.9.1/1
Objects declared as characters (char)
shall be large enough to store any
member of the implementation’s basic
character set. If a character from
this set is stored in a character
object, the integral value of that
character object is equal to the value
of the single character literal form
of that character. It is
implementationdefined whether a char
object can hold negative values.
Characters can be explicitly declared
unsigned or signed. Plain char, signed
char, and unsigned char are three
distinct types.
$5.8/1 -
"The operands shall be of integral or
enumeration type and integral
promotions are performed. The type of
the result is that of the promoted
left operand. The behavior is
undefined if the right operand is
negative, or greater than or equal to
the length in bits of the promoted
left operand."
So when the value of char becomes negative, left shift from thereon has undefined behavior.
That's how it works.
-1 = 1111 1111
-2 = 1111 1110
-3 = 1111 1101
-4 = 1111 1110
...
-126 = 1000 0010
-127 = 1000 0001
-128 = 1000 0000
Two's complement is exactly like unsigned binary representation with one slight change:
The MSB (bit n-1) is redefined to have a value of -2n-1 instead of 2n-1.
That's why the addition logic is unchanged: because all the other bits still have the same place value.
This also explains the underflow/overflow detection method, which involves checking the carry from bit (n-2) into bit (n-1).
There is a pretty simple process for converting from a negative two's complement integer value to it's positive equivalent.
0000 0001 ; The x = 1
1000 0000 ; x <<= 7
The two's complement process is two-steps... first, if the high-bit is 1, reverse all bits
0111 1111 ; (-) 127
then add 1
1000 0000 ; (-) 128
Supplying a char to a %d format specifier that expects an int is probably unwise.
Whether an unadorned char is signed or unsigned is implementation defined. In this case not only is it apparently signed, but also the char argument has been pushed on to the stack an an int sized object and sign extended so that the higher order bits are all set to the same value as the high order bit of the original char.
I am not sure whether this is defined behaviour or not without looking it up, but personally I'd have cast the char to an int when formatting it with %d. Not least because some compilers and static analysis tools will trap that error and issue a warning. GCC will do so when -Wformat is used for example.
That is the explanation, if you want a solution (i.e. one that prints 128 rather than -128) then you need to cast to unsigned and mask-off the sign extension bits as well as using a correctly matching format specifier:
printf("%u", (unsigned)ch & 0xff );