Bit manipulation- for negative numbers - c++

Let the size of integer i=-5 be 2 bytes. The signed bit value at the leftmost bit is '1'(which signifies that it is a negative number).
When i am trying to do a right shift operation, should i not expect the '1' at the 15th bit position to shift to 14th position? and give me a high but positive value?
What i tried:
int i=5;
i>>1 // giving me 2 (i understand this)
int i=-5
i>>1 // giving me -3 (mind=blown)

Right shifts of negative values are implementation-defined, [expr.shift]/3
The value of E1 >> E2 is E1 right-shifted E2 bit positions.
[..]. If E1 has a signed type and a negative value, the resulting
value is implementation-defined.
Most implementations use the so-called arithmetic shift though, which preserves and extends the sign-bit:
Shifting right by n bits on a two's complement signed binary number
has the effect of dividing it by 2n, but it always rounds down
(towards negative infinity). This is different from the way rounding
is usually done in signed integer division (which rounds towards 0).
This discrepancy has led to bugs in more than one compiler.
So what happens is, when shortened down to 8 bit, the following. In two's complement -5 would be
1111 1011
After the arithmetic right shift:
1111 1101
Now flip and add one to get the positive value for comparison:
0000 0011
Looks like a three to me.

Related

Negative decimal to fixed-point binary

IS -28.91 = 00100.0111 ??
28 -> 11100 then flip and add 1
-28 -> 00100
.91 -> 0111 with the accuracy of 4 decimals places
I have tried to check a lot of places to check my conversion if it is correct but I am failing at it. So I like to ask people here if I am correct.
For addition / subtraction and other operations to work normally (by using binary addition on the whole bit-pattern), the whole thing (integer and fractional parts combined) as an integer has to be x * 2^4.
i.e. the actual value represented by 0b00100.0111 is 0b001000111 / 16.
That means you have to do 2's complement negation (binary subtraction from 0, or use the invert and add 1 identity) for the whole and fractional bits together.
Also, your value for 28 has its MSB set, so it's already negative, i.e. you've overflowed 5-bit signed 2's complement. Presumably you actually have a wider integer part.
For 16-bit 12.4 fixed-point, 28.91:
28.91 * 16 = 462.56, which rounds up to 463.
+463 = 0b0000000111001111
-463 = 0b1111111000110001
As 12.4 fixed-point, this 0b111111100011.0001 bit-pattern represents -463/16 = -28.9375, the nearest representable value to -28.91

C/C++ Bitwise Operations not resulting in expected output?

I'm currently working on bitwise operations but I am confused right now... Here's the scoop and why
I have a byte 0xCD in bits this is 1100 1101
I am shifting the bits left 7, then I'm saying & 0xFF since 0xFF in bits is 1111 1111
unsigned int bit = (0xCD << 7) & 0xFF<<7;
Now I would make the assumption that both 0xCD and 0xFF would get shifted to the left 7 times and the remaining bit would be 1&1 = 1 but I'm not getting that for output also I would also make the assumption that shifting 6 would give me bits 0&1 = 0 but I'm getting again a number above 1 like 205 0.o Is there something incorrect about the way I am trying to process bit shifting in my head? If so what is it that I am doing wrong?
Code Below:
unsigned char byte_now = 0xCD;
printf("Bits for byte_now: 0x%02x: ", byte_now);
/*
* We want to get the first bit in a byte.
* To do this we will shift the bits over 7 places for the last bit
* we will compare it to 0xFF since it's (1111 1111) if bit&1 then the bit is one
*/
unsigned int bit_flag = 0;
int bit_pos = 7;
bit_flag = (byte_now << bit_pos) & 0xFF;
printf("%d", bit_flag);
Is there something incorrect about the way I am trying to process bit shifting in my head?
There seems to be.
If so what is it that I am doing wrong?
That's unclear, so I offer a reasonably full explanation.
In the first place, it is important to understand that C does not not perform any arithmetic directly on integers smaller than int. Consider, then, your expression byte_now << bit_pos. "The usual arithmetic promotions" are performed on the operands, resulting in the left operand being converted to the int value 0xCD. The result has the same pattern of least-significant value bits as bit_flag, but also a bunch of leading zero bits.
Left shifting the result by 7 bits produces the bit pattern 110 0110 1000 0000, equivalent to 0x6680. You then perform a bitwise and operation on the result, masking off all but the least-significant 8 bits, thus yielding 0x80. What happens when you assign that to bit_flag depends on the type of that variable, but if it is an integer type that is either unsigned or has more than 7 value bits then the assignment is well-defined and value-preserving. Note that it is bit 7 that is nonzero, not bit 0.
The type of bit_flag is more important when you pass it to printf(). You've paired it with a %d field descriptor, which is correct if bit_flag has type int and incorrect otherwise. If bit_flag does have type int, then I would expect the program to print 128.

fixed point subtraction for two's complement data

I have some real data. For example +2 and -3. These data are represented in two's complement fixed point with 4 bit binary value where MSB represents the sign bit and number of fractional bit is zero.
So +2 = 0010
-3 = 1101
addition of this two numbers is (+2) + (-3)=-1
(0010)+(1101)=(1111)
But in case of subtraction (+2)-(-3) what should i do?
Is it needed to take the two's complement of 1101 (-3) again and add with 0010?
You can evaluate -(-3) in binary and than simply sums it with the other values.
With two's complement, evaluate the opposite of a number is pretty simple: just apply the NOT binary operation to every digits except for the less significant bit. The equation below uses the tilde to rapresent the NOT operation of a single bit and assumed to deal with integer rapresented by n bits (n = 4 in your example):
In your example (with an informal notation): -(-3) = -(1101) = 0011

Why does a right shift on a signed integer causes an overflow?

Given any 8 bits negative integer (signed so between -1 and -128), a right shift in HLA causes an overflow and I don't understand why. If shifted once, it should basically divide the value by 2. This is true for positive numbers but obviously not for negative. Why? So for example if -10 is entered the result is +123.
Program cpy;
#include ("stdlib.hhf")
#include ("hla.hhf")
static
i:int8;
begin cpy;
stdout.put("Enter value to divide by 2: ");
stdin.geti8();
mov(al,i);
shr(1,i); //shift bits one position right
if(#o)then // if overlow
stdout.put("overflow");
endif;
end cpy;
Signed numbers are represented with their 2's complement in binary, plus a sign bit "on the left".
The 2's complement of 10 coded on 7 bits is 1110110, and the sign bit value for negative numbers is 1.
-10: 1111 0110
^
|
sign bit
Then you shift it to the right (when you right shift zeroes get added to the left):
-10 >> 1: 0111 1001
^
|
sign bit
Your sign bit is worth 0 (positive), and 1111011 is 123 in decimal.

the idea behind unsigned integer [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What happens if I assign a negative value to an unsigned variable?
I'm new at C++ and I want to know how to use unsigned types. For the unsigned int type, I know that it can take the values from 0 to 4294967296. but when I want to initialize an unsigned int type as follows:
unsigned int x = -10;
cout << x;
The output seems like 4294967286
The got this output = max value - 10. So I want to learn what is happening in the memory? What kind of processes are being done while this calculation is continuing? Thanks for your answers.
You're encountering wrap around behavior.
Unsigned types are cyclic (signed types, on the other hand, may or may not be cyclic, but it's undefined behavior that you shouldn't rely on). That is to say, one less than the minimum possible value is the maximum possible value. You can demonstrate this yourself with the following snippet:
int main()
{
unsigned int x = 5;
for (int i = 0; i < 10; ++i) cout << x-- << endl;
return 0;
}
You'll notice that after reaching zero, the value of x jumps to 2^32-1, the maximum representable value. Subtracting further acts as expected.
When you subtract 1 from unsigned 0, the bit pattern changes in the following way:
0000 0000 0000 0000 0000 0000 0000 0000 // before (0)
1111 1111 1111 1111 1111 1111 1111 1111 // after (2^32 - 1)
With unsigned numbers, negative numbers are treated like positive numbers subtracted from zero. So (unsigned int) -10 will equal ((unsigned int) 0) - ((unsigned int) 10).
I like to think about it as an unsigned int being the lowest 32 bits of a higher-precision arbitrary value. Like this:
v imaginary high order bit
1 0000 0000 0000 0000 0000 0000 0000 0000 // before (2^32)
0 1111 1111 1111 1111 1111 1111 1111 1111 // after (2^32 - 1)
The behavior of the unsigned int in these overflow cases is exactly the same as the behavior of the low 8 bits of an unsigned int when you subtract 1 from 256. It makes more sense to look at an unsigned char (1 byte) like this, because the values 0 and 256 are equal if casted to unsigned char, since the limited precision discards the extra bits.
0 0000 0000 0000 0000 0000 0001 0000 0000 // before (256)
0 0000 0000 0000 0000 0000 0000 1111 1111 // before (255)
As others have pointed out, this is called modulo arithmetic. Using higher precision values to help visualize the transitions made when wrapping around works because you mask off high order bits. It doesn't matter what it was, so it can be anything, it just gets discarded. Integers are values over modulus 2^32, so any multiples of 2^32 equal zero in the space of an integer. That's why I can get away with pretending there's an extra bit on the end.
Modulus operations have their own dedicated operator in case you need to compute them for numbers other than 2^32 in your programs, as used in this statement:
int forty_mod_twelve = 40 % 12;
// value is 4: 4 + n * 12 == 40 for some whole number n
Modulus operations on powers of two (like 2^32) simplify directly to masking off high order bits, and if you take a 64 bit integer and compute it modulo 2^32, the value will be exactly the same as if you had converted it to an unsigned int.
01011010 01011100 10000001 00001101 11111111 11111111 11111111 11111111 // before
00000000 00000000 00000000 00000000 11111111 11111111 11111111 11111111 // after
Programmers like to use this property to speed up programs, because it's easy to chop off some number of bits, but performing a modulus operation is much harder (it's about as hard as doing a division).
Does that make sense?
This involves the standard integral conversions. Here's the applicable rule. We start with the type of the literal 10:
2.14.2 Integer literals [lex.icon]
An integer literal is a sequence of digits that has no period or exponent part. An integer literal may have
a prefix that specifies its base and a suffix that specifies its type. The lexically first digit of the sequence
of digits is the most significant. A decimal integer literal (base ten) begins with a digit other than 0 and
consists of a sequence of decimal digits. An octal integer literal (base eight) begins with the digit 0 and
consists of a sequence of octal digits. A hexadecimal integer literal (base sixteen) begins with 0x or 0X and
consists of a sequence of hexadecimal digits, which include the decimal digits and the letters a through f and A through F with decimal values ten through fifteen. [ Example: the number twelve can be written 12, 014, or 0XC. — end example ]
The type of an integer literal is the first of the corresponding list in Table 6 in which its value can be
represented.
A table follows, the first type is int and it fits. So the literal's type is int.
The unary minus operator is applied, which doesn't change the type. Then the following rule is applied:
4.7 Integral conversions [conv.integral]
A prvalue of an integer type can be converted to a prvalue of another integer type. A prvalue of an unscoped enumeration type can be converted to a prvalue of an integer type.
If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). — end note ]
Instead of printing the value the way you are, print it in hexadecimal format (sorry, I forget how to do that with cout but I know it's possible). You'll see that the representation is the same for both values.
From your context, an integer is 32 bits (this is not always the case). When using a signed integer, the most significant bit is the sign, not part of the value. When using an unsigned integer, the most significant bit is part of the value.