Bitwise Operators NOT - bit-manipulation

I encountered a problem with bit arithmetic. It is bitwise NOT.
if A = 5; then ~A = ?
The binary of 5 is 101, the inverse is 010, and then converted to decimal is 0 * 2^2 + 1 * 2^1 + 0 * 2^0 = 2
But when I test in the IDE, the output is as follows:
System.out.println( ~5 );
Output:
-6
I don't know why. Thanks!!!

If you using a standard int, then after assignment your A to 5:
int A = 5;
Then your "A" would be not 101b, but 00000000000000000000000000000101b - all 32 bits.
After NEG operation, which inverse all bits, you will get:
A = 11111111111111111111111111111010
And this int-value is -6, in the 2-complement representation, used int the most of computers.

Related

c++, binary number calculations

I have question that asks how values such as c are computed in terms of binary numbers. Im researching it but now but figured id ask here if anyone has somewhere they can send me or explain how this works.
int main()
{
int a 10, int b = 12, int c, int d;
int c = a << 2; //output 40
}
Well, I'm not answering with C++ code, as the question is not really related to the language.
The integer ten is written 10 in base 10 as it's equal to 1 * 10^1 + 0 * 10^0.
Binary is base 2, so let's try to write ten as a sum of powers of 2.
10 = 8 + 2
That is 2^3 + 2^1.
Let's switch to binary (using only two digits : 0 and 1).
2^3 is written 1000
2^1 is written 10
Their sum is 1010 in binary.
"<<" is the operation that shift left binary digits by a certain amount (beware of overflow).
So 1010 << 2 is 101000
That is in decimal 2^5 + 2^3 = 32 + 8 = 40
You can also think of "<< N" as being a multiplication by 2^N of an integer.

C++ using AND operator in integer expression

I'm reading some source code for designing an Octree and found this in the code. I have removed some elements for simplication, but can anyone explain what i&4 is supposed to evaluate to?
for (int i = 0; i < 8; i++)
{
float j = i&4 ? .5f : -.5f;
}
& is the bitwise AND Operator.
It just does a bitwise operation of the value stored in i AND 0x4.
It exactly just isolates the third bit as 2^2 = 4.
Your expression in the loop checks if third bit is set in i and assigns to j (which must be a float!) 0.5 or if not set -0.5
I am not sure, but it may evaluate the bitwise and operation of i and 4 (100), so any number which has a '1' in its third bit will be evaluted to true, otherwise false.
Ex:
5 (101) & 4 (100) = 100 (4) which is different from 0 so its true
8 (1000) & 4 (100) = 0000 (0) which is false
The & operator in this case is a bitwise AND. Since the second operand is 4, a power of 2, it evaluates to 4 when i has its second least-significant bit set, and to 0 otherwise.
The for loop takes i from 0 to 7, inclusive. Consider bit representations of i in this range:
0000 - 0
0001 - 1
0010 - 2
0011 - 3
0100 - 4
0101 - 5
0110 - 6
0111 - 7
^
|
This bit determines the result of i & 4
Therefore, the end result of the conditional is as follows: if the second bit is set (i.e. when i is 4, 5, 6, or 7), the result is 0.5f; otherwise, it is -0.5f.
For the given range of values, this expression can be rewritten as
float j = (i >= 4) ? .5f : -.5f;
i & 4 just evaluate to true when the value-4 bit is set. In your case this only happens in the second half of the loop. So the code could actually be rewritten:
for (int i = 0; i < 4; i++)
{
float j = -.5f;
}
for (int i = 4; i < 8; i++)
{
float j = .5f;
}

In binary notation, what is the meaning of the digits after the radix point "."?

I have this example on how to convert from a base 10 number to IEEE 754 float representation
Number: 45.25 (base 10) = 101101.01 (base 2) Sign: 0
Normalized form N = 1.0110101 * 2^5
Exponent esp = 5 E = 5 + 127 = 132 (base 10) = 10000100 (base 2)
IEEE 754: 0 10000100 01101010000000000000000
This makes sense to me except one passage:
45.25 (base 10) = 101101.01 (base 2)
45 is 101101 in binary and that's okay.. but how did they obtain the 0.25 as .01 ?
Simple place value. In base 10, you have these places:
... 103 102 101 100 . 10-1 10-2 10-3 ...
... thousands, hundreds, tens, ones . tenths, hundredths, thousandths ...
Similarly, in binary (base 2) you have:
... 23 22 21 20 . 2-1 2-2 2-3 ...
... eights, fours, twos, ones . halves, quarters, eighths ...
So the second place after the . in binary is units of 2-2, well known to you as units of 1/4 (or alternately, 0.25).
You can convert the part after the decimal point to another base by repeatedly multiplying by the new base (in this case the new base is 2), like this:
0.25 * 2 = 0.5
-> The first binary digit is 0 (take the integral part, i.e. the part before the decimal point).
Continue multiplying with the part after the decimal point:
0.5 * 2 = 1.0
-> The second binary digit is 1 (again, take the integral part).
This is also where we stop because the part after the decimal point is now zero, so there is nothing more to multiply.
Therefore the final binary representation of the fractional part is: 0.012.
Edit:
Might also be worth noting that it's quite often that the binary representation is infinite even when starting with a finite fractional part in base 10. Example: converting 0.210 to binary:
0.2 * 2 = 0.4 -> 0
0.4 * 2 = 0.8 -> 0
0.8 * 2 = 1.6 -> 1
0.6 * 2 = 1.2 -> 1
0.2 * 2 = ...
So we end up with: 0.001100110011...2.
Using this method you see quite easily if the binary representation ends up being infinite.
"Decimals" (fractional bits) in other bases are surprisingly unintuitive considering they work in exactly the same way as integers.
base 10
scinot 10e2 10e1 10e0 10e-1 10e-2 10e-3
weight 100.0 10.0 1.0 0.1 0.01 0.001
value 0 4 5 .2 5 0
base 2
scinot 2e6 2e5 2e4 2e3 2e2 2e1 2e0 2e-1 2e-2 2e-3
weight 64 32 16 8 4 2 1 .5 .25 .125
value 0 1 0 1 1 0 1 .0 1 0
If we start with 45.25, that's bigger/equal than 32, so we add a binary 1, and subtract 32.
We're left with 13.25, which is smaller than 16, so we add a binary 0.
We're left with 13.25, which is bigger/equal than 8, so we add a binary 1, and subtract 8.
We're left with 05.25, which is bigger/equal than 4, so we add a binary 1, and subtract 4.
We're left with 01.25, which is smaller than 2, so we add a binary 0.
We're left with 01.25, which is bigger/equal than 1, so we add a binary 1, and subtract 1.
With integers, we'd have zero left, so we stop. But:
We're left with 00.25, which is smaller than 0.5, so we add a binary 0.
We're left with 00.25, which is bigger/equal to 0.25, so we add a binary 1, and subtract 0.25.
Now we have zero, so we stop (or not, you can keep going and calculating zeros forever if you want)
Note that not all "easy" numbers in decimal always reach that zero stopping point. 0.1 (decimal) converted into base 2, is infinitely repeating: 0.0001100110011001100110011... However, all "easy" numbers in binary will always convert nicely into base 10.
You can also do this same process with fractional (2.5), irrational (pi), or even imaginary(2i) bases, except the base cannot be between -1 and 1 inclusive .
2.00010 = 2+1 = 10.0002
1.00010 = 2+0 = 01.0002
0.50010 = 2-1 = 00.1002
0.25010 = 2-2 = 00.0102
0.12510 = 2-3 = 00.0012
The fractions base 2 are .1 = 1/2, .01 = 1/4. ...
Think of it this way
(dot) 2^-1 2^-2 2^-3 etc
so
. 0/2 + 1/4 + 0/8 + 0/16 etc
See http://floating-point-gui.de/formats/binary/
You can think of 0.25 as 1/4.
Dividing by 2 in (base 2) moves the decimal point one step left, the same way dividing by 10 in (base 10) moves the decimal point one step left. Generally dividing by M in (base M) moves the decimal point one step left.
so
base 10 base 2
--------------------------------------
1 => 1
1/2 = 0.5 => 0.1
0.5/2 = 1/4 = 0.25 => 0.01
0.25/2 = 1/8 = 0.125 => 0.001
.
.
.
etc.

represent negative number with 2' complement technique?

I am using 2' complement to represent a negative number in binary form
Case 1:number -5
According to the 2' complement technique:
Convert 5 to the binary form:
00000101, then flip the bits
11111010, then add 1
00000001
=> result: 11111011
To make sure this is correct, I re-calculate to decimal:
-128 + 64 + 32 + 16 + 8 + 2 + 1 = -5
Case 2: number -240
The same steps are taken:
11110000
00001111
00000001
00010000 => recalculate this I got 16, not -240
I am misunderstanding something?
The problem is that you are trying to represent 240 with only 8 bits. The range of an 8 bit signed number is -128 to 127.
If you instead represent it with 9 bits, you'll see you get the correct answer:
011110000 (240)
100001111 (flip the signs)
+
000000001 (1)
=
100010000
=
-256 + 16 = -240
Did you forget that -240 cannot be represented with 8 bits when it is signed ?
The lowest negative number you can express with 8 bits is -128, which is 10000000.
Using 2's complement:
128 = 10000000
(flip) = 01111111
(add 1) = 10000000
The lowest negative number you can express with N bits (with signed integers of course) is always - 2 ^ (N - 1).

How to optimize a cycle?

I have the following bottleneck function.
typedef unsigned char byte;
void CompareArrays(const byte * p1Start, const byte * p1End, const byte * p2, byte * p3)
{
const byte b1 = 128-30;
const byte b2 = 128+30;
for (const byte * p1 = p1Start; p1 != p1End; ++p1, ++p2, ++p3) {
*p3 = (*p1 < *p2 ) ? b1 : b2;
}
}
I want to replace C++ code with SSE2 intinsic functions. I have tried _mm_cmpgt_epi8 but it used signed compare. I need unsigned compare.
Is there any trick (SSE, SSE2, SSSE3) to solve my problem?
Note:
I do not want to use multi-threading in this case.
Instead of offsetting your signed values to make them unsigned, a slightly more efficient way would be to do the following:
use _mm_min_epu8 to get the unsigned min of p1, p2
compare this min for equality with p2 using _mm_cmpeq_epi8
the resulting mask will now be 0x00 for elements where p1 < p2 and 0xff for elements where p1 >= p2
you can now use this mask with _mm_or_si128 and _mm_andc_si128 to select the appropriate b1/b2 values
Note that this is 4 instructions in total, compared with 5 using the offset + signed comparison approach.
You can subtract 127 from your numbers, and then use _mm_cmpgt_epi8
Yes, this can be done in SIMD, but it will take a few steps to make the mask.
Ruslik got it right, I think. You want to xor each component with 0x80 to flip the sense of the signed and unsigned comparison. _mm_xor_si128 (PXOR) gets you that -- you'll need to create the mask as a static char array somewhere before loading it into a SIMD register. Then _mm_cmpgt_epi8 gets you a mask and you can use a bitwise AND (eg _mm_and_si128) to perform a masked-move.
Yes, SSE will not work here.
You can improve this code performance on multi-core computer by using OpenMP:
void CompareArrays(const byte * p1Start, const byte * p1End, const byte * p2, byte * p3)
{
const byte b1 = 128-30;
const byte b2 = 128+30;
int n = p1End - p1Start;
#pragma omp parallel for
for (int i = 0; i < n; ++p1, ++i)
{
p3[i] = (p1[i] < p2[i]) ? b1 : b2;
}
}
Unfortunately, many of the answers above are incorrect. Let's assume a 3-bit word:
unsigned: 4 5 6 7 0 1 2 3 == signed: -4 -3 -2 -1 0 1 2 3 (bits: 100 101 110 111 000 001 010 011)
The method by Paul R is incorrect. Suppose we want to know if 3 > 2. min(3,2) == 2, which suggests yes, so the method works here. Now suppose we want to know if if 7 > 2. The value 7 is -1 in signed representation, so min(-1,2) == -1, which suggests wrongly that 7 is not greater than 2 unsigned.
The method by Andrey is also incorrect. Suppose we want to know if 7 > 2, or a = 7, and b = 2. The value 7 is -1 in signed representation, so the first term (a > b) fails, and the method suggests that 7 is not greater than 2.
However, the method by BJobnh, as corrected by Alexey, is correct. Just subtract 2^(n-1) from the values, where n is the number of bits. In this case, we would subtract 4 to obtain new corresponding values:
old signed: -4 -3 -2 -1 0 1 2 3 => new signed: 0 1 2 3 -4 -3 -2 -1 == new unsigned 0 1 2 3 4 5 6 7.
In other words, unsigned_greater_than(a,b) is equivalent to signed_greater_than(a - 2^(n-1), b - 2^(n-1)).
use pcmpeqb and be the Power with you.