Is a bitwise OR and AND NOT the same as addition and subtraction when a set is known? - bit-manipulation

If a = 2 and b = 4 where a OR b = 6 and (a|b) AND NOT b = a then is a bitwise AND NOT equivalent to subtraction when the the value is a set of flags which is known to include the flag being removed?
Is it the same for addition as well?
Note that this is in a situation where the flags are known to exist in the set. No addition or subtraction would occur if the flag is not present.

If I understood correctly what you're asking, yes. So:
If (a & b) == 0, then (a | b) == (a + b), and
If (a | b) == a, then (a & ~b) == (a - b)
As a sort of proof, take that addition can be written as a + b == (a ^ b) + ((a & b) << 1) (which is doing all the sums-without-carry, and then adding the carries separately). So if a & b is zero, the carries disappear and it becomes just a ^ b, and that in turn becomes a | b. A similar thing happens with the subtraction where we know there are no borrows.

Only if you're ORing with values whose 1-bits definitely are not in the first operand or ANDing with bit-negated values of values whose 1-bits definitely are in the first operand.

Related

Modulo Multiplication Function: Multiplying two integers under a modulus

I came across this modulo multiplication function in a code for the miller-rabin primality test. This is supposed to eliminate the integer overflow that occurs when calculating ( a * b ) % m.
I need some help in understanding what is going on here. Why does this work? and what is the significance of that number literal 0x8000000000000000ULL?
unsigned long long mul_mod(unsigned long long a, unsigned long long b, unsigned long long m) {
unsigned long long d = 0, mp2 = m >> 1;
if (a >= m) a %= m;
if (b >= m) b %= m;
for (int i = 0; i < 64; i++)
{
d = (d > mp2) ? (d << 1) - m : d << 1;
if (a & 0x8000000000000000ULL)
d += b;
if (d >= m) d -= m;
a <<= 1;
}
return d;
}
This code, which currently appears on the modular arithmetic Wikipedia page, only works for arguments of up to 63 bits -- see bottom.
Overview
One way to compute an ordinary multiplication a * b is to add left-shifted copies of b -- one for each 1-bit in a. This is similar to how most of us did long multiplication in school, but simplified: Since we only ever need to "multiply" each copy of b by 1 or 0, all we need to do is either add the shifted copy of b (when the corresponding bit of a is 1) or do nothing (when it's 0).
This code does something similar. However, to avoid overflow (mostly; see below), instead of shifting each copy of b and then adding it to the total, it adds an unshifted copy of b to the total, and relies on later left-shifts performed on the total to shift it into the correct place. You can think of these shifts "acting on" all the summands added to the total so far. For example, the first loop iteration checks whether the highest bit of a, namely bit 63, is 1 (that's what a & 0x8000000000000000ULL does), and if so adds an unshifted copy of b to the total; by the time the loop completes, the previous line of code will have shifted the total d left 1 bit 63 more times.
The main advantage of doing it this way is that we are always adding two numbers (namely b and d) that we already know are less than m, so handling the modulo wraparound is cheap: We know that b + d < 2 * m, so to ensure that our total so far remains less than m, it suffices to check whether b + d < m, and if not, subtract m. If we were to use the shift-then-add approach instead, we would need a % modulo operation per bit, which is as expensive as division -- and usually much more expensive than subtraction.
One of the properties of modulo arithmetic is that, whenever we want to perform a sequence of arithmetic operations modulo some number m, performing them all in usual arithmetic and taking the remainder modulo m at the end always yields the same result as taking remainders modulo m for each intermediate result (provided no overflows occur).
Code
Before the first line of the loop body, we have the invariants d < m and b < m.
The line
d = (d > mp2) ? (d << 1) - m : d << 1;
is a careful way of shifting the total d left by 1 bit, while keeping it in the range 0 .. m and avoiding overflow. Instead of first shifting it and then testing whether the result is m or greater, we test whether it is currently strictly above RoundDown(m/2) -- because if so, after doubling, it will surely be strictly above 2 * RoundDown(m/2) >= m - 1, and so require a subtraction of m to get back in range. Note that even though the (d << 1) in (d << 1) - m may overflow and lose the top bit of d, this does no harm as it does not affect the lowest 64 bits of the subtraction result, which are the only ones we are interested in. (Also note that if d == m/2 exactly, we wind up with d == m afterward, which is slightly out of range -- but changing the test from d > mp2 to d >= mp2 to fix this would break the case where m is odd and d == RoundDown(m/2), so we have to live with this. It doesn't matter, because it will be fixed up below.)
Why not simply write d <<= 1; if (d >= m) d -= m; instead? Suppose that, in infinite-precision arithmetic, d << 1 >= m, so we should perform the subtraction -- but the high bit of d is on and the rest of d << 1 is less than m: In this case, the initial shift will lose the high bit and the if will fail to execute.
Restriction to inputs of 63 bits or fewer
The above edge case can only occur when d's high bit is on, which can only occur when m's high bit is also on (since we maintain the invariant d < m). So it looks like the code is taking pains to work correctly even with very high values of m. Unfortunately, it turns out that it can still overflow elsewhere, resulting in incorrect answers for some inputs that set the top bit. For example, when a = 3, b = 0x7FFFFFFFFFFFFFFFULL and m = 0xFFFFFFFFFFFFFFFFULL, the correct answer should be 0x7FFFFFFFFFFFFFFEULL, but the code will return 0x7FFFFFFFFFFFFFFDULL (an easy way to see the correct answer is to rerun with the values of a and b swapped). Specifically, this behaviour occurs whenever the line d += b overflows and leaves the truncated d less than m, causing a subtraction to be erroneously skipped.
Provided this behaviour is documented (as it is on the Wikipedia page), this is just a limitation, not a bug.
Removing the restriction
If we replace the lines
if (a & 0x8000000000000000ULL)
d += b;
if (d >= m) d -= m;
with
unsigned long long x = -(a >> 63) & b;
if (d >= m - x) d -= m;
d += x;
the code will work for all inputs, including those with top bits set. The cryptic first line is just a conditional-free (and thus usually faster) way of writing
unsigned long long x = (a & 0x8000000000000000ULL) ? b : 0;
The test d >= m - x operates on d before it has been modified -- it's like the old d >= m test, but b (when the top bit of a is on) or 0 (otherwise) has been subtracted from both sides. This tests whether d would be m or larger once x is added to it. We know that the RHS m - x never underflows, because the largest x can be is b and we have established that b < m at the top of the function.

Bitwise operations for comparing numbers?

I've spent too many brain cycles on this over the last day.
I'm trying to come up with a set of bitwise operations that may re-implement the following condition:
uint8_t a, b;
uint8_t c, d;
uint8_t e, f;
...
bool result = (a == 0xff || a == b) && (c == 0xff || c == d) && (e == 0xff || e == f);
Code I'm looking at has four of these expressions, short-circuit &&ed together (as above).
I know this is an esoteric question, but the short-circuit nature of this and the timing of the above code in a tight loop makes the lack of predictable time a royal pain, and quite frankly, it seems to really suck on architectures where branch prediction isn't available, or so well implemented.
Is there such a beast that would be concise?
So, if you really want to do bit-twiddling to make this "fast" (which you really should only do after profiling your code to make sure this is a bottleneck), what you want to do is vectorize this by packing all the values together into a wider word so you can do all the comparisons at once (one instruction), and then extract the answer from a few bits.
There are a few tricks to this. To compare two value for equality, you can xor (^) them and test to see if the result is zero. To test a field of a wider word to see if it is zero, you can 'pack' it with a 1 bit above, then subtract one and see if the extra bit you added is still 1 -- if it is now 0, the value of the field was zero.
Putting all this together, you want to do 6 8-bit compares at once. You can pack these values into 9 bit fields in a 64-bit word (9 bits to get that extra 1 guard bit your going to test for subtraction). You can fit up to 7 such 9 bit fields in a 64 bit int, so no problem
// pack 6 9-bit values into a word
#define VEC6x9(A,B,C,D,E,F) (((uint64_t)(A) << 45) | ((uint64_t)(B) << 36) | ((uint64_t)(C) << 27) | ((uint64_t)(D) << 18) | ((uint64_t)(E) << 9) | (uint64_t)(F))
// the two values to compare
uint64_t v1 = VEC6x9(a, a, c, c, e, e);
uint64_t v2 = VEC6x9(b, 0xff, d, 0xff, f, 0xff);
uint64_t guard_bits = VEC6x9(0x100, 0x100, 0x100, 0x100, 0x100, 0x100);
uint64_t ones = VEC6x9(1, 1, 1, 1, 1, 1);
uint64_t alt_guard_bits = VEC6x9(0, 0x100, 0, 0x100, 0, 0x100);
// do the comparisons in parallel
uint64_t res_vec = ((v1 ^ v2) | guard_bits) - ones;
// mask off the bits we'll ignore (optional for clarity, not needed for correctness)
res_vec &= ~guard_bits;
// do the 3 OR ops in parallel
res_vec &= res_vec >> 9;
// get the result
bool result = (res_vec & alt_guard_bits) == 0;
The ORs and ANDs at the end are 'backwards' becuase the result bit for each comparison is 0 if the comparison was true (values were equal) and 1 if it was false (values were not equal.)
All of the above is mostly of interest if you are writing a compiler -- its how you end up implementing a vector comparison -- and it may well be the case that a vectorizing compiler will do it all for you automatically.
This can be much more efficient if you can arrange to have your initial values pre-packed into vectors. This may in turn influence your choice of data structures and allowable values -- if you arrange for your values to be 7-bit or 15-bit (instead of 8-bit) they may pack nicer when you add the guard bits...
You could modify how you store and interpret the data:
When a if 0xFF, do you need the value of b. If not, then make b equal to 0xFF and simplify the expression by removing the part that test for 0xFF.
Also, you might combine a, b and c in a single variable.
uint32_t abc;
uint32_t def;
bool result = abc == def;
Other operations might be slower but that loop should be much faster (single comparison instead of up to 6 comparisons).
You might want to use an union to be able to access byte individually or in group. In that case, make sure that the forth byte is always 0.
To remove timing variations with &&, ||, use &, |. #molbdnilo. Possible faster, maybe not. Certainly easier to parallel.
// bool result = (a == 0xff || a == b) && (c == 0xff || c == d)
// && (e == 0xff || e == f);
bool result = ((a == 0xff) | (a == b)) & ((c == 0xff) | (c == d))
& ((e == 0xff) | (e == f));

How is XOR applied when determining carry?

I'm working on a gameboy emulator. One of the CPU operations I need to implement is the adding of a byte n to the stack pointer sp (opcode E8). The carry flag needs to be set if there is a carry from bit 7. I've looked at two implementations for this operation and they both follow the same carry detection logic. The code for this is roughly as follows:
int result = (sp + n) & 0xFFFF
boolean carry = ((sp ^ n ^ result) & 0x100) != 0
I have worked through this logic with a few examples and it does work, but I simply don't get how it works. I understand how xor works but what's the logic behind its application here? Thanks.
Addition can be written as:
a + b = a ^ b ^ (c << 1)
Where c is the carry-out for every bit (c << 1 is the carry-in). This can also be used as a way to implement addition.
Therefore if the a ^ b part is XORed out of the sum again, we're left with c << 1. Bit 8 of that is the carry-out of bit 7.

Difference between | and || , or & and && [duplicate]

This question already has answers here:
Why can't we use bitwise operators on float & double data types
(2 answers)
Closed 7 years ago.
These are two simple samples in C++ written on Dev-cpp C++ 5.4.2:
float a, b, c;
if (a | b & a | c)
printf("x = %.2f\tF = %.0f\n", x, F);
else
printf("x = %.2f\tF = %.2f\n", x, F);
and this code :
float a, b, c;
if (a || b && a || c)
printf("x = %.2f\tF = %.0f\n", x, F);
else
printf("x = %.2f\tF = %.2f\n", x, F);
Can somebody tell my difference between || > | and & > &&. The second code works , but first does not.
And compiler gives an error message :
[Error] invalid operands of types 'float' and 'float' to binary 'operator&'.
The operators |, &, and ~ act on individual bits in parallel. They can be used only on integer types. a | b does an independent OR operation of each bit of a with the corresponding bit of b to generate that bit of the result.
The operators ||, &&, and ! act on each entire operand as a single true/false value. Any data type can be used that implicitly converts to bool. Many data types, including float implicitly convert to bool with an implied !=0 operation.
|| and && also "short circuit". That means whenever the value of the result can be determined by just the first operand, the second is not evaluated. Example:
ptr && (*ptr==7) If ptr is zero, the result is false without any risk of seg faulting by dereferencing zero.
You could contrast that with (int)ptr & (*ptr). Ignoring the fact that this would be a bizarre operation to even want, if (int)ptr were zero, the entire result would be zero, so a human might think you don't need the second operand in that case. But the program will likely compute both anyway.
You seems be confused with the symbols of the operators. Theses symbols are actually split in two different categories, which are bit-wise operators and logical operators. Although they use the same symbols, you should regard them as different operators. The truth tables for both categories are similar, but the meanings are different. Maybe that's why people use the similar symbols for the operators.
bit-wise operators
~ // NOT
& // AND
| // OR
^ // XOR
The bit-wise operators will regard all its operands as binary numerals and act according to the bit-wise truth tables on every bit of the operands.
Bit-wise Truth Table
x y x&y x|y x^y
0 0 0 0 0
1 0 0 1 1
0 1 0 1 1
1 1 1 1 0
x ~x
0 1
1 0
logical operators
! // Logical NOT (negation)
&& // Logical AND (conjunction)
|| // Logical OR (disjunction)
The logical operator will regard all its operands as bools and act according the operator truth tables. Any number that is not equal to 0 will be true, else will be false.
Logical Truth Table
x y x&&y x||y
F F F F
T F F T
F T F T
T T T T
x !x
F T
T F
For example:
int a = 10; // a = 0000 .... 0000 1010 <-- a 32 bits integer
// a is not zero -> true
int b = 7; // b = 0000 .... 0000 0111 <-- a 32 bits integer
// b is not zero -> true
Then for bit-wise operator:
assert(a & b == 2); // 2 = 0000 .... 0000 0010 <-- every bit will & separately
For logic operator:
assert(a && b == true); // true && true -> true
The bitwise operators, which are | (OR), & (AND), ^ (XOR), and ~ (complement) do what you expect them to do: they perform the aforementioned operations on bits.
And regarding your compilation issue, there are no bitwise operations for floating point numbers.
The logical operators, which are || (OR), && (AND), and ! (NOT) only know the values true and false.
An expression is true if its value is not 0. It is false if its value equals 0.
The logical operators do this operation first. Then they perform their corresponding operation:
||: true if at least one the operands is true
&&: true if both operands are true
!: true if the operand is false
Note that all logical operators are short-circuit operators.
Bitwise operation is not supported for floating points
Alternatively if you really need to check, you can cast before you use them (highly discouraged),
Check here how to convert a float into integrals, https://www.cs.tut.fi/~jkorpela/round.html

Invert a bitwise left shift and OR assignment

What would be the inverse function for this?
A = (B << 3) | 0x07;
How can I get a B when I already have the corresponding A?
You can't ever recover all the bits fully.
B << 3 shifts 'B' three bits to the left, and it doesn't loop around. This means the state of the top three bits of B are erased - unless you know those, you wouldn't be able to recover B.
Example:
10101101 << 3
Turns: 10101101
^---^
Into: 01101000
^---^
The top three bits are lost, and the bottom three are filled with zeroes. Deleted data is deleted.
The | 0x07 fills the bottom three bits (with 111), so even if you didn't shift, you'd be erasing the lowest three bits with 111, making those bits irrecoverable.
Now if it was XOR'd instead of OR'd, it'd be recoverable with another XOR:
A ^ same-value can be undone with another A ^ same-value because ((A ^ B) ^ B) == A
A | same-value cannot be undone with another A | same-value
A | same-value also cannot be undone with an AND: A & same-value
But the shift still would cause problems, even if it was XOR'd (which it isn't).
Given (Using 8-bit B as example, using0b for binary form, demonstration only)
B = 0b00000000
B = 0b00100000
//...
B = 0b11100000
You can get the same A, so I don't think you can reverse the calculation, the leftmost 3 bits are lost.