Using bit manipulation to calculate the mean value of two number? - bit-manipulation

I find this code :
int mid = (l & r) + ((l ^ r) >> 1)
which is the same as mid=(l+r)/2
but i can't figure why?
Any help? Thanks!

It's not quite the same, the point of it is not being the same. It is mostly the same, but without overflow trouble: if you input two positive numbers, the result will never be negative. That is not true of mid = (l + r) / 2, if you have for example l = 0x7fffffff, r = 1 then the true midpoint is 0x40000000 but the naive midpoint calculation says it is 0xc0000000, a large negative number.
Addition can be decomposed into:
x + y = (x ^ y) + ((x & y) << 1)
That's just a simple "calculate per-digit sum, then apply the carries separately" decomposition. Then shift the whole thing right by 1 while restoring the bits that "fell off the end" by just not shifting left to begin with and shifting the other thing to the right,
x + y = ((x ^ y) >> 1) + (x & y)
Which is that midpoint calculation. Note that it rounds down, not towards zero, which matters for negative results. I would not call the result wrong, it's still halfway in between the endpoints, but it does not match the result from a normal signed division by 2 (usually rounds towards zero, though opinions about how it should round differ).
You can change it to work for all unsigned integers by using an unsigned right shift:
// unsigned midpoint without wrapping/overflow
int mid = (l & r) + ((l ^ r) >>> 1);
Of course being the unsigned midpoint, negative inputs are implicitly treated as very large positive numbers, that's the point.
If you're working with signed-but-non-negative numbers (as is usually the case for midpoint calculation), you can use the significantly simpler
int mid = (x + y) >>> 1

Related

A bitwise shortcut for calculating the signed result of `(x - y) / z`, given unsigned operands

I'm looking for a neat way (most likely, a "bitwise shortcut") for calculating the signed value of the expression (x - y) / z, given unsigned operands x, y and z.
Here is a "kinda real kinda pseudo" code illustrating what I am currently doing (please don't mind the actual syntax being "100% perfect C or C++"):
int64 func(uint64 x, uint64 y, uint64 z)
{
if (x >= y) {
uint64 result = (x - y) / z;
if (int64(result) >= 0)
return int64(result);
}
else {
uint64 result = (y - x) / z;
if (int64(result) >= 0)
return -int64(result);
}
throwSomeError();
}
Please assume that I don't have a larger type at hand.
I'd be happy to read any idea of how to make this simpler/shorter/neater.
There is a shortcut, by using a bitwise trick for conditional-negation twice (once for the absolute difference, and then again to restore the sign).
I'll use some similar non-perfect C-ish syntax I guess, to match the question.
First get a mask that has all bits set iff x < y:
uint64 m = -uint64(x < y);
(x - y) and -(y - x) are actually the same, even in unsigned arithmetic, and conditional negation can be done by using the definition of two's complement: -a = ~(a - 1) = (a + (-1) ^ -1). (a + 0) ^ 0 is of course equal to a again, so when m is -1, (a + m) ^ m = -a and when m is zero, it is a. So it's a conditional negation.
uint64 absdiff = (x - y + m) ^ m;
Then divide as usual, and restore the sign by doing another conditional negation:
return int64((absdiff / z + m) ^ m);

bit twiddling : checking non-negative integers as difference of powers of 2

Problem : To check if a non-negative integer is of form 2^j - 2^k where j>=k>=0 i.e. difference of powers of 2.
My solution : The number n (say) can be represented as contiguous sequence of 1's for eg. 00011110. I will turn off the contiguous sequence(right most) of 1's and do a zero check on n.
What I do here is that, steps for solution
00011110
00011111(turn on trailing 0's)
00000000(then turn off trailing 1's).
Using this formula (x | (x - 1)) & ((x | (x - 1)) + 1).
But a more efficient formula(maybe because of less number of operation) which does not uses literals is ((x & -x) + x) & x followed by a zero check. And I can't understand this but it's written it does the same thing, but just can't derive the formula from my result. Can someone explain this to me?
EDIT : 32-bit word, 2's complement
Given that -x is ~x + 1, if a number is of the form 2^j - 2^k then:
-x = 2^k plus all 1s >= 2^j, as carry will ripple up until it hits 2^k, then stop;
hence x & -x= 2^k;
hence (x & -x) + x = 2^k; and
hence ((x & -x) + x) & x = 0.
And you can work backwards along that logic:
((x & -x) + x) & x = 0 => no common bits between ((x & -x) + x) and x;
no common bits between x and ((x & -x) + x) implies that for consecutive group of 1s in x, (x & -x) must have the lowest of those bits set and none of the others;
... and the only way to achieve that given the way that carry ripples is if there is only one consecutive group of 1s.
You asked for an algebraic proof connecting the two expressions, so here is one, but with some non-simple steps
((x | (x - 1)) + 1) & (x | (x - 1))
// rename x | (x - 1) into blsfill(x)
(blsfill(x) + 1) & blsfill(x)
// the trailing zeroes that get filled on the right side of the & don't matter,
// they end up being reset by the & anyway
(blsfill(x) + 1) & x
// filling the trailing zeroes and adding 1,
// is the same thing as skipping the trailing zeroes and adding the least-set-bit
(x + blsi(x)) & x
// rewrite blsi into elementary operations
(x + (x & -x)) & x

compute midpoint in floating point

Given two floating point numbers (IEEE single or double precision), I would like to find the number that lies half-way between them, but not in the sense of (x+y)/2 but with respect to actually representable numbers.
if both x and y are positive, the following works
float ieeeMidpoint(float x, float y)
{
assert(x >= 0 && y >= 0);
int xi = *(int*)&x;
int yi = *(int*)&y;
int zi = (xi+yi)/2;
return *(float*)&zi;
}
The reason this works is that positive ieee floating point numbers (including subnormals and infinity) keep their order when doing a reinterpreting cast. (this is not true for the 80-bit extended format, but I don't need that anyway).
Now I am looking for an elegant way to do the same that includes the case when one or both of the numbers are negative. Of course it is easy to do with a bunch of if's, but I was wondering if there is some nice bit-magic, prefarably without any branching.
Figured it out myself. the order of negative number is reversed when doing the reinterpreting cast, so that is the only thing one needs to fix. This version is longer than I hoped it would be, but its only some bit-shuffling, so it should be fast.
float ieeeMidpoint(float x, float y)
{
// check for NaN's (Note that subnormals and infinity work fine)
assert(x ==x && y == y);
// re-interpreting cast
int xi = *(int*)&x;
int yi = *(int*)&y;
// reverse negative numbers
// (would look cleaner with an 'if', but I like not branching)
xi = xi ^ ((xi >> 31) & 0x3FFFFFFF);
yi = yi ^ ((yi >> 31) & 0x3FFFFFFF);
// compute average of xi,yi (overflow-safe)
int zi = (xi >> 1) + (yi >> 1) + (xi & 1);
// reverse negative numbers back
zi = zi ^ ((zi >> 31) & 0x3FFFFFFF);
// re-interpreting back to float
return *(float*)&zi;
}

Find smallest integer greater or equal than x (positive integer) multiple of z (positive integer, probably power of 2)

Probably very easy question, yet I came out with this implementation that looks far too complicated...
unsigned int x;
unsigned int z;
unsigned int makeXMultipleOfZ(const unsigned x, const unsigned z) {
return x + (z - x % z) % z;
//or
//return x + (z - (x + 1) % z - 1); //This generates shorter assembly,
//6 against 8 instructions
}
I would like to avoid if-statements
If this can help we can safely say that z will be a power of 2
In my case z=4 (I know I could replace the modulo operation with a & bit operator), and I was wondering if could come with an implementation that involves less steps.
If z is a power of two, the modulo operation can be reduced to this bitwise operation:
return (x + z - 1) & ~(z - 1);
This logic is very common for data structure boundary alignment, for example. More info here: https://en.wikipedia.org/wiki/Data_structure_alignment
If z is a power of two and the integers are unsigned, the following will work:
x + (z - 1) & ~(z - 1)
I cannot think of a solution using bit-twiddling if z is an arbitrary number.

Fast ceiling of an integer division in C / C++

Given integer values x and y, C and C++ both return as the quotient q = x/y the floor of the floating point equivalent. I'm interested in a method of returning the ceiling instead. For example, ceil(10/5)=2 and ceil(11/5)=3.
The obvious approach involves something like:
q = x / y;
if (q * y < x) ++q;
This requires an extra comparison and multiplication; and other methods I've seen (used in fact) involve casting as a float or double. Is there a more direct method that avoids the additional multiplication (or a second division) and branch, and that also avoids casting as a floating point number?
For positive numbers where you want to find the ceiling (q) of x when divided by y.
unsigned int x, y, q;
To round up ...
q = (x + y - 1) / y;
or (avoiding overflow in x+y)
q = 1 + ((x - 1) / y); // if x != 0
For positive numbers:
q = x/y + (x % y != 0);
Sparky's answer is one standard way to solve this problem, but as I also wrote in my comment, you run the risk of overflows. This can be solved by using a wider type, but what if you want to divide long longs?
Nathan Ernst's answer provides one solution, but it involves a function call, a variable declaration and a conditional, which makes it no shorter than the OPs code and probably even slower, because it is harder to optimize.
My solution is this:
q = (x % y) ? x / y + 1 : x / y;
It will be slightly faster than the OPs code, because the modulo and the division is performed using the same instruction on the processor, because the compiler can see that they are equivalent. At least gcc 4.4.1 performs this optimization with -O2 flag on x86.
In theory the compiler might inline the function call in Nathan Ernst's code and emit the same thing, but gcc didn't do that when I tested it. This might be because it would tie the compiled code to a single version of the standard library.
As a final note, none of this matters on a modern machine, except if you are in an extremely tight loop and all your data is in registers or the L1-cache. Otherwise all of these solutions will be equally fast, except for possibly Nathan Ernst's, which might be significantly slower if the function has to be fetched from main memory.
You could use the div function in cstdlib to get the quotient & remainder in a single call and then handle the ceiling separately, like in the below
#include <cstdlib>
#include <iostream>
int div_ceil(int numerator, int denominator)
{
std::div_t res = std::div(numerator, denominator);
return res.rem ? (res.quot + 1) : res.quot;
}
int main(int, const char**)
{
std::cout << "10 / 5 = " << div_ceil(10, 5) << std::endl;
std::cout << "11 / 5 = " << div_ceil(11, 5) << std::endl;
return 0;
}
There's a solution for both positive and negative x but only for positive y with just 1 division and without branches:
int div_ceil(int x, int y) {
return x / y + (x % y > 0);
}
Note, if x is positive then division is towards zero, and we should add 1 if reminder is not zero.
If x is negative then division is towards zero, that's what we need, and we will not add anything because x % y is not positive
How about this? (requires y non-negative, so don't use this in the rare case where y is a variable with no non-negativity guarantee)
q = (x > 0)? 1 + (x - 1)/y: (x / y);
I reduced y/y to one, eliminating the term x + y - 1 and with it any chance of overflow.
I avoid x - 1 wrapping around when x is an unsigned type and contains zero.
For signed x, negative and zero still combine into a single case.
Probably not a huge benefit on a modern general-purpose CPU, but this would be far faster in an embedded system than any of the other correct answers.
I would have rather commented but I don't have a high enough rep.
As far as I am aware, for positive arguments and a divisor which is a power of 2, this is the fastest way (tested in CUDA):
//example y=8
q = (x >> 3) + !!(x & 7);
For generic positive arguments only, I tend to do it like so:
q = x/y + !!(x % y);
This works for positive or negative numbers:
q = x / y + ((x % y != 0) ? !((x > 0) ^ (y > 0)) : 0);
If there is a remainder, checks to see if x and y are of the same sign and adds 1 accordingly.
simplified generic form,
int div_up(int n, int d) {
return n / d + (((n < 0) ^ (d > 0)) && (n % d));
} //i.e. +1 iff (not exact int && positive result)
For a more generic answer, C++ functions for integer division with well defined rounding strategy
For signed or unsigned integers.
q = x / y + !(((x < 0) != (y < 0)) || !(x % y));
For signed dividends and unsigned divisors.
q = x / y + !((x < 0) || !(x % y));
For unsigned dividends and signed divisors.
q = x / y + !((y < 0) || !(x % y));
For unsigned integers.
q = x / y + !!(x % y);
Zero divisor fails (as with a native operation). Cannot cause overflow.
Corresponding floored and modulo constexpr implementations here, along with templates to select the necessary overloads (as full optimization and to prevent mismatched sign comparison warnings):
https://github.com/libbitcoin/libbitcoin-system/wiki/Integer-Division-Unraveled
Compile with O3, The compiler performs optimization well.
q = x / y;
if (x % y) ++q;