Find smallest integer greater or equal than x (positive integer) multiple of z (positive integer, probably power of 2) - c++

Probably very easy question, yet I came out with this implementation that looks far too complicated...
unsigned int x;
unsigned int z;
unsigned int makeXMultipleOfZ(const unsigned x, const unsigned z) {
return x + (z - x % z) % z;
//or
//return x + (z - (x + 1) % z - 1); //This generates shorter assembly,
//6 against 8 instructions
}
I would like to avoid if-statements
If this can help we can safely say that z will be a power of 2
In my case z=4 (I know I could replace the modulo operation with a & bit operator), and I was wondering if could come with an implementation that involves less steps.

If z is a power of two, the modulo operation can be reduced to this bitwise operation:
return (x + z - 1) & ~(z - 1);
This logic is very common for data structure boundary alignment, for example. More info here: https://en.wikipedia.org/wiki/Data_structure_alignment

If z is a power of two and the integers are unsigned, the following will work:
x + (z - 1) & ~(z - 1)
I cannot think of a solution using bit-twiddling if z is an arbitrary number.

Related

A bitwise shortcut for calculating the signed result of `(x - y) / z`, given unsigned operands

I'm looking for a neat way (most likely, a "bitwise shortcut") for calculating the signed value of the expression (x - y) / z, given unsigned operands x, y and z.
Here is a "kinda real kinda pseudo" code illustrating what I am currently doing (please don't mind the actual syntax being "100% perfect C or C++"):
int64 func(uint64 x, uint64 y, uint64 z)
{
if (x >= y) {
uint64 result = (x - y) / z;
if (int64(result) >= 0)
return int64(result);
}
else {
uint64 result = (y - x) / z;
if (int64(result) >= 0)
return -int64(result);
}
throwSomeError();
}
Please assume that I don't have a larger type at hand.
I'd be happy to read any idea of how to make this simpler/shorter/neater.
There is a shortcut, by using a bitwise trick for conditional-negation twice (once for the absolute difference, and then again to restore the sign).
I'll use some similar non-perfect C-ish syntax I guess, to match the question.
First get a mask that has all bits set iff x < y:
uint64 m = -uint64(x < y);
(x - y) and -(y - x) are actually the same, even in unsigned arithmetic, and conditional negation can be done by using the definition of two's complement: -a = ~(a - 1) = (a + (-1) ^ -1). (a + 0) ^ 0 is of course equal to a again, so when m is -1, (a + m) ^ m = -a and when m is zero, it is a. So it's a conditional negation.
uint64 absdiff = (x - y + m) ^ m;
Then divide as usual, and restore the sign by doing another conditional negation:
return int64((absdiff / z + m) ^ m);

Using bit manipulation to calculate the mean value of two number?

I find this code :
int mid = (l & r) + ((l ^ r) >> 1)
which is the same as mid=(l+r)/2
but i can't figure why?
Any help? Thanks!
It's not quite the same, the point of it is not being the same. It is mostly the same, but without overflow trouble: if you input two positive numbers, the result will never be negative. That is not true of mid = (l + r) / 2, if you have for example l = 0x7fffffff, r = 1 then the true midpoint is 0x40000000 but the naive midpoint calculation says it is 0xc0000000, a large negative number.
Addition can be decomposed into:
x + y = (x ^ y) + ((x & y) << 1)
That's just a simple "calculate per-digit sum, then apply the carries separately" decomposition. Then shift the whole thing right by 1 while restoring the bits that "fell off the end" by just not shifting left to begin with and shifting the other thing to the right,
x + y = ((x ^ y) >> 1) + (x & y)
Which is that midpoint calculation. Note that it rounds down, not towards zero, which matters for negative results. I would not call the result wrong, it's still halfway in between the endpoints, but it does not match the result from a normal signed division by 2 (usually rounds towards zero, though opinions about how it should round differ).
You can change it to work for all unsigned integers by using an unsigned right shift:
// unsigned midpoint without wrapping/overflow
int mid = (l & r) + ((l ^ r) >>> 1);
Of course being the unsigned midpoint, negative inputs are implicitly treated as very large positive numbers, that's the point.
If you're working with signed-but-non-negative numbers (as is usually the case for midpoint calculation), you can use the significantly simpler
int mid = (x + y) >>> 1

compute midpoint in floating point

Given two floating point numbers (IEEE single or double precision), I would like to find the number that lies half-way between them, but not in the sense of (x+y)/2 but with respect to actually representable numbers.
if both x and y are positive, the following works
float ieeeMidpoint(float x, float y)
{
assert(x >= 0 && y >= 0);
int xi = *(int*)&x;
int yi = *(int*)&y;
int zi = (xi+yi)/2;
return *(float*)&zi;
}
The reason this works is that positive ieee floating point numbers (including subnormals and infinity) keep their order when doing a reinterpreting cast. (this is not true for the 80-bit extended format, but I don't need that anyway).
Now I am looking for an elegant way to do the same that includes the case when one or both of the numbers are negative. Of course it is easy to do with a bunch of if's, but I was wondering if there is some nice bit-magic, prefarably without any branching.
Figured it out myself. the order of negative number is reversed when doing the reinterpreting cast, so that is the only thing one needs to fix. This version is longer than I hoped it would be, but its only some bit-shuffling, so it should be fast.
float ieeeMidpoint(float x, float y)
{
// check for NaN's (Note that subnormals and infinity work fine)
assert(x ==x && y == y);
// re-interpreting cast
int xi = *(int*)&x;
int yi = *(int*)&y;
// reverse negative numbers
// (would look cleaner with an 'if', but I like not branching)
xi = xi ^ ((xi >> 31) & 0x3FFFFFFF);
yi = yi ^ ((yi >> 31) & 0x3FFFFFFF);
// compute average of xi,yi (overflow-safe)
int zi = (xi >> 1) + (yi >> 1) + (xi & 1);
// reverse negative numbers back
zi = zi ^ ((zi >> 31) & 0x3FFFFFFF);
// re-interpreting back to float
return *(float*)&zi;
}

How can you calculate a factor if you have the other factor and the product with overflows?

a * x = b
I have a seemingly rather complicated multiplication / imul problem: if I have a and I have b, how can I calculate x if they're all 32-bit dwords (e.g. 0-1 = FFFFFFFF, FFFFFFFF+1 = 0)?
For example:
0xcb9102df * x = 0x4d243a5d
In that case, x is 0x1908c643. I found a similar question but the premises were different and I'm hoping there's a simpler solution than those given.
Numbers have a modular multiplicative inverse modulo a power of two precisely iff they are odd. Everything else is a bit-shifted odd number (even zero, which might be anything, with all bits shifted out). So there are a couple of cases:
Given a * x = b
tzcnt(a) > tzcnt(b) no solution
tzcnt(a) <= tzcnt(b) solvable, with 2tzcnt(a) solutions
The second case has a special case with 1 solution, for odd a, namely x = inverse(a) * b
More generally, x = inverse(a >> tzcnt(a)) * (b >> tzcnt(a)) is a solution, because you write a as (a >> tzcnt(a)) * (1 << tzcnt(a)), so we cancel the left factor with its inverse, we leave the right factor as part of the result (cannot be cancelled anyway) and then multiply by what remains to get it up to b. Still only works in the second case, obviously. If you wanted, you could enumerate all solutions by filling in all possibilities for the top tzcnt(a) bits.
The only thing that remains is getting the inverse, you've probably seen it in the other answer, whatever it was, but for completeness you can compute it as follows: (not tested)
; input x
dword y = (x * x) + x - 1;
dword t = y * x;
y *= 2 - t;
t = y * x;
y *= 2 - t;
t = y * x;
y *= 2 - t;
; result y

Fast ceiling of an integer division in C / C++

Given integer values x and y, C and C++ both return as the quotient q = x/y the floor of the floating point equivalent. I'm interested in a method of returning the ceiling instead. For example, ceil(10/5)=2 and ceil(11/5)=3.
The obvious approach involves something like:
q = x / y;
if (q * y < x) ++q;
This requires an extra comparison and multiplication; and other methods I've seen (used in fact) involve casting as a float or double. Is there a more direct method that avoids the additional multiplication (or a second division) and branch, and that also avoids casting as a floating point number?
For positive numbers where you want to find the ceiling (q) of x when divided by y.
unsigned int x, y, q;
To round up ...
q = (x + y - 1) / y;
or (avoiding overflow in x+y)
q = 1 + ((x - 1) / y); // if x != 0
For positive numbers:
q = x/y + (x % y != 0);
Sparky's answer is one standard way to solve this problem, but as I also wrote in my comment, you run the risk of overflows. This can be solved by using a wider type, but what if you want to divide long longs?
Nathan Ernst's answer provides one solution, but it involves a function call, a variable declaration and a conditional, which makes it no shorter than the OPs code and probably even slower, because it is harder to optimize.
My solution is this:
q = (x % y) ? x / y + 1 : x / y;
It will be slightly faster than the OPs code, because the modulo and the division is performed using the same instruction on the processor, because the compiler can see that they are equivalent. At least gcc 4.4.1 performs this optimization with -O2 flag on x86.
In theory the compiler might inline the function call in Nathan Ernst's code and emit the same thing, but gcc didn't do that when I tested it. This might be because it would tie the compiled code to a single version of the standard library.
As a final note, none of this matters on a modern machine, except if you are in an extremely tight loop and all your data is in registers or the L1-cache. Otherwise all of these solutions will be equally fast, except for possibly Nathan Ernst's, which might be significantly slower if the function has to be fetched from main memory.
You could use the div function in cstdlib to get the quotient & remainder in a single call and then handle the ceiling separately, like in the below
#include <cstdlib>
#include <iostream>
int div_ceil(int numerator, int denominator)
{
std::div_t res = std::div(numerator, denominator);
return res.rem ? (res.quot + 1) : res.quot;
}
int main(int, const char**)
{
std::cout << "10 / 5 = " << div_ceil(10, 5) << std::endl;
std::cout << "11 / 5 = " << div_ceil(11, 5) << std::endl;
return 0;
}
There's a solution for both positive and negative x but only for positive y with just 1 division and without branches:
int div_ceil(int x, int y) {
return x / y + (x % y > 0);
}
Note, if x is positive then division is towards zero, and we should add 1 if reminder is not zero.
If x is negative then division is towards zero, that's what we need, and we will not add anything because x % y is not positive
How about this? (requires y non-negative, so don't use this in the rare case where y is a variable with no non-negativity guarantee)
q = (x > 0)? 1 + (x - 1)/y: (x / y);
I reduced y/y to one, eliminating the term x + y - 1 and with it any chance of overflow.
I avoid x - 1 wrapping around when x is an unsigned type and contains zero.
For signed x, negative and zero still combine into a single case.
Probably not a huge benefit on a modern general-purpose CPU, but this would be far faster in an embedded system than any of the other correct answers.
I would have rather commented but I don't have a high enough rep.
As far as I am aware, for positive arguments and a divisor which is a power of 2, this is the fastest way (tested in CUDA):
//example y=8
q = (x >> 3) + !!(x & 7);
For generic positive arguments only, I tend to do it like so:
q = x/y + !!(x % y);
This works for positive or negative numbers:
q = x / y + ((x % y != 0) ? !((x > 0) ^ (y > 0)) : 0);
If there is a remainder, checks to see if x and y are of the same sign and adds 1 accordingly.
simplified generic form,
int div_up(int n, int d) {
return n / d + (((n < 0) ^ (d > 0)) && (n % d));
} //i.e. +1 iff (not exact int && positive result)
For a more generic answer, C++ functions for integer division with well defined rounding strategy
For signed or unsigned integers.
q = x / y + !(((x < 0) != (y < 0)) || !(x % y));
For signed dividends and unsigned divisors.
q = x / y + !((x < 0) || !(x % y));
For unsigned dividends and signed divisors.
q = x / y + !((y < 0) || !(x % y));
For unsigned integers.
q = x / y + !!(x % y);
Zero divisor fails (as with a native operation). Cannot cause overflow.
Corresponding floored and modulo constexpr implementations here, along with templates to select the necessary overloads (as full optimization and to prevent mismatched sign comparison warnings):
https://github.com/libbitcoin/libbitcoin-system/wiki/Integer-Division-Unraveled
Compile with O3, The compiler performs optimization well.
q = x / y;
if (x % y) ++q;