Fast ceiling of an integer division in C / C++ - c++

Given integer values x and y, C and C++ both return as the quotient q = x/y the floor of the floating point equivalent. I'm interested in a method of returning the ceiling instead. For example, ceil(10/5)=2 and ceil(11/5)=3.
The obvious approach involves something like:
q = x / y;
if (q * y < x) ++q;
This requires an extra comparison and multiplication; and other methods I've seen (used in fact) involve casting as a float or double. Is there a more direct method that avoids the additional multiplication (or a second division) and branch, and that also avoids casting as a floating point number?

For positive numbers where you want to find the ceiling (q) of x when divided by y.
unsigned int x, y, q;
To round up ...
q = (x + y - 1) / y;
or (avoiding overflow in x+y)
q = 1 + ((x - 1) / y); // if x != 0

For positive numbers:
q = x/y + (x % y != 0);

Sparky's answer is one standard way to solve this problem, but as I also wrote in my comment, you run the risk of overflows. This can be solved by using a wider type, but what if you want to divide long longs?
Nathan Ernst's answer provides one solution, but it involves a function call, a variable declaration and a conditional, which makes it no shorter than the OPs code and probably even slower, because it is harder to optimize.
My solution is this:
q = (x % y) ? x / y + 1 : x / y;
It will be slightly faster than the OPs code, because the modulo and the division is performed using the same instruction on the processor, because the compiler can see that they are equivalent. At least gcc 4.4.1 performs this optimization with -O2 flag on x86.
In theory the compiler might inline the function call in Nathan Ernst's code and emit the same thing, but gcc didn't do that when I tested it. This might be because it would tie the compiled code to a single version of the standard library.
As a final note, none of this matters on a modern machine, except if you are in an extremely tight loop and all your data is in registers or the L1-cache. Otherwise all of these solutions will be equally fast, except for possibly Nathan Ernst's, which might be significantly slower if the function has to be fetched from main memory.

You could use the div function in cstdlib to get the quotient & remainder in a single call and then handle the ceiling separately, like in the below
#include <cstdlib>
#include <iostream>
int div_ceil(int numerator, int denominator)
{
std::div_t res = std::div(numerator, denominator);
return res.rem ? (res.quot + 1) : res.quot;
}
int main(int, const char**)
{
std::cout << "10 / 5 = " << div_ceil(10, 5) << std::endl;
std::cout << "11 / 5 = " << div_ceil(11, 5) << std::endl;
return 0;
}

There's a solution for both positive and negative x but only for positive y with just 1 division and without branches:
int div_ceil(int x, int y) {
return x / y + (x % y > 0);
}
Note, if x is positive then division is towards zero, and we should add 1 if reminder is not zero.
If x is negative then division is towards zero, that's what we need, and we will not add anything because x % y is not positive

How about this? (requires y non-negative, so don't use this in the rare case where y is a variable with no non-negativity guarantee)
q = (x > 0)? 1 + (x - 1)/y: (x / y);
I reduced y/y to one, eliminating the term x + y - 1 and with it any chance of overflow.
I avoid x - 1 wrapping around when x is an unsigned type and contains zero.
For signed x, negative and zero still combine into a single case.
Probably not a huge benefit on a modern general-purpose CPU, but this would be far faster in an embedded system than any of the other correct answers.

I would have rather commented but I don't have a high enough rep.
As far as I am aware, for positive arguments and a divisor which is a power of 2, this is the fastest way (tested in CUDA):
//example y=8
q = (x >> 3) + !!(x & 7);
For generic positive arguments only, I tend to do it like so:
q = x/y + !!(x % y);

This works for positive or negative numbers:
q = x / y + ((x % y != 0) ? !((x > 0) ^ (y > 0)) : 0);
If there is a remainder, checks to see if x and y are of the same sign and adds 1 accordingly.

simplified generic form,
int div_up(int n, int d) {
return n / d + (((n < 0) ^ (d > 0)) && (n % d));
} //i.e. +1 iff (not exact int && positive result)
For a more generic answer, C++ functions for integer division with well defined rounding strategy

For signed or unsigned integers.
q = x / y + !(((x < 0) != (y < 0)) || !(x % y));
For signed dividends and unsigned divisors.
q = x / y + !((x < 0) || !(x % y));
For unsigned dividends and signed divisors.
q = x / y + !((y < 0) || !(x % y));
For unsigned integers.
q = x / y + !!(x % y);
Zero divisor fails (as with a native operation). Cannot cause overflow.
Corresponding floored and modulo constexpr implementations here, along with templates to select the necessary overloads (as full optimization and to prevent mismatched sign comparison warnings):
https://github.com/libbitcoin/libbitcoin-system/wiki/Integer-Division-Unraveled

Compile with O3, The compiler performs optimization well.
q = x / y;
if (x % y) ++q;

Related

A bitwise shortcut for calculating the signed result of `(x - y) / z`, given unsigned operands

I'm looking for a neat way (most likely, a "bitwise shortcut") for calculating the signed value of the expression (x - y) / z, given unsigned operands x, y and z.
Here is a "kinda real kinda pseudo" code illustrating what I am currently doing (please don't mind the actual syntax being "100% perfect C or C++"):
int64 func(uint64 x, uint64 y, uint64 z)
{
if (x >= y) {
uint64 result = (x - y) / z;
if (int64(result) >= 0)
return int64(result);
}
else {
uint64 result = (y - x) / z;
if (int64(result) >= 0)
return -int64(result);
}
throwSomeError();
}
Please assume that I don't have a larger type at hand.
I'd be happy to read any idea of how to make this simpler/shorter/neater.
There is a shortcut, by using a bitwise trick for conditional-negation twice (once for the absolute difference, and then again to restore the sign).
I'll use some similar non-perfect C-ish syntax I guess, to match the question.
First get a mask that has all bits set iff x < y:
uint64 m = -uint64(x < y);
(x - y) and -(y - x) are actually the same, even in unsigned arithmetic, and conditional negation can be done by using the definition of two's complement: -a = ~(a - 1) = (a + (-1) ^ -1). (a + 0) ^ 0 is of course equal to a again, so when m is -1, (a + m) ^ m = -a and when m is zero, it is a. So it's a conditional negation.
uint64 absdiff = (x - y + m) ^ m;
Then divide as usual, and restore the sign by doing another conditional negation:
return int64((absdiff / z + m) ^ m);

Using bit manipulation to calculate the mean value of two number?

I find this code :
int mid = (l & r) + ((l ^ r) >> 1)
which is the same as mid=(l+r)/2
but i can't figure why?
Any help? Thanks!
It's not quite the same, the point of it is not being the same. It is mostly the same, but without overflow trouble: if you input two positive numbers, the result will never be negative. That is not true of mid = (l + r) / 2, if you have for example l = 0x7fffffff, r = 1 then the true midpoint is 0x40000000 but the naive midpoint calculation says it is 0xc0000000, a large negative number.
Addition can be decomposed into:
x + y = (x ^ y) + ((x & y) << 1)
That's just a simple "calculate per-digit sum, then apply the carries separately" decomposition. Then shift the whole thing right by 1 while restoring the bits that "fell off the end" by just not shifting left to begin with and shifting the other thing to the right,
x + y = ((x ^ y) >> 1) + (x & y)
Which is that midpoint calculation. Note that it rounds down, not towards zero, which matters for negative results. I would not call the result wrong, it's still halfway in between the endpoints, but it does not match the result from a normal signed division by 2 (usually rounds towards zero, though opinions about how it should round differ).
You can change it to work for all unsigned integers by using an unsigned right shift:
// unsigned midpoint without wrapping/overflow
int mid = (l & r) + ((l ^ r) >>> 1);
Of course being the unsigned midpoint, negative inputs are implicitly treated as very large positive numbers, that's the point.
If you're working with signed-but-non-negative numbers (as is usually the case for midpoint calculation), you can use the significantly simpler
int mid = (x + y) >>> 1

How can you calculate a factor if you have the other factor and the product with overflows?

a * x = b
I have a seemingly rather complicated multiplication / imul problem: if I have a and I have b, how can I calculate x if they're all 32-bit dwords (e.g. 0-1 = FFFFFFFF, FFFFFFFF+1 = 0)?
For example:
0xcb9102df * x = 0x4d243a5d
In that case, x is 0x1908c643. I found a similar question but the premises were different and I'm hoping there's a simpler solution than those given.
Numbers have a modular multiplicative inverse modulo a power of two precisely iff they are odd. Everything else is a bit-shifted odd number (even zero, which might be anything, with all bits shifted out). So there are a couple of cases:
Given a * x = b
tzcnt(a) > tzcnt(b) no solution
tzcnt(a) <= tzcnt(b) solvable, with 2tzcnt(a) solutions
The second case has a special case with 1 solution, for odd a, namely x = inverse(a) * b
More generally, x = inverse(a >> tzcnt(a)) * (b >> tzcnt(a)) is a solution, because you write a as (a >> tzcnt(a)) * (1 << tzcnt(a)), so we cancel the left factor with its inverse, we leave the right factor as part of the result (cannot be cancelled anyway) and then multiply by what remains to get it up to b. Still only works in the second case, obviously. If you wanted, you could enumerate all solutions by filling in all possibilities for the top tzcnt(a) bits.
The only thing that remains is getting the inverse, you've probably seen it in the other answer, whatever it was, but for completeness you can compute it as follows: (not tested)
; input x
dword y = (x * x) + x - 1;
dword t = y * x;
y *= 2 - t;
t = y * x;
y *= 2 - t;
t = y * x;
y *= 2 - t;
; result y

Find smallest integer greater or equal than x (positive integer) multiple of z (positive integer, probably power of 2)

Probably very easy question, yet I came out with this implementation that looks far too complicated...
unsigned int x;
unsigned int z;
unsigned int makeXMultipleOfZ(const unsigned x, const unsigned z) {
return x + (z - x % z) % z;
//or
//return x + (z - (x + 1) % z - 1); //This generates shorter assembly,
//6 against 8 instructions
}
I would like to avoid if-statements
If this can help we can safely say that z will be a power of 2
In my case z=4 (I know I could replace the modulo operation with a & bit operator), and I was wondering if could come with an implementation that involves less steps.
If z is a power of two, the modulo operation can be reduced to this bitwise operation:
return (x + z - 1) & ~(z - 1);
This logic is very common for data structure boundary alignment, for example. More info here: https://en.wikipedia.org/wiki/Data_structure_alignment
If z is a power of two and the integers are unsigned, the following will work:
x + (z - 1) & ~(z - 1)
I cannot think of a solution using bit-twiddling if z is an arbitrary number.

finding cube root in C++?

Strange things happen when i try to find the cube root of a number.
The following code returns me undefined. In cmd : -1.#IND
cout<<pow(( double )(20.0*(-3.2) + 30.0),( double )1/3)
While this one works perfectly fine. In cmd : 4.93242414866094
cout<<pow(( double )(20.0*4.5 + 30.0),( double )1/3)
From mathematical way it must work since we can have the cube root from a negative number.
Pow is from Visual C++ 2010 math.h library. Any ideas?
pow(x, y) from <cmath> does NOT work if x is negative and y is non-integral.
This is a limitation of std::pow, as documented in the C standard and on cppreference:
Error handling
Errors are reported as specified in math_errhandling
If base is finite and negative and exp is finite and non-integer, a domain error occurs and a range error may occur.
If base is zero and exp is zero, a domain error may occur.
If base is zero and exp is negative, a domain error or a pole error may occur.
There are a couple ways around this limitation:
Cube-rooting is the same as taking something to the 1/3 power, so you could do std::pow(x, 1/3.).
In C++11, you can use std::cbrt. C++11 introduced both square-root and cube-root functions, but no generic n-th root function that overcomes the limitations of std::pow.
The power 1/3 is a special case. In general, non-integral powers of negative numbers are complex. It wouldn't be practical for pow to check for special cases like integer roots, and besides, 1/3 as a double is not exactly 1/3!
I don't know about the visual C++ pow, but my man page says under errors:
EDOM The argument x is negative and y is not an integral value. This would result in a complex number.
You'll have to use a more specialized cube root function if you want cube roots of negative numbers - or cut corners and take absolute value, then take cube root, then multiply the sign back on.
Note that depending on context, a negative number x to the 1/3 power is not necessarily the negative cube root you're expecting. It could just as easily be the first complex root, x^(1/3) * e^(pi*i/3). This is the convention mathematica uses; it's also reasonable to just say it's undefined.
While (-1)^3 = -1, you can't simply take a rational power of a negative number and expect a real response. This is because there are other solutions to this rational exponent that are imaginary in nature.
http://www.wolframalpha.com/input/?i=x^(1/3),+x+from+-5+to+0
Similarily, plot x^x. For x = -1/3, this should have a solution. However, this function is deemed undefined in R for x < 0.
Therefore, don't expect math.h to do magic that would make it inefficient, just change the signs yourself.
Guess you gotta take the negative out and put it in afterwards. You can have a wrapper do this for you if you really want to.
function yourPow(double x, double y)
{
if (x < 0)
return -1.0 * pow(-1.0*x, y);
else
return pow(x, y);
}
Don't cast to double by using (double), use a double numeric constant instead:
double thingToCubeRoot = -20.*3.2+30;
cout<< thingToCubeRoot/fabs(thingToCubeRoot) * pow( fabs(thingToCubeRoot), 1./3. );
Should do the trick!
Also: don't include <math.h> in C++ projects, but use <cmath> instead.
Alternatively, use pow from the <complex> header for the reasons stated by buddhabrot
pow( x, y ) is the same as (i.e. equivalent to) exp( y * log( x ) )
if log(x) is invalid then pow(x,y) is also.
Similarly you cannot perform 0 to the power of anything, although mathematically it should be 0.
C++11 has the cbrt function (see for example http://en.cppreference.com/w/cpp/numeric/math/cbrt) so you can write something like
#include <iostream>
#include <cmath>
int main(int argc, char* argv[])
{
const double arg = 20.0*(-3.2) + 30.0;
std::cout << cbrt(arg) << "\n";
std::cout << cbrt(-arg) << "\n";
return 0;
}
I do not have access to the C++ standard so I do not know how the negative argument is handled... a test on ideone http://ideone.com/bFlXYs seems to confirm that C++ (gcc-4.8.1) extends the cube root with this rule cbrt(x)=-cbrt(-x) when x<0; for this extension you can see http://mathworld.wolfram.com/CubeRoot.html
I was looking for cubit root and found this thread and it occurs to me that the following code might work:
#include <cmath>
using namespace std;
function double nth-root(double x, double n){
if (!(n%2) || x<0){
throw FAILEXCEPTION(); // even root from negative is fail
}
bool sign = (x >= 0);
x = exp(log(abs(x))/n);
return sign ? x : -x;
}
I think you should not confuse exponentiation with the nth-root of a number. See the good old Wikipedia
because the 1/3 will always return 0 as it will be considered as integer...
try with 1.0/3.0...
it is what i think but try and implement...
and do not forget to declare variables containing 1.0 and 3.0 as double...
Here's a little function I knocked up.
#define uniform() (rand()/(1.0 + RAND_MAX))
double CBRT(double Z)
{
double guess = Z;
double x, dx;
int loopbreaker;
retry:
x = guess * guess * guess;
loopbreaker = 0;
while (fabs(x - Z) > FLT_EPSILON)
{
dx = 3 * guess*guess;
loopbreaker++;
if (fabs(dx) < DBL_EPSILON || loopbreaker > 53)
{
guess += uniform() * 2 - 1.0;
goto retry;
}
guess -= (x - Z) / dx;
x = guess*guess*guess;
}
return guess;
}
It uses Newton-Raphson to find a cube root.
Sometime Newton -Raphson gets stuck, if the root is very close to 0 then the derivative can
get large and it can oscillate. So I've clamped and forced it to restart if that happens.
If you need more accuracy you can change the FLT_EPSILONs.
If you ever have no math library you can use this way to compute the cubic root:
cubic root
double curt(double x) {
if (x == 0) {
// would otherwise return something like 4.257959840008151e-109
return 0;
}
double b = 1; // use any value except 0
double last_b_1 = 0;
double last_b_2 = 0;
while (last_b_1 != b && last_b_2 != b) {
last_b_1 = b;
// use (2 * b + x / b / b) / 3 for small numbers, as suggested by willywonka_dailyblah
b = (b + x / b / b) / 2;
last_b_2 = b;
// use (2 * b + x / b / b) / 3 for small numbers, as suggested by willywonka_dailyblah
b = (b + x / b / b) / 2;
}
return b;
}
It is derives from the sqrt algorithm below. The idea is that b and x / b / b bigger and smaller from the cubic root of x. So, the average of both lies closer to the cubic root of x.
Square Root And Cubic Root (in Python)
def sqrt_2(a):
if a == 0:
return 0
b = 1
last_b = 0
while last_b != b:
last_b = b
b = (b + a / b) / 2
return b
def curt_2(a):
if a == 0:
return 0
b = a
last_b_1 = 0;
last_b_2 = 0;
while (last_b_1 != b and last_b_2 != b):
last_b_1 = b;
b = (b + a / b / b) / 2;
last_b_2 = b;
b = (b + a / b / b) / 2;
return b
In contrast to the square root, last_b_1 and last_b_2 are required in the cubic root because b flickers. You can modify these algorithms to compute the fourth root, fifth root and so on.
Thanks to my math teacher Herr Brenner in 11th grade who told me this algorithm for sqrt.
Performance
I tested it on an Arduino with 16mhz clock frequency:
0.3525ms for yourPow
0.3853ms for nth-root
2.3426ms for curt