Rounding integer division without logical operators - c++

I want a function
int rounded_division(const int a, const int b) {
return round(1.0 * a/b);
}
So we have, for example,
rounded_division(3, 2) // = 2
rounded_division(2, 2) // = 1
rounded_division(1, 2) // = 1
rounded_division(0, 2) // = 0
rounded_division(-1, 2) // = -1
rounded_division(-2, 2) // = -1
rounded_division(-3, -2) // = 2
Or in code, where a and b are 32 bit signed integers:
int rounded_division(const int a, const int b) {
return ((a < 0) ^ (b < 0)) ? ((a - b / 2) / b) : ((a + b / 2) / b);
}
And here comes the tricky part: How to implement this guy efficiently (not using larger 64 bit values) and without a logical operators such as ?:, &&, ...? Is it possible at all?
The reason why I am wondering of avoiding logical operators, because the processor I have to implement this function for, has no conditional instructions (more about missing conditional instructions on ARM.).

a/b + a%b/(b/2 + b%2) works quite well - not failed in billion+ test cases. It meets all OP's goals: No overflow, no long long, no branching, works over entire range of int when a/b is defined.
No 32-bit dependency. If using C99 or later, no implementation behavior restrictions.
int rounded_division(int a, int b) {
int q = a / b;
int r = a % b;
return q + r/(b/2 + b%2);
}
This works with 2's complement, 1s' complement and sign-magnitude as all operations are math ones.

How about this:
int rounded_division(const int a, const int b) {
return (a + b/2 + b * ((a^b) >> 31))/b;
}
(a ^ b) >> 31 should evaluate to -1 if a and b have different signs and 0 otherwise, assuming int has 32 bits and the leftmost is the sign bit.
EDIT
As pointed out by #chux in his comments this method is wrong due to integer division. This new version evaluates the same as OP's example, but contains a bit more operations.
int rounded_division(const int a, const int b) {
return (a + b * (1 + 2 * ((a^b) >> 31)) / 2)/b;
}
This version still however does not take into account the overflow problem.

What about
...
return ((a + (a*b)/abs(a*b) * b / 2) / b);
}
Without overflow:
...
return ((a + ((a/abs(a))*(b/abs(b))) * b / 2) / b);
}

This is a rough approach that you may use. Using a mask to apply something if the operation a*b < 0.
Please note that I did not test this appropriately.
int function(int a, int b){
int tmp = float(a)/b + 0.5;
int mask = (a*b) >> 31; // shift sign bit to set rest of the bits
return tmp - (1 & mask);//minus one if a*b was < 0
}

The following rounded_division_test1() meets OP's requirement of no branching - if one counts sign(int a), nabs(int a), and cmp_le(int a, int b) as non-branching. See here for ideas of how to do sign() without compare operators. These helper functions could be rolled into rounded_division_test1() without explicit calls.
The code demonstrates the correct functionality and is useful for testing various answers. When a/b is defined, this answer does not overflow.
#include <limits.h>
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
int nabs(int a) {
return (a < 0) * a - (a >= 0) * a;
}
int sign(int a) {
return (a > 0) - (a < 0);
}
int cmp_le(int a, int b) {
return (a <= b);
}
int rounded_division_test1(int a, int b) {
int q = a / b;
int r = a % b;
int flag = cmp_le(nabs(r), (nabs(b) / 2 + nabs(b % 2)));
return q + flag * sign(b) * sign(r);
}
// Alternative that uses long long
int rounded_division_test1LL(int a, int b) {
int c = (a^b)>>31;
return (a + (c*2 + 1)*1LL*b/2)/b;
}
// Reference code
int rounded_division(int a, int b) {
return round(1.0*a/b);
}
int test(int a, int b) {
int q0 = rounded_division(a, b);
//int q1 = function(a,b);
int q1 = rounded_division_test1(a, b);
if (q0 != q1) {
printf("%d %d --> %d %d\n", a, b, q0, q1);
fflush(stdout);
}
return q0 != q1;
}
void tests(void) {
int err = 0;
int const a[] = { INT_MIN, INT_MIN + 1, INT_MIN + 1, -3, -2, -1, 0, 1, 2, 3,
INT_MAX - 1, INT_MAX };
for (unsigned i = 0; i < sizeof a / sizeof a[0]; i++) {
for (unsigned j = 0; j < sizeof a / sizeof a[0]; j++) {
if (a[j] == 0) continue;
if (a[i] == INT_MIN && a[j] == -1) continue;
err += test(a[i], a[j]);
}
}
printf("Err %d\n", err);
}
int main(void) {
tests();
return 0;
}

Let me give my contribution:
What about:
int rounded_division(const int a, const int b) {
return a/b + (2*(a%b))/b;
}
No branch, no logical operators, only mathematical operators. But it could fail if b is great than INT_MAX/2 or less than INT_MIN/2.
But if 64 bits are allowed to compute 32 bits rounds. It will not fail
int rounded_division(const int a, const int b) {
return a/b + (2LL*(a%b))/b;
}

Code that I came up with for use on ARM M0 (no floating point, slow divide).
It only uses one divide instruction and no conditionals, but will overflow if numerator + (denominator/2) > INT_MAX.
Cycle count on ARM M0 = 7 cycles + the divide (M0 has no divide instruction, so it is toolchain dependant).
int32_t Int32_SignOf(int32_t val)
{
return (+1 | (val >> 31)); // if v < 0 then -1, else +1
}
uint32_t Int32_Abs(int32_t val)
{
int32_t tmp = val ^ (val >> 31);
return (tmp - (val >> 31));
// the following code looks like it should be faster, using subexpression elimination
// except on arm a bitshift is free when performed with another operation,
// so it would actually end up being slower
// tmp = val >> 31;
// dst = val ^ (tmp);
// dst -= tmp;
// return dst;
}
int32_t Int32_DivRound(int32_t numerator, int32_t denominator)
{
// use the absolute (unsigned) demominator in the fudge value
// as the divide by 2 then becomes a bitshift
int32_t sign_num = Int32_SignOf(numerator);
uint32_t abs_denom = Int32_Abs(denominator);
return (numerator + sign_num * ((int32_t)(abs_denom / 2u))) / denominator;
}

since the function seems to be symmetric how about sign(a/b)*floor(abs(a/b)+0.5)

Related

I encountered the 10^9+7 problem but I can't understand the relation between the distributive properties of mod and my problem

Given 3 numbers a b c get a^b , b^a , c^x where x is abs diff between b and a cout each one but mod 10^9+7 in ascending order.
well I searched web for how to use the distributive property but didn't understand it since I am beginner,
I use very simple for loops so understanding this problem is a bit hard for me so how can I relate these mod rules with powers too in loops? If anyone can help me I would be so happy.
note time limit is 1 second which makes it harder
I tried to mod the result every time in the loop then times it by the original number.
for example if 2^3 then 1st loop given variables cin>>a,a would be 2, num =a would be like this
a = (a % 10^9 + 7) * num this works for very small inputs but large ones it exceed time
#include <iostream>
#include <cmath>
using namespace std;
int main ()
{
long long a,b,c,one,two,thr;
long long x;
long long mod = 1e9+7;
cin>>a>>b>>c;
one = a;
two = b;
thr = c;
if (a>=b)
x = a - b;
else
x = b - a;
for(int i = 0; i < b-1;i++)
{
a = ((a % mod) * (one%mod))%mod;
}
for(int j = 0; j < a-1;j++)
{
b = ((b % mod) * (two%mod))%mod;
}
for(int k = 0; k < x-1;k++)
{
c = ((c % mod) * (thr%mod))%mod;
}
}
I use very simple for loops [...] this works for very small inputs, but large ones it exceeds time.
There is an algorithm called "exponentiation by squaring" that has a logarithmic time complexity, rather then a linear one.
It works breaking down the power exponent while increasing the base.
Consider, e.g. x355. Instead of multiplying x 354 times, we can observe that
x355 = x·x354 = x·(x2)177 = x·x2·(x2)176 = x·x2·(x4)88 = x·x2·(x8)44 = x·x2·(x16)22 = x·x2·(x32)11 = x·x2·x32·(x32)10 = x·x2·x32·(x64)5 = x·x2·x32·x64·(x64)4 = x·x2·x32·x64·(x128)2 = x1·x2·x32·x64·x256
That took "only" 12 steps.
To implement it, we only need to be able to perform modular multiplications safely, without overflowing. Given the value of the modulus, a type like std::int64_t is wide enough.
#include <iostream>
#include <cstdint>
#include <limits>
#include <cassert>
namespace modular
{
auto exponentiation(std::int64_t base, std::int64_t exponent) -> std::int64_t;
}
int main()
{
std::int64_t a, b, c;
std::cin >> a >> b >> c;
auto const x{ b < a ? a - b : b - a };
std::cout << modular::exponentiation(a, b) << '\n'
<< modular::exponentiation(b, a) << '\n'
<< modular::exponentiation(c, x) << '\n';
return 0;
}
namespace modular
{
constexpr std::int64_t M{ 1'000'000'007 };
// We need the mathematical modulo
auto from(std::int64_t x)
{
static_assert(M > 0);
x %= M;
return x < 0 ? x + M : x;
}
// It assumes that both a and b are already mod M
auto multiplication_(std::int64_t a, std::int64_t b)
{
assert( 0 <= a and a < M and 0 <= b and b < M );
assert( b == 0 or a <= std::numeric_limits<int64_t>::max() / b );
return (a * b) % M;
}
// Implements exponentiation by squaring
auto exponentiation(std::int64_t base, std::int64_t exponent) -> std::int64_t
{
assert( exponent >= 0 );
auto b{ from(base) };
std::int64_t x{ 1 };
while ( exponent > 1 )
{
if ( exponent % 2 != 0 )
{
x = multiplication_(x, b);
--exponent;
}
b = multiplication_(b, b);
exponent /= 2;
}
return multiplication_(b, x);
}
}

How to modulo this formula?

I would like to write this formula in C++ language:
(2<=n<=1e5), (1<=k<=n), (2<=M<=1e9).
I would like to do this without using special structures.
Unfortunately in this formula there are a lot of cases which effectively make modulation difficult. Example: ((n-k)!) mod M can be equal to 0, or ((n-1)(n-2))/4 may not be an integer. I will be very grateful for any help.
(n−1)!/(n−k)! can be handled by computing the product (n−k+1)…(n−1).
(n−1)! (n−1)(n−2)/4 can be handled by handling n ≤ 2 (0) and n ≥ 3
(3…(n−1) (n−1)(n−2)/2) separately.
Untested C++:
#include <cassert>
#include <cstdint>
class Residue {
public:
// Accept int64_t for convenience.
explicit Residue(int64_t rep, int32_t modulus) : modulus_(modulus) {
assert(modulus > 0);
rep_ = rep % modulus;
if (rep_ < 0)
rep_ += modulus;
}
// Return int64_t for convenience.
int64_t rep() const { return rep_; }
int32_t modulus() const { return modulus_; }
private:
int32_t rep_;
int32_t modulus_;
};
Residue operator+(Residue a, Residue b) {
assert(a.modulus() == b.modulus());
return Residue(a.rep() + b.rep(), a.modulus());
}
Residue operator-(Residue a, Residue b) {
assert(a.modulus() == b.modulus());
return Residue(a.rep() - b.rep(), a.modulus());
}
Residue operator*(Residue a, Residue b) {
assert(a.modulus() == b.modulus());
return Residue(a.rep() * b.rep(), a.modulus());
}
Residue QuotientOfFactorialsMod(int32_t a, int32_t b, int32_t modulus) {
assert(modulus > 0);
assert(b >= 0);
assert(a >= b);
Residue result(1, modulus);
// Don't initialize with b + 1 because it could overflow.
for (int32_t i = b; i < a; i++) {
result = result * Residue(i + 1, modulus);
}
return result;
}
Residue FactorialMod(int32_t a, int32_t modulus) {
assert(modulus > 0);
assert(a >= 0);
return QuotientOfFactorialsMod(a, 0, modulus);
}
Residue Triangular(int32_t a, int32_t modulus) {
assert(modulus > 0);
return Residue((static_cast<int64_t>(a) + 1) * a / 2, modulus);
}
Residue F(int32_t n, int32_t k, int32_t m) {
assert(n >= 2);
assert(n <= 100000);
assert(k >= 1);
assert(k <= n);
assert(m >= 2);
assert(m <= 1000000000);
Residue n_res(n, m);
Residue n_minus_1(n - 1, m);
Residue n_minus_2(n - 2, m);
Residue k_res(k, m);
Residue q = QuotientOfFactorialsMod(n - 1, n - k, m);
return q * (k_res - n_res) * n_minus_1 +
(FactorialMod(n - 1, m) - q) * k_res * n_minus_1 +
(n > 2 ? QuotientOfFactorialsMod(n - 1, 2, m) *
(n_res * n_minus_1 + Triangular(n - 2, m))
: Residue(1, m));
}
As mentioned in the other answer dividing factorials can be evaluated directly without division. Also you need 64bit arithmetics in order to store your subresults. And use modulo after each multiplication otherwise you would need very huge numbers which would take forever to compute.
Also you mention ((n-1)(n-2))/4 can be non just integer how to deal with that is questionable as we do not have any context to what you are doing. However you can move /2 before brackets (apply it on (n-1)! so modpi without 2 beware not to divide the already modded factorial!!!) and then you have no remainder as the (n-1)*(n-2)/4 become (n-1)*(n-2)/2 and the (n-1)*(n-2) is always odd (divisible by 2). The only "problem" is when n=2 as the n*(n-1)/2 is 1 but the /2 moved before bracket will round down the (n-1)! so you should handle it as special case by not moving the /2 before brackets (not included in code below).
I see it like this:
typedef unsigned __int64 u64;
u64 modpi(u64 x0,u64 x1,u64 p) // ( x0*(x0+1)*(x0+2)*...*x1 ) mod p
{
u64 x,y;
if (x0>x1){ x=x0; x0=x1; x1=x; }
for (y=1,x=x0;x<=x1;x++){ y*=x; y%=p; }
return y;
}
void main()
{
u64 n=100,k=20,m=123456789,a,b,b2,c,y;
a =modpi(n-k+1,n-1,m); // (n-1)!/(n-k)!
b =modpi(1,n-1,m); // (n-1)! mod m
b2=modpi(3,n-1,m); // (n-1)!/2 mod m
c =((n*(n-1)))%m; // 2*( n*(n-1)/2 + (n-1)*(n-2)/4 ) mod m
c+=(((n-1)*(n-2))/2)%m;
y =(((a*(k-n))%m)*(n-1))%m; // ((n-1)!/(n-k)!)*(k-1)*(n-1) mod m
y+=b; // (n-1)! mod m
y-=(((a*k)%m)*(n-1))%m; // ((n-1)!/(n-k)!)*k*(n-1) mod m
y+=(b2*c)%m; // (n-1)!*( n*(n-1)/2 + (n-1)*(n-2)/4 ) mod m
// here y should hold your answer
}
however be careful older compilers do not have full support of 64 bit integers and can produce wrong results or even does not compile. In such case use big integer lib or compute using 2*32bit variables or look for 32 bit modmul implementation.
The expression implies the use of a floating point type. Therefore, use the function fmod to get the remainder of the division.

C++ combination function always resulting 0

can anybody tell me why my Combination function is always resulting 0 ?
I also tried to make it calculate the combination without the use of the permutation function but the factorial and still the result is 0;
#include <iostream>
#include <cmath>
using namespace std;
int factorial(int& n)
{
if (n <= 1)
{
return 1;
}
else
{
n = n-1;
return (n+1) * factorial(n);
}
}
int permutation(int& a, int& b)
{
int x = a-b;
return factorial(a) / factorial(x);
}
int Combination(int& a, int& b)
{
return permutation(a,b) / factorial(b);
}
int main()
{
int f, s;
cin >> f >> s;
cout << permutation(f,s) << endl;
cout << Combination(f,s);
return 0;
}
Your immediate problem is that that you pass a modifiable reference to your function. This means that you have Undefined Behaviour here:
return (n+1) * factorial(n);
// ^^^ ^^^
because factorial(n) modifies n, and is indeterminately sequenced with (n+1). A similar problem exists in Combination(), where b is modified twice in the same expression:
return permutation(a,b) / factorial(b);
// ^^^ ^^^
You will get correct results if you pass n, a and b by value, like this:
int factorial(int n)
Now, factorial() gets its own copy of n, and doesn't affect the n+1 you're multiplying it with.
While we're here, I should point out some other flaws in the code.
Avoid using namespace std; - it has traps for the unwary (and even for the wary!).
You can write factorial() without modifying n once you pass by value (rather than by reference):
int factorial(const int n)
{
if (n <= 1) {
return 1;
} else {
return n * factorial(n-1);
}
}
Consider using iterative code to compute factorial.
We should probably be using unsigned int, since the operations are meaningless for negative numbers. You might consider unsigned long or unsigned long long for greater range.
Computing one factorial and dividing by another is not only inefficient, it also risks unnecessary overflow (when a is as low as 13, with 32-bit int). Instead, we can multiply just down to the other number:
unsigned int permutation(const unsigned int a, const unsigned int b)
{
if (a < b) return 0;
unsigned int permutations = 1;
for (unsigned int i = a; i > a-b; --i) {
permutations *= i;
}
return permutations;
}
This works with much higher a, when b is small.
We didn't need the <cmath> header for anything.
Suggested fixed code:
unsigned int factorial(const unsigned int n)
{
unsigned int result = 1;
for (unsigned int i = 2; i <= n; ++i) {
result *= i;
}
return result;
}
unsigned int permutation(const unsigned int a, const unsigned int b)
{
if (a < b) return 0;
unsigned int result = 1;
for (unsigned int i = a; i > a-b; --i) {
result *= i;
}
return result;
}
unsigned int combination(const unsigned int a, const unsigned int b)
{
// C(a, b) == C(a, a - b), but it's faster to compute with small b
if (b > a - b) {
return combination(a, a - b);
}
return permutation(a,b) / factorial(b);
}
You dont calculate with the pointer value you calculate withe the pointer address.

To find combination value of large numbers

I want to find (n choose r) for large integers, and I also have to find out the mod of that number.
long long int choose(int a,int b)
{
if (b > a)
return (-1);
if(b==0 || a==1 || b==a)
return(1);
else
{
long long int r = ((choose(a-1,b))%10000007+(choose(a-1,b- 1))%10000007)%10000007;
return r;
}
}
I am using this piece of code, but I am getting TLE. If there is some other method to do that please tell me.
I don't have the reputation to comment yet, but I wanted to point out that the answer by rock321987 works pretty well:
It is fast and correct up to and including C(62, 31)
but cannot handle all inputs that have an output that fits in a uint64_t. As proof, try:
C(67, 33) = 14,226,520,737,620,288,370 (verify correctness and size)
Unfortunately, the other implementation spits out 8,829,174,638,479,413 which is incorrect. There are other ways to calculate nCr which won't break like this, however the real problem here is that there is no attempt to take advantage of the modulus.
Notice that p = 10000007 is prime, which allows us to leverage the fact that all integers have an inverse mod p, and that inverse is unique. Furthermore, we can find that inverse quite quickly. Another question has an answer on how to do that here, which I've replicated below.
This is handy since:
x/y mod p == x*(y inverse) mod p; and
xy mod p == (x mod p)(y mod p)
Modifying the other code a bit, and generalizing the problem we have the following:
#include <iostream>
#include <assert.h>
// p MUST be prime and less than 2^63
uint64_t inverseModp(uint64_t a, uint64_t p) {
assert(p < (1ull << 63));
assert(a < p);
assert(a != 0);
uint64_t ex = p-2, result = 1;
while (ex > 0) {
if (ex % 2 == 1) {
result = (result*a) % p;
}
a = (a*a) % p;
ex /= 2;
}
return result;
}
// p MUST be prime
uint32_t nCrModp(uint32_t n, uint32_t r, uint32_t p)
{
assert(r <= n);
if (r > n-r) r = n-r;
if (r == 0) return 1;
if(n/p - (n-r)/p > r/p) return 0;
uint64_t result = 1; //intermediary results may overflow 32 bits
for (uint32_t i = n, x = 1; i > r; --i, ++x) {
if( i % p != 0) {
result *= i % p;
result %= p;
}
if( x % p != 0) {
result *= inverseModp(x % p, p);
result %= p;
}
}
return result;
}
int main() {
uint32_t smallPrime = 17;
uint32_t medNum = 3001;
uint32_t halfMedNum = medNum >> 1;
std::cout << nCrModp(medNum, halfMedNum, smallPrime) << std::endl;
uint32_t bigPrime = 4294967291ul; // 2^32-5 is largest prime < 2^32
uint32_t bigNum = 1ul << 24;
uint32_t halfBigNum = bigNum >> 1;
std::cout << nCrModp(bigNum, halfBigNum, bigPrime) << std::endl;
}
Which should produce results for any set of 32-bit inputs if you are willing to wait. To prove a point, I've included the calculation for a 24-bit n, and the maximum 32-bit prime. My modest PC took ~13 seconds to calculate this. Check the answer against wolfram alpha, but beware that it may exceed the 'standard computation time' there.
There is still room for improvement if p is much smaller than (n-r) where r <= n-r. For example, we could precalculate all the inverses mod p instead of doing it on demand several times over.
nCr = n! / (r! * (n-r)!) {! = factorial}
now choose r or n - r in such a way that any of them is minimum
#include <cstdio>
#include <cmath>
#define MOD 10000007
int main()
{
int n, r, i, x = 1;
long long int res = 1;
scanf("%d%d", &n, &r);
int mini = fmin(r, (n - r));//minimum of r,n-r
for (i = n;i > mini;i--) {
res = (res * i) / x;
x++;
}
printf("%lld\n", res % MOD);
return 0;
}
it will work for most cases as required by programming competitions if the value of n and r are not too high
Time complexity :- O(min(r, n - r))
Limitation :- for languages like C/C++ etc. there will be overflow if
n > 60 (approximately)
as no datatype can store the final value..
The expansion of nCr can always be reduced to product of integers. This is done by canceling out terms in denominator. This approach is applied in the function given below.
This function has time complexity of O(n^2 * log(n)). This will calculate nCr % m for n<=10000 under 1 sec.
#include <numeric>
#include <algorithm>
int M=1e7+7;
int ncr(int n, int r)
{
r=min(r,n-r);
int A[r],i,j,B[r];
iota(A,A+r,n-r+1); //initializing A starting from n-r+1 to n
iota(B,B+r,1); //initializing B starting from 1 to r
int g;
for(i=0;i<r;i++)
for(j=0;j<r;j++)
{
if(B[i]==1)
break;
g=__gcd(B[i], A[j] );
A[j]/=g;
B[i]/=g;
}
long long ans=1;
for(i=0;i<r;i++)
ans=(ans*A[i])%M;
return ans;
}

Pow() calculates wrong?

I need to use pow in my c++ program and if i call the pow() function this way:
long long test = pow(7, e);
Where
e is an integer value with the value of 23.
I always get 821077879 as a result. If i calculate it with the windows calculator i get 27368747340080916343.. Whats wrong here? ):
I tried to cast to different types but nothing helped here... What could be the reason for this? How i can use pow() correctly?
Thanks!
The result is doesn't fit in long long.
If you want to deal with very big numbers then use a library like GMP
Or store it as a floating point (which won't be as precise).
Applying modulo:
const unsigned int b = 5; // base
const unsigned int e = 27; // exponent
const unsigned int m = 7; // modulo
unsigned int r = 1; // remainder
for (int i = 0; i < e; ++i)
r = (r * b) % m;
// r is now (pow(5,27) % 7)
723 is too big to fit into a long long (assuming it's 64 bits). The value is getting truncated.
Edit: Oh, why didn't you say that you wanted pow(b, e) % m instead of just pow(b, e)? That makes things a whole lot simpler, because you don't need bigints after all. Just do all your arithmetic mod m. Pubby's solution works, but here's a faster one (O(log e) instead of O(e)).
unsigned int powmod(unsigned int b, unsigned int e, unsigned int m)
{
assert(m != 0);
if (e == 0)
{
return 1;
}
else if (e % 2 == 0)
{
unsigned int squareRoot = powmod(b, e / 2, m);
return (squareRoot * squareRoot) % m;
}
else
{
return (powmod(b, e - 1, m) * b) % m;
}
}
See it live: https://ideone.com/YsG7V
#include<iostream>
#include<cmath>
int main()
{
long double ldbl = pow(7, 23);
double dbl = pow(7, 23);
std::cout << ldbl << ", " << dbl << std::endl;
}
Output: 2.73687e+19, 2.73687e+19