Related
How to calculate a!/(b1! b2! ... bm!) modulo p, where p is a prime number? The factorial of a and b can be very big (long long int is not sufficient) so I need to pass to modulo.
If a, bs and p are fairly small, prefer #KellyBundy's approach of cancelling factors, or counting prime factors.
Multiplication and modular arithmetic
Given integers m and n and some other integer k:
(m * n) modulo k = ((m modulo k) * (n mod k)) modulo k
This allows a large product to be calculated modulo p without worrying about overflow, since we can always keep the arguments in the range [0, k).
For example to compute the factorial a! modulo k, in python:
def fact(a, k):
if a == 0:
return 1
else:
return ((a % k) * fact(a - 1, k)) % k
Division and modular arithmetic
If p is a prime then for any integer n that is not divisible by p, we can find an integer which I'll call inv(n) such that:
(n * inv(n)) modulo p = 1
This number is called the modular inverse of n. There are various algorithms to find modular inverses, which I won't describe here (but see e.g. here).
Now, given integers n and m, and assuming that m / n is an integer, we can apply the rule:
(m / n) modulo p = (m * inv(n)) modulo p
So provided we can calculate modular inverses, we can convert division to multiplication, and then apply the previous rule.
Another way, listing the factors 1 to a, then canceling with all divisors, then multiplying modulo p:
#include <iostream>
#include <vector>
int gcd(int a, int b) {
return b ? gcd(b, a % b) : a;
}
int main() {
int a = 60;
std::vector<int> bs = {13, 7, 19};
int p = 10007;
std::vector<int> factors(a);
for (int i=0; i<a; i++)
factors[i] = i + 1;
for (int b : bs) {
while (b > 1) {
int d = b--;
for (int& f : factors) {
int g = gcd(f, d);
f /= g;
d /= g;
}
}
}
int result = 1;
for (int f : factors)
result = result * f % p;
std::cout << result;
}
Prints 5744, same as this Python code:
from math import factorial, prod
a = 60
bs = [13, 7, 19]
p = 10007
num = factorial(a)
den = prod(map(factorial, bs))
print(num // den % p)
I am trying to find
(a^b) % mod
where b and mod is upto 10^9, while l can be really large i have tested upto 48 digits with success
using this relation
(a^b) % mod = (a%mod)^b % mod
#define ll long long int
ll powerLL(ll x, ll n,ll MOD)
{
ll result = 1;
while (n) {
if (n & 1)
result = result * x % MOD;
n = n / 2;
x = x * x % MOD;
}
return result;
}
ll powerStrings(string sa, string sb,ll MOD)
{
ll a = 0, b = 0;
for (size_t i = 0; i < sa.length(); i++)
a = (a * 10 + (sa[i] - '0')) % MOD;
for (size_t i = 0; i < sb.length(); i++)
b = (b * 10 + (sb[i] - '0')) % (MOD - 1);
return powerLL(a, b,MOD);
}
powerStrings("5109109785634228366587086207094636370893763284000","362323789",354252525) returns 208624800 but it should return 323419500. In this case a is 49 digits
powerStrings("300510498717329829809207642824818434714870652000","362323489",354255221) returns 282740484 , which is correct. In this case a is 48 digits
Is something wrong with the code or I will have to use other method of doing the same??
It does not work because it is not mathematically correct.
In general, we have that pow(a, n, m) = pow(a, n % λ(m), m) (with a coprime to m) where λ is the Carmichael function. As a special case, when m is a prime number, then λ(m) = m - 1. That situation is also covered by Fermat's little theorem. That's only a special case, it does not always work.
λ(354252525) = 2146980, if I hack that in then the right result comes out. (the base is not actually coprime to the modulus though)
In general you would need to compute the Carmichael function for the modulus, which is non-trivial, but feasible for small moduli.
This problem's answer turns out to be calculating large binomial coefficients modulo prime number using Lucas' theorem. Here's the solution to that problem using this technique: here.
Now my questions are:
Seems like my code expires if the data increases due to overflow of variables. Any ways to handle this?
Are there any ways to do this without using this theorem?
EDIT: note that as this is an OI or ACM problem, external libs other than original ones are not permitted.
Code below:
#include <iostream>
#include <string.h>
#include <stdio.h>
using namespace std;
#define N 100010
long long mod_pow(int a,int n,int p)
{
long long ret=1;
long long A=a;
while(n)
{
if (n & 1)
ret=(ret*A)%p;
A=(A*A)%p;
n>>=1;
}
return ret;
}
long long factorial[N];
void init(long long p)
{
factorial[0] = 1;
for(int i = 1;i <= p;i++)
factorial[i] = factorial[i-1]*i%p;
//for(int i = 0;i < p;i++)
//ni[i] = mod_pow(factorial[i],p-2,p);
}
long long Lucas(long long a,long long k,long long p)
{
long long re = 1;
while(a && k)
{
long long aa = a%p;long long bb = k%p;
if(aa < bb) return 0;
re = re*factorial[aa]*mod_pow(factorial[bb]*factorial[aa-bb]%p,p-2,p)%p;
a /= p;
k /= p;
}
return re;
}
int main()
{
int t;
cin >> t;
while(t--)
{
long long n,m,p;
cin >> n >> m >> p;
init(p);
cout << Lucas(n+m,m,p) << "\n";
}
return 0;
}
This solution assumes that p2 fits into an unsigned long long. Since an unsigned long long has at least 64 bits as per standard, this works at least for p up to 4 billion, much more than the question specifies.
typedef unsigned long long num;
/* x such that a*x = 1 mod p */
num modinv(num a, num p)
{
/* implement this one on your own */
/* you can use the extended Euclidean algorithm */
}
/* n chose m mod p */
/* computed with the theorem of Lucas */
num modbinom(num n, num m, num p)
{
num i, result, divisor, n_, m_;
if (m == 0)
return 1;
/* check for the likely case that the result is zero */
if (n < m)
return 0;
for (n_ = n, m_ = m; m_ > 0; n_ /= p, m_ /= p)
if (n_ % p < m_ % p)
return 0;
for (result = 1; n >= p || m >= p; n /= p, m /= p) {
result *= modbinom(n % p, m % p, p);
result %= p;
}
/* avoid unnecessary computations */
if (m > n - m)
m = n - m;
divisor = 1;
for (i = 0; i < m; i++) {
result *= n - i;
result %= p;
divisor *= i + 1;
divisor %= p;
}
result *= modinv(divisor, p);
result %= p;
return result;
}
An infinite precision integer seems like the way to go.
If you are in C++,
the PicklingTools library has an "infinite precision" integer (similar to
Python's LONG type). Someone else suggested Python, that's a reasonable
answer if you know Python. if you want to do it in C++, you can
use the int_n type:
#include "ocval.h"
int_n n="012345678910227836478627843";
n = n + 1; // Can combine with other plain ints as well
Take a look at the documentation at:
http://www.picklingtools.com/html/usersguide.html#c-int-n-and-the-python-arbitrary-size-ints-long
and
http://www.picklingtools.com/html/faq.html#c-and-otab-tup-int-un-int-n-new-in-picklingtools-1-2-0
The download for the C++ PicklingTools is here.
You want a bignum (a.k.a. arbitrary precision arithmetic) library.
First, don't write your own bignum (or bigint) library, because efficient algorithms (more efficient than the naive ones you learned at school) are difficult to design and implement.
Then, I would recommend GMPlib. It is free software, well documented, often used, quite efficient, and well designed (with perhaps some imperfections, in particular the inability to plugin your own memory allocator in replacement of the system malloc; but you probably don't care, unless you want to catch the rare out-of-memory condition ...). It has an easy C++ interface. It is packaged in most Linux distributions.
If it is a homework assignment, perhaps your teacher is expecting you to think more on the math, and find, with some proof, a way of solving the problem without any bignums.
Lets suppose that we need to compute a value of (a / b) mod p where p is a prime number. Since p is prime then every number b has an inverse mod p. So (a / b) mod p = (a mod p) * (b mod p)^-1. We can use euclidean algorithm to compute the inverse.
To get (n over k) we need to compute n! mod p, (k!)^-1, ((n - k)!)^-1. Total time complexity is O(n).
UPDATE: Here is the code in c++. I didn't test it extensively though.
int64_t fastPow(int64_t a, int64_t exp, int64_t mod)
{
int64_t res = 1;
while (exp)
{
if (exp % 2 == 1)
{
res *= a;
res %= mod;
}
a *= a;
a %= mod;
exp >>= 1;
}
return res;
}
// This inverse works only for primes p, it uses Fermat's little theorem
int64_t inverse(int64_t a, int64_t p)
{
assert(p >= 2);
return fastPow(a, p - 2, p);
}
int64_t binomial(int64_t n, int64_t k, int64_t p)
{
std::vector<int64_t> fact(n + 1);
fact[0] = 1;
for (auto i = 1; i <= n; ++i)
fact[i] = (fact[i - 1] * i) % p;
return ((((fact[n] * inverse(fact[k], p)) % p) * inverse(fact[n - k], p)) % p);
}
I want to find (n choose r) for large integers, and I also have to find out the mod of that number.
long long int choose(int a,int b)
{
if (b > a)
return (-1);
if(b==0 || a==1 || b==a)
return(1);
else
{
long long int r = ((choose(a-1,b))%10000007+(choose(a-1,b- 1))%10000007)%10000007;
return r;
}
}
I am using this piece of code, but I am getting TLE. If there is some other method to do that please tell me.
I don't have the reputation to comment yet, but I wanted to point out that the answer by rock321987 works pretty well:
It is fast and correct up to and including C(62, 31)
but cannot handle all inputs that have an output that fits in a uint64_t. As proof, try:
C(67, 33) = 14,226,520,737,620,288,370 (verify correctness and size)
Unfortunately, the other implementation spits out 8,829,174,638,479,413 which is incorrect. There are other ways to calculate nCr which won't break like this, however the real problem here is that there is no attempt to take advantage of the modulus.
Notice that p = 10000007 is prime, which allows us to leverage the fact that all integers have an inverse mod p, and that inverse is unique. Furthermore, we can find that inverse quite quickly. Another question has an answer on how to do that here, which I've replicated below.
This is handy since:
x/y mod p == x*(y inverse) mod p; and
xy mod p == (x mod p)(y mod p)
Modifying the other code a bit, and generalizing the problem we have the following:
#include <iostream>
#include <assert.h>
// p MUST be prime and less than 2^63
uint64_t inverseModp(uint64_t a, uint64_t p) {
assert(p < (1ull << 63));
assert(a < p);
assert(a != 0);
uint64_t ex = p-2, result = 1;
while (ex > 0) {
if (ex % 2 == 1) {
result = (result*a) % p;
}
a = (a*a) % p;
ex /= 2;
}
return result;
}
// p MUST be prime
uint32_t nCrModp(uint32_t n, uint32_t r, uint32_t p)
{
assert(r <= n);
if (r > n-r) r = n-r;
if (r == 0) return 1;
if(n/p - (n-r)/p > r/p) return 0;
uint64_t result = 1; //intermediary results may overflow 32 bits
for (uint32_t i = n, x = 1; i > r; --i, ++x) {
if( i % p != 0) {
result *= i % p;
result %= p;
}
if( x % p != 0) {
result *= inverseModp(x % p, p);
result %= p;
}
}
return result;
}
int main() {
uint32_t smallPrime = 17;
uint32_t medNum = 3001;
uint32_t halfMedNum = medNum >> 1;
std::cout << nCrModp(medNum, halfMedNum, smallPrime) << std::endl;
uint32_t bigPrime = 4294967291ul; // 2^32-5 is largest prime < 2^32
uint32_t bigNum = 1ul << 24;
uint32_t halfBigNum = bigNum >> 1;
std::cout << nCrModp(bigNum, halfBigNum, bigPrime) << std::endl;
}
Which should produce results for any set of 32-bit inputs if you are willing to wait. To prove a point, I've included the calculation for a 24-bit n, and the maximum 32-bit prime. My modest PC took ~13 seconds to calculate this. Check the answer against wolfram alpha, but beware that it may exceed the 'standard computation time' there.
There is still room for improvement if p is much smaller than (n-r) where r <= n-r. For example, we could precalculate all the inverses mod p instead of doing it on demand several times over.
nCr = n! / (r! * (n-r)!) {! = factorial}
now choose r or n - r in such a way that any of them is minimum
#include <cstdio>
#include <cmath>
#define MOD 10000007
int main()
{
int n, r, i, x = 1;
long long int res = 1;
scanf("%d%d", &n, &r);
int mini = fmin(r, (n - r));//minimum of r,n-r
for (i = n;i > mini;i--) {
res = (res * i) / x;
x++;
}
printf("%lld\n", res % MOD);
return 0;
}
it will work for most cases as required by programming competitions if the value of n and r are not too high
Time complexity :- O(min(r, n - r))
Limitation :- for languages like C/C++ etc. there will be overflow if
n > 60 (approximately)
as no datatype can store the final value..
The expansion of nCr can always be reduced to product of integers. This is done by canceling out terms in denominator. This approach is applied in the function given below.
This function has time complexity of O(n^2 * log(n)). This will calculate nCr % m for n<=10000 under 1 sec.
#include <numeric>
#include <algorithm>
int M=1e7+7;
int ncr(int n, int r)
{
r=min(r,n-r);
int A[r],i,j,B[r];
iota(A,A+r,n-r+1); //initializing A starting from n-r+1 to n
iota(B,B+r,1); //initializing B starting from 1 to r
int g;
for(i=0;i<r;i++)
for(j=0;j<r;j++)
{
if(B[i]==1)
break;
g=__gcd(B[i], A[j] );
A[j]/=g;
B[i]/=g;
}
long long ans=1;
for(i=0;i<r;i++)
ans=(ans*A[i])%M;
return ans;
}
Cheers,
I know you can get the amount of combinations with the following formula (without repetition and order is not important):
// Choose r from n
n! / r!(n - r)!
However, I don't know how to implement this in C++, since for instance with
n = 52
n! = 8,0658175170943878571660636856404e+67
the number gets way too big even for unsigned __int64 (or unsigned long long). Is there some workaround to implement the formula without any third-party "bigint" -libraries?
Here's an ancient algorithm which is exact and doesn't overflow unless the result is to big for a long long
unsigned long long
choose(unsigned long long n, unsigned long long k) {
if (k > n) {
return 0;
}
unsigned long long r = 1;
for (unsigned long long d = 1; d <= k; ++d) {
r *= n--;
r /= d;
}
return r;
}
This algorithm is also in Knuth's "The Art of Computer Programming, 3rd Edition, Volume 2: Seminumerical Algorithms" I think.
UPDATE: There's a small possibility that the algorithm will overflow on the line:
r *= n--;
for very large n. A naive upper bound is sqrt(std::numeric_limits<long long>::max()) which means an n less than rougly 4,000,000,000.
From Andreas' answer:
Here's an ancient algorithm which is exact and doesn't overflow unless the result is to big for a long long
unsigned long long
choose(unsigned long long n, unsigned long long k) {
if (k > n) {
return 0;
}
unsigned long long r = 1;
for (unsigned long long d = 1; d <= k; ++d) {
r *= n--;
r /= d;
}
return r;
}
This algorithm is also in Knuth's "The Art of Computer Programming, 3rd Edition, Volume 2: Seminumerical Algorithms" I think.
UPDATE: There's a small possibility that the algorithm will overflow on the line:
r *= n--;
for very large n. A naive upper bound is sqrt(std::numeric_limits<long long>::max()) which means an n less than rougly 4,000,000,000.
Consider n == 67 and k == 33. The above algorithm overflows with a 64 bit unsigned long long. And yet the correct answer is representable in 64 bits: 14,226,520,737,620,288,370. And the above algorithm is silent about its overflow, choose(67, 33) returns:
8,829,174,638,479,413
A believable but incorrect answer.
However the above algorithm can be slightly modified to never overflow as long as the final answer is representable.
The trick is in recognizing that at each iteration, the division r/d is exact. Temporarily rewriting:
r = r * n / d;
--n;
For this to be exact, it means if you expanded r, n and d into their prime factorizations, then one could easily cancel out d, and be left with a modified value for n, call it t, and then the computation of r is simply:
// compute t from r, n and d
r = r * t;
--n;
A fast and easy way to do this is to find the greatest common divisor of r and d, call it g:
unsigned long long g = gcd(r, d);
// now one can divide both r and d by g without truncation
r /= g;
unsigned long long d_temp = d / g;
--n;
Now we can do the same thing with d_temp and n (find the greatest common divisor). However since we know a-priori that r * n / d is exact, then we also know that gcd(d_temp, n) == d_temp, and therefore we don't need to compute it. So we can divide n by d_temp:
unsigned long long g = gcd(r, d);
// now one can divide both r and d by g without truncation
r /= g;
unsigned long long d_temp = d / g;
// now one can divide n by d/g without truncation
unsigned long long t = n / d_temp;
r = r * t;
--n;
Cleaning up:
unsigned long long
gcd(unsigned long long x, unsigned long long y)
{
while (y != 0)
{
unsigned long long t = x % y;
x = y;
y = t;
}
return x;
}
unsigned long long
choose(unsigned long long n, unsigned long long k)
{
if (k > n)
throw std::invalid_argument("invalid argument in choose");
unsigned long long r = 1;
for (unsigned long long d = 1; d <= k; ++d, --n)
{
unsigned long long g = gcd(r, d);
r /= g;
unsigned long long t = n / (d / g);
if (r > std::numeric_limits<unsigned long long>::max() / t)
throw std::overflow_error("overflow in choose");
r *= t;
}
return r;
}
Now you can compute choose(67, 33) without overflow. And if you try choose(68, 33), you'll get an exception instead of a wrong answer.
The following routine will compute the n-choose-k, using the recursive definition and memoization. The routine is extremely fast and accurate:
inline unsigned long long n_choose_k(const unsigned long long& n,
const unsigned long long& k)
{
if (n < k) return 0;
if (0 == n) return 0;
if (0 == k) return 1;
if (n == k) return 1;
if (1 == k) return n;
typedef unsigned long long value_type;
value_type* table = new value_type[static_cast<std::size_t>(n * n)];
std::fill_n(table,n * n,0);
class n_choose_k_impl
{
public:
n_choose_k_impl(value_type* table,const value_type& dimension)
: table_(table),
dimension_(dimension)
{}
inline value_type& lookup(const value_type& n, const value_type& k)
{
return table_[dimension_ * n + k];
}
inline value_type compute(const value_type& n, const value_type& k)
{
if ((0 == k) || (k == n))
return 1;
value_type v1 = lookup(n - 1,k - 1);
if (0 == v1)
v1 = lookup(n - 1,k - 1) = compute(n - 1,k - 1);
value_type v2 = lookup(n - 1,k);
if (0 == v2)
v2 = lookup(n - 1,k) = compute(n - 1,k);
return v1 + v2;
}
value_type* table_;
value_type dimension_;
};
value_type result = n_choose_k_impl(table,n).compute(n,k);
delete [] table;
return result;
}
Remember that
n! / ( n - r )! = n * ( n - 1) * .. * (n - r + 1 )
so it's way smaller than n!. So the solution is to evaluate n* ( n - 1 ) * ... * ( n - r + 1) instead of first calculating n! and then dividing it .
Of course it all depends on the relative magnitude of n and r - if r is relatively big compared to n, then it still won't fit.
Well, I have to answer to my own question. I was reading about Pascal's triangle and by accident noticed that we can calculate the amount of combinations with it:
#include <iostream>
#include <boost/cstdint.hpp>
boost::uint64_t Combinations(unsigned int n, unsigned int r)
{
if (r > n)
return 0;
/** We can use Pascal's triange to determine the amount
* of combinations. To calculate a single line:
*
* v(r) = (n - r) / r
*
* Since the triangle is symmetrical, we only need to calculate
* until r -column.
*/
boost::uint64_t v = n--;
for (unsigned int i = 2; i < r + 1; ++i, --n)
v = v * n / i;
return v;
}
int main()
{
std::cout << Combinations(52, 5) << std::endl;
}
Getting the prime factorization of the binomial coefficient is probably the most efficient way to calculate it, especially if multiplication is expensive. This is certainly true of the related problem of calculating factorial (see Click here for example).
Here is a simple algorithm based on the Sieve of Eratosthenes that calculates the prime factorization. The idea is basically to go through the primes as you find them using the sieve, but then also to calculate how many of their multiples fall in the ranges [1, k] and [n-k+1,n]. The Sieve is essentially an O(n \log \log n) algorithm, but there is no multiplication done. The actual number of multiplications necessary once the prime factorization is found is at worst O\left(\frac{n \log \log n}{\log n}\right) and there are probably faster ways than that.
prime_factors = []
n = 20
k = 10
composite = [True] * 2 + [False] * n
for p in xrange(n + 1):
if composite[p]:
continue
q = p
m = 1
total_prime_power = 0
prime_power = [0] * (n + 1)
while True:
prime_power[q] = prime_power[m] + 1
r = q
if q <= k:
total_prime_power -= prime_power[q]
if q > n - k:
total_prime_power += prime_power[q]
m += 1
q += p
if q > n:
break
composite[q] = True
prime_factors.append([p, total_prime_power])
print prime_factors
Using a dirty trick with a long double, it is possible to get the same accuracy as Howard Hinnant (and probably more):
unsigned long long n_choose_k(int n, int k)
{
long double f = n;
for (int i = 1; i<k+1; i++)
f /= i;
for (int i=1; i<k; i++)
f *= n - i;
unsigned long long f_2 = std::round(f);
return f_2;
}
The idea is to divide first by k! and then to multiply by n(n-1)...(n-k+1). The approximation through the double can be avoided by inverting the order of the for loop.
Improves Howard Hinnant's answer (in this question) a little bit:
Calling gcd() per loop seems a bit slow.
We could aggregate the gcd() call into the last one, while making the most use of the standard algorithm from Knuth's book "The Art of Computer Programming, 3rd Edition, Volume 2: Seminumerical Algorithms":
const uint64_t u64max = std::numeric_limits<uint64_t>::max();
uint64_t choose(uint64_t n, uint64_t k)
{
if (k > n)
throw std::invalid_argument(std::string("invalid argument in ") + __func__);
if (k > n - k)
k = n - k;
uint64_t r = 1;
uint64_t d;
for (d = 1; d <= k; ++d) {
if (r > u64max / n)
break;
r *= n--;
r /= d;
}
if (d > k)
return r;
// Let N be the original n,
// n is the current n (when we reach here)
// We want to calculate C(N,k),
// Currently we already calculated the r value so far:
// r = C(N, n) = C(N, N-n) = C(N, d-1)
// Note that N-n = d-1
// In addition we know the following identity formula:
// C(N,k) = C(N,d-1) * C(N-d+1, k-d+1) / C(k, k-d+1)
// = C(N,d-1) * C(n, k-d+1) / C(k, k-d+1)
// Using this formula, we effectively reduce the calculation,
// while recursively use the same function.
uint64_t b = choose(n, k-d+1);
if (b == u64max) {
return u64max; // overflow
}
uint64_t c = choose(k, k-d+1);
if (c == u64max) {
return u64max; // overflow
}
// Now, the combinatorial should be r * b / c
// We can use gcd() to calculate this:
// We Pick b for gcd: b < r almost (if not always) in all cases
uint64_t g = gcd(b, c);
b /= g;
c /= g;
r /= c;
if (r > u64max / b)
return u64max; // overflow
return r * b;
}
Note that the recursive depth is normally 2 (I don't really see a case goes to 3, the combinatorial reducing is quite decent.), i.e. calling choose() for 3 times, for non-overflow cases.
Replace uint64_t with unsigned long long if you prefer it.
One of SHORTEST way :
int nChoosek(int n, int k){
if (k > n) return 0;
if (k == 0) return 1;
return nChoosek(n - 1, k) + nChoosek(n - 1, k - 1);
}
If you want to be 100% sure that no overflows occur so long as the final result is within the numeric limit, you can sum up Pascal's Triangle row-by-row:
for (int i=0; i<n; i++) {
for (int j=0; j<=i; j++) {
if (j == 0) current_row[j] = 1;
else current_row[j] = prev_row[j] + prev_row[j-1];
}
prev_row = current_row; // assume they are vectors
}
// result is now in current_row[r-1]
However, this algorithm is much slower than the multiplication one. So perhaps you could use multiplication to generate all the cases you know that are 'safe' and then use addition from there. (.. or you could just use a BigInt library).