Computing large combinations - c++

How would you compute a combination such as (100,000 choose 50,000)?
I have tried three different approaches thus far but for obvious reasons each has failed:
1) Dynamic Programming- The size of the array just gets to be so ridiculous it seg faults
unsigned long long int grid[p+1][q+1];
//Initialise x boundary conditions
for (long int i = 0; i < q; ++i) {
grid[p][i] = 1;
}
//Initialise y boundary conditions
for (long int i = 0; i < p; ++i) {
grid[i][q] = 1;
}
for (long int i = p - 1; i >= 0; --i) {
for (long int j = q - 1; j >= 0; --j) {
grid[i][j] = grid[i+1][j] + grid[i][j+1];
}
}
2) Brute Force - Obviously calculating even 100! isn't realistic
unsigned long long int factorial(long int n)
{
return (n == 1 || n == 0) ? 1 : factorial(n - 1) * n;
}
3) Multiplicative Formula- I'm unable to store the values they are just so large
const int gridSize = 100000; //say 100,000
unsigned long long int paths = 1;
for (int i = 0; i < gridSize; i++) {
paths *= (2 * gridSize) - i;
paths /= i + 1;
}
// code from (http://www.mathblog.dk/project-euler-15/)
If it helps for context the aim of this is to solve the "How many routes are there through an m×n grid" problem for large inputs. Maybe I am miss-attacking the problem?

C(100000, 50000) is a huge number with 30101 decimal digits: http://www.wolframalpha.com/input/?i=C%28100000%2C+50000%29
Obviously unsigned long long will not be enough to store it. You need some arbitrary large integers library, like GMP: http://en.wikipedia.org/wiki/GNU_Multiple_Precision_Arithmetic_Library
Otherwise, multiplicative formula should be good enough.

"How would you compute ..." depends very much on the desired accuracy. Precise results can only be computed with arbitrary precission numbers (eg. GMP), but it is rather likely that you don't really need the exact result.
In that case I would use the Stirling Approximation for factorials ( http://en.wikipedia.org/wiki/Stirling%27s_approximation ) and calculate with doubles. The number of summands in the expansion can be used to regulate the error. The wikipedia page will also give you an error estimate.

Here is recursive formula that might help : -
NCk = (N-1)C(k-1)*N/K
Use a recursive call for (N-1)C(K-1) first then evaluate NCk on result.
As your numbers will be very large use one of following alternatives.
GMP
Use your own implementation where you can store numbers as sequence of binary bits in array and use booth's algorithm for multiplication
and shift & subtract for division.

Related

Calculating Modulo between 2 large numbers represented as strings

I am trying to compute % of large numbers. I managed so far with the dividend to be large and the mod represented as integer. I have no idea how to do otherwise. I need this exactly for Large numbers division that is why I need this. No, I will not use BigInts library, as it is not accepted by online judges. Moreover, I'd like to know how to do it myself.
This is what I have written for just one large number.
int mod(string num, int a)
{
int res = 0;
for (int i = 0; i < num.length(); i++)
res = (res*10 + (int)num[i] - '0') %a;
return res;
}

Floating point error in C++ code

I am trying to solve a question in which i need to find out the number of possible ways to make a team of two members.(note: a team can have at most two person)
After making this code, It works properly but in some test cases it shows floating point error ad i can't find out what it is exactly.
Input: 1st line : Number of test cases
2nd line: number of total person
Thank you
#include<iostream>
using namespace std;
long C(long n, long r)
{
long f[n + 1];
f[0] = 1;
for (long i = 1; i <= n; i++)
{
f[i] = i * f[i - 1];
}
return f[n] / f[r] / f[n - r];
}
int main()
{
long n, r, m,t;
cin>>t;
while(t--)
{
cin>>n;
r=1;
cout<<C(n, min(r, n - r))+1<<endl;
}
return 0;
}
You aren't getting a floating point exception. You are getting a divide by zero exception. Because your code is attempting to divide by the number 0 (which can't be done on a computer).
When you invoke C(100, 1) the main loop that initializes the f array inside C increases exponentially. Eventually, two values are multiplied such that i * f[i-1] is zero due to overflow. That leads to all the subsequent f[i] values being initialized to zero. And then the division that follows the loop is a division by zero.
Although purists on these forums will say this is undefined, here's what's really happening on most 2's complement architectures. Or at least on my computer....
At i==21:
f[20] is already equal to 2432902008176640000
21 * 2432902008176640000 overflows for 64-bit signed, and will typically become -4249290049419214848 So at this point, your program is bugged and is now in undefined behavior.
At i==66
f[65] is equal to 0x8000000000000000. So 66 * f[65] gets calculated as zero for reasons that make sense to me, but should be understood as undefined behavior.
With f[66] assigned to 0, all subsequent assignments of f[i] become zero as well. After the main loop inside C is over, the f[n-r] is zero. Hence, divide by zero error.
Update
I went back and reverse engineered your problem. It seems like your C function is just trying to compute this expression:
N!
-------------
R! * (N-R)!
Which is the "number of unique sorted combinations"
In which case instead of computing the large factorial of N!, we can reduce that expression to this:
n
[ ∏ i ]
n-r
--------------------
R!
This won't eliminate overflow, but will allow your C function to be able to take on larger values of N and R to compute the number of combinations without error.
But we can also take advantage of simple reduction before trying to do a big long factorial expression
For example, let's say we were trying to compute C(15,5). Mathematically that is:
15!
--------
10! 5!
Or as we expressed above:
1*2*3*4*5*6*7*8*9*10*11*12*13*14*15
-----------------------------------
1*2*3*4*5*6*7*8*9*10 * 1*2*3*4*5
The first 10 factors of the numerator and denominator cancel each other out:
11*12*13*14*15
-----------------------------------
1*2*3*4*5
But intuitively, you can see that "12" in the numerator is already evenly divisible by denominators 2 and 3. And that 15 in the numerator is evenly divisible by 5 in the denominator. So simple reduction can be applied:
11*2*13*14*3
-----------------------------------
1 * 4
There's even more room for greatest common divisor reduction, but this is a great start.
Let's start with a helper function that computes the product of all the values in a list.
long long multiply_vector(std::vector<int>& values)
{
long long result = 1;
for (long i : values)
{
result = result * i;
if (result < 0)
{
std::cout << "ERROR - multiply_range hit overflow" << std::endl;
return 0;
}
}
return result;
}
Not let's implement C as using the above function after doing the reduction operation
long long C(int n, int r)
{
if ((r >= n) || (n < 0) || (r < 0))
{
std::cout << "invalid parameters passed to C" << std::endl;
return 0;
}
// compute
// n!
// -------------
// r! * (n-r)!
//
// assume (r < n)
// Which maps to
// n
// [∏ i]
// n - r
// --------------------
// R!
int end = n;
int start = n - r + 1;
std::vector<int> numerators;
std::vector<int> denominators;
long long numerator = 1;
long long denominator = 1;
for (int i = start; i <= end; i++)
{
numerators.push_back(i);
}
for (int i = 2; i <= r; i++)
{
denominators.push_back(i);
}
size_t n_length = numerators.size();
size_t d_length = denominators.size();
for (size_t n = 0; n < n_length; n++)
{
int nval = numerators[n];
for (size_t d = 0; d < d_length; d++)
{
int dval = denominators[d];
if ((nval % dval) == 0)
{
denominators[d] = 1;
numerators[n] = nval / dval;
}
}
}
numerator = multiply_vector(numerators);
denominator = multiply_vector(denominators);
if ((numerator == 0) || (denominator == 0))
{
std::cout << "Giving up. Can't resolve overflow" << std::endl;
return 0;
}
long long result = numerator / denominator;
return result;
}
You are not using floating-point. And you seem to be using variable sized arrays, which is a C feature and possibly a C++ extension but not standard.
Anyway, you will get overflow and therefore undefined behaviour even for rather small values of n.
In practice the overflow will lead to array elements becoming zero for not much larger values of n.
Your code will then divide by zero and crash.
They also might have a test case like (1000000000, 999999999) which is trivial to solve, but not for your code which I bet will crash.
You don't specify what you mean by "floating point error" - I reckon you are referring to the fact that you are doing an integer division rather than a floating point one so that you will always get integers rather than floats.
int a, b;
a = 7;
b = 2;
std::cout << a / b << std::endl;
this will result in 3, not 3.5! If you want floating point result you should use floats instead like this:
float a, b;
a = 7;
b = 2;
std::cout << a / b << std::end;
So the solution to your problem would simply be to use float instead of long long int.
Note also that you are using variable sized arrays which won't work in C++ - why not use std::vector instead??
Array syntax as:
type name[size]
Note: size must a constant not a variable
Example #1:
int name[10];
Example #2:
const int asize = 10;
int name[asize];

C++ function to calculate factorial returns negative value

I've written a C++ function to calculate factorial and used it to calculate 22C11 (Combination). I have declared a variable ans and set it to 0. I tried to calculate
22C11 = fact(2*n)/(fact(n)*fact(n))
where i sent n as 11. For some reason, i'm getting a negative value stored in answer. How can i fix this?
long int fact(long int n) {
if(n==1||n==0)
return 1;
long int x=1;
if(n>1)
x=n*fact(n-1);
return x;
}
The following lines are included in the main function:
long int ans=0;
ans=ans+(fact(2*n)/(fact(n)*fact(n)));
cout<<ans;
The answer i'm getting is -784
The correct answer should be 705432
NOTE: This function is working perfectly fine for n<=10. I have tried long long int instead of long int but it still isn't working.
It is unwise to actually calculate factorials - they grow extremely fast. Generally, with combinatorial formulae it's a good idea to look for a way to re-order operations to keep intermediate results somewhat constrained.
For example, let's look at (2*n)!/(n!*n!). It can be rewritten as ((n+1)*(n+2)*...*(2*n)) / (1*2*...*n) == (n+1)/1 * (n+2)/2 * (n+3)/3 ... * (2*n)/n. By interleaving multiplication and division, the rate of growth of intermediate result is reduced.
So, something like this:
int f(int n) {
int ret = 1;
for (int i = 1; i <= n; ++i) {
ret *= (n + i);
ret /= i;
}
return ret;
}
Demo
22! = 1,124,000,727,777,607,680,000
264 = 18,446,744,073,709,551,615
So unless you have 128-bit integers for unsigned long long you have integer overflow.
You are triggering integer overflow, which causes undefined behaviour. You could in fact use long long int, or unsigned long long int to get a little bit more precision, e.g:
unsigned long long fact(int n)
{
if(n < 2)
return 1;
return fact(n-1) * n;
}
You say you tried this and it didn't work but I'm guessing you forgot to also update the type of x or something. (In my version I removed x as it is redundant). And/or your calculation still was so big that it overflowed unsigned long long int.
You may be interested in this thread which shows an algorithm for working out nCr that doesn't require so much intermediate storage.
You increasing your chances of success by avoiding the brute force method.
COMB(N1, N2) = FACT(N1)/(FACT(N1-N2)*FACT(N2))
You can take advantage of the fact that both the nominator and the denominator have a lot of common terms.
COMB(N1, N2) = (N1-N2+1)*(N1-N2+2)*...*N1/FACT(N1)
Here's an implementation that makes use of that knowledge and computes COMB(22,11) with much less risk of integer overflow.
unsigned long long comb(int n1, int n2)
{
unsigned long long res = 1;
for (int i = (n1-n2)+1; i<= n1; ++i )
{
res *= i;
}
for (int i = 2; i<= n2; ++i )
{
res /= i;
}
return res;
}

C++ Finding the minimum quotient from two different integer arrays of the same size

I have 2 different arrays numerator[ ], and denominator[ ] and int size which is 9. They both consist of 9 different integers, and I need to find the lowest quotient of 2 ints
(the percentage - (numerator[ ])/(denominator[ ]) ) in the two arrays. How would I go about doing this?
Do you want to return the percentage or the quotient(with no remainder)?
Following code returns the percentage. Change double to int, if you want the quotient.
#include<limits>
double lowestQuotient(const int *numerator, const int *denominator)
{
double min=DBL_MAX;
double quotient;
for(i=0;i<9;i++)
{
if (denominator[i]==0)
continue;
quotient = (double)numerator [i]/denominator [i];
if (i==0 || quotient<min)
min=quotient;
}
return min;
}
Edit: This answer was written before the problem statement was changed to clarify that the intention was not to compare every combination, but instead to only take pair-wise quotients. That simplifies the problem quite a bit and makes my lengthy solution here overkill. This was also written before a solution involving floating point values was indicated; I assumed that the questioner was interested in the mathematical definition of the quotient of two integers, which is itself necessarily an integer. All the same I'll leave this here for posterity...
Edit 2: Fixed the compilation error -- thanks James Root for pointing out the error.
This is a math problem first and a programming problem second.
The naive implementation is to compute every combination of numerators from the first array divided by denominators from the second array, track the minimum quotient as we go, and compute the result.
This would look something like the following:
#include <climits>
#include <algorithm>
int minimum_quotient(int numerator[], int denominator[], int size)
{
int minimum = INT_MAX; // minimum quotient
for (int i = 0; i < size; ++i)
for (int j = 0; j < size; ++j)
if (denominator[j] != 0) // avoid division by 0
minimum = std::min(minimum, numerator[i] / denominator[j]);
return 0;
}
With size being a known, small number, this should be sufficient. However, if we are concerned about the case in which size becomes very large, we may want to avoid the above written solution, which scales proportionate to the square of the size of the input.
Here is an idea for a solution that scales better with larger sizes. Specifically it scales linearly with the size of the input. We can take advantage of the following facts:
If the numerators and denominators both have the same sign, then the smallest quotient will be from the numerator with the smallest absolute value and the denominator with the largest absolute value.
If the numerators and denominators have opposite signs, then the opposite is true: for the smallest quotient we want the numerator with the largest absolute value and the denominator with the smallest absolute value.
We can iterate through both lists once, accumulating the largest and smallest numerators and denominators, and then compare these at the end to find the smallest quotient:
#include <climits>
#include <algorithm>
int minimum_quotient(int numerator[], int denominator[], int size)
{
int min_num = INT_MAX, min_den = INT_MAX;
int max_num = INT_MIN, max_den = INT_MIN;
for (int i = 0; i < size; ++i)
{
min_num = std::min(min_num, numerator[i]);
max_num = std::max(max_num, numerator[i]);
min_den = std::min(min_den, denominator[i]);
max_den = std::max(max_den, denominator[i]);
}
int minimum = INT_MAX;
if (min_den != 0)
{
minimum = std::min(minimum, min_num / min_den);
minimum = std::min(minimum, max_num / min_den);
}
if (max_den != 0)
{
minimum = std::min(minimum, min_num / max_den);
minimum = std::min(minimum, max_num / max_den);
}
return minimum;
}

Adding Integers without longs

I am writing a class to model big integers. I am storing my data as unsigned ints in a vector pointer called data. What this function does is it adds n to the current big integer. What seems to be slowing down my performance is having to use long long values. Do any of you know a way around this. I currently have to make sum a long long or else it will overflow.
void Integer::u_add(const Integer & n)
{
std::vector<unsigned int> & tRef = *data;
std::vector<unsigned int> & nRef = *n.data;
const int thisSize = tRef.size();
const int nSize = nRef.size();
int carry = 0;
for(int i = 0; i < nSize || carry; ++i)
{
bool readThis = i < thisSize;
long long sum = (readThis ? (long long)tRef[i] + carry : (long long)carry) + (i < nSize ? nRef[i] : 0);
if(readThis)
tRef[i] = sum % BASE; //Base is 2^32
else
tRef.push_back(sum % BASE);
carry = (sum >= BASE ? 1 : 0);
}
}
Also just wondering if there is any benefit to using references to the pointers over just using the pointers themselves? I mean should I use tRef[i] or (*data)[i] to access the data.
Instead of using base 2^32, use base 2^30. Then when you add two values the largest sum will be 2^31-1 which fits into an ordinary long (signed or unsigned).
Or better still, use base 10^9 (roughly equal to 2^30), then you don't need a lot of effort to print the large numbers in decimal format.
If you really need to work in base 2^32, you could try a kludge like the following, provided unsigned ints don't throw overflow exceptions:
sum = term1 + term2
carry = 0
if (sum < term1 || sum < term2)
carry = 1