I am trying to calculate a massive number which unsigned long long can't hold. To calculate the number, I need to multiply a bunch of numbers and divide them by another bunch of numbers. The thing is, the numerator and denominator themselves also can't be held by unsigned long long. To handle this, I wrote some code to cancel out factors from both the numerator and denominator. This problem that I'm trying to solve is part of larger coding problem that I'm trying to solve. My program to solve that coding problem works for small test cases, but fails for large test cases. I've deduced that the issue probably lies within my algorithm canceling out the factors. I'm unable to use the debugger for large test cases because I'm trying to calculate thousands of different large numbers, and I'm not sure where my algorithm fails in canceling out the factors of the numbers.
The number I'm trying to calculate looks like n!/(a! * b! * ... * x!), where a+b+...+x=n. I used std::map to store the numerator. Each key in the map refers to the unique number that occurs in the numerator and the key points to the number of times it occurs. For example, if I had 3! in the numerator, I would have
1->1
2->1
3->1
In the denominator, I have a multiset of the different factors. For example, if the denominator was 2! * 1!, it would look like
1, 1, 2
To cancel out the numbers from the numerator and denominator, I start from the greatest number in the denominator and search for the greatest number in the numerator that can divide it. Using the above two examples, I would start from 2 in the denominator and cancel out the 2 in the numerator. I keep repeating this process until I finish canceling out everything in the denominator.
Here's my code
for (auto l = denominatorMultipliers2.rbegin(); l != denominatorMultipliers2.rend(); ++l) { //denominatorMultipliers2 holds the factors in the denominator
auto m = numeratorMultipliers.rbegin(); //search for the greatest number in the numerator that can divide the factor in the denominator
while (m->first % *l!= 0) {
++m;
}
--m->second; //once we find the factor in the numerator, we subtract the number of times it occurs by one
++numeratorMultipliers[m->first / *l]; //sometimes, dividing a factor by another factor yields another number in the numerator, e.g. 4/2->2, so there's going to be another 2 in the numerator. numeratorMultipliers is the std::map that holds the factors in the numerator
if (m->second == 0) { //if the factor in the numerator occurs 0 times, we delete it from the container
numeratorMultipliers.erase(std::next(m).base());
}
}
After canceling out the factors, all that I'm left with is the factors of the numerator. A property of this number that I'm calculating is that all the factors in the denominator will cancel out with the factors in the numerator.
Does anyone see what is wrong with my algorithm?
My guess is for test cases with large values, your algorithm fails because it can't cancel out all the values in the denominator?
That is expected, because all the numbers that you are tracking are not prime. To fully cancel out the denominator against the numerator, you'd need to cancel them partially (i.e., cancel their common factors).
One way to solve this is like this:
Run your algorithm until it cannot proceed (i.e., some values in both numerator and denominator cannot be cancelled)
Pick a remaining number in denominator, perform Euclid algorithm with each of the remaining numbers in numerator, and cancel out their GCD.
Repeat above until all numbers in denominator are cancelled out.
NOTE: I'd advice against direct factorization because for large numbers the complexity may be high. You don't really care about factorizing the numerator numbers (which is the expensive part), but just to find out their GCD with the denominator numbers.
Related
If a user enters a number "n" (integer) not equal to 0, my program should check if the the fraction 1/n has infinite or finite number of digits after the decimal sign. For example: for n=2 we have 1/2=0.5, therefore we have 1 digit after the decimal point. My first solution to this problem was this:
int n=1;
cin>>n;
if((1.0/n)*n==1)
{
cout<<"fixed number of digits after decimal point";
}
else cout<<"infinite number of digits after decimal point";
Since the computer can't store infinite numbers like 1/3, I expected that (1/3)*3 wouldn't be equal to 1. The first time I ran the program, the result was what I expected, but when I ran the program today, for n=3 I got the output (1/3)*3=1. I was surprised by this result and tried
double fraction = 1.0/n;
cout<< fraction*n;
which also returned 1. Why is the behaviour different and can I make my algorithm work? If I can't make it to work, I will have to check if n's divisors are only 1, 2 and 5, which, I think, would be harder to program and compute.
My IDE is Visual Studio, therefore my C++ compiler is VC.
Your code tries to make use of the fact that 1.0/n is not done with perfect precision, which is true. Multiplying the result by n theoretically should get you something not equal to 1, true.
Sadly the multiplication with n in your code is ALSO not done with perfect precision.
The fact which trips your concept up is that the two imperfections can cancel each other out and you get a seemingly perfect 1 in the end.
So, yes. Go with the divisor check.
Binary vs. decimal
Your assignment asks you whether the fraction 1/n can be represented with a finite number of digits in decimal representation. Floating-point numbers in python are represented using binary, which has some similarities and some differences with decimal:
if a rational number can be represented in binary with a finite number of bits, then it can also be represented in decimal with a finite number of digits;
some numbers can be represented in decimal with a finite number of digits, but require an infinite number of bits in decimal.
This is because 10 = 2 * 5; for any integer p, p / 2**k == (p * 5**k) / 10**k. So 1/2==5/10 and 1/4 == 25/100 and 1/8 == 125/1000 can be represented with finitely many digits or bits. But 1/5 can be represented with finitely many digits in decimal, yet requires infinitely many bits in binary.
Floating-point arithmetic and test for equality
See also: Is floating-point math broken? and What every programmer should know about floating-point arithmetic (pdf paper).
The computation (1.0 / n) * n results in an approximation; there is mostly no way to know whether checking for equality with 1.0 will return true or false. In language C, which uses the same floating-point arithmetic as python, compilers will raise a warning if you try to test for equality of two floating-point numbers (this warning can be abled or disabled with option -Wfloat-equal).
A different logic for your algorithm
You can't rely on floating-point arithmetic to decide your problem. But it's not needed. A number can be represented with finitely many digits if and only if it can be written under the form p / 10**k with p and k integers. So your program should examine n to find out whether there exists j and k such that 1 / n == 1 / (2**j * 5**k), without using floating-point arithmetic.
Doing one of my first homeworks of uni, and have ran into this problem:
Task: Find a sum of all n elements where n is the count of numerals in a number (n=1, means 1, 2, 3... 8, 9 for example, answer is 45)
Problem: The code I wrote has gotten all the test answers correctly up to 10 to the power of 9, but when it reaches 10 to the power of 10 territory, then the answers start being wrong, it's really close to what I should be getting, but not quite there (For example, my output = 49499999995499995136, expected result = 49499999995500000000)
Would really appreciate some help/insights, am guessing it's something to do with the variable types, but not quite sure of a possible solution..
#include <iostream>
#include <cmath>
#include <iomanip>
using namespace std;
int main()
{
int n;
double ats = 0, maxi, mini;
cin >> n;
maxi = pow(10, n) - 1;
mini = pow(10, n-1) - 1;
ats = (maxi * (maxi + 1)) / 2 - (mini * (mini + 1)) / 2;
cout << setprecision(0) << fixed << ats;
}
The main reason of problems is pow() function. It works with double, not int. Loss of accuracy is price for representing huge numbers.
There are 3 way's to solve problem:
For small n you can make your own long long int pow(int x, int pow) function. But there is problem, that we can overflow even long long int
Use long arithmetic functions, as #rustyx sayed. You can write your own with vector, or find and include library.
There is Math solution specific for topic's task. It solves the big numbers problem.
You can write your formula like
((10^n) - 1) * (10^n) - (10^m - 1) * (10^m)) / 2 , (here m = n-1)
Then multiply numbers in numerator. Regroup them. Extract common multiples 10^(n-1). And then you can see, that answer have a structure:
X9...9Y0...0 for big enought n, where letter X and Y are constants.
So, you can just print the answer "string" without calculating.
I think you're stretching floating points beyond their precision. Let me explain:
The C pow() function takes doubles as arguments. You're passing ints, the compiler is adding the code to convert them to doubles before they reach pow(). (And anyway you're storing it as a double when you get the return value since you declared it that way).
Floating points are called that way precisely because the point "floats". Inside a double there's a sign bit, a few bits for the mantissa and a few bits for the exponent. In binary, elevating to a power of two is equivalent to moving the fractional point to the right (or to the left if you're elevating to a negative number). So basically the exponent is saying where the fractional point is, in binary. The great advantage of using this kind of in-memory representation for doubles is that you get a lot of precision for numbers close to 0, and gradually lose precision as numbers become bigger.
That last thing is exactly what's happening to you. Your number is too large to be stored exactly. So it's being rounded to the closest sum of powers of two (powers of two are the numbers that have all zeroes to the right in binary).
Quick experiment: press F12 in your browser, open the javascript console and type 49499999995499995136. In my case, in chrome, I reproduce the same problem.
If you really really really want precision with such big numbers then you can try some of these libraries, but that's too advanced for a student program, you don't need it. Just add an if block and print an error message if the number that the user typed is too big (professors love that, which is actually quite correct).
Here is my code:
#include <iostream>
#include <cmath>
using namespace std;
int factorial(int);
int main()
{
for(int k = 0; k < 100000; k++)
{
static double sum = 0.0;
double term;
term = (double)pow(-1.0, k) * (double)pow(4.0, 2*k+1) / factorial(2*k+1);
sum = sum + term;
cout << sum << '\n';
}
}
int factorial(int n)
{
if(n == 0)
{
return 1;
}
return n*factorial(n-1);
}
I'm just trying to calculate the value of sine(4) using the maclaurin expansion form of sine. For each console output, the value reads 'nan'. The console gives an error and shuts down after like 10 second. I don't get any errors in the IDE.
There're multiple problems with your approach.
Your factorial function can't return an int. The return value will be way too big, very quickly.
Using pow(-1, value) to get a alternating positive/negative one is very inefficient and will yield incorrect value pretty quick. You should pick 1.0 or -1.0 depending on k's parity.
When you sum a long series of terms, you want to sum the terms with the least magnitude first. Otherwise, you lose precision due to existing bit limiting the range you can reach. In your case, the power of four is dominated by the factorial, so you sum the highest magnitude values first. You'd probably get better precision starting by the other end.
Algorithmically, if you're going to raise 4 to the 2k+1 power and then divide by (2k+1)!, you should keep both the list of factors (4, 4, 4, 4...) and (2,3,4,5,6,7,8,9,....) and simplify both sides. There's plenty of fours to remove on the numerators and denominators at the same time.
Even with those four, I'm not sure you can get anywhere close to the 100000 target you set, without specialized code.
As already stated by others, the intermediate results you will get for large k are magnitudes too large to fit into a double. From a certain k on pow as well as factorial will return infinity. This is simply what happens for very large doubles. And as you then divide one infinity by another you get NaN.
One common trick to deal with too large numbers is using logarithms for intermediate results and only in the end apply the exponential function once.
Some mathematical knowledge of logarithms is required here. To understand what I am doing here you need to know exp(log(x)) == x, log(a^b) == b*log(a), and log(a/b) == log(a) - log(b).
In your case you can rewrite
pow(4, 2*k+1)
to
exp((2*k+1)*log(4))
Then there is still the factorial. The lgamma function can help with factorial(n) == gamma(n+1) and log(factorial(n)) == lgamma(n+1). In short, lgamma gives you the log of a factorial without huge intermediate results.
So summing up, replace
pow(4, 2*k+1) / factorial(2*k+1)
With
exp((2*k+1)*log(4) - lgamma(2*k+2))
This should help you with your NaNs. Also, this should increase performance as lgamma operates in O(1) whereas your factorial is in O(k).
Note, however, that I have still very little confidence that your result will be numerically accurate.
A double has still limited precision of roughly 16 decimal digits. Your 100000 iterations are very likely worthless, probably even harmfull.
The program asks the user for the number of times to flip a coin (n; the number of trials).
A success is considered a heads.
Flawlessly, the program creates a random number between 0 and 1. 0's are considered heads and success.
Then, the program is supposed to output the expected values of getting x amount of heads. For example if the coin was flipped 4 times, what are the following probabilities using the formula
nCk * p^k * (1-p)^(n-k)
Expected 0 heads with n flips: xxx
Expected 1 heads with n flips: xxx
...
Expected n heads with n flips: xxx
When doing this with "larger" numbers, the numbers come out to weird values. It happens if 15 or twenty are put into the input. I have been getting 0's and negative values for the value that should be xxx.
Debugging, I have noticed that the nCk has come out to be negative and not correct towards the upper values and beleive this is the issue. I use this formula for my combination:
double combo = fact(n)/fact(r)/fact(n-r);
here is the psuedocode for my fact function:
long fact(int x)
{
int e; // local counter
factor = 1;
for (e = x; e != 0; e--)
{
factor = factor * e;
}
return factor;
}
Any thoughts? My guess is my factorial or combo functions are exceeding the max values or something.
You haven't mentioned how is factor declared. I think you are getting integer overflows. I suggest you use double. That is because since you are calculating expected values and probabilities, you shouldn't be concerned much about precision.
Try changing your fact function to.
double fact(double x)
{
int e; // local counter
double factor = 1;
for (e = x; e != 0; e--)
{
factor = factor * e;
}
return factor;
}
EDIT:
Also to calculate nCk, you need not calculate factorials 3 times. You can simply calculate this value in the following way.
if k > n/2, k = n-k.
n(n-1)(n-2)...(n-k+1)
nCk = -----------------------
factorial(k)
You're exceeding the maximum value of a long. Factorial grows so quickly that you need the right type of number--what type that is will depend on what values you need.
Long is an signed integer, and as soon as you pass 2^31, the value will become negative (it's using 2's complement math).
Using an unsigned long will buy you a little time (one more bit), but for factorial, it's probably not worth it. If your compiler supports long long, then try an "unsigned long long". That will (usually, depends on compiler and CPU) double the number of bits you're using.
You can also try switching to use double. The problem you'll face there is that you'll lose accuracy as the numbers increase. A double is a floating point number, so you'll have a fixed number of significant digits. If your end result is an approximation, this may work okay, but if you need exact values, it won't work.
If none of these solutions will work for you, you may need to resort to using an "infinite precision" math package, which you should be able to search for. You didn't say if you were using C or C++; this is going to be a lot more pleasant with C++ as it will provide a class that acts like a number and that would use standard arithmetic operators.
I was trying to code an algorithm to count the number of different possible ways the make a certain amount with the given denominations.
Assume the US dollar is available in denominations of $100, $50, $20, $10, $5, $1, $0.25, $0.10, $0.05 and $0.01. Below function, works great for int amount and int denominations
/* Count number of ways of making different combination */
int Count_Num_Ways(double amt, int numDenom, double S[]){
cout << amt << endl; //getchar();
/* combination leads to the amount */
if(amt == 0.00)
return 1;
/* No combination can lead to negative sum*/
if(amt < 0.00)
return 0;
/* All denominations have been exhausted and we have not reached
the required sum */
if(numDenom < 0 && amt >= 0.00)
return 0;
/* either we pick this denomination, this causes a reduction of
picked denomination from the sum for further subproblem, or
we choose to not pick this denomination and
try a different denomination */
return Count_Num_Ways(amt, numDenom - 1, S) +
Count_Num_Ways(amt - S[numDenom], numDenom, S);
}
but when I change my logic from int to float, it goes into infinite loop. I suspect that it is because of floating point comparisons in the code. I am not able to figure out the exact cause for a infinite loop behavior.
Any help in this regard would be helpful.
When dealing with such "small" currency amounts and not dealing with interest it will be much easier to just deal with cents and integer amounts only, not using floating point.
So just change your formula to use cents rather than dollars and keep using integers. Then when you need to display the amounts just divide them by 100 to get the dollars and modulo 100 to get the cents.
floating point operations cannot be exact, because of the finite representation. This way you will never ever end up with exactly 0.0. That's why you always test an interval like so:
if (fabs(amt - 0.0) < TOL)
with a given tolerance of TOL. TOL is chosen appropriately for the application, in this case, 1/2 cent should already be fine.
EDIT: Of course, for this kind of thing, Daemin's answer is much more suitable.