I was trying to code an algorithm to count the number of different possible ways the make a certain amount with the given denominations.
Assume the US dollar is available in denominations of $100, $50, $20, $10, $5, $1, $0.25, $0.10, $0.05 and $0.01. Below function, works great for int amount and int denominations
/* Count number of ways of making different combination */
int Count_Num_Ways(double amt, int numDenom, double S[]){
cout << amt << endl; //getchar();
/* combination leads to the amount */
if(amt == 0.00)
return 1;
/* No combination can lead to negative sum*/
if(amt < 0.00)
return 0;
/* All denominations have been exhausted and we have not reached
the required sum */
if(numDenom < 0 && amt >= 0.00)
return 0;
/* either we pick this denomination, this causes a reduction of
picked denomination from the sum for further subproblem, or
we choose to not pick this denomination and
try a different denomination */
return Count_Num_Ways(amt, numDenom - 1, S) +
Count_Num_Ways(amt - S[numDenom], numDenom, S);
}
but when I change my logic from int to float, it goes into infinite loop. I suspect that it is because of floating point comparisons in the code. I am not able to figure out the exact cause for a infinite loop behavior.
Any help in this regard would be helpful.
When dealing with such "small" currency amounts and not dealing with interest it will be much easier to just deal with cents and integer amounts only, not using floating point.
So just change your formula to use cents rather than dollars and keep using integers. Then when you need to display the amounts just divide them by 100 to get the dollars and modulo 100 to get the cents.
floating point operations cannot be exact, because of the finite representation. This way you will never ever end up with exactly 0.0. That's why you always test an interval like so:
if (fabs(amt - 0.0) < TOL)
with a given tolerance of TOL. TOL is chosen appropriately for the application, in this case, 1/2 cent should already be fine.
EDIT: Of course, for this kind of thing, Daemin's answer is much more suitable.
Related
Here is my code:
#include <iostream>
#include <cmath>
using namespace std;
int factorial(int);
int main()
{
for(int k = 0; k < 100000; k++)
{
static double sum = 0.0;
double term;
term = (double)pow(-1.0, k) * (double)pow(4.0, 2*k+1) / factorial(2*k+1);
sum = sum + term;
cout << sum << '\n';
}
}
int factorial(int n)
{
if(n == 0)
{
return 1;
}
return n*factorial(n-1);
}
I'm just trying to calculate the value of sine(4) using the maclaurin expansion form of sine. For each console output, the value reads 'nan'. The console gives an error and shuts down after like 10 second. I don't get any errors in the IDE.
There're multiple problems with your approach.
Your factorial function can't return an int. The return value will be way too big, very quickly.
Using pow(-1, value) to get a alternating positive/negative one is very inefficient and will yield incorrect value pretty quick. You should pick 1.0 or -1.0 depending on k's parity.
When you sum a long series of terms, you want to sum the terms with the least magnitude first. Otherwise, you lose precision due to existing bit limiting the range you can reach. In your case, the power of four is dominated by the factorial, so you sum the highest magnitude values first. You'd probably get better precision starting by the other end.
Algorithmically, if you're going to raise 4 to the 2k+1 power and then divide by (2k+1)!, you should keep both the list of factors (4, 4, 4, 4...) and (2,3,4,5,6,7,8,9,....) and simplify both sides. There's plenty of fours to remove on the numerators and denominators at the same time.
Even with those four, I'm not sure you can get anywhere close to the 100000 target you set, without specialized code.
As already stated by others, the intermediate results you will get for large k are magnitudes too large to fit into a double. From a certain k on pow as well as factorial will return infinity. This is simply what happens for very large doubles. And as you then divide one infinity by another you get NaN.
One common trick to deal with too large numbers is using logarithms for intermediate results and only in the end apply the exponential function once.
Some mathematical knowledge of logarithms is required here. To understand what I am doing here you need to know exp(log(x)) == x, log(a^b) == b*log(a), and log(a/b) == log(a) - log(b).
In your case you can rewrite
pow(4, 2*k+1)
to
exp((2*k+1)*log(4))
Then there is still the factorial. The lgamma function can help with factorial(n) == gamma(n+1) and log(factorial(n)) == lgamma(n+1). In short, lgamma gives you the log of a factorial without huge intermediate results.
So summing up, replace
pow(4, 2*k+1) / factorial(2*k+1)
With
exp((2*k+1)*log(4) - lgamma(2*k+2))
This should help you with your NaNs. Also, this should increase performance as lgamma operates in O(1) whereas your factorial is in O(k).
Note, however, that I have still very little confidence that your result will be numerically accurate.
A double has still limited precision of roughly 16 decimal digits. Your 100000 iterations are very likely worthless, probably even harmfull.
I need to find the remainder of a an average taken from an array so that I can properly round up or down. Basically, if the average of the array has a decimal that is greater than or equal to 0.5, I need to round up, otherwise I need to round down.
long arrayaverage;
long average;
long avremain;
long output;
average = ((array1[0]+array1[1]+array1[2]+array1[3]+array1[4])/5);
avremain=average%long(1.0);
if (avremain>=0.5)
{
output=((average-avremain)+1);
}
else if (avremain<0.5)
{
output=(average-avremain);
}
cout<<"The average of the array is: "<<output<<endl;
There is my code, but the issue I think is with the modulo operator. Whenever I try to run my code, I get the average but it always rounds up. Any help would be much appreciated!
(side note: in an earlier segment of the code, 5 values are collected from the user to form the array, and I know there is no issue with that since an earlier part of the code runs just fine.)
It is not necessary to round up or down, since you're doing all your work with integral types.
long sum = array1[0]+array1[1]+array1[2]+array1[3]+array1[4];
long remainder = sum % 5L; // 5L for consistency working with longs
if (2*remainder > 5L) // this will effectively round up
sum += 5L;
sum -= remainder; // sum will be a multiple of 5 now
average = sum/5L;
All I've done is do the adjustments to the sum before dividing by the number, rather than trying to adjust the average (in which case, rounding is toward zero, and remainder information is lost).
I have assumed that the sum and therefore the average is positive. The adjustment for negatives is trivial, and I'll leave that as an exercise.
BTW: In your code, the expression ..... avremain=average%long(1.0), long(1.0) is equal to 1. Remainder of dividing any positive integer by 1 is always zero, so your approach achieves (literally) nothing.
You have taken everything to long and that is why it is not storing the decimal part. The avremain variable will always give 0 and so, it always show the floor of it.
Use double instead of long;
double arrayaverage;
double average;
double avremain;
double output;
average = ((array1[0]+array1[1]+array1[2]+array1[3]+array1[4])/5);
avremain=average - floor(average);
if (avremain>=0.5)
{
output=((average-avremain)+1);
}
else if (avremain<0.5)
{
output=(average-avremain);
}
cout<<"The average of the array is: "<<output<<endl;
To use floor, write #include <cmath>
The background:
I have been working on the following problem, "The Trip" from "Programming Challenges: The Programming Contest Training Manual" by S. Skiena:
A group of students are members of a club that travels annually to
different locations. Their destinations in the past have included
Indianapolis, Phoenix, Nashville, Philadelphia, San Jose, and Atlanta.
This spring they are planning a trip to Eindhoven.
The group agrees in advance to share expenses equally, but it is not
practical to share every expense as it occurs. Thus individuals in the
group pay for particular things, such as meals, hotels, taxi rides,
and plane tickets. After the trip, each student's expenses are tallied
and money is exchanged so that the net cost to each is the same, to
within one cent. In the past, this money exchange has been tedious and
time consuming. Your job is to compute, from a list of expenses, the
minimum amount of money that must change hands in order to equalize
(within one cent) all the students' costs.
Input
Standard input will contain the information for several trips. Each
trip consists of a line containing a positive integer n denoting the
number of students on the trip. This is followed by n lines of input,
each containing the amount spent by a student in dollars and cents.
There are no more than 1000 students and no student spent more than
$10,000.00. A single line containing 0 follows the information for the
last trip.
Output
For each trip, output a line stating the total amount of money, in
dollars and cents, that must be exchanged to equalize the students'
costs.
(Bold is mine, book here, site here)
I solved the problem with the following code:
/*
* the-trip.cpp
*/
#include <iostream>
#include <iomanip>
#include <cmath>
int main( int argc, char * argv[] )
{
int students_number, transaction_cents;
double expenses[1000], total, average, given_change, taken_change, minimum_change;
while (std::cin >> students_number) {
if (students_number == 0) {
return 0;
}
total = 0;
for (int i=0; i<students_number; i++) {
std::cin >> expenses[i];
total += expenses[i];
}
average = total / students_number;
given_change = 0;
taken_change = 0;
for (int i=0; i<students_number; i++) {
if (average > expenses[i]) {
given_change += std::floor((average - expenses[i]) * 100) / 100;
}
if (average < expenses[i]) {
taken_change += std::floor((expenses[i] - average) * 100) / 100;
}
}
minimum_change = given_change > taken_change ? given_change : taken_change;
std::cout << "$" << std::setprecision(2) << std::fixed << minimum_change << std::endl;
}
return 0;
}
My original implementation had float instead of double. It was working with the small problem instances provided with the description and I spent a lot of time trying to figure out what was wrong.
In the end I figured out that I had to use double precision, apparently some big input in the programming challenge tests made my algorithms with float fail.
The question:
Given the input can have 1000 students and each student can spend up to 10,000$, my total variable has to store a number of the maximum size of
10,000,000.
How should I decide which precision is needed?
Is there something that should have given me an hint that float wasn't enough for this task?
I later realized that in this case I could have avoided floating point at all since my number fits into integer types, but I'm still interested in understanding if there was a way to foresee that float wasn't precise enough in this case.
Is there something that should have given me an hint that float wasn't enough for this task?
The fact that 0.10 is not representable at all in binary floating-point (which both float and double are if you use an ordinary computer) should have been the hint. Binary floating-point is perfect for physical quantities that arrive inaccurate to begin with, or for computations that will be inaccurate anyway whatever the reasonable numerical system with decidable equality. Exact computations of monetary amounts are not a good application of binary floating-point.
How should I decide which precision is needed? … my total variable has to store a number of the maximum size of 10,000,000.
Use an integer type to represent numbers of cents. By your own reasoning, you shouldn't have to deal with amounts of more than 1,000,000,000 cents, so long should be enough, but just use long long and save yourself the risk of trouble with corner cases.
As you said: Never use floating point variables to represent money. Using an integer representation - either as one large number in form of cents or whatever the fraction of the local currency is, or as two numbers [which makes the math a bit more awkward, but easier to see/read/write the value as two units].
The motivation for not using floating point is that it's "often not accurate". Just like 1/3 can't be writen as an exact value using decimal representation, no matter how many threes you write, the actual answer would have more threes, binary floating point values can not precisely describe some decimal values, and you get "Your value of 0.20 does not match 0.20 that the customer owes" - which doesn't make sense, but that's because "0.200000000001" and "0.19999999999" aren't exactly the same thing according to the computer. And eventually, those little rounding errors will cause some big problem in one way or another - and this regardless of if it's float, double or extra_super_long_double.
However, if you have a question like this: if I have to represent a value of 10 million, with a precison of 1/100th of the unit, how big a floating point variable do I need, your calculation becomes:
float bigNumber = 10000000;
float smallNumber = 0.01;
float bits = log2(bigNumber/smallNumber);
cout << "Bits in mantissa needed: " << ceil(bits) << endl;
So, in this case, we get bits as 29.897, so you need 30 bits (in other words, float is not good enough.
Of course, if you do not need fractions of a dollar (or whatever), you can get away with a few less digits. Namely log2(10000000) = 23.2 - so 24 bits of mantissa -> still too big for a float.
10,000,000>2^23 so you need at least 24 bits of mantissa, which is what single precision provides. Because of intermediate rounding, the last bits can err.
1 digit ~ 3.321928 bits.
The program asks the user for the number of times to flip a coin (n; the number of trials).
A success is considered a heads.
Flawlessly, the program creates a random number between 0 and 1. 0's are considered heads and success.
Then, the program is supposed to output the expected values of getting x amount of heads. For example if the coin was flipped 4 times, what are the following probabilities using the formula
nCk * p^k * (1-p)^(n-k)
Expected 0 heads with n flips: xxx
Expected 1 heads with n flips: xxx
...
Expected n heads with n flips: xxx
When doing this with "larger" numbers, the numbers come out to weird values. It happens if 15 or twenty are put into the input. I have been getting 0's and negative values for the value that should be xxx.
Debugging, I have noticed that the nCk has come out to be negative and not correct towards the upper values and beleive this is the issue. I use this formula for my combination:
double combo = fact(n)/fact(r)/fact(n-r);
here is the psuedocode for my fact function:
long fact(int x)
{
int e; // local counter
factor = 1;
for (e = x; e != 0; e--)
{
factor = factor * e;
}
return factor;
}
Any thoughts? My guess is my factorial or combo functions are exceeding the max values or something.
You haven't mentioned how is factor declared. I think you are getting integer overflows. I suggest you use double. That is because since you are calculating expected values and probabilities, you shouldn't be concerned much about precision.
Try changing your fact function to.
double fact(double x)
{
int e; // local counter
double factor = 1;
for (e = x; e != 0; e--)
{
factor = factor * e;
}
return factor;
}
EDIT:
Also to calculate nCk, you need not calculate factorials 3 times. You can simply calculate this value in the following way.
if k > n/2, k = n-k.
n(n-1)(n-2)...(n-k+1)
nCk = -----------------------
factorial(k)
You're exceeding the maximum value of a long. Factorial grows so quickly that you need the right type of number--what type that is will depend on what values you need.
Long is an signed integer, and as soon as you pass 2^31, the value will become negative (it's using 2's complement math).
Using an unsigned long will buy you a little time (one more bit), but for factorial, it's probably not worth it. If your compiler supports long long, then try an "unsigned long long". That will (usually, depends on compiler and CPU) double the number of bits you're using.
You can also try switching to use double. The problem you'll face there is that you'll lose accuracy as the numbers increase. A double is a floating point number, so you'll have a fixed number of significant digits. If your end result is an approximation, this may work okay, but if you need exact values, it won't work.
If none of these solutions will work for you, you may need to resort to using an "infinite precision" math package, which you should be able to search for. You didn't say if you were using C or C++; this is going to be a lot more pleasant with C++ as it will provide a class that acts like a number and that would use standard arithmetic operators.
I'm doing some homework for a C++ class, and i'm pretty new to C++. I've run into some issues with my if statement... What i'm doing, is i have the user input a time, between 0.00 and 23.59. the : is replaced by a period btw. that part works. i then am seperating the hour and the minute, and checking them to make sure that they are in valid restraints. checking the hour works, but not the minute... heres my code:
minute= startTime - static_cast<int>(startTime);
hour= static_cast<int>(startTime);
//check validity
if (minute > 0.59) {
cout << "ERROR! ENTERED INVALID TIME! SHUTTING DOWN..." << endl;;
return(0);
}
if (hour > 23) {
cout << "ERROR! ENTERED INVALID TIME! SHUTTING DOWN..." << endl;;
return(0);
}
again, the hour works if i enter 23, but if i enter 23.59, i get the error, but if i enter 23.01 i do not. also, if i enter 0.59 it also gives the error, but not 0.58. I tried switching the if(minute > 0.59) to if(minute > 0.6) but for some reason that caused problems elsewhere. i am at a complete loss as to what to do, so any help would be great! thanks a million!
EDIT: i just entered 0.58, and it didnt give me the error... but if i make it a 1.59 it gives an error again... also, upvotes would be nice :D
Floating-point arithmetic (float and double) is inherently fuzzy. There are always some digits behind the decimal point that you don't see in the rounded representation that is sent to your stream, and comparisons can also be fuzzy because the representation you are used to (decimal) is not the one the computer uses (binary).
Represent a time as int hours and int minutes, and your problems will fade away. Most libraries measure time in ticks (usually seconds or microseconds) and do not offer sub-tick resolution. You do well to emulate them.
Comparison of floating point numbers is prone to failure, because very few of them can be represented exactly in base 2. There's always going to be some possibility that two different numbers are going to round in different directions.
The simplest fix is to add a very tiny fudge factor to the number you're comparing:
if (minute > 0.59 + epsilon)
See What Every Computer Scientist Should Know About Floating-Point Arithmetic.
Don't, ever, use double (nor float) to store two integer values using the integer and decimal part as separator. The decimal data types are not precise enough, so you may have wrong results (in case of a round).
A good solution in your case is either to use a struct (or a pair) or even an integer.
With integer, you can use mod or div of multiples of then to extract the real value. Example:
int startTime = 2359;
int hour = startTime / 100;
int minute = startTime % 100;
Although, with struct the code would look simpler and easier to understand (and to maintain).
There's no way you can compare to exactly 0.59; this value
cannot be represented in the machine. You're comparing
something which is very close to 0.59 with something else that
is very close to 0.59.
Personally, I wouldn't input time this way at all. A much
better solution would be to learn how to use std::istream to
read the value. But if you insist, and you have startTime as
your double, you need to multiply it by 100.0 (so that all
of the values that interest you are integers, and can be
represented exactly), then round it, then convert it to an
integer, and use the modulo and division operators on the
integer to extract your values. Something along the lines of:
assert( startTime >= 0.0 && startTime <= 24.0 );
int tmpTime = 100.0 * startTime + 0.5;
int minute = tmpTime % 100;
int hour = tmpTime / 100;
When dealing with integral values, it's much simpler to use
integral types.