What is wrong with this while-loop (C++)? - c++

I was pretty sure I was doing this the right way, but apparently I'm not. This while loops keeps running infinitely once it reaches 0. It keeps outputting "Monthly Payment: $0.00" and "Your loan after that payment is: $0.00" over and over again. What am I doing wrong?
while (loan_balance ! = 0)
{
monthly_payment = loan_balance / 20;
loan_balance = loan_balance - monthly_payment;
cout << "Monthly Payment: $" << monthly_payment << ends;
cout << "Your loan balance after that payment is: $" << loan_balance << endl;
}

If load_balance is a floating point type (float or double), then load_balance != 0 (where 0 is 0.0f) will likely never be false unless it is explicitly set to load_balance = 0.0f. So it should be compared to a small threshold instead, e.g.
while(load_balance >= 1e-4)
Also the not-equal operator is !=, with a space ! = is doesn't work.

loan_balance is probably a float or double; it might decrease to be something close to but not quite 0. Change your comparison to "> 0".

0 is a data type of type int
0.0f is a data type of type double or float
So you're saying
while(loan_balance != 0)
do stuff
The compiler in return says: "loan_balance will never ever be 0 if it is a double/float, so I'll just keep doing stuff."
Keep in mind: integers aren't floats/doubles

Your loan_balance is most likely never actually going to be 0 exactly. I am pretty sure you want your loan_balance in float or double.
double loan_balance=23000.00;
while (loan_balance >0.000)
{
monthly_payment = loan_balance / 20.0;
loan_balance = loan_balance - monthly_payment;
cout << "Monthly Payment: $" << monthly_payment << ends;
cout << "Your loan balance after that payment is: $" << loan_balance << endl;
}

You're facing a problem with both precision and rounding.
When you exponentially decrease a number, it will converge to zero. If it is a limited precision floating point number, there will be a number that is so small that cannot be distinguishable from zero with any possible representation. So, a loop like
double d=1.0; // or some other finite value
while(d!=0.0)
d*=ratio; // ratio is some finite value so that fabs(ratio)<1.0
would finish in a finite number of iterations.
However, depending on the values of d and ratio, especially when d approaches zero in the subnormal range (implying fewer significant bits) and fabs(ratio) is close to 1.0 (slow convergence), the selected rounding mode can cause d*ratio to be rounded towards d. When that happens, the loop above will never end.
In machines/compilers that support IEC 60559, you should be able to test and set the floating point rounding mode, using fegetround() and fesetround() (declared in <fenv.h>). It is likely that the default rounding on your system is to the nearest. For the loop above to converge more quickly (or at all), it would be better to cause rounding to be toward 0.
Note however, that it comes with a price: depending on the application changing the rounding mode might be undesirable on the grounds of precision/accuracy (OTOH, if your already working in the subnormal range, your precision is probably not that good anymore anyway).
But the original question's convergence problem is still a little bit more complicated, because it is done with a two-step operation. So there is one rounding in the division operation, and another one in the subtraction. To increase the chance and speed of convergence, the subtraction should take as much of loan_balance's value as possible, so maybe rounding up would be better in this case. In fact, when I added fesetround(FP_UPWARD) to the OP's original code (together with defining the initial value of loan_balance to 1.0), it converged after 14466 iterations. (Note however that that is just a guess, and the effect on the original code might have been just a special condition. A more in-depth analysis would be necessary to qualify it, and such an analysis would have to take into account different values of ratio, and the relative greatness of minuend and subtrahend.)

Related

How to add two large double precision numbers in c++

I have the following piece of code
#include <iostream>
#include <iomanip>
int main()
{
double x = 7033753.49999141693115234375;
double y = 7033753.499991415999829769134521484375;
double z = (x+ y)/2.0;
std::cout << "y is " << std::setprecision(40) << y << "\n";
std::cout << "x is " << std::setprecision(40) << x << "\n";
std::cout << "z is " << std::setprecision(40) << z << "\n";
return 0;
}
When the above code is run I get,
y is 7033753.499991415999829769134521484375
x is 7033753.49999141693115234375
z is 7033753.49999141693115234375
When I do the same in Wolfram Alpha the value of z is completely different
z = 7033753.4999914164654910564422607421875 #Wolfram answer
I am familiar with floating point precision and that large numbers away from zero can not be exactly represented. Is that what is happening here? Is there anyway in c++ where I can get the same answer as Wolfram without any performance penalty?
large numbers away from zero can not be exactly represented. Is that what is happening here?
Yes.
Note that there are also infinitely many rational numbers that cannot be represented near zero as well. But the distance between representable values does grow exponentially in larger value ranges.
Is there anyway in c++ where I can get the same answer as Wolfram ...
You can potentially get the same answer by using long double. My system produces exactly the same result as Wolfram. Note that precision of long double varies between systems even among systems that conform to IEEE 754 standard.
More generally though, if you need results that are accurate to many significant digits, then don't use finite precision math.
... without any performance penalty?
No. Precision comes with a cost.
Just telling IOStreams to print to 40 significant decimal figures of precision, doesn't mean that the value you're outputting actually has that much precision.
A typical double takes you up to 17 significant decimal figures (ish); beyond that, what you see is completely arbitrary.
Per eerorika's answer, it looks like the Wolfram Alpha answer is also falling foul of this, albeit possibly with some different precision limit than yours.
You can try a different approach like a "bignum" library, or limit yourself to the precision afforded by the types that you've chosen.

Why does it show nan?

Ok so i am doing an a program where I am trying to get the result of the right side to be equivalent to the left side with 0.0001% accuracy
sin x = x - (x^3)/3! + (x^5)/5! + (x^7)/7! +....
#include<iostream>
#include<iomanip>
#include<math.h>
using namespace std;
long int fact(long int n)
{
if(n == 1 || n == 0)
return 1;
else
return n*fact(n-1);
}
int main()
{
int n = 1, counts=0; //for sin
cout << "Enter value for sin" << endl;
long double x,value,next = 0,accuracy = 0.0001;
cin >> x;
value = sin(x);
do
{
if(counts%2 == 0)
next = next + (pow(x,n)/fact(n));
else
next = next - (pow(x,n)/fact(n));
counts++;
n = n+2;
} while((fabs(next - value))> 0);
cout << "The value of sin " << x << " is " << next << endl;
}
and lets say i enter 45 for x
I get the result
The value for sin 45 in nan.
can anyone help me out on where I did wrong ?
First your while condition should be
while((fabs(next - value))> accuracy) and fact should return long double.
When you change that it still won't work for value of 45. The reason is that this Taylor series converge too slowly for large values.
Here is the error term in the formula
Here k is the number of iterations a=0 and the function is sin.In order for the condition to become false 45^(k+1)/(k+1)! times some absolute value of sin or cos (depending what the k-th derivative is) (it's between 0 and 1) should be less than 0.0001.
Well in this formula for value of 50 the number is still very large (we should expect error of around 1.3*10^18 which means we will do more than 50 iterations for sure).
45^50 and 50! will overflow and then dividing them will give you infinity/infinity=NAN.
In your original version fact value doesn't fit in the integer (your value overflows to 0) and then the division over 0 gives you infinity which after subtract of another infinity gives you NAN.
I quote from here in regard to pow:
Return value
If no errors occur, base raised to the power of exp (or
iexp) (baseexp), is returned.
If a domain error occurs, an
implementation-defined value is returned (NaN where supported)
If a pole error or a range error due to overflow occurs, ±HUGE_VAL,
±HUGE_VALF, or ±HUGE_VALL is returned.
If a range error occurs due to
underflow, the correct result (after rounding) is returned.
Reading further:
Error handling
...
except where specified above, if any argument is NaN, NaN is returned
So basically, since n is increasing and and you have many loops pow returns NaN (the compiler you use obviously supports that). The rest is arithmetic. You calculate with overflowing values.
I believe you are trying to approximate sin(x) by using its Taylor series. I am not sure if that is the way to go.
Maybe you can try to stop the loop as soon as you hit NaN and not update the variable next and simply output that. That's the closest you can get I believe with your algorithm.
If the choice of 45 implies you think the input is in degrees, you should rethink that and likely should reduce mod 2 Pi.
First fix two bugs:
long double fact(long int n)
...
}while((fabs(next - value))> accuracy);
the return value of fact will overflow quickly if it is long int. The return value of fact will overflow eventually even for long double. When you compare to 0 instead of accuracy the answer is never correct enough, so only nan can stop the while
Because of rounding error, you still never converge (while pow is giving values bigger than fact you are computing differences between big numbers, which accumulates significant rounding error, which is then never removed). So you might instead stop by computing long double m=pow(x,n)/fact(n); before increasing n in each step of the loop and use:
}while(m > accuracy*.5);
At that point, either the answer has the specified accuracy or the remaining error is dominated by rounding error and iterating further won't help.
If you had compiled your system with any reasonable level of warnings enabled you would have immediately seen that you are not using the variable accuracy. This and the fact that your fact function returns a long int are but a small part of your problem. You will never get a good result for sin(45) using your algorithm even if you correct those issues.
The problem is that with x=45, the terms in the Taylor expansion of sin(x) won't start decreasing until n=45. This is a big problem because 4545/45! is a very large number, 2428380447472097974305091567498407675884664058685302734375 / 1171023117375434566685446533210657783808, or roughly 2*1018. Your algorithm initially adds and subtracts huge numbers that only start decreasing after 20+ additions/subtractions, with the eventual hope that the result will be somewhere between -1 and +1. That is an unrealizable hope given an input value of 45 and using a native floating point type.
You could use some BigNum type (the internet is chock-full of them) with your algorithm, but that's extreme overkill when you only want four place accuracy. Alternatively, you could take advantage of the cyclical nature of sin(x), sin(x+2*pi)=sin(x). An input value of 45 is equivalent to 1.017702849742894661522992634... (modulo 2*pi). Your algorithm works quite nicely for an input of 1.017702849742894661522992634.
You can do much better than that, but taking the input value modulo 2*pi is the first step toward a reasonable algorithm for computing sine and cosine. Even better, you can use the facts that sin(x+pi)=-sin(x). This lets you reduce the range from -infinity to +infinity to 0 to pi. Even better, you can use the fact that between 0 and pi, sin(x) is symmetric about pi/2. You can do even better than that. The implementations of the trigonometric functions take extreme advantage of these behaviors, but they typically do not use Taylor approximations.

Choosing between float and double

The background:
I have been working on the following problem, "The Trip" from "Programming Challenges: The Programming Contest Training Manual" by S. Skiena:
A group of students are members of a club that travels annually to
different locations. Their destinations in the past have included
Indianapolis, Phoenix, Nashville, Philadelphia, San Jose, and Atlanta.
This spring they are planning a trip to Eindhoven.
The group agrees in advance to share expenses equally, but it is not
practical to share every expense as it occurs. Thus individuals in the
group pay for particular things, such as meals, hotels, taxi rides,
and plane tickets. After the trip, each student's expenses are tallied
and money is exchanged so that the net cost to each is the same, to
within one cent. In the past, this money exchange has been tedious and
time consuming. Your job is to compute, from a list of expenses, the
minimum amount of money that must change hands in order to equalize
(within one cent) all the students' costs.
Input
Standard input will contain the information for several trips. Each
trip consists of a line containing a positive integer n denoting the
number of students on the trip. This is followed by n lines of input,
each containing the amount spent by a student in dollars and cents.
There are no more than 1000 students and no student spent more than
$10,000.00. A single line containing 0 follows the information for the
last trip.
Output
For each trip, output a line stating the total amount of money, in
dollars and cents, that must be exchanged to equalize the students'
costs.
(Bold is mine, book here, site here)
I solved the problem with the following code:
/*
* the-trip.cpp
*/
#include <iostream>
#include <iomanip>
#include <cmath>
int main( int argc, char * argv[] )
{
int students_number, transaction_cents;
double expenses[1000], total, average, given_change, taken_change, minimum_change;
while (std::cin >> students_number) {
if (students_number == 0) {
return 0;
}
total = 0;
for (int i=0; i<students_number; i++) {
std::cin >> expenses[i];
total += expenses[i];
}
average = total / students_number;
given_change = 0;
taken_change = 0;
for (int i=0; i<students_number; i++) {
if (average > expenses[i]) {
given_change += std::floor((average - expenses[i]) * 100) / 100;
}
if (average < expenses[i]) {
taken_change += std::floor((expenses[i] - average) * 100) / 100;
}
}
minimum_change = given_change > taken_change ? given_change : taken_change;
std::cout << "$" << std::setprecision(2) << std::fixed << minimum_change << std::endl;
}
return 0;
}
My original implementation had float instead of double. It was working with the small problem instances provided with the description and I spent a lot of time trying to figure out what was wrong.
In the end I figured out that I had to use double precision, apparently some big input in the programming challenge tests made my algorithms with float fail.
The question:
Given the input can have 1000 students and each student can spend up to 10,000$, my total variable has to store a number of the maximum size of
10,000,000.
How should I decide which precision is needed?
Is there something that should have given me an hint that float wasn't enough for this task?
I later realized that in this case I could have avoided floating point at all since my number fits into integer types, but I'm still interested in understanding if there was a way to foresee that float wasn't precise enough in this case.
Is there something that should have given me an hint that float wasn't enough for this task?
The fact that 0.10 is not representable at all in binary floating-point (which both float and double are if you use an ordinary computer) should have been the hint. Binary floating-point is perfect for physical quantities that arrive inaccurate to begin with, or for computations that will be inaccurate anyway whatever the reasonable numerical system with decidable equality. Exact computations of monetary amounts are not a good application of binary floating-point.
How should I decide which precision is needed? … my total variable has to store a number of the maximum size of 10,000,000.
Use an integer type to represent numbers of cents. By your own reasoning, you shouldn't have to deal with amounts of more than 1,000,000,000 cents, so long should be enough, but just use long long and save yourself the risk of trouble with corner cases.
As you said: Never use floating point variables to represent money. Using an integer representation - either as one large number in form of cents or whatever the fraction of the local currency is, or as two numbers [which makes the math a bit more awkward, but easier to see/read/write the value as two units].
The motivation for not using floating point is that it's "often not accurate". Just like 1/3 can't be writen as an exact value using decimal representation, no matter how many threes you write, the actual answer would have more threes, binary floating point values can not precisely describe some decimal values, and you get "Your value of 0.20 does not match 0.20 that the customer owes" - which doesn't make sense, but that's because "0.200000000001" and "0.19999999999" aren't exactly the same thing according to the computer. And eventually, those little rounding errors will cause some big problem in one way or another - and this regardless of if it's float, double or extra_super_long_double.
However, if you have a question like this: if I have to represent a value of 10 million, with a precison of 1/100th of the unit, how big a floating point variable do I need, your calculation becomes:
float bigNumber = 10000000;
float smallNumber = 0.01;
float bits = log2(bigNumber/smallNumber);
cout << "Bits in mantissa needed: " << ceil(bits) << endl;
So, in this case, we get bits as 29.897, so you need 30 bits (in other words, float is not good enough.
Of course, if you do not need fractions of a dollar (or whatever), you can get away with a few less digits. Namely log2(10000000) = 23.2 - so 24 bits of mantissa -> still too big for a float.
10,000,000>2^23 so you need at least 24 bits of mantissa, which is what single precision provides. Because of intermediate rounding, the last bits can err.
1 digit ~ 3.321928 bits.

Divide by zero prevention

What is 1.#INF and why does casting to a float or double prevent a division by 0 of crashing?
Also, any great ideas of how to prevent division by 0? (Like any macro or template)?
int nQuota = 0;
int nZero = 3 / nQuota; //crash
cout << nZero << endl;
float fZero = 2 / nQuota; //crash
cout << fZero << endl;
if I use instead:
int nZero = 3 / (float)nQuota;
cout << nZero << endl;
//Output = -2147483648
float fZero = 2 / (float)nQuota;
cout << fZero << endl;
//Output = 1.#INF
1.#INF is positive infinity. You will get it when you divide a positive float by zero (if you divide the float zero itself by zero, then the result will be "not a number").
On the other hand, if you divide an integer by zero, the program will crash.
The reason float fZero = 2 / nQuota; crashes is because both operands of the / operator are integers, so the division is performed on integers. It doesn't matter that you then store the result in a float; C++ has no notion of target typing.
Why positive infinity cast to an integer is the smallest integer, I have no idea.
Wwhy using (float) or (double) prevents a division by 0 of crashing?
It doesn't necessarily. The standard is amazing spare when it comes to floating point. Most systems use the IEEE floating point standard nowadays, and that says that the default action for division by zero is to return ±infinity rather than crash. You can make it crash by enabling the appropriate floating point exceptions.
Note well: The only thing the floating point exception model and the C++ exception model have in common is the word "exception". On every machine I work on, a floating point exception does not throw a C++ exception.
Also, any great ideas of how to prevent division by 0?
Simple answer: Don't do it.
This is one of those "Doctor, doctor it hurts when I do this!" kinds of situations. So don't do it!
Make sure that the divisor is not zero.
Do sanity checks on divisors that are user inputs. Always filter your user inputs for sanity. A user input value of zero when the number is supposed to be in the millions will cause all kinds of havoc besides overflow. Do sanity checks on intermediate values.
Enable floating point exceptions.
Making the default behavior to allow errors (and these are almost always errors) to go through unchecked was IMHO a big mistake on the part of the standards committee. Use the default and those infinities and not-a-numbers will eventually turn everything into an Inf or a NaN.
The default should have been to stop floating point errors in their tracks, with an option to allow things like 1.0/0.0 and 0.0/0.0 to take place. That isn't the case, so you have to enable those traps. Do that and you can oftentimes find the cause of the problem in short order.
Write custom divide, custom multiply, custom square root, custom sine, ... functions.
This unfortunately is the route that many safety critical software systems must take. It is a royal pain. Option #1 is out because it's just wishful thinking. Option #3 is out because the system cannot be allowed to crash. Option #2 is still a good idea, but it doesn't always work because bad data always has a way of sneaking in. It's Murphy's law.
BTW, the problem is a bit worse than just division by zero. 10200/10-200 will also overflow.
You usually check to make sure you aren't dividing by zero. The code below isn't particularly useful unless nQuota has a legitimate value but it does prevent crashes
int nQuota = 0;
int nZero = 0;
float fZero = 0;
if (nQuota)
nZero = 3 / nQuota;
cout << nZero << endl;
if (nQuota)
fZero = 2 / nQuota;
cout << fZero << endl;

C++ internal representation of double/float

I am unable to understand why C++ division behaves the way it does. I have a simple program which divides 1 by 10 (using VS 2003)
double dResult = 0.0;
dResult = 1.0/10.0;
I expect dResult to be 0.1, However i get 0.10000000000000001
Why do i get this value, whats the problem with internal representation of double/float
How can i get the correct value?
Thanks.
Because all most modern processors use binary floating-point, which cannot exactly represent 0.1 (there is no way to represent 0.1 as m * 2^e with integer m and e).
If you want to see the "correct value", you can print it out with e.g.:
printf("%.1f\n", dResult);
Double and float are not identical to real numbers, it is because there are infinite values for real numbers, but only finite number of bits to represent them in double/float.
You can further read: what every computer scientist should know about floating point arithmetics
The ubiquitous IEEE754 floating point format expresses floating point numbers in scientific notation base 2, with a finite mantissa. Since a fraction like 1/5 (and hence 1/10) does not have a presentation with finitely many digits in binary scientific notation, you cannot represent the value 0.1 exactly. More generally, the only values that can be represented exactly are those that fit precisely into binary scientific notation with a mantissa of a few (e.g. 24 or 53 or 64) binary digits, and a suitably small exponent.
Working with integers, floats, and doubles could be tricky. Depends on what is your purpose. If you only want to display in nice format, then you can play with the C++ iomanipulator, precision, showpint, noshowpint. If you are trying to do precise computing with numeric methods, you may have to use some library for accurate representation. If you are multiplying a lots of small and large number, you may have to resole to use log transformations. Here is a small test:
float x=1.0000001;
cout << x << endl;
float y=9.9999999999999;
cout << "using default io format " << y/x << endl;
cout << showpoint << "using showpoint " << y/x << endl;
y=9.9999;
cout << "fewer 9 default C++ " << y/x << endl;
cout << showpoint << "fewer 9 showpoint" << y/x << endl;
1
using default io format 10
using showpoint 10.0000
fewer 9 default C++ 9.99990
fewer 9 showpoint9.99990
In special cases you want to use double (which may be the result of some complicated algorithm) to represent integer numbers, you have to figure out the proper conversion method. Once I had a situation where I want to use a single double value to store two type of values: -1, +1, or (0-1) to make my code more memory efficient (and speed, large memory tends to reduce performance). It is a little tricky to distinguish between +1 and val < 1. In this case I know that the values < 1 has a resolution say only 1/500, Then I can safely use floor(val+0.000001) to get back the 1 value that I initially stored.