Dividing a float by 10 [duplicate] - c++

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why can't decimal numbers be represented exactly in binary?
I am developing a pretty simple algorithm for mathematics use under C++.
And I have a floating point variable named "step", each time I finish a while loop, I need step to be divided by 10.
So my code is kind of like this,
float step = 1;
while ( ... ){
//the codes
step /= 10;
}
In my stupid simple logic, that ends of well. step will be divided by 10, from 1 to 0.1, from 0.1 to 0.01.
But it didn't, instead something like 0.100000000001 appears. And I was like "What The Hell"
Can someone please help me with this. It's probably something about the data type itself that I don't fully understand. So if someone could explain further, it'll be appreciated.

It is a numerical issue. The Problem is that 1/10 is a endless long number in binary and the successive apply of a division by 10 ends up with summing the error in each step. To get a more stable version you should multiply the divisor. But take care: the result is also not exact! You may want to replace the float with a double to minimize the error.
unsigned int div = 1;
while(...)
{
double step = 1.0 / (double)div;
....
div *= 10;
}

The division by ten cannot be exact for binary floating point arithmetic, so you see results that will look a little bit off from what you expect.
Binary floating are represented as an integer ratio where the denominator is a power of two. Since there in no binary fraction exactly equal to one-tenth, you'll see the nearest representable number instead of the one you expected.

Related

Is there an easy way in C++ to round to the nearest 1.75 increment

I tried looking up this on StackOverflow, close this question if this has been answered already. But is there a function in C++ that can round a value to an increment with a decimal value? Please generalize your answer for any given decimal increment - I would like to also know, for example, how to round to the nearest 1.78 increment.
There is no specific function to do that, but you can easily make one. Just divide the value by 1.75, then call std::round and then multiply by 1.75.
e.g.
double round_to_multiple(double val, double step)
{
return step * std::round(val / step);
}
Where "step" is the step-size, or multiple. In your case, 1.75.
Just be wary of floating-point rounding error. If you are always dealing with 2 decimal places and the numbers you work with are in a useful range, you might want to consider using integers instead (multiplied by 100), which would change the technique a little.

How make a good calculator with floating point arithmetic [duplicate]

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 4 years ago.
Here is how my calculator should work:
There is a JSON value where I can write the first multiplier - something like this:
{
"value1": 1.4
}
On the calculator I can write the second multiplier - only 10^n numbers (10, 100, ..., 10000000). And my calc should return me an integer, as I know that always people who use my calc with write less numbers after the decimal point for the first multiplier than we have 0s on the calc for the second multiplier. Yes, my calc is a very-very strange one.
Here are valid inputs:
v1=1.4; v2=100;
v1=1.414; v2=100000;
v1=1.1; v2=100;
What happens when I do this, for example for value1=1.4 and value2=10000 I get 13900. As far as float cannot hold any number sometimes it stores different numbers. For 1.4 internally it stores 1.399999 on my machine. I know why, but you know the QA engineer who tests my app tells me that I need to get 14000. Your calc does not work. How to make my calc so that I will print correct number?
P.S. Of course I have cut out my real problem from the context but the thing is that I have a float in a file and a 10^n number in my program as a user input. How to get correct result?
EDIT1: I don't ask why float works that way. I know why. I ask how to solve the problem even when float works that way.
EDIT2: I use RapidJson to read the JSON file which already returns me wrong number as a double precision number. I can't use libraries that provide with higher precision floating points.
Round the result when you format it for display. A double precision value is correct to about 15 significant digits, so if you round the result to 12 significant digits you're not going to surprise the user.

For with minor or equal condition in C++ [duplicate]

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 5 years ago.
I am doing a loop to perform some calculations from t=0 to t=1 (included).
That should be easy like this:
for(double t = 0; t<=1; t = t + 0.05)
{
DEBUG_LOG1 (LVL1, t);
//DoMaths
}
But for some reason, t is being logged from 0 to 0.95, not including t=1, as it if was t<1 instead of t<=1.
Where is the problem in my code?
This is a simple problem with types. Because it is a floating point number will will likely never get precisely 0.05 or 1.00.
Rather you'll it will try for 0.05 but really will be something like 0.050000000000000012 which added together 20 times is not 1 but more like 1.00000000000000024 and will therefore not correspond with 1.
There is not problem with your code per se since you catch the problem by using <= instead of =.
You can read more about floating point numbers on http://www.learncpp.com/cpp-tutorial/25-floating-point-numbers/
I think it may be because 0.05 is not exactly representable as a floating point value. It is only approximate. Try running this program.
#include <stdio.h>
int main()
{
double x = 0.05;
printf("%.50lf\n", x);
return 0;
}
Here I tell printf to give me a lot of excess precision. This prints out the value
0.05000000000000000277555756156289135105907917022705.
Now if I take that value and multiply and add 0.5 to it 19 times in a loop I get...
1.00000000000000022204460492503130808472633361816406
See how it is not exactly 1 but slightly greater. This is the reason comparing equality between floats leads to strange results. You can get around this by adding a small epsilon to 1. For instance compare to 1.001 in your loop.
Decimal numbers can't be accurately represented using floating types. For example 0.05 can't be accurately represented in type double and depending on the platform it might be: 0.050000000000000003 or similar. So that tiny little bit always gets added in your loop. By the time you think it is 0.95 it is actually 0.95000000000000029 or similar and adding 0.05 makes it greater than 1, hence the observed results. More info on the subject in this SO post:
Is floating point math broken?

Avoiding erratic tiny numbers when working with floating point numbers

It some times happen when I use floating point numbers in c++ and only use numbers as multiples of, say 0.1, as an increment in a for loop, the actual number which is the loop iterators is not exactly multiples of 0.1 but has unpredictably other added or subtracted tiny numbers of the order of 1^-17. How can I avoid that?
Use integers for the iteration and multiply by the floating-point increment before using.
Alternatively find a decimal math package and use it instead of floating point.
Don't iterate over floating-point numbers.
The problem is that 0.1 can't be exactly represented in floating-point. So instead, you should do something like:
for (int i = 0; i < N; i++)
{
float f = i * 0.1f;
...
}
Here is an excellent article on the subject of working with floats. There is a discussion covering precisely your example - an increment of 0.1:
for (double r=0.0; r!=1.0; r+=0.1) printf("*");
How many stars is it going to print? Ten? Run it and be surprised. The code just keeps on printing the stars until we break it.
Where's the problem? As we already know, doubles are not infinitely precise. The problem we encountered here is the following: In binary, the representation of 0.1 is not finite (as it is in base 10). Decimal 0.1 is equivalent to binary 0.0(0011), where the part in the parentheses is repeated forever. When 0.1 is stored in a double variable, it gets rounded to the closest representable value. Thus if we add it 10 times the result is not exactly equal to one.
I highly recommend reading the whole article if you work a lot with floating point numbers.

Representing probability in C++

I'm trying to represent a simple set of 3 probabilities in C++. For example:
a = 0.1
b = 0.2
c = 0.7
(As far as I know probabilities must add up to 1)
My problem is that when I try to represent 0.7 in C++ as a float I end up with 0.69999999, which won't help when I am doing my calculations later. The same for 0.8, 0.80000001.
Is there a better way of representing numbers between 0.0 and 1.0 in C++?
Bear in mind that this relates to how the numbers are stored in memory so that when it comes to doing tests on the values they are correct, I'm not concerned with how they are display/printed out.
This has nothing to do with C++ and everything to do with how floating point numbers are represented in memory. You should never use the equality operator to compare floating point values, see here for better methods: http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm
My problem is that when I try to
represent 0.7 in C++ as a float I end
up with 0.69999999, which won't help
when I am doing my calculations later.
The same for 0.8, 0.80000001.
Is it really a problem? If you just need more precision, use a double instead of a float. That should get you about 15 digits precision, more than enough for most work.
Consider your source data. Is 0.7 really significantly more correct than 0.69999999?
If so, you could use a rational number library such as:
http://www.boost.org/doc/libs/1_40_0/libs/rational/index.html
If the problem is that probabilities add up to 1 by definition, then store them as a collection of numbers, omitting the last one. Infer the last value by subtracting the sum of the others from 1.
How much precision do you need? You might consider scaling the values and quantizing them in a fixed-point representation.
The tests you want to do with your numbers will be incorrect.
There is no exact floating point representation in a base-2 number system for a number like 0.1, because it is a infinte periodic number. Consider one third, that is exactly representable as 0.1 in a base-3 system, but 0.333... in the base-10 system.
So any test you do with a number 0.1 in floating point will be prone to be flawed.
A solution would be using rational numbers (boost has a rational lib), which will be always exact for, ermm, rationals, or use a selfmade base-10 system by multiplying the numbers with a power of ten.
If you really need the precision, and are sticking with rational numbers, I suppose you could go with a fixed point arithemtic. I've not done this before so I can't recommend any libraries.
Alternatively, you can set a threshold when comparing fp numbers, but you'd have to err on one side or another -- say
bool fp_cmp(float a, float b) {
return (a < b + epsilon);
}
Note that excess precision is automatically truncated in each calculation, so you should take care when operating at many different orders of magnitude in your algorithm. A contrived example to illustrate:
a = 15434355e10 + 22543634e10
b = a / 1e20 + 1.1534634
c = b * 1e20
versus
c = b + 1.1534634e20
The two results will be very different. Using the first method a lot of the precision of the first two numbers will be lost in the divide by 1e20. Assuming that the final value you want is on the order of 1e20, the second method will give you more precision.
If you only need a few digits of precision then just use an integer. If you need better precision then you'll have to look to different libraries that provide guarantees on precision.
The issue here is that floating point numbers are stored in base 2. You can not exactly represent a decimal in base 10 with a floating point number in base 2.
Lets step back a second. What does .1 mean? Or .7? They mean 1x10-1 and 7x10-1. If you're using binary for your number, instead of base 10 as we normally do, .1 means 1x2-1, or 1/2. .11 means 1x2-1 + 1x2-2, or 1/2+1/4, or 3/4.
Note how in this system, the denominator is always a power of 2. You cannot represent a number without a denominator that is a power of 2 in a finite number of digits. For instance, .1 (in decimal) means 1/10, but in binary that is an infinite repeating fraction, 0.000110011... (with the 0011 pattern repeating forever). This is similar to how in base 10, 1/3 is an infinite fraction, 0.3333....; base 10 can only represent numbers exactly with a denominator that is a multiple of powers of 2 and 5. (As an aside, base 12 and base 60 are actually really convenient bases, since 12 is divisible by 2, 3, and 4, and 60 is divisible by 2, 3, 4, and 5; but for some reason we use decimal anyhow, and we use binary in computers).
Since floating point numbers (or fixed point numbers) always have a finite number of digits, they cannot represent these infinite repeating fractions exactly. So, they either truncate or round the values to be as close as possible to the real value, but are not equal to the real value exactly. Once you start adding up these rounded values, you start getting more error. In decimal, if your representation of 1/3 is .333, then three copies of that will add up to .999, not 1.
There are four possible solutions. If all you care about is exactly representing decimal fractions like .1 and .7 (as in, you don't care that 1/3 will have the same problem you mention), then you can represent your numbers as decimal, for instance using binary coded decimal, and manipulate those. This is a common solution in finance, where many operations are defined in terms of decimal. This has the downside that you will need to implement all of your own arithmetic operations yourself, without the benefits of the computer's FPU, or find a decimal arithmetic library. This also, as mentioned, does not help with fractions that can't be represented exactly in decimal.
Another solution is to use fractions to represent your numbers. If you use fractions, with bignums (arbitrarily large numbers) for your numerators and denominators, you can represent any rational number that will fit in the memory of your computer. Again, the downside is that arithmetic will be slower, and you'll need to implement arithmetic yourself or use an existing library. This will solve your problem for all rational numbers, but if you wind up with a probability that is computed based on π or √2, you will still have the same issues with not being able to represent them exactly, and need to also use one of the later solutions.
A third solution, if all you care about is getting your numbers to add up to 1 exactly, is for events where you have n possibilities, to only store the values of n-1 of those probabilities, and compute the probability of the last as 1 minus the sum of the rest of the probabilities.
And a fourth solution is to do what you always need to remember when working with floating point numbers (or any inexact numbers, such as fractions being used to represent irrational numbers), and never compare two numbers for equality. Again in base 10, if you add up 3 copies of 1/3, you will wind up with .999. When you want to compare that number to 1, you have to instead compare to see if it is close enough to 1; check that the absolute value of the difference, 1-.999, is less than a threshold, such as .01.
Binary machines always round decimal fractions (except .0 and .5, .25, .75, etc) to values that don't have an exact representation in floating point. This has nothing to do with the language C++. There is no real way around it except to deal with it from a numerical perspective within your code.
As for actually producing the probabilities you seek:
float pr[3] = {0.1, 0.2, 0.7};
float accPr[3];
float prev = 0.0;
int i = 0;
for (i = 0; i < 3; i++) {
accPr[i] = prev + pr[i];
prev = accPr[i];
}
float frand = rand() / (1 + RAND_MAX);
for (i = 0; i < 2; i++) {
if (frand < accPr[i]) break;
}
return i;
I'm sorry to say there's not really an easy answer to your problem.
It falls into a field of study called "Numerical Analysis" that deals with these types of problems (which goes far beyond just making sure you don't check for equality between 2 floating point values). And by field of study, I mean there are a slew of books, journal articles, courses etc. dealing with it. There are people who do their PhD thesis on it.
All I can say is that that I'm thankful I don't have to deal with these issues very much, because the problems and the solutions are often very non-intuitive.
What you might need to do to deal with representing the numbers and calculations you're working on is very dependent on exactly what operations you're doing, the order of those operations and the range of values that you expect to deal with in those operations.
Depending on the requirements of your applications any one of several solutions could be best:
You live with the inherent lack of precision and use floats or doubles. You cannot test either for equality and this implies that you cannot test the sum of your probabilities for equality with 1.0.
As proposed before, you can use integers if you require a fixed precision. You represent 0.7 as 7, 0.1 as 1, 0.2 as 2 and they will add up perfectly to 10, i.e., 1.0. If you have to calculate with your probabilities, especially if you do division and multiplication, you need to round the results correctly. This will introduce an imprecision again.
Represent your numbers as fractions with a pair of integers (1,2) = 1/2 = 0.5. Precise, more flexible than 2) but you don't want to calculate with those.
You can go all the way and use a library that implements rational numbers (e.g. gmp). Precise, with arbitrary precision, you can calculate with it, but slow.
yeah, I'd scale the numbers (0-100)(0-1000) or whatever fixed size you need if you're worried about such things. It also makes for faster math computation in most cases. Back in the bad-old-days, we'd define entire cos/sine tables and other such bleh in integer form to reduce floating fuzz and increase computation speed.
I do find it a bit interesting that a "0.7" fuzzes like that on storage.