Unexpected behavior of if statement in OCaml - if-statement

I am learning OCaml so maybe I am writing this if-statement wrong, but for this statement:
# if 0.3 -. 0.2 = 0.1 then ’a’ else ’b’;;
the output is:
- : char = 'b'
Shouldn't the output be 'a', since 0.3 - 0.2 = 0.1?
The behavior is also the same when I write == instead of =.

Floating point values can only represent decimal fractions approximately. So 0.3 -. 0.2 is very close to 0.1, but not exactly equal.
# 0.3 -. 0.2;;
- : float = 0.0999999999999999778
Understanding this is a rite of passage for programmers. Here's a site I found with some discussion: What Every Programmer Should Know about Floating Point.
As a side comment, you should never be using the == (physical equality) operator in ordinary computations. Example:
# 1.0 == 1.0;;
- : bool = false
This is a different problem. Floating values aren't that approximate :-)
The everyday, workhorse equality comparison operator is =. That's what you should use unless you have a specific reason not to.

Related

Why isn't to!int() working properly?

Why does this assertion fail?
import std.conv;
void main()
{
auto y = 0.6, delta=0.1;
auto r = to!int(y/delta);
assert(r == 6);
}
r's value should be 6 and yet it's 5, Why?
This is probably because 0.6 can't be represented purely in a floating point number. You write 0.6, but that's not exactly what you get - you get something like 0.599999999. When you divide that by 0.1, you get something like 5.99999999, which converts to an integer of 5 (by rounding down).
Examples in other languages:
C#: Why is (double)0.6f > (double)(6/10f)?
Java: Can someone please explain me that in java why 0.6 is <0.6f but 0.7is >=0.7f
Computers represent floating point numbers in binary. The decimal numbers 0.6 and 0.1 do not have an exact binary representation, while number of bits used to represent them is finite. As a result, there would be truncation, whose effect is seen during division. The result of that division is not exactly 6.00000000, but perhaps 5.99999999, which is then truncated to 5.

0.1 float is greater than 0.1 double. I expected it to be false [duplicate]

This question already has answers here:
If operator< works properly for floating-point types, why can't we use it for equality testing?
(5 answers)
Closed 9 years ago.
Let:
double d = 0.1;
float f = 0.1;
should the expression
(f > d)
return true or false?
Empirically, the answer is true. However, I expected it to be false.
As 0.1 cannot be perfectly represented in binary, while double has 15 to 16 decimal digits of precision, and float has only 7. So, they both are less than 0.1, while the double is more close to 0.1.
I need an exact explanation for the true.
I'd say the answer depends on the rounding mode when converting the double to float. float has 24 binary bits of precision, and double has 53. In binary, 0.1 is:
0.1₁₀ = 0.0001100110011001100110011001100110011001100110011…₂
^ ^ ^ ^
1 10 20 24
So if we round up at the 24th digit, we'll get
0.1₁₀ ~ 0.000110011001100110011001101
which is greater than the exact value and the more precise approximation at 53 digits.
The number 0.1 will be rounded to the closest floating-point representation with the given precision. This approximation might be either greater than or less than 0.1, so without looking at the actual values, you can't predict whether the single precision or double precision approximation is greater.
Here's what the double precision value gets rounded to (using a Python interpreter):
>>> "%.55f" % 0.1
'0.1000000000000000055511151231257827021181583404541015625'
And here's the single precision value:
>>> "%.55f" % numpy.float32("0.1")
'0.1000000014901161193847656250000000000000000000000000000'
So you can see that the single precision approximation is greater.
If you convert .1 to binary you get:
0.000110011001100110011001100110011001100110011001100...
repeating forever
Mapping to data types, you get:
float(.1) = %.00011001100110011001101
^--- note rounding
double(.1) = %.0001100110011001100110011001100110011001100110011010
Convert that to base 10:
float(.1) = .10000002384185791015625
double(.1) = .100000000000000088817841970012523233890533447265625
This was taken from an article written by Bruce Dawson. it can be found here:
Doubles are not floats, so don’t compare them
I think Eric Lippert's comment on the question is actually the clearest explanation, so I'll repost it as an answer:
Suppose you are computing 1/9 in 3-digit decimal and 6-digit decimal. 0.111 < 0.111111, right?
Now suppose you are computing 6/9. 0.667 > 0.666667, right?
You can't have it that 6/9 in three digit decimal is 0.666 because that is not the closest 3-digit decimal to 6/9!
Since it can't be exactly represented, comparing 1/10 in base 2 is like comparing 1/7 in base 10.
1/7 = 0.142857142857... but comparing at different base 10 precisions (3 versus 6 decimal places) we have 0.143 > 0.142857.
Just to add to the other answers talking about IEEE-754 and x86: the issue is even more complicated than they make it seem. There is not "one" representation of 0.1 in IEEE-754 - there are two. Either rounding the last digit down or up would be valid. This difference can and does actually occur, because x86 does not use 64-bits for its internal floating-point computations; it actually uses 80-bits! This is called double extended-precision.
So, even among just x86 compilers, it sometimes happen that the same number is represented two different ways, because some computes its binary representation with 64-bits, while others use 80.
In fact, it can happen even with the same compiler, even on the same machine!
#include <iostream>
#include <cmath>
void foo(double x, double y)
{
if (std::cos(x) != std::cos(y)) {
std::cout << "Huh?!?\n"; //← you might end up here when x == y!!
}
}
int main()
{
foo(1.0, 1.0);
return 0;
}
See Why is cos(x) != cos(y) even though x == y? for more info.
The rank of double is greater than that of float in conversions. By doing a logical comparison, f is cast to double and maybe the implementation you are using is giving inconsistent results. If you suffix f so the compiler registers it as a float, then you get 0.00 which is false in double type. Unsuffixed floating types are double.
#include <stdio.h>
#include <float.h>
int main()
{
double d = 0.1;
float f = 0.1f;
printf("%f\n", (f > d));
return 0;
}

Can someone explain to me why float x = 0.1 * 7 does not result in x == 0.7 to be true? [duplicate]

This question already has answers here:
strange output in comparison of float with float literal
(8 answers)
Closed 9 years ago.
EDIT: There are a lot of disgruntled members here because this question had a duplicate on the site. In my defense, I tried searching for the answer FIRST, and maybe I was using poor searching keywords, but I could not find a direct, clear answer to this specific code example. Little did I know there was one out there from **2009** that would then be linked to from here.
Here's a coded example:
#include <iostream>
using namespace std;
int main() {
float x = 0.1 * 7;
if (x == 0.7)
cout << "TRUE. \n";
else
cout << "FALSE. \n";
return 0;
}
This results in FALSE. However, when I output x, it does indeed output as 0.7. Explanation?
Please read What Every Computer Scientist Should Know About Floating-Point Arithmetic.
First of all, 0.1 is a literal of type double. The closest representable value to 0.1 in IEEE 754 double-precision is:
0.1000000000000000055511151231257827021181583404541015625
If you multiply that by 7, the closest representable value in IEE 754 single-precision (since you're storing it in a float) is:
0.699999988079071044921875
Which, as you can see, is almost 0.7, but not quite. This then gets converted to a double for the comparison, and you end up comparing the following two values:
0.699999988079071044921875 == 0.6999999999999999555910790149937383830547332763671875
Which of course evaluates to false.
This is because numbers are stored in binary. In binary, you cannot exactly represent the fraction .1 or .7 with finitely many places, because these have repeating expansions in binary. something like 1/2 can be represented exactly with the representation .1, but .1 in decimal for instance is .0001100110011.... So, when you cut off this number, you're bound to have roundoff error.
Doubles and floats should never be compared by the == operator. Numbers are stored in memory inaccurately because in binary they don't have to have finite representation (for example 0.1).
You will see it here:
#include <iostream>
using namespace std;
int main() {
float x = 0.1 * 7;
cout << x-0.7;
return 0;
}
The difference is NOT zero, but something very very close to zero.
Like every datatype a float is represented as binary number. For the exact representation see here: http://en.wikipedia.org/wiki/IEEE_floating_point
When converting a decimal number to a floating point number by hand, you first have to convert it to a fixed point number.
Converting 0.7 to base 2 (binary):
0.7 = 0.101100110011...
As you see it has infinite digits after the comma, so when representing it as a float datatype, some digits will get cut off. This results in the number not being EXACTLY 0.7 when converting it back to decimal.
In your example the multiplication results in a different number than the literal "0.7".
To fix this: Use a epsilon when comparing equality of floats:
if (x < 0.71f && x > 0.69f)

How to solve floating point number getting wrong in list [haskell] [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Haskell ranges and floats
For example, when I type
[0.1, 0.3 ..1]
I get this:
[0.1,0.3,0.5,0.7,0.8999999999999999,1.0999999999999999]
I expected:
[0.1,0.3,0.5,0.7,0.9]
Try
map (/10) [1, 3 .. 10]
instead.
The problem is that floating point numbers use binary fractions, and binary fractions can't exactly represent decimal fractions. So you get errors and the errors build up.
A binary fraction cannot exactly represent 1/5 in the same way that a decimal fraction cannot exactly represent 1/3 --- the best we can do is 0.33333....
[0.1, 0.3 .. 1] is shorthand for
[0.1,
0.1 + 0.2,
0.1 + 0.2 + 0.2,
0.1 + 0.2 + 0.2 + 0.2,
0.1 + 0.2 + 0.2 + 0.2 + 0.2,
0.1 + 0.2 + 0.2 + 0.2 + 0.2 + 0.2]
The other problem is that the list will stop after the first element that is equal to or past the limit when the next element would be more than half the step past the limit, which is why you have the 1.0999999999999999 element.
This is an issue with how floating point numbers are represented in the computer. Simple arithmetic with floating point numbers often does not behave as expected. Repeated arithmetic can accumulate "rounding error", which means the result can get progressively worse as you repeatedly add numbers (for example).
You can avoid these problems in some cases by using a different numerical representation. If you only care about rational numbers, for example, you could use the Rational type. So you could do:
[0.1,0.3..1] :: [Rational]
which results in:
[1 % 10,3 % 10,1 % 2,7 % 10,9 % 10,11 % 10]
This is the correct answer with no rounding error; each number is just represented as the ratio of two Integers. Depending on your particular situation, this may be a better option than using floating point numbers.
This does still go over the upper bound, but that is much easier to deal with than the rounding error you get from floating point numbers.
Note that for something performance critical floating point numbers are probably going to be faster.
The expression [e1, e2 .. e3] is evaluated as enumFromThenTo e1 e2 e3, which for floating point numbers means (from The Haskell 98 Report):
For Float and Double, the semantics of the enumFrom family is given by the rules for Int above, except that the list terminates when the elements become greater than e3+i/2 for positive increment i, or when they become less than e3+i/2 for negative i.
This means that with floating point numbers the last element of [e1, e2 .. e3] is often greater than e3, and can be up to e3+(e2-e1)/2 - ε.

Error subtracting floating point numbers when passing through 0.0

The following program:
#include <stdio.h>
int main()
{
double val = 1.0;
int i;
for (i = 0; i < 10; i++)
{
val -= 0.2;
printf("%g %s\n", val, (val == 0.0 ? "zero" : "non-zero"));
}
return 0;
}
Produces this output:
0.8 non-zero
0.6 non-zero
0.4 non-zero
0.2 non-zero
5.55112e-17 non-zero
-0.2 non-zero
-0.4 non-zero
-0.6 non-zero
-0.8 non-zero
-1 non-zero
Can anyone tell me what is causing the error when subtracting 0.2 from 0.2? Is this a rounding error or something else? Most importantly, how do I avoid this error?
EDIT: It looks like the conclusion is to not worry about it, given 5.55112e-17 is extremely close to zero (thanks to #therefromhere for that information).
Its because floating points numbers can not be stored in memory in exact value. So it is never safe to use == in floating point values. Using double will increase the precision, but again that will not be exact. The correct way to compare a floating point value is to do something like this:
val == target; // not safe
// instead do this
// where EPS is some suitable low value like 1e-7
fabs(val - target) &lt EPS;
EDIT: As pointed in the comments, the main reason of the problem is that 0.2 can't be stored exactly. So when you are subtracting it from some value, every time causing some error. If you do this kind of floating point calculation repeatedly then at certain point the error will be noticeable. What I am trying to say is that all floating points values can't be stored, as there are infinites of them. A slight wrong value is not generally noticeable but using that is successive computation will lead to higher cumulative error.
0.2 is not a double precision floating-point number, so it is rounded to the nearest double precision number, which is:
0.200000000000000011102230246251565404236316680908203125
That's rather unwieldy, so let's look at it in hex instead:
0x0.33333333333334
Now, let's follow what happens when this value is repeatedly subtracted from 1.0:
0x1.00000000000000
- 0x0.33333333333334
--------------------
0x0.cccccccccccccc
The exact result is not representable in double precision, so it is rounded, which gives:
0x0.ccccccccccccd
In decimal, this is exactly:
0.8000000000000000444089209850062616169452667236328125
Now we repeat the process:
0x0.ccccccccccccd
- 0x0.33333333333334
--------------------
0x0.9999999999999c
rounds to 0x0.999999999999a
(0.600000000000000088817841970012523233890533447265625 in decimal)
0x0.999999999999a
- 0x0.33333333333334
--------------------
0x0.6666666666666c
rounds to 0x0.6666666666666c
(0.400000000000000077715611723760957829654216766357421875 in decimal)
0x0.6666666666666c
- 0x0.33333333333334
--------------------
0x0.33333333333338
rounds to 0x0.33333333333338
(0.20000000000000006661338147750939242541790008544921875 in decimal)
0x0.33333333333338
- 0x0.33333333333334
--------------------
0x0.00000000000004
rounds to 0x0.00000000000004
(0.000000000000000055511151231257827021181583404541015625 in decimal)
Thus, we see that the accumulated rounding that is required by floating-point arithmetic produces the very small non-zero result that you are observing. Rounding is subtle, but it is deterministic, not magic, and not a bug. It's worth taking the time to learn about.
Floating point arithmetic cannot represent all numbers exactly. Thus rounding errors like you observe are inevitable.
One possible strategy is to use a fixed point format, e.g. A decimal or currency data type. Such types still can't represent all numbers but would behave as you expect for this example.
To elaborate a bit: if the mantissa of the floating point number is encoded in binary (as is the case in most contemporary FPUs), then only sums of (multiples) of the numbers 1/2, 1/4, 1/8, 1/16, ... can be represented exactly in the mantissa. The value 0.2 is approximated with 1/8 + 1/16 + .... some even smaller numbers, yet the exact value of 0.2 can not be reached with a finite mantissa.
You can try the following:
printf("%.20f", 0.2);
and you'll (probably) see that what you think is 0.2 is not 0.2 but a number that is a tiny amount different (actually, on my computer it prints 0.20000000000000001110). Now you understand why you can never reach 0.
But if you let val = 12.5 and subtract 0.125 in your loop, you could reach zero.