Comparing floats in c++ - c++

I'm learning c++ from a tutorial and there, I was told that comparing floats in c++ can be very comfusing. For example, in the below code:
#include <iostream>
using namespace std;
int main()
{
float decimalNumber = 1.2;
if (decimalNumber == 1.2){
cout << "Equal" << endl;
}else{
cout << "Not equal" << endl;
}
return 0;
}
I would get "not equal". I agree on this point. The tutor told that If we need to compare floats, we can use > to the nearest number. (that would not be very precise). I searched for different ways to compare floats in google and I got many complex ways of doing that.
Then I created a program myself:
#include <iostream>
using namespace std;
int main()
{
float decimalNumber = 1.2;
if (decimalNumber == (float)1.2){
cout << "Equal" << endl;
}else{
cout << "Not equal" << endl;
}
return 0;
}
After type-casting like above, I got "Equal".
The thing I want to know is that should I use the above way to compare floats in all of my programs? Does this have some cons?
Note : I know how a number is represented exactly in the memory and how 0.1 + 0.2 !- 0.3 as described in another SO Question. I just want to know that can I check the equality of two floats in the above way?

The thing I want to know is that should I use the above way to compare floats in all of my programs?
Depends on context. Equality comparison is very rarely useful with floats.
However yes, whether you compare with equality or relationality, you should compare floating point objects of same type, instead of mixing float and double.
Does this have some cons?
Floating point calculations potentially have error. Result might not be what you expect. When there is error, then equality comparison is meaningless.

The reason that second example works is, in a way, pure chance.
Some factors to consider:
When both sides of the comparison are of type float, the compiler is probably more likely to "optimise out" the comparison so that it just happens during the compilation process. It can look at the literals and realise that the two numbers are logically the same, even though at runtime they may differ at lower levels of precision.
When both sides of the comparison are of type float, they have the same precision, so if you've created the values in the same way (here, by a literal), any error in the lower levels of precision could be identical. When one of them is a double, you have additional erroring bits at the end that throw off the comparison. And if you'd created one via 0.6 + 0.6 then the result could also be different as the errors propagate differently.
In general, do not rely on this. You can't really predict how "accurate" your floats will be when they contain numbers not representable exactly in binary. You should stick to epsilon-range compares (where appropriate) if you need loose value comparison, even if direct comparison appears to "work" occasionally without it.
A good approach to take, if you don't actually need floating point, is to use fixed point instead. There are no built-in fixed-point types in the language, but that's okay because you can simulate them trivially with a little arithmetic. It's hard to know what your use case is here, but if you only need one decimal place, instead you can store int decimalNumber = 12 (i.e. shift the decimal point by one) and just divide by ten whenever you need to display it. Two ints of value 12 always compare nicely.
Money is a great example of this: count in pennies (or tenths of pennies), not in pounds, to avoid errors creeping in that scam your customers out of their cash. 😉

Related

Is hardcoding least significant byte of a double a good rounding strategy?

I have a function doing some mathematical computation and returning a double. It ends up with different results under Windows and Android due to std::exp implementation beging different (Why do I get platform-specific result for std::exp?). The e-17 rounding difference gets propagated and in the end it's not just a rounding difference that I get (results can change 2.36 to 2.47 in the end). As I compare the result to some expected values, I want this function to return the same result on all platform.
So I need to round my result. The simpliest solution to do this is apparently (as far as I could find on the web) to do std::ceil(d*std::pow<double>(10,precision))/std::pow<double>(10,precision). However, I feel like this could still end up with different results depending on the platform (and moreover, it's hard to decide what precision should be).
I was wondering if hard-coding the least significant byte of the double could be a good rounding strategy.
This quick test seems to show that "yes":
#include <iostream>
#include <iomanip>
double roundByCast( double d )
{
double rounded = d;
unsigned char* temp = (unsigned char*) &rounded;
// changing least significant byte to be always the same
temp[0] = 128;
return rounded;
}
void showRoundInfo( double d, double rounded )
{
double diff = std::abs(d-rounded);
std::cout << "cast: " << d << " rounded to " << rounded << " (diff=" << diff << ")" << std::endl;
}
void roundIt( double d )
{
showRoundInfo( d, roundByCast(d) );
}
int main( int argc, char* argv[] )
{
roundIt( 7.87234042553191493141184764681 );
roundIt( 0.000000000000000000000184764681 );
roundIt( 78723404.2553191493141184764681 );
}
This outputs:
cast: 7.87234 rounded to 7.87234 (diff=2.66454e-14)
cast: 1.84765e-22 rounded to 1.84765e-22 (diff=9.87415e-37)
cast: 7.87234e+07 rounded to 7.87234e+07 (diff=4.47035e-07)
My question is:
Is unsigned char* temp = (unsigned char*) &rounded safe or is there an undefined behaviour here, and why?
If there is no UB (or if there is a better way to do this without UB), is such a round function safe and accurate for all input?
Note: I know floating point numbers are inaccurate. Please don't mark as duplicate of Is floating point math broken? or Why Are Floating Point Numbers Inaccurate?. I understand why results are different, I'm just looking for a way to make them be identical on all targetted platforms.
Edit, I may reformulate my question as people are asking why I have different values and why I want them to be the same.
Let's say you get a double from a computation that could end up with a different value due to platform specific implementations (like std::exp). If you want to fix those different double to end up having the exact same memory representation (1) on all platforms, and you want to loose the fewest precision as possible, then, is fixing the least significant byte a good approach? (because I feel that rounding to an arbitrary given precision is likely to loose more information than this trick).
(1) By "same representation", I mean that if you transform it to a std::bitset, you want to see the same bits sequence for all platform.
No, rounding is not a strategy for removing small errors, or guaranteeing agreement with calculations performed with errors.
For any slicing of the number line into ranges, you will successfully eliminate most slight deviations (by placing them in the same bucket and clamping to the same value), but you greatly increase the deviation if your original pair of values straddle a boundary.
In your particular case of hardcoding the least significant byte, the very near values
0x1.mmmmmmm100
and
0x1.mmmmmmm0ff
have a deviation of only one ULP... but after your rounding, they differ by 256 ULP. Oops!
Is unsigned char* temp = (unsigned char*) &rounded safe or is there an undefined behaviour here, and why?
It is well defined, as aliasing through unsigned char is allowed.
is such a round function safe and accurate for all input?
No. You cannot perfectly fix this problem with truncating/rounding. Consider, that one implementation gives 0x.....0ff, and the other 0x.....100. Setting the lsb to 0x00 will make the original 1 ulp difference to 256 ulps.
No rounding algorithm can fix this.
You have two options:
don't use floating point, use some other way (for example, fixed point)
embed a floating point library into your application, which only uses basic floating point arithmetic (+, -, *, /, sqrt), and don't use -ffast-math, or any equivalent option. This way, if you're on a IEEE-754 compatible platform, floating point results should be the same, as IEEE-754 mandates that basic operations should be calculated "perfectly". It means as if the operation calculated at infinite precision, and then rounded to the resulting representation.
Btw, if an input 1e-17 difference means a huge output difference, then your problem/algorithm is ill-conditioned, which generally should be avoided, as it usually doesn't give you meaningful results.
What you are doing is totally, totally misguided.
Your problem is not that you are getting different results (2.36 vs. 2.47). Your problem is that at least one of these results, and likely both, have massive errors. Your Windows and Android results are not just different, they are WRONG. (At least one of them, and you have no idea which one).
Find out why you get these massive errors and change your algorithms to not increase tiny rounding errors massively. Or you have a problem that is inherently chaotic, in which case the difference between results is actually very useful information.
What you are trying just makes the rounding errors 256 times bigger, and if two different results end in ....1ff and ....200 hexadecimal, then you change these to ....180 and ....280, so even the difference between slightly different numbers can grow by a factor 256.
And on a bigendian machine your code will just go kaboom!!!
Your function won't work because of aliasing.
double roundByCast( double d )
{
double rounded = d;
unsigned char* temp = (unsigned char*) &rounded;
// changing least significant byte to be always the same
temp[0] = 128;
return rounded;
}
Casting to unsigned char* for temp is allowed, because char* casts are the exception to the aliasing rules. That's necessary for functions like read, write, memcpy, etc, so that they can copy values to and from byte representations.
However, you aren't allowed to write to temp[0] and then assume that rounded changed. You must create a new double variable (on the stack is fine) and memcpy temp back to it.

Compensating for double/float inaccuracy

I've written a math calculator that takes in a string from the user and parses it. It uses doubles to hold all values involved when calculating. Once solved, I then print it, and use std::setprecision() to make sure it is output correctly (for instance 0.9999999 will become 1 on the print out.
Returning the string that will be output:
//return true or false if this is in the returnstring.
if (returnString.compare("True") == 0 || returnString.compare("False") == 0) return returnString;
//create stringstream and put the answer into the returnString.
std::stringstream stream;
returnString = std::to_string(temp_answer.answer);
//write into the stream with precision set correctly.
stream << std::fixed << std::setprecision(5) << temp_answer.answer;
return stream.str();
I am aware of the accuracy issues when using doubles and floats. Today I started working on code so that the user can compare the two mathematical strings. For instance, 1=1 will evaluate to true, 2>3 false...etc. This works by running my math expression parser for each side of the comparison operator, then comparing the answers.
The issue i'm facing right now is when the user enters something like 1/3*3=1. Of course because i'm using doubles the parser will return 0.999999as the answer. Usually when just solving a non-comparison problem this is compensated for at printing time with std::setprecision() as mentioned before. However, when comparing two doubles it's going to return false as 0.99999!=1. How can I get it so when comparing the doubles this inaccuracy is compensated for, and the answer returned correctly? Here's the code that I use to compare the numbers themselves.
bool math_comparisons::equal_to(std::string lhs, std::string rhs)
{
auto lhs_ret = std::async(process_question, lhs);
auto rhs_ret = std::async(process_question, rhs);
bool ReturnVal = false;
if (lhs_ret.get().answer == rhs_ret.get().answer)
{
ReturnVal = true;
}
return ReturnVal;
}
I'm thinking some kind of rounding needs to occur, but i'm not 100% sure how to accomplish it properly. Please forgive me if this has already been addressed - I couldn't find much with a search. Thanks!
Assuming that answer is a double, replace this
lhs_ret.get().answer == rhs_ret.get().answer
with
abs(lhs_ret.get().answer - rhs_ret.get().answer) < TOL
where TOL is an appropriate tolerance value.
Floating point numbers should never be compared with == but by checking if the absolute difference is less than a given tolerance.
There is one difficulty that needs to be mentioned: The accuracy of doubles is about 16 decimals. So you might set TOL=1.0e-16. This will only work if your numbers are less than 1. For a number with 16 digits, it means that the tolerance has to be as large as 1.
So either you assume that your numbers are smaller than say 10e8 and use a relatively large tolerance like 10e-8 or you need to do something much more complicated.
First consider:
As a basic rule of thumb, a double will be a value with roughly 16dp where dp is decimal places, or 1.0e-16. You need to be aware that this only applies to numbers that are less than one(1) IE for 10.n you'll have to operate around that fact you can only have 15dp EG: 10.0e-15 and so on... Due to computers counting in base 2, and people counting in base 10 some values can never be properly expressed in the bit ranges that "most" modern OS use.
This can be highlighted by the fact that expressing 0.1 in binary or base 2 is infinitely long.
Therefore you should never compare a rational number via the == operator. Instead what the "go to" solution conventionally used is:
You implement a "close enough" solution. IE: you define epsilon as a value eg: epsilon = 0.000001 and you state that if (value a - value b) < epsilon == true. What we are saying is that if a - b is within e, for all intents and purposes for our program, its close enough to be regarded as true.
Now for choosing a value for epsilon, that all depends on how accurate you need to be for your purposes. For example, one can assume you need a high level of accuracy for structural engineering compared to a 2D side scrolling platform game.
The solution in your case you be to replace line 7 of you code:
(lhs_ret.get().answer == rhs_ret.get().answer)
with
abs(lhs_ret.get().answer - rhs_ret.get().answer) < epsilon where abs is the absolute value. IE ignoring the sign of the value lhs.
For more context i highly recommend this lecture on MIT open courseware which explains it in an easy to digest manner.
http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-00sc-introduction-to-computer-science-and-programming-spring-2011/unit-1/lecture-7-debugging/

Truncate floor into three decimal point C++

I want to truncate floor number to be 3 digit decimal number. Example:
input : x = 0.363954;
output: 0.364
i used
double myCeil(float v, int p)
{
return int(v * pow(float(10),p))/pow(float(10),p );
}
but the output was 0.3630001 .
I tried to use trunc from <cmath> but it doesn't exist.
Floating-point math typically uses a binary representation; as a result, there are decimal values that cannot be exactly represented as floating-point values. Trying to fiddle with internal precisions runs into exactly this problem. But mostly when someone is trying to do this they're really trying to display a value using a particular precision, and that's simple:
double x = 0.363954;
std::cout.precision(3);
std::cout << x << '\n';
The function your looking for is the std::ceil, not std::trunc
double myCeil(double v, int p)
{
return std::ceil(v * std::pow(10, p)) / std::pow(10, p);
}
substitue in std::floor or std::round for a myFloor or myRound as desired. (Note that std::round appears in C++11, which you will have to enable if it isn't already done).
It is just impossible to get 0.364 exactly. There is no way you can store the number 0.364 (364/1000) exactly as a float, in the same way you would need an infinite number of decimals to write 1/3 as 0.3333333333...
You did it correctly, except for that you probably want to use std::round(), to round to the closest number, instead of int(), which truncates.
Comparing floating point numbers is tricky business. Typically the best you can do is check that the numbers are sufficiently close to each other.
Are you doing your rounding for comparison purposes? In such case, it seems you are happy with 3 decimals (this depends on each problem in question...), in such case why not just
bool are_equal_to_three_decimals(double a, double b)
{
return std::abs(a-b) < 0.001;
}
Note that the results obtained via comparing the rounded numbers and the function I suggested are not equivalent!
This is an old post, but what you are asking for is decimal precision with binary mathematics. The conversion between the two is giving you an apparent distinction.
The main point, I think, which you are making is to do with identity, so that you can use equality/inequality comparisons between two numbers.
Because of the fact that there is a discrepancy between what we humans use (decimal) and what computers use (binary), we have three choices.
We use a decimal library. This is computationally costly, because we are using maths which are different to how computers work. There are several, and one day they may be adopted into std. See eg "ISO/IEC JTC1 SC22 WG21 N2849"
We learn to do our maths in binary. This is mentally costly, because it's not how we do our maths normally.
We change our algorithm to include an identity test.
We change our algorithm to use a difference test.
With option 3, it is where we make a decision as to just how close one number needs to be to another number to be considered 'the same number'.
One simple way of doing this is (as given by #SirGuy above) where we use ceiling or floor as a test - this is good, because it allows us to choose the significant number of digits we are interested in. It is domain specific, and the solution that he gives might be a bit more optimal if using a power of 2 rather than of 10.
You definitely would only want to do the calculation when using equality/inequality tests.
So now, our equality test would be (for 10 binary places (nearly 3dp))
// Normal identity test for floats.
// Quick but fails eg 1.0000023 == 1.0000024
return (a == b);
Becomes (with 2^10 = 1024).
// Modified identity test for floats.
// Works with 1.0000023 == 1.0000024
return (std::floor(a * 1024) == std::floor(b * 1024));
But this isn't great
I would go for option 4.
Say you consider any difference less than 0.001 to be insignificant, such that 1.00012 = 1.00011.
This does an additional subtraction and a sign removal, which is far cheaper (and more reliable) than bit shifts.
// Modified equality test for floats.
// Returns true if the ∂ is less than 1/10000.
// Works with 1.0000023 == 1.0000024
return abs(a - b) < 0.0001;
This boils down to your comment about calculating circularity, I am suggesting that you calculate the delta (difference) between two circles, rather than testing for equivalence. But that isn't exactly what you asked in the question...

Converting variable type (or workaround)

The class below is supposed to represent a musical note. I want to be able to store the length of the note (e.g. 1/2 note, 1/4 note, 3/8 note, etc.) using only integers. However, I also want to be able to store the length using a floating point number for the rare case that I deal with notes of irregular lengths.
class note{
string tone;
int length_numerator;
int length_denominator;
public:
set_length(int numerator, int denominator){
length_numerator=numerator;
length_denominator=denominator;
}
set_length(double d){
length_numerator=d; // unfortunately truncates everything past decimal point
length_denominator=1;
}
}
The reason it is important for me to be able to use integers rather than doubles to store the length is that in my past experience with floating point numbers, sometimes the values are unexpectedly inaccurate. For example, a number that is supposed to be 16 occasionally gets mysteriously stored as 16.0000000001 or 15.99999999999 (usually after enduring some operations) with floating point, and this could cause problems when testing for equality (because 16!=15.99999999999).
Is it possible to convert a variable from int to double (the variable, not just its value)? If not, then what else can I do to be able to store the note's length using either an integer or a double, depending on the what I need the type to be?
If your only problem is comparing floats for equality, then I'd say to use floats, but read "Comparing floating point numbers" / Bruce Dawson first. It's not long, and it explains how to compare two floating numbers correctly (by checking the absolute and relative difference).
When you have more time, you should also look at "What Every Computer Scientist Should Know About Floating Point Arithmetic" to understand why 16 occasionally gets "mysteriously" stored as 16.0000000001 or 15.99999999999.
Attempts to use integers for rational numbers (or for fixed point arithmetic) are rarely as simple as they look.
I see several possible solutions: the first is just to use double. It's
true that extended computations may result in inaccurate results, but in
this case, your divisors are normally powers of 2, which will give exact
results (at least on all of the machines I've seen); you only risk
running into problems when dividing by some unusual value (which is the
case where you'll have to use double anyway).
You could also scale the results, e.g. representing the notes as
multiples of, say 64th notes. This will mean that most values will be
small integers, which are guaranteed exact in double (again, at least
in the usual representations). A number that is supposed to be 16 does
not get stored as 16.000000001 or 15.99999999 (but a number that is
supposed to be .16 might get stored as .1600000001 or .1599999999).
Before the appearance of long long, decimal arithmetic classes often
used double as a 52 bit integral type, ensuring at each step that the
actual value was exactly an integer. (Only division might cause a problem.)
Or you could use some sort of class representing rational numbers.
(Boost has one, for example, and I'm sure there are others.) This would
allow any strange values (5th notes, anyone?) to remain exact; it could
also be advantageous for human readable output, e.g. you could test the
denominator, and then output something like "3 quarter notes", or the
like. Even something like "a 3/4 note" would be more readable to a
musician than "a .75 note".
It is not possible to convert a variable from int to double, it is possible to convert a value from int to double. I'm not completely certain which you are asking for but maybe you are looking for a union
union DoubleOrInt
{
double d;
int i;
};
DoubleOrInt length_numerator;
DoubleOrInt length_denominator;
Then you can write
set_length(int numerator, int denominator){
length_numerator.i=numerator;
length_denominator.i=denominator;
}
set_length(double d){
length_numerator.d=d;
length_denominator.d=1.0;
}
The problem with this approach is that you absolutely must keep track of whether you are currently storing ints or doubles in your unions. Bad things will happen if you store an int and then try to access it as a double. Preferrably you would do this inside your class.
This is normal behavior for floating point variables. They are always rounded and the last digits may change valued depending on the operations you do. I suggest reading on floating points somewhere (e.g. http://floating-point-gui.de/) - especially about comparing fp values.
I normally subtract them, take the absolute value and compare this against an epsilon, e.g. if (abs(x-y)
Given you have a set_length(double d), my guess is that you actually need doubles. Note that the conversion from double to a fraction of integer is fragile and complexe, and will most probably not solve your equality problems (is 0.24999999 equal to 1/4 ?). It would be better for you to either choose to always use fractions, or always doubles. Then, just learn how to use them. I must say, for music, it make sense to have fractions as it is even how notes are being described.
If it were me, I would just use an enum. To turn something into a note would be pretty simple using this system also. Here's a way you could do it:
class Note {
public:
enum Type {
// In this case, 16 represents a whole note, but it could be larger
// if demisemiquavers were used or something.
Semiquaver = 1,
Quaver = 2,
Crotchet = 4,
Minim = 8,
Semibreve = 16
};
static float GetNoteLength(const Type &note)
{ return static_cast<float>(note)/16.0f; }
static float TieNotes(const Type &note1, const Type &note2)
{ return GetNoteLength(note1)+GetNoteLength(note2); }
};
int main()
{
// Make a semiquaver
Note::Type sq = Note::Semiquaver;
// Make a quaver
Note::Type q = Note::Quaver;
// Dot it with the semiquaver from before
float dottedQuaver = Note::TieNotes(sq, q);
std::cout << "Semiquaver is equivalent to: " << Note::GetNoteLength(sq) << " beats\n";
std::cout << "Dotted quaver is equivalent to: " << dottedQuaver << " beats\n";
return 0;
}
Those 'Irregular' notes you speak of can be retrieved using TieNotes

double precision C++

I think the precision of double is causing that problem, as it was described in similiar posts, but I would like to know if there is a way to achieve correct result. I'm using function template which compares two parameters and returns true if they are equal.
template <class T>
bool eq(T one, T two)
{
if (one == two)
return true;
else
return false;
}
It works with eq (0.8,0.8), but it doesn't work with eq (0.8*0.2,0.16). As I mentioned I assume it has to do with double precision as it also works fine with int eq(8*2,16).
First you should read one (or both) of these articles: What Every Computer Scientist Should Know About Floating-Point Arithmetic and The Perils of Floating Point.
If you are looking for a solution for your template, I would suggest using template specialization for the cases where T==double and T==float.
You should rarely try to compare doubles for equality. Instead you need to decide on a margin of error that you are willing to accept:
return fabs(one - two) <= 0.000001
You will want to overload your function and then do a comparison that is appropriate for floating point:
bool eq(double one, double two)
{
// You'll want to choose a delta that is appropriate for your usage
return fabs(one - two) < DELTA;
}
You'll also want to do another overload for float's.
Here is another article about the problems with comparing floating point numbers.
Comparing for equality
Floating point math is not exact. Simple values like 0.2 cannot be precisely represented using binary floating point numbers, and the limited precision of floating point numbers means that slight changes in the order of operations can change the result. Different compilers and CPU architectures store temporary results at different precisions, so results will differ depending on the details of your environment. If you do a calculation and then compare the results against some expected value it is highly unlikely that you will get exactly the result you intended.
You should never compare floating point without specifying precision.
This will return false:
cout << (0.8 * 2 < 0.16 ? true : false) << endl;
But this will return true:
cout << ((0.8 * 2 - 0.16) < 0.001 ? true : false) << endl;
You always will be able to use rounding functions to be sure.