What is the most effective way for float and double comparison? - c++
What would be the most efficient way to compare two double or two float values?
Simply doing this is not correct:
bool CompareDoubles1 (double A, double B)
{
return A == B;
}
But something like:
bool CompareDoubles2 (double A, double B)
{
diff = A - B;
return (diff < EPSILON) && (-diff < EPSILON);
}
Seems to waste processing.
Does anyone know a smarter float comparer?
Be extremely careful using any of the other suggestions. It all depends on context.
I have spent a long time tracing bugs in a system that presumed a==b if |a-b|<epsilon. The underlying problems were:
The implicit presumption in an algorithm that if a==b and b==c then a==c.
Using the same epsilon for lines measured in inches and lines measured in mils (.001 inch). That is a==b but 1000a!=1000b. (This is why AlmostEqual2sComplement asks for the epsilon or max ULPS).
The use of the same epsilon for both the cosine of angles and the length of lines!
Using such a compare function to sort items in a collection. (In this case using the builtin C++ operator == for doubles produced correct results.)
Like I said: it all depends on context and the expected size of a and b.
By the way, std::numeric_limits<double>::epsilon() is the "machine epsilon". It is the difference between 1.0 and the next value representable by a double. I guess that it could be used in the compare function but only if the expected values are less than 1. (This is in response to #cdv's answer...)
Also, if you basically have int arithmetic in doubles (here we use doubles to hold int values in certain cases) your arithmetic will be correct. For example 4.0/2.0 will be the same as 1.0+1.0. This is as long as you do not do things that result in fractions (4.0/3.0) or do not go outside of the size of an int.
The comparison with an epsilon value is what most people do (even in game programming).
You should change your implementation a little though:
bool AreSame(double a, double b)
{
return fabs(a - b) < EPSILON;
}
Edit: Christer has added a stack of great info on this topic on a recent blog post. Enjoy.
Comparing floating point numbers for depends on the context. Since even changing the order of operations can produce different results, it is important to know how "equal" you want the numbers to be.
Comparing floating point numbers by Bruce Dawson is a good place to start when looking at floating point comparison.
The following definitions are from The art of computer programming by Knuth:
bool approximatelyEqual(float a, float b, float epsilon)
{
return fabs(a - b) <= ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}
bool essentiallyEqual(float a, float b, float epsilon)
{
return fabs(a - b) <= ( (fabs(a) > fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}
bool definitelyGreaterThan(float a, float b, float epsilon)
{
return (a - b) > ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}
bool definitelyLessThan(float a, float b, float epsilon)
{
return (b - a) > ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}
Of course, choosing epsilon depends on the context, and determines how equal you want the numbers to be.
Another method of comparing floating point numbers is to look at the ULP (units in last place) of the numbers. While not dealing specifically with comparisons, the paper What every computer scientist should know about floating point numbers is a good resource for understanding how floating point works and what the pitfalls are, including what ULP is.
I found that the Google C++ Testing Framework contains a nice cross-platform template-based implementation of AlmostEqual2sComplement which works on both doubles and floats. Given that it is released under the BSD license, using it in your own code should be no problem, as long as you retain the license. I extracted the below code from http://code.google.com/p/googletest/source/browse/trunk/include/gtest/internal/gtest-internal.h https://github.com/google/googletest/blob/master/googletest/include/gtest/internal/gtest-internal.h and added the license on top.
Be sure to #define GTEST_OS_WINDOWS to some value (or to change the code where it's used to something that fits your codebase - it's BSD licensed after all).
Usage example:
double left = // something
double right = // something
const FloatingPoint<double> lhs(left), rhs(right);
if (lhs.AlmostEquals(rhs)) {
//they're equal!
}
Here's the code:
// Copyright 2005, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
// * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Authors: wan#google.com (Zhanyong Wan), eefacm#gmail.com (Sean Mcafee)
//
// The Google C++ Testing Framework (Google Test)
// This template class serves as a compile-time function from size to
// type. It maps a size in bytes to a primitive type with that
// size. e.g.
//
// TypeWithSize<4>::UInt
//
// is typedef-ed to be unsigned int (unsigned integer made up of 4
// bytes).
//
// Such functionality should belong to STL, but I cannot find it
// there.
//
// Google Test uses this class in the implementation of floating-point
// comparison.
//
// For now it only handles UInt (unsigned int) as that's all Google Test
// needs. Other types can be easily added in the future if need
// arises.
template <size_t size>
class TypeWithSize {
public:
// This prevents the user from using TypeWithSize<N> with incorrect
// values of N.
typedef void UInt;
};
// The specialization for size 4.
template <>
class TypeWithSize<4> {
public:
// unsigned int has size 4 in both gcc and MSVC.
//
// As base/basictypes.h doesn't compile on Windows, we cannot use
// uint32, uint64, and etc here.
typedef int Int;
typedef unsigned int UInt;
};
// The specialization for size 8.
template <>
class TypeWithSize<8> {
public:
#if GTEST_OS_WINDOWS
typedef __int64 Int;
typedef unsigned __int64 UInt;
#else
typedef long long Int; // NOLINT
typedef unsigned long long UInt; // NOLINT
#endif // GTEST_OS_WINDOWS
};
// This template class represents an IEEE floating-point number
// (either single-precision or double-precision, depending on the
// template parameters).
//
// The purpose of this class is to do more sophisticated number
// comparison. (Due to round-off error, etc, it's very unlikely that
// two floating-points will be equal exactly. Hence a naive
// comparison by the == operation often doesn't work.)
//
// Format of IEEE floating-point:
//
// The most-significant bit being the leftmost, an IEEE
// floating-point looks like
//
// sign_bit exponent_bits fraction_bits
//
// Here, sign_bit is a single bit that designates the sign of the
// number.
//
// For float, there are 8 exponent bits and 23 fraction bits.
//
// For double, there are 11 exponent bits and 52 fraction bits.
//
// More details can be found at
// http://en.wikipedia.org/wiki/IEEE_floating-point_standard.
//
// Template parameter:
//
// RawType: the raw floating-point type (either float or double)
template <typename RawType>
class FloatingPoint {
public:
// Defines the unsigned integer type that has the same size as the
// floating point number.
typedef typename TypeWithSize<sizeof(RawType)>::UInt Bits;
// Constants.
// # of bits in a number.
static const size_t kBitCount = 8*sizeof(RawType);
// # of fraction bits in a number.
static const size_t kFractionBitCount =
std::numeric_limits<RawType>::digits - 1;
// # of exponent bits in a number.
static const size_t kExponentBitCount = kBitCount - 1 - kFractionBitCount;
// The mask for the sign bit.
static const Bits kSignBitMask = static_cast<Bits>(1) << (kBitCount - 1);
// The mask for the fraction bits.
static const Bits kFractionBitMask =
~static_cast<Bits>(0) >> (kExponentBitCount + 1);
// The mask for the exponent bits.
static const Bits kExponentBitMask = ~(kSignBitMask | kFractionBitMask);
// How many ULP's (Units in the Last Place) we want to tolerate when
// comparing two numbers. The larger the value, the more error we
// allow. A 0 value means that two numbers must be exactly the same
// to be considered equal.
//
// The maximum error of a single floating-point operation is 0.5
// units in the last place. On Intel CPU's, all floating-point
// calculations are done with 80-bit precision, while double has 64
// bits. Therefore, 4 should be enough for ordinary use.
//
// See the following article for more details on ULP:
// http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm.
static const size_t kMaxUlps = 4;
// Constructs a FloatingPoint from a raw floating-point number.
//
// On an Intel CPU, passing a non-normalized NAN (Not a Number)
// around may change its bits, although the new value is guaranteed
// to be also a NAN. Therefore, don't expect this constructor to
// preserve the bits in x when x is a NAN.
explicit FloatingPoint(const RawType& x) { u_.value_ = x; }
// Static methods
// Reinterprets a bit pattern as a floating-point number.
//
// This function is needed to test the AlmostEquals() method.
static RawType ReinterpretBits(const Bits bits) {
FloatingPoint fp(0);
fp.u_.bits_ = bits;
return fp.u_.value_;
}
// Returns the floating-point number that represent positive infinity.
static RawType Infinity() {
return ReinterpretBits(kExponentBitMask);
}
// Non-static methods
// Returns the bits that represents this number.
const Bits &bits() const { return u_.bits_; }
// Returns the exponent bits of this number.
Bits exponent_bits() const { return kExponentBitMask & u_.bits_; }
// Returns the fraction bits of this number.
Bits fraction_bits() const { return kFractionBitMask & u_.bits_; }
// Returns the sign bit of this number.
Bits sign_bit() const { return kSignBitMask & u_.bits_; }
// Returns true iff this is NAN (not a number).
bool is_nan() const {
// It's a NAN if the exponent bits are all ones and the fraction
// bits are not entirely zeros.
return (exponent_bits() == kExponentBitMask) && (fraction_bits() != 0);
}
// Returns true iff this number is at most kMaxUlps ULP's away from
// rhs. In particular, this function:
//
// - returns false if either number is (or both are) NAN.
// - treats really large numbers as almost equal to infinity.
// - thinks +0.0 and -0.0 are 0 DLP's apart.
bool AlmostEquals(const FloatingPoint& rhs) const {
// The IEEE standard says that any comparison operation involving
// a NAN must return false.
if (is_nan() || rhs.is_nan()) return false;
return DistanceBetweenSignAndMagnitudeNumbers(u_.bits_, rhs.u_.bits_)
<= kMaxUlps;
}
private:
// The data type used to store the actual floating-point number.
union FloatingPointUnion {
RawType value_; // The raw floating-point number.
Bits bits_; // The bits that represent the number.
};
// Converts an integer from the sign-and-magnitude representation to
// the biased representation. More precisely, let N be 2 to the
// power of (kBitCount - 1), an integer x is represented by the
// unsigned number x + N.
//
// For instance,
//
// -N + 1 (the most negative number representable using
// sign-and-magnitude) is represented by 1;
// 0 is represented by N; and
// N - 1 (the biggest number representable using
// sign-and-magnitude) is represented by 2N - 1.
//
// Read http://en.wikipedia.org/wiki/Signed_number_representations
// for more details on signed number representations.
static Bits SignAndMagnitudeToBiased(const Bits &sam) {
if (kSignBitMask & sam) {
// sam represents a negative number.
return ~sam + 1;
} else {
// sam represents a positive number.
return kSignBitMask | sam;
}
}
// Given two numbers in the sign-and-magnitude representation,
// returns the distance between them as an unsigned number.
static Bits DistanceBetweenSignAndMagnitudeNumbers(const Bits &sam1,
const Bits &sam2) {
const Bits biased1 = SignAndMagnitudeToBiased(sam1);
const Bits biased2 = SignAndMagnitudeToBiased(sam2);
return (biased1 >= biased2) ? (biased1 - biased2) : (biased2 - biased1);
}
FloatingPointUnion u_;
};
EDIT: This post is 4 years old. It's probably still valid, and the code is nice, but some people found improvements. Best go get the latest version of AlmostEquals right from the Google Test source code, and not the one I pasted up here.
For a more in depth approach read Comparing floating point numbers. Here is the code snippet from that link:
// Usable AlmostEqual function
bool AlmostEqual2sComplement(float A, float B, int maxUlps)
{
// Make sure maxUlps is non-negative and small enough that the
// default NAN won't compare as equal to anything.
assert(maxUlps > 0 && maxUlps < 4 * 1024 * 1024);
int aInt = *(int*)&A;
// Make aInt lexicographically ordered as a twos-complement int
if (aInt < 0)
aInt = 0x80000000 - aInt;
// Make bInt lexicographically ordered as a twos-complement int
int bInt = *(int*)&B;
if (bInt < 0)
bInt = 0x80000000 - bInt;
int intDiff = abs(aInt - bInt);
if (intDiff <= maxUlps)
return true;
return false;
}
Realizing this is an old thread but this article is one of the most straight forward ones I have found on comparing floating point numbers and if you want to explore more it has more detailed references as well and it the main site covers a complete range of issues dealing with floating point numbers The Floating-Point Guide :Comparison.
We can find a somewhat more practical article in Floating-point tolerances revisited and notes there is absolute tolerance test, which boils down to this in C++:
bool absoluteToleranceCompare(double x, double y)
{
return std::fabs(x - y) <= std::numeric_limits<double>::epsilon() ;
}
and relative tolerance test:
bool relativeToleranceCompare(double x, double y)
{
double maxXY = std::max( std::fabs(x) , std::fabs(y) ) ;
return std::fabs(x - y) <= std::numeric_limits<double>::epsilon()*maxXY ;
}
The article notes that the absolute test fails when x and y are large and fails in the relative case when they are small. Assuming he absolute and relative tolerance is the same a combined test would look like this:
bool combinedToleranceCompare(double x, double y)
{
double maxXYOne = std::max( { 1.0, std::fabs(x) , std::fabs(y) } ) ;
return std::fabs(x - y) <= std::numeric_limits<double>::epsilon()*maxXYOne ;
}
I ended up spending quite some time going through material in this great thread. I doubt everyone wants to spend so much time so I would highlight the summary of what I learned and the solution I implemented.
Quick Summary
Is 1e-8 approximately same as 1e-16? If you are looking at noisy sensor data then probably yes but if you are doing molecular simulation then may be not! Bottom line: You always need to think of tolerance value in context of specific function call and not just make it generic app-wide hard-coded constant.
For general library functions, it's still nice to have parameter with default tolerance. A typical choice is numeric_limits::epsilon() which is same as FLT_EPSILON in float.h. This is however problematic because epsilon for comparing values like 1.0 is not same as epsilon for values like 1E9. The FLT_EPSILON is defined for 1.0.
The obvious implementation to check if number is within tolerance is fabs(a-b) <= epsilon however this doesn't work because default epsilon is defined for 1.0. We need to scale epsilon up or down in terms of a and b.
There are two solution to this problem: either you set epsilon proportional to max(a,b) or you can get next representable numbers around a and then see if b falls into that range. The former is called "relative" method and later is called ULP method.
Both methods actually fails anyway when comparing with 0. In this case, application must supply correct tolerance.
Utility Functions Implementation (C++11)
//implements relative method - do not use for comparing with zero
//use this most of the time, tolerance needs to be meaningful in your context
template<typename TReal>
static bool isApproximatelyEqual(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon())
{
TReal diff = std::fabs(a - b);
if (diff <= tolerance)
return true;
if (diff < std::fmax(std::fabs(a), std::fabs(b)) * tolerance)
return true;
return false;
}
//supply tolerance that is meaningful in your context
//for example, default tolerance may not work if you are comparing double with float
template<typename TReal>
static bool isApproximatelyZero(TReal a, TReal tolerance = std::numeric_limits<TReal>::epsilon())
{
if (std::fabs(a) <= tolerance)
return true;
return false;
}
//use this when you want to be on safe side
//for example, don't start rover unless signal is above 1
template<typename TReal>
static bool isDefinitelyLessThan(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon())
{
TReal diff = a - b;
if (diff < tolerance)
return true;
if (diff < std::fmax(std::fabs(a), std::fabs(b)) * tolerance)
return true;
return false;
}
template<typename TReal>
static bool isDefinitelyGreaterThan(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon())
{
TReal diff = a - b;
if (diff > tolerance)
return true;
if (diff > std::fmax(std::fabs(a), std::fabs(b)) * tolerance)
return true;
return false;
}
//implements ULP method
//use this when you are only concerned about floating point precision issue
//for example, if you want to see if a is 1.0 by checking if its within
//10 closest representable floating point numbers around 1.0.
template<typename TReal>
static bool isWithinPrecisionInterval(TReal a, TReal b, unsigned int interval_size = 1)
{
TReal min_a = a - (a - std::nextafter(a, std::numeric_limits<TReal>::lowest())) * interval_size;
TReal max_a = a + (std::nextafter(a, std::numeric_limits<TReal>::max()) - a) * interval_size;
return min_a <= b && max_a >= b;
}
The portable way to get epsilon in C++ is
#include <limits>
std::numeric_limits<double>::epsilon()
Then the comparison function becomes
#include <cmath>
#include <limits>
bool AreSame(double a, double b) {
return std::fabs(a - b) < std::numeric_limits<double>::epsilon();
}
The code you wrote is bugged :
return (diff < EPSILON) && (-diff > EPSILON);
The correct code would be :
return (diff < EPSILON) && (diff > -EPSILON);
(...and yes this is different)
I wonder if fabs wouldn't make you lose lazy evaluation in some case. I would say it depends on the compiler. You might want to try both. If they are equivalent in average, take the implementation with fabs.
If you have some info on which of the two float is more likely to be bigger than then other, you can play on the order of the comparison to take better advantage of the lazy evaluation.
Finally you might get better result by inlining this function. Not likely to improve much though...
Edit: OJ, thanks for correcting your code. I erased my comment accordingly
`return fabs(a - b) < EPSILON;
This is fine if:
the order of magnitude of your inputs don't change much
very small numbers of opposite signs can be treated as equal
But otherwise it'll lead you into trouble. Double precision numbers have a resolution of about 16 decimal places. If the two numbers you are comparing are larger in magnitude than EPSILON*1.0E16, then you might as well be saying:
return a==b;
I'll examine a different approach that assumes you need to worry about the first issue and assume the second is fine your application. A solution would be something like:
#define VERYSMALL (1.0E-150)
#define EPSILON (1.0E-8)
bool AreSame(double a, double b)
{
double absDiff = fabs(a - b);
if (absDiff < VERYSMALL)
{
return true;
}
double maxAbs = max(fabs(a) - fabs(b));
return (absDiff/maxAbs) < EPSILON;
}
This is expensive computationally, but it is sometimes what is called for. This is what we have to do at my company because we deal with an engineering library and inputs can vary by a few dozen orders of magnitude.
Anyway, the point is this (and applies to practically every programming problem): Evaluate what your needs are, then come up with a solution to address your needs -- don't assume the easy answer will address your needs. If after your evaluation you find that fabs(a-b) < EPSILON will suffice, perfect -- use it! But be aware of its shortcomings and other possible solutions too.
As others have pointed out, using a fixed-exponent epsilon (such as 0.0000001) will be useless for values away from the epsilon value. For example, if your two values are 10000.000977 and 10000, then there are NO 32-bit floating-point values between these two numbers -- 10000 and 10000.000977 are as close as you can possibly get without being bit-for-bit identical. Here, an epsilon of less than 0.0009 is meaningless; you might as well use the straight equality operator.
Likewise, as the two values approach epsilon in size, the relative error grows to 100%.
Thus, trying to mix a fixed point number such as 0.00001 with floating-point values (where the exponent is arbitrary) is a pointless exercise. This will only ever work if you can be assured that the operand values lie within a narrow domain (that is, close to some specific exponent), and if you properly select an epsilon value for that specific test. If you pull a number out of the air ("Hey! 0.00001 is small, so that must be good!"), you're doomed to numerical errors. I've spent plenty of time debugging bad numerical code where some poor schmuck tosses in random epsilon values to make yet another test case work.
If you do numerical programming of any kind and believe you need to reach for fixed-point epsilons, READ BRUCE'S ARTICLE ON COMPARING FLOATING-POINT NUMBERS.
Comparing Floating Point Numbers
Here's proof that using std::numeric_limits::epsilon() is not the answer — it fails for values greater than one:
Proof of my comment above:
#include <stdio.h>
#include <limits>
double ItoD (__int64 x) {
// Return double from 64-bit hexadecimal representation.
return *(reinterpret_cast<double*>(&x));
}
void test (__int64 ai, __int64 bi) {
double a = ItoD(ai), b = ItoD(bi);
bool close = std::fabs(a-b) < std::numeric_limits<double>::epsilon();
printf ("%.16f and %.16f %s close.\n", a, b, close ? "are " : "are not");
}
int main()
{
test (0x3fe0000000000000L,
0x3fe0000000000001L);
test (0x3ff0000000000000L,
0x3ff0000000000001L);
}
Running yields this output:
0.5000000000000000 and 0.5000000000000001 are close.
1.0000000000000000 and 1.0000000000000002 are not close.
Note that in the second case (one and just larger than one), the two input values are as close as they can possibly be, and still compare as not close. Thus, for values greater than 1.0, you might as well just use an equality test. Fixed epsilons will not save you when comparing floating-point values.
Qt implements two functions, maybe you can learn from them:
static inline bool qFuzzyCompare(double p1, double p2)
{
return (qAbs(p1 - p2) <= 0.000000000001 * qMin(qAbs(p1), qAbs(p2)));
}
static inline bool qFuzzyCompare(float p1, float p2)
{
return (qAbs(p1 - p2) <= 0.00001f * qMin(qAbs(p1), qAbs(p2)));
}
And you may need the following functions, since
Note that comparing values where either p1 or p2 is 0.0 will not work,
nor does comparing values where one of the values is NaN or infinity.
If one of the values is always 0.0, use qFuzzyIsNull instead. If one
of the values is likely to be 0.0, one solution is to add 1.0 to both
values.
static inline bool qFuzzyIsNull(double d)
{
return qAbs(d) <= 0.000000000001;
}
static inline bool qFuzzyIsNull(float f)
{
return qAbs(f) <= 0.00001f;
}
Unfortunately, even your "wasteful" code is incorrect. EPSILON is the smallest value that could be added to 1.0 and change its value. The value 1.0 is very important — larger numbers do not change when added to EPSILON. Now, you can scale this value to the numbers you are comparing to tell whether they are different or not. The correct expression for comparing two doubles is:
if (fabs(a - b) <= DBL_EPSILON * fmax(fabs(a), fabs(b)))
{
// ...
}
This is at a minimum. In general, though, you would want to account for noise in your calculations and ignore a few of the least significant bits, so a more realistic comparison would look like:
if (fabs(a - b) <= 16 * DBL_EPSILON * fmax(fabs(a), fabs(b)))
{
// ...
}
If comparison performance is very important to you and you know the range of your values, then you should use fixed-point numbers instead.
General-purpose comparison of floating-point numbers is generally meaningless. How to compare really depends on a problem at hand. In many problems, numbers are sufficiently discretized to allow comparing them within a given tolerance. Unfortunately, there are just as many problems, where such trick doesn't really work. For one example, consider working with a Heaviside (step) function of a number in question (digital stock options come to mind) when your observations are very close to the barrier. Performing tolerance-based comparison wouldn't do much good, as it would effectively shift the issue from the original barrier to two new ones. Again, there is no general-purpose solution for such problems and the particular solution might require going as far as changing the numerical method in order to achieve stability.
You have to do this processing for floating point comparison, since float's can't be perfectly compared like integer types. Here are functions for the various comparison operators.
Floating Point Equal to (==)
I also prefer the subtraction technique rather than relying on fabs() or abs(), but I'd have to speed profile it on various architectures from 64-bit PC to ATMega328 microcontroller (Arduino) to really see if it makes much of a performance difference.
So, let's forget about all this absolute value stuff and just do some subtraction and comparison!
Modified from Microsoft's example here:
/// #brief See if two floating point numbers are approximately equal.
/// #param[in] a number 1
/// #param[in] b number 2
/// #param[in] epsilon A small value such that if the difference between the two numbers is
/// smaller than this they can safely be considered to be equal.
/// #return true if the two numbers are approximately equal, and false otherwise
bool is_float_eq(float a, float b, float epsilon) {
return ((a - b) < epsilon) && ((b - a) < epsilon);
}
bool is_double_eq(double a, double b, double epsilon) {
return ((a - b) < epsilon) && ((b - a) < epsilon);
}
Example usage:
constexpr float EPSILON = 0.0001; // 1e-4
is_float_eq(1.0001, 0.99998, EPSILON);
I'm not entirely sure, but it seems to me some of the criticisms of the epsilon-based approach, as described in the comments below this highly-upvoted answer, can be resolved by using a variable epsilon, scaled according to the floating point values being compared, like this:
float a = 1.0001;
float b = 0.99998;
float epsilon = std::max(std::fabs(a), std::fabs(b)) * 1e-4;
is_float_eq(a, b, epsilon);
This way, the epsilon value scales with the floating point values and is therefore never so small of a value that it becomes insignificant.
For completeness, let's add the rest:
Greater than (>), and less than (<):
/// #brief See if floating point number `a` is > `b`
/// #param[in] a number 1
/// #param[in] b number 2
/// #param[in] epsilon a small value such that if `a` is > `b` by this amount, `a` is considered
/// to be definitively > `b`
/// #return true if `a` is definitively > `b`, and false otherwise
bool is_float_gt(float a, float b, float epsilon) {
return a > b + epsilon;
}
bool is_double_gt(double a, double b, double epsilon) {
return a > b + epsilon;
}
/// #brief See if floating point number `a` is < `b`
/// #param[in] a number 1
/// #param[in] b number 2
/// #param[in] epsilon a small value such that if `a` is < `b` by this amount, `a` is considered
/// to be definitively < `b`
/// #return true if `a` is definitively < `b`, and false otherwise
bool is_float_lt(float a, float b, float epsilon) {
return a < b - epsilon;
}
bool is_double_lt(double a, double b, double epsilon) {
return a < b - epsilon;
}
Greater than or equal to (>=), and less than or equal to (<=)
/// #brief Returns true if `a` is definitively >= `b`, and false otherwise
bool is_float_ge(float a, float b, float epsilon) {
return a > b - epsilon;
}
bool is_double_ge(double a, double b, double epsilon) {
return a > b - epsilon;
}
/// #brief Returns true if `a` is definitively <= `b`, and false otherwise
bool is_float_le(float a, float b, float epsilon) {
return a < b + epsilon;
}
bool is_double_le(double a, double b, double epsilon) {
return a < b + epsilon;
}
Additional improvements:
A good default value for epsilon in C++ is std::numeric_limits<T>::epsilon(), which evaluates to either 0 or FLT_EPSILON, DBL_EPSILON, or LDBL_EPSILON. See here: https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon. You can also see the float.h header for FLT_EPSILON, DBL_EPSILON, and LDBL_EPSILON.
See https://en.cppreference.com/w/cpp/header/cfloat and
https://www.cplusplus.com/reference/cfloat/
You could template the functions instead, to handle all floating point types: float, double, and long double, with type checks for these types via a static_assert() inside the template.
Scaling the epsilon value is a good idea to ensure it works for really large and really small a and b values. This article recommends and explains it: http://realtimecollisiondetection.net/blog/?p=89. So, you should scale epsilon by a scaling value equal to max(1.0, abs(a), abs(b)), as that article explains. Otherwise, as a and/or b increase in magnitude, the epsilon would eventually become so small relative to those values that it becomes lost in the floating point error. So, we scale it to become larger in magnitude like they are. However, using 1.0 as the smallest allowed scaling factor for epsilon also ensures that for really small-magnitude a and b values, epsilon itself doesn't get scaled so small that it also becomes lost in the floating point error. So, we limit the minimum scaling factor to 1.0.
If you want to "encapsulate" the above functions into a class, don't. Instead, wrap them up in a namespace if you like in order to namespace them. Ex: if you put all of the stand-alone functions into a namespace called float_comparison, then you could access the is_eq() function like this, for instance: float_comparison::is_eq(1.0, 1.5);.
It might also be nice to add comparisons against zero, not just comparisons between two values.
So, here is a better type of solution with the above improvements in place:
namespace float_comparison {
/// Scale the epsilon value to become large for large-magnitude a or b,
/// but no smaller than 1.0, per the explanation above, to ensure that
/// epsilon doesn't ever fall out in floating point error as a and/or b
/// increase in magnitude.
template<typename T>
static constexpr T scale_epsilon(T a, T b, T epsilon =
std::numeric_limits<T>::epsilon()) noexcept
{
static_assert(std::is_floating_point_v<T>, "Floating point comparisons "
"require type float, double, or long double.");
T scaling_factor;
// Special case for when a or b is infinity
if (std::isinf(a) || std::isinf(b))
{
scaling_factor = 0;
}
else
{
scaling_factor = std::max({(T)1.0, std::abs(a), std::abs(b)});
}
T epsilon_scaled = scaling_factor * std::abs(epsilon);
return epsilon_scaled;
}
// Compare two values
/// Equal: returns true if a is approximately == b, and false otherwise
template<typename T>
static constexpr bool is_eq(T a, T b, T epsilon =
std::numeric_limits<T>::epsilon()) noexcept
{
static_assert(std::is_floating_point_v<T>, "Floating point comparisons "
"require type float, double, or long double.");
// test `a == b` first to see if both a and b are either infinity
// or -infinity
return a == b || std::abs(a - b) <= scale_epsilon(a, b, epsilon);
}
/*
etc. etc.:
is_eq()
is_ne()
is_lt()
is_le()
is_gt()
is_ge()
*/
// Compare against zero
/// Equal: returns true if a is approximately == 0, and false otherwise
template<typename T>
static constexpr bool is_eq_zero(T a, T epsilon =
std::numeric_limits<T>::epsilon()) noexcept
{
static_assert(std::is_floating_point_v<T>, "Floating point comparisons "
"require type float, double, or long double.");
return is_eq(a, (T)0.0, epsilon);
}
/*
etc. etc.:
is_eq_zero()
is_ne_zero()
is_lt_zero()
is_le_zero()
is_gt_zero()
is_ge_zero()
*/
} // namespace float_comparison
See also:
The macro forms of some of the functions above in my repo here: utilities.h.
UPDATE 29 NOV 2020: it's a work-in-progress, and I'm going to make it a separate answer when ready, but I've produced a better, scaled-epsilon version of all of the functions in C in this file here: utilities.c. Take a look.
ADDITIONAL READING I need to do now have done: Floating-point tolerances revisited, by Christer Ericson. VERY USEFUL ARTICLE! It talks about scaling epsilon in order to ensure it never falls out in floating point error, even for really large-magnitude a and/or b values!
My class based on previously posted answers. Very similar to Google's code but I use a bias which pushes all NaN values above 0xFF000000. That allows a faster check for NaN.
This code is meant to demonstrate the concept, not be a general solution. Google's code already shows how to compute all the platform specific values and I didn't want to duplicate all that. I've done limited testing on this code.
typedef unsigned int U32;
// Float Memory Bias (unsigned)
// ----- ------ ---------------
// NaN 0xFFFFFFFF 0xFF800001
// NaN 0xFF800001 0xFFFFFFFF
// -Infinity 0xFF800000 0x00000000 ---
// -3.40282e+038 0xFF7FFFFF 0x00000001 |
// -1.40130e-045 0x80000001 0x7F7FFFFF |
// -0.0 0x80000000 0x7F800000 |--- Valid <= 0xFF000000.
// 0.0 0x00000000 0x7F800000 | NaN > 0xFF000000
// 1.40130e-045 0x00000001 0x7F800001 |
// 3.40282e+038 0x7F7FFFFF 0xFEFFFFFF |
// Infinity 0x7F800000 0xFF000000 ---
// NaN 0x7F800001 0xFF000001
// NaN 0x7FFFFFFF 0xFF7FFFFF
//
// Either value of NaN returns false.
// -Infinity and +Infinity are not "close".
// -0 and +0 are equal.
//
class CompareFloat{
public:
union{
float m_f32;
U32 m_u32;
};
static bool CompareFloat::IsClose( float A, float B, U32 unitsDelta = 4 )
{
U32 a = CompareFloat::GetBiased( A );
U32 b = CompareFloat::GetBiased( B );
if ( (a > 0xFF000000) || (b > 0xFF000000) )
{
return( false );
}
return( (static_cast<U32>(abs( a - b ))) < unitsDelta );
}
protected:
static U32 CompareFloat::GetBiased( float f )
{
U32 r = ((CompareFloat*)&f)->m_u32;
if ( r & 0x80000000 )
{
return( ~r - 0x007FFFFF );
}
return( r + 0x7F800000 );
}
};
I'd be very wary of any of these answers that involves floating point subtraction (e.g., fabs(a-b) < epsilon). First, the floating point numbers become more sparse at greater magnitudes and at high enough magnitudes where the spacing is greater than epsilon, you might as well just be doing a == b. Second, subtracting two very close floating point numbers (as these will tend to be, given that you're looking for near equality) is exactly how you get catastrophic cancellation.
While not portable, I think grom's answer does the best job of avoiding these issues.
There are actually cases in numerical software where you want to check whether two floating point numbers are exactly equal. I posted this on a similar question
https://stackoverflow.com/a/10973098/1447411
So you can not say that "CompareDoubles1" is wrong in general.
In terms of the scale of quantities:
If epsilon is the small fraction of the magnitude of quantity (i.e. relative value) in some certain physical sense and A and B types is comparable in the same sense, than I think, that the following is quite correct:
#include <limits>
#include <iomanip>
#include <iostream>
#include <cmath>
#include <cstdlib>
#include <cassert>
template< typename A, typename B >
inline
bool close_enough(A const & a, B const & b,
typename std::common_type< A, B >::type const & epsilon)
{
using std::isless;
assert(isless(0, epsilon)); // epsilon is a part of the whole quantity
assert(isless(epsilon, 1));
using std::abs;
auto const delta = abs(a - b);
auto const x = abs(a);
auto const y = abs(b);
// comparable generally and |a - b| < eps * (|a| + |b|) / 2
return isless(epsilon * y, x) && isless(epsilon * x, y) && isless((delta + delta) / (x + y), epsilon);
}
int main()
{
std::cout << std::boolalpha << close_enough(0.9, 1.0, 0.1) << std::endl;
std::cout << std::boolalpha << close_enough(1.0, 1.1, 0.1) << std::endl;
std::cout << std::boolalpha << close_enough(1.1, 1.2, 0.01) << std::endl;
std::cout << std::boolalpha << close_enough(1.0001, 1.0002, 0.01) << std::endl;
std::cout << std::boolalpha << close_enough(1.0, 0.01, 0.1) << std::endl;
return EXIT_SUCCESS;
}
I use this code:
bool AlmostEqual(double v1, double v2)
{
return (std::fabs(v1 - v2) < std::fabs(std::min(v1, v2)) * std::numeric_limits<double>::epsilon());
}
Found another interesting implementation on: https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon
#include <cmath>
#include <limits>
#include <iomanip>
#include <iostream>
#include <type_traits>
#include <algorithm>
template<class T>
typename std::enable_if<!std::numeric_limits<T>::is_integer, bool>::type
almost_equal(T x, T y, int ulp)
{
// the machine epsilon has to be scaled to the magnitude of the values used
// and multiplied by the desired precision in ULPs (units in the last place)
return std::fabs(x-y) <= std::numeric_limits<T>::epsilon() * std::fabs(x+y) * ulp
// unless the result is subnormal
|| std::fabs(x-y) < std::numeric_limits<T>::min();
}
int main()
{
double d1 = 0.2;
double d2 = 1 / std::sqrt(5) / std::sqrt(5);
std::cout << std::fixed << std::setprecision(20)
<< "d1=" << d1 << "\nd2=" << d2 << '\n';
if(d1 == d2)
std::cout << "d1 == d2\n";
else
std::cout << "d1 != d2\n";
if(almost_equal(d1, d2, 2))
std::cout << "d1 almost equals d2\n";
else
std::cout << "d1 does not almost equal d2\n";
}
In a more generic way:
template <typename T>
bool compareNumber(const T& a, const T& b) {
return std::abs(a - b) < std::numeric_limits<T>::epsilon();
}
Note:
As pointed out by #SirGuy, this approach is flawed.
I am leaving this answer here as an example not to follow.
I use this code. Unlike the above answers this allows one to
give a abs_relative_error that is explained in the comments of the code.
The first version compares complex numbers, so that the error
can be explained in terms of the angle between two "vectors"
of the same length in the complex plane (which gives a little
insight). Then from there the correct formula for two real
numbers follows.
https://github.com/CarloWood/ai-utils/blob/master/almost_equal.h
The latter then is
template<class T>
typename std::enable_if<std::is_floating_point<T>::value, bool>::type
almost_equal(T x, T y, T const abs_relative_error)
{
return 2 * std::abs(x - y) <= abs_relative_error * std::abs(x + y);
}
where abs_relative_error is basically (twice) the absolute value of what comes closest to being defined in the literature: a relative error. But that is just the choice of the name.
What it really is seen most clearly in the complex plane I think. If |x| = 1, and y lays in a circle around x with diameter abs_relative_error, then the two are considered equal.
I use the following function for floating-point numbers comparison:
bool approximatelyEqual(double a, double b)
{
return fabs(a - b) <= ((fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * std::numeric_limits<double>::epsilon());
}
It depends on how precise you want the comparison to be. If you want to compare for exactly the same number, then just go with ==. (You almost never want to do this unless you actually want exactly the same number.) On any decent platform you can also do the following:
diff= a - b; return fabs(diff)<EPSILON;
as fabs tends to be pretty fast. By pretty fast I mean it is basically a bitwise AND, so it better be fast.
And integer tricks for comparing doubles and floats are nice but tend to make it more difficult for the various CPU pipelines to handle effectively. And it's definitely not faster on certain in-order architectures these days due to using the stack as a temporary storage area for values that are being used frequently. (Load-hit-store for those who care.)
/// testing whether two doubles are almost equal. We consider two doubles
/// equal if the difference is within the range [0, epsilon).
///
/// epsilon: a positive number (supposed to be small)
///
/// if either x or y is 0, then we are comparing the absolute difference to
/// epsilon.
/// if both x and y are non-zero, then we are comparing the relative difference
/// to epsilon.
bool almost_equal(double x, double y, double epsilon)
{
double diff = x - y;
if (x != 0 && y != 0){
diff = diff/y;
}
if (diff < epsilon && -1.0*diff < epsilon){
return true;
}
return false;
}
I used this function for my small project and it works, but note the following:
Double precision error can create a surprise for you. Let's say epsilon = 1.0e-6, then 1.0 and 1.000001 should NOT be considered equal according to the above code, but on my machine the function considers them to be equal, this is because 1.000001 can not be precisely translated to a binary format, it is probably 1.0000009xxx. I test it with 1.0 and 1.0000011 and this time I get the expected result.
You cannot compare two double with a fixed EPSILON. Depending on the value of double, EPSILON varies.
A better double comparison would be:
bool same(double a, double b)
{
return std::nextafter(a, std::numeric_limits<double>::lowest()) <= b
&& std::nextafter(a, std::numeric_limits<double>::max()) >= b;
}
My way may not be correct but useful
Convert both float to strings and then do string compare
bool IsFlaotEqual(float a, float b, int decimal)
{
TCHAR form[50] = _T("");
_stprintf(form, _T("%%.%df"), decimal);
TCHAR a1[30] = _T(""), a2[30] = _T("");
_stprintf(a1, form, a);
_stprintf(a2, form, b);
if( _tcscmp(a1, a2) == 0 )
return true;
return false;
}
operator overlaoding can also be done
I write this for java, but maybe you find it useful. It uses longs instead of doubles, but takes care of NaNs, subnormals, etc.
public static boolean equal(double a, double b) {
final long fm = 0xFFFFFFFFFFFFFL; // fraction mask
final long sm = 0x8000000000000000L; // sign mask
final long cm = 0x8000000000000L; // most significant decimal bit mask
long c = Double.doubleToLongBits(a), d = Double.doubleToLongBits(b);
int ea = (int) (c >> 52 & 2047), eb = (int) (d >> 52 & 2047);
if (ea == 2047 && (c & fm) != 0 || eb == 2047 && (d & fm) != 0) return false; // NaN
if (c == d) return true; // identical - fast check
if (ea == 0 && eb == 0) return true; // ±0 or subnormals
if ((c & sm) != (d & sm)) return false; // different signs
if (abs(ea - eb) > 1) return false; // b > 2*a or a > 2*b
d <<= 12; c <<= 12;
if (ea < eb) c = c >> 1 | sm;
else if (ea > eb) d = d >> 1 | sm;
c -= d;
return c < 65536 && c > -65536; // don't use abs(), because:
// There is a posibility c=0x8000000000000000 which cannot be converted to positive
}
public static boolean zero(double a) { return (Double.doubleToLongBits(a) >> 52 & 2047) < 3; }
Keep in mind that after a number of floating-point operations, number can be very different from what we expect. There is no code to fix that.
Related
Rounding off floating numbers in cpp
For a particular question, I need to perform calculations on a floating number, round it off to 2 digits after the decimal place, and assign it to a variable for comparison purposes. I tried to find a solution to this but all I keep finding is how to print those rounded numbers (using printf or setprecision) instead of assigning them to a variable. Please help.
I usually do something like that: #include <cmath> // 'std::floor' #include <limits> // 'std::numeric_limits' // Round value to granularity template<typename T> inline T round(const T x, const T gran) { //static_assert(gran!=0); return gran * std::floor( x/gran + std::numeric_limits<T>::round_error() ); } double rounded_to_cent = round(1.23456, 0.01); // Gives something near 1.23 Be sure you know how floating point types work though. Addendum: I know that this topic has already been extensively covered in other questions, but let me put this small paragraph here. Given a real number, you can represent it with -almost- arbitrary accuracy with a (base10) literal like 1.2345, that's a string that you can type with your keyboard. When you store that value in a floating point type, let's say a double, you -almost- always loose accuracy because probably your number won't have an exact representation in the finite set of the numbers representable by that type. Nowadays double uses 64 bits, so it has 2^64 symbols to represent the not numerable infinity of real numbers: that's a H2O molecule in an infinity of infinite oceans. The representation error is relative to the value; for example in a IEEE 754 double, over 2^53 not all the integer values can be represented. So when someone tells that the "result is wrong" they're technically right; the "acceptable" result is application dependent.
round it off to 2 digits after the decimal place, and assign it to a variable for comparison purposes To avoid errors that creep in when using binary floating point in a decimal problem, consider alternatives. Direct approach has corner errors due to double rounding and overflow. These errors may be tolerable for OP larger goals // Errors: // 1) x*100.0, round(x*100.0)/100.0 inexact. // Select `x` values near "dddd.dd5" form an inexact product `x*100.0` // and may lead to a double rounding error and then incorrect result when comparing. // 2) x*100.0 may overflow. int compare_hundredth1(double x, double ref) { x = round(x*100.0)/100.0; return (x > ref) - (x < ref); } We can do better. When a wider floating point type exist: int compare_hundredth2(double x, double ref) { auto x_rounded = math::round(x*100.0L); auto ref_rounded = ref*100.0L; return (x_rounded > ref_rounded) - (x_rounded < ref_rounded); } To use the same width floating point type takes more work: All finite large larges of x, ref are whole numbers and need no rounding to the nearest 0.01. int compare_hundredth3(double x, double ref) { double x_whole; auto x_fraction = modf(x, &x_whole); // If rounding needed ... if (x_fraction != 0.0) { if (x - 0.01 > ref) return 1; // x much more than ref if (x + 0.01 < ref) return -1; // x much less than ref // x, ref nearly the same double ref_whole; auto ref_fraction = modf(x, &ref_whole); x -= ref_whole; auto x100 = (x - ref_whole)*100; // subtraction expected to be exact here. auto ref100 = ref_fraction*100; return (x100 > ref100) - (x100 < ref100); } return (x > ref) - (x < ref); } The above assume ref is without error. If this is not so, consider using a scaled ref. Note: The above sets aside not-a-number concerns. More clean-up later.
Here's an example with a custom function that rounds up the floating number f to n decimal places. Basically, it multiplies the floating number by 10 to the power of N to separate the decimal places, then uses roundf to round the decimal places up or down, and finally divides back the floating number by 10 to the power of N (N is the amount of decimal places). Works for C and C++: #include <stdio.h> #include <math.h> float my_round(float f, unsigned int n) { float p = powf(10.0f, (float)n); f *= p; f = roundf(f); f /= p; return f; } int main() { float f = 0.78901f; printf("%f\n", f); f = my_round(f, 2); /* Round with 2 decimal places */ printf("%f\n", f); return 0; } Output: 0.789010 0.790000
Iterate though all possible floating-point values, starting from lowest
I am writing a unit test for a math function and I would like to be able to "walk" all possible floats/doubles. Due to IEEE shenanigans, floating types cannot be incremented (++) at their extremities. See this question for more details. That answer states : one can only add multiples of 2^(n-N) But never mentions what little n is. A solution to iterate all possible values from +0.0 to +infinity is given in this great blog post. The technique involves using a union with an int to walk the different values of a float. This works due to the following properties explained in the post, though they are only valid for positive numbers. Adjacent floats have adjacent integer representations Incrementing the integer representation of a float moves to the next representable float, moving away from zero His solution for +0.0 to +infinity (0.f to std::numeric_limits<float>::max()) : union Float_t { int32_t RawExponent() const { return (i >> 23) & 0xFF; } int32_t i; float f; }; Float_t allFloats; allFloats.f = 0.0f; while (allFloats.RawExponent() < 255) { allFloats.i += 1; } Is there a solution for -infinity to +0.0 (std::numeric_limits<float>::lowest() to 0.f)? I've tested std::nextafter and std::nexttoward and couldn't get them to work. Maybe this is an MSVC issue? I would be ok with any sort of hack since this is a unit test. Thanks!
You can walk all 32-bit bit representations by using all values of a 32-bit unsigned int. Then you will walk really all representations, positive and negative, including both nulls (there are two) and also all the not a number representations (NaN). You may or may not want to filter out the NaN representations, or just filter out the signaling ones and leave the non signaling ones in. This depends on your use case. Example: for (uint32_t i = 0;;) { float f; // Type punning: Force the bit representation of i into f. // Type punning is hard because mostly undefined in C/C++. // Using memcpy() usually avoids any type punning warning. memcpy(&f, &i, sizeof(f)); // Use f here. // Warning: Using signaling NaNs may throw exceptions or raise signals. i++; if (i == 0) break; } Instead you can also walk a 32-bit int from -2**31 to +(2**31-1). This makes no difference.
Pascal Cuoq correctly points out std::nextafter is the right solution. I had a problem elsewhere in my code. Sorry for the unnecessary question. #include <cassert> #include <cmath> #include <limits> float i = std::numeric_limits<float>::lowest(); float hi = std::numeric_limits<float>::max(); float new_i = std::nextafterf(i, hi); assert(i != new_i); double d = std::numeric_limits<double>::lowest(); double hi_d = std::numeric_limits<double>::max(); double new_d = std::nextafter(d, hi_d); assert(d != new_d); long double ld = std::numeric_limits<long double>::lowest(); long double hi_ld = std::numeric_limits<long double>::max(); long double new_ld = std::nextafterl(ld, hi_ld); assert(ld != new_ld); for (float d = std::numeric_limits<float>::lowest(); d < std::numeric_limits<float>::max(); d = std::nextafterf( d, std::numeric_limits<float>::max())) { // Wait a lifetime? }
Iterating through all the float values can be done with simple understanding of the floating-point representation: The distance between consecutive subnormal values is the minimum normal times the “epsilon”. Simply iterate through all the subnormals using this distance as an increment. The distance between the normal values at the lowest exponent is the same. Step through them with the same increment. For each exponent, the distance increases according to the floating-point radix. Simply multiply the increment by the radix and step through all the values for the next exponent. Repeat until infinity is reached. Observe that the inner loop in the code below is simply: for (; x < Limit; x += Increment) Test(x); This has the advantage that only normal floating-point arithmetic is used. The inner loop contains only one addition and one comparison (plus any tests you want to perform with each number). No library functions are called in the loop, no representations are dissected or copied to general registers or otherwise manipulated. There is nothing to impede performance. This code steps through only the non-negative numbers. The negative numbers can be tested separately in the same way or can share this code by inserting a call Test(-x). #include <limits> static void Test(float x) { // Insert unit test for value x here. } int main(void) { typedef float T; static const int Radix = std::numeric_limits<T>::radix; static const T Infinity = std::numeric_limits<T>::infinity(); /* Increment is the current distance between floating-point numbers. We start it at distance between subnormal numbers. */ T Increment = std::numeric_limits<T>::min() * std::numeric_limits<T>::epsilon(); /* Limit is the next boundary where the distance between floating-point numbers changes. We will increment up to that limit and then adjust the limit and increment. We start it at the top of the first set of normals, which allows the first loop to increment first through the subnormals and then through the normals with the lowest exponent. (These two sets have the same step size between adjacent values.) */ T Limit = std::numeric_limits<T>::min() * Radix; /* Start with zero and continue until we reach infinity. We execute an inner loop that iterates through all the significands of one floating-point exponent. Each time it completes, we step up the limit and increment. */ for (T x = 0; x < Infinity; Limit *= Radix, Increment *= Radix) // Increment x through all the significands with the current exponent. for (; x < Limit; x += Increment) // Test with the current value of x. Test(x); // Also test infinity. Test(Infinity); } (This code assumes the floating-point type has subnormals, and that they are not flushed to zero. The code can be readily adjusted to support these alternatives as well.)
float comparsion (operator !=) returns false, but numbers are equal although the numbers appear to be the same [duplicate]
What would be the most efficient way to compare two double or two float values? Simply doing this is not correct: bool CompareDoubles1 (double A, double B) { return A == B; } But something like: bool CompareDoubles2 (double A, double B) { diff = A - B; return (diff < EPSILON) && (-diff < EPSILON); } Seems to waste processing. Does anyone know a smarter float comparer?
Be extremely careful using any of the other suggestions. It all depends on context. I have spent a long time tracing bugs in a system that presumed a==b if |a-b|<epsilon. The underlying problems were: The implicit presumption in an algorithm that if a==b and b==c then a==c. Using the same epsilon for lines measured in inches and lines measured in mils (.001 inch). That is a==b but 1000a!=1000b. (This is why AlmostEqual2sComplement asks for the epsilon or max ULPS). The use of the same epsilon for both the cosine of angles and the length of lines! Using such a compare function to sort items in a collection. (In this case using the builtin C++ operator == for doubles produced correct results.) Like I said: it all depends on context and the expected size of a and b. By the way, std::numeric_limits<double>::epsilon() is the "machine epsilon". It is the difference between 1.0 and the next value representable by a double. I guess that it could be used in the compare function but only if the expected values are less than 1. (This is in response to #cdv's answer...) Also, if you basically have int arithmetic in doubles (here we use doubles to hold int values in certain cases) your arithmetic will be correct. For example 4.0/2.0 will be the same as 1.0+1.0. This is as long as you do not do things that result in fractions (4.0/3.0) or do not go outside of the size of an int.
The comparison with an epsilon value is what most people do (even in game programming). You should change your implementation a little though: bool AreSame(double a, double b) { return fabs(a - b) < EPSILON; } Edit: Christer has added a stack of great info on this topic on a recent blog post. Enjoy.
Comparing floating point numbers for depends on the context. Since even changing the order of operations can produce different results, it is important to know how "equal" you want the numbers to be. Comparing floating point numbers by Bruce Dawson is a good place to start when looking at floating point comparison. The following definitions are from The art of computer programming by Knuth: bool approximatelyEqual(float a, float b, float epsilon) { return fabs(a - b) <= ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon); } bool essentiallyEqual(float a, float b, float epsilon) { return fabs(a - b) <= ( (fabs(a) > fabs(b) ? fabs(b) : fabs(a)) * epsilon); } bool definitelyGreaterThan(float a, float b, float epsilon) { return (a - b) > ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon); } bool definitelyLessThan(float a, float b, float epsilon) { return (b - a) > ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon); } Of course, choosing epsilon depends on the context, and determines how equal you want the numbers to be. Another method of comparing floating point numbers is to look at the ULP (units in last place) of the numbers. While not dealing specifically with comparisons, the paper What every computer scientist should know about floating point numbers is a good resource for understanding how floating point works and what the pitfalls are, including what ULP is.
I found that the Google C++ Testing Framework contains a nice cross-platform template-based implementation of AlmostEqual2sComplement which works on both doubles and floats. Given that it is released under the BSD license, using it in your own code should be no problem, as long as you retain the license. I extracted the below code from http://code.google.com/p/googletest/source/browse/trunk/include/gtest/internal/gtest-internal.h https://github.com/google/googletest/blob/master/googletest/include/gtest/internal/gtest-internal.h and added the license on top. Be sure to #define GTEST_OS_WINDOWS to some value (or to change the code where it's used to something that fits your codebase - it's BSD licensed after all). Usage example: double left = // something double right = // something const FloatingPoint<double> lhs(left), rhs(right); if (lhs.AlmostEquals(rhs)) { //they're equal! } Here's the code: // Copyright 2005, Google Inc. // All rights reserved. // // Redistribution and use in source and binary forms, with or without // modification, are permitted provided that the following conditions are // met: // // * Redistributions of source code must retain the above copyright // notice, this list of conditions and the following disclaimer. // * Redistributions in binary form must reproduce the above // copyright notice, this list of conditions and the following disclaimer // in the documentation and/or other materials provided with the // distribution. // * Neither the name of Google Inc. nor the names of its // contributors may be used to endorse or promote products derived from // this software without specific prior written permission. // // THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS // "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT // LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR // A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT // OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, // SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT // LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, // DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY // THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT // (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE // OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // // Authors: wan#google.com (Zhanyong Wan), eefacm#gmail.com (Sean Mcafee) // // The Google C++ Testing Framework (Google Test) // This template class serves as a compile-time function from size to // type. It maps a size in bytes to a primitive type with that // size. e.g. // // TypeWithSize<4>::UInt // // is typedef-ed to be unsigned int (unsigned integer made up of 4 // bytes). // // Such functionality should belong to STL, but I cannot find it // there. // // Google Test uses this class in the implementation of floating-point // comparison. // // For now it only handles UInt (unsigned int) as that's all Google Test // needs. Other types can be easily added in the future if need // arises. template <size_t size> class TypeWithSize { public: // This prevents the user from using TypeWithSize<N> with incorrect // values of N. typedef void UInt; }; // The specialization for size 4. template <> class TypeWithSize<4> { public: // unsigned int has size 4 in both gcc and MSVC. // // As base/basictypes.h doesn't compile on Windows, we cannot use // uint32, uint64, and etc here. typedef int Int; typedef unsigned int UInt; }; // The specialization for size 8. template <> class TypeWithSize<8> { public: #if GTEST_OS_WINDOWS typedef __int64 Int; typedef unsigned __int64 UInt; #else typedef long long Int; // NOLINT typedef unsigned long long UInt; // NOLINT #endif // GTEST_OS_WINDOWS }; // This template class represents an IEEE floating-point number // (either single-precision or double-precision, depending on the // template parameters). // // The purpose of this class is to do more sophisticated number // comparison. (Due to round-off error, etc, it's very unlikely that // two floating-points will be equal exactly. Hence a naive // comparison by the == operation often doesn't work.) // // Format of IEEE floating-point: // // The most-significant bit being the leftmost, an IEEE // floating-point looks like // // sign_bit exponent_bits fraction_bits // // Here, sign_bit is a single bit that designates the sign of the // number. // // For float, there are 8 exponent bits and 23 fraction bits. // // For double, there are 11 exponent bits and 52 fraction bits. // // More details can be found at // http://en.wikipedia.org/wiki/IEEE_floating-point_standard. // // Template parameter: // // RawType: the raw floating-point type (either float or double) template <typename RawType> class FloatingPoint { public: // Defines the unsigned integer type that has the same size as the // floating point number. typedef typename TypeWithSize<sizeof(RawType)>::UInt Bits; // Constants. // # of bits in a number. static const size_t kBitCount = 8*sizeof(RawType); // # of fraction bits in a number. static const size_t kFractionBitCount = std::numeric_limits<RawType>::digits - 1; // # of exponent bits in a number. static const size_t kExponentBitCount = kBitCount - 1 - kFractionBitCount; // The mask for the sign bit. static const Bits kSignBitMask = static_cast<Bits>(1) << (kBitCount - 1); // The mask for the fraction bits. static const Bits kFractionBitMask = ~static_cast<Bits>(0) >> (kExponentBitCount + 1); // The mask for the exponent bits. static const Bits kExponentBitMask = ~(kSignBitMask | kFractionBitMask); // How many ULP's (Units in the Last Place) we want to tolerate when // comparing two numbers. The larger the value, the more error we // allow. A 0 value means that two numbers must be exactly the same // to be considered equal. // // The maximum error of a single floating-point operation is 0.5 // units in the last place. On Intel CPU's, all floating-point // calculations are done with 80-bit precision, while double has 64 // bits. Therefore, 4 should be enough for ordinary use. // // See the following article for more details on ULP: // http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm. static const size_t kMaxUlps = 4; // Constructs a FloatingPoint from a raw floating-point number. // // On an Intel CPU, passing a non-normalized NAN (Not a Number) // around may change its bits, although the new value is guaranteed // to be also a NAN. Therefore, don't expect this constructor to // preserve the bits in x when x is a NAN. explicit FloatingPoint(const RawType& x) { u_.value_ = x; } // Static methods // Reinterprets a bit pattern as a floating-point number. // // This function is needed to test the AlmostEquals() method. static RawType ReinterpretBits(const Bits bits) { FloatingPoint fp(0); fp.u_.bits_ = bits; return fp.u_.value_; } // Returns the floating-point number that represent positive infinity. static RawType Infinity() { return ReinterpretBits(kExponentBitMask); } // Non-static methods // Returns the bits that represents this number. const Bits &bits() const { return u_.bits_; } // Returns the exponent bits of this number. Bits exponent_bits() const { return kExponentBitMask & u_.bits_; } // Returns the fraction bits of this number. Bits fraction_bits() const { return kFractionBitMask & u_.bits_; } // Returns the sign bit of this number. Bits sign_bit() const { return kSignBitMask & u_.bits_; } // Returns true iff this is NAN (not a number). bool is_nan() const { // It's a NAN if the exponent bits are all ones and the fraction // bits are not entirely zeros. return (exponent_bits() == kExponentBitMask) && (fraction_bits() != 0); } // Returns true iff this number is at most kMaxUlps ULP's away from // rhs. In particular, this function: // // - returns false if either number is (or both are) NAN. // - treats really large numbers as almost equal to infinity. // - thinks +0.0 and -0.0 are 0 DLP's apart. bool AlmostEquals(const FloatingPoint& rhs) const { // The IEEE standard says that any comparison operation involving // a NAN must return false. if (is_nan() || rhs.is_nan()) return false; return DistanceBetweenSignAndMagnitudeNumbers(u_.bits_, rhs.u_.bits_) <= kMaxUlps; } private: // The data type used to store the actual floating-point number. union FloatingPointUnion { RawType value_; // The raw floating-point number. Bits bits_; // The bits that represent the number. }; // Converts an integer from the sign-and-magnitude representation to // the biased representation. More precisely, let N be 2 to the // power of (kBitCount - 1), an integer x is represented by the // unsigned number x + N. // // For instance, // // -N + 1 (the most negative number representable using // sign-and-magnitude) is represented by 1; // 0 is represented by N; and // N - 1 (the biggest number representable using // sign-and-magnitude) is represented by 2N - 1. // // Read http://en.wikipedia.org/wiki/Signed_number_representations // for more details on signed number representations. static Bits SignAndMagnitudeToBiased(const Bits &sam) { if (kSignBitMask & sam) { // sam represents a negative number. return ~sam + 1; } else { // sam represents a positive number. return kSignBitMask | sam; } } // Given two numbers in the sign-and-magnitude representation, // returns the distance between them as an unsigned number. static Bits DistanceBetweenSignAndMagnitudeNumbers(const Bits &sam1, const Bits &sam2) { const Bits biased1 = SignAndMagnitudeToBiased(sam1); const Bits biased2 = SignAndMagnitudeToBiased(sam2); return (biased1 >= biased2) ? (biased1 - biased2) : (biased2 - biased1); } FloatingPointUnion u_; }; EDIT: This post is 4 years old. It's probably still valid, and the code is nice, but some people found improvements. Best go get the latest version of AlmostEquals right from the Google Test source code, and not the one I pasted up here.
For a more in depth approach read Comparing floating point numbers. Here is the code snippet from that link: // Usable AlmostEqual function bool AlmostEqual2sComplement(float A, float B, int maxUlps) { // Make sure maxUlps is non-negative and small enough that the // default NAN won't compare as equal to anything. assert(maxUlps > 0 && maxUlps < 4 * 1024 * 1024); int aInt = *(int*)&A; // Make aInt lexicographically ordered as a twos-complement int if (aInt < 0) aInt = 0x80000000 - aInt; // Make bInt lexicographically ordered as a twos-complement int int bInt = *(int*)&B; if (bInt < 0) bInt = 0x80000000 - bInt; int intDiff = abs(aInt - bInt); if (intDiff <= maxUlps) return true; return false; }
Realizing this is an old thread but this article is one of the most straight forward ones I have found on comparing floating point numbers and if you want to explore more it has more detailed references as well and it the main site covers a complete range of issues dealing with floating point numbers The Floating-Point Guide :Comparison. We can find a somewhat more practical article in Floating-point tolerances revisited and notes there is absolute tolerance test, which boils down to this in C++: bool absoluteToleranceCompare(double x, double y) { return std::fabs(x - y) <= std::numeric_limits<double>::epsilon() ; } and relative tolerance test: bool relativeToleranceCompare(double x, double y) { double maxXY = std::max( std::fabs(x) , std::fabs(y) ) ; return std::fabs(x - y) <= std::numeric_limits<double>::epsilon()*maxXY ; } The article notes that the absolute test fails when x and y are large and fails in the relative case when they are small. Assuming he absolute and relative tolerance is the same a combined test would look like this: bool combinedToleranceCompare(double x, double y) { double maxXYOne = std::max( { 1.0, std::fabs(x) , std::fabs(y) } ) ; return std::fabs(x - y) <= std::numeric_limits<double>::epsilon()*maxXYOne ; }
I ended up spending quite some time going through material in this great thread. I doubt everyone wants to spend so much time so I would highlight the summary of what I learned and the solution I implemented. Quick Summary Is 1e-8 approximately same as 1e-16? If you are looking at noisy sensor data then probably yes but if you are doing molecular simulation then may be not! Bottom line: You always need to think of tolerance value in context of specific function call and not just make it generic app-wide hard-coded constant. For general library functions, it's still nice to have parameter with default tolerance. A typical choice is numeric_limits::epsilon() which is same as FLT_EPSILON in float.h. This is however problematic because epsilon for comparing values like 1.0 is not same as epsilon for values like 1E9. The FLT_EPSILON is defined for 1.0. The obvious implementation to check if number is within tolerance is fabs(a-b) <= epsilon however this doesn't work because default epsilon is defined for 1.0. We need to scale epsilon up or down in terms of a and b. There are two solution to this problem: either you set epsilon proportional to max(a,b) or you can get next representable numbers around a and then see if b falls into that range. The former is called "relative" method and later is called ULP method. Both methods actually fails anyway when comparing with 0. In this case, application must supply correct tolerance. Utility Functions Implementation (C++11) //implements relative method - do not use for comparing with zero //use this most of the time, tolerance needs to be meaningful in your context template<typename TReal> static bool isApproximatelyEqual(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon()) { TReal diff = std::fabs(a - b); if (diff <= tolerance) return true; if (diff < std::fmax(std::fabs(a), std::fabs(b)) * tolerance) return true; return false; } //supply tolerance that is meaningful in your context //for example, default tolerance may not work if you are comparing double with float template<typename TReal> static bool isApproximatelyZero(TReal a, TReal tolerance = std::numeric_limits<TReal>::epsilon()) { if (std::fabs(a) <= tolerance) return true; return false; } //use this when you want to be on safe side //for example, don't start rover unless signal is above 1 template<typename TReal> static bool isDefinitelyLessThan(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon()) { TReal diff = a - b; if (diff < tolerance) return true; if (diff < std::fmax(std::fabs(a), std::fabs(b)) * tolerance) return true; return false; } template<typename TReal> static bool isDefinitelyGreaterThan(TReal a, TReal b, TReal tolerance = std::numeric_limits<TReal>::epsilon()) { TReal diff = a - b; if (diff > tolerance) return true; if (diff > std::fmax(std::fabs(a), std::fabs(b)) * tolerance) return true; return false; } //implements ULP method //use this when you are only concerned about floating point precision issue //for example, if you want to see if a is 1.0 by checking if its within //10 closest representable floating point numbers around 1.0. template<typename TReal> static bool isWithinPrecisionInterval(TReal a, TReal b, unsigned int interval_size = 1) { TReal min_a = a - (a - std::nextafter(a, std::numeric_limits<TReal>::lowest())) * interval_size; TReal max_a = a + (std::nextafter(a, std::numeric_limits<TReal>::max()) - a) * interval_size; return min_a <= b && max_a >= b; }
The portable way to get epsilon in C++ is #include <limits> std::numeric_limits<double>::epsilon() Then the comparison function becomes #include <cmath> #include <limits> bool AreSame(double a, double b) { return std::fabs(a - b) < std::numeric_limits<double>::epsilon(); }
The code you wrote is bugged : return (diff < EPSILON) && (-diff > EPSILON); The correct code would be : return (diff < EPSILON) && (diff > -EPSILON); (...and yes this is different) I wonder if fabs wouldn't make you lose lazy evaluation in some case. I would say it depends on the compiler. You might want to try both. If they are equivalent in average, take the implementation with fabs. If you have some info on which of the two float is more likely to be bigger than then other, you can play on the order of the comparison to take better advantage of the lazy evaluation. Finally you might get better result by inlining this function. Not likely to improve much though... Edit: OJ, thanks for correcting your code. I erased my comment accordingly
`return fabs(a - b) < EPSILON; This is fine if: the order of magnitude of your inputs don't change much very small numbers of opposite signs can be treated as equal But otherwise it'll lead you into trouble. Double precision numbers have a resolution of about 16 decimal places. If the two numbers you are comparing are larger in magnitude than EPSILON*1.0E16, then you might as well be saying: return a==b; I'll examine a different approach that assumes you need to worry about the first issue and assume the second is fine your application. A solution would be something like: #define VERYSMALL (1.0E-150) #define EPSILON (1.0E-8) bool AreSame(double a, double b) { double absDiff = fabs(a - b); if (absDiff < VERYSMALL) { return true; } double maxAbs = max(fabs(a) - fabs(b)); return (absDiff/maxAbs) < EPSILON; } This is expensive computationally, but it is sometimes what is called for. This is what we have to do at my company because we deal with an engineering library and inputs can vary by a few dozen orders of magnitude. Anyway, the point is this (and applies to practically every programming problem): Evaluate what your needs are, then come up with a solution to address your needs -- don't assume the easy answer will address your needs. If after your evaluation you find that fabs(a-b) < EPSILON will suffice, perfect -- use it! But be aware of its shortcomings and other possible solutions too.
As others have pointed out, using a fixed-exponent epsilon (such as 0.0000001) will be useless for values away from the epsilon value. For example, if your two values are 10000.000977 and 10000, then there are NO 32-bit floating-point values between these two numbers -- 10000 and 10000.000977 are as close as you can possibly get without being bit-for-bit identical. Here, an epsilon of less than 0.0009 is meaningless; you might as well use the straight equality operator. Likewise, as the two values approach epsilon in size, the relative error grows to 100%. Thus, trying to mix a fixed point number such as 0.00001 with floating-point values (where the exponent is arbitrary) is a pointless exercise. This will only ever work if you can be assured that the operand values lie within a narrow domain (that is, close to some specific exponent), and if you properly select an epsilon value for that specific test. If you pull a number out of the air ("Hey! 0.00001 is small, so that must be good!"), you're doomed to numerical errors. I've spent plenty of time debugging bad numerical code where some poor schmuck tosses in random epsilon values to make yet another test case work. If you do numerical programming of any kind and believe you need to reach for fixed-point epsilons, READ BRUCE'S ARTICLE ON COMPARING FLOATING-POINT NUMBERS. Comparing Floating Point Numbers
Here's proof that using std::numeric_limits::epsilon() is not the answer — it fails for values greater than one: Proof of my comment above: #include <stdio.h> #include <limits> double ItoD (__int64 x) { // Return double from 64-bit hexadecimal representation. return *(reinterpret_cast<double*>(&x)); } void test (__int64 ai, __int64 bi) { double a = ItoD(ai), b = ItoD(bi); bool close = std::fabs(a-b) < std::numeric_limits<double>::epsilon(); printf ("%.16f and %.16f %s close.\n", a, b, close ? "are " : "are not"); } int main() { test (0x3fe0000000000000L, 0x3fe0000000000001L); test (0x3ff0000000000000L, 0x3ff0000000000001L); } Running yields this output: 0.5000000000000000 and 0.5000000000000001 are close. 1.0000000000000000 and 1.0000000000000002 are not close. Note that in the second case (one and just larger than one), the two input values are as close as they can possibly be, and still compare as not close. Thus, for values greater than 1.0, you might as well just use an equality test. Fixed epsilons will not save you when comparing floating-point values.
Qt implements two functions, maybe you can learn from them: static inline bool qFuzzyCompare(double p1, double p2) { return (qAbs(p1 - p2) <= 0.000000000001 * qMin(qAbs(p1), qAbs(p2))); } static inline bool qFuzzyCompare(float p1, float p2) { return (qAbs(p1 - p2) <= 0.00001f * qMin(qAbs(p1), qAbs(p2))); } And you may need the following functions, since Note that comparing values where either p1 or p2 is 0.0 will not work, nor does comparing values where one of the values is NaN or infinity. If one of the values is always 0.0, use qFuzzyIsNull instead. If one of the values is likely to be 0.0, one solution is to add 1.0 to both values. static inline bool qFuzzyIsNull(double d) { return qAbs(d) <= 0.000000000001; } static inline bool qFuzzyIsNull(float f) { return qAbs(f) <= 0.00001f; }
Unfortunately, even your "wasteful" code is incorrect. EPSILON is the smallest value that could be added to 1.0 and change its value. The value 1.0 is very important — larger numbers do not change when added to EPSILON. Now, you can scale this value to the numbers you are comparing to tell whether they are different or not. The correct expression for comparing two doubles is: if (fabs(a - b) <= DBL_EPSILON * fmax(fabs(a), fabs(b))) { // ... } This is at a minimum. In general, though, you would want to account for noise in your calculations and ignore a few of the least significant bits, so a more realistic comparison would look like: if (fabs(a - b) <= 16 * DBL_EPSILON * fmax(fabs(a), fabs(b))) { // ... } If comparison performance is very important to you and you know the range of your values, then you should use fixed-point numbers instead.
General-purpose comparison of floating-point numbers is generally meaningless. How to compare really depends on a problem at hand. In many problems, numbers are sufficiently discretized to allow comparing them within a given tolerance. Unfortunately, there are just as many problems, where such trick doesn't really work. For one example, consider working with a Heaviside (step) function of a number in question (digital stock options come to mind) when your observations are very close to the barrier. Performing tolerance-based comparison wouldn't do much good, as it would effectively shift the issue from the original barrier to two new ones. Again, there is no general-purpose solution for such problems and the particular solution might require going as far as changing the numerical method in order to achieve stability.
You have to do this processing for floating point comparison, since float's can't be perfectly compared like integer types. Here are functions for the various comparison operators. Floating Point Equal to (==) I also prefer the subtraction technique rather than relying on fabs() or abs(), but I'd have to speed profile it on various architectures from 64-bit PC to ATMega328 microcontroller (Arduino) to really see if it makes much of a performance difference. So, let's forget about all this absolute value stuff and just do some subtraction and comparison! Modified from Microsoft's example here: /// #brief See if two floating point numbers are approximately equal. /// #param[in] a number 1 /// #param[in] b number 2 /// #param[in] epsilon A small value such that if the difference between the two numbers is /// smaller than this they can safely be considered to be equal. /// #return true if the two numbers are approximately equal, and false otherwise bool is_float_eq(float a, float b, float epsilon) { return ((a - b) < epsilon) && ((b - a) < epsilon); } bool is_double_eq(double a, double b, double epsilon) { return ((a - b) < epsilon) && ((b - a) < epsilon); } Example usage: constexpr float EPSILON = 0.0001; // 1e-4 is_float_eq(1.0001, 0.99998, EPSILON); I'm not entirely sure, but it seems to me some of the criticisms of the epsilon-based approach, as described in the comments below this highly-upvoted answer, can be resolved by using a variable epsilon, scaled according to the floating point values being compared, like this: float a = 1.0001; float b = 0.99998; float epsilon = std::max(std::fabs(a), std::fabs(b)) * 1e-4; is_float_eq(a, b, epsilon); This way, the epsilon value scales with the floating point values and is therefore never so small of a value that it becomes insignificant. For completeness, let's add the rest: Greater than (>), and less than (<): /// #brief See if floating point number `a` is > `b` /// #param[in] a number 1 /// #param[in] b number 2 /// #param[in] epsilon a small value such that if `a` is > `b` by this amount, `a` is considered /// to be definitively > `b` /// #return true if `a` is definitively > `b`, and false otherwise bool is_float_gt(float a, float b, float epsilon) { return a > b + epsilon; } bool is_double_gt(double a, double b, double epsilon) { return a > b + epsilon; } /// #brief See if floating point number `a` is < `b` /// #param[in] a number 1 /// #param[in] b number 2 /// #param[in] epsilon a small value such that if `a` is < `b` by this amount, `a` is considered /// to be definitively < `b` /// #return true if `a` is definitively < `b`, and false otherwise bool is_float_lt(float a, float b, float epsilon) { return a < b - epsilon; } bool is_double_lt(double a, double b, double epsilon) { return a < b - epsilon; } Greater than or equal to (>=), and less than or equal to (<=) /// #brief Returns true if `a` is definitively >= `b`, and false otherwise bool is_float_ge(float a, float b, float epsilon) { return a > b - epsilon; } bool is_double_ge(double a, double b, double epsilon) { return a > b - epsilon; } /// #brief Returns true if `a` is definitively <= `b`, and false otherwise bool is_float_le(float a, float b, float epsilon) { return a < b + epsilon; } bool is_double_le(double a, double b, double epsilon) { return a < b + epsilon; } Additional improvements: A good default value for epsilon in C++ is std::numeric_limits<T>::epsilon(), which evaluates to either 0 or FLT_EPSILON, DBL_EPSILON, or LDBL_EPSILON. See here: https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon. You can also see the float.h header for FLT_EPSILON, DBL_EPSILON, and LDBL_EPSILON. See https://en.cppreference.com/w/cpp/header/cfloat and https://www.cplusplus.com/reference/cfloat/ You could template the functions instead, to handle all floating point types: float, double, and long double, with type checks for these types via a static_assert() inside the template. Scaling the epsilon value is a good idea to ensure it works for really large and really small a and b values. This article recommends and explains it: http://realtimecollisiondetection.net/blog/?p=89. So, you should scale epsilon by a scaling value equal to max(1.0, abs(a), abs(b)), as that article explains. Otherwise, as a and/or b increase in magnitude, the epsilon would eventually become so small relative to those values that it becomes lost in the floating point error. So, we scale it to become larger in magnitude like they are. However, using 1.0 as the smallest allowed scaling factor for epsilon also ensures that for really small-magnitude a and b values, epsilon itself doesn't get scaled so small that it also becomes lost in the floating point error. So, we limit the minimum scaling factor to 1.0. If you want to "encapsulate" the above functions into a class, don't. Instead, wrap them up in a namespace if you like in order to namespace them. Ex: if you put all of the stand-alone functions into a namespace called float_comparison, then you could access the is_eq() function like this, for instance: float_comparison::is_eq(1.0, 1.5);. It might also be nice to add comparisons against zero, not just comparisons between two values. So, here is a better type of solution with the above improvements in place: namespace float_comparison { /// Scale the epsilon value to become large for large-magnitude a or b, /// but no smaller than 1.0, per the explanation above, to ensure that /// epsilon doesn't ever fall out in floating point error as a and/or b /// increase in magnitude. template<typename T> static constexpr T scale_epsilon(T a, T b, T epsilon = std::numeric_limits<T>::epsilon()) noexcept { static_assert(std::is_floating_point_v<T>, "Floating point comparisons " "require type float, double, or long double."); T scaling_factor; // Special case for when a or b is infinity if (std::isinf(a) || std::isinf(b)) { scaling_factor = 0; } else { scaling_factor = std::max({(T)1.0, std::abs(a), std::abs(b)}); } T epsilon_scaled = scaling_factor * std::abs(epsilon); return epsilon_scaled; } // Compare two values /// Equal: returns true if a is approximately == b, and false otherwise template<typename T> static constexpr bool is_eq(T a, T b, T epsilon = std::numeric_limits<T>::epsilon()) noexcept { static_assert(std::is_floating_point_v<T>, "Floating point comparisons " "require type float, double, or long double."); // test `a == b` first to see if both a and b are either infinity // or -infinity return a == b || std::abs(a - b) <= scale_epsilon(a, b, epsilon); } /* etc. etc.: is_eq() is_ne() is_lt() is_le() is_gt() is_ge() */ // Compare against zero /// Equal: returns true if a is approximately == 0, and false otherwise template<typename T> static constexpr bool is_eq_zero(T a, T epsilon = std::numeric_limits<T>::epsilon()) noexcept { static_assert(std::is_floating_point_v<T>, "Floating point comparisons " "require type float, double, or long double."); return is_eq(a, (T)0.0, epsilon); } /* etc. etc.: is_eq_zero() is_ne_zero() is_lt_zero() is_le_zero() is_gt_zero() is_ge_zero() */ } // namespace float_comparison See also: The macro forms of some of the functions above in my repo here: utilities.h. UPDATE 29 NOV 2020: it's a work-in-progress, and I'm going to make it a separate answer when ready, but I've produced a better, scaled-epsilon version of all of the functions in C in this file here: utilities.c. Take a look. ADDITIONAL READING I need to do now have done: Floating-point tolerances revisited, by Christer Ericson. VERY USEFUL ARTICLE! It talks about scaling epsilon in order to ensure it never falls out in floating point error, even for really large-magnitude a and/or b values!
My class based on previously posted answers. Very similar to Google's code but I use a bias which pushes all NaN values above 0xFF000000. That allows a faster check for NaN. This code is meant to demonstrate the concept, not be a general solution. Google's code already shows how to compute all the platform specific values and I didn't want to duplicate all that. I've done limited testing on this code. typedef unsigned int U32; // Float Memory Bias (unsigned) // ----- ------ --------------- // NaN 0xFFFFFFFF 0xFF800001 // NaN 0xFF800001 0xFFFFFFFF // -Infinity 0xFF800000 0x00000000 --- // -3.40282e+038 0xFF7FFFFF 0x00000001 | // -1.40130e-045 0x80000001 0x7F7FFFFF | // -0.0 0x80000000 0x7F800000 |--- Valid <= 0xFF000000. // 0.0 0x00000000 0x7F800000 | NaN > 0xFF000000 // 1.40130e-045 0x00000001 0x7F800001 | // 3.40282e+038 0x7F7FFFFF 0xFEFFFFFF | // Infinity 0x7F800000 0xFF000000 --- // NaN 0x7F800001 0xFF000001 // NaN 0x7FFFFFFF 0xFF7FFFFF // // Either value of NaN returns false. // -Infinity and +Infinity are not "close". // -0 and +0 are equal. // class CompareFloat{ public: union{ float m_f32; U32 m_u32; }; static bool CompareFloat::IsClose( float A, float B, U32 unitsDelta = 4 ) { U32 a = CompareFloat::GetBiased( A ); U32 b = CompareFloat::GetBiased( B ); if ( (a > 0xFF000000) || (b > 0xFF000000) ) { return( false ); } return( (static_cast<U32>(abs( a - b ))) < unitsDelta ); } protected: static U32 CompareFloat::GetBiased( float f ) { U32 r = ((CompareFloat*)&f)->m_u32; if ( r & 0x80000000 ) { return( ~r - 0x007FFFFF ); } return( r + 0x7F800000 ); } };
I'd be very wary of any of these answers that involves floating point subtraction (e.g., fabs(a-b) < epsilon). First, the floating point numbers become more sparse at greater magnitudes and at high enough magnitudes where the spacing is greater than epsilon, you might as well just be doing a == b. Second, subtracting two very close floating point numbers (as these will tend to be, given that you're looking for near equality) is exactly how you get catastrophic cancellation. While not portable, I think grom's answer does the best job of avoiding these issues.
There are actually cases in numerical software where you want to check whether two floating point numbers are exactly equal. I posted this on a similar question https://stackoverflow.com/a/10973098/1447411 So you can not say that "CompareDoubles1" is wrong in general.
In terms of the scale of quantities: If epsilon is the small fraction of the magnitude of quantity (i.e. relative value) in some certain physical sense and A and B types is comparable in the same sense, than I think, that the following is quite correct: #include <limits> #include <iomanip> #include <iostream> #include <cmath> #include <cstdlib> #include <cassert> template< typename A, typename B > inline bool close_enough(A const & a, B const & b, typename std::common_type< A, B >::type const & epsilon) { using std::isless; assert(isless(0, epsilon)); // epsilon is a part of the whole quantity assert(isless(epsilon, 1)); using std::abs; auto const delta = abs(a - b); auto const x = abs(a); auto const y = abs(b); // comparable generally and |a - b| < eps * (|a| + |b|) / 2 return isless(epsilon * y, x) && isless(epsilon * x, y) && isless((delta + delta) / (x + y), epsilon); } int main() { std::cout << std::boolalpha << close_enough(0.9, 1.0, 0.1) << std::endl; std::cout << std::boolalpha << close_enough(1.0, 1.1, 0.1) << std::endl; std::cout << std::boolalpha << close_enough(1.1, 1.2, 0.01) << std::endl; std::cout << std::boolalpha << close_enough(1.0001, 1.0002, 0.01) << std::endl; std::cout << std::boolalpha << close_enough(1.0, 0.01, 0.1) << std::endl; return EXIT_SUCCESS; }
I use this code: bool AlmostEqual(double v1, double v2) { return (std::fabs(v1 - v2) < std::fabs(std::min(v1, v2)) * std::numeric_limits<double>::epsilon()); }
Found another interesting implementation on: https://en.cppreference.com/w/cpp/types/numeric_limits/epsilon #include <cmath> #include <limits> #include <iomanip> #include <iostream> #include <type_traits> #include <algorithm> template<class T> typename std::enable_if<!std::numeric_limits<T>::is_integer, bool>::type almost_equal(T x, T y, int ulp) { // the machine epsilon has to be scaled to the magnitude of the values used // and multiplied by the desired precision in ULPs (units in the last place) return std::fabs(x-y) <= std::numeric_limits<T>::epsilon() * std::fabs(x+y) * ulp // unless the result is subnormal || std::fabs(x-y) < std::numeric_limits<T>::min(); } int main() { double d1 = 0.2; double d2 = 1 / std::sqrt(5) / std::sqrt(5); std::cout << std::fixed << std::setprecision(20) << "d1=" << d1 << "\nd2=" << d2 << '\n'; if(d1 == d2) std::cout << "d1 == d2\n"; else std::cout << "d1 != d2\n"; if(almost_equal(d1, d2, 2)) std::cout << "d1 almost equals d2\n"; else std::cout << "d1 does not almost equal d2\n"; }
In a more generic way: template <typename T> bool compareNumber(const T& a, const T& b) { return std::abs(a - b) < std::numeric_limits<T>::epsilon(); } Note: As pointed out by #SirGuy, this approach is flawed. I am leaving this answer here as an example not to follow.
I use this code. Unlike the above answers this allows one to give a abs_relative_error that is explained in the comments of the code. The first version compares complex numbers, so that the error can be explained in terms of the angle between two "vectors" of the same length in the complex plane (which gives a little insight). Then from there the correct formula for two real numbers follows. https://github.com/CarloWood/ai-utils/blob/master/almost_equal.h The latter then is template<class T> typename std::enable_if<std::is_floating_point<T>::value, bool>::type almost_equal(T x, T y, T const abs_relative_error) { return 2 * std::abs(x - y) <= abs_relative_error * std::abs(x + y); } where abs_relative_error is basically (twice) the absolute value of what comes closest to being defined in the literature: a relative error. But that is just the choice of the name. What it really is seen most clearly in the complex plane I think. If |x| = 1, and y lays in a circle around x with diameter abs_relative_error, then the two are considered equal.
I use the following function for floating-point numbers comparison: bool approximatelyEqual(double a, double b) { return fabs(a - b) <= ((fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * std::numeric_limits<double>::epsilon()); }
It depends on how precise you want the comparison to be. If you want to compare for exactly the same number, then just go with ==. (You almost never want to do this unless you actually want exactly the same number.) On any decent platform you can also do the following: diff= a - b; return fabs(diff)<EPSILON; as fabs tends to be pretty fast. By pretty fast I mean it is basically a bitwise AND, so it better be fast. And integer tricks for comparing doubles and floats are nice but tend to make it more difficult for the various CPU pipelines to handle effectively. And it's definitely not faster on certain in-order architectures these days due to using the stack as a temporary storage area for values that are being used frequently. (Load-hit-store for those who care.)
/// testing whether two doubles are almost equal. We consider two doubles /// equal if the difference is within the range [0, epsilon). /// /// epsilon: a positive number (supposed to be small) /// /// if either x or y is 0, then we are comparing the absolute difference to /// epsilon. /// if both x and y are non-zero, then we are comparing the relative difference /// to epsilon. bool almost_equal(double x, double y, double epsilon) { double diff = x - y; if (x != 0 && y != 0){ diff = diff/y; } if (diff < epsilon && -1.0*diff < epsilon){ return true; } return false; } I used this function for my small project and it works, but note the following: Double precision error can create a surprise for you. Let's say epsilon = 1.0e-6, then 1.0 and 1.000001 should NOT be considered equal according to the above code, but on my machine the function considers them to be equal, this is because 1.000001 can not be precisely translated to a binary format, it is probably 1.0000009xxx. I test it with 1.0 and 1.0000011 and this time I get the expected result.
You cannot compare two double with a fixed EPSILON. Depending on the value of double, EPSILON varies. A better double comparison would be: bool same(double a, double b) { return std::nextafter(a, std::numeric_limits<double>::lowest()) <= b && std::nextafter(a, std::numeric_limits<double>::max()) >= b; }
My way may not be correct but useful Convert both float to strings and then do string compare bool IsFlaotEqual(float a, float b, int decimal) { TCHAR form[50] = _T(""); _stprintf(form, _T("%%.%df"), decimal); TCHAR a1[30] = _T(""), a2[30] = _T(""); _stprintf(a1, form, a); _stprintf(a2, form, b); if( _tcscmp(a1, a2) == 0 ) return true; return false; } operator overlaoding can also be done
I write this for java, but maybe you find it useful. It uses longs instead of doubles, but takes care of NaNs, subnormals, etc. public static boolean equal(double a, double b) { final long fm = 0xFFFFFFFFFFFFFL; // fraction mask final long sm = 0x8000000000000000L; // sign mask final long cm = 0x8000000000000L; // most significant decimal bit mask long c = Double.doubleToLongBits(a), d = Double.doubleToLongBits(b); int ea = (int) (c >> 52 & 2047), eb = (int) (d >> 52 & 2047); if (ea == 2047 && (c & fm) != 0 || eb == 2047 && (d & fm) != 0) return false; // NaN if (c == d) return true; // identical - fast check if (ea == 0 && eb == 0) return true; // ±0 or subnormals if ((c & sm) != (d & sm)) return false; // different signs if (abs(ea - eb) > 1) return false; // b > 2*a or a > 2*b d <<= 12; c <<= 12; if (ea < eb) c = c >> 1 | sm; else if (ea > eb) d = d >> 1 | sm; c -= d; return c < 65536 && c > -65536; // don't use abs(), because: // There is a posibility c=0x8000000000000000 which cannot be converted to positive } public static boolean zero(double a) { return (Double.doubleToLongBits(a) >> 52 & 2047) < 3; } Keep in mind that after a number of floating-point operations, number can be very different from what we expect. There is no code to fix that.
comparing double values in C++
I have following code for double comparision. Why I am getting not equal when I execute? #include <iostream> #include <cmath> #include <limits> bool AreDoubleSame(double dFirstVal, double dSecondVal) { return std::fabs(dFirstVal - dSecondVal) < std::numeric_limits<double>::epsilon(); } int main() { double dFirstDouble = 11.304; double dSecondDouble = 11.3043; if(AreDoubleSame(dFirstDouble , dSecondDouble ) ) { std::cout << "equal" << std::endl; } else { std::cout << "not equal" << std::endl; } }
The epsilon for 2 doubles is 2.22045e-016 By definition, epsilon is the difference between 1 and the smallest value greater than 1 that is representable for the data type. These differ by more than that and hence, it returns false (Reference)
The are not equal (according to your function) because they differ by more than epsilon. Epsilon is defined as "Machine epsilon (the difference between 1 and the least value greater than 1 that is representable)" - source http://www.cplusplus.com/reference/std/limits/numeric_limits/. This is approximately 2.22045e-016 (source http://msdn.microsoft.com/en-us/library/6x7575x3(v=vs.71).aspx) If you want to change the fudging factor, compare to another small double, for example: bool AreDoubleSame(double dFirstVal, double dSecondVal) { return std::fabs(dFirstVal - dSecondVal) < 1E-3; }
The difference between your two doubles is 0.0003. std::numeric_limits::epsilon() is much smaller than that.
epsilon() is only the difference between 1.0 and the next value representable after 1.0, the real min. The library function std::nextafter can be used to scale the equality precision test for numbers of any magnitude. For example using std::nextafter to test double equality, by testing that b is both <= next number lower than a && >= next number higher than a: bool nearly_equal(double a, double b) { return std::nextafter(a, std::numeric_limits<double>::lowest()) <= b && std::nextafter(a, std::numeric_limits<double>::max()) >= b; } This of course, would only be true if the bit patterns for a & b are the same. So it is an inefficient way of doing the (incorrect) naive direct a == b comparrison, therefore : To test two double for equality within some factor scaled to the representable difference, you might use: bool nearly_equal(double a, double b, int factor /* a factor of epsilon */) { double min_a = a - (a - std::nextafter(a, std::numeric_limits<double>::lowest())) * factor; double max_a = a + (std::nextafter(a, std::numeric_limits<double>::max()) - a) * factor; return min_a <= b && max_a >= b; } Of course working with floating point, analysis of the calculation precision would be required to determine how representation errors build up, to determine the correct minimum factor.
Epsilon is much smaller than 0.0003, so they are clearly not equal. If you want to see where it works check http://ideone.com/blcmB
Comparing Same Float Values In C [duplicate]
This question already has answers here: Closed 10 years ago. Possible Duplicate: strange output in comparison of float with float literal When I am trying to compare 2 same float values it doesn't print "equal values" in the following code : void main() { float a = 0.7; clrscr(); if (a < 0.7) printf("value : %f",a); else if (a == 0.7) printf("equal values"); else printf("hello"); getch(); } Thanks in advance.
While many people will tell you to always compare floating point numbers with an epsilon (and it's usually a good idea, though it should be a percentage of the values being compared rather than a fixed value), that's not actually necessary here since you're using constants. Your specific problem here is that: float a = 0.7; uses the double constant 0.7 to create a single precision number (losing some precision) while: if (a == 0.7) will compare two double precision numbers (a is promoted first). The precision that was lost when turning the double 0.7 into the float a is not regained when promoting a back to a double. If you change all those 0.7 values to 0.7f (to force float rather than double), or if you just make a a double, it will work fine - I rarely use float nowadays unless I have a massive array of them and need to save space. You can see this in action with: #include <stdio.h> int main (void){ float f = 0.7; // double converted to float double d1 = 0.7; // double kept as double double d2 = f; // float converted back to double printf ("double: %.30f\n", d1); printf ("double from float: %.30f\n", d2); return 0; } which will output something like (slightly modified to show difference): double: 0.6999999|99999999955591079014994 double from float: 0.6999999|88079071044921875000000 \_ different beyond here.
Floating point number are not what you think they are: here are two sources with more information: What Every Computer Scientist Should Know About Floating-Point Arithmetic and The Floating-Point Guide. The short answer is that due to the way floating point numbers are represented, you cannot do basic comparison or arithmetic and expect it to work.
You are comparing a single-precision approximation of 0.7 with a double-precision approximation. To get the expected output you should use: if(a == 0.7f) // check a is exactly 0.7f Note that due to representation and rounding errors it may be very unlikely to ever get exactly 0.7f from any operation. In general you should check if fabs(a-0.7) is sufficiently close to 0. Don't forget that the exact value of 0.7f is not really 0.7, but slightly lower: 0.7f = 0.699999988079071044921875 The exact value of the double precision representation of 0.7 is a better approximation, but still not exactly 0.7: 0.7d = 0.6999999999999999555910790149937383830547332763671875
a is a float; 0.7 is a value of type double. The comparison between the two requires a conversion. The compiler will convert the float value to a double value ... and the value resulting from converting a float to a double is not the same as the value resulting from the compiler converting a string of text (the source code) to a double. But don't ever compare floating point values (float, double, or long double) with ==. You might like to read "What Every Programmer Should Know About Floating-Point Arithmetic".
Floating point numbers must not be compared with the "==" operator. Instead of comparing float numbers with the "==" operator, you can use a function like this one : //compares if the float f1 is equal with f2 and returns 1 if true and 0 if false int compare_float(float f1, float f2) { float precision = 0.00001; if (((f1 - precision) < f2) && ((f1 + precision) > f2)) { return 1; } else { return 0; } }
The lack of absolute precision in floats makes it more difficult to do trivial comparisons than for integers. See this page on comparing floats in C. In particular, one code snippet lifted from there exhibits a 'workaround' to this issue: bool AlmostEqual2sComplement(float A, float B, int maxUlps) { // Make sure maxUlps is non-negative and small enough that the // default NAN won't compare as equal to anything. assert(maxUlps > 0 && maxUlps < 4 * 1024 * 1024); int aInt = *(int*)&A; // Make aInt lexicographically ordered as a twos-complement int if (aInt < 0) aInt = 0x80000000 - aInt; // Make bInt lexicographically ordered as a twos-complement int int bInt = *(int*)&B; if (bInt < 0) bInt = 0x80000000 - bInt; int intDiff = abs(aInt - bInt); if (intDiff <= maxUlps) return true; return false; } A simple and common workaround is to provide an epsilon with code like so: if (fabs(result - expectedResult) < 0.00001) This essentially checks the difference between the values is within a threshold. See the linked article as to why this is not always optimal though :) Another article is pretty much the de facto standard of what is linked to when people ask about floats on SO.
if you need to compare a with 0.7 than if( fabs(a-0.7) < 0.00001 ) //your code here 0.00001 can be changed to less (like 0.00000001) or more (like 0.0001) > It depends on the precision you need.