I am now trying to use dart:test features.
I can write something like:
expect(areaUnderCurveWithRectangleRule(f1, 0,1,1000), equals(2));
But as we know, in float/double calculation, there is no such thing as precise equal. So I am wondering if there is a roughly equal testing method? It will return true for two double values, if their difference is within a certain epsilon (say, 1E-6) or certain percentage?
If not, will this make a good feature request to Dart team?
dart:test provides a closeTo matcher for this purpose:
expect(areaUnderCurveWithRectangleRule(f1, 0,1,1000), closeTo(2, epsilon));
Note that closeTo uses an absolute delta, so a single threshold might not be appropriate for floating-point values that have very different magnitudes.
If you instead want a version that compares based on a percentage, it should be easy to wrap closeTo with your own function, e.g.:
Matcher closeToPercentage(num value, double fraction) {
final delta = value * fraction;
return closeTo(value, delta);
}
As far as I know there is no standart imlementation for this. But you can use the following:
expect(abs(x-y) < epsilon)
for some epsilon you defined ealier
Gives helpful errors, instead of just "false"
void near(double a, double b, {double eps = 1e-12, bool relative = false}) {
var bound = relative ? eps*b.abs() : eps;
expect(a,greaterThanOrEqualTo(b-bound));
expect(a,lessThanOrEqualTo(b+bound));
}
Related
I have two vectors:
std::vector<double> calculatedValues = calculateValues();
std::vector<double> expectedValues = {1.1, 1.2};
I am using cpputest to verify if these vectors are equal:
CHECK_TRUE(calculatedValues == expectedValues)
This is working. However, I am wondering whether I shouldn't use some tolerance, because after all I am comparing doubles.
Instead of operator== you can use std::equal() with a custom epsilon:
bool equal = std::equal(calculatedValues.begin(), calculatedValues.end(), expectedValues.begin(),
[](double value1, double value2)
{
constexpr double epsilon = 0.01; // Choose whatever you need here
return std::fabs(value1 - value2) < epsilon;
});
CHECK_TRUE(equal);
To compare floating point values you should do something like this:
bool is_equal(double a, double b) {
return (abs(a-b)/b < 0.0001); // 0.00001 value can be a relative value
}
You can adapt it to compare your vectors.
Yes, you should use some tolerance because floating point operations are not guaranteed to yield the exact same results on different CPUs. There can be e.g. roundoff errors.
However, the SSE/SSE2 standards do provide reproducible floating point math, so you may consider using the compile flag /ARCH:SSE2 as an alternative. That said, it is difficult to ensure that no x87 math is used anywhere in the app, so be careful!
I have the following expression:
A = cos(5x),
where x is a letter indicating a generic parameter.
In my program I have to work on A, and after some calculations I must have a result that must still be a function of x , explicitly.
In order to do that, what kind of variable should A (and I guess all the other variables that I use for my calculations) be?
Many thanks to whom will answer
I'm guessing you need precision. In which case, double is probably what you want.
You can also use float if you need to operate on a lot of floating-point numbers (think in the order of thousands or more) and analysis of the algorithm has shown that the reduced range and accuracy don't pose a problem.
If you need more range or accuracy than double, long double can also be used.
To define function A(x) = cos(5 * x)
You may do:
Regular function:
double A(double x) { return std::cos(5 * x); }
Lambda:
auto A = [](double x) { return std::cos(5 * x); };
And then just call it as any callable object.
A(4.); // cos(20.)
It sounds like you're trying to do a symbolic calculation, ie
A = magic(cos(5 x))
B = acos(A)
print B
> 5 x
If so, there isn't a simple datatype that will do this for you, unless you're programming in Mathematica.
The most general answer is "A will be an Expression in some AST representation for which you have a general algebraic solver."
However, if you really want to end up with a C++ function you can call (instead of a symbolic representation you can print as well as evaluating), you can just use function composition. In that case, A would be a
std::function<double (double )>
or something similar.
With this question as base, it is well known that we should not apply equals comparison operation to decimal variables, due numeric erros (it is not bound to programming language):
bool CompareDoubles1 (double A, double B)
{
return A == B;
}
The abouve code it is not right.
My questions are:
It is right to round to both numbers and then compare?
It is more efficient?
For instance:
bool CompareDoubles1 (double A, double B)
{
double a = round(A,4);
double b = round(B,4)
return a == b;
}
It is correct?
EDIT
I'm considering round is a method that take a double (number) and int (precition):
bool round (float number, int precision);
EDIT
I consider that a better idea of what I mean with this question will be expressed with this compare method:
bool CompareDoubles1 (double A, double B, int precision)
{
//precition could be the error expected when rounding
double a = round(A,precision);
double b = round(B,precision)
return a == b;
}
Usually, if you really have to compare floating values, you'd specify a tolerance:
bool CompareDoubles1 (double A, double B, double tolerance)
{
return std::abs(A - B) < tolerance;
}
Choosing an appropriate tolerance will depend on the nature of the values and the calculations that produce them.
Rounding is not appropriate: two very close values, which you'd want to compare equal, might round in different directions and appear unequal. For example, when rounding to the nearest integer, 0.3 and 0.4 would compare equal, but 0.499999 and 0.500001 wouldn't.
A common comparison for doubles is implemented as
bool CompareDoubles2 (double A, double B)
{
return std::abs(A - B) < 1e-6; // small magic constant here
}
It is clearly not as efficient as the check A == B, because it involves more steps, namely subtraction, calling std::abs and finally comparison with a constant.
The same argument about efficiency holds for you proposed solution:
bool CompareDoubles1 (double A, double B)
{
double a = round(A,4); // the magic constant hides in the 4
double b = round(B,4); // and here again
return a == b;
}
Again, this won't be as efficient as direct comparison, but -- again -- it doesn't even try to do the same.
Whether CompareDoubles2 or CompareDoubles1 is faster depends on your machine and the choice of magic constants. Just measure it. You need to make sure to supply matching magic constants, otherwise you are checking for equality with a different trust region which yields different results.
I think comparing the difference with a fixed tolerance is a bad idea.
Say what happens if you set the tolerance to 1e-6, but the two numbers you compare are
1.11e-9 and 1.19e-9?
These would be considered equal, even if they differ after the second significant digit. This may not what you want.
I think a better way to do the comparison is
equal = ( fabs(A - B) <= tol*max(fabs(A), fabs(B)) )
Note, the <= (and not <), because the above must also work for 0==0. If you set tol=1e-14, two numbers will be considered equal when they are equal up to 14 significant digits.
Sidenote: When you want to test if a number is zero, then the above test might not be ideal and then one indeed should use an absolute threshold.
If the round function used in your example means to round to 4th decimal digit, this is not correct at all. For example, if A and B are 0.000003 and 0.000004 they would be rounded to 0.0 and would therefore be compared to be equal.
A general purpose compairison function must not work with a constant tolarance but with a relative one. But it is all explained in the post you cite in your question.
There is no 'correct' way to compare floating point values (Even a f == 0.0 might be correct). Different comparison may be suitable. Have a look at http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/
Similar to other posts, but introducing scale-invariance: If you are doing something like adding two sets of numbers together and then you want to know if the two set sums are equal, you can take the absolute value of the log-ratio (difference of logarithms) and test to see if this is less than your prescribed tolerance. That way, e.g. if you multiply all your numbers by 10 or 100 in summation calculations, it won't affect the result about whether the answers are equal or not. You should have a separate test to determine if two numbers are equal because they are close enough to 0.
Suppose I have three numbers. Two of them form a range between them. The last number, I want to check to see if it falls within that range. It's a simple caveat: the numbers that define the range's start and end, may be greater than or less than the other. This is for a physics algorithm whose performance I'm working to improve, so I also want to avoid using conditional statements.
double inRange(double point, double rangeStart, double rangeEnd){
// returns true if the 'point' lies within the range
// the 'range' is every number between 'rangeStart' and 'rangeEnd'
// rangeStart can be greater than or less than rangeEnd
// conditional branches should be avoided
return ?; // return values [0.0 - 1.0] are considered 'in range'
}
Is there a mathematical equation to accomplish this, without using condition logic?
edit:
The reason it returns a double instead of a bool, is because I need to know the ratio too; 0.0 is closest to one edge while 1.0 is closest to the other.
The original algorithm I have is this:
double inRange(double point, double rangeStart, double rangeEnd){
if(rangeStart > rangeEnd){
double temp = rangeStart;
rangeStart = rangeEnd;
rangeEnd = temp;
}
return (point - rangeStart) / (rangeEnd - rangeStart);
}
My profiler shows about 16% of the time the program is running, is spent in this function, with optimizations enabled. It's called pretty frequently. Not sure if the condition statement is entirely to blame, but I would like to try a function that doesn't have one and see.
to answer your specification "it should return zero when close to the start and 1 when close to the end", that you don't want conditionals, and that start and end might be swapped:
return (point-std::min(rangeStart, rangeEnd))/std::abs(rangeStart - rangeEnd);
Note that although I don't know about the particular STL implementation, min does not necessarily require conditionals to be implemented. For instance, min(a,b) = (a+b-abs(b-a))/2.
If the start is larger than the end, then swap those.
The following code is supposed to find the key 3.0in a std::map which exists. But due to floating point precision it won't be found.
map<double, double> mymap;
mymap[3.0] = 1.0;
double t = 0.0;
for(int i = 0; i < 31; i++)
{
t += 0.1;
bool contains = (mymap.count(t) > 0);
}
In the above example, contains will always be false.
My current workaround is just multiply t by 0.1 instead of adding 0.1, like this:
for(int i = 0; i < 31; i++)
{
t = 0.1 * i;
bool contains = (mymap.count(t) > 0);
}
Now the question:
Is there a way to introduce a fuzzyCompare to the std::map if I use double keys?
The common solution for floating point number comparison is usually something like a-b < epsilon. But I don't see a straightforward way to do this with std::map.
Do I really have to encapsulate the double type in a class and overwrite operator<(...) to implement this functionality?
So there are a few issues with using doubles as keys in a std::map.
First, NaN, which compares less than itself is a problem. If there is any chance of NaN being inserted, use this:
struct safe_double_less {
bool operator()(double left, double right) const {
bool leftNaN = std::isnan(left);
bool rightNaN = std::isnan(right);
if (leftNaN != rightNaN)
return leftNaN<rightNaN;
return left<right;
}
};
but that may be overly paranoid. Do not, I repeat do not, include an epsilon threshold in your comparison operator you pass to a std::set or the like: this will violate the ordering requirements of the container, and result in unpredictable undefined behavior.
(I placed NaN as greater than all doubles, including +inf, in my ordering, for no good reason. Less than all doubles would also work).
So either use the default operator<, or the above safe_double_less, or something similar.
Next, I would advise using a std::multimap or std::multiset, because you should be expecting multiple values for each lookup. You might as well make content management an everyday thing, instead of a corner case, to increase the test coverage of your code. (I would rarely recommend these containers) Plus this blocks operator[], which is not advised to be used when you are using floating point keys.
The point where you want to use an epsilon is when you query the container. Instead of using the direct interface, create a helper function like this:
// works on both `const` and non-`const` associative containers:
template<class Container>
auto my_equal_range( Container&& container, double target, double epsilon = 0.00001 )
-> decltype( container.equal_range(target) )
{
auto lower = container.lower_bound( target-epsilon );
auto upper = container.upper_bound( target+epsilon );
return std::make_pair(lower, upper);
}
which works on both std::map and std::set (and multi versions).
(In a more modern code base, I'd expect a range<?> object that is a better thing to return from an equal_range function. But for now, I'll make it compatible with equal_range).
This finds a range of things whose keys are "sufficiently close" to the one you are asking for, while the container maintains its ordering guarantees internally and doesn't execute undefined behavior.
To test for existence of a key, do this:
template<typename Container>
bool key_exists( Container const& container, double target, double epsilon = 0.00001 ) {
auto range = my_equal_range(container, target, epsilon);
return range.first != range.second;
}
and if you want to delete/replace entries, you should deal with the possibility that there might be more than one entry hit.
The shorter answer is "don't use floating point values as keys for std::set and std::map", because it is a bit of a hassle.
If you do use floating point keys for std::set or std::map, almost certainly never do a .find or a [] on them, as that is highly highly likely to be a source of bugs. You can use it for an automatically sorted collection of stuff, so long as exact order doesn't matter (ie, that one particular 1.0 is ahead or behind or exactly on the same spot as another 1.0). Even then, I'd go with a multimap/multiset, as relying on collisions or lack thereof is not something I'd rely upon.
Reasoning about the exact value of IEEE floating point values is difficult, and fragility of code relying on it is common.
Here's a simplified example of how using soft-compare (aka epsilon or almost equal) can lead to problems.
Let epsilon = 2 for simplicity. Put 1 and 4 into your map. It now might look like this:
1
\
4
So 1 is the tree root.
Now put in the numbers 2, 3, 4 in that order. Each will replace the root, because it compares equal to it. So then you have
4
\
4
which is already broken. (Assume no attempt to rebalance the tree is made.) We can keep going with 5, 6, 7:
7
\
4
and this is even more broken, because now if we ask whether 4 is in there, it will say "no", and if we ask for an iterator for values less than 7, it won't include 4.
Though I must say that I've used maps based on this flawed fuzzy compare operator numerous times in the past, and whenever I digged up a bug, it was never due to this. This is because datasets in my application areas never actually amount to stress-testing this problem.
As Naszta says, you can implement your own comparison function. What he leaves out is the key to making it work - you must make sure that the function always returns false for any values that are within your tolerance for equivalence.
return (abs(left - right) > epsilon) && (left < right);
Edit: as pointed out in many comments to this answer and others, there is a possibility for this to turn out badly if the values you feed it are arbitrarily distributed, because you can't guarantee that !(a<b) and !(b<c) results in !(a<c). This would not be a problem in the question as asked, because the numbers in question are clustered around 0.1 increments; as long as your epsilon is large enough to account for all possible rounding errors but is less than 0.05, it will be reliable. It is vitally important that the keys to the map are never closer than 2*epsilon apart.
You could implement own compare function.
#include <functional>
class own_double_less : public std::binary_function<double,double,bool>
{
public:
own_double_less( double arg_ = 1e-7 ) : epsilon(arg_) {}
bool operator()( const double &left, const double &right ) const
{
// you can choose other way to make decision
// (The original version is: return left < right;)
return (abs(left - right) > epsilon) && (left < right);
}
double epsilon;
};
// your map:
map<double,double,own_double_less> mymap;
Updated: see Item 40 in Effective STL!
Updated based on suggestions.
Using doubles as keys is not useful. As soon as you make any arithmetic on the keys you are not sure what exact values they have and hence cannot use them for indexing the map. The only sensible usage would be that the keys are constant.