Rephrasing question :
The following code (Not C++ - written in an in-house scripting language)
if(A*B != 0.0)
{
D = (C/(A*B))*100.0;
}
else
{
D = 0.0;
}
yields a value of
90989373681853939930449659398190196007605312719045829137102976436641398782862768335320454041881784565022989668056715169480294533394160442876108458546952155914634268552157701346144299391656459840294022732906509880379702822420494744472135997630178480287638496793549447363202959411986592330337536848282003701760.000000
for D. We are 100% sure that A != 0.0. And we are almost 100% sure that B == 0.0. We never use such infinitesimally small values (close to 0.0 but not 0.0) such as the value of B that this value of C suggests. It is impossible that it acquired that value from our data. Can A*B yield anything that is not equal to 0.0 when B is 0?
The number you divided by was not in fact 0, just very, very close.
Assuming you are using IEEE floating point numbers it is not a good idea to use equal or not equal in this case with floating point numbers. Even if the same value like -0.0 and +0.0 they are not equal from a bitwise perspective which is what the equate does. Even if using other float formats, equal and not equal are discouraged.
Instead put some sort of range on it e=a*b; if ((e<0.0002)||(e>0.0002) then...
This looks like you are accruing error from previous calculations, so you divison is by a really small decimal, but not zero. You should add a margin of error if you want to catch something like this, psuedocode: if(num < margin_of_error) ret inf;, or use the epsilon method to be even safer
Related
I know it is incorrect to compare double (equality) and the best is to use an epsilon factor as described into the Knuth book (Art of programming). Nevertheless, I am working on a legacy code (C++), where there are a lot of devision like:
// b,c double from previous computation
if( b == 50.0)
b += 0.001;
double a = c/(b - 50.0);
Do we perform the conditional statement (b == 50) on the "bit representation" (mantissa-exponenent) or the decimal one ? I do not find this information on my C++ book. If it is the decimal, I think I can trough away the conditional statement.
The == operator is applied to the run-time representation of the floating-point value, ideally with exactly the exponent and significand numbers of bits implied by the type, but unfortunately, sometimes in a wider format, as allowed by the standard.
In b == 50.0, the decimal representation 50.0 is converted to such a floating-point representation at compile-time once and for all. That value is then used (or the program behaves as if it was used) each time this expression 50.0 is involved. In the case of 50.0, it does not make a difference because the number 50 can be represented exactly as a binary floating-point value.
As an example, b == 50.0000000000000000000001 is likely to behave exactly as b == 50.0 because 50.0000000000000000000001 represents the same floating-point value as 50.0.
For the specific piece of code, the use of exact comparison is correct:
// b,c double from previous computation
if( b == 50.0)
b += 0.001;
double a = c/(b - 50.0);
The purpose seems to be to ensure that the division will not be a division by zero. The code may have been written to be compatible with systems in which division by 0 causes failure, rather than infinity. Subtracting 50 from any double that is not exactly 50 will have a non-zero result, so the 0.001 fudge factor only needs to be added in the case of exact equality.
I want to avoid dividing by zero so I have an if statement:
float number;
//........
if (number > 0.000000000000001)
number = 1/number;
How small of a value can I safely use in place of 0.000000000000001?
Just use:
if(number > 0)
number = 1/number;
Note the difference between > and >=. If number > 0, then it definitely is not 0.
If number can be negative you can also use:
if(number != 0)
number = 1/number;
Note that, as others have mentioned in the comments, checking that number is not 0 will not prevent your result from being Inf or -Inf.
The number in the if condition depends on what you want to do with the result. In IEEE 754, which is used by (almost?) all C implementations, dividing by 0 is OK: you get positive or negative infinity.
If your goal is to avoid +/- Infinity, then the number in the if condition will depend upon the numerator. When the numerator is 1, you can use DBL_MIN or FLT_MIN from math.h.
If your goal is to avoid huge numbers after the division, you can do the division and then check if fabs(number) is bigger than certain value after the division, and then take whatever action as needed.
There is no single correct answer to your question.
You can simply check:
if (number > 0)
I can't understand why you need the lower limit.
For numeric type T std::numeric_limits gives you anything you need. For example you could do this to make sure that anything above min_invertible has finite reciprocal:
float max_float = std::numeric_limits<float>::max();
float min_float = std::numeric_limits<float>::min(); // or denorm_min()
float min_invertible = (max_float*min_float > 1.0f )? min_float : 1.0f/max_float;
You can't decently check up front. DBL_MAX / 0.5 effectively is a division by zero; the result is the same infinity you'd get from any other division by (almost) zero.
There is a simple solution: just check the result. std::isinf(result) will tell you whether the result overflowed, and IEEE754 tells you that division cannot produce infinity in other cases. (Well, except for INF/x,. That's not really producing infinity but merely preserving it.)
Your risk of producing an unhelpful result through overflow or underflow depends on both numerator and denominator.
A safety check which takes that into consideration is:
if (den == 0.0 || log2(num) - log2(den) >= log2(FLT_MAX))
/* expect overflow */ ;
else
return num / den;
but you might want to shave a small amount off log2(FLT_MAX) to leave wiggle-room for subsequent arithmetic and round-off.
You can do something similar with frexp, which would work for negative values as well:
int max;
int n, d;
frexp(FLT_MAX, &max);
frexp(num, &n);
frexp(den, &d);
if (den == 0.0 || n - d > max)
/* might overflow */ ;
else
return num / den;
This avoids the work of computing the logarithm, which might be more efficient if the compiler can find a suitable way of doing it, but it's not as accurate.
With IEEE 32-bit floats, the smallest possible value greater than 0 is 2^-149.
If you're using IEEE 64-bit, the smallest possible value is 2^-1074.
That said, (x > 0) is probably the better test.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Floating point comparison
Well this is a strange one. Ordinarily the following if( 4.0 == 4.0 ){return true} will always return true. In a simple little opengl 3d 'shooter' program I have, when I am trying to add 'jumping' effects, this is not the case.
The idea is fairly simple, I have a terrain of triangle strips, as the 'character' moves/walks you move along a 2d array of heights, hence you walk up and down the various elevations in the hills/troughs.
Outside of the drawScene() function (or if you know opengl, the glutDisplayFunc()), I have an update() function which raises the character up and down as he 'jumps', this is called in drawScene(). In words, as high level as I can explain it, the jumping algorithm is below:
parameters:
double currentJumpingHeight
double maximumJumpingHeight = 4.0
double ypos;
const double jumpingIncrement = 0.1; //THE PROBLEM!!!! HAS TO BE A MULTIPLE OF 0.5!!
bool isJumping = false;
bool ascending = true;
the algorithm:
(when space is pressed) isJumping = true.
if ascending = true and isJumping = true,
currentJumpHeight += jumpingIncrement (and the same for ypos)
if currentJumpingHeight == maximumJumpingHeight, //THE PROBLEM
ascending = false
(we decrement currentJumpingHeight and start to fall until we hit the ground.)
Its very simple, BUT IT ONLY WORKS WHEN jumpingIncrement IS A MULTIPLE of 0.5!!
If jumpingIncrement is, say, 0.1, then currentJumpingHeight will never equal maximumJumpingHeight. The character takes off like a rocket and never returns to the ground. Even when the two variables are printed to the standard output and are the same, the condition is never true. This is the problem I would like to solve, it is absurd.
I don't want any feedback on the jumping algorithm - just the above paragraph please.
please help.
It's just a typical floating point precision problem. In particular 0.1 cannot be represented exactly using binary floating points. Numbers like 0.5 and 0.25 on the other hand can be represented exactly, and thus will probably work. I believe the compiler is still free to make it not work, even in that case.
In your case a solution is using >=:
if ascending && (currentJumpingHeight >= maximumJumpingHeight)
ascending = false
currentJumpingHeight = maximumJumpingHeight
You could also use epsilon comparisons, but I avoid them where possible. In your case a simple >= seems cleaner than epsilon equality.
As people said, it's a typical floating precision point problem. Look at this
Basically, in the same way you can't code with a finite number of digits 1/3 in base 10
1/3 = 0.3333333333.....
You can't do 1/10 in base 2.
It will be
0.1 = 0x0.0001100110011001100110011001100110011001100110011...
You should never compare floats but check that there difference is smaller than a value, ex
if (fabs(a-b) < 1e-6)
instead of
if(a==b)
Of course in your case, just use '>='.
It work with your 0.5 increment, because 0.5 is 1/2 or 0.1 in binary, so it's coded properly .
That will work for any power of 2 so 0.5 or 0.125, but never with 0.1.
Suppose I have some code such as:
float a, b = ...; // both positive
int s1 = ceil(sqrt(a/b));
int s2 = ceil(sqrt(a/b)) + 0.1;
Is it ever possible that s1 != s2? My concern is when a/b is a perfect square. For example, perhaps a=100.0 and b=4.0, then the output of ceil should be 5.00000 but what if instead it is 4.99999?
Similar question: is there a chance that 100.0/4.0 evaluates to say 5.00001 and then ceil will round it up to 6.00000?
I'd prefer to do this in integer math but the sqrt kinda screws that plan.
EDIT: suggestions on how to better implement this would be appreciated too! The a and b values are integer values, so actual code is more like: ceil(sqrt(float(a)/b))
EDIT: Based on levis501's answer, I think I will do this:
float a, b = ...; // both positive
int s = sqrt(a/b);
while (s*s*b < a) ++s;
Thank you all!
I don't think it's possible. Regardless of the value of sqrt(a/b), what it produces is some value N that we use as:
int s1 = ceil(N);
int s2 = ceil(N) + 0.1;
Since ceil always produces an integer value (albeit represented as a double), we will always have some value X, for which the first produces X.0 and the second X.1. Conversion to int will always truncate that .1, so both will result in X.
It might seem like there would be an exception if X was so large that X.1 overflowed the range of double. I don't see where this could be possible though. Except close to 0 (where overflow isn't a concern) the square root of a number will always be smaller than the input number. Therefore, before ceil(N)+0.1 could overflow, the a/b being used as an input in sqrt(a/b) would have to have overflowed already.
You may want to write an explicit function for your case. e.g.:
/* return the smallest positive integer whose square is at least x */
int isqrt(double x) {
int y1 = ceil(sqrt(x));
int y2 = y1 - 1;
if ((y2 * y2) >= x) return y2;
return y1;
}
This will handle the odd case where the square root of your ratio a/b is within the precision of double.
Equality of floating point numbers is indeed an issue, but IMHO not if we deal with integer numbers.
If you have the case of 100.0/4.0, it should perfectly evaluate to 25.0, as 25.0 is exactly representable as a float, as opposite to e.g. 25.1.
Yes, it's entirely possible that s1 != s2. Why is that a problem, though?
It seems natural enough that s1 != (s1 + 0.1).
BTW, if you would prefer to have 5.00001 rounded to 5.00000 instead of 6.00000, use rint instead of ceil.
And to answer the actual question (in your comment) - you can use sqrt to get a starting point and then just find the correct square using integer arithmetic.
int min_dimension_greater_than(int items, int buckets)
{
double target = double(items) / buckets;
int min_square = ceil(target);
int dim = floor(sqrt(target));
int square = dim * dim;
while (square < min_square) {
seed += 1;
square = dim * dim;
}
return dim;
}
And yes, this can be improved a lot, it's just a quick sketch.
s1 will always equal s2.
The C and C++ standards do not say much about the accuracy of math routines. Taken literally, it is impossible for the standard to be implemented, since the C standard says sqrt(x) returns the square root of x, but the square root of two cannot be exactly represented in floating point.
Implementing routines with good performance that always return a correctly rounded result (in round-to-nearest mode, this means the result is the representable floating-point number that is nearest to the exact result, with ties resolved in favor of a low zero bit) is a difficult research problem. Good math libraries target accuracy less than 1 ULP (so one of the two nearest representable numbers is returned), perhaps something slightly more than .5 ULP. (An ULP is the Unit of Least Precision, the value of the low bit given a particular value in the exponent field.) Some math libraries may be significantly worse than this. You would have to ask your vendor or check the documentation for more information.
So sqrt may be slightly off. If the exact square root is an integer (within the range in which integers are exactly representable in floating-point) and the library guarantees errors are less than 1 ULP, then the result of sqrt must be exactly correct, because any result other than the exact result is at least 1 ULP away.
Similarly, if the library guarantees errors are less than 1 ULP, then ceil must return the exact result, again because the exact result is representable and any other result would be at least 1 ULP away. Additionally, the nature of ceil is such that I would expect any reasonable math library to always return an integer, even if the rest of the library were not high quality.
As for overflow cases, if ceil(x) were beyond the range where all integers are exactly representable, then ceil(x)+.1 is closer to ceil(x) than it is to any other representable number, so the rounded result of adding .1 to ceil(x) should be ceil(x) in any system implementing the floating-point standard (IEEE 754). That is provided you are in the default rounding mode, which is round-to-nearest. It is possible to change the rounding mode to something like round-toward-infinity, which could cause ceil(x)+.1 to be an integer higher than ceil(x).
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Most effective way for float and double comparison
I have two values(floats) I am attempting to add together and average. The issue I have is that occasionally these values would add up to zero, thus not requiring them to be averaged.
The situation I am in specifically contains the values "-1" and "1", yet when added together I am given the value "-1.19209e-007" which is clearly not 0. Any information on this?
I'm sorry but this doesn't make sense to me.
Two floating point values, if they are exactly the same but with opposite sign, subtracted will produce always 0. This is how floating point operations works.
float a = 0.2f;
float b = -0.2f;
float f = (a - b) / 2;
printf("%f %d\n", f, f != 0); // will print out 0.0000 0
Will be always 0 also if the compiler doesn't optimize the code.
There is not any kind of rounding error to take in account if a and b have the same value but opposite sign! That is, if the higher bit of a is 0 and the higher bit of b is 1 and all other bits are the same, the result cannot be other than 0.
But if a and b are slightly different, of course, the result can be non-zero.
One possible solution to avoid this can be using a tolerance...
float f = (a + b) / 2;
if (abs(f) < 0.000001f)
f = 0;
We are using a simple tolerance to see if our value is near to zero.
A nice example code to show this is...
int main(int argc)
{
for (int i = -10000000; i <= 10000000 * argc; ++i)
{
if (i != 0)
{
float a = 3.14159265f / i;
float b = -a + (argc - 1);
float f = (a + b) / 2;
if (f != 0)
printf("%f %d\n", a, f);
}
}
printf("completed\n");
return 0;
}
I'm using "argc" here as a trick to force the compiler to not optimize out our code.
At least right off, this sounds like typical floating point imprecision.
The usual way to deal with it is to round your numbers to the correct number of significant digits. In this case, your average would be -1.19209e-08 (i.e., 0.00000001192). To (say) six or seven significant digits, that is zero.
Takes the sum of all your numbers, divide by your count. Round off your answer to something reasonable before you do prints, reports comparisons, or whatever you're doing.
again, do some searching on this but here is the basic explanation ...
the computer approximates floating point numbers by base 2 instead of base 10. this means that , for example, 0.2 (when converted to binary) is actually 0.001100110011 ... on forever. since the computer cannot add these on forever, it must approximate it.
because of these approximations, we lose "precision" of calculations. hence "single" and "double" precision floating point numbers. this is why you never test for a float to be actually 0. instead, you test whether is below some threshhold which you want to use as zero.