What checks can I perform to identify what differences they are in the floating point behaviour of two hardware platforms?
Verifying IEE-754 compliance or checking for known bugs may be sufficient (to explain a difference in output that I've observed).
I have looked at the CPU flags via /proc/cpu and both claim to support SSE2
I looked at:
https://www.vinc17.net/research/fptest.en.html
http://www.jhauser.us/arithmetic/TestFloat.html
but they look challenging to use.
I've built TestFloat but I'm not sure what to do with it. The home page says:
"Unfortunately, TestFloat’s output is not easily interpreted. Detailed
knowledge of the IEEE Standard is required to use TestFloat
responsibly."
Ideally I just want one or two programs or some simple configure style checks I can run and compare the output between two platforms.
Ideally I would then convert this into configure checks to ensure that
the an attempt to compile the non-portable code on a platform that behaves abnormally its detected at configure time rather than run time.
Background
I have found a difference in behaviour for a C++ application on two different platforms:
Intel(R) Xeon(R) CPU E5504
Intel(R) Core(TM) i5-3470 CPU
Code compiled natively on either machine runs on the other but
for one test the behaviour depends on which machine the code is run on.
Clarification
The executable compiled on machine A behaves like the executable compiled on machine B when copied to run on machine B and visa versa.
It could an uninitialised variable (though nothing showed up in valgrind) or many other things but
I suspected that the cause could be non-portable use of floating point.
Perhaps one machine is interpreting the float point assembly differently from the other?
The implementers have confirmed they know about this.
Its not my code and I have no desire to completely rewrite it to test this. Recompiling is fine though.
I want to test my hypothesis.
In the related question I am looking at how to enable software floating point. This question is tackling the problem from the other side.
Update
I've gone down the configure check road tried the following based on #chux's hints.
#include <iostream>
#include <cfloat>
int main(int /*argc*/, const char* /*argv*/[])
{
std::cout << "FLT_EVAL_METHOD=" << FLT_EVAL_METHOD << "\n";
std::cout << "FLT_ROUNDS=" << FLT_ROUNDS << "\n";
#ifdef __STDC_IEC_559__
std::cout << "__STDC_IEC_559__ is defined\n";
#endif
#ifdef __GCC_IEC_559__
std::cout << "__GCC_IEC_559__ is defined\n";
#endif
std::cout << "FLT_MIN=" << FLT_MIN << "\n";
std::cout << "FLT_MAX=" << FLT_MAX << "\n";
std::cout << "FLT_EPSILON=" << FLT_EPSILON << "\n";
std::cout << "FLT_RADIX=" << FLT_RADIX << "\n";
return 0;
}
Giving identical output on both platforms:
./floattest
FLT_EVAL_METHOD=0
FLT_ROUNDS=1
__STDC_IEC_559__ is defined
FLT_MIN=1.17549e-38
FLT_MAX=3.40282e+38
FLT_EPSILON=1.19209e-07
FLT_RADIX=2
I'm still looking for something that might be different.
OP has 2 goals that conflict a bit.
How to detect differences in floating point behaviour across platforms (?)
I just want one or two programs or some simple configure style checks I can run and compare the output between two platforms.
Yes some differences are easy to detect, but some differences can be exceedingly subtle.
Sample Can the floating-point status flag FE_UNDERFLOW set when the result is not sub-normal?
There are no simple tests for the general problem.
Recommend either:
Revamp the coding goal to allow for nominal differences.
See if _STDC_IEC_559__ is defined and hope that is sufficient for you application. Given various other factors like FLT_EVAL_METHOD and FLT_ROUNDS and optimization levels, code can still be compliant yet provide different results, yet the degree will be more manageable.
If super high consistency is needed, do not use floating point.
I found a program called esparanoia that does some checks of floating point behaviour. This is based on William Kahan's original paranoid program found the infamous Pentium division bug.
While it did not detect any problems with my test systems (and thus is not sufficient to answer the question) it might be of interest to someone else.
Related
Description
I'm trying to switch over from using the classic intel compiler from the Intel OneAPI toolkit to the next-generation DPC/C++ compiler, but the default behaviour for handling floating point operations appears broken or different, in that comparison with infinity always evaluates to false in fast floating point modes. The above is both a compiler warning and the behaviour I now experience with ICX, but not a behaviour experienced with the classic compiler (for the same minimal set of compiler flags used).
Minimally reproducible example
#include <iostream>
#include <cmath>
int main()
{
double a = 1.0/0.0;
if (std::isinf(a))
std::cout << "is infinite";
else
std::cout << "is not infinite;";
}
Compiler Flags:
-O3 -Wall -fp-model=fast
ICC 2021.5.0 Output:
is infinite
(also tested on several older versions)
ICX 2022.0.0 Output:
is not infinite
(also tested on 2022.0.1)
Live demo on compiler-explorer:
https://godbolt.org/z/vzeYj1Wa3
By default -fp-model=fast is enabled on both compilers. If I manually specify -fp-model=precise I can recover the behaviour but not the performance.
Does anyone know of a potential solution to both maintain the previous behaviour & performance of the fast floating point model using the next-gen compiler?
If you add -fp-speculation=safe to -fp-model=fast, you will still get the warning that you shouldn't use -fp-model=fast if you want to check for infinity, but the condition will evaluate correctly: godbolt.
In the Intel Porting Guide for ICC Users to DPCPP or ICX it is stated that:
FP Strictness: Nothing stricter than the default is supported. There is no support for -fp-model strict, -fp-speculation=safe, #pragma fenv_access, etc. Implementing support for these is a work-in-progress in the open source community.
Even though it works for the current version of the tested compiler (icx 2022.0.0), there is a discrepancy: either the documentation is outdated (more probable), or this feature is working by accident (less probable).
This question already has answers here:
Is floating point math broken?
(31 answers)
Math precision requirements of C and C++ standard
(1 answer)
Closed 4 years ago.
I have a program that were giving slithly different results under Android and Windows. As I validate the output data against a binary file containign expected result, the difference, even if very small (rounding issue) is annoying and I must find a way to fix it.
Here is a sample program:
#include <iostream>
#include <iomanip>
#include <bitset>
int main( int argc, char* argv[] )
{
// this value was identified as producing different result when used as parameter to std::exp function
unsigned char val[] = {158, 141, 250, 206, 70, 125, 31, 192};
double var = *((double*)val);
std::cout << std::setprecision(30);
std::cout << "var is " << var << std::endl;
double exp_var = std::exp(var);
std::cout << "std::exp(var) is " << exp_var << std::endl;
}
Under Windows, compiled with Visual 2015, I get the output:
var is -7.87234042553191493141184764681
std::exp(var) is 0.00038114128472300899284561093161
Under Android/armv7, compiled with g++ NDK r11b, I get the output:
var is -7.87234042553191493141184764681
std::exp(var) is 0.000381141284723008938635502307335
So the results are different starting e-20:
PC: 0.00038114128472300899284561093161
Android: 0.000381141284723008938635502307335
Note that my program does a lot of math operations and I only noticed std::exp producing different results for the same input...and only for some specific input values (did not investigate if those values are having a similar property), for most of them, results are identical.
Is this behaviour kind of "expected", is there no guarantee to have the same result in some situations?
Is there some compiler flag that could fix that?
Or do I need to round my result to end with the same on both platformas? Then what would be the good strategy for rounding? Because rounding abritrary at e-20 would loose too many information if input var in very small?
Edit: I consider my question not being a duplicate of Is floating point math broken?. I get exactly the same result on both platforms, only std::exp for some specific values produces different results.
The standard does not define how the exp function (or any other math library function1) should be implemented, thus each library implementation may use a different computing method.
For instance, the Android C library (bionic) uses an approximation of exp(r) by a special rational function on the interval [0,0.34658] and scales back the result.
Probably the Microsoft library is using a different computing method (cannot find info about it), thus resulting in different results.
Also the libraries could take a dynamic load strategy (i.e. load a .dll containing the actual implementation) in order to leverage the different hardware specific features, making it even more unpredictable the result, even when using the same compiler.
In order to get the same implementation in both (all) platforms, you could use your own implementation of the exp function, thus not relying on the different implementations of the different libraries.
Take into account that maybe the processors are taking different rounding approaches, which would yield also to a different result.
1 There are some exceptions to these, for isntance the sqrt function or std::fma and some rounding functions and basic arithmetic operations
Sample code:
#include <iostream>
#include <cmath>
#include <stdint.h>
using namespace std;
static bool my_isnan(double val) {
union { double f; uint64_t x; } u = { val };
return (u.x << 1) > (0x7ff0000000000000u << 1);
}
int main() {
cout << std::isinf(std::log(0.0)) << endl;
cout << std::isnan(std::sqrt(-1.0)) << endl;
cout << my_isnan(std::sqrt(-1.0)) << endl;
cout << __isnan(std::sqrt(-1.0)) << endl;
return 0;
}
Online compiler.
With -ffast-math, that code prints "0, 0, 1, 1" -- without, it prints "1, 1, 1, 1".
Is that correct? I thought that std::isinf/std::isnan should still work with -ffast-math in these cases.
Also, how can I check for infinity/NaN with -ffast-math? You can see the my_isnan doing this, and it actually works, but that solution is of course very architecture dependent. Also, why does my_isnan work here and std::isnan does not? What about __isnan and __isinf. Do they always work?
With -ffast-math, what is the result of std::sqrt(-1.0) and std::log(0.0). Does it become undefined, or should it be NaN / -Inf?
Related discussions: (GCC) [Bug libstdc++/50724] New: isnan broken by -ffinite-math-only in g++, (Mozilla) Bug 416287 - performance improvement opportunity with isNaN
Note that -ffast-math may make the compiler ignore/violate IEEE specifications, see http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Optimize-Options.html#Optimize-Options :
This option is not turned on by any -O option besides -Ofast since it
can result in incorrect output for programs that depend on an exact
implementation of IEEE or ISO rules/specifications for math functions.
It may, however, yield faster code for programs that do not require
the guarantees of these specifications.
Thus, using -ffast-math you are not guaranteed to see infinity where you should.
In particular, -ffast-math turns on -ffinite-math-only, see http://gcc.gnu.org/wiki/FloatingPointMath which means (from http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Optimize-Options.html#Optimize-Options )
[...] optimizations for floating-point arithmetic that assume that arguments and results are not NaNs or +-Infs
This means, by enabling the -ffast-math you make a promise to the compiler that your code will never use infinity or NaN, which in turn allows the compiler to optimize the code by, e.g., replacing any calls to isinf or isnan by the constant false (and further optimize from there). If you break your promise to the compiler, the compiler is not required to create correct programs.
Thus the answer quite simple, if your code may have infinities or NaN (which is strongly implied by the fact that you use isinf and isnan), you cannot enable -ffast-math as else you might get incorrect code.
Your implementation of my_isnan works (on some systems) because it directly checks the binary representation of the floating point number. Of course, the processor still might do (some) actual calculations (depending on which optimizations the compiler does), and thus actual NaNs might appear in memory and you can check their binary representation, but as explained above, std::isnan might have been replaced by the constant false. It might equally well happen that the compiler replaces, e.g., sqrt, by some version that doesn't even produce a NaN for input -1. In order to see which optimisations your compiler does, compile to assembler and look at that code.
To make a (not completely unrelated) analogy, if you're telling your compiler your code is in C++ you can not expect it to compile C code correctly and vice-versa (there are actual examples for this, e.g. Can code that is valid in both C and C++ produce different behavior when compiled in each language? ).
It is a bad idea to enable -ffast-math and use my_isnan because this will make everything very machine- and compiler-dependent you don't know what optimizations the compiler does overall, so there might be other hidden problems related to the fact that you are using non-finite maths but tell the compiler otherwise.
A simple fix is to use -ffast-math -fno-finite-math-only which would still give some optimizations.
It also might be that your code looks something like this:
filter out all infinities and NaNs
do some finite maths on the filtered values (by this I mean maths that is guaranteed to never create infinities or NaNs, this has to be very, very carefully checked)
In this case, you could split up your code and either use optimize #pragma or __attribute__ to turn -ffast-math (respectively -ffinite-math-only and -fno-finite-math-only) on and off selectively for the given pieces of code (however, I remember there being some trouble with some version of GCC related to this) or just split your code into separate files and compile them with different flags. Of course, this also works in more general settings if you can isolate the parts where infinities and NaNs might occur. If you can not isolate these parts, this is a strong indication that you can not use -ffinite-math-only for this code.
Finally, it's important to understand that -ffast-math is not a harmless optimization that simply makes your program faster. It does not only affect the performance of your code but also its correctness (and this on top of all the issues surrounding floating point numbers already, if I remember right William Kahan has a collection of horror stories on his homepage, see also What every programmer should know about floating point arithmetic). In short, you might get faster code, but also wrong or unexpected results (see below for an example). Hence, you should only use such optimizations when you really know what you are doing and you have made absolutely sure, that either
the optimizations don't affect the correctness of that particular code, or
the errors introduced by the optimization are not critical to the code.
Program code can actually behave quite differently depending on whether this optimization is used or not. In particular it can behave wrong (or at least very contrary to your expectations) when optimizations such as -ffast-math are enabled. Take the following program for example:
#include <iostream>
#include <limits>
int main() {
double d = 1.0;
double max = std::numeric_limits<double>::max();
d /= max;
d *= max;
std::cout << d << std::endl;
return 0;
}
will produce output 1 as expected when compiled without any optimization flag, but using -ffast-math, it will output 0.
I have a very strange bug in my program. I was not able to isolate the error in a reproducible code but at a certain place in my code there is:
double distance, criticalDistance;
...
if (distance > criticalDistance)
{
std::cout << "first branch" << std::endl;
}
if (distance == criticalDistance)
{
std::cout << "second branch" << std::endl;
}
In debug build everything is fine. Only one branch gets executed.
But in release build all hell breaks loose and sometimes both branches get executed.
This is very strange, since if I add the else conditional:
if (distance > criticalDistance)
{
std::cout << "first branch" << std::endl;
}
else if (distance == criticalDistance)
{
std::cout << "second branch" << std::endl;
}
This does not happen.
Please, what can be the cause of this? I am using gcc 4.8.1 on Ubuntu 13.10 on a 32 bit computer.
EDIT1:
I am using precompiler flags
-std=gnu++11
-gdwarf-3
EDIT2:
I do not think this is caused by a memory leak. I analyzed both release and debug builds with valgrind memory analyzer with tracking of unitialized memory and detection of self-modifiyng code and I found no errors.
EDIT3:
Changing the declaration to
volatile double distance, criticalDistance;
makes the problem go away. Does this confirm woolstar's answer? Is this a compiler bug?
EDIT4:
using the gcc option -ffloat-store also fixes the problem. If I understand this correctly this is caused by gcc.
if (distance > criticalDistance)
// true
if (distance == criticalDistance)
// also true
I have seen this behavior before in my own code. It is due to the mismatch between the standard 64 bit value stored in memory, and the 80 bit internal values that intel processors use for floating point calculation.
Basically, when truncated to 64 bits, your values are equal, but when tested at 80 bit values, one is slightly larger than the other. In DEBUG mode, the values are always stored to memory and then reloaded so they are always truncated. In optimized mode, the compiler reuses the value in the floating point register and it doesn't get truncated.
Please, what can be the cause of this?
Undefined behavior, aka. bugs in your code.
There is no IEEE floating point value which exhibits this behavior. So what's happening is that you are doing something wrong, which violates an assumption made by your compiler.
When optimizing your code, the compiler assumes that your code can be described by the C++ standard. If you do anything that is left undefined by the C++ standard, then these assumptions are violated, resulting in "weird" execution. It could be something "simple" like an uninitialized variable or a buffer overrun resulting in parts of the stack or heap being overwritten with garbage data, or it could be something more subtle, where you rely on a specific ordering between two operations, which is not guaranteed by the standard.
That is probably why you were not able to reproduce the problem in a small test case (the smaller test code does not contain the erroneous code), or and why you only see the error in optimized builds.
Of course, it is also possible that you've stumbled across a compiler bug, but a bug in your code is quite a bit more likely. :)
And best of all, it means that we don't really have a chance to debug the problem from the code snippet you've shown. We can say "the code shouldn't behave like that", but that's about all.
You are not initializing your doubles, are you sure that they always get a value?
I have found that uninitilized variables in debug is allways 0, but in release they can be pretty much anything.
I just wonder if there is some convenient way to detect if overflow happens to any variable of any default data type used in a C++ program during runtime? By convenient, I mean no need to write code to follow each variable if it is in the range of its data type every time its value changes. Or if it is impossible to achieve this, how would you do?
For example,
float f1=FLT_MAX+1;
cout << f1 << endl;
doesn't give any error or warning in either compilation with "gcc -W -Wall" or running.
Thanks and regards!
Consider using boosts numeric conversion which gives you negative_overflow and positive_overflow exceptions (examples).
Your example doesn't actually overflow in the default floating-point environment in a IEEE-754 compliant system.
On such a system, where float is 32 bit binary floating point, FLT_MAX is 0x1.fffffep127 in C99 hexadecimal floating point notation. Writing it out as an integer in hex, it looks like this:
0xffffff00000000000000000000000000
Adding one (without rounding, as though the values were arbitrary precision integers), gives:
0xffffff00000000000000000000000001
But in the default floating-point environment on an IEEE-754 compliant system, any value between
0xfffffe80000000000000000000000000
and
0xffffff80000000000000000000000000
(which includes the value you have specified) is rounded to FLT_MAX. No overflow occurs.
Compounding the matter, your expression (FLT_MAX + 1) is likely to be evaluated at compile time, not runtime, since it has no side effects visible to your program.
In situations where I need to detect overflow, I use SafeInt<T>. It's a cross platform solution which throws an exception in overflow situations.
SafeInt<float> f1 = FLT_MAX;
f1 += 1; // throws
It is available on codeplex
http://www.codeplex.com/SafeInt/
Back in the old days when I was developing C++ (199x) we used a tool called Purify. Back then it was a tool that instrumented the object code and logged everything 'bad' during a test run.
I did a quick google and I'm not quite sure if it still exists.
As far as I know nowadays several open source tools exist that do more or less the same.
Checkout electricfence and valgrind.
Clang provides -fsanitize=signed-integer-overflow and -fsanitize=unsigned-integer-overflow.
http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation