What checks can I perform to identify what differences they are in the floating point behaviour of two hardware platforms?
Verifying IEE-754 compliance or checking for known bugs may be sufficient (to explain a difference in output that I've observed).
I have looked at the CPU flags via /proc/cpu and both claim to support SSE2
I looked at:
https://www.vinc17.net/research/fptest.en.html
http://www.jhauser.us/arithmetic/TestFloat.html
but they look challenging to use.
I've built TestFloat but I'm not sure what to do with it. The home page says:
"Unfortunately, TestFloat’s output is not easily interpreted. Detailed
knowledge of the IEEE Standard is required to use TestFloat
responsibly."
Ideally I just want one or two programs or some simple configure style checks I can run and compare the output between two platforms.
Ideally I would then convert this into configure checks to ensure that
the an attempt to compile the non-portable code on a platform that behaves abnormally its detected at configure time rather than run time.
Background
I have found a difference in behaviour for a C++ application on two different platforms:
Intel(R) Xeon(R) CPU E5504
Intel(R) Core(TM) i5-3470 CPU
Code compiled natively on either machine runs on the other but
for one test the behaviour depends on which machine the code is run on.
Clarification
The executable compiled on machine A behaves like the executable compiled on machine B when copied to run on machine B and visa versa.
It could an uninitialised variable (though nothing showed up in valgrind) or many other things but
I suspected that the cause could be non-portable use of floating point.
Perhaps one machine is interpreting the float point assembly differently from the other?
The implementers have confirmed they know about this.
Its not my code and I have no desire to completely rewrite it to test this. Recompiling is fine though.
I want to test my hypothesis.
In the related question I am looking at how to enable software floating point. This question is tackling the problem from the other side.
Update
I've gone down the configure check road tried the following based on #chux's hints.
#include <iostream>
#include <cfloat>
int main(int /*argc*/, const char* /*argv*/[])
{
std::cout << "FLT_EVAL_METHOD=" << FLT_EVAL_METHOD << "\n";
std::cout << "FLT_ROUNDS=" << FLT_ROUNDS << "\n";
#ifdef __STDC_IEC_559__
std::cout << "__STDC_IEC_559__ is defined\n";
#endif
#ifdef __GCC_IEC_559__
std::cout << "__GCC_IEC_559__ is defined\n";
#endif
std::cout << "FLT_MIN=" << FLT_MIN << "\n";
std::cout << "FLT_MAX=" << FLT_MAX << "\n";
std::cout << "FLT_EPSILON=" << FLT_EPSILON << "\n";
std::cout << "FLT_RADIX=" << FLT_RADIX << "\n";
return 0;
}
Giving identical output on both platforms:
./floattest
FLT_EVAL_METHOD=0
FLT_ROUNDS=1
__STDC_IEC_559__ is defined
FLT_MIN=1.17549e-38
FLT_MAX=3.40282e+38
FLT_EPSILON=1.19209e-07
FLT_RADIX=2
I'm still looking for something that might be different.
OP has 2 goals that conflict a bit.
How to detect differences in floating point behaviour across platforms (?)
I just want one or two programs or some simple configure style checks I can run and compare the output between two platforms.
Yes some differences are easy to detect, but some differences can be exceedingly subtle.
Sample Can the floating-point status flag FE_UNDERFLOW set when the result is not sub-normal?
There are no simple tests for the general problem.
Recommend either:
Revamp the coding goal to allow for nominal differences.
See if _STDC_IEC_559__ is defined and hope that is sufficient for you application. Given various other factors like FLT_EVAL_METHOD and FLT_ROUNDS and optimization levels, code can still be compliant yet provide different results, yet the degree will be more manageable.
If super high consistency is needed, do not use floating point.
I found a program called esparanoia that does some checks of floating point behaviour. This is based on William Kahan's original paranoid program found the infamous Pentium division bug.
While it did not detect any problems with my test systems (and thus is not sufficient to answer the question) it might be of interest to someone else.
I have just found a problem and I have no idea what it could be. I started learning programming a few weeks ago and I am learning about pointers.
I compiled exactly the same code in 2 different PC's. In the first, the program runs perfectly. In the second, it stops working when it reaches a certain line.
I use 2 PC's.
The one at my workplace runs Windows XP SP3. In this one, the program worked fine.
The one at my home runs Windows 7 SP1. It compiled the code, but the program did not work.
I am writing and compiling using DEV C++ and TDM GCC 5.1.0 in both systems.
#include<iostream>
using namespace std;
int main (void) {
int* pointer;
cout << "pointer == " << pointer << "\n";
cout << "*pointer == " << *pointer << "\n"; // this is the line where the program stops.
cout << "&pointer == " << &pointer << "\n";
return 0;}
The output in the first computer was something like:
pointer == 0x000001234
*pointer == some garbage value
&pointer == 0x000007865
In the second computer, it stops at second line.
pointer == 0x1
I do understand that the pointer have not been assigned to a variable. Therefore, it does not store any correct address. Even so, it should at least show the garbage value inside it, or a "0" to indicate it has not yet an address to point to. I know the code is right because it worked fine in the first PC. But I do not understand why it failed in other computer.
I know the code is right because it worked fine in the first PC
You know no such thing.
You have undefined behaviour, and one entirely valid consequence is a program that always works. Or always works except on Saturdays, or always works until after you finished testing and shipped it to a paying customer, or always works on one machine and always fails on another.
The behaviour is undefined, not "defined to some specific consistent observable mode of failure".
Specifically, the real risk of undefined behaviour isn't simply that the result of some operation has an unspecified value, but that it may have undefined and unpredictable side-effects - on apparently-unrelated areas of your program, or on the system as a whole.
Even so, it should at least show the garbage value inside it
It did. But then you asked it to dereference that garbage value.
Reading any variable with an unspecified value is itself Undefined Behaviour, so the first piece of UB is reading the value of the pointer.
Following (dereferencing) a pointer which doesn't point to a valid object is also undefined behaviour, because you don't know whether the unspecified value you illegally interpreted as an address is correctly aligned for the type, or is mapped in your process' address space.
If you successfully load some integer from that address, that is a third piece of undefined behaviour, because again its value is unspecified.
So, the worst-case immediate pitfalls (with hardware trap values and restrictive alignment) are:
read the unspecified pointer value, get a trap representation, die with a hardware trap
OR read the unspecified pointer value, interpret it as an address which is misaligned, die with a bus error
OR follow the unspecified pointer to an unmapped address, die with a segment violation
OR survive all the previous steps - by pure chance - load some random value from some location in memory. Then die because that value is a trap representation.
But your if your process just dies, reproducibly, you can easily debug and fix it with no ill effects. In that sense, crashing at the point of invoking UB is actually the best possible outcome. The alternatives are worse, less predictable, and harder to debug.
I do understand that the pointer have not been assigned to a variable. Therefore, it does not store any correct address. Even so, it should at least show the garbage value inside it, or a "0" to indicate it has not yet an address to point to.
It did! That was the 0x000001234.
Unfortunately you then tried to dereference this invalid pointer, and print the value of an int that does not exist. You cannot do that.
If you hadn't done that, we'd have made it to the third line, where the 0x000007865 would correctly represent the address of the pointer, which is an object with name pointer and type int* that does indeed exist.
I know the code is right because it worked fine in the first PC.
One of the things you'll have to get used to with C++ is that "it appears to work on one computer" is very far from proof that the code is correct. Read about undefined behaviour and weep slow tears.
But I do not understand why it failed in other computer.
Because the code isn't right, and you didn't get "lucky" this time.
We could analyse a few reasons why it appeared to work on one system and not the other, and there are reasons for that. But it's late, and you're just starting out, and since this is undefined behaviour it doesn't matter. :)
Sample code:
#include <iostream>
#include <cmath>
#include <stdint.h>
using namespace std;
static bool my_isnan(double val) {
union { double f; uint64_t x; } u = { val };
return (u.x << 1) > (0x7ff0000000000000u << 1);
}
int main() {
cout << std::isinf(std::log(0.0)) << endl;
cout << std::isnan(std::sqrt(-1.0)) << endl;
cout << my_isnan(std::sqrt(-1.0)) << endl;
cout << __isnan(std::sqrt(-1.0)) << endl;
return 0;
}
Online compiler.
With -ffast-math, that code prints "0, 0, 1, 1" -- without, it prints "1, 1, 1, 1".
Is that correct? I thought that std::isinf/std::isnan should still work with -ffast-math in these cases.
Also, how can I check for infinity/NaN with -ffast-math? You can see the my_isnan doing this, and it actually works, but that solution is of course very architecture dependent. Also, why does my_isnan work here and std::isnan does not? What about __isnan and __isinf. Do they always work?
With -ffast-math, what is the result of std::sqrt(-1.0) and std::log(0.0). Does it become undefined, or should it be NaN / -Inf?
Related discussions: (GCC) [Bug libstdc++/50724] New: isnan broken by -ffinite-math-only in g++, (Mozilla) Bug 416287 - performance improvement opportunity with isNaN
Note that -ffast-math may make the compiler ignore/violate IEEE specifications, see http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Optimize-Options.html#Optimize-Options :
This option is not turned on by any -O option besides -Ofast since it
can result in incorrect output for programs that depend on an exact
implementation of IEEE or ISO rules/specifications for math functions.
It may, however, yield faster code for programs that do not require
the guarantees of these specifications.
Thus, using -ffast-math you are not guaranteed to see infinity where you should.
In particular, -ffast-math turns on -ffinite-math-only, see http://gcc.gnu.org/wiki/FloatingPointMath which means (from http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Optimize-Options.html#Optimize-Options )
[...] optimizations for floating-point arithmetic that assume that arguments and results are not NaNs or +-Infs
This means, by enabling the -ffast-math you make a promise to the compiler that your code will never use infinity or NaN, which in turn allows the compiler to optimize the code by, e.g., replacing any calls to isinf or isnan by the constant false (and further optimize from there). If you break your promise to the compiler, the compiler is not required to create correct programs.
Thus the answer quite simple, if your code may have infinities or NaN (which is strongly implied by the fact that you use isinf and isnan), you cannot enable -ffast-math as else you might get incorrect code.
Your implementation of my_isnan works (on some systems) because it directly checks the binary representation of the floating point number. Of course, the processor still might do (some) actual calculations (depending on which optimizations the compiler does), and thus actual NaNs might appear in memory and you can check their binary representation, but as explained above, std::isnan might have been replaced by the constant false. It might equally well happen that the compiler replaces, e.g., sqrt, by some version that doesn't even produce a NaN for input -1. In order to see which optimisations your compiler does, compile to assembler and look at that code.
To make a (not completely unrelated) analogy, if you're telling your compiler your code is in C++ you can not expect it to compile C code correctly and vice-versa (there are actual examples for this, e.g. Can code that is valid in both C and C++ produce different behavior when compiled in each language? ).
It is a bad idea to enable -ffast-math and use my_isnan because this will make everything very machine- and compiler-dependent you don't know what optimizations the compiler does overall, so there might be other hidden problems related to the fact that you are using non-finite maths but tell the compiler otherwise.
A simple fix is to use -ffast-math -fno-finite-math-only which would still give some optimizations.
It also might be that your code looks something like this:
filter out all infinities and NaNs
do some finite maths on the filtered values (by this I mean maths that is guaranteed to never create infinities or NaNs, this has to be very, very carefully checked)
In this case, you could split up your code and either use optimize #pragma or __attribute__ to turn -ffast-math (respectively -ffinite-math-only and -fno-finite-math-only) on and off selectively for the given pieces of code (however, I remember there being some trouble with some version of GCC related to this) or just split your code into separate files and compile them with different flags. Of course, this also works in more general settings if you can isolate the parts where infinities and NaNs might occur. If you can not isolate these parts, this is a strong indication that you can not use -ffinite-math-only for this code.
Finally, it's important to understand that -ffast-math is not a harmless optimization that simply makes your program faster. It does not only affect the performance of your code but also its correctness (and this on top of all the issues surrounding floating point numbers already, if I remember right William Kahan has a collection of horror stories on his homepage, see also What every programmer should know about floating point arithmetic). In short, you might get faster code, but also wrong or unexpected results (see below for an example). Hence, you should only use such optimizations when you really know what you are doing and you have made absolutely sure, that either
the optimizations don't affect the correctness of that particular code, or
the errors introduced by the optimization are not critical to the code.
Program code can actually behave quite differently depending on whether this optimization is used or not. In particular it can behave wrong (or at least very contrary to your expectations) when optimizations such as -ffast-math are enabled. Take the following program for example:
#include <iostream>
#include <limits>
int main() {
double d = 1.0;
double max = std::numeric_limits<double>::max();
d /= max;
d *= max;
std::cout << d << std::endl;
return 0;
}
will produce output 1 as expected when compiled without any optimization flag, but using -ffast-math, it will output 0.
I am using VS2010. My debug version works fine however my Release version kept on crashing. So In the release version mode I right clicked the project chose Debug and then chose start new instance. At this point I saw that an array that I had declared as such
int ma[4]= { 1,2,8,4};
Never gets initialized. Any suggestions on what might be going on.
When you build in Release, the compiler performs many optimizations on your code. Many of the optimizations include replacing variables with hard-coded values, when it is possible and correct to do so. For example, if you have something like:
int n = 42;
cout << "The answer is: " << n;
By the time the optimizer gets done with it, it will often look more like:
cout << "The answer is: " << 42;
...and the variable n is eliminated from your program completely. If you are stepping through a release version of this program and trying to examine the value of n, you may see very odd values or the debugger may report that n doesn't exist at all.
There are many other optimizations that can be applied which make debugging an optimized program quite difficult. Placing a breakpoint after the initialization of an array could yield very misleading information if the array was eliminated, or if the initialization of it was moved to someplace else.
Another common optimization is to eliminate unused variables, such as with:
int a = ma[0];
If there is no code in your program which actually uses a, the compiler will see that a is unneeded, and optimize it away so that it no longer exists.
In order to see the values that ma has been initialized with, the simplest somewhat reliable approach is to use so-called sprintf debugging:
cout << "ma values: ";
copy (ma, ma+4, ostream_iterator <int> (cout, ", "));
And see what is actually there.
If you debug release build the debugger will report bogus values or will not be able to display any values for most of your variables. The safest way to check that the value of a variable is in Release build is to use logging.
So most probably your array is initialized in Release just as in Debug build but you are not able to see that through the debugger. It seems you have some other problem that is causing the code to crash in Release. Look for some other uninitialized variable or some stack corruption/index out of bounds access.
I'm writing a C++ program that doesn't work (I get a segmentation fault) when I compile it with optimizations (options -O1, -O2, -O3, etc.), but it works just fine when I compile it without optimizations.
Is there any chance that the error is in my code? or should I assume that this is a bug in GCC?
My GCC version is 3.4.6.
Is there any known workaround for this kind of problem?
There is a big difference in speed between the optimized and unoptimized version of my program, so I really need to use optimizations.
This is my original functor. The one that works fine with no levels of optimizations and throws a segmentation fault with any level of optimization:
struct distanceToPointSort{
indexedDocument* point ;
distanceToPointSort(indexedDocument* p): point(p) {}
bool operator() (indexedDocument* p1,indexedDocument* p2){
return distance(point,p1) < distance(point,p2) ;
}
} ;
And this one works flawlessly with any level of optimization:
struct distanceToPointSort{
indexedDocument* point ;
distanceToPointSort(indexedDocument* p): point(p) {}
bool operator() (indexedDocument* p1,indexedDocument* p2){
float d1=distance(point,p1) ;
float d2=distance(point,p2) ;
std::cout << "" ; //without this line, I get a segmentation fault anyways
return d1 < d2 ;
}
} ;
Unfortunately, this problem is hard to reproduce because it happens with some specific values. I get the segmentation fault upon sorting just one out of more than a thousand vectors, so it really depends on the specific combination of values each vector has.
Now that you posted the code fragment and a working workaround was found (#Windows programmer's answer), I can say that perhaps what you are looking for is -ffloat-store.
-ffloat-store
Do not store floating point variables in registers, and inhibit other options that might change whether a floating point value is taken from a register or memory.
This option prevents undesirable excess precision on machines such as the 68000 where the floating registers (of the 68881) keep more precision than a double is supposed to have. Similarly for the x86 architecture. For most programs, the excess precision does only good, but a few programs rely on the precise definition of IEEE floating point. Use -ffloat-store for such programs, after modifying them to store all pertinent intermediate computations into variables.
Source: http://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Optimize-Options.html
I would assume your code is wrong first.
Though it is hard to tell.
Does your code compile with 0 warnings?
g++ -Wall -Wextra -pedantic -ansi
Here's some code that seems to work, until you hit -O3...
#include <stdio.h>
int main()
{
int i = 0, j = 1, k = 2;
printf("%d %d %d\n", *(&j-1), *(&j), *(&j+1));
return 0;
}
Without optimisations, I get "2 1 0"; with optimisations I get "40 1 2293680". Why? Because i and k got optimised out!
But I was taking the address of j and going out of the memory region allocated to j. That's not allowed by the standard. It's most likely that your problem is caused by a similar deviation from the standard.
I find valgrind is often helpful at times like these.
EDIT: Some commenters are under the impression that the standard allows arbitrary pointer arithmetic. It does not. Remember that some architectures have funny addressing schemes, alignment may be important, and you may get problems if you overflow certain registers!
The words of the [draft] standard, on adding/subtracting an integer to/from a pointer (emphasis added):
"If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined."
Seeing as &j doesn't even point to an array object, &j-1 and &j+1 can hardly point to part of the same array object. So simply evaluating &j+1 (let alone dereferencing it) is undefined behaviour.
On x86 we can be pretty confident that adding one to a pointer is fairly safe and just takes us to the next memory location. In the code above, the problem occurs when we make assumptions about what that memory contains, which of course the standard doesn't go near.
As an experiment, try to see if this will force the compiler to round everything consistently.
volatile float d1=distance(point,p1) ;
volatile float d2=distance(point,p2) ;
return d1 < d2 ;
The error is in your code. It's likely you're doing something that invokes undefined behavior according to the C standard which just happens to work with no optimizations, but when GCC makes certain assumptions for performing its optimizations, the code breaks when those assumptions aren't true. Make sure to compile with the -Wall option, and the -Wextra might also be a good idea, and see if you get any warnings. You could also try -ansi or -pedantic, but those are likely to result in false positives.
You may be running into an aliasing problem (or it could be a million other things). Look up the -fstrict-aliasing option.
This kind of question is impossible to answer properly without more information.
It is very seldom the compiler fault, but compiler do have bugs in them, and them often manifest themselves at different optimization levels (if there is a bug in an optimization pass, for example).
In general when reporting programming problems: provide a minimal code sample to demonstrate the issue, such that people can just save the code to a file, compile and run it. Make it as easy as possible to reproduce your problem.
Also, try different versions of GCC (compiling your own GCC is very easy, especially on Linux). If possible, try with another compiler. Intel C has a compiler which is more or less GCC compatible (and free for non-commercial use, I think). This will help pinpointing the problem.
It's almost (almost) never the compiler.
First, make sure you're compiling warning-free, with -Wall.
If that didn't give you a "eureka" moment, attach a debugger to the least optimized version of your executable that crashes and see what it's doing and where it goes.
5 will get you 10 that you've fixed the problem by this point.
Ran into the same problem a few days ago, in my case it was aliasing. And GCC does it differently, but not wrongly, when compared to other compilers. GCC has become what some might call a rules-lawyer of the C++ standard, and their implementation is correct, but you also have to be really correct in you C++, or it'll over optimize somethings, which is a pain. But you get speed, so can't complain.
I expect to get some downvotes here after reading some of the comments, but in the console game programming world, it's rather common knowledge that the higher optimization levels can sometimes generate incorrect code in weird edge cases. It might very well be that edge cases can be fixed with subtle changes to the code, though.
Alright...
This is one of the weirdest problems I've ever had.
I dont think I have enough proof to state it's a GCC bug, but honestly... It really looks like one.
This is my original functor. The one that works fine with no levels of optimizations and throws a segmentation fault with any level of optimization:
struct distanceToPointSort{
indexedDocument* point ;
distanceToPointSort(indexedDocument* p): point(p) {}
bool operator() (indexedDocument* p1,indexedDocument* p2){
return distance(point,p1) < distance(point,p2) ;
}
} ;
And this one works flawlessly with any level of optimization:
struct distanceToPointSort{
indexedDocument* point ;
distanceToPointSort(indexedDocument* p): point(p) {}
bool operator() (indexedDocument* p1,indexedDocument* p2){
float d1=distance(point,p1) ;
float d2=distance(point,p2) ;
std::cout << "" ; //without this line, I get a segmentation fault anyways
return d1 < d2 ;
}
} ;
Unfortunately, this problem is hard to reproduce because it happens with some specific values. I get the segmentation fault upon sorting just one out of more than a thousand vectors, so it really depends on the specific combination of values each vector has.
Wow, I didn't expect answers so quicly, and so many...
The error occurs upon sorting a std::vector of pointers using std::sort()
I provide the strict-weak-ordering functor.
But I know the functor I provide is correct because I've used it a lot and it works fine.
Plus, the error cannot be some invalid pointer in the vector becasue the error occurs just when I sort the vector. If I iterate through the vector without applying std::sort first, the program works fine.
I just used GDB to try to find out what's going on. The error occurs when std::sort invoke my functor. Aparently std::sort is passing an invalid pointer to my functor. (of course this happens with the optimized version only, any level of optimization -O, -O2, -O3)
as other have pointed out, probably strict aliasing.
turn it of in o3 and try again. My guess is that you are doing some pointer tricks in your functor (fast float as int compare? object type in lower 2 bits?) that fail across inlining template functions.
warnings do not help to catch this case. "if the compiler could detect all strict aliasing problems it could just as well avoid them" just changing an unrelated line of code may make the problem appear or go away as it changes register allocation.
As the updated question will show ;) , the problem exists with a std::vector<T*>. One common error with vectors is reserve()ing what should have been resize()d. As a result, you'd be writing outside array bounds. An optimizer may discard those writes.
post the code in distance! it probably does some pointer magic, see my previous post. doing an intermediate assignment just hides the bug in your code by changing register allocation. even more telling of this is the output changing things!
The true answer is hidden somewhere inside all the comments in this thread. First of all: it is not a bug in the compiler.
The problem has to do with floating point precision. distanceToPointSort should be a function that should never return true for both the arguments (a,b) and (b,a), but that is exactly what can happen when the compiler decides to use higher precision for some data paths. The problem is especially likely on, but by no means limited to, x86 without -mfpmath=sse. If the comparator behaves that way, the sort function can become confused, and the segmentation fault is not surprising.
I consider -ffloat-store the best solution here (already suggested by CesarB).