Same code using floats on two computers gives two different results - c++

I've got some image processing code in C++ which calculates gradients and finds straight lines in them with the hough transformation algorithm. The program does most of the calculations with floats.
When I run this code on the same image on two different computers, one Pentium IV running latest Fedora, the other a Core i5 latest Ubuntu, both 32 bit, I get slightly different results. E.g. I have after some lengthy calculation 1.3456f for some variable on the one machine and 1.3457f on the other. Is this expected behavior or should I search for errors in my program?
My first guess was, that I'm accessing some uninitialized or out-of-bounds memory but I did run the program through valgrind and it can't find any errors, also running multiple times on the same machine always gives the same results.

This is not uncommon and it will depend on your compiler, optimisation settings, math libraries, CPU, and of course the numerical stability of the algorithms that you are using.
You need to have a good idea of your accuracy requirements and if you are not meeting these then you may need to look at your algorithms and e.g. consider using double rather than float where needed.

For background on why given source code might not result in the same output on different computers, see What Every Computer Scientist Should Know About Floating-Point Arithmetic. I doubt this is due to any deficiency of your code unless it performs aggregation in a non-deterministic way eg. by centrally collating calculation results from multiple threads.
Floating point behaviour is often tunable per compiler options, even to the level of different CPUs. Check your compiler docs to see if you can reduce or eliminate the discrepancy. On Visual C++ (for example) this is done via /fp.

Is it due to the a phonomena called machine epsilon?
http://en.wikipedia.org/wiki/Machine_epsilon
There are limitations on flaoting-point number. The fact that floating-point numbers cannot precisely represent all real numbers, and that floating-point operations cannot precisely represent true arithmetic operations, leads to many surprising situations. This is related to the finite precision with which computers generally represent numbers.

Basically, the same C++ instructions can be compiled to different machine instructions (even on the same CPU and certainly on different CPUs) depending on a large number of factors, and the same machine instructions can lead to different low-level CPU actions depending on a large number of factors. In theory, these are supposed to be semantically equivalent, but with floating-point numbers, there are edge cases where they aren't.
Read "The pitfalls of verifying floating-point computations" by David Monniaux for details.

I will also say that this is very common, and probably not your fault.
I spent a lot of time in the past trying to figure out the same problem.
I would suggest to use decimal instead of float and double as long as your numbers do not refer to scientific calculations but to values like prices, quantities, exchange rates, etc.

This is totally normal, unfortunately.
There are libraries which can produce identical results everywhere--see http://www.mpfr.org/ for an example. But the performance cost is substantial and it's probably not worth it unless exact identical results are the most important criterion.
I've actually written a closed-source library which implemented floating-point math in the integer unit, in order to make floats provide identical results on multiple platforms (Intel, AMD, PowerPC) across different compilers. We had an app which simply could not function if floating-point results varied. It was quite a challenge, though. If we could do it again we'd have just designed the original app in fixed-point, but at the time it was too much code to rewrite.

Either this is a difference between the internal representation of the float, making slightly different results, or perhaps it is a difference in the way the float is printed to the screen? I doubt that it is your fault...

Related

Standard math functions reproducibility on different CPU's

I am working on project with a lot math calculations. After switching on a new test machine, I have noticed that a lot of tests failed. But also important to notice that tests also failed on my develop machine, and on some machines of other developers. After tracing values and comparing with values from the old machine I found that some functions (At this moment I found only cosine) from math.h sometimes returns slightly different values (for example: 40965.8966304650828827e-01 and 40965.8966304650828816e-01, -3.3088623618085204e-08 and -3.3088623618085197e-08).
New CPU: Intel Xeon Gold 6230R (Intel64 Family 6 Model 85 Stepping 7)
Old CPU: Exact model is unknown (Intel64 Family 6 Model 42 Stepping 7)
My CPU: Intel Core i7-4790K
Tests results doesn't depend on Windows version (7 and 10 were tested).
I have tried to test with binary that was statically linked with standard library to exclude loading of different libraries for different processes and Windows versions, but all results were the same.
Project compiled with /fp:precise, switching to /fp:strict changed nothing.
MSVC from Visual Studio 15 is used: 19.00.24215.1 for x64.
How to make calculations fully reproducible?
Since you are on Windows, I am pretty sure the different results are because the UCRT detects during runtime whether FMA3 (fused-multiply-add) instructions are available for the CPU and if yes, use them in transcendental functions such as cosine. This gives slightly different results. The solution is to place the call set_FMA3_enable(0); at the very start of your main() or WinMain() function, as described here.
If you want to have reproducibility also between different operating systems, things become harder or even impossible. See e.g. this blog post.
In response also to the comments stating that you should just use some tolerance, I do not agree with this as a general statement. Certainly, there are many applications where this is the way to go. But I do think that it can be a sensible requirement to get exactly the same floating point results for some applications, at least when staying on the same OS (Windows, in this case). In fact, we had the very same issue with set_FMA3_enable a while ago. I am a software developer for a traffic simulation, and minor differences such as 10^-16 often build up and lead to entirely different simulation results eventually. Naturally, one is supposed to run many simulations with different seeds and average over all of them, making the different behavior irrelevant for the final result. But: Sometimes customers have a problem at a specific simulation second for a specific seed (e.g. an application crash or incorrect behavior of an entity), and not being able to reproduce it on our developer machines due to a different CPU makes it much harder to diagnose and fix the issue. Moreover, if the test system consists of a mixture of older and newer CPUs and test cases are not bound to specific resources, means that sometimes tests can deviate seemingly without reason (flaky tests). This is certainly not desired. Requiring exact reproducibility also makes writing the tests much easier because you do not require heuristic thresholds (e.g. a tolerance or some guessed value for the amount of samples). Moreover, our customers expect the results to remain stable for a specific version of the program since they calibrated (more or less...) their traffic networks to real data. This is somewhat questionable, since (again) one should actually look at averages, but the naive expectation in reality usually wins.
IEEE-745 double precision binary floating point provides no more than 15 decimal significant digits of precision. You are looking at the "noise" of different library implementations and possibly different FPU implementations.
How to make calculations fully reproducible?
That is an X-Y problem. The answer is you can't. But it is the wrong question. You would do better to ask how you can implement valid and robust tests that are sympathetic to this well-known and unavoidable technical issue with floating-point representation. Without providing the test code you are trying to use, it is not possible to answer that directly.
Generally you should avoid comparing floating point values for exact equality, and rather subtract the result from the desired value, and test for some acceptable discrepancy within the supported precision of the FP type used. For example:
#define EXPECTED_RESULT 40965.8966304650
#define RESULT_PRECISION 00000.0000000001
double actual_result = test() ;
bool error = fabs( actual_result-
EXPECTED_RESULT ) >
RESULT_PRECISION ;
First of all, 40965.8966304650828827e-01 cannot be a result from cos() function, as cos(x) is a function that, for real valued arguments always returns a value in the interval [-1.0, 1.0] so the result shown cannot be the output from it.
Second, you will have probably read somewhere that double values have a precision of roughly 17 digits in the significand, while your are trying to show 21 digit. You cannot get correct data past the ...508, as you are trying to force the result farther from the 17dig limit.
The reason you get different results in different computers is that what is shown after the precise digits are shown is undefined behaviour, so it's normal that you get different values (you could get different values even on different runs on the same machine with the same program)

C++ DLL floating-point determinism

Can the same compilation of a C++ DLL exhibit different floating-point results on different machines?
We have some code in our DLL which performs a < comparison of two doubles. For a particular set of inputs those doubles are expected to be equal. Of course, the < comparison is dubious in this case, but what we didn't expect was to see different results from the comparison in our test versus the client's machine.
The same DLL on 2 different computers even though both are running Windows XP could conceivably produce the different results you're seeing. These are the reasons that occur to me:
They could use different version of the C++ runtime (since that is likely dynamically linked) or of other system dlls.
I don't know how likely this is but I would believe that the floating point operations on different CPUs could produce different enough results for 2 series of calculations of a and b such that a < b == true on one machine and a < b == false on another.
What I've used in the past to find out what DLLs are being used by an application is Dependency Walker.
Yes, there can be differences in floating point implementations that are significant enough to cause equality comparisons to fail.
You can attribute it to failure to implement the IEEE standards properly, but I can see situations where, for example, a different number of guard digits might be used in different implementations, and so the round-off errors might be different. It should be noted, however that the IEEE standards are rather strict.
Comparisons between floating point numbers should never use exact equality. Favor an approach where you can test for numbers being within a small range of error, rather than exact equality.
Further Reading
What Every Computer Scientist Should Know About Floating Point Arithmetic
In VS 2003, MS C++ compiler introduced a new model for floating-point optimization. It provides you with 3 compiler options: fp:fast; fp:precise; fp:struct.
Under the fp:strict mode, the compiler never performs any optimizations that perturb the accuracy of floating-point computations, so if you want accuracy over speed, you should use this one. The default one is fp:precise. You can change in the project properties->C++->Code generation.
Please read this: Microsoft Visual C++ Floating-Point Optimization

Fast trigonometric functions using only integer in c++ for arm target

I am writing code for an ARM-Target which uses a lot of floating point operations and trigonometric functions. AFAIK floating point calculations are MUCH slower than int (especially on ARM). Accuracy is not crucial.
I thought about implementing my own trigonometric functions using a scaling factor (p.e. range of 0*pi to 2*pi becomes int 0 to 1024) and lookup tables. Is that a good approach?
Are there any alternatives?
Target platform is an Odroid U2 (Exynos4412) running ubuntu and lots of other stuff (webserver etc...).
(c++11 and boost/libraries allowed)
If your target platform has a math library, use it. If it is any good, it was written by experts who were considering speed. You should not base code design on guesses about what is fast or slow. If you do not have actual measurements or processor specifications, and you do not know trigonometric functions in your application are consuming a lot of time, then you do not have good reason for replacing the math libraries.
Floating-point instructions typically have longer latencies than integer instructions, but they are pipelined so that throughput may be comparable. (E.g., a floating-point unit might have four stages to do the work, so an instruction takes four cycles to work through all the stages, but you can push a new instruction into the first stage in each cycle.) Whether the pipelining is sufficient to provide performance on a par with an integer implementation depends greatly on the target processor, the algorithm being used, and the skill of the implementor.
If it is beneficial in your case to use custom implementations of the math routines, then how they should be designed is hugely dependent on circumstances. Proper advice depends on the domain to support (Just 0 to 2π? –2π to +2π? Possibly larger values, which have to be folded to -π to π?), what special cases needed to be supported (Propagate NaNs?), the accuracy required, what else is happening in the processor (Is a lot of memory in use or can we rely on a lookup table remaining in cache?), and more.
A significant part of the trigonometric routines is handling various cases (NaNs, infinities, small values) and reducing arguments modulo 2π. It may be possible to implement stripped-down routines that do not handle special cases or perform argument reduction but still use floating-point.
Exynos 4412 uses the Cortex-A9 core[1], which has fully pipelined single- and double-precision floating-point. There is no reason to resort to integer operations, as there was with some older ARM cores.
Depending on your specific accuracy requirements (and especially if you can guarantee that the inputs fall into a limited range), you may be able to use approximations that are significantly faster than the implementations available in the standard library. More information about your exact usage would be necessary to give sound advice.
[1] http://en.wikipedia.org/wiki/Exynos_(system_on_chip)
One possible alternative is trigint:
trigint download
trigint doxygen
You should use "fixed point" math rather than floating point.
Most ARM processors (7 and above) allow for 32 bits of resolution in the fixed point. So you could go to 1E-3 radians quite easily. But the real question is how much accuracy do you need in the results?
Whether to use lookup tables, lookup tables with interpolation or functions depends on how much data space you have on your system. Lookup tables are fastest execution, but use the most data space. Functions use the least amount of data but require the most execution time. Interpolation may be a mitigation that allows smaller tables and some extra processing.

What should i know when using floats/doubles between different machines?

I've heard that there are many problems with floats/doubles on different CPU's.
If i want to make a game that uses floats for everything, how can i be sure the float calculations are exactly the same on every machine so that my simulation will look exactly same on every machine?
I am also concerned about writing/reading files or sending/receiving the float values to different computers. What conversions there must be done, if any?
I need to be 100% sure that my float values are computed exactly the same, because even a slight difference in the calculations will result in a totally different future. Is this even possible ?
Standard C++ does not prescribe any details about floating point types other than range constraints, and possibly that some of the maths functions (like sine and exponential) have to be correct up to a certain level of accuracy.
Other than that, at that level of generality, there's really nothing else you can rely on!
That said, it is quite possible that you will not actually require binarily identical computations on every platform, and that the precision and accuracy guarantees of the float or double types will in fact be sufficient for simulation purposes.
Note that you cannot even produce a reliable result of an algebraic expression inside your own program when you modify the order of evaluation of subexpressions, so asking for the sort of reproducibility that you want may be a bit unrealistic anyway. If you need real floating point precision and accuracy guarantees, you might be better off with an arbitrary precision library with correct rounding, like MPFR - but that seems unrealistic for a game.
Serializing floats is an entirely different story, and you'll have to have some idea of the representations used by your target platforms. If all platforms were in fact to use IEEE 754 floats of 32 or 64 bit size, you could probably just exchange the binary representation directly (modulo endianness). If you have other platforms, you'll have to think up your own serialization scheme.
What every programmer should know: http://docs.sun.com/source/806-3568/ncg_goldberg.html

What claims, if any, can be made about the accuracy/precision of floating-point calculations?

I'm working on an application that does a lot of floating-point calculations. We use VC++ on Intel x86 with double precision floating-point values. We make claims that our calculations are accurate to n decimal digits (right now 7, but trying to claim 15).
We go to a lot of effort of validating our results against other sources when our results change slightly (due to code refactoring, cleanup, etc.). I know that many many factors play in to the overall precision, such as the FPU control state, the compiler/optimizer, floating-point model, and the overall order of operations themselves (i.e., the algorithm itself), but given the inherent uncertainty in FP calculations (e.g., 0.1 cannot be represented), it seems invalid to claim any specific degree of precision for all calulations.
My question is this: is it valid to make any claims about the accuracy of FP calculations in general without doing any sort of analysis (such as interval analysis)? If so, what claims can be made and why?
EDIT:
So given that the input data is accurate to, say, n decimal places, can any guarantee be made about the result of any arbitrary calculations, given that double precision is being used? E.g., if the input data has 8 significant decimal digits, the output will have at least 5 significant decimal digits... ?
We are using math libraries and are unaware of any guarantees they may or may not make. The algorithms we use are not necessarily analyzed for precision in any way. But even given a specific algorithm, the implementation will affect the results (just changing the order of two addition operations, for example). Is there any inherent guarantee whatsoever when using, say, double precision?
ANOTHER EDIT:
We do empirically validate our results against other sources. So are we just getting lucky when we achieve, say, 10-digit accuracy?
As with all such questions, I have to just simply answer with the article What Every Computer Scientist Should Know About Floating-Point Arithmetic. It's absolutely indispensable for the type of work you are talking about.
Short answer: No.
Reason: Have you proved (yes proved) that you aren't losing any precision as you go along? Are you sure? Do you understand the intrinsic precision of any library functions you're using for transcendental functions? Have you computed the limits of additive errors? If you are using an iterative algorithm, do you know how well it has converged when you quit? This stuff is hard.
Unless your code uses only the basic operations specified in IEEE 754 (+, -, *, / and square root), you do not even know how much precision loss each call to library functions outside your control (trigonometric functions, exp/log, ...) introduce. Functions outside the basic 5 are not guaranteed to be, and are usually not, precise at 1ULP.
You can do empirical checks, but that's what they remain... empirical. Don't forget the part about there being no warranty in the EULA of your software!
If your software was safety-critical, and did not call library-implemented mathematical functions, you could consider http://www-list.cea.fr/labos/gb/LSL/fluctuat/index.html . But only critical software is worth the effort and has a chance to fit in the analysis constraints of this tool.
You seem, after your edit, mostly concerned about your compiler doing things in your back. It is a natural fear to have (because like for the mathematical functions, you are not in control). But it's rather unlikely to be the problem. Your compiler may compute with a higher precision than you asked for (80-bit extendeds when you asked for 64-bit doubles or 64-bit doubles when you asked for 32-bit floats). This is allowed by the C99 standard. In round-to-nearest, this may introduce double-rounding errors. But it's only 1ULP you are losing, and so infrequently that you needn't worry. This can cause puzzling behaviors, as in:
float x=1.0;
float y=7.0;
float z=x/y;
if (z == x/y)
...
else
... /* the else branch is taken */
but you were looking for trouble when you used == between floating-point numbers.
When you have code that does cancelations on purpose, such as in Kahan's summation algorithm:
d = (a+b)-a-b;
and the compiler optimizes that into d=0;, you have a problem. And yes, this optimization "as if floats operation were associative" has been seen in general compilers. It is not allowed by C99. But the situation has gotten better, I think. Compiler authors have become more aware of the dangers of floating-point and no longer try to optimize so aggressively. Plus, if you were doing this in your code you would not be asking this question.
Given that your vendors of machines, compilers, run-time libraries, and operation systems don't make any such claim about floating point accuracy, you should take that to be a warning that your group should be leery of making claims that could come under harsh scrutiny if clients ever took you to court.
Without doing formal verification of the entire system, I would avoid such claims. I work on scientific software that has indirect human safety implications, so we have consider such things in the past, and we do not make these sort of claims.
You could make useless claims about precision of double (length) floating point calculations, but it would be basically worthless.
Ref: The pitfalls of verifying floating-point computations from ACM Transactions on Programming Languages and Systems 30, 3 (2008) 12
No, you cannot make any such claim. If you wanted to do so, you would need to do the following:
Hire an expert in numerical computing to analyze your algorithms.
Either get your library and compiler vendors to open their sources to said expert for analysis, or get them to sign off on hard semantics and error bounds.
Double-precision floating-point typically carries about 15 digits of decimal accuracy, but there are far too many ways for some or all of that accuracy to be lost, that are far too subtle for a non-expert to diagnose, to make any claim like what you would like to claim.
There are relatively simpler ways to keep running error bounds that would let you make accuracy claims about any specific computation, but making claims about the accuracy of all computations performed with your software is a much taller order.
A double precision number on an Intel CPU has slightly better than 15 significant digits (decimal).
The potrntial error for a simple computation is in the ballparl of n/1.0e15, where n is the order of magnitude of the number(s) you are working with. I suspect that Intel has specs for the accuracy of CPU-based FP computations.
The potential error for library functions (like cos and log) is usually documented. If not, you can look at the source code (e.g. thr GNU source) and calculate it.
You would calculate error bars for your calculations just as you would for manual calculations.
Once you do that, you may be able to reduce the error by judicious ordering of the computations.
Since you seem to be concerned about accuracy of arbitrary calculations, here is an approach you can try: run your code with different rounding modes for floating-point calculations. If the results are pretty close to each other, you are probably okay. If the results are not close, you need to start worrying.
The maximum difference in the results will give you a lower bound on the accuracy of the calculations.