As part of developing a library of numerical methods, I routinely add unit tests. Most of the Assert.AreEqual type tests currently check at most 6 places of decimal, since precision beyond that is not required for this type of libraries. The tests have worked great so far in serving their purpose.
Recently, while working on a 64-bit version of the same library, I found that the results of quite a few unit tests are same through 6 places of decimal, but changes at 10th decimal place or beyond. How I found out about this happening in the first place is a completely different story, but chasing a few of those resulted in resolving few subtle bugs which I would have never knew existed.
Which brings me to the question: there seems to be a value in having unit tests check numbers at high precision (e.g., 10-12 places of decimal), or may be full precision (not even sure how to do that), even though the precision requirement of the library is not that high. What does the community here typically do for unit testing numerical values from scientific/numerical code? Any recommendations/suggestions/pointers?
More information: the library deals with double values. It is not meant to be a financial library etc., so I do not use decimal.
i don't think there exists any solution for your problem (without using quantum computers). there are two possibilities i heard of.
you are working with numbers that have finite representation (rational and all that is based on them). in this case you can simply compare/test those representations with full precision
representation is not finite (irrational and all that is based on them). in this case you simply can't do any operation (including testing) with full precision because you have finite amount of memory. you have to chose precision for your operation and then you can test only up to that precision.
there is also something 'between':
you are using algebraic/symbolic calculation rather than value calculation. in this case even though you are working with (let's say) real number, in fact you are using its finite representation like 'pi', 'e' or 'sqrt(3)'. and that brings you to the #1
you do operations like calculating sqrt(4), where your input and output are from #1 however your calculation process leads through #2 and therefore has all its limitations
Related
I am working on project with a lot math calculations. After switching on a new test machine, I have noticed that a lot of tests failed. But also important to notice that tests also failed on my develop machine, and on some machines of other developers. After tracing values and comparing with values from the old machine I found that some functions (At this moment I found only cosine) from math.h sometimes returns slightly different values (for example: 40965.8966304650828827e-01 and 40965.8966304650828816e-01, -3.3088623618085204e-08 and -3.3088623618085197e-08).
New CPU: Intel Xeon Gold 6230R (Intel64 Family 6 Model 85 Stepping 7)
Old CPU: Exact model is unknown (Intel64 Family 6 Model 42 Stepping 7)
My CPU: Intel Core i7-4790K
Tests results doesn't depend on Windows version (7 and 10 were tested).
I have tried to test with binary that was statically linked with standard library to exclude loading of different libraries for different processes and Windows versions, but all results were the same.
Project compiled with /fp:precise, switching to /fp:strict changed nothing.
MSVC from Visual Studio 15 is used: 19.00.24215.1 for x64.
How to make calculations fully reproducible?
Since you are on Windows, I am pretty sure the different results are because the UCRT detects during runtime whether FMA3 (fused-multiply-add) instructions are available for the CPU and if yes, use them in transcendental functions such as cosine. This gives slightly different results. The solution is to place the call set_FMA3_enable(0); at the very start of your main() or WinMain() function, as described here.
If you want to have reproducibility also between different operating systems, things become harder or even impossible. See e.g. this blog post.
In response also to the comments stating that you should just use some tolerance, I do not agree with this as a general statement. Certainly, there are many applications where this is the way to go. But I do think that it can be a sensible requirement to get exactly the same floating point results for some applications, at least when staying on the same OS (Windows, in this case). In fact, we had the very same issue with set_FMA3_enable a while ago. I am a software developer for a traffic simulation, and minor differences such as 10^-16 often build up and lead to entirely different simulation results eventually. Naturally, one is supposed to run many simulations with different seeds and average over all of them, making the different behavior irrelevant for the final result. But: Sometimes customers have a problem at a specific simulation second for a specific seed (e.g. an application crash or incorrect behavior of an entity), and not being able to reproduce it on our developer machines due to a different CPU makes it much harder to diagnose and fix the issue. Moreover, if the test system consists of a mixture of older and newer CPUs and test cases are not bound to specific resources, means that sometimes tests can deviate seemingly without reason (flaky tests). This is certainly not desired. Requiring exact reproducibility also makes writing the tests much easier because you do not require heuristic thresholds (e.g. a tolerance or some guessed value for the amount of samples). Moreover, our customers expect the results to remain stable for a specific version of the program since they calibrated (more or less...) their traffic networks to real data. This is somewhat questionable, since (again) one should actually look at averages, but the naive expectation in reality usually wins.
IEEE-745 double precision binary floating point provides no more than 15 decimal significant digits of precision. You are looking at the "noise" of different library implementations and possibly different FPU implementations.
How to make calculations fully reproducible?
That is an X-Y problem. The answer is you can't. But it is the wrong question. You would do better to ask how you can implement valid and robust tests that are sympathetic to this well-known and unavoidable technical issue with floating-point representation. Without providing the test code you are trying to use, it is not possible to answer that directly.
Generally you should avoid comparing floating point values for exact equality, and rather subtract the result from the desired value, and test for some acceptable discrepancy within the supported precision of the FP type used. For example:
#define EXPECTED_RESULT 40965.8966304650
#define RESULT_PRECISION 00000.0000000001
double actual_result = test() ;
bool error = fabs( actual_result-
EXPECTED_RESULT ) >
RESULT_PRECISION ;
First of all, 40965.8966304650828827e-01 cannot be a result from cos() function, as cos(x) is a function that, for real valued arguments always returns a value in the interval [-1.0, 1.0] so the result shown cannot be the output from it.
Second, you will have probably read somewhere that double values have a precision of roughly 17 digits in the significand, while your are trying to show 21 digit. You cannot get correct data past the ...508, as you are trying to force the result farther from the 17dig limit.
The reason you get different results in different computers is that what is shown after the precise digits are shown is undefined behaviour, so it's normal that you get different values (you could get different values even on different runs on the same machine with the same program)
I want to do some calculations in C++ using several scientific constants like,
effective mass of electron(m) 9.109e-31 kg
charge of electron 1.602e-19 C
Boltzman constant(k) 1.38×10−23
Time 8.92e-13
And I have calculations like, sqrt((2kT)/m)
Is it safe to use double for these constants and for results?
floating point arithmetic and accuracy is a very tricky subject. Read absolutely the floating-point-gui.de site.
Errors of many floating point operations can accumulate to the point of giving meaningless results. Several catastrophic events (loss of life, billions of dollars crashes) happened because of this. More will happen in the future.
There are some static source analyzers dedicated to detect them, for example Fluctuat (by my CEA colleagues, several now at Ecole Polytechnique, Palaiseau, France) and others. But Rice's theorem applies so that static analysis problem is unsolvable in general.
(but static analysis of floating point accuracy could sometimes practically work on some small programs of a few thousand lines, and do not scale well to large programs)
There are also some programs instrumenting calculations, for example CADNA from LIP6 in Paris, France.
(but instrumention may give a huge over-approximation of the error)
You could design your numerical algorithms to be less sensitive to floating point errors. This is very difficult (and you'll need years of work to acquire the relevant skills and expertise).
(you need both numerical, mathematical, and computer science skills, PhD-level)
You could also use arbitrary-precision arithmetic, or extended precision one (e.g. 128 bit floats or quad-precision). This slows down the computations.
An important consideration is how much effort (time and money) you can allocate to hunt floating point errors, and how much do they matter to your particular problem. But there is No Silver Bullet, and the question of floating point accurary remains a very difficult issue (you could work your entire life on it).
PS. I am not a floating point expert. I just happen to know some.
With the particular example you gave (constants and calculations) : YES
You didn't define 'safe' in your problem. I will assume that you want to keep the same number of correct significant digits.
doubles are correct to 15 significant digits
you have constants that have 4 significant digits
the operations involves use multiplication, division, and one square root
it doesn't seem that your results are going to the 'edge' cases of doubles (for very small or large exponent value, where mantissa loses precision)
In this particular order, the result would be correct to 4 significant digits.
In the general case, however, you have to be careful. (probably not, and this depend on your definition of 'safe' of course).
This is a large and complicated subject. In particular, your result might not be correct to the same number of significant digits if you have :
a lot more operations,
if you have substractions of numbers close to each other
other problematic operations
Obligatory reading : What Every Computer Scientist Should Know About Floating-Point Arithmetic
See the good answer of #Basile Starynkevitch for other references.
Also, for complex calculations, it is relevant to have some notion of the Condition number of a problem.
If you need a yes or no answer, No.
I am developing a program for a couple of rather poorly documented MCUs. So far, the most pressing problem is that I have to get all of these MCUs to constantly communicate (send/recieve) floating-point data, and I have no idea exactly what the specifications are for the floating point types. In other words, I cannot make sure that one floating point type will have the same value if I send it along a serial/parallel connection to another MCU. Everywhere I have looked does not give me specifics for how they handle them (precision, mantissa, location of sign bit, etc...)
I have the standard fixed point integer types like int and long figured out; this applies explicitly to floating point types like float and double.
The worst part is that I do not have access to the standard library for every MCU. That means I cannot use std::numeric_limits or other stuff like that.
As a last resort, I can create my own struct, class, or other type and use some well-placed logic operators to get each data type to do what I want, but this is ultimately undesirable for my project. The same goes for trial-and-error of what the bit structure of every floating point type is for every MCU.
So, is it possible to not only see the specifics of the floating point types, but also possibly change them if they don't follow the standard? Or, is it as simple as "You need to get a better MCU"?
EDIT #1:
As I have recently tested my 12 MCUs, only 5 of them support IEEE standard for single and double precision. The other seven all use unique formats for both single and double precision.
EDIT #2:
As suggested, I ran Kahan's Paranoia Test Script, suggested by Simon Byrne:
You could try running Kahan's paranoia test script, which is available in several different languages from Netlib. This tries to figure out the floating point characteristics by test computations.
This worked well
for two of my MCUs. The remaining five do not have enough memory to run the test. However, the two I did manage to decode have extremely weird ways of handling the sign bit and endianess, and I'll have to look into some weird logical operations so as to make a foolproof compatibility layer.
You could try running Kahan's paranoia test script, which is available in several different languages from Netlib. This tries to figure out the floating point characteristics by test computations.
I am trying to come up with a good tolerance when comparing doubles in unit tests.
If I allow a fixed tolerance as I've seen mentioned on this site (eg return abs(actual-expected) < 0.00001;), this will frequently fail when numbers are very big due to the nature of floating point representation.
If I use a relative tolerance in terms of % error allowed (eg return abs(actual-expected) < abs(actual * 0.001); this fails too often for small numbers (and for very small numbers, the computation itself can introduce rounding error). Additionally, it allows too much tolerance in certain ranges (eg comparing 2000 and 2001 would pass).
I'm wondering if there's any standard algorithm for allowing tolerance that will work for both small and large numbers. Should I try for some kind of base 2 logarithmic tolerance to mirror floating point storage? Should I do a hybrid approach based on the size of the inputs?
Since this is in unit test code, performance is not a big factor.
The specification of tolerance is a business function. There aren't standards that say "all tolerance must be within +/- .001%". So you have to go to your requirements and figure out what's going to work for you.
Look at your application. Let's say it's for some kind of cutting machine. Are they metal machining tolerances? .005 inches is common. Are they wood cabinet sawing tolerances? 1/32" is sloppy, 1/64" is better. Are they house framing tolerances? Don't expect a carpenter to come closer than 1/4". Hand cutting with a welding torch? Hope for about an inch. The point is simply that every application depends on something different, even when they're doing equivalent things.
If you're just talking "doubles" in general, they're usually good to no better than 15 digits of precision. Floats are good to 7 digits. I round those down by one when I'm thinking about the problem (I don't rely on a double being accurate to more than 14 digits and I stop with floats at six digits); however, if I'm worried about more than the 12th digit of precision I'm generally working with large dollar amounts that have to balance precisely, and I'd be a fool to use non-integer math for them. Business people want their stuff to balance to the penny, and wouldn't approve of rounding off addition operations!
If you're looking at math library operations such as the trig functions, read the library's documentation on each function.
I'm working on an application that does a lot of floating-point calculations. We use VC++ on Intel x86 with double precision floating-point values. We make claims that our calculations are accurate to n decimal digits (right now 7, but trying to claim 15).
We go to a lot of effort of validating our results against other sources when our results change slightly (due to code refactoring, cleanup, etc.). I know that many many factors play in to the overall precision, such as the FPU control state, the compiler/optimizer, floating-point model, and the overall order of operations themselves (i.e., the algorithm itself), but given the inherent uncertainty in FP calculations (e.g., 0.1 cannot be represented), it seems invalid to claim any specific degree of precision for all calulations.
My question is this: is it valid to make any claims about the accuracy of FP calculations in general without doing any sort of analysis (such as interval analysis)? If so, what claims can be made and why?
EDIT:
So given that the input data is accurate to, say, n decimal places, can any guarantee be made about the result of any arbitrary calculations, given that double precision is being used? E.g., if the input data has 8 significant decimal digits, the output will have at least 5 significant decimal digits... ?
We are using math libraries and are unaware of any guarantees they may or may not make. The algorithms we use are not necessarily analyzed for precision in any way. But even given a specific algorithm, the implementation will affect the results (just changing the order of two addition operations, for example). Is there any inherent guarantee whatsoever when using, say, double precision?
ANOTHER EDIT:
We do empirically validate our results against other sources. So are we just getting lucky when we achieve, say, 10-digit accuracy?
As with all such questions, I have to just simply answer with the article What Every Computer Scientist Should Know About Floating-Point Arithmetic. It's absolutely indispensable for the type of work you are talking about.
Short answer: No.
Reason: Have you proved (yes proved) that you aren't losing any precision as you go along? Are you sure? Do you understand the intrinsic precision of any library functions you're using for transcendental functions? Have you computed the limits of additive errors? If you are using an iterative algorithm, do you know how well it has converged when you quit? This stuff is hard.
Unless your code uses only the basic operations specified in IEEE 754 (+, -, *, / and square root), you do not even know how much precision loss each call to library functions outside your control (trigonometric functions, exp/log, ...) introduce. Functions outside the basic 5 are not guaranteed to be, and are usually not, precise at 1ULP.
You can do empirical checks, but that's what they remain... empirical. Don't forget the part about there being no warranty in the EULA of your software!
If your software was safety-critical, and did not call library-implemented mathematical functions, you could consider http://www-list.cea.fr/labos/gb/LSL/fluctuat/index.html . But only critical software is worth the effort and has a chance to fit in the analysis constraints of this tool.
You seem, after your edit, mostly concerned about your compiler doing things in your back. It is a natural fear to have (because like for the mathematical functions, you are not in control). But it's rather unlikely to be the problem. Your compiler may compute with a higher precision than you asked for (80-bit extendeds when you asked for 64-bit doubles or 64-bit doubles when you asked for 32-bit floats). This is allowed by the C99 standard. In round-to-nearest, this may introduce double-rounding errors. But it's only 1ULP you are losing, and so infrequently that you needn't worry. This can cause puzzling behaviors, as in:
float x=1.0;
float y=7.0;
float z=x/y;
if (z == x/y)
...
else
... /* the else branch is taken */
but you were looking for trouble when you used == between floating-point numbers.
When you have code that does cancelations on purpose, such as in Kahan's summation algorithm:
d = (a+b)-a-b;
and the compiler optimizes that into d=0;, you have a problem. And yes, this optimization "as if floats operation were associative" has been seen in general compilers. It is not allowed by C99. But the situation has gotten better, I think. Compiler authors have become more aware of the dangers of floating-point and no longer try to optimize so aggressively. Plus, if you were doing this in your code you would not be asking this question.
Given that your vendors of machines, compilers, run-time libraries, and operation systems don't make any such claim about floating point accuracy, you should take that to be a warning that your group should be leery of making claims that could come under harsh scrutiny if clients ever took you to court.
Without doing formal verification of the entire system, I would avoid such claims. I work on scientific software that has indirect human safety implications, so we have consider such things in the past, and we do not make these sort of claims.
You could make useless claims about precision of double (length) floating point calculations, but it would be basically worthless.
Ref: The pitfalls of verifying floating-point computations from ACM Transactions on Programming Languages and Systems 30, 3 (2008) 12
No, you cannot make any such claim. If you wanted to do so, you would need to do the following:
Hire an expert in numerical computing to analyze your algorithms.
Either get your library and compiler vendors to open their sources to said expert for analysis, or get them to sign off on hard semantics and error bounds.
Double-precision floating-point typically carries about 15 digits of decimal accuracy, but there are far too many ways for some or all of that accuracy to be lost, that are far too subtle for a non-expert to diagnose, to make any claim like what you would like to claim.
There are relatively simpler ways to keep running error bounds that would let you make accuracy claims about any specific computation, but making claims about the accuracy of all computations performed with your software is a much taller order.
A double precision number on an Intel CPU has slightly better than 15 significant digits (decimal).
The potrntial error for a simple computation is in the ballparl of n/1.0e15, where n is the order of magnitude of the number(s) you are working with. I suspect that Intel has specs for the accuracy of CPU-based FP computations.
The potential error for library functions (like cos and log) is usually documented. If not, you can look at the source code (e.g. thr GNU source) and calculate it.
You would calculate error bars for your calculations just as you would for manual calculations.
Once you do that, you may be able to reduce the error by judicious ordering of the computations.
Since you seem to be concerned about accuracy of arbitrary calculations, here is an approach you can try: run your code with different rounding modes for floating-point calculations. If the results are pretty close to each other, you are probably okay. If the results are not close, you need to start worrying.
The maximum difference in the results will give you a lower bound on the accuracy of the calculations.