Is it safe to use double for scientific constants in C++? - c++

I want to do some calculations in C++ using several scientific constants like,
effective mass of electron(m) 9.109e-31 kg
charge of electron 1.602e-19 C
Boltzman constant(k) 1.38×10−23
Time 8.92e-13
And I have calculations like, sqrt((2kT)/m)
Is it safe to use double for these constants and for results?

floating point arithmetic and accuracy is a very tricky subject. Read absolutely the floating-point-gui.de site.
Errors of many floating point operations can accumulate to the point of giving meaningless results. Several catastrophic events (loss of life, billions of dollars crashes) happened because of this. More will happen in the future.
There are some static source analyzers dedicated to detect them, for example Fluctuat (by my CEA colleagues, several now at Ecole Polytechnique, Palaiseau, France) and others. But Rice's theorem applies so that static analysis problem is unsolvable in general.
(but static analysis of floating point accuracy could sometimes practically work on some small programs of a few thousand lines, and do not scale well to large programs)
There are also some programs instrumenting calculations, for example CADNA from LIP6 in Paris, France.
(but instrumention may give a huge over-approximation of the error)
You could design your numerical algorithms to be less sensitive to floating point errors. This is very difficult (and you'll need years of work to acquire the relevant skills and expertise).
(you need both numerical, mathematical, and computer science skills, PhD-level)
You could also use arbitrary-precision arithmetic, or extended precision one (e.g. 128 bit floats or quad-precision). This slows down the computations.
An important consideration is how much effort (time and money) you can allocate to hunt floating point errors, and how much do they matter to your particular problem. But there is No Silver Bullet, and the question of floating point accurary remains a very difficult issue (you could work your entire life on it).
PS. I am not a floating point expert. I just happen to know some.

With the particular example you gave (constants and calculations) : YES
You didn't define 'safe' in your problem. I will assume that you want to keep the same number of correct significant digits.
doubles are correct to 15 significant digits
you have constants that have 4 significant digits
the operations involves use multiplication, division, and one square root
it doesn't seem that your results are going to the 'edge' cases of doubles (for very small or large exponent value, where mantissa loses precision)
In this particular order, the result would be correct to 4 significant digits.
In the general case, however, you have to be careful. (probably not, and this depend on your definition of 'safe' of course).
This is a large and complicated subject. In particular, your result might not be correct to the same number of significant digits if you have :
a lot more operations,
if you have substractions of numbers close to each other
other problematic operations
Obligatory reading : What Every Computer Scientist Should Know About Floating-Point Arithmetic
See the good answer of #Basile Starynkevitch for other references.
Also, for complex calculations, it is relevant to have some notion of the Condition number of a problem.

If you need a yes or no answer, No.

Related

How to force 32bits floating point calculation consistency across different platforms?

I have a simple piece of code that operates with floating points.
Few multiplications, divisions, exp(), subtraction and additions in a loop.
When I run the same piece of code on different platforms (like PC, Android phones, iPhones) I get slightly different results.
The result is pretty much equal on all the platforms but has a very small discrepancy - typically 1/1000000 of the floating point value.
I suppose the reason is that some phones don't have floating point registers and just simulate those calculations with integers, some do have floating point registers but have different implementations.
There are proofs to that here: http://christian-seiler.de/projekte/fpmath/
Is there a way to force all the platform to produce a consistent results?
For example a good & fast open-source library that implements floating point mechanics with integers (in software), thus I can avoid hardware implementation differences.
The reason I need an exact consistency is to avoid compound errors among layers of calculations.
Currently those compound errors do produce a significantly different result.
In other words, I don't care so much which platform has a more correct result, but rather want to force consistency to be able to reproduce equal behavior. For example a bug which was discovered on a mobile phone is much easier to debug on PC, but I need to reproduce this exact behavior
One relatively widely used and high quality software FP implementation is MPFR. It is a lot slower than hardware FP, though.
Of course, this won't solve the actual problems your algorithm has with compound errors, it will just make it produce the same errors on all platforms. Probably a better approach would be to design an algorithm which isn't as sensitive to small differences in FP arithmetic, if feasible. Or if you go the MPFR route, you can use a higher precision FP type and see if that helps, no need to limit yourself to emulating the hardware single/double precision.
32-bit floating point math, for a given calculation will, at best, have a precision of 1 in 16777216 (1 in 224). Functions such as exp are often implemented as a sequence of calculations, so may have a larger error due to this. If you do several calculations in a row, the errors will add and multiply up. In general float has about 6-7 digits of precision.
As one comment says, check the rounding mode is the same. Most FPU's have a "round to nearest" (rtn), "round to zero" (rtz) and "round to even" (rte) mode that you can choose. The default on different platforms MAY vary.
If you perform additions or subtractions of fairly small numbers to fairly large numbers, since the number has to be normalized you will have a greater error from these sort of operations.
Normalized means shifted such that both numbers have the decimal place lined up - just like if you do that on paper, you have to fill in extra zeros to line up the two numbers you are adding - but of course on paper you can add 12419818.0 with 0.000000001 and end up with 12419818.000000001 because paper has as much precision as you can be bothered with. Doing this in float or double will result in the same number as before.
There are indeed libraries that do floating point math - the most popular being MPFR - but it is a "multiprecision" library, but it will be fairly slow - because they are not really built to be "plugin replacement of float", but a tool for when you want to calculate pi with 1000s of digits, or when you want to calculate prime numbers in the ranges much larger than 64 or 128 bits, for example.
It MAY solve the problem to use such a library, but it will be slow.
A better choice would, moving from float to double should have a similar effect (double has 53 bits of mantissa, compared to the 23 in a 32-bit float, so more than twice as many bits in the mantissa). And should still be available as hardware instructions in any reasonably recent ARM processor, and as such relatively fast, but not as fast as float (FPU is available from ARMv7 - which certainly what you find in iPhone - at least from iPhone 3 and the middle to high end Android devices - I managed to find that Samsung Galaxy ACE has an ARM9 processor [first introduced in 1997] - so has no floating point hardware).

Good tolerance for double comparison

I am trying to come up with a good tolerance when comparing doubles in unit tests.
If I allow a fixed tolerance as I've seen mentioned on this site (eg return abs(actual-expected) < 0.00001;), this will frequently fail when numbers are very big due to the nature of floating point representation.
If I use a relative tolerance in terms of % error allowed (eg return abs(actual-expected) < abs(actual * 0.001); this fails too often for small numbers (and for very small numbers, the computation itself can introduce rounding error). Additionally, it allows too much tolerance in certain ranges (eg comparing 2000 and 2001 would pass).
I'm wondering if there's any standard algorithm for allowing tolerance that will work for both small and large numbers. Should I try for some kind of base 2 logarithmic tolerance to mirror floating point storage? Should I do a hybrid approach based on the size of the inputs?
Since this is in unit test code, performance is not a big factor.
The specification of tolerance is a business function. There aren't standards that say "all tolerance must be within +/- .001%". So you have to go to your requirements and figure out what's going to work for you.
Look at your application. Let's say it's for some kind of cutting machine. Are they metal machining tolerances? .005 inches is common. Are they wood cabinet sawing tolerances? 1/32" is sloppy, 1/64" is better. Are they house framing tolerances? Don't expect a carpenter to come closer than 1/4". Hand cutting with a welding torch? Hope for about an inch. The point is simply that every application depends on something different, even when they're doing equivalent things.
If you're just talking "doubles" in general, they're usually good to no better than 15 digits of precision. Floats are good to 7 digits. I round those down by one when I'm thinking about the problem (I don't rely on a double being accurate to more than 14 digits and I stop with floats at six digits); however, if I'm worried about more than the 12th digit of precision I'm generally working with large dollar amounts that have to balance precisely, and I'd be a fool to use non-integer math for them. Business people want their stuff to balance to the penny, and wouldn't approve of rounding off addition operations!
If you're looking at math library operations such as the trig functions, read the library's documentation on each function.

Checking Numerical Precision in Algorithms

what is the best practice to check for numerical precision in algorithms?
Is there any suggested technique to resolve the problem "how do we know the result we calculated is correct"?
If possible: are there some example of numerical precision enhancement in C++?
Thank you for any suggestion!
Math::BigFloat / Math::BigInt will help. I must say there are many libraries that do this, I don't know which would be best. Maybe someone else has a nice answer for you.
In general though, you can write it twice: once with unlimited precision, and one without then verify the two. That's what I do with the scientific software I write. Then I'll write a third that does fancier speed enhancements. This way I can verify all three. Mind you, I know the three won't be exactly equal, but they should have enough significant figures of corroboration.
To actually know how much error is difficult to obtain accurately --remember order of operations of floating point numbers can cause large differences. It's really problem specific but if you know the relative magnitude of certain numbers you can change the order of operations to get accuracy (multiply a list in sorted order for example). Two places to look for investigating this is,
Handbook of Floating-Point Arithmetic
What Every Computer Scientist needs to know about Floating Point Arithmetic.
Anatomy of a Floating Point Number
Have a look at interval arithmetic, for example
http://www.boost.org/doc/libs/1_53_0/libs/numeric/interval/doc/interval.htm
It will produce upper and lower bounds on results
PS: also have a look at http://www.cs.cmu.edu/~quake/robust.html

What should i know when using floats/doubles between different machines?

I've heard that there are many problems with floats/doubles on different CPU's.
If i want to make a game that uses floats for everything, how can i be sure the float calculations are exactly the same on every machine so that my simulation will look exactly same on every machine?
I am also concerned about writing/reading files or sending/receiving the float values to different computers. What conversions there must be done, if any?
I need to be 100% sure that my float values are computed exactly the same, because even a slight difference in the calculations will result in a totally different future. Is this even possible ?
Standard C++ does not prescribe any details about floating point types other than range constraints, and possibly that some of the maths functions (like sine and exponential) have to be correct up to a certain level of accuracy.
Other than that, at that level of generality, there's really nothing else you can rely on!
That said, it is quite possible that you will not actually require binarily identical computations on every platform, and that the precision and accuracy guarantees of the float or double types will in fact be sufficient for simulation purposes.
Note that you cannot even produce a reliable result of an algebraic expression inside your own program when you modify the order of evaluation of subexpressions, so asking for the sort of reproducibility that you want may be a bit unrealistic anyway. If you need real floating point precision and accuracy guarantees, you might be better off with an arbitrary precision library with correct rounding, like MPFR - but that seems unrealistic for a game.
Serializing floats is an entirely different story, and you'll have to have some idea of the representations used by your target platforms. If all platforms were in fact to use IEEE 754 floats of 32 or 64 bit size, you could probably just exchange the binary representation directly (modulo endianness). If you have other platforms, you'll have to think up your own serialization scheme.
What every programmer should know: http://docs.sun.com/source/806-3568/ncg_goldberg.html

What claims, if any, can be made about the accuracy/precision of floating-point calculations?

I'm working on an application that does a lot of floating-point calculations. We use VC++ on Intel x86 with double precision floating-point values. We make claims that our calculations are accurate to n decimal digits (right now 7, but trying to claim 15).
We go to a lot of effort of validating our results against other sources when our results change slightly (due to code refactoring, cleanup, etc.). I know that many many factors play in to the overall precision, such as the FPU control state, the compiler/optimizer, floating-point model, and the overall order of operations themselves (i.e., the algorithm itself), but given the inherent uncertainty in FP calculations (e.g., 0.1 cannot be represented), it seems invalid to claim any specific degree of precision for all calulations.
My question is this: is it valid to make any claims about the accuracy of FP calculations in general without doing any sort of analysis (such as interval analysis)? If so, what claims can be made and why?
EDIT:
So given that the input data is accurate to, say, n decimal places, can any guarantee be made about the result of any arbitrary calculations, given that double precision is being used? E.g., if the input data has 8 significant decimal digits, the output will have at least 5 significant decimal digits... ?
We are using math libraries and are unaware of any guarantees they may or may not make. The algorithms we use are not necessarily analyzed for precision in any way. But even given a specific algorithm, the implementation will affect the results (just changing the order of two addition operations, for example). Is there any inherent guarantee whatsoever when using, say, double precision?
ANOTHER EDIT:
We do empirically validate our results against other sources. So are we just getting lucky when we achieve, say, 10-digit accuracy?
As with all such questions, I have to just simply answer with the article What Every Computer Scientist Should Know About Floating-Point Arithmetic. It's absolutely indispensable for the type of work you are talking about.
Short answer: No.
Reason: Have you proved (yes proved) that you aren't losing any precision as you go along? Are you sure? Do you understand the intrinsic precision of any library functions you're using for transcendental functions? Have you computed the limits of additive errors? If you are using an iterative algorithm, do you know how well it has converged when you quit? This stuff is hard.
Unless your code uses only the basic operations specified in IEEE 754 (+, -, *, / and square root), you do not even know how much precision loss each call to library functions outside your control (trigonometric functions, exp/log, ...) introduce. Functions outside the basic 5 are not guaranteed to be, and are usually not, precise at 1ULP.
You can do empirical checks, but that's what they remain... empirical. Don't forget the part about there being no warranty in the EULA of your software!
If your software was safety-critical, and did not call library-implemented mathematical functions, you could consider http://www-list.cea.fr/labos/gb/LSL/fluctuat/index.html . But only critical software is worth the effort and has a chance to fit in the analysis constraints of this tool.
You seem, after your edit, mostly concerned about your compiler doing things in your back. It is a natural fear to have (because like for the mathematical functions, you are not in control). But it's rather unlikely to be the problem. Your compiler may compute with a higher precision than you asked for (80-bit extendeds when you asked for 64-bit doubles or 64-bit doubles when you asked for 32-bit floats). This is allowed by the C99 standard. In round-to-nearest, this may introduce double-rounding errors. But it's only 1ULP you are losing, and so infrequently that you needn't worry. This can cause puzzling behaviors, as in:
float x=1.0;
float y=7.0;
float z=x/y;
if (z == x/y)
...
else
... /* the else branch is taken */
but you were looking for trouble when you used == between floating-point numbers.
When you have code that does cancelations on purpose, such as in Kahan's summation algorithm:
d = (a+b)-a-b;
and the compiler optimizes that into d=0;, you have a problem. And yes, this optimization "as if floats operation were associative" has been seen in general compilers. It is not allowed by C99. But the situation has gotten better, I think. Compiler authors have become more aware of the dangers of floating-point and no longer try to optimize so aggressively. Plus, if you were doing this in your code you would not be asking this question.
Given that your vendors of machines, compilers, run-time libraries, and operation systems don't make any such claim about floating point accuracy, you should take that to be a warning that your group should be leery of making claims that could come under harsh scrutiny if clients ever took you to court.
Without doing formal verification of the entire system, I would avoid such claims. I work on scientific software that has indirect human safety implications, so we have consider such things in the past, and we do not make these sort of claims.
You could make useless claims about precision of double (length) floating point calculations, but it would be basically worthless.
Ref: The pitfalls of verifying floating-point computations from ACM Transactions on Programming Languages and Systems 30, 3 (2008) 12
No, you cannot make any such claim. If you wanted to do so, you would need to do the following:
Hire an expert in numerical computing to analyze your algorithms.
Either get your library and compiler vendors to open their sources to said expert for analysis, or get them to sign off on hard semantics and error bounds.
Double-precision floating-point typically carries about 15 digits of decimal accuracy, but there are far too many ways for some or all of that accuracy to be lost, that are far too subtle for a non-expert to diagnose, to make any claim like what you would like to claim.
There are relatively simpler ways to keep running error bounds that would let you make accuracy claims about any specific computation, but making claims about the accuracy of all computations performed with your software is a much taller order.
A double precision number on an Intel CPU has slightly better than 15 significant digits (decimal).
The potrntial error for a simple computation is in the ballparl of n/1.0e15, where n is the order of magnitude of the number(s) you are working with. I suspect that Intel has specs for the accuracy of CPU-based FP computations.
The potential error for library functions (like cos and log) is usually documented. If not, you can look at the source code (e.g. thr GNU source) and calculate it.
You would calculate error bars for your calculations just as you would for manual calculations.
Once you do that, you may be able to reduce the error by judicious ordering of the computations.
Since you seem to be concerned about accuracy of arbitrary calculations, here is an approach you can try: run your code with different rounding modes for floating-point calculations. If the results are pretty close to each other, you are probably okay. If the results are not close, you need to start worrying.
The maximum difference in the results will give you a lower bound on the accuracy of the calculations.