What should i know when using floats/doubles between different machines?

What should i know when using floats/doubles between different machines? - c++

I've heard that there are many problems with floats/doubles on different CPU's.
If i want to make a game that uses floats for everything, how can i be sure the float calculations are exactly the same on every machine so that my simulation will look exactly same on every machine?
I am also concerned about writing/reading files or sending/receiving the float values to different computers. What conversions there must be done, if any?
I need to be 100% sure that my float values are computed exactly the same, because even a slight difference in the calculations will result in a totally different future. Is this even possible ?

Standard C++ does not prescribe any details about floating point types other than range constraints, and possibly that some of the maths functions (like sine and exponential) have to be correct up to a certain level of accuracy.
Other than that, at that level of generality, there's really nothing else you can rely on!
That said, it is quite possible that you will not actually require binarily identical computations on every platform, and that the precision and accuracy guarantees of the float or double types will in fact be sufficient for simulation purposes.
Note that you cannot even produce a reliable result of an algebraic expression inside your own program when you modify the order of evaluation of subexpressions, so asking for the sort of reproducibility that you want may be a bit unrealistic anyway. If you need real floating point precision and accuracy guarantees, you might be better off with an arbitrary precision library with correct rounding, like MPFR - but that seems unrealistic for a game.
Serializing floats is an entirely different story, and you'll have to have some idea of the representations used by your target platforms. If all platforms were in fact to use IEEE 754 floats of 32 or 64 bit size, you could probably just exchange the binary representation directly (modulo endianness). If you have other platforms, you'll have to think up your own serialization scheme.

What every programmer should know: http://docs.sun.com/source/806-3568/ncg_goldberg.html

Related

Achieving identical floating point calculation result on different platforms/compilers?

Different platforms have varying FP capabilities with varying parameters and behaviors, as a result there is a degree of variance between the calculation results they produce, which cascade and amplify on each intermediate step.
I am in a situation where it is critical for (+-*/ only) calculations to produce identical results on each and every different target platform, using different compiler vendors, so I wonder if there is a standard way to do that. I am not asking about arbitrary high precision floating point numbers but standard 64 bit IEEE double, and a performance hit is expected and tolerable.

Even if you have a 64 bit IEEE754 double, there are a few extra things you need to check.
Make sure you have strict floating point. Don't allow your compiler to use, for example, 80 bits for intermediate calculations.
Various operations (all the arithmetic operations such as the ones you mention, std::sqrt, &c.) are required by IEEE754 to return the best number possible. (Should you need others then make sure that all your operations are mentioned in the IEEE754 standard and your platform obeys that faithfully - it might not).
Shy away from other functions (such as trigonometric functions), for which there is no guarantee of precision, even under IEEE754.
In your specific case it appears that (1) is sufficient, along with perhaps (for C++)
static_assert(std::numeric_limits<double>::is_iec559, "IEEE 754 floating point required");

Are there any commonly used floating point formats besides IEEE754?

I am writing a marshaling layer to automatically convert values between different domains. When it comes to floating point values this potentially means converting values from one floating point format to another. However, it seems that almost every modern system is using IEEE754, so I'm wondering whether it's actually worth generalising to allow other formats, or just manage marshaling between different IEEE754 formats.
Does anyone know of any commonly used floating point formats other than IEEE754 that I should consider (perhaps on ARM processors or mainframes)? If so, a reference to the format specification would be extremely helpful.

Virtually all relatively modern (within the last 15 years) general purpose computers use IEEE 754. In the very unlikely event that you find system that you need to support which uses a non-IEEE 754 floating point format, there will probably be a library available to convert to/from IEEE 754.
Some non-ancient systems which did not natively use IEEE 754 were the Cray SV1 (1998-2003) and IBM System 360, 370, and 390 prior to Generation 5 (ended 2002). IBM implemented IEEE 754 emulation around 2001 in a software release for prior S/390 hardware.

As of now, what systems do you actually want this to work on? If you come across one down the line that doesn't use IEEE754 (which as #JohnZwinick says, is vanishingly unlikely) then you should be able to code for that then.
To put it another way, what you are designing here is, in effect, a communications protocol and you obviously seek to make a sensible choice for how you will represent a floating point number (both single precision and double precision, I guess) in the bytes that travel between domains.
I think #SomeProgrammerDude was trying to imply that representing these as text strings (while they are in transit) might offer the most portability, and if so I would agree, but it's obviously not the most efficient way to do it.
So, if you do decide to plump for IEEE754 as your interchange format (as I would) then the worst that can happen is that you might need to find a way to convert these to and from the native format used on some antique architecture that you are almost certainly never going to encounter, and if that does happen then that problem would not be not difficult to solve.
Also, floats and doubles can be big-endian or little-endian, so you need to decide what you're going to use in your byte stream and convert when marshalling if necessary. Little-endian is much more common these days so I'd go with that.

Does anyone know of any commonly used floating point formats other than IEEE754 that I should consider ...?
CCSI uses a variation on binary32 for select processors.
it seems that almost every modern system is using IEEE754,
Yes, but... various implementations fudge on the particulars with edge values like subnormals, negative zero in visual studio, infinity and not-a-number.
It is this second issue that is more lethal and harder to discern that a given implementation has completely coded IEEE754. See __STDC_IEC_559__
OP has "I am writing a marshaling layer". It is in this coding that likely troubles remain for edge cases. Also IEEE754 does not specify endian so that marshaling issues remains. Recall integer endian may not match FP endian.

How to force 32bits floating point calculation consistency across different platforms?

I have a simple piece of code that operates with floating points.
Few multiplications, divisions, exp(), subtraction and additions in a loop.
When I run the same piece of code on different platforms (like PC, Android phones, iPhones) I get slightly different results.
The result is pretty much equal on all the platforms but has a very small discrepancy - typically 1/1000000 of the floating point value.
I suppose the reason is that some phones don't have floating point registers and just simulate those calculations with integers, some do have floating point registers but have different implementations.
There are proofs to that here: http://christian-seiler.de/projekte/fpmath/
Is there a way to force all the platform to produce a consistent results?
For example a good & fast open-source library that implements floating point mechanics with integers (in software), thus I can avoid hardware implementation differences.
The reason I need an exact consistency is to avoid compound errors among layers of calculations.
Currently those compound errors do produce a significantly different result.
In other words, I don't care so much which platform has a more correct result, but rather want to force consistency to be able to reproduce equal behavior. For example a bug which was discovered on a mobile phone is much easier to debug on PC, but I need to reproduce this exact behavior

One relatively widely used and high quality software FP implementation is MPFR. It is a lot slower than hardware FP, though.
Of course, this won't solve the actual problems your algorithm has with compound errors, it will just make it produce the same errors on all platforms. Probably a better approach would be to design an algorithm which isn't as sensitive to small differences in FP arithmetic, if feasible. Or if you go the MPFR route, you can use a higher precision FP type and see if that helps, no need to limit yourself to emulating the hardware single/double precision.

32-bit floating point math, for a given calculation will, at best, have a precision of 1 in 16777216 (1 in 224). Functions such as exp are often implemented as a sequence of calculations, so may have a larger error due to this. If you do several calculations in a row, the errors will add and multiply up. In general float has about 6-7 digits of precision.
As one comment says, check the rounding mode is the same. Most FPU's have a "round to nearest" (rtn), "round to zero" (rtz) and "round to even" (rte) mode that you can choose. The default on different platforms MAY vary.
If you perform additions or subtractions of fairly small numbers to fairly large numbers, since the number has to be normalized you will have a greater error from these sort of operations.
Normalized means shifted such that both numbers have the decimal place lined up - just like if you do that on paper, you have to fill in extra zeros to line up the two numbers you are adding - but of course on paper you can add 12419818.0 with 0.000000001 and end up with 12419818.000000001 because paper has as much precision as you can be bothered with. Doing this in float or double will result in the same number as before.
There are indeed libraries that do floating point math - the most popular being MPFR - but it is a "multiprecision" library, but it will be fairly slow - because they are not really built to be "plugin replacement of float", but a tool for when you want to calculate pi with 1000s of digits, or when you want to calculate prime numbers in the ranges much larger than 64 or 128 bits, for example.
It MAY solve the problem to use such a library, but it will be slow.
A better choice would, moving from float to double should have a similar effect (double has 53 bits of mantissa, compared to the 23 in a 32-bit float, so more than twice as many bits in the mantissa). And should still be available as hardware instructions in any reasonably recent ARM processor, and as such relatively fast, but not as fast as float (FPU is available from ARMv7 - which certainly what you find in iPhone - at least from iPhone 3 and the middle to high end Android devices - I managed to find that Samsung Galaxy ACE has an ARM9 processor [first introduced in 1997] - so has no floating point hardware).

What claims, if any, can be made about the accuracy/precision of floating-point calculations?

I'm working on an application that does a lot of floating-point calculations. We use VC++ on Intel x86 with double precision floating-point values. We make claims that our calculations are accurate to n decimal digits (right now 7, but trying to claim 15).
We go to a lot of effort of validating our results against other sources when our results change slightly (due to code refactoring, cleanup, etc.). I know that many many factors play in to the overall precision, such as the FPU control state, the compiler/optimizer, floating-point model, and the overall order of operations themselves (i.e., the algorithm itself), but given the inherent uncertainty in FP calculations (e.g., 0.1 cannot be represented), it seems invalid to claim any specific degree of precision for all calulations.
My question is this: is it valid to make any claims about the accuracy of FP calculations in general without doing any sort of analysis (such as interval analysis)? If so, what claims can be made and why?
EDIT:
So given that the input data is accurate to, say, n decimal places, can any guarantee be made about the result of any arbitrary calculations, given that double precision is being used? E.g., if the input data has 8 significant decimal digits, the output will have at least 5 significant decimal digits... ?
We are using math libraries and are unaware of any guarantees they may or may not make. The algorithms we use are not necessarily analyzed for precision in any way. But even given a specific algorithm, the implementation will affect the results (just changing the order of two addition operations, for example). Is there any inherent guarantee whatsoever when using, say, double precision?
ANOTHER EDIT:
We do empirically validate our results against other sources. So are we just getting lucky when we achieve, say, 10-digit accuracy?

As with all such questions, I have to just simply answer with the article What Every Computer Scientist Should Know About Floating-Point Arithmetic. It's absolutely indispensable for the type of work you are talking about.

Short answer: No.
Reason: Have you proved (yes proved) that you aren't losing any precision as you go along? Are you sure? Do you understand the intrinsic precision of any library functions you're using for transcendental functions? Have you computed the limits of additive errors? If you are using an iterative algorithm, do you know how well it has converged when you quit? This stuff is hard.

Unless your code uses only the basic operations specified in IEEE 754 (+, -, *, / and square root), you do not even know how much precision loss each call to library functions outside your control (trigonometric functions, exp/log, ...) introduce. Functions outside the basic 5 are not guaranteed to be, and are usually not, precise at 1ULP.
You can do empirical checks, but that's what they remain... empirical. Don't forget the part about there being no warranty in the EULA of your software!
If your software was safety-critical, and did not call library-implemented mathematical functions, you could consider http://www-list.cea.fr/labos/gb/LSL/fluctuat/index.html . But only critical software is worth the effort and has a chance to fit in the analysis constraints of this tool.
You seem, after your edit, mostly concerned about your compiler doing things in your back. It is a natural fear to have (because like for the mathematical functions, you are not in control). But it's rather unlikely to be the problem. Your compiler may compute with a higher precision than you asked for (80-bit extendeds when you asked for 64-bit doubles or 64-bit doubles when you asked for 32-bit floats). This is allowed by the C99 standard. In round-to-nearest, this may introduce double-rounding errors. But it's only 1ULP you are losing, and so infrequently that you needn't worry. This can cause puzzling behaviors, as in:
float x=1.0;
float y=7.0;
float z=x/y;
if (z == x/y)
...
else
... /* the else branch is taken */
but you were looking for trouble when you used == between floating-point numbers.
When you have code that does cancelations on purpose, such as in Kahan's summation algorithm:
d = (a+b)-a-b;
and the compiler optimizes that into d=0;, you have a problem. And yes, this optimization "as if floats operation were associative" has been seen in general compilers. It is not allowed by C99. But the situation has gotten better, I think. Compiler authors have become more aware of the dangers of floating-point and no longer try to optimize so aggressively. Plus, if you were doing this in your code you would not be asking this question.

Given that your vendors of machines, compilers, run-time libraries, and operation systems don't make any such claim about floating point accuracy, you should take that to be a warning that your group should be leery of making claims that could come under harsh scrutiny if clients ever took you to court.
Without doing formal verification of the entire system, I would avoid such claims. I work on scientific software that has indirect human safety implications, so we have consider such things in the past, and we do not make these sort of claims.
You could make useless claims about precision of double (length) floating point calculations, but it would be basically worthless.
Ref: The pitfalls of verifying floating-point computations from ACM Transactions on Programming Languages and Systems 30, 3 (2008) 12

No, you cannot make any such claim. If you wanted to do so, you would need to do the following:
Hire an expert in numerical computing to analyze your algorithms.
Either get your library and compiler vendors to open their sources to said expert for analysis, or get them to sign off on hard semantics and error bounds.
Double-precision floating-point typically carries about 15 digits of decimal accuracy, but there are far too many ways for some or all of that accuracy to be lost, that are far too subtle for a non-expert to diagnose, to make any claim like what you would like to claim.
There are relatively simpler ways to keep running error bounds that would let you make accuracy claims about any specific computation, but making claims about the accuracy of all computations performed with your software is a much taller order.

A double precision number on an Intel CPU has slightly better than 15 significant digits (decimal).
The potrntial error for a simple computation is in the ballparl of n/1.0e15, where n is the order of magnitude of the number(s) you are working with. I suspect that Intel has specs for the accuracy of CPU-based FP computations.
The potential error for library functions (like cos and log) is usually documented. If not, you can look at the source code (e.g. thr GNU source) and calculate it.
You would calculate error bars for your calculations just as you would for manual calculations.
Once you do that, you may be able to reduce the error by judicious ordering of the computations.

Since you seem to be concerned about accuracy of arbitrary calculations, here is an approach you can try: run your code with different rounding modes for floating-point calculations. If the results are pretty close to each other, you are probably okay. If the results are not close, you need to start worrying.
The maximum difference in the results will give you a lower bound on the accuracy of the calculations.

How to write portable floating point arithmetic in c++?

Say you're writing a C++ application doing lots of floating point arithmetic. Say this application needs to be portable accross a reasonable range of hardware and OS platforms (say 32 and 64 bits hardware, Windows and Linux both in 32 and 64 bits flavors...).
How would you make sure that your floating point arithmetic is the same on all platforms ? For instance, how to be sure that a 32 bits floating point value will really be 32 bits on all platforms ?
For integers we have stdint.h but there doesn't seem to exist a floating point equivalent.
[EDIT]
I got very interesting answers but I'd like to add some precision to the question.
For integers, I can write:
#include <stdint>
[...]
int32_t myInt;
and be sure that whatever the (C99 compatible) platform I'm on, myInt is a 32 bits integer.
If I write:
double myDouble;
float myFloat;
am I certain that this will compile to, respectively, 64 bits and 32 bits floating point numbers on all platforms ?

Non-IEEE 754
Generally, you cannot. There's always a trade-off between consistency and performance, and C++ hands that to you.
For platforms that don't have floating point operations (like embedded and signal processing processors), you cannot use C++ "native" floating point operations, at least not portably so. While a software layer would be possible, that's certainly not feasible for this type of devices.
For these, you could use 16 bit or 32 bit fixed point arithmetic (but you might even discover that long is supported only rudimentary - and frequently, div is very expensive). However, this will be much slower than built-in fixed-point arithmetic, and becomes painful after the basic four operations.
I haven't come across devices that support floating point in a different format than IEEE 754. From my experience, your best bet is to hope for the standard, because otherwise you usually end up building algorithms and code around the capabilities of the device. When sin(x) suddenly costs 1000 times as much, you better pick an algorithm that doesn't need it.
IEEE 754 - Consistency
The only non-portability I found here is when you expect bit-identical results across platforms. The biggest influence is the optimizer. Again, you can trade accuracy and speed for consistency. Most compilers have a option for that - e.g. "floating point consistency" in Visual C++. But note that this is always accuracy beyond the guarantees of the standard.
Why results become inconsistent?
First, FPU registers often have higher resolution than double's (e.g. 80 bit), so as long as the code generator doesn't store the value back, intermediate values are held with higher accuracy.
Second, the equivalences like a*(b+c) = a*b + a*c are not exact due to the limited precision. Nonetheless the optimizer, if allowed, may make use of them.
Also - what I learned the hard way - printing and parsing functions are not necessarily consistent across platforms, probably due to numeric inaccuracies, too.
float
It is a common misconception that float operations are intrinsically faster than double. working on large float arrays is faster usually through less cache misses alone.
Be careful with float accuracy. it can be "good enough" for a long time, but I've often seen it fail faster than expected. Float-based FFT's can be much faster due to SIMD support, but generate notable artefacts quite early for audio processing.

Use fixed point.
However, if you want to approach the realm of possibly making portable floating point operations, you at least need to use controlfp to ensure consistent FPU behavior as well as ensuring that the compiler enforces ANSI conformance with respect to floating point operations. Why ANSI? Because it's a standard.
And even then you aren't guaranteeing that you can generate identical floating point behavior; that also depends on the CPU/FPU you are running on.

It shouldn't be an issue, IEEE 754 already defines all details of the layout of floats.
The maximum and minimum values storable should be defined in limits.h

Portable is one thing, generating consistent results on different platforms is another. Depending on what you are trying to do then writing portable code shouldn't be too difficult, but getting consistent results on ANY platform is practically impossible.

I believe "limits.h" will include the C library constants INT_MAX and its brethren. However, it is preferable to use "limits" and the classes it defines:
std::numeric_limits<float>, std::numeric_limits<double>, std::numberic_limits<int>, etc...

If you're assuming that you will get the same results on another system, read What could cause a deterministic process to generate floating point errors first. You might be surprised to learn that your floating point arithmetic isn't even the same across different runs on the very same machine!

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js