Floating point precision in Visual C++

Floating point precision in Visual C++ - c++

HI,
I am trying to use the robust predicates for computational geometry from Jonathan Richard Shewchuk.
I am not a programmer, so I am not even sure of what I am saying, I may be doing some basic mistake.
The point is the predicates should allow for precise aritmthetic with adaptive floating point precision. On my computer: Asus pro31/S (Core Due Centrino Processor) they do not work. The problem may stay in the fact the my computer may use some improvements in the floating point precision taht conflicts with the one used by Shewchuk.
The author says:
/* On some machines, the exact arithmetic routines might be defeated by the */
/* use of internal extended precision floating-point registers. Sometimes */
/* this problem can be fixed by defining certain values to be volatile, */
/* thus forcing them to be stored to memory and rounded off. This isn't */
/* a great solution, though, as it slows the arithmetic down. */
Now what I would like to know is that there is a way, maybe some compiler option, to turn off the internal extended precision floating-point registers.
I really appriaciate your help

The complier option you want for Visual Studio is /fp:strict which is exposed in the IDE as Project->Properties->C/C++->Code Generation->Floating Point Model

Yes, you'll have to change the FPU control word to avoid this. It is explained well for most popular compilers in this web page. Beware that this is dramatically incompatible with what most libraries expect the FPU to do, don't mix and match. Always restore the FPU control word after you're done.

_control87(_PC_53, _MCW_PC) or _control87(_PC_24, _MCW_PC) will do the trick. Those set the precision to double and single respectively with MSVC. You might want to use _controlfp_s(...), as that allows you to retrieve the current control word explicitly after setting it.

As others have noted, you can deal with this by setting the x87 control word to limit floating point precision. However, a better way would be to get MSVC to generate SSE/SSE2 code for the floating-point operations; I'm surprised that it doesn't do that by default in this day and age, given the performance advantages (and the fact that it prevents one from running into annoying bugs like what you're seeing), but there's no accounting for MSVC's idiosyncrasies.
Ranting about MSVC aside, I believe that the /arch:SSE2 flag will cause MSVC to use SSE and SSE2 instructions for single- and double-precision arithmetic, which should resolve the issue.

If you're using GCC, the SO answer here might help:
GCC problem with raw double type comparisons
If you're using another compiler, you might be able to find some clues in that example (or maybe post a comment to that answer to see if Mike Dinsdale might know.

Related

Rounding differences on Windows vs Unix based system in sprintf

I have problem on UNIX based systems sprintf does not round up properly value.
For example
double tmp = 88888888888885.875
char out[512];
Thats 88,888,888,888,885.875 just to be easier on eyes.
I am giving such specific and big example because it seems it works fine on smaller numbers.
I am trying to use it in following way
sprintf(out, "%021.2f", tmp);
printf("out = %s\n", tmp);
On windows this results in:
out = 000088888888888885.88
On for example AIX, but shows in Linux as well:
out = 000088888888888885.87
Why is this happening?
Any ideas and how to make it behave same way on Win/Unix
Thanks

There is a bug report for glibc with a problem very similar to yours. The main conclusion (in comment 46) here is that double is not a 15-decimal-digit number and you should not expect it to work like that.
As a workaround you can add something small to your numbers to make them round better. But this solution is not general because it depends on number ranges you deal with.
Another workaround can be multiplying to prepare them for rounding, then rounding (e.g. 2597.625*100 = 259762.5 -> 259763 = 2597.63*100)
However I think there must be smarter workarounds.

What floating point representations are used by your processor and your compiler?
Not all processors use the same way to represent floating-point values and even compilers may choose different floating-point representation methods (I think the Microsoft C++ compiler even has options to choose the representation).
The page http://www.quadibloc.com/comp/cp0201.htm gives an overview of some of the floating-point representations (although they seem to be rather old architectures shown there).
http://msdn.microsoft.com/en-us/library/0b34tf65.aspx describes how Microsoft Visual C++ stores floating-point values. I couldn't find immediately what representation is used by AIX or Linux.
Additinally, every compiler has options that let you indicate how you want to work with floating-point operations. Do you want them to be correct as possible (but possibly somewhat slower)? Or do you want floating-point operations to be fast as possible (but possibly less correct)?

That's because you're using double which has accuracy limitations, meaning, your 88888888888885.875 is probably being rounded to something else internally.
See more info in a similar question, in blogs or in wikipedia.

On an IEEE 754 conformant implementation, it should print 88888888888885.88 in the default rounding mode. This has nothing to do with floating point precision since the value is exactly representable; it's simply a matter of printf's rounding to 2 places after the decimal point. No idea why you're seeing 88888888888885.87 on some systems.

Are there any modern platforms with non-IEEE C/C++ float formats?

I am writing a video game, Humm and Strumm, which requires a network component in its game engine. I can deal with differences in endianness easily, but I have hit a wall in attempting to deal with possible float memory formats. I know that modern computers have all a standard integer format, but I have heard that they may not all use the IEEE standard for floating-point integers. Is this true?
While certainly I could just output it as a character string into each packet, I would still have to convert to a "well-known format" of each client, regardless of the platform. The standard printf() and atod() would be inadequate.
Please note, because this game is a Free/Open Source Software program that will run on GNU/Linux, *BSD, and Microsoft Windows, I cannot use any proprietary solutions, nor any single-platform solutions.
Cheers,
Patrick

If you properly abstract your network interface, you can have functions/objects that serialize and deserialize the float datatypes. On every system I can think of, these are the IEEE standard, so you'd just have them pass the data through unchanged (the compiler will probably even optimize it out, so you don't lose any performance). If you do encounter some system with a different format, you can conditionally-compile in some code in these functions to do bit hacks to convert from the IEEE standard to the native format. You'll only need to change it in one place. You probably will never need to do so, however, unless you get into consoles/handhelds/etc.

I think it is safe to assume that each platform has an implementation of the IEEE-754 spec that you can rely on. However, even if they all implement the same spec, there is no guarantee that each platform has the exact same implementation, has the same FP control flags set, does the same optimizations, or implements the same non-standard extensions. This makes floating-point determinism very hard to control and somewhat unreliable to use for this kind of thing (where you'll be communicating FP values over the network).
For more information on that; read http://gafferongames.com/networking-for-game-programmers/floating-point-determinism/
Another problem to tackle is handling clients that don't have a floating-point unit; most of the time these will be low-end CPUs, consoles or embedded devices. Make sure to take this into account if you want to target them. FP emulation can be done but tends to be very slow on these devices so you'll have to get a hang of doing fixed-point calculations. Be advised though: writing elaborate classes to abstract floating point and fixed point calculations to the same code sounds like a plan but on most devices it isn'ta good one. It doesn't allow you to squeeze out the maximum precision and performance when dealing with fixed point values.
Yet another problem is handling the endianness of the floating point values because you cannot just swap bytes and stack 'm in a floating point register again (the bytes might get a different meaning, see http://www.dmh2000.com/cpp/dswap.shtml on that).
My advice would be to convert the floats to fixed point intermediate values, do an endian correction if needed and transmit that. Also, don't assume that two floating point calculations on different machines will yield the same results; they don't. However, floating point implementations other than IEEE-754 are rare. For example GPUs tended to use fixed point, but are more likely to have a subset of IEEE-754 these days because they don't want to deal with division-by-zero exceptions but they will have extensions for half-floats that fit in 16 bits.
Also, realize that there are libraries out there that have already solved this problem (sending low-level data formats in a gaming context) for you. One such library is RakNet: specifically, its BitStream class is designed to send these kinds of data reliably to different platforms while keeping the overhead to a minimum. For example, RakNet goes through quite some trouble not to waste any bandwidth on sending strings or vectors.

Some embedded processors do not include any floating-point hardware at all. For desktop computers, I do not see any reason to worry too much, apart from details that only really annoy specialists (sqrt being incorrectly rounded on the Alpha, this kind of thing. The differences that annoy them are in the implementation of the operations, not of the format, anyway).
One variation between platforms is related to the handling of denormals. I asked a question about those a while back. Even that was not as bad as I expected.

What claims, if any, can be made about the accuracy/precision of floating-point calculations?

I'm working on an application that does a lot of floating-point calculations. We use VC++ on Intel x86 with double precision floating-point values. We make claims that our calculations are accurate to n decimal digits (right now 7, but trying to claim 15).
We go to a lot of effort of validating our results against other sources when our results change slightly (due to code refactoring, cleanup, etc.). I know that many many factors play in to the overall precision, such as the FPU control state, the compiler/optimizer, floating-point model, and the overall order of operations themselves (i.e., the algorithm itself), but given the inherent uncertainty in FP calculations (e.g., 0.1 cannot be represented), it seems invalid to claim any specific degree of precision for all calulations.
My question is this: is it valid to make any claims about the accuracy of FP calculations in general without doing any sort of analysis (such as interval analysis)? If so, what claims can be made and why?
EDIT:
So given that the input data is accurate to, say, n decimal places, can any guarantee be made about the result of any arbitrary calculations, given that double precision is being used? E.g., if the input data has 8 significant decimal digits, the output will have at least 5 significant decimal digits... ?
We are using math libraries and are unaware of any guarantees they may or may not make. The algorithms we use are not necessarily analyzed for precision in any way. But even given a specific algorithm, the implementation will affect the results (just changing the order of two addition operations, for example). Is there any inherent guarantee whatsoever when using, say, double precision?
ANOTHER EDIT:
We do empirically validate our results against other sources. So are we just getting lucky when we achieve, say, 10-digit accuracy?

As with all such questions, I have to just simply answer with the article What Every Computer Scientist Should Know About Floating-Point Arithmetic. It's absolutely indispensable for the type of work you are talking about.

Short answer: No.
Reason: Have you proved (yes proved) that you aren't losing any precision as you go along? Are you sure? Do you understand the intrinsic precision of any library functions you're using for transcendental functions? Have you computed the limits of additive errors? If you are using an iterative algorithm, do you know how well it has converged when you quit? This stuff is hard.

Unless your code uses only the basic operations specified in IEEE 754 (+, -, *, / and square root), you do not even know how much precision loss each call to library functions outside your control (trigonometric functions, exp/log, ...) introduce. Functions outside the basic 5 are not guaranteed to be, and are usually not, precise at 1ULP.
You can do empirical checks, but that's what they remain... empirical. Don't forget the part about there being no warranty in the EULA of your software!
If your software was safety-critical, and did not call library-implemented mathematical functions, you could consider http://www-list.cea.fr/labos/gb/LSL/fluctuat/index.html . But only critical software is worth the effort and has a chance to fit in the analysis constraints of this tool.
You seem, after your edit, mostly concerned about your compiler doing things in your back. It is a natural fear to have (because like for the mathematical functions, you are not in control). But it's rather unlikely to be the problem. Your compiler may compute with a higher precision than you asked for (80-bit extendeds when you asked for 64-bit doubles or 64-bit doubles when you asked for 32-bit floats). This is allowed by the C99 standard. In round-to-nearest, this may introduce double-rounding errors. But it's only 1ULP you are losing, and so infrequently that you needn't worry. This can cause puzzling behaviors, as in:
float x=1.0;
float y=7.0;
float z=x/y;
if (z == x/y)
...
else
... /* the else branch is taken */
but you were looking for trouble when you used == between floating-point numbers.
When you have code that does cancelations on purpose, such as in Kahan's summation algorithm:
d = (a+b)-a-b;
and the compiler optimizes that into d=0;, you have a problem. And yes, this optimization "as if floats operation were associative" has been seen in general compilers. It is not allowed by C99. But the situation has gotten better, I think. Compiler authors have become more aware of the dangers of floating-point and no longer try to optimize so aggressively. Plus, if you were doing this in your code you would not be asking this question.

Given that your vendors of machines, compilers, run-time libraries, and operation systems don't make any such claim about floating point accuracy, you should take that to be a warning that your group should be leery of making claims that could come under harsh scrutiny if clients ever took you to court.
Without doing formal verification of the entire system, I would avoid such claims. I work on scientific software that has indirect human safety implications, so we have consider such things in the past, and we do not make these sort of claims.
You could make useless claims about precision of double (length) floating point calculations, but it would be basically worthless.
Ref: The pitfalls of verifying floating-point computations from ACM Transactions on Programming Languages and Systems 30, 3 (2008) 12

No, you cannot make any such claim. If you wanted to do so, you would need to do the following:
Hire an expert in numerical computing to analyze your algorithms.
Either get your library and compiler vendors to open their sources to said expert for analysis, or get them to sign off on hard semantics and error bounds.
Double-precision floating-point typically carries about 15 digits of decimal accuracy, but there are far too many ways for some or all of that accuracy to be lost, that are far too subtle for a non-expert to diagnose, to make any claim like what you would like to claim.
There are relatively simpler ways to keep running error bounds that would let you make accuracy claims about any specific computation, but making claims about the accuracy of all computations performed with your software is a much taller order.

A double precision number on an Intel CPU has slightly better than 15 significant digits (decimal).
The potrntial error for a simple computation is in the ballparl of n/1.0e15, where n is the order of magnitude of the number(s) you are working with. I suspect that Intel has specs for the accuracy of CPU-based FP computations.
The potential error for library functions (like cos and log) is usually documented. If not, you can look at the source code (e.g. thr GNU source) and calculate it.
You would calculate error bars for your calculations just as you would for manual calculations.
Once you do that, you may be able to reduce the error by judicious ordering of the computations.

Since you seem to be concerned about accuracy of arbitrary calculations, here is an approach you can try: run your code with different rounding modes for floating-point calculations. If the results are pretty close to each other, you are probably okay. If the results are not close, you need to start worrying.
The maximum difference in the results will give you a lower bound on the accuracy of the calculations.

Platform independent math library

Is there a publically available library that will produce the exact
same results for sin, cos, floor, ceil, exp and log on 32 bit and
64 bit linux, solaris and possibly other platforms?
I am considering the following alternatives:
a) cephes compiled
with gcc -mfpmath=sse and the same optimization levels on each
platform ... but its not clear that this would work.
b) MPFR but I am worried that this would be
too slow.
Regarding precision (edited): For this particular application
I don't really need something that produces the value that is
numerically closest to the exact value. I just need the answers
to be the exact same on all platforms, os and "bitness". That
being said the values need to be reasonable (5 digits would
probably be enough). I apologize for not having made this clear
in my initial question.
I guess MAPM or MPFR with a low enough precision setting might do
the trick but I was hoping to find something that did not have
the "multiple precision" machinery/flavor to it. In any case, I will
try this out.

Would something like: http://lipforge.ens-lyon.fr/www/crlibm/index.html be what you are searching for (this is a library whose aim is to be able to replace the standard math library of C99 -- so keep good enough performance in the normal cases -- while ensuring correctly rounded result according to IEEE 754 rounding modes) ?

crlibm is the correct tool for this. An earlier poster linked to it. Because it is correctly rounded, it will deliver bit-identical results on all platforms with IEEE-754 compliant hardware if compiled properly. It is much, much faster than MPFR.

You shouldn't need one. floor and ceil will be exact since their computation is straightforward.
What you are concerned with is rounding on the last bit for the transendentals like sin, cos, and exp. But these are native to the CPU microcode and can be done in high quality consistently regardless of library. However, the rounding does vary from chip architecture to architecture.
So, if exact answers for the transindentals is indeed your goal, you do need a portable library, and you also will be giving up huge efficiencies by doing so.
You could use a portable library like MAPM which gives you not only consistent ULP results but as a side benefit lets you define arbirary precision.
You can check your math precision with tools like this one and this one.

You mention using SSE. If you're planning on only running on x86 chips, then what exactly are the inconsistencies you're expecting?
As for MPFR, don't worry - test it! By the way, if it's good enough to be included in GCC, it's probably good enough for you.

You want to use MPFR. That library has been around for years and has been ported to every platform under the sun and optimized by tons of people.
If MPFR isn't sufficient for your needs we're talking about full custom ASM implementations in which case it might be more efficient to consider implementing it in dedicated hardware.

Different math rounding behaviour between Linux, Mac OS X and Windows

HI,
I developed some mixed C/C++ code, with some intensive numerical calculations. When compiled in Linux and Mac OS X I get very similar results after the simulation ends. In Windows the program compiles as well but I get very different results and sometimes the program does not seem to work.
I used GNU compilers in all systems. Some friend recommend me to add -frounding-math and now the windows version seems to work more stable, but Linux and Os X, their results, do not change at all.
Could you recommend another options to get more concordance between Win and Linux/OSX versions?
Thanks
P.D. I also tried -O0 (no optimizations) and specified -m32

I can't speak to the implementation in Windows, but Intel chips contain 80-bit floating point registers, and can give greater precision than that specified in the IEEE-754 floating point standard. You can try calling this routine in the main() of your application (on Intel chip platforms):
inline void fpu_round_to_IEEE_double()
{
unsigned short cw = 0;
_FPU_GETCW(cw); // Get the FPU control word
cw &= ~_FPU_EXTENDED; // mask out '80-bit' register precision
cw |= _FPU_DOUBLE; // Mask in '64-bit' register precision
_FPU_SETCW(cw); // Set the FPU control word
}
I think this is distinct from the rounding modes discussed by #Alok.

There are four different types of rounding for floating-point numbers: round toward zero, round up, round down, and round to the nearest number. Depending upon compiler/operating system, the default may be different on different systems. For programmatically changing the rounding method, see fesetround. It is specified by C99 standard, but may be available to you.
You can also try -ffloat-store gcc option. This will try to prevent gcc from using 80-bit floating-point values in registers.
Also, if your results change depending upon the rounding method, and the differences are significant, it means that your calculations may not be stable. Please consider doing interval analysis, or using some other method to find the problem. For more information, see How Futile are Mindless Assessments of Roundoff in Floating-Point Computation? (pdf) and The pitfalls of verifying floating-point computations (ACM link, but you can get PDF from a lot of places if that doesn't work for you).

In addition to the runtime rounding settings that people mentioned, you can control the Visual Studio compiler settings in Properties > C++ > Code Generation > Floating Point Model. I've seen cases where setting this to "Fast" may cause some bad numerical behavior (e.g. iterative methods fail to converge).
The settings are explained here:
http://msdn.microsoft.com/en-us/library/e7s85ffb%28VS.80%29.aspx

The IEEE and C/C++ standards leave some aspects of floating-point math unspecified. Yes, the precise result of adding to floats is determined, but any more complicated calculation is not. For instance, if you add three floats then the compiler can do the evaluation at float precision, double precision, or higher. Similarly, if you add three doubles then the compiler may do the evaluation at double precision or higher.
VC++ defaults to setting the x87 FPUs precision to double. I believe that gcc leaves it at 80-bit precision. Neither is clearly better, but they can easily give different results, especially if there is any instability in your calculations. In particular 'tiny + large - large' may give very different results if you have extra bits of precision (or if the order of evaluation changes). The implications of varying intermediate precision are discussed here:
http://randomascii.wordpress.com/2012/03/21/intermediate-floating-point-precision/
The challenges of deterministic floating-point are discussed here:
http://randomascii.wordpress.com/2013/07/16/floating-point-determinism/
Floating-point math is tricky. You need to find out when your calculations diverge and examine the generated code to understand why. Only then can you decide what actions to take.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js