When do you use fp:strict as opposed to fp:precise? Is it better to use the former if I want "more precise" calculations and avoid rounding errors? What is the heuristic behind using either?
The standard IEEE 754 specifies a method for floating point calculations and storage of floating point values in memory.
Using fp:strict means that all the rules of IEEE 754 are respected. fp:strict is used to sustain bitwise compatibility between different compilers and platforms.
fp:precise weakens some of the rules, however it warranties that the precision of the calculations will not be lost.
fp:fast allows compiler specific optimizations and transformations of expressions containing floating point calculation. It is the fastest methods but the results will differ between different compilers and platforms.
Related
What determines the representation of floating point numbers in the memory? By the compiler or FPU.
If the representation depends on the FPU, how the compiler stores constants such as 1.337f in a binary file? Maybe when the application starts happens unpacking of the floating point values?
I have long been interested in this question because do network programming.
The C and C++ standards do not require any particular floating point representations, although recent standards have included some specific support for IEEE (i.e. facilities that are available IF the implementation uses IEEE floating point). For particular implementations (aka toolchains) the representation of floating point depends on the host system and, to some extent, on decisions by compiler vendors.
For older microprocessor (and other processing hardware such as microcontrollers, Digital Signal Processors [DSP], etc), the implementation is often in hardware - for example, a set of specialised electronic circuits that implement registers that represent floating point values, and circuits which perform operations on such registers.
In modern processing hardware (microcontrollers,DSPs, graphic processing units, floating point units, etc) the implementation is in microcode - over-simplistically, a layer of hardware instructions that implement machine code instructions and a state machine (a basis for how the processor appears to work, as far as programs and operating systems are concerned). So higher level instruction sets (X86, etc) are used by executables and operating systems, and microcode is the intermediary between the operating system and the hardware (which often implements a very simple set of instructions). The term "modern" in this description is relative - the first microcode-based processors date from the 1970s.
Historically, processing hardware has implemented floating point in a wide number of ways - some proprietary, and some standardised. There are a number of processors which supported multiple distinct representations. In some cases, software layers have emulated floating point on top of hardware that does not support floating point at all. Most compilers will use hardware-supplied floating point if available (and some compilers have options to select different floating point representations, reflecting their target platforms), but a number of compilers targeting hardware with no floating point support literally emulate the representations and operations in software.
The IEEE floating point specification (first version related in 1985, most recent version IEEE 754-2008) which has been adopted as an international standard ISO/IEC/IEEE 60559:2011 defines a bunch of things, including arithmetic formats (how values, infinities, NaNs, etc are represented in floating point variables), interchange formats (encodings for exchange of floating point values between systems), operations (for arithmetic, etc), rounding rules during operations, and exception handling (dealing with things like division by zero). The IEEE specification has evolved over some time, and is becoming increasingly common in modern hardware and software.
Who determines the representation is the FPU which adheres to a standard with which the compiler supports.
The current standard is the IEEE 754. It describes how floating-point computation and data should be represented (see this article for a detailed description).
The data is always represented by a fixed number of bits, such as 32-bit, 64-bit, 128-bit, 80-bit (a.k.a x86 extended precision). In memory they are all but bits. But then, each set of bits represents a component of the floating-point data, such as: the most significant bit (depending on the endianness) is the sign, another set of bits are the exponent and another the significant part.
Then, the compiler's support of the standard (the IEEE 754) generates code specific to that representation.
So, user2079303's answer is right: who determines the representation of your code is the compiler which targets the standard, however it wouldn't work if the standard was not in charge.
EDIT: Peter's answer is quite detailed and covers many other cases.
What determines the representation of floating point numbers in the memory
The compiler determines which representation it uses. But, if it targets an fpu, then it must use the representation used by the fpu.
how the compiler stores constants such as 1.337f in a binary file?
Typically, in the same binary representation as it uses in memory.
Different platforms have varying FP capabilities with varying parameters and behaviors, as a result there is a degree of variance between the calculation results they produce, which cascade and amplify on each intermediate step.
I am in a situation where it is critical for (+-*/ only) calculations to produce identical results on each and every different target platform, using different compiler vendors, so I wonder if there is a standard way to do that. I am not asking about arbitrary high precision floating point numbers but standard 64 bit IEEE double, and a performance hit is expected and tolerable.
Even if you have a 64 bit IEEE754 double, there are a few extra things you need to check.
Make sure you have strict floating point. Don't allow your compiler to use, for example, 80 bits for intermediate calculations.
Various operations (all the arithmetic operations such as the ones you mention, std::sqrt, &c.) are required by IEEE754 to return the best number possible. (Should you need others then make sure that all your operations are mentioned in the IEEE754 standard and your platform obeys that faithfully - it might not).
Shy away from other functions (such as trigonometric functions), for which there is no guarantee of precision, even under IEEE754.
In your specific case it appears that (1) is sufficient, along with perhaps (for C++)
static_assert(std::numeric_limits<double>::is_iec559, "IEEE 754 floating point required");
What determines the representation of floating point numbers in the memory? By the compiler or FPU.
If the representation depends on the FPU, how the compiler stores constants such as 1.337f in a binary file? Maybe when the application starts happens unpacking of the floating point values?
I have long been interested in this question because do network programming.
The C and C++ standards do not require any particular floating point representations, although recent standards have included some specific support for IEEE (i.e. facilities that are available IF the implementation uses IEEE floating point). For particular implementations (aka toolchains) the representation of floating point depends on the host system and, to some extent, on decisions by compiler vendors.
For older microprocessor (and other processing hardware such as microcontrollers, Digital Signal Processors [DSP], etc), the implementation is often in hardware - for example, a set of specialised electronic circuits that implement registers that represent floating point values, and circuits which perform operations on such registers.
In modern processing hardware (microcontrollers,DSPs, graphic processing units, floating point units, etc) the implementation is in microcode - over-simplistically, a layer of hardware instructions that implement machine code instructions and a state machine (a basis for how the processor appears to work, as far as programs and operating systems are concerned). So higher level instruction sets (X86, etc) are used by executables and operating systems, and microcode is the intermediary between the operating system and the hardware (which often implements a very simple set of instructions). The term "modern" in this description is relative - the first microcode-based processors date from the 1970s.
Historically, processing hardware has implemented floating point in a wide number of ways - some proprietary, and some standardised. There are a number of processors which supported multiple distinct representations. In some cases, software layers have emulated floating point on top of hardware that does not support floating point at all. Most compilers will use hardware-supplied floating point if available (and some compilers have options to select different floating point representations, reflecting their target platforms), but a number of compilers targeting hardware with no floating point support literally emulate the representations and operations in software.
The IEEE floating point specification (first version related in 1985, most recent version IEEE 754-2008) which has been adopted as an international standard ISO/IEC/IEEE 60559:2011 defines a bunch of things, including arithmetic formats (how values, infinities, NaNs, etc are represented in floating point variables), interchange formats (encodings for exchange of floating point values between systems), operations (for arithmetic, etc), rounding rules during operations, and exception handling (dealing with things like division by zero). The IEEE specification has evolved over some time, and is becoming increasingly common in modern hardware and software.
Who determines the representation is the FPU which adheres to a standard with which the compiler supports.
The current standard is the IEEE 754. It describes how floating-point computation and data should be represented (see this article for a detailed description).
The data is always represented by a fixed number of bits, such as 32-bit, 64-bit, 128-bit, 80-bit (a.k.a x86 extended precision). In memory they are all but bits. But then, each set of bits represents a component of the floating-point data, such as: the most significant bit (depending on the endianness) is the sign, another set of bits are the exponent and another the significant part.
Then, the compiler's support of the standard (the IEEE 754) generates code specific to that representation.
So, user2079303's answer is right: who determines the representation of your code is the compiler which targets the standard, however it wouldn't work if the standard was not in charge.
EDIT: Peter's answer is quite detailed and covers many other cases.
What determines the representation of floating point numbers in the memory
The compiler determines which representation it uses. But, if it targets an fpu, then it must use the representation used by the fpu.
how the compiler stores constants such as 1.337f in a binary file?
Typically, in the same binary representation as it uses in memory.
I am trying to implement a program with floating point numbers, using two or more programming languages. The program does say 50k iterations to finally bring the error to very small value.
To ensure that my results are comparable, I wanted to make sure I use data types of same precision in different languages. Would you please tell if there is correspondence between float/double of C/C++ to that in D and Go. I expect C/C++ and D to be quite close in this regard, but not sure. Thanks a lot.
Generally, for compiled languages, floating point format and precision comes down to two things:
The library used to implement the floating point functions that aren't directly supported in hardware.
The hardware the system is running on.
It may also depend on what compiler options you give (and how sophisticated the compiler is in general) - many modern processors have vector instructions, and the result may be subtly different than if you use "regular" floating point instructions (e.g. FPU vs. SSE on x86 processors). You may also see differences, sometimes, because the internal calculations on an x86 FPU is 80-bits, stored as 64-bits when the computation is completed.
But generally, given the same hardware, and similar type of compilers, I'd expect to get the same result [and roughly the same performance] from two different [sufficiently similar] languages.
Most languages have either only "double" (typically 64-bit) or "single and double" (e.g. float - typically 32-bit and double - typically 64-bit in C/C++ - and probably D as well, but I'm not that into D).
In Go, floating point types follow the IEEE-754 standard.
Straight from the spec (http://golang.org/ref/spec#Numeric_types)
float32 the set of all IEEE-754 32-bit floating-point numbers
float64 the set of all IEEE-754 64-bit floating-point numbers
I'm not familiar with D, but this page might be of interest: http://dlang.org/float.html.
For C/C++, the standard doesn't require IEEE-754, but in C++ you could use is_iec559() to check if your compiler is using IEEE-754. See this question: How to check if C++ compiler uses IEEE 754 floating point standard
According to the following site:
http://en.cppreference.com/w/cpp/language/types
"double - double precision floating point type. Usually IEEE-754 64 bit floating point type".
It says "usually". What other possible formats/standard could C++ double use? What compiler uses an alternative to the IEEE format? Or architecture?
Vaxen, Crays, and IBM mainframes, to name just a few that are still in reasonably wide use. Most (all?) of those can also do IEEE floating point now, but sometimes only with a special add-on. In other cases (IBM) IEEE arithmetic can carry a significant speed penalty.
As for older machines, most mainframes (Unisys, Control Data, etc.) used unique floating point formats, most of which weren't even much like IEEE, not to mention actually conforming.
For a short history lesson, you can check out the Intel Floating Point Case Study.
Intel compilers have an option that is on by default when optimizing that enables a so-called fast-math feature. This makes the math much faster but drops strict compliance with IEEE standards. One can enforce strict standard compliance with the fp-model option.
I believe the CUDA language for NVidia GPU's also has a significantly faster math library if one is willing to give up strict compliance with the IEEE standard. This not only makes the math faster, but it reduces the number of registers used for transcendental functions in particular.
Whether compliance is needed depends on a case-by-case basis. We've experienced problems with the Intel optimizations and have had to turn on the fp-model strict option to ensure correct results with double precision math.
Seems most computers today use IEEE-754. But alternatives seems to have been available before. Formats like excess 128 and packed BCD have been used before (http://aplawrence.com/Basics/floatingpoint.html). The wikipedia entry too has a few listed http://en.wikipedia.org/wiki/Floating_point
It is probably worth adding, in answer to "What other possible formats/standard could C++ double use?", that gcc for Atmel AVR (which are 8 bit data CPU's, used in some Arduinos) does not implement double as 64 bits.
See the GCC wiki, avr-gcc page and specifically the 'double' subsection of 'Deviations from the Standard' where it says
double is only 32 bits wide and implemented in the same way as
float
I believe other CPUs have similar implementations, but I couldn't find them.