What determines the representation of floating point numbers in the memory? By the compiler or FPU.
If the representation depends on the FPU, how the compiler stores constants such as 1.337f in a binary file? Maybe when the application starts happens unpacking of the floating point values?
I have long been interested in this question because do network programming.
The C and C++ standards do not require any particular floating point representations, although recent standards have included some specific support for IEEE (i.e. facilities that are available IF the implementation uses IEEE floating point). For particular implementations (aka toolchains) the representation of floating point depends on the host system and, to some extent, on decisions by compiler vendors.
For older microprocessor (and other processing hardware such as microcontrollers, Digital Signal Processors [DSP], etc), the implementation is often in hardware - for example, a set of specialised electronic circuits that implement registers that represent floating point values, and circuits which perform operations on such registers.
In modern processing hardware (microcontrollers,DSPs, graphic processing units, floating point units, etc) the implementation is in microcode - over-simplistically, a layer of hardware instructions that implement machine code instructions and a state machine (a basis for how the processor appears to work, as far as programs and operating systems are concerned). So higher level instruction sets (X86, etc) are used by executables and operating systems, and microcode is the intermediary between the operating system and the hardware (which often implements a very simple set of instructions). The term "modern" in this description is relative - the first microcode-based processors date from the 1970s.
Historically, processing hardware has implemented floating point in a wide number of ways - some proprietary, and some standardised. There are a number of processors which supported multiple distinct representations. In some cases, software layers have emulated floating point on top of hardware that does not support floating point at all. Most compilers will use hardware-supplied floating point if available (and some compilers have options to select different floating point representations, reflecting their target platforms), but a number of compilers targeting hardware with no floating point support literally emulate the representations and operations in software.
The IEEE floating point specification (first version related in 1985, most recent version IEEE 754-2008) which has been adopted as an international standard ISO/IEC/IEEE 60559:2011 defines a bunch of things, including arithmetic formats (how values, infinities, NaNs, etc are represented in floating point variables), interchange formats (encodings for exchange of floating point values between systems), operations (for arithmetic, etc), rounding rules during operations, and exception handling (dealing with things like division by zero). The IEEE specification has evolved over some time, and is becoming increasingly common in modern hardware and software.
Who determines the representation is the FPU which adheres to a standard with which the compiler supports.
The current standard is the IEEE 754. It describes how floating-point computation and data should be represented (see this article for a detailed description).
The data is always represented by a fixed number of bits, such as 32-bit, 64-bit, 128-bit, 80-bit (a.k.a x86 extended precision). In memory they are all but bits. But then, each set of bits represents a component of the floating-point data, such as: the most significant bit (depending on the endianness) is the sign, another set of bits are the exponent and another the significant part.
Then, the compiler's support of the standard (the IEEE 754) generates code specific to that representation.
So, user2079303's answer is right: who determines the representation of your code is the compiler which targets the standard, however it wouldn't work if the standard was not in charge.
EDIT: Peter's answer is quite detailed and covers many other cases.
What determines the representation of floating point numbers in the memory
The compiler determines which representation it uses. But, if it targets an fpu, then it must use the representation used by the fpu.
how the compiler stores constants such as 1.337f in a binary file?
Typically, in the same binary representation as it uses in memory.
Related
What determines the representation of floating point numbers in the memory? By the compiler or FPU.
If the representation depends on the FPU, how the compiler stores constants such as 1.337f in a binary file? Maybe when the application starts happens unpacking of the floating point values?
I have long been interested in this question because do network programming.
The C and C++ standards do not require any particular floating point representations, although recent standards have included some specific support for IEEE (i.e. facilities that are available IF the implementation uses IEEE floating point). For particular implementations (aka toolchains) the representation of floating point depends on the host system and, to some extent, on decisions by compiler vendors.
For older microprocessor (and other processing hardware such as microcontrollers, Digital Signal Processors [DSP], etc), the implementation is often in hardware - for example, a set of specialised electronic circuits that implement registers that represent floating point values, and circuits which perform operations on such registers.
In modern processing hardware (microcontrollers,DSPs, graphic processing units, floating point units, etc) the implementation is in microcode - over-simplistically, a layer of hardware instructions that implement machine code instructions and a state machine (a basis for how the processor appears to work, as far as programs and operating systems are concerned). So higher level instruction sets (X86, etc) are used by executables and operating systems, and microcode is the intermediary between the operating system and the hardware (which often implements a very simple set of instructions). The term "modern" in this description is relative - the first microcode-based processors date from the 1970s.
Historically, processing hardware has implemented floating point in a wide number of ways - some proprietary, and some standardised. There are a number of processors which supported multiple distinct representations. In some cases, software layers have emulated floating point on top of hardware that does not support floating point at all. Most compilers will use hardware-supplied floating point if available (and some compilers have options to select different floating point representations, reflecting their target platforms), but a number of compilers targeting hardware with no floating point support literally emulate the representations and operations in software.
The IEEE floating point specification (first version related in 1985, most recent version IEEE 754-2008) which has been adopted as an international standard ISO/IEC/IEEE 60559:2011 defines a bunch of things, including arithmetic formats (how values, infinities, NaNs, etc are represented in floating point variables), interchange formats (encodings for exchange of floating point values between systems), operations (for arithmetic, etc), rounding rules during operations, and exception handling (dealing with things like division by zero). The IEEE specification has evolved over some time, and is becoming increasingly common in modern hardware and software.
Who determines the representation is the FPU which adheres to a standard with which the compiler supports.
The current standard is the IEEE 754. It describes how floating-point computation and data should be represented (see this article for a detailed description).
The data is always represented by a fixed number of bits, such as 32-bit, 64-bit, 128-bit, 80-bit (a.k.a x86 extended precision). In memory they are all but bits. But then, each set of bits represents a component of the floating-point data, such as: the most significant bit (depending on the endianness) is the sign, another set of bits are the exponent and another the significant part.
Then, the compiler's support of the standard (the IEEE 754) generates code specific to that representation.
So, user2079303's answer is right: who determines the representation of your code is the compiler which targets the standard, however it wouldn't work if the standard was not in charge.
EDIT: Peter's answer is quite detailed and covers many other cases.
What determines the representation of floating point numbers in the memory
The compiler determines which representation it uses. But, if it targets an fpu, then it must use the representation used by the fpu.
how the compiler stores constants such as 1.337f in a binary file?
Typically, in the same binary representation as it uses in memory.
The standard says that they support the exchange of floating-point data between implementations. What is meant by 'implementations' here? And how that exchange may occur?
The standard, like many programming standards, specifies a set of rules. Anything that follows those rules is an implementation of the standard. Per the IEEE 754-2008 abstract, “An implementation of a floating-point system conforming to this standard may be realized entirely in software, entirely in hardware, or in any combination of software and hardware.”
For example, one could, in theory, design a computer processor that natively supports floating-point formats and that has one instruction for each IEEE 754 operation. Or you could write a C compiler that targets a processor which has some floating-point operations, like addition and multiplication, but uses software routines for square root and other operations. Or you could write software that implements all the IEEE 754 operations using purely C integer operations, without any reliance on any specific hardware. Each of these could be an IEEE 754 implementation.
The floating-point behaviors are specified largely in terms of the mathematical values represented. For arithmetic formats, the standard specifies what numbers are represented, what the results of performing arithmetic operations on them are, and so on. Interchange formats go further than this; for interchange formats, the standard specifies precisely what bit patterns represent which values.
Because of this, if one IEEE 754 implementation puts a value in a floating-point object and then transmits the bits to another implementation, possibly running on completely different hardware, and that destination implementation restores those bits to a floating-point object of the same interchange format, then the destination object will have the same value and behavior as the original source object. In other words, the interchange formats make data portable.
This transmission can occur over a network, by storage of the bits onto a disk that is then physically moved to another computer, by printing the bits (or a representation of them, such as hexadecimal) on paper that is then typed in by a human, or other means.
The standard specifies only the bit pattern in a sequence of bits from most significant to least significant. It does not specify how those bits are grouped into bytes or how those bytes are ordered, so sending and receiving systems must be sure to send and receive the bits in the correct order.
I am trying to implement a program with floating point numbers, using two or more programming languages. The program does say 50k iterations to finally bring the error to very small value.
To ensure that my results are comparable, I wanted to make sure I use data types of same precision in different languages. Would you please tell if there is correspondence between float/double of C/C++ to that in D and Go. I expect C/C++ and D to be quite close in this regard, but not sure. Thanks a lot.
Generally, for compiled languages, floating point format and precision comes down to two things:
The library used to implement the floating point functions that aren't directly supported in hardware.
The hardware the system is running on.
It may also depend on what compiler options you give (and how sophisticated the compiler is in general) - many modern processors have vector instructions, and the result may be subtly different than if you use "regular" floating point instructions (e.g. FPU vs. SSE on x86 processors). You may also see differences, sometimes, because the internal calculations on an x86 FPU is 80-bits, stored as 64-bits when the computation is completed.
But generally, given the same hardware, and similar type of compilers, I'd expect to get the same result [and roughly the same performance] from two different [sufficiently similar] languages.
Most languages have either only "double" (typically 64-bit) or "single and double" (e.g. float - typically 32-bit and double - typically 64-bit in C/C++ - and probably D as well, but I'm not that into D).
In Go, floating point types follow the IEEE-754 standard.
Straight from the spec (http://golang.org/ref/spec#Numeric_types)
float32 the set of all IEEE-754 32-bit floating-point numbers
float64 the set of all IEEE-754 64-bit floating-point numbers
I'm not familiar with D, but this page might be of interest: http://dlang.org/float.html.
For C/C++, the standard doesn't require IEEE-754, but in C++ you could use is_iec559() to check if your compiler is using IEEE-754. See this question: How to check if C++ compiler uses IEEE 754 floating point standard
I know that the C and C++ standards leave many aspects of the language implementation-defined just because if there was an architecture with other characteristics, a standard confirming compiler for that architecture would need to emulate those parts of the language, resulting in inefficient machine code.
Surely, 40 years ago every computer had its own unique specification. However, I don't know of any architectures used today where:
CHAR_BIT != 8
signed is not two's complement (I heard Java had problems with this one).
Floating point is not IEEE 754 compliant (Edit: I meant "not in IEEE 754 binary encoding").
The reason I'm asking is that I often explain to people that it's good that C++ doesn't mandate any other low-level aspects like fixed sized types†. It's good because unlike 'other languages' it makes your code portable when used correctly (Edit: because it can be ported to more architectures without requiring emulation of low-level aspects of the machine, like e.g. two's complement arithmetic on sign+magnitude architecture). But I feel bad that I cannot point to any specific architecture myself.
So the question is: what architectures exhibit the above properties?
† uint*_ts are optional.
Take a look at this one
Unisys ClearPath Dorado Servers
offering backward compatibility for people who have not yet migrated all their Univac software.
Key points:
36-bit words
CHAR_BIT == 9
one's complement
72-bit non-IEEE floating point
separate address space for code and data
word-addressed
no dedicated stack pointer
Don't know if they offer a C++ compiler though, but they could.
And now a link to a recent edition of their C manual has surfaced:
Unisys C Compiler Programming Reference Manual
Section 4.5 has a table of data types with 9, 18, 36, and 72 bits.
None of your assumptions hold for mainframes. For starters, I don't know
of a mainframe which uses IEEE 754: IBM uses base 16 floating point, and
both of the Unisys mainframes use base 8. The Unisys machines are a bit
special in many other respects: Bo has mentioned the 2200 architecture,
but the MPS architecture is even stranger: 48 bit tagged words.
(Whether the word is a pointer or not depends on a bit in the word.)
And the numeric representations are designed so that there is no real
distinction between floating point and integral arithmetic: the floating
point is base 8; it doesn't require normalization, and unlike every
other floating point I've seen, it puts the decimal to the right of the
mantissa, rather than the left, and uses signed magnitude for the
exponent (in addition to the mantissa). With the results that an
integral floating point value has (or can have) exactly the same bit
representation as a signed magnitude integer. And there are no floating
point arithmetic instructions: if the exponents of the two values are
both 0, the instruction does integral arithmetic, otherwise, it does
floating point arithmetic. (A continuation of the tagging philosophy in
the architecture.) Which means that while int may occupy 48 bits, 8
of them must be 0, or the value won't be treated as an integer.
Full IEEE 754 compliance is rare in floating-point implementations. And weakening the specification in that regard allows lots of optimizations.
For example the subnorm support differers between x87 and SSE.
Optimizations like fusing a multiplication and addition which were separate in the source code slightly change the results too, but is nice optimization on some architectures.
Or on x86 strict IEEE compliance might require certain flags being set or additional transfers between floating point registers and normal memory to force it to use the specified floating point type instead of its internal 80bit floats.
And some platforms have no hardware floats at all and thus need to emulate them in software. And some of the requirements of IEEE 754 might be expensive to implement in software. In particular the rounding rules might be a problem.
My conclusion is that you don't need exotic architectures in order to get into situations were you don't always want to guarantee strict IEEE compliance. For this reason were few programming languages guarantee strict IEEE compliance.
I found this link listing some systems where CHAR_BIT != 8. They include
some TI DSPs have CHAR_BIT == 16
BlueCore-5 chip (a Bluetooth
chip from Cambridge Silicon Radio) which has CHAR_BIT ==
16.
And of course there is a question on Stack Overflow: What platforms have something other than 8-bit char
As for non two's-complement systems there is an interesting read on
comp.lang.c++.moderated. Summarized: there are platforms having ones' complement or sign and magnitude representation.
I'm fairly sure that VAX systems are still in use. They don't support IEEE floating-point; they use their own formats. Alpha supports both VAX and IEEE floating-point formats.
Cray vector machines, like the T90, also have their own floating-point format, though newer Cray systems use IEEE. (The T90 I used was decommissioned some years ago; I don't know whether any are still in active use.)
The T90 also had/has some interesting representations for pointers and integers. A native address can only point to a 64-bit word. The C and C++ compilers had CHAR_BIT==8 (necessary because it ran Unicos, a flavor of Unix, and had to interoperate with other systems), but a native address could only point to a 64-bit word. All byte-level operations were synthesized by the compiler, and a void* or char* stored a byte offset in the high-order 3 bits of the word. And I think some integer types had padding bits.
IBM mainframes are another example.
On the other hand, these particular systems needn't necessarily preclude changes to the language standard. Cray didn't show any particular interest in upgrading its C compiler to C99; presumably the same thing applied to the C++ compiler. It might be reasonable to tighten the requirements for hosted implementations, such as requiring CHAR_BIT==8, IEEE format floating-point if not the full semantics, and 2's-complement without padding bits for signed integers. Old systems could continue to support earlier language standards (C90 didn't die when C99 came out), and the requirements could be looser for freestanding implementations (embedded systems) such as DSPs.
On the other other hand, there might be good reasons for future systems to do things that would be considered exotic today.
CHAR_BITS
According to gcc source code:
CHAR_BIT is 16 bits for 1750a, dsp16xx architectures.
CHAR_BIT is 24 bits for dsp56k architecture.
CHAR_BIT is 32 bits for c4x architecture.
You can easily find more by doing:
find $GCC_SOURCE_TREE -type f | xargs grep "#define CHAR_TYPE_SIZE"
or
find $GCC_SOURCE_TREE -type f | xargs grep "#define BITS_PER_UNIT"
if CHAR_TYPE_SIZE is appropriately defined.
IEEE 754 compliance
If target architecture doesn't support floating point instructions, gcc may generate software fallback witch is not the standard compliant by default. More than, special options (like -funsafe-math-optimizations witch also disables sign preserving for zeros) can be used.
IEEE 754 binary representation was uncommon on GPUs until recently, see GPU Floating-Point Paranoia.
EDIT: a question has been raised in the comments whether GPU floating point is relevant to the usual computer programming, unrelated to graphics. Hell, yes! Most high performance thing industrially computed today is done on GPUs; the list includes AI, data mining, neural networks, physical simulations, weather forecast, and much much more. One of the links in the comments shows why: an order of magnitude floating point advantage of GPUs.
Another thing I'd like to add, which is more relevant to the OP question: what did people do 10-15 years ago when GPU floating point was not IEEE and when there was no API such as today's OpenCL or CUDA to program GPUs? Believe it or not, early GPU computing pioneers managed to program GPUs without an API to do that! I met one of them in my company. Here's what he did: he encoded the data he needed to compute as an image with pixels representing the values he was working on, then used OpenGL to perform the operations he needed (such as "gaussian blur" to represent a convolution with a normal distribution, etc), and decoded the resulting image back into an array of results. And this still was faster than using CPU!
Things like that is what prompted NVidia to finally make their internal data binary compatible with IEEE and to introduce an API oriented on computation rather than image manipulation.
I am interested to learn about the binary format for a single or a double type used by C++ on Intel based systems.
I have avoided the use of floating point numbers in cases where the data needs to potentially be read or written by another system (i.e. files or networking). I do realise that I could use fixed point numbers instead, and that fixed point is more accurate, but I am interested to learn about the floating point format.
Wikipedia has a reasonable summary - see http://en.wikipedia.org/wiki/IEEE_754.
Burt if you want to transfer numbers betwen systems you should avoid doing it in binary format. Either use middleware like CORBA (only joking, folks), Tibco etc. or fall back on that old favourite, textual representation.
This should get you started : http://docs.sun.com/source/806-3568/ncg_goldberg.html. (:
Floating-point format is determined by the processor, not the language or compiler. These days almost all processors (including all Intel desktop machines) either have no floating-point unit or have one that complies with IEEE 754. You get two or three different sizes (Intel with SSE offers 32, 64, and 80 bits) and each one has a sign bit, an exponent, and a significand. The number represented is usually given by this formula:
sign * (2**(E-k)) * (1 + S / (2**k'))
where k' is the number of bits in the significand and k is a constant around the middle range of exponents. There are special representations for zero (plus and minus zero) as well as infinities and other "not a number" (NaN) values.
There are definite quirks; for example, the fraction 1/10 cannot be represented exactly as a binary IEEE standard floating-point number. For this reason the IEEE standard also provides for a decimal representation, but this is used primarily by handheld calculators and not by general-purpose computers.
Recommended reading: David Golberg's What Every Computer Scientist Should Know About Floating-Point Arithmetic
As other posters have noted, there is plenty of information about on the IEEE format used by every modern processor, but that is not where your problems will arise.
You can rely on any modern system using IEEE format, but you will need to watch for byte ordering. Look up "endianness" on Wikipedia (or somewhere else). Intel systems are little-endian, a lot of RISC processors are big-endian. Swapping between the two is trivial, but you need to know what type you have.
Traditionally, people use big-endian formats for transmission. Sometimes people include a header indicating the byte order they are using.
If you want absolute portability, the simplest thing is to use a text representation. However that can get pretty verbose for floating point numbers if you want to capture the full precision. 0.1234567890123456e+123.
Intel's representation is IEEE 754 compliant.
You can find the details at http://download.intel.com/technology/itj/q41999/pdf/ia64fpbf.pdf .
Note that decimal floating-point constants may convert to different floating-point binary values on different systems (even with different compilers on the same system). The difference would be slight -- maybe only as large as 2^-54 for a double -- but is a difference nonetheless.
Use hexadecimal constants if you want to guarantee the same floating-point binary value on any platform.