Why dealing with floating point endianness conversion in C/C++ is difficult? - c++

Swapping a bunch of bytes seems an easy job, but I always prefer to use libraries when they are available. So I stumble across Boost.Endian and I was surprised when I discover that it deals with intergers but not floating points. What is so special about floating point endianness? (I read Boost.Endian exaplantion but it doesn't fulfill my curiosity).
In particular I would like to understand this: if Boost.Endian already provides functions for swapping 4 and 8 bytes integers why not simply apply those functions to floats and doubles?

So I stumble across Boost.Endian and I was surprised when I discover that it deals with intergers but not floating points. What is so special about floating point endianness?
The documentation explains why:
An attempt was made to support four-byte floats and eight-byte doubles, limited to IEEE 754 (also known as ISO/IEC/IEEE 60559) floating point and further limited to systems where floating point endianness does not differ from integer endianness. Even with those limitations, support for floating point types was not reliable and was removed. For example, simply reversing the endianness of a floating point number can result in a signaling-NAN. For all practical purposes, binary serialization and endianness for integers are one and the same problem. That is not true for floating point numbers, so binary serialization interfaces and formats for floating point does not fit well in an endian-based library.
So, the problems are not related to the language, but to the underlying floating point format that the system uses.
The signaling-NAN mentioned as an example for example causes on POSIX systems the process to abort by default. This is typically an undesirable outcome when changing endianness.

Related

Are there any commonly used floating point formats besides IEEE754?

I am writing a marshaling layer to automatically convert values between different domains. When it comes to floating point values this potentially means converting values from one floating point format to another. However, it seems that almost every modern system is using IEEE754, so I'm wondering whether it's actually worth generalising to allow other formats, or just manage marshaling between different IEEE754 formats.
Does anyone know of any commonly used floating point formats other than IEEE754 that I should consider (perhaps on ARM processors or mainframes)? If so, a reference to the format specification would be extremely helpful.
Virtually all relatively modern (within the last 15 years) general purpose computers use IEEE 754. In the very unlikely event that you find system that you need to support which uses a non-IEEE 754 floating point format, there will probably be a library available to convert to/from IEEE 754.
Some non-ancient systems which did not natively use IEEE 754 were the Cray SV1 (1998-2003) and IBM System 360, 370, and 390 prior to Generation 5 (ended 2002). IBM implemented IEEE 754 emulation around 2001 in a software release for prior S/390 hardware.
As of now, what systems do you actually want this to work on? If you come across one down the line that doesn't use IEEE754 (which as #JohnZwinick says, is vanishingly unlikely) then you should be able to code for that then.
To put it another way, what you are designing here is, in effect, a communications protocol and you obviously seek to make a sensible choice for how you will represent a floating point number (both single precision and double precision, I guess) in the bytes that travel between domains.
I think #SomeProgrammerDude was trying to imply that representing these as text strings (while they are in transit) might offer the most portability, and if so I would agree, but it's obviously not the most efficient way to do it.
So, if you do decide to plump for IEEE754 as your interchange format (as I would) then the worst that can happen is that you might need to find a way to convert these to and from the native format used on some antique architecture that you are almost certainly never going to encounter, and if that does happen then that problem would not be not difficult to solve.
Also, floats and doubles can be big-endian or little-endian, so you need to decide what you're going to use in your byte stream and convert when marshalling if necessary. Little-endian is much more common these days so I'd go with that.
Does anyone know of any commonly used floating point formats other than IEEE754 that I should consider ...?
CCSI uses a variation on binary32 for select processors.
it seems that almost every modern system is using IEEE754,
Yes, but... various implementations fudge on the particulars with edge values like subnormals, negative zero in visual studio, infinity and not-a-number.
It is this second issue that is more lethal and harder to discern that a given implementation has completely coded IEEE754. See __STDC_IEC_559__
OP has "I am writing a marshaling layer". It is in this coding that likely troubles remain for edge cases. Also IEEE754 does not specify endian so that marshaling issues remains. Recall integer endian may not match FP endian.

data structures for floating point operation in fixed point processor

I need to program a fixed point processor that was used for a legacy application. New features are requested and these features need large dynamic range , most likely beyond the fixed point range even after scaling. As the processor will not be changed due to several reasons, I am planning to incorporate the floating point operation based on fixed point arithmetic-- basically software based approach. I want to define few data structures to represent a floating point numbers in C for the underlying fixed point processor. Is it possible to do at all? I am planning to use the IEEE floating point representation . What kind of data structures would be good for achieving basic operation like multiplication, division, add and sub . Are there already some open source libraries available in C /C++?
Most C development tools for microcontrollers without native floating-point support provide software libraries for floating-point operations. Even if you do your programming in assembler, you could probably use those libraries. (Just curious, which processor and which development tools are you using?)
However, if you are serious about writing your own floating-point libraries, the most efficient way is to treat the routines as routines operating on integers. You can use unions, or code like to following to convert between floating-point and integer representation.
uint32_t integer_representation = *(uint32_t *)&fvalue;
Note that this is inherently undefined behavior, as the format of a floating-point number is not specified in the C standard.
The problem is much easier if you stick to floating-point and integer types that match (typically 32 or 64 bit types), that way you can see the routines as plain integer routines, as, for example, addition takes two 32 bit integer representations of floating-point values and return a 32 bit integer representation of the result.
Also, if you use the routines yourself, you could probably get away with leaving parts of the standard unimplemented, such as exception flags, subnormal numbers, NaN and Inf etc.
You don’t really need any data structures to do this. You simply use integers to represent floating-point encodings and integers to represent unpacked sign, exponent, and significand fields.

Floating point Endianness?

I'm writing a client and a server for a realtime offshore simulator, and, as I have to send a lot of data through a socket, I'm using binary data to maximize the amount of data I can send. I already know about integers endianness, and how to use htonl and ntohl to circumvent endianness issues, but my application, as almost all simulation software, deals with a lot of floats.
My question is: Is there some issue of endianness when dealing with binary formats of floating point numbers? I know that all the machines where my code will run use IEEE implementation of floating points, but is there some endianness issue when dealing with floats?
Since I only have access to machines with the same endian, I cannot test this by myself. So, I'll be glad if someone can help me with this.
According to Wikipedia,
Floating-point and endianness
On some machines, while integers were
represented in little-endian form,
floating point numbers were
represented in big-endian form.
Because there are many floating point
formats, and a lack of a standard
"network" representation, no standard
for transferring floating point values
has been made. This means that
floating point data written on one
machine may not be readable on
another, and this is the case even if
both use IEEE 754 floating point
arithmetic since the endianness of the
memory representation is not part of
the IEEE specification.
Yes, floating point can be endianess dependent. See Converting float values from big endian to little endian for info, be sure to read the comments.
EDIT: THE FOLLOWING IS WRONG ANSWER (leaving so that people know that this 'somewhat popular' view is wrong, please read the accepted answer and comments on this answer)
--WRONG ANSWER BEGIN--
There is no such thing as floating point endianness or integer endianness etc. Its just binary representation endianness.
Either a machine is little-endian, or its big-endian. Which means that it will either represent the MSb/MSB in the binary representation of any datatype as its first bit/byte or last bit/byte.
Thats it.
--WRONG ANSWER END---

How to write portable floating point arithmetic in c++?

Say you're writing a C++ application doing lots of floating point arithmetic. Say this application needs to be portable accross a reasonable range of hardware and OS platforms (say 32 and 64 bits hardware, Windows and Linux both in 32 and 64 bits flavors...).
How would you make sure that your floating point arithmetic is the same on all platforms ? For instance, how to be sure that a 32 bits floating point value will really be 32 bits on all platforms ?
For integers we have stdint.h but there doesn't seem to exist a floating point equivalent.
[EDIT]
I got very interesting answers but I'd like to add some precision to the question.
For integers, I can write:
#include <stdint>
[...]
int32_t myInt;
and be sure that whatever the (C99 compatible) platform I'm on, myInt is a 32 bits integer.
If I write:
double myDouble;
float myFloat;
am I certain that this will compile to, respectively, 64 bits and 32 bits floating point numbers on all platforms ?
Non-IEEE 754
Generally, you cannot. There's always a trade-off between consistency and performance, and C++ hands that to you.
For platforms that don't have floating point operations (like embedded and signal processing processors), you cannot use C++ "native" floating point operations, at least not portably so. While a software layer would be possible, that's certainly not feasible for this type of devices.
For these, you could use 16 bit or 32 bit fixed point arithmetic (but you might even discover that long is supported only rudimentary - and frequently, div is very expensive). However, this will be much slower than built-in fixed-point arithmetic, and becomes painful after the basic four operations.
I haven't come across devices that support floating point in a different format than IEEE 754. From my experience, your best bet is to hope for the standard, because otherwise you usually end up building algorithms and code around the capabilities of the device. When sin(x) suddenly costs 1000 times as much, you better pick an algorithm that doesn't need it.
IEEE 754 - Consistency
The only non-portability I found here is when you expect bit-identical results across platforms. The biggest influence is the optimizer. Again, you can trade accuracy and speed for consistency. Most compilers have a option for that - e.g. "floating point consistency" in Visual C++. But note that this is always accuracy beyond the guarantees of the standard.
Why results become inconsistent?
First, FPU registers often have higher resolution than double's (e.g. 80 bit), so as long as the code generator doesn't store the value back, intermediate values are held with higher accuracy.
Second, the equivalences like a*(b+c) = a*b + a*c are not exact due to the limited precision. Nonetheless the optimizer, if allowed, may make use of them.
Also - what I learned the hard way - printing and parsing functions are not necessarily consistent across platforms, probably due to numeric inaccuracies, too.
float
It is a common misconception that float operations are intrinsically faster than double. working on large float arrays is faster usually through less cache misses alone.
Be careful with float accuracy. it can be "good enough" for a long time, but I've often seen it fail faster than expected. Float-based FFT's can be much faster due to SIMD support, but generate notable artefacts quite early for audio processing.
Use fixed point.
However, if you want to approach the realm of possibly making portable floating point operations, you at least need to use controlfp to ensure consistent FPU behavior as well as ensuring that the compiler enforces ANSI conformance with respect to floating point operations. Why ANSI? Because it's a standard.
And even then you aren't guaranteeing that you can generate identical floating point behavior; that also depends on the CPU/FPU you are running on.
It shouldn't be an issue, IEEE 754 already defines all details of the layout of floats.
The maximum and minimum values storable should be defined in limits.h
Portable is one thing, generating consistent results on different platforms is another. Depending on what you are trying to do then writing portable code shouldn't be too difficult, but getting consistent results on ANY platform is practically impossible.
I believe "limits.h" will include the C library constants INT_MAX and its brethren. However, it is preferable to use "limits" and the classes it defines:
std::numeric_limits<float>, std::numeric_limits<double>, std::numberic_limits<int>, etc...
If you're assuming that you will get the same results on another system, read What could cause a deterministic process to generate floating point errors first. You might be surprised to learn that your floating point arithmetic isn't even the same across different runs on the very same machine!

What is the binary format of a floating point number used by C++ on Intel based systems?

I am interested to learn about the binary format for a single or a double type used by C++ on Intel based systems.
I have avoided the use of floating point numbers in cases where the data needs to potentially be read or written by another system (i.e. files or networking). I do realise that I could use fixed point numbers instead, and that fixed point is more accurate, but I am interested to learn about the floating point format.
Wikipedia has a reasonable summary - see http://en.wikipedia.org/wiki/IEEE_754.
Burt if you want to transfer numbers betwen systems you should avoid doing it in binary format. Either use middleware like CORBA (only joking, folks), Tibco etc. or fall back on that old favourite, textual representation.
This should get you started : http://docs.sun.com/source/806-3568/ncg_goldberg.html. (:
Floating-point format is determined by the processor, not the language or compiler. These days almost all processors (including all Intel desktop machines) either have no floating-point unit or have one that complies with IEEE 754. You get two or three different sizes (Intel with SSE offers 32, 64, and 80 bits) and each one has a sign bit, an exponent, and a significand. The number represented is usually given by this formula:
sign * (2**(E-k)) * (1 + S / (2**k'))
where k' is the number of bits in the significand and k is a constant around the middle range of exponents. There are special representations for zero (plus and minus zero) as well as infinities and other "not a number" (NaN) values.
There are definite quirks; for example, the fraction 1/10 cannot be represented exactly as a binary IEEE standard floating-point number. For this reason the IEEE standard also provides for a decimal representation, but this is used primarily by handheld calculators and not by general-purpose computers.
Recommended reading: David Golberg's What Every Computer Scientist Should Know About Floating-Point Arithmetic
As other posters have noted, there is plenty of information about on the IEEE format used by every modern processor, but that is not where your problems will arise.
You can rely on any modern system using IEEE format, but you will need to watch for byte ordering. Look up "endianness" on Wikipedia (or somewhere else). Intel systems are little-endian, a lot of RISC processors are big-endian. Swapping between the two is trivial, but you need to know what type you have.
Traditionally, people use big-endian formats for transmission. Sometimes people include a header indicating the byte order they are using.
If you want absolute portability, the simplest thing is to use a text representation. However that can get pretty verbose for floating point numbers if you want to capture the full precision. 0.1234567890123456e+123.
Intel's representation is IEEE 754 compliant.
You can find the details at http://download.intel.com/technology/itj/q41999/pdf/ia64fpbf.pdf .
Note that decimal floating-point constants may convert to different floating-point binary values on different systems (even with different compilers on the same system). The difference would be slight -- maybe only as large as 2^-54 for a double -- but is a difference nonetheless.
Use hexadecimal constants if you want to guarantee the same floating-point binary value on any platform.