If I test my code with the following:
#ifndef __STDC_IEC_559__
#error Warning: __STDC_IEC_559__ not defined. The code assumes we're using the IEEE 754 floating point for binary serialization of floats and doubles.
#endif
...such as is described here, am I guaranteed that this:
float myFloat = ...;
unsigned char *data = reinterpret_cast<unsigned char*>(&myFloat)
unsigned char buffer[4];
std::memcpy(&Buffer[0], data, sizeof(float));
...would safely serialize the float for writing to a file or network packet?
If not, how can I safely serialize floats and doubles?
Also, who's responsible for byte ordering - my code or the Operating System?
To clarifiy my question: Can I cast floats to 4 bytes and doubles to 8 bytes, and safely serialize to and from files or across networks, if I:
Assert that we're using IEC 559
Convert the resulting to/from a standard byte order (such as network byte order).
__STDC_IEC_559__ is a macro defined by C99/C11, I didn't find reference about whether C++ guarantees to support it.
A better solution is to use std::numeric_limits< float >::is_iec559 or std::numeric_limits< double >::is_iec559
C++11 18.2.1.1 Class template numeric_limits
static const bool is_iec559 ;
52 True if and only if the type adheres to IEC 559 standard.210)
53 Meaningful for all floating point types.
In the footnote:
210) International Electrotechnical Commission standard 559 is the same as IEEE 754.
About your second assumption, I don't think you can say any byte order is "standard", but if the byte order is the same between machines(little or big endian), then yes, I think you can serialize like that.
How about considering standard serialization like XDR [used in Unix RPC] or CDR etc ?
http://en.wikipedia.org/wiki/External_Data_Representation
for example :
bool_t xdr_float(XDR *xdrs, float *fp); from linux.die.net/man/3/xdr
or a c++ library
http://xstream.sourceforge.net/
You might also be intersted in CDR [used by CORBA] , ACE [adaptive communication environment] has CDR classes [But its very heavy library]
Related
I've built a custom version of frexp:
auto frexp(float f) noexcept
{
static_assert(std::numeric_limits<float>::is_iec559);
constexpr uint32_t ExpMask = 0xff;
constexpr int32_t ExpOffset = 126;
constexpr int MantBits = 23;
uint32_t u;
std::memcpy(&u, &f, sizeof(float)); // well defined bit transformation from float to int
int exp = ((u >> MantBits) & ExpMask) - ExpOffset; // extract the 8 bits of the exponent (it has an offset of 126)
// divide by 2^exp (leaving mantissa intact while placing "0" into the exponent)
u &= ~(ExpMask << MantBits); // zero out the exponent bits
u |= ExpOffset << MantBits; // place 126 into exponent bits (representing 0)
std::memcpy(&f, &u, sizeof(float)); // copy back to f
return std::make_pair(exp, f);
}
By checking is_iec559 I'm making sure that float fulfills
the requirements of IEC 559 (IEEE 754) standard.
My question is: Does this mean that the bit operations I'm doing are well defined and do what I want? If not, is there a way to fix it?
I tested it for some random values and it seems to be correct, at least on Windows 10 compiled with msvc and on wandbox. Note however, that (on purpose) I'm not handling the edge cases of subnormals, NaN, and inf.
If anyone wonders why I'm doing this: In benchmarks I found that this version of frexp is up to 15 times faster than std::frexp on Windows 10. I haven't tested other platforms yet. But I want to make sure that this not just works by coincident and may brake in future.
Edit:
As mentioned in the comments, endianess could be an issue. Does anybody know?
"Does this mean that the bit operations I'm doing are well defined..."
The TL;DR;, by the strict definition of "well defined": no.
Your assumptions are likely correct but not well defined, because there are no guarantees about the bit width, or the implementation of float. From § 3.9.1:
there are three floating point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double. The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double. The value representation of floating-point types is implementation-defined.
The is_iec559 clause only qualifies with:
True if and only if the type adheres to IEC 559 standard
If a literal genie wrote you a terrible compiler, and made float = binary16, double = binary32, and long double = binary64, and made is_iec559 true for all the types, it would still adhere to the standard.
does that mean that I can extract exponent and mantissa in a well defined way?
The TL;DR;, by the limited guarantees of the C++ standard: no.
Assume you use float32_t and is_iec559 is true, and logically deduced from all the rules that it could only be binary32 with no trap representations, and you correctly argued that memcpy is a well defined for conversion between arithmetic types of the same width, and won't break strict aliasing. Even with all those assumptions, the behavior might be well defined but it's only likely and not guaranteed that you can extract the mantissa this way.
The IEEE 754 standard and 2's complement regard bit string encodings, and the behavior of memcpy is described using bytes. While it's plausible to assume the bit string of uint32_t and float32_t would be encoded the same way (e.g. same endianness), there's no guarantee in the standard for that. If the bit strings are stored differently and you shift and mask the copied integer representation to get the mantissa, the answer will be incorrect, despite the memcpy behavior being well defined.
As mentioned in the comments, endianess could be an issue. Does anybody know?
At least a few architectures have used different endianness for floating point registers and integer registers. The same link says that except for small embedded processors, this isn't a concern. I trust Wikipedia entirely for all topics and refuse to do any further research.
I have a working software, which currently runs on a little-endian architecture. I would like to make it run in big-endian mode too. I would like to write little-endian data into files, regardless of the endianness of the underlying system.
To achieve this, I decided to use the boost endian library. It can convert integers efficiently. But it cannot handle floats (and doubles).
It states in the documentation, that "Floating point types will be supported in the Boost 1.59.0". But they are still not supported in 1.62.
I can assume, that the floats are valid IEEE 754 floats (or doubles). But their endianness may vary according to the underlying system. As far as I know, using the htonl and ntohl functions on floats is not recommended. How is it possible then? Is there any header-only library, which can handle floats too? I was not able to find any.
I could convert the floats to string, and write that into a file, I would like to avoid that method, for many reasons ( performance, disk-space, ... )
Here:
float f = 1.2f;
auto it = reinterpret_cast<uint8_t*>(&f);
std::reverse(it, it + sizeof(f)); //f is now in the reversed endianness
No need for anything fancy.
Unheilig: you are correct, but
#include <boost/endian/conversion.hpp>
template <typename T>
inline T endian_cast(const T & t)
{
#ifdef BOOST_LITTLE_ENDIAN
return boost::endian::endian_reverse(t);
#else
return t;
#endif
}
or when u are using pointers, to immediate reversing, use:
template <typename T>
inline T endian_cast(T *t)
{
#ifdef BOOST_LITTLE_ENDIAN
return boost::endian::endian_reverse_inplace(*t);
#else
return t;
#endif
}
and use it, instead of manually (or maybe error-prone) reversing it's content
example:
std::uint16_t start_address() const
{
std::uint16_t address;
std::memcpy(&address, &data()[1], 2);
return endian_cast(address);
}
void start_address(std::uint16_t i)
{
endian_cast(&i);
std::memcpy(&data()[1], &i, 2);
}
Good luck.
When serializing float/double values, I make the following three assumptions:
The machine representation follows IEEE 754
The endianess of float/double matches the endianess of integers
The behavior of reinterpret_cast-ing between double&/int64_t& or float&/int32_t& is well-defined (E.g., the cast behaves as if the types are similar).
None of these assumptions is guaranteed by the standard. Under these assumptions, the following code will ensure doubles are written in little-endian:
ostream out;
double someVal;
...
static_assert(sizeof(someVal) == sizeof(int64_t),
"Endian conversion requires 8-byte doubles");
native_to_little_inplace(reinterpret_cast<int64_t&>(someVal));
out.write(reinterpret_cast<char*>(&someVal), sizeof(someVal));
We have special functions like std::nanl to make a NaN with a payload. Currently here's what I have to do to print it back:
#include <iostream>
#include <cmath>
#include <cstring>
#include <cstdint>
int main()
{
const auto x=std::nanl("1311768467463790325");
std::uint64_t y;
std::memcpy(&y,&x,sizeof y);
std::cout << (y&~(3ull<<62)) << "\n";
}
This relies on the particular representation of long double, namely on it being 80-bit type of x87 FPU. Is there any standard way to achieve this without relying on such detail of implementation?
C++ imports nan* functions from ISO C. ISO C states in 7.22.1.3:
the meaning of the n-char sequence is implementation-defined
with a comment
An implementation may use the n-char sequence to determine extra information to be represented in the NaN’s significand.
There is no method to get the stored information.
I stumbled across this one here in 2023.Things haven’t improved much.
C11 supports nan*() functions (if QNaN is supported on your target processor), but
MSVC 2022 does not actually implement payload compiling
Payload must specified as a string anyway, and
There is still no Standard way to get the data.
(C23 proposes the GNU extension getPayload(), but it returns yet another double, which is far less interesting than an integer would have been.)
However
It has always been possible to get a QNaN payload, assuming you have a proper IEEE 754 QNaN with payload data. It has been put to good use on systems that do in things like Javascript and Lua, for example.[citation needed]
According to Wikipedia, after discussing some dinosaurs: [link]
It may therefore appear strange that the widespread IEEE 754 floating-point standard does not specify endianness.[3] Theoretically, this means that even standard IEEE floating-point data written by one machine might not be readable by another. However, on modern standard computers (i.e., implementing IEEE 754), one may safely assume that the endianness is the same for floating-point numbers as for integers, making the conversion straightforward regardless of data type. Small embedded systems using special floating-point formats may be another matter, however.Emphasis added
So as long as you aren’t leaking abstractions outside of internal use or playing with specialized (or ancient) hardware then you should be good to play with stuffing stuff in your QNaNs.
As this question is tagged C++ we will have to resort to slightly uglier code than strictly necessary in C, as type-punning with a union is (probably) UB in C++.[more link] The following should work in both C and C++ and produce just as well-optimized code either way.
Da codez or go home
qnan.h
#ifndef QNAN_H
#define QNAN_H
// Copyright stackoverflow.com
// Distributed under the Boost Software License, Version 1.0.
// (See accompanying file LICENSE_1_0.txt or copy at
// https://www.boost.org/LICENSE_1_0.txt )
#include <assert.h>
#include <math.h>
#include <stdint.h>
#ifndef NAN
#error "IEEE 754 Quiet NaN is required."
#endif
#ifndef UINT64_MAX
#error "uint64_t required."
#endif
static_assert( sizeof(double) == 8, "IEEE 754 64-bit double-precision is required" );
double qnan ( unsigned long long payload );
unsigned long long qnan_payload ( double qnan );
#endif
qnan.c
// Copyright stackoverflow.com
// Distributed under the Boost Software License, Version 1.0.
// (See accompanying file LICENSE_1_0.txt or copy at
// https://www.boost.org/LICENSE_1_0.txt )
#include <string.h>
#include "qnan.h"
double
qnan( unsigned long long payload )
{
double qnan = NAN;
uint64_t n;
memcpy( &n, &qnan, 8 );
n |= payload & 0x7FFFFFFFFFFFFULL;
memcpy( &qnan, &n, 8 );
return qnan;
}
unsigned long long
qnan_payload( double qnan )
{
uint64_t n;
memcpy( &n, &qnan, 8 );
return n & 0x7FFFFFFFFFFFFULL;
}
These two functions allow you access to all 51 bits of payload data as an unsigned integer.
Note, however, that unlike the weird-o getPayload() function the qnan_payload() function does not bother to fact-check you about your choice of input — it assumes you have given it an actual QNaN.
If you are unsure what kind of double you have, the isnan() function from <math.h> works just fine to check for QNaN-ness.
Similar code will give you access to a four-byte float or a N-byte long double (which is probably just an 8-byte double, unless it isn’t, and is probably more trouble supporting than it’s worth).
Having read this: http://commandcenter.blogspot.fi/2012/04/byte-order-fallacy.html
The method in the article is this:
Read from big endian:
int i = (data[3]<<0) | (data[2]<<8) | (data[1]<<16) | (data[0]<<24);
Read from little endian:
int i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | (data[3]<<24);
Is there any way to convert this ideology to floating point numbers?
So is there any way to avoid the if(swap_needed) swap(data);
One thought I had was to read the sign bit, mantissa and exponent individually from the data calculate the floating point value based on them.
Sebastian Redl's answer is correct if you stay with simple non-Intel IEEE-754 float or double, but it will fail with Intel's special representation of double and long double, and all other special ideas for their long double formats. Only very few architectures use the standard IEEE-754 floating point formats.
Even the easiest mips, which can use BE/LE at will, has a special MIPS64 16 byte long double format.
So there's no correct and easy way to do a fast byteswap for floats. However I wrote code to read floats from various architectures into the current architecture, which is a herculean task. https://github.com/parrot/parrot/blob/native_pbc2/src/packfile/pf_items.c#L553
Note: The intel speciality is the extra normalization bit (the highest bit 63 of the mantissa) marked with i in https://github.com/parrot/parrot/blob/native_pbc2/src/packfile/pf_items.c#L605
I.e. I convert between those, BE and LE:
Floattype 0 = IEEE-754 8 byte double (binary64)
Floattype 1 = Intel 80-bit long double stored in 12 byte (i386) or aligned to 16 byte (x86_64/ia64)
Floattype 2 = IEEE-754 128 bit quad precision stored in 16 byte, Sparc64 quad-float or __float128, gcc since 4.3 (binary128)
Floattype 3 = IEEE-754 4 byte float (binary32)
Floattype 4 = PowerPC 16 byte double-double (-mlong-double-128)
not yet:
Floattype 5 = IEEE-754 2 byte half-precision float (binary16)
Floattype 6 = MIPS64 16 byte long double
Floattype 7 = AIX 16 byte long double
CRAY and more crazyness
Since there was no big need, I never made a proper library for this float-conversion code.
Btw. I use much faster native byteswap functions, see https://github.com/parrot/parrot/blob/native_pbc2/include/parrot/bswap.h
Usually you print with max. precision to a string and read this string. There you only have the problem to find out your max. precision.
You just grab the underlying bytes and work with that.
unsigned char underlying[sizeof(float)];
// Writing
std::memcpy(underlying, &my_float, sizeof(float));
if (platform_endian != target_endian)
std::reverse(std::begin(underlying), std::end(underlying));
write(underlying, sizeof(float));
// Reading
read(underlying, sizeof(float));
if (platform_endian != target_endian)
std::reverse(std::begin(underlying), std::end(underlying));
std::memcpy(&my_float, underlying, sizeof(float));
You can of course optimize the reverse to something super-special if you feel so inclined.
You will usually see people cast to a unsigned 64 bit integer, and then call the classic BSD functions to convert to/from network byte order. I once worked on a project where I got doubles over the network from a Java machine, so I knew they were sent big-endian, and read them on an Intel machine in C++. I simply read the data as a char[8], called std::reverse, and cast the result to a double:
double read_double()
{
char buffer[8];
// read from network into buffer;
std::reverse(std::begin(buffer), std::end(buffer), std::begin(buffer));
return *static_cast<double*>(static_cast<void*>(buffer));
}
Today I would do things differently. For one, the bitshifting code you posted isn't that hard to follow. For another, I agree with #NeilKirk and the article you linked to: the code to read/write from a particular endianness is identical regardless of the actual endianness on the machine, so just write code that will read big-endian/little-endian data, using the code from the article you linked to (just cast it to a double after you've read and manipulated the bytes as a 64 bit unsigned integer type).
I know the integer format would be different between big-endian machine and little-endian machine, is it the same for float point format (IEEE 754)?
The IEEE754 specification for floating point numbers simply doesn't cover the endianness problem. Floating point numbers therefore may use different representations on different machines and, in theory, it's even possible that for two processors integer endianness is the same and floating point is different or vice-versa.
See this wikipedia article for more information.
If you have a Linux box, you'll probably have /usr/include/ieee754.h... (note the #ifs)
...
union ieee754_float
{
float f;
/* This is the IEEE 754 single-precision format. */
struct
{
#if __BYTE_ORDER == __BIG_ENDIAN
unsigned int negative:1;
unsigned int exponent:8;
unsigned int mantissa:23;
#endif /* Big endian. */
#if __BYTE_ORDER == __LITTLE_ENDIAN
unsigned int mantissa:23;
unsigned int exponent:8;
unsigned int negative:1;
#endif /* Little endian. */
} ieee;
...
Endianness issues arise as soon as you consider something as made up of smaller units. The way the smaller units are arranged may change.
Then if you care about variations in FP format, you must be aware that IEEE 754 doesn't describe a FP representation (even if it one diagram suggest one), and there is at least one more variation than the one related to endianness: the bit denoting signaling subnormals is interpreted differently in different implementation.