reinterpret_cast usage to manipulate bytes - c++

I was reading here how to use the byteswap function. I don't understand why bit_cast is actually needed instead of using reinterpret_cast to char*. What I understand is that using this cast we are not violating the strict aliasing rule. I read that the second version below could be wrong because we access to unaligned memory. It could but at this point I'm a bit confused because if the access is UB due to unaligned memory, when is it possible to manipulate bytes with reinterpret_cast? According to the standard the cast should allow to access (read/write) the memory.
template<std::integral T>
constexpr T byteswap(T value) noexcept
{
static_assert(std::has_unique_object_representations_v<T>,
"T may not have padding bits");
auto value_representation = std::bit_cast<std::array<std::byte, sizeof(T)>>(value);
std::ranges::reverse(value_representation);
return std::bit_cast<T>(value_representation);
}
template<std::integral T>
void byteswap(T& value) noexcept
{
static_assert(std::has_unique_object_representations_v<T>,
"T may not have padding bits");
char* value_representation = reinterpret_cast<char*>(value);
std::reverse(value_representation, value_representation+sizeof(T));
}

The primary reason is that reinterpret_cast can not be used in constant expression evaluation, while std::bit_cast can. And std::byteswap is specified to be constexpr.
If you added constexpr to the declaration in your implementation, it would be ill-formed, no diagnostic required, because there is no specialization of it that could be called as subexpression of a constant expression.
Without the constexpr it is not ill-formed, but cannot be called as subexpression of a constant expression, which std::byteswap is supposed to allow.
Furthermore, there is a defect in the standard:
The standard technically does not allow doing pointer arithmetic on the reinterpret_cast<char*>(value) pointer (and also doesn't really specify a meaning for reading and writing through such a pointer).
The intention is that the char* pointer should be a pointer into the object representation of the object, considered as an array of characters. But currently the standard just says that the reinterpret_cast<char*>(value) pointer still points to the original object, not its object representation. See P1839 for a paper proposing to correct the specification to be more in line with the usual assumptions.
The implementation from cppreference is also making an assumption that might not be guaranteed to be true: Whether std::array<std::byte, sizeof(T)> is guaranteed to have the same size as T. Of course that should hold in practice and std::bit_cast will fail to compile if it doesn't.
If you want to read some discussion on whether or not it is guaranteed in theory, see the questions std::bit_cast with std::array, Is the size of std::array defined by standard and What is the sizeof std::array<char, N>?

Related

std::bit_cast with std::array

In his recent talk “Type punning in modern C++” Timur Doumler said that std::bit_cast cannot be used to bit cast a float into an unsigned char[4] because C-style arrays cannot be returned from a function. We should either use std::memcpy or wait until C++23 (or later) when something like reinterpret_cast<unsigned char*>(&f)[i] will become well defined.
In C++20, can we use an std::array with std::bit_cast,
float f = /* some value */;
auto bits = std::bit_cast<std::array<unsigned char, sizeof(float)>>(f);
instead of a C-style array to get bytes of a float?
Yes, this works on all major compilers, and as far as I can tell from looking at the standard, it is portable and guaranteed to work.
First of all, std::array<unsigned char, sizeof(float)> is guaranteed to be an aggregate (https://eel.is/c++draft/array#overview-2). From this follows that it holds exactly a sizeof(float) number of chars inside (typically as a char[], although afaics the standard doesn't mandate this particular implementation - but it does say the elements must be contiguous) and cannot have any additional non-static members.
It is therefore trivially copyable, and its size matches that of float as well.
Those two properties allow you to bit_cast between them.
The accepted answer is incorrect because it fails to consider alignment and padding issues.
Per [array]/1-3:
The header <array> defines a class template for storing fixed-size
sequences of objects. An array is a contiguous container. An instance
of array<T, N> stores N elements of type T, so that size() == N is an invariant.
An array is an aggregate that can be list-initialized with up to N
elements whose types are convertible to T.
An array meets all of the requirements of a container and of a
reversible container ([container.requirements]), except that a default
constructed array object is not empty and that swap does not have
constant complexity. An array meets some of the requirements of a
sequence container. Descriptions are provided here only for operations
on array that are not described in one of these tables and for
operations where there is additional semantic information.
The standard does not actually require std::array to have exactly one public data member of type T[N], so in theory it is possible that sizeof(To) != sizeof(From) or is_­trivially_­copyable_­v<To>.
I will be surprised if this doesn't work in practice, though.
Yes.
According to the paper that describes the behaviour of std::bit_cast, and its proposed implementation as far as both types have the same size and are trivially copyable the cast should be successful.
A simplified implementation of std::bit_cast should be something like:
template <class Dest, class Source>
inline Dest bit_cast(Source const &source) {
static_assert(sizeof(Dest) == sizeof(Source));
static_assert(std::is_trivially_copyable<Dest>::value);
static_assert(std::is_trivially_copyable<Source>::value);
Dest dest;
std::memcpy(&dest, &source, sizeof(dest));
return dest;
}
Since a float (4 bytes) and an array of unsigned char with size_of(float) respect all those asserts, the underlying std::memcpy will be carried out. Therefore, each element in the resulting array will be one consecutive byte of the float.
In order to prove this behaviour, I wrote a small example in Compiler Explorer that you can try here: https://godbolt.org/z/4G21zS. The float 5.0 is properly stored as an array of bytes (Ox40a00000) that corresponds to the hexadecimal representation of that float number in Big Endian.

Should std::byte pointers be used for pointer arithmetic?

It seems std::byte has become the way (in C++17) to work with buffers holding object representations, but it's unclear whether this intent still allows for performing pointer arithmetic.
The question in the title is intentionally phrased as should because I'm looking for recommendation. For example, void* can be used for pointer arithmetic as gcc extensions but are not standard (at least this is true for C), hence a possibility but not a recommendation.
I know the motivation for std::byte is to detach the character and the numeric aspects from the concept of byte. But at the same time, does pointer arithmetic stay?
EDIT: adjusted to clarify that I'm looking to do "pointer arithmetic" using std::byte* not the numerical value of pointers stores in std::bytes
Yes, std::byte* can be used for pointer arithmetic.
And you can even do things like
struct foo{int x,y};
foo f;
int* ptr_to_y = reinterpret_cast<int*>(reinterpret_cast<std::byte*>(&f)+offsetof(foo,y));
You do have to be careful that your locations are reachable through your operations. Just because pointers-as-integers gets the right result doesn't mean that the C++ code is doing defined behavior. There are a number of quirks in C++ around permitting the optimizer to "know" that a certain value cannot be modified.
struct loc {
int x,y;
};
void f( int* );
loc work( loc l ) {
l.x=3;
f(&l.y);
return l;
}
in the above case, someone who used the &l.y pointer to do pointer arithmetic (within f) and modify l.x, regardless of if they went to std::byte* or not, would be doing undefined behavior. The compiler is allowed to assume the returned l will have an .x value of 3.
These are not new pitfalls introduced by std::byte*.

Over-aligned types with std::aligned_storage

The C++ standard states, regarding the std::aligned_storage template, that
Align shall be equal to alignof(T) for some type T or to default-alignment.
Does that mean that there must be such a type in the program, or that it must be possible to make such a type? In particular, the possible implementation suggested on cppreference is
template<std::size_t Len, std::size_t Align /* default alignment not implemented */>
struct aligned_storage {
typedef struct {
alignas(Align) unsigned char data[Len];
} type;
};
It seems like this makes a type with that alignment, if possible (that is, if Align is a valid alignment). Is that behavior required, or is it undefined behavior to specify an Align if such a type does not already exist?
And, perhaps more importantly, is it plausible in practice that the compiler or standard library would fail to do the right thing in this case, assuming that Align is at least a legal alignment for a type to have?
You can always attempt to make a type with arbitrary (valid) alignment N:
template <std::size_t N> struct X { alignas(N) char c; };
When N is greater than the default alignment, X has extended alignment. The support for extended alignment is implementation-defined, and [dcl.align] says:
if the constant expression does not evaluate to an alignment value (6.11), or evaluates to an extended
alignment and the implementation does not support that alignment in the context of the declaration, the program is ill-formed.
Therefore, when you attempt to say X<N> for an extended alignment that is not supported, you will face a diagnostic. You can now use the existence (or otherwise) of X<N> to justify the validity of the specialization aligned_storage<Len, N> (whose condition is now met with T = X<N>).
Since aligned_storage will effectively use something like X internally, you don't even have to actually define X. It's just a mental aid in the explanation. The aligned_storage will be ill-formed if the requested alignment is not supported.

Endianness in constexpr

I want to create a constexpr function that returns the endianness of the system, like so:
constexpr bool IsBigEndian()
{
constexpr int32_t one = 1;
return (reinterpret_cast<const int8_t&>(one) == 0);
}
Now, since the function will get executed at compile time rather than on the actual target machine, what guarantee does the C++ spec give to make sure that the correct result is returned?
None. In fact, the program is ill-formed. From [expr.const]:
A conditional-expression e is a core constant expression unless the evaluation of e, following the rules of the
abstract machine (1.9), would evaluate one of the following expressions:
— [...]
— a reinterpret_cast.
— [...]
And, from [dcl.constexpr]:
For a constexpr function or constexpr constructor that is neither defaulted nor a template, if no argument
values exist such that an invocation of the function or constructor could be an evaluated subexpression of
a core constant expression (5.20), or, for a constructor, a constant initializer for some object (3.6.2), the
program is ill-formed; no diagnostic required.
The way to do this is just to hope that your compiler is nice enough to provide macros for the endianness of your machine. For instance, on gcc, I could use __BYTE_ORDER__:
constexpr bool IsBigEndian() {
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
return false;
#else
return true;
#endif
}
As stated by Barry, your code is not legal C++. However, even if you took away the constexpr part, it would still not be legal C++. Your code violates strict aliasing rules and therefore represents undefined behavior.
Indeed, there is no way in C++ to detect the endian-ness of an object without invoking undefined behavior. Casting it to a char* doesn't work, because the standard doesn't require big or little endian order. So while you could read the data through a byte, you would not be able to legally infer anything from that value.
And type punning through a union fails because you're not allowed to type pun through a union in C++ at all. And even if you did... again, C++ does not restrict implementations to big or little endian order.
So as far as C++ as a standard is concerned, there is no way to detect this, whether at compile-time or runtime.

Comparison semantics with std::atomic types

I'm trying to find where the comparison semantics for the type T with std::atomic is defined.
I know that beside the builtin specializations for integral types, T can be any TriviallyCopyable type. But how do operations like compare_and_exchange_X know how to compare an instance of T?
I imagine they must simply do a byte by byte comparison of the user defined object (like a memcmp) but I don't see where in the standard this is explicitly mentioned.
So, suppose I have:
struct foo
{
std::uint64_t x;
std::uint64_t y;
};
How does the compiler know how to compare two std::atomic<foo> instances when I call std::atomic<foo>::compare_and_exchange_weak()?
In draft n3936, memcmp semantics are explicitly described in section 29.6.5.
Note: For example, the effect of atomic_compare_exchange_strong is
if (memcmp(object, expected, sizeof(*object)) == 0)
memcpy(object, &desired, sizeof(*object));
else
memcpy(expected, object, sizeof(*object));
and
Note: The memcpy and memcmp semantics of the compare-and-exchange operations may result in failed comparisons for values that compare equal with operator== if the underlying type has padding bits, trap bits, or alternate representations of the same value.
That wording has been present at least since n3485.
Note that only memcmp(p1, p2, sizeof(T)) != 0 is meaningful to compare_and_exchange_weak (failure guaranteed). memcmp(p1, p2, sizeof(T)) == 0 allows but does not guarantee success.
It's implementation defined. It could just be using a mutex lock or it could be using some intrinsics on memory blobs. The standard simply defines it such that the latter might work as an implementation strategy.
The compiler doesn't know anything here. It'll all be in the library. Since it's a template you can go read how your implementation does it.