In his recent talk “Type punning in modern C++” Timur Doumler said that std::bit_cast cannot be used to bit cast a float into an unsigned char[4] because C-style arrays cannot be returned from a function. We should either use std::memcpy or wait until C++23 (or later) when something like reinterpret_cast<unsigned char*>(&f)[i] will become well defined.
In C++20, can we use an std::array with std::bit_cast,
float f = /* some value */;
auto bits = std::bit_cast<std::array<unsigned char, sizeof(float)>>(f);
instead of a C-style array to get bytes of a float?
Yes, this works on all major compilers, and as far as I can tell from looking at the standard, it is portable and guaranteed to work.
First of all, std::array<unsigned char, sizeof(float)> is guaranteed to be an aggregate (https://eel.is/c++draft/array#overview-2). From this follows that it holds exactly a sizeof(float) number of chars inside (typically as a char[], although afaics the standard doesn't mandate this particular implementation - but it does say the elements must be contiguous) and cannot have any additional non-static members.
It is therefore trivially copyable, and its size matches that of float as well.
Those two properties allow you to bit_cast between them.
The accepted answer is incorrect because it fails to consider alignment and padding issues.
Per [array]/1-3:
The header <array> defines a class template for storing fixed-size
sequences of objects. An array is a contiguous container. An instance
of array<T, N> stores N elements of type T, so that size() == N is an invariant.
An array is an aggregate that can be list-initialized with up to N
elements whose types are convertible to T.
An array meets all of the requirements of a container and of a
reversible container ([container.requirements]), except that a default
constructed array object is not empty and that swap does not have
constant complexity. An array meets some of the requirements of a
sequence container. Descriptions are provided here only for operations
on array that are not described in one of these tables and for
operations where there is additional semantic information.
The standard does not actually require std::array to have exactly one public data member of type T[N], so in theory it is possible that sizeof(To) != sizeof(From) or is_trivially_copyable_v<To>.
I will be surprised if this doesn't work in practice, though.
Yes.
According to the paper that describes the behaviour of std::bit_cast, and its proposed implementation as far as both types have the same size and are trivially copyable the cast should be successful.
A simplified implementation of std::bit_cast should be something like:
template <class Dest, class Source>
inline Dest bit_cast(Source const &source) {
static_assert(sizeof(Dest) == sizeof(Source));
static_assert(std::is_trivially_copyable<Dest>::value);
static_assert(std::is_trivially_copyable<Source>::value);
Dest dest;
std::memcpy(&dest, &source, sizeof(dest));
return dest;
}
Since a float (4 bytes) and an array of unsigned char with size_of(float) respect all those asserts, the underlying std::memcpy will be carried out. Therefore, each element in the resulting array will be one consecutive byte of the float.
In order to prove this behaviour, I wrote a small example in Compiler Explorer that you can try here: https://godbolt.org/z/4G21zS. The float 5.0 is properly stored as an array of bytes (Ox40a00000) that corresponds to the hexadecimal representation of that float number in Big Endian.
Related
I was reading here how to use the byteswap function. I don't understand why bit_cast is actually needed instead of using reinterpret_cast to char*. What I understand is that using this cast we are not violating the strict aliasing rule. I read that the second version below could be wrong because we access to unaligned memory. It could but at this point I'm a bit confused because if the access is UB due to unaligned memory, when is it possible to manipulate bytes with reinterpret_cast? According to the standard the cast should allow to access (read/write) the memory.
template<std::integral T>
constexpr T byteswap(T value) noexcept
{
static_assert(std::has_unique_object_representations_v<T>,
"T may not have padding bits");
auto value_representation = std::bit_cast<std::array<std::byte, sizeof(T)>>(value);
std::ranges::reverse(value_representation);
return std::bit_cast<T>(value_representation);
}
template<std::integral T>
void byteswap(T& value) noexcept
{
static_assert(std::has_unique_object_representations_v<T>,
"T may not have padding bits");
char* value_representation = reinterpret_cast<char*>(value);
std::reverse(value_representation, value_representation+sizeof(T));
}
The primary reason is that reinterpret_cast can not be used in constant expression evaluation, while std::bit_cast can. And std::byteswap is specified to be constexpr.
If you added constexpr to the declaration in your implementation, it would be ill-formed, no diagnostic required, because there is no specialization of it that could be called as subexpression of a constant expression.
Without the constexpr it is not ill-formed, but cannot be called as subexpression of a constant expression, which std::byteswap is supposed to allow.
Furthermore, there is a defect in the standard:
The standard technically does not allow doing pointer arithmetic on the reinterpret_cast<char*>(value) pointer (and also doesn't really specify a meaning for reading and writing through such a pointer).
The intention is that the char* pointer should be a pointer into the object representation of the object, considered as an array of characters. But currently the standard just says that the reinterpret_cast<char*>(value) pointer still points to the original object, not its object representation. See P1839 for a paper proposing to correct the specification to be more in line with the usual assumptions.
The implementation from cppreference is also making an assumption that might not be guaranteed to be true: Whether std::array<std::byte, sizeof(T)> is guaranteed to have the same size as T. Of course that should hold in practice and std::bit_cast will fail to compile if it doesn't.
If you want to read some discussion on whether or not it is guaranteed in theory, see the questions std::bit_cast with std::array, Is the size of std::array defined by standard and What is the sizeof std::array<char, N>?
I recently discovered about the vreinterpret{q}_dsttype_srctype casting operator. However this doesn't seem to support conversion in the data type described at this link (bottom of the page):
Some intrinsics use an array of vector types of the form:
<type><size>x<number of lanes>x<length of array>_t
These types are treated as ordinary C structures containing a single
element named val.
An example structure definition is:
struct int16x4x2_t
{
int16x4_t val[2];
};
Do you know how to convert from uint8x16_t to uint8x8x2_t?
Note that that the problem cannot be reliably addressed using union (reading from inactive members leads to undefined behaviour Edit: That's only the case for C++, while it turns out that C allows type punning), nor by using pointers to cast (breaks the strict aliasing rule).
It's completely legal in C++ to type pun via pointer casting, as long as you're only doing it to char*. This, not coincidentally, is what memcpy is defined as working on (technically unsigned char* which is good enough).
Kindly observe the following passage:
For any object (other than a base-class subobject) of trivially
copyable type T, whether or not the object holds a valid value of type
T, the underlying bytes (1.7) making up the object can be copied into
an array of char or unsigned char.
42 If the content of the array of char or unsigned char is copied back
into the object, the object shall subsequently hold its original
value. [Example:
#define N sizeof(T)
char buf[N];
T obj;
// obj initialized to its original value
std::memcpy(buf, &obj, N);
// between these two calls to std::memcpy,
// obj might be modified
std::memcpy(&obj, buf, N);
// at this point, each subobject of obj of scalar type
// holds its original value
— end example ]
Put simply, copying like this is the intended function of std::memcpy. As long as the types you're dealing with meet the necessary triviality requirements, it's totally legit.
Strict aliasing does not include char* or unsigned char*- you are free to alias any type with these.
Note that for unsigned ints specifically, you have some very explicit leeway here. The C++ Standard requires that they meet the requirements of the C Standard. The C Standard mandates the format. The only way that trap representations or anything like that can be involved is if your implementation has any padding bits, but ARM does not have any- 8bit bytes, 8bit and 16bit integers. So for unsigned integers on implementations with zero padding bits, any byte is a valid unsigned integer.
For unsigned integer types other than unsigned char, the bits
of the object representation shall be divided into two groups:
value bits and padding bits (there need not be any of the
latter). If there are N value bits, each bit shall represent
a different power of 2 between 1 and 2N−1, so that objects
of that type shall be capable of representing values from 0
to 2N−1 using a pure binary representation; this shall be
known as the value representation. The values of any padding bits are
unspecified.
Based on your comments, it seems you want to perform a bona fide conversion -- that is, to produce a distinct, new, separate value of a different type. This is a very different thing than a reinterpretation, such as the lead-in to your question suggests you wanted. In particular, you posit variables declared like this:
uint8x16_t a;
uint8x8x2_t b;
// code to set the value of a ...
and you want to know how to set the value of b so that it is in some sense equivalent to the value of a.
Speaking to the C language:
The strict aliasing rule (C2011 6.5/7) says,
An object shall have its stored value accessed only by an lvalue
expression that has one of the following types:
a type compatible with the effective type of the object, [...]
an aggregate or union type that includes one of the aforementioned types among its members [...], or
a character type.
(Emphasis added. Other enumerated options involve differently-qualified and differently-signed versions of the of the effective type of the object or compatible types; these are not relevant here.)
Note that these provisions never interfere with accessing a's value, including the member value, via variable a, and similarly for b. But don't overlook overlook the usage of the term "effective type" -- this is where things can get bolluxed up under slightly different circumstances. More on that later.
Using a union
C certainly permits you to perform a conversion via an intermediate union, or you could rely on b being a union member in the first place so as to remove the "intermediate" part:
union {
uint8x16_t x1;
uint8x8_2_t x2;
} temp;
temp.x1 = a;
b = temp.x2;
Using a typecast pointer (to produce UB)
However, although it's not so uncommon to see it, C does not permit you to type-pun via a pointer:
// UNDEFINED BEHAVIOR - strict-aliasing violation
b = *(uint8x8x2_t *)&a;
// DON'T DO THAT
There, you are accessing the value of a, whose effective type is uint8x16_t, via an lvalue of type uint8x8x2_t. Note that it is not the cast that is forbidden, nor even, I'd argue, the dereferencing -- it is reading the dereferenced value so as to apply the side effect of the = operator.
Using memcpy()
Now, what about memcpy()? This is where it gets interesting. C permits the stored values of a and b to be accessed via lvalues of character type, and although its arguments are declared to have type void *, this is the only plausible interpretation of how memcpy() works. Certainly its description characterizes it as copying characters. There is therefore nothing wrong with performing a
memcpy(&b, &a, sizeof a);
Having done so, you may freely access the value of b via variable b, as already mentioned. There are aspects of doing so that could be problematic in a more general context, but there's no UB here.
However, contrast this with the superficially similar situation in which you want to put the converted value into dynamically-allocated space:
uint8x8x2_t *c = malloc(sizeof(*c));
memcpy(c, &a, sizeof a);
What could be wrong with that? Nothing is wrong with it, as far as it goes, but here you have UB if you afterward you try to access the value of *c. Why? because the memory to which c points does not have a declared type, therefore its effective type is the effective type of whatever was last stored in it (if that has an effective type), including if that value was copied into it via memcpy() (C2011 6.5/6). As a result, the object to which c points has effective type uint8x16_t after the copy, whereas the expression *c has type uint8x8x2_t; the strict aliasing rule says that accessing that object via that lvalue produces UB.
So there are a bunch of gotchas here. This reflects C++.
First you can convert trivially copyable data to char* or unsigned char* or c++17 std::byte*, then copy it from one location to another. The result is defined behavior. The values of the bytes are unspecified.
If you do this from a value of one one type to another via something like memcpy, this can result in undefined behaviour upon access of the target type unless the target type has valid values for all byte representations, or if the layout of the two types is specified by your compiler.
There is the possibility of "trap representations" in the target type -- byte combinations that result in machine exceptions or something similar if interpreted as a value of that type. Imagine a system that doesn't use IEEE floats and where doing math on NaN or INF or the like causes a segfault.
There are also alignment concerns.
In C, I believe that type punning via unions is legal, with similar qualifications.
Finally, note that under a strict reading of the c++ standard, foo* pf = (foo*)malloc(sizeof(foo)); is not a pointer to a foo even if foo was plain old data. You must create an object before interacting with it, and the only way to create an object outside of automatic storage is via new or placement new. This means you must have data of the target type before you memcpy into it.
Do you know how to convert from uint8x16_t to uint8x8x2_t?
uint8x16_t input = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
uint8x8x2_t output = { vget_low_u8(input), vget_high_u8(input) };
One must understand that with neon intrinsics, uint8x16_t represents a 16-byte register; while uint8x8x2_t represents two adjacent 8-byte registers. For ARMv7 these may be the same thing (q0 == {d0, d1}) but for ARMv8 the register layout is different. It's necessary to get (extract) the low 8 bytes and the high 8 bytes of the single 16-byte register using two functions. The clang compiler will determine which instruction(s) are necessary based on the context.
I have learned recently that size_t was introduced to help future-proof code against native bit count increases and increases in available memory. The specific use definition seems to be on the storing of the size of something, generally an array.
I now must wonder how far this future proofing should be taken. Surely it is pointless to have an array length defined using the future-proof and appropriately sized size_t if the very next task of iterating over the array uses say an unsigned int as the index array:
void (double* vector, size_t vectorLength) {
for (unsigned int i = 0; i < vectorLength; i++) {
//...
}
}
In fact in this case I might expect the syntax strictly should up-convert the unsigned int to a size_t for the relation operator.
Does this imply the iterator variable i should simply be a size_t?
Does this imply that any integer in any program must become functionally identified as to whether it will ever be used as an array index?
Does it imply any code using logic that develops the index programmatically should then create a new result value of type size_t, particularly if the logic relies on potentially signed integer values? i.e.
double foo[100];
//...
int a = 4;
int b = -10;
int c = 50;
int index = a + b + c;
double d = foo[(size_t)index];
Surely though since my code logic creates a fixed bound, up-converting to the size_t provides no additional protection.
You should keep in mind the automatic conversion rules of the language.
Does this imply the iterator variable i should simply be a size_t?
Yes it does, because if size_t is larger than unsigned int and your array is actually larger than can be indexed with an unsigned int, then your variable (i) can never reach the size of the array.
Does this imply that any integer in any program must become functionally identified as to whether it will ever be used as an array index?
You try to make it sound drastic, while it's not. Why do you choose a variable as double and not float? Why would you make a variable as unsigned and one not? Why would you make a variable short while another is int? Of course, you always know what your variables are going to be used for, so you decide what types they should get. The choice of size_t is one among many and it's similarly decided.
In other words, every variable in a program should be functionally identified and given the correct type.
Does it imply any code using logic that develops the index programmatically should then create a new result value of type size_t, particularly if the logic relies on potentially signed integer values?
Not at all. First, if the variable can never have negative values, then it could have been unsigned int or size_t in the first place. Second, if the variable can have negative values during computation, then you should definitely make sure that in the end it's non-negative, because you shouldn't index an array with a negative number.
That said, if you are sure your index is non-negative, then casting it to size_t doesn't make any difference. C11 at 6.5.2.1 says (emphasis mine):
A postfix expression followed by an expression in square brackets [] is a subscripted
designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2th element of E1 (counting from zero).
Which means whatever type of index for which some_pointer + index makes sense, is allowed to be used as index. In other words, if you know your int has enough space to contain the index you are computing, there is absolutely no need to cast it to a different type.
Surely it is pointless to have an array length defined using the future-proof and appropriately sized size_t if the very next task of iterating over the array uses say an unsigned int as the index array
Yes it is. So don't do it.
In fact in this case I might expect the syntax strictly should up-convert the unsigned int to a size_t for the relation operator.
It will only be promoted in that particular < operation. The upper limit of your int variable will not be changed, so the ++ operation will always work with an int, rather than a size_t.
Does this imply the iterator variable i should simply be a size_t?
Does this imply that any integer in any program must become functionally identified as to whether it will ever be used as an array index?
Yeah well, it is better than int... But there is a smarter way to write programs: use common sense. Whenever you declare an array, you can actually stop and consider in advance how many items the array would possibly need to store. If it will never contain more than 100 items, there is absolutely no reason for you to use int nor to use size_t to index it.
In the 100 items case, simply use uint_fast8_t. Then the program is optimized for size as well as speed, and 100% portable.
Whenever declaring a variable, a good programmer will activate their brain and consider the following:
What is the range of the values that I will store inside this variable?
Do I actually need to store negative numbers in it?
In the case of an array, how many values will I need in the worst-case? (If unknown, do I have to use dynamic memory?)
Are there any compatibility issues with this variable if I decide to port this program?
As opposed to a bad programmer, who does not activate their brain but simply types int all over the place.
As discussed by Neil Kirk, iterators are a future proof counterpart of size_t.
An additional point in your question is the computation of a position, and this typically includes an absolute position (e.g. a in your example) and possibly one or more relative quantities (e.g. b or c), potentially signed.
The signed counterpart of size_t is ptrdiff_t and the analogous for iterator type I is typename I::difference_type.
As you describe in your question, it is best to use the appropriate types everywhere in your code, so that no conversions are needed. For memory efficiency, if you have e.g. an array of one million positions into other arrays and you know these positions are in the range 0-255, then you can use unsigned char; but then a conversion is necessary at some point.
In such cases, it is best to name this type, e.g.
using pos = unsigned char;
and make all conversions explicit. Then the code will be easier to maintain, should the range 0-255 increase in the future.
Yep, if you use int to index an array, you defeat the point of using size_t in other places. This is why you can use iterators with STL. They are future proof. For C arrays, you can use either size_t, pointers, or algorithms and lambdas or range-based for loops (C++11). If you need to store the size or index in variables, they will need to be size_t or other appropriate types, as will anything else they interact with, unless you know the size will be small. (For example, if you store the distance between two elements which will always be in a small range, you can use int).
double *my_array;
for (double *it = my_array, *end_it = my_array + my_array_size, it != end_it; ++it)
{
// use it
}
std::for_each(std::begin(my_array), std::end(my_array), [](double& x)
{
// use x
});
for (auto& x : my_array)
{
// use x
}
Does this imply that any integer in any program must become functionally identified as to whether it will ever be used as an array index?
I'll pick that point, and say clearly Yes. Besides, in most cases a variable used as an array index is only used as that (or something related to it).
And this rule does not only apply here, but also in other circumstances: There are many use cases where nowadays a special type exists: ptrdiff_t, off_t (which even may change depeding on the configuration we use!), pid_t and a lot of others.
At first one might think std::numeric_limits<size_t>::max(), but if there was an object that huge, could it still offer a one-past-the-end pointer? I guess not. Does that imply the largest value sizeof(T) could yield is std::numeric_limits<size_t>::max()-1? Am I right, or am I missing something?
Q: What is the largest value sizeof(T) can yield?
A: std::numeric_limits<size_t>::max()
Clearly, sizeof cannot return a value larger than std::numeric_limits<size_t>::max(), since it wouldn't fit. The only question is, can it return ...::max()?
Yes. Here is a valid program, that violates no constraints of the C++03 standard, which demonstrates a proof-by-example. In particular, this program does not violate any constraint listed in §5.3.3 [expr.sizeof], nor in §8.3.4 [dcl.array]:
#include <limits>
#include <iostream>
int main () {
typedef char T[std::numeric_limits<size_t>::max()];
std::cout << sizeof(T)<<"\n";
}
If std::numeric_limits<ptrdiff_t>::max() > std::numeric_limits<size_t>::max() you can compute the size of an object of size std::numeric_limits<size_t>::max() by subtracting a pointer to it from a one-past-the-end pointer.
If sizeof(T*) > sizeof(size_t) you can have enough distinct pointers to address each and every single byte inside that object (in case you have an array of char, for example) plus one for one-past-the-end.
So, it's possible to write an implementation where sizeof can return std::numeric_limits<size_t>::max(), and where you can get pointer to one-past-the-end of an object that large.
it's not exactly well-defined. but to stay within safe limits of the standard, max object size is std::numeric_limits<ptrdiff_t>::max()
that's because when you subtract two pointers, you get a ptrdiff_t
which is a signed integer type
cheers & hth.,
The requirement to be able to point beyond the end of an array has nothing to do with the range of size_t. Given an object x, it's quite possible for (&x)+1 to be a valid pointer, even if the number of bytes separating the two pointers can't be represented by size_t.
You could argue that the requirement does imply an upper bound on object size of the maximum range of pointers, minus the alignment of the object. However, I don't believe the standard says anywhere that such a type can't be defined; it would just be impossible to instantiate one and still remain conformant.
If this was a test, I'd say (size_t) -1
A sizeof() expression yields a value of type size_t. From C99 standard 6.5.3.4:
The value of the result is implementation-defined, and its type (an
unsigned integer type) is size_t, defined in stddef.h (and other
headers).
Therefore, the maximum value that sizeof() can yield is SIZE_MAX.
You can have a standard compliant compiler that allows for object sizes that cause pointer arithmetic to overflow; however, the result is undefined. From the C++ standard, 5.7 [expr.add]:
When two pointers to elements of the same array object are subtracted,
the result is the difference of the subscripts of the two array
elements. The type of the result is an implementation-defined signed
integral type; this type shall be the same type that is defined as
std::ptrdiff_t in the <cstddef> header (18.2). As with any other
arithmetic overflow, if the result does not fit in the space provided,
the behavior is undefined.
Is the following code 100% portable?
int a=10;
size_t size_of_int = (char *)(&a+1)-(char*)(&a); // No problem here?
std::cout<<size_of_int;// or printf("%zu",size_of_int);
P.S: The question is only for learning purpose. So please don't give answers like Use sizeof() etc
From ANSI-ISO-IEC 14882-2003, p.87 (c++03):
"75) Another way to approach pointer
arithmetic is first to convert the
pointer(s) to character pointer(s): In
this scheme the integral value of the
expression added to or subtracted from
the converted pointer is first
multiplied by the size of the object
originally pointed to, and the
resulting pointer is converted back to
the original type. For pointer
subtraction, the result of the
difference between the character
pointers is similarly divided by the
size of the object originally pointed
to."
This seems to suggest that the pointer difference equals to the object size.
If we remove the UB'ness from incrementing a pointer to a scalar a and turn a into an array:
int a[1];
size_t size_of_int = (char*)(a+1) - (char*)(a);
std::cout<<size_of_int;// or printf("%zu",size_of_int);
Then this looks OK. The clauses about alignment requirements are consistent with the footnote, if alignment requirements are always divisible by the size of the object.
UPDATE: Interesting. As most of you probably know, GCC allows to specify an explicit alignment to types as an extension. But I can't break OP's "sizeof" method with it because GCC refuses to compile it:
#include <stdio.h>
typedef int a8_int __attribute__((aligned(8)));
int main()
{
a8_int v[2];
printf("=>%d\n",((char*)&v[1]-(char*)&v[0]));
}
The message is error: alignment of array elements is greater than element size.
&a+1 will lead to undefined behavior according to the C++ Standard 5.7/5:
When an expression that has integral type is added to or subtracted from a pointer, the result has the type of
the pointer operand. <...> If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
&a+1 is OK according to 5.7/4:
For the purposes of these operators, a pointer to a nonarray object behaves the same as a pointer to the first
element of an array of length one with the type of the object as its element type.
That means that 5.7/5 can be applied without UB. And finally remark 75 from 5.7/6 as #Luther Blissett noted in his answer says that the code in the question is valid.
In the production code you should use sizeof instead. But the C++ Standard doesn't guarantee that sizeof(int) will result in 4 on every 32-bit platform.
No. This code won't work as you expect on every plattform. At least in theory, there might be a plattform with e.g. 24 bit integers (=3 bytes) but 32 bit alignment. Such alignments are not untypical for (older or simpler) plattforms. Then, your code would return 4, but sizeof( int ) would return 3.
But I am not aware of a real hardware that behaves that way. In practice, your code will work on most or all plattforms.
It's not 100% portable for the following reasons:
Edit: You'd best use int a[1]; and then a+1 becomes definitively valid.
&a invokes undefined behaviour on objects of register storage class.
In case of alignment restrictions that are larger or equal than the size of int type, size_of_int will not contain the correct answer.
Disclaimer:
I am uncertain if the above hold for C++.
Why not just:
size_t size_of_int = sizeof(int);
It is probably implementation defined.
I can imagine a (hypothetical) system where sizeof(int) is smaller than the default alignment.
It looks only safe to say that size_of_int >= sizeof(int)
The code above will portably compute sizeof(int) on a target platform but the latter is implementation defined - you will get different results on different platforms.
Yes, it gives you the equivalent of sizeof(a) but using ptrdiff_t instead of size_t type.
There was a debate on a similar question.
See the comments on my answer to that question for some pointers at why this is not only non-portable, but also is undefined behaviour by the standard.