I was scrolling through some posts and I read about something called the strict aliasing rule. It looked awfully close to some code I've seen in a club project, relevant snippet below.
LibSerial::DataBuffer dataBuffer;
size_t BUFFER_SIZE = sizeof(WrappedPacket);
while(true) {
serial_port.Read(dataBuffer, sizeof(WrappedPacket));
uint8_t *rawDataBuffer = dataBuffer.data();
//this part
auto *wrappedPacket = (WrappedPacket *) rawDataBuffer;
...
the struct definitions are:
typedef struct __attribute__((__packed__)) TeensyData {
int16_t adc0, adc1, adc2, adc3, adc4, adc5, adc6, adc7, adc8, adc9, adc10, adc11;
int32_t loadCell0;
double tc0, tc1, tc2, tc3, tc4, tc5, tc6, tc7;
} TeensyData;
typedef struct __attribute__((__packed__)) WrappedPacket {
TeensyData dataPacket;
uint16_t packetCRC;
} WrappedPacket;
Hopefully it's pretty obvious that I'm new to C++. So 1) is this a violation of the rule? and 2) if it is, what alternative solutions are there?
Yeah, it's a violation. The rule lets you use raw byte access (char*, unsigned char*, std::byte*) to access other data types. It doesn't let you use other data types to access arrays of raw bytes.
The solution is memcpy:
WrappedPacket wpkt;
std::memcpy(&wpkt, dataBuffer.data(), sizeof wpkt);
Yes, this is a strict aliasing violation and UB. But I wouldn't worry too much about it.
I wouldn't want to memcpy a big struct solely for formal correctness, in hope that the compiler optimizes it out. But I would add std::launder just in case1:
auto *wrappedPacket = std::launder(reinterpret_cast<WrappedPacket *>(rawDataBuffer));
Here's my reasoning:
.Read most probably boils down to an opaque library call, so the compiler must optimize under the assumption that it does something that makes the code legal. For example, it could apply placement-new to the provided buffer, with the right type.
If we assume that it's the case, then std::launder would "bless" the pointer to point to the said imaginary object, fixing the UB.
We, programmers, know that this assumption is false, but the compiler doesn't.
I'm not sure if C++20 implicit lifetimes relax the rules here and make launder unnecessary. I would keep it.
1 launder doesn't fix UB here. But it does reduce the probability of it blowing up in your face.
Related
(Disclaimer: At this point, this is mostly academic interest.)
Imagine I have such an external interface, that is, I do not control it's code:
// Provided externally: Cannot (easily) change this:
// fill buffer with n floats:
void data_source_external(float* pDataOut, size_t n);
// send n data words from pDataIn:
void data_sink_external(const uint32_t* pDataIn, size_t n);
Is it possible within standard C++ to "move" / "stream" data between these two interfaces without copying?
That is, is there any way to make the following be non-UB, without copying of the data between two correctly typed buffers?
int main()
{
constexpr size_t n = 64;
float fbuffer[n];
data_source_external(fbuffer, n);
// These hold and can be checked statically:
static_assert(sizeof(float) == sizeof(uint32_t), "same size");
static_assert(alignof(float) == alignof(uint32_t), "same alignment");
static_assert(std::numeric_limits<float>::is_iec559 == true, "IEEE 754");
// This is clearly UB. Any way to make this work without copying the data?
const uint32_t* buffer_alias = static_cast<uint32_t*>(static_cast<void*>(fbuffer));
// **Note**:
// + reinterpret_cast would also be UB.
data_sink_external(buffer_alias, n);
// ...
As far as I can tell the following would be defined behavior, at least with regard to strict aliasing:
...
uint32_t ibuffer[n];
std::memcpy(ibuffer, fbuffer, n * sizeof(uint32_t));
data_sink_external(ibuffer, n);
but given that the ibuffer will have exactly the same bits as the fbuffer this seems quite insane.
Or would we expect optimizing compilers to optimize even this copy away? (In a now deleted comment-like answer a user posted a godbolt link that seems to indicate, at least on first glance, that clang 11 indeed would be able to optimize out the memcpy.)
I didn't test and can't comment yet (cause not enough reputation). But reinterpret_cast may help in this situation.
Documentation
Basically it tells the compiler, hey treat this pointer as if it was the specified type in the cast.
There was a similar question here, but the user in that question seemed to have a much larger array, or vector. If I have:
bool boolArray[4];
And I want to check if all elements are false, I can check [ 0 ], [ 1 ] , [ 2 ] and [ 3 ] either separately, or I can loop through it. Since (as far as I know) false should have value 0 and anything other than 0 is true, I thought about simply doing:
if ( *(int*) boolArray) { }
This works, but I realize that it relies on bool being one byte and int being four bytes. If I cast to (std::uint32_t) would it be OK, or is it still a bad idea? I just happen to have 3 or 4 bools in an array and was wondering if this is safe, and if not if there is a better way to do it.
Also, in the case I end up with more than 4 bools but less than 8 can I do the same thing with a std::uint64_t or unsigned long long or something?
As πάντα ῥεῖ noticed in comments, std::bitset is probably the best way to deal with that in UB-free manner.
std::bitset<4> boolArray {};
if(boolArray.any()) {
//do the thing
}
If you want to stick to arrays, you could use std::any_of, but this requires (possibly peculiar to the readers) usage of functor which just returns its argument:
bool boolArray[4];
if(std::any_of(std::begin(boolArray), std::end(boolArray), [](bool b){return b;}) {
//do the thing
}
Type-punning 4 bools to int might be a bad idea - you cannot be sure of the size of each of the types. It probably will work on most architectures, but std::bitset is guaranteed to work everywhere, under any circumstances.
Several answers have already explained good alternatives, particularly std::bitset and std::any_of(). I am writing separately to point out that, unless you know something we don't, it is not safe to type pun between bool and int in this fashion, for several reasons:
int might not be four bytes, as multiple answers have pointed out.
M.M points out in the comments that bool might not be one byte. I'm not aware of any real-world architectures in which this has ever been the case, but it is nevertheless spec-legal. It (probably) can't be smaller than a byte unless the compiler is doing some very elaborate hide-the-ball chicanery with its memory model, and a multi-byte bool seems rather useless. Note however that a byte need not be 8 bits in the first place.
int can have trap representations. That is, it is legal for certain bit patterns to cause undefined behavior when they are cast to int. This is rare on modern architectures, but might arise on (for example) ia64, or any system with signed zeros.
Regardless of whether you have to worry about any of the above, your code violates the strict aliasing rule, so compilers are free to "optimize" it under the assumption that the bools and the int are entirely separate objects with non-overlapping lifetimes. For example, the compiler might decide that the code which initializes the bool array is a dead store and eliminate it, because the bools "must have" ceased to exist* at some point before you dereferenced the pointer. More complicated situations can also arise relating to register reuse and load/store reordering. All of these infelicities are expressly permitted by the C++ standard, which says the behavior is undefined when you engage in this kind of type punning.
You should use one of the alternative solutions provided by the other answers.
* It is legal (with some qualifications, particularly regarding alignment) to reuse the memory pointed to by boolArray by casting it to int and storing an integer, although if you actually want to do this, you must then pass boolArray through std::launder if you want to read the resulting int later. Regardless, the compiler is entitled to assume that you have done this once it sees the read, even if you don't call launder.
You can use std::bitset<N>::any:
Any returns true if any of the bits are set to true, otherwise false.
#include <iostream>
#include <bitset>
int main ()
{
std::bitset<4> foo;
// modify foo here
if (foo.any())
std::cout << foo << " has " << foo.count() << " bits set.\n";
else
std::cout << foo << " has no bits set.\n";
return 0;
}
Live
If you want to return true if all or none of the bits set to on, you can use std::bitset<N>::all or std::bitset<N>::none respectively.
The standard library has what you need in the form of the std::all_of, std::any_of, std::none_of algorithms.
...And for the obligatory "roll your own" answer, we can provide a simple "or"-like function for any array bool[N], like so:
template<size_t N>
constexpr bool or_all(const bool (&bs)[N]) {
for (bool b : bs) {
if (b) { return b; }
}
return false;
}
Or more concisely,
template<size_t N>
constexpr bool or_all(const bool (&bs)[N]) {
for (bool b : bs) { if (b) { return b; } }
return false;
}
This also has the benefit of both short-circuiting like ||, and being optimised out entirely if calculable at compile time.
Apart from that, if you want to examine the original idea of type-punning bool[N] to some other type to simplify observation, I would very much recommend that you don't do that view it as char[N2] instead, where N2 == (sizeof(bool) * N). This would allow you to provide a simple representation viewer that can automatically scale to the viewed object's actual size, allow iteration over its individual bytes, and allow you to more easily determine whether the representation matches specific values (such as, e.g., zero or non-zero). I'm not entirely sure off the top of my head whether such examination would invoke any UB, but I can say for certain that any such type's construction cannot be a viable constant-expression, due to requiring a reinterpret cast to char* or unsigned char* or similar (either explicitly, or in std::memcpy()), and thus couldn't as easily be optimised out.
There have previously been some great answers on memory alignment, but I feel don't completely answer some questions.
E.g.:
What is data alignment? Why and when should I be worried when typecasting pointers in C?
What is aligned memory allocation?
I have an example program:
#include <iostream>
#include <vector>
#include <cstring>
int32_t cast_1(int offset) {
std::vector<char> x = {1,2,3,4,5};
return reinterpret_cast<int32_t*>(x.data()+offset)[0];
}
int32_t cast_2(int offset) {
std::vector<char> x = {1,2,3,4,5};
int32_t y;
std::memcpy(reinterpret_cast<char*>(&y), x.data() + offset, 4);
return y;
}
int main() {
std::cout << cast_1(1) << std::endl;
std::cout << cast_2(1) << std::endl;
return 0;
}
The cast_1 function outputs a ubsan alignment error (as expected) but cast_2 does not. However, cast_2 looks much less readable to me (requires 3 lines). cast_1 looks perfectly clear on the intent, even though it is UB.
Questions:
1) Why is cast_1 UB, when the intent is perfectly clear? I understand that there may be performance issues with alignment.
2) Is cast_2 a correct approach to fixing the UB of cast_1?
1) Why is cast_1 UB?
Because the language rules say so. Multiple rules in fact.
The offset where you access the object does not meet the alignment requirements of int32_t (except on systems where the alignment requirement is 1). No objects can be created without conforming to the alignment requirement of the type.
A char pointer may not be aliased by a int32_t pointer.
2) Is cast_2 a correct approach to fixing the UB of cast_1?
cast_2 has well defined behaviour. The reinterpret_cast in that function is redundant, and it is bad to use magic constants (use sizeof).
WRT the first question, it would be trivial for the compiler to handle that for you, true. All it would have to do is pessimize every other non-char load in the program.
The alignment rules were written precisely so the compiler can generate code that performs well on the many platforms where aligned memory access is a fast native op, and misaligned access is the equivalent of your memcpy. Except where it could prove alignment, the compiler would have to handle every load the slow & safe way.
I want to split large variables like floats into byte segments and send these serially byte by byte via UART. I'm using C/C++.
One method could be to deepcopy the value I want to send to a union and then send it. I think that would be 100% safe but slow. The union would look like this:
union mySendUnion
{
mySendType sendVal;
char[sizeof(mySendType)] sendArray;
}
Another option could be to cast the pointer to the value I want to send, into a pointer to a particular union. Is this still safe?
The third option could be to cast the pointer to the value I want to send to a char, and then increment a pointer like this:
sendType myValue = 443.2;
char* sendChar = (char*)myValue;
for(char i=0; i< sizeof(sendType) ; i++)
{
Serial.write(*(sendChar+j), 1);
}
I've had succes with the above pointer arithmetics, but I'm not sure if it's safe under all circumstances. My concern is, what if we for instance is using a 32 bit processor and want to send a float. The compiler choose to store this 32 bit float into one memory cell, but does only store one single char into each 32 bit cell.
Each counter increment would then make the program pointer increment one whole memory cell, and we would miss the float.
Is there something in the C standard that prevents this, or could this be an issue with a certain compiler?
First off, you can't write your code in "C/C++". There's no such language as "C/C++", as they are fundamentally different languages. As such, the answer regarding unions differs radically.
As to the title:
Are casts as safe as unions?
No, generally they aren't, because of the strict aliasing rule. That is, if you type-pun a pointer of one certain type with a pointer to an incompatible type, it will result in undefined behavior. The only exception to this rule is when you read or manipulate the byte-wise representation of an object by aliasing it through a pointer to (signed or unsigned) char. As in your case.
Unions, however, are quite different bastards. Type punning via copying to and reading from unions is permitted in C99 and later, but results in undefined behavior in C89 and all versions of C++.
In one direction, you can also safely type pun (in C99 and later) using a pointer to union, if you have the original union as an actual object. Like this:
union p {
char c[sizeof(float)];
float f;
} pun;
union p *punPtr = &pun;
punPtr->f = 3.14;
send_bytes(punPtr->c, sizeof(float));
Because "a pointer to a union points to all of its members and vice versa" (C99, I don't remember the exact pargraph, it's around 6.2.5, IIRC). This isn't true in the other direction, though:
float f = 3.14;
union p *punPtr = &f;
send_bytes(punPtr->c, sizeof(float)); // triggers UB!
To sum up: the following code snippet is valid in both C89, C99, C11 and C++:
float f = 3.14;
char *p = (char *)&f;
size_t i;
for (i = 0; i < sizeof f; i++) {
send_byte(p[i]); // hypotetical function
}
The following is only valid in C99 and later:
union {
char c[sizeof(float)];
float f;
} pun;
pun.f = 3.14;
send_bytes(pun.c, sizeof float); // another hypotetical function
The following, however, would not be valid:
float f = 3.14;
unsigned *u = (unsigned *)&f;
printf("%u\n", *u); // undefined behavior triggered!
Another solution that is always guaranteed to work is memcpy(). The memcpy() function does a bytewise copying between two objects. (Don't get me started on it being "slow" -- in most modern compilers and stdlib implementations, it's an intrinsic function).
A general advice when sending floating point data on a byte stream would be to use some serialization technology, to ensure that the data format is well defined (and preferably architecture neutral, beware of endianness issues!).
You could use XDR -or perhaps ASN1- which is a binary format (see xdr(3) for more). For C++, see also libs11n
Unless speed or data size is very critical, I would suggest instead a textual format like JSON or perhaps YAML (textual formats are more verbose, but easier to debug and to document). There are several good libraries supporting it (e.g. jsoncpp for C++ or jansson for C).
Notice that serial ports are quite slow (w.r.t. CPU). So the serialization processing time is negligible.
Whatever you do, please document the serialization format (even for an internal project).
The cast to [[un]signed] char [const] * is legal and it won't cause issues when reading, so that is a fine option (that is, after fixing char *sendChar = reinterpret_cast<char*>(&myValue);, and since you are at it, make it const)
Now the next problem comes on the other side, when reading, as you cannot safely use the same approach for reading. In general, the cost of copying the variables is much less than the cost of sending over the UART, so I would just use the union when reading out of the serial.
I had a discussion this morning with a colleague regarding the correctness of a "coding trick" to detect endianness.
The trick was:
bool is_big_endian()
{
union
{
int i;
char c[sizeof(int)];
} foo;
foo.i = 1;
return (foo.c[0] == 1);
}
To me, it seems that this usage of an union is incorrect because setting one member of the union and reading another is not well-defined. But I have to admit that this is just a feeling and I lack actual proofs to strengthen my point.
Is this trick correct ? Who is right here ?
Your code is not portable. It might work on some compilers or it might not.
You are right about the behaviour being undefined when you try to access the inactive member of the union [as it is in the case of the code given]
$9.5/1
In a union, at most one of the data members can be active at any time, that is, the value of at most one of the data members can be stored in a union at any time.
So foo.c[0] == 1 is incorrect because c is not active at that moment. Feel free to correct me if you think I am wrong.
Don't do this, better use something like the following:
#include <arpa/inet.h>
//#include <winsock2.h> // <-- for Windows use this instead
#include <stdint.h>
bool is_big_endian() {
uint32_t i = 1;
return i == htonl(i);
}
Explanation:
The htonl function converts a u_long from host to TCP/IP network byte order (which is big-endian).
References:
http://linux.die.net/man/3/htonl
http://msdn.microsoft.com/de-de/library/ms738556%28v=vs.85%29.aspx
You're correct that that code doesn't have well-defined behavior. Here's how to do it portably:
#include <cstring>
bool is_big_endian()
{
static unsigned const i = 1u;
char c[sizeof(unsigned)] = { };
std::memcpy(c, &i, sizeof(c));
return !c[0];
}
// or, alternatively
bool is_big_endian()
{
static unsigned const i = 1u;
return !*static_cast<char const*>(static_cast<void const*>(&i));
}
The function should be named is_little_endian. I think you can use this union trick. Or also a cast to char.
The code has undefined behavior, although some (most?) compilers will
define it, at least in limited cases.
The intent of the standard is that reinterpret_cast be used for
this. This intent isn't well expressed, however, since the standard
can't really define the behavior; there is no desire to define it when
the hardware won't support it (e.g. because of alignment issues). And
it's also clear that you can't just reinterpret_cast between two
arbitrary types and expect it to work.
From a quality of implementation point of view, I would expect both the
union trick and reinterpret_cast to work, if the union or the
reinterpret_cast is in the same functional block; the union should
work as long as the compiler can see that the ultimate type is a union
(although I've used compilers where this wasn't the case).