I am trying to understand the strict aliasing rule. By reading the answer to What is the strict aliasing rule , I have some further questions. Here I borrow the example from that post:
struct Msg { uint32_t a, b; }
void SendWord(uint32_t);
void SendMessage(uint32_t* buff, uint32_t buffSize)
{
for (uint32_t i = 0; i < buffSize; ++i) SendWord(buff[i]);
}
Now consider the following code (A):
uint32_t *buff = (uint32_t*)malloc(sizeof(Msg));
std::memset(buff, 0, sizeof(Msg));
Msg *msg = (Msg*)buff; // Undefined behavior.
for (uint32_t i = 0; i < 10; ++i)
{
msg->a = i;
msg->b = i + 1;
SendMessage(buff, 2);
}
free(buff);
For the above code, the author explained what might happen due to the UB. The message sent could contain all 0s: during optimization, the compiler may assume msg and buff point to disjoint memory blocks, and then decide the writes to msg do not affect buff.
But how about the following code (B):
uint32_t *buff = (uint32_t*)malloc(sizeof(Msg));
std::memset(buff, 0, sizeof(Msg));
unsigned char *msg = (unsigned char*)buff; // Compliant.
for (uint32_t i = 0; i < sizeof(Msg); ++i)
{
msg[i] = i + 1;
SendMessage(buff, 2);
}
free(buff);
Is the sent message guaranteed to be as intended (as if the strict aliasing complier flag is off) ? If so, is it simply because a *char (msg) points to the same place as buff, the compiler should and will notice it, and refrain from the possible, aforementioned optimization in (A) ?
Yet then I read another comment under the post Strict aliasing rule and 'char *' pointers, saying that it is UB to use the *char pointer to write the referenced object. So again, code (B) could still result in similar, unexpected behavior ?
First of all , the answer https://stackoverflow.com/a/99010/1505939 is applying to C (and is in fact completely wrong, but that's another story). You ask about C++ and the strict aliasing rule is set up differently in C than C++. So that answer has nothing to do with this question.
Prior to C++20, both versions of your code cause undefined behaviour (by omission) as the behaviour of the assignment operator is only defined for the case of writing to an object. The malloc function in C++ allocates space but does not create objects within that space. The task should be approached using various other constructs such as new, or higher level containers, which do both allocate space and create objects within the space ready for writing.
Trying to analyze this code (pre-C++20) in context of the C++ strict aliasing rule is not possible because the definition of the rule is about accessing the stored value of an object but in this code it is not accessing any object, since no object was created.
Since C++20 there is a new provision (N4860 [intro.object]/10) that objects can be implicitly created by the assignment operator, if there exists such a possible combination of objects that would make the code well-defined. (Otherwise the behaviour remains undefined).
Under this change to the object model, both of your code samples are well-defined. In (A) there can be implicitly-created uint32_t objects in the space, and in (B) there can be implicitly-created unsigned char objects in the space. Since your code does not write as one type and then read as a different type, there is no possibility of an aliasing violation.
The existence of intermediate pointers of various types such as buff has no bearing on strict aliasing (in either language) -- the rule is strictly about how the space is read and written; not about how we got to the space.
Related
Say we declared a char* buffer:
char *buf = new char[sizeof(int)*4]
//to delete:
delete [] buf;
or a void* buffer:
void *buf = operator new(sizeof(int)*4);
//to delete:
operator delete(buf);
How would they differ if they where used exclusively with the purpose of serving as pre-allocated memory?- always casting them to other types(not dereferencing them on their own):
int *intarr = static_cast<int*>(buf);
intarr[1] = 1;
Please also answer if the code above is incorrect and the following should be prefered(only considering the cases where the final types are primitives like int):
int *intarr = static_cast<int*>(buf);
for(size_t i = 0; i<4; i++){
new(&intarr[i]) int;
}
intarr[1] = 1;
Finally, answer if it is safe to delete the original buffer of type void*/char* once it is used to create other types in it with the latter aproach of placement new.
It is worth clarifying that this question is a matter of curiosity. I firmly believe that by knowing the bases of what is and isnt possible in a programming language, I can use these as building blocks and come up with solutions suitable for every specific use case when I need to in the future. This is not an XY question, as I dont have a specific implementation of code in mind.
In any case, I can name a few things I can relate to this question off the top of my head(pre-allocated buffers specifically):
Sometimes you want to make memory buffers for custom allocation. Sometimes even you want to align these buffers to cache line boundaries or other memory boundaries. Almost always in the name of more performance and sometimes by requirement(e.g. SIMD, if im not mistaken). Note that for alignment you could use std::aligned_alloc()
Due to a technicality, delete [] buf; may be considered to be UB after buf has been invalidated due to the array of characters being destroyed due to the reuse of the memory.
I wouldn't expect it to cause problems in practice, but using raw memory from operator new doesn't suffer from this technicality and there is no reason to not prefer it.
Even better is to use this:
T* memory = std::allocator<T>::allocate(n);
Because this will also work for overaligned types (unlike your suggestions) and it is easy to replace with a custom allocator. Or, simply use std::vector
for(size_t i = 0; i<4; i++){
new(&intarr[i]) int;
}
There is a standard function for this:
std::uninitialized_fill_n(intarr, 4, 0);
OK, it's slightly different that it initialises with an actual value, but that's probably better thing to do anyway.
For the code(Full demo) like:
#include <iostream>
struct A
{
int a;
char ch[1];
};
int main()
{
volatile A *test = new A;
test->a = 1;
test->ch[0] = 'a';
test->ch[1] = 'b';
test->ch[2] = 'c';
test->ch[3] = '\0';
std::cout << sizeof(*test) << std::endl
<< test->ch[0] << std::endl;
}
I need to ignore the compilation warning like
warning: array subscript 1 is above array bounds of 'volatile char 1' [-Warray-bounds]
which is raised by gcc8.2 compiler:
g++ -O2 -Warray-bounds=2 main.cpp
A method to ignore this warning is to use pointer to operate the four bytes characters like:
#include <iostream>
struct A
{
int a;
char ch[1];
};
int main()
{
volatile A *test = new A;
test->a = 1;
// Use pointer to avoid the warning
volatile char *ptr = test->ch;
*ptr = 'a';
*(ptr + 1) = 'b';
*(ptr + 2) = 'c';
*(ptr + 3) = '\0';
std::cout << sizeof(*test) << std::endl
<< test->ch[0] << std::endl;
}
But I can not figure out why that works to use pointer instead of subscript array. Is it because pointer do not have boundary checking for which it point to? Can anyone explain that?
Thanks.
Background:
Due to padding and alignment of memory for struct, though ch[1]-ch[3] in struct A is out of declared array boundary, it is still not overflow from memory view
Why don't we just declare the ch to ch[4] in struct A to avoid this warning?
Answer:
struct A in our app code is generated by other script while compiling. The design rule for struct in our app is that if we do not know the length of an array, we declare it with one member, place it at the end of the struct, and use another member like int a in struct A to control the array length.
Due to padding and alignment of memory for struct, though ch[1]
– ch[3] in struct A is out of declared array boundary, it is
still not overflow for memory view, so we want to ignore this warning.
C++ does not work the way you think it does. You are triggering undefined behavior. When your code triggers undefined behavior, the C++ standard places no requirement on its behavior. A version of GCC attempts to start some video games when certain kind of undefined behavior is encountered. Anthony Williams also knows at least one case where a particular instance of undefined behavior caused someone's monitor to catch on fire. (C++ Concurrency in Action, page 106) Your code may appear to be working at this very time and situation, but that is just an instance of undefined behavior and you cannot count on it. See Undefined, unspecified and implementation-defined behavior.
The correct way to suppress this warning is to write correct C++ code with well-defined behavior. In your case, declaring ch as char ch[4]; solves the problem.
The standard specifies this as undefined behavior in [expr.add]/4:
When an expression J that has integral type is added to or
subtracted from an expression P of pointer type, the result has the
type of P.
If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]),78 the expressions P +
J and J + P (where J has the value j) point to the
(possibly-hypothetical) array element i + j of x if
0 ≤ i + j ≤ n and the
expression P - J points to the (possibly-hypothetical) array element
i − j of x if 0 ≤ i − j ≤ n.
Otherwise, the behavior is undefined.
78) An object that is not an array element is
considered to belong to a single-element array for this purpose; see
[expr.unary.op]. A pointer past the last element of an array x of
n elements is considered to be equivalent to a pointer to a hypothetical array element n for this purpose; see
[basic.compound].
I want to avoid the warning like
warning: array subscript 1 is above array bounds of 'volatile char 1' [-Warray-bounds]
Well, it is probably better to fix the warning, not just avoid it.
The warning is actually telling you something: what you are doing is undefined behavior. Undefined behavior is really bad (it allows your program to literally anything!) and should be fixed.
Let's look at your struct again:
struct A
{
int a;
char ch[1];
};
In C++, your array has only one element in it. The standard only guarantees array elements of 0 through N-1, where N is the size of the array:
[dcl.array]
...If the value of the constant expression is N, the array
has N elements numbered 0 to N-1...
So ch only has the elements 0 through 1-1, or elements 0 through 0, which is just element 0. That means accessing ch[1], ch[2] overruns the buffer, which is undefined behavior.
Due to padding and alignment of memory for struct, though ch1-ch3 in struct A is out of declared array boundary, it is still not overflow for memory view, so we want to ignore this warning.
Umm, if you say so. The example you gave only allocated 1 A, so as far as we know, there is still only space for the 1 character. If you do allocate more than 1 A at a time in your real program, then I suppose this is possible. But that's still probably not a good thing to do. Especially since you might run into int a of the next A if you're not careful.
A solution to ignore this warning is to use pointer...But I can not figure out why that works. Is it because pointer do not have boundary checking for which it point?
Probably. That would be my guess too. Pointers can point to anything (including destroyed data or even nothing at all!), so the compiler probably won't check it for you. The compiler may not even have a way of knowing whether the memory you point to is valid or not (or may just not care), and, thus, may not even have a way to warn you, much less will warn you. Its only choice is to trust you, so I'm guessing that's why there's no warning.
Why don't we just declare the ch to ch4 in struct A to avoid this warning?
Side issue: actually std::string is probably a better choice here if you don't know how many characters you want to store in here ahead of time--assuming it's different for every instance of A. Anyway, moving on:
Why don't we just declare the ch to ch4 in struct A to avoid this warning?
Answer:
struct A in our app code is generated by other script while compiling. The design rule for struct in our app is that if we do not know the length of an array, we declare it with one member, place it at the end of the struct, and use another member like int a in struct A to control the array length.
I'm not sure I understand your design principle completely, but it sounds like std::vector might be a better option. Then, size is kept track of automatically by the std::vector, and you know that everything is stored in ch. To access it, it would be something like:
myVec[i].ch[0]
I don't know all your constraints for your situation, but it sounds like a better solution instead of walking the line around undefined behavior. But that's just me.
Finally, I should mention that if you are still really interested in ignoring our advice, then I should mention that you still have the option to turn off the warning, but again, I'd advise not doing that. It'd be better to fix A if you can, or get a better use strategy if you can't.
There really is no way to work with this cleanly in C++ and iirc the type (a dynamically sized struct) isn't actually properly formed in C++. But you can work with it because compilers still try to preserve compatibility with C. So it works in practice.
You can't have a value of the struct, only references or pointers to it. And they must be allocated by malloc() and released by free(). You can't use new and delete. Below I show you a way that only allows you to allocate pointers to variable sized structs given the desired payload size. This is the tricky bit as sizeof(Buf) will be 16 (and not 8) because Buf::buf must have a unique address. So here we go:
#include <cstddef>
#include <cstdint>
#include <stdlib.h>
#include <new>
#include <iostream>
#include <memory>
struct Buf {
size_t size {0};
char buf[];
[[nodiscard]]
static Buf * alloc(size_t size) {
void *mem = malloc(offsetof(Buf, buf) + size);
if (!mem) throw std::bad_alloc();
return std::construct_at(reinterpret_cast<Buf*>(mem), AllocGuard{}, size);
}
private:
class AllocGuard {};
public:
Buf(AllocGuard, size_t size_) noexcept : size(size_) {}
};
int main() {
Buf *buf = Buf::alloc(13);
std::cout << "buffer has size " << buf->size << std::endl;
}
You should delete or implement the assign/copy/move constructors and operators as desired. A another good idea would be to use std::uniq_ptr or std::shared_ptr with a Deleter that calls free() instead of returning a naked pointer. But I leave that as exercise to the reader.
There are problems, where we need to fill buffers with mixed types. Two examples:
programming OpenGL/DirectX, we need to fill vertex buffers, which can have mixed types (which is basically an array of struct, but the struct maybe described by a run-time data)
creating a memory allocator: putting header/trailer information to the buffer (size, flags, next/prev pointer, sentinels, etc.)
The problem can be described like this:
there is an allocation function, which gives back some memory (new, malloc, OS dependent allocation function, like mmap or VirtualAlloc)
there is a need to put mixed types into an allocated buffer, at various offsets
A solution can be this, for example writing an int to an offset:
void *buffer = <allocate>;
int offset = <some_offset>;
char *ptr = static_cast<char*>(buffer);
*reinterpret_cast<int*>(ptr+offset) = int_value;
However, this is inconvenient, and has UB at least two places:
ptr+offset is UB, as there is no char array at ptr
writing to the result of reinterpret_cast is UB, as there is no int there
To solve the inconvenience problem, this solution is often used:
union Pointer {
void *asVoid;
bool *asBool;
byte *asByte;
char *asChar;
short *asShort;
int *asInt;
Pointer(void *p) : asVoid(p) { }
};
So, with this union, we can do this:
Pointer p = <allocate>;
p.asChar += offset;
*p.asInt++ = int_value; // write an int to offset
*p.asShort++ = short_value; // then a short afterwards
// other writes here
This solution is convenient for filling buffers, but has further UB, as the solution uses non-active union members.
So, my question is: how can one solve this problem in a strictly standard conformant, and most convenient way? I mean, I'd like to have the functionality which the union solution gives me, but in a standard conformant way.
(Note: suppose, that we have no alignment issues here, alignment is taken care of by using proper offsets)
A simple (and conformant) way to handle these things is leveraging std::memcpy to move whatever values you need into the correct offsets in your storage area, e.g.
std::int32_t value;
char *ptr;
int offset;
// ...
std::memcpy(ptr+offset, &value, sizeof(value));
Do not worry about performance, since your compiler will not actually perform std::memcpy calls in many cases (e.g. small values). Of course, check the assembly output (and profile!), but it should be fine in general.
My goal is to allocate a single chunk of memory and then partition it into smaller arrays of different types. I have a few questions about the code I've written here:
#include <iostream>
#include <cstdint>
#include <cstdlib>
int main() {
constexpr std::size_t array_count = 5;
constexpr std::size_t elements_size = sizeof(std::uint32_t) + sizeof(std::uint16_t);
void* const pv = std::calloc(array_count, elements_size);
//Partition the memory. p32 array starts at pv, p16 array starts after the 20 byte buffer for the p32 array.
std::uint32_t* const p32 = (std::uint32_t *) pv;
std::uint16_t* const p16 = (std::uint16_t *)((char *) pv + sizeof(std::uint32_t) * array_count);
//Initialize values.
for(std::size_t i = 0; i < array_count; ++i) {
p32[i] = i;
p16[i] = i * 2;
}
//Read them back.
for(std::size_t i = 0; i < array_count; ++i) {
std::cout << p32[i] << std::endl;
std::cout << p16[i] << std::endl;
std::cout << std::endl;
}
std::free(pv);
}
Does this code violate c++'s strict aliasing rules? I'm having trouble finding resources on aliasing when casting pointers from a malloc or calloc call. The p32 and p16 pointers should never overlap.
If I reverse the positioning of the two arrays where p16 started at pv, and p32 had a 10 byte offset from pv this could cause a segfault because uint32_t is aligned to the 4 byte boundary pv + 10 could be on the 2 byte boundary, right?
Is this program unsafe, or introduce any undefined behavior that I'm missing in general? I get the expected output on my local machine, but of course that doesn't mean my code is correct.
Yes, the program is UB. When you do this:
for(std::size_t i = 0; i < array_count; ++i) {
p32[i] = i;
p16[i] = i * 2;
}
There are no uint32_t or uint16_t objects that p32 or p16 point to. calloc just gives you bytes, not objects. You can't just reinterpret_cast objects into existence. On top of that, indexing is only defined for arrays, and p32 does not point to an array.
To make it well defined, you'd have to create an array object. However, placement-new for arrays is broken, so you're left with manually initializing a bunch of uint32_ts like:
auto p32 = reinterpret_cast<uint32_t*>(pv);
for (int i = 0; i < array_count; ++i) {
new (p32+i) uint32_t; // NB: this does no initialization, but it does satisfy
// [intro.object] in actually creating an object
}
This would then run into a separate issue: CWG 2182. Now we have array_count uint32_ts, but we don't have a uint32_t[array_count] so indexing is still UB. Basically, there's just no way in purely-by-the-letter-of-the-standard C++ to write this. See also my similar question on the topic.
That said, the amount of code that does this in the wild is tremendous and every implementation will allow you to do it.
I am only going to address Strict Aliasing part of the question.
C++ standard talks very little about malloc - mostly mentions it has semantic defined in C. In strict reading of C++ standard, there is no aliasing rule violation because there is no object which is aliased - in C++, lifetime of the object begins after it has been constructed, and no object has been constructed by malloc call.
As a result, this is something which is simply unspecified by Standard (as opposed by undefined).
No it's just normal cast of malloc return value and handling allocated memory. You can reinterpret these bytes to whatever datatype you want. Until you touch memory behind borders of allocated block.
No
I know you are just asking and trying, but you should allocate memory typically.
std::uint32_t* const p32 = std::calloc(array_count, sizeof(std::uint32_t));
std::uint16_t* const p16 = std::calloc(array_count, sizeof(std::uint16_t));
Is this program unsafe?
In your example, I think there is no problem with exceptions at all, but I suppose you intend to use it in another context. If your code throws an exception, you probably are going to leak some memory.
Do you really need to use calloc/free? In the cppcore guidelines you can find some guidelines about resource management in C++:
https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#S-resource
For instance, R10 and R11 say "not to call explicitly malloc/calloc/free, new/delete": https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rr-mallocfree
Only if you are doing something that really needs very low level code, you need to call new and delete explicitly, but the C++ way is to encapsulate all this calls inside a class, so the user of the class doesn't need to call explicitly new and delete.
But if you are not doing some low level stuff, have you consider to use std::array<>? or std::vector?
Let us not discuss the badness of the following code, it's not mine, and I fully
agree with you in advance that it's not pretty, rather C-ish and potentially
very dangerous:
void * buf = std::malloc(24 + sizeof(int[3]));
char * name = reinterpret_cast<char *>(buf);
std::strcpy(name, "some name");
int * values = reinterpret_cast<int *>(name + 24);
values[0] = 0; values[1] = 13; values[2] = 42;
Its intent is quite clear; it's a "byte block" storing two arrays of different
types. To access elements not in front of the block, it interprets the block as
char * and increments the pointer by sizeof(type[len]).
My question is however, is it legal C++(11), as in "is it guaranteed that it
will work on every standard conforming compiler"? My intuition says it's not, however g++ and clang seem to be fine
with it.
I would appreciate a standard quote on this one; unfortunately I
was not able to find a related passage myself.
This is perfectly valid C++ code (not nice though, as you noted yourself). As long as the string is not longer than 23 characters, it is not even in conflict with the strict aliasing rules, because you never access the same byte in memory through differently typed pointers. However, if the string exceeds the fixed limit, you have undefined behaviour like any other out of bounds bug.
Still, I'd recommend to use a structure, at the very least:
typedef struct whatever {
char name[24];
int [3];
} whatever;
whatever* myData = new myData;
std::strcpy(myData->name, "some name");
myData->values[0] = 0; myData->values[1] = 13; myData->values[2] = 42;
This is 100% equivalent to the code you gave, except for a bit more overhead in the new operator as opposed to directly calling malloc(). If you are worried about performance, you can still do whatever* myData = (whatever*)std::malloc(sizeof(*myData)); instead of using new.