My goal is to allocate a single chunk of memory and then partition it into smaller arrays of different types. I have a few questions about the code I've written here:
#include <iostream>
#include <cstdint>
#include <cstdlib>
int main() {
constexpr std::size_t array_count = 5;
constexpr std::size_t elements_size = sizeof(std::uint32_t) + sizeof(std::uint16_t);
void* const pv = std::calloc(array_count, elements_size);
//Partition the memory. p32 array starts at pv, p16 array starts after the 20 byte buffer for the p32 array.
std::uint32_t* const p32 = (std::uint32_t *) pv;
std::uint16_t* const p16 = (std::uint16_t *)((char *) pv + sizeof(std::uint32_t) * array_count);
//Initialize values.
for(std::size_t i = 0; i < array_count; ++i) {
p32[i] = i;
p16[i] = i * 2;
}
//Read them back.
for(std::size_t i = 0; i < array_count; ++i) {
std::cout << p32[i] << std::endl;
std::cout << p16[i] << std::endl;
std::cout << std::endl;
}
std::free(pv);
}
Does this code violate c++'s strict aliasing rules? I'm having trouble finding resources on aliasing when casting pointers from a malloc or calloc call. The p32 and p16 pointers should never overlap.
If I reverse the positioning of the two arrays where p16 started at pv, and p32 had a 10 byte offset from pv this could cause a segfault because uint32_t is aligned to the 4 byte boundary pv + 10 could be on the 2 byte boundary, right?
Is this program unsafe, or introduce any undefined behavior that I'm missing in general? I get the expected output on my local machine, but of course that doesn't mean my code is correct.
Yes, the program is UB. When you do this:
for(std::size_t i = 0; i < array_count; ++i) {
p32[i] = i;
p16[i] = i * 2;
}
There are no uint32_t or uint16_t objects that p32 or p16 point to. calloc just gives you bytes, not objects. You can't just reinterpret_cast objects into existence. On top of that, indexing is only defined for arrays, and p32 does not point to an array.
To make it well defined, you'd have to create an array object. However, placement-new for arrays is broken, so you're left with manually initializing a bunch of uint32_ts like:
auto p32 = reinterpret_cast<uint32_t*>(pv);
for (int i = 0; i < array_count; ++i) {
new (p32+i) uint32_t; // NB: this does no initialization, but it does satisfy
// [intro.object] in actually creating an object
}
This would then run into a separate issue: CWG 2182. Now we have array_count uint32_ts, but we don't have a uint32_t[array_count] so indexing is still UB. Basically, there's just no way in purely-by-the-letter-of-the-standard C++ to write this. See also my similar question on the topic.
That said, the amount of code that does this in the wild is tremendous and every implementation will allow you to do it.
I am only going to address Strict Aliasing part of the question.
C++ standard talks very little about malloc - mostly mentions it has semantic defined in C. In strict reading of C++ standard, there is no aliasing rule violation because there is no object which is aliased - in C++, lifetime of the object begins after it has been constructed, and no object has been constructed by malloc call.
As a result, this is something which is simply unspecified by Standard (as opposed by undefined).
No it's just normal cast of malloc return value and handling allocated memory. You can reinterpret these bytes to whatever datatype you want. Until you touch memory behind borders of allocated block.
No
I know you are just asking and trying, but you should allocate memory typically.
std::uint32_t* const p32 = std::calloc(array_count, sizeof(std::uint32_t));
std::uint16_t* const p16 = std::calloc(array_count, sizeof(std::uint16_t));
Is this program unsafe?
In your example, I think there is no problem with exceptions at all, but I suppose you intend to use it in another context. If your code throws an exception, you probably are going to leak some memory.
Do you really need to use calloc/free? In the cppcore guidelines you can find some guidelines about resource management in C++:
https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#S-resource
For instance, R10 and R11 say "not to call explicitly malloc/calloc/free, new/delete": https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rr-mallocfree
Only if you are doing something that really needs very low level code, you need to call new and delete explicitly, but the C++ way is to encapsulate all this calls inside a class, so the user of the class doesn't need to call explicitly new and delete.
But if you are not doing some low level stuff, have you consider to use std::array<>? or std::vector?
Related
I am trying to understand the strict aliasing rule. By reading the answer to What is the strict aliasing rule , I have some further questions. Here I borrow the example from that post:
struct Msg { uint32_t a, b; }
void SendWord(uint32_t);
void SendMessage(uint32_t* buff, uint32_t buffSize)
{
for (uint32_t i = 0; i < buffSize; ++i) SendWord(buff[i]);
}
Now consider the following code (A):
uint32_t *buff = (uint32_t*)malloc(sizeof(Msg));
std::memset(buff, 0, sizeof(Msg));
Msg *msg = (Msg*)buff; // Undefined behavior.
for (uint32_t i = 0; i < 10; ++i)
{
msg->a = i;
msg->b = i + 1;
SendMessage(buff, 2);
}
free(buff);
For the above code, the author explained what might happen due to the UB. The message sent could contain all 0s: during optimization, the compiler may assume msg and buff point to disjoint memory blocks, and then decide the writes to msg do not affect buff.
But how about the following code (B):
uint32_t *buff = (uint32_t*)malloc(sizeof(Msg));
std::memset(buff, 0, sizeof(Msg));
unsigned char *msg = (unsigned char*)buff; // Compliant.
for (uint32_t i = 0; i < sizeof(Msg); ++i)
{
msg[i] = i + 1;
SendMessage(buff, 2);
}
free(buff);
Is the sent message guaranteed to be as intended (as if the strict aliasing complier flag is off) ? If so, is it simply because a *char (msg) points to the same place as buff, the compiler should and will notice it, and refrain from the possible, aforementioned optimization in (A) ?
Yet then I read another comment under the post Strict aliasing rule and 'char *' pointers, saying that it is UB to use the *char pointer to write the referenced object. So again, code (B) could still result in similar, unexpected behavior ?
First of all , the answer https://stackoverflow.com/a/99010/1505939 is applying to C (and is in fact completely wrong, but that's another story). You ask about C++ and the strict aliasing rule is set up differently in C than C++. So that answer has nothing to do with this question.
Prior to C++20, both versions of your code cause undefined behaviour (by omission) as the behaviour of the assignment operator is only defined for the case of writing to an object. The malloc function in C++ allocates space but does not create objects within that space. The task should be approached using various other constructs such as new, or higher level containers, which do both allocate space and create objects within the space ready for writing.
Trying to analyze this code (pre-C++20) in context of the C++ strict aliasing rule is not possible because the definition of the rule is about accessing the stored value of an object but in this code it is not accessing any object, since no object was created.
Since C++20 there is a new provision (N4860 [intro.object]/10) that objects can be implicitly created by the assignment operator, if there exists such a possible combination of objects that would make the code well-defined. (Otherwise the behaviour remains undefined).
Under this change to the object model, both of your code samples are well-defined. In (A) there can be implicitly-created uint32_t objects in the space, and in (B) there can be implicitly-created unsigned char objects in the space. Since your code does not write as one type and then read as a different type, there is no possibility of an aliasing violation.
The existence of intermediate pointers of various types such as buff has no bearing on strict aliasing (in either language) -- the rule is strictly about how the space is read and written; not about how we got to the space.
Say we declared a char* buffer:
char *buf = new char[sizeof(int)*4]
//to delete:
delete [] buf;
or a void* buffer:
void *buf = operator new(sizeof(int)*4);
//to delete:
operator delete(buf);
How would they differ if they where used exclusively with the purpose of serving as pre-allocated memory?- always casting them to other types(not dereferencing them on their own):
int *intarr = static_cast<int*>(buf);
intarr[1] = 1;
Please also answer if the code above is incorrect and the following should be prefered(only considering the cases where the final types are primitives like int):
int *intarr = static_cast<int*>(buf);
for(size_t i = 0; i<4; i++){
new(&intarr[i]) int;
}
intarr[1] = 1;
Finally, answer if it is safe to delete the original buffer of type void*/char* once it is used to create other types in it with the latter aproach of placement new.
It is worth clarifying that this question is a matter of curiosity. I firmly believe that by knowing the bases of what is and isnt possible in a programming language, I can use these as building blocks and come up with solutions suitable for every specific use case when I need to in the future. This is not an XY question, as I dont have a specific implementation of code in mind.
In any case, I can name a few things I can relate to this question off the top of my head(pre-allocated buffers specifically):
Sometimes you want to make memory buffers for custom allocation. Sometimes even you want to align these buffers to cache line boundaries or other memory boundaries. Almost always in the name of more performance and sometimes by requirement(e.g. SIMD, if im not mistaken). Note that for alignment you could use std::aligned_alloc()
Due to a technicality, delete [] buf; may be considered to be UB after buf has been invalidated due to the array of characters being destroyed due to the reuse of the memory.
I wouldn't expect it to cause problems in practice, but using raw memory from operator new doesn't suffer from this technicality and there is no reason to not prefer it.
Even better is to use this:
T* memory = std::allocator<T>::allocate(n);
Because this will also work for overaligned types (unlike your suggestions) and it is easy to replace with a custom allocator. Or, simply use std::vector
for(size_t i = 0; i<4; i++){
new(&intarr[i]) int;
}
There is a standard function for this:
std::uninitialized_fill_n(intarr, 4, 0);
OK, it's slightly different that it initialises with an actual value, but that's probably better thing to do anyway.
I am working with legacy C++ code that reserves a block of memory using malloc and divides it up into parts that are freed separately. Something like this:
const int N_floats_per_buffer = 100;
const int N_buffers = 2;
//reserve buffers en bloque
float * buffer = (float*) malloc(N_float_per_buffer * N_buffers * sizeof(float));
//then this block memory is divided into sub-blocks
float * sub_buffer[N_buffers];
for(int j = 0; j < N_buffers; ++j)
{
sub_buffer[j] = buffer + j*N_floats_per_buffer;
}
//do something with the buffers...
//...
//finally: memory is freed for the individual buffers
for(int j = 0; j < N_buffers; ++j)
{
if(sub_buffer[i]!=NULL) free(sub_buffer[j]);
}
It is actually even more confusing in the real code but I think I have captured the essence of it.
My question is: is that a memory leak?
It is not a memory leak. It is worse, undefined behavior.
You can only call free on a pointer returned from malloc (or calloc, etc.). You are not allowed to call free on a pointer pointing somewhere else in the returned storage block. Doing that causes undefined behavior.
Also on a more pedantic side, malloc does not create any objects and pointer arithmetic requires the pointer to point to an element of an array object for it to have well-defined behavior in C++. Therefore you technically already have undefined behavior when you do
buffer + j*N_floats_per_buffer
although probably all compilers behave as expected (even though the standard made no guarantees). This was only resolved recently for C++20, where the required array will be created implicitly.
One should almost always only use new/delete, not malloc/free in C++.
For the code(Full demo) like:
#include <iostream>
struct A
{
int a;
char ch[1];
};
int main()
{
volatile A *test = new A;
test->a = 1;
test->ch[0] = 'a';
test->ch[1] = 'b';
test->ch[2] = 'c';
test->ch[3] = '\0';
std::cout << sizeof(*test) << std::endl
<< test->ch[0] << std::endl;
}
I need to ignore the compilation warning like
warning: array subscript 1 is above array bounds of 'volatile char 1' [-Warray-bounds]
which is raised by gcc8.2 compiler:
g++ -O2 -Warray-bounds=2 main.cpp
A method to ignore this warning is to use pointer to operate the four bytes characters like:
#include <iostream>
struct A
{
int a;
char ch[1];
};
int main()
{
volatile A *test = new A;
test->a = 1;
// Use pointer to avoid the warning
volatile char *ptr = test->ch;
*ptr = 'a';
*(ptr + 1) = 'b';
*(ptr + 2) = 'c';
*(ptr + 3) = '\0';
std::cout << sizeof(*test) << std::endl
<< test->ch[0] << std::endl;
}
But I can not figure out why that works to use pointer instead of subscript array. Is it because pointer do not have boundary checking for which it point to? Can anyone explain that?
Thanks.
Background:
Due to padding and alignment of memory for struct, though ch[1]-ch[3] in struct A is out of declared array boundary, it is still not overflow from memory view
Why don't we just declare the ch to ch[4] in struct A to avoid this warning?
Answer:
struct A in our app code is generated by other script while compiling. The design rule for struct in our app is that if we do not know the length of an array, we declare it with one member, place it at the end of the struct, and use another member like int a in struct A to control the array length.
Due to padding and alignment of memory for struct, though ch[1]
– ch[3] in struct A is out of declared array boundary, it is
still not overflow for memory view, so we want to ignore this warning.
C++ does not work the way you think it does. You are triggering undefined behavior. When your code triggers undefined behavior, the C++ standard places no requirement on its behavior. A version of GCC attempts to start some video games when certain kind of undefined behavior is encountered. Anthony Williams also knows at least one case where a particular instance of undefined behavior caused someone's monitor to catch on fire. (C++ Concurrency in Action, page 106) Your code may appear to be working at this very time and situation, but that is just an instance of undefined behavior and you cannot count on it. See Undefined, unspecified and implementation-defined behavior.
The correct way to suppress this warning is to write correct C++ code with well-defined behavior. In your case, declaring ch as char ch[4]; solves the problem.
The standard specifies this as undefined behavior in [expr.add]/4:
When an expression J that has integral type is added to or
subtracted from an expression P of pointer type, the result has the
type of P.
If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
Otherwise, if P points to an array element i of an array object x with n elements ([dcl.array]),78 the expressions P +
J and J + P (where J has the value j) point to the
(possibly-hypothetical) array element i + j of x if
0 ≤ i + j ≤ n and the
expression P - J points to the (possibly-hypothetical) array element
i − j of x if 0 ≤ i − j ≤ n.
Otherwise, the behavior is undefined.
78) An object that is not an array element is
considered to belong to a single-element array for this purpose; see
[expr.unary.op]. A pointer past the last element of an array x of
n elements is considered to be equivalent to a pointer to a hypothetical array element n for this purpose; see
[basic.compound].
I want to avoid the warning like
warning: array subscript 1 is above array bounds of 'volatile char 1' [-Warray-bounds]
Well, it is probably better to fix the warning, not just avoid it.
The warning is actually telling you something: what you are doing is undefined behavior. Undefined behavior is really bad (it allows your program to literally anything!) and should be fixed.
Let's look at your struct again:
struct A
{
int a;
char ch[1];
};
In C++, your array has only one element in it. The standard only guarantees array elements of 0 through N-1, where N is the size of the array:
[dcl.array]
...If the value of the constant expression is N, the array
has N elements numbered 0 to N-1...
So ch only has the elements 0 through 1-1, or elements 0 through 0, which is just element 0. That means accessing ch[1], ch[2] overruns the buffer, which is undefined behavior.
Due to padding and alignment of memory for struct, though ch1-ch3 in struct A is out of declared array boundary, it is still not overflow for memory view, so we want to ignore this warning.
Umm, if you say so. The example you gave only allocated 1 A, so as far as we know, there is still only space for the 1 character. If you do allocate more than 1 A at a time in your real program, then I suppose this is possible. But that's still probably not a good thing to do. Especially since you might run into int a of the next A if you're not careful.
A solution to ignore this warning is to use pointer...But I can not figure out why that works. Is it because pointer do not have boundary checking for which it point?
Probably. That would be my guess too. Pointers can point to anything (including destroyed data or even nothing at all!), so the compiler probably won't check it for you. The compiler may not even have a way of knowing whether the memory you point to is valid or not (or may just not care), and, thus, may not even have a way to warn you, much less will warn you. Its only choice is to trust you, so I'm guessing that's why there's no warning.
Why don't we just declare the ch to ch4 in struct A to avoid this warning?
Side issue: actually std::string is probably a better choice here if you don't know how many characters you want to store in here ahead of time--assuming it's different for every instance of A. Anyway, moving on:
Why don't we just declare the ch to ch4 in struct A to avoid this warning?
Answer:
struct A in our app code is generated by other script while compiling. The design rule for struct in our app is that if we do not know the length of an array, we declare it with one member, place it at the end of the struct, and use another member like int a in struct A to control the array length.
I'm not sure I understand your design principle completely, but it sounds like std::vector might be a better option. Then, size is kept track of automatically by the std::vector, and you know that everything is stored in ch. To access it, it would be something like:
myVec[i].ch[0]
I don't know all your constraints for your situation, but it sounds like a better solution instead of walking the line around undefined behavior. But that's just me.
Finally, I should mention that if you are still really interested in ignoring our advice, then I should mention that you still have the option to turn off the warning, but again, I'd advise not doing that. It'd be better to fix A if you can, or get a better use strategy if you can't.
There really is no way to work with this cleanly in C++ and iirc the type (a dynamically sized struct) isn't actually properly formed in C++. But you can work with it because compilers still try to preserve compatibility with C. So it works in practice.
You can't have a value of the struct, only references or pointers to it. And they must be allocated by malloc() and released by free(). You can't use new and delete. Below I show you a way that only allows you to allocate pointers to variable sized structs given the desired payload size. This is the tricky bit as sizeof(Buf) will be 16 (and not 8) because Buf::buf must have a unique address. So here we go:
#include <cstddef>
#include <cstdint>
#include <stdlib.h>
#include <new>
#include <iostream>
#include <memory>
struct Buf {
size_t size {0};
char buf[];
[[nodiscard]]
static Buf * alloc(size_t size) {
void *mem = malloc(offsetof(Buf, buf) + size);
if (!mem) throw std::bad_alloc();
return std::construct_at(reinterpret_cast<Buf*>(mem), AllocGuard{}, size);
}
private:
class AllocGuard {};
public:
Buf(AllocGuard, size_t size_) noexcept : size(size_) {}
};
int main() {
Buf *buf = Buf::alloc(13);
std::cout << "buffer has size " << buf->size << std::endl;
}
You should delete or implement the assign/copy/move constructors and operators as desired. A another good idea would be to use std::uniq_ptr or std::shared_ptr with a Deleter that calls free() instead of returning a naked pointer. But I leave that as exercise to the reader.
I'm in the unfortunate position to write my own vector implementation (no, using a standard implementation isn't possible, very unfortunately). The one which is used by now uses raw bytes buffers and in-place construction and deconstruction of objects, but as a side-effect, I can't look into the actual elements. So I decided to do a variant implementation which uses internally true arrays.
While working on it I noticed that allocating the arrays would cause additional calls of construtor and destructor comapred to the raw buffer version. Is this overhead somehow avoidable without losing the array access? It would be nice to have it as fast as the raw buffer version, so it could be replaced.
I'd appreciate as well if someone knows a good implementation which I could base my own on, or the very least get some ideas from. The work is quite tricky after all. :)
Edit:
Some code to explain it better.
T* data = new T[4]; // Allocation of "num" elements
data[0] = T(1);
data[1] = T(2);
delete[] data;
Now for each element of the array the default constructor has been called, and then 2 assignment methods are called. So instead just 2 constructor calls we have 4 and later 4 destructor calls instead just 2.
as a side-effect, I can't look into the actual elements.
Why not?
void* buffer = ...
T* elements = static_cast<T*>(buffer);
std::cout << elements[0] << std::endl;
Using true arrays means constructors will be called. You'll need to go to raw byte buffers - but it's not too bad. Say you have a buffer:
void *buffer;
Change that to a T *:
T *buffer;
When allocating, treat it as a raw memory buffer:
buffer = (T *) malloc(sizeof(T) * nelems);
And call constructors as necessary:
new(&buffer[x]) T();
Your debugger should be able to look into elements of the buffer as with a true array. When it comes time to free the array, of course, it's your responsibility to free the elements of the array, then pass it to free():
for (int i = 0; i < nInUse; i++)
buffer[x].~T();
free((void*)buffer);
Note that I would not use new char[] and delete[] to allocate this array - I don't know if new char[] will give proper alignment, and in any case you'd need to be careful to cast back to char* before delete[]ing the array.
I find the following implementation quite interesting: C Array vs. C++ Vector
Besides the performance comparison, his vector implementation also includes push/pop operations on the vector.
The code also has an example that shows how to use the macros:
#include "kvec.h"
int main() {
kvec_t(int) array;
kv_init(array);
kv_push(int, array, 10); // append
kv_a(int, array, 20) = 5; // dynamic
kv_A(array, 20) = 4; // static
kv_destroy(array);
return 0;
}