How to invoke aligned new/delete properly? - c++

How do I call new operator with alignment?
auto foo = new(std::align_val_t(32)) Foo; //?
and then, how to delete it properly?
delete(std::align_val_t(32), foo); //?
If this is the right form of using these overloads, why valgring complaining about mismatched free()/delete/delete[]?

exist very basic principle - the memory free routine always must match to allocate routine. if we use mismatch allocate and free - run time behavior can be any: all can be random ok, or crash by run-time, or memory leak, or heap corruption.
if we allocate memory with aligned version of operator new
void* operator new ( std::size_t count, std::align_val_t al);
we must use the corresponding aligned version of operator delete
void operator delete ( void* ptr, std::align_val_t al );
call void operator delete ( void* ptr ); here always must lead to run-time error. let simply test
std::align_val_t al = (std::align_val_t)256;
if (void* pv = operator new(8, al))
{
operator delete(pv, al);
//operator delete(pv); this line crash, or silently corrupt heap
}
why is aligned and not aligned version of operator delete always incompatible ? let think - how is possible allocate align on some value memory ? we initially always allocate some memory block. for return align pointer to use - we need adjust allocated memory pointer to be multiple align. ok. this is possible by allocate more memory than requested and adjust pointer. but now question - how free this block ? in general user got pointer not to the begin of allocated memory - how from this user pointer jump back to begin of allocated block ? without additional info this is impossible. we need store pointer to actual allocated memory before user returned pointer. may be this will be more visible in code typical implementation for aligned new and delete use _aligned_malloc and _aligned_free
void* operator new(size_t size, std::align_val_t al)
{
return _aligned_malloc(size, static_cast<size_t>(al));
}
void operator delete (void * p, std::align_val_t al)
{
_aligned_free(p);
}
when not aligned new and delete use malloc and free
void* operator new(size_t size)
{
return malloc(size);
}
void operator delete (void * p)
{
free(p);
}
now let look for internal implementation of _aligned_malloc and _aligned_free
void* __cdecl _aligned_malloc(size_t size, size_t alignment)
{
if (!alignment || ((alignment - 1) & alignment))
{
// alignment is not a power of 2 or is zero
return 0;
}
union {
void* pv;
void** ppv;
uintptr_t up;
};
if (void* buf = malloc(size + sizeof(void*) + --alignment))
{
pv = buf;
up = (up + sizeof(void*) + alignment) & ~alignment;
ppv[-1] = buf;
return pv;
}
return 0;
}
void __cdecl _aligned_free(void * pv)
{
if (pv)
{
free(((void**)pv)[-1]);
}
}
in general words _aligned_malloc allocate size + sizeof(void*) + alignment - 1 instead requested by caller size. adjust allocated pointer to fit alignment , and store originally allocated memory before pointer returned to caller.
and _aligned_free(pv) call not free(pv) but free(((void**)pv)[-1]); - for always another pointer. because this effect of _aligned_free(pv) always another compare free(pv). and operator delete(pv, al); always not compatible with operator delete(pv); if say delete [] usual have the same effect as delete but align vs not align always run time different.

The below syntax was the only one that worked for me to create and destroy an overaligned array, using clang-cl 13 on Windows 10 x64:
int* arr = new (std::align_val_t(64)) int[555];
::operator delete[] (arr, std::align_val_t(64));
For the same new operation, the below delete expression would not compile ("cannot delete expression of type 'std::align_val_t'):
delete[] (arr, std::align_val_t(64));
The below delete expression will compile, but then throws a runtime error ("
Critical error detected c0000374"):
delete[](std::align_val_t(64), blocks);

Related

Heap array allocates 4 extra bytes if class has destructor

I'm new to C++ and I've been playing around with memory allocation lately. I have found out that when you declare a class with a destructor, like so:
class B
{
public:
~B() { }
};
And then create a heap array of it like so:
B* arr = new B[8];
The allocator allocates 12 bytes but when I remove the destructor, it only allocates 8 bytes. This is how I'm measuring the allocation:
size_t allocated = 0;
void* operator new(size_t size)
{
allocated += size;
return malloc(size);
}
void deallocate(B* array, size_t size)
{
delete[] array;
allocated -= size * sizeof(B);
}
Of course I have to call deallocate manually while the new operator is called automatically.
I have found this problem while working with an std::string* and I realised that the deallocator worked fine with an int* but not with the former.
Does anyone know why that happens and more importantly: How to detect these programmatically at runtime?
Thanks in advance.
You are looking at an implementation detail of how your compiler treats new[] and delete[], so there isn't a definitive answer for the extra space being allocated since the answer will be specific to an implementation -- though I can provide a likely reason below.
Since this is implementation-defined, you cannot reliably detect this at runtime. More importantly, there shouldn't be any real reason to do this. Especially if you're new to C++, this fact is more of an interesting/esoteric thing to know of, but there should be no real benefit detecting this at runtime.
It's important to also be aware that this only happens with array-allocations, and not with object allocations. For example, the following will print expected numbers:
struct A {
~A(){}
};
struct B {
};
auto operator new(std::size_t n) -> void* {
std::cout << "Allocated: " << n << std::endl;
return std::malloc(n);
}
auto operator delete(void* p, std::size_t n) -> void {
std::free(p);
}
auto main() -> int {
auto* a = new A{};
delete a;
auto* b = new B{};
delete b;
}
Output:
Allocated: 1
Allocated: 1
Live Example
The extra storage only gets allocated for types with non-trivial destructors:
auto* a = new A[10];
delete[] a;
auto* b = new B[10];
delete[] b;
Outputs:
Allocated: 18
Allocated: 10
Live Example
The most likely reason why this happens is that extra bookkeeping of a single size_t is being kept at the beginning of allocated arrays containing non-trivial destructors. This would be done so that when delete is called, the language can know how many objects require their destructors invoked. For non-trivial destructors, its able to rely on the underlying delete mechanics of their deallocation functions.
This hypothesis is also supported by the fact that for the GNU ABI, the extra storage is sizeof(size_t) bytes. Building for x86_64 yields 18 for an allocation of A[10] (8 bytes for size_t). Building for x86 yields 14 for that same allocation (4 bytes for size_t).
Edit
I don't recommend doing this in practice, but you can actually view this extra data from arrays. The allocated pointer from new[] gets adjusted before being returned to the caller (which you can test by printing the address from the new[] operator).
If you read this data into a std::size_t, you can see that this data -- at least for the GNU ABI -- contains the exact count for the number of objects allocated.
Again, I do not recommend doing this in practice since this exploits implementation-defined behavior. But just for fun:
auto* a = new A[10];
const auto* bytes = reinterpret_cast<const std::byte*>(a);
std::size_t count;
std::memcpy(&count, bytes - sizeof(std::size_t), sizeof(std::size_t));
std::cout << "Count " << count << std::endl;
delete[] a;
The output:
Count 10
Live Example

Does c++ operator new[]/delete[] call operator new/delete?

Does c++ operator new[]/delete[] (not mine) call operator new/delete?
After I replaced operator new and operator delete with my own implemenation, then the following code will call them:
int *array = new int[3];
delete[] array;
And When I also replaced operator new[] and operator delete[], then the above code will call only them.
My operators implementation:
void *operator new(std::size_t blockSize) {
std::cout << "allocate bytes: " << blockSize << std::endl;
return malloc(blockSize);
}
void *operator new[](std::size_t blockSize) {
std::cout << "[] allocate: " << blockSize << std::endl;
return malloc(blockSize);
}
void operator delete(void *block) throw() {
int *blockSize = static_cast<int *>(block);
blockSize = blockSize - sizeof(int);
std::cout << "deallocate bytes: " << *blockSize << std::endl;
free(block);
}
void operator delete[](void *block) throw() {
int *blockSize = static_cast<int *>(block);
blockSize = blockSize - sizeof(int);
std::cout << "[] deallocate bytes: " << *blockSize << std::endl;
free(block);
}
I have a second question which maybe not so related, why the code prints:
[] allocate: 12
[] deallocate bytes: 0
Instead of this:
[] allocate: 16
[] deallocate bytes: 16
Since the allocation operators new and new[] pretty much do the same thing(a), it makes sense that one would be defined in terms of the other. They're both used for allocating a block of a given size, regardless of what you intend to use it for. Ditto for delete and delete[].
In fact, this is required by the standard. C++11 18.6.1.2 /4 (for example) states that the default behaviour of operator new[] is that it returns operator new(size). There's a similar restriction in /13 for operator delete[].
So a sample default implementation would be something like:
void *operator new(std::size_t sz) { return malloc(sz); }
void operator delete(void *mem) throw() { free(mem); }
void *operator new[](std::size_t sz) { return operator new(sz); }
void operator delete[](void *mem) throw() { return operator delete(mem); }
When you replace the new and delete functions, the new[] and delete[] ones will still use them under the covers. However, replacing new[] and delete[] with your own functions that don't call your new and delete results in them becoming disconnected.
That's why you're seeing the behaviour described in the first part of your question.
As per the second part of your question, you're seeing what I'd expect to see. The allocation of int[3] is asking for three integers, each four bytes in size (in you environment). That's clearly 12 bytes.
Why it seems to be freeing zero bytes is a little more complex. You seem to think that the four bytes immediately before the address you were given are the size of the block but that's not necessarily so.
Implementations are free to store whatever control information they like in the memory arena(b) including the following possibilities (this is by no means exhaustive):
the size of the current memory allocation;
a link to the next (and possibly previous) control block;
a sentinel value (such as 0xa55a or a checksum of the control block) to catch arena corruption.
Unless you know and control how the memory allocation functions use their control blocks, you shouldn't be making assumptions. For a start, to ensure correct alignment, control blocks may be padded with otherwise useless data. If you want to save/use the requested size, you'll need to do it yourself with something like:
#include <iostream>
#include <memory>
// Need to check this is enough to maintain alignment.
namespace { const int buffSz = 16; }
// New will allocate more than needed, store size, return adjusted address.
void *operator new(std::size_t blockSize) {
std::cout << "Allocating size " << blockSize << '\n';
auto mem = static_cast<std::size_t*>(std::malloc(blockSize + buffSz));
*mem = blockSize;
return reinterpret_cast<char*>(mem) + buffSz;
}
// Delete will unadjust address, use that stored size and free.
void operator delete(void *block) throw() {
auto mem = reinterpret_cast<std::size_t*>(static_cast<char*>(block) - buffSz);
std::cout << "Deallocating size " << *mem << '\n';
std::free(mem);
}
// Leave new[] and delete[] alone, they'll use our functions above.
// Test harness.
int main() {
int *x = new int;
*x = 7;
int *y = new int[3];
y[0] = y[1] = y[2] = 42;
std::cout << *x << ' ' << y[1] << '\n';
delete[] y;
delete x;
}
Running that code results in successful values being printed:
Allocating size 4
Allocating size 12
7 42
Deallocating size 12
Deallocating size 4
(a) The difference between new MyClass and new MyClass[7] comes later than the allocation phase, when the objects are being constructed. Basically, they both allocate the required memory once, then construct as many objects in that memory as necessary (once in the former, seven times in the latter).
(b) And an implementation is allowed to not store any control information inline. I remember working on embedded systems where we knew that no allocation would ever be more than 1K. So we basically created an arena that had no inline control blocks. Instead it had a bit chunk of memory, several hundred of those 1K blocks, and used a bitmap to decide which was in use and which was free.
On the off chance someone asked for more than 1K, the got NULL. Those asking for less than or equal to 1K got 1K regardless. Needless to say, it was much faster than the general purpose allocation functions provided with the implementation.

How to align array of structs, each require alignment (SSE)

I have a struct alignedStruct, and it requires special alignment (SEE ext):
void* operator new(size_t i) { return _aligned_malloc(i, 16); }
void operator delete(void* p) { _aligned_free(p); }
This alignment works fine for unique objects/pointers of alignedStruct, but then I tried to do this:
alignedStruct * arr = new alignedStruct[count];
My application crashes, and the problem is definitely about "alignment of array" (exactly at previous line):
0xC0000005: Access violation reading location 0xFFFFFFFF.
Such crash occur in ~60% of times, also indicates problem is not typical.
I believe what you're looking for is placement new which allows you to use the _aligned_malloc with a constructor properly. Alternatively you can overload operator new[] and operator delete[].
void* operator new[] (std::size_t size)
{
void * mem = _aligned_malloc(size, 16);
if(mem == nullptr)
throw std::bad_alloc;
return mem;
}

overloading new and delete C++ for tracking memory allocations

I need help in understanding the code snipped below...allocate is a function that would be called by the overloaded new operator to allocate memory. I am having problems trying to understand the following casts in particular:
*static_cast<std::size_t*>(mem) = pAmount; //please explain?
return static_cast<char*>(mem) + sizeof(std::size_t); //?
and..
// get original block
void* mem = static_cast<char*>(pMemory) - sizeof(std::size_t); //?
the code is shown below:
const std::size_t allocation_limit = 1073741824; // 1G
std::size_t totalAllocation = 0;
void* allocate(std::size_t pAmount)
{
// make sure we're within bounds
assert(totalAllocation + pAmount < allocation_limit);
// over allocate to store size
void* mem = std::malloc(pAmount + sizeof(std::size_t));
if (!mem)
return 0;
// track amount, return remainder
totalAllocation += pAmount;
*static_cast<std::size_t*>(mem) = pAmount;
return static_cast<char*>(mem) + sizeof(std::size_t);
}
void deallocate(void* pMemory)
{
// get original block
void* mem = static_cast<char*>(pMemory) - sizeof(std::size_t);
// track amount
std::size_t amount = *static_cast<std::size_t*>(mem);
totalAllocation -= pAmount;
// free
std::free(mem);
}
The allocator keeps track of the size of allocations by keeping them along with the blocks it serves to client code. When asked for a block of pAmount bytes, it allocates an extra sizeof(size_t) bytes at the beginning and stores the size there. To get to this size, it interprets the mem pointer it gets from malloc as a size_t* and dereferences that (*static_cast<std::size_t*>(mem) = pAmount;). It then returns the rest of the block, which starts at mem + sizeof(size_t), since that is the part that the client may use.
When deallocating, it must pass the exact pointer it got from malloc to free. To get this pointer, it subtracts the sizeof(size_t) bytes it added in the allocate member function.
In both cases, the casts to char* are needed because pointer arithmetic is not allowed on void pointers.
void* allocate(std::size_t pAmount)
allocates pAmount of memory plus space to store the size
|-size-|---- pAmount of memory-----|
^
|
"allocate" will return a pointer just pasted the size field.
void deallocate(void* pMemory)
will move the pointer back to the beginning
|-size-|---- pAmount of memory-----|
^
|
and free it.
1.)
std::size_t mySize = 0;
void * men = & mySize;
// same as: mySize = 42;
*static_cast<std::size_t*>(mem) = 42;
std::cout << mySize;
// prints "42"
2.)
`return static_cast<char*>(mem) + sizeof(std::size_t);
// casts void pointer mem to a char* so that you can do pointer arithmetic.
// same as
char *myPointer = (char*)mem;
// increment myPointer by the size of size_t
return myPointer + sizeof(std::size_t);
3.)
`void* mem = static_cast<char*>(pMemory) - sizeof(std::size_t);`
// mem points size of size_t before pMemory
In order to know how much memory to clean up when you delete it (and provide some diagnostics) the allocator stores off the size in extra allocated memory.
*static_cast(mem) = pAmount; //please explain?
This takes the allocated memory and stores the number of allocated bytes into this location. The cast treats the raw memory as a size_t for storage purposes.
return static_cast(mem) +
sizeof(std::size_t); //?
This moves forward past the size bytes to the actual memory that your application will use and returns that pointer.
void* mem =
static_cast(pMemory) -
sizeof(std::size_t); //?
This is taking the block previously returned to the user and advancing back to the "real" allocated block that stored the size earlier. It's needed to do checks and reclaim the memory.
the cast is needed in order to get the proper offset since void* is not a type with a size.
when you write
return static_cast(mem) + sizeof(std::size_t);
the pointer is cast to a char* before the offset bytes is added.
ditto subtract when deallocating.

Overloading delete and retrieving size?

I am currently writing a small custom memory Allocator in C++, and want to use it together with operator overloading of new/ delete. Anyways, my memory Allocator basically checks if the requested memory is over a certain threshold, and if so uses malloc to allocate the requested memory chunk. Otherwise the memory will be provided by some fixedPool allocators. that generally works, but for my deallocation function looks like this:
void MemoryManager::deallocate(void * _ptr, size_t _size){
if(_size > heapThreshold)
deallocHeap(_ptr);
else
deallocFixedPool(_ptr, _size);
}
So I need to provide the size of the chunk pointed to, to deallocate from the right place.
Now the problem is that the delete keyword does not provide any hint on the size of the deleted chunk, so I would need something like this:
void operator delete(void * _ptr, size_t _size){
MemoryManager::deallocate(_ptr, _size);
}
But as far as I can see, there is no way to determine the size inside the delete operator.- If I want to keep things the way it is right now, would I have to save the size of the memory chunks myself?
allocate more memory than neccessary and store the size information there. That's what your system allocator probably does already. Something like this (demonstrate with malloc for simplicity):
void *allocate(size_t size) {
size_t *p = malloc(size + sizeof(size_t));
p[0] = size; // store the size in the first few bytes
return (void*)(&p[1]); // return the memory just after the size we stored
}
void deallocate(void *ptr) {
size_t *p = (size_t*)ptr; // make the pointer the right type
size_t size = p[-1]; // get the data we stored at the beginning of this block
// do what you need with size here...
void *p2 = (void*)(&p[-1]); // get a pointer to the memory we originally really allocated
free(p2); // free it
}
You could keep a map of memory address to size for your pool-allocated memory. When you delete, check if the pointer is in the map, if it is delete that size, if it isn't call regular delete.
For class type, C++ already supports it directly. For nonclass types, you need to store the size manually like the other solution shows.
struct MyClass {
void operator delete(void *p, size_t size) {
MemoryManager::deallocate(p, size);
}
};
As of C++14 the Standard supports the second size parameter in the global delete allocation function. So want you want to do is possible natively now.
http://en.cppreference.com/w/cpp/memory/new/operator_delete