I've a structure that can be dynamically sized.
I want to use smart pointer (unique_ptr here) to allocate this structure.
The problem is that this struct is dynamically sized..
Here is the structure (of Windows library):
typedef struct _STORAGE_DEPENDENCY_INFO
{
STORAGE_DEPENDENCY_INFO_VERSION Version;
ULONG NumberEntries;
union
{
STORAGE_DEPENDENCY_INFO_TYPE_1 Version1Entries[];
STORAGE_DEPENDENCY_INFO_TYPE_2 Version2Entries[];
};
} STORAGE_DEPENDENCY_INFO, *PSTORAGE_DEPENDENCY_INFO;
I CAN get the total size of the structure.
So, I know i can do this with malloc:
STORAGE_DEPENDENCY_INFO *info = std::malloc(struct_size);
But I don't know how to allocate it with make_unique..
You can use smart pointers with any allocator-deallocator pair by using a custom deallocation function. Here is an example using std::malloc and std::free:
struct freer
{
void operator()(void* p) const noexcept {
std::free(p);
}
};
template<class T>
using unique_c_ptr = std::unique_ptr<T, freer>;
template<class T>
[[nodiscard]] unique_c_ptr<T>
make_unique_malloc(std::size_t size) noexcept
{
static_asset(std::is_trvial_v<T>);
return unique_c_ptr<T>{static_cast<T*>(std::malloc(size))};
}
auto unique = make_unique_malloc<STORAGE_DEPENDENCY_INFO>(struct_size);
STORAGE_DEPENDENCY_INFO_TYPE_1 Version1Entries[];
Array of unspecified bound cannot be a non-static member in C++, so this class is ill-formed. There is no such thing as "dynamically sized structure" in C++.
Given that it is from a system library, it probably relies on some language extension.
C language does have "flexible array members", but those are allowed only for structs; not for unions.
STORAGE_DEPENDENCY_INFO *info = std::malloc(struct_size);
std::malloc returns a void* so this implicit conversion is also ill-formed in C++.
Related
I have a class hierarchy that I'm storing in a std::vector<std::unique_ptr<Base>>. There is frequent adding and removing from this vector, so I wanted to experiment with custom memory allocation to avoid all the calls to new and delete. I'd like to use STL tools only, so I'm trying std::pmr::unsynchronized_pool_resource for the allocation, and then adding a custom deleter to the unique_ptr.
Here's what I've come up with so far:
#include <memory_resource>
#include <vector>
#include <memory>
// dummy classes
struct Base
{
virtual ~Base() {}
};
struct D1 : public Base
{
D1(int i_) : i(i_) {}
int i;
};
struct D2 : public Base
{
D2(double d_) : d(d_) {}
double d;
};
// custom deleter: this is what I'm concerned about
struct Deleter
{
Deleter(std::pmr::memory_resource& m, std::size_t s, std::size_t a) :
mr(m), size(s), align(a) {}
void operator()(Base* a)
{
a->~Base();
mr.get().deallocate(a, size, align);
}
std::reference_wrapper<std::pmr::memory_resource> mr;
std::size_t size, align;
};
template <typename T>
using Ptr = std::unique_ptr<T, Deleter>;
// replacement function for make_unique
template <typename T, typename... Args>
Ptr<T> newT(std::pmr::memory_resource& m, Args... args)
{
auto aPtr = m.allocate(sizeof(T), alignof(T));
return Ptr<T>(new (aPtr) T(args...), Deleter(m, sizeof(T), alignof(T)));
}
// simple construction of vector
int main()
{
auto pool = std::pmr::unsynchronized_pool_resource();
auto vec = std::vector<Ptr<Base>>();
vec.push_back(newT<Base>(pool));
vec.push_back(newT<D1>(pool, 2));
vec.push_back(newT<D2>(pool, 4.0));
return 0;
}
This compiles, and I'm pretty sure that it doesn't leak (please tell me if I'm wrong!) But I'm not too happy with the Deleter class, which has to take extra arguments for the size and alignment.
I first tried making it a template, so that I could work out the size and alignment automatically:
template <typename T>
struct Deleter
{
Deleter(std::pmr::memory_resource& m) :
mr(m) {}
void operator()(Base* a)
{
a->~Base();
mr.get().deallocate(a, sizeof(T), alignof(T));
}
std::reference_wrapper<std::pmr::memory_resource> mr;
};
But then the unique_ptrs for each type are incompatible, and the vector won't hold them.
Then I tried deallocating through the base class:
mr.get().deallocate(a, sizeof(Base), alignof(Base));
But this is clearly a bad idea, as the memory that's deallocated has a different size and alignment from what was allocated.
So, how do I deallocate through the base pointer without storing the size and alignment at runtime? delete seems to manage, so it seems like it should be possible here as well.
After writing my answer, I would recommend you stick with your code.
Letting unique_ptr handle the storage is not bad at all, it is allocated on stack if unique_ptr itself is, it is safe, and there is no additional overhead at deallocation time. The latter is not true for std::shared_ptr which uses type-erause for its deleters.
I think it is the cleanest and simplest way how to achieve the goal. And there's nothing wrong with your code as far as I can tell.
Most allocators to my knowledge allocate extra space for storing any data they need for deallocation directly next to the pointer they return to you. We can do the same to the aPtr blob:
// Extra information needed for deallocation
struct Header {
std::size_t s;
std::size_t a;
std::pmr::memory_resource* res;
};
// Deleter is now just a free function
void deleter(Base* a) {
// First delete the object itself.
a->~Base();
// Obtain the header
auto* ptr = reinterpret_cast<unsigned char*>(a);
Header* header = reinterpret_cast<Header*>(ptr - sizeof(Header));
// Deallocate the allocated blob.
header->res->deallocate(ptr, header->s, header->a);
};
// Use the new custom function.
template <typename T>
using Ptr = std::unique_ptr<T, decltype(&deleter)>;
template <typename T, typename... Args>
Ptr<T> newT(std::pmr::memory_resource& m, Args... args) {
// Let the compiler calculate the correct way how to store `T` and `H`
// together.
struct Storage {
Header header;
T type;
};
Header h = {sizeof(Storage), alignof(Storage)};
auto aPtr = m.allocate(h.s, h.a);
// Use dummy header.
Storage* storage = new (aPtr) Storage{h, T(args...)};
static_assert(sizeof(Storage) == (sizeof(Header) + sizeof(T)),
"No padding bytes allowed in Storage.");
return Ptr<T>(&storage->type, deleter);
}
We store all information necessary for deallocation in Header structure.
Allocating both T and the header in a single blob is not straight forward as it might seem - see below. We need at least sizeof(T)+sizeof(Header) bytes but must also respect the alignof(T). So we let the compiler figure it out via Storage.
This way we can allocate T properly and return a pointer to &storage->type to the user. The issue now is that there might be some to-deleter-unknown amount of padding in Storage between header and type, thus the deleter function would not be able to recover &storage->header only from &storage->type pointer.
I have two proposals for this:
Just assert the padding amount to 0.
Manually write the header at the known place, albeit I cannot guarantee 100% safe.
Restricting to known padding
Although the extra padding in Storage is unlikely because Header is aligned to 8 bytes on normal 64-bit systems which should be generally enough for all Ts, there is no such alignment guarantee in C++. vtable pointer makes this even less guaranteed IMHO and the fact that alignas(N) offers some user-control over the alignment, increasing it in particular for e.g. vector instructions, doesn't help either. So to be safe, we can just use static_assert and if any "weird" type comes along, the code will not compile and remain safe.
If that happens, one can manually add extra padding to Storage and modify the subtraction amount. The cost would be extra memory for that padding for all allocations.
Writing the header manually
Another option is that we just ignore storage->header member and write the header ourselves directly before type, potentially into the padding area. This requires the use of memcopy because we cannot just placement-new it there because of possible alignof(Header) mismatch. Same in deleter itself because there is no Header object at ptr-sizeof(Header), simple reinterpret_cast<Header*>(ptr-sizeof(header)) would break the strict aliasing rule.
// Extra information needed for deallocation
struct Header {
std::size_t s;
std::size_t a;
std::pmr::memory_resource* res;
};
// Deleter is now just a free function
void deleter(Base* a) {
// First delete the object itself.
a->~Base();
// Obtain the header
auto* ptr = reinterpret_cast<unsigned char*>(a);
Header header;
std::memcpy(&header, ptr - sizeof(Header), sizeof(Header));
// Deallocate the allocated blob.
header.res->deallocate(ptr, header.s, header.a);
};
// Use the new custom function.
template <typename T>
using Ptr = std::unique_ptr<T, decltype(&deleter)>;
template <typename T, typename... Args>
Ptr<T> newT(std::pmr::memory_resource& m, Args... args) {
// Let the compiler calculate the correct way how to store `T` and `H`
// together.
struct Storage {
Header header;
// Padding???
T type;
};
Header h = {sizeof(Storage), alignof(Storage)};
auto aPtr = m.allocate(h.s, h.a);
// Use dummy header.
Storage* storage = new (aPtr) Storage{{0, 0}, T(args...)};
// Write our own header at the known -sizeof(Header) offset.
auto* ptr = reinterpret_cast<unsigned char*>(storage);
std::memcpy(ptr - sizeof(Header), &h, sizeof(Header));
return Ptr<T>(&storage->type, deleter);
}
I know this solution is safe w.r.t strict aliasing, object lifetime and allocating T. What I am not 100% certain about is whether the compiler is allowed to store anything relevant to T inside the potential padding bytes, which would thus be overwritten by the manually-written header.
Is there some way to create an array in C++ where we don't know the type, but we do know it's size and alignmnent requirements?
Let's say we have a template:
template<typename T>
T* create_array(size_t numElements) { return new T[numElements]; }
This works because each element T has known size and alignment, which is known at compile-time. But I'm looking for something where we can delegate the creation for later by simply extracting size and align and passing them on. This is the interface that I seek:
// my_header.hpp
// "internal" helper function, implementation in source file!
void* _create_array(size_t s, size_t a, size_t n);
template<typename T>
T* create_array(size_t numElements) {
return (T*)_create_array(sizeof(T), alignof(T), numElements);
}
Can we implement this in a source file?:
#include "my_header.hpp"
void* _create_array(size_t s, size_t a, size_t n) {
// ... ?
}
Requirements:
Each array element must have the correct alignment.
The total array size must be equal to s*n, and be aligned to a.
Type safety is assumed to be managed by the templated interface.
Indexing into the array should use correct size and align offsets.
I'm using C++20, so newer features may also be considered.
In advance, thank you!
While you can also implement this yourself, you can simply use std::allocator:
template<typename T>
constexpr T* create_array(size_t numElements) {
std::allocator<T> a;
return std::allocator_traits<decltype(a)>::allocate(a, numElements);
}
and then
template<typename T>
constexpr void destroy_array(T* ptr) noexcept {
std::allocator<T> a;
std::allocator_traits<decltype(a)>::deallocate(a, ptr);
}
The benefit over doing it yourself via a call to operator new is that this will also be usable in constant expression evaluation.
You then need to create objects in the returned storage via placement-new, std::allocator_traits<std::allocator<T>>::construct or std::construct_at.
Anyway, first make sure that you really need to do all of this memory management manually. Standard library containers already offer similar functionality, e.g. std::vector has a .reserve member function to reserve memory in which objects can be placed later via push_back, emplace_back, resize, etc.
If you want to implement the above yourself, you basically need
#include<new>
//...
void* create_array(size_t s, size_t a, size_t n) {
// CAREFUL: check here that `s*n` does not overflow! Potential for vulnerabilities!
return ::operator new(s*n, std::align_val_t{a});
}
void destroy_array(void* ptr, size_t a) noexcept {
::operator delete(ptr, std::align_val_t{a});
}
(Note that identifiers starting with an underscore are reserved in the global namespace scope and may not be used there as function names, so I changed the name.)
I have setup, similar to this:
There is class similar to vector (it is implemented using std::vector).
It contains pointers to int's.
I am using my own custom allocator.
The vector does not create elements, but it can destroy elements.
In order to destroy it needs to call non static method Allocator::deallocate(int *p).
If I do it with manual livetime management, I can call Allocator::deallocate(int *p) manually. This works, but is not RAII.
Alternatively, I can use std::unique_ptr with custom deleter. However if I do so, the size of array became double, because each std::unique_ptr must contain pointer to the allocator.
Is there any way I can do it without doubling the size of the vector?
Note i do not want to templatize the class.
Here is best RAII code I come up.
#include <functional>
#include <cstdlib>
#include <memory>
struct MallocAllocator{
template<class T>
static T *allocate(size_t size = sizeof(T) ) noexcept{
return reinterpret_cast<T *>( malloc(size) );
}
// this is deliberately not static method
void deallocate(void *p) noexcept{
return ::free(p);
}
// this is deliberately not static method
auto getDeallocate() noexcept{
return [this](void *p){
deallocate(p);
};
}
};
struct S{
std::function<void(void *)> fn;
S(std::function<void(void *)> fn) : fn(fn){}
auto operator()() const{
auto f = [this](void *p){
fn(p);
};
return std::unique_ptr<int, decltype(f)>{ (int *) malloc(sizeof(int)), f };
}
};
int main(){
MallocAllocator m;
S s{ m.getDeallocate() };
auto x = s();
printf("%zu\n", sizeof(x));
}
You can't do it. If you want your unique_ptr to store a reference to non-static deleter, there is nothing you can do about that, it will have to store it somewhere.
Some ways to work around this:
If you are using allocator-aware data structure, pass you allocator to it and don't use unique_ptr's, use the actual data type as a stored type.
Wrap you allocated objects around some sort of manager that would deallocate those objects when needed. You lose RAII inside it, but to outside code it will still be RAII. You can even transfer the ownership of some objects from this manager to the outside code and you'll only have to use custom deleter there.
(not recommended) Use some global state that you can access from deleter to make it size 0.
How is it possible to use C++ STL containers with jemalloc (or any other malloc implementation)?
Is it as simple as include jemalloc/jemalloc.h? Or should I write an allocator for them?
Edit: The application I'm working on allocates and frees relatively small objects over its lifetime. I want the replace the default allocator, because benchmarks showed that the application doesn't scale beyond 2 cores. Profiling showed that it was waiting for memory allocation, that's what caused the scaling issues. As I understand, jemalloc will help with that.
I'd like to see a solution, that's platform-neutral as the application has to work on both Linux and Windows. (Linking against a different implementation is easy under Linux, but it's very hard on Windows as far as I know.)
C++ allows you to replace operator new. If this replacement operator new calls je_malloc, then std::allocator will indirectly call je_malloc, and in turn all standard containers will.
This is by far the simplest approach. Writing a custom allocator requires writing an entire class. Replacing malloc may not be sufficient (there's no guarantee that the non-replaced operator new calls malloc), and it has the risks noted earlier by Adrian McCarthy
If you want to replace malloc everywhere in your program (which I wanted to and also seems the only logical solution), then all you have to do is link against it.
So, if you use gcc then all you have to do is:
g++ yourprogram.cpp -ljemalloc
But, if it's not possible, then you have to use jemalloc via another functions e.g. je_malloc and je_free, and then you have to overload the new and delete operators.
There's no need for including any header if you don't use implementation-specific features (statistics, mostly).
Writing an allocator is going to be the easiest solution, since the stl was designed to have interchangeable allocators. This will be the easiest path.
Some projects play games try to get the alternate malloc implementation to replace the malloc and news provided by the compiler's companion library. That's prone to all sorts of issues because you end up relying on specific implementation details of your compiler and the library it normally uses. This path is fraught with danger.
Some dangers of trying to replace malloc globally:
Static initializer order has limited guarantees in C++. There's no way to guarantee the allocator replacement is initialized before the first caller tries to use it, unless you ban static objects that might allocate memory. The runtime doesn't have this problem, since the compiler and the runtime work together to make sure the runtime is fully initialized before initializing any statics.
If you dynamically link to the runtime library, then there's no way to ensure some of the runtime library's code isn't already bound to its own implementation. Trying to modify the compiler's runtime library might lead to licensing issues when redistributing your application.
All other methods of allocation might not always ultimately rely on malloc. For example, an implementation of new might bypass malloc for large allocations and directly call the OS to allocate memory. That requires tracking to make sure such allocations aren't accidentally sent to the replacement free.
I believe Chromium and Firefox has both replaced the allocator, but they play some dirty tricks and probably have to update their approach as the compiler, linker, and runtime evolve.
Make yourself allocator. Do like this:
#include <vector>
template<typename T>
struct RemoveConst
{
typedef T value_type;
};
template<typename T>
struct RemoveConst<const T>
{
typedef T value_type;
};
template <class T>
class YourAlloc {
public:
// type definitions
typedef RemoveConst<T> Base;
typedef typename Base::value_type value_type;
typedef value_type* pointer;
typedef const value_type* const_pointer;
typedef value_type& reference;
typedef const value_type& const_reference;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
// rebind allocator to type U
template <class U>
struct rebind {
typedef YourAlloc<U> other;
};
// return address of values
pointer address(reference value) const {
return &value;
}
const_pointer address(const_reference value) const {
return &value;
}
/* constructors and destructor
* - nothing to do because the allocator has no state
*/
YourAlloc() throw() {
}
YourAlloc(const YourAlloc&) throw() {
}
template <class U>
YourAlloc(const YourAlloc<U>&) throw() {
}
~YourAlloc() throw() {
}
// return maximum number of elements that can be allocated
size_type max_size() const throw() {
return std::numeric_limits<std::size_t>::max() / sizeof(T);
}
// allocate but don't initialize num elements of type T
pointer allocate(size_type num, const void* = 0) {
return (pointer)je_malloc(num * sizeof(T));
}
// initialize elements of allocated storage p with value value
void construct(pointer p, const T& value) {
// initialize memory with placement new
new((void*)p)T(value);
}
// destroy elements of initialized storage p
void destroy(pointer p) {
// destroy objects by calling their destructor
p->~T();
}
// deallocate storage p of deleted elements
void deallocate(pointer p, size_type num) {
je_free(p);
}
};
// return that all specializations of this allocator are interchangeable
template <class T1, class T2>
bool operator== (const YourAlloc<T1>&,
const YourAlloc<T2>&) throw() {
return true;
}
template <class T1, class T2>
bool operator!= (const YourAlloc<T1>&,
const YourAlloc<T2>&) throw() {
return false;
}
int main()
{
std::vector<int, YourAlloc<int>> vector;
return 0;
}
The code is copied from here
There may be problems as the constructors won't be called. You may use differnt options of operator new (has more options than just new) which can just allocate memory without calling constructor, or call the constructor in already allocated memory. http://www.cplusplus.com/reference/std/new/operator%20new%5B%5D/
Is it necessary for me to call allocator.construct() for an array of primitive types allocated using an arbitrary allocator, as in the code listing below? The class doesn't require the allocated memory to be initialized to any particular value, so it seems to me that calling allocator.construct() with a newly-allocated chunk of memory would be unnecessary. Is there any danger in not calling this method, given that the array always consists of primitive types?
template <class T, template <class> class Allocator = std::allocator>
class foo
{
public:
typedef Allocator<T> allocator;
typedef typename allocator::pointer pointer;
private:
unsigned size_;
allocator alloc_;
pointer t_;
public:
foo(unsigned n) throw(std::bad_alloc) : size_(n), alloc_(),
t_(alloc_.allocate(n))
{
// Note that I do not call alloc_.construct() here.
}
~foo() { alloc_.deallocate(t_, size_); }
};
Yes. The allocator is free to impose whatever custom book-keeping it wants, including the number of existing objects. There is no guarantee at all that it simply does new (memory) T(...). And in addition, it would be a very nasty surprise for a person to change your code so that it's no longer just primitives and then find it randomly breaks sometime later.