Background:
For hardware dependant reasons, I need to allocate memory of using the DOS Protected Mode Interface in order to communicate with some low-level interfaces (e.g. VESA BIOS Extensions).
Situation:
So I can overload new and delete for dynamically allocated memory which is great but I really want to overload the allocators for statically allocated memory. The project I'm working on is rather old library and therefore requires a fair number of static global variables.
Question:
Is there some way I could overload the allocation process for these variables? If not, is there a template to dynamically allocate these variables that wouldn't require explicit allocation or deletion and be almost entirely transparent?
Sadly, there is no keyword to specify the type of memory allocation you wish to use and there are no standard templates to force heap allocation. The best answer I could find was to just make a small wrapper class a that allocates and deletes memory how you want and provides you with access to the pointer. It's nothing fancy but here's some of my code.
template<typename T>
class mem
{
public:
mem(void) { m_data = reinterpret_cast<T*>(dos::malloc(sizeof(T))); }
~mem(void) { dos::free(m_data); m_data = nullptr; }
T* operator ->(void) noexcept { return m_data; }
operator T*(void) noexcept { return m_data; }
private:
T* m_data;
};
Related
I am using some C-library in my C++ code. The library wants me to allocate some amount of the memory and pass the pointer to the library. Unfortunately, the exact required memory size is not known in advance, so the library also requires me to provide C-callback with the following signature:
void* callback_realloc(void* ptr, size_t new_size);
where ptr is previously passed memory and new_size is required size, and the callback must return the pointer to newly allocated memory. There is no direct way to store the allocator state. Instead, I need to rely on pointer arithmetic somehow as the following:
template<class T>
struct o_s {
std::aligned_storage_t<sizeof(T), alignof(T)> data;
};
template<class Alloc>
struct o_i: private Alloc {
std::size_t allocated_size;
const Alloc& get_allocator() const { return *this; }
};
template<class T>
struct o: public o_i, public o_s<T> {
void* ptr() {
return &data;
}
// Additionally, override class operator new and operator delete...
};
void* my_callback(void* ptr, size_t new_size) {
auto meta = static_cast<o*>(reinterpret_cast<o_s<char>*>(ptr));
// access to the allocator state ...
}
Then sizeof(o_i) + initial_size memory is allocated and ptr() is passed to the C library.
At this point, I understand that I am not the first person in the world who needs this pattern. Unfortunately (and surprisingly) I have not found anything suitable for this in Boost or STL. I would like to use ready implementation to avoid possible underwater rocks.
The simplest solution is to allocate the memory using std::malloc, and use std::realloc as the callback. C API such as this are the case where using those makes sense in C++.
You don't necessarily have to use std::malloc however. You can implement a custom allocator if you want to. Using some of the allocated storage is one way of storing allocation metadata, and it's an efficient way. That's not necessary either though, since you can also store the metadata separately in a map-like structure.
I don't think you'd find an idiomatic way to do this in C++, as this is a C idiom.
Conceptually, you have two ways of going about this:
Store the metadata alongside the buffer you've allocated (as you seem to be doing). I feel using double inheritance etc. is a bit overkill here, when you could just allocate a char buffer of sizeof(size_t)+allocation_size and use the first part for your metadata.
Allocate and return to the API a raw buffer as needed, and use a separate static data structure to manage this with a map of ptr->allocation size. I suppose this is what malloc/realloc is doing behind the scenes anyway.
Both are valid, and the one you chose depends on the specific details of your application. When dealing with memory it's ok to actually deal with memory, be it pointer arithmetic or what not.
The important thing is probably to keep the ugly bit to a single location, and provide a C++ style API to this on the C++ side of things, so that the client code isn't exposed to implementation details.
I am investigating the pros and cons between using class overloaded news and deletes vs placement news. By this I mean, either declaring every class I may wish to new and delete with their own operator overloads, or by using the memory manager to give me the memory I need via placement new.
I have a memory manager that allows me to allocate memory from a number of pools:
enum MemPool
{
kPool1,
kPool2,
}
class MemoryManager
{
public:
template <typename T>
void* Allocate(MemPool pool);
void Remove(MemPool pool, void* ptr);
};
MemoryManager g_mmgr;
The Allocate is templated since in debug mode I store the name of each allocation (via typeid(T).name()) and I can get the size of each allocation via sizeof(T)
I see myself as having at least 2 options as to how to allocate and am trying to decide which is best in terms of syntactical usage, efficiency, safety and portability.
Option 1 is to have a templated base class with news and deletes, which wraps up the mempool and type nicely for me.
template <typename T, MemPool pool>
class MemPoolUser
{
public:
static void* operator new(int size)
{
return g_mmgr.Allocate<T>(pool);
}
static void operator delete(void* ptr)
{
g_mmgr.Remove(pool,ptr);
}
};
I could then ensure that every class that may need newing via the MemoryManager is declared thus:
class MyClass : public MemPoolUser<MyClass, kPool1>
{
};
This will allow me to simply do
MyClass* c = new MyClass();
...
delete c;
and the correct new and delete inside MemPoolUser will be called.
Option 2 is to use placement news:
class MyClass
{
};
MyClass* c = new (g_mmgr.Allocate<MyClass>(kPool1)) MyClass();
....
c->~MyClass();
g_mmgr.Remove(kPool1,c);
Any pros and cons to each of these options? Option 1 seems neater, but I have to know the type of mempool I want each class to allocate from, which may depend on other runtime factors.
Option 2 is more flexible but the newing and deleting is syntactically ugly (it could be wrapped in #defines)
So my question is, apart from the above problems mentioned, is there anything else I have failed to consider with these two options and is one more hazardous than the other?
If you MUST do this in the first place (and I expect you do have some pretty decent use-case where which has some specific need for a specific memory pool for these object), I would definitely go for the operator new and operator delete solution. It allows your application-code that uses this to be written with a minimal amount of extra coding above and beyond the typical code.
I am writing a performance critical application in which I am creating large number of objects of similar type to place orders. I am using boost::singleton_pool for allocating memory. Finally my class looks like this.
class MyOrder{
std::vector<int> v1_;
std::vector<double> v2_;
std::string s1_;
std::string s2_;
public:
MyOrder(const std::string &s1, const std::string &s2): s1_(s1), s2_(s2) {}
~MyOrder(){}
static void * operator new(size_t size);
static void operator delete(void * rawMemory) throw();
static void operator delete(void * rawMemory, std::size_t size) throw();
};
struct MyOrderTag{};
typedef boost::singleton_pool<MyOrderTag, sizeof(MyOrder)> MyOrderPool;
void* MyOrder:: operator new(size_t size)
{
if (size != sizeof(MyOrder))
return ::operator new(size);
while(true){
void * ptr = MyOrderPool::malloc();
if (ptr != NULL) return ptr;
std::new_handler globalNewHandler = std::set_new_handler(0);
std::set_new_handler(globalNewHandler);
if(globalNewHandler) globalNewHandler();
else throw std::bad_alloc();
}
}
void MyOrder::operator delete(void * rawMemory) throw()
{
if(rawMemory == 0) return;
MyOrderPool::free(rawMemory);
}
void MyOrder::operator delete(void * rawMemory, std::size_t size) throw()
{
if(rawMemory == 0) return;
if(size != sizeof(Order)) {
::operator delete(rawMemory);
}
MyOrderPool::free(rawMemory);
}
I recently posted a question about performance benefit in using boost::singleton_pool. When I compared the performances of boost::singleton_pool and default allocator, I did not gain any performance benefit. When someone pointed that my class had members of the type std::string, whose allocation was not being governed by my custom allocator, I removed the std::string variables and reran the tests. This time I noticed a considerable performance boost.
Now, in my actual application, I cannot get rid of member variables of time std::string and std::vector. Should I be using boost::pool_allocator with my std::string and std::vector member variables?
boost::pool_allocator allocates memory from an underlying std::singleton_pool. Will it matter if different member variables (I have more than one std::string/std::vector types in my MyOrder class. Also I am employing pools for classes other than MyOrder which contain std::string/std::vector types as members too) use the same memory pool? If it does, how do I make sure that they do one way or the other?
Now, in my actual application, I cannot get rid of member variables of time std::string and std::vector. Should I be using boost::pool_allocator with my std::string and std::vector member variables?
I have never looked into that part of boost, but if you want to change where strings allocate their memory, you need to pass a different allocator to std::basic_string<> at compile time. There is no other way. However, you need to be aware of the downsides of that: For example, such strings will not be assignable to std::string anymore. (Although employing c_str() would work, it might impose a small performance penalty.)
boost::pool_allocator allocates memory from an underlying std::singleton_pool. Will it matter if different member variables (I have more than one std::string/std::vector types in my MyOrder class. Also I am employing pools for classes other than MyOrder which contain std::string/std::vector types as members too) use the same memory pool? If it does, how do I make sure that they do one way or the other?
The whole point of a pool is to put more than one object into it. If it was just one, you wouldn't need a pool. So, yes, you can put several objects into it, including the dynamic memory of several std::string objects.
Whether this gets you any performance gains, however, remains to be seen. You use a pool because you have reasons to assume that it is faster than the general-purpose allocator (rather than using it to, e.g., allocate memory from a specific area, like shared memory). Usually such a pool is faster because it can make assumptions on the size of the objects allocated within. That's certainly true for your MyOrder class: objects of it always have the same size, otherwise (larger derived classes) you won't allocate them in the pool.
That's different for std::string. The whole point of using a dynamically allocating string class is that it adapts to any string lengths. The memory chunks needed for that are of different size (otherwise you could just char arrays instead). I see little room for a pool allocator to improve over the general-purpose allocator for that.
On a side note: Your overloaded operator new() returns the result of invoking the global one, but your operator delete just passes anything coming its way to that pool's free(). That seems very suspicious to me.
Using a custom allocator for the std::string/std::vector in your class would work (assuming the allocator is correct) - but only performance testing will see if you really see any benefits from it.
Alternatively, if you know that the std::string/std::vector will have upper limits, you could implement a thin wrapper around a std::array (or normal array if you don't have c++11) that makes it a drop in replacement.
Even if the size is unbounded, if there is some size that most values would be less than, you could extend the std::array based implementations above to be expandable by allocating with your pooled allocator if they fill up.
How is it possible to use C++ STL containers with jemalloc (or any other malloc implementation)?
Is it as simple as include jemalloc/jemalloc.h? Or should I write an allocator for them?
Edit: The application I'm working on allocates and frees relatively small objects over its lifetime. I want the replace the default allocator, because benchmarks showed that the application doesn't scale beyond 2 cores. Profiling showed that it was waiting for memory allocation, that's what caused the scaling issues. As I understand, jemalloc will help with that.
I'd like to see a solution, that's platform-neutral as the application has to work on both Linux and Windows. (Linking against a different implementation is easy under Linux, but it's very hard on Windows as far as I know.)
C++ allows you to replace operator new. If this replacement operator new calls je_malloc, then std::allocator will indirectly call je_malloc, and in turn all standard containers will.
This is by far the simplest approach. Writing a custom allocator requires writing an entire class. Replacing malloc may not be sufficient (there's no guarantee that the non-replaced operator new calls malloc), and it has the risks noted earlier by Adrian McCarthy
If you want to replace malloc everywhere in your program (which I wanted to and also seems the only logical solution), then all you have to do is link against it.
So, if you use gcc then all you have to do is:
g++ yourprogram.cpp -ljemalloc
But, if it's not possible, then you have to use jemalloc via another functions e.g. je_malloc and je_free, and then you have to overload the new and delete operators.
There's no need for including any header if you don't use implementation-specific features (statistics, mostly).
Writing an allocator is going to be the easiest solution, since the stl was designed to have interchangeable allocators. This will be the easiest path.
Some projects play games try to get the alternate malloc implementation to replace the malloc and news provided by the compiler's companion library. That's prone to all sorts of issues because you end up relying on specific implementation details of your compiler and the library it normally uses. This path is fraught with danger.
Some dangers of trying to replace malloc globally:
Static initializer order has limited guarantees in C++. There's no way to guarantee the allocator replacement is initialized before the first caller tries to use it, unless you ban static objects that might allocate memory. The runtime doesn't have this problem, since the compiler and the runtime work together to make sure the runtime is fully initialized before initializing any statics.
If you dynamically link to the runtime library, then there's no way to ensure some of the runtime library's code isn't already bound to its own implementation. Trying to modify the compiler's runtime library might lead to licensing issues when redistributing your application.
All other methods of allocation might not always ultimately rely on malloc. For example, an implementation of new might bypass malloc for large allocations and directly call the OS to allocate memory. That requires tracking to make sure such allocations aren't accidentally sent to the replacement free.
I believe Chromium and Firefox has both replaced the allocator, but they play some dirty tricks and probably have to update their approach as the compiler, linker, and runtime evolve.
Make yourself allocator. Do like this:
#include <vector>
template<typename T>
struct RemoveConst
{
typedef T value_type;
};
template<typename T>
struct RemoveConst<const T>
{
typedef T value_type;
};
template <class T>
class YourAlloc {
public:
// type definitions
typedef RemoveConst<T> Base;
typedef typename Base::value_type value_type;
typedef value_type* pointer;
typedef const value_type* const_pointer;
typedef value_type& reference;
typedef const value_type& const_reference;
typedef std::size_t size_type;
typedef std::ptrdiff_t difference_type;
// rebind allocator to type U
template <class U>
struct rebind {
typedef YourAlloc<U> other;
};
// return address of values
pointer address(reference value) const {
return &value;
}
const_pointer address(const_reference value) const {
return &value;
}
/* constructors and destructor
* - nothing to do because the allocator has no state
*/
YourAlloc() throw() {
}
YourAlloc(const YourAlloc&) throw() {
}
template <class U>
YourAlloc(const YourAlloc<U>&) throw() {
}
~YourAlloc() throw() {
}
// return maximum number of elements that can be allocated
size_type max_size() const throw() {
return std::numeric_limits<std::size_t>::max() / sizeof(T);
}
// allocate but don't initialize num elements of type T
pointer allocate(size_type num, const void* = 0) {
return (pointer)je_malloc(num * sizeof(T));
}
// initialize elements of allocated storage p with value value
void construct(pointer p, const T& value) {
// initialize memory with placement new
new((void*)p)T(value);
}
// destroy elements of initialized storage p
void destroy(pointer p) {
// destroy objects by calling their destructor
p->~T();
}
// deallocate storage p of deleted elements
void deallocate(pointer p, size_type num) {
je_free(p);
}
};
// return that all specializations of this allocator are interchangeable
template <class T1, class T2>
bool operator== (const YourAlloc<T1>&,
const YourAlloc<T2>&) throw() {
return true;
}
template <class T1, class T2>
bool operator!= (const YourAlloc<T1>&,
const YourAlloc<T2>&) throw() {
return false;
}
int main()
{
std::vector<int, YourAlloc<int>> vector;
return 0;
}
The code is copied from here
There may be problems as the constructors won't be called. You may use differnt options of operator new (has more options than just new) which can just allocate memory without calling constructor, or call the constructor in already allocated memory. http://www.cplusplus.com/reference/std/new/operator%20new%5B%5D/
Have you overloaded operator new in C++?
If yes, why?
An interview question, for which I humbly request some of your thoughts.
We had an embedded system where new was only rarely allowed, and the memory could never be deleted, as we had to prove a maximum heap usage for reliability reasons.
We had a third party library developer who didn't like those rules, so they overloaded new and delete to work against a chunk of memory we allocated just for them.
Yes.
Overloading operator new gives you a chance to control where an object lives in memory. I did this because I knew some details about the lifetime of my objects, and wanted to avoid fragmentation on a platform which didn't have virtual memory.
You would overload new if you're using your own allocator, doing something fancy with reference counting, instrumenting for garbage collection, debugging object lifetimes or something else entirely; you're replacing the allocator for objects. I've personally had to do it to ensure certain objects get allocated on specific mmap'ed pages of memory.
Yes, for two reasons: Custom allocator, and custom allocation tracking.
Overloading new operator may look as a good idea at first glance if you want to do custom allocation for some reason (i.e. avoiding memory fragmentation intrinsic to c-runtime allocator or/and avoiding locks on memory management calls in multithreaded programs). But when you get to implementation you may realize that in most cases you want to pass some additional context to this call, for example a thread-specific heap for objects of a given size. And overloading of new/delete simply doesn't work here. So eventually you may want to create your own facade to your custom memory management subsystem.
I found it very handy to overload operator new when writing Python extension code in C++. I wrapped the Python C-API code for allocation and deallocation in operator new and operator delete overloads, respectively – this allows for PyObject*-compatible structures that can be created with new MyType() and managed with predictable heap-allocation semantics.
It also allows for a separation of the allocation code (normally in the Python __new__ method) and the initialization code (in Python’s __init__) into, respectively, the operator new overloads and any constructors one sees fit to define.
Here’s a sample:
struct ModelObject {
static PyTypeObject* type_ptr() { return &ModelObject_Type; }
/// operator new performs the role of tp_alloc / __new__
/// Not using the 'new size' value here
void* operator new(std::size_t) {
PyTypeObject* type = type_ptr();
ModelObject* self = reinterpret_cast<ModelObject*>(
type->tp_alloc(type, 0));
if (self != NULL) {
self->weakrefs = NULL;
self->internal = std::make_unique<buffer_t>(nullptr);
}
return reinterpret_cast<void*>(self);
}
/// operator delete acts as our tp_dealloc
void operator delete(void* voidself) {
ModelObject* self = reinterpret_cast<ModelObject*>(voidself);
PyObject* pyself = reinterpret_cast<PyObject*>(voidself);
if (self->weakrefs != NULL) { PyObject_ClearWeakRefs(pyself); }
self->cleanup();
type_ptr()->tp_free(pyself);
}
/// Data members
PyObject_HEAD
PyObject* weakrefs = nullptr;
bool clean = false;
std::unique_ptr<buffer_t> internal;
/// Constructors fill in data members, analagous to __init__
ModelObject()
:internal(std::make_unique<buffer_t>())
,accessor{}
{}
explicit ModelObject(buffer_t* buffer)
:clean(true)
,internal(std::make_unique<buffer_t>(buffer))
{}
ModelObject(ModelObject const& other)
:internal(im::buffer::heapcopy(other.internal.get()))
{}
/// No virtual methods are defined to keep the struct as a POD
/// ... instead of using destructors I defined a 'cleanup()' method:
void cleanup(bool force = false) {
if (clean && !force) {
internal.release();
} else {
internal.reset(nullptr);
clean = !force;
}
}
/* … */
};