C++ - overload operator new and provide additional arguments - c++

I know you can overload the operator new. When you do, you method gets sent a size_t parameter by default. However, is it possible to send the size_t parameter - as well as additional user-provided parameters, to the overloaded new operator method?
For example
int a = 5;
Monkey* monk = new Monkey(a);
Because I want to override new operator like this
void* Monkey::operator new(size_t size, int a)
{
...
}
Thanks
EDIT: Here's what I a want to accomplish:
I have a chunk of virtual memory allocated at the start of the app (a memory pool). All objects that inherit my base class will inherit its overloaded new operator.
The reason I want to sometimes pass an argument in overloaded new is to tell my memory manager if I want to use the memory pool, or if I want to allocate it with malloc.

Invoke new with that additional operand, e.g.
Monkey *amonkey = new (1275) Monkey(a);
addenda
A practical example of passing argument[s] to your new operator is given by Boehm's garbage collector, which enables you to code
Monkey *acollectedmonkey = new(UseGc) Monkey(a);
and then you don't have to bother about delete-ing acollectedmonkey (assuming its destructor don't do weird things; see this answer). These are the rare situations where you want to pass an explicit Allocator argument to template collections like std::vector or std::map.
When using memory pools, you often want to have some MemoryPool class, and pass instances (or pointers to them) of that class to your new and your delete operations. For readability reasons, I won't recommend referencing memory pools by some obscure integer number.

Related

Dealing with std::string/std::vector member variables while using boost::singleton_pool

I am writing a performance critical application in which I am creating large number of objects of similar type to place orders. I am using boost::singleton_pool for allocating memory. Finally my class looks like this.
class MyOrder{
std::vector<int> v1_;
std::vector<double> v2_;
std::string s1_;
std::string s2_;
public:
MyOrder(const std::string &s1, const std::string &s2): s1_(s1), s2_(s2) {}
~MyOrder(){}
static void * operator new(size_t size);
static void operator delete(void * rawMemory) throw();
static void operator delete(void * rawMemory, std::size_t size) throw();
};
struct MyOrderTag{};
typedef boost::singleton_pool<MyOrderTag, sizeof(MyOrder)> MyOrderPool;
void* MyOrder:: operator new(size_t size)
{
if (size != sizeof(MyOrder))
return ::operator new(size);
while(true){
void * ptr = MyOrderPool::malloc();
if (ptr != NULL) return ptr;
std::new_handler globalNewHandler = std::set_new_handler(0);
std::set_new_handler(globalNewHandler);
if(globalNewHandler) globalNewHandler();
else throw std::bad_alloc();
}
}
void MyOrder::operator delete(void * rawMemory) throw()
{
if(rawMemory == 0) return;
MyOrderPool::free(rawMemory);
}
void MyOrder::operator delete(void * rawMemory, std::size_t size) throw()
{
if(rawMemory == 0) return;
if(size != sizeof(Order)) {
::operator delete(rawMemory);
}
MyOrderPool::free(rawMemory);
}
I recently posted a question about performance benefit in using boost::singleton_pool. When I compared the performances of boost::singleton_pool and default allocator, I did not gain any performance benefit. When someone pointed that my class had members of the type std::string, whose allocation was not being governed by my custom allocator, I removed the std::string variables and reran the tests. This time I noticed a considerable performance boost.
Now, in my actual application, I cannot get rid of member variables of time std::string and std::vector. Should I be using boost::pool_allocator with my std::string and std::vector member variables?
boost::pool_allocator allocates memory from an underlying std::singleton_pool. Will it matter if different member variables (I have more than one std::string/std::vector types in my MyOrder class. Also I am employing pools for classes other than MyOrder which contain std::string/std::vector types as members too) use the same memory pool? If it does, how do I make sure that they do one way or the other?
Now, in my actual application, I cannot get rid of member variables of time std::string and std::vector. Should I be using boost::pool_allocator with my std::string and std::vector member variables?
I have never looked into that part of boost, but if you want to change where strings allocate their memory, you need to pass a different allocator to std::basic_string<> at compile time. There is no other way. However, you need to be aware of the downsides of that: For example, such strings will not be assignable to std::string anymore. (Although employing c_str() would work, it might impose a small performance penalty.)
boost::pool_allocator allocates memory from an underlying std::singleton_pool. Will it matter if different member variables (I have more than one std::string/std::vector types in my MyOrder class. Also I am employing pools for classes other than MyOrder which contain std::string/std::vector types as members too) use the same memory pool? If it does, how do I make sure that they do one way or the other?
The whole point of a pool is to put more than one object into it. If it was just one, you wouldn't need a pool. So, yes, you can put several objects into it, including the dynamic memory of several std::string objects.
Whether this gets you any performance gains, however, remains to be seen. You use a pool because you have reasons to assume that it is faster than the general-purpose allocator (rather than using it to, e.g., allocate memory from a specific area, like shared memory). Usually such a pool is faster because it can make assumptions on the size of the objects allocated within. That's certainly true for your MyOrder class: objects of it always have the same size, otherwise (larger derived classes) you won't allocate them in the pool.
That's different for std::string. The whole point of using a dynamically allocating string class is that it adapts to any string lengths. The memory chunks needed for that are of different size (otherwise you could just char arrays instead). I see little room for a pool allocator to improve over the general-purpose allocator for that.
On a side note: Your overloaded operator new() returns the result of invoking the global one, but your operator delete just passes anything coming its way to that pool's free(). That seems very suspicious to me.
Using a custom allocator for the std::string/std::vector in your class would work (assuming the allocator is correct) - but only performance testing will see if you really see any benefits from it.
Alternatively, if you know that the std::string/std::vector will have upper limits, you could implement a thin wrapper around a std::array (or normal array if you don't have c++11) that makes it a drop in replacement.
Even if the size is unbounded, if there is some size that most values would be less than, you could extend the std::array based implementations above to be expandable by allocating with your pooled allocator if they fill up.

C++ New vs Malloc for dynamic memory array of Objects

I have a class Bullet that takes several arguments for its construction. However, I am using a dynamic memory array to store them. I am using C++ so i want to conform to it's standard by using the new operator to allocate the memory. The problem is that the new operator is asking for the constructor arguments when I'm allocating the array, which I don't have at that time. I can accomplish this using malloc to get the right size then fill in form there, but that's not what i want to use :) any ideas?
pBulletArray = (Bullet*) malloc(iBulletArraySize * sizeof(Bullet)); // Works
pBulletArray = new Bullet[iBulletArraySize]; // Requires constructor arguments
Thanks.
You can't.
And if you truly want to conform to C++ standards, you should use std::vector.
FYI, it would probably be even more expensive than what you're trying to achieve. If you could do this, new would call a constructor. But since you'll modify the object later on anyway, the initial construction is useless.
1) std::vector
A std::vector really is the proper C++ way to do this.
std::vector<Bullet> bullets;
bullets.reserve(10); // allocate memory for bullets without constructing any
bullets.push_back(Bullet(10.2,"Bang")); // put a Bullet in the vector.
bullets.emplace_back(10.2,"Bang"); // (C++11 only) construct a Bullet in the vector without copying.
2) new [] operator
It is also possible to do this with new, but you really shouldn't. Manually managing resources with new/delete is an advanced task, similar to template meta-programming in that it's best left to library builders, who'll use these features to build efficient, high level libraries for you. In fact to do this correctly you'll basically be implementing the internals of std::vector.
When you use the new operator to allocate an array, every element in the array is default initialized. Your code could work if you added a default constructor to Bullet:
class Bullet {
public:
Bullet() {} // default constructor
Bullet(double,std::string const &) {}
};
std::unique_ptr<Bullet[]> b = new Bullet[10]; // default construct 10 bullets
Then, when you have the real data for a Bullet you can assign it to one of the elements of the array:
b[3] = Bullet(20.3,"Bang");
Note the use of unique_ptr to ensure that proper clean-up occurs, and that it's exception safe. Doing these things manually is difficult and error prone.
3) operator new
The new operator initializes its objects in addition to allocating space for them. If you want to simply allocate space, you can use operator new.
std::unique_ptr<Bullet,void(*)(Bullet*)> bullets(
static_cast<Bullet*>(::operator new(10 * sizeof(Bullet))),
[](Bullet *b){::operator delete(b);});
(Note that the unique_ptr ensures that the storage will be deallocated but no more. Specifically, if we construct any objects in this storage we have to manually destruct them and do so in an exception safe way.)
bullets now points to storage sufficient for an array of Bullets. You can construct an array in this storage:
new (bullets.get()) Bullet[10];
However the array construction again uses default initialization for each element, which we're trying to avoid.
AFAIK C++ doesn't specify any well defined method of constructing an array without constructing the elements. I imagine this is largely because doing so would be a no-op for most (all?) C++ implementations. So while the following is technically undefined, in practice it's pretty well defined.
bool constructed[10] = {}; // a place to mark which elements are constructed
// construct some elements of the array
for(int i=0;i<10;i+=2) {
try {
// pretend bullets points to the first element of a valid array. Otherwise 'bullets.get()+i' is undefined
new (bullets.get()+i) Bullet(10.2,"Bang");
constructed = true;
} catch(...) {}
}
That will construct elements of the array without using the default constructor. You don't have to construct every element, just the ones you want to use. However when destroying the elements you have to remember to destroy only the elements that were constructed.
// destruct the elements of the array that we constructed before
for(int i=0;i<10;++i) {
if(constructed[i]) {
bullets[i].~Bullet();
}
}
// unique_ptr destructor will take care of deallocating the storage
The above is a pretty simple case. Making non-trivial uses of this method exception safe without wrapping it all up in a class is more difficult. Wrapping it up in a class basically amounts to implementing std::vector.
4) std::vector
So just use std::vector.
It's possible to do what you want -- search for "operator new" if you really want to know how. But it's almost certainly a bad idea. Instead, use std::vector, which will take care of all the annoying details for you. You can use std::vector::reserve to allocate all the memory you'll use ahead of time.
Bullet** pBulletArray = new Bullet*[iBulletArraySize];
Then populate pBulletArray:
for(int i = 0; i < iBulletArraySize; i++)
{
pBulletArray[i] = new Bullet(arg0, arg1);
}
Just don't forget to free the memory using delete afterwards.
The way C++ new normally works is allocating the memory for the class instance and then calling the constructor for that instance. You basically have already allocated the memory for your instances.
You can call only the constructor for the first instance like this:
new((void*)pBulletArray) Bullet(int foo);
Calling the constructor of the second one would look like this (and so on)
new((void*)pBulletArray+1) Bullet(int bar);
if the Bullet constructor takes an int.
If what you're really after here is just fast allocation/deallocation, then you should look into "memory pools." I'd recommend using boost's implementation, rather than trying to roll your own. In particular, you would probably want to use an "object_pool".

Is there a C++ allocator that respects an overridden new/delete?

I'm implementing a resource-allocating cloning operation for an array of type T. The straightforward implementation uses new T[sz] followed by a std::copy call from the source into the new array. It walks memory twice.
I'd like to allocate raw memory and then use std::uninitialized_copy so I only walk memory once for performance reasons. I know how to accomplish this when a custom allocator is used (Allocator.allocate followed by std::uninitialized_copy), and I know how to accomplish this using std::allocator (which employs ::operator new following lib.allocator.members in section 20.4.1.1 of the specification). My concern is that a std::allocator-based approach seems wrong for types T where T::operator new has been defined. I know I can detect such a situation using Boost.TypeTraits' has_new_operator.
Is there a simple, standards-compliant way to allocate-and-then-initialize raw memory in a fashion that will respect an overridden new (and does so passing over memory only once)? If not, does using SFINAE to dispatch between an implementation employing std::allocator and one using the overridden operator new seem reasonable? FWIW, grepping through Boost does not show such a use of the has_new_operator trait.
Thanks,
Rhys
Seems it isn't possible. Only operator new[] knows how to store array size (if T has a destructor) in some implementation-specific way (operator delete[] then utilizes this info). Therefore, there is no portable way to store this information without new expression (and without calling elements constructors).
try placement new then.
typedef std::string T;
T src[5];
char* p = new char[sizeof(T)* 5];
T* dest = (T*)p;
for(int i = 0;i < 5; ++i)
{
new(dest + i) T(src[i]); //placement new
}

C++: general thoughts about overloading operator new

Have you overloaded operator new in C++?
If yes, why?
An interview question, for which I humbly request some of your thoughts.
We had an embedded system where new was only rarely allowed, and the memory could never be deleted, as we had to prove a maximum heap usage for reliability reasons.
We had a third party library developer who didn't like those rules, so they overloaded new and delete to work against a chunk of memory we allocated just for them.
Yes.
Overloading operator new gives you a chance to control where an object lives in memory. I did this because I knew some details about the lifetime of my objects, and wanted to avoid fragmentation on a platform which didn't have virtual memory.
You would overload new if you're using your own allocator, doing something fancy with reference counting, instrumenting for garbage collection, debugging object lifetimes or something else entirely; you're replacing the allocator for objects. I've personally had to do it to ensure certain objects get allocated on specific mmap'ed pages of memory.
Yes, for two reasons: Custom allocator, and custom allocation tracking.
Overloading new operator may look as a good idea at first glance if you want to do custom allocation for some reason (i.e. avoiding memory fragmentation intrinsic to c-runtime allocator or/and avoiding locks on memory management calls in multithreaded programs). But when you get to implementation you may realize that in most cases you want to pass some additional context to this call, for example a thread-specific heap for objects of a given size. And overloading of new/delete simply doesn't work here. So eventually you may want to create your own facade to your custom memory management subsystem.
I found it very handy to overload operator new when writing Python extension code in C++. I wrapped the Python C-API code for allocation and deallocation in operator new and operator delete overloads, respectively – this allows for PyObject*-compatible structures that can be created with new MyType() and managed with predictable heap-allocation semantics.
It also allows for a separation of the allocation code (normally in the Python __new__ method) and the initialization code (in Python’s __init__) into, respectively, the operator new overloads and any constructors one sees fit to define.
Here’s a sample:
struct ModelObject {
static PyTypeObject* type_ptr() { return &ModelObject_Type; }
/// operator new performs the role of tp_alloc / __new__
/// Not using the 'new size' value here
void* operator new(std::size_t) {
PyTypeObject* type = type_ptr();
ModelObject* self = reinterpret_cast<ModelObject*>(
type->tp_alloc(type, 0));
if (self != NULL) {
self->weakrefs = NULL;
self->internal = std::make_unique<buffer_t>(nullptr);
}
return reinterpret_cast<void*>(self);
}
/// operator delete acts as our tp_dealloc
void operator delete(void* voidself) {
ModelObject* self = reinterpret_cast<ModelObject*>(voidself);
PyObject* pyself = reinterpret_cast<PyObject*>(voidself);
if (self->weakrefs != NULL) { PyObject_ClearWeakRefs(pyself); }
self->cleanup();
type_ptr()->tp_free(pyself);
}
/// Data members
PyObject_HEAD
PyObject* weakrefs = nullptr;
bool clean = false;
std::unique_ptr<buffer_t> internal;
/// Constructors fill in data members, analagous to __init__
ModelObject()
:internal(std::make_unique<buffer_t>())
,accessor{}
{}
explicit ModelObject(buffer_t* buffer)
:clean(true)
,internal(std::make_unique<buffer_t>(buffer))
{}
ModelObject(ModelObject const& other)
:internal(im::buffer::heapcopy(other.internal.get()))
{}
/// No virtual methods are defined to keep the struct as a POD
/// ... instead of using destructors I defined a 'cleanup()' method:
void cleanup(bool force = false) {
if (clean && !force) {
internal.release();
} else {
internal.reset(nullptr);
clean = !force;
}
}
/* … */
};

Issues with C++ 'new' operator?

I've recently come across this rant.
I don't quite understand a few of the points mentioned in the article:
The author mentions the small annoyance of delete vs delete[], but seems to argue that it is actually necessary (for the compiler), without ever offering a solution. Did I miss something?
In the section 'Specialized allocators', in function f(), it seems the problems can be solved with replacing the allocations with: (omitting alignment)
// if you're going to the trouble to implement an entire Arena for memory,
// making an arena_ptr won't be much work. basically the same as an auto_ptr,
// except that it knows which arena to deallocate from when destructed.
arena_ptr<char> string(a); string.allocate(80);
// or: arena_ptr<char> string; string.allocate(a, 80);
arena_ptr<int> intp(a); intp.allocate();
// or: arena_ptr<int> intp; intp.allocate(a);
arena_ptr<foo> fp(a); fp.allocate();
// or: arena_ptr<foo>; fp.allocate(a);
// use templates in 'arena.allocate(...)' to determine that foo has
// a constructor which needs to be called. do something similar
// for destructors in '~arena_ptr()'.
In 'Dangers of overloading ::operator new[]', the author tries to do a new(p) obj[10]. Why not this instead (far less ambiguous):
obj *p = (obj *)special_malloc(sizeof(obj[10]));
for(int i = 0; i < 10; ++i, ++p)
new(p) obj;
'Debugging memory allocation in C++'. Can't argue here.
The entire article seems to revolve around classes with significant constructors and destructors located in a custom memory management scheme. While that could be useful, and I can't argue with it, it's pretty limited in commonality.
Basically, we have placement new and per-class allocators -- what problems can't be solved with these approaches?
Also, in case I'm just thick-skulled and crazy, in your ideal C++, what would replace operator new? Invent syntax as necessary -- what would be ideal, simply to help me understand these problems better.
Well, the ideal would probably be to not need delete of any kind. Have a garbage-collected environment, let the programmer avoid the whole problem.
The complaints in the rant seem to come down to
"I liked the way malloc does it"
"I don't like being forced to explicitly create objects of a known type"
He's right about the annoying fact that you have to implement both new and new[], but you're forced into that by Stroustrups' desire to maintain the core of C's semantics. Since you can't tell a pointer from an array, you have to tell the compiler yourself. You could fix that, but doing so would mean changing the semantics of the C part of the language radically; you could no longer make use of the identity
*(a+i) == a[i]
which would break a very large subset of all C code.
So, you could have a language which
implements a more complicated notion of an array, and eliminates the wonders of pointer arithmetic, implementing arrays with dope vectors or something similar.
is garbage collected, so you don't need your own delete discipline.
Which is to say, you could download Java. You could then extend that by changing the language so it
isn't strongly typed, so type checking the void * upcast is eliminated,
...but that means that you can write code that transforms a Foo into a Bar without the compiler seeing it. This would also enable ducktyping, if you want it.
The thing is, once you've done those things, you've got Python or Ruby with a C-ish syntax.
I've been writing C++ since Stroustrup sent out tapes of cfront 1.0; a lot of the history involved in C++ as it is now comes out of the desire to have an OO language that could fit into the C world. There were plenty of other, more satisfying, languages that came out around the same time, like Eiffel. C++ seems to have won. I suspect that it won because it could fit into the C world.
The rant, IMHO, is very misleading and it seems to me that the author does understand the finer details, it's just that he appears to want to mislead. IMHO, the key point that shows the flaw in argument is the following:
void* operator new(std::size_t size, void* ptr) throw();
The standard defines that the above function has the following properties:
Returns: ptr.
Notes: Intentionally performs no other action.
To restate that - this function intentionally performs no other action. This is very important, as it is the key to what placement new does: It is used to call the constructor for the object, and that's all it does. Notice explicitly that the size parameter is not even mentioned.
For those without time, to summarise my point: everything that 'malloc' does in C can be done in C++ using "::operator new". The only difference is that if you have non aggregate types, ie. types that need to have their destructors and constructors called, then you need to call those constructor and destructors. Such types do not explicitly exist in C, and so using the argument that "malloc does it better" is not valid. If you have a struct in 'C' that has a special "initializeMe" function which must be called with a corresponding "destroyMe" then all points made by the author apply equally to that struct as they do to a non-aggregate C++ struct.
Taking some of his points explicitly:
To implement multiple inheritance, the compiler must actually change the values of pointers during some casts. It can't know which value you eventually want when converting to a void * ... Thus, no ordinary function can perform the role of malloc in C++--there is no suitable return type.
This is not correct, again ::operator new performs the role of malloc:
class A1 { };
class A2 { };
class B : public A1, public A2 { };
void foo () {
void * v = ::operator new (sizeof (B));
B * b = new (v) B(); // Placement new calls the constructor for B.
delete v;
v = ::operator new (sizeof(int));
int * i = reinterpret_cast <int*> (v);
delete v'
}
As I mention above, we need placement new to call the constructor for B. In the case of 'i' we can cast from void* to int* without a problem, although again using placement new would improve type checking.
Another point he makes is about alignment requirements:
Memory returned by new char[...] will not necessarily meet the alignment requirements of a struct intlist.
The standard under 3.7.3.1/2 says:
The pointer returned shall be suitably aligned so that it can be converted to a
pointer of any complete object type and then used to access the object or array in the storage allocated (until
the storage is explicitly deallocated by a call to a corresponding deallocation function).
That to me appears pretty clear.
Under specialized allocators the author describes potential problems that you might have, eg. you need to use the allocator as an argument to any types which allocate memory themselves and the constructed objects will need to have their destructors called explicitly. Again, how is this different to passing the allocator object through to an "initalizeMe" call for a C struct?
Regarding calling the destructor, in C++ you can easily create a special kind of smart pointer, let's call it "placement_pointer" which we can define to call the destructor explicitly when it goes out of scope. As a result we could have:
template <typename T>
class placement_pointer {
// ...
~placement_pointer() {
if (*count == 0) {
m_b->~T();
}
}
// ...
T * m_b;
};
void
f ()
{
arena a;
// ...
foo *fp = new (a) foo; // must be destroyed
// ...
fp->~foo ();
placement_pointer<foo> pfp = new (a) foo; // automatically !!destructed!!
// ...
}
The last point I want to comment on is the following:
g++ comes with a "placement" operator new[] defined as follows:
inline void *
operator new[](size_t, void *place)
{
return place;
}
As noted above, not just implemented this way - but it is required to be so by the standard.
Let obj be a class with a destructor. Suppose you have sizeof (obj[10]) bytes of memory somewhere and would like to construct 10 objects of type obj at that location. (C++ defines sizeof (obj[10]) to be 10 * sizeof (obj).) Can you do so with this placement operator new[]? For example, the following code would seem to do so:
obj *
f ()
{
void *p = special_malloc (sizeof (obj[10]));
return new (p) obj[10]; // Serious trouble...
}
Unfortunately, this code is incorrect. In general, there is no guarantee that the size_t argument passed to operator new[] really corresponds to the size of the array being allocated.
But as he highlights by supplying the definition, the size argument is not used in the allocation function. The allocation function does nothing - and so the only affect of the above placement expression is to call the constructor for the 10 array elements as you would expect.
There are other issues with this code, but not the one the author listed.