Overriding new operator - non-allocating placement allocation functions - c++

I am in need to write / overload / override default C++ new operator. I found this below information -
non-allocating placement allocation functions
void* operator new ( std::size_t count, void* ptr );
(9)
void* operator new[]( std::size_t count, void* ptr );
(10)
As per documentation it state -
Called by the standard single-object placement new expression. The standard library implementation performs no action and returns ptr unmodified.
I am not able to clear myself by what is meant by "non-allocating placement allocation functions"?

These overloads are used by placement new. This is an expression that creates an object at a memory location, but doesn't allocate space for it. For instance, in this toy example:
void foo() {
void *raw = malloc(sizeof(int));
int *pint = new(raw) int(10);
pint->~int();
free(raw);
}
This illustrates that if we need to create an object in memory allocated by something that is not the C++ standard library (in this case, the C allocation functions), special syntax is used to create an object at that location.
The placement operator new accepts the address, and returns it unchanged. Thus the new expression just creates an object there. Naturally, we cannot call delete to free said object and memory, we must do an explicit destructor call, followed by the correct memory deallocation function.

Non-allocating placement allocation function is a form of new that doesn't allocate memory, but rather constructs an object in pre-allocated memory.
See this answer for more details.

To answer the question implied in your title:
It is not permitted to supply your own replacement function for the non-allocating forms of operator new. This is covered in the standard (N4659):
Non-allocating forms [new.delete.placement]
These functions are reserved; a C++ program may not define functions that displace the versions in the C++ standard library.
You can only replace the forms of operator new which are listed as Replacable under the section of the Standard labelled [new.delete].

Related

Who defines the new operator?

Is this program supposed to compile?
int main(){
double* p = new double[4];
delete[] p;
return 0;
}
(It does compile with GCC 7.1.1)
The reason I ask this is because, I am not sure who is providing the definition of the operator new.
Is it the language? is the compiler by default #includeing <new>?
To make this question more clear I can actually define (overwrite?) operator new.
#include<cstdlib> // malloc and size_t
#include<cassert> // assert
void* operator new[](unsigned long int sz){ // what if this is not defined? who provides `new`?
assert(0); // just to check I am calling this function
return std::malloc(sz);
}
int main(){
double* p = new double[4]; // calls the new function above
delete[] p;
return 0;
}
What am I doing here?
overriding new? overloading new? Is the definition of new special (magic) for the compiler? (e.g. if not defined use a language provided one).
What is the role of #include<new> in all this?
Here you have a new expression, which invokes indeed operator new[]
void* operator new[]( std::size_t count );
The compiler is implicitly declaring the basic operator news in each translation unit (this is specified by the standard), see the cppreference documentation.
The versions (1-4) are implicitly declared in each translation unit even if the < new> header is not included.
In your second code snippet, you are overloading (technically you are actually replacing) operator new (not overriding, that's only for virtual functions), although you should overload operator new[] instead (note that if you don't, then operator new[] will fall back on operator new, so technically I believe your code is OK, but I'd just overload the array version for clarity).
There can be other overloads of operator new. You have scalar and array versions both provided by the compiler, possibly by the standard library, and possibly by user-code.
There are additional overloads of new (and delete) if you write them yourself (if they don't have the signature of the built-in versions) or if you #include new. If you provide operators that match the builtin signatures, then they will replace the builtin version. Beware. :)
The main reasons people include are for
1) constructing objects in a user-provided memory address
2) non-throwing versions of new
There are, of course, operator delete overloads as well, since new/delete must be overloaded pairwise to work together, or nearly certain destruction will follow. If your operator new matches the built-in signature, it will replace rather than overload the built-in version.
Remember, a "new expression" as simple as:
std::string myString = new std::string();
really works in 2 parts: 1) Allocating memory (via operator new), and then, 2) constructing an object in that memory. If successful, it returns the pointer to that object, and if failure, cleans up what was constructed, deallocates whatever was allocated, and then throws.
When you overload operator new, you are only dealing with the memory allocation, not the constructor call, and the object will be constructed in whatever address this operator returns. For the normal placement new, you might see code like this:
char myBuffer[1024]; // assume aligned; nobody needs more than 1024 bytes
std::string *ptr = new (myBuffer) std::string();
assert (ptr == &myBuffer);
The extra parameter to new is the myBuffer address, which is immediately returned by operator new and becomes the place where the object is constructed. The assertion should pass, and shows that the string was created in the bytes of myBuffer.
The no-throw versions of new are also available after #including new, which also uses an extra argument to the operator:
char * buf = new (std::nothrow) char[MAX_INT];
if (buf) {
std::cout << "WHOA, you have a lot of memory!!!\n";
}
Now instead of failing, it'll return a null pointer, so you have to check it.

Placement forms of the operator delete functions

In his new book TC++PL4, Stroustrup casts a slightly different light on a once usual practice regarding user-controlled memory allocation and placement new—or, more specifically, regarding the enigmatical "placement delete." In the book's sect. 11.2.4, Stroustrup writes:
The "placement delete" operators do nothing except possibly inform a garbage collector that the deleted pointer is no longer safely derived.
This implies that sound programming practice will follow an explicit call to a destructor by a call to placement delete.
Fair enough. However, is there no better syntax to call placement delete than the obscure
::operator delete(p);
The reason I ask is that Stroustrup's sect. 11.2.4 mentions no such odd syntax. Indeed, Stroustrup does not dwell on the matter; he mentions no syntax at all. I vaguely dislike the look of ::operator, which interjects the matter of namespace resolution into something that properly has nothing especially to do with namespaces. Does no more elegant syntax exist?
For reference, here is Stroustrup's quote in fuller context:
By default, operator new creates its object on the free store. What
if we wanted the object allocated elsewhere?... We can place objects
anywhere by providing an allocator function with extra arguments and
then supplying such extra arguments when using new:
void* operator new(size_t, void* p) { return p; }
void buf = reinterpret_cast<void*>(0xF00F);
X* p2 = new(buf) X;
Because of this usage, the new(buf) X syntax for supplying extra
arguments to operator new() is known as the placement syntax.
Note that every operator new() takes a size as its first argument
and that the size of the object allocated is implicitly supplied.
The operator new() used by the new operator is chosen by the
usual argument-matching rules; every operator new() has
a size_t as its first argument.
The "placement" operator new() is the simplest such allocator. It
is defined in the standard header <new>:
void* operator new (size_t, void* p) noexcept;
void* operator new[](size_t, void* p) noexcept;
void* operator delete (void* p, void*) noexcept; // if (p) make *p invalid
void* operator delete[](void* p, void*) noexcept;
The "placement delete" operators do nothing except possibly inform a
garbage collector that the deleted pointer is no longer safely
derived.
Stroustrup then continues to discuss the use of placement new with arenas. He does not seem to mention placement delete again.
If you don't want to use ::, you don't really have to. In fact, you generally shouldn't (don't want to).
You can provide replacements for ::operator new and ::operator delete (and the array variants, though you should never use them).
You can also, however, overload operator new and operator delete for a class (and yes, again, you can do the array variants, but still shouldn't ever use them).
Using something like void *x = ::operator new(some_size); forces the allocation to go directly to the global operator new instead of using a class specific one (if it exists). Generally, of course, you want to use the class specific one if it exists (and the global one if it doesn't). That's exactly what you get from using void *x = operator new(some_size); (i.e., no scope resolution operator).
As always, you need to ensure that your news and deletes match, so you should only use ::operator delete to delete the memory when/if you used ::operator new to allocate it. Most of the time you shouldn't use :: on either one.
The primary exception to that is when/if you're actually writing an operator new and operator delete for some class. These will typically call ::operator new to get a big chunk of memory, then divvy that up into object-sized pieces. To allocate that big chunk of memory, it typically (always?) has to explicitly specify ::operator new because otherwise it would end up calling itself to allocate it. Obviously, if it specifies ::operator new when it allocates the data, it also needs to specify ::operator delete to match.
First of all: No there isn't.
But what is the type of memory? Exactly, it doesn't have one. So why not just use the following:
typedef unsigned char byte;
byte *buffer = new byte[SIZE];
Object *obj1 = new (buffer) Object;
Object *obj2 = new (buffer + sizeof(Object)) Object;
...
obj1->~Object();
obj2->~Object();
delete[] buffer;
This way you don't have to worry about placement delete at all. Just wrap the whole thing in a class called Buffer and there you go.
EDIT
I thought about your question and tried a lot of things out but I found no occasion for what you call placement delete. When you take a look into the <new> header you'll see this function is empty. I'd say it's just there for the sake of completeness. Even when using templates you're able to call the destructor manually, you know?
class Buffer
{
private:
size_t size, pos;
byte *memory;
public:
Buffer(size_t size) : size(size), pos(0), memory(new byte[size]) {}
~Buffer()
{
delete[] memory;
}
template<class T>
T* create()
{
if(pos + sizeof(T) > size) return NULL;
T *obj = new (memory + pos) T;
pos += sizeof(T);
return obj;
}
template<class T>
void destroy(T *obj)
{
if(obj) obj->~T(); //no need for placement delete here
}
};
int main()
{
Buffer buffer(1024 * 1024);
HeavyA *aObj = buffer.create<HeavyA>();
HeavyB *bObj = buffer.create<HeavyB>();
if(aObj && bObj)
{
...
}
buffer.destroy(aObj);
buffer.destroy(bObj);
}
This class is just an arena (what Stroustrup calls it). You can use it when you have to allocate many objects and don't want the overhead of calling new everytime. IMHO this is the only use case for a placement new/delete.
This implies that sound programming practice will follow an explicit call to a destructor by a call to placement delete.
No it doesn't. IIUC Stroustrup does not mean placement delete is necessary to inform the garbage collector that memory is no longer in use, he means it doesn't do anything apart from that. All deallocation functions can tell a garbage colector memory is no longer used, but when using placement new to manage memory yourself, why would you want a garbage collector to fiddle with that memory anyway?
I vaguely dislike the look of ::operator, which interjects the matter of namespace resolution into something that properly has nothing especially to do with namespaces.
"Properly" it does have to do with namespaces, qualifying it to refer to the "global operator new" distinguishes it from any overloaded operator new for class types.
Does no more elegant syntax exist?
You probably don't ever want to call it. A placement delete operator will be called by the compiler if you use placement new and the constructor throws an exception. Since there is no memory to deallocate (because the pacement new didn't allocate any) all it does it potentially mark the memory as unused.

overriding delete with parameters

I can override global operator new with different parameters, so for example I can have:
void* operator new (std::size_t size) throw (std::bad_alloc);
void* operator new (std::size_t size, int num) throw (std::bad_alloc);
which can be called separately as
int* p1 = new int; // calls new(size_t)
int* p2 = new(5) int; // calls new(size_t, int)
since each of these can potentially use some different allocation scheme, I would need a separate delete() function for each. However, delete(void*) cannot be overloaded in the same way! delete(void*) is the only valid signature. So how can the above case be handled?
P.S. I am not suggesting this is a good idea. This kind of thing happened to me and so I discovered this "flaw" (at least in my opinion) in c++. If the language allows the new overrides, it must allow the delete overrides, or it becomes useless. And so I was wondering if there is a way around this, not if this a good idea.
I can override global operator new with different parameters
Those are called placement allocation functions.
delete(void*) is the only valid signature.
No.
First, some terminology: A delete expression such as delete p is not just a function call, it invokes the destructor then calls a deallocation function, which is some overload of operator delete that is chosen by overload resolution.
You can override operator delete with signatures to match your placement allocation function, but that overload will only be used if the constructor called by the placement new expression throws an exception e.g.
struct E {
E() { throw 1; }
};
void* operator new(std::size_t n, int) throw(std::bad_alloc) { return new char[n]; }
void operator delete(void* p, int) { std::puts("hello!"); delete[] (char*)p; }
int main()
{
try {
new (1) E;
} catch (...) {
puts("caught");
}
}
The placement deallocation function which matches the form of placement new expression used (in this case it has an int parameter) is found by overload resolution and called to deallocate the storage.
So you can provide "placement delete" functions, you just can't call them explicitly. It's up to you to remember how you allocated an object and ensure you use the corresponding deallocation.
If you keep track of the different memory regions you allocate with your different new overloads, you can tag them with the version of new that was called.
Then at delete time you can look the address up to find which new was called, and do something different in each case.
This way you can guarantee that the correct logic is automatically associated with each different new overload.
As pointed out by baruch in the comments below, there is a performance overhead associated with the maintenance of the data you use for tracking, and this logic will also only work as long as the overloaded delete is not passed anything allocated using the default delete.
As far as tracking overhead, it seems to me that the minimum overhead method of tracking the type of the allocation is to allocate the amount requested, plus a small amount of additional space at the start of the allocated region in which to tag the request type (sized according to conservative alignment requirements). You can then look at this tag region on delete to determine which logic to follow.
There is a placement delete which matches the placement operator new. And you can't call it directly. This is because the placement delete is only used to deallocate memory when the constructor of the new'ed object throws.
void *operator new(size_t, Area&); // placement new
void operator delete(void*, Area&); // matching placement delete
...
Area &area;
SomeType *t=new(area) SomeType();
// when SomeType() throws then `delete(t,area)` from above is called
// but you can't do this:
delete (area) t;
A common way to overcome this, is to use write an overloaded "destroy" function, which accepts all kinds of parameters.
template<class T> void destroy(Area &a, T* &pt) //<-you can't do this with 'delete'
{
if (pt) {
pt->~T(); // run the destructor
a.freeMem(pt); // deallocate the object
pt=NULL; // nulls the pointer on the caller side.
}
}
The simple answer is: Do not do this. All forms of non-placement new are redundant in C++11 and horrifically unsafe, for example, return raw pointer. If you want to allocate objects in a custom place, then use a class with an allocate function if stateful or a free function if not. The best treatment for new and indeed, delete, is to excise them from your program with prejudice, with the possible exception of placement new.
Edit: The reason why it's useless for you is because you're trying to use it for a purpose which it was not intended for. All you can use the extra params for is stuff like logging or other behaviour control. You can't really change the fundamental semantics of new and delete. If you need stateful allocation, you must use a class.
You're wrong. It is possible to provide a placement delete.

new operator for memory allocation on heap

I was looking at the signature of new operator. Which is:
void* operator new (std::size_t size) throw (std::bad_alloc);
But when we use this operator, we never use a cast. i.e
int *arr = new int;
So, how does C++ convert a pointer of type void* to int* in this case. Because, even malloc returns a void* and we need to explicitly use a cast.
There is a very subtle difference in C++ between operator new and the new operator. (Read that over again... the ordering is important!)
The function operator new is the C++ analog of C's malloc function. It's a raw memory allocator whose responsibility is solely to produce a block of memory on which to construct objects. It doesn't invoke any constructors, because that's not its job. Usually, you will not see operator new used directly in C++ code; it looks a bit weird. For example:
void* memory = operator new(137); // Allocate at least 137 bytes
The new operator is a keyword that is responsible for allocating memory for an object and invoking its constructor. This is what's encountered most commonly in C++ code. When you write
int* myInt = new int;
You are using the new operator to allocate a new integer. Internally, the new operator works roughly like this:
Allocate memory to hold the requested object by using operator new.
Invoke the object constructor, if any. If this throws an exception, free the above memory with operator delete, then propagate the exception.
Return a pointer to the newly-constructed object.
Because the new operator and operator new are separate, it's possible to use the new keyword to construct objects without actually allocating any memory. For example, the famous placement new allows you to build an object at an arbitrary memory address in user-provided memory. For example:
T* memory = (T*) malloc(sizeof(T)); // Allocate a raw buffer
new (memory) T(); // Construct a new T in the buffer pointed at by 'memory.'
Overloading the new operator by defining a custom operator new function lets you use new in this way; you specify how the allocation occurs, and the C++ compiler will wire it into the new operator.
In case you're curious, the delete keyword works in a same way. There's a deallocation function called operator delete responsible for disposing of memory, and also a delete operator responsible for invoking object destructors and freeing memory. However, operator new and operator delete can be used outside of these contexts in place of C's malloc and free, for example.
You confuse new expression with operator new() function. When the former is compiled the compiler among other stuff generates a call to operator new() function and passes size enough to hold the type mentioned in the new expression and then a pointer of that type is returned.

How do 'malloc' and 'new' work? How are they different (implementation wise)? [duplicate]

This question already has answers here:
What is the difference between new/delete and malloc/free?
(15 answers)
Closed 1 year ago.
I know how they are different syntactically, and that C++ uses new, and C uses malloc. But how do they work, in a high-level explanation?
See What is the difference between new/delete and malloc/free?
I'm just going to direct you to this answer: What is the difference between new/delete and malloc/free? . Martin provided an excellent overview. Quick overview on how they work (without diving into how you could overload them as member functions):
new-expression and allocation
The code contains a new-expression supplying the type-id.
The compiler will look into whether the type overloads the operator new with an allocation function.
If it finds an overload of an operator new allocation function, that one is called using the arguments given to new and sizeof(TypeId) as its first argument:
Sample:
new (a, b, c) TypeId;
// the function called by the compiler has to have the following signature:
operator new(std::size_t size, TypeOfA a, TypeOfB b, TypeOf C c);
if operator new fails to allocate storage, it can call new_handler, and hope it makes place. If there still is not enough place, new has to throw std::bad_alloc or derived from it. An allocator that has throw() (no-throw guarantee), it shall return a null-pointer in that case.
The C++ runtime environment will create an object of the type given by the type-id in the memory returned by the allocation function.
There are a few special allocation functions given special names:
no-throw new. That takes a nothrow_t as second argument. A new-expression of the form like the following will call an allocation function taking only std::size_t and nothrow_t:
Example:
new (std::nothrow) TypeId;
placement new. That takes a void* pointer as first argument, and instead of returning a newly allocated memory address, it returns that argument. It is used to create an object at a given address. Standard containers use that to preallocate space, but only create objects when needed, later.
Code:
// the following function is defined implicitly in the standard library
void * operator(std::size_t size, void * ptr) throw() {
return ptr;
}
If the allocation function returns storage, and the the constructor of the object created by the runtime throws, then the operator delete is called automatically. In case a form of new was used that takes additional parameters, like
new (a, b, c) TypeId;
Then the operator delete that takes those parameters is called. That operator delete version is only called if the deletion is done because the constructor of the object did throw. If you call delete yourself, then the compiler will use the normal operator delete function taking only a void* pointer:
int * a = new int;
=> void * operator new(std::size_t size) throw(std::bad_alloc);
delete a;
=> void operator delete(void * ptr) throw();
TypeWhosCtorThrows * a = new ("argument") TypeWhosCtorThrows;
=> void * operator new(std::size_t size, char const* arg1) throw(std::bad_alloc);
=> void operator delete(void * ptr, char const* arg1) throw();
TypeWhosCtorDoesntThrow * a = new ("argument") TypeWhosCtorDoesntThrow;
=> void * operator new(std::size_t size, char const* arg1) throw(std::bad_alloc);
delete a;
=> void operator delete(void * ptr) throw();
new-expression and arrays
If you do
new (possible_arguments) TypeId[N];
The compiler is using the operator new[] functions instead of plain operator new. The operator can be passed a first argument not exactly sizeof(TypeId)*N: The compiler could add some space to store the number of objects created (necassary to be able to call destructors). The Standard puts it this way:
new T[5] results in a call of operator new[](sizeof(T)*5+x), and
new(2,f) T[5] results in a call of operator new[](sizeof(T)*5+y,2,f).
What new does differently form malloc is the following:
It constructs a value in the allocated memory, by calling operator new. This behaviour can be adapted by overloading this operator, either for all types, or just for your class.
It calls handler functions if no memory can be allocated. This gives you the opportunity to free the required memory on the fly if you have registered such a handler function beforehand.
If that doesn't help (e.g. because you didn't register any function), it throws an exception.
So all in all, new is highly customizable and also does initialization work besides memory allocation. These are the two big differences.
Although malloc/free and new/delete have different behaviors, they both do the same thing at a low level: manage dynamically allocated memory. I'm assuming this is what you're really asking about. On my system, new actually calls malloc internally to perform its allocation, so I'll just talk about malloc.
The actual implementation of malloc and free can vary a lot, since there are many ways to implement memory allocation. Some approaches get better performance, some waste less memory, others are better for debugging. Garbage collected languages may also have completely different ways of doing allocation, but your question was about C/C++.
In general, blocks are allocated from the heap, a large area of memory in your program's address space. The library manages the heap for you, usually using system calls like sbrk or mmap. One approach to allocating blocks from the heap is to maintain a list of free and allocated blocks which stores block sizes and locations. Initially, the list might contain one big block for the whole heap. When a new block is requested, the allocator will select a free block from the list. If the block is too large, it can be split into two blocks (one of the requested size, the other of whatever size is left). When an allocated block is freed, it can be merged with adjacent free blocks, since having one big free block is more useful than several small free blocks. The actual list of blocks can be stored as separate data structures or embedded into the heap.
There are many variations. You might want to keep separate lists of free and allocated blocks. You might get better performance if you have separate areas of the heap for blocks of common sizes or separate lists for those sizes. For instance, when you allocated a 16-byte block, the allocator might have a special list of 16-byte blocks so allocation can be O(1). It may also be advantageous to only deal with block sizes that are powers of 2 (anything else gets rounded up). For instance, the Buddy allocator works this way.
"new" does a lot more than malloc. malloc simply allocates the memory - it doesn't even zero it for you. new initialises objects, calls contructors etc. I would suspect that in most implementations new is little more than a thin wrapper around malloc for basic types.
In C:
malloc allocates a chunk of memory of a size that you provide in an argument, and returns back a pointer to this memory.
The memory is declared on the heap, so make sure to deallocate it when you are finished.