I just learned about the C++ construct called "placement new". It allows you to exactly control where a pointer points to in memory. It looks like this:
#include <new> // Must #include this to use "placement new"
#include "Fred.h" // Declaration of class Fred
void someCode()
{
char memory[sizeof(Fred)];
void* place = memory;
Fred* f = new(place) Fred(); // Create a pointer to a Fred(),
// stored at "place"
// The pointers f and place will be equal
...
}
(example from C++ FAQ Lite)
In this example, the this pointer of Fred will be equal to place.
I've seen it used in our team's code once or twice. In your experience, what does this construct enable? Do other pointer languages have similar constructs? To me, it seems reminiscent of equivalence in FORTRAN, which allows disparate variables to occupy the same location in memory.
It allows you to do your own memory management. Usually this will get you at best marginally improved performance, but sometimes it's a big win. For example, if your program is using a large number of standard-sized objects, you might well want to make a pool with one large memory allocation.
This sort of thing was also done in C, but since there are no constructors in C it didn't require any language support.
It is also used for embedded programming, where IO devices are often mapped to specific memory addresses
Its usefull when building your own container like objects.
For example if you were to create a vector. If you reserve space for a large number of objects you want to allocate the memory with some method that does not invoke the constructor of the object (like new char[sizeof(object) * reserveSize]). Then when people start adding objects into the vector you use placement new to copy them into allocated memory.
template<typename T>
class SillyVectorExample
{
public:
SillyVectorExample()
:reserved(10)
,size(0)
,data(new char[sizeof(T) * reserved])
{}
void push_back(T const& object)
{
if (size >= reserved)
{
// Do Somthing.
}
// Place a copy of the object into the data store.
new (data+(sizeof(T)*size)) T(object);
++size;
}
// Add other methods to make sure data is copied and dealllocated correctly.
private:
size_t reserved;
size_t size;
char* data;
};
PS. I am not advocating doing this. This is just a simplified example of how containers can work.
I've used it when constructing objects in a shared memory segment.
Placement new can be used to create type-safe unions, such as Boost's variant.
The union class contains a buffer as big as the biggest type it's specified to contain (and with sufficient alignment). It placement news objects into the buffer as required.
I use this construct when doing C++ in kernel mode.
I use the kernel mode memory allocator and construct the object on the allocated chunk.
All of this is wrapped in classes and functions, but in the end I do a placement new.
Placement new is NOT about making pointers equal (you can just use assignment for that!).
Placement new is for constructing an object at a particular location. There are three ways of constructing an object in C++, and placement new is the only one that gives you explicit control over where that object "lives". This is useful for several things, including shared memory, low-level device I/O, and memory pool/allocator implementation.
With stack allocation, the object is constructed at the top of the stack, wherever that happens to be currently.
With "regular" new, the object is constructed at an effectively arbitrary address on the heap, as managed by the standard library (unless you've overridden operator new).
Placement new says "build me an object at this address specifically", and its implementation is simply an overload of operator new that returns the pointer passed to it, as a means of getting to the remainder of the machinery of the new operator, which constructs an object in the memory returned by the operator new function.
It's also worth noting that the operator new function can be overloaded with arbitrary arguments (just as any other function). These other arguments are passed via the "new(arg 2, arg3, ..., argN)" syntax. Arg1 is always implicitly passed as "sizeof(whatever you're constructing)".
By controlling the exact placement, you can align things in memory and this can sometimes be used to improve CPU fetch/cache performance.
Never actually saw it in use, though
It can be useful when paging out memory to a file on the hard drive, which one might do when manipulating large objects.
Placement new allows the developer to allocate the memory from preallocated memory chunk. If the system is larger, then developers go for using placement new. Now I am working on a larger avionics software there we allocate the large memory that is required for the execution of application at the start. And we use the placement new to allocate the memory wherever required. It increases the performance to some amount.
seems to me like a way of allocating an object on the stack ..
I've used it to create objects based on memory containing messages received from the network.
Related
I just learned about the C++ construct called "placement new". It allows you to exactly control where a pointer points to in memory. It looks like this:
#include <new> // Must #include this to use "placement new"
#include "Fred.h" // Declaration of class Fred
void someCode()
{
char memory[sizeof(Fred)];
void* place = memory;
Fred* f = new(place) Fred(); // Create a pointer to a Fred(),
// stored at "place"
// The pointers f and place will be equal
...
}
(example from C++ FAQ Lite)
In this example, the this pointer of Fred will be equal to place.
I've seen it used in our team's code once or twice. In your experience, what does this construct enable? Do other pointer languages have similar constructs? To me, it seems reminiscent of equivalence in FORTRAN, which allows disparate variables to occupy the same location in memory.
It allows you to do your own memory management. Usually this will get you at best marginally improved performance, but sometimes it's a big win. For example, if your program is using a large number of standard-sized objects, you might well want to make a pool with one large memory allocation.
This sort of thing was also done in C, but since there are no constructors in C it didn't require any language support.
It is also used for embedded programming, where IO devices are often mapped to specific memory addresses
Its usefull when building your own container like objects.
For example if you were to create a vector. If you reserve space for a large number of objects you want to allocate the memory with some method that does not invoke the constructor of the object (like new char[sizeof(object) * reserveSize]). Then when people start adding objects into the vector you use placement new to copy them into allocated memory.
template<typename T>
class SillyVectorExample
{
public:
SillyVectorExample()
:reserved(10)
,size(0)
,data(new char[sizeof(T) * reserved])
{}
void push_back(T const& object)
{
if (size >= reserved)
{
// Do Somthing.
}
// Place a copy of the object into the data store.
new (data+(sizeof(T)*size)) T(object);
++size;
}
// Add other methods to make sure data is copied and dealllocated correctly.
private:
size_t reserved;
size_t size;
char* data;
};
PS. I am not advocating doing this. This is just a simplified example of how containers can work.
I've used it when constructing objects in a shared memory segment.
Placement new can be used to create type-safe unions, such as Boost's variant.
The union class contains a buffer as big as the biggest type it's specified to contain (and with sufficient alignment). It placement news objects into the buffer as required.
I use this construct when doing C++ in kernel mode.
I use the kernel mode memory allocator and construct the object on the allocated chunk.
All of this is wrapped in classes and functions, but in the end I do a placement new.
Placement new is NOT about making pointers equal (you can just use assignment for that!).
Placement new is for constructing an object at a particular location. There are three ways of constructing an object in C++, and placement new is the only one that gives you explicit control over where that object "lives". This is useful for several things, including shared memory, low-level device I/O, and memory pool/allocator implementation.
With stack allocation, the object is constructed at the top of the stack, wherever that happens to be currently.
With "regular" new, the object is constructed at an effectively arbitrary address on the heap, as managed by the standard library (unless you've overridden operator new).
Placement new says "build me an object at this address specifically", and its implementation is simply an overload of operator new that returns the pointer passed to it, as a means of getting to the remainder of the machinery of the new operator, which constructs an object in the memory returned by the operator new function.
It's also worth noting that the operator new function can be overloaded with arbitrary arguments (just as any other function). These other arguments are passed via the "new(arg 2, arg3, ..., argN)" syntax. Arg1 is always implicitly passed as "sizeof(whatever you're constructing)".
By controlling the exact placement, you can align things in memory and this can sometimes be used to improve CPU fetch/cache performance.
Never actually saw it in use, though
It can be useful when paging out memory to a file on the hard drive, which one might do when manipulating large objects.
Placement new allows the developer to allocate the memory from preallocated memory chunk. If the system is larger, then developers go for using placement new. Now I am working on a larger avionics software there we allocate the large memory that is required for the execution of application at the start. And we use the placement new to allocate the memory wherever required. It increases the performance to some amount.
seems to me like a way of allocating an object on the stack ..
I've used it to create objects based on memory containing messages received from the network.
Every time somebody asks a question about delete[] on here, there is always a pretty general "that's how C++ does it, use delete[]" kind of response. Coming from a vanilla C background what I don't understand is why there needs to be a different invocation at all.
With malloc()/free() your options are to get a pointer to a contiguous block of memory and to free a block of contiguous memory. Something in implementation land comes along and knows what size the block you allocated was based on the base address, for when you have to free it.
There is no function free_array(). I've seen some crazy theories on other questions tangentially related to this, such as calling delete ptr will only free the top of the array, not the whole array. Or the more correct, it is not defined by the implementation. And sure... if this was the first version of C++ and you made a weird design choice that makes sense. But why with $PRESENT_YEAR's standard of C++ has it not been overloaded???
It seems to be the only extra bit that C++ adds is going through the array and calling destructors, and I think maybe this is the crux of it, and it literally is using a separate function to save us a single runtime length lookup, or nullptr at end of the list in exchange for torturing every new C++ programmer or programmer who had a fuzzy day and forgot that there is a different reserve word.
Can someone please clarify once and for all if there is a reason besides "that's what the standard says and nobody questions it"?
Objects in C++ often have destructors that need to run at the end of their lifetime. delete[] makes sure the destructors of each element of the array are called. But doing this has unspecified overhead, while delete does not. This is why there are two forms of delete expressions. One for arrays, which pays the overhead and one for single objects which does not.
In order to only have one version, an implementation would need a mechanism for tracking extra information about every pointer. But one of the founding principles of C++ is that the user shouldn't be forced to pay a cost that they don't absolutely have to.
Always delete what you new and always delete[] what you new[]. But in modern C++, new and new[] are generally not used anymore. Use std::make_unique, std::make_shared, std::vector or other more expressive and safer alternatives.
Basically, malloc and free allocate memory, and new and delete create and destroy objects. So you have to know what the objects are.
To elaborate on the unspecified overhead François Andrieux's answer mentions, you can see my answer on this question in which I examined what does a specific implementation do (Visual C++ 2013, 32-bit). Other implementations may or may not do a similar thing.
In case the new[] was used with an array of objects with a non-trivial destructor, what it did was allocating 4 bytes more, and returning the pointer shifted by 4 bytes ahead, so when delete[] wants to know how many objects are there, it takes the pointer, shifts it 4 bytes prior, and takes the number at that address and treats it as the number of objects stored there. It then calls a destructor on each object (the size of the object is known from the type of the pointer passed). Then, in order to release the exact address, it passes the address that was 4 bytes prior to the passed address.
On this implementation, passing an array allocated with new[] to a regular delete results in calling a single destructor, of the first element, followed by passing the wrong address to the deallocation function, corrupting the heap. Don't do it!
Something not mentioned in the other (all good) answers is that the root cause of this is that arrays - inherited from C - have never been a "first-class" thing in C++.
They have primitive C semantics and do not have C++ semantics, and therefore C++ compiler and runtime support, which would let you or the compiler runtime systems do useful things with pointers to them.
In fact, they're so unsupported by C++ that a pointer to an array of things looks just like a pointer to a single thing. That, in particular, would not happen if arrays were proper parts of the language - even as part of a library, like string or vector.
This wart on the C++ language happened because of this heritage from C. And it remains part of the language - even though we now have std::array for fixed-length arrays and (have always had) std::vector for variable-length arrays - largely for purposes of compatibility: Being able to call out from C++ to operating system APIs and to libraries written in other languages using C-language interop.
And ... because there are truckloads of books and websites and classrooms out there teaching arrays very early in their C++ pedagogy, because of a) being able to write useful/interesting examples early on that do in fact call OS APIs, and of course because of the awesome power of b) "that's the way we've always done it".
Generally, C++ compilers and their associated runtimes build on top of the platform's C runtime. In particular in this case the C memory manager.
The C memory manager allows you to free a block of memory without knowing its size, but there is no standard way to get the size of the block from the runtime and there is no guarantee that the block that was actually allocated is exactly the size you requested. It may well be larger.
Thus the block size stored by the C memory manager can't usefully be used to enable higher-level functionality. If higher-level functionality needs information on the size of the allocation then it must store it itself. (And C++ delete[] does need this for types with destructors, to run them for every element.)
C++ also has an attitude of "you only pay for what you use", storing an extra length field for every allocation (separate from the underlying allocator's bookkeeping) would not fit well with this attitude.
Since the normal way to represent an array of unknown (at compile time) size in C and C++ is with a pointer to its first element, there is no way the compiler can distinguish between a single object allocation and an array allocation based on the type system. So it leaves it up to the programmer to distinguish.
The cover story is that delete is required because of C++'s relationship with C.
The new operator can make a dynamically allocated object of almost any object type.
But, due to the C heritage, a pointer to an object type is ambiguous between two abstractions:
being the location of a single object, and
being the base of a dynamic array.
The delete versus delete[] situation just follows from that.
However, that's does not ring true, because, in spite of the above observations being true, a single delete operator could be used. It does not logically follow that two operators are required.
Here is informal proof. The new T operator invocation (single object case) could implicitly behave as if it were new T[1]. So that is to say, every new could always allocate an array. When no array syntax is mentioned, it could be implicit that an array of [1] will be allocated. Then, there would just have to exist a single delete which behaves like today's delete[].
Why isn't that design followed?
I think it boils down to the usual: it's a goat that was sacrificed to the gods of efficiency. When you allocate an array with new [], extra storage is allocated for meta-data to keep track of the number of elements, so that delete [] can know how many elements need to be iterated for destruction. When you allocate a single object with new, no such meta-data is required. The object can be constructed directly in the memory which comes from the underlying allocator without any extra header.
It's a part of "don't pay for what you don't use" in terms of run-time costs. If you're allocating single objects, you don't have to "pay" for any representational overhead in those objects to deal with the possibility that any dynamic object referenced by pointer might be an array. However, you are burdened with the responsibility of encoding that information in the way you allocate the object with the array new and subsequently delete it.
An example might help. When you allocate a C-style array of objects, those objects may have their own destructor that needs to be called. The delete operator does not do that. It works on container objects, but not C-style arrays. You need delete[] for them.
Here is an example:
#include <iostream>
#include <stdlib.h>
#include <string>
using std::cerr;
using std::cout;
using std::endl;
class silly_string : private std::string {
public:
silly_string(const char* const s) :
std::string(s) {}
~silly_string() {
cout.flush();
cerr << "Deleting \"" << *this << "\"."
<< endl;
// The destructor of the base class is now implicitly invoked.
}
friend std::ostream& operator<< ( std::ostream&, const silly_string& );
};
std::ostream& operator<< ( std::ostream& out, const silly_string& s )
{
return out << static_cast<const std::string>(s);
}
int main()
{
constexpr size_t nwords = 2;
silly_string *const words = new silly_string[nwords]{
"hello,",
"world!" };
cout << words[0] << ' '
<< words[1] << '\n';
delete[] words;
return EXIT_SUCCESS;
}
That test program explicitly instruments the destructor calls. It’s obviously a contrived example. For one thing, a program does not need to free memory immediately before it terminates and releases all its resources. But it does demonstrate what happens and in what order.
Some compilers, such as clang++, are smart enough to warn you if you leave out the [] in delete[] words;, but if you force it to compile the buggy code anyway, you get heap corruption.
Delete is an operator that destroys array and non-array(pointer) objects which are generated by new expression.
It can be used by either using the Delete operator or Delete [ ] operator
A new operator is used for dynamic memory allocation which puts variables on heap memory.
This means the Delete operator deallocates memory from the heap.
Pointer to object is not destroyed, value or memory block pointed by the pointer is destroyed.
The delete operator has a void return type that does not return a value.
Here's my situation: I want to overload "operator new" so that instead of allocating my object in a random space in memory, it gets allocated in a pre-defined memory buffer. I want to be able to save this buffer to a file and load it in the future, so I want to use handles instead of pointers. What I want, ideally, is for "operator new" to return a handle that I can use to go straight to the object's place in it's buffer. Is it possible to do this in C++(11)? If not, what are my best alternatives?
Afaik you cannot change the return type of the new operator. For your scenario you could try the following approach:
Define a Handle (or Handle<T>, if you need handles for more than one type) class, which internally stores the index into your allocation area (i.e. Handle has one single member variable index). The Handle class will need to internally somehow have access to your allocation subsystem, i.e. it will need to know where the actual storage area (buffer) is located.
Define a constructor for Handle taking a pointer (returned by your new implementation) as argument and computing the index from that (e.g. by subtracting the beginning of your storage area)
Also define a dereferencing operator (operator *) for the Handle class returning a reference to the "handled" object (e.g. by adding the index to the beginning of the storage area...)
In your code, alsways use Handle<T> instead of pointers to T, at least at every point where you actually store the Handle/pointer.
That way, when you serialize/deserialize your storage area, only indices will be written to the disk. Using the Handle class will be the same as using pointers. The computation of the actual pointers will be done internally.
Of course this will have some performance penalty, since instead of directly using pointers there will always be some computation. Also this might affect the optimization that can be done by the compiler. One idea to minimize this performance issue would be to implement a conversion operator from Handle<T> to T*, so for code pieces that use a single Handle/Pointer a lot you could easily "precompute" the pointer and use it throughout the code.
I know something that might help you.. Try this..
Standard C++ also supports a second version of new, called placement new, which constructs an object on a preallocated storage. In order for this to work we must provide the address where we want the object to be allocated as a pointer parameter:
(my_class = new (place) Myclass);
So why would you want to use placement new? Placement new is useful for constructing objects in a pre-allocated block of memory. This bypasses the work of operator new by allowing the person constructing the object to choose the memory that it is initialized into. You might do this if you have a pool of memory you want to use for constructing some objects of a class, but don't want to overload operator new for the whole class.
What is the use of Construction of objects at predetermined locations in C++?
The following code illustrates Construction at predetermined location-
void *address = (void *) 0xBAADCAFE ;
MyClass *ptr = new (address) MyClass (/*arguments to constructor*/) ;
This eventually creates object of MyClass, at the predetermined "address".
(Assuming storage pointed by address is fairly large enough to hold MyClass object).
I would like to know the use of creating objects at such predetermined locations in memory.
One scenario where placement new is useful is:
You can preallocate big buffer once and then use many placement new operators.
This gives you better performance(You don't need reallocations everytime) and less fragmented memory (when you need small memory chunks). Typically this is what an std::vector imlementation uses.
The downside is, You have to manually manage the allocated memory. Objects allocated by placement new require an explicit destructor invocation once they are not needed anymore.
Given that it is always advicable to profile your application for bottle necks instead of running over to placement new for pre-optimization.
There are mainly two cases:
The first is when -for example in an embedded system- you have to construct an object in a given well-known place.
The second is when you want -for some reason- to manage memory in a way other than the default.
In C++, an expression like pA = new(...) A(...) does two consecutive things:
calls the void* operator new(size_t, ...) function and subsequently
calls A::A(...).
Since calling new is the only way to call A::A(), adding parameters to new allows to specialize different way to manage memory. The most trivial is to "use memory already obtained by some other means".
This method is fine when the allocation and construction needs to be separated. The typical case is std::allocator, whose purpose is allocate uninitialized memory for a given quantity, while object contruction happens later.
This happens, for example, in std::vector, since it has to allocate a capacity normally wider than its actual size, and then contruct the object as they are push_back-ed in the space that already exist.
In fact the default std::allocator implementation, when asket to allocate n object, does a return reinterpret_cast<T*>(new char[n*sizeof(T)]), so allocating the space, but actually not constructing anything.
Admitting that std::vector stores:
T* pT; //the actual buffer
size_t sz; //the actual size
size_t cap; //the actual capacity
allocator<T> alloc;
an implemetation of push_back can be:
void vector<T>::push_back(const T& t)
{
if(sz==cap)
{
size_t ncap = cap + 1+ cap/2; //just something more than cap
T* npT = alloc.allocate(ncap);
for(size_t i=0; i<sz; ++i)
{
new(npT+i)T(pt[i]); //copy old values (may be move in C++11)
pt[i].~T(); // destroy old value, without deallocating
}
alloc.deallocate(pt,cap);
pT = npT;
cap = ncap;
// now we heve extra capacity
}
new(pT+sz)T(t); //copy the new value
++sz; //actual size grown
}
In essence, there is the need to separate the allocation (that relates to the buffer as a whole) with the construction of the elements (that has to happen in the already existent bufffer).
Usually you use predetermined locations in embedded or driver code, where some hardware is addressed via certain address ranges.
But in this case the storage at that address isnt used for accessing, or better it is not intended (or better dont have to be used for it, as you dont know that the new operator is doing with it), as later on the new operation is executed.
You use it as initialization value (with new not really changing it).
There come two purposes to my mind: First, in case you forgot later on a new, you instantly see in the debugger your magic address (i.e. in this case 0xBAAADCAFE).
Secondly you can use in case you fiddle around with the new operator and need an init value, so you can debug it (e.g. you can see changes).
Or you have modified your new operator that it makes whatever with that magic number (e.g. you can use it for debugging, or, like mentioned above, to really indeed use memory at a specific address for certain hardware), switch between different allocation methods, ...
EDIT: To answer it in this case correct, one needs to see what the new operator really does, you should check that news source code.
This particular behaviour is useful when you know the address of a class by having a long, DWORD, DWORD_PTR or otherwise sized pointer passed as an argument to a function and need to reconstruct a copy of the class for O-O use.
Alternatively, this could also be used to create a class in pre-allocated memory or a location which you have determined is static (ie: you are linking your application with some ancient ASM libraries).
Custom allocators, realtime (no lock here), and performance.
Class-specific version of placement new can be provided even though you can't replace the global one. What scenarios exist where a class should provide its own placement new operator?
Even if my class don't implement placement new the following code works (assuming for abc no operator new is overloaded).
char arr[100];
abc *pt = new(&arr)abc;
So i interpret, there is some default placement new but for class we can provide our own version of operator new, my question is what is the use case for that?
What one is supposed to do other then returning the same pointer that is passed? Is there any useful example/scenario that you encountered?
Sounds like a quiz question...
Faster and Leaner Allocation
The most common reason is a lot of small objects that need to be allocated dynamically. A custom allocator for fixed-size objects has much less allocation overhead than a generic allocator, does not suffer from fragmentation, and is typically faster. (Also, when these allocations are removed from the main heap, they don't contribute to main heap fragmentation anymore).
Similary, a non-freeing allocator (where you can allocate multiple objects, but can't free them together, only in conjunction) is the fastest allocation scheme possible, and does not have any overhead (except alignment in a few rare cases). It makes sense if you are building a data structure that you never modify, only delete as a whole.
Other base allocator
Another application is allocating from a different heap than the C++ heap.
Maybe the data in the objects needs to be allocated in shared memory for exchange with other processes, or it needs to be passed to a system function that takes ownership and requries the use of a certain allocator. (Note that this requires to implement the same mechanism for all sub-objects, too, there is no generic way to achieve that).
Similary (where I use it) is when you create code on the fly. Nowadays, you need to tell the OS that data on this memory page is allowed to run, but you get this memory in rather large chunks (e.g. 4K). So again, request a page (4K) from the OS with execution rights, then allocate many small objects on top of it - using placement new.
Unfortunately, AFAIK, you cannot do a class specific overload of the standard placement new operator, only of custom placement new operators. So a use-case for it is a bit academic, but I wanted to use it to forbid placement new on the class by using = delete of C++11. This works great with standard operator new but not for placement new.
Straight from the horse's mouth wiki. The section titled 'Use' highlights the need for placement new.
This SO thread here might also help
UPDATE:
To specifically answer you question; You might use the standard placement new provided by header <new> if you have a pool of memory you want to use for constructing some objects of a class, but don't want to overload operator new for the whole class. In the latter case all the class objects are placed as per the overloaded placement new as defined in the class
i'm not sure that its possible to overload the placement new, only the regular new. i can't think of even a single use for that, since the only possible implementation is just creating a temp object and memcp'ing it to the given memory address - since you're not supposed to allocate any other memory in there, but use the given one.