According to C++ Standard paragraph 3.7.3/1 objects should be dynamically created with new expression and the C++ runtime should provide an allocation function ::operator new().
Once in a while it is necessary to call ::operator new() directly.
Does the C++ Standard allow such calls to ::operator new() function or is this (and related) function for internal use only?
It's perfectly acceptable to call operator new and operator delete directly; they are a part of the global namespace and act like a C++-ier version of malloc and free that interact with set_new_handler and the bad_alloc exceptions a bit nicer. The C++ ISO standard even contains a few examples of this. For example, §13.5/4 has this example:
Operator functions are usually not called directly; instead they are invoked to evaluate the operators they implement (13.5.1 - 13.5.7). They can be explicitly called, however, using the operator-function-id as the name of the function in the function call syntax (5.2.2). [Example:
complex z = a.operator+(b); // complex z = a+b;
void* p = operator new(sizeof(int)*n);
—end example]
Yes, it is allowed to call the global operator new function directly – though it's not as often required as you might believe. You must match allocation and deallocation functions, but if you have full control over both, then you can always use new[] and delete[] with char. However, that would be a new-expression and delete-expression, so you are only "required" to use the global functions themselves if you need a function pointer. (You would have to wrap the new-expression to get a function pointer, otherwise.)
If you replace these global functions so that new and new[] use different heaps, for example, then you might also want to explicitly use ::operator new, but this is rare.
Related
I'm using llvm lately, and I found that new statements in cpp are translated to _Znam in llvm IR, I know that
new in cpp also call the function _Znwm, and new [] call _Znam, so what's the difference between the functionality of these two functions?
What if I use _Znwm to allocate space for an array?
Example
a = new int*[10];
is compiled as
%2 = call i8* #_Znam(i64 80) #2
_Znwm and _Znam are just mangled names for the functions
operator new(std::size_t)
and
operator new[](std::size_t)
respectively.
An (non-placement) array new expression calls the latter, while a (non-placement) non-array new expression calls the former to allocate memory.
These functions can be replaced by the user, but a default implementation is provided by the standard library. The default implementation of the array version simply calls the non-array version since C++11 and the non-array version allocates memory of the passed size, aligned suitably for all non-overaligned types, in some unspecified way, throwing the exception std::bad_alloc if allocation fails and otherwise returning a non-null pointer to the beginning of the allocated block.
So it behaves similar to std::malloc, except that the latter returns a null pointer if allocation fails, rather than throwing an exception. It is unspecified, but likely, that the default operator new implementation just uses malloc internally to do the allocation.
malloc should not call operator new or operator new[], so I don't know why you think that it would translate to that in IR.
I don't think there is anything LLVM-specific here. Which allocation function is called is specified by the C++ standard. Only the names are mangled in an implementation-defined manner.
Also note that these calls are not all that the new expressions translate to. After calling operator new/operator new[] the new expression will also construct objects in the memory, which may require constructor calls to stores of values.
The context is:
Writing a container, containing type T, and a char * p to a memory region. Let's suppose the pointer is already suitably aligned for type T - the alignment issue is not part of the question.
How do I default construct an element on that memory region?
((*T)(p))->T();
works for classes, but not with some builtin types.
((*T)(p)) = 0; // or simply memset
for integral types, pointers.
Do these two cover everything, unions and what not?
Is there a best practice for this, or some standard library feature?
std::allocator::construct can do it, that is what e.g. std::vector uses, but it is not a static method, so I would need an instance of it. Is there some freestanding or static function that can do it?
--EDIT--
Yes, the answer is obvious, and I was dumb today -- placement new
BTW, Now I'm trying to destroy the element...
"Placement new" is the term to look for. It is a standard library operator new overload that does not actually allocate memory, but just returns whatever pointer you pass to it.
Include the <new> header and use its placement new allocation function like this:
::new (p) T()
The :: qualification avoids picking up a class-specific allocation function.
The paranthesis (p) is an argument list for the allocation function.
This allocation function just returns the passed in pointer.
To be pedantic about things you would also cast the pointer to void*, to avoid picking up some hypothetical other operator new in the global namespace.
The code shown in the question, ((*T)(p))->T();, should not compile. The standard explicitly points out that a constructor doesn't have a name. So it can't be called like an ordinary function.
I know of storage classes in both C and C++ (static, extern, auto, register, C++ also adds mutable and some compiler-specific ones) but I can't figure out what a storage allocator is. I don't think it's referred to memory allocators implementable on STL, what is it in simple terms?
It's whatever is behind operator new and operator delete (not to be confused with the new operator and the delete operator). operator new allocates memory from the free store, and operator delete releases memory previously allocated by operator new for possible reuse. When code does foo *ptr = new foo (new operator), the compiler generates code that calls operator new to get the right number of bytes of storage, then calls the constructor for foo. When code does delete ptr (delete operator) the compiler calls the destructor for foo, then calls operator delete to release the memory.
Note that this is how the term is used in the C++03 standard. In the C++11 standard it is also used to refer to standard allocators.
In the C++ standard, that term is used to refer to the allocator class used by STL-style containers - either std::allocator, or a user-defined custom allocator that meets the requirements given by C++11 17.6.3.5.
However, it's not a formally defined term, and also appears once referring to the implementation of the free store - that is, the dynamic storage allocated by new.
[NOTE: I'm referring to the current (2011) language specification. As noted in the comments, historical versions of the specification apparently only used the term (informally) to refer to the free store]
I understand that there are 3 general ways to modify the behaviour of new and delete in C++:
Replacing the default new/delete and new[]/delete[]
Overriding or overloading the placement versions (overriding the one with a memory location passed to it, overloading when creating versions which pass other types or numbers of arguments)
Overloading class specific versions.
What are the restrictions for performing these modifications to the behaviour of new/delete?
In particular are there limitations on the signatures that new and delete can be used with?
It makes sense if any replacement versions must have the same signature (otherwise they wouldn't be replacement or would break other code, like the STL for example), but is it permissible to have global placement or class specific versions return smart pointers or some custom handle for example?
First off, don't confuse the new/delete expression with the operator new() function.
The expression is a language construct that performs construction and destruction. The operator is an ordinary function that performs memory (de)allocation.
Only the default operators (operator new(size_t) and operator delete(void *) can be used with the default new and delete expressions. All other forms are summarily called "placement" forms, and for those you can only use new, but you have to destroy objects manually by invoking the destructor. Placement forms are of rather limited and specialised need. By far the most useful placement form is global placement-new, ::new (addr) T, but the behavior of that cannot even be changed (which is presumably why it's the only popular one).
All new operators must return void *. These allocation functions are far more low-level than you might appreciate, so basically you "will know when you need to mess with them".
To repeat: C++ separates the notions of object construction and memory allocation. All you can do is provide alternative implementations for the latter.
When you overload new and delete within a class you are effectively modifying the way the memory is allocated and released for the class, asking for it to give you this control.
This may be done when a class wants to use some kind of pool to allocate its instances, either for optimisation or for tracking purposes.
Restrictions, as with pretty much any operator overload, is the parameter list you may pass, and the behaviour it is expected to adhere to.
Why can't it just be regular function calls? New is essentially:
malloc(sizeof(Foo));
Foo::Foo();
While delete is
Foo:~Foo();
free(...);
So why does new/delete end up having it's own syntax rather than being regular functions?
Here's a stab at it:
The new operator calls the operator new() function. Similarly, the delete operator calls the operator delete() function (and similarly for the array versions).
So why is this? Because the user is allowed to override operator new() but not the new operator (which is a keyword). You override operator new() (and delete) to define your own allocator, however, you are not responsible (or allowed to for that matter) for calling appropriate constructors and destructors. These function are called automatically by the compiler when it sees the new keyword.
Without this dichotomy, a user could override the operator new() function, but the compiler would still have to treat this as a special function and call the appropriate constructor(s) for the object(s) being created.
You can overload operator new and operator delete to provide your own allocation semantics. This is useful when you want to bypass the default heap allocator's behavior. For example if you allocate and deallocate a lot of instances of a small, fixed-size object, you may want to use a pool allocator for its memory management.
Having new and delete as explicit operators like other operators makes this flexibility easier to express using C++'s operator overloading mechanism.
For auto objects on the stack, allocation/constructor call and deallocation/destructor calls basically are transparent as you request. :)
'Cause there is no way to provide complie-time type safety with a function (malloc() returns void*, remember). Additionally, C++ tries to eliminate even a slightest chance of allocated but uninitialized objects floating around. And there are objects out there without a default constructor - for these, how would you feed constructor arguments to a function? A function like this would require too much of a special-case handling; easier to promote it to a language feature. Thus operator new.
'new/delete' are keywords in the C++ language (like 'for' and 'while'), whereas malloc/calloc are function calls in the standard C library (like 'printf' and 'sleep'). Very different beasts, more than their similar syntax may let on.
The primary difference is that 'new' and 'delete' trigger additional user code - specifically, constructors and destructors. All malloc does is set aside some memory for you to use. When setting aside memory for a simple plain old data (floats or ints, for example), 'new' and 'malloc' behave very similarly. But when you ask for space for a class, the 'new' keyword sets aside memory and then calls a constructor to initialize that class. Big difference.
Why does C++ have separate syntax for greater-than? Why can't it just be a regular function call?
greaterThan(foo, bar);