I'm using llvm lately, and I found that new statements in cpp are translated to _Znam in llvm IR, I know that
new in cpp also call the function _Znwm, and new [] call _Znam, so what's the difference between the functionality of these two functions?
What if I use _Znwm to allocate space for an array?
Example
a = new int*[10];
is compiled as
%2 = call i8* #_Znam(i64 80) #2
_Znwm and _Znam are just mangled names for the functions
operator new(std::size_t)
and
operator new[](std::size_t)
respectively.
An (non-placement) array new expression calls the latter, while a (non-placement) non-array new expression calls the former to allocate memory.
These functions can be replaced by the user, but a default implementation is provided by the standard library. The default implementation of the array version simply calls the non-array version since C++11 and the non-array version allocates memory of the passed size, aligned suitably for all non-overaligned types, in some unspecified way, throwing the exception std::bad_alloc if allocation fails and otherwise returning a non-null pointer to the beginning of the allocated block.
So it behaves similar to std::malloc, except that the latter returns a null pointer if allocation fails, rather than throwing an exception. It is unspecified, but likely, that the default operator new implementation just uses malloc internally to do the allocation.
malloc should not call operator new or operator new[], so I don't know why you think that it would translate to that in IR.
I don't think there is anything LLVM-specific here. Which allocation function is called is specified by the C++ standard. Only the names are mangled in an implementation-defined manner.
Also note that these calls are not all that the new expressions translate to. After calling operator new/operator new[] the new expression will also construct objects in the memory, which may require constructor calls to stores of values.
Related
Section "11.2.4 Overloading new" ends with:
"There is no special syntax for placement of arrays. Nor need there be since arbitrary types can be allocated by placement new. However, an operator delete can be defined for arrays".
If I understand it correctly, what is being said is that for arrays, we use the usual placement new syntax, which would invoke the appropriate operator new[]. But, what I don't understand is the last sentence. What is he trying to say there? Afaik, we can specify both operator new and operator delete for arrays.
There is a special allocator for arrays (operator new[]()) but it is not dependent on a special syntax.
In the following code
new T();
or
new (p) T();
The compiler will generate a call to either operator new(...) or operator new[](...) depending on whether T is an array type. There is no syntactic difference in the new-expression.
(Now, there is a special syntax for a new-expression with a runtime size... but invocation of operator new[]() is not limited to scenarios with a runtime size)
In contrast to new, for delete the same pointer type is compatible with both scalar and array. So you the programmer must tell the compiler which you want via
delete p;
vs
delete [] p;
There is no automatic translation to delete[] based on recognition of an array type.
In C++, is 'new' an operator or an expression or some kind of keyword? a similar question that comes in mind is, should i call '=' an operator or expression?
C++ separates the notion of memory allocation and object lifetime. This is a new feature compared to C, since in C an object was equivalent to its memory representation (which is called "POD" in C++).
An object begins its life when a constructor has completed, and its life ends when the destructor has completed. For an object of dynamic storage duration, the life cycle thus consists of four key milestones:
Memory allocation.
Object construction.
Object destruction.
Memory deallocation.
The standard way in C++ to allocate memory dynamically is with the global ::operator new(), and deallocation with ::operator delete(). However, to construct an object there is only one method: A new expression:
T * p = new T;
This most common form of the new expression does allocation and construction in one step. It is equivalent to the broken down version:
void * addr = ::operator new(sizeof(T));
T * p = new (addr) T; // placement-new
Similarly, the delete expression delete p; first calls the destructor and then releases memory. It is equivalent to this:
p->~T();
::operator delete(addr);
Thus, the default new and delete expressions perform memory allocation and object construction in one wash. All other forms of the new expression, collectively called "placement new", call a corresponding placement-new operator to allocate memory before constructing the object. However, there is no matching "placement delete expression", and all dynamic objects created with placement-new have to be destroyed manually with p->~T();.
In summary, it is very important to distinguish the new expression from the operator new. This is really at the heart of memory management in C++.
It's all of those.
2.13 Table 4 explicitly lists new as a keyword.
5.3.4 Introduces the new-expression. This is an expression such as new int(5) which uses the new keyword, a type and an initial value.
5.3.4/8 Then states that operator new is called to allocate memory for the object created by the new-expression
= works quite the same. Each class has an operator= (unless explicitly deleted), which is used in assignment expressions. We usually call a=5; just an assignment, even when it's technically "an expression statement containing an assignment expression."
new is operator. You can overload it and write your own version of it. Also I think that = is operator. Expression is more complex thing which consist of operators, variables, function calls etc. And please try to get C++ language standard. It must describe all things you mentioned.
According to C++ Standard paragraph 3.7.3/1 objects should be dynamically created with new expression and the C++ runtime should provide an allocation function ::operator new().
Once in a while it is necessary to call ::operator new() directly.
Does the C++ Standard allow such calls to ::operator new() function or is this (and related) function for internal use only?
It's perfectly acceptable to call operator new and operator delete directly; they are a part of the global namespace and act like a C++-ier version of malloc and free that interact with set_new_handler and the bad_alloc exceptions a bit nicer. The C++ ISO standard even contains a few examples of this. For example, §13.5/4 has this example:
Operator functions are usually not called directly; instead they are invoked to evaluate the operators they implement (13.5.1 - 13.5.7). They can be explicitly called, however, using the operator-function-id as the name of the function in the function call syntax (5.2.2). [Example:
complex z = a.operator+(b); // complex z = a+b;
void* p = operator new(sizeof(int)*n);
—end example]
Yes, it is allowed to call the global operator new function directly – though it's not as often required as you might believe. You must match allocation and deallocation functions, but if you have full control over both, then you can always use new[] and delete[] with char. However, that would be a new-expression and delete-expression, so you are only "required" to use the global functions themselves if you need a function pointer. (You would have to wrap the new-expression to get a function pointer, otherwise.)
If you replace these global functions so that new and new[] use different heaps, for example, then you might also want to explicitly use ::operator new, but this is rare.
Looks like operator new and operator new[] have exactly the same signature:
void* operator new( size_t size );
void* operator new[]( size_t size );
and do exactly the same: either return a pointer to a big enough block of raw (not initialized in any way) memory or throw an exception.
Also operator new is called internally when I create an object with new and operator new[] - when I create an array of objects with new[]. Still the above two special functions are called by C++ internally in exactly the same manner and I don't see how the two calls can have different meanings.
What's the purpose of having two different functions with exactly the same signatures and exactly the same behavior?
In Design and Evolution of C++ (section 10.3), Stroustrup mentions that if the new operator for object X was itself used for allocating an array of object X, then the writer of X::operator new() would have to deal with array allocation too, which is not the common usage for new() and add complexity. So, it was not considered to use new() for array allocation. Then, there was no easy way to allocate different storage areas for dynamic arrays. The solution was to provide separate allocator and deallocator methods for arrays: new[] and delete[].
The operators can be overridden (for a specific class, or within a namespace, or globally), and this allows you to provide separate versions if you want to treat object allocations differently from array allocations. For example, you might want to allocate from different memory pools.
I've had a reasonably good look at this, and to be blunt there's no reason from an interface standpoint.
The only possible reason that I can think of is to allow an optimization hint for the implementation, operator new[] is likely to be called upon to allocate larger blocks of memory; but that is a really, really tenuous supposition as you could new a very large structure or new char[2] which doesn't really count as large.
Note that operator new[] doesn't add any magic extra storage for the array count or anything. It is the job of the new[] operator to work out how much overhead (if any) is needed and to pass the correct byte count to operator new[].
[A test with gcc indicates that no extra storage is needed by new[] unless the type of the array members being constructed have a non-trivial desctructor.]
From an interface and contract standpoint (other than require the use of the correct corresponding deallocation function) operator new and operator new[] are identical.
One purpose is that they can be separately defined by the user. So if I want to initialize memory in single heap-allocated objects to 0xFEFEFEFE and memory in heap-allocated arrays to 0xEFEFEFEF, because I think it will help me with debugging, then I can.
Whether that's worth it is another matter. I guess if your particular program mostly uses quite small objects, and quite large arrays, then you could allocate off different heaps in the hope that this will reduce fragmentation. But equally you could identify the classes which you allocate large arrays of, and just override operator new[] for those classes. Or operator new could switch between different heaps based on the size.
There is actually a difference in the wording of the requirements. One allocates memory aligned for any object of the specified size, the other allocates memory aligned for any array of the specified size. I don't think there's any difference - an array of size 1 surely has the same alignment as an object - but I could be mistaken. The fact that by default the array version returns the same as the object version strongly suggests there is no difference. Or at least that the alignment requirements on an object are stricter than those on an array, which I can't make any sense of...
Standard says that new T calls operator new( ) and new T[ ] results in a call of operator new[]( ). You could overload them if you want. I believe that there is no difference between them by default. Standard says that they are replaceable (3.7.3/2):
The library provides default definitions for the global allocation and deallocation functions. Some global
allocation and deallocation functions are replaceable (18.4.1). A C + + program shall provide at most one
definition of a replaceable allocation or deallocation function. Any such function definition replaces the
default version provided in the library (17.4.3.4). The following allocation and deallocation functions
(18.4) are implicitly declared in global scope in each translation unit of a program
void* operator new(std::size_t) throw(std::bad_alloc);
void* operator new[](std::size_t) throw(std::bad_alloc);
void operator delete(void*) throw();
void operator delete[](void*) throw();
This MSDN page mentions that there're nothrow versions of new and delete. nothrow new is quite a known thing - returns null instead of throwing an exception if memory allocation fails. But what is nothrow delete mentioned there?
They are probably referring to the raw memory allocation functions operator new and operator delete.
When you invoke a specific version of placement new-expression (i.e. new-expression with extra parameters; they all are officially referred to as placement forms of new) and the memory allocation function operator new succeeds, but the process fails later for some other reason (the constructor throws), the implementation has to abort the process and automatically release the allocated memory by calling the appropriate version of operator delete. "Appropriate version" of operator delete in this case is the version that has the same set of parameters as the operator new function previously used for memory allocation (except for the very first parameter, of course).
This applies to nothrow version of operator new as well. When you use a nothrow form of new-expression, it calls a nothrow version of operator new and then constructs the object in the allocated memory. If the constructor fails (throws), the implementation of the new-expression releases allocated memory with the help of nothrow version of operator delete. This is basically the only reason for this version of operator delete to exist.
In other words, the nothrow version of operator delete exists for very specific internal purposes. You should not normally want to call it yourself and, maybe, you don't really need to know about its existence. However, it is worth knowing that for the reasons described above, whenever you create your own version of operator new with extra parameters, it is always a good idea to provide a matching version of operator delete with the same set of extra parameters.