What's the purpose of having a separate "operator new[]"? - c++

Looks like operator new and operator new[] have exactly the same signature:
void* operator new( size_t size );
void* operator new[]( size_t size );
and do exactly the same: either return a pointer to a big enough block of raw (not initialized in any way) memory or throw an exception.
Also operator new is called internally when I create an object with new and operator new[] - when I create an array of objects with new[]. Still the above two special functions are called by C++ internally in exactly the same manner and I don't see how the two calls can have different meanings.
What's the purpose of having two different functions with exactly the same signatures and exactly the same behavior?

In Design and Evolution of C++ (section 10.3), Stroustrup mentions that if the new operator for object X was itself used for allocating an array of object X, then the writer of X::operator new() would have to deal with array allocation too, which is not the common usage for new() and add complexity. So, it was not considered to use new() for array allocation. Then, there was no easy way to allocate different storage areas for dynamic arrays. The solution was to provide separate allocator and deallocator methods for arrays: new[] and delete[].

The operators can be overridden (for a specific class, or within a namespace, or globally), and this allows you to provide separate versions if you want to treat object allocations differently from array allocations. For example, you might want to allocate from different memory pools.

I've had a reasonably good look at this, and to be blunt there's no reason from an interface standpoint.
The only possible reason that I can think of is to allow an optimization hint for the implementation, operator new[] is likely to be called upon to allocate larger blocks of memory; but that is a really, really tenuous supposition as you could new a very large structure or new char[2] which doesn't really count as large.
Note that operator new[] doesn't add any magic extra storage for the array count or anything. It is the job of the new[] operator to work out how much overhead (if any) is needed and to pass the correct byte count to operator new[].
[A test with gcc indicates that no extra storage is needed by new[] unless the type of the array members being constructed have a non-trivial desctructor.]
From an interface and contract standpoint (other than require the use of the correct corresponding deallocation function) operator new and operator new[] are identical.

One purpose is that they can be separately defined by the user. So if I want to initialize memory in single heap-allocated objects to 0xFEFEFEFE and memory in heap-allocated arrays to 0xEFEFEFEF, because I think it will help me with debugging, then I can.
Whether that's worth it is another matter. I guess if your particular program mostly uses quite small objects, and quite large arrays, then you could allocate off different heaps in the hope that this will reduce fragmentation. But equally you could identify the classes which you allocate large arrays of, and just override operator new[] for those classes. Or operator new could switch between different heaps based on the size.
There is actually a difference in the wording of the requirements. One allocates memory aligned for any object of the specified size, the other allocates memory aligned for any array of the specified size. I don't think there's any difference - an array of size 1 surely has the same alignment as an object - but I could be mistaken. The fact that by default the array version returns the same as the object version strongly suggests there is no difference. Or at least that the alignment requirements on an object are stricter than those on an array, which I can't make any sense of...

Standard says that new T calls operator new( ) and new T[ ] results in a call of operator new[]( ). You could overload them if you want. I believe that there is no difference between them by default. Standard says that they are replaceable (3.7.3/2):
The library provides default definitions for the global allocation and deallocation functions. Some global
allocation and deallocation functions are replaceable (18.4.1). A C + + program shall provide at most one
definition of a replaceable allocation or deallocation function. Any such function definition replaces the
default version provided in the library (17.4.3.4). The following allocation and deallocation functions
(18.4) are implicitly declared in global scope in each translation unit of a program
void* operator new(std::size_t) throw(std::bad_alloc);
void* operator new[](std::size_t) throw(std::bad_alloc);
void operator delete(void*) throw();
void operator delete[](void*) throw();

Related

What are the limitations of overloading, overriding and replacing new/delete?

I understand that there are 3 general ways to modify the behaviour of new and delete in C++:
Replacing the default new/delete and new[]/delete[]
Overriding or overloading the placement versions (overriding the one with a memory location passed to it, overloading when creating versions which pass other types or numbers of arguments)
Overloading class specific versions.
What are the restrictions for performing these modifications to the behaviour of new/delete?
In particular are there limitations on the signatures that new and delete can be used with?
It makes sense if any replacement versions must have the same signature (otherwise they wouldn't be replacement or would break other code, like the STL for example), but is it permissible to have global placement or class specific versions return smart pointers or some custom handle for example?
First off, don't confuse the new/delete expression with the operator new() function.
The expression is a language construct that performs construction and destruction. The operator is an ordinary function that performs memory (de)allocation.
Only the default operators (operator new(size_t) and operator delete(void *) can be used with the default new and delete expressions. All other forms are summarily called "placement" forms, and for those you can only use new, but you have to destroy objects manually by invoking the destructor. Placement forms are of rather limited and specialised need. By far the most useful placement form is global placement-new, ::new (addr) T, but the behavior of that cannot even be changed (which is presumably why it's the only popular one).
All new operators must return void *. These allocation functions are far more low-level than you might appreciate, so basically you "will know when you need to mess with them".
To repeat: C++ separates the notions of object construction and memory allocation. All you can do is provide alternative implementations for the latter.
When you overload new and delete within a class you are effectively modifying the way the memory is allocated and released for the class, asking for it to give you this control.
This may be done when a class wants to use some kind of pool to allocate its instances, either for optimisation or for tracking purposes.
Restrictions, as with pretty much any operator overload, is the parameter list you may pass, and the behaviour it is expected to adhere to.

What is the difference between ::operator new(size_t n) and new char[n] with regard to alignment?

With regard to alignment, there is something I don't understand about the difference between an explicit call to T* t = (T*) ::operator new(sizeof(T)) and T* t = (T*)new char[sizeof(T)], where T is a C-like struct.
In the book "C++ Solutions Companion to The C++ Programming Language, Third Edition" (see exercise 12.9), the author says: "Note that one should not use the expression new char[n] to allocate raw memory because that memory may not satisfy the alignment requirements of a T, while an explicit call of the global operator new is guaranteed to yield storage that is sufficiently aligned for any C++ object.". This is also stated in an article I read on internet: http://www.scs.stanford.edu/~dm/home/papers/c++-new.html (see the "delete vs. delete[] and free" paragraph).
On the other hand, in my previous post Data alignment in C++, standard and portability a guy quoted the standard, plus a note at the end of the quote which contradicts the above people:
"A new-expression passes the amount of space requested to the allocation function as the first argument of type std::size_t. That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array. For arrays of char and unsigned char, the difference between the result of the new-expression and the address returned by the allocation function shall be an integral multiple of the strictest fundamental alignment requirement (3.11) of any object type whose size is no greater than the size of the array being created. [ Note: Because allocation functions are assumed to return pointers to storage that is appropriately aligned for objects of any type with fundamental alignment, this constraint on array allocation overhead permits the common idiom of allocating character arrays into which objects of other types will later be placed. — end note ]"
I am confused. Could you please tell me who is right here according to the standard and, in case, why is ::operator new() different from new char[]?
That's right, new char[n] delegates to operator new[](size_t), see 5.3.4 "New" §8:
A new-expression obtains storage for the object by calling an allocation function [...]
If the allocated type is an array type, the allocation function's name is operator new[]
Hence, there is absolutely no way the former could somehow yield "less aligned" memory than the latter.
Sadly, when it comes to technical details, most C++ books out there just plain suck.
C++98 §3.7.3.1/2, about allocation functions:
The pointer returned shall be suitably aligned so that it can be converted to a pointer of
any complete object type and then used to access the object or array in the storage allocated
Together with your quote of §5.3.4/10 about new-expressions, "A new-expression passes the amount...", this means new char[n] can't offer weaker alignment guarantees, it can't be less aligned.
Cheers & hth.,
The confusion is between different versions of the standard. The older standard did not make the guarentee about new char[n] and the newer one does.
Note the bit in both cases where new char[n] might NOT return the same pointer it got back from the underlying operator new(size_t) -- that's because when allocating an array, the runtime may use the space at the beginning of the allocated block (the space between what operator new(size_t) returned and what new char[n] returns) to store the size of the array -- and that size might not require the full worst-case alignment padding. An implementation is not required to do this -- it might note that, since char has no destructor, it doesn't need to record the size of the array to know how many destructor calls to make when delete[] is called. But an implementation is free to treat all arrays uniformly.

Why does c++ have its separate syntax for new & delete?

Why can't it just be regular function calls? New is essentially:
malloc(sizeof(Foo));
Foo::Foo();
While delete is
Foo:~Foo();
free(...);
So why does new/delete end up having it's own syntax rather than being regular functions?
Here's a stab at it:
The new operator calls the operator new() function. Similarly, the delete operator calls the operator delete() function (and similarly for the array versions).
So why is this? Because the user is allowed to override operator new() but not the new operator (which is a keyword). You override operator new() (and delete) to define your own allocator, however, you are not responsible (or allowed to for that matter) for calling appropriate constructors and destructors. These function are called automatically by the compiler when it sees the new keyword.
Without this dichotomy, a user could override the operator new() function, but the compiler would still have to treat this as a special function and call the appropriate constructor(s) for the object(s) being created.
You can overload operator new and operator delete to provide your own allocation semantics. This is useful when you want to bypass the default heap allocator's behavior. For example if you allocate and deallocate a lot of instances of a small, fixed-size object, you may want to use a pool allocator for its memory management.
Having new and delete as explicit operators like other operators makes this flexibility easier to express using C++'s operator overloading mechanism.
For auto objects on the stack, allocation/constructor call and deallocation/destructor calls basically are transparent as you request. :)
'Cause there is no way to provide complie-time type safety with a function (malloc() returns void*, remember). Additionally, C++ tries to eliminate even a slightest chance of allocated but uninitialized objects floating around. And there are objects out there without a default constructor - for these, how would you feed constructor arguments to a function? A function like this would require too much of a special-case handling; easier to promote it to a language feature. Thus operator new.
'new/delete' are keywords in the C++ language (like 'for' and 'while'), whereas malloc/calloc are function calls in the standard C library (like 'printf' and 'sleep'). Very different beasts, more than their similar syntax may let on.
The primary difference is that 'new' and 'delete' trigger additional user code - specifically, constructors and destructors. All malloc does is set aside some memory for you to use. When setting aside memory for a simple plain old data (floats or ints, for example), 'new' and 'malloc' behave very similarly. But when you ask for space for a class, the 'new' keyword sets aside memory and then calls a constructor to initialize that class. Big difference.
Why does C++ have separate syntax for greater-than? Why can't it just be a regular function call?
greaterThan(foo, bar);

STL allocators and operator new[]

Are there STL implementations that use operator new[] as an allocator? On my compiler, making Foo::operator new[] private did not prevent me from creating a vector<Foo>... is that behavior guaranteed by anything?
C++ Standard, section 20.4.1.1. The default allocator allocate() function uses global operator new:
pointer allocate(size_type n, allocator<void>::const_pointerhint=0);
3 Notes: Uses ::operator new(size_t) (18.4.1).
std library implementations won't use T::operator new[] for std::allocator. Most of them use their own memory pooling infrastructure behind the scenes.
In general, if you want to stop Foo objects being dynamically allocated, you'll have to have make all the constructors private and provide a function that creates Foo objects. Of course, you won't be able to create them as auto variables either though.
std::vector uses an Allocator that's passed as a template argument, which defaults to std::allocate. The allocator doesn't work like new[] though -- it just allocates raw memory, and placement new is used to actually create the objects in that memory when you tell it to add the objects (e.g. with push_back() or resize()).
About the only way you could use new[] in an allocator would be if you abused things a bit, and allocated raw space using something like new char[size];. As abuses go, that one's fairly harmless, but it's still unrelated to your overload of new[] for the class.
If you want to prohibit the creation of your object make private constructor rather than operator new.
In addition to the other answers here, if you want to prevent anyone from creating a STL container for your type Foo, then simply make the copy-constructor for Foo private (also the move-constructor if you're working with C++11). All STL-container objects must have a valid copy or move constructor for the container's allocator to properly call placement new and construct a copy of the object in the allocated memory block for the container.

Operator overloading with memory allocation?

The sentence below is from, The Positive Legacy of C++ and Java by Bruce Eckel, about operator overloading in C++:
C++ has both stack allocation and heap
allocation and you must overload your
operators to handle all situations and
not cause memory leaks. Difficult
indeed.
I do not understand how operator overloading has anything to do with memory allocation. Can anyone please explain how they are correlated?
I can imagine a couple possible interpretations:
First, in C++ new and delete are both actually operators; if you choose to provide custom allocation behavior for an object by overloading these operators, you must be very careful in doing so to ensure you don't introduce leaks.
Second, some types of objects require that you overload operator= to avoid memory management bugs. For example, if you have a reference counting smart pointer object (like the Boost shared_ptr), you must implement operator=, and you must be sure to do so correctly. Consider this broken example:
template <class T>
class RefCountedPtr {
public:
RefCountedPtr(T *data) : mData(data) { mData->incrRefCount(); }
~RefCountedPtr() { mData->decrRefCount(); }
RefCountedPtr<T>& operator=(const RefCountedPtr<T>& other) {
mData = other.mData;
return *this;
}
...
protected:
T *mData;
};
The operator= implementation here is broken because it doesn't manage the reference counts on mData and other.mData: it does not decrement the reference count on mData, leading to a leak; and it does not increment the reference count on other.mData, leading to a possible memory fault down the road because the object being pointed to could be deleted before all the actual references are gone.
Note that if you do not explicitly declare your own operator= for your classes, the compiler will provide a default implementation which has behavior identical to the implementation shown here -- that is, completely broken for this particular case.
So as the article says -- in some cases you must overload operators, and you must be careful to handle all situations correctly.
EDIT: Sorry, I didn't realize that the reference was an online article, rather than a book. Even after reading the full article it's not clear what was intended, but I think Eckel was probably referring to situations like the second one I described above.
new and delete are actually operators in C++ which you can override to provide your own customized memory management. Take a look at the example here.
Operators are functions. Just because they add syntactic sugar does not mean you need not be careful with memory. You must manage memory as you would with any other member/global/friend function.
This is particularly important when you overload the when you implement a wrapper pointer class.
There is then string concatenation, by overloading the operator+ or operator+=. Take a look at the basic_string template for more information.
If you are comparing operator overloading between Java and C++, you wouldn't be talking about new and delete - Java doesn't expose enough memory management detail for new, and doesn't need delete.
You can't overload the other operators for pointer types - at least one argument must be a class or enumerated type, so he can't be talking about providing different operators for pointers.
So operators in C++ operate on values or const references to values.
It would be very unusual for operators which operator on values or const references to values to return anything other than a value.
Apart from obvious errors common to all C++ functions - returning a reference to a stack allocated object (which is the opposite of a memory leak), or returning a reference to an object created with new rather than a value (which is usually done no more than once in a career before being learnt), it would be hard to come up with a scenario where the common operators have memory issues.
So there isn't any need to create multiple versions depending on whether the operands are stack or heap allocated based on normal patterns of use.
The arguments to an operator are objects passed as either values or references. There is no portable mechanism in C++ to test whether an object was allocated heap or stack. If the object was passed by value, it will always be on the stack. So if there was a requirement to change the behaviour of the operators for the two cases, it would not be possible to do so portably in C++. (you could, on many OSes, test whether the pointer to the object is in the space normally used for stack or the space normally used for heap, but that is neither portable nor entirely reliable.) (also, even if you could have operators which took two pointers as arguments, there's no reason to believe that the objects are heap allocated just because they are pointers. that information simply doesn't exist in C++)
The only duplication you get is for cases such as operator[] where the same operator is used as both an accessor and a mutator. Then it is normal to have a const and a non-const version, so you can set the value if the receiver is not const. That is a good thing - not being able to mutate (the publicly accessible state of) objects which have been marked constant.