Why [] is used in delete ( delete [] ) to free dynamically allocated array ? - c++

I know that when delete [] will cause destruction for all array elements and then releases the memory.
I initially thought that compiler wants it just to call destructor for all elements in the array, but I have also a counter - argument for that which is:
Heap memory allocator must know the size of bytes allocated and using sizeof(Type) its possible to find no of elements and to call appropriate no of destructors for an array to prevent resource leaks.
So my assumption is correct or not and please clear my doubt on it.
So I am not getting the usage of [] in delete [] ?

Scott Meyers says in his Effective C++ book: Item 5: Use the same form in corresponding uses of new and delete.
The big question for delete is this: how many objects reside in the memory being deleted? The answer to that determines how many destructors must be called.
Does the pointer being deleted point to a single object or to an array of objects? The only way for delete to know is for you to tell it. If you don't use brackets in your use of delete, delete assumes a single object is pointed to.
Also, the memory allocator might allocate more space that required to store your objects and in this case dividing the size of the memory block returned by the size of each object won't work.
Depending on the platform, the _msize (windows), malloc_usable_size (linux) or malloc_size (osx) functions will tell you the real length of the block that was allocated. This information can be exploited when designing growing containers.
Another reason why it won't work is that Foo* foo = new Foo[10] calls operator new[] to allocate the memory. Then delete [] foo; calls operator delete[] to deallocate the memory. As those operators can be overloaded, you have to adhere to the convention otherwise delete foo; calls operator delete which may have an incompatible implementation with operator delete []. It's a matter of semantics, not just keeping track of the number of allocated object to later issue the right number of destructor calls.
See also:
[16.14] After p = new Fred[n], how does the compiler know there are n objects to be destructed during delete[] p?
Short answer: Magic.
Long answer: The run-time system stores the number of objects, n, somewhere where it can be retrieved if you only know the pointer, p. There are two popular techniques that do this. Both these techniques are in use by commercial-grade compilers, both have tradeoffs, and neither is perfect. These techniques are:
Over-allocate the array and put n just to the left of the first Fred object.
Use an associative array with p as the key and n as the value.
EDIT: after having read #AndreyT comments, I dug into my copy of Stroustrup's "The Design and Evolution of C++" and excerpted the following:
How do we ensure that an array is correctly deleted? In particular, how do we ensure that the destructor is called for all elements of an array?
...
Plain delete isn't required to handle both individual objects an arrays. This avoids complicating the common case of allocating and deallocating individual objects. It also avoids encumbering individual objects with information necessary for array deallocation.
An intermediate version of delete[] required the programmer to specify the number of elements of the array.
...
That proved too error prone, so the burden of keeping track of the number of elements was placed on the implementation instead.
As #Marcus mentioned, the rational may have been "you don't pay for what you don't use".
EDIT2:
In "The C++ Programming Language, 3rd edition", §10.4.7, Bjarne Stroustrup writes:
Exactly how arrays and individual objects are allocated is implementation-dependent. Therefore, different implementations will react differently to incorrect uses of the delete and delete[] operators. In simple and uninteresting cases like the previous one, a compiler can detect the problem, but generally something nasty will happen at run time.
The special destruction operator for arrays, delete[], isn’t logically necessary. However, suppose the implementation of the free store had been required to hold sufficient information for every object to tell if it was an individual or an array. The user could have been relieved of a burden, but that obligation would have imposed significant time and space overheads on some C++ implementations.

The main reason why it was decided to keep separate delete and delete[] is that these two entities are not as similar as it might seem at the first sight. For a naive observer they might seem to be almost the same: just destruct and deallocate, with the only difference in the potential number of objects to process. In reality, the difference is much more significant.
The most important difference between the two is that delete might perform polymorphic deletion of objects, i.e. the static type of the object in question might be different from its dynamic type. delete[] on the other hand must deal with strictly non-polymorphic deletion of arrays. So, internally these two entities implement logic that is significantly different and non-intersecting between the two. Because of the possibility of polymorphic deletion, the functionality of delete is not even remotely the same as the functionality of delete[] on an array of 1 element, as a naive observer might incorrectly assume initially.
Contrary to the strange claims made in some other answers, it is, of course, perfectly possible to replace delete and delete[] with just a single construct that would branch at the very early stage, i.e. it would determine the type of the memory block (array or not) using the household information that would be stored by new/new[], and then jump to the appropriate functionality, equivalent to either delete or delete[]. However, this would be a rather poor design decision, since, once again, the functionality of the two is too different. Forcing both into a single construct would be akin to creating a Swiss Army Knife of a deallocation function. Also, in order to be able to tell an array from a non-array we'd have to introduce an additional piece of household information even into a single-object memory allocations done with plain new. This might easily result in notable memory overhead in single object allocations.
But, once again, the main reason here is the functional difference between delete and delete[]. These language entities possess only apparent skin-deep similarity that exists only at the level of naive specification ("destruct and free memory"), but once one gets to understand in detail what these entities really have to do one realizes that they are too different to be merged into one.
P.S. This is BTW one of the problems with the suggestion about sizeof(type) you made in the question. Because of the potentially polymorphic nature of delete, you don't know the type in delete, which is why you can't obtain any sizeof(type). There are more problems with this idea, but that one is already enough to explain why it won't fly.

The heap itself knows the size of allocated block - you only need the address. Look like free() works - you only pass the address and it frees memory.
The difference between delete (delete[]) and free() is that the former two first call the destructors, then free memory (possibly using free()). The problem is that delete[] also has only one argument - the address and having only that address it need to know the number of objects to run destructors on. So new[] uses som implementation-defined way of writing somewhere the number of elements - usually it prepends the array with the number of elements. Now delete[] will rely on that implementation-specific data to run destructors and then free memory (again, only using the block address).

delete[] just calls a different implementation (function);
There's no reason an allocator couldn't track it (in fact, it would be easy enough to write your own).
I don't know the reason they did not manage it, or the history of the implementation, if I were to guess: Many of these 'well, why wasn't this slightly simpler?' questions (in C++) came down to one or more of:
compatibility with C
performance
In this case, performance. Using delete vs delete[] is easy enough, I believe it could all be abstracted from the programmer and be reasonably fast (for general use). delete[] only requires only a few additional function calls and operations (omitting destructor calls), but that is per call to delete, and unnecessary because the programmer generally knows the type he/she is dealing with (if not, there's likely a bigger problem at hand). So it just avoids calling through the allocator. Additionally, these single allocations may not need to be tracked by the allocator in as much detail; Treating every allocation as an array would require additional entries for count for trivial allocations, so it is multiple levels of simple allocator implementation simplifications which are actually important for many people, considering it is a very low level domain.

This is more complicated.
The keyword and the convention to use it to delete an array was invented for the convenience of implementations, and some implementations do use it (I don't know which though. MS VC++ does not).
The convenience is this:
In all other cases, you know the exact size to be freed by other means. When you delete a single object, you can have the size from compile-time sizeof(). When you delete a polymorphic object by base pointer and you have a virtual destructor, you can have the size as a separate entry in vtbl. If you delete an array, how would you know the size of memory to be freed, unless you track it separately?
The special syntax would allow tracking such size only for an array - for instance, by putting it before the address that is returned to the user. This takes up additional resources and is not needed for non-arrays.

Related

Why does the delete[] syntax exist in C++?

Every time somebody asks a question about delete[] on here, there is always a pretty general "that's how C++ does it, use delete[]" kind of response. Coming from a vanilla C background what I don't understand is why there needs to be a different invocation at all.
With malloc()/free() your options are to get a pointer to a contiguous block of memory and to free a block of contiguous memory. Something in implementation land comes along and knows what size the block you allocated was based on the base address, for when you have to free it.
There is no function free_array(). I've seen some crazy theories on other questions tangentially related to this, such as calling delete ptr will only free the top of the array, not the whole array. Or the more correct, it is not defined by the implementation. And sure... if this was the first version of C++ and you made a weird design choice that makes sense. But why with $PRESENT_YEAR's standard of C++ has it not been overloaded???
It seems to be the only extra bit that C++ adds is going through the array and calling destructors, and I think maybe this is the crux of it, and it literally is using a separate function to save us a single runtime length lookup, or nullptr at end of the list in exchange for torturing every new C++ programmer or programmer who had a fuzzy day and forgot that there is a different reserve word.
Can someone please clarify once and for all if there is a reason besides "that's what the standard says and nobody questions it"?
Objects in C++ often have destructors that need to run at the end of their lifetime. delete[] makes sure the destructors of each element of the array are called. But doing this has unspecified overhead, while delete does not. This is why there are two forms of delete expressions. One for arrays, which pays the overhead and one for single objects which does not.
In order to only have one version, an implementation would need a mechanism for tracking extra information about every pointer. But one of the founding principles of C++ is that the user shouldn't be forced to pay a cost that they don't absolutely have to.
Always delete what you new and always delete[] what you new[]. But in modern C++, new and new[] are generally not used anymore. Use std::make_unique, std::make_shared, std::vector or other more expressive and safer alternatives.
Basically, malloc and free allocate memory, and new and delete create and destroy objects. So you have to know what the objects are.
To elaborate on the unspecified overhead François Andrieux's answer mentions, you can see my answer on this question in which I examined what does a specific implementation do (Visual C++ 2013, 32-bit). Other implementations may or may not do a similar thing.
In case the new[] was used with an array of objects with a non-trivial destructor, what it did was allocating 4 bytes more, and returning the pointer shifted by 4 bytes ahead, so when delete[] wants to know how many objects are there, it takes the pointer, shifts it 4 bytes prior, and takes the number at that address and treats it as the number of objects stored there. It then calls a destructor on each object (the size of the object is known from the type of the pointer passed). Then, in order to release the exact address, it passes the address that was 4 bytes prior to the passed address.
On this implementation, passing an array allocated with new[] to a regular delete results in calling a single destructor, of the first element, followed by passing the wrong address to the deallocation function, corrupting the heap. Don't do it!
Something not mentioned in the other (all good) answers is that the root cause of this is that arrays - inherited from C - have never been a "first-class" thing in C++.
They have primitive C semantics and do not have C++ semantics, and therefore C++ compiler and runtime support, which would let you or the compiler runtime systems do useful things with pointers to them.
In fact, they're so unsupported by C++ that a pointer to an array of things looks just like a pointer to a single thing. That, in particular, would not happen if arrays were proper parts of the language - even as part of a library, like string or vector.
This wart on the C++ language happened because of this heritage from C. And it remains part of the language - even though we now have std::array for fixed-length arrays and (have always had) std::vector for variable-length arrays - largely for purposes of compatibility: Being able to call out from C++ to operating system APIs and to libraries written in other languages using C-language interop.
And ... because there are truckloads of books and websites and classrooms out there teaching arrays very early in their C++ pedagogy, because of a) being able to write useful/interesting examples early on that do in fact call OS APIs, and of course because of the awesome power of b) "that's the way we've always done it".
Generally, C++ compilers and their associated runtimes build on top of the platform's C runtime. In particular in this case the C memory manager.
The C memory manager allows you to free a block of memory without knowing its size, but there is no standard way to get the size of the block from the runtime and there is no guarantee that the block that was actually allocated is exactly the size you requested. It may well be larger.
Thus the block size stored by the C memory manager can't usefully be used to enable higher-level functionality. If higher-level functionality needs information on the size of the allocation then it must store it itself. (And C++ delete[] does need this for types with destructors, to run them for every element.)
C++ also has an attitude of "you only pay for what you use", storing an extra length field for every allocation (separate from the underlying allocator's bookkeeping) would not fit well with this attitude.
Since the normal way to represent an array of unknown (at compile time) size in C and C++ is with a pointer to its first element, there is no way the compiler can distinguish between a single object allocation and an array allocation based on the type system. So it leaves it up to the programmer to distinguish.
The cover story is that delete is required because of C++'s relationship with C.
The new operator can make a dynamically allocated object of almost any object type.
But, due to the C heritage, a pointer to an object type is ambiguous between two abstractions:
being the location of a single object, and
being the base of a dynamic array.
The delete versus delete[] situation just follows from that.
However, that's does not ring true, because, in spite of the above observations being true, a single delete operator could be used. It does not logically follow that two operators are required.
Here is informal proof. The new T operator invocation (single object case) could implicitly behave as if it were new T[1]. So that is to say, every new could always allocate an array. When no array syntax is mentioned, it could be implicit that an array of [1] will be allocated. Then, there would just have to exist a single delete which behaves like today's delete[].
Why isn't that design followed?
I think it boils down to the usual: it's a goat that was sacrificed to the gods of efficiency. When you allocate an array with new [], extra storage is allocated for meta-data to keep track of the number of elements, so that delete [] can know how many elements need to be iterated for destruction. When you allocate a single object with new, no such meta-data is required. The object can be constructed directly in the memory which comes from the underlying allocator without any extra header.
It's a part of "don't pay for what you don't use" in terms of run-time costs. If you're allocating single objects, you don't have to "pay" for any representational overhead in those objects to deal with the possibility that any dynamic object referenced by pointer might be an array. However, you are burdened with the responsibility of encoding that information in the way you allocate the object with the array new and subsequently delete it.
An example might help. When you allocate a C-style array of objects, those objects may have their own destructor that needs to be called. The delete operator does not do that. It works on container objects, but not C-style arrays. You need delete[] for them.
Here is an example:
#include <iostream>
#include <stdlib.h>
#include <string>
using std::cerr;
using std::cout;
using std::endl;
class silly_string : private std::string {
public:
silly_string(const char* const s) :
std::string(s) {}
~silly_string() {
cout.flush();
cerr << "Deleting \"" << *this << "\"."
<< endl;
// The destructor of the base class is now implicitly invoked.
}
friend std::ostream& operator<< ( std::ostream&, const silly_string& );
};
std::ostream& operator<< ( std::ostream& out, const silly_string& s )
{
return out << static_cast<const std::string>(s);
}
int main()
{
constexpr size_t nwords = 2;
silly_string *const words = new silly_string[nwords]{
"hello,",
"world!" };
cout << words[0] << ' '
<< words[1] << '\n';
delete[] words;
return EXIT_SUCCESS;
}
That test program explicitly instruments the destructor calls. It’s obviously a contrived example. For one thing, a program does not need to free memory immediately before it terminates and releases all its resources. But it does demonstrate what happens and in what order.
Some compilers, such as clang++, are smart enough to warn you if you leave out the [] in delete[] words;, but if you force it to compile the buggy code anyway, you get heap corruption.
Delete is an operator that destroys array and non-array(pointer) objects which are generated by new expression.
It can be used by either using the Delete operator or Delete [ ] operator
A new operator is used for dynamic memory allocation which puts variables on heap memory.
This means the Delete operator deallocates memory from the heap.
Pointer to object is not destroyed, value or memory block pointed by the pointer is destroyed.
The delete operator has a void return type that does not return a value.

Which order should allocated memory blocks be freed in?

Given something like:
char *a = malloc(m);
char *b = malloc(n);
char *c = malloc(o);
Once you have finished with the blocks of memory, this is correct:
free(a);
free(b);
free(c);
And so is this:
free(c);
free(b);
free(a);
Yet one or other order must be used, and it is more satisfying to have a basis for choosing one.
Is there any existing implementation on which it makes any difference to memory fragmentation, or to the speed at which the memory is freed and subsequent allocation requests satisfied? If so, which order is more efficient? Or is there any other basis for choosing an order?
Without any known relationship between these blocks of memory, there is no basis for designating which deallocation order is correct. Just as there is no basis for deciding which allocation order is correct.
This is not a question that can be answered a priori. The only reason not to deallocate memory is if you're still using it, and one piece of memory could be "using" (ie: storing a pointer into) another. So without knowing anything about what is going on inside of that memory, there is no reason to pick any particular deallocation order.
Normally this should be irrelevant for most usecases. However, I once had a project, which was allocating an extremely high number of objects, and there it made a noticable performance difference. In this case, the second variant was much faster (from back to front), than deallocating in the same order as it was allocated.
In our scenario it took almost 4 minutes to deallocate from first to last, while it took only about a minute in the other direction.
But as always with performance, wether a change makes any difference should be measured on the target system.
All else equal, releasing resources is better done in the reverse order of acquisition. This "feels" more natural, and may actually be more efficient in some cases (including memory de/allocations) where the symmetry can help the manager of said resources reclaim them more orderly.
It is the same reason why for example C++ constructs elements of an array from beginning to end, but destructs them in reverse order (class.init.general/3):
When an array of class objects is initialized (either explicitly or implicitly) and the elements are initialized by constructor, the constructor shall be called for each element of the array, following the subscript order; see dcl.array.
[Note 1: Destructors for the array elements are called in reverse order of their construction. — end note]
I like to think of it like this, with scoping ifs
char *a = malloc(m);
if (a) {
char *b = malloc(n);
if (b) {
char *c = malloc(o);
if (c) {
//...
free(c);
}
free(b);
}
free(a);
}
And now, remove the braces and ifs ... and you get your 2nd option.
This is a low-level detail, subject to change when your build environment upgrades. As such, I would look to general principles instead of specific implementations. In this case, I would add a hypothesis to the general principles.
Hypothesis:
If the order of freeing memory makes a difference for performance, the build environment would prefer better performance for the general tendency of the language.
This is based on an assumption that the build environment values producing fast programs. Anyone who wants to dispute that should disregard this answer.
By "general tendency of the language", I am referring to the idea that the last thing constructed tends to be the first thing destroyed. For example, a derived class is constructed after its base is, but the derived class is destroyed before its base. Even within a class, members are destroyed in the reverse order of construction. Last constructed, first destroyed. If the members own dynamically allocated memory, then last allocated, first released (more or less).
This tendency does not guarantee that memory will be released in reverse order from acquisition. In fact, it's not hard to come up with counter-examples. However, it does mean that the reverse order seems more likely than any other order. There are very few guarantees when accounting for all the crazy things a programmer might do, but it seems reasonable to me that free would be optimized either in a symmetric manner (order does not matter) or geared towards this reverse order, if possible.
This is not a proof, nor is it an absolute rule. This is just playing the odds. If the above seems reasonable, I would suggest going with the tendency of the language initially. (Actually, I would suggest that memory be released in a destructor, which forces the order specified by the language.) If profiling indicates that this is a problem area, then more attention would be warranted. Until then, in the question's scenario, free(c) first to improve your odds.

Program with realloc() works just fine but at the end it returns error -1073741819. Can someone please find my mistake in this code? [duplicate]

From what is written here, new allocates in free store while malloc uses heap and the two terms often mean the same thing.
From what is written here, realloc may move the memory block to a new location. If free store and heap are two different memory spaces, does it mean any problem then?
Specifically I'd like to know if it is safe to use
int* data = new int[3];
// ...
int* mydata = (int*)realloc(data,6*sizeof(int));
If not, is there any other way to realloc memory allocated with new safely? I could allocate new area and memcpy the contents, but from what I understand realloc may use the same area if possible.
You can only realloc that which has been allocated via malloc (or family, like calloc).
That's because the underlying data structures that keep track of free and used areas of memory, can be quite different.
It's likely but by no means guaranteed that C++ new and C malloc use the same underlying allocator, in which case realloc could work for both. But formally that's in UB-land. And in practice it's just needlessly risky.
C++ does not offer functionality corresponding to realloc.
The closest is the automatic reallocation of (the internal buffers of) containers like std::vector.
The C++ containers suffer from being designed in a way that excludes use of realloc.
Instead of the presented code
int* data = new int[3];
//...
int* mydata = (int*)realloc(data,6*sizeof(int));
… do this:
vector<int> data( 3 );
//...
data.resize( 6 );
However, if you absolutely need the general efficiency of realloc, and if you have to accept new for the original allocation, then your only recourse for efficiency is to use compiler-specific means, knowledge that realloc is safe with this compiler.
Otherwise, if you absolutely need the general efficiency of realloc but is not forced to accept new, then you can use malloc and realloc. Using smart pointers then lets you get much of the same safety as with C++ containers.
The only possibly relevant restriction C++ adds to realloc is that C++'s malloc/calloc/realloc must not be implemented in terms of ::operator new, and its free must not be implemented in terms of ::operator delete (per C++14 [c.malloc]p3-4).
This means the guarantee you are looking for does not exist in C++. It also means, however, that you can implement ::operator new in terms of malloc. And if you do that, then in theory, ::operator new's result can be passed to realloc.
In practice, you should be concerned about the possibility that new's result does not match ::operator new's result. C++ compilers may e.g. combine multiple new expressions to use one single ::operator new call. This is something compilers already did when the standard didn't allow it, IIRC, and the standard now does allow it (per C++14 [expr.new]p10). That means that even if you go this route, you still don't have a guarantee that passing your new pointers to realloc does anything meaningful, even if it's no longer undefined behaviour.
In general, don't do that. If you are using user defined types with non-trivial initialization, in case of reallocation-copy-freeing, the destructor of your objects won't get called by realloc. The copy constructor won't be called too, when copying. This may lead to undefined behavior due to an incorrect use of object lifetime (see C++ Standard §3.8 Object lifetime, [basic.life]).
1 The lifetime of an object is a runtime property of the object. An object is said to have non-trivial initialization if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial default constructor. [ Note: initialization by a trivial copy/move constructor is non-trivial initialization. —end note ]
The lifetime of an object of type T begins when:
— storage with the proper alignment and size for type T is obtained, and
— if the object has non-trivial initialization, its initialization is complete.
The lifetime of an object of type T ends when:
— if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
— the storage which the object occupies is reused or released.
And later (emphasis mine):
3 The properties ascribed to objects throughout this International Standard apply for a given object only during its lifetime.
So, you really don't want to use an object out of its lifetime.
It is not safe, and it's not elegant.
It might be possible to override new/delete to support the reallocation, but then you may as well consider to use the containers.
In general, no.
There are a slew of things which must hold to make it safe:
Bitwise copying the type and abandoning the source must be safe.
The destructor must be trivial, or you must in-place-destruct the elements you want to deallocate.
Either the constructor is trivial, or you must in-place-construct the new elements.
Trivial types satisfy the above requirements.
In addition:
The new[]-function must pass the request on to malloc without any change, nor do any bookkeeping on the side. You can force this by replacing global new[] and delete[], or the ones in the respective classes.
The compiler must not ask for more memory in order to save the number of elements allocated, or anything else.
There is no way to force that, though a compiler shouldn't save such information if the type has a trivial destructor as a matter of Quality of Implementation.
Yes - if new actually called malloc in the first place (for example, this is how VC++ new works).
No otherwise. do note that once you decide to reallocate the memory (because new called malloc), your code is compiler specific and not portable between compilers anymore.
(I know this answer may upset many developers, but I answer depends on real facts, not just idiomaticy).
That is not safe. Firstly the pointer you pass to realloc must have been obtained from malloc or realloc: http://en.cppreference.com/w/cpp/memory/c/realloc.
Secondly the result of new int [3] need not be the same as the result of the allocation function - extra space may be allocated to store the count of elements.
(And for more complex types than int, realloc wouldn't be safe since it doesn't call copy or move constructors.)
You may be able to (not in all cases), but you shouldn't. If you need to resize your data table, you should use std::vector instead.
Details on how to use it are listed in an other SO question.
These function is mostly used in C.
memset sets the bytes in a block of memory to a specific value.
malloc allocates a block of memory.
calloc, same as malloc. Only difference is that it initializes the bytes to zero.
In C++ the preferred method to allocate memory is to use new.
C: int intArray = (int*) malloc(10 *sizeof(int));
C++: int intArray = new int[10];
C: int intArray = (int*) calloc(10 *sizeof(int));
C++: int intArray = new int10;

C++ Size Of Dynamic Memory at Runtime

This is something I've been wondering for a while and never found an answer for:
Why is it that when you allocate something on the heap you cannot determine the size of it from just the pointer, yet you can delete it using just the pointer and somehow C++ knows how many bytes to free?
Does this have something to do with the way it is stored on the heap?
Is this information there but not exposed by C++?
And perhaps this should be a separate question but I think it's pretty related so I'll ask it here:
Why is it a dynamic array of elements must be deleted using delete [] as opposed to just the simple delete command; why does C++ need this additional information to correctly free all the memory?
When an allocation is made, a small section of memory immediately before [or, technically, somewhere completely different, but just before is the most common scenario] will store the size of the allocation, and in the case of new [] also store the number of allocated objects.
Note that the C++ standard doesn't give any way to retrieve this information for a reason: It may not accurately describe what is allocated, for example the size of an array may very well be rounded up to some "nice" boundary [almost all modern allocators round to 16 bytes at the very least, so that the memory is usable for SSE and other similar SIMD implementations on other processor architectures]. So if you allocated 40 bytes, it would report back 48, which isn't what you asked for, so it would be rather confusing. And of course, there is no guarantee that the information is stored at ALL - it may be implied by some other information that is stored in the "admin" block of the allocation.
And of course, you can use placement new, in which case there is no admin block, and the allocation is not deleted in the normal fashion - some arbitrary code wouldn't be able to tell the difference.
delete differs from delete [] in that delete [] will know how many objects have been allocated, and call the destructor for all of those objects. It is also possible [or even likely] that new [] stores the number of elements in a way that means that calling delete [] on something that wasn't created with new [] will go horribly wrong.
And as Zan Lynx commented, that if there is no destructor for the objects (e.g. when you are allocating data for int or struct { int x; double y; }, etc - including classes that don't have a constructor [note however that if you have another class inside the class, the compiler will build a destructor for you]), then there is no need to store the count, or do anything else, so the compiler CAN, if it wishes, optimise this sort of allocation into regular new and delete.

Why should C++ programmers minimize use of 'new'?

I stumbled upon Stack Overflow question Memory leak with std::string when using std::list<std::string>, and one of the comments says this:
Stop using new so much. I can't see any reason you used new anywhere you did. You can create objects by value in C++ and it's one of the huge advantages to using the language. You do not have to allocate everything on the heap. Stop thinking like a Java programmer.
I'm not really sure what he means by that.
Why should objects be created by value in C++ as often as possible, and what difference does it make internally? Did I misinterpret the answer?
There are two widely-used memory allocation techniques: automatic allocation and dynamic allocation. Commonly, there is a corresponding region of memory for each: the stack and the heap.
Stack
The stack always allocates memory in a sequential fashion. It can do so because it requires you to release the memory in the reverse order (First-In, Last-Out: FILO). This is the memory allocation technique for local variables in many programming languages. It is very, very fast because it requires minimal bookkeeping and the next address to allocate is implicit.
In C++, this is called automatic storage because the storage is claimed automatically at the end of scope. As soon as execution of current code block (delimited using {}) is completed, memory for all variables in that block is automatically collected. This is also the moment where destructors are invoked to clean up resources.
Heap
The heap allows for a more flexible memory allocation mode. Bookkeeping is more complex and allocation is slower. Because there is no implicit release point, you must release the memory manually, using delete or delete[] (free in C). However, the absence of an implicit release point is the key to the heap's flexibility.
Reasons to use dynamic allocation
Even if using the heap is slower and potentially leads to memory leaks or memory fragmentation, there are perfectly good use cases for dynamic allocation, as it's less limited.
Two key reasons to use dynamic allocation:
You don't know how much memory you need at compile time. For instance, when reading a text file into a string, you usually don't know what size the file has, so you can't decide how much memory to allocate until you run the program.
You want to allocate memory which will persist after leaving the current block. For instance, you may want to write a function string readfile(string path) that returns the contents of a file. In this case, even if the stack could hold the entire file contents, you could not return from a function and keep the allocated memory block.
Why dynamic allocation is often unnecessary
In C++ there's a neat construct called a destructor. This mechanism allows you to manage resources by aligning the lifetime of the resource with the lifetime of a variable. This technique is called RAII and is the distinguishing point of C++. It "wraps" resources into objects. std::string is a perfect example. This snippet:
int main ( int argc, char* argv[] )
{
std::string program(argv[0]);
}
actually allocates a variable amount of memory. The std::string object allocates memory using the heap and releases it in its destructor. In this case, you did not need to manually manage any resources and still got the benefits of dynamic memory allocation.
In particular, it implies that in this snippet:
int main ( int argc, char* argv[] )
{
std::string * program = new std::string(argv[0]); // Bad!
delete program;
}
there is unneeded dynamic memory allocation. The program requires more typing (!) and introduces the risk of forgetting to deallocate the memory. It does this with no apparent benefit.
Why you should use automatic storage as often as possible
Basically, the last paragraph sums it up. Using automatic storage as often as possible makes your programs:
faster to type;
faster when run;
less prone to memory/resource leaks.
Bonus points
In the referenced question, there are additional concerns. In particular, the following class:
class Line {
public:
Line();
~Line();
std::string* mString;
};
Line::Line() {
mString = new std::string("foo_bar");
}
Line::~Line() {
delete mString;
}
Is actually a lot more risky to use than the following one:
class Line {
public:
Line();
std::string mString;
};
Line::Line() {
mString = "foo_bar";
// note: there is a cleaner way to write this.
}
The reason is that std::string properly defines a copy constructor. Consider the following program:
int main ()
{
Line l1;
Line l2 = l1;
}
Using the original version, this program will likely crash, as it uses delete on the same string twice. Using the modified version, each Line instance will own its own string instance, each with its own memory and both will be released at the end of the program.
Other notes
Extensive use of RAII is considered a best practice in C++ because of all the reasons above. However, there is an additional benefit which is not immediately obvious. Basically, it's better than the sum of its parts. The whole mechanism composes. It scales.
If you use the Line class as a building block:
class Table
{
Line borders[4];
};
Then
int main ()
{
Table table;
}
allocates four std::string instances, four Line instances, one Table instance and all the string's contents and everything is freed automagically.
Because the stack is faster and leak-proof
In C++, it takes but a single instruction to allocate space—on the stack—for every local scope object in a given function, and it's impossible to leak any of that memory. That comment intended (or should have intended) to say something like "use the stack and not the heap".
The reason why is complicated.
First, C++ is not garbage collected. Therefore, for every new, there must be a corresponding delete. If you fail to put this delete in, then you have a memory leak. Now, for a simple case like this:
std::string *someString = new std::string(...);
//Do stuff
delete someString;
This is simple. But what happens if "Do stuff" throws an exception? Oops: memory leak. What happens if "Do stuff" issues return early? Oops: memory leak.
And this is for the simplest case. If you happen to return that string to someone, now they have to delete it. And if they pass it as an argument, does the person receiving it need to delete it? When should they delete it?
Or, you can just do this:
std::string someString(...);
//Do stuff
No delete. The object was created on the "stack", and it will be destroyed once it goes out of scope. You can even return the object, thus transfering its contents to the calling function. You can pass the object to functions (typically as a reference or const-reference: void SomeFunc(std::string &iCanModifyThis, const std::string &iCantModifyThis). And so forth.
All without new and delete. There's no question of who owns the memory or who's responsible for deleting it. If you do:
std::string someString(...);
std::string otherString;
otherString = someString;
It is understood that otherString has a copy of the data of someString. It isn't a pointer; it is a separate object. They may happen to have the same contents, but you can change one without affecting the other:
someString += "More text.";
if(otherString == someString) { /*Will never get here */ }
See the idea?
Objects created by new must be eventually deleted lest they leak. The destructor won't be called, memory won't be freed, the whole bit. Since C++ has no garbage collection, it's a problem.
Objects created by value (i. e. on stack) automatically die when they go out of scope. The destructor call is inserted by the compiler, and the memory is auto-freed upon function return.
Smart pointers like unique_ptr, shared_ptr solve the dangling reference problem, but they require coding discipline and have other potential issues (copyability, reference loops, etc.).
Also, in heavily multithreaded scenarios, new is a point of contention between threads; there can be a performance impact for overusing new. Stack object creation is by definition thread-local, since each thread has its own stack.
The downside of value objects is that they die once the host function returns - you cannot pass a reference to those back to the caller, only by copying, returning or moving by value.
C++ doesn't employ any memory manager by its own. Other languages like C# and Java have a garbage collector to handle the memory
C++ implementations typically use operating system routines to allocate the memory and too much new/delete could fragment the available memory
With any application, if the memory is frequently being used it's advisable to preallocate it and release when not required.
Improper memory management could lead memory leaks and it's really hard to track. So using stack objects within the scope of function is a proven technique
The downside of using stack objects are, it creates multiple copies of objects on returning, passing to functions, etc. However, smart compilers are well aware of these situations and they've been optimized well for performance
It's really tedious in C++ if the memory being allocated and released in two different places. The responsibility for release is always a question and mostly we rely on some commonly accessible pointers, stack objects (maximum possible) and techniques like auto_ptr (RAII objects)
The best thing is that, you've control over the memory and the worst thing is that you will not have any control over the memory if we employ an improper memory management for the application. The crashes caused due to memory corruptions are the nastiest and hard to trace.
I see that a few important reasons for doing as few new's as possible are missed:
Operator new has a non-deterministic execution time
Calling new may or may not cause the OS to allocate a new physical page to your process. This can be quite slow if you do it often. Or it may already have a suitable memory location ready; we don't know. If your program needs to have consistent and predictable execution time (like in a real-time system or game/physics simulation), you need to avoid new in your time-critical loops.
Operator new is an implicit thread synchronization
Yes, you heard me. Your OS needs to make sure your page tables are consistent and as such calling new will cause your thread to acquire an implicit mutex lock. If you are consistently calling new from many threads you are actually serialising your threads (I've done this with 32 CPUs, each hitting on new to get a few hundred bytes each, ouch! That was a royal p.i.t.a. to debug.)
The rest, such as slow, fragmentation, error prone, etc., have already been mentioned by other answers.
Pre-C++17:
Because it is prone to subtle leaks even if you wrap the result in a smart pointer.
Consider a "careful" user who remembers to wrap objects in smart pointers:
foo(shared_ptr<T1>(new T1()), shared_ptr<T2>(new T2()));
This code is dangerous because there is no guarantee that either shared_ptr is constructed before either T1 or T2. Hence, if one of new T1() or new T2() fails after the other succeeds, then the first object will be leaked because no shared_ptr exists to destroy and deallocate it.
Solution: use make_shared.
Post-C++17:
This is no longer a problem: C++17 imposes a constraint on the order of these operations, in this case ensuring that each call to new() must be immediately followed by the construction of the corresponding smart pointer, with no other operation in between. This implies that, by the time the second new() is called, it is guaranteed that the first object has already been wrapped in its smart pointer, thus preventing any leaks in case an exception is thrown.
A more detailed explanation of the new evaluation order introduced by C++17 was provided by Barry in another answer.
Thanks to #Remy Lebeau for pointing out that this is still a problem under C++17 (although less so): the shared_ptr constructor can fail to allocate its control block and throw, in which case the pointer passed to it is not deleted.
Solution: use make_shared.
To a great extent, that's someone elevating their own weaknesses to a general rule. There's nothing wrong per se with creating objects using the new operator. What there is some argument for is that you have to do so with some discipline: if you create an object you need to make sure it's going to be destroyed.
The easiest way of doing that is to create the object in automatic storage, so C++ knows to destroy it when it goes out of scope:
{
File foo = File("foo.dat");
// Do things
}
Now, observe that when you fall off that block after the end-brace, foo is out of scope. C++ will call its destructor automatically for you. Unlike Java, you don't need to wait for the garbage collection to find it.
Had you written
{
File * foo = new File("foo.dat");
you would want to match it explicitly with
delete foo;
}
or even better, allocate your File * as a "smart pointer". If you aren't careful about that it can lead to leaks.
The answer itself makes the mistaken assumption that if you don't use new you don't allocate on the heap; in fact, in C++ you don't know that. At most, you know that a small amount of memory, say one pointer, is certainly allocated on the stack. However, consider if the implementation of File is something like:
class File {
private:
FileImpl * fd;
public:
File(String fn){ fd = new FileImpl(fn);}
Then FileImpl will still be allocated on the stack.
And yes, you'd better be sure to have
~File(){ delete fd ; }
in the class as well; without it, you'll leak memory from the heap even if you didn't apparently allocate on the heap at all.
new() shouldn't be used as little as possible. It should be used as carefully as possible. And it should be used as often as necessary as dictated by pragmatism.
Allocation of objects on the stack, relying on their implicit destruction, is a simple model. If the required scope of an object fits that model then there's no need to use new(), with the associated delete() and checking of NULL pointers.
In the case where you have lots of short-lived objects allocation on the stack should reduce the problems of heap fragmentation.
However, if the lifetime of your object needs to extend beyond the current scope then new() is the right answer. Just make sure that you pay attention to when and how you call delete() and the possibilities of NULL pointers, using deleted objects and all of the other gotchas that come with the use of pointers.
When you use new, objects are allocated to the heap. It is generally used when you anticipate expansion. When you declare an object such as,
Class var;
it is placed on the stack.
You will always have to call destroy on the object that you placed on the heap with new. This opens the potential for memory leaks. Objects placed on the stack are not prone to memory leaking!
One notable reason to avoid overusing the heap is for performance -- specifically involving the performance of the default memory management mechanism used by C++. While allocation can be quite quick in the trivial case, doing a lot of new and delete on objects of non-uniform size without strict order leads not only to memory fragmentation, but it also complicates the allocation algorithm and can absolutely destroy performance in certain cases.
That's the problem that memory pools where created to solve, allowing to to mitigate the inherent disadvantages of traditional heap implementations, while still allowing you to use the heap as necessary.
Better still, though, to avoid the problem altogether. If you can put it on the stack, then do so.
I tend to disagree with the idea of using new "too much". Though the original poster's use of new with system classes is a bit ridiculous. (int *i; i = new int[9999];? really? int i[9999]; is much clearer.) I think that is what was getting the commenter's goat.
When you're working with system objects, it's very rare that you'd need more than one reference to the exact same object. As long as the value is the same, that's all that matters. And system objects don't typically take up much space in memory. (one byte per character, in a string). And if they do, the libraries should be designed to take that memory management into account (if they're written well). In these cases, (all but one or two of the news in his code), new is practically pointless and only serves to introduce confusions and potential for bugs.
When you're working with your own classes/objects, however (e.g. the original poster's Line class), then you have to begin thinking about the issues like memory footprint, persistence of data, etc. yourself. At this point, allowing multiple references to the same value is invaluable - it allows for constructs like linked lists, dictionaries, and graphs, where multiple variables need to not only have the same value, but reference the exact same object in memory. However, the Line class doesn't have any of those requirements. So the original poster's code actually has absolutely no needs for new.
I think the poster meant to say You do not have to allocate everything on the heap rather than the the stack.
Basically, objects are allocated on the stack (if the object size allows, of course) because of the cheap cost of stack-allocation, rather than heap-based allocation which involves quite some work by the allocator, and adds verbosity because then you have to manage data allocated on the heap.
Two reasons:
It's unnecessary in this case. You're making your code needlessly more complicated.
It allocates space on the heap, and it means that you have to remember to delete it later, or it will cause a memory leak.
Many answers have gone into various performance considerations. I want to address the comment which puzzled OP:
Stop thinking like a Java programmer.
Indeed, in Java, as explained in the answer to this question,
You use the new keyword when an object is being explicitly created for the first time.
but in C++, objects of type T are created like so: T{} (or T{ctor_argument1,ctor_arg2} for a constructor with arguments). That's why usually you just have no reason to want to use new.
So, why is it ever used at all? Well, for two reasons:
You need to create many values the number of which is not known at compile time.
Due to limitations of the C++ implementation on common machines - to prevent a stack overflow by allocating too much space creating values the regular way.
Now, beyond what the comment you quoted implied, you should note that even those two cases above are covered well enough without you having to "resort" to using new yourself:
You can use container types from the standard libraries which can hold a runtime-variable number of elements (like std::vector).
You can use smart pointers, which give you a pointer similar to new, but ensure that memory gets released where the "pointer" goes out of scope.
and for this reason, it is an official item in the C++ community Coding Guidelines to avoid explicit new and delete: Guideline R.11.
The core reason is that objects on heap are always difficult to use and manage than simple values. Writing code that are easy to read and maintain is always the first priority of any serious programmer.
Another scenario is the library we are using provides value semantics and make dynamic allocation unnecessary. Std::string is a good example.
For object oriented code however, using a pointer - which means use new to create it beforehand - is a must. In order to simplify the complexity of resource management, we have dozens of tools to make it as simple as possible, such as smart pointers. The object based paradigm or generic paradigm assumes value semantics and requires less or no new, just as the posters elsewhere stated.
Traditional design patterns, especially those mentioned in GoF book, use new a lot, as they are typical OO code.
new is the new goto.
Recall why goto is so reviled: while it is a powerful, low-level tool for flow control, people often used it in unnecessarily complicated ways that made code difficult to follow. Furthermore, the most useful and easiest to read patterns were encoded in structured programming statements (e.g. for or while); the ultimate effect is that the code where goto is the appropriate way to is rather rare, if you are tempted to write goto, you're probably doing things badly (unless you really know what you're doing).
new is similar — it is often used to make things unnecessarily complicated and harder to read, and the most useful usage patterns can be encoded have been encoded into various classes. Furthermore, if you need to use any new usage patterns for which there aren't already standard classes, you can write your own classes that encode them!
I would even argue that new is worse than goto, due to the need to pair new and delete statements.
Like goto, if you ever think you need to use new, you are probably doing things badly — especially if you are doing so outside of the implementation of a class whose purpose in life is to encapsulate whatever dynamic allocations you need to do.
One more point to all the above correct answers, it depends on what sort of programming you are doing. Kernel developing in Windows for example -> The stack is severely limited and you might not be able to take page faults like in user mode.
In such environments, new, or C-like API calls are prefered and even required.
Of course, this is merely an exception to the rule.
new allocates objects on the heap. Otherwise, objects are allocated on the stack. Look up the difference between the two.