I understand that, when we write delete [] on a pointer created by a corresponding new [], the program will look for the accounting information on the array and find out the array's element size (a counter). Then the program invokes the element's destructor on each of them. Finally memory (what memory??) is deallocated by a function named operator delete.
What I am asking is whether delete[] will deallocate the entire memory, allocated by the new[] expression, in one shot because that information (total amount of memory) is available after all elements are destroyed, or will it successively deallocate the memory occupied by the array elements that it has invoked destructors on?
A related follow-up quesion is asked Does delete (non array form) know the total amount of memory allocated by either new or new[]
All of the memory will be released to the underlying allocator at once. This is mandated by the C++ standard, although not especially clearly. N3337 is the closest approximation to the official C++11 standard available online for free. Look at section 5.3.5, [expr.delete], which spells out the code that an invocation of delete[] expands to. In particular, operator delete[] is called once in step 7, after destructors have been invoked for all elements of the array (step 6).
You can also deduce this behavior from the wording in 18.6.1.2 [new.delete.array] regarding what pointers are valid to pass to operator delete[]: "... shall be the value returned by an earlier call to operator new[] (... caveats ...)". Since operator new[] returns one pointer to an entire array, operator delete[] must also be invoked once for an entire array.
5.3.5/2:
In the first alternative (delete object), the value of the operand of
delete may be a null pointer value, a pointer to a non-array object
created by a previous new-expression, or a pointer to a subobject
(1.8) representing a base class of such an object (Clause 10). If not,
the behavior is undefined.
This clearly shows us that the behavior in your proposed situation is undefined by the standard, which leaves "not leaking memory" or "leaking memory" as the two most likely undefined behaviors (it seems that heap corruption could be a much less likely third possibility just for completeness). Just don't write such code and you avoid possible problems.
In practice your example
auto pi = new int[10];
...
delete pi;
works just fine and is quite common in working programs. I'll trust those here who looked it up in the standard, that it is undefined behavior. So I won't suggest you take advantage of the fact that it routinely works.
In practice, the allocator keeps track of the total size and deallocation frees the correct total size, so pairing new[] with delete only causes the destructors of elements past the zeroeth to fail to be called and does no harm for arrays of objects with trivial destruction. But coding undefined behavior is not a great idea even when you are confident every compiler behaves OK for that particular UB.
Edit: now that you removed all that from your question to ask only the more trivial part of the original question:
I assume the standard does not go into any such detail, so an implementation could do either. But I'm also confident any real implementation frees the entire contiguous chunk in one shot after calling all the destructors. (The implementation would also have no choice but to do it the sane way for cases where delete[] is overridden)
Related
From what is written here, new allocates in free store while malloc uses heap and the two terms often mean the same thing.
From what is written here, realloc may move the memory block to a new location. If free store and heap are two different memory spaces, does it mean any problem then?
Specifically I'd like to know if it is safe to use
int* data = new int[3];
// ...
int* mydata = (int*)realloc(data,6*sizeof(int));
If not, is there any other way to realloc memory allocated with new safely? I could allocate new area and memcpy the contents, but from what I understand realloc may use the same area if possible.
You can only realloc that which has been allocated via malloc (or family, like calloc).
That's because the underlying data structures that keep track of free and used areas of memory, can be quite different.
It's likely but by no means guaranteed that C++ new and C malloc use the same underlying allocator, in which case realloc could work for both. But formally that's in UB-land. And in practice it's just needlessly risky.
C++ does not offer functionality corresponding to realloc.
The closest is the automatic reallocation of (the internal buffers of) containers like std::vector.
The C++ containers suffer from being designed in a way that excludes use of realloc.
Instead of the presented code
int* data = new int[3];
//...
int* mydata = (int*)realloc(data,6*sizeof(int));
… do this:
vector<int> data( 3 );
//...
data.resize( 6 );
However, if you absolutely need the general efficiency of realloc, and if you have to accept new for the original allocation, then your only recourse for efficiency is to use compiler-specific means, knowledge that realloc is safe with this compiler.
Otherwise, if you absolutely need the general efficiency of realloc but is not forced to accept new, then you can use malloc and realloc. Using smart pointers then lets you get much of the same safety as with C++ containers.
The only possibly relevant restriction C++ adds to realloc is that C++'s malloc/calloc/realloc must not be implemented in terms of ::operator new, and its free must not be implemented in terms of ::operator delete (per C++14 [c.malloc]p3-4).
This means the guarantee you are looking for does not exist in C++. It also means, however, that you can implement ::operator new in terms of malloc. And if you do that, then in theory, ::operator new's result can be passed to realloc.
In practice, you should be concerned about the possibility that new's result does not match ::operator new's result. C++ compilers may e.g. combine multiple new expressions to use one single ::operator new call. This is something compilers already did when the standard didn't allow it, IIRC, and the standard now does allow it (per C++14 [expr.new]p10). That means that even if you go this route, you still don't have a guarantee that passing your new pointers to realloc does anything meaningful, even if it's no longer undefined behaviour.
In general, don't do that. If you are using user defined types with non-trivial initialization, in case of reallocation-copy-freeing, the destructor of your objects won't get called by realloc. The copy constructor won't be called too, when copying. This may lead to undefined behavior due to an incorrect use of object lifetime (see C++ Standard §3.8 Object lifetime, [basic.life]).
1 The lifetime of an object is a runtime property of the object. An object is said to have non-trivial initialization if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial default constructor. [ Note: initialization by a trivial copy/move constructor is non-trivial initialization. —end note ]
The lifetime of an object of type T begins when:
— storage with the proper alignment and size for type T is obtained, and
— if the object has non-trivial initialization, its initialization is complete.
The lifetime of an object of type T ends when:
— if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
— the storage which the object occupies is reused or released.
And later (emphasis mine):
3 The properties ascribed to objects throughout this International Standard apply for a given object only during its lifetime.
So, you really don't want to use an object out of its lifetime.
It is not safe, and it's not elegant.
It might be possible to override new/delete to support the reallocation, but then you may as well consider to use the containers.
In general, no.
There are a slew of things which must hold to make it safe:
Bitwise copying the type and abandoning the source must be safe.
The destructor must be trivial, or you must in-place-destruct the elements you want to deallocate.
Either the constructor is trivial, or you must in-place-construct the new elements.
Trivial types satisfy the above requirements.
In addition:
The new[]-function must pass the request on to malloc without any change, nor do any bookkeeping on the side. You can force this by replacing global new[] and delete[], or the ones in the respective classes.
The compiler must not ask for more memory in order to save the number of elements allocated, or anything else.
There is no way to force that, though a compiler shouldn't save such information if the type has a trivial destructor as a matter of Quality of Implementation.
Yes - if new actually called malloc in the first place (for example, this is how VC++ new works).
No otherwise. do note that once you decide to reallocate the memory (because new called malloc), your code is compiler specific and not portable between compilers anymore.
(I know this answer may upset many developers, but I answer depends on real facts, not just idiomaticy).
That is not safe. Firstly the pointer you pass to realloc must have been obtained from malloc or realloc: http://en.cppreference.com/w/cpp/memory/c/realloc.
Secondly the result of new int [3] need not be the same as the result of the allocation function - extra space may be allocated to store the count of elements.
(And for more complex types than int, realloc wouldn't be safe since it doesn't call copy or move constructors.)
You may be able to (not in all cases), but you shouldn't. If you need to resize your data table, you should use std::vector instead.
Details on how to use it are listed in an other SO question.
These function is mostly used in C.
memset sets the bytes in a block of memory to a specific value.
malloc allocates a block of memory.
calloc, same as malloc. Only difference is that it initializes the bytes to zero.
In C++ the preferred method to allocate memory is to use new.
C: int intArray = (int*) malloc(10 *sizeof(int));
C++: int intArray = new int[10];
C: int intArray = (int*) calloc(10 *sizeof(int));
C++: int intArray = new int10;
Current draft standard explicitly states that placement new[] can have a space overhead:
This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) and other placement allocation functions. The amount of overhead may vary from one invocation of new to another.
So presumably they have something in mind, why a compiler need this overhead. What is it? Can a compiler use this overhead for anything useful?
In my understanding, to destruct this array, the only solution is to call destructors in a loop (am I right on this?), as there is no placement delete[] (btw, shouldn't we have placement delete[] to properly destruct the array, not just its elements?). So the compiler doesn't have to know the array length.
I thought as this overhead cannot be used for anything useful, compilers don't use it (so this is not an issue in practice). I've checked compilers with this simple code:
#include <stdio.h>
#include <new>
struct Foo {
~Foo() { }
};
int main() {
char buffer1[1024];
char buffer2[1024];
float *fl = new(buffer1) float[3];
Foo *foo = new(buffer2) Foo[3];
printf("overhead for float[]: %d\n", (int)(reinterpret_cast<char*>(fl) - buffer1));
printf("overhead for Foo[] : %d\n", (int)(reinterpret_cast<char*>(foo) - buffer2));
}
GCC and clang doesn't use any overhead at all. But, MSVC uses 8 bytes for the Foo case. For what purpose could MSVC use this overhead?
Here's some background, why I put this question.
There were previous questions about this subject:
Array placement-new requires unspecified overhead in the buffer?
Can placement new for arrays be used in a portable way?
As far as I see, the moral of these questions is to avoid using placement new[], and use placement new in a loop. But this solution doesn't create an array, but elements which are sitting next to each other, which is not an array, using operator[] is undefined behavior for them. These questions are more about how to avoid placement new[], but this question is more about the "why?".
Current draft standard explicitly states ...
To clarify, this rule has (probably) existed since first version of the standard (earliest version I have access to is C++03, which does contain that rule, and I found no defect report about needing to add the rule).
So presumably they have something in mind, why a compiler need this overhead
My suspicion is that the standard committee didn't have any particular use case in mind, but added the rule in order to keep the existing compiler(s?) with this behaviour compliant.
For what purpose could MSVC use this overhead? "why?"
These questions could confidently be answered only by the MS compiler team, but I can propose a few conjectures:
The space could be used by a debugger, which would allow it to show all of the elements of the array. It could be used by an address sanitiser to verify that the array isn't overflowed. That said, I believe both of these tools could store the data in an external structure.
Considering the overhead is only reserved in the case of non-trivial destructor, it might be that it is used to store the number of elements constructed so far, so that compiler can know which elements to destroy in the event of an exception in one of the constructors. Again, as far as I know, this could just as well be stored in a separate temporary object on the stack.
For what it's worth, the Itanium C++ ABI agrees that the overhead isn't needed:
No cookie is required if the new operator being used is ::operator new[](size_t, void*).
Where cookie refers to the array length overhead.
The dynamic array allocation is implementation-specific. But ont of the common practices with implementing dynamic array allocation is storing its size before its beginning (I mean storing size before first element). This perfectly overlaps with:
representing array allocation overhead; the result of the
new-expression will be offset by this amount from the value returned
by operator new[].
"Placement delete" would not make much sense. What delete does is call destructor and free memory. delete calls destructor on all of the array elements and frees it. Calling destructor explicitly is in some sense "placement delete".
Current draft standard explicitly states that placement new[] can have a space overhead ...
Yes, beats the hell out of me too. I posted it (rightly or wrongly) as an issue on GitHub, see:
https://github.com/cplusplus/draft/issues/2264
So presumably they have something in mind, why a compiler need this overhead. What is it? Can a compiler use this overhead for anything useful?
Not so far as I can see, no.
In my understanding, to destruct this array, the only solution is to call destructors in a loop (am I right on this?), as there is no placement delete[] (btw, shouldn't we have placement delete[] to properly destruct the array, not just its elements?). So the compiler doesn't have to know the array length.
For the first part of what you say there, absolutely. But we don't need a placement delete [] (we can just call the destructors in a loop, because we know how many elements there are).
I thought as this overhead cannot be used for anything useful, compilers don't use it (so this is not an issue in practice). I've checked compilers with this simple code:
...
GCC and clang doesn't use any overhead at all. But, MSVC uses 8 bytes for the Foo case. For what purpose could MSVC use this overhead?
That's depressing. I really though that all compilers wouldn't do this because there's no point. It's only used by delete [] and you can't use that with placement new anyway, so...
So, to summarise, the purpose of placement new [ ] should be to let the compiler know how many elements there are in the array so that it knows how many constructors to call. And that's all it should do. Period.
(Edit: added more detail)
But this solution doesn't create an array, but elements which are sitting next to each other, which is not an array, using operator[] is undefined behavior for them.
As far as I understand, this is not quite true.
[basic.life]
The lifetime of an object of type T begins when:
(1.1) — storage with the proper alignment and size for type T is obtained, and
(1.2) — if the object has non-vacuous initialization, its initialization is complete
Initialisation of an array consists of initialisation of its elements. (Important: this statement may not be directly supported by the standard. If it is indeed not supported, then this is a defect in the standard which makes creation of variable length arrays other than by new[] undefined. In particular, users cannot write their own replacement for std::vector. I don't believe this to be the intent of the standard).
So whenever there is a char array suitably sized and aligned for an array of N objects of type T, the first condition is satisfied.
In order to satisfy the second condition, one needs to initialise N individual objects of type T. This initialisation may be portably achieved by incrementing the original char array address by sizeof(T) at a time, and calling placement new on the resulting pointers.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Are memory leaks “undefined behavior” class problem in C++?
Never calling delete or delete[] on address returned by new or new [] resp in a C++ program is an Undefined Behavior or merely a memory leak?
References from the Standard(if any) are welcome.
This came up in one of the comments here & I am just a bit confused about it.
[basic.life] (3.8 Object lifetime) in paragraph 4 tells :
A program may end the lifetime of any object by reusing the storage which the object occupies or by explicitly calling the destructor for an object of a class type with a non-trivial destructor. For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined
behavior.
The standard is clear with regards to the semantics of new and delete. There's certainly no undefined behavior if you don't call delete; it is, in fact, standard practice for singletons, and I imagine that std::cout and std::cin use new[] to acquire their buffers (which they almost certainly never delete). Why would not calling delete be undefined behavior?
What is undefined behavior is calling the wrong form of delete, calling free for memory allocated with new, or in general to attempt to delete an object without following the protocol required by its allocation.
Referring to [basic.stc.dynamic.deallocation] (aka 3.7.4.2 in n3337) there are only 4 paragraphs.
operator delete and operator delete[] should be either class members or in global scope
Precisions on the valide signatures of operator delete and operator delete[]
Precisions on which delete can be used for deallocation, depending on which new was used for allocation
Precisions on the possible arguments value and effects of the call (ie the pointers to this storage are now invalid)
There is absolutely no note here on what would happen if storage is allocated but never released.
I don't think that the Standard concerns itself with this, so it is more unspecified rather than undefined.
It's just a memory leak.
But I explicitly remember the standard saying that use new with delete[] and new [] with delete is undefined behavior. (or any combination with malloc or free)
I don't think the standard specifically says calling new results in undefined behavior if you fail to call delete. Also, how can the run-time tell if you call delete sometime later or never call it at all?
I don't think there are any contracts in the standard that say - if you do X, you MUST do Y afterwards, otherwise it's UB.
Let's say that if you do not call delete your program will still work. BUT if you do not delete memory allocations your program memory usage will keep growing until your program will run out of free memory (the longer you run it better are the chances for it to happen) which will cause crashes at different points and will be very hard to detect (and I think that whats 'Undefined Behavior' mentioned in the comment means)
If delete/delete[] is not called for the objects allocated with new/new[], there would be resource leaks. It could be memory leak if the constructor had allocated dynamic memory. Other things like semaphore lock not released, file handles not released etc can happen if the constructor had allocated them.
It will not be undefined behavior.
I dont see how not releasing memory will lead to undefined behaviour.
If you dont clean up, the OS still has knowledge of the allocated memory. That will lead to a resource leak for as long as the application runs.
Someone on IRC claimed that, although allocating with new[] and deleting with delete (not delete[]) is UB, on Linux platforms (no further details about the OS) it would be safe.
Is this true? Is it guaranteed? Is it to do with something in POSIX that specifies that dynamically-allocated blocks should not have metadata at the start?
Or is it just completely untrue?
Yes, I know I shouldn't do it. I never would.I am curious about the veracity of this idea; that's it!
By "safe", I mean: "will not cause behaviour other than were the original allocation performed by new, or were the de-allocation performed by delete[]". This means that we might see 1 "element" destruction or n, but no crashing.
Of course it's not true. That person is mixing up several different concerns:
how does the OS handle allocations/deallocations
correct calls to constructors and destructors
UB means UB
On the first point, I'm sure he's correct. It is common to handle both in the same way on that level: it is simply a request for X bytes, or a request to release the allocation starting at address X. It doesn't really matter if it's an array or not.
On the second point, everything falls apart. new[] calls the constructor for each element in the allocated array. delete calls the destructor for the one element at the specified address. And so, if you allocate an array of objects, and free it with delete, only one element will have its destructor invoked. (This is easy to forget because people invariably test this with arrays of ints, in which case this difference is unnoticeable)
And then there's the third point, the catch-all. It's UB, and that means it's UB. The compiler may make optimizations based on the assumption that your code does not exhibit any undefined behavior. If it does, it may break some of these assumptions, and seemingly unrelated code might break.
Even if it happens to be safe on some environment, don't do it. There's no reason to want to do it.
Even if it did return the right memory to the OS, the destructors wouldn't be called properly.
It's definitely not true for all or even most Linuxes, your IRC friend is talking bollocks.
POSIX has nothing to do with C++. In general, this is unsafe. If it works anywhere, it's because of the compiler and library, not the OS.
This question discusses in great details when exactly mixing new[] and delete looks safe (no observable problems) on Visual C++. I suppose that by "on Linux" you actually mean "with gcc" and I've observed very similar results with gcc on ideone.com.
Please note that this requires:
global operator new() and operator new[]() functions to be implemented identically and
the compiler optimizing away the "prepend with number of elements" allocation overhead
and also only works for types with trivial destructors.
Even with these requirements met there's no guarantee it will work on a specific version of a specific compiler. You'll be much better off simply not doing that - relying on undefined behavior is a very bad idea.
It is definitely not safe as you can simply try out with the following code:
#include<iostream>
class test {
public:
test(){ std::cout << "Constructor" << std::endl; }
~test(){ std::cout << "Destructor" << std::endl; }
};
int main() {
test * t = new test[ 10 ];
delete t;
return 1;
}
Have a look at http://ideone.com/b8BiQ . It fails misserably.
It may work when you do not use classes, but only fundamental types, but even that is not guaranteed.
EDIT: Some explanations for those of you who want to know why this crashes:
new and delete mainly serve as wrappers around malloc(), hence calling free() on a newed pointer is most of the time "safe" (remember to call the destructor), but you should not rely on it. For new[] and delete[] however the situation is more complicated.
When an array of classes gets constructed using new[] each default constructor will be called in turn. When you do delete[] each destructor gets called. However each destructor also has to be supplied a this pointer to use inside as a hidden parameter. So before calling the destructor the program has to find the locations of all objects within the reserved memory, to pass these locations as this pointers to the destructor. So all information that is later needed to reconstruct this information needs to be stored somewhere.
Now the easiest way would be to have a global map somewhere around, which stores this information for all new[]ed pointers. In this case if you delete is called instead of delete[] only one of the destructors would be called and the entry would not be removed from a map. However this method is usually not used, because maps are slow and memory management should be as fast as possible.
Hence for the stdlibc++ a different solution is used. Since only a few bytes are needed as additional information, it is the fastest to just over-allocate by these few bytes, store the information at the beginning of the memory and return the pointer to the memory after the bookkeeping. So if you allocate an array of 10 objects of 10 bytes each, the programm will allocate 100+X bytes where X is the size of the data which is needed to reconstruct the this.
So in this case it looks something like this
| Bookkeeping | First Object | Second Object |....
^ ^
| This is what is returned by new[]
|
this is what is returned by malloc()
So in case you pass the pointer you have recieved from new[] to delete[] it will call all destructors, then substract X from the pointer and give that one to free(). However if you call delete instead, it will call a destructor for the first object and then immediately pass that pointer to free(), which means free() has just been passed a pointer which was never malloced, which means the result is UB.
Have a look at http://ideone.com/tIiMw , to see what gets passed to delete and delete[]. As you can see, the pointer returned from new[] is not the pointer which was allocated inside, but 4 is added to it before it is being returned to main(). When calling delete[] correctly the same four is substracted an we get the correct pointer within delete[] however this substraction is missing when calling delete and we get the wrong pointer.
In case of calling new[] on a fundamental type, the compiler immediately knows that it will not have to call any destructors later and it just optimizes the bookkeeping away. However it is definitely allowed to write bookkeeping even for fundamental types. And it is also allowed to add bookkeeping in case you call new.
This bookkeeping in front of the real pointer is actually a very good trick, in case you ever need to write your own memory allocation routines as a replacement of new and delete. There is hardly any limit on what you can store there , so one should never assume that anything returned from new or new[] was actually returned from malloc().
I expect that new[] and delete[] just boil down to malloc() and free() under Linux (gcc, glibc, libstdc++), except that the con(de)structors get called. The same for new and delete except that the con(de)structors get called differently. This means that if his constructors and destructors don't matter, then he can probably get away with it. But why try?
I am relatively new to programming so this may well sound like a stupid question to you seasoned pros out there. Here goes:
In C++, when I use the delete operator on arrays, I have noticed that the data contained in the released memory locations is preserved. For example:
int* testArray=new int[5];
testArray[3]=24;
cout<<testArray[3]; //prints 24
delete [] testArray;
cout<<testArray[3]; // still prints 24
Subsequently, am I right in assuming that since testArray[3] still prints 42 , the data in the deleted memory location is still preserved? If so, does this notion hold true for other languages, and is there any particular reason for this?
Shouldn't "freed" memory locations have null data, or is "free memory" just a synonym for memory that can be used by other applications, irrespective of whether the locations contain data or not?
I've noticed this is not the case when it comes to non array types such as int, double etc. Dereferencing and outputting the deleted variable prints 0 rather than the data. I also have a sneaking suspicion that I might be using wrong syntax to delete testArray, which will probably make this question all the more stupid. I'd love to hear your thoughts nonetheless.
Once you deallocate the memory by calling delete and try to access the memory at that address again it is an Undefined Behavior.
The Standard does not mandate the compilers to do anything special in this regard. It does not ask the compilers to mark the de-allocated memory with 0 or some special magic numbers.It is left out as the implementation detail for the compilers. Some compiler implementations do mark such memory with some special magic numbers but it is left up to each compiler implementation.
In your case, the data, still exists at the deallocated addresses because perhaps there is no other memory requirement which needed that memory to be re-utilized and the compiler didn't clear the contents of previous allocation(since it is not needed to).
However, You should not rely on this at all as this might not be the case always. It still is and will be an Undefined Behavior.
EDIT: To answer the Q in comment.
The delete operator does not return any value so you cannot check the return status however the Standard guarantees that the delete operator will sucessfully do it's job.
Relevant quote from the C++03 Standard:
Section §3.7.3.2.4:
If the argument given to a deallocation function in the standard library is a pointer that is not the null pointer value (4.10), the deallocation function shall deallocate the storage referenced by the pointer, render-ing invalid all pointers referring to any part of the deallocated storage.
The data is still there, because when you free, it frees it in the allocation table -- the system would be very slow if it had to zero over all the memory each time free() or delete is called.
This is the same in any language.
I think the non-array types were set to zero because they were in fact statically allocated rather than dynamically allocated.
non-POD data will be altered in a destructor (which might appear as being null-ed in a debugger).
Freed data is just usable, indeed.
You can NOT depend on the data being unaltered after delete. On a related note, debugging malloc's or runtime libraries will frequently reset the data to a specific signature (0xdeadbeef, 0xdcdcdcdc etc) so you can easily spot accesses to deleted memory in a debugger.