C++ Size Of Dynamic Memory at Runtime

C++ Size Of Dynamic Memory at Runtime - c++

This is something I've been wondering for a while and never found an answer for:
Why is it that when you allocate something on the heap you cannot determine the size of it from just the pointer, yet you can delete it using just the pointer and somehow C++ knows how many bytes to free?
Does this have something to do with the way it is stored on the heap?
Is this information there but not exposed by C++?
And perhaps this should be a separate question but I think it's pretty related so I'll ask it here:
Why is it a dynamic array of elements must be deleted using delete [] as opposed to just the simple delete command; why does C++ need this additional information to correctly free all the memory?

When an allocation is made, a small section of memory immediately before [or, technically, somewhere completely different, but just before is the most common scenario] will store the size of the allocation, and in the case of new [] also store the number of allocated objects.
Note that the C++ standard doesn't give any way to retrieve this information for a reason: It may not accurately describe what is allocated, for example the size of an array may very well be rounded up to some "nice" boundary [almost all modern allocators round to 16 bytes at the very least, so that the memory is usable for SSE and other similar SIMD implementations on other processor architectures]. So if you allocated 40 bytes, it would report back 48, which isn't what you asked for, so it would be rather confusing. And of course, there is no guarantee that the information is stored at ALL - it may be implied by some other information that is stored in the "admin" block of the allocation.
And of course, you can use placement new, in which case there is no admin block, and the allocation is not deleted in the normal fashion - some arbitrary code wouldn't be able to tell the difference.
delete differs from delete [] in that delete [] will know how many objects have been allocated, and call the destructor for all of those objects. It is also possible [or even likely] that new [] stores the number of elements in a way that means that calling delete [] on something that wasn't created with new [] will go horribly wrong.
And as Zan Lynx commented, that if there is no destructor for the objects (e.g. when you are allocating data for int or struct { int x; double y; }, etc - including classes that don't have a constructor [note however that if you have another class inside the class, the compiler will build a destructor for you]), then there is no need to store the count, or do anything else, so the compiler CAN, if it wishes, optimise this sort of allocation into regular new and delete.

Related

How does delete[] know how much memory to delete? [duplicate]

This question already has answers here:
How does delete[] "know" the size of the operand array?
(9 answers)
How does delete[] know it's an array?
(16 answers)
Closed 9 years ago.
int* i = new int[4];
delete[] i;
While we call delete[], how does the program know "i" is 4 byte-length. Is 4 be stored in somewhere in memory?
The implementation of delete[] depend on System or Compilers?
Is there some System API to get the length of i?
As HadeS said, which will hold the information how much memory has been allocated? And where?
It must be hold in memory, or maybe nearby the pointer i.

First off, i is not "4-byte length". Rather, i is a pointer to an array of four ints.
Next, delete[] doesn't need to know anything, because int has no destructor. All that has to happen is that the memory needs to be freed, which is done by the system's allocator. This is the same situation as with free(p) -- you don't need to tell free how much memory needs to be freed, since you expect it to figure that out.
The situation is different when destructors need to be called; in that case, the C++ implementation does indeed need to remember the number of C++ objects separately. The method for this is up to the implementation, although many compilers follow the popular Itanium ABI, which allows linking together of object code compiled by those different compilers.
There is no way for you to query this information. You should consider dynamic arrays a misfeature of C++; there is essentially no reason to use them*, and you can always do better with some kind of class that manages memory and object separately and individually: Since you'll have to remember the number of array elements anyway, it's much better to encapsulate the size and the allocation in one coherent class, rather than have vague dynamic arrays that you cannot really use without passing extra information along anyway (unless you had self-terminating semantics, but then you'd just be using the extra space for the terminator).
*) And there are at least two standard defects about dynamic arrays that nobody is too bothered to worry about fixing

When you dynamically allocate a memory; compiler allocates an extra block of memory apart from what you have asked, which will hold the information how much memory has been allocated.
when you try to delete this memory using delete this extra block of memory will be read by the compiler to see how much memory was allocated and free the space accordingly.
I don't think there is any API which will fetch this information.

Getting dynamically allocated array size

In "The C++ Programming Language" book Stroustrup says:
"To deallocate space allocated by new, delete and delete[] must be able to determine the size of the object allocated. This implies that an object allocated using the standard implementation of new will occupy slightly more space than a static object. Typically, one word is used to hold the object’s size.
That means every object allocated by new has its size located somewhere in the heap. Is the location known and if it is how can I access it?

In actual fact, the typical implementation of the memory allocators store some other information too.
There is no standard way to access this information, in fact there is nothing in the standard saying WHAT information is stored either (the size in bytes, number of elements and their size, a pointer to the last element, etc).
Edit:
If you have the base-address of the object and the correct type, I suspect the size of the allocation could be relatively easily found (not necessarily "at no cost at all"). However, there are several problems:
It assumes you have the original pointer.
It assumes the memory is allocated exactly with that runtime library's allocation code.
It assumes the allocator doesn't "round" the allocation address in some way.
To illustrate how this could go wrong, let's say we do this:
size_t get_len_array(int *mem)
{
return allcoated_length(mem);
}
...
void func()
{
int *p = new int[100];
cout << get_len_array(p);
delete [] p;
}
void func2()
{
int buf[100];
cout << get_len_array(buf); // Ouch!
}

That means every object allocated by new has its size located somewhere in the heap. Is the location known and if it is how can I access it?
Not really, that is not needed for all cases. To simplify the reasoning, there are two levels at which the sizes could be needed. At the language level, the compiler needs to know what to destroy. At the allocator level, the allocator needs to know how to release the memory given only a pointer.
At the language level, only the array versions new[] and delete[] need to handle any size. When you allocate with new, you get a pointer with the type of the object, and that type has a given size.
To destroy the object the size is not needed. When you delete, either the pointer is to the correct type, or the static type of the pointer is a base and the destructor is virtual. All other cases are undefined behavior, and thus can be ignored (anything can happen). If it is the correct type, then the size is known. If it is a base with a virtual destructor, the dynamic dispatch will find the final overrider, and at that point the type is known.
There could be different strategies to manage this, the one used in the Itanium C++ ABI (used by multiple compilers in multiple platforms, although not Visual Studio) for example generates up to 3 different destructors per type, one of them being a version that takes care of releasing the memory, so although delete ptr is defined in terms of calling the appropriate destructor and then releasing the memory, in this particular ABI delete ptr call a special destructor that both destroys and releases the memory.
When you use new[] the type of the pointer is the same regardless of the number of elements in the dynamic array, so the type cannot be used to retrieve that information back. A common implementation is allocating an extra integral value and storing the size there, followed by the real objects, then returning a pointer to the first object. delete[] would then move the received pointer one integer back, read the number of elements, call the destructor for all of them and then release the memory (pointer retrieved by the allocator, not the pointer given to the program). This is really only needed if the type has a non-trivial destructor, if the type has a trivial destructor, the implementation does not need to store the size and you can avoid storing that number.
Out of the language level, the real memory allocator (think of malloc) needs to know how much memory was allocated so that the same amount can be released. In some cases that can be done by attaching the metadata to the memory buffer in the same way that new[] stores the size of the array, by acquiring a larger block, storing the metadata there and returning a pointer beyond it. The deallocator would then undo the transformation to get to the metadata.
This is, on the other hand, not always needed. A common implementation for allocators of small size is to allocate pages of memory to form pools from which the small allocations are then obtained. To make this efficient, the allocator considers only a few different sizes, and allocations that don't fit one of the sizes exactly are bumped to the next size. If you request, for example, 65 bytes, the allocator might actually give you 128 bytes (assuming pools of 64 and 128 bytes). Thus given one of the larger blocks managed by the allocator, all pointers that were allocated from it have the same size. The allocator can then find the block from which pointer was allocated and infer the size from it.
Of course, this is all implementation details that are not accessible to the C++ program in a standard portable way, and the exact implementation can differ not just based on the program, but also de execution environment. If you are interested in knowing how the information is really kept in your environment, you might be able to find the information, but I would think twice before trying to use it for anything other than learning purposes.

Your are not deleting a object directly, instead you send a pointer to delete operator.
Reference C++
You use delete by following
it with a pointer to a block of memory originally allocated with new:
int * ps = new int; // allocate memory with new
. . . // use the memory
delete ps; // free memory with delete when done
This removes the memory to which ps points; it doesn’t remove the pointer ps itself.
You can reuse ps, for example, to point to another new allocation

how are `delete[] obj` and `delete obj` implemented at compiler level [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why [] is used in delete ( delete [] ) to free dynamically allocated array?
Why does C++ still have a delete[] AND a delete operator?
I'm wondering what's their difference and I know the obvious answer some might say, that one is to delete an array and the other is to delete a single object but I'm wondering why should there be two different deletion methods for these two operations? I mean delete is basically implemented using C free method which doesn't care if the pointer is actually pointing toward an array or a single object. The only reason I can think of is two be able to know if it's an array and call destructor for each cell instead of only the first object but that wouldn't also be possible since compiler can not guess the length of array just looking at it's pointer. By the way though it's said to invoke undefined behavior to call delete for memory allocated with new[] I can't imagine anything that could possibly go wrong.

As you have discovered the compiler needs to know the length of an array (at least for non-trivial types) to be able to call destructors for each element. For this new[] typically allocates some extra bytes to record the element count and returns a pointer to the end of this bookkeeping area.
When you use delete[] the compiler will look at the memory before the array to find the count and adjust the pointer, so that the originally allocated block is freed.
If you use delete to destroy a dynamically allocated array, destructors for elements (except the first) won't be called and typically this will end up attempting to free a pointer that doesn't point to the beginning of an allocated block, which may corrupt the heap.

but that wouldn't also be possible since compiler can not guess the
length of array just looking at it's pointer
That's not really true. The compiler itself doesn't need to guess anything, but it does decide which function to call to free the memory based on the operator it sees. There is a separate function dedicated to releasing arrays, and this function does indeed know the length of the array to be freed so it can appropriately call destructors.
It knows the length of the array because typically new[] allocates memory that includes the array length (since this is known on allocation) and returns a pointer to just the "usable" memory allocated. When delete[] is called it knows how to access this memory based on the pointer to the usable part of the array that was given.

When you allocate memory using new[], the compiler not only needs to construct each element, it also needs to keep track of how many elements have been allocated. This is needed for delete[] to work correctly.
Since new and delete operate on scalars, they don't need to do that, and could save on a little bit of overhead.
There is absolutely no requirement for new to be compatible with delete[] and vice versa. Mixing the two is undefined behaviour.

Is there any danger in calling free() or delete instead of delete[]? [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
( POD )freeing memory : is delete[] equal to delete ?
Does delete deallocate the elements beyond the first in an array?
char *s = new char[n];
delete s;
Does it matter in the above case seeing as all the elements of s are allocated contiguously, and it shouldn't be possible to delete only a portion of the array?
For more complex types, would delete call the destructor of objects beyond the first one?
Object *p = new Object[n];
delete p;
How can delete[] deduce the number of Objects beyond the first, wouldn't this mean it must know the size of the allocated memory region? What if the memory region was allocated with some overhang for performance reasons? For example one could assume that not all allocators would provide a granularity of a single byte. Then any particular allocation could exceed the required size for each element by a whole element or more.
For primitive types, such as char, int, is there any difference between:
int *p = new int[n];
delete p;
delete[] p;
free p;
Except for the routes taken by the respective calls through the delete->free deallocation machinery?

It's undefined behaviour (most likely will corrupt heap or crash the program immediately) and you should never do it. Only free memory with a primitive corresponding to the one used to allocate that memory.
Violating this rule may lead to proper functioning by coincidence, but the program can break once anything is changed - the compiler, the runtime, the compiler settings. You should never rely on such proper functioning and expect it.
delete[] uses compiler-specific service data for determining the number of elements. Usually a bigger block is allocated when new[] is called, the number is stored at the beginning and the caller is given the address behind the stored number. Anyway delete[] relies on the block being allocated by new[], not anything else. If you pair anything except new[] with delete[] or vice versa you run into undefined behaviour.

Read the FAQ: 16.3 Can I free() pointers allocated with new? Can I delete pointers allocated with malloc()?
Does it matter in the above case seeing as all the elements of s are allocated contiguously, and it shouldn't be possible to delete only a portion of the array?
Yes it does.
How can delete[] deduce the number of Objects beyond the first, wouldn't this mean it must know the size of the allocated memory region?
The compiler needs to know. See FAQ 16.11
Because the compiler stores that information.
What I mean is the compiler needs different deletes to generate appropriate book-keeping code. I hope this is clear now.

Yes, this is dangerous!
Dont do it!
It will lead to programm crashes or even worse behavior!
For objects allocated with new you MUST use delete;
For objects allocated with new [] you MUST use delete [];
For objects allocated with malloc() or calloc() you MUST use free();
Be aware also that for all these cases its illegal to delete/free a already deleted/freed pointer a second time. free may also NOT be called with null. calling delete/delete[] with NULL is legal.

Yes, there's a real practical danger. Even implementation details aside, remember that operator new/operator delete and operator new[]/operator delete[] functions can be replaced completely independently. For this reason, it is wise to think of new/delete, new[]/delete[], malloc/free etc. as different, completely independent methods of memory allocaton, which have absolutely nothing in common.

Raymond Chen (Microsoft developer) has an in-depth article covering scaler vs. vector deletes, and gives some background to the differences. See:
http://blogs.msdn.com/oldnewthing/archive/2004/02/03/66660.aspx

Does delete deallocate the elements
beyond the first in an array?
No. delete will deallocate only the first element regardless on which compiler you do this. It may work in some cases but that's co-incidental.
Does it matter in the above case seeing as all the elements of s are allocated
contiguously, and it shouldn't be possible to delete only a portion of the array?
Depends on how the memory is marke as free. Again implementation dependant.
For more complex types, would delete call the destructor of objects beyond the first one?
No. Try this:
#include <cstdio>
class DelTest {
static int next;
int i;
public:
DelTest() : i(next++) { printf("Allocated %d\n", i); }
~DelTest(){ printf("Deleted %d\n", i); }
};
int DelTest::next = 0;
int main(){
DelTest *p = new DelTest[5];
delete p;
return 0;
}
How can delete[] deduce the number of
Objects beyond the first, wouldn't
this mean it must know the size of the
allocated memory region?
Yes, the size is stored some place. Where it is stored depends on implementation. Example, the allocator could store the size in a header preceding the allocated address.
What if the memory region was
allocated with some overhang for
performance reasons? For example one
could assume that not all allocators
would provide a granularity of a
single byte. Then any particular
allocation could exceed the required
size for each element by a whole
element or more.
It is for this reason that the returned address is made to align to word boundaries. The "overhang" can be seen using the sizeof operator and applies to objects on the stack as well.
For primitive types, such as char, int, is there any difference between ...?
Yes. malloc and new could be using separate blocks of memory. Even if this were not the case, it's a good practice not to assume they are the same.

It's undefined behavior. Hence, the anser is: yes, there could be danger. And it's impossible to predict exactly what will trigger problems. Even if it works one time, will it work again? Does it depend on the type? Element count?

For primitive types, such as char, int, is there any difference between:
I'd say you'll get undefined behaviour. So you shouldn't count on stable behaviour. You should always use new/delete, new[]/delete[] and malloc/free pairs.

Although it might seem in some logic way that you can mix new[] and free or delete instead of delete[], this is under the assumption about the compiler being a fairly simplistic, i.e., that it will always use malloc() to implement the memory allocation for new[].
The problem is that if your compiler has a smart enough optimizer it might see that there is no "delete[]" corresponding to the new[] for the object you created. It might therefore assume that it can fetch the memory for it from anywhere, including the stack in order to save the cost of calling the real malloc() for the new[]. Then when you try to call free() or the wrong kind of delete on it, it is likely to malfunction hard.

Step 1 read this: what-is-the-difference-between-new-delete-and-malloc-free
You are only looking at what you see on the developer side.
What you are not considering is how the std lib does memory management.
The first difference is that new and malloc allocate memroy from two different areas in memory (New from FreeStore and malloc from Heap (Don't focus on the names they are both basically heaps, those are just there official names from the standard)). If you allocate from one and de-allocate to the other you will messs up the data structures used to manage the memory (there is no gurantee they will use the same structure for memory management).
When you allocate a block like this:
int* x= new int; // 0x32
Memory May look like this: It probably wont since I made this up without thinking that hard.
Memory Value Comment
0x08 0x40 // Chunk Size
0x16 0x10000008 // Free list for Chunk size 40
0x24 0x08 // Block Size
0x32 ?? // Address returned by New.
0x40 0x08 // Pointer back to head block.
0x48 0x0x32 // Link to next item in a chain of somthing.
The point is that there is a lot more information in the allocated block than just the int you allocated to handle memory management.
The standard does not specify how this is done becuase (in C/C++ style) they did not want to inpinge on the compiler/library manufacturers ability to implement the most effecient memory management method for there architecture.
Taking this into account you want the manufacturer the ability to distinguish array allocation/deallocation from normal allocation/deallocation so that it is possable to make it as effecient as possable for both types independantly. As a result you can not mix and match as internally they may use different data structures.
If you actually analyse the memory allocation differences between C and C++ applications you find that they are very different. And thus it is not unresonable to use completely different techniques of memory management to optimise for the application type. This is another reason to prefer new over malloc() in C++ as it will probably be more effecient (The more important reason though will always be to reducing complexity (IMO)).

Why [] is used in delete ( delete [] ) to free dynamically allocated array ?

I know that when delete [] will cause destruction for all array elements and then releases the memory.
I initially thought that compiler wants it just to call destructor for all elements in the array, but I have also a counter - argument for that which is:
Heap memory allocator must know the size of bytes allocated and using sizeof(Type) its possible to find no of elements and to call appropriate no of destructors for an array to prevent resource leaks.
So my assumption is correct or not and please clear my doubt on it.
So I am not getting the usage of [] in delete [] ?

Scott Meyers says in his Effective C++ book: Item 5: Use the same form in corresponding uses of new and delete.
The big question for delete is this: how many objects reside in the memory being deleted? The answer to that determines how many destructors must be called.
Does the pointer being deleted point to a single object or to an array of objects? The only way for delete to know is for you to tell it. If you don't use brackets in your use of delete, delete assumes a single object is pointed to.
Also, the memory allocator might allocate more space that required to store your objects and in this case dividing the size of the memory block returned by the size of each object won't work.
Depending on the platform, the _msize (windows), malloc_usable_size (linux) or malloc_size (osx) functions will tell you the real length of the block that was allocated. This information can be exploited when designing growing containers.
Another reason why it won't work is that Foo* foo = new Foo[10] calls operator new[] to allocate the memory. Then delete [] foo; calls operator delete[] to deallocate the memory. As those operators can be overloaded, you have to adhere to the convention otherwise delete foo; calls operator delete which may have an incompatible implementation with operator delete []. It's a matter of semantics, not just keeping track of the number of allocated object to later issue the right number of destructor calls.
See also:
[16.14] After p = new Fred[n], how does the compiler know there are n objects to be destructed during delete[] p?
Short answer: Magic.
Long answer: The run-time system stores the number of objects, n, somewhere where it can be retrieved if you only know the pointer, p. There are two popular techniques that do this. Both these techniques are in use by commercial-grade compilers, both have tradeoffs, and neither is perfect. These techniques are:
Over-allocate the array and put n just to the left of the first Fred object.
Use an associative array with p as the key and n as the value.
EDIT: after having read #AndreyT comments, I dug into my copy of Stroustrup's "The Design and Evolution of C++" and excerpted the following:
How do we ensure that an array is correctly deleted? In particular, how do we ensure that the destructor is called for all elements of an array?
...
Plain delete isn't required to handle both individual objects an arrays. This avoids complicating the common case of allocating and deallocating individual objects. It also avoids encumbering individual objects with information necessary for array deallocation.
An intermediate version of delete[] required the programmer to specify the number of elements of the array.
...
That proved too error prone, so the burden of keeping track of the number of elements was placed on the implementation instead.
As #Marcus mentioned, the rational may have been "you don't pay for what you don't use".
EDIT2:
In "The C++ Programming Language, 3rd edition", §10.4.7, Bjarne Stroustrup writes:
Exactly how arrays and individual objects are allocated is implementation-dependent. Therefore, different implementations will react differently to incorrect uses of the delete and delete[] operators. In simple and uninteresting cases like the previous one, a compiler can detect the problem, but generally something nasty will happen at run time.
The special destruction operator for arrays, delete[], isn’t logically necessary. However, suppose the implementation of the free store had been required to hold sufficient information for every object to tell if it was an individual or an array. The user could have been relieved of a burden, but that obligation would have imposed significant time and space overheads on some C++ implementations.

The main reason why it was decided to keep separate delete and delete[] is that these two entities are not as similar as it might seem at the first sight. For a naive observer they might seem to be almost the same: just destruct and deallocate, with the only difference in the potential number of objects to process. In reality, the difference is much more significant.
The most important difference between the two is that delete might perform polymorphic deletion of objects, i.e. the static type of the object in question might be different from its dynamic type. delete[] on the other hand must deal with strictly non-polymorphic deletion of arrays. So, internally these two entities implement logic that is significantly different and non-intersecting between the two. Because of the possibility of polymorphic deletion, the functionality of delete is not even remotely the same as the functionality of delete[] on an array of 1 element, as a naive observer might incorrectly assume initially.
Contrary to the strange claims made in some other answers, it is, of course, perfectly possible to replace delete and delete[] with just a single construct that would branch at the very early stage, i.e. it would determine the type of the memory block (array or not) using the household information that would be stored by new/new[], and then jump to the appropriate functionality, equivalent to either delete or delete[]. However, this would be a rather poor design decision, since, once again, the functionality of the two is too different. Forcing both into a single construct would be akin to creating a Swiss Army Knife of a deallocation function. Also, in order to be able to tell an array from a non-array we'd have to introduce an additional piece of household information even into a single-object memory allocations done with plain new. This might easily result in notable memory overhead in single object allocations.
But, once again, the main reason here is the functional difference between delete and delete[]. These language entities possess only apparent skin-deep similarity that exists only at the level of naive specification ("destruct and free memory"), but once one gets to understand in detail what these entities really have to do one realizes that they are too different to be merged into one.
P.S. This is BTW one of the problems with the suggestion about sizeof(type) you made in the question. Because of the potentially polymorphic nature of delete, you don't know the type in delete, which is why you can't obtain any sizeof(type). There are more problems with this idea, but that one is already enough to explain why it won't fly.

The heap itself knows the size of allocated block - you only need the address. Look like free() works - you only pass the address and it frees memory.
The difference between delete (delete[]) and free() is that the former two first call the destructors, then free memory (possibly using free()). The problem is that delete[] also has only one argument - the address and having only that address it need to know the number of objects to run destructors on. So new[] uses som implementation-defined way of writing somewhere the number of elements - usually it prepends the array with the number of elements. Now delete[] will rely on that implementation-specific data to run destructors and then free memory (again, only using the block address).

delete[] just calls a different implementation (function);
There's no reason an allocator couldn't track it (in fact, it would be easy enough to write your own).
I don't know the reason they did not manage it, or the history of the implementation, if I were to guess: Many of these 'well, why wasn't this slightly simpler?' questions (in C++) came down to one or more of:
compatibility with C
performance
In this case, performance. Using delete vs delete[] is easy enough, I believe it could all be abstracted from the programmer and be reasonably fast (for general use). delete[] only requires only a few additional function calls and operations (omitting destructor calls), but that is per call to delete, and unnecessary because the programmer generally knows the type he/she is dealing with (if not, there's likely a bigger problem at hand). So it just avoids calling through the allocator. Additionally, these single allocations may not need to be tracked by the allocator in as much detail; Treating every allocation as an array would require additional entries for count for trivial allocations, so it is multiple levels of simple allocator implementation simplifications which are actually important for many people, considering it is a very low level domain.

This is more complicated.
The keyword and the convention to use it to delete an array was invented for the convenience of implementations, and some implementations do use it (I don't know which though. MS VC++ does not).
The convenience is this:
In all other cases, you know the exact size to be freed by other means. When you delete a single object, you can have the size from compile-time sizeof(). When you delete a polymorphic object by base pointer and you have a virtual destructor, you can have the size as a separate entry in vtbl. If you delete an array, how would you know the size of memory to be freed, unless you track it separately?
The special syntax would allow tracking such size only for an array - for instance, by putting it before the address that is returned to the user. This takes up additional resources and is not needed for non-arrays.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js