C++ Flat Array vs Multidimensional Array Memory Footprint - c++

In C++, the delete[] operator deletes an array. It is able to access the length of the array because the allocator keeps track of it.
Does that mean that a flattened one-dimensional array takes up less memory than a multi-dimensional array?
To be more specific, if I allocate Object** c, does the allocator store the lengths of both the first and second dimensions, while allocating Object* c (but with the same number of elements as the two-dimensional array) only stores one length?

If you do this:
Object **c = new Object*[n];
for (size_t i=0; i!=n; ++i) {
c[i] = new Object[m];
}
then it will typically take more memory than doing this:
Object *c = new Object[n*m];
for just the reasons you stated.
Every memory allocation has a certain amount of overhead. In addition to needing to keep the number of elements, there is overhead for the memory allocator itself. It also takes more memory for all the extra pointers for each row.
Note that it is possible to have a situation where breaking it up would use less memory. If your heap was fragmented, then finding one large chunk of memory may require allocating more memory from the operating system, whereas if your array was broken into smaller pieces, those pieces may be able to fit in the holes of your fragmented heap.

Related

Problem in Deleting the elements of an array allocated with new[]

This question is similar to Problem with delete[], how to partially delete the memory?
I understand that deleting an array after incrementing its pointer is not possible as it loses the track of how many bytes to clean. But, I am not able to understand why one-by-one delete/deallocation of a dynamic array doesn't work either.
int main()
{
int n = 5;
int *p = new int[n];
for(int i=0;i<n;++i){
delete &p[i];
}
}
I believe this should work, but in clang 12.0 it fails with the invalid pointer error. Can anyone explain why?
An array is a contiguous object in memory of a specific size. It is one object where you can place your data in and therefore you can only free/delete it as one object.
You are thinking that an array is a list of multiple objects, but that's not true. That would be true for something like a linked list, where you allocate individual objects and link them together.
You allocated one object of the type int[n] (one extent of memory for an array) using the operator new
int *p = new int[n];
Elements of the array were not allocated dynamically separately.
So to delete it you just need to write
delete []p;
If for example you allocated an array of pointers like
int **p = new int *[n];
and then for each pointer of the array you allocated an object of the type int like
for ( int i = 0;i < n;++i )
{
p[i] = new int( i );
}
then to delete all the allocated objects you need to write
for ( int i = 0; i < n; ++i )
{
delete p[i];
}
delete []p;
That is the number of calling of the operator delete or delete [] one to one corresponds to the number of calling operator new or new [].
One new always goes with one delete. Just as that.
In detail, when we request an array using new, what we actually do is to get a pointer that controls a contiguous & fixed block on the memory. Whatever we do with that array, we do it through that pointer and this pointer associates strictly with the array itself.
Furthermore, let's assume that you were able to delete an elemnent in the middle of that array. After the deletion, that array would fall apart and they are not contiguous anymore! By then, an array would not really be an array!
Because of that, we can not 'chop off' an array into separate pieces. We must always treat an array as one thing, not distinctive elements scattered around the memory.
Greatly simplyfyinh: in most systems memory is allocated in logical blocks which are described by the starting pointer of the allocated block.
So if you allocate an array:
int* array = new int[100];
OS stores the information of that allocation as a pair (simplifying) (block_begin, size) -> (value of array ptr, 100)
Thus when you deallocate the memory you don't need to specify how much memory you allocated i.e:
// you use
delete[] array; // won't go into detail why you do delete[] instead of delete - mostly it is due to C++ way of handling destruction of objects
// instead of
delete[100] array;
In fact in bare C you would do this with:
int* array = malloc(100 * sizeof(int))
[...]
free(array)
So in most OS'es it is not possible due to the way they are implemented.
However theoretically allocating large chunk of memory in fact allocate many smaller blocks which could be deallocated this way, but still it would deallocate smaller blocks at a time not one-by-one.
All of new or new[] and even C's malloc do exactly the same in respect to memory: requesting a fix block of memory from the operating system.
You cannot split up this block of memory and return it partially to the operating system, that's simply not supported, thus you cannot delete a single element from the array either. Only all or none…
If you need to remove an element from an array all you can do is copy the subsequent elements one position towards front, overwriting the element to delete and additionally remember how many elements actually are valid – the elements at the end of the array stay alive!
If these need to be destructed immediately you might call the destructor explicitly – and then assure that it isn't called again on an already destructed element when delete[]ing the array (otherwise undefined behaviour!) – ending in not calling new[] and delete[] at all but instead malloc, placement new for each element, std::launder any pointer to any element created that way and finally explicitly calling the constructor when needed.
Sounds like much of a hassle, doesn't it? Well, there's std::vector doing all this stuff for you! You should this one it instead…
Side note: You could get similar behaviour if you use an array of pointers; you then can – and need to – maintain (i.e. control its lifetime) each object individually. Further disadvantages are an additional level of pointer indirection whenever you access the array members and the array members indeed being scattered around the memory (though this can turn into an advantage if you need to move objects around your array and copying/moving objects is expensive – still you would to prefer a std::vector, of pointers this time, though; insertions, deletions and managing the pointer array itself, among others, get much safer and much less complicated).

Better method to increase the size of an array in C++ dynamically

Suppose I have a character array allocated with new.Now, if I want to modify the size of the array then which of the following two is the best method ?
1. using realloc function OR
2.Allocate a new array, copy the data from old array to the new array and then delete the old array.
Allocate the block using malloc (not new) and then use realloc. realloc knows how much free space is available after the block for expansion.
s2 = realloc(s,<size>);
if (s2) {
s = s2;
}
else {
free up s and handle the error
}
Most code I have seen doesn't correctly handle failure of realloc.
You can't portably apply realloc to a buffer allocated with new. Hence only the second option of yours is viable.
Consider switching to std::vector and std::string.
I think you are thinking about the whole matter from the C perspective.
if you are dealing with an array use a vector, it is dynamical and avoids the issues that you state in your question
e.g.
vector v(10); // allocates an array of 10 initialized to 0
v.push_back(42); // added another, so now array is 11 long
The first option also implies copying if realloc finds that there is not enough free space after the block. (Even if you used malloc for allocating the array, which is the only correct option to use realloc at all.)
However, if you double the size of the array on each reallocation (or multiply its size by a constant > 1), the operation "increase array by one" uses constant time on average. Search for Constant Amortized Time.
Assume an array of initial size n. Reallocating to the new size 2n and copying costs 2n steps, but the next n "increase" operation are free.
By the way, this is how std::vector and many other array containers are implemented internally.

Vector of vectors, heap versus stack (C++)

I wanted to initialize a vector of vectors that contain pointers to Courses. I declared this:
std::vector<std::vector<Course*> > *CSPlan =
new std::vector<std::vector<Course*> >(smsNum);
What I wanted to do by this is to have a vector of vectors, each inside vector is a vector that contains pointers to Courses, and I wanted the MAIN vector to be of size int smsNum. Furthermore, I wanted it on the heap.
My questions are:
Are both the main vector AND the inside vectors allocated on the heap? or is it only the MAIN vector is on the heap and its' indexes are pointers to other smaller vectors on the stack?
I declared it to be of size int smsNum so the Main vector is of size 10, but what about the smaller vectors? are they also of that size or are they still dynamic?
My goal in the end is to have a vector of vectors, both the Main vector and the child vectors on the heap, and ONLY the Main vector is of size smsNum, while the rest are dynamic.
Any structure that can grow as large as the user wants it to is going to be allocated on the heap. The memory stack, on the other hand, is used to allocate statically allocated variables, that the program has control of the size statically, during the compilation process.
Since you can have a loop like this:
for (i = 0; i < your_value; i++) {
vector.insert(...);
}
And considering your_value as an integer read from the standard input, the compiler have no control on how large will be your vector, i.e., it does not know what is the maximum amount of inserts you may perform.
To solve this, the structure must be allocated on the heap, where it may grow as large as the OS allows it to -- considering primary memory, and swap. As a complement, if you use the pointer to the vector, you'll be simply dynamically allocating a variable to reference the vector. This changes NOT the fact that the contents of the vector is, necessarily, being allocated on the heap.
You'll have, in your stack:
a variable "x" that stores the address of a variable "y";
And in your heap:
the value of the variable "y", that is a reference to your vector of vectors;
the contents of your vector of vectors (accessed by "y", that is accessed by "x").

Resizing Arrays - Difference between two execution blocks?

I have a function which grows an array when trying to add an element if it is full. Which of the execution blocks is better or faster?
I think my second block (commented out) may be wrong, because after doubling my array I then go back and point to the original.
When creating arrays does the compiler look for a contiguous block in memory which it entirely fits into? (On the stack/heap? I don't fully understand which, though it is important for me to learn it is irrelevant to the actual question.)
If so, would this mean using the second block could potentially overwrite other information by overwriting adjacent memory? (Since the original would use 20 adjacent blocks of memory, and the latter 40.)
Or would it just mean the location of elements in my array would be split, causing poor performance?
void Grow()
{
length *= 2; // double the size of our stack
// create temp pointer to this double sized array
int* tempStack = new int[length];
// loop the same number of times as original size
for(int i = 0; i < (length / 2); i++)
{
// copy the elements from the original array to the temp one
tempStack[i] = myStack[i];
}
delete[] myStack; //delete the original pointer and free the memory
myStack = tempStack; //make the original point to the new stack
//Could do the following - but may not get contiguous memory block, causing
// overwritten >data
#if 0
int* tempStack = myStack; //create temp pointer to our current stack
delete[] myStack; //delete the original pointer and free memory
myStack = new int[length *= 2]; //delete not required due to new?
myStack = tempStack;
#endif
}
The second block wouldn't accomplish what you want at all.
When you do
myStack = new int[length *= 2];
then the system will return a pointer to wherever it happens to allocate the new, larger array.
You then reassign myStack to the old location (which you've already de-allocated!), which means you're pointing at memory that's not allocated (bad!) and you've lost the pointer to the new memory you just allocated (also bad!).
Edit: To clarify, your array will be allocated on the heap. Additionally, the (new) pointer returned by your larger array allocation (new int[foo]) will be a contiguous block of memory, like the old one, just probably in a different location. Unless you go out of bounds, don't worry about "overwriting" memory.
Your second block is incorrect because of this sequence:
int* tempStack = myStack; //create temp pointer to our current stack
delete[] myStack; //delete the original pointer and free memory
tempStack and myStack and both simply pointers to the same block of memory. When you delete[] the pointer in the second line, you no longer have access to that memory via either pointer.
Using C++ memory management, if you want to grow an array, you need to create a new array before you delete the old one and copy over the values yourself.
That said, since you are working with POD, you could use C style memory management which supports directly growing an array via realloc. That can be a bit more efficient if the memory manager realizes it can grow the buffer without moving it (although if it can't grow the buffer in place, it will fall back on the way you grow your array in your first block).
C style memory management is only okay for arrays of POD. For non-POD, you must do the create new array/copy/delete old array technique.
This doesn't exactly answer your question, but you shouldn't be doing either one. Generally new[] or delete[] should be avoided in favor of using std::vector. new[] is hard to use because it requires explicit memory management (if an exception is thrown as the elements are being copied, you will need to catch the exception and delete the array to avoid a memory leak). std::vector takes care of this for you, automatically grows itself, and is likely to have an efficient implementation tuned by the vendor.
One argument for using explicit arrays is to have a contiguous block of memory that can be passed to C functions, but that also can be done with std::vector for any non-pathological implementation (and the next version of the C++ standard will require all conforming implementations to support that). (For reference, see http://www.gotw.ca/publications/mill10.htm by Herb Sutter, former convener of the ISO C++ standards committee.)
Another argument against std::vector is the weirdness with std::vector<bool>, but if you need that you can simply use std::vector<char> or std::vector<int> instead. (See: http://www.gotw.ca/publications/mill09.htm)

How is dynamic memory managed in std::vector?

How does std::vector implement the management of the changing number of elements: Does it use realloc() function, or does it use a linked list?
Thanks.
It uses the allocator that was given to it as the second template parameter. Like this then. Say it is in push_back, let t be the object to be pushed:
...
if(_size == _capacity) { // size is never greater than capacity
// reallocate
T * _begin1 = alloc.allocate(_capacity * 2, 0);
size_type _capacity1 = _capacity * 2;
// copy construct items (copy over from old location).
for(size_type i=0; i<_size; i++)
alloc.construct(_begin1 + i, *(_begin + i));
alloc.construct(_begin1 + _size, t);
// destruct old ones. dtors are not allowed to throw here.
// if they do, behavior is undefined (17.4.3.6/2)
for(size_type i=0;i<_size; i++)
alloc.destroy(_begin + i);
alloc.deallocate(_begin, _capacity);
// set new stuff, after everything worked out nicely
_begin = _begin1;
_capacity = _capacity1;
} else { // size less than capacity
// tell the allocator to allocate an object at the right
// memory place previously allocated
alloc.construct(_begin + _size, t);
}
_size++; // now, we have one more item in us
...
Something like that. The allocator will care about allocating memory. It keeps the steps of allocating memory and constructing object into that memory apart, so it can preallocate memory, but not yet call constructors. During reallocate, the vector has to take care about exceptions being thrown by copy constructors, which complicates the matter somewhat. The above is just some pseudo code snippet - not real code and probably contains many bugs. If the size gets above the capacity, it asks the allocator to allocate a new greater block of memory, if not then it just constructs at the previously allocated space.
The exact semantics of this depend on the allocator. If it is the standard allocator, construct will do
new ((void*)(_start + n)) T(t); // known as "placement new"
And the allocate allocate will just get memory from ::operator new. destroy would call the destructor
(_start + n)->~T();
All that is abstracted behind the allocator and the vector just uses it. A stack or pooling allocator could work completely different. Some key points about vector that are important
After a call to reserve(N), you can have up to N items inserted into your vector without risking a reallocation. Until then, that is as long as size() <= capacity(), references and iterators to elements of it remain valid.
Vector's storage is contiguous. You can treat &v[0] as a buffer containing as many elements you have currently in your vector.
One of the hard-and-fast rules of vectors is that the data will be stored in one contiguous block of memory.
That way you know you can theoretically do this:
const Widget* pWidgetArrayBegin = &(vecWidget[0]);
You can then pass pWidgetArrayBegin into functions that want an array as a parameter.
The only exception to this is the std::vector<bool> specialisation. It actually isn't bools at all, but that's another story.
So the std::vector will reallocate the memory, and will not use a linked list.
This means you can shoot yourself in the foot by doing this:
Widget* pInteresting = &(vecWidget.back());
vecWidget.push_back(anotherWidget);
For all you know, the push_back call could have caused the vector to shift its contents to an entirely new block of memory, invalidating pInteresting.
The memory managed by std::vector is guaranteed to be continuous, such that you can treat &vec[0] as a pointer to the beginning of a dynamic array.
Given this, how it actually manages it's reallocations is implementation specific.
std::vector stored data in contiguous memory blocks.
Suppose we declare a vector as
std::vector intvect;
So initially a memory of x elements will be created . Here x is implementation depended.
If user is inserting more than x elements than a new memory block will be created of 2x (twice the size)elements and initial vector is copied into this memory block.
Thats why it is always recommended to reserve memory for vector by calling reserve
function.
intvect.reserve(100);
so as to avoid deletion and copying of vector data.