Array of contiguous new objects - c++

I am currently filling an vector array of elements like so:
std::vector<T*> elemArray;
for (size_t i = 0; i < elemArray.size(); ++i)
{
elemArray = new T();
}
The code has obviously been simplified. Now after asking another question (unrelated to this problem but related to the program) I realized I need an array that has new'd objects (can't be on the stack, will overflow, too many elements) but are contiguous. That is, if I were to receive an element, without the array index, I should be able to find the array index by doing returnedElement - elemArray[0] to get the index of the element in the array.
I hope I have explained the problem, if not, please let me know which parts and I will attempt to clarify.
EDIT: I am not sure why the highest voted answer is not being looked into. I have tried this many times. If I try allocating a vector like that with more than 100,000 (approximately) elements, it always gives me a memory error. Secondly, I require pointers, as is clear from my example. Changing it suddenly to not be pointers will require a large amount of code re-write (although I am willing to do that, but it still does not address the issue that allocating vectors like that with a few million elements does not work.

A std::vector<> stores its elements in a heap allocated array, it won't store the elements on the stack. So you won't get any stack overflow even if you do it the simple way:
std::vector<T> elemArray;
for (size_t i = 0; i < elemCount; ++i) {
elemArray.push_back(T(i));
}
&elemArray[0] will be a pointer to a (continuous) array of T objects.

If you need the elements to be contiguous, not the pointers, you can just do:
std::vector<T> elemArray(numberOfElements);
The elements themselves won't be on the stack, vector manages the dynamic allocation of memory and as in your example the elements will be value-initialized. (Strictly, copy-initialized from a value-initialized temporary but this should work out the same for objects that it is valid to store in a vector.)
I believe that your index calculation should be: &returnedElement - &elemArray[0] and this will work with a vector. Provided that returnedElement is actually stored in elemArray.

Your original loop should look something like this: (though it doesn't create objects in contiguous memory).
for (size_t i = 0; i < someSize ; ++i)
{
elemArray.push_back(new T());
}
And then you should know two basic things here:
elemArray.size() returns the number of elements elemArray currently holds. That means, if you use it in the for loop's condition, then your for loop would become an infinite loop, because you're adding elements to it, and so the vector's size would keep on increasing.
elemArray is a vector of T*, so you can store only T*, and to populate it, you've to use push_back function.

Considering your old code caused a stack overflow I think it probably looked like this:
Item items[2000000]; // stack overflow
If that is the case, then you can use the following syntax:
std::vector<Item> items(2000000);
This will allocate (and construct) the items contiguously on the heap.

While I also have the same requirements but for a different reason, mainly for being cache hot...(for small number of objects)
int maxelements = 100000;
T *obj = new T [100000];
vector<T *> vectorofptr;
for (int i = 0; i < maxelements; i++)
{
vectorofptr.push_back(&obj[i]);
}
or
int sz = sizeof(T);
int maxelements = 100000;
void *base = calloc(maxelements, sz); //need to save base for free()
vector<T *> vectorofptr;
int offset = 0;
for (int i = 0; i < maxelements; i++)
{
vectorofptr.push_back((T *) base + offset);
offset += sz;
}

Allocate a large chunk of memory and use placement new to construct your vector elements in it:
elemArray[i] = new (GetNextContinuousAddress()) T();
(assuming that you really need pointer to indidually new'ed objects in your array and are aware of the reasons why this is not recommended ..)

I know this is an old post but it might be useful intel for any who might need it at some point. Storing pointers in a sequence container is perfectly fine only if you have in mind a few important caveats:
the objects your stored pointers point to won't be contiguously aligned in memory since they were allocated (dynamically or not) elsewhere in your program, you will end up with a contiguous array of pointers pointing to data likely scattered across memory.
since STL containers work with value copy and not reference copy, your container won't take ownership of allocated data and thus will only delete the pointer objects on destruction and not the objects they point to. This is fine when those objects were not dynamically allocated, otherwise you will need to provide a release mechanism elsewhere on the pointed objects like simply looping through your pointers and delete them individually or using shared pointers which will do the job for you when required ;-)
one last but very important thing to remember is that if your container is to be used in a dynamic environment with possible insertions and deletions at random places, you need to make sure to use a stable container so your iterators/pointers to stored elements remain valid after such operations otherwise you will end up with undesirable side effects... This is the case with std::vector which will release and reallocate extra space when inserting new elements and then perform a copy of the shifted elements (unless you insert at the end), thus invalidating element pointers/iterators (some implementations provide stability though, like boost::stable_vector, but the penalty here is that you lose the contiguous property of the container, magic does not exist when programming, life is unfair right? ;-))
Regards

Related

Problem in Deleting the elements of an array allocated with new[]

This question is similar to Problem with delete[], how to partially delete the memory?
I understand that deleting an array after incrementing its pointer is not possible as it loses the track of how many bytes to clean. But, I am not able to understand why one-by-one delete/deallocation of a dynamic array doesn't work either.
int main()
{
int n = 5;
int *p = new int[n];
for(int i=0;i<n;++i){
delete &p[i];
}
}
I believe this should work, but in clang 12.0 it fails with the invalid pointer error. Can anyone explain why?
An array is a contiguous object in memory of a specific size. It is one object where you can place your data in and therefore you can only free/delete it as one object.
You are thinking that an array is a list of multiple objects, but that's not true. That would be true for something like a linked list, where you allocate individual objects and link them together.
You allocated one object of the type int[n] (one extent of memory for an array) using the operator new
int *p = new int[n];
Elements of the array were not allocated dynamically separately.
So to delete it you just need to write
delete []p;
If for example you allocated an array of pointers like
int **p = new int *[n];
and then for each pointer of the array you allocated an object of the type int like
for ( int i = 0;i < n;++i )
{
p[i] = new int( i );
}
then to delete all the allocated objects you need to write
for ( int i = 0; i < n; ++i )
{
delete p[i];
}
delete []p;
That is the number of calling of the operator delete or delete [] one to one corresponds to the number of calling operator new or new [].
One new always goes with one delete. Just as that.
In detail, when we request an array using new, what we actually do is to get a pointer that controls a contiguous & fixed block on the memory. Whatever we do with that array, we do it through that pointer and this pointer associates strictly with the array itself.
Furthermore, let's assume that you were able to delete an elemnent in the middle of that array. After the deletion, that array would fall apart and they are not contiguous anymore! By then, an array would not really be an array!
Because of that, we can not 'chop off' an array into separate pieces. We must always treat an array as one thing, not distinctive elements scattered around the memory.
Greatly simplyfyinh: in most systems memory is allocated in logical blocks which are described by the starting pointer of the allocated block.
So if you allocate an array:
int* array = new int[100];
OS stores the information of that allocation as a pair (simplifying) (block_begin, size) -> (value of array ptr, 100)
Thus when you deallocate the memory you don't need to specify how much memory you allocated i.e:
// you use
delete[] array; // won't go into detail why you do delete[] instead of delete - mostly it is due to C++ way of handling destruction of objects
// instead of
delete[100] array;
In fact in bare C you would do this with:
int* array = malloc(100 * sizeof(int))
[...]
free(array)
So in most OS'es it is not possible due to the way they are implemented.
However theoretically allocating large chunk of memory in fact allocate many smaller blocks which could be deallocated this way, but still it would deallocate smaller blocks at a time not one-by-one.
All of new or new[] and even C's malloc do exactly the same in respect to memory: requesting a fix block of memory from the operating system.
You cannot split up this block of memory and return it partially to the operating system, that's simply not supported, thus you cannot delete a single element from the array either. Only all or none…
If you need to remove an element from an array all you can do is copy the subsequent elements one position towards front, overwriting the element to delete and additionally remember how many elements actually are valid – the elements at the end of the array stay alive!
If these need to be destructed immediately you might call the destructor explicitly – and then assure that it isn't called again on an already destructed element when delete[]ing the array (otherwise undefined behaviour!) – ending in not calling new[] and delete[] at all but instead malloc, placement new for each element, std::launder any pointer to any element created that way and finally explicitly calling the constructor when needed.
Sounds like much of a hassle, doesn't it? Well, there's std::vector doing all this stuff for you! You should this one it instead…
Side note: You could get similar behaviour if you use an array of pointers; you then can – and need to – maintain (i.e. control its lifetime) each object individually. Further disadvantages are an additional level of pointer indirection whenever you access the array members and the array members indeed being scattered around the memory (though this can turn into an advantage if you need to move objects around your array and copying/moving objects is expensive – still you would to prefer a std::vector, of pointers this time, though; insertions, deletions and managing the pointer array itself, among others, get much safer and much less complicated).

C++: How are pointers themselves handled regarding memory management?

I have a fairly simple question;
I have arrays which contain pointers to objects. I sometimes create mutated arrays from those arrays and only use them, let's say, within a method. Aftwards I don't need them. In this case I don't want the pointed data to be destroyed as I keep using the original Array. What I don't fully understand is what happens to the pointers ( not the data itself, but the pointers) that were created in my temporarily Array? How does Memory deal with them. As far as I know Pointers can only point to an address. You can't "delete" them.
Anyone who can give me more insight? All this time I feel like I'm doing something wrong with memory.
In this case list is my "bag", which is an object wrapper for an array implementation. However since it contains gabs between indexes I use getGapless to get a bag where the nullptr indexes are excluded.
I delete my bag at the end, but it doesn't delete the actual content ( that is done with a different method ).
So when do those pointers in my "players" bag go out of scope?
virtual void processEntities(artemis::ImmutableBag<artemis::Entity*>& bag)
{
artemis::Bag<artemis::Entity*> * list = (artemis::Bag<artemis::Entity*>*)this->world->getGroupManager()->getEntities("HUMAN");
if(list == nullptr) return;//Kill function
artemis::Bag<artemis::Entity*> * players = list->getGapless();
for(int i=0; i<players->getCount(); i++)
{
for(int j=i+1; j < players->getCount(); j++)
{
if(intersects(*players->get(i),*players->get(j))){
std::cout << "Collide YEAH \n";
}
}
}
delete players;
}
Nope, don't worry! You can think of pointers as being managed in the same way as ints or doubles (at least in terms of memory). The pointer itself is like an int that happens to contain the address of some other object or array of objects. Once the pointer disappears from scope, the memory for the pointer itself will automatically be recovered.
The exception would be if you're doing something like int** p = new int*[1], i.e. creating pointers with new. Then you will at some point need to delete p.
If you're creating your pointers like int* p = new int[size]; (which is probably what you want), then p itself is on the stack, which means you don't need to concern yourself with memory deallocation, but the array p points to is on the heap which means you will need to deallocate it at some point.
Pointers are ordinary variables. They are not handled in any special way. There's no difference between pointer variables and integer variables in that respect, as there's no difference between between pointer arrays and integer arrays in that respect.
The memory management for all variables in the language is entirely up to you. If you declare a local variable, it is automatically destroyed when control goes out of its block. If you allocate/create objects dynamically, then you have to deallocate/destroy them explicitly. And so on. There's absolutely nothing special about pointers. They are just like any other variables.
Basically, it is not clear why you are even asking this question, since the issue your question seems to address does not really exist. Can you provide an example of what caused you to ask this?
Pointers just hold addresses, the same way 'int' holds an integer. If you instead had an array of ints and you were using a mutated array based on it, then got rid of the mutated array, the original array stays untouched; here it is really no different.
The values in the mutated array go away, but since they are copies (regardless of whether they are ints or pointers or whatever), it does not affect the original.

Reallocation in std::vector after std::vector.reserve()

I have a snippet of code where I first put some values in to a std::vector and then give an address of each of them to one of the objects that will be using them, like this:
std::vector < useSomeObject > uso;
// uso is filled
std::vector < someObject > obj;
for (int i=0; i < numberOfDesiredObjects; ++i){
obj.push_back(someObject());
obj.back().fillWithData();
}
for (int i=0; i < numberOfDesiredObjects; ++i){
uso[i].setSomeObject(&obj[i]);
}
// use objects via uso vector
// deallocate everything
Now, since I'm sometimes a little bit of a style freak, I think this is ugly and would like to use only 1 for loop, kind of like this:
for (int i=0; i < numberOfDesiredObjects; ++i){
obj.push_back(someObject());
obj.back().fillWithData();
uso[i].setSomeObject(&obj.back());
}
Of course, I can not do that because reallocation happens occasionally, and all the pointers I set became invalid.
So, my question is:
I know that std::vector.reserve() is the way to go if you know how much you will need and want to allocate the memory in advance. If I make sure that I am trying to allocate enough memory in advance with reserve(), does that guarantee that my pointers will stay valid?
Thank you.
Sidenote. This is a similar question, but there is not an answer to what I would like to know. Just to prevent it from popping up as a first comment to this question.
This is, in fact, one of the principal reasons for using reserve. You
are guaranteed that appending to the end of an std::vector will not
invalidate iterators, references or pointers to elements in the vector
as long as the new size of the vector does not exceed the old capacity.
If I make sure that I am trying to allocate enough memory in advance with reserve(), does that guarantee that my pointers will stay valid?
Yes, it guarantees your pointers will stay valid unless:
The size increases beyond the current capacity or
Unless, you erase any elements, which you don't.
The iterator Invalidation rules for an vector are specified in 23.2.4.3/1 & 23.2.4.3/3 as:
All iterators and references before the point of insertion are unaffected, unless the new container size is greater than the previous capacity (in which case all iterators and references are invalidated)
Every iterator and reference after the point of erase is invalidated.

Where exactly in memory is count of allocated memory thats being used by delete?

int* Array;
Array = new int[10];
delete[] Array;
The delete knows the count of allocated memory. I Googled that it stores it in memory, but it's compiler dependent. Is there anyway to use get this count?
Actually, the heap knows how large each allocation is. However, that's not something that you can access easily, and it is only guaranteed to be greater than or equal to the amount requested. Sometimes more is allocated for the benefit of byte alignment.
As Ben said though, the implementation does know in certain circumstances how many objects are in the array so that their destructors can be called.
There is no standard way to retrieve the number of elements after construction. In fact, for an int array, it is probably NOT stored anywhere. The count is only necessary for arrays of elements with non-trivial destructors, so that delete[] can call the right number of destructors. In your example, there aren't any destructor calls, since int doesn't have a destructor.
There's no way to get the number of elements of a dynamically allocated array after you allocate it.
The one way to rule them all
However, you can store it beforehand:
int* Array;
size_t len = 10;
Array = new int[len];
delete[] Array;
Custom class
If you don't like that, you could create your own class:
class IntArray
{
public:
int* data;
size_t length;
IntArray(size_t);
~IntArray();
};
IntArray::IntArray(size_t len)
{
length = len;
data = new int[len];
}
IntArray::~IntArray()
{
length = 0;
delete data;
data = NULL;
}
std::vector
The method I recommend is to use std::vector:
std::vector<int> Array (10, 0);
You can use it just like a regular array... with extra features:
for(size_t i = 0; i < Array.size(); ++i)
Array[i] = i;
There are likely one or two counts of the number of elements in such an allocation depending upon the type and the implementation that you are using though you can't really access them in the way you probably want.
The first is the accounting information stored by the actual memory manager that you are using (the library that provides malloc). It will store that a record of some size has been allocated in the free store of the system (heap or anonymous memory allocation are both possible with the glibc malloc for example). This space will be at least as large as the data you are trying to store (sizeof(int)*count+delta where delta is the C++ compiler's tracking information I talk about below), but it could also be larger, even significantly so.
The second count is a value kept by the compiler that tells it how to call destructors on all the elements in the array (the whole magic of RAII), but that value is not accessible and could probably even be done without directly storing the information, though that would be unlikely.
As others have said, if you need to track the information on allocation size you probably want to use a vector, you can even use it as an actual array for the purpose of pointer math if need be (see http://www.cplusplus.com/reference/stl/vector/ for more on this).
Who says that there actually is one?
This stuff depends on the implementation and as such is uninteresting for you, me or whoever wants to know it.
C++ generally intentionally doesn't allow you access to that information, because arrays are simple types that do not keep that information associated with them. Ultimately that information must be stored, but the compiler is free to figure out how, where, and when by the C++ standards to allow for optimization in the assembly.
Basically, either store it yourself somewhere, or (better, most of the time), use std::vector.
No, you need to keep track of it yourself if you need to know.
Many people like using a std::vector if it's not a fixed size. std::vector keeps track of the size allocated for you.

How is dynamic memory managed in std::vector?

How does std::vector implement the management of the changing number of elements: Does it use realloc() function, or does it use a linked list?
Thanks.
It uses the allocator that was given to it as the second template parameter. Like this then. Say it is in push_back, let t be the object to be pushed:
...
if(_size == _capacity) { // size is never greater than capacity
// reallocate
T * _begin1 = alloc.allocate(_capacity * 2, 0);
size_type _capacity1 = _capacity * 2;
// copy construct items (copy over from old location).
for(size_type i=0; i<_size; i++)
alloc.construct(_begin1 + i, *(_begin + i));
alloc.construct(_begin1 + _size, t);
// destruct old ones. dtors are not allowed to throw here.
// if they do, behavior is undefined (17.4.3.6/2)
for(size_type i=0;i<_size; i++)
alloc.destroy(_begin + i);
alloc.deallocate(_begin, _capacity);
// set new stuff, after everything worked out nicely
_begin = _begin1;
_capacity = _capacity1;
} else { // size less than capacity
// tell the allocator to allocate an object at the right
// memory place previously allocated
alloc.construct(_begin + _size, t);
}
_size++; // now, we have one more item in us
...
Something like that. The allocator will care about allocating memory. It keeps the steps of allocating memory and constructing object into that memory apart, so it can preallocate memory, but not yet call constructors. During reallocate, the vector has to take care about exceptions being thrown by copy constructors, which complicates the matter somewhat. The above is just some pseudo code snippet - not real code and probably contains many bugs. If the size gets above the capacity, it asks the allocator to allocate a new greater block of memory, if not then it just constructs at the previously allocated space.
The exact semantics of this depend on the allocator. If it is the standard allocator, construct will do
new ((void*)(_start + n)) T(t); // known as "placement new"
And the allocate allocate will just get memory from ::operator new. destroy would call the destructor
(_start + n)->~T();
All that is abstracted behind the allocator and the vector just uses it. A stack or pooling allocator could work completely different. Some key points about vector that are important
After a call to reserve(N), you can have up to N items inserted into your vector without risking a reallocation. Until then, that is as long as size() <= capacity(), references and iterators to elements of it remain valid.
Vector's storage is contiguous. You can treat &v[0] as a buffer containing as many elements you have currently in your vector.
One of the hard-and-fast rules of vectors is that the data will be stored in one contiguous block of memory.
That way you know you can theoretically do this:
const Widget* pWidgetArrayBegin = &(vecWidget[0]);
You can then pass pWidgetArrayBegin into functions that want an array as a parameter.
The only exception to this is the std::vector<bool> specialisation. It actually isn't bools at all, but that's another story.
So the std::vector will reallocate the memory, and will not use a linked list.
This means you can shoot yourself in the foot by doing this:
Widget* pInteresting = &(vecWidget.back());
vecWidget.push_back(anotherWidget);
For all you know, the push_back call could have caused the vector to shift its contents to an entirely new block of memory, invalidating pInteresting.
The memory managed by std::vector is guaranteed to be continuous, such that you can treat &vec[0] as a pointer to the beginning of a dynamic array.
Given this, how it actually manages it's reallocations is implementation specific.
std::vector stored data in contiguous memory blocks.
Suppose we declare a vector as
std::vector intvect;
So initially a memory of x elements will be created . Here x is implementation depended.
If user is inserting more than x elements than a new memory block will be created of 2x (twice the size)elements and initial vector is copied into this memory block.
Thats why it is always recommended to reserve memory for vector by calling reserve
function.
intvect.reserve(100);
so as to avoid deletion and copying of vector data.