Memory management when using vector - c++

I am making a game engine and need to use the std::vector container for all of the components and entities in the game.
In a script the user might need to hold a pointer to an entity or component, perhaps to continuously check some kind of state. If something is added to the vector that the pointer points to and the capacity is exceeded, it is my understanding that the vector will allocate new memory and every pointer that points to any element in the vector will become invalid.
Considering this issue i have a couple of possible solutions. After each push_back to the vector, would it be a viable to check if a current capacity variable is exceeded by the actual capacity of the vector? And if so, fetch and overwrite the old pointers to the new ones? Would this guarantee to "catch" every case that invalidates pointers when performing a push_back?
Another solution that i've found is to instead save an index to the element and access it that way, but i suspect that is bad for performance when you need to continuously check the state of that element (every 1/60 second).
I am aware that other containers do not have this issue but i'd really like to make it work with a vector. Also it might be worth noting that i do not know in advance how many entities / components there will be.
Any input is greatly appreciated.

You shouldn't worry about performance of std::vector when you access its element only 60 times per second. By the way, in Release compilation mode std::vector::operator[] is being converted to a single lea opcode. In Debug mode it is decorated by some runtime range checks though.

If the user is going to store pointers to the objects, why even contain them in a vector?
I don't feel like it is a good idea to (poor wording)->store pointers to objects in a vector. (what I meant is to create pointers that point to vector elements, i.e. my_ptr = &my_vec[n];) The whole point of a container is to reference the contents in the normal ways that the container supports, not to create outside pointers to elements of the container.
To answer your question about whether you can detect the allocations, yes you could, but it is still probably a bad idea to reference the contents of a vector by pointers to elements.
You could also reserve space in the vector when you create it, if you have some idea of what the maximum size might grow to. Then it would never resize.
edit:
After reading other responses, and thinking about what you asked, another thought occurred. If your vector is a vector of pointers to objects, and you pass out the pointers to the objects to your clients, resizing the vector does not invalidate the pointers that the vector hold. The issue becomes keeping track of the life of the object (who owns it), which is why using shared_ptr would be useful.
For example:
vector<shared_ptr> my_vec;
my_vec.push_back(stuff);
if you pass out the pointers contained in the vector to clients...
client_ptr = my_vec[3];
There will be no problem when the vector resizes. The contents of the vector will be preserved, and whatever was at my_vec[3] will still be there. The object pointed to by my_vec[3] will still be at the same address, and my_vec[3] will still contain that address. Whomever got a copy of the pointer at my_vec[3] will still have a valid pointer.
However, if you did this:
client_ptr = &my_vec[3];
And the client is dereferencing like this:
*client_ptr->whatever();
You have a problem. Now when my_vec resized, &my_vec[3] is probably no longer valid, thus client_ptr points to nowhere.

If something is added to the vector that the pointer points to and the
capacity is exceeded, it is my understanding that the vector will
allocate new memory and every pointer that points to any element in
the vector will become invalid.
I once wrote some code to analyze what happens when a vector's capacity is exceeded. (Have you done this, yet?) What that code demonstrated on my Ubuntu with g++v5 system was that std::vector code simply a) doubles the capacity, b) moves all the elements from old to the new storage, then c) cleans up the old. Perhaps your implementation is similar. I think the details of capacity expansion is implementation dependent.
And yes, any pointer into the vector would be invalidated when push_back() causes capacity to be exceeded.
1) I simply don't use pointers-into-the-vector (and neither should you). In this way the issue is completely eliminated, as it simply can not occur. (see also, dangling pointers) The proper way to access a std::vector (or a std::array) element is to use an index (via the operator[]() method).
After any capacity-expansion, the index of all elements at indexes less than the previous capacity limit are still valid, as the push_back() installed the new element at the 'end' (I think highest memory addressed.) The elements memory location may have changed, but the element index is still the same.
2) It is my practice that I simply don't exceed the capacity. Yes, by that I mean that I have been able to formulate all my problems such that I know the required maximum-capacity. I have never found this approach to be a problem.
3) If the vector contents can not be contained in system memory (my system's best upper limit capacity is roughly 3.5 GBytes), then perhaps a vector container (or any ram based container) is inappropriate. You will have to accomplish your goal using disk storage, perhaps with vector containers acting as a cache.
update 2017-July-31
Some code to consider from my latest Game of Life.
Each Cell_t (on the 2-d gameboard) has 8 neighbors.
In my implementation, each Cell_t has a neighbor 'list,' (either std::array or std::vector, I've tried both), and after the gameboard has fully constructed, each Cell_t's init() method is run, filling it's neighbor 'list'.
// see Cell_t data attributes
std::array<int, 8> m_neighbors;
// ...
void Cell_t::void init()
{
int i = 0;
m_neighbors[i] = validCellIndx(m_row-1, m_col-1); // 1 - up left
m_neighbors[++i] = validCellIndx(m_row-1, m_col); // 2 - up
m_neighbors[++i] = validCellIndx(m_row-1, m_col+1); // 3 - up right
m_neighbors[++i] = validCellIndx(m_row, m_col+1); // 4 - right
m_neighbors[++i] = validCellIndx(m_row+1, m_col+1); // 5 - down right
m_neighbors[++i] = validCellIndx(m_row+1, m_col); // 6 - down
m_neighbors[++i] = validCellIndx(m_row+1, m_col-1); // 7 - down left
m_neighbors[++i] = validCellIndx(m_row, m_col-1); // 8 - left
// ^^^^^^^^^^^^^- returns info to quickly find cell
}
The int value in m_neighbors[i] is the index into the gameboard vector. To determine the next state of the cell, the code 'counts the neighbor's states.'
Note - Some cells are at the edge of the gameboard ... in this implementation, validCellIndx() can return a value indicating 'no-neighbor', (above top row, left of left edge, etc.)
// multiplier: for 100x200 cells,20,000 * m_generation => ~20,000,000 ops
void countNeighbors(int& aliveNeighbors, int& totalNeighbors)
{
{ /* ... initialize m_count[]s to 0 */ }
for(auto neighborIndx : m_neighbors ) { // each of 8 neighbors // 123
if(no_neighbor != neighborIndx) // 8-4
m_count[ gBoard[neighborIndx].m_state ] += 1; // 765
}
aliveNeighbors = m_count[ CellALIVE ]; // CellDEAD = 1, CellALIVE
totalNeighbors = aliveNeighbors + m_count [ CellDEAD ];
} // Cell_Arr_t::countNeighbors
init() pre-computes the index to this cells neighbors. The m_neighbors array holds index integers, not pointers. It is trivial to have NO pointers-into-the-gameboard vector.

Related

drawback to vector.reserve(10000000) because i want to keep original memory locations?

I'm creating objects at runtime with vector.push_back() and i want to store the player object so i can use it whenever i want. So i store the pointer of the just created player to a global pointer variable, but when i push back the next object, the memory adress of the player changes, i think because the vector has to resize and therefore changes all of its elements locations.
for (int y = 0; y < 5; y++)
{
for (int x = 0; x < 10; x++)
{
switch (map[x + 10 * y])
{
case '1':
gameObjects.push_back(player);
playerPointer = &gameObjects.back();
break;
case '2':
gameObjects.push_back(block);
gameObjects.back().transform.pos = vec((float)x * 100, (float)y * 150, 0);
break;
}
}
}
but when i use
gameObjects.reserve(10000);
the location doesnt change, because its reserved and doesnt need to resize until the size becomes 10000 in this case.
So whats the catch? can you just reserve(1000000000) with no consequences? my RAM usage doesnt skyrocket.
Thanks in advance
The drawback of reserve( a lot ) is that you can either reserve far too much, which is a waste of memory, or too little, then reallocations will still happen.
Pointers to elements get invalidated when iterators get invalidated. For when iterators get invalidated I refer you to this question: Iterator invalidation rules.
Face it: std::vectors iterators are rather unstable (and thus also pointers to elements). You have several options:
Reserving enough space upfront can be a solution, though it isn't safe. Once you push more than you reserved, iterators are off.
When you are fine with reserving enough space for the maximum number of elements you can as well use a std::array.
Use indices. Assuming you do not reorder the vector and you never remove elements, indices are stable.
Use a std::list. Iterators of lists are much more stable than iterators into vectors. In particular, inserting to a list does not invalidate iterators. Though, consider the drawbacks of std::list compared to std::vector.
Store the elements elsewhere. Instead of storing the player in the vector you can store a (smart-) pointer to it in the vector. Reordering or reallocating the vector will then not affect pointers to the actual objects.
quickly after i posted this i realized that using pointers as an ID is stupid (especially with vectors) so i just gave the gameobject a const char* name and made a function which loops trough my gameobjects till it finds one with the name you seek and returns it. The i set a GameObject* to the gameobject* the function returns and from there i can use player->doStuff

Memory fragmentation using std list?

I'm using list of lists to store points data in my appliation.
Here some examples test I made:
//using list of lists
list<list<Point>> ls;
for(int i=0;i<10000;++i)
{
list<Point> lp;
lp.resize(4);
lp.pushback(Point(1,2));
ls.push_back(lp);
}
I asume that memory used will be
10k elements * 5 Points * Point size = 10000*5*2*4=400.000 bytes + some overhead of list container, but memory used by programm rises dramatically.
Is it due to overhead of list container or maybe because of memory fragmentation?
EDIT:
add some info and another example
Point is mfc CPoint class or you can define your own point class with int x,y , I'm using VS2008 in debug mode, Win XP, and Window Task Manager to view memory of application
I can't use vector instead of outer list because I don't know total size N of it beforehand, so I must push_back every new entry.
here is modified example
int N=10000;
list<vector<CPoint>> ls;
for(int i=0;i<N;++i)
{
vector<CPoint> vp;
vp.resize(5);
vp.reserve(5);
ls.push_back(vp);
}
and I compare it to
CPoint* p= new CPoint[N*5];
It's not "+ some overhead of list container". List overhead is linear with the number of objects, not constant. There's 50,000 Points, but with each Point you also have two pointers (std::list is doubly-linked), and also with each element in ls, you have two pointers. Plus, each list is going to have a head and tail pointer.
So that's 140,002 (I think) extra pointers that your math doesn't account for. Note that this dwarfs the size of the Point objects themselves, since they're so small. You sure that list is the right container for you? vector has constant overhead - basically three pointer per container, which would be just 30,003 additional pointers on top of just the Point objects. That's a large memory savings - if that is something that matters.
[Update based on Bill Lynch's comment] vector could allocate more space than 5 for your points. Worst-case, it will allocate twice as much space as you need. But since sizeof(Point) == sizeof(Point*) for you, that's still strictly better than list since list will always use three times as much space.

How to make sure that recently created area points to NULL?

How can I make sure that, for each allocated new space on heap area, the recently created element of vector of pointers points to NULL ?
Ex:
vector < Sometype* >
vector ----------------------
| | | | ... |
----------------------
new element is pushed back but no available area so double space
index x x+1 y
vector -------------------------------------------
| | | | ... | | | ... |
-------------------------------------------
^^^^^^^^^^^^^^^^^^^^^
recently created
x, x+1, ... y all points to the NULL
I want each space on recently created part point to NULL ?
This new space is part of the capacity of the vector, but not part of the size. You shouldn't need to care what values it contains, since you're not allowed to access it anyway. Other than the one value you pushed back, the extra space is not "elements of the vector", it's just unused space.
As far as the standard is concerned, the implementation could use it to store something meaningful, if it wanted. For example, an implementation could legally store some eye-catcher value in the unused memory, which conflicts with your desire for the unused memory to contain null pointers.
You could write code like this:
v.push_back(some_value);
if (v.capacity() > v.size()) {
size_t oldsize = v.size();
v.resize(v.capacity(), NULL);
v.resize(oldsize);
}
There's no guarantee this will actually leave the memory set to 0 once you resize back down again, but it probably will. So it might be good enough for debugging. If the purpose you have in mind is not debugging, please say what it is, because if not debugging then either your purpose is illegitimate or else one of us has misunderstood something.
If I correctly understood your question, one straightforward solution is to call resize() yourself passing NULL as second argument to be used as default value for newly created items:
if (v.size() == v.capacity()) //vector is full
{
//compute the new size
size_t newSize = 2 * v.size();
//second argument is the default value for newly added items
v.resize(newSize, NULL);
}
Why would you need it to be NULL unless it's been constructed? If you create a vector of 10 objects, say, and then push an extra, 11th item onto the vector then the vector may reserve enough space for another 10 items but you cannot use those items unless you either push items onto the vector increasing it's size, or you call resize.
size is not the same as capacity
Why would you need that? vector does not allow you to access these elements anyway. The expanding of it's capacity is implementation detail of vector and values of elements at the allocated space are not relevant. These elements will get overwritten once you push_back something there, or resize the vector with the given value.
There is an important difference between capacity and size of a vector.
When you push new elements into vector and there is no room, although std::vector allocates extra memory for new elements (similar to reserve() call), it does not create them (does not call constructors). See placement "new" to understand how this could work. There's no real way to enforce certain value for new elements, because there are no new elements - only raw memory block allocated for future elements. By using std::vector::at instead of operator[] you can ensure that you're accessing elements within valid range.
If you resize vector yourself by calling std::vector::resize, then simply provide default value for new elements in 2nd parameters. However, there's a catch. When you resize std::vector yourself and do not provide value for 2nd argument of std::vector::resize, std::vector will value-initialize new elements if value stored in std::vector has constructor and zero-initialize them otherwise. Which means, that if you do std::vector<int*> v; v.resize(200);, all new elements of v will be initialized to zero. See this answer for details.

pop-push element from std::vector and reuse elements

i have a project in c++03 that have a problem with data structure: i use vector instead of list even if i have to continuously pop_front-push_back. but for now it is ok because i need to rewrite too many code for now.
my approach is tuo have a buffer of last frame_size point always updated. so each frame i have to pop front and push back. (mayebe there is a name for this approach?)
so i use this code:
Point apoint; // allocate new point
apoint.x = xx;
apoint.y = yy;
int size = points.size()
if (size > frame_size) {
this->points.erase( points.begin() ); // pop_front
}
this->points.push_back(apoint);
i have some ready-to-use code for an object pool and so i thought: it is not a great optimization but i can store the front in the pool and so i can gain the allocation time of apoint.
ok this is not so useful and probably it make no sense but i ask just to educational curiosity: how can i do that?
how can i store the memory of erased element of a vector for reusing it? does this question make sense? if not, why?
.. because erase does not return the erased vector, it return:
A random access iterator pointing to the new location of the element
that followed the last element erased by the function call, which is
the vector end if the operation erased the last element in the
sequence.
i have some ready-to-use code for an object pool ... how can i do that?
Using a vector, you can't. A vector stores its elements in a contiguous array, so they can't be allocated one at a time, only in blocks of arbitrary size. Therefore, you can't use an object pool as an allocator for std::vector.
how can i store the memory of erased element of a vector for reusing it? does this question make sense? if not, why?
The vector already does that. Your call to erase moves all the elements down into the space vacated by the first element, leaving an empty space at the end to push the new element into.
As long as you use a vector, you can't avoid moving all the elements when you erase the first; if that is too inefficient, then use a deque (or possibly a list) instead.
I'm not sure to understand what you want to do, but this should be functionnally equivalent to what you wrote, without constructing a temporary Point instance:
// don't do this on an empty vector
assert (points.size() > 0);
// rotate elements in the vector, erasing the first element
// and duplicating the last one
copy (points.begin()+1, points.end(), points.begin());
// overwrite the last element with your new data
points.back().x = xx;
points.back().y = yy;
EDIT: As Mike Seymour noted in the comments, neither this solution nor the approach proposed in the question cause any new memory allocation.

inserting into the middle of an array

I have an array int *playerNum which stores the list of all the numbers of the players in the team. Each slot e.g playerNum[1]; represents a position on the team, if I wanted to add a new player for a new position on the team. That is, inserting a new element into the array somewhere near the middle, how would I go about doing this?
At the moment, I was thinking you memcpy up to the position you want to insert the player into a new array and then insert the new player and copy over the rest of it?
(I have to use an array)
If you're using C++, I would suggest not using memcpy or memmove but instead using the copy or copy_backward algorithms. These will work on any data type, not just plain old integers, and most implementations are optimized enough that they will compile down to memmove anyway. More importantly, they will work even if you change the underlying type of the elements in the array to something that needs a custom copy constructor or assignment operator.
If you have to use an array, after having made sure you have enough storage (using realloc if necessary), use memmove to shift the items from the insertion point to the end by one position, then save your new player at the desired location.
You can't use memcpy if the source and target areas overlap.
This will fail as soon as the objects in your array have non-trivial copy-constructors, and it's not idiomatic C++. Using one of the container classes is much safer (std::vector or std::list for instance).
Your solution using memcpy is correct (under few assumptions mentionned by other).
However, and since you are programming in C++. It is probably a better choice to use std::vector and its insert method.
vector<int> myvector (3,100);
myvector.insert ( 10 , 42 );
An array takes a contiguous block of memory, there is no function for you to insert an element in the middle. you can create a new one of size larger than the origin's by one then copy the original array into the new one plus the new member
for(int i=0;i<arSize/2;i++)
{
newarray[i]<-ar[i];
}
newarray[i+1]<-newelemant;
for(int j=i+1<newSize;j++,i++)
{
newarray[i]<-ar[i];
}
if you use STL, ting becomes easier, use list.
As you're talking about an array and "insert" I assume that it is a sorted array. You don't necessarily need a second array provided that the capacity N of your existing array is large enough to store more entries (N>n, where n is the number of current entries). You can move the entries from k to n-1 (zero-indexed) to k+1 to n, where k is the desired insert position. Insert the new element at index position k and increase n by one. If the array is not large enough in the beginning, you can follow your proposed approach or just reallocate a new array of larger capacity N' and copy the existing data before applying the actual insert operation described above.
BTW: As you're using C++, you could easily use std::vector.
While it is possible to use arrays for this, C++ has a better solutions to offer. For starters, try std::vector, which is a decent enough general-purpose container, based on a dynamically-allocated array. It behaves exactly like an array in many cases.
Looking at your problem, however, there are two downsides to arrays or vectors:
Indices have to be 0-based and contiguous; you cannot remove elements from the middle without losing key/value associations for everything after the removed element; so if you remove the player on position 4, then the player from position 9 will move to position 8
Random insertion and deletion (that is, anywhere except the end) is expensive - O(n), that is, execution time grows linearly with array size. This is because every time you insert or delete, a part of the array needs to be moved.
If the key/value thing isn't important to you, and insertion/deletion isn't time critical, and your container is never going to be really large, then by all means, use a vector. If you need random insertion/deletion performance, but the key/value thing isn't important, look at std::list (although you won't get random access then, that is, the [] operator isn't defined, as implementing it would be very inefficient for linked lists; linked lists are also very memory hungry, with an overhead of two pointers per element). If you want to maintain key/value associations, std::map is your friend.
Losting the tail:
#include <stdio.h>
#define s 10
int L[s];
void insert(int v, int p, int *a)
{
memmove(a+p+1,a+p,(s-p+1)*4);
*(a+p) = v;
}
int main()
{
for(int i=0;i<s;i++) L[i] = i;
insert(11,6, L);
for(int i=0;i<s;i++) printf("%d %d\n", L[i], &L[i]);
return 0;
}