Memory fragmentation using std list? - c++

I'm using list of lists to store points data in my appliation.
Here some examples test I made:
//using list of lists
list<list<Point>> ls;
for(int i=0;i<10000;++i)
{
list<Point> lp;
lp.resize(4);
lp.pushback(Point(1,2));
ls.push_back(lp);
}
I asume that memory used will be
10k elements * 5 Points * Point size = 10000*5*2*4=400.000 bytes + some overhead of list container, but memory used by programm rises dramatically.
Is it due to overhead of list container or maybe because of memory fragmentation?
EDIT:
add some info and another example
Point is mfc CPoint class or you can define your own point class with int x,y , I'm using VS2008 in debug mode, Win XP, and Window Task Manager to view memory of application
I can't use vector instead of outer list because I don't know total size N of it beforehand, so I must push_back every new entry.
here is modified example
int N=10000;
list<vector<CPoint>> ls;
for(int i=0;i<N;++i)
{
vector<CPoint> vp;
vp.resize(5);
vp.reserve(5);
ls.push_back(vp);
}
and I compare it to
CPoint* p= new CPoint[N*5];

It's not "+ some overhead of list container". List overhead is linear with the number of objects, not constant. There's 50,000 Points, but with each Point you also have two pointers (std::list is doubly-linked), and also with each element in ls, you have two pointers. Plus, each list is going to have a head and tail pointer.
So that's 140,002 (I think) extra pointers that your math doesn't account for. Note that this dwarfs the size of the Point objects themselves, since they're so small. You sure that list is the right container for you? vector has constant overhead - basically three pointer per container, which would be just 30,003 additional pointers on top of just the Point objects. That's a large memory savings - if that is something that matters.
[Update based on Bill Lynch's comment] vector could allocate more space than 5 for your points. Worst-case, it will allocate twice as much space as you need. But since sizeof(Point) == sizeof(Point*) for you, that's still strictly better than list since list will always use three times as much space.

Related

Memory management when using vector

I am making a game engine and need to use the std::vector container for all of the components and entities in the game.
In a script the user might need to hold a pointer to an entity or component, perhaps to continuously check some kind of state. If something is added to the vector that the pointer points to and the capacity is exceeded, it is my understanding that the vector will allocate new memory and every pointer that points to any element in the vector will become invalid.
Considering this issue i have a couple of possible solutions. After each push_back to the vector, would it be a viable to check if a current capacity variable is exceeded by the actual capacity of the vector? And if so, fetch and overwrite the old pointers to the new ones? Would this guarantee to "catch" every case that invalidates pointers when performing a push_back?
Another solution that i've found is to instead save an index to the element and access it that way, but i suspect that is bad for performance when you need to continuously check the state of that element (every 1/60 second).
I am aware that other containers do not have this issue but i'd really like to make it work with a vector. Also it might be worth noting that i do not know in advance how many entities / components there will be.
Any input is greatly appreciated.
You shouldn't worry about performance of std::vector when you access its element only 60 times per second. By the way, in Release compilation mode std::vector::operator[] is being converted to a single lea opcode. In Debug mode it is decorated by some runtime range checks though.
If the user is going to store pointers to the objects, why even contain them in a vector?
I don't feel like it is a good idea to (poor wording)->store pointers to objects in a vector. (what I meant is to create pointers that point to vector elements, i.e. my_ptr = &my_vec[n];) The whole point of a container is to reference the contents in the normal ways that the container supports, not to create outside pointers to elements of the container.
To answer your question about whether you can detect the allocations, yes you could, but it is still probably a bad idea to reference the contents of a vector by pointers to elements.
You could also reserve space in the vector when you create it, if you have some idea of what the maximum size might grow to. Then it would never resize.
edit:
After reading other responses, and thinking about what you asked, another thought occurred. If your vector is a vector of pointers to objects, and you pass out the pointers to the objects to your clients, resizing the vector does not invalidate the pointers that the vector hold. The issue becomes keeping track of the life of the object (who owns it), which is why using shared_ptr would be useful.
For example:
vector<shared_ptr> my_vec;
my_vec.push_back(stuff);
if you pass out the pointers contained in the vector to clients...
client_ptr = my_vec[3];
There will be no problem when the vector resizes. The contents of the vector will be preserved, and whatever was at my_vec[3] will still be there. The object pointed to by my_vec[3] will still be at the same address, and my_vec[3] will still contain that address. Whomever got a copy of the pointer at my_vec[3] will still have a valid pointer.
However, if you did this:
client_ptr = &my_vec[3];
And the client is dereferencing like this:
*client_ptr->whatever();
You have a problem. Now when my_vec resized, &my_vec[3] is probably no longer valid, thus client_ptr points to nowhere.
If something is added to the vector that the pointer points to and the
capacity is exceeded, it is my understanding that the vector will
allocate new memory and every pointer that points to any element in
the vector will become invalid.
I once wrote some code to analyze what happens when a vector's capacity is exceeded. (Have you done this, yet?) What that code demonstrated on my Ubuntu with g++v5 system was that std::vector code simply a) doubles the capacity, b) moves all the elements from old to the new storage, then c) cleans up the old. Perhaps your implementation is similar. I think the details of capacity expansion is implementation dependent.
And yes, any pointer into the vector would be invalidated when push_back() causes capacity to be exceeded.
1) I simply don't use pointers-into-the-vector (and neither should you). In this way the issue is completely eliminated, as it simply can not occur. (see also, dangling pointers) The proper way to access a std::vector (or a std::array) element is to use an index (via the operator[]() method).
After any capacity-expansion, the index of all elements at indexes less than the previous capacity limit are still valid, as the push_back() installed the new element at the 'end' (I think highest memory addressed.) The elements memory location may have changed, but the element index is still the same.
2) It is my practice that I simply don't exceed the capacity. Yes, by that I mean that I have been able to formulate all my problems such that I know the required maximum-capacity. I have never found this approach to be a problem.
3) If the vector contents can not be contained in system memory (my system's best upper limit capacity is roughly 3.5 GBytes), then perhaps a vector container (or any ram based container) is inappropriate. You will have to accomplish your goal using disk storage, perhaps with vector containers acting as a cache.
update 2017-July-31
Some code to consider from my latest Game of Life.
Each Cell_t (on the 2-d gameboard) has 8 neighbors.
In my implementation, each Cell_t has a neighbor 'list,' (either std::array or std::vector, I've tried both), and after the gameboard has fully constructed, each Cell_t's init() method is run, filling it's neighbor 'list'.
// see Cell_t data attributes
std::array<int, 8> m_neighbors;
// ...
void Cell_t::void init()
{
int i = 0;
m_neighbors[i] = validCellIndx(m_row-1, m_col-1); // 1 - up left
m_neighbors[++i] = validCellIndx(m_row-1, m_col); // 2 - up
m_neighbors[++i] = validCellIndx(m_row-1, m_col+1); // 3 - up right
m_neighbors[++i] = validCellIndx(m_row, m_col+1); // 4 - right
m_neighbors[++i] = validCellIndx(m_row+1, m_col+1); // 5 - down right
m_neighbors[++i] = validCellIndx(m_row+1, m_col); // 6 - down
m_neighbors[++i] = validCellIndx(m_row+1, m_col-1); // 7 - down left
m_neighbors[++i] = validCellIndx(m_row, m_col-1); // 8 - left
// ^^^^^^^^^^^^^- returns info to quickly find cell
}
The int value in m_neighbors[i] is the index into the gameboard vector. To determine the next state of the cell, the code 'counts the neighbor's states.'
Note - Some cells are at the edge of the gameboard ... in this implementation, validCellIndx() can return a value indicating 'no-neighbor', (above top row, left of left edge, etc.)
// multiplier: for 100x200 cells,20,000 * m_generation => ~20,000,000 ops
void countNeighbors(int& aliveNeighbors, int& totalNeighbors)
{
{ /* ... initialize m_count[]s to 0 */ }
for(auto neighborIndx : m_neighbors ) { // each of 8 neighbors // 123
if(no_neighbor != neighborIndx) // 8-4
m_count[ gBoard[neighborIndx].m_state ] += 1; // 765
}
aliveNeighbors = m_count[ CellALIVE ]; // CellDEAD = 1, CellALIVE
totalNeighbors = aliveNeighbors + m_count [ CellDEAD ];
} // Cell_Arr_t::countNeighbors
init() pre-computes the index to this cells neighbors. The m_neighbors array holds index integers, not pointers. It is trivial to have NO pointers-into-the-gameboard vector.

trim array to elements between i and j

A classic, I'm looking for optimisation here : I have an array of things, and after some processing I know I'm only interested in elements i to j. How to trim my array in the fatset, lightest way, with complete deletions/freeing of memory of elements before i and after j ?
I'm doing mebedded C++, so I may not be able to compile all sorts of library let's say. But std or vector things welcome in a first phase !
I've tried, for array A to be trimmed between i and j, with variable numElms telling me the number of elements in A :
A = &A[i];
numElms = i-j+1;
As it is this yields an incompatibility error. Can that be fixed, and even when fixed, does that free the memory at all for now-unused elements?
A little context : This array is the central data set of my module, and it can be heavy. It will live as long as the module lives. And there's no need to carry dead weight all this time. This is the very first thing that is done - figuring which segment of the data set has to be at all analyzed, and trimming and dumping the rest forever, never to use it again (until the next cycle where we get a fresh array with possibily a compeltely different size).
When asking questions about speed your millage may very based on the size of the array you're working with, but:
Your fastest way will be to not trim the array, just use A[index + i] to find the elements you want.
The lightest way to do this would be to:
Allocate a dynamic array with malloc
Once i and j are found copy that range to the head of the dynamic array
Use realloc to resize the dynamic array to the size j - i + 1
However you have this tagged as C++ not C, so I believe that you're also interested in readability and the required programming investment, not raw speed or weight. If this is true then I would suggest use of a vector or deque.
Given vector<thing> A or a deque<thing> A you could do:
A.erase(cbegin(A), next(cbegin(A), i));
A.resize(j - i + 1);
There is no way to change aloocated memory block size in standard C++ (unless you have POD data — in this case C facilities like realloc could be used). The only way to trim an array is to allocate new array. copy/move needed elements and destroy old array.
You can do it manually, or using vectors:
int* array = new int[10]{0,1,2,3,4,5,6,7,8,9};
std::vector<int> vec {0,1,2,3,4,5,6,7,8,9};
//We want only elements 3-5
{
int* new_array = new int[3];
std::copy(array + 3, array + 6, new_array);
delete[] array;
array = new_array;
}
vec = std::vector<int>(vec.begin()+3, vec.begin()+6);
If you are using C++11, both approaches should have same perfomance.
If you only want to remove extra elements and do not really want to release memory (for example you might want to add more elements later) you can follow NathanOliver link
However, you should consider: do you really need that memory freed immideately? Do you need to move elements right now? Will you array live for such long time that this memory would be lost for your program completely? Maybe you need a range or perharps a view to the array content? In many cases you can store two pointers (or pointer and size) to denote your "new" array, while keeping old one to be released all at once.

Why doesn't QList have a resize() method?

I just noticed that QList doesn't have a resize method, while QVector, for example, has one. Why is this? And is there an equivalent function?
Well, this is the more generic answer, but I hope you will see, by comparising QList and QVector why there is no need of manually expanding the container.
QList is using internal buffer to save pointers to the elements (or, if the element is smaller than pointer size, or element is one of the shared classes - elements itself), and the real data will be kept on the heap.
During the time, removing the data will not reduce internal buffer (empty space will be filled by shifting left or right elements, leaving space on the beginning and the end for later insertions).
Appending items, like QVector will create additional new space on end of the array, and since, unlike QVector, real data is not stored in internal buffer, you can create a lot of space in single instruction, no matter what size of the item is (unlike QVector) - because you are simply adding pointers into indexing buffer.
For example, if you are using 32bit system (4 bytes per pointer) and you are storing 50 items in the QList, and each item is 1MB big, QVector buffer will need to be resized to 50MB, and QList's internal buffer is need to allocate only 200B of memory. This is where you need to call resize() in QVector, but in QList there is no need, since allocating small chunk of memory is not problematic, as allocating 50MB of memory.
However, there is a price for that which means that you sometimes you want to preffer QVector instead of QList: For single item stored in the QList, you need one additional alloc on the heap - to keep the real data of the item (data where pointer in the internal buffer is pointing to). If you want to add 10000 items larger than the pointer (because, if it can fit into pointer, it will be stored directly in the internal buffer), you will need 10000 system calls to allocate data for 10000 items on the heap. But, if you are using QVector, and you call resize, you are able to fit all the items in the single alloc call - so don't use QList if you need a lot of inserting or appending, prefer QVector for that. Of course, if you are using QList to store shared classes, there is no need for additional allocating, which again makes QList more suitable.
So, prefer QList for most of the cases as it is:
Using indices to access the individual elements, accessing items will be faster that QLinkedList
Inserting into middle of the list will only require moving of the pointers to create space, and it is faster than shifting actual QVector data around.
There is no need to manually reserve or resize space, as empty space will be moved to the end of the buffer for later use, and allocating space in the array is very fast, as the elements are very small, and it can allocate a lot of space without killing your memory space.
Don't use it in the following scenarios, and prefer QVector:
If you need to ensure that your data is stored in the sequential memory locations
If you are rarely inserting data at the random positions, but you are appending a lot of data
at the end or beginning, which can cause a lot of unnecessary system calls, and you still need fast indexing.
If you are looking for (shared) replacement for simple arrays which will not grow over the time.
And, finally, note: QList (and QVector) have reserve(int alloc) function which will cause QList's internal buffer to grow if alloc is greater than the current size of the internal buffer. However, this will not affect external size of the QList (size() will always return the exact number of elements contained in the list).
I think reason is because QList doesn't require the element type to have a default constructor.
As a result of this, there is no operation where QList ever creates an object it only copies them.
But if you really need to resize a QList (for whatever reason), here's a function that will do it. Note that it's just a convenience function, and it's not written with performance in mind.
template<class T>
void resizeList(QList<T> & list, int newSize) {
int diff = newSize - list.size();
T t;
if (diff > 0) {
list.reserve(newSize);
while (diff--) list.append(t);
} else if (diff < 0) list.erase(list.end() + diff, list.end());
}
wasle answer is good, but it'll add the same object multiple time. Here is an utility functions that will add different object for list of smart pointers.
template<class T>
void resizeSmartList(QList<QSharedPointer<T> > & list, int newSize) {
int diff = newSize - list.size();
if (diff > 0) {
list.reserve(diff);
while (diff>0){
QSharedPointer<T> t = QSharedPointer<T>(new T);
list.append(t);
diff--;
}
}else if (diff < 0) list.erase(list.end() + diff, list.end());
}
For use without smart pointers, the following will add different objects to your list.
template<class T>
void resizeList(QList<T> & list, int newSize) {
int diff = newSize - list.size();
if (diff > 0) {
list.reserve(diff);
while (diff>0){
T t = new T;
list.append(t);
diff--;
}
}else if (diff < 0) list.erase(list.end() + diff, list.end());
}
Also remember that your objects must have default constructor (constructor declared in the header with arg="someValue") or else it will fail.
Just use something like
QList<Smth> myList;
// ... some operations on the list here
myList << QVector<Smth>(desiredNewSize - myList.size()).toList();
Essentially, there are these to/from Vector/List/Set() methods everywhere, which makes it trivial to resize Qt containers when necessary in a somewhat manual, but trivial and effective (I believe) way.
Another (1 or 2-liner) solution would be:
myList.reserve(newListSize); // note, how we have to reserve manually
std::fill_n(std::back_inserter(myList), desiredNewSize - myList.size(), Smth());
-- that's for STL-oriented folks :)
For some background on how complex an effective QList::resize() may get, see:
bugreports.qt.io/browse/QTBUG-42732 , and
codereview.qt-project.org/#/c/100738/1//ALL

inserting into the middle of an array

I have an array int *playerNum which stores the list of all the numbers of the players in the team. Each slot e.g playerNum[1]; represents a position on the team, if I wanted to add a new player for a new position on the team. That is, inserting a new element into the array somewhere near the middle, how would I go about doing this?
At the moment, I was thinking you memcpy up to the position you want to insert the player into a new array and then insert the new player and copy over the rest of it?
(I have to use an array)
If you're using C++, I would suggest not using memcpy or memmove but instead using the copy or copy_backward algorithms. These will work on any data type, not just plain old integers, and most implementations are optimized enough that they will compile down to memmove anyway. More importantly, they will work even if you change the underlying type of the elements in the array to something that needs a custom copy constructor or assignment operator.
If you have to use an array, after having made sure you have enough storage (using realloc if necessary), use memmove to shift the items from the insertion point to the end by one position, then save your new player at the desired location.
You can't use memcpy if the source and target areas overlap.
This will fail as soon as the objects in your array have non-trivial copy-constructors, and it's not idiomatic C++. Using one of the container classes is much safer (std::vector or std::list for instance).
Your solution using memcpy is correct (under few assumptions mentionned by other).
However, and since you are programming in C++. It is probably a better choice to use std::vector and its insert method.
vector<int> myvector (3,100);
myvector.insert ( 10 , 42 );
An array takes a contiguous block of memory, there is no function for you to insert an element in the middle. you can create a new one of size larger than the origin's by one then copy the original array into the new one plus the new member
for(int i=0;i<arSize/2;i++)
{
newarray[i]<-ar[i];
}
newarray[i+1]<-newelemant;
for(int j=i+1<newSize;j++,i++)
{
newarray[i]<-ar[i];
}
if you use STL, ting becomes easier, use list.
As you're talking about an array and "insert" I assume that it is a sorted array. You don't necessarily need a second array provided that the capacity N of your existing array is large enough to store more entries (N>n, where n is the number of current entries). You can move the entries from k to n-1 (zero-indexed) to k+1 to n, where k is the desired insert position. Insert the new element at index position k and increase n by one. If the array is not large enough in the beginning, you can follow your proposed approach or just reallocate a new array of larger capacity N' and copy the existing data before applying the actual insert operation described above.
BTW: As you're using C++, you could easily use std::vector.
While it is possible to use arrays for this, C++ has a better solutions to offer. For starters, try std::vector, which is a decent enough general-purpose container, based on a dynamically-allocated array. It behaves exactly like an array in many cases.
Looking at your problem, however, there are two downsides to arrays or vectors:
Indices have to be 0-based and contiguous; you cannot remove elements from the middle without losing key/value associations for everything after the removed element; so if you remove the player on position 4, then the player from position 9 will move to position 8
Random insertion and deletion (that is, anywhere except the end) is expensive - O(n), that is, execution time grows linearly with array size. This is because every time you insert or delete, a part of the array needs to be moved.
If the key/value thing isn't important to you, and insertion/deletion isn't time critical, and your container is never going to be really large, then by all means, use a vector. If you need random insertion/deletion performance, but the key/value thing isn't important, look at std::list (although you won't get random access then, that is, the [] operator isn't defined, as implementing it would be very inefficient for linked lists; linked lists are also very memory hungry, with an overhead of two pointers per element). If you want to maintain key/value associations, std::map is your friend.
Losting the tail:
#include <stdio.h>
#define s 10
int L[s];
void insert(int v, int p, int *a)
{
memmove(a+p+1,a+p,(s-p+1)*4);
*(a+p) = v;
}
int main()
{
for(int i=0;i<s;i++) L[i] = i;
insert(11,6, L);
for(int i=0;i<s;i++) printf("%d %d\n", L[i], &L[i]);
return 0;
}

Delete parts of a dynamic array and grow other

I need to have a dynamic array, so I need to allocate the necessary amount of memory through a pointer. What makes me wonder about which is a good solution, is that C++ has the ability to do something like:
int * p = new int[6];
which allocates the necessary array. What I need is that, afterwards, I want to grow some parts of this array. A(n flawed) example:
int *p1 = &p[0];
int *p2 = &p[2];
int *p3 = &p[4];
// delete positions p[2], p[3]
delete [] p2;
// create new array
p2 = new int[4];
I don't know how to achieve this behavior.
EDIT: std::vector does not work for me since I need the time of insertion/deletion of k elements to be proportional to the number k and not to the number of elements stored in the std::vector.
Using pointers, in the general case, I would point to the start of any non continuous region of memory and I would keep account of how many elements it stores. Conceptually, I would fragment the large array into many small ones and not necessarily in continuous space in the memory (the deletion creates "holes" while the allocation does not necessarily "fill" them).
You achieve this behavior by using std::vector:
std::vector<int> v(6); // create a vector with six elements.
v.erase(v.begin() + 2); // erase the element at v[2]
v.insert(v.begin() + 2, 4, 0); // insert four new elements starting at v[2]
Really, any time you want to use a dynamically allocated array, you should first consider using std::vector. It's not the solution to every problem, but along with the rest of the C++ standard library containers it is definitely the solution to most problems.
You should look into STL containers in C++, for example vector has pretty much the functionality you want.
I'd advise against doing this on your own. Look up std::vector for a reasonable starting point.
another option, besides std::vector is std::deque, which works in much the same way, but is a little more efficient at inserting chunks into the middle. If that's still not good enough for you, you might get some mileage using a collection of collections. You'll have to do a little bit more work getting random access to work (perhaps writing a class to wrap the whole thing.