Freeze in C++ program using huge vector - c++

I have an issue with a C++ program. I think it's a problem of memory.
In my program i'm used to create some enormous std::vector (i use reserve to allocate some memory). With vector size of 1 000 000, it's ok but if i increase this number (about ten millions), my program will freeze my PC and i can do nothing except waiting for a crash (or end of the program if i'm lucky). My vector contains a structure called Point which contain a vector of double.
I used valgrind to check if there is a memory lack. But no. According to it, there is no problem. Maybe using a vector of objects is not advised ? Or maybe is there some system parameters to check or something ? Or simply, the vector is too big for the computer ?
What do you think about this ?

Disclaimer
Note that this answer assumes a few things about your machine; the exact memory usage and error potential depends on your environment. And of course it is even easier to crash when you don't compute on 2d-Points, but e.g. 4d-points, which are common in computer graphics for example, or even larger Points for other numeric purposes.
About your problem
That's quite a lot of memory to allocate:
#include <iostream>
#include <vector>
struct Point {
std::vector<double> coords;
};
int main () {
std::cout << sizeof(Point) << std::endl;
}
This prints 12, which is the size in bytes of an empty Point. If you have 2-dimensional points, add another 2*sizeof(double)=8 to that per element, i.e. you now have a total of 20 bytes per Point.
With 10s of millions of elements, you request 200s of millions of bytes of data, e.g. for 20 million elements, you request 400 million bytes. While this does not exceed the maximum index into an std::vector, it is possible that the OS does not have that much contiguous memory free for you.
Also, your vectors memory needs to be copied quite often in order to be able to grow. This happens for example when you push_back, so when you already have a 400MiB vector, upon the next push_back you might have your old version of the vector, plus the newly allocated 400MiB*X memory, so you may easily exceed the 1000MiB temporarilly, et cetera.
Optimizations (high level; preferred)
Do you need to actually store the data all time? Can you use a similar algorithm which does not require so much storage? Can you refactor your code so that storage is reduced? Can you core some data out when you know it will take some time until you need it again?
Optimizations (low level)
If you know the number of elements before creating your outer vector, use the std::vector constructor which you can tell an initial size:
vector<Foo> foo(12) // initialize have 12 elements
Of course you can optimize a lot for memory; e.g. if you know you always only have 2d-Points, just have two doubles as members: 20 bytes -> 16 bytes. When you do not really need the precision of double, use float: 16 bytes -> 8 bytes. That's an optimization to $2/5$:
// struct Point { std::vector<double> coords; }; <-- old
struct Point { float x, y; }; // <-- new
If this is still not enough, an ad-hoc solution could be std::deque, or another, non-contiguous container: No temporal memory "doubling" because no resizing needed; also no need for the OS to find you such contiguous block of memory.
You can also use compression mechanisms, or indexed data, or fixed point numbers. But it depends on your exact circumstances.
struct Point { signed char x, y; }; // <-- or even this? examine a proper type
struct Point { short x_index, y_index; };

Without seeing your code, this is just speculation, but I suspect it is in large part due to your attempt to allocate a massive amount of memory that is contiguous. std::vector is guaranteed to be in contiguous memory, so if you try to allocate a large amount of space, the operating system has to try to find a block of memory that large that it can use. This may not be a problem for 2MB, but if you are suddenly trying to allocate 200MB or 2GB of contiguous memory ...
Additionally, anytime you add a new element to the vector and it is forced to resize, all of the existing elements must be copied into the new space allocated. If you have 9 million elements and adding the 9,000,001 element requires a resize, that is 9 million elements that have to be moved. As your vector gets larger, this copy time takes longer.
Try using std::deque instead. It is will basically allocate pages (that will be contiguous), but each page can be allocated wherever it can fit.

Related

std::vector increasing peak memory

This is in continuation of my last question. I am failed to understand the memory taken up by vector. Problem skeleton:
Consider an vector which is an collection of lists and lists is an collection of pointers. Exactly like:
std::vector<std::list<ABC*> > vec;
where ABC is my class. We work on 64bit machines, so size of pointer is 8 bytes.
At the start of my flow in the project, I resize this vector to an number so that I can store lists at respective indexes.
vec.resize(613284686);
At this point, capacity and size of the vector would be 613284686. Right. After resizing, I am inserting the lists at corresponding indexes as:
// Some where down in the program, make these lists. Simple push for now.
std::list<ABC*> l1;
l1.push_back(<pointer_to_class_ABC>);
l1.push_back(<pointer_to_class_ABC>);
// Copy the list at location
setInfo(613284686, l1);
void setInfo(uint64_t index, std::list<ABC*> list>) {
std::copy(list.begin(), list.end(), std::back_inserter(vec.at(index));
}
Alright. So inserting is done. Notable things are:
Size of vector is : 613284686
Entries in the vector is : 3638243731 // Calculated this by going over vector indexes and add the size of std::lists at each index.
Now, since there are 3638243731 entries of pointers, I would expect memory taken by this vector is ~30Gb. 3638243731 * 8(bytes) = ~30Gb.
BUT BUT When I have this data in memory, memory peaks to, 400G.
And then I clear up this vector with:
std::vector<std::list<nl_net> >& ccInfo = getVec(); // getVec defined somewhere and return me original vec.
std::vector<std::list<nl_net> >::iterator it = ccInfo.begin();
for(; it != ccInfo.end(); ++it) {
(*it).clear();
}
ccInfo.clear(); // Since it is an reference
std::vector<std::list<nl_net> >().swap(ccInfo); // This makes the capacity of the vector 0.
Well, after clearing up this vector, memory drops down to 100G. That is too much holding from an vector.
Would you all like to correct me what I am failing to understand here?
P.S. I can not reproduce it on smaller cases and it is coming in my project.
vec.resize(613284686);
At this point, capacity and size of the vector would be 613284686
It would be at least 613284686. It could be more.
std::vector<std::list<nl_net> >().swap(ccInfo); // This makes the capacity of the vector 0.
Technically, there is no guarantee by the standard that a default constructed vector wouldn't have capacity other than 0... But in practice, this is probably true.
Now, since there are 3638243731 entries of pointers, I would expect memory taken by this vector is ~30Gb. 3638243731 * 8(bytes)
But the vector doesn't contain pointers. It contains std::list<ABC*> objects. So, you should expect vec.capacity() * sizeof(std::list<ABC*>) bytes used by the buffer of the vector itself. Each list has at least a pointer to beginning and the end.
Furthermore, you should expect each element in each of the lists to use memory as well. Since the list is doubly linked, you should expect about two pointers plus the data (a third pointer) worth of memory for each element.
Also, each pointer in the lists apparently points to an ABC object, and each of those use sizeof(ABC) memory as well.
Furthermore, since each element of the linked lists are allocated separately, and each dynamic allocation requires book-keeping so that they can be individually de-allocated, and each allocation must be aligned to the maximum native alignment, and the free store may have fragmented during the execution, there will be much overhead associated with each dynamic allocation.
Well, after clearing up this vector, memory drops down to 100G.
It is quite typical for the language implementation to retain (some) memory it has allocated from the OS. If your target system documents an implementation specific function for explicitly requesting release of such memory, then you could attempt using that.
However, if the vector buffer wasn't the latest dynamic allocation, then its deallocation may have left a massive reusable area in the free store, but if there exists later allocations, then all that memory might not be releasable back to the OS.
Even if the langauge implementation has released the memory to the OS, it is quite typical for the OS to keep the memory mapped for the process until another process actually needs the memory for something else. So, depending on how you're measuring memory use, the results might not necessarily be meaningful.
General rules of thumb that may be useful:
Don't use a vector unless you use all (or most) of the indices. In case where you don't, consider a sparse array instead (there is no standard container for such data structure though).
When using vector, reserve before resize if you know the upper bound of allocation.
Don't use linked lists without a good reason.
Don't rely on getting all memory back from peak usage (back to the OS that is; The memory is still usable for further dynamic allocations).
Don't stress about virtual memory usage.
std::list is a fragmented memory container. Typically each node MUST have the data it is storing, plus the 2 prev/next pointers, and then you have to add in the space required within the OS allocation table (typically 16 or 32 bytes per allocation - depending on OS). You then have to account for the fact all allocations must be returned on a 16byte boundary (on Intel/AMD based 64bit machines anyway).
So using the example of std::list<ABC*> the size of a pointer is 8, however you will need at least 48bytes to store each element (at least).
So memory usage for ONLY the list entries is going to be around: 3638243731 * 48(bytes) = ~162Gb.
This is of course assuming that there is no memory fragmentation (where there may be a block of 62bytes free, and the OS returns the entire block of 62 rather than the 48 requested). We are also assuming here that the OS has a minimum allocation size of 48 bytes (and not say, 64bytes, which would not be overly silly, but would push the usage up far higher).
The size of the std::lists themselves within the vector comes to around 18GB. So in total we are looking at 180Gb at least to store that vector. It would not be beyond the realm of possibility that the extra allocations are additional OS book keeping info, for all of those individual memory allocations (e.g. lists of loaded memory pages, lists of swapped out memory pages, the read/write/mmap permissions, etc, etc).
As a final note, instead of using swap on a newly constructed vector, you can just use shrink to fit.
ccInfo.clear();
ccInfo.shrinkToFit();
The main vector needs some more consideration. I get the impression it will always be a fixed size. So why not use a std::array instead? A std::vector always allocates more memory than it needs to allow for growth. The bigger your vector the bigger the reservation of memory to allow for more even growth. The reasononing behind is to keep relocations in memory to a minimum. Relocations on really big vectors take up huge amounts of time so a lot of extra memory is reserved to prevent this.
No vector function that can delete elements (such as vector::clear and ::erase) also deallocates memory (e.g. lower the capacity). The size will decrease but the capacity doesn't. Again, this is meant to prevent relocations; if you delete you are also very likely to add again. ::shrink_to_fit also doesn't guarantuee you that all of the used memory is released.*
Next is the choice of a list to store elements. Is a list really applicable? Lists are strong in random access/insertion/removal operations. Are you really constantly adding and removing ABC objects to the list in random locations? Or is another container type with different properties but with contiguous memory more suitable? Another std::vector or std::array perhaps. If the answer is yes than you're pretty much stuck with a list and its scattered memory allocations. If no, than you could win back a lot of memory by using a different container type.
So, what is it you really want to do? Do you really need dynamic growth on both the main container and its elements? Do you really need random manipulation? Or can you use fixed-size arrays for both container and ABC objects and use iteration instead? When contemplating this you might want to read up on the available containers and their properties on en.cppreference.com. It will help you decide what is most appropriate.
*For the fun of it I dug around in VS2017's implementation and it creates an entirely new vector without the growth segment, copies the old elements and then reassigns the internal pointers of the old vector to the new one while deleting the old memory. So at least with that compiler you can count on memory being released.

std::vector new runs out of memory

I am writing a program in C++ for a embedded platform STM32. I used to use normal arrays but for the first time I am using vectors in it. So far it runs well in the majority of the cases. However in a very few cases I got the error stated in the title eventhough it shouldn't. Would appreciate some help as I run out of ideas.
My situation
The vectors are like this
struct Boundary{
vector<unsigned short> x; //The x coordinates
vector<unsigned short> y; //The y coordinates
};
Everytime I use these vectors I clear them with
boundary0.x.clear();
boundary0.y.clear();
I add elements with the normal push_back
The strange part
Sometimes, the program finishes with the "Operator new out of memory" when adding elements to the vector.
"!Ah, you run out of memory!"- you would say, but that is the strange part. The vector so far has only 275 elements which being short gives 550 bytes.
But this very same program has handled the same vector with many more elements (500 or more) without problem.
Somehow, you previously leaked out memory!- can be said, and I suspect that. Perhaps I used this before and failed to clean it up (although I cleaned it as I stated) but this error appears even when I disconnect and connect the processor wiping out any previous used memory.
I am at lost why this could be happening. Any help or comment or advice greatly appreciated.
What you need is to reduce vector capacity to its size after vector has been used and clear was performed.
C++11 solution:
Use shrink_to_fit() after clear() function to release memory allocated previously.
boundary0.x.clear();
boundary0.x.shrink_to_fit();
boundary0.y.clear();
boundary0.y.shrink_to_fit();
It will reduce capacity of vector to be equal to its size which after clear() equals to zero.
Note, that shrink_to_fit introduced since C++11.
C++03 and earlier:
'Swap-to-fit' idiom can be used to have same as shrink_to_fit effect.
std::vector<T>(my_vector.begin(), my_vector.end()).swap(my_vector);
will reduce my_vector capacity to its size.
This idiom is described here with detailed explanation how exactly it works: https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Shrink-to-fit
When adding an element to a vector using Vector::push_back and the number of elements reach its initial capacity, then the internally reserved buffer will be reallocated (i.e. the existing one might be freed and a larger chunk of memory is allocated). This might "fragment" your memory, as smaller chunks of free memory get available yet the system requires larger chunks, which at some point it might not find any more if the system has rather low memory). Hence, if you do this very often with a lot of vectors, it could get a problem on an embedded system.
Hard to say if this is actually the reason - but I'd try to initialize the vector with a capacity that it will most likely not overreach. Maybe that solves your problem. So you could try out:
struct Boundary{
vector<unsigned short> x = vector<unsigned short>(500); //The x coordinates
vector<unsigned short> y = vector<unsigned short>(500); //The y coordinates
};

Limit on vectors in c++

I have a question regarding vectors used in c++. I know unlike array there is no limit on vectors. I have a graph with 6 million vertices and I am using vector of class. When I am trying to insert nodes into vector it is failing by saying bad memory allocation. where as it is working perfectly over 2 million nodes. I know bad allocation means it s failing due to pointers I am using in my code but to me this does not seems the case. My question is it possible that it is failing due to the large size of graph as limit on vector is increased. If it is is there any way we can increase that Limit.
First of all you should verify how much memory a single element requires. What is the size of one vertex/node? (You can verify that by using the sizeof operator). Consider that if the answer is, say, 50 bytes, you need 50 bytes times 6 million vertices = 300 MBytes.
Then, consider the next problem: in a vector the memory must be contiguous. This means your program will ask the OS to give it a contiguous chunk of 300 MBytes, and there's no guarantee this chunk is available even if the available memory is more than 300 MB. You might have to split your data, or to choose another, non-contiguous container. RAM fragmentation is impossible to control, which means if you run your program and it works, maybe you run it again and it doesn't work (or vice versa).
Another possible approach is to resize the vector manually, instead of letting it choose its new size automatically. The vector tries to anticipate some future growth, so if it has to grow it will try to allocate more capacity than is needed. This extra capacity might be the difference between having enough memory and not having it. You can use std::vector::reserve for this, though I think the exact behaviour is implementation dependent - it might still decide to reserve more than the amount you have requested.
One more option you have is to optimize the data types you are using. For example, if inside your vertex class you are using 32-bit integers while you only need 16 bits, you might use int16_t which would take half the space. See the full list of fixed size variables at CPP Reference.
There is std::vector::max_size that you can use to see the maximum number of elements the the vector you declared can potentially hold.
Return maximum size
Returns the maximum number of elements that the
vector can hold.
This is the maximum potential size the container can reach due to
known system or library implementation limitations, but the container
is by no means guaranteed to be able to reach that size: it can still
fail to allocate storage at any point before that size is reached.

200 millions items at Vector

Here is my structure
struct Node
{
int chrN;
long int pos;
int nbp;
string Ref;
string Alt;
};
to fill the structure I read though a file and pars my interest variable to the structure and then push it back to a vectore. The problem is, there are around 200 millions items and I should keep all of them at memory (for the further steps)! But the program terminated after pushing back 50 millions of nodes with bad_allocation error.
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
searching around give me the idea I'm out of memory! but the output of top shows %48 (when the termination happened)
Additional information which may be useful:
I set the stack limitation unlimit and
I'm using Ubuntu x86_64 x86_64 x86_64 GNU/Linux with 4Gb RAM.
Any help whuold be most welcome.
Update:
1st switch from vector to list, then store each ~500Mb at file and index them for further analyses.
Vector storage is contiguous, in this case, 200 mio * the sizeof the struct bytes are required. For each of the strings in the struct, another mini allocation may be needed to hold the string. All together, this is not going to fit your available address space, and no (non-compressing) data structure is going to solve this.
Vectors usually grow their backing capacity exponentially (which amortizes the cost for push_back). So when your program was already using about half the available address space, the vector probably attempted to double its size (or add 50%), which then caused the bad_alloc and it did not free the previous buffer, so the final memory appears to be only 48%.
That node structure consumes up to 44 bytes, plus the actual string buffers. There's no way 200 million of them will fit in 4 GB.
You need to not hold your entire dataset in memory at once.
Since vectors must store all elements in contiguous memory, you are much more likely to run out of memory before you consume your full available RAM.
Try using a std::list and see if that can retain all of your items. You won't be able to randomly access the elements, but that is a tradeoff you are most likely going to have to deal with.
The std::list can better utilize free fragments of RAM, since unlike the vector, it doesn't try to store the elements adjacent to one another.
Depending on the size of the data types, I'd guess, the structure you are using is at least 4+8+4+2+2=20 Bytes long. If you have 200 000 000 of data fields, this will be around 3,8 gb of data. Not sure what you read from top, but this is close to your memory limit.
As LatencyMachine noted, the items have to be in a contiguous memory block, which will be difficult (the string memory can be somewhere else, but the two bytes I summed up will have to be in the vector).
It might help to initialize the vector with the correct size to avoid reallocation.
If you have a look at this code:
#include <iostream>
using namespace std;
struct Node
{
int chrN;
long int pos;
int nbp;
string Ref;
string Alt;
};
int main() {
// your code goes here
cout << sizeof(Node) << endl;
return 0;
}
And the result it produces on ideone you will find out that the size of your structure even if the strings are empty and on 32 bit computer is 20. Thus 200 * 10^6 times this size makes exactly 4GB. You can't hope to have the whole memory just for you. So your program will use virtual memory like crazy. You have to think of a way to store the elements only partially or your program will be in huge trouble.

vector reserve c++

I have a very large multidimensional vector that changes in size all the time.
Is there any point to use the vector.reserve() function when I only know a good approximation of the sizes.
So basically I have a vector
A[256*256][x][y]
where x goes from 0 to 50 for every iteration in the program and then back to 0 again. The y values can differ every time, which means that for each of the
[256*256][y] elements the vector y can be of a different size but still smaller than 256;
So to clarify my problem this is what I have:
vector<vector<vector<int>>> A;
for(int i =0;i<256*256;i++){
A.push_back(vector<vector<int>>());
A[i].push_back(vector<int>());
A[i][0].push_back(SOME_VALUE);
}
Add elements to the vector...
A.clear();
And after this I do the same thing again from the top.
When and how should I reserve space for the vectors.
If I have understood this correctly I would save a lot of time if I would use reserve as I change the sizes all the time?
What would be the negative/positive sides of reserving the maximum size my vector can have which would be [256*256][50][256] in some cases.
BTW. I am aware of different Matrix Templates and Boost, but have decided to go with vectors on this one...
EDIT:
I was also wondering how to use the reserve function in multidimensional arrays.
If I only reserve the vector in two dimensions will it then copy the whole thing if I exceed its capacity in the third dimension?
To help with discussion you can consider the following typedefs:
typedef std::vector<int> int_t; // internal vector
typedef std::vector<int_t> mid_t; // intermediate
typedef std::vector<mid_t> ext_t; // external
The cost of growing (vector capacity increase) int_t will only affect the contents of this particular vector and will not affect any other element. The cost of growing mid_t requires copying of all the stored elements in that vector, that is it will require all of the int_t vector, which is quite more costly. The cost of growing ext_t is huge: it will require copying all the elements already stored in the container.
Now, to increase performance, it would be much more important to get the correct ext_t size (it seems fixed 256*256 in your question). Then get the intermediate mid_t size correct so that expensive reallocations are rare.
The amount of memory you are talking about is huge, so you might want to consider less standard ways to solve your problem. The first thing that comes to mind is adding and extra level of indirection. If instead of holding the actual vectors you hold smart pointers into the vectors you can reduce the cost of growing the mid_t and ext_t vectors (if ext_t size is fixed, just use a vector of mid_t). Now, this will imply that code that uses your data structure will be more complex (or better add a wrapper that takes care of the indirections). Each int_t vector will be allocated once in memory and will never move in either mid_t or ext_t reallocations. The cost of reallocating mid_t is proportional to the number of allocated int_t vectors, not the actual number of inserted integers.
using std::tr1::shared_ptr; // or boost::shared_ptr
typedef std::vector<int> int_t;
typedef std::vector< shared_ptr<int_t> > mid_t;
typedef std::vector< shared_ptr<mid_t> > ext_t;
Another thing that you should take into account is that std::vector::clear() does not free the allocated internal space in the vector, only destroys the contained objects and sets the size to 0. That is, calling clear() will never release memory. The pattern for actually releasing the allocated memory in a vector is:
typedef std::vector<...> myvector_type;
myvector_type myvector;
...
myvector.swap( myvector_type() ); // swap with a default constructed vector
Whenever you push a vector into another vector, set the size in the pushed vectors constructor:
A.push_back(vector<vector<int> >( somesize ));
You have a working implementation but are concerned about the performance. If your profiling shows it to be a bottleneck, you can consider using a naked C-style array of integers rather than the vector of vectors of vectors.
See how-do-i-work-with-dynamic-multi-dimensional-arrays-in-c for an example
You can re-use the same allocation each time, reallocing as necessary and eventually keeping it at the high-tide mark of usage.
If indeed the vectors are the bottleneck, performance beyond avoiding the sizing operations on the vectors each loop iteration will likely become dominated by your access pattern into the array. Try to access the highest orders sequentially.
If you know the size of a vector at construction time, pass the size to the c'tor and assign using operator[] instead of push_back. If you're not totally sure about the final size, make a guess (maybe add a little bit more) and use reserve to have the vector reserve enough memory upfront.
What would be the negative/positive sides of reserving the maximum size my vector can have which would be [256*256][50][256] in some cases.
Negative side: potential waste of memory. Positive side: less CPU time, less heap fragmentation. It's a memory/cpu tradeoff, the optimum choice depends on your application. If you're not memory-bound (on most consumer machines there's more than enough RAM), consider reserving upfront.
To decide how much memory to reserve, look at the average memory consumption, not at the peak (reserving 256*256*50*256 is not a good idea unless such dimensions are needed regularly)