200 millions items at Vector - c++

Here is my structure
struct Node
{
int chrN;
long int pos;
int nbp;
string Ref;
string Alt;
};
to fill the structure I read though a file and pars my interest variable to the structure and then push it back to a vectore. The problem is, there are around 200 millions items and I should keep all of them at memory (for the further steps)! But the program terminated after pushing back 50 millions of nodes with bad_allocation error.
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
searching around give me the idea I'm out of memory! but the output of top shows %48 (when the termination happened)
Additional information which may be useful:
I set the stack limitation unlimit and
I'm using Ubuntu x86_64 x86_64 x86_64 GNU/Linux with 4Gb RAM.
Any help whuold be most welcome.
Update:
1st switch from vector to list, then store each ~500Mb at file and index them for further analyses.

Vector storage is contiguous, in this case, 200 mio * the sizeof the struct bytes are required. For each of the strings in the struct, another mini allocation may be needed to hold the string. All together, this is not going to fit your available address space, and no (non-compressing) data structure is going to solve this.
Vectors usually grow their backing capacity exponentially (which amortizes the cost for push_back). So when your program was already using about half the available address space, the vector probably attempted to double its size (or add 50%), which then caused the bad_alloc and it did not free the previous buffer, so the final memory appears to be only 48%.

That node structure consumes up to 44 bytes, plus the actual string buffers. There's no way 200 million of them will fit in 4 GB.
You need to not hold your entire dataset in memory at once.

Since vectors must store all elements in contiguous memory, you are much more likely to run out of memory before you consume your full available RAM.
Try using a std::list and see if that can retain all of your items. You won't be able to randomly access the elements, but that is a tradeoff you are most likely going to have to deal with.
The std::list can better utilize free fragments of RAM, since unlike the vector, it doesn't try to store the elements adjacent to one another.

Depending on the size of the data types, I'd guess, the structure you are using is at least 4+8+4+2+2=20 Bytes long. If you have 200 000 000 of data fields, this will be around 3,8 gb of data. Not sure what you read from top, but this is close to your memory limit.
As LatencyMachine noted, the items have to be in a contiguous memory block, which will be difficult (the string memory can be somewhere else, but the two bytes I summed up will have to be in the vector).
It might help to initialize the vector with the correct size to avoid reallocation.

If you have a look at this code:
#include <iostream>
using namespace std;
struct Node
{
int chrN;
long int pos;
int nbp;
string Ref;
string Alt;
};
int main() {
// your code goes here
cout << sizeof(Node) << endl;
return 0;
}
And the result it produces on ideone you will find out that the size of your structure even if the strings are empty and on 32 bit computer is 20. Thus 200 * 10^6 times this size makes exactly 4GB. You can't hope to have the whole memory just for you. So your program will use virtual memory like crazy. You have to think of a way to store the elements only partially or your program will be in huge trouble.

Related

Limit on vectors in c++

I have a question regarding vectors used in c++. I know unlike array there is no limit on vectors. I have a graph with 6 million vertices and I am using vector of class. When I am trying to insert nodes into vector it is failing by saying bad memory allocation. where as it is working perfectly over 2 million nodes. I know bad allocation means it s failing due to pointers I am using in my code but to me this does not seems the case. My question is it possible that it is failing due to the large size of graph as limit on vector is increased. If it is is there any way we can increase that Limit.
First of all you should verify how much memory a single element requires. What is the size of one vertex/node? (You can verify that by using the sizeof operator). Consider that if the answer is, say, 50 bytes, you need 50 bytes times 6 million vertices = 300 MBytes.
Then, consider the next problem: in a vector the memory must be contiguous. This means your program will ask the OS to give it a contiguous chunk of 300 MBytes, and there's no guarantee this chunk is available even if the available memory is more than 300 MB. You might have to split your data, or to choose another, non-contiguous container. RAM fragmentation is impossible to control, which means if you run your program and it works, maybe you run it again and it doesn't work (or vice versa).
Another possible approach is to resize the vector manually, instead of letting it choose its new size automatically. The vector tries to anticipate some future growth, so if it has to grow it will try to allocate more capacity than is needed. This extra capacity might be the difference between having enough memory and not having it. You can use std::vector::reserve for this, though I think the exact behaviour is implementation dependent - it might still decide to reserve more than the amount you have requested.
One more option you have is to optimize the data types you are using. For example, if inside your vertex class you are using 32-bit integers while you only need 16 bits, you might use int16_t which would take half the space. See the full list of fixed size variables at CPP Reference.
There is std::vector::max_size that you can use to see the maximum number of elements the the vector you declared can potentially hold.
Return maximum size
Returns the maximum number of elements that the
vector can hold.
This is the maximum potential size the container can reach due to
known system or library implementation limitations, but the container
is by no means guaranteed to be able to reach that size: it can still
fail to allocate storage at any point before that size is reached.

Memory Demands: Heap vs Stack in C++

So I had a strange experience this evening.
I was working on a program in C++ that required some way of reading a long list of simple data objects from file and storing them in the main memory, approximately 400,000 entries. The object itself is something like:
class Entry
{
public:
Entry(int x, int y, int type);
Entry(); ~Entry();
// some other basic functions
private:
int m_X, m_Y;
int m_Type;
};
Simple, right? Well, since I needed to read them from file, I had some loop like
Entry** globalEntries;
globalEntries = new Entry*[totalEntries];
entries = new Entry[totalEntries];// totalEntries read from file, about 400,000
for (int i=0;i<totalEntries;i++)
{
globalEntries[i] = new Entry(.......);
}
That addition to the program added about 25 to 35 megabytes to the program when I tracked it on the task manager. A simple change to stack allocation:
Entry* globalEntries;
globalEntries = new Entry[totalEntries];
for (int i=0;i<totalEntries;i++)
{
globalEntries[i] = Entry(.......);
}
and suddenly it only required 3 megabytes. Why is that happening? I know pointer objects have a little bit of extra overhead to them (4 bytes for the pointer address), but it shouldn't be enough to make THAT much of a difference. Could it be because the program is allocating memory inefficiently, and ending up with chunks of unallocated memory in between allocated memory?
Your code is wrong, or I don't see how this worked. With new Entry [count] you create a new array of Entry (type is Entry*), yet you assign it to Entry**, so I presume you used new Entry*[count].
What you did next was to create another new Entry object on the heap, and storing it in the globalEntries array. So you need memory for 400.000 pointers + 400.000 elements. 400.000 pointers take 3 MiB of memory on a 64-bit machine. Additionally, you have 400.000 single Entry allocations, which will all require sizeof (Entry) plus potentially some more memory (for the memory manager -- it might have to store the size of allocation, the associated pool, alignment/padding, etc.) These additional book-keeping memory can quickly add up.
If you change your second example to:
Entry* globalEntries;
globalEntries = new Entry[count];
for (...) {
globalEntries [i] = Entry (...);
}
memory usage should be equal to the stack approach.
Of course, ideally you'll use a std::vector<Entry>.
First of all, without specifying which column exactly you were watching, the number in task manager means nothing. On a modern operating system it's difficult even to define what you mean with "used memory" - are we talking about private pages? The working set? Only the stuff that stays in RAM? does reserved but not committed memory count? Who pays for memory shared between processes? Are memory mapped file included?
If you are watching some meaningful metric, it's impossible to see 3 MB of memory used - your object is at least 12 bytes (assuming 32 bit integers and no padding), so 400000 elements will need about 4.58 MB. Also, I'd be surprised if it worked with stack allocation - the default stack size in VC++ is 1 MB, you should already have had a stack overflow.
Anyhow, it is reasonable to expect a different memory usage:
the stack is (mostly) allocated right from the beginning, so that's memory you nominally consume even without really using it for anything (actually virtual memory and automatic stack expansion makes this a bit more complicated, but it's "true enough");
the CRT heap is opaque to the task manager: all it sees is the memory given by the operating system to the process, not what the C heap has "really" in use; the heap grows (requesting memory to the OS) more than strictly necessary to be ready for further memory requests - so what you see is how much memory it is ready to give away without further syscalls;
your "separate allocations" method has a significant overhead. The all-contiguous array you'd get with new Entry[size] costs size*sizeof(Entry) bytes, plus the heap bookkeeping data (typically a few integer-sized fields); the separated allocations method costs at least size*sizeof(Entry) (size of all the "bare elements") plus size*sizeof(Entry *) (size of the pointer array) plus size+1 multiplied by the cost of each allocation. If we assume a 32 bit architecture with a cost of 2 ints per allocation, you quickly see that this costs size*24+8 bytes of memory, instead of size*12+8 for the contiguous array in the heap;
the heap normally really gives away blocks that aren't really the size you asked for, because it manages blocks of fixed size; so, if you allocate single objects like that you are probably paying also for some extra padding - supposing it has 16 bytes blocks, you are paying 4 bytes extra per element by allocating them separately; this moves out memory estimation to size*28+8, i.e. an overhead of 16 bytes per each 12-byte element.

2D vector with very large size

i have defined a 2 dimention vector in c++ that its sizes are too large. the voctor definition is like this:
vector<vector<string> > CommunityNodes(3600, vector<string>(240005));
when i run the program, i have the following error:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
before i run the program, i take the following command in console line that specify the stack size would be unlimited:
ulimit -s unlimited
but again i have the allocation error.how can i define such big vector in c++?
Let's assume an implementation of std::string containing one 32-bit pointer to its contents, one 32-bit int for the current length of the string, and one 32-bit int for the current allocation size.
That gives us 12 bytes per string * 240005 strings per row * 3600 rows, which works out to 9.7 gigabytes -- considerably more than you can deal with on a 32-bit implementation. Worse, an implementation might easily pad that 12-byte string out to 16 bytes, thus increasing the memory needed still more.
If you go to a 64-bit implementation, you can address more memory, but the size is likely to double, so you'd need roughly 20 gigabytes of memory just to store the arrays of empty strings. Add some actual contents, and you need even more (and, again, the string could easily be padded to be larger still).
So, yes, with a 64-bit implementation this can probably be made to work, but it's impractical for most purposes. You probably want/need to find some way to reduce the number of strings you use.
An attempt to pre-allocate that much space suggests that you are taking the wrong approach to the problem. I suspect that most of the nodes in the double vector are going to be empty. If that is the case, you would be much better off with something like a double map instead of a double vector:
std::map< int, std::map< int, std::string > > CommunityNodes();
Then you only create the entries you must have, instead of preallocating the entire array.
Please Note: using the '[]' operator will automatically create a node, even if you don't assign anything to it. If you don't want to create the node, then use 'find()' instead of '[]'.

Freeze in C++ program using huge vector

I have an issue with a C++ program. I think it's a problem of memory.
In my program i'm used to create some enormous std::vector (i use reserve to allocate some memory). With vector size of 1 000 000, it's ok but if i increase this number (about ten millions), my program will freeze my PC and i can do nothing except waiting for a crash (or end of the program if i'm lucky). My vector contains a structure called Point which contain a vector of double.
I used valgrind to check if there is a memory lack. But no. According to it, there is no problem. Maybe using a vector of objects is not advised ? Or maybe is there some system parameters to check or something ? Or simply, the vector is too big for the computer ?
What do you think about this ?
Disclaimer
Note that this answer assumes a few things about your machine; the exact memory usage and error potential depends on your environment. And of course it is even easier to crash when you don't compute on 2d-Points, but e.g. 4d-points, which are common in computer graphics for example, or even larger Points for other numeric purposes.
About your problem
That's quite a lot of memory to allocate:
#include <iostream>
#include <vector>
struct Point {
std::vector<double> coords;
};
int main () {
std::cout << sizeof(Point) << std::endl;
}
This prints 12, which is the size in bytes of an empty Point. If you have 2-dimensional points, add another 2*sizeof(double)=8 to that per element, i.e. you now have a total of 20 bytes per Point.
With 10s of millions of elements, you request 200s of millions of bytes of data, e.g. for 20 million elements, you request 400 million bytes. While this does not exceed the maximum index into an std::vector, it is possible that the OS does not have that much contiguous memory free for you.
Also, your vectors memory needs to be copied quite often in order to be able to grow. This happens for example when you push_back, so when you already have a 400MiB vector, upon the next push_back you might have your old version of the vector, plus the newly allocated 400MiB*X memory, so you may easily exceed the 1000MiB temporarilly, et cetera.
Optimizations (high level; preferred)
Do you need to actually store the data all time? Can you use a similar algorithm which does not require so much storage? Can you refactor your code so that storage is reduced? Can you core some data out when you know it will take some time until you need it again?
Optimizations (low level)
If you know the number of elements before creating your outer vector, use the std::vector constructor which you can tell an initial size:
vector<Foo> foo(12) // initialize have 12 elements
Of course you can optimize a lot for memory; e.g. if you know you always only have 2d-Points, just have two doubles as members: 20 bytes -> 16 bytes. When you do not really need the precision of double, use float: 16 bytes -> 8 bytes. That's an optimization to $2/5$:
// struct Point { std::vector<double> coords; }; <-- old
struct Point { float x, y; }; // <-- new
If this is still not enough, an ad-hoc solution could be std::deque, or another, non-contiguous container: No temporal memory "doubling" because no resizing needed; also no need for the OS to find you such contiguous block of memory.
You can also use compression mechanisms, or indexed data, or fixed point numbers. But it depends on your exact circumstances.
struct Point { signed char x, y; }; // <-- or even this? examine a proper type
struct Point { short x_index, y_index; };
Without seeing your code, this is just speculation, but I suspect it is in large part due to your attempt to allocate a massive amount of memory that is contiguous. std::vector is guaranteed to be in contiguous memory, so if you try to allocate a large amount of space, the operating system has to try to find a block of memory that large that it can use. This may not be a problem for 2MB, but if you are suddenly trying to allocate 200MB or 2GB of contiguous memory ...
Additionally, anytime you add a new element to the vector and it is forced to resize, all of the existing elements must be copied into the new space allocated. If you have 9 million elements and adding the 9,000,001 element requires a resize, that is 9 million elements that have to be moved. As your vector gets larger, this copy time takes longer.
Try using std::deque instead. It is will basically allocate pages (that will be contiguous), but each page can be allocated wherever it can fit.

Why is the heap after array allocation so large

I've got a very basic application that boils down to the following code:
char* gBigArray[200][200][200];
unsigned int Initialise(){
for(int ta=0;ta<200;ta++)
for(int tb=0;tb<200;tb++)
for(int tc=0;tc<200;tc++)
gBigArray[ta][tb][tc]=new char;
return sizeof(gBigArray);
}
The function returns the expected value of 32000000 bytes, which is approximately 30MB, yet in the Windows Task Manager (and granted it's not 100% accurate) gives a Memory (Private Working Set) value of around 157MB. I've loaded the application into VMMap by SysInternals and have the following values:
I'm unsure what Image means (listed under Type), although irrelevant of that its value is around what I'm expecting. What is really throwing things out for me is the Heap value, which is where the apparent enormous size is coming from.
What I don't understand is why this is? According to this answer if I've understood it correctly, gBigArray would be placed in the data or bss segment - however I'm guessing as each element is an uninitialised pointer it would be placed in the bss segment. Why then would the heap value be larger by a silly amount than what is required?
It doesn't sound silly if you know how memory allocators work. They keep track of the allocated blocks so there's a field storing the size and also a pointer to the next block, perhaps even some padding. Some compilers place guarding space around the allocated area in debug builds so if you write beyond or before the allocated area the program can detect it at runtime when you try to free the allocated space.
you are allocating one char at a time. There is typically a space overhead per allocation
Allocate the memory on one big chunk (or at least in a few chunks)
Do not forget that char* gBigArray[200][200][200]; allocates space for 200*200*200=8000000 pointers, each word size. That is 32 MB on a 32 bit system.
Add another 8000000 char's to that for another 8MB. Since you are allocating them one by one it probably can't allocate them at one byte per item so they'll probably also take the word size per item resulting in another 32MB (32 bit system).
The rest is probably overhead, which is also significant because the C++ system must remember how many elements an array allocated with new contains for delete [].
Owww! My embedded systems stuff would roll over and die if faced with that code. Each allocation has quite a bit of extra info associated with it and either is spaced to a fixed size, or is managed via a linked list type object. On my system, that 1 char new would become a 64 byte allocation out of a small object allocator such that management would be in O(1) time. But in other systems, this could easily fragment your memory horribly, make subsequent new and deletes run extremely slowly O(n) where n is number of things it tracks, and in general bring doom upon an app over time as each char would become at least a 32 byte allocation and be placed in all sorts of cubby holes in memory, thus pushing your allocation heap out much further than you might expect.
Do a single large allocation and map your 3D array over it if you need to with a placement new or other pointer trickery.
Allocating 1 char at a time is probably more expensive. There are metadata headers per allocation so 1 byte for a character is smaller than the header metadata so you might actually save space by doing one large allocation (if possible) that way you mitigate the overhead of each individual allocation having its own metadata.
Perhaps this is an issue of memory stride? What size of gaps are between values?
30 MB is for the pointers. The rest is for the storage you allocated with the new call that the pointers are pointing to. Compilers are allowed to allocate more than one byte for various reasons, like to align on word boundaries, or give some growing room in case you want it later. If you want 8 MB worth of characters, leave the * off your declaration for gBigArray.
Edited out of the above post into a community wiki post:
As the answers below say, the issue here is I am creating a new char 200^3 times, and although each char is only 1 byte, there is overhead for every object on the heap. It seems creating a char array for all chars knocks the memory down to a more believable level:
char* gBigArray[200][200][200];
char* gCharBlock=new char[200*200*200];
unsigned int Initialise(){
unsigned int mIndex=0;
for(int ta=0;ta<200;ta++)
for(int tb=0;tb<200;tb++)
for(int tc=0;tc<200;tc++)
gBigArray[ta][tb][tc]=&gCharBlock[mIndex++];
return sizeof(gBigArray);
}