C++ Containers: Optimal Memory Management

C++ Containers: Optimal Memory Management - c++

I want to implement a container. The data will be stored in a dynamically allocated array. I need advice on memory reallocation.
Basically, i want a formula on how much bigger i should make the array when it is full. I think a constant value would be sub-optimal, since the larger the array is, the longer it takes to copy it.
For example, if an array can store 1000000 doubles and it becomes full, reallocating for 1000005 doubles would be stupid. Going for 1001000 would be a better idea. On the contrary, if i have an array of 5 doubles and it gets full, enlarging it to 1005 units is equally stupid. Maybe enlarging it by 10% (or like by 20+10% so that it feels ok on small arrays too) every time would be a better idea. Any advice on this?

I would start by reusing std::vector. Don't re-implement what already works well.
If you know something about the size of your data, then use the reserve() function to ensure you don't allocate more often than you need to. Feel free to reserve 20%, or 40% extra space if you dont know exactly how much data you have.
If you don't know something about the size of your data, then std::vector is optimized for good performance without knowing anything. If you know nothing about the size of your data, you are equally likely to have 10001 entries and have vector wastefully allocate lots of space as you are to have 9999 entries and have vector avoid 4 or 5 wasteful copies that a less aggressive algorithm chose. Std::vector implementations are fine tuned over many hundreds of man hours to ensure optimal behavior when the user has no information on sizing.
The only time I'd start to deviate from that is when you start getting up into the gigabyte data sets. Then it's nice to make sure that you don't allocate something too big to fit into memory.

Related

Memory efficient way to create a dynamic-list attribute that can grow indefinitely

I need to make an object which owns lists that can grow indefinitely. Items in the list are structs composed of basic types.
So, I want to know if using a vector for it could lead to memory fragmentation if it grows too much. If so, what should I use instead?
Would a pointer to vector be enough? I don't know if memory fragmentation would be less important if the vector is stored outside the object.

From the comments :
n the biggest test case I have, the mother object with the biggest list has 10000 elements. However, there are 23000 mother objects in that case. So, we could speak of a total of 230,000,000 "basic structs" as maximum, given there are not bigger cases than this.
Use vectors.
You shouldn't worry about memory fragmentation when the biggest contiguous array of memory you need contains about 10000 elements (let's say 30 bytes per element, that means 300kB). Today's memory models are efficient enough, they can manage a few kilobytes of contiguous memory. If you want to know more about memory fragmentation, here's a question about it.
The fact that you can have a lot of "mother objects" doesn't matter, since they don't require to be contiguous in memory.
You can also read about deques if you want to dig a little deeper.

It shouldn't make a big difference unless the sequence is very big, but vectors use contiguous memory, so they don't lead to fragmentation, whereas if you do it with a list, it will be asking for space in different portions of the memory, witch could eventually lead to fragmentation.

Is it okay to allocate more memory than you need for array when you do not know exact amount of elements but can estimate upper bound?

C++.
I want to create an array and store elements in it. And I can estimate upper bound for amount of elements that would be true for most cases, let's say about 98% of cases. Is it better in terms of speed and beauty to create static array with size equal to upper bound instead of dynamic allocation?
To be more specific, let's say that amount of elements varying between 10000 and 60000 and 2 bytes per element. And in very rare cases that amount can be higher than 60000(in such case I'll have to make reallocation).
Is it okay to statically allocate array of size 60000 and use part of it, and in some cases reallocate to bigger size, or this practice is too ugly?

Just use std::vector, which lets you resize as needed and reserve an estimated maximum size to improve performance. You can even make this static/global if you want to, but then you have the usual issues with global variables.
Unless you are creating new instances of the vector very frequently, which I doubt with such a large size, this should have good performance.

The template std::vector is made to exactly tackle such problems. It will automatically allocate new memory for new elements, ensuring efficient use of memory, and preventing time expensive operations, like reallocation, as well.

fast variable size container c++

I am making a chess engine, and have hit a brick wall with optimization. After using a profiler, I have found that the move generation is the biggest factor. When I looked closer, it turned out that a large portion of time generating moves was spent calling std::vector.push_back(move) when I had found a move.
Is there a way to have a dynamically sized c++ container that is fast? It can't be a fixed size array, as I have no way of knowing ahead of time how many moves will be generated (although there are usually less than 50).
Does anyone have experience with this sort of issue? I would write my own container if necessary, but I feel like there should be an standard way of doing this.

Call std::vector::reserve() with adequate size before the following push_back() calls to avoid memory re-allocation again and again.

Vector::reserve() helps. You can try to profile and see the distribution of number of moves, and try to reserve an optimal number in advance. Don't worry about memory waste because when you have 32 - 50 moves, the memory reserved might be 64, and there's a waste of 14 - 32. So reserve a memory of 8 or even 16 may not take much more memory.
Do you need to access moves by index? why not use std::list?
Or you can try to push_back a shared_ptr of a move, and then reserve some number in advance, there will be less memory waste.

Did you try profiling with std::deque? If you've no requirement that the objects be allocated in a contiguous fashion, then it might be an optimal solution. It provides constant time insert and erase to the front; usually std::deque is preferred if you need to insert or erase at both ends of the sequence.
You can read the details in GotW 54.

You can use std::vector and call its reserve method at appropriate places.

I use this method of profiling.
It doesn't surprise me that push_back is a big time-taker, and reserve should fix that.
However, if you profile again, you might find something else is the big time-taker, such as calls to new and delete for your move objects.
Fix that (by pooling), and do it again. Now, something else will be big.
Each time you do this, you get a speedup factor, and those factors multiply together, until you will be really pleased with the result.

Should I use boost fast pool allocator for following?

I have a server that throughout the course of 24 hours keeps adding new items to a set. Elements are not deleted over the 24 period, just new elements keep getting inserted.
Then at end of period the set is cleared, and new elements start getting added again for another 24 hours.
Do you think a fast pool allocator would be useful here as to reuse the memory and possibly help with fragmentation?
The set grows to around 1 million elements. Each element is about 1k.

It's highly unlikely …but you are of course free to test it in your program.
For a collection of that size and allocation pattern (more! more! more! + grow! grow! grow!), you should use an array of vectors. Just keep it in contiguous blocks and reserve() when they are created and you never need to reallocate/resize or waste space and bandwidth traversing lists. vector is going to be best for your memory layout with a collection that large. Not one big vector (which would take a long time to resize), but several vectors, each which represent chunks (ideal chunk size can vary by platform -- I'd start with 5MB each and measure from there). If you follow, you see there is no need to resize or reuse memory; just create an allocation every few minutes for the next N objects -- there is no need for high frequency/speed object allocation and recreation.
The thing about a pool allocator would suggest you want a lot of objects which have discontiguous allocations, lots of inserts and deletes like a list of big allocations -- this is bad for a few reasons. If you want to create an implementation which optimizes for contiguous allocation at this size, just aim for the blocks with vectors approach. Allocation and lookup will both be close to minimal. At that point, allocation times should be tiny (relative to the other work you do). Then you will also have nothing unusual or surprising about your allocation patterns. However, the fast pool allocator suggests you treat this collection as a list, which will have terrible performance for this problem.
Once you implement that block+vector approach, a better performance comparison (at that point) would be to compare boost's pool_allocator vs std::allocator. Of course, you could test all three, but memory fragmentation is likely going to be reduced far more by that block of vectors approach, if you implement it correctly. Reference:
If you are seriously concerned about performance, use fast_pool_allocator when dealing with containers such as std::list, and use pool_allocator when dealing with containers such as std::vector.

are STL Containers .push_back() naughty

This might seem daft for which I'm sorry, I've been writing a bit some code for the Playstation 2 for uni. I am writing a sort of API for the Graphic Synthesizer. I am using a similar syntax to that of openGL which is a state machine.
So the input would something like
gsBegin(GS_TRIANGLE);
gsColor(...);
gsVertex3f(...);
gsVertex3f(...);
gsVertex3f(...);
gsEnd();
This is great so far for line/triangles/quads with a determined amount of vertices, however things like a LINE_STRIP or TRIANGLE_FAN take an undetermined amount of points. I have been warned off several times for using stl containers because of the push_back() method in this situation because of the time sensitive nature (is this justified).
If its not justified what would be a better way of dealing with the undetermined amount situation. Currently I have an Array that can hold 30 vertices at a time, is this best way of dealing with this kind of situation?

Vector's push_back has amortized constant time complexity because it exponentially increases the capacity of the vector. (I'm assuming you're using vector, because it's ideal for this situation.) However, in practice, rendering code is very performance sensitive, so if the push_back causes a vector reallocation, performance may suffer.
You can prevent reallocations by reserving the capacity before you add to it. If you call myvec.reserve(10);, you are guaranteed to be able to add 10 elements before the vector reallocates.
However, this still requires knowing ahead of time how many elements you need. Also, if you create and destroy lots of different vectors, you're still doing a lot of memory allocation. Instead, just use one vector for all vertices, and re-use it. Calling clear() returns it to empty while keeping its allocated capacity. This way you don't actually need to reserve anything - the first few times you use it it'll reallocate and grow, but once it reaches its peak size, it won't need to reallocate any more. The nice thing about this is the vector finds the approximate size it needs to be, and once it's "warmed up" there's no further allocation so it is high performance.
In short:
Use a single persistently stored std::vector
push_back as much as you like
When you're done, clear().
In practice this will perform as well as a C array, but without a hard limit on size.

University, eh? Just tell them push_back has amortized constant time complexity and they'll be happy.

First, avoid using glBegin / glEnd if you can, and instead use something like glDrawArrays or glDrawElements.
push_back() on a std::vector is a quick operation unless the array needs to grow in size when the operation occurs. Set the vector capacity as high as you think you will need it to be and you should see minimal overhead. 'Raw' arrays will usually always be slightly faster, but then you have to deal with using 'raw' arrays.

There is always the alternative of using a deque.
A deque is very much like a vector, contiguity apart. Basically, it's often implemented as a vector of arrays.
This means a lower allocation cost, but member access might be slightly slower (though constant) because of the double dereference, so I am unsure if it's profitable in your case.
There is also the LLVM alternative: SmallVector<T,N>, which preallocates (right in the vector) space for N elements, and will simply get back to using a traditional vector-like implementation once the size has grown too much.

The drawback to using std::vector in this kind of situation is making sure you manage your memory allocation properly. On systems like the PS2 (PS3 seems to be a bit better at this), memory allocation is insanely slow and if you don't reserve the right amount of space in the vector to begin with (and it has to resize several times when adding items), you will slow your game to a creeping crawl. If you know what your max size is going to be and reserve it when you create the vector, you won't have a problem.
That said, if this vector is going to be a temporary/local variable, you will still be reallocating memory every time your function is called. So if this function is called every frame, you will still have the performance problem. You can get around this by using a custom allocator and/or making the vector global (or a member variable to a class that will exist during your game loop).

You can always equip the container you want to use with proper allocator, which takes into account the limitations of the platform and the expected grow/shrink scenarios etc...

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js