Choosing a STL Container for a very large list - c++

I have a very large list of items (~2 millions) that I want to optimize for access speed. I iterate trough the items using an iterator (++it).
Right now the code is implemented using std:map<std::wstring, STRUCT>.
I wonder if it's worth to change std::map with a std::deque<std::pair<std::wstring, STRUCT>>. I think I would have advantage of using pointer arithmetic and minimize cache miss. It worths ?
I know that profiling is the answer but I need an opinion before implementing this ...

If you know in advance the size, then std::Vector is clearly the way to go it your objects aren't too big.
std::vector<Object> list;
list.reserve(2000000);
And then fill it as usual.
This is the fastest and least memory consuming approach. However, you need to be able to allocate enought continous memory. But excepted if your object are 1kb big, it shouldn't be a problem.

With deque, you would lose ( or would have to re-implement ) the advantage of Key-Value pairs. If it's not essential for your data, I would consider using deque.

Generally, if you're only doing search in this set (no insertions/deletions), you're probably better off using a sorted sequential cointainer, like deque or vector. You can then use simple binary search to find the needed elements. The advantage of using a sequential container is that it is better in terms of memory usage, has very simple implementation, and provides better locality of reference. I'd write one version of the code using vector, and another version of the code using deque, then compare them in terms of preformance to decide which one to use in the final version.
However, if your structure needs to be updated (new elements need to be inserted or old elements have to be deleted frequently), map is better choice. Or maybe, you just have to drop STL containers altogether and just use an in-memory database (see SQLite), but it highly depends on what problem you're solving.

The fastest container to iterate through is usually a vector, so if you want to optimize for iteration at the expense of everything else, use that.
Overall app performance of course will depend how many times you iterate, and how you construct your data in the first place. For a simple test, once your map has been populated you can construct a vector from it as follows:
vector<pair<K,V> > myvec(mymap.begin(), mymap.end());
Where K and V are the key and value types of the map. Then just use the vector iterators in place of the map iterators and compare performance.
Of course, if you want to modify the map in future, then normally it would not be appropriate to optimize for iteration at the expense of everything else.

Related

Memory efficient std::map alternative

I'm using a std::map to store about 20 million entries. If they were stored without any container overhead, it would take approximately 650MB of memory. However, since they are stored using std::map, it uses up about 15GB of memory (i.e. too much).
The reason I am using an std::map is because I need to find keys that are equal to/larger/smaller than x. This is why something like sparsehash wouldn't work (since, using that, I cannot find keys by comparison).
Is there an alternative to using std::map (or ordered maps in general) that would result in less memory usage?
EDIT: Writing performance is much more important than reading performance. It will probably only read ~10 entries, but I don't know which entries it will read.
One alternative would be to use flat_map from Boost.Containers: that supports the same interface as std::map, but is backed by a sorted contiguous array (think std::vector) instead of a tree. Or hand-roll your own solution based on the same idea.
Its performance characteristic is of course different, due to the different back-end. It's up to you to evaluate whether it's usable in your case.
Are you writing on-the-fly or one time before the lookup is done? If the later is the case, you shouldn't need a map, you could use std::vector and one-time sort.
You could just insert everything unsorted to the vector, sort one-time after everything is there (O(N * log N) as well as std::map, but much better performance characteristics) and then lookup in the sorted array (O(logN) as the std::map).
And especially if you know the number of elements before reading and could reserve the vector size upfront, that could work pretty well. Or at least if you know some "upper bound" to reserve perhaps slightly more than actually needed but avoid the reallocations.
Given your requirements:
Insertion needs to be quick
There are many elements to read
Read-back can be slow
You only read back data once
I'd consider typedef std::pair<uint64, thirty_six_byte_struct> element; and populate a std::list<element>. That will be hard to beat in terms of performance.
For reading back, I'd simply traverse the linked list, checking at every point if you need one of those elements. That's a O(N) traversal but as you say, you'll only do that once.
Turns out the issue wasn't std::map.
I realized was using 3 separate maps to represent various parts of the same data, and after slimming it down to 1, the difference in memory was entirely negligible.
Looking at the code a little more, I realized code I had written to free a really expensive struct (per element of the map) didn't actually work.
Fixing that part, it now uses <1GB of memory, as it should! :)
TL;DR: std::map's overhead is entirely negligible for this. The issue was my own.

Fast data structure that supports finding the minimum element and accessing, inserting, removing and updating data at any index

I'm looking for ideas to implement a templatized sequence container data structure which can beat the performance of std::vector in as many features as possible and potentially perform much faster. It should support the following:
Finding the minimum element (and returning it's index)
Insertion at any index
Removal at any index
Accessing and updating any element by index (via operator[])
What would be some good ways to implement such a structure in C++?
You generally be pretty sure that the STL implementations of all containers tend to be very good at the range of tasks they were designed for. That is to say, you're unlikely to be able to build a container that is as robust as std::vector and quicker for all applications. However, generally speaking, it is almost always possible to beat a generic tool when optimizing for a specific application.
First, let's think about what a vector actually is. You can think of it as a pointer to a c-style array, except that its elements are stored on the heap. Unlike a c array, it also provides a bunch of methods that make it a little bit more convenient to manipulate. But like a c-array, all of it's data is stored contiguously in memory, so lookups are extremely cheap, but changing its size may require the entire array to be shifted elsewhere in memory to make room for the new elements.
Here are some ideas for how you could do each of the things you're asking for better than a vanilla std::vector:
Finding the minimum element: Search is typically O(N) for many containers, and certainly for a vector (because you need to iterate through all elements to find the lowest). You can make it O(1), or very close to free, by simply keeping the smallest element at all times, and only updating it when the container is changed.
Insertion at any index: If your elements are small and there are not many, I wouldn't bother tinkering here, just do what the vector does and keep elements contiguously next to each other to keep lookups quick. If you have large elements, store pointers to the elements instead of the elements themselves (boost's stable vector will do this for you). Keep in mind that this make lookup more expensive, because you now need to dereference the pointer, so whether you want to do this will depend on your application. If you know the number of elements you are going to insert, std::vector provides the reserve method which preallocates some memory for you, but what it doesn't do is allow you to decide how the size of the allocated memory grows. So if your application warrants lots of push_back operations without enough information to intelligently call reserve, you might be able to beat the standard std::vector implementation by tailoring the growth function of your container to your particular needs. Another option is using a linked list (e.g. std::list), which will beat an std::vector in insertions for larger containers. However, the cost here is that lookup (see 4.) will now become vastly slower (O(N) instead of O(1) for vectors), so you're unlikely to want to go down this path unless you plan to do more insertions/erasures than lookups.
Removal at any index: Similar considerations as for 2.
Accessing and updating any element by index (via operator[]): The only way you can beat std::vector in this regard is by making sure your data is in the cache when you try to access it. This is because lookup for a vector is essentially an array lookup, which is really just some pointer arithmetic and a pointer dereference. If you don't access your vector often you might be able to squeeze out a few clock cycles by using a custom allocator (see boost pools) and placing your pool close to the stack pointer.
I stopped writing mainly because there are dozens of ways in which you could approach this problem.
At the end of the day, this is probably more of an exercise in teaching you that the implementation of std::vector is likely to be extremely efficient for most compilers. All of these suggestions are essentially micro-optimizations (which are the root of all evil), so please don't blindly apply these in important code, as they're highly likely to end up costing you a lot of time and headache.
However, that's not to say you shouldn't tinker and learn for yourself, so by all means go ahead and try to beat it for your application and let us know how you go! Good luck :)

Keeping an unordered list of small objects with frequent insertions and removals

Suppose I have a list of small objects that I iterate through (say, in a loop) with frequent insertions and removals. However, the sequential order that I iterate through the list does not matter. Instead of using std::list to store the elements, I was thinking about using std::vector in the following way (for constant time removals):
Insertion: use push_back to insert at the end of the array.
Removal: let's say I want to remove an element at position k from a vector of size n. Then, I copy the content of the nth (or (n-1)st, depending on how you see it) element to the kth element and use pop_back. Given that the elements are small, the copy operation shouldn't be costly.
This is to take advantage of contiguous memory and not having to dynamically allocate memory for every insertion. Is there a downside for this approach? I also noticed that C++11 has unordered_set, but I think this may be overkill for what I'm trying to do.
I apologize if this idea sounds blatantly obvious.
Your idea is the basic approach to keep an array efficient. If the order really doesn't matter for you, I think it's the ideal approach. You might want to encapsulate it in a class (a wrapper around std::vector) so that you can employ it in multiple places without code duplication, test it separately and generally follow the "single responsibility" principle.
If you have access to C++11 features, you won't even have to copy the n-th element - you can move it instead, making this feasible even for heavier objects.
I can't see a downside to the approach given your fairly loose requirements.
One other option to consider is that if you item is cheaper to swap than copy, you can swap the last item with the one to delete and the pop your now-swapped item off the end.
It does really sound like unordered_set is too heavyweight for your needs since it has constant time find that you don't need for your requirements.

Initializing a std::map when the size is known in advance

I would like to initialize a std::map. For now I am using ::insert but I feel I am wasting some computational time since I already know the size I want to allocate. Is there a way to allocate a fixed size map and then fill the map ?
No, the members of the map are internally stored in a tree structure. There is no way to build the tree until you know the keys and values that are to be stored.
The short answer is: yes, this is possible, but it's not trivial. You need to define a custom allocator for your map. The basic idea is that your custom allocator will set aside a single block of memory for the map. As the map requires new nodes, the allocator will simply assign them addresses within the pre-allocated block. Something like this:
std::map<KeyType, ValueType, std::less<KeyType>, MyAllocator> myMap;
myMap.get_allocator().reserve( nodeSize * numberOfNodes );
There are a number of issues you'll have to deal with, however.
First, you don't really know the size of each map node or how many allocations the map will perform. These are internal implementation details. You can experiment to find out, but you can't assume that the results will hold across different compilers (or even future versions of the same compiler). Therefore, you shouldn't worry about allocating a "fixed" size map. Rather, your goal should be to reduce the number of allocations required to a handful.
Second, this strategy becomes quite a bit more complex if you want to support deletion.
Third, don't forget memory alignment issues. The pointers your allocator returns must be properly aligned for the various types of objects the memory will store.
All that being said, before you try this, make sure it's necessary. Memory allocation can be very expensive, but you still shouldn't assume that it's a problem for your program. Measure to find out. You should also consider alternative strategies that more naturally allow pre-allocation. For example, a sorted list or a std::unordered_map.
Not sure if this answers your question, but Boost.Container has a flat_map in which you can reserve space. Basically you can see this as a sorted vector of (key, value) pairs. Tip: if you also know that your input is sorted, you can use insert with hint for maximal performance.
There are several good answers to this question already, but they miss some primary points.
Initialize the map directly
The map knows the size up front if initialized directly with iterators:
auto mymap = std::map(it_begin, it_end);
This is the best way to dodge the issue. If you are agnostic about the implementation, the map can then know the size up front from the iterators and you moved the issue to the std:: implementation to worry about.
Alternatively use insert with iterators instead, that is:
mymap.insert(it_begin, it_end);
See: https://en.cppreference.com/w/cpp/container/map/insert
Beware of Premature optimization
but I feel I am wasting some computational time.
This sounds a lot like you are optimization prematurely (meaning you do not know where the bottleneck is - you are guessing or seeing an issue that isn't really one). Instead, measure first and then do optimization - repeat if necessary.
Memory allocation could already be optimized, to a large degree
Rolling your own block allocator for the map could be close to fruitless. On modern system(here I include OS/hardware and the C++ language level) memory allocation is already very well optimized for the general case and you could be looking at little or no improvement if rolling your own block allocator. Even if you take a lot of care and get the map into one contiguous array - while an improvement in itself - you could still be facing the problem that in the end, the elements could be placed randomly in the array (eg. insertion order) and be less cache friendly anyway (this very much depending on your actual use case though - I'm assuming a super large data-set).
Use another container or third party map
If you are still facing this issue - the best approach is probably to use another container (eg. a sorted std::vector - use std::lower_bound for lookups) or use a third party map optimized for how you are using the map. A good example is flat_map from boost - see this answer.
Conclusion
Let the std::map worry about the issue.
When performance is the main issue: use a data structure (perhaps 3rd party) that best suits how your data is being used (random inserts or bulk inserts / mostly iteration or mostly lookups / etc.). You then need to profile and gather performance metrics to compare.
You are talking about block allocators. But it is hard to implement. Measure before think about such hard things. Anyway Boost has some articles about implementing block allocator. Or use already implemented preallocated map Stree

Would I see a performance gain using std::map instead of vector<pair<string, string> >?

I currently have some code where I am using a vector of pair<string,string>. This is used to store some data from XML parsing and as such, the process is quite slow in places. In terms of trying to speed up the entire process I was wondering if there would be any performance advantage in switching from vector<pair<string,string> > to std::map<string,string> ? I could code it up and run a profiler, but I thought I would see if I could get an answer that suggests some obvious performance gain first. I am not required to do any sorting, I simply add items to the vector, then at a later stage iterate over the contents and do some processing - I have no need for sorting or anything of that nature. I am guessing that perhaps I would not get any performance gain, but I have never actually used a std::map before so I don't know without asking or coding it all up.
No. If (as you say) you are simply iterating over the collection, you will see a small (probably not measurable) performance decrease by using a std::map.
Maps are for accessing a value by its key. If you never do this, map is a bad choice for a container.
If you are not modifying your vector<pair<string,string> > - just iterating it over and over - you will get perfomance degradation by using map. This is because typical map is organized with binary tree of objects, each of which can be allocated in different memory blocks (unless you write own allocator). Plus, each node of map manages pointers to neighbor objects, so it's time and memory overhead, too. But, search by key is O(log) operation. On other side, vector holds data in one block, so processor cache usually feels better with it. Searching in vector is actually O(N) operation which is not so good but acceptable. Search in sorted vector can be upgraded to O(log) using lower_bound etc functions.
It depends on operations you doing on this data. If you make many searches - probably its better to use hashing container like unordered_map since search by key in this containers is O(1) operation. For iterating, as mentioned, vector is faster.
Probably it is worth to replace string in your pair, but this highly depends on what you hold there and how access container.
The answer depends on what you are doing with these data structures and what the size of them is. If you have thousands of elements in your std::vector<std::pair<std::stringm std::string> > and you keep searching for the first element over and over, using a std::map<std::string, std::string> may improve the performance (you might want to consider using std::unordered_map<std::string, std::string> for this use case, instead). If your vectors are relatively small and you don't trying to insert elements into the middle too often, using vectors may very well be faster. If you just iterate over the elements, vectors are a lot faster than maps: iterations isn't really one of their strength. Maps are good at looking things up, assuming the number of elements isn't really small because otherwise a linear search over a vector is still faster.
The best way to determine where the time is spent is to profile the code: it is often not entirely clear up front where the time is spent. Frequently, the suspected hot-spots are actually non-problematic and other areas show unexpected performance problems. For example, you might be passing your objects my value rather than by reference at some obscure place.
If your usage pattern is such that you perform many insertions before performing any lookups, then you might benefit from implementing a "lazy" map where the elements are sorted on demand (i.e. when you acquire an iterator, perform a lookup, etc).
As C++ say std::vector sort items in a linear memory, so first it allocate a memory block with an initial capacity and then when you want to insert new item into vector it will check if it has more room or not and if not it will allocate a new buffer with more space, copy construct all items into new buffer and then delete source buffer and set it to new one.
When you just start inserting items into vector and you have lot of items you suffer from too many reallocation, copy construction and destructor call.
In order to solve this problem, if you now count of input items (not exact but its usual length) you can reserve some memory for the vector and avoid reallocation and every thing.
if you have no idea about the size you can use a collection like std::list witch never reallocate its internal items.