Working on some legacy code, I am running into memory issues due mainly (I believe) to the extensive use of STL maps (particularly “maps-of-maps”.)
I am looking at Boost flat_map as a possible solution. Does anyone have any firsthand experience with flat_maps, in particular with regards improvements in speed and/or memory usage? I realize of course this can be very dependent on the types of data stored and the manner in which they are stored but still curious of folk’s actual experience.
Can anyone point me to some solid examples?
As an example: there are several cases in this code of a map-of-a-map; that is, a map where the value is another map.
By replacing the “inner” map with a pair of vectors, I reduced the memory footprint 10:1 (3G to 300M). Of course this can slow down searches but for this particular case it doesn’t seem to matter much. And it involved about a day of refactoring and careful testing.
Boost’s flat_map sounds like it might be just what I need but I can’t seem to find out much about it other than the class description on the Boost web site. Looking for some firsthand feedback.
Boost's flat_map is a binary-tree-based map implementation, except that that binary tree is stored as a (sorted) vector of key-value pairs.
You can basically figure out the answers regarding performance (relative to an std::map by yourself based on that fact:
Iterating the map or a large part of it should be super-fast, relatively
Lookup should typically be relatively fast
Adding or removing values is theoretically much slower, but in practice - assuming your key and value types are small and the number of map elements not very high - probably comparable in speed (or even better on small maps - often no allocation is necessary on insert)
etc.
In your case - maps-of-maps - you're going to lose some of the benefit of "flattening things out", since you'll have an outer map with a pointer to an inner map (if not more levels of indirection); but the flat map would at least help you reduce that. Also, supposing you have two levels of maps, you could arrange it so that you store all of the inner maps contiguously (either by constructing the inner maps appropriately or by instantiating them with your own allocator, a trickier affair); in that case, you could replace pointers to maps with map indices, reducing the amount of space they take up and making life easier for the compiler.
You might also want read Boost's documentation of flat_map; and you could also just use the force and read the source (and the source of the underlying flat_tree) - like I have; I dont actually have flat_map experience myself.
I know this is an old question, but this might be of use to someone finding this question.
I found that flat_map was a big improvement in searching, lookup and iterating large maps. The fact the map is using contiguous data in memory also makes inserting faster than you might expect due to great data locality. If you're doing more inserts than lookups in your map then it might not be for you.
Having said that, repeatedly inserting a random value into a sorted vector is faster than the same on a linked list because of the data locality - despite what Big O might tell you. (tested in VS2017 and G++ 4.8).
Related
I want to make a container with few elements and I will be just checking whether an element is part of that set or not.
I know a vector would not be the appropriate container if the set is big enough since each research would be worst-case O(n) and there are better options that use a hash function or binary trees.
However I was wondering what happens if my set has few elements (eg just 5) and I know it in advance, is it worth it to implement the container as a structure having a hash function?
Maybe if the set is not big enough the overhead introduced by having to apply the hash function is greater than having to iterate through 5 elements.
For example in C++ using std::unordered_set instead of std::vector.
As always, thank you
There are many factors affecting the point where the std::vector falls behind other approaches. See std::vector faster than std::unordered_set? and Performance of vector sort/unique/erase vs. copy to unordered_set for some of the reasons why this is the case. As a consequence, any calculation of this point would have to be rather involved and complicated. The most expedient way to find this point is performance testing.
Keep in mind that some of the factors are based on the hardware being used. So not only do you need performance testing on your development machine, but also performance testing on "typical" end-user machines. Your development machine can give you a gut check, though, (as can an online tool like Quick Bench) letting you know that you are not even in the ballpark yet. (The intuition of individual programmers is notoriously unreliable. Some people see 100 as a big number and worry about performance; others don't worry until numbers hit the billions. Either of these people would be blown away by the others' view.)
Given the difficulty in determining the point where std::vector falls behind, I think this is a good place to give a reminder that premature optimization is usually a waste of time. Investigating this performance out of curiosity is fine, but until this is identified as a performance bottleneck, don't hold up work on more important aspects of your project for this. Choose the approach that fits your code best.
That being said, I personally would assume the break point is well, well over 10 items. So for the 5-element collection of the question, I would be inclined to use a vector and not look back.
What's the need to go for defining and implementing data structures (e.g. stack) ourselves if they are already available in C++ STL?
What are the differences between the two implementations?
First, implementing by your own an existing data structure is a useful exercise. You understand better what it does (so you can understand better what the standard containers do). In particular, you understand better why time complexity is so important.
Then, there is a quality of implementation issue. The standard implementation might not be suitable for you.
Let me give an example. Indeed, std::stack is implementing a stack. It is a general-purpose implementation. Have you measured sizeof(std::stack<char>)? Have you benchmarked it, in the case of a million of stacks of 3.2 elements on average with a Poisson distribution?
Perhaps in your case, you happen to know that you have millions of stacks of char-s (never NUL), and that 99% of them have less than 4 elements. With that additional knowledge, you probably should be able to implement something "better" than what the standard C++ stack provides. So std::stack<char> would work, but given that extra knowledge you'll be able to implement it differently. You still (for readability and maintenance) would use the same methods as in std::stack<char> - so your WeirdSmallStackOfChar would have a push method, etc. If (later during the project) you realize or that bigger stack might be useful (e.g. in 1% of cases) you'll reimplement your stack differently (e.g. if your code base grow to a million lines of C++ and you realize that you have quite often bigger stacks, you might "remove" your WeirdSmallStackOfChar class and add typedef std::stack<char> WeirdSmallStackOfChar; ....)
If you happen to know that all your stacks have less than 4 char-s and that \0 is not valid in them, representing such "stack"-s as a char w[4] field is probably the wisest approach. It is fast and easy to code.
So, if performance and memory space matters, you might perhaps code something as weird as
class MyWeirdStackOfChars {
bool small;
union {
std::stack<char>* bigstack;
char smallstack[4];
}
Of course, that is very incomplete. When small is true your implementation uses smallstack. For the 1% case where it is false, your implemention uses bigstack. The rest of MyWeirdStackOfChars is left as an exercise (not that easy) to the reader. Don't forget to follow the rule of five.
Ok, maybe the above example is not convincing. But what about std::map<int,double>? You might have millions of them, and you might know that 99.5% of them are smaller than 5. You obviously could optimize for that case. It is highly probable that representing small maps by an array of pairs of int & double is more efficient both in terms of memory and in terms of CPU time.
Sometimes, you even know that all your maps have less than 16 entries (and std::map<int,double> don't know that) and that the key is never 0. Then you might represent them differently. In that case, I guess that I am able to implement something much more efficient than what std::map<int,double> provides (probably, because of cache effects, an array of 16 entries with an int and a double is the fastest).
That is why any developer should know the classical algorithms (and have read some Introduction to Algorithms), even if in many cases he would use existing containers. Be also aware of the as-if rule.
STL implementation of Data Structures is not perfect for every possible use case.
I like the example of hash tables. I have been using STL implementation for a while, but I use it mainly for Competitive Programming contests.
Imagine that you are Google and you have billions of dollars in resources destined to storing and accessing hash tables. You would probably like to have the best possible implementation for the company use cases, since it will save resources and make search faster in general.
Oh, and I forgot to mention that you also have some of the best engineers on the planet working for you (:
(This video is made by Kulukundis talking about the new hash table made by his team at Google )
https://www.youtube.com/watch?v=ncHmEUmJZf4
Some other reasons that justify implementing your own version of Data Structures:
Test your understanding of a specific structure.
Customize part of the structure to some peculiar use case.
Seek better performance than STL for a specific data structure.
Hating STL errors.
Benchmarking STL against some simple implementation.
I have a map with about 100,000 pairs . Is there any way that i can speed up searching when using find(), given that the keys are in alphabetical order. Also how should i go about doing it. I know that you can specify a new comparator when you create the map. But will that speed up the find() function at all?
Thanks in advance.
[solved] Thanks a bunch guys i have decided to go with a vector and use lower and upperbound to "snip" some of the searching.
Also i am new here is there any way to mark this question as answered , or pick a best answer?
A different comparator will only speed up find if it manages to do the comparison faster (which, for strings will usually be pretty difficult).
If you're basically inserting all the data in order, then doing the searching, it may be faster to use a std::vector with std::lower_bound or std::upper_bound.
If you don't really care about ordering, and just want to find the data as quickly as possible, you might find that std::unordered_map works better for you.
Edit: Just for the record: the way you "might find" or "may find" those things is normally by profiling. Depending on the situation, it might be enough faster that it's pretty obvious even in simple testing, so profiling isn't really necessary, but if there's (much) doubt, or you want to quantify the effect, a profiler is probably the right way to do it.
std::map is already taking advantage of the fact the keys are in alphabetical order - it guarantees that itself. You aren't going to be able to improve it by changing the comparator (one assumes it's already a reasonably efficient string comparison).
Have you considered using unordered_map (aka hash_map in various implementations pre C++11? It should be able to search in O(1) instead of O(log(n)) for std::map.
You could also look into something slightly more exotic, like a trie, but that's not part of the standard library so you'd either have to find one elsewhere or roll your own, so I'd suggest unordered_map is a good place to start.
If you're using std::find to find elements, you should switch to using map::find (you don't really say in your question.) map::find uses the fact that the map is ordered to search much faster.
If that's still not good enough, you might look into a hash container such as unordered_map rather than map.
I've put in a vote for unordered_map but I wanted to also make another point.
One of the things that can hurt performance on modern machines is poor use of the cache. A map is going to have nodes allocated all over the place and there won't be much locality of reference. Also since it has to store a bunch of pointers between nodes it will use up more memory.
At the recent Going Native 2012 conference Bjarne Stroustroup gave an interesting talk that touched on this topic. He compared vector and list performance at a task involving a lot of random insertions and deletions, where it might seem list ought to have dominated, but because of the memory size and layout issue vector was in fact the fastest by far. Take a look at his slides, starting at slide 43.
unordered_map gives you direct access to the element and so it probably means even less hopping around in memory than trying to stick your data in a vector (and thus better performance than vector) so my comment is simply an admonishment to always keep your memory access pattern in mind for performance
I have a large collection of unique strings (about 500k). Each string is associated with a vector of strings. I'm currently storing this data in a
map<string, vector<string> >
and it's working fine. However I'd like the look-up into the map to be faster than log(n). Under these constrained circumstances how can I create a hashtable that supports O(1) look-up? Seems like this should be possible since I know all the keys ahead of time... and all the keys are unique (so I don't have to account for collisions).
Cheers!
You can create a hashtable with boost::unordered_map, std::tr1::unordered_map or (on C++0x compilers) std::unordered_map. That takes almost zero effort. Google sparsehash may be faster still and tends to take less memory. (Deletion can be a pain, but it seems you won't need that.)
If the code is still not fast enough, you can exploit prior knowledge of the keys with a minimal perfect hash, as suggested by others, to obtain guaranteed O(1) performance. Whether the code generating effort that takes is worth it depends on you; putting 500k keys into a tool like gperf may take a code generator generator.
You may also want to look at CMPH, which generates a perfect hash function at run-time, though through a C API.
I would look into creating a Perfect Hash Function for your table. This will guarantee no collisions which are an expensive operation to resolve. Perfect Hash Function Generators are also available.
What you're looking for is a Perfect Hash. gperf is often used to generate these, but I don't know how well it works with such a large collection of strings.
If you want no collisions for a known collection of keys you're looking for a perfect hash. The CMPH library (my apologies as it is for C rather than C++) is mature and can generate minimal perfect hashes for rather large data sets.
For small sets or maps, it's usually much faster to just use a sorted vector, instead of the tree-based set/map - especially for something like 5-10 elements. LLVM has some classes in that spirit, but no real adapter that would provide a std::map like interface backed up with a std::vector.
Any (free) implementation of this out there?
Edit: Thanks for all the alternative ideas, but I'm really interested in a vector based set/map. I do have specific cases where I tend to create huge amounts of sets/maps which contain usually less than 10 elements, and I do really want to have less memory pressure. Think about for example neighbor edges for a vertex in a triangle mesh, you easily wind up with 100k sets of 3-4 elements each.
I just stumbled upon your question, hope its not too late.
I recommend a great (open source) library named Loki.
It has a vector based implementation of an associative container that is a drop-in replacement for std::map, called AssocVector.
It offers better performance for accessing elements (and worst performance for insertions/deletions).
The library was written by Andrei Alexandrescu author of Modern C++ Design.
It also contains some other really nifty stuff.
If you can't find anything suitable, I would just wrap a std::vector to do sort() on insert, and implement find() using lower_bound(). It should be straight forward, and just as efficient as a custom solution.
Old post, I know, but for more recent visitors, Boost's flat_set and flat_map look like what you need. See https://theboostcpplibraries.com/boost.container for more information.
I don't know any such implementation, but there are some functions that help working with sorted vectors already in STL, such as lower_bound and upper_bound.
If the set or map truly is small, the performance gained by micro-optimizing the data structure will have little to no noticeable effects. You'll save maybe one or two memory (read: cache) lookups when searching a tiny tree vs tiny vector, which in the big picture is insignificant.
Having said that, you could give hash_map a try. Lookups by key are guaranteed to run in constant time.
Maybe you're looking for unordered map's and unordered set's. Try taking a look at the TR1 unordered containers that rely on hashing, or the Boost.Unordered container library. Underneath the interface, I'm not sure if they really do use std::vector, but I'd wager it's worth taking a look at.