Which one is better stl map or unordered_map for the following cases - c++

I am trying to compare stl map and stl unordered_map for certain operations. I looked on the net and it only increases my doubts regarding which one is better as a whole. So I would like to compare the two on the basis of the operation they perform.
Which one performs faster in
Insert, Delete, Look-up
Which one takes less memory and less time to clear it from the memory. Any explanations are heartily welcomed !!!
Thanks in advance

Which one performs faster in Insert, Delete, Look-up? Which one takes less memory and less time to clear it from the memory. Any explanations are heartily welcomed !!!
For a specific use, you should try both with your actual data and usage patterns and see which is actually faster... there are enough factors that it's dangerous to assume either will always "win".
implementation and characteristics of unordered maps / hash tables
Academically - as the number of elements increases towards infinity, those operations on an std::unordered_map (which is the C++ library offering for what Computing Science terms a "hash map" or "hash table") will tend to continue to take the same amount of time O(1) (ignoring memory limits/caching etc.), whereas with a std::map (a balanced binary tree) each time the number of elements doubles it will typically need to do an extra comparison operation, so it gets gradually slower O(log2n).
std::unordered_map implementations necessarily use open hashing: the fundamental expectation is that there'll be a contiguous array of "buckets", each logically a container of any values hashing thereto.
It generally serves to picture the hash table as a vector<list<pair<key,value>>> where getting from the vector elements to a value involves at least one pointer dereference as you follow the list-head-pointer stored in the bucket to the initial list node; the insert/find/delete operations' performance depends on the size of the list, which on average equals the unordered_map's load_factor.
If the max_load_factor is lowered (the default is 1.0), then there will be less collisions but more reallocation/rehashing during insertion and more wasted memory (which can hurt performance through increased cache misses).
The memory usage for this most-obvious of unordered_map implementations involves both the contiguous array of bucket_count() list-head-iterator/pointer-sized buckets and one doubly-linked list node per key/value pair. Typically, bucket_count() + 2 * size() extra pointers of overhead, adjusted for any rounding-up of dynamic memory allocation request sizes the implementation might do. For example, if you ask for 100 bytes you might get 128 or 256 or 512. An implementation's dynamic memory routines might use some memory for tracking the allocated/available regions too.
Still, the C++ Standard leaves room for real-world implementations to make some of their own performance/memory-usage decisions. They could, for example, keep the old contiguous array of buckets around for a while after allocating a new larger array, so rehashing values into the latter can be done gradually to reduce the worst-case performance at the cost of average-case performance as both arrays are consulted during operations.
implementation and characteristics of maps / balanced binary trees
A map is a binary tree, and can be expected to employ pointers linking distinct heap memory regions returned by different calls to new. As well as the key/value data, each node in the tree will need parent, left, and right pointers (see wikipedia's binary tree article if lost).
comparison
So, both unordered_map and map need to allocate nodes for key/value pairs with the former typically having two-pointer/iterator overhead for prev/next-node linkage, and the latter having three for parent/left/right. But, the unordered_map additionally has the single contiguous allocation for bucket_count() buckets (== size() / load_factor()).
For most purposes that's not a dramatic difference in memory usage, and the deallocation time difference for one extra region is unlikely to be noticeable.
another alternative
For those occasions when the container's populated up front then repeatedly searched without further inserts/erases, it can sometimes be fastest to use a sorted vector, searched using Standard algorithms binary_search, equal_range, lower_bound, upper_bound. This has the advantage of a single contiguous memory allocation, which is much more cache friendly. It always outperforms map, but unordered_map may still be faster - measure if you care.

The reason there is both is that neither is better as a whole.
Use either one. Switch if the other proves better for your usage.
std::map provides better space for worse time.
std::unordered_map provides better time for worse space.

The answer to your question is heavily dependent on the particular STL implementation you're using. Really, you should look at your STL implementation's documentation – it'll likely have a good amount of information on performance.
In general, though, according to cppreference.com, maps are usually implemented as red-black trees and support operations with time complexity O(log n), while unordered_maps usually support constant-time operations. cppreference.com offers little insight into memory usage; however, another StackOverflow answer suggests maps will generally use less memory than unordered_maps.
For the STL implementation Microsoft packages with Visual Studio 2012, it looks like map supports these operations in amortized O(log n) time, and unordered_map supports them in amortized constant time. However, the documentation says nothing explicit about memory footprint.

Map:
Insertion:
For the first version ( insert(x) ), logarithmic.
For the second
version ( insert(position,x) ), logarithmic in general, but
amortized constant if x is inserted right after the element pointed
by position.
For the third version ( insert (first,last) ),
Nlog(size+N) in general (where N is the distance between first and
last, and size the size of the container before the insertion), but
linear if the elements between first and last are already sorted
according to the same ordering criterion used by the container.
Deletion:
For the first version ( erase(position) ), amortized constant.
For the second version ( erase(x) ), logarithmic in container size.
For the last version ( erase(first,last) ), logarithmic in container size plus linear in the distance between first and last.
Lookup:
Logarithmic in size.
Unordered map:
Insertion:
Single element insertions:
Average case: constant.
Worst case: linear in container size.
Multiple elements insertion:
Average case: linear in the number of elements inserted.
Worst case: N*(size+1): number of elements inserted times the container size plus one.
Deletion:
Average case: Linear in the number of elements removed ( constant when you remove just one element )
Worst case: Linear in the container size.
Lookup:
Average case: constant.
Worst case: linear in container size.
Knowing these, you can decide which container to use according to the type of the implementation.
Source: www.cplusplus.com

Related

When std::map/std::set should be used over std::unordered_map/std::unordered_set?

As in new standards std::unordered_map/std::unordered_set were introduced, which uses hash function and have in average constant complexity of inserting/deleting/getting the elements, in case where we do not need to iterate over the collection in particular order, it seems there is no reason to use "old" std::map/std::set? Or there are some other cases/reasons when std::map/std::set would be a better choice? Like would they be for ex. less memory consuming, or their only pros over the "unordered" versions is the ordering?
They are ordered, and writing < us easier than writing hash and equality.
Never underestimate ease of use, because 90% of your code has trivial impact on your code's performance. Making the 10% faster can use time you would have spent on writing a hash for yet another type.
OTOH, a good hash combiner is write once, and get-state-as-tie makes <, == and hash nearly free.
Splicing guarantees between containers with node based operations may be better, as splicing into a hash map isn't free like a good ordered container splice. But I am uncertain.
Finally the iterator invalidation guarantees differ. Blindly replacing a mature tested moew with an unordered meow could create bugs. And maybe the invalidation features of maps are worth it to you.
std::set/std::map and std::unordered_set/std::unordered_map are used in very different problem areas and are not replaceable by each other.
std::set/std::map are used where problem is moving around the order of elements and element access is O(log n) time in average case is acceptable. By using std::set/std::map other information can also be retrieved for example to find number of elements greater than given element.
std::unordered_set/std::unordered_map are used where elements access has to be in O(1) time complexity in average case and order is not important, for example if you want to keep elements of integer key in std::vector, it means vec[10] = 10 but that is not practical approach because if keys drastically very, for example one key is 20 and other is 50000 then keeping only two values a std::vector of size 50001 have to be allocated and if you use std::set/std::map then element access complexity is O(log n) not O(1). In this problem std::unordered_set/std::unordered_map is used and it offers O(1) constant time complexity in average case by using hashing without allocating a large space.

c++ Why std::multimap is slower than std::priority_queue

I implemented an algorithm where I make use of an priority queue.
I was motivated by this question:
Transform a std::multimap into std::priority_queue
I am going to store up to 10 million elements with their specific priority value.
I then want to iterate until the queue is empty.
Every time an element is retrieved it is also deleted from the queue.
After this I recalculate the elements pririty value, because of previous iterations it can change.
If the value did increase I am inserting the element againg into the queue.
This happens more often dependent on the progress. (at the first 25% it does not happen, in the next 50% it does happen, in the last 25% it will happen multiple times).
After receiving the next element and not reinserting it, I am going to process it. This for I do not need the priority value of this element but the technical ID of this element.
This was the reason I intuitively had chosen a std::multimap to achieve this, using .begin() to get the first element, .insert() to insert it and .erase() to remove it.
Also, I did not intuitively choose std::priority_queue directly because of other questions to this topic answering that std::priority_queue most likely is used for only single values and not for mapped values.
After reading the link above I reimplemented it using priority queue analogs to the other question from the link.
My runtimes seem to be not that unequal (about an hour on 10 mio elements).
Now I am wondering why std::priority_queue is faster at all.
I actually would expect to be the std::multimap faster because of the many reinsertions.
Maybe the problem is that there are too many reorganizations of the multimap?
To summarize: your runtime profile involves both removing and inserting elements from your abstract priority queue, with you trying to use both a std::priority_queue and a std::multimap as the actual implementation.
Both the insertion into a priority queue and into a multimap have roughly equivalent complexity: logarithmic.
However, there's a big difference with removing the next element from a multimap versus a priority queue. With a priority queue this is going to be a constant-complexity operation. The underlying container is a vector, and you're removing the last element from the vector, which is going to be mostly a nothing-burger.
But with a multimap you're removing the element from one of the extreme ends of the multimap.
The typical underlying implementation of a multimap is a balanced red/black tree. Repeated element removals from one of the extreme ends of a multimap has a good chance of skewing the tree, requiring frequent rebalancing of the entire tree. This is going to be an expensive operation.
This is likely to be the reason why you're seeing a noticeable performance difference.
I think the main difference comes form two facts:
Priority queue has a weaker constraint on the order of elements. It doesn't have to have sorted whole range of keys/priorities. Multimap, has to provide that. Priority queue only have to guarantee the 1st / top element to be largest.
So, while, the theoretical time complexities for the operations on both are the same O(log(size)), I would argue that erase from multimap, and rebalancing the RB-tree performs more operations, it simply has to move around more elements. (NOTE: RB-tree is not mandatory, but very often chosen as underlying container for multimap)
The underlying container of priority queue is contiguous in memory (it's a vector by default).
I suspect the rebalancing is also slower, because RB-tree relies on nodes (vs contiguous memory of vector), which makes it prone to cache misses, although one has to remember that operations on heap are not done in iterative manner, it is hopping through the vector. I guess to be really sure one would have to profile it.
The above points are true for both insertions and erasues. I would say the difference is in the constant factors lost in the big-O notation. This is intuitive thinking.
The abstract, high level explanation for map being slower is that it does more. It keeps the entire structure sorted at all times. This feature comes at a cost. You are not paying that cost if you use a data structure that does not keep all elements sorted.
Algorithmic explanation:
To meet the complexity requirements, a map must be implemented as a node based structure, while priority queue can be implemented as a dynamic array. The implementation of std::map is a balanced (typically red-black) tree, while std::priority_queue is a heap with std::vector as the default underlying container.
Heap insertion is usually quite fast. The average complexity of insertion into a heap is O(1), compared to O(log n) for balanced tree (worst case is the same, though). Creating a priority queue of n elements has worst case complexity of O(n) while creating a balanced tree is O(n log n). See more in depth comparison: Heap vs Binary Search Tree (BST)
Additional, implementation detail:
Arrays usually use CPU cache much more efficiently, than node based structures such as trees or lists. This is because adjacent elements of an array are adjacent in memory (high memory locality) and therefore may fit within a single cache line. Nodes of a linked structure however exist in arbitrary locations (low memory locality) in memory and usually only one or very few are within a single cache line. Modern CPUs are very very fast at calculations but memory speed is a bottle neck. This is why array based algorithms and data structures tend to be significantly faster than node based.
While I agree with both #eerorika and #luk32, it is worth mentioning that in the real world, when using default STL allocator, memory management cost easily out-weights a few data structure maintenance operations such as updating pointers to perform tree rotation. Depending on the implementation the memory allocation itself could involve tree maintenance operation and potentially triggers system-call where it would become even more costly.
In multi-map, there is memory allocation and deallocation associated with each insert() and erase() respectively which often contributes to slowness in a higher order of magnitude than the extra steps in the algorithm.
priority-queue however, by default uses vector which only triggers memory allocation (a much more expansive one though, which involves moving all stored objects to the new memory location) once the capacity is exhausted. In your case pretty much all allocation only happens in the first iteration for priority-queue whereas multi-map keeps paying memory management cost with each insert and erase.
The downside around memory management for map could be mitigated by using a memory-pool based custom allocator. This also gives you cache hit rate comparable to priority queue. It might even out-perform priority-queue when your object is expansive to move or copy.

Does a map get slower the longer it is

Will a map get slower the longer it is? I'm not talking about iterating through it, but rather operations like .find() .insert() and .at().
For instance if we have map<int, Object> mapA which contains 100'000'000 elements and map<int, Object> mapB which only contains 100 elements.
Will there be any difference performance wise executing mapA.find(x) and mapB.find(x)?
The complexity of lookup and insertion operations on std::map is logarithmic in the number of elements in the map. So it gets slower as the map gets larger, but only it gets slower only very slowly (slower than any polynomial in the element number). To implement a container with such properties, operations usually take a form of binary search.
To imagine how much slower it is, you essentially require one further operation every time you double the number of elements. So if you need k operations on a map with 4000 elements, you need k + 1 operations on a map with 8000 elements, k + 2 for 16000 elements, and so forth.
By contrast, std::unordered_map does not offer you an ordering of the elements, and in return it gives you a complexity that's constant on average. This container is usually implemented as a hash table. "On average" means that looking up one particular element may take long, but the time it takes to look up many randomly chosen elements, divided by the number of looked-up elements, does not depend on the container size. The unordered map offers you fewer features, and as a result can potentially give you better performance.
However, be careful when choosing which map to use (provided ordering doesn't matter), since asymptotic cost doesn't tell you anything about real wall-clock cost. The cost of hashing involved in the unordered map operations may contribute a significant constant factor that only makes the unordered map faster than the ordered map at large sizes. Moreover, the lack of predictability of the unordered map (along with potential complexity attacks using chosen keys) may make the ordered map preferable in situations where you need control on the worst case rather than the average.
The C++ standard only requires that std::map has logarithmic lookup time; not that it is a logarithm of any particular base or with any particular constant overhead.
As such, asking "how many times slower would a 100 million map be than a 100 map" is nonsensical; it could well be that the overhead easily dominates both, so that the operations are about the same speed. It could even well be that for small sizes the time growth is exponential! By design, none of these things are deducible purely from the specification.
Further, you're asking about time, rather than operations. This depends heavily on access patterns. To use some diagrams from Paul Khong's (amazing) blog on Binary searches, the runtimes for repeated searches (look at stl, the turquoise line) are almost perfectly logarithmic,
but once you start doing random access the performance becomes decidedly non-logarithmic due to memory access outside of level-1 cache:
Note that goog refers to Google's dense_hash_map, which is akin to unordered_map. In this case, even it does not escape performance degradation at larger sizes.
The latter graph is likely the more telling for most situations, and suggests that the speed cost from looking up a random index in a size-100 map will cost about 10x less than a size-500'000 map. dense_hash_map will degrade worse than that, in that it will go from almost-free to certainly-not-free, albeit always remaining much faster than the STL's map.
In general, when asking these questions, an approach from theory can only give you very rough answers. A quick look at actual benchmarks and considerations of constant factors is likely to fine-tune these rough answers significantly.
Now, also remember that you're talking about map<int, Object>, which is very different from set<uint32_t>; if the Object is large this will emphasize the cost of cache misses and de-emphasize the cost of traversal.
A pedantic aside.
A quick note about hash maps: Their time complexity is often described as constant time, but this isn't strictly true. Most hash maps rather give you constant time with very high likelihood with regards to lookups, and amortized constant time with very high likelihood with regards to inserts.
The former means that for most hash tables there is an input that makes them perform less than optimal, and for user-input this could be dangerous. For this reason, Rust uses a cryptographic hash by defaul, Java's HashMap resolves collision with a binary search and CPython randomizes hashes. Generally if you're exposing your hash table to untrusted input, you should make sure you're using some mitigation of this kind.
Some, like Cuckoo hashes, do better than probabilistic (on constrained data types, given a special kind of hash function) for the case where you're worried about attackers, and incremental resizing removes the amortized time cost (assuming cheap allocations), but neither are commonly used since these are rarely problems that need solving, and the solutions are not free.
That said, if you're struggling to think of why we'd go through the hassle of using unordered maps, look back up at the graphs. They're fast, and you should use them.

For the operation count(). Which is faster std::set<void*> or std::unordered_set<void*>?

Suppose that sizeof(void*) == sizeof(size_t) and that hashing the pointer is just casting it to size_t. So I want to know if my set contains an element, which will be faster std::set<void*> or std::unordered_set<void*>?
I know how std::set works, but I am not familiar with std::unordered_set. Well, I know that unordered set uses hashing and buckets and that if no intersection occurs (which is my case) the complexity is O(1). But I dont know how much is this constant complexity.
If how much date is in the container is relevant, my actual scenario uses less than a hundred¹. But my curiosity concerns both cases with few elements and a lot of elements.
¹ The amount of elements is so few that even a std::vector would perform fine.
I think the important bit is in the small footnote:
The amount of elements is so few that even a std::vector would perform fine.
std::vector is more cache friendly than std::set and std::unordered_set, because it allocates its elements in a contiguous region of memory (this is mandated by the Standard).
And although lookup and insertion complexity is worse for std::vector than for std::set or std::unordered_set (linear versus O(log N) and amortized O(1), respectively), data locality and a higher cache hit rate is likely to be dominating the computational complexity and yield better performance.
In general, anyway, the impact on performance of the data structures you choose also depends on the kind of operations you want to perform on them and on their frequency - this is not mentioned in your post.
As always, however, measure the different alternatives before you commit to one, and always favor a simpler design when you do not have evidence that it represents a bottleneck preventing your application from meeting its performance requirements.
unordered_set has better cache locality than set in many implementations. But since the number of elements is so small in your case a vector is likely to fit completely into cache, making it a better choice despite the O(n) complexity of looking up elements.

How large does a collection have to be for std::map<k,v> to outpace a sorted std::vector<std::pair<k,v> >?

How large does a collection have to be for std::map to outpace a sorted std::vector >?
I've got a system where I need several thousand associative containers, and std::map seems to carry a lot of overhead in terms of CPU cache. I've heard somewhere that for small collections std::vector can be faster -- but I'm wondering where that line is....
EDIT: I'm talking about 5 items or fewer at a time in a given structure. I'm concerned most with execution time, not storage space. I know that questions like this are inherently platform-specific, but I'm looking for a "rule of thumb" to use.
Billy3
It's not really a question of size, but of usage.
A sorted vector works well when the usage pattern is that you read the data, then you do lookups in the data.
A map works well when the usage pattern involves a more or less arbitrary mixture of modifying the data (adding or deleting items) and doing queries on the data.
The reason for this is fairly simple: a map has higher overhead on an individual lookup (thanks to using linked nodes instead of a monolithic block of storage). An insertion or deletion that maintains order, however, has a complexity of only O(lg N). An insertion or deletion that maintains order in a vector has a complexity of O(N) instead.
There are, of course, various hybrid structures that can be helpful to consider as well. For example, even when data is being updated dynamically, you often start with a big bunch of data, and make a relatively small number of changes at a time to it. In this case, you can load your data into memory into a sorted vector, and keep the (small number of) added objects in a separate vector. Since that second vector is normally quite small, you simply don't bother with sorting it. When/if it gets too big, you sort it and merge it with the main data set.
Edit2: (in response to edit in question). If you're talking about 5 items or fewer, you're probably best off ignoring all of the above. Just leave the data unsorted, and do a linear search. For a collection this small, there's effectively almost no difference between a linear search and a binary search. For a linear search you expect to scan half the items on average, giving ~2.5 comparisons. For a binary search you're talking about log2 N, which (if my math is working this time of the morning) works out to ~2.3 -- too small a difference to care about or notice (in fact, a binary search has enough overhead that it could very easily end up slower).
If you say "outspace" you mean consuming more space (aka memory), then it's very likely that vector will always be more efficient (the underlying implementation is an continous memory array with no othe data, where map is a tree, so every data implies using more space). This however depends on how much the vector reserves extra space for future inserts.
When it is about time (and not space), vector will also always be more effective (doing a dichotomic search). But it will be extreamly bad for adding new elements (or removing them).
So : no simple answer ! Look-up the complexities, think about the uses you are going to do. http://www.cplusplus.com/reference/stl/
The main issue with std::map is an issue of cache, as you pointed.
The sorted vector is a well-known approach: Loki::AssocVector.
For very small datasets, the AssocVector should crush the map despite the copy involved during insertion simply because of cache locality. The AssocVector will also outperform the map for read-only usage. Binary search is more efficient there (less pointers to follow).
For all other uses, you'll need to profile...
There is however an hybrid alternative that you might wish to consider: using the Allocator parameter of the map to restrict the memory area where the items are allocated, thus minimizing the locality reference issue (the root of cache misses).
There is also a paradigm shift that you might consider: do you need sorted items, or fast look-up ?
In C++, the only STL-compliant containers for fast-lookup have been implemented in terms of Sorted Associative Containers for years. However the up-coming C++0x features the long awaited unordered_map which could out perform all the above solutions!
EDIT: Seeing as you're talking about 5 items or fewer:
Sorting involves swapping items. When inserting into std::map, that will only involve pointer swaps. Whether a vector or map will be faster depends on how fast it is to swap two elements.
I suggest you profile your application to figure it out.
If you want a simple and general rule, then you're out of luck - you'll need to consider at least the following factors:
Time
How often do you insert new items compared to how often you lookup?
Can you batch inserts of new items?
How expensive is sorting you vector? Vectors of elements that are expensive to swap become very expensive to sort - vectors of pointers take far less.
Memory
How much overhead per allocation does the allocator you're using have? std::map will perform one allocation per item.
How big are your key/value pairs?
How big are your pointers? (32/64 bit)
How fast does you implementation of std::vector grow? (Popular growth factors are 1.5 and 2)
Past a certain size of container and element, the overhead of allocation and tree pointers will become outweighed by the cost of the unused memory at the end of the vector - but by far the easiest way to find out if and when this occurs is by measuring.
It has to be in the millionth items. And even there ...
I am more thinking here to memory usage and memory accesses. Under hundreds of thousands, take whatever you want, there will be no noticeable difference. CPUs are really fast these days, and the bottleneck is memory latency.
But even with millions of items, if your map<> has been build by inserting elements in random order. When you want to traverse your map (in sorted order) you'll end up jumping around randomly in the memory, stalling the CPU for memory to be available, resulting in poor performance.
On the other side, if your millions of items are in a vector, traversing it is really fast, taking advantage of the CPU memory accesses predictions.
As other have written, it depends on your usage.
Edit: I would more question the way to organize your thousands of associative containers than the containers themselves if they contain only 5 items.