C++ std::unordered_map complexity - c++

I've read a lot about unordered_map (c++11) time-complexity here at stackoverflow, but I haven't found the answer for my question.
Let's assume indexing by integer (just for example):
Insert/at functions work constantly (in average time), so this example would take O(1)
std::unordered_map<int, int> mymap = {
{ 1, 1},
{ 100, 2},
{ 100000, 3 }
};
What I am curious about is how long does it take to iterate through all (unsorted) values stored in map - e.g.
for ( auto it = mymap.begin(); it != mymap.end(); ++it ) { ... }
Can I assume that each stored value is accessed only once (or twice or constant-times)? That would imply that iterate through all values is in N-valued map O(N). The other possibility is that my example with keys {1,10,100000} could take up to 1000000 iteration (if represented by array)
Is there any other container, that can be iterated linearly and value accessed by given key constantly?
What I would really need is (pseudocode)
myStructure.add(key, value) // O(1)
value = myStructure.at(key) // O(1)
for (auto key : mySructure) {...} // O(1) for each key/value pair = O(N) for N values
Is std::unordered_map the structure I need?
Integer indexing is sufficient, average complexity as well.

Regardless of how they're implemented, standard containers provide iterators that meet the iterator requirements. Incrementing an iterator is required to be constant time, so iterating through all the elements of any standard container is O(N).

The complexity guarantees of all standard containers are specified in the C++ Standard.
std::unordered_map element access and element insertion is required to be of complexity O(1) on average and O(N) worst case (cf. Sections 23.5.4.3 and 23.5.4.4; pages 797-798).
A specific implementation (that is, a specific vendor's implementation of the Standard Library) can choose whatever data structure they want. However, to be compliant with the Standard, their complexity must be at least as specified.

There's a few different ways that a hash table can be implemented, and I suggest you read more on those if you're interested, but the main two are through chaining and open addressing.
In the first case you have an array of linked lists. Each entry in the array could be empty, each item in the hashtable will be in some bucket. So iteration is walking down the array, and walking down each non-empty list in it. Clearly O(N), but could potentially be very memory inefficient depending on how the linked lists themselves are allocated.
In the second case, you just have one very large array which will have lots of empty slots. Here, iteration is again clearly linear, but could be inefficient if the table is mostly empty (which is should be for lookup purposes) because the elements that are actually present will be in different cache lines.
Either way, you're going to have linear iteration and you're going to be touching every element exactly once. Note that this is true of std::map also, iteration will be linear there as well. But in the case of the maps, iteration will definitely be far less efficient that iterating a vector, so keep that in mind. If your use-case involves requiring BOTH fast lookup and fast iteration, if you insert all your elements up front and never erase, it could be much better to actually have both the map and the vector. Take the extra space for the added performance.

Related

Performance difference for iteration over all elements std::unordered_map vs std::map?

I wanted to map data with pointer as the key. What container should I have chosen, map or unordered_map? There are multiple questions on stackoverflow for this topic but none of them covers the performance aspect when we need to iterate over all the key-value pairs.
std::map<classKey* , classData*> myMap;
std::unordered_map<classKey* , classData*> myUnorderedMap;
for (auto & iter : myMap) { //loop1
display(iter.second);
}
for (auto & iter : myUnorderedMap) { //loop2
display(iter.second);
}
loop1 vs loop2 which one gives better performance.
Bench Mark Provided by #RetiredNinja
For size = 10,000,000 We get following benchmark results:
As you might expect, this depends heavily on the actual implementation of the standard library data structures. Therefore, this answer will be a bit more theoretical and less tied to any one implementation.
A std::map uses a balanced binary tree under the covers. This is why it has O(log(n)) insertion, deletion, and lookup. Iterating over it should be linear because you just have to do a depth-first traversal (which will require O(log(n)) memory in the form of stack space). The benefit of using a std::map for iteration is that you will iterate over the keys in sorted order and you will get that benefit "for free".
A std::unordered_map uses a hash table under the covers. This allows you to have an amortized constant time insertion, deletion, and lookup. If the implementation is not optimized for iterating, a naive approach would be to iterate over every bucket in the hash table. Since a good hash table (in theory) has exactly one element in 50% of the buckets and zero in the rest, this operation will also be linear. However, it will take more "wall clock time" than the same linear operation for a std::map. To get around this, some hash table implementations keep a side list of all of the elements for fast iterations. If this is the case, iterating on a std::unordered_map will be faster because you can't get much better than iterating over contiguous memory (still linear time though, obviously).
In the extremely unlikely case that you actually need to optimize to this level (instead of just being curious about the performance in theory), you likely have much bigger performance bottlenecks elsewhere in your code.
All of this ignores the oddity of keying off of a pointer value, but that's neither here nor there.
Sources for further reading:
GCC std::map implementation
GCC std::unordered_map implementation
How GCC std::unordered_map achieves fast iteration

Insert a sorted range into std::set with hint

Assume I have a std::set (which is by definition sorted), and I have another range of sorted elements (for the sake of simplicity, in a different std::set object). Also, I have a guarantee that all values in the second set are larger than all the values in the first set.
I know I can efficiently insert one element into std::set - if I pass a correct hint, this will be O(1). I know I can insert any range into std::set, but as no hint is passed, this will be O(k logN) (where k is number of new elements, and N number of old elements).
Can I insert a range in a std::set and provide a hint? The only way I can think of is to do k single inserts with a hint, which does push the complexity of the insert operations in my case down to O(k):
std::set <int> bigSet{1,2,5,7,10,15,18};
std::set <int> biggerSet{50,60,70};
for(auto bigElem : biggerSet)
bigSet.insert(bigSet.end(), bigElem);
First of all, to do the merge you're talking about, you probably want to use set (or map's) merge member function, which will let you merge some existing map into this one. The advantage of doing this (and the reason you might not want to, depending your usage pattern) is that the items being merged in are actually moved from one set to the other, so you don't have to allocate new nodes (which can save a fair amount of time). The disadvantage is that the nodes then disappear from the source set, so if you need each local histogram to remain intact after being merged into the global histogram, you don't want to do this.
You can typically do better than O(log N) when searching a sorted vector. Assuming reasonably predictable distribution you can use an interpolating search to do a search in (typically) around O(log log N), often called "pseudo-constant" complexity.
Given that you only do insertions relatively infrequently, you might also consider a hybrid structure. This starts with a small chunk of data that you don't keep sorted. When you reach an upper bound on its size, you sort it and insert it into a sorted vector. Then you go back to adding items to your unsorted area. When it reaches the limit, again sort it and merge it with the existing sorted data.
Assuming you limit the unsorted chunk to no larger than log(N), search complexity is still O(log N)--one log(n) binary search or log log N interpolating search on the sorted chunk, and one log(n) linear search on the unsorted chunk. Once you've verified that an item doesn't exist yet, adding it has constant complexity (just tack it onto the end of the unsorted chunk). The big advantage is that this can still easily use a contiguous structure such as a vector, so it's much more cache friendly than a typical tree structure.
Since your global histogram is (apparently) only ever populated with data coming from the local histograms, it might be worth considering just keeping it in a vector, and when you need to merge in the data from one of the local chunks, just use std::merge to take the existing global histogram and the local histogram, and merge them together into a new global histogram. This has O(N + M) complexity (N = size of global histogram, M = size of local histogram). Depending on the typical size of a local histogram, this could pretty easily work out as a win.
Merging two sorted containers is much quicker than sorting. It's complexity is O(N), so in theory what you say makes sense. It's the reason why merge-sort is one of the quickest sorting algorithms. If you follow the link, you will also find pseudo-code, what you are doing is just one pass of the main loop.
You will also find the algorithm implemented in STL as std::merge. This takes ANY container as an input, I would suggest using std::vector as default container for new element. Sorting a vector is a very fast operation. You may even find it better to use a sorted-vector instead of a set for output. You can always use std::lower_bound to get O(Nlog(N)) performance from a sorted-vector.
Vectors have many advantages compared with set/map. Not least of which is they are very easy to visualise in a debugger :-)
(The code at the bottom of the std::merge shows an example of using vectors)
You can merge the sets more efficiently using special functions for that.
In case you insist, insert returns information about the inserted location.
iterator insert( const_iterator hint, const value_type& value );
Code:
std::set <int> bigSet{1,2,5,7,10,15,18};
std::set <int> biggerSet{50,60,70};
auto hint = bigSet.cend();
for(auto& bigElem : biggerSet)
hint = bigSet.insert(hint, bigElem);
This assumes, of course, that you are inserting new elements that will end up together or close in the final set. Otherwise there is not much to gain, only the fact that since the source is a set (it is ordered) then about half of the three will not be looked up.
There is also a member function
template< class InputIt > void insert( InputIt first, InputIt last );.
That might or might not do something like this internally.

What alternative to C++ vector when it comes to fast deletion?

vector is the first choice in many situations because random access is O(1), as there are not many containers that are fast enough, or at least O(log(n)).
My issue with vector being that vector<>::erase() is O(n), map<>::erase() is faster and is a better container.
An alternative would be to use an object pool, but it is not a standard container, and implementations might vary depending on use, so I'm not very keen on using something I don't really understand or know a lot about.
It seems map is a very good alternative to vector<> when there is often-occurring deletions, but I wanted to know if there are better alternatives to it.
So is there a container that is both fast with random access and deletion?
Is there an usual way to make an object pool?
What alternative to C++ vector when it comes to fast deletion?
Erasing the last element of a vector (i.e. pop operation) has constant complexity, so if you don't need to keep your sequence ordered, then an efficient solution is to swap the target element with the last one, and pop it.
A linked list has constant complexity deletion that maintains the order of the sequence, but indexed lookup is linear (i.e not random access).
The (unordered) map sure has both asymptotically efficient lookup and erase, but you won't get the same behaviour as a vector would have. If you create an index -> element map, and remove element from index i, then there will be a gap between i - 1 and i + 1, while the vector would shift the elements at indices greater than i left.
The indexable skip list has logarithmic (on average; worst case is linear) lookup and deletion. However, there is no implementation of it in the standard library.

How to efficiently insert a range of consecutive integers into a std::set?

In C++, I have a std::set that I would like to insert a range of consecutive integers. How can I do this efficiently, hopefully in O(n) time where n is the length of the range?
I'm thinking I'd use the inputIterator version of std::insert, but am unclear on how to build the input iterator.
std::set<int> mySet;
// Insert [34 - 75):
mySet.insert(inputIteratorTo34, inputIteratorTo75);
How can I create the input iterator and will this be O(n) on the range size?
The efficient way of inserting already ordered elements into a set is to hint the library as to where the next element will be. For that you want to use the version of insert that takes an iterator:
std::set<int>::iterator it = mySet.end();
for (int x : input) {
it = mySet.insert(it, x);
}
On the other hand, you might want to consider other containers. Whenever possible, use std::vector. If the amount of insertions is small compared to lookups, or if all inserts happen upfront, then you can build a vector, sort it and use lower_bound for lookups. In this case, since the input is already sorted, you can skip the sorting.
If insertions (or removals) happen all over the place, you might want to consider using std::unordered_set<int> which has an average O(1) insertion (per element) and lookup cost.
For the particular case of tracking small numbers in a set, all of which are small (34 to 75 are small numbers) you can also consider using bitsets or even a plain array of bool in which you set the elements to true when inserted. Either will have O(n) insertion (all elements) and O(1) lookup (each lookup), which is better than the set.
A Boost way could be:
std::set<int> numbers(
boost::counting_iterator<int>(0),
boost::counting_iterator<int>(10));
A great LINK for other answers, Specially #Mani's answer
std::set is a type of binary-search-tree, which means an insertion costs O(lgn) on average,
c++98:If N elements are inserted, Nlog(size+N) in general, but linear
in size+N if the elements are already sorted according to the same
ordering criterion used by the container.
c++11:If N elements are inserted, Nlog(size+N). Implementations may
optimize if the range is already sorted.
I think the C++98 implement will trace the current insertion node and check if the next value to insert is larger than the current one, in which case there's no need to start from root again.
in c++11, this is an optional optimize, so you may implement a skiplist structure, and use this range-insert feture in your implement, or you may optimize the programm according to your scenarios
Taking the hint provided by aksham, I see the answer is:
#include <boost/iterator/counting_iterator.hpp>
std::set<int> mySet;
// Insert [34 - 75):
mySet.insert(boost::counting_iterator<int>(34),
boost::counting_iterator<int>(75));
It's not clear why you specifically want to insert using iterators to specify a range.
However, I believe you can use a simple for-loop to insert with the desired O(n) complexity.
Quoting from cppreference's page on std::set, the complexity is:
If N elements are inserted, Nlog(size+N) in general, but linear in size+N if the elements are already sorted according to the same ordering criterion used by the container.
So, using a for-loop:
std::set<int> mySet;
for(int i = 34; i < 75; ++i)
mySet.insert(i);

Efficiently finding multiple items in a container

I need to find a number of objects from a large container.
The only way I can think of to do that seems to be to just search the container for one item at a time in a loop, however, even which an efficient search with an average case of say "log n" (where n is the size of the container), this gives me "m log n" (where m is the number of items I'm looking for) for the entire operation.
That seems highly suboptimal to me, and as its something that I am likely to need to do on a frequent bases, something I'd definitely like to improve if possible.
Neither part has been implemented yet, so I'm open for suggestions on the format of the main container, the "list" of items I'm looking for, etc, as well as the actual search algorithm.
The items are complex objects, however the search key is just a simple integer.
Hash tables have basically O(1) lookup. This gives you O(m) to lookup m items; obviously you can't lookup m items faster than O(m) because you need to get the result out.
If you're purely doing look-up (you don't require ordered elements) and can give up some memory, try unordered_map (it's TR1, also implemented in Boost), which has constant-time amortized look-up.
In a game engine, we tested std::map and unordered_map, and while map was faster for insertions (if I recall), unordered_map blew it out of the water for retrieval. We had greater than 1000 elements in the map, for scale, which is fairly low compared to some other tasks you may be doing.
If you require elements to be ordered, your next bet is std::map, which has the look-up times you've posted, and keeps the elements ordered. In general, it also uses less memory than an unordered_map.
If your container is a vector and the elements are sorted, you can use std::lower_bound to search in O(log n) time. If your search items are also sorted, you can do a small optimization by always using the last found iterator as the start of the search for the next one, e.g.
vector<stuff> container;
vector<stuff>::iterator it = container.begin();
for (int i = 0; i < search_items.size() && it != container.end(); ++i)
{
it = std::lower_bound(it, container.end(), search_items[i]);
// make sure the found item is a match
if (it != container.end() && search_items[i] < *it)
it = container.end(); // break out early
}
if (it != container.end()) // found it!
boost/tr1 unordered_map and unordered_set are containers backed by a hash table which gives you search in amortized contant time [ O(1) ]
Boost Unordered documentation.
I suppose if you have a sorted container and a uniform distribution of items then the most efficient type of method would be a recursive bisection search with an execution path somewhat like a tree - calling itself twice whenever all the objects being searched for are in both halves of the bisection.
However, if you choose a container based on a hash-table (boost unordered set, I think?), or something similar, then lookup can be O(1), so searching in a loop really doesn't matter.
EDIT:
note that std::map and std::set are normally (always?) implemented using rb-trees, so are only log(n) for lookup.
Are you sure that m log2(n) is actually going to be a problem? If you are using a std::map that is even relatively large, the number of actually comparisons is still pretty small - if you are looking up 10,000 elements in a map of 1,000,000, the number of comparisons should be about 200,000 or about 20 comparisons per target element. This really isn't bad if your key is just a simple integer.
If you were hashing something that didn't already have a nice key, then I would say go with boost::unordered_map. I would implement it with std::map first, profile it, and then decide if you want to make the next jump to Boost.
If you're frequently performing the same projections on your collection, such as extracting elements with a key of "42", you could consider maintaining these subsets in buckets. You'd internally maintain a hashmap from keys to vectors of elements with that key, and add elements to the appropriate bucket as well as your primary collection representing "everything". Extracting a subgroup of elements is constant time (because the respective collections have already been built), and the memory overhead of maintaining the buckets scales primarily with the number of unique keys in your dataset.
This technique is decidedly less effective if you have a large number of unique key values, and makes insertions and removals more expensive, but it's good for some situations- I thought it was at least worth mentioning.