Memory allocations when using unordered_map - c++

If a std::unordered_map<int,...> was to stay roughly the same size but continually add and remove items, would it continually allocate and free memory or cache and reuse the memory (ie. like a pool or vector)? Assuming a modern standard MS implementation of the libraries.

The standard is not specific about these aspects, so they are implementation defined. Most notably, a caching behaviour like you describe is normally achieved by using a custom allocator (e.g. for a memory pool allocator) so it should normally be decoupled from the container implementation.
The relevant bits of the standard, ~p874 about unordered containers:
The elements of an unordered associative container are organized into
buckets. Keys with the same hash code appear in the same bucket. The
number of buckets is automatically increased as elements are added to
an unordered associative container, so that the average number of
elements per bucket is kept below a bound.
and insert:
The insert and emplace members shall not affect the validity of
iterators if (N+n) <= z * B, where N is the number of elements in the
container prior to the insert operation, n is the number of elements
inserted, B is the container’s bucket count, and z is the container’s
maximum load factor
You could read between the lines and assume that since iterator validity is not affected, probably no memory allocations will take place. Though this is by no means guaranteed (e.g. if the bucket data structure is a linked list, you can append to it without invalidating iterators). The standard doesn't seem to specify what should happen when an element is removed, but since it couldn't invalidate the constraint above I don't see a reason to deallocate memory.
The easiest way to find out for sure for your specific implementation is to read the source or profile your code.
Alternatively you can try to take control of this behaviour (if you really need to) by using the rehash and resize methods and tuning the map's load_factor.

Related

Does copy of unordered_map will have exactly same buckets

As far as I understand, amount of buckets in unordered_map increases accidentally while filling of unordered_map.
If I perform copy of unordered_map (to another unordered_map) it is guaranteed there will be exactly same pairs. But will they be in the same buckets? Will amount of buckets will be the same?
I don't know bucket's creation mechanism, and didn't find short explanation, how it have to be implemented in standard. But if buckets amount may rely on sequence of insertions, allocation and etc, then after copying we may get different amount of buckets, or different distribution in there (even if items will be the same). Is it true? Both for boost's implementation and for gcc's standard implementation?
The max load factor, but not the "current" load factor are specified as being copied when an unordered_map is copied.
Both the entry for copy construction and copy assignment include the following
In addition to the requirements of Table 64, copies the hash function,
predicate, and maximum load factor.
[unord.req]
In general this means there may be a different count of buckets, and thus a different distribution of elements into buckets in a copy.

Does the C++ standard define the structure of a bucket for unordered_set?

When a hash value for an element in a unordered_set is computed it is placed in a "bucket" together with other -- different -- elements but same hash value.
My experience is that the elements in such a bucket are stored in a singly linked list. Meaning, it gets very slow when searching inside a bucket with a bad hash function.
Is the singly linked list a requirement by the standard or just one possible implementation? Could one implement unordered_set with sets as buckets?
The standard states the requirements and guarantees, but doesn't explicitly force the underlying data structures and algorithms.
N4140 §23.2.5 [unord.req]/1
Unordered associative containers provide an ability for fast retrieval
of data based on keys. The worstcase complexity for most operations is
linear, but the average case is much faster.
This is a little weird, because it states the worst case complexity as a fact, instead of just allowing it.
N4140 §23.2.5 [unord.req]/9
The elements of an unordered associative container are organized into
buckets. Keys with the same hash code appear in the same bucket. The number of buckets is automatically increased as elements are added to
an unordered associative container, so that the average number of
elements per bucket is kept below a bound. Rehashing invalidates
iterators, changes ordering between elements, and changes which
buckets elements appear in, but does not invalidate pointers or
references to elements.
The above does seem to invalidate std::set as a possible data type, but should allow a set-like data structure if it allowed moving elements between its instances without invalidating pointers or references.
That leaves one hurdle: sets would require a comparator/operator< to be defined (with strict weak ordering semantics), while unordered associative containers make no such requirement. In this case you could simply fall back to linked list if it isn't defined, though.
So, as far as I can tell, you could replace the linked list with a set-like structure, if the aforementioned conditions were met. That being said, it does feel like a problem that you shouldn't have experienced in the first place, had you used a proper hashing algorithm.

c++ unordered_map collision handling , resize and rehash

I have not read the C++ standard but this is how I feel that the unordered_map of c++ suppose to work.
Allocate a memory block in the heap.
With every put request, hash the object and map it to a space in this memory
During this process handle collision handling via chaining or open addressing..
I am quite surprised that I could not find much about how the memory is handled by unordered_map. Is there a specific initial size of memory which unordered_map allocates. What happens if lets say we allocated 50 int memory and we ended up inserting 5000 integer?
This will be lot of collisions so I believe there should be kind of like a re-hashing and re-sizing algorithm to decrease the number of collisions after a certain level of collision threshold is reached. Since they are explicitly provided as member functions to the class, I assume they are used internally as well. Is there a such mechanism?
With every put request, hash the object and map it to a space in this memory
Unfortunately, this isn't exactly true. You are referring to an open addressing or closed hashing data structure which is not how unordered_map is specified.
Every unordered_map implementation stores a linked list to external nodes in the array of buckets. Meaning that inserting an item will always allocate at least once (the new node) if not twice (resizing the array of buckets, then the new node).
No, that is not at all the most efficient way to implement a hash map for most common uses. Unfortunately, a small "oversight" in the specification of unordered_map all but requires this behavior. The required behavior is that iterators to elements must stay valid when inserting or deleting other elements. Because inserting might cause the bucket array to grow (reallocate), it is not generally possible to have an iterator pointing directly into the bucket array and meet the stability guarantees.
unordered_map is a better data structure if you are storing expensive-to-copy items as your key or value. Which makes sense, given that its general design was lifted from Boost's pre-move-semantics design.
Chandler Carruth (Google) mentions this problem in his CppCon '14 talk "Efficiency with Algorithms, Performance with Data Structures".
std::unordered_map contains a load factor that it uses to manage the size of it's internal buckets. std::unordered_map uses this odd factor to keep the size of the container somewhere in between a 0.0 and 1.0 factor. This decreases the likelihood of a collision in a bucket. After that, I'm not sure if they fallback to linear probing within a bucket that a collision was found in, but I would assume so.
Allocate a memory block in the heap.
True - there's a block of memory for an array of "buckets", which in the case of GCC are actually iterators capable of recording a place in a forward-linked list.
With every put request, hash the object and map it to a space in this memory
No... when you insert/emplace further items into the list, an additional dynamic (i.e. heap) allocation is done with space for the node's next link and the value being inserted/emplaced. The linked list is rewired accordingly, so the newly inserted element is linked to and/or from the other elements that hashed to the same bucket, and if other buckets also have elements, that group will be linked to and/or from the nodes for those elements.
At some point, the hash table content might look like this (GCC does things this way, but it's possible to do something simpler):
+-------> head
/ |
bucket# / #503
[0]----\/ |
[1] /\ /===> #1003
[2]===/==\====/ |
[3]--/ \ /==> #22
[4] \ / |
[5] \ / #7
[6] \ |
[7]=========/ \-----> #177
[8] |
[9] #100
The buckets on the left are the array from the original allocation: there are 10 elements in the illustrated array, so "bucket_count()" == 10.
A key with hash value X - denoted #x e.g. #177 - hashes to bucket X % bucket_count(); that bucket will need to store an iterator to the singly-linked list element immediately before the first element hashing to that bucket, so it can remove the last element from the bucket and rewire either head, or another bucket's next pointer, to skip over the erased element.
While elements in a bucket need to be contiguous in the forward-linked list, the ordering of buckets within that list is an unimportant consequence of the order of insertion of elements in the container, and isn't stipulated in the Standard.
During this process handle collision handling via chaining or open addressing..
The Standard library containers that are backed by hash tables always use separate chaining.
I am quite surprised that I could not find much about how the memory is handled by unordered_map. Is there a specific initial size of memory which unordered_map allocates.
No, the C++ Standard doesn't dictate what the initial memory allocation should be; it's up to the C++ implementation to choose. You can see how many buckets a newly created table has by printing out .bucket_count(), and in all likelihood if you multiply that by the your pointer size you'll get the size of the heap allocation that the unordered container made: myUnorderedContainer.bucket_count() * sizeof(int*). That said, there's no prohibition on your Standard Library implementation varying the initial bucket_count() in arbitrary and bizarre ways (e.g. with optimisation level, depending on Key type), but I can't imagine why any would.
What happens if lets say we allocated 50 int memory and we ended up
inserting 5000 integer? This will be lot of collisions so I believe there should be kind of like a re-hashing and re-sizing algorithm to decrease the number of collisions after a certain level of collision threshold is reached.
Rehashing/resizing isn't triggered by a certain number of collisions, but a certain proneness for collisions, as measured by the load factor, which is .size() / .bucket_count().
When an insertion would push the .load_factor() above the .max_load_factor(), which you can change but is required by the C++ Standard to default to 1.0, then the hash table is resized. That effectively means it allocates more buckets - normally somewhere close to but not necessarily exactly twice as many - then it points the new buckets at the linked list nodes, then finally deletes the heap allocation with the old buckets.
Since they are explicitly provided as member functions to the class, I assume they are used internally as well. Is there a such mechanism?
There's is no C++ Standard requirement about how the resizing is implemented. That said, if I were implementing resize() I'd consider creating a function-local container whilst specifying the newly desired bucket_count, then iterate over the elements in the *this object, calling extract() to detach them, then merge() to add them to the function-local container object, then eventually invoke swap on *this and the function-local container.

How std::unordered_map is implemented

c++ unordered_map collision handling , resize and rehash
This is a previous question opened by me and I have seen that I am having a lot of confusion about how unordered_map is implemented. I am sure many other people shares that confusion with me. Based on the information I have know without reading the standard:
Every unordered_map implementation stores a linked list to external
nodes in the array of buckets... No, that is not at all the most
efficient way to implement a hash map for most common uses.
Unfortunately, a small "oversight" in the specification of
unordered_map all but requires this behavior. The required behavior is
that iterators to elements must stay valid when inserting or deleting
other elements
I was hoping that someone might explain the implementation and how it fits with the C++ standard definition ( in terms of performance requirements ) and if it is really not the most efficient way to implement an hash map data structure how it can be improved ?
The Standard effectively mandates that implementations of std::unordered_set and std::unordered_map - and their "multi" brethren - use open hashing aka separate chaining, which means an array of buckets, each of which holds the head of a linked list†. That requirement is subtle: it is a consequence of:
the default max_load_factor() being 1.0 (which means the table will resize whenever size() would otherwise exceed 1.0 times the bucket_count(), and
the guarantee that the table will not be rehashed unless grown beyond that load factor.
That would be impractical without chaining, as the collisions with the other main category of hash table implementation - closed hashing aka open addressing - become overwhelming as the load_factor()](https://en.cppreference.com/w/cpp/container/unordered_map/load_factor) approaches 1.
References:
23.2.5/15: The insert and emplace members shall not affect the validity of iterators if (N+n) < z * B, where N is the number of elements in the container prior to the insert operation, n is the number of elements inserted, B is the container’s bucket count, and z is the container’s maximum load factor.
amongst the Effects of the constructor at 23.5.4.2/1: max_load_factor() returns 1.0.
† To allow optimal iteration without passing over any empty buckets, GCC's implementation fills the buckets with iterators into a single singly-linked list holding all the values: the iterators point to the element immediately before that bucket's elements, so the next pointer there can be rewired if erasing the bucket's last value.
Regarding the text you quote:
No, that is not at all the most efficient way to implement a hash map for most common uses. Unfortunately, a small "oversight" in the specification of unordered_map all but requires this behavior. The required behavior is that iterators to elements must stay valid when inserting or deleting other elements
There is no "oversight"... what was done was very deliberate and done with full awareness. It's true that other compromises could have been struck, but the open hashing / chaining approach is a reasonable compromise for general use, that copes reasonably elegantly with collisions from mediocre hash functions, isn't too wasteful with small or large key/value types, and handles arbitrarily-many insert/erase pairs without gradually degrading performance the way many closed hashing implementations do.
As evidence of the awareness, from Matthew Austern's proposal here:
I'm not aware of any satisfactory implementation of open addressing in a generic framework. Open addressing presents a number of problems:
• It's necessary to distinguish between a vacant position and an occupied one.
• It's necessary either to restrict the hash table to types with a default constructor, and to construct every array element ahead of time, or else to maintain an array some of whose elements are objects and others of which are raw memory.
• Open addressing makes collision management difficult: if you're inserting an element whose hash code maps to an already-occupied location, you need a policy that tells you where to try next. This is a solved problem, but the best known solutions are complicated.
• Collision management is especially complicated when erasing elements is allowed. (See Knuth for a discussion.) A container class for the standard library ought to allow erasure.
• Collision management schemes for open addressing tend to assume a fixed size array that can hold up to N elements. A container class for the standard library ought to be able to grow as necessary when new elements are inserted, up to the limit of available memory.
Solving these problems could be an interesting research project, but, in the absence of implementation experience in the context of C++, it would be inappropriate to standardize an open-addressing container class.
Specifically for insert-only tables with data small enough to store directly in the buckets, a convenient sentinel value for unused buckets, and a good hash function, a closed hashing approach may be roughly an order of magnitude faster and use dramatically less memory, but that's not general purpose.
A full comparison and elaboration of hash table design options and their implications is off topic for S.O. as it's way too broad to address properly here.

Does binary search have logarithmic performance of deque C++ data structure?

The standard says that std::binary_search(...) and the two related functions std::lower_bound(...) and std::upper_bound(...) are O(log n) if the data structure has random access. So, given that, I presume that these algorithms have O(log n) performance on std::deque (assuming its contents are kept sorted by the user).
However, it seems that the internal representation of std::deque is tricky (it's broken into chunks), so I was wondering: does the requirement of O(log n) search hold for std::deque.
Yes it still holds for deque because the container is required to provide access to any element in constant time (just a slightly higher constant than vector).
That doesn't relieve you of the obligation for the deque to be sorted though.
Yes. deque has constant time access for an index if it is there. It is organized in pages of an equal size. What you have is smth. like a vector of pointers to pages. If you have let's say 2 pages with 100 elements and you would like to access 103rd element than determine the page by 103/100 = 1 and determine the index in the page by 103 %100 => 3. Now use a constant time access in two vectors to get the element.
Here the statement from http://www.cplusplus.com/reference/stl/deque/:
Deques may be implemented by specific libraries in different ways, but in all cases they allow for the individual elements to be accessed through random access iterators, with storage always handled automatically (expanding and contracting as needed).
Just write a program to test that, I think deque's implementation of binary_search will slower than vector's, but it's complexity is still O(logn)