How to retrieve elements using hash values in STL hash containers? - c++

I am using unordered set to store strings. Here it is possible to have different strings to have same hash value, so I have to add to linked list for that hash value. Is this is supported in C++ stl unordered_set.
std::unordered_set<std::string, hashFunction> m_Dictionary;
I inserted "world: which has hash value 4 and another word "HellO" which has hash value 4 so both should go to the dictionary. How we can achieve this.
Another requirement is that I want to search by string, if string present in dictionary it should return true.
Also I want to search by hash value i.e., 4 then I should get output as "world" and "HellO".
Is this can be achieved by unordered_set or unordered_map. I want to use STL hash containers.
Basically I want to search using strings and with hash value. If hash value has multiple strings then we have to print all strings with that hash value. I am looking for sample code how we can achieve this.
I am aware that we can do this with out using STL hash containers, I am wondering is this possible with STL hash containers.
Thanks for help.

The hash container already supports objects that have the same hash. Basically, they are all pushed into a sublist and linearly searched through, so it is more expensive if your hash is poor. You can't search by hash value in the stl.
If you really need to use hash values then you will need to put a wrapper round an unordered_multimap indexed by integer, where your wrapper has hashed to the key, which will store multiple values at the same key. You can use equal_range() to return a pair of iterators covering the common key.

Related

How does std::unordered_map find values?

When iterating over an unordered map of values, std::unordered_map<Foo, Bar>, it will use an iterator pointing to values instd::pair<Foo, Bar>. This makes it seem like std::unordered_map stores its values internally as std::pairs of values.
If I'm not mistaken, std::unordered_map works by hashing the key and using that for lookups. So, if the internal structure is something like a list of pairs, how does it know which pair has the hashed key? Wouldn't it need to hash the whole pair, value included?
If the internal structure does not hold pairs, does calling std::unordered_map::begin() then create a std::pair object using the data in the map? Then how does modifying the data in the pair also modify the data in the actual map itself?
Thank you.
Let's pretend that the map is really a single vector:
std::vector<std::pair<Foo, Bar>> container;
It's easy to visualize that iterating over this container iterates over std::pairs of two classes.
The real unordered map works the same way, except instead of a vector the map stores std::pair<Foo, Bar>s in a hash table, with additional pointers that stitch the whole hash table together, that are not exposed via the map's iterators.
The way that associative containers, like maps, get loosely explained and described -- as a key/value lookup -- makes it sound like in the maps keys and values are stored separately: here's a hash table of keys, and each key points to its corresponding value. However that is not the case. The key and its value are kept together, tightly coupled in discrete std::pair objects, and the map's internal structure arranges them in a hashed table that the iterator knows how to iterate over.
So, if the internal structure is something like a list of pairs, how does it know which pair has the hashed key?
Neither. An unordered map can be loosely described as a:
std::vector<std::list<std::pair<Key, Value>>>
A key's hash is the index in the vector. All Keys in the ith position in this vector have the same hash value. All keys with the same hash are stored in a single linked list.

Unordered_map produce secondary key

I'm using strings as a type of key for my unordered_map but is it possible that I could associate a secondary unique key, independent from the primary, so I could perform a find operation with the second key?
I was thinking that the key could be a hash number the internal hash algorithm came up with.
I thought of including an id (increasing by 1 each time) to the structure I'm saving but then again, I would have to look up for the key that is a string first.
Reason behind this: I want to make lists that enlist some of the elements in the unordered_map but saving strings in the list is very inefficient instead of saving int or long long. (I would prefer not to use pointers but rather a bookkeeping style of procedure).
You couldn't use a hash number the internal hash algorithm came up with because it could change the numbers due to the table growth in size. This is called rehashing. Also hashes are not guaranteed to be unique (they certainly won't be).
Keeping pointers to elements in your list will work just fine, since unordered_map doesn't invalidate pointers. But the deletion of the elements will be hard.
Boost has multi_index_container, which provides many useful database-like functions. It will be perfect for your task.
If you don't want to use Boost, you could use unordered_map with unique integer indices, and another unordered_map which keeps string->index pairs for searching by string keys. The deletion will also be hard, because either you will check all your lists each time you delete a record, or you will check if the record still exists each time you are traversing the list.

Underlying storage and functionality of unordered_maps vs unordered_multimaps in C++?

I'm having a hard time wrapping my head around unordered_maps and unordered_multimaps because my test code isn't producing what I've been told to expect.
std::unordered_map<string, int> names;
names.insert(std::make_pair("Peter", 4));
names.insert(std::make_pair("George", 4));
names.insert(std::make_pair("George", 4));
When I iterate through this list, I get one instance of George first, then Peter.
1) It's my understanding unordered_maps do not allow multiple keys to map to one value, and that multimaps due. Is this true?
2) Why can Peter and George coexist at a value of 4? What is happening to the second George? And for that matter, why is George appearing first when I iterate from begin() to end() if this is unordered?
3) What is the underlying representation of an unordered map vs. unordered multimap?
4) Is there a way to insert keys into either map without providing a value? E.g. have the compiler create its own hash function that I don't need to worry about when I retrieve keys and look for collisions?
I'll make it short:
No. Multi... refers to keys. A (non-multi)map can't have multiple equivalent keys with differeny values, ie. per key there is at most one value. A multi map can. The same holds for the unordered versions.
Peter != George, which is why they have different key and may very well have the same value.
A hashmap.
Use sets.
In your example the second insertion for George using a (non-multi) is skipped as the same key was previously inserted.
You want to use unordered_multimap to have several keys that are the same.
Since this is unordered you can't really hope to have any particular order, because it depends on the hash function.
If you want order in which you insert things, you need to use std::vector. Even normal maps, which are supposed to be ordered imply the comparison order, and not the order in which you insert things, for example string "AB" comes before "BB", because "A" is less than "B".
To insert without providing a value you need a set, and not a map.
The underlying structure of "unordered_" things is hashtable.

C++ std::hash_map: What is the key's role

Both maps and hash_maps are designed so hold pairs of <key, data>. It's clear to me why the map should have a key for it's sorting (more precisely: treeing), but I don't understand why hash_maps need a key, why can't it's data alone be hashed and placed into the hash table?
I couldn't find the answer neither in the documentation nor by searching around the net.
std::unordered_set works precisely in the way you describe. However, there are times when you want to map from one piece of data to another; that's where std::unordered_map comes into play.
Walk to the cupboard. Get the phone book out and look up a number. It has a mapping between a name and number
you are looking for set, where a key is also data.
C++ offers some different flavour of them: set, unordered_set, etc...
Hash Map which is also called Unordered Map uses a HASH of the KEY as an index of buckets or Slots.In other word, any Hash Table needs a hash function to compute an index into an array of buckets or slots, from which the correct value can be found.These index are the Key of the Hash Table, which are used for accessing the data in O(1) time in best case.
If you want to use the data itself as the key, the appropriate container is std::set or std::unordered_set. A map holds both a key and a value; the difference between std::map and std::unordered_map is in how the data is organized; std::map sorts by the key, and std::unordered_map hashes by the key.

Why retrieving elements from CMap is not ordered

In my application, I have a CMap of CString values. After adding the elements in the Map, if I retrieve the elements in some other place, am not getting the elements in the order of insertion.Suppose I retrieve the third element, I get the fifth like that. Is it a behavior of CMap. Why this happens?
You asked for "why", so here goes:
A Map provides for an efficient way to retrieve values by key. It does this by using a clever datastructure that is faster for this than a list or an array would be (where you have to search through the whole list before you know if an element is in there or not). There are trade-offs, such as increased memory usage, and the inability to do some other things (such as knowing in which order things were inserted).
There are two common ways to implement this
a hashmap, which puts keys into buckets by hash value.
a treemap, which arranges keys into a binary tree, according to how they are sorted
You can iterate over maps, but it will be according to how they are stored internally, either in key order (treemap) or completely unpredictable (hashmap). Your CMap seems to be a hashmap.
Either way, insertion order is not preserved. If you want that, you need an extra datastructure (such as a list).
How about read documentation to CMap?
http://msdn.microsoft.com/ru-ru/library/s897094z%28v=vs.71%29.aspx
It's unordered map really. How you retrieve elements? By GetStartPosition and GetNextAssoc? http://msdn.microsoft.com/ru-ru/library/d82fyybt%28v=vs.71%29.aspx read Remark here
Remarks
The iteration sequence is not predictable; therefore, the "first element in the map" has no special significance.
CMap is a dictionary collection class that maps unique keys to values. Once you have inserted a key-value pair (element) into the map, you can efficiently retrieve or delete the pair using the key to access it. You can also iterate over all the elements in the map.