C++ unordered_map where key is also unordered_map - c++

I am trying to use an unordered_map with another unordered_map as a key (custom hash function). I've also added a custom equal function, even though it's probably not needed.
The code does not do what I expect, but I can't make heads or tails of what's going on. For some reason, the equal function is not called when doing find(), which is what I'd expect.
unsigned long hashing_func(const unordered_map<char,int>& m) {
string str;
for (auto& e : m)
str += e.first;
return hash<string>()(str);
}
bool equal_func(const unordered_map<char,int>& m1, const unordered_map<char,int>& m2) {
return m1 == m2;
}
int main() {
unordered_map<
unordered_map<char,int>,
string,
function<unsigned long(const unordered_map<char,int>&)>,
function<bool(const unordered_map<char,int>&, const unordered_map<char,int>&)>
> mapResults(10, hashing_func, equal_func);
unordered_map<char,int> t1 = getMap(str1);
unordered_map<char,int> t2 = getMap(str2);
cout<<(t1 == t2)<<endl; // returns TRUE
mapResults[t1] = "asd";
cout<<(mapResults.find(t2) != mapResults.end()); // returns FALSE
return 0;
}

First of all, the equality operator is certainly required, so you should keep it.
Let's look at your unordered map's hash function:
string str;
for (auto& e : m)
str += e.first;
return hash<string>()(str);
Since it's an unordered map, by definition, the iterator can iterate over the unordered map's keys in any order. However, since the hash function must produce the same hash value for the same key, this hash function will obviously fail in that regard.
Additionally, I would also expect that the hash function will also include the values of the unorderered map key, in addition to the keys themselves. I suppose that you might want to do it this way -- for two unordered maps to be considered to be the same key as long as their keys are the same, ignoring their values. It's not clear from the question what your expectation is, but you may want to think it over.

Comparing two std::unordered_map objects using == compares whether the maps contain the same keys. It does nothing to tell whether they contain them in the same order (it's an unordered map, after all). However, your hashing_func depends on the order of items in the map: hash<string>()("ab") is in general different from hash<string>()("ba").

A good place to start is with what hashing_func returns for each map, or more easily what the string construction in hashing_func generates.
A more obviously correct hash function for such a type could be:
unsigned long hashing_func(const unordered_map<char,int>& m) {
unsigned long res = 0;
for (auto& e : m)
res ^ hash<char>()(e.first) ^ hash<int>()(e.second);
return res;
}

Related

The best practice for (unordered) map keys and values modification

The map of the form map<long long, vector<long long>> is given. One has to take all keys and values modulo some integer N. Some keys can merge and corresponding values must join accordingly. For example, the map {{1,{2,6,4}}, {5,{8,4,9}}, {10,{5,1,7}}} should be equal to {{1,{2,1,4}}, {0,{0,1,2,3,4}}} after reduction modulo 5.
My way is in using a new map but I think there should be a better way.
code added
vector<long long> tmp;
//integer N, for example N = 5
int N = 5;
unordered_map<long long, vector<long long>> map;
//temporary map
unordered_map<long long, vector<long long>> map_tmp;
for (auto & x : map)
{
tmp.clear();
for (auto & y : x.second) tmp.push_back(y % N);
ind = x.first % N;
map_tmp[ind].insert(map_tmp[ind].end(), tmp.begin(), tmp.end());
sort(map_tmp[ind].begin(), map_tmp[ind].end());
map_tmp[ind].erase(unique(map_tmp[ind].begin(), map_tmp[ind].end()), map_tmp[ind].end());
}
map = map_tmp;
Since apparently values in map are unique and after applying modulo operation values contains unique items, then you should use different data structure. for example:
using Map = std::unordered_map<int, std::set<int>>;
std::set will handle uniqueness and order of items for given key.
Now the whole trick is to inspect API of std::unordered_map and std::set and how item can be inserted there. See:
std::unordered_map::insert
std::set::insert
Note return value: std::pair<iterator,bool> which gives you iterator to inserted or exciting item in map/set.
Knowing this thing writing a code which is able to meet your requriements is quite simple:
using Map = std::unordered_map<int, std::set<int>>;
Map moduloMap(const Map& in, int mod)
{
Map out;
for (const auto& [k, s] : in) {
if (s.empty())
continue;
auto& destSet = out.insert({ k % mod, {} }).first->second;
for (auto x : s) {
destSet.insert(x % mod);
}
}
return out;
}
Live demo with tests
Sometimes a for loop can be the easiest, clearest way to do something.
map<long long, vector<long long>> result;
for (const auto& [key, vec] : input) {
process (result[key%5], vec);
}
and process takes the vector by (non-const) reference and appends the reduced values from the second (const) argument.
update
After seeing the code you posted, I have several suggestions:
use a set instead. You are spending multiple steps to append the new values, sort the whole thing together, then remove duplicates. Just use a set which maintains a single copy of each value automatically.
use structured binding in your loop. Instead of x.second and x.first you can just name them key and vec as in my earlier post.
Assuming you still need tmp, declare it where you are calling .clear() now, instead of declaring it way up at the top of your code. You don't need to clear it each time through the loop; it will be empty each time through the loop naturally.

Same key, multiple entries for std::unordered_map?

I have a map inserting multiple values with the same key of C string type.
I would expect to have a single entry with the specified key.
However the map seems to take it's address into consideration when uniquely identifying a key.
#include <cassert>
#include <iostream>
#include <string>
#include <unordered_map>
typedef char const* const MyKey;
/// #brief Hash function for StatementMap keys
///
/// Delegates to std::hash<std::string>.
struct MyMapHash {
public:
size_t operator()(MyKey& key) const {
return std::hash<std::string>{}(std::string(key));
}
};
typedef std::unordered_map<MyKey, int, MyMapHash> MyMap;
int main()
{
// Build std::strings to prevent optimizations on the addresses of
// underlying C strings.
std::string key1_s = "same";
std::string key2_s = "same";
MyKey key1 = key1_s.c_str();
MyKey key2 = key2_s.c_str();
// Make sure addresses are different.
assert(key1 != key2);
// Make sure hashes are identical.
assert(MyMapHash{}(key1) == MyMapHash{}(key2));
// Insert two values with the same key.
MyMap map;
map.insert({key1, 1});
map.insert({key2, 2});
// Make sure we find them in the map.
auto it1 = map.find(key1);
auto it2 = map.find(key2);
assert(it1 != map.end());
assert(it2 != map.end());
// Get values.
int value1 = it1->second;
int value2 = it2->second;
// The first one of any of these asserts fails. Why is there not only one
// entry in the map?
assert(value1 == value2);
assert(map.size() == 1u);
}
A print in the debugger shows that map contains two elements just after inserting them.
(gdb) p map
$4 = std::unordered_map with 2 elements = {
[0x7fffffffda20 "same"] = 2,
[0x7fffffffda00 "same"] = 1
}
Why does this happen if the hash function which delegates to std::hash<std::string> only takes it's value into account (this is asserted in the code)?
Moreover, if this is the intended behaviour, how can I use a map with C string as key, but with a 1:1 key-value mapping?
The reason is that hash maps (like std::unordered_map) do not only rely on the hash function for determining if two keys are equal. The hash function is the first comparison layer, after that the elements are always also compared by value. The reason is that even with good hash functions you might have collisions where two different keys yield the same hash value - but you still need to be able to save both entries in the hashmap. There are various strategies to handle that, you can find more information on looking for collision resolution for hash maps.
In your examples both entries have the same hash value but different values. The values are just compared by the standard comparison function, which compares the char* pointers, which are different. Therefore the value comparison fails and you get two entries in the map. To solve your issue you also need to define a custom equality function for your hash map, which can be done by specifiying the fourth template parameter KeyEqual for std::unordered_map.
This fails because the unordered_map does not and cannot solely rely on the hash function for the key to differentiate keys, but it must also compare keys with the same hash for equality. And comparing two char pointers compares the address pointed to.
If you want to change the comparison, pass a KeyEqual parameter to the map in addition to the hash.
struct MyKeyEqual
{
bool operator()(MyKey const &lhs, MyKey const &rhs) const
{
return std::strcmp(lhs, rhs) == 0;
}
};
unordered_map needs to be able to perform two operations on the key - checking equality, and obtaining hash code. Naturally, two unequal keys are allowed to have different hash codes. When this happens, unordered map applies hash collision resolution strategy to treat these unequal keys as distinct.
That is precisely what happens when you supply a character pointer for the key, and provide an implementation of hash to it: the default equality comparison for pointers kicks in, so two different pointers produce two different keys, even though the content of the corresponding C strings is the same.
You can fix it by providing a custom implementation of KeyEqual template parameter to perform actual comparison of C strings, for example, by calling strcmp:
return !strcmp(lhsKey, rhsKey);
You didn't define a map of keys but a map of pointers to a key.
typedef char const* const MyKey;
The compiler can optimize the two instances of "name" and use only one instance in the const data segment, but that can happen or not. A.k.a. undefined behavior.
Your map should contain the key itself. Make the key a std::string or similar.

Accessing pair-vector to return using iterator

my task is to overload the [] operator and use girl[index] = partner to write stuff into a pair vector:
class Dancers {
unsigned long & operator [] (int i);
vector<pair<int,string>> m_dancers;
};
unsigned long & operator [] (int i) {
auto iter = lower_bound(m_dancers.first.begin(), m_dancers.first.end(), i, cmpInt);
m_dancers.first.insert(iter, i);
//what now?
}
int main() {
Pairs girl;
girl[0] = "Richard";
return 0;
}
So I've managed to sort the girls and now I have the girl that I want to assign a partner to. From what I understand, now it's time to return the reference so I can assign the partner. How do I do that using the iterator?
And MORE IMPORTANTLY: is there a more efficient way to assign x and y to a pair-vector in a a[x] = y situation? Or am I trying to reinvent a wheel?
Presumably you don't want to insert a new element if there is already an existing key in your map (i.e. you want a unqiue-key map, not a multi-key map). So you need to check the key and only insert conditionally. And you want to return the mapped element, not they key.
string & operator[](int key) {
auto it = lower_bound(m_dancers.begin(), m_dancers.end(), key, cmpInt);
if (it->first != key)
it = m_dancers.insert(it, make_pair(i, string()));
return it->second;
}
If you wanted a multi-key map instead, then just omit the conditional check and make the insertion unconditionally. (But then you'd probably want to use upper_bound so that new elements are added at the end of their equal-range, also see here.)
To summarize, things that needed to be fixed in your code:
Return type
Iterators are from the vector, not from the pair
Insertion is conditional
Remember the result of the insertion
You misspelled your use case, it should say Dancers girl;
You are probably misspelling the out-of-line member definition; it should say string & Dancers::operator[](int i)... (or just define it inline).

How not to use custom comparison function of std::map in searching ( map::find)?

As you can see in my code, lenMap is a std::map with a custom comparison function. This function just check the string's length.
Now when I want to search for some key ( using map::find), the map still uses that custom comparison function.
But How can I force my map not to use that when I search for some key ?
Code:
struct CompareByLength : public std::binary_function<string, string, bool>
{
bool operator()(const string& lhs, const string& rhs) const
{
return lhs.length() < rhs.length();
}
};
int main()
{
typedef map<string, string, CompareByLength> lenMap;
lenMap mymap;
mymap["one"] = "one";
mymap["a"] = "a";
mymap["foobar"] = "foobar";
// Now In mymap: [a, one, foobar]
string target = "b";
if (mymap.find(target) == mymap.end())
cout << "Not Found :) !";
else
cout << "Found :( !"; // I don't want to reach here because of "a" item !
return 0;
}
The map itself does not offer such an operation. The idea of the comparison functor is to create an internal ordering for faster lookup, so the elements are actually ordered according to your functor.
If you need to search for elements in a different way, you can either use the STL algorithm std::find_if() (which has linear time complexity) or create a second map that uses another comparison functor.
In your specific example, since you seem only to be interested in the string's length, you should rather use the length (of type std::size_t) and not the string itself as a key.
By the way, std::binary_function is not needed as a base class. Starting from C++11, it has even been deprecated, see here for example.
The comparison function tells the map how to order elements and how to differentiate between them. If it only compares the length, two different strings with the same length will occupy the same position in the map (one will overwrite the other).
Either store your strings in a different data structure and sort them, or perhaps try this comparison function:
struct CompareByLength
{
bool operator()(const string& lhs, const string& rhs) const
{
if (lhs.length() < rhs.length())
{
return true;
}
else if (rhs.length() < lhs.length())
{
return false;
}
else
{
return lhs < rhs;
}
}
};
I didn't test it, but I believe this will first order strings by length, and then however strings normally compare.
You could also use std::map<std::string::size_type, std::map<std::string, std::string>> and use the length for the first map and the string value for the second map. You would probably want to wrap this in a class to make it easier to use, as there is no protection against messing it up.

Finding an element in map by its value

I'm creating a HandleManager whose purpose is to simply map Handles (which is a typedef of long long int) to strings. The purpose is so that objects that use a Handle can also be identified via strings if it helps a user remember the object. In which case, in this map:
typedef std::unordered_map<Handle, std::string> HandleMap;
both types in the pair are keys insofar they can be used to identify anything. So far everything has compiled apart from the code which needs to get the Handle. The purpose is such that when a user allocates a string like so:
handle("myHandle");
A Handle is generated randomly and then the string passed is paired with it in the foresaid map. What I want now is to be able to get the Handle that is paired with the string based on the string that is passed:
Handle HandleManager::id(const std::string &name)
{
HandleMap::iterator it = pHandles.find(name);
if (it != pHandles.end())
return it->first;
return -1;
}
But for some weird reason the compiler complains about this:
HandleManager.cpp:48:45: error: no matching function for call to ‘std::unordered_map<long long int, std::basic_string<char> >::find(const string&)’
In the foresaid map, the string is the value and the Handle is the key. So how can I get the key from the unordered_map based on the value contained therein?
You can use the member function find to search for key only. To search for a value, you can use a std::find_if with a lambda function (if you use C++11), or to traverse the map (ok in previous C++ version):
for (HandleMap::const_iterator it = map.begin(); it != map.end(); ++it) {
if (it->second == name) return it->first;
}
// or value not found
On the other hand, if searching for a value is a very common operation, you may want to have two maps: std::unordered_map<Handle, std::string> and std::unordered_map<std::string, Handle>. In that case, you have to make sure you perform insertions, deletions, etc. in both maps to keep then synchronized.
std::unordered_map::find operates on the key, not the value. You can use std::find_if:
Handle HandleManager::id(const std::string &name)
{
auto it = std::find_if(std::begin(pHandles), std::end(pHandles),
[](auto&& p) { return p->second == name; });
if (it == std::end(pHandles))
return -1;
return it->first
}
Note that auto, std::begin, std::end and lambdas are C++11 and generic lambdas are C++14, so substitute those out if you're stuck with an old compiler.
But for some weird reason the compiler complains about this:
Of course it does, the find function is for lookup up by key and you're not doing that.
To find a value you need to visit every element until you find it (or use a bidirectional map which maps values back to keys, e.g. Boost.Bimap).
Based on answer from #TartanLlama (*):
Handle HandleManager::id(const std::string & name) {
auto iter = std::find_if(std::begin(pHandles), std::end(pHandles),
[& name](auto && pair) {
return pair.second == name;
});
if (it == std::end(pHandles)) {
return -1;
}
return it->first;
}
(*): Because it doesn't seem possible to format code in comments.