unordered_map - Hash function has no effect - c++

Why in the following the hash function (which returns constant 0) seems not be taking any effect?
Since the hash function is returning constant, I was expecting as output all values to be 3. However, it seems to uniquely map the std::vector values to a unique value, regardless of my hash function being constant.
#include <iostream>
#include <map>
#include <unordered_map>
#include <vector>
// Hash returning always zero.
class TVectorHash {
public:
std::size_t operator()(const std::vector<int> &p) const {
return 0;
}
};
int main ()
{
std::unordered_map<std::vector<int> ,int, TVectorHash> table;
std::vector<int> value1({0,1});
std::vector<int> value2({1,0});
std::vector<int> value3({1,1});
table[value1]=1;
table[value2]=2;
table[value3]=3;
std::cout << "\n1=" << table[value1];
std::cout << "\n2=" << table[value2];
std::cout << "\n3=" << table[value3];
return 0;
}
Obtained output:
1=1
2=2
3=3
Expected output:
1=3
2=3
3=3
What am I missing about hash?

You misunderstood the use of the hash function: it's not used to compare elements. Internally, the map organizes the elements into buckets and the hash function is used to determine the bucket into which the element resides. Comparison of the elements is performed with another template parameter, look at the full declaration of the unordered_map template:
template<
class Key,
class T,
class Hash = std::hash<Key>,
class KeyEqual = std::equal_to<Key>,
class Allocator = std::allocator< std::pair<const Key, T> >
> class unordered_map;
The next template parameter after the hasher is the key comparator. To get the behavior you expect, you would have to do something like this:
class TVectorEquals {
public:
bool operator()(const std::vector<int>& lhs, const std::vector<int>& rhs) const {
return true;
}
};
std::unordered_map<std::vector<int> ,int, TVectorHash, TVectorEquals> table;
Now your map will have a single element and all your results will be 3.

A sane hash table implementation should not lose information, even in the presence of hash collisions. There are several techniques that allow the resolution of collisions (usually trading off runtime performance to data integrity).
Obviously, std::unordered_map implements it.
See: Hash Collision Resolution

Add a predicate key comparer class.
class TComparer {
public:
bool operator()(const std::vector<int> &a, const std::vector<int> &b) const {
return true; // this means that all keys are considered equal
}
};
Use it like this:
std::unordered_map<std::vector<int> ,int, TVectorHash, TComparer> table;
Then the rest of your code will work as expected.

Related

Getting error while sorting a map by value [duplicate]

I was trying to make a map sort by value using a custom comparator but I couldn't figure out why I kept getting the error of "no matching call to compareByVal"
Here's what I had in my main.cpp:
#include <map>
#include <iostream>
struct compareByVal {
bool operator[](const std::pair<int,int> & a, const std::pair<int,int> & b)
return a.second < b.second;
}
int main() {
std::map<int,int,compareByVal> hash;
hash[1] = 5;
hash[2] = 2;
hash[3] = 10;
std::cout << hash.begin()->first << std::endl;
}
The first, simple problem is
struct compareByVal {
bool operator[](const std::pair<int,int> & a, const std::pair<int,int> & b)
return a.second < b.second;
}
should be
struct compareByVal {
bool operator()(const std::pair<int,int> & a, const std::pair<int,int> & b) const {
return a.second < b.second;
}
};
The second, serious problem is the signature of the compare is wrong. It should be
struct compareByVal {
bool operator()(const int leftKey, const int rightKey) const;
}
You can't access the value in the compare function. There is no (simple) way to sort a map by value.
Simply put, you cannot. Not sure which compiler you're using, but clang and gcc both give useful messages. with context.
clang:
static_assert(__is_invocable<_Compare&, const _Key&, const _Key&>{},
gcc:
if (__i == end() || key_comp()(__k, (*__i).first))
You can see that clang and gcc are both calling the compare method with only they key, and not a value. This is simply how maps work.
If you want to sort by value, you would have to create your own custom map, or, more realistically, use the value as the key instead. Creating your own map to achieve this would be more difficult than you'd think, since it would have to sort after any value is modified.
If you want to sort a std::map by its value, then you are using the wrong container. std::map is sorted by the keys by definition.
You can wrap key and value:
struct foo {
int key;
int value;
};
and then use a std::set<foo> that uses a comparator that only compares foo::value.
Well, first, the reason you're getting the error: "no matching call to compareByVal" is because map's comparator works only with the keys. So the comparator should like:
struct compareByVal {
template <typename T>
bool operator()(const T& a, const T& b) const
return a < b;
}
Coming on to what you want to achieve, I see two ways of doing so:
Copy all the elements of the map to a std::vector and sort that:
std::vector<std::pair<int,int> > v(hash.begin(), hash.end());
std::sort(v.begin(), v.end(), [](const auto& a, const auto& b) { return a.second < b.second; });
Copy all the elements of the map to another map with keys as values and values as keys. If values of your map are not unique, you can use a std::multimap instead.
This may be an X-Y issue.
If you need to sort by both key and value, then a single std::map may not be the most efficient choice.
In database theory, all the data would be placed into a single table. An index table would be created describing the access or sorting method. Data that needs to be sorted in more than one method would have multiple index tables.
In C++, the core table would be a std::vector. The indices would be std::map<key1, vector_index>, std::map<key2, vector_index>, where vector_index is the index of the item in the core table.
Example:
struct Record
{
int age;
std::string name;
};
// Core table
std::vector<Record> database;
// Index by age
std::map<int, unsigned int> age_index_table;
// Index by name
std::map<std::string, unsigned int> name_index_table;
// Fetching by age:
unsigned int database_index = age_index_table[42];
Record r = database[database_index];
// Fetching by name:
unsigned int database_index = name_index_table["Harry Potter"];
Record r = database[database_index];
You can learn more by searching the internet for "database index tables c++".
If it looks like a database and smells like a database ...

Using string_view to search unordered_map<string, int> [duplicate]

I have an unordered_map that uses a string-type as a key:
std::unordered_map<string, value> map;
A std::hash specialization is provided for string, as well as a
suitable operator==.
Now I also have a "string view" class, which is a weak pointer into an existing string, avoiding heap allocations:
class string_view {
string *data;
size_t begin, len;
// ...
};
Now I'd like to be able to check if a key exists in the map using a string_view object. Unfortunately, std::unordered_map::find takes a Key argument, not a generic T argument.
(Sure, I can "promote" one to a string, but that causes an allocation I'd like to avoid.)
What I would've liked instead was something like
template<class Key, class Value>
class unordered_map
{
template<class T> iterator find(const T &t);
};
which would require operator==(T, Key) and std::hash<T>() to be suitably defined, and would return an iterator to a matching value.
Is there any workaround?
P0919R2 Heterogeneous lookup for unordered containers has been merged in the C++2a's working draft!
The abstract seems a perfect match w.r.t. my original question :-)
Abstract
This proposal adds heterogeneous lookup support to the unordered associative containers in the C++ Standard Library. As a result, a creation of a temporary key object is not needed when different (but compatible) type is provided as a key to the member function. This also makes unordered and regular associative container interfaces and functionality more compatible with each other.
With the changes proposed by this paper the following code will work without any additional performance hits:
template<typename Key, typename Value>
using h_str_umap = std::unordered_map<Key, Value, string_hash>;
h_str_umap<std::string, int> map = /* ... */;
map.find("This does not create a temporary std::string object :-)"sv);
As mentioned above, C++14 does not provide heterogeneous lookup for std::unordered_map (unlike std::map). You can use Boost.MultiIndex to define a fairly close substitute for std::unordered_map that allows you to look up string_views without allocating temporary std::strings:
Live Coliru Demo
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/member.hpp>
#include <string>
using namespace boost::multi_index;
struct string_view
{
std::string *data;
std::size_t begin,len;
};
template<typename T,typename Q>
struct mutable_pair
{
T first;
mutable Q second;
};
struct string_view_hash
{
std::size_t operator()(const string_view& v)const
{
return boost::hash_range(
v.data->begin()+v.begin,v.data->begin()+v.begin+v.len);
}
std::size_t operator()(const std::string& s)const
{
return boost::hash_range(s.begin(),s.end());
}
};
struct string_view_equal_to
{
std::size_t operator()(const std::string& s1,const std::string& s2)const
{
return s1==s2;
}
std::size_t operator()(const std::string& s1,const string_view& v2)const
{
return s1.size()==v2.len&&
std::equal(
s1.begin(),s1.end(),
v2.data->begin()+v2.begin);
}
std::size_t operator()(const string_view& v1,const std::string& s2)const
{
return v1.len==s2.size()&&
std::equal(
v1.data->begin()+v1.begin,v1.data->begin()+v1.begin+v1.len,
s2.begin());
}
};
template<typename Q>
using unordered_string_map=multi_index_container<
mutable_pair<std::string,Q>,
indexed_by<
hashed_unique<
member<
mutable_pair<std::string,Q>,
std::string,
&mutable_pair<std::string,Q>::first
>,
string_view_hash,
string_view_equal_to
>
>
>;
#include <iostream>
int main()
{
unordered_string_map<int> m={{"hello",0},{"boost",1},{"bye",2}};
std::string str="helloboost";
auto it=m.find(string_view{&str,5,5});
std::cout<<it->first<<","<<it->second<<"\n";
}
Output
boost,1
I faced an equal problem.
We need two structs:
struct string_equal {
using is_transparent = std::true_type ;
bool operator()(std::string_view l, std::string_view r) const noexcept
{
return l == r;
}
};
struct string_hash {
using is_transparent = std::true_type ;
auto operator()(std::string_view str) const noexcept {
return std::hash<std::string_view>()(str);
}
};
For unordered_map:
template <typename Value>
using string_unorderd_map = std::unordered_map<std::string, Value, string_hash, string_equal>;
For unordered_set:
using string_unorderd_set = std::unordered_set<std::string, string_hash, string_equal>;
Now using string_view is possible.
It looks like only as recently as C++14 did even the basic map get such a templated find for is_transparent types in the comparison. Most likely the correct implementation for hashed containers was not immediately evident.
As far as I can see your two options are:
Just do the allocation and profile to see if maybe it's not actually a problem.
Take a look at boost::multi_index (http://www.boost.org/doc/libs/1_60_0/libs/multi_index/doc/index.html) and have both string and string_view indexes into the container.
This solution has drawbacks, which may or may not make it unviable for your context.
You can make a wrapper class:
struct str_wrapper {
const char* start, end;
};
And change your map to use str_wrapper as its key. You'd have to add 2 constructors to str_wrapper, one for std::string and one for your string_view. The major decision is whether to make these constructors perform deep or shallow copies.
For example, if you use std::string only for inserts and str_view only for lookups, you'd make the std::string constructor deep and the str_view one shallow (this can be enforced at compile time if you use a custom wrapper around unordered_map). If you care to avoid memory leaks on the deep copy you would need additional fields to support proper destruction.
If your usage is more varied, (looking up std::string's or inserting by str_view), there will be drawbacks, which again, might make the approach too distasteful so as to be unviable. It depends on your intended usage.
Yet another option is to split the lookup and the data management by using multiple containters:
std::unordered_map<string_view, value> map;
std::vector<unique_ptr<const char[]>> mapKeyStore;
Lookups are done using string_views without the need of allocations.
Whenever a new key is inserted we need to add a real string allocation first:
mapKeyStore.push_back(conv(str)); // str can be string_view, char*, string... as long as it converts to unique_ptr<const char[]> or whatever type
map.emplace(mapKeyStore.back().get(), value)
It would be much more intuitive to use std::string in the mapKeyStore. However, using std::string does not guarantee unchanging string memory (e.g. if the vector resizes). With the unique_ptr this is enforced. However, we need some special conversion/allocation routine, called conv in the example. If you have a custom string container which guarantees data consistency under moves (and forces the vector to use moves), then you can use it here.
The drawback
The disadvantage of the above method is that handling deletions is non-trivial and expensive if done naive. If the map is only created once or only growing this is a non-issue and the above pattern works quite well.
Running example
The below example includes a naive deletion of one key.
#include <vector>
#include <unordered_map>
#include <string>
#include <string_view>
#include <iostream>
#include <memory>
#include <algorithm>
using namespace std;
using PayLoad = int;
unique_ptr<const char[]> conv(string_view str) {
unique_ptr<char[]> p (new char [str.size()+1]);
memcpy(p.get(), str.data(), str.size()+1);
return move(p);
}
int main() {
unordered_map<string_view, PayLoad> map;
vector<unique_ptr<const char[]>> mapKeyStore;
// Add multiple values
mapKeyStore.push_back(conv("a"));
map.emplace(mapKeyStore.back().get(), 3);
mapKeyStore.push_back(conv("b"));
map.emplace(mapKeyStore.back().get(), 1);
mapKeyStore.push_back(conv("c"));
map.emplace(mapKeyStore.back().get(), 4);
// Search all keys
cout << map.find("a")->second;
cout << map.find("b")->second;
cout << map.find("c")->second;
// Delete the "a" key
map.erase("a");
mapKeyStore.erase(remove_if(mapKeyStore.begin(), mapKeyStore.end(),
[](const auto& a){ return strcmp(a.get(), "a") == 0; }),
mapKeyStore.end());
// Test if verything is OK.
cout << '\n';
for(auto it : map)
cout << it.first << ": " << it.second << "\n";
return 0;
}
Of course, the two containers can be put into a wrapper which handles the insertion and deletion for its own.
I'll just present one variation I found on github, it involves defining a new map class that wraps the std.
Redefining some key API to intercept the adaptors we want, and use a static string to copy the key.
It's not necessary a good solution, but it's interesting to know it exists for people who deems it enough.
original:
https://gist.github.com/facontidavide/95f20c28df8ec91729f9d8ab01e7d2df
code gist:
template <typename Value>
class StringMap: public std::unordered_map<std::string, Value>
{
public:
typename std::unordered_map<string,Value>::iterator find(const nonstd::string_view& v )
{
tmp_.reserve( v.size() );
tmp_.assign( v.data(), v.size() );
return std::unordered_map<string, Value>::find(tmp_);
}
typename std::unordered_map<std::string,Value>::iterator find(const std::string& v )
{
return std::unordered_map<std::string, Value>::find(v);
}
typename std::unordered_map<std::string,Value>::iterator find(const char* v )
{
tmp_.assign(v);
return std::unordered_map<std::string, Value>::find(v);
}
private:
thread_local static std::string tmp_;
};
credits:
Davide Faconti
Sorry for answering this very old question, but it still comes up in search engine results...
In this case your unordered_map is using the string type as its key, the find method is looking for a reference to a string which will not generate an allocation. Your string_view class stores a pointer to a string. Therefore your string_view class can dereference the pointer into a ref of the type needed for your map without causing an allocation. The method would look like this...
string &string_view::getRef() const
{
return *_ptr;
}
and to use the string_view with the map it would look like this
auto found=map.find(string_view_inst.getRef());
note that this will not work for the c++17 string_view class as it does not internally store a std::string object
ps.
Your string_view class is probably not great for cpu caches as it stores a pointer to a string allocated somewhere on the heap, and the string itself stores a pointer to the actual data located somewhere else on the heap. Every time you access your string_view it will result in a double dereference.
You could allow your view to be implicitly convertible to a std::string:
class StringView {
// ...
operator std::string() const
{
return data->substr(begin, len);
}
// ...
};

std::unordered_map::find using a type different than the Key type?

I have an unordered_map that uses a string-type as a key:
std::unordered_map<string, value> map;
A std::hash specialization is provided for string, as well as a
suitable operator==.
Now I also have a "string view" class, which is a weak pointer into an existing string, avoiding heap allocations:
class string_view {
string *data;
size_t begin, len;
// ...
};
Now I'd like to be able to check if a key exists in the map using a string_view object. Unfortunately, std::unordered_map::find takes a Key argument, not a generic T argument.
(Sure, I can "promote" one to a string, but that causes an allocation I'd like to avoid.)
What I would've liked instead was something like
template<class Key, class Value>
class unordered_map
{
template<class T> iterator find(const T &t);
};
which would require operator==(T, Key) and std::hash<T>() to be suitably defined, and would return an iterator to a matching value.
Is there any workaround?
P0919R2 Heterogeneous lookup for unordered containers has been merged in the C++2a's working draft!
The abstract seems a perfect match w.r.t. my original question :-)
Abstract
This proposal adds heterogeneous lookup support to the unordered associative containers in the C++ Standard Library. As a result, a creation of a temporary key object is not needed when different (but compatible) type is provided as a key to the member function. This also makes unordered and regular associative container interfaces and functionality more compatible with each other.
With the changes proposed by this paper the following code will work without any additional performance hits:
template<typename Key, typename Value>
using h_str_umap = std::unordered_map<Key, Value, string_hash>;
h_str_umap<std::string, int> map = /* ... */;
map.find("This does not create a temporary std::string object :-)"sv);
As mentioned above, C++14 does not provide heterogeneous lookup for std::unordered_map (unlike std::map). You can use Boost.MultiIndex to define a fairly close substitute for std::unordered_map that allows you to look up string_views without allocating temporary std::strings:
Live Coliru Demo
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/member.hpp>
#include <string>
using namespace boost::multi_index;
struct string_view
{
std::string *data;
std::size_t begin,len;
};
template<typename T,typename Q>
struct mutable_pair
{
T first;
mutable Q second;
};
struct string_view_hash
{
std::size_t operator()(const string_view& v)const
{
return boost::hash_range(
v.data->begin()+v.begin,v.data->begin()+v.begin+v.len);
}
std::size_t operator()(const std::string& s)const
{
return boost::hash_range(s.begin(),s.end());
}
};
struct string_view_equal_to
{
std::size_t operator()(const std::string& s1,const std::string& s2)const
{
return s1==s2;
}
std::size_t operator()(const std::string& s1,const string_view& v2)const
{
return s1.size()==v2.len&&
std::equal(
s1.begin(),s1.end(),
v2.data->begin()+v2.begin);
}
std::size_t operator()(const string_view& v1,const std::string& s2)const
{
return v1.len==s2.size()&&
std::equal(
v1.data->begin()+v1.begin,v1.data->begin()+v1.begin+v1.len,
s2.begin());
}
};
template<typename Q>
using unordered_string_map=multi_index_container<
mutable_pair<std::string,Q>,
indexed_by<
hashed_unique<
member<
mutable_pair<std::string,Q>,
std::string,
&mutable_pair<std::string,Q>::first
>,
string_view_hash,
string_view_equal_to
>
>
>;
#include <iostream>
int main()
{
unordered_string_map<int> m={{"hello",0},{"boost",1},{"bye",2}};
std::string str="helloboost";
auto it=m.find(string_view{&str,5,5});
std::cout<<it->first<<","<<it->second<<"\n";
}
Output
boost,1
I faced an equal problem.
We need two structs:
struct string_equal {
using is_transparent = std::true_type ;
bool operator()(std::string_view l, std::string_view r) const noexcept
{
return l == r;
}
};
struct string_hash {
using is_transparent = std::true_type ;
auto operator()(std::string_view str) const noexcept {
return std::hash<std::string_view>()(str);
}
};
For unordered_map:
template <typename Value>
using string_unorderd_map = std::unordered_map<std::string, Value, string_hash, string_equal>;
For unordered_set:
using string_unorderd_set = std::unordered_set<std::string, string_hash, string_equal>;
Now using string_view is possible.
It looks like only as recently as C++14 did even the basic map get such a templated find for is_transparent types in the comparison. Most likely the correct implementation for hashed containers was not immediately evident.
As far as I can see your two options are:
Just do the allocation and profile to see if maybe it's not actually a problem.
Take a look at boost::multi_index (http://www.boost.org/doc/libs/1_60_0/libs/multi_index/doc/index.html) and have both string and string_view indexes into the container.
This solution has drawbacks, which may or may not make it unviable for your context.
You can make a wrapper class:
struct str_wrapper {
const char* start, end;
};
And change your map to use str_wrapper as its key. You'd have to add 2 constructors to str_wrapper, one for std::string and one for your string_view. The major decision is whether to make these constructors perform deep or shallow copies.
For example, if you use std::string only for inserts and str_view only for lookups, you'd make the std::string constructor deep and the str_view one shallow (this can be enforced at compile time if you use a custom wrapper around unordered_map). If you care to avoid memory leaks on the deep copy you would need additional fields to support proper destruction.
If your usage is more varied, (looking up std::string's or inserting by str_view), there will be drawbacks, which again, might make the approach too distasteful so as to be unviable. It depends on your intended usage.
Yet another option is to split the lookup and the data management by using multiple containters:
std::unordered_map<string_view, value> map;
std::vector<unique_ptr<const char[]>> mapKeyStore;
Lookups are done using string_views without the need of allocations.
Whenever a new key is inserted we need to add a real string allocation first:
mapKeyStore.push_back(conv(str)); // str can be string_view, char*, string... as long as it converts to unique_ptr<const char[]> or whatever type
map.emplace(mapKeyStore.back().get(), value)
It would be much more intuitive to use std::string in the mapKeyStore. However, using std::string does not guarantee unchanging string memory (e.g. if the vector resizes). With the unique_ptr this is enforced. However, we need some special conversion/allocation routine, called conv in the example. If you have a custom string container which guarantees data consistency under moves (and forces the vector to use moves), then you can use it here.
The drawback
The disadvantage of the above method is that handling deletions is non-trivial and expensive if done naive. If the map is only created once or only growing this is a non-issue and the above pattern works quite well.
Running example
The below example includes a naive deletion of one key.
#include <vector>
#include <unordered_map>
#include <string>
#include <string_view>
#include <iostream>
#include <memory>
#include <algorithm>
using namespace std;
using PayLoad = int;
unique_ptr<const char[]> conv(string_view str) {
unique_ptr<char[]> p (new char [str.size()+1]);
memcpy(p.get(), str.data(), str.size()+1);
return move(p);
}
int main() {
unordered_map<string_view, PayLoad> map;
vector<unique_ptr<const char[]>> mapKeyStore;
// Add multiple values
mapKeyStore.push_back(conv("a"));
map.emplace(mapKeyStore.back().get(), 3);
mapKeyStore.push_back(conv("b"));
map.emplace(mapKeyStore.back().get(), 1);
mapKeyStore.push_back(conv("c"));
map.emplace(mapKeyStore.back().get(), 4);
// Search all keys
cout << map.find("a")->second;
cout << map.find("b")->second;
cout << map.find("c")->second;
// Delete the "a" key
map.erase("a");
mapKeyStore.erase(remove_if(mapKeyStore.begin(), mapKeyStore.end(),
[](const auto& a){ return strcmp(a.get(), "a") == 0; }),
mapKeyStore.end());
// Test if verything is OK.
cout << '\n';
for(auto it : map)
cout << it.first << ": " << it.second << "\n";
return 0;
}
Of course, the two containers can be put into a wrapper which handles the insertion and deletion for its own.
I'll just present one variation I found on github, it involves defining a new map class that wraps the std.
Redefining some key API to intercept the adaptors we want, and use a static string to copy the key.
It's not necessary a good solution, but it's interesting to know it exists for people who deems it enough.
original:
https://gist.github.com/facontidavide/95f20c28df8ec91729f9d8ab01e7d2df
code gist:
template <typename Value>
class StringMap: public std::unordered_map<std::string, Value>
{
public:
typename std::unordered_map<string,Value>::iterator find(const nonstd::string_view& v )
{
tmp_.reserve( v.size() );
tmp_.assign( v.data(), v.size() );
return std::unordered_map<string, Value>::find(tmp_);
}
typename std::unordered_map<std::string,Value>::iterator find(const std::string& v )
{
return std::unordered_map<std::string, Value>::find(v);
}
typename std::unordered_map<std::string,Value>::iterator find(const char* v )
{
tmp_.assign(v);
return std::unordered_map<std::string, Value>::find(v);
}
private:
thread_local static std::string tmp_;
};
credits:
Davide Faconti
Sorry for answering this very old question, but it still comes up in search engine results...
In this case your unordered_map is using the string type as its key, the find method is looking for a reference to a string which will not generate an allocation. Your string_view class stores a pointer to a string. Therefore your string_view class can dereference the pointer into a ref of the type needed for your map without causing an allocation. The method would look like this...
string &string_view::getRef() const
{
return *_ptr;
}
and to use the string_view with the map it would look like this
auto found=map.find(string_view_inst.getRef());
note that this will not work for the c++17 string_view class as it does not internally store a std::string object
ps.
Your string_view class is probably not great for cpu caches as it stores a pointer to a string allocated somewhere on the heap, and the string itself stores a pointer to the actual data located somewhere else on the heap. Every time you access your string_view it will result in a double dereference.
You could allow your view to be implicitly convertible to a std::string:
class StringView {
// ...
operator std::string() const
{
return data->substr(begin, len);
}
// ...
};

STL like iterator over a collection of hash maps

I'm looking for a way to create a forward iterator which allows to iterate over a collection of hash maps.
An exemplary class which holds several maps looks like follows. (I'm using boost for unordered_map and shared_ptr, but C++11 would provide those classes as well).
In my particular application I'm using this construct to represent sparse hierarchical 2D grid locations; i.e. KeyT is a 2D location, ValueT is an integral counter and the different levels represent different resolutions of the grid.
template <typename KeyT, typename ValueT>
class MapCollection
{
public:
// type-definitions for the map and a shared pointer to the map
typedef boost::unordered_map<KeyT, ValueT> Map;
typedef boost::shared_ptr<Map> MapPtr;
// Constructor for class
MapCollection (int num_levels)
{
levels_.reserve (num_levels);
for (int i = 0; i < num_levels; ++i)
levels_.push_back (MapPtr (new Map()));
}
// adds a key-value pair to the map on the given level
void addValue (const KeyT &key, const ValueT &value)
{
int level = getLevelForKey (key);
(*levels_[level])[key] = value;
}
// TODO define const_iterator for this class
// TODO define member function begin(), returning levels_.front()->begin()
// TODO define member function end(), returning levels_.back()->end()
private:
// return the hierarchy level for the given key
int getLevelForKey (const KeyT &key) { return /* ... */ };
// collection of maps
std::vector<MapPtr> levels_;
};
Within an application I would now like to be able to iterate over all entries of all maps, similarly to what is possible if one just iterates over a single map, i.e.
int main (int argc, char *argv[])
{
int num_levels = 5;
MapCollection maps (num_levels);
// fill maps
maps.addValue ( /* ... */ )
// iterator over all entries
MapCollection::const_iterator iter = maps.begin();
for (; iter != maps.end(); ++iter)
{
std::cout << "Key: " << iter->first << " | Value: " << iter->second << std::endl;
}
return EXIT_SUCCESS;
}
Obviously it would be possible to iterator over the different levels and for each level over it's map, but I would like to hide the creation of different levels for the user.
What is the correct way to define an (const_)iterator for the class MapCollection?
Thanks for your help!
You could use boost iterator facade, that will help you with the task of implementing custom iterators. The idea would be:
1-maintain a reference to the vector containing the maps (levels_).
2-a iterator indicating what element of the previous vector the custom iterator is iterating (of type std::vector<MapPtr>::iterator or std::vector<MapPtr>::const_iterator) or a index with the same info (to retrieve the end of level_[index]).
3-a iterator with current element of the iteration of the previous iterator (the actual element of the iteration).
Some sample code:
#include <boost/iterator/iterator_facade.hpp>
namespace impl
{
template <class Value>
class iterator
: public boost::iterator_facade<
config_iterator<Value>
, Value
, boost::forward_traversal_tag
>
{
public:
config_iterator() {...}
explicit config_iterator(parameters) { /*your specific contructor*/ }
private:
template <class OtherValue>
config_iterator(config_iterator<OtherValue> const& other);
friend class boost::iterator_core_access;
template <class> friend class config_iterator;
template <class OtherValue>
bool equal(config_iterator<OtherValue> const& other) const { } // Verify is two iterators are equals (used to verify if it == end)
void increment() {} // Logic for incrementing the iterator
Value& dereference() const {} // Returning the actual value
// members
};
}
typedef impl::iterator<type> iterator;
This is a template file for very simple iterator (forward), read the Iterator Help for more info, the same can be implementing overloading the right operators (++ post a pre increment, *, ->, etc...), by boost provide a ways of defining the minimum necessary to implement the rest.

display map every time it is updated sorted by value

basically, I have the
map<std::string, int>
so if i have
foo 5
bar 10
jack 3
in the map, I want to display it (notice the reverse order)
bar 10
foo 5
jack 3
And every time it is updated, I want iterate through all the elements, cout them, sorted by value. What is the good way to implement that? should I provide a comparator to the constructor?
I want to note that values in the map will be updated at least 100 million times, so efficiency is crucial, where as extra-space is no problem
Please no Boost solutions...thx
struct keyval_t { std::string key; int val; };
int operator<(const keyval_t &a, const ketval_t &b)
{ return a.val<b.val || (a.val==b.val && a.key<b.key); }
Then you need one map and one set:
map<std::string, int>; set<keyval_t>;
On update, you need to look up the map first to determine the key-value pair and then update both map and set. On printing, you just iterate through the set. In terms of theoretical time complexity, this is optimal. It doubles the memory, though. Does this meet your goal?
To reduce memory, you may consider the following:
map<std::string,uint64_t>; set<uint64_t>;
The value of the map (also the key of the set) is: (uint64_t)val<<32|counter, where counter is something that differentiates identical values. For example, whenever you insert a key, increase the counter by 1. You do not need to update the counter when you update the value. If you do not like uint64_t, use pair<int,int> instead. This solution is also faster as it avoids comparisons between strings.
If you want a performant map sorted by both key and value, you want Boost MultiIndex, it gets updated (resorted) on every update (which you have to do manually) and has a good documentation.
The previous responses have the inconvenience not to take into account the initial requirements (the key is std::string and the value is int).
EDITED: following the comments, I suppose presenting it directly with a Bimap is better :)
So here we go, right in!
#include <boost/bimap.hpp>
class MyMap
{
struct name {};
struct value {};
typedef boost::tagged<name, std::string> tagged_name;
typedef boost::tagged<value, int> tagged_value;
// unordered_set_of: guarantees only unicity (not order)
// multi_set_of: guarantees only order (not unicity)
typedef boost::bimap< boost::unordered_set_of< tagged_name >,
boost::multi_set_of< tagged_value,
std::greater< tagged_value >
>
> impl_type;
public:
// Redefine all usual types here
typedef typename impl_type::map_by<name>::const_iterator const_iterator;
typedef typename impl_type::value_type value_type;
// Define the functions you want
// the bimap will not allow mutators because the elements are used as keys
// so you may want to add wrappers
std::pair< iterator, bool > insert(const value_type & x)
{
std::pair< iterator, bool > result = m_impl.insert(x);
if (result.second) this->display();
return result;
} // insert
iterator insert(iterator position, const value_type & x)
{
iterator result = m_impl.insert(x);
this->display();
return result;
} // insert
template< class InputIterator >
void insert(InputIterator first, InputIterator last)
{
m_impl.insert(first, last);
this->display();
} // insert
private:
void display() const
{
// Yeah I know about std::for_each...
typedef typename impl_type::map_by<value>::const_iterator const_it;
for (const_it it = m_impl.begin(), end = m_impl.end(); it != end; ++it)
{
// Note the inversion of the 'second' and 'first',
// we are looking at it from the right
std::cout << it->second << " " << it->first << std::endl;
}
}
impl_type m_impl;
}; // class MyMap
Here you go.
I strongly suggest that you consult the bimap documentation though. There are a lot of possibilities for storing (set_of, unordered_set_of, unconstrained_set_of, the muli variants, the list_of variant...) so there is probably one that could do what you want.
Then you also have the possibility to just sort each time you display:
#include <set>
#include <map>
// Just use a simple std::map<std::string,int> for your impl_type
// Mutators are allowed since the elements are sorted each time you display
struct Comparator
{
bool operator(const value_type& lhs, const value_type& rhs) const
{
return lhs.second < rhs.value;
}
};
void display() const
{
typedef std::multi_set<value_type, Comparator> sort_type;
sort_type mySet;
std::copy(m_impl.begin(), m_impl.end(), std::inserter(mySet, mySet.end()));
for (sort_type it = mySet.begin(), end = mySet.end(); it != end; ++it)
{
std::cout << it->first<< " " << it->second << std::endl;
}
}
It should be easier to understand my point now :)
Well, you have to sort by key. The easiest way to "sort by value" is to use a multimap with the key and value switched. So here's a way to do that (note -- i don't have access to a compiler right now, so if it doesn't compile, I'm sorry):
#include <algorithm>
#include <functional>
#include <map>
#include <string>
#include <utility>
typedef std::multimap<int, std::string, std::greater<int> > MapType;
struct MapKeyOutput
{
void operator()(const MapType::value_type& val)
{
std::cout << val.second << " " << val.first << "\n";
}
}
void display_insert(MapType& m, const std::string& str, int val)
{
m.insert(std::make_pair(val, str));
std::for_each(m.begin(), m.end(), MapKeyOutput());
}
int main()
{
MapType m;
display_insert(m, "Billy", 5);
display_insert(m, "Johnny", 10);
}
You could also make a map class that uses a multimap internally, I just didn't want to type it out. Then display_insert would be some member function instead. This should demonstrate the point, though. Points of interest:
typedef std::multimap<int, std::string, std::greater<int> > MapType;
Note the comparator is greater<int> to sort descending. We're using a multimap so more than one name can have the same number.
struct MapKeyOutput
{
void operator()(const MapType::value_type& val)
{
std::cout << val.second << " " << val.first << "\n";
}
}
This is a function object to output one element in the map. second is output before first so the order is what you want.
std::for_each(m.begin(), m.end(), MapKeyOutput());
This applies our function object to every element in m.