C++, using Visual Studio 2010. A question about why a user-defined trait of hash_map actually requires total ordering.
I have a simple structure, say FOO, which only has a number of integers. I'd like to use hash_map, which is a hash table whose keys are unordered, to store the structure of FOO. I just need a fast searching of its associated value, so this is a right choice: hash_map<FOO, int32_t>.
However, I need to implement my own hash function and some compare functions for FOO. Here is the definitions of hash_map, taken from MSDN:
template <
class Key,
class Type,
class Traits=hash_compare<Key, less<Key> >,
class Allocator=allocator<pair <const Key, Type> >
>
class hash_map
It turned out that I needed to implement hash_compare functors:
template<class Key, class Traits = less<Key> >
class hash_compare
{
Traits comp;
public:
const size_t bucket_size = 4;
const size_t min_buckets = 8;
hash_compare( );
hash_compare( Traits pred );
size_t operator( )( const Key& _Key ) const; // This is a hash function
bool operator( )( // This is an ordering function
const Key& _Key1,
const Key& _Key2
) const;
};
Here is the detailed description of the bool operatod() from MSDN:
For any value _Key1 of type Key that precedes _Key2 in the sequence and has the same hash value (value returned by the hash function), hash_comp(_Key2, _Key1) is false. The function must impose a total ordering on values of type Key.
The function supplied by hash_compare returns comp(_Key2, _Key1), where comp is a stored object of type Traits that you can specify when you construct the object hash_comp. For the default Traits parameter type less, sort keys never decrease in value.
It was easy to write the hash_compare class for FOO. This question is not for asking how to implement a class. However, it's not straightforward for me that why they have the default trait parameter as less<key> and require total ordering.
hash_map is an unordered data structure. So, I thought that it would be sufficient to have equal_to or not_equal_to instead of less or greater. However, the description of MSDN explicitly states that keys are ordered, which confuses me.
Did I misunderstand the definition of hash_map? Why STL's hash_map actually require orders of its key?
For any value _Key1 of type Key that precedes _Key2 in the sequence and has the same hash value (value
returned by the hash function), hash_comp(_Key2, _Key1) is false. The function must impose a
total ordering on values of type Key.
A total ordering of keys with the same hash value guarantees a total ordering of keys which hash to the same bucket.
That provides the opportunity for a more efficient implementation of search for a key within a particular bucket - e.g. Θ(log n) binary search is possible. If there is no such guaranteed ordering, the worst case (many different keys which are all in the same bucket because they all hash to the same value) is Θ(n).
hash_map that you are looking at is a Microsoft extension that came in in VS2003 and is actually now in stdext in Visual C++ - it's not part of the STL.
std::unordered_map is the official STL version of an associative container with value access by hashable key - the predicate on that is for equality, as you expected.
template<class Key,
class Ty,
class Hash = std::hash<Key>,
class Pred = std::equal_to<Key>,
class Alloc = std::allocator<std::pair<const Key, Ty> > >
class unordered_map;
The exact requirements on hash_map vary with the implementation, and some of them (as you've seen) don't make a whole lot of sense. That's part of why they decided not to include a hash_map (or hash_*) in TR1 and/or C++0x. Instead, they have unordered_[multi](map|set), which requires only equal_key, not operator<.
Bottom line: unless you have a truly outstanding reason to do otherwise, use unordered_map instead of hash_map.
Related
In case of unordered_map we define the hash and pred functors whenever we are using user-defined keys.
The template syntax for a map is as follows:
template < class Key, // map::key_type
class T, // map::mapped_type
class Compare = less<Key>, // map::key_compare
class Alloc = allocator<pair<const Key,T> > // map::allocator_type
> class map;
In case of map there is no hash and pred functors option. Do we never have collisions in case of map. If collisions happen then why don't we have the hash and pred functors as in unordered_map?
Am I missing something here?
std::map and std::unordered_map are two different types of containers that both provided key-value pair mapping. How they do that though is completely different.
std::map uses a tree structure for its implementation. Typically this is an RBTree but any tree that can guarantee worst case O(logN) operations will work. This means it only needs to have a comparison operator for the key type since you can get total ordering and check for equality with a comparator that implements a strict weak ordering. This means you'll never have a hash collision since you aren't using a hash.
std::unordered_map is based on a hash table implementation. Since it hashes the key, you need a hash operator. You also need a comparison operator since two values could hash to the same value (hash collision). Without the comparison operator you would not be able to tell if the duplicate hash is really a duplicate item.
std::map is not a hash table and thus doesn't use a hash function. Instead, it requires operator< for ordering the values contained in the map.
For an std::map, can i always trust begin() to return the element with the smallest key according to comparison operators for the type, when iterating?
In other words...
Will std::map<Key, SomeClass>::iterator smallestKeyIt = someMap.begin(); give me the pair in the map with the smallest key?
Is this the ordering that is quaranteed for an std::map or can i configure it somehow? My understanding is that the underlaying tree structure is kept ordered when performing operations such as adding and removing elements.
can i always trust begin() to return the element with the smallest key according to comparison operators for the type?
Yes.
Is this the ordering that is quaranteed for an std::map or can i configure it somehow?
Yes. And you can configure the behaviour of comparing by specify the comparator. (The default one is std::less.)
From cppreference:
template<
class Key,
class T,
class Compare = std::less<Key>,
class Allocator = std::allocator<std::pair<const Key, T> >
> class map;
std::map is a sorted associative container that contains key-value
pairs with unique keys. Keys are sorted by using the comparison
function Compare. Search, removal, and insertion operations have
logarithmic complexity. Maps are usually implemented as red-black trees.
std::map is defined as:
template<
class Key,
class T,
class Compare = std::less<Key>,
class Allocator = std::allocator<std::pair<const Key, T> >
> class map;
You can use your specialized Compare to configure how the entries of a map are ordered. For example, if you use:
std::map<int, double, std::greater<int>> myMap;
then, the first entry in myMap will have the largest key.
The keys in my std::unordered_map are boost::uuids::uuids, thus 128 bit hashes considered unique. However, the compiler can't know that and therefore says this.
error C2338: The C++ Standard doesn't provide a hash for this type.
How can I make the map use the keys as hashes as they are? By the way, std::size_t is defined as unsigned int __w64 on my system, which I think refers to only 64 bits.
You always need to provide a function object mapping the key to a hash value even if this mapping is the identity. You can either define a specialization for std::hash<boost::uuids::uuid> and have the std::unordered_map<K, V> pick this one up automatically or you can parameterize the unordered map with additional template parameter for the function object type. In addition to the hash an equality operation is also needed but the default, using operator==() is probably OK.
That said, the hash value won't accept a 128-bit integer unless your system has a built-in 128-bit integer type. The hash value needs to be a std::size_t to be usable with the standard unordered containers. The complete list of requirements for std::hash<T> specializations is listed in 20.8.12 [unord.hash]:
std::hash<X> needs to be default constructible, copy constructible, and copy assignable.
std::hash<X> needs to be swappable.
It needs to provide two nested types argument_type for the key type and result_type for the type of the hashed value with the latter being the same as std::size_t.
For the function the relation k1 == k2 => h(k1) == h(k2) needs to be true where h is the hashing function object.
So, you will need to define something along the lines of this:
namespace std {
template <>
struct hash<boost::uuids::uuid>
{
typedef boost::uuids::uuid argument_type;
typedef std::size_t result_type;
std::size_t operator()(boost::uuid::uuid key) const {
return transform_to_size_t(key);
}
};
}
where transform_to_size_t() is the actual transformation you'll need to provide.
};
you need to provide a hash function for type boost::uuids::uuid. Since it is unique, you can just use stl identity.
Here is the declaration of unordered_map.
template < class Key, // unordered_map::key_type
class T, // unordered_map::mapped_type
class Hash = hash<Key>, // unordered_map::hasher
class Pred = equal_to<Key>, // unordered_map::key_equal
class Alloc = allocator< pair<const Key,T> > // unordered_map::allocator_type
> class unordered_map;
I think the simplest way is to implement an specialization of std::hash for that types, which returns the same input:
namespace std
{
template<>
struct hash<Foo>
{
Foo operator(const Foo& foo)
{
return foo;
}
};
}
Supposing that the type, Foo in the example, is implicitly convertible to std::size_t.
In your case, the type is a 128 bits GUID, and std::size_t uses 32 or 64 bits. You could split the 128 bits GUID in parts of 64/32 bits, and combine the values.
I found no way to use UUIDs as keys for std::unordered_map since the UUID is 128 bits long while the hash for the map is std::size_t which only can hold 64 bits.
Instead, I dropped real 128 bits UUIDs for only 64 bit ids which can be stored in the uint64_t type and are natively supported by containers of the standard library.
I couldn't find a way to set a custom comparator function for QMap, like I can for std::map (the typename _Compare = std::less<_Key> part of its template arguments).
Does QMap have a way to set one?
It's not documented (and it's a mistake, I think), but in you can specialize the qMapLessThanKey template function for your types (cf. the source). That will allow your type to use some other function rather than operator<:
template<> bool qMapLessThanKey<int>(const int &key1, const int &key2)
{
return key1 > key2; // sort by operator> !
}
Nonetheless, std::map has the advantage that you can specify a different comparator per each map, while here you can't (all maps using your type must see that specialization, or everything will fall apart).
No, as far as i know QMap doesn't have that functionality it requires that it's key type to have operator<, so you are stuck with std::map if you really need that compare functionality.
QMap's key type must provide operator<(). QMap uses it to keep its items sorted, and assumes that two keys x and y are equal if neither x < y nor y < x is true.
In case, overload operator<().
In C++, the std::set::insert() only inserts a value if there is not already one with the same 'value'. By the same, does this mean operator== or does it mean one for which operator< is false for either ordering, or does it mean something else?
does it mean one for which operator< is false for either ordering?
Yes, if the set uses the default comparator and compares keys using <. More generally, in an ordered container with comparator Compare, two keys k1 and k2 are regarded as equivalent if !Compare(k1,k2) && !Compare(k2,k1).
Keys are not required to implement operator== or anything else; they are just required to be comparable using the container's comparator to give a strict weak ordering.
std::set has a template argument called `Compare' as in this signature:
template < class Key, class Compare = less<Key>,
class Allocator = allocator<Key> > class set;
Compare is used to determine the ordering between elements. Here, the default less<Key> uses the < operator to compare two keys.
If it helps, you can think of a set as just a std::map with meaningless values, ie a std::set<int> can be thought of as a std::map<int, int> where the values are meaningless.
The only comparison that set is allowed to perform on T is via the functor type it was given to do comparisons as part of the template. Thus, that's how it defines equivalence.
For every value in the set, the comparison must evaluate to true for one of the two ordering between that value and the new one. If it's false both ways for any value, then it won't be stored.