May std::map key address number of values? - c++

It may be a little bit stupid question, but assuming the std::map defined as follows:
std::map<int, int> m;
Is there any way to have more than one value stored and be accessible for a single key?
Motivation of asking:
std::map has methods like count() and equal_range() that get a key as parameters, that kind of give a sense that there is more than one value may be specified by a single key.

Those methods exist so as to provide a common interface with other associative containers that do allow multiple values per key (such as std::multimap which is exactly what you're looking for).
This makes implementing algorithms generically (i.e. with templates) much easier than it would otherwise be, and nothing of value is lost by designing it this way.
It's true that, in the case of std::map, count() can only give you zero or one (unless you're using transparent keys, which are a whole other kettle of fish).
C++20 will introduce std::map::contains(), which is more or less a check that count() == 1 — this seems to have been intended to address concerns that the function count() is kind of a weird thing to have for a std::map specifically.

Is there any way to have more than one value stored and be accessible for a single key?
Not with std::map, these objects store only one value per key, but std::multimap can store a variable number of values per key.
Similarities between both types (e.g. std::map::count, std::multimap::count) are due to establishing similar interfaces between STL containers.

No. std::map is designed to have one value per key.
If you want multiple values for one key, you should use std::multimap.

Related

Map count really counts or just checks existence

In CPP primer or other websites I have found the language of count (from map STL) definition very vague and misleading:
Searches the container for elements with a key equivalent to k and returns the number of matches
Now what I have studied so far is that key is singular and so is the mapped value - the mapped value can be changed through assignment.
So doesn't it just returns whether the container contains the key or not? Rather than the count? Where am I wrong in understanding the concept?
A std::map's count() will always return either 0 or 1.
But the C++ library has other associative containers that might very well have multiple values for the same key. Like std::multimap and std::multiset. And by a very lucky coincidence they also have a count() method that may actually return values greater than 1.
But what this allows you to do is metaprogramming by developing templates that can use any associative container, one that may or may not be unique. All your template needs to do is use count() to determine how many values exist in the container with the given key, and the end result can be used with either std::map or std::multimap. It won't care in the slightest. In both cases, your template will get the right answer: the number of values in the container with the given key.
According to cplusplus.com
Because all elements in a map container are unique, the function can only return 1 (if the element is found) or zero (otherwise).

which container from std::map or std::unordered_map is suitable for my case

I don't know how a red black tree works with string keys. I've already seen it with numbers on youtube and it baffled me a lot. However I know very well how unoredred_map work (the internal of hash maps). std::map stays esoterical for me, but I read and tested that if we don't have many changes in the std::map, it could beat hash maps.
My case is simple, I have a std::map of <std::string,bool>. Keys contains paths to XML elements (example of a key: "Instrument_Roots/Instrument_Root/Rating_Type"), and I use the boolean value in my SAX parser to know if we reached a particular element.
I build this map "only once"; and then all I do is using std::find to search if a particular "key" ("path") exists in order to set its Boolean value to true, or search the first element who has "true" as associated value and use its corresponded "key", and finally I set all the boolean values to false to guarantee that only a single "key" has a "true" boolean value.
You shouldn't need to understand how red-black trees work in order to understand how to use a std::map. It's simply an associative array where the keys are in order (lexicographical order, in the case of string keys, at least with the default comparison function). That means that you can not only look keys up in a std::map, you can also make queries which depend on order. For example, you can find the largest key in the map which is not greater than the key you have. You can find the next larger key. Or (again in the case of strings) you can find all keys which start with the same prefix.
If you iterate over all the key-value pairs in a std::map, you will see them in order by key. That can be very useful, sometimes.
The extra functionality comes at a price. std::map is usually slower than std::unordered_map (though not always; for large string keys, the overhead of computing the hash function might be noticeable), and the underlying data structure has a certain amount of overhead, so they may occupy more space. The usual advice is to use a std::map if you find the fact that the keys are ordered to be essential or even useful.
But if you've benchmarked and concluded that for your application, a std::map is also faster, then go ahead and use it :)
It is occasionally useful to have a map whose mapped type is bool, but only if you need to distinguish between keys whose corresponding value is false and keys which are not present in the map at all. In effect, a std::map<T, bool> (or std::unordered_map<T, bool>) provides a ternary choice for each possible key.
If you don't need to distinguish between the two false cases, and you don't frequently change a key's value, then you may well be better off with a std::set (or std::unordered_set), which is exactly the same datastructure but without the overhead of the bool in each element. (Although only one bit of the bool is useful, alignment considerations may end up using 8 additional bytes for each entry.) Other than storage space, there won't be much (if any) performance difference, though.
If you do really need a ternary case, then you would be well-advised to make the value an enum rather than a bool. What do true and false mean in the context of your usage? My guess is that they don't mean "true" and "false". Instead, they mean something like "is an attribute path" and "is an element path". That distinction could be made much clearer (and therefore less accident-prone) by using enum PathType {ATTRIBUTE_PATH, ELEMENT_PATH};. That will not involve any additional resources, since the bool is occupying eight bytes of storage in any case (because of alignment).
By the way, there is no guarantee that the underlying data structure is precisely a red-black tree, although the performance guarantees would be difficult to achieve without some kind of self-balancing tree. I don't know of such an implementation, but it would be possible to use k-ary trees (for some small k) to take advantage of SIMD vector comparison operations, for example. Of course, that would need to be customized for appropriate key types.
If you do want to understand red-black trees, you could do worse than Robert Sedgewick's standard textbook on Algorithms. On the book's website, you'll find a brief illustrated explanation in the chapter on balanced trees.
I would recommend you to use std::unordered_set because you really don't need to store this boolean flag and you also don't need to keep these xml tags in sorted order so std::unordered_set seems to me as logical and the most efficient choice.

Underlying storage and functionality of unordered_maps vs unordered_multimaps in C++?

I'm having a hard time wrapping my head around unordered_maps and unordered_multimaps because my test code isn't producing what I've been told to expect.
std::unordered_map<string, int> names;
names.insert(std::make_pair("Peter", 4));
names.insert(std::make_pair("George", 4));
names.insert(std::make_pair("George", 4));
When I iterate through this list, I get one instance of George first, then Peter.
1) It's my understanding unordered_maps do not allow multiple keys to map to one value, and that multimaps due. Is this true?
2) Why can Peter and George coexist at a value of 4? What is happening to the second George? And for that matter, why is George appearing first when I iterate from begin() to end() if this is unordered?
3) What is the underlying representation of an unordered map vs. unordered multimap?
4) Is there a way to insert keys into either map without providing a value? E.g. have the compiler create its own hash function that I don't need to worry about when I retrieve keys and look for collisions?
I'll make it short:
No. Multi... refers to keys. A (non-multi)map can't have multiple equivalent keys with differeny values, ie. per key there is at most one value. A multi map can. The same holds for the unordered versions.
Peter != George, which is why they have different key and may very well have the same value.
A hashmap.
Use sets.
In your example the second insertion for George using a (non-multi) is skipped as the same key was previously inserted.
You want to use unordered_multimap to have several keys that are the same.
Since this is unordered you can't really hope to have any particular order, because it depends on the hash function.
If you want order in which you insert things, you need to use std::vector. Even normal maps, which are supposed to be ordered imply the comparison order, and not the order in which you insert things, for example string "AB" comes before "BB", because "A" is less than "B".
To insert without providing a value you need a set, and not a map.
The underlying structure of "unordered_" things is hashtable.

Why does multimap allow duplicate key-value pairs?

EDIT: Please note, I'm NOT asking why multimap can't contain duplicate keys.
What's the rationale behind multimap allowing duplicate key-value pairs? (not keys)
#include <map>
#include <string>
#include <iostream>
int
main(int argc, char** argv)
{
std::multimap<std::string, std::string> m;
m.insert(std::make_pair("A", "B"));
m.insert(std::make_pair("A", "B"));
m.insert(std::make_pair("A", "C"));
std::cout << m.size() << std::endl;
return 0;
}
This printed 3, which somewhat surprised me, I expected multimap to behave like a set of pairs, so I was expecting 2.
Intuitively, it's not consistent with C++ std::map behaviour, where insert does not always change the map (as opposed to operator[]).
Is there a rationale behind it, or it's just arbitrary?
Multimap only has a predicate ordering the keys. It has no method to determine whether the values are equal. Is value "A" a duplicate of value "a"? Without a second predicate for the values, there's no telling. Therefore, it doesn't even make sense to talk about duplicate values in a multimap.
If you would like a container that stores pairs, and enforces the unique-ness of both parts of the pair, look at boost::multi_index_container. It's very flexible, but takes a load of arguments as a result.
EDIT: This answer does not answer the current question anymore. I'll keep it as it is because it got upvoted a lot so it must be useful for some.
The multi in multimap stands for the fact that the same key can occur multiple times.
The standard puts no limit on the type used as value, so one cannot assume that operator==() is defined. Because we don't want the result of your code depend on whether the operator==() is defined or not, it is never used.
std::multimap is not a replacement for std::map. As you noticed, it behaves differently when the same key is inserted multiple times. If you want std::map's behaviour, use std::map.
There is also a std::multiset.
The rational: sometimes one would like to keep all old entries for the same key around as well. [TBD: Insert some example here]
Personally, I barely ever use std::multimap. If I want multiple entries for the same key, I usually rely on std::map<std::vector<T> >.
The values are allowed to be duplicates because they are not required to be comparable to each other. The container cannot do anything with the values besides copy them in. This enables types like multimap< int, my_class >.
If duplicate key-value pairs are undesirable, then use set< pair< T, U > > and use lower_bound to find the first match to a given key.
As you know, multimap allows to have multiple keys. Since it does not place any constraints on values comparability, it is unable to check, if values haven't been doubled.
If you want to have some dictionary data structure which allows for duplicate keys, but not key-value pairs, you would have to ensure that values are comparable.
Let's say we have a game of some sort, where there is 2D world of sqaure fields, and you can put items on fields. You can have multimap<Field, Item>, which will allow you to keep two identical items on the field. Items don't have to be comparable here.
My reasoning is multimap is based on the Key lookup/insertion and not on the value. So whether the value on duplicate keys is the same or not does not play a part when elements are being inserted.
23.3.2 Class template multimap
1 A multimap is a kind of associative
container that supports equivalent
keys (possibly containing multiple
copies of the same key value) and
provides for fast retrieval of values
of another type T based on the keys.
"multimap" is meant to support 'multiple' keys unlike simple "map". Since it allows multiple keys, it won't bother for their values, so it shows 3 elements in your example. The other difference is, one can not have operator [] for multimap.
A use of duplicate [map,value] pairs is to count the number of occurrences of say a word on a page of a book, be it no times, thus no entry in the multimap for that word, be it once with one entry, or more than once with the number of occurrences in multimap for make_pair(word, page_number). It was more by accident that design that I found this usage.

Hash map, string compares, and std::map?

First off, I would like to make a few points I believe to be true. Please can these be verified?
A hash map stores strings by
converting them into an integer
somehow.
std::map is not a hash map, and if I'm using strings, I should consider using a hash map for memory issues?
String compares are not good to rely on.
If std::map is not a hash map and I should not be relying on string compares (basically, I have a map with strings as keys...I was told to look up using hash maps instead?), is there a hash map in the C++ STL? If not, how about Boost?
Secondly, Is a hash map worth it for [originally] an std::map< std::string, non-POD GameState >?
I think my point is getting across...I plan to have a store of different game states I can look up and register into a factory.
If any more info is needed, please ask.
Thanks for your time.
I don't believe most of your points are correct.
there is no hash map in the current standard. C++0x introduces unordered_map, who's implementation will be a hash table and your compiler probably already supports it.
std::map is implemented as a balanced tree, not a hash table. There are no "memory issues" when using either map type with strings, either as keys or data.
strings are not stored as numbers in either case - an unordered_map will use a hashing function to derive a numeric key from the string, but this is not stored.
my experience is that unordered_map is about twice the speed of map - they have basically the same interface, so you can try both with your own data - whenever you are interested in performance you should always perform tests your self with your own real data, rather than depending on the experience of others. Both map types will be somewhat sensitive to the length of the string key.
Assuming you have some class A, that you want to access via a string key, the maps would be declared as:
map <string, A> amap;
unordered_map <string, A> umap;
I made a benchmark that compares std::map with boost::unordered_map.
My conclusion was basically this: If you do not need map-specific things like equal_range, then always use boost::unordered_map.
The full benchmark can be found here
A hash map will typically have some integral representation of a string, yes.
std::map has a requirement to be sorted, so implementing it as a hash table is unlikely, and I've never seen it in practice.
Whether string comparisons are good or bad depends entirely on what you're doing, what data you're comparing, and how often. If the first letter differs then that's barely any different from an integer comparison, for example.
You want unordered_map (that's the Boost version - there is also a version in the TR1 standard library if your compiler has that).
Is it worth it for game states? Yes, but only because using an unordered_map is simple. You're prematurely worrying about optimisations at this stage. Save the worries over access patterns for things you're going to look up thousands of times a second (ie. when your profiler tells you that it's a problem).