Map count really counts or just checks existence - c++

In CPP primer or other websites I have found the language of count (from map STL) definition very vague and misleading:
Searches the container for elements with a key equivalent to k and returns the number of matches
Now what I have studied so far is that key is singular and so is the mapped value - the mapped value can be changed through assignment.
So doesn't it just returns whether the container contains the key or not? Rather than the count? Where am I wrong in understanding the concept?

A std::map's count() will always return either 0 or 1.
But the C++ library has other associative containers that might very well have multiple values for the same key. Like std::multimap and std::multiset. And by a very lucky coincidence they also have a count() method that may actually return values greater than 1.
But what this allows you to do is metaprogramming by developing templates that can use any associative container, one that may or may not be unique. All your template needs to do is use count() to determine how many values exist in the container with the given key, and the end result can be used with either std::map or std::multimap. It won't care in the slightest. In both cases, your template will get the right answer: the number of values in the container with the given key.

According to cplusplus.com
Because all elements in a map container are unique, the function can only return 1 (if the element is found) or zero (otherwise).

Related

May std::map key address number of values?

It may be a little bit stupid question, but assuming the std::map defined as follows:
std::map<int, int> m;
Is there any way to have more than one value stored and be accessible for a single key?
Motivation of asking:
std::map has methods like count() and equal_range() that get a key as parameters, that kind of give a sense that there is more than one value may be specified by a single key.
Those methods exist so as to provide a common interface with other associative containers that do allow multiple values per key (such as std::multimap which is exactly what you're looking for).
This makes implementing algorithms generically (i.e. with templates) much easier than it would otherwise be, and nothing of value is lost by designing it this way.
It's true that, in the case of std::map, count() can only give you zero or one (unless you're using transparent keys, which are a whole other kettle of fish).
C++20 will introduce std::map::contains(), which is more or less a check that count() == 1 — this seems to have been intended to address concerns that the function count() is kind of a weird thing to have for a std::map specifically.
Is there any way to have more than one value stored and be accessible for a single key?
Not with std::map, these objects store only one value per key, but std::multimap can store a variable number of values per key.
Similarities between both types (e.g. std::map::count, std::multimap::count) are due to establishing similar interfaces between STL containers.
No. std::map is designed to have one value per key.
If you want multiple values for one key, you should use std::multimap.

Use of equal_range for unordered_map

As it makes sense, lower_bound and upper_bound are absent for std::unordered_map because there is no order for elements.
However std::unordered_map has method equal_range. it return iterators for range corresponding to a key.
How does it make sense? Since there can be only one element with a key in std::unordered_map. It is just find method.
Also, in std::unordered_multimap, does it presence means all elements with same key will always come together while iterating unordered_multimap with a iterator?
How does it make sense?
It kind of does. The standard requires all associative container to offer equal_range so the non multi containers need to provide it. It does make writing generic code easier so I suspect that is the reason why all of the containers are required to use it.
Does it presence means all elements with same key will always come together while iterating unordered_map with a iterator?
Actually, yes. Since the all of the keys will have the same value they will hash to the same value which means they all get stored in the same bucket and will be grouped together since the keys compare equally. From [unord.req]/6
An unordered associative container supports unique keys if it may contain at most one element for each key. Otherwise, it supports equivalent keys. unordered_­set and unordered_­map support unique keys. unordered_­multiset and unordered_­multimap support equivalent keys. In containers that support equivalent keys, elements with equivalent keys are adjacent to each other in the iteration order of the container. Thus, although the absolute order of elements in an unordered container is not specified, its elements are grouped into equivalent-key groups such that all elements of each group have equivalent keys. Mutating operations on unordered containers shall preserve the relative order of elements within each equivalent-key group unless otherwise specified.
emphasis mine
It's for consistency with the other containers.
It makes more sense in the _multi variants, but is present in all the associative (and unordered associative) containers in the standard library.
It allows you to write code like
template <typename MapLike, typename KeyLike>
void do_stuff(const MapLike & map, const KeyLike & key)
{
auto range = map.equal_range(key);
for (auto it = range.first; it != range.second; ++it)
// blah
}
Which does not care about what specific container it is dealing with
cplusplus.com writes about std::unordered_map::equal_range:
Returns the bounds of a range that includes all the elements in the container with a key that compares equal to k. In unordered_map containers, where keys are unique, the range will include one element at most.
Also, in std::unordered_multimap, does it presence means all elements with same key will always come together while iterating unordered_multimap with a iterator?
In general, the order, in which elements stored in a std::unordered_multimap are obtained while traversing it, is actually not defined. However, note that std::unordered_multimaps are usually implemented as hash tables. By analysing such an implementation you will realize that the ordering is not going to be as "undefined" as someone might initially think.
At element insertion (or hash table rehashing), the value resulting of applying the hash function to an element's key is used to select the bucket where that element is going to be stored. Two elements with equal keys will result in the same hash value, therefore they will be stored in the same bucket, so they come togetherX while iterating an std::unordered_multimap.
XNote that even two elements with different keys might also result in the same hash value (i.e., a collision). However, std::unordered_multimap can handle these cases by comparing the keys against equality, and therefore still group elements with equal keys together.

Why is the C++ STL set container's count() method thus named?

What it really checks for is contains() and not the count of the number of occurrences, right? Duplicates are not permitted either so wouldn't contains() be a better name than count()?
It's to make it consistent with other container classes, given that one of the great aspects of polymorphism is to be able to treat different classes with the same API.
It does actually return the count. The fact that the count can only be zero or one for a set does not change that aspect.
It's not fundamentally different to a collection object that only allows two things of each "value" at the same time. In that case, it would return the count of zero, one or two, but it's still a count, the same as with a set.
The relevant part of the standard that requires this is C++11 23.2.4 which talks about the associative containers set, multiset, map and multimap. Table 102 contains the requirements for these associative containers over and above the requirements for "regular" containers, and the bit for count is paraphrased below:
size_type a.count(k) - returns the number of elements with key equivalent to k. Complexity is log(a.size()) + a.count(k).
All associative containers must meet the requirements listed in §23.2.4/8 Table 102 - Associative container requirements. One of these is that they implement a.count(k) which then
returns the number of elements with key equivalent to k
So the reason is to have a consistent interface between all associative containers. For instance, this uniformity will be very important when writing generic function templates that must work with any associative container.
It's a standard operation on containers that returns the number of matching elements. In things like lists, this makes perfect sense. It just so happens that on a set, there can only be one occurrence of an element and therefore count can never return a value greater than 1.

Why do unordered_set operations like count and erase return a size_type?

Apparently, unordered_set::erase and unordered_set::count return something that is not strictly boolean (logically, that is, I'm not talking about the actual type).
The linked page reads for the third version of erase:
size_type erase( const key_type& key );
Removes the elements with the key value key
This has a tone to it that suggests there could be more than just one element with a given key. It doesn't explicitly state this, but it sounds like it a lot.
Now, the point of a set, even an unordered one, is to have each element once.
The standard library acknowledges the existence of the bool type and uses it for boolean values like unordered_set::empty(). So, what's the point of returning size_type in the cases above? Even in spite of hash collisions, the container should distinguish elements with different keys, right? Can I still rely on that?
a.erase(k) size_type Erases all elements with key
equivalent to k. Returns the
number of elements erased.
b.count(k) size_type Returns the number of elements with key
equivalent to k.
It's because of the unordered associative container requirements [23.2.5].
It's probably just so that they could re-use the wording from unordered_multiset. You don't have to worry about hash collisions except for performance-wise, the container is still correct even if every element collides- even if such a thing would be stupendously slow.

Why does multimap allow duplicate key-value pairs?

EDIT: Please note, I'm NOT asking why multimap can't contain duplicate keys.
What's the rationale behind multimap allowing duplicate key-value pairs? (not keys)
#include <map>
#include <string>
#include <iostream>
int
main(int argc, char** argv)
{
std::multimap<std::string, std::string> m;
m.insert(std::make_pair("A", "B"));
m.insert(std::make_pair("A", "B"));
m.insert(std::make_pair("A", "C"));
std::cout << m.size() << std::endl;
return 0;
}
This printed 3, which somewhat surprised me, I expected multimap to behave like a set of pairs, so I was expecting 2.
Intuitively, it's not consistent with C++ std::map behaviour, where insert does not always change the map (as opposed to operator[]).
Is there a rationale behind it, or it's just arbitrary?
Multimap only has a predicate ordering the keys. It has no method to determine whether the values are equal. Is value "A" a duplicate of value "a"? Without a second predicate for the values, there's no telling. Therefore, it doesn't even make sense to talk about duplicate values in a multimap.
If you would like a container that stores pairs, and enforces the unique-ness of both parts of the pair, look at boost::multi_index_container. It's very flexible, but takes a load of arguments as a result.
EDIT: This answer does not answer the current question anymore. I'll keep it as it is because it got upvoted a lot so it must be useful for some.
The multi in multimap stands for the fact that the same key can occur multiple times.
The standard puts no limit on the type used as value, so one cannot assume that operator==() is defined. Because we don't want the result of your code depend on whether the operator==() is defined or not, it is never used.
std::multimap is not a replacement for std::map. As you noticed, it behaves differently when the same key is inserted multiple times. If you want std::map's behaviour, use std::map.
There is also a std::multiset.
The rational: sometimes one would like to keep all old entries for the same key around as well. [TBD: Insert some example here]
Personally, I barely ever use std::multimap. If I want multiple entries for the same key, I usually rely on std::map<std::vector<T> >.
The values are allowed to be duplicates because they are not required to be comparable to each other. The container cannot do anything with the values besides copy them in. This enables types like multimap< int, my_class >.
If duplicate key-value pairs are undesirable, then use set< pair< T, U > > and use lower_bound to find the first match to a given key.
As you know, multimap allows to have multiple keys. Since it does not place any constraints on values comparability, it is unable to check, if values haven't been doubled.
If you want to have some dictionary data structure which allows for duplicate keys, but not key-value pairs, you would have to ensure that values are comparable.
Let's say we have a game of some sort, where there is 2D world of sqaure fields, and you can put items on fields. You can have multimap<Field, Item>, which will allow you to keep two identical items on the field. Items don't have to be comparable here.
My reasoning is multimap is based on the Key lookup/insertion and not on the value. So whether the value on duplicate keys is the same or not does not play a part when elements are being inserted.
23.3.2 Class template multimap
1 A multimap is a kind of associative
container that supports equivalent
keys (possibly containing multiple
copies of the same key value) and
provides for fast retrieval of values
of another type T based on the keys.
"multimap" is meant to support 'multiple' keys unlike simple "map". Since it allows multiple keys, it won't bother for their values, so it shows 3 elements in your example. The other difference is, one can not have operator [] for multimap.
A use of duplicate [map,value] pairs is to count the number of occurrences of say a word on a page of a book, be it no times, thus no entry in the multimap for that word, be it once with one entry, or more than once with the number of occurrences in multimap for make_pair(word, page_number). It was more by accident that design that I found this usage.