Confused with the standard associative container set and vector - c++

I am familiar with the standard libraries associative container map and the sequence container map. However I cant seem to understand the purpose of a set. While trying to understand std::set online I came across the following statement
A set is an STL container that stores values and permits easy lookup.
For example, you might have a set of strings:
std::set<std::string> S;
You can add a new element by writing
S.insert("foo";.
A set may not contain more than one element with the same key, so
insert won't add anything if S already contains the string "foo";
instead it just looks up the old element. The return value includes a
status code indicating whether or not the new element got inserted
So from the above text it seems to me like the set container only stores key and is not like a map which stores a key and a value. If this is true why is it an associative container and not a sequence container like a map ?

It is not really an associative container as it only stores values. It has the property of a mathematical set that if you put the same value in more than once you still only get one instance of that. This can be very useful. Suppose you were trying to get a list of all the words in a document; a set would be great as they would occur many times in the text but you would just get one of each word in the set.

You may think about it as an association of a key and a boolean value, telling whether the key is present in the set or not.

Related

Map count really counts or just checks existence

In CPP primer or other websites I have found the language of count (from map STL) definition very vague and misleading:
Searches the container for elements with a key equivalent to k and returns the number of matches
Now what I have studied so far is that key is singular and so is the mapped value - the mapped value can be changed through assignment.
So doesn't it just returns whether the container contains the key or not? Rather than the count? Where am I wrong in understanding the concept?
A std::map's count() will always return either 0 or 1.
But the C++ library has other associative containers that might very well have multiple values for the same key. Like std::multimap and std::multiset. And by a very lucky coincidence they also have a count() method that may actually return values greater than 1.
But what this allows you to do is metaprogramming by developing templates that can use any associative container, one that may or may not be unique. All your template needs to do is use count() to determine how many values exist in the container with the given key, and the end result can be used with either std::map or std::multimap. It won't care in the slightest. In both cases, your template will get the right answer: the number of values in the container with the given key.
According to cplusplus.com
Because all elements in a map container are unique, the function can only return 1 (if the element is found) or zero (otherwise).

Underlying storage and functionality of unordered_maps vs unordered_multimaps in C++?

I'm having a hard time wrapping my head around unordered_maps and unordered_multimaps because my test code isn't producing what I've been told to expect.
std::unordered_map<string, int> names;
names.insert(std::make_pair("Peter", 4));
names.insert(std::make_pair("George", 4));
names.insert(std::make_pair("George", 4));
When I iterate through this list, I get one instance of George first, then Peter.
1) It's my understanding unordered_maps do not allow multiple keys to map to one value, and that multimaps due. Is this true?
2) Why can Peter and George coexist at a value of 4? What is happening to the second George? And for that matter, why is George appearing first when I iterate from begin() to end() if this is unordered?
3) What is the underlying representation of an unordered map vs. unordered multimap?
4) Is there a way to insert keys into either map without providing a value? E.g. have the compiler create its own hash function that I don't need to worry about when I retrieve keys and look for collisions?
I'll make it short:
No. Multi... refers to keys. A (non-multi)map can't have multiple equivalent keys with differeny values, ie. per key there is at most one value. A multi map can. The same holds for the unordered versions.
Peter != George, which is why they have different key and may very well have the same value.
A hashmap.
Use sets.
In your example the second insertion for George using a (non-multi) is skipped as the same key was previously inserted.
You want to use unordered_multimap to have several keys that are the same.
Since this is unordered you can't really hope to have any particular order, because it depends on the hash function.
If you want order in which you insert things, you need to use std::vector. Even normal maps, which are supposed to be ordered imply the comparison order, and not the order in which you insert things, for example string "AB" comes before "BB", because "A" is less than "B".
To insert without providing a value you need a set, and not a map.
The underlying structure of "unordered_" things is hashtable.

How can I find out if a map contains a given value?

I'd like to find out whether a given value is present in a map. Also getting the corresponding key(s) would be nice but is not required.
bool map::contains(string value);
Is there a simple way to do this other than to iterate over the whole map and comparing each value with the given value? Why is there no corresponding method in the STL?
std::map only indexes its elements by key; it does not index them by value. Therefore, there is no way to look up an element by its value without iterating over the map.
Take a look at Boost.Bimap:
Boost.Bimap is a bidirectional maps library for C++. With Boost.Bimap you can create associative containers in which both types can be used as key. A bimap<X,Y> can be thought of as a combination of a std::map<X,Y> and a std::map<Y,X>.
Using it is pretty straightforward, although you will of course need to consider the question of whether duplicate values are allowed.
Also, see Is there a Boost.Bimap alternative in c++11?

C++ container for list and map

we have a collection of key and value pairs. We are in the need for a container which can help us to retrieve the value o(1) but also remember the insertion order so that when we do iteration, we could iterate like a inserting order. Since the key is a string, we will not able to use a set or similar structure.
Currently we have defined our own collection class which contains a list, also a map and the values are stored into 2 different structure.
Are there any readily available implementation available?
Sounds like you need a Boost Multi-Index container.

Why does multimap allow duplicate key-value pairs?

EDIT: Please note, I'm NOT asking why multimap can't contain duplicate keys.
What's the rationale behind multimap allowing duplicate key-value pairs? (not keys)
#include <map>
#include <string>
#include <iostream>
int
main(int argc, char** argv)
{
std::multimap<std::string, std::string> m;
m.insert(std::make_pair("A", "B"));
m.insert(std::make_pair("A", "B"));
m.insert(std::make_pair("A", "C"));
std::cout << m.size() << std::endl;
return 0;
}
This printed 3, which somewhat surprised me, I expected multimap to behave like a set of pairs, so I was expecting 2.
Intuitively, it's not consistent with C++ std::map behaviour, where insert does not always change the map (as opposed to operator[]).
Is there a rationale behind it, or it's just arbitrary?
Multimap only has a predicate ordering the keys. It has no method to determine whether the values are equal. Is value "A" a duplicate of value "a"? Without a second predicate for the values, there's no telling. Therefore, it doesn't even make sense to talk about duplicate values in a multimap.
If you would like a container that stores pairs, and enforces the unique-ness of both parts of the pair, look at boost::multi_index_container. It's very flexible, but takes a load of arguments as a result.
EDIT: This answer does not answer the current question anymore. I'll keep it as it is because it got upvoted a lot so it must be useful for some.
The multi in multimap stands for the fact that the same key can occur multiple times.
The standard puts no limit on the type used as value, so one cannot assume that operator==() is defined. Because we don't want the result of your code depend on whether the operator==() is defined or not, it is never used.
std::multimap is not a replacement for std::map. As you noticed, it behaves differently when the same key is inserted multiple times. If you want std::map's behaviour, use std::map.
There is also a std::multiset.
The rational: sometimes one would like to keep all old entries for the same key around as well. [TBD: Insert some example here]
Personally, I barely ever use std::multimap. If I want multiple entries for the same key, I usually rely on std::map<std::vector<T> >.
The values are allowed to be duplicates because they are not required to be comparable to each other. The container cannot do anything with the values besides copy them in. This enables types like multimap< int, my_class >.
If duplicate key-value pairs are undesirable, then use set< pair< T, U > > and use lower_bound to find the first match to a given key.
As you know, multimap allows to have multiple keys. Since it does not place any constraints on values comparability, it is unable to check, if values haven't been doubled.
If you want to have some dictionary data structure which allows for duplicate keys, but not key-value pairs, you would have to ensure that values are comparable.
Let's say we have a game of some sort, where there is 2D world of sqaure fields, and you can put items on fields. You can have multimap<Field, Item>, which will allow you to keep two identical items on the field. Items don't have to be comparable here.
My reasoning is multimap is based on the Key lookup/insertion and not on the value. So whether the value on duplicate keys is the same or not does not play a part when elements are being inserted.
23.3.2 Class template multimap
1 A multimap is a kind of associative
container that supports equivalent
keys (possibly containing multiple
copies of the same key value) and
provides for fast retrieval of values
of another type T based on the keys.
"multimap" is meant to support 'multiple' keys unlike simple "map". Since it allows multiple keys, it won't bother for their values, so it shows 3 elements in your example. The other difference is, one can not have operator [] for multimap.
A use of duplicate [map,value] pairs is to count the number of occurrences of say a word on a page of a book, be it no times, thus no entry in the multimap for that word, be it once with one entry, or more than once with the number of occurrences in multimap for make_pair(word, page_number). It was more by accident that design that I found this usage.