Order of map in C++? - c++

I'm new to C++ and I've been experimenting with the language lately.
I started doing some basic iterations with map.
What I found was that the following code:
map<string, int> persons = {{"Lily", 14}, {"John", 45}};
for ( const auto &p : persons ) {
cout << p.first << " is " << p.second << " years old." << endl;
}
Always returns:
John is 45 years old.
Lily is 14 years old.
No matter what the order of persons is (eg if I switched up Lily & John).
Is there any ordering within map?

Yes.
std::map (as well as std::set) is ordered according to its Comparator, which defaults to std::less which calls the overload of operator < for the stored keys.
Hence, std::strings are ordered lexicographically.

Yes, there is ordering in map.
To be specific, an std::map orders items by the keys. In this case, you've use std::string as the key, so the keys are ordered by comparing strings. Since J comes before L in the alphabet, it's ordered first in the map as well.
If you prefer, you can supply your own comparison routine (as a function, or preferably, a function object) that specifies a different ordering (but it still has to satisfy a "strict weak ordering" criteria, so (for example) A<B and B<C implies that A<C).

Related

set::key_comp vs set::value_comp in C++?

What is the difference between set::key_comp vs set::value_comp in C++? Going to cplusplus.com page there is no significant difference.
Furthermore on set::key_comp & related set::value_comp pages
last sentence is "(...) key_comp and its sibling member function value_comp are equivalent."
Examples are almost the same:
http://www.cplusplus.com/reference/set/set/key_comp/
http://www.cplusplus.com/reference/set/set/value_comp/
key_comp defines the order of the keys in a container.
value_comp defines the order of the values in a container.
In a std::set where, essentially, the values are the keys, the two are indeed exactly equivalent. But that's not true in all containers, e.g. std::map, or, in general, in a container that you might build yourself that follows the conventions of the C++ Standard Library Containers.
Note also that http://en.cppreference.com/w/ is a superior reference for C++. It pretty much proxies the standards.
These are identical, both must be made available by any implementation because std::set must meet the requirement of Associative Container.
This allows you to write generic code that works with any Associative Container (std::set, std::map, std::multiset, std::multimap in the standard library).
The difference comes when key and value are different entities inside a container.
For containers like set, these two terms mean same thing.
While, for containers like map or multimap, the key and value are separate entities maintained as an single entry.
Here is an example which shows how they differ:
std::set<int> myset;
int highest1, highest2, highest3;
typedef map<int, int> MyMap;
MyMap mymap;
std::set<int>::key_compare myCompKeyForSet = myset.key_comp();
std::set<int>::value_compare myCompValForSet = myset.value_comp();
MyMap::key_compare myCompKeyForMap = mymap.key_comp();
MyMap::value_compare myCompValForMap = mymap.value_comp();
for (int i=0; i<=5; i++) {
myset.insert(i);
mymap.insert(make_pair(i, 2*i));
}
//////SET///////
highest1=*myset.rbegin();
std::set<int>::iterator it=myset.begin();
while ( myCompKeyForSet(*it, highest1) ) it++;
std::cout << "\nhighest1 is " << highest1; // prints 5
highest2 = *myset.rbegin();
it=myset.begin();
while ( myCompValForSet(*it, highest2) ) it++;
std::cout << "\nhighest2 is " << highest2; // prints 5
//////MAP///////
MyMap::iterator it2 = mymap.begin();
highest3 = mymap.rbegin()->first;
while ( myCompKeyForMap((it2->first), highest3) ) it2++;
std::cout << "\nhighest3 is " << highest3; // prints 5
std::pair<int,int> highest4 = *mymap.rbegin(); //must be defined as map's `value_type`
it2 = mymap.begin();
while ( myCompValForMap(*(it2), highest4) ) it2++; // takes `value_type` which is `pair<int, int>` in this case.
std::cout << "\nhighest4 is " << highest4.second; // prints 10
Live demo
As I mentioned the passed arguments to value_compare function object must be of type value_type&, so I am in a kind of disagreement with those saying that these two set::key_comp and set::value_comp are easily compatible across associative containers.

C++ Multiset count()

So the problem is the following, I have a multiset where I use the std::equal_to operator for comparing the elements, but when I use the count() method it says all 4 elements in my multiset are equal_to my counts parameter.
std::multiset< std::string, std::equal_to< std::string > > mset;
mset.insert("C++");
mset.insert("SQL");
mset.insert("Jav");
mset.insert("C");
for(std::multiset<std::string>::iterator it = mset.begin(); it != mset.end(); ++it){
std::cout << *it << std::endl;
}
std::cout << std::endl;
std::cout << mset.count("STR");
The output is : 4
If i understand right whats happening is "STR"=="C++"=="SQL"=="Jav"=="C"==true.
And this is what I don't understand.
Thankyou for the help.
As BobTFish already said in a comment, the Compare type of std::multiset should return true if the first argument is "less" (has to be ordered before) the second argument. The default type is std::less<Key>.
For elements stored in std::multiset you must define strict weak ordering relation f(x, y). One of the properties of strict weak ordering is Irreflexivity, that is f(x, x) must be false. This property is violated in your strict weak ordering and you have got some undefined results.
What you probably want is to use std::unordered_multiset instead.

Should items with duplicate keys in unordered_multimap be kept in the order of their insertion?

One book mentioned that for std::unordered_multimap:
The order of the elements is undefined. The only guarantee is that
duplicates, which are possible because a multiset is used, are grouped
together in the order of their insertion.
But from the output of the example below, we can see that the print order is reverse from their insertion.
#include <string>
#include <unordered_map>
int main()
{
std::unordered_multimap<int, std::string> um;
um.insert( {1,"hello1.1"} );
um.insert( {1,"hello1.2"} );
um.insert( {1,"hello1.3"} );
for (auto &a: um){
cout << a.first << '\t' << a.second << endl;
}
}
Which when compiled and run produces this output (g++ 5.4.0):
1 hello1.3
1 hello1.2
1 hello1.1
updated: unordered_multiset has the same issue:
auto cmp = [](const pair<int,string> &p1, const pair<int,string> &p2)
{return p1.first == p2.first;};
auto hs = [](const pair<int,string> &p1){return std::hash<int>()(p1.first);};
unordered_multiset<pair<int, string>, decltype(hs), decltype(cmp)> us(0, hs, cmp);
us.insert({1,"hello1.1"});
us.insert({1,"hello1.2"});
us.insert({1,"hello1.3"});
for(auto &a:us){
cout<<a.first<<"\t"<<a.second<<endl;
}
output:
1 hello1.3
1 hello1.2
1 hello1.1
Here is what the standard says of the ordering [unord.req] / §6:
... In containers that support equivalent keys, elements with equivalent keys are adjacent to each other in the iteration order of the container. Thus, although the absolute order of elements in an unordered container is not specified, its elements are grouped into equivalent-key groups such that all elements of each group have equivalent keys. Mutating operations on unordered containers shall preserve the relative order of elements within each equivalent-key group unless otherwise specified.
So, to answer the question:
Should items with duplicate keys in unordered_multimap be kept in the order of their insertion?
No, there is no such requirement, or guarantee. If the book makes such claim about the standard, then it is not correct. If the book describes a particular implementation of std::unordered_multimap, then the description could be true for that implementation.
The requirements of the standard make an implementation using open addressing impractical. Therefore, compliant implementations typically use separate chaining of hash collisions, see How does C++ STL unordered_map resolve collisions?
Because equivalent keys - which necessarily collide - are (in practice, not explicitly required to be) stored in a separate linked list, the most efficient way to insert them, is in order of insertion (push_back) or in reverse (push_front). Only the latter is efficient if the separate chain is singly linked.

Why isn't vector::operator[] implemented similar to map::operator[]?

Is there any reason for std::vector's operator[] to just return a reference instead of inserting a new element? The cppreference.com page for vector::operator says here
Unlike std::map::operator[], this operator never inserts a new element into the container.
While the page for map::operator[] says
"Returns a reference to the value that is mapped to a key equivalent to key, performing an insertion if such key does not already exist."
Why couldn't vector::operator[] be implemented by calling vector::push_back or vector::insert like how map::operator[] calls insert(std::make_pair(key, T())).first->second;?
Quite simply: Because it doesn't make sense. What do you expect
std::vector<int> a = {1, 2, 3};
a[10] = 4;
to do? Create a fourth element even though you specified index 10? Create elements 3 through to 10 and return a reference to the last one? Neither would be particularily intuitive.
If you really want to fill a vector with values using operator[] instead of push_back, you can call resize on the vector to create the elements before settings them.
Edit: Or, if you actually want to have an associative container, where the index is important apart from ordering, std::map<int, YourData> might actually make more sense.
A map and a vector are completely different concepts. A map is an "associative container" whereas a vector is a "sequence container". Delineating the differences is out of the scope of this answer, though at the most superficial of levels, a map is generally implemented as a red-black tree, while a vector is a convoluted wrapper over a C-style array (elements stored contiguously in memory).
If you want to check if an element already exists, you would need to resize the entire container. But what happens if you decide to remove the element? What do you do with the entries you just created? With a map:
std::map<int, int> m;
m[1] = 1;
m.erase(m.begin());
This is a constant operation.
With a vector:
std::vector<int> v;
// ... initialize some values between 25 and 100
v[100] = 1;
v.erase(v.begin() + 25, v.end());
This is a linear operation. That's horribly inefficient (comparatively) to a map. While this is a contrived example, it's not hard to imagine how this could blow up in other scenarios. At a minimum, most people would go out of their way to avoid operator[] which as a cost in of itself (maintenance and code complexity).
Is there any reason for std::vector's operator[] to just return a reference instead of inserting a new element?
std::vector::operator[] is implemented in an array-like fashion because std::vector is a sequence container (i.e., array-like). Standard arrays for integral types cannot be accessed out of bounds. Similarly, accessing std::vector::operator[] with an index outside of the vector's length is not allowed either. So, yes, the reasons it is not implemented as you ask about is because in no other context, do arrays in C++ act like that.
std::map::operator[] is not a sequence container. Its syntax makes it similar to associative arrays in other languages. In terms of C++ (and its predecessor, C), map::operator[] is just syntactic sugar. It is the "black sheep" of the operator[] family, not std::vector::operator[].
The interesting part of the C++ specification regarding is that accessing a map with a key that doesn't exist, using std::map::operator[], adds an element to the map. Thus,
#include <iostream>
#include <map>
int main(void) {
std::map<char, int> m;
m['a'] = 1;
std::cout << "m['a'] == " << m['a'] << ", m.size() == " << m.size() << std::endl;
std::cout << "m['b'] == " << m['b'] << ", m.size() == " << m.size() << std::endl;
}
results in:
m['a'] == 1, m.size() == 1
m['b'] == 0, m.size() == 2
See also: Difference between map[] and map.at in C++? :
[map::at] throws an exception if the key doesn't exist, find returns aMap.end() if the element doesn't exist, and operator[] value-initializes a new value for the corresponding key if no value exists there.

How can I sort a std::map first by value, then by key?

I need to sort a std::map by value, then by key. The map contains data like the following:
1 realistically
8 really
4 reason
3 reasonable
1 reasonably
1 reassemble
1 reassembled
2 recognize
92 record
48 records
7 recs
I need to get the values in order, but the kicker is that the keys need to be in alphabetical order after the values are in order. How can I do this?
std::map will sort its elements by keys. It doesn't care about the values when sorting.
You can use std::vector<std::pair<K,V>> then sort it using std::sort followed by std::stable_sort:
std::vector<std::pair<K,V>> items;
//fill items
//sort by value using std::sort
std::sort(items.begin(), items.end(), value_comparer);
//sort by key using std::stable_sort
std::stable_sort(items.begin(), items.end(), key_comparer);
The first sort should use std::sort since it is nlog(n), and then use std::stable_sort which is n(log(n))^2 in the worst case.
Note that while std::sort is chosen for performance reason, std::stable_sort is needed for correct ordering, as you want the order-by-value to be preserved.
#gsf noted in the comment, you could use only std::sort if you choose a comparer which compares values first, and IF they're equal, sort the keys.
auto cmp = [](std::pair<K,V> const & a, std::pair<K,V> const & b)
{
return a.second != b.second? a.second < b.second : a.first < b.first;
};
std::sort(items.begin(), items.end(), cmp);
That should be efficient.
But wait, there is a better approach: store std::pair<V,K> instead of std::pair<K,V> and then you don't need any comparer at all — the standard comparer for std::pair would be enough, as it compares first (which is V) first then second which is K:
std::vector<std::pair<V,K>> items;
//...
std::sort(items.begin(), items.end());
That should work great.
You can use std::set instead of std::map.
You can store both key and value in std::pair and the type of container will look like this:
std::set< std::pair<int, std::string> > items;
std::set will sort it's values both by original keys and values that were stored in std::map.
As explained in Nawaz's answer, you cannot sort your map by itself as you need it, because std::map sorts its elements based on the keys only. So, you need a different container, but if you have to stick to your map, then you can still copy its content (temporarily) into another data structure.
I think, the best solution is to use a std::set storing flipped key-value pairs as presented in ks1322's answer.
The std::set is sorted by default and the order of the pairs is exactly as you need it:
3) If lhs.first<rhs.first, returns true. Otherwise, if rhs.first<lhs.first, returns false. Otherwise, if lhs.second<rhs.second, returns true. Otherwise, returns false.
This way you don't need an additional sorting step and the resulting code is quite short:
std::map<std::string, int> m; // Your original map.
m["realistically"] = 1;
m["really"] = 8;
m["reason"] = 4;
m["reasonable"] = 3;
m["reasonably"] = 1;
m["reassemble"] = 1;
m["reassembled"] = 1;
m["recognize"] = 2;
m["record"] = 92;
m["records"] = 48;
m["recs"] = 7;
std::set<std::pair<int, std::string>> s; // The new (temporary) container.
for (auto const &kv : m)
s.emplace(kv.second, kv.first); // Flip the pairs.
for (auto const &vk : s)
std::cout << std::setw(3) << vk.first << std::setw(15) << vk.second << std::endl;
Output:
1 realistically
1 reasonably
1 reassemble
1 reassembled
2 recognize
3 reasonable
4 reason
7 recs
8 really
48 records
92 record
Code on Ideone
Note: Since C++17 you can use range-based for loops together with structured bindings for iterating over a map.
As a result, the code for copying your map becomes even shorter and more readable:
for (auto const &[k, v] : m)
s.emplace(v, k); // Flip the pairs.
std::map already sorts the values using a predicate you define or std::less if you don't provide one. std::set will also store items in order of the of a define comparator. However neither set nor map allow you to have multiple keys. I would suggest defining a std::map<int,std::set<string> if you want to accomplish this using your data structure alone. You should also realize that std::less for string will sort lexicographically not alphabetically.
EDIT: The other two answers make a good point. I'm assuming that you want to order them into some other structure, or in order to print them out.
"Best" can mean a number of different things. Do you mean "easiest," "fastest," "most efficient," "least code," "most readable?"
The most obvious approach is to loop through twice. On the first pass, order the values:
if(current_value > examined_value)
{
current_value = examined_value
(and then swap them, however you like)
}
Then on the second pass, alphabetize the words, but only if their values match.
if(current_value == examined_value)
{
(alphabetize the two)
}
Strictly speaking, this is a "bubble sort" which is slow because every time you make a swap, you have to start over. One "pass" is finished when you get through the whole list without making any swaps.
There are other sorting algorithms, but the principle would be the same: order by value, then alphabetize.