Multimap not sorting - c++

I have this multimap built to map the hamming distance of a string to its corresponding string.
Since the hamming distance of two strings could be the same, I want them to be sorted in ascending order. However when I print it out, it is not sorted. The hamdistArray is declared as an unsigned type.
typedef multimap<unsigned, string, less<unsigned> > Check;
Check pairs;
pairs.insert(Check::value_type(hamdistArray[j], d.sortedWordDatabase[j]));
for(Check::const_iterator iter = pairs.begin(); iter != pairs.end(); ++iter)
{
cout << iter->first << '\t' << iter->second<< endl;
}

Elements in a multimap are sorted by the key (in this case the unsigned hamming distance). Elements with the same key are not sorted by the value (in this case the string), they are usually kept in the order in which they were inserted.

This is not possible using std::multimap, because when keys are compared, it is not known which value they represent.

multimap only sorts by its key (length) not also by value (string). In this case I suspect your best approach is a std::map<unsigned, std::set<std::string> >. You could also use a std::set<std::pair<unsigned, std::string> > but searching would require you to construct dummy pairs to search on.

The less template function is not necessary since it's the default. Try declaring Check without as:
typedef multimap<unsigned, string> Check;
Edited: The best way to do this is to generate a hash-key as the *key_type* and than the value-type could be a std::pair<unsigned, string>

Related

How to get the top 100 names in a collection

I am creating a program where I am summarizing from a data file. The data file has information about first names, etc. The information are the fields in the csv file. The fields in the data file are included as instance variables in the class. I created setter and getter methods to return the data for one person. I created vectors to hold the collection of variables.
I am having trouble understanding how create a list of the 100 most common first names of all people in the collection. The list must be in descending order of occurrence.
I was able to print all the common names and its frequencies. But, I am unable to print the 100 most common names. I sorted the vector and got the following errors:
class std::pair<const std::string, int> has no member begin and end
Please help me resolve these issue. All processing of data in the vector must be done with iterators.I am not sure how to fix these issues since I am a beginner.
std::vector<std::string> commonNamesFirst; //vector
for (auto x : census) {
commonNamesFirst.push_back(x.getFirstName()); //populate vector
}
std::map<std::string, int> frequencies;
for (auto& x : census) { ++frequencies[x.getFirstName()]; }
for (auto& freq : frequencies) {
sort(freq.begin(), freq.end(), greater <>()); //error, need to sort in descending order
cout << freq.first << ": " << freq.second << endl; //print the 100 common names in descending order
}
std::map<std::string, int> frequencies;
This is generally the right direction. You're using this to count how many times each word occurs.
for (auto& freq : frequencies) {
This iterates over each individual word and a count of how many times it occured. This no longer makes any logical sense. You are looking to find the 100 most common ones, the one with the highest count values. Iterating, and looking at each one individually, in the manner that's done here, does not make any sense.
sort(freq.begin(), freq.end(), greater <>());
freq, here, is a single word and how many times it occured. You are using freq to iterate over all of the frequencies. Therefore, this is just one of the words, and its frequency value. This is a single std::pair value. And it does not have anything called begin, or end. And that's what your compiler is telling you, directly.
Furthermore, you cannot sort a std::map in the first place. This is not a sortable container. The simplest option is to extract the contents if the now-complete map into something that's sortable. Like, for example, a vector:
std::vector<std::pair<std::string, int>> vfrequencies{
frequencies.begin(),
frequencies.end()
};
So, you've now copied the contents of a map into a vector. Not the most efficient approach, but a workable one.
And now, you can sort this vector. Rather easily.
However, as one last detail, you can't just drop std::greater<> and expect the right thing to happen.
You are looking to sort on the frequency count value only, which is the .second of these std::pairs. A plain std::greater is not going to do this for you. The std::greater overload for a std::pair is not going to do what you think it will do, here.
You will need to provide your own custom lambda for the third parameter of std::sort, that compares the second value of the std::pairs in that vector.
And then, the first 100 most common words will be the first 100 values in the vector. Mission accomplished.
You cannot (re-)sort std::map, you can copy frequencies in vector or std::multimap as intermediate:
std::map<std::string, int> frequencies;
for (auto& x : census) { ++frequencies[x.getFirstName()]; }
std::vector<std::pair<std::string, int>> freqs{frequencies.begin(), frequencies.end()};
std::partial_sort(freqs.begin(), freqs.begin() + 100, freqs.end(),
[](const auto& lhs, const auto& rhs){ return lhs.second > rhs.second; });
for (std::size_t i = 0; i != 100; ++i)
{
std::cout << freqs[i].second << ":" << freqs[i] << std::endl;
}
Building on to #Michał Kaczorowski's answer, you are trying to sort the values in each pair instead of the pairs in the map. However, as Sam mentoined, you cannot sort an std::map (the internal implementation stores things sorted by the key value, or the name in this case). You'd have to get the values out of the map and sort them then, or use a priority queue and heapsort (faster constant factor), or a monotonic queue (linear time but harder to implement). Here is an example heapsort implementation:
vector<string> commonNamesFirst; //vector
for (auto x : census) {
commonNamesFirst.push_back(x.getFirstName()); //populate vector
}
std::map<std::string, int> frequencies;
for (auto& x : census) { ++frequencies[x.getFirstName()]; }
std::priority_queue<pair<int, std::string> > top_names; // put the frequency before the name to take advantage of default pair compare
for (auto& freq : frequencies) top_names.push(std::make_pair(freq.second, freq.first));
for (int i=0; i<100; ++i)
{
outputFile << top_names.top().second << ": " << top_names.top().first << endl; //print the 100 common names in descending order
top_names.pop();
}
The error you get, says it all. You are trying to sort individual std::pair. I think the best way would be to transform your map into a std::vector of pairs and then sort that vector. Then just go through first 100 elements in a loop and print results.

How can I sort a std::map first by value, then by key?

I need to sort a std::map by value, then by key. The map contains data like the following:
1 realistically
8 really
4 reason
3 reasonable
1 reasonably
1 reassemble
1 reassembled
2 recognize
92 record
48 records
7 recs
I need to get the values in order, but the kicker is that the keys need to be in alphabetical order after the values are in order. How can I do this?
std::map will sort its elements by keys. It doesn't care about the values when sorting.
You can use std::vector<std::pair<K,V>> then sort it using std::sort followed by std::stable_sort:
std::vector<std::pair<K,V>> items;
//fill items
//sort by value using std::sort
std::sort(items.begin(), items.end(), value_comparer);
//sort by key using std::stable_sort
std::stable_sort(items.begin(), items.end(), key_comparer);
The first sort should use std::sort since it is nlog(n), and then use std::stable_sort which is n(log(n))^2 in the worst case.
Note that while std::sort is chosen for performance reason, std::stable_sort is needed for correct ordering, as you want the order-by-value to be preserved.
#gsf noted in the comment, you could use only std::sort if you choose a comparer which compares values first, and IF they're equal, sort the keys.
auto cmp = [](std::pair<K,V> const & a, std::pair<K,V> const & b)
{
return a.second != b.second? a.second < b.second : a.first < b.first;
};
std::sort(items.begin(), items.end(), cmp);
That should be efficient.
But wait, there is a better approach: store std::pair<V,K> instead of std::pair<K,V> and then you don't need any comparer at all — the standard comparer for std::pair would be enough, as it compares first (which is V) first then second which is K:
std::vector<std::pair<V,K>> items;
//...
std::sort(items.begin(), items.end());
That should work great.
You can use std::set instead of std::map.
You can store both key and value in std::pair and the type of container will look like this:
std::set< std::pair<int, std::string> > items;
std::set will sort it's values both by original keys and values that were stored in std::map.
As explained in Nawaz's answer, you cannot sort your map by itself as you need it, because std::map sorts its elements based on the keys only. So, you need a different container, but if you have to stick to your map, then you can still copy its content (temporarily) into another data structure.
I think, the best solution is to use a std::set storing flipped key-value pairs as presented in ks1322's answer.
The std::set is sorted by default and the order of the pairs is exactly as you need it:
3) If lhs.first<rhs.first, returns true. Otherwise, if rhs.first<lhs.first, returns false. Otherwise, if lhs.second<rhs.second, returns true. Otherwise, returns false.
This way you don't need an additional sorting step and the resulting code is quite short:
std::map<std::string, int> m; // Your original map.
m["realistically"] = 1;
m["really"] = 8;
m["reason"] = 4;
m["reasonable"] = 3;
m["reasonably"] = 1;
m["reassemble"] = 1;
m["reassembled"] = 1;
m["recognize"] = 2;
m["record"] = 92;
m["records"] = 48;
m["recs"] = 7;
std::set<std::pair<int, std::string>> s; // The new (temporary) container.
for (auto const &kv : m)
s.emplace(kv.second, kv.first); // Flip the pairs.
for (auto const &vk : s)
std::cout << std::setw(3) << vk.first << std::setw(15) << vk.second << std::endl;
Output:
1 realistically
1 reasonably
1 reassemble
1 reassembled
2 recognize
3 reasonable
4 reason
7 recs
8 really
48 records
92 record
Code on Ideone
Note: Since C++17 you can use range-based for loops together with structured bindings for iterating over a map.
As a result, the code for copying your map becomes even shorter and more readable:
for (auto const &[k, v] : m)
s.emplace(v, k); // Flip the pairs.
std::map already sorts the values using a predicate you define or std::less if you don't provide one. std::set will also store items in order of the of a define comparator. However neither set nor map allow you to have multiple keys. I would suggest defining a std::map<int,std::set<string> if you want to accomplish this using your data structure alone. You should also realize that std::less for string will sort lexicographically not alphabetically.
EDIT: The other two answers make a good point. I'm assuming that you want to order them into some other structure, or in order to print them out.
"Best" can mean a number of different things. Do you mean "easiest," "fastest," "most efficient," "least code," "most readable?"
The most obvious approach is to loop through twice. On the first pass, order the values:
if(current_value > examined_value)
{
current_value = examined_value
(and then swap them, however you like)
}
Then on the second pass, alphabetize the words, but only if their values match.
if(current_value == examined_value)
{
(alphabetize the two)
}
Strictly speaking, this is a "bubble sort" which is slow because every time you make a swap, you have to start over. One "pass" is finished when you get through the whole list without making any swaps.
There are other sorting algorithms, but the principle would be the same: order by value, then alphabetize.

C++ Vector of linked lists

I have a nested template argument in the form of
vector<list<int, string> >.
That is, it's a vector of linked lists that hold integer values and string words. If this is not a valid form, please let me know. My question is in calling it. If 'table' is of the above data type, would it be okay to call an index as, for example, table[0]? If so, how do I start walking through the linked list in that index?
With the nested data structure you've defined, you are exactly correct to call an index like you mentioned, table[0]. You can perform your list operations on that exactly. To make your code more clean, it can be helpful to do something like:
list<string> listInVector = table[i];
So you don't get confused performing operations on an index in table, instead you can use that identifier to make the code more clean.
The elements in the containers are individual elements. If you want to have multiple elements, e.g., an int and a std::string you'll need to put them into a suitable structure, e.g., into a std::pair<int, std::string>:
std::vector<std::list<std::pair<int, std::string>>> table;
To walk the elements in the list at a specific position you could use, e.g.:
std::list<std::pair<int, std::strin>>::iterator it(table[i].begin()), end(table[i].end());
for (; it != end; ++it) {
std::cout << "int=" << it->first << " string=" << it->second << "\n";
}
"If this is not a valid form, please let me know"
Yes its not valid, I think you meant :-
std::vector< std::list<std::pair<std::string, int> > > table ;
How to access ?
Something like :-
typedef std::pair<std::string , int> ele;
std::list<ele>::iterator it = table[i].begin(); //for ith table
for(;it!=table[i].end();++it)
std::cout<<it->first<<" "<<it->second;

predicate for a map from string to int

I have this small program that reads a line of input & prints the words in it, with their respective number of occurrences. I want to sort the elements in the map that stores these values according to their occurrences. I mean, the words that only appear once, will be ordered to be at the beginning, then the words that appeared twice 7 so on. I know that the predicate should return a bool value, but I don't know what the parameters should be. Should it be two iterators to the map? If some one could explain this, it would be greatly appreciated. Thank you in advance.
#include<iostream>
#include<map>
using std::cout;
using std::cin;
using std::endl;
using std::string;
using std::map;
int main()
{
string s;
map<string,int> counters; //store each word & an associated counter
//read the input, keeping track of each word & how often we see it
while(cin>>s)
{
++counters[s];
}
//write the words & associated counts
for(map<string,int>::const_iterator iter = counters.begin();iter != counters.end();iter++)
{
cout<<iter->first<<"\t"<<iter->second<<endl;
}
return 0;
}
std::map is always sorted according to its key. You cannot sort the elements by their value.
You need to copy the contents to another data structure (for example std::vector<std::pair<string, int> >) which can be sorted.
Here is a predicate that can be used to sort such a vector. Note that sorting algorithms in C++ standard library need a "less than" predicate which basically says "is a smaller than b".
bool cmp(std::pair<string, int> const &a, std::pair<string, int> const &b) {
return a.second < b.second;
}
You can't resort a map, it's order is predefined (by default, from std::less on the key type). The easiest solution for your problem would be to create a std::multimap<int, string> and insert your values there, then just loop over the multimap, which will be ordered on the key type (int, the number of occurences), which will give you the order that you want, without having to define a predicate.
You are not going to be able to do this with one pass with an std::map. It can only be sorted on one thing at a time, and you cannot change the key in-place. What I would recommend is to use the code you have now to maintain the counters map, then use std::max_element with a comparison function that compares the second field of each std::pair<string, int> in the map.
A map has its keys sorted, not its values. That's what makes the map efficent. You cannot sort it by occurrences without using another data structure (maybe a reversed index!)
As stated, it simply won't work -- a map always remains sorted by its key value, which would be the strings.
As others have noted, you can copy the data to some other structure, and sort by the value. Another possibility would be to use a Boost bimap instead. I've posted a demo of the basic idea previously.
You probably want to transform map<string,int> to vector<pair<const string, int> > then sort the vector on the int member.
You could do
struct PairLessSecond
{
template< typename P >
bool operator()( const P& pairLeft, const P& pairRight ) const
{
return pairLeft.second < pairRight.second;
}
};
You can probably also construct all this somehow using a lambda with a bind.
Now
std::vector< std::map<std::string,int>::value_type > byCount;
std::sort( byCount.begin(), byCount.end(), PairLessSecond() );

Make Map Key Sorted According To Insert Sequence

Without help from additional container (like vector), is it possible that I can make map's key sorted same sequence as insertion sequence?
#include <map>
#include <iostream>
using namespace std;
int main()
{
map<const char*, int> m;
m["c"] = 2;
m["b"] = 2;
m["a"] = 2;
m["d"] = 2;
for (map<const char*, int>::iterator begin = m.begin(); begin != m.end(); begin++) {
// How can I get the loop sequence same as my insert sequence.
// c, b, a, d
std::cout << begin->first << std::endl;
}
getchar();
}
No. A std::map is a sorted container; the insertion order is not maintained. There are a number of solutions using a second container to maintain insertion order in response to another, related question.
That said, you should use std::string as your key. Using a const char* as a map key is A Bad Idea: it makes it near impossible to access or search for an element by its key because only the pointers will be compared, not the strings themselves.
No. std::map<Key, Data, Compare, Alloc> is sorted according to the third template parameter Compare, which defaults to std::less<Key>. If you want insert sequence you can use std::list<std::pair<Key, Data> >.
Edit:
As was pointed out, any sequential STL container would do: vector, deque, list, or in this particular case event string. You would have to decide on the merits of each.
Consider using a boost::multi_index container instead of a std::map. You can put both an ordered map index and an unordered sequential index on your container.