predicate for a map from string to int - c++

I have this small program that reads a line of input & prints the words in it, with their respective number of occurrences. I want to sort the elements in the map that stores these values according to their occurrences. I mean, the words that only appear once, will be ordered to be at the beginning, then the words that appeared twice 7 so on. I know that the predicate should return a bool value, but I don't know what the parameters should be. Should it be two iterators to the map? If some one could explain this, it would be greatly appreciated. Thank you in advance.
#include<iostream>
#include<map>
using std::cout;
using std::cin;
using std::endl;
using std::string;
using std::map;
int main()
{
string s;
map<string,int> counters; //store each word & an associated counter
//read the input, keeping track of each word & how often we see it
while(cin>>s)
{
++counters[s];
}
//write the words & associated counts
for(map<string,int>::const_iterator iter = counters.begin();iter != counters.end();iter++)
{
cout<<iter->first<<"\t"<<iter->second<<endl;
}
return 0;
}

std::map is always sorted according to its key. You cannot sort the elements by their value.
You need to copy the contents to another data structure (for example std::vector<std::pair<string, int> >) which can be sorted.
Here is a predicate that can be used to sort such a vector. Note that sorting algorithms in C++ standard library need a "less than" predicate which basically says "is a smaller than b".
bool cmp(std::pair<string, int> const &a, std::pair<string, int> const &b) {
return a.second < b.second;
}

You can't resort a map, it's order is predefined (by default, from std::less on the key type). The easiest solution for your problem would be to create a std::multimap<int, string> and insert your values there, then just loop over the multimap, which will be ordered on the key type (int, the number of occurences), which will give you the order that you want, without having to define a predicate.

You are not going to be able to do this with one pass with an std::map. It can only be sorted on one thing at a time, and you cannot change the key in-place. What I would recommend is to use the code you have now to maintain the counters map, then use std::max_element with a comparison function that compares the second field of each std::pair<string, int> in the map.

A map has its keys sorted, not its values. That's what makes the map efficent. You cannot sort it by occurrences without using another data structure (maybe a reversed index!)

As stated, it simply won't work -- a map always remains sorted by its key value, which would be the strings.
As others have noted, you can copy the data to some other structure, and sort by the value. Another possibility would be to use a Boost bimap instead. I've posted a demo of the basic idea previously.

You probably want to transform map<string,int> to vector<pair<const string, int> > then sort the vector on the int member.
You could do
struct PairLessSecond
{
template< typename P >
bool operator()( const P& pairLeft, const P& pairRight ) const
{
return pairLeft.second < pairRight.second;
}
};
You can probably also construct all this somehow using a lambda with a bind.
Now
std::vector< std::map<std::string,int>::value_type > byCount;
std::sort( byCount.begin(), byCount.end(), PairLessSecond() );

Related

sorting std::map by value

Right now I have a map and I need to sort it by value(int), and then by key(string) if there is a tie. I know I would need to write a customized comparison function for this, however so far I haven't been able to make it work.
(I need to store my stuffs in a map since the strings are words and ints are the frequencies and I will need to 'find' the pairs by searching the keys later)
The std::map can only be sorted by key (string in your case).
If you need to sort it by value as well, you'd need to create a std::multimap with the int as key and the string as value, and populate it by iterating over the map.
Alternatively, you could also create a vector<pair<int,string>> that you populate by iteration over the map and just use std::sort().
You can use a std::multiset<std::pair<int, std::string>>
With the information given, it's a bit of a guessing game, but unless you are shuffling massive amounts of data, this may do.
using entry = std::pair<std::string, int>;
using CompareFunc = bool(*)(const entry&, const entry&);
using sortset = std::set<entry, CompareFunc>;
sortset bv(themap.begin(), themap.end(), [](auto& a, auto&b){ a.second!=b.second?a.second<b.second:a.first<b.first; });
for(const auto& d : bv) {
//
}

How to remove duplicates from a vector of pair<int, Object>

This is what I am trying right now. I made a comparison function:
bool compare(const std::pair<int, Object>& left, const std::pair<int, Object>& right)
{
return (left.second.name == right.second.name) && (left.second.time == right.second.time) &&
(left.second.value == right.second.value);
}
After I add an element I call std::unique to filter duplicates:
data.push_back(std::make_pair(index, obj));
data.erase(std::unique(data.begin(), data.end(), compare), data.end());
But it seems that this doesn't work. And I don't know what the problem is.
From my understanding std::unique should use the compare predicate.
How should I update my code to make this work ?
I am using C++03.
edit:
I have tried to sort it too, but still doens't work.
bool compare2(const std::pair<int, Object>& left, const std::pair<int, Object>& right)
{
return (left.second.time< right.second.time);
}
std::sort(simulatedLatchData.begin(), simulatedLatchData.end(), compare2);
std::unique requires the range passed to it to have all the duplicate elements next to one another in order to work.
You can use std::sort on the range before you a call unique to achieve that as sorting automatically groups duplicates.
Sorting and filtering is nice, but since you never want any duplicate, why not use std::set?
And while we're at it, these pairs look suspiciously like key-values, so how about std::map?
If you want to keep only unique objects, then use an appropriate container type, such as a std::set (or std::map). For example
bool operator<(object const&, object const&);
std::set<object> data;
object obj = new_object(/*...*/);
data.insert(obj); // will only insert if unique

Working with a vector of pair vectors?

I've been search around Google but I didn't find what I need. I'm trying to create a vector that allows me to add 3 (and after I'll need to store 4) variables, access and sort them.
I'm implementing the vector as follows for 3 variables:
std::vector<std::pair<std::string, std::pair<int, double> > > chromosomes;
To add information (variables), I'm doing:
chromosomes.emplace_back(dirp->d_name, std::make_pair(WSA, fault_percent));
How can I access each parameter and sort them based on the WSA and fault coverage? As in a vector of pair that I can do that using members first and second.
And for 4 variables, it would be as follows?
std::vector<std::pair<std::string, std::string>, std::pair<int, double> > > chromosomes;
chromosomes.emplace_back( std::make_pair(dirp->d_name, x), std::make_pair(WSA, fault_percent));`
As suggested here I think you should be using a vector of tuple<string, int, double>s or tuple<string, string, int, double>s respectively.
There is a defined tuple::operator< which uses the less-than-operator for each of it's composing types moving left to right. If a simple comparison of each element is sufficient then all you'll need to do is call sort:
sort(chromosomes.begin(), chromosomes.end());
If the tuple::operatior< does not provide a sufficient comparison for your needs sort provides an overload which takes a comparison lambda. Your lambda would need to do the following:
Take in 2 const references to the tuples
Return true if the first tuple is strictly smaller than the second tuple
Return false if the first tuple is greater or equal to the second tuple
In the end your call would look something like this:
sort(chromosomes.begin(), chromosomes.end(), [](const auto& lhs, const auto& rhs) {
// Your comparison between the two goes here
});
If you're not familiar with working with tuples you'll need to use the templated get method to extract either by index or type in the cases where there is not a duplicate type contained by the tuple.
First to access to the different elements:
for (auto& x :chromosomes)
cout <<x.first<<": "<<x.second.first<<" "<<x.second.second<<endl;
Next, to sort the elements on WSA:
sort(chromosomes.begin(), chromosomes.end(),
[](auto &x, auto &y) { return x.second.first<y.second.first;});
If you want to sort on several criteria, for example WSA and fault_percent, you just have to change the lambda function for comparison:
sort(chromosomes.begin(), chromosomes.end(),
[](auto &x, auto &y) { return x.second.first<y.second.first
|| (x.second.first==y.second.first
&& x.second.second<y.second.second );});
Here is an online demo
Remark
Now what puzzles me, is why you want to use pairs of pairs or even tuples, when you could use a clean struct which would be easier to store/retrieve, and access its members:
struct Chromosome {
string name;
int WSA;
double fault_percent;
};
vector <Chromosome> chromosomes;
It would be much more readable and maintainable this way:
sort(chromosomes.begin(), chromosomes.end(),
[](auto &x, auto &y) { return x.WSA<y.WSA
|| (x.WSA==y.WSA && x.fault_percent<y.fault_percent );});
It seems like you need a table-like data structure, that allows sorting by multiple columns. C++ isn't the easiest language to manipulate table/matrix data structures in, but here's a few links to help you get started.
An example Table class:
How to dynamically sort data by arbitrary column(s)
A vector/tuple solution, which is a slightly cleaner version of what you're currently working on:
sorting table in place using stl sort
A lengthy discussion of this problem, which might give you some additional ideas:
https://softwareengineering.stackexchange.com/questions/188130/what-is-the-best-way-to-store-a-table-in-c

stdext::hash_map unclear hash function

#include <iostream>
#include <hash_map>
using namespace stdext;
using namespace std;
class CompareStdString
{
public:
bool operator ()(const string & str1, const string & str2) const
{
return str1.compare(str2) < 0;
}
};
int main()
{
hash_map<string, int, hash_compare<string, CompareStdString> > Map;
Map.insert(make_pair("one", 1));
Map.insert(make_pair("two", 2));
Map.insert(make_pair("three", 3));
Map.insert(make_pair("four", 4));
Map.insert(make_pair("five", 5));
hash_map<string, int, hash_compare<string, CompareStdString> > :: iterator i;
for (i = Map.begin(); i != Map.end(); ++i)
{
i -> first; // they are ordered as three, five, two, four, one
}
return 0;
}
I want to use hash_map to keep std::string as a key. But when i insert the next pair order is confused. Why order is do not match to insert order ? how should i get the order one two three four five ??
Why order is do not match to insert order?
That's because a stdext::hash_map (and the platform-independent standard library version std::unordered_map from C++11) doesn't maintain/guarantee any reasonable order of its elements, not even insertion order. That's because it is a hashed container, with the individual elements' position based on their hash value and the size of the container. So you won't be able to maintain a reasonable order for your data with such a container.
What you can use to keep your elements in a guaranteed order is a good old std::map. But this also doesn't order elements by insertion order, but by the order induced by the comparison predicate (which can be confugured to respect insertion time, but that would be quite unintuitive and not that easy at all).
For anything else you won't get around rolling your own (or search for other libraries, don't know if boost has something like that). For example add all elements to a linear std::vector/std::list for insertion order iteration and maintain an additional std::(unordered_)map pointing into that vector/list for O(1)/O(log n) retrieval if neccessary.

Make Map Key Sorted According To Insert Sequence

Without help from additional container (like vector), is it possible that I can make map's key sorted same sequence as insertion sequence?
#include <map>
#include <iostream>
using namespace std;
int main()
{
map<const char*, int> m;
m["c"] = 2;
m["b"] = 2;
m["a"] = 2;
m["d"] = 2;
for (map<const char*, int>::iterator begin = m.begin(); begin != m.end(); begin++) {
// How can I get the loop sequence same as my insert sequence.
// c, b, a, d
std::cout << begin->first << std::endl;
}
getchar();
}
No. A std::map is a sorted container; the insertion order is not maintained. There are a number of solutions using a second container to maintain insertion order in response to another, related question.
That said, you should use std::string as your key. Using a const char* as a map key is A Bad Idea: it makes it near impossible to access or search for an element by its key because only the pointers will be compared, not the strings themselves.
No. std::map<Key, Data, Compare, Alloc> is sorted according to the third template parameter Compare, which defaults to std::less<Key>. If you want insert sequence you can use std::list<std::pair<Key, Data> >.
Edit:
As was pointed out, any sequential STL container would do: vector, deque, list, or in this particular case event string. You would have to decide on the merits of each.
Consider using a boost::multi_index container instead of a std::map. You can put both an ordered map index and an unordered sequential index on your container.