below is a code for getting top K frequent elements from an array.the code is correct,but im confused about what the comparator is doing here.why is it "p1.second > p2.second " and not "p1.second < p2.second" ,shouldn't the pair with less count be the one at the top of the heap?Please help!
class Solution {
struct compare {
bool operator() (pair<int, int>p1, pair<int, int>p2) {
return p1.second > p2.second;
}
};
public:
vector topKFrequent(vector& nums, int k) {
int n = nums.size();
unordered_map<int, int>m;
for (int i = 0; i < n; i++) {
m[nums[i]]++;
}
priority_queue<pair<int, int>, vector<pair<int, int>>, compare>pq;
for (auto it = m.begin(); it != m.end(); it++) {
pq.push(make_pair(it->first, it->second));
if (pq.size() > k)
pq.pop();
}
vector<int>v;
while (!pq.empty()) {
pair<int, int>p = pq.top();
v.push_back(p.first);
pq.pop();
}
return v;
**}
};**
By default, std::priority_queue uses the std::less comparator. In that case, pop() removes the largest element.
However, in your case you want to keep the k largest elements in the queue, and pop() the smallest one (to discard it). To do that, you need reverse the sense of the comparison.
What the priority queue does is constructing a heap over the container and the tail of container is the top of the heap, > means descending order so that the element with least frequency will be the top and pop first.
The comparator function is passed as an argument when we want to build the heap in a customized way. One node of your heap is storing two values, the element, and its frequency. Since you are using pair<int, int>, it means the first value of the pair is the element itself and the second value is its frequency.
Now, inside the comparator function, you just compare two pair<int, int> according to their second value, i.e the one whose second value is larger should come first. Hence it stores the heap elements according to their frequencies.
Related
I was wondering if there's a way to sort my list of pairs based on the second element. Here is a code:
std::list<std::pair<std::string, unsigned int>> words;
words.push_back(std::make_pair("aba", 23);
words.push_back(std::make_pair("ab", 20);
words.push_back(std::make_pair("aBa", 15);
words.push_back(std::make_pair("acC", 8);
words.push_back(std::make_pair("aaa", 23);
I would like to sort my list words based in the integer element in decreasing order so that my list would be like:
<"aba", 23>,<"aaa", 23>,<"ab", 20>,<"aBa", 15>,<"acC", 8>
Also, is it possible to sort them by both the first and second element such that it sorts by the second elements first (by integer value), and then if there's two or more pairs with the same second element (i.e. same integer value), then it will sort those based on the first element in alphabetical order, then the first 2 pairs on my sorted list above would swap, so:
<"aaa", 23>,<"aba", 23>,<"ab", 20>,<"aBa", 15>,<"acC", 8>
I would like to sort my list words based in the integer element in decreasing order
The sorting predicate must return true if the first element (i.e., the first pair) passed precedes the second one in your established order:
words.sort([](auto const& a, auto const& b) {
return a.second > b.second;
});
Since you want to sort the list in decreasing order, the pair a will precede b if its second element (i.e., the int) is greater than b's second element.
Note that std::sort() doesn't work for sorting an std::list because it requires random access iterators but std::list only provides bidirectional iterators.
is it possible to sort them by both the first and second element such that it sorts by the second elements first (by integer value), and then if there's two or more pairs with the same second element (i.e. same integer value), then it will sort those based on the first element in alphabetical order
Assuming again decreasing order for the int element, just resort to the second element of the pairs when both int elements are the same:
lst.sort([](auto const& a, auto const& b) {
if (a.second > b.second)
return true;
if (a.second < b.second)
return false;
return a.first < b.first;
});
or more concise thanks to std::tie():
lst.sort([](auto const& a, auto const& b) {
return std::tie(b.second, a.first) < std::tie(a.second, a.first);
});
std::list has a member function std::list::sort that is supposed to do the sorting.
One of its two overloads accepts custom comparison function:
template <class Compare>
void sort(Compare comp);
which you can use as follows:
words.sort([](const std::pair<string, unsigned int> &x,
const std::pair<string, unsigned int> &y)
{
return x.second > y.second;
});
I have a vector<tuple<string, float> > vector that stores the euclidean distance between two nodes, and the node's name. I need to sort this vector by the distance value first, and then by string value. This is so in the instance where two nodes have the same euclidean distance, their sorted position resolves to being done by alphabetical (lexicographical) order of the node names. At the moment, I have a custom sorting helper function that sorts the vector by float value first.
bool sort_second(const tuple<string, float>& a, const tuple<string, float>& b) {
return (get<1>(a) < get<1>(b));
}
and call sort(vec.begin(), vec.end(), sort_second) as such. But in the event where nodes foo and bar have the same distance, it's possible for foo to come before bar even though bar should be first. How would I go by sorting the vector a second time (or better yet, on the first pass through) such that I don't mess up the value order that I already computed? Thanks
Use std::tie to sort as if tuples were reversed:
bool sort_reversed(const tuple<string, float>& a, const tuple<string, float>& b)
{
return std::tie(get<1>(a), get<0>(a)) < std::tie(get<1>(b), get<0>(b));
}
Or if you want it done manually:
bool sort_second(const tuple<string, float>& a, const tuple<string, float>& b)
{
return (get<1>(a) < get<1>(b)) ||
((get<1>(a) == get<1>(b)) && (get<0>(a) < get<0>(b)));
}
Can be achieved simply by modifying the code:
bool sort_second(const tuple<string, float>& a, const tuple<string, float>& b) {
if(get<1>(a) == get<1>(b))
return get<0>(a) < get<0>(b);
return (get<1>(a) < get<1>(b));
}
Recently I was in somewhat same situation but to lower down my burden I used different approach.
I just used an array to store all the vector elements , sorted that array, placed those array elements back in the vector. Try to catch an idea from this :)
A std::priority_queue uses a std::vector as the default container (Reference this). For sorting on the basis of the first element in a std::vector<pair<int, int>>, we need to define our own comparison function (Reference this). This is what I understand.
Now, the following code returns the k most frequent elements in a non-empty array, in O(NlogK):
class Solution {
public:
vector<int> topKFrequent(vector<int>& nums, int k) {
if(nums.empty())
return vector<int>();
unordered_map< int, int > hashMap;
for(int i=0; i<nums.size(); i++)
hashMap[nums[i]]++;
priority_queue< pair< int, int >> pq;
vector< int > result;
unordered_map< int, int >::iterator it=hashMap.begin();
for(it=hashMap.begin(); it!=hashMap.end(); it++) {
//the first one is frequency and the second one is the value
pq.push(make_pair(it->second, it->first));
//the peculiar implementation below is because we the code to be O(NlogK)
if(pq.size()>(hashMap.size()-k)) {
result.push_back(pq.top().second);
pq.pop();
}
}
return result;
}
};
This code works correctly and gets accepted by the judge - but how? The std::priority_queue, using a std::vector<pair<int, int>> as its underlying container must contain a custom comparison function so that it sorts correctly. So, how does it work?
Frankly, it works because it is designed to do so.
A few things:
a std::priority_queue employs std::less<T>, where T is the underlying sequence value type, as the default comparator when no override is specified.
std::less<T> invokes operator < against two T arguments, resolving to whatever best-fits and/or is available.
Therefore, if this works as you desired with no special override of the sequence type comparator, it must mean that there exists an operator < for std::pair<int,int> that wire this whole thing together.
And indeed there is. Checking the documentation for std::pair<T1,T2>, you'll find there is an operator < overload that effectively does this:
if (lhs.first < rhs.first)
return true;
else if (!(rhs.first < lhs.first))
return lhs.second < rhs.second
else
return false;
Mind-play examples of how this works are left to the reader to think about.
I'm using std::multimap in this way
std::multimap<float, std::pair<int, int> > edges;
I want to sort it by the first float number, but later count how many int (the first one of <int, int>) are in this multimap.
for example,
I have element pairs (0.6001, <2,3>), (0.62, <2,4>), (0.63, <1,3>) in my multimap,
I want to count the number of <2,*> (it should be 2 here).
Is there a simpler way (something like edges.count()) than to get every element out and count?
Or is there another container that I could turn to?
#
Solution 1
I'll first store the values I need to count in a std::set and count as codes given by jrok or johny;
Solution 2
I'll use a std::multimap to store the second and third element again and count.
Thank you both jrok and johny!
What about this?
std::multimap<float, std::pair<int, int> > edges;
typedef std::multimap<float, std::pair<int, int> >::value_type ElemT;
int value = 2;
int count =
std::count_if(edges.begin(), edges.end(),
[value](const ElemT& e) { return e.second.first == value; });
I'm a bit of a noob to iterators. I'm trying to create a priority_queue, sorted by vector length. (I.e., I want to pop off the longest vectors in order.)
This is the resource that I've been using:
http://www.cplusplus.com/reference/stl/priority_queue/priority_queue/
I tried this code, and it seems to do what I want:
// testing to make sure that a priority queue will always give me the longest vector
priority_queue< vector<int> > q;
vector<int> f;
f.push_back(1);
vector<int> g;
g.push_back(19);
g.push_back(80);
vector<int> y;
y.push_back(62);
y.push_back(10);
y.push_back(11);
q.push(f);
q.push(g);
q.push(y);
vector<int> out = q.top();
for (unsigned int i = 0; i < out.size(); i++) {
cout << out[i] << endl;
}
My questions:
1. Will this always give me the longest vector? This seems to be the case.
2. If not, what else should I do? The iterator syntax on the reference page is like... o_O
Thanks!!
No, the code doesn't do what you expect. It compares the vectors lexicographically rather than by length. To compare by length use a custom comparator:
struct LengthCompare {
bool operator() (const vector<int>& a, const vector<int>& b) {
return a.size() < b.size();
}
};
priority_queue<vector<int>, vector<vector<int> >, LengthCompare> q;
Also note that your queue stores copies of the vectors, which might be not so efficient because it may copy them when it builds the heap. Store (smart) pointers instead.
priority_queues in C++ use a Comparison object to determine what is the greatest element. By default, this is the < (less-than) operator over the objects held in the priority_queue - so you need to know what < means over vectors. This page http://www.cplusplus.com/reference/stl/vector/operators/ has some information about that.