When using this code to remove duplicates I get invalid operands to binary expression errors. I think that this is down to using a vector of a struct but I am not sure I have Googled my question and I get this code over and over again which suggests that this code is right but it isn't working for me.
std::sort(vec.begin(), vec.end());
vec.erase(std::unique(vec.begin(), vec.end()), vec.end());
Any help will be appreciated.
EDIT:
fileSize = textFile.size();
vector<wordFrequency> words (fileSize);
int index = 0;
for(int i = 0; i <= fileSize - 1; i++)
{
for(int j = 0; j < fileSize - 1; j++)
{
if(string::npos != textFile[i].find(textFile[j]))
{
words[i].Word = textFile[i];
words[i].Times = index++;
}
}
index = 0;
}
sort(words.begin(), words.end());
words.erase(unique(words.begin(), words.end(), words.end()));
First problem.
unique used wrongly
unique(words.begin(), words.end(), words.end()));
You are calling the three operand form of unique, which takes a start, an end, and a predicate. The compiler will pass words.end() as the predicate, and the function expects that to be your comparison functor. Obviously, it isn't one, and you enter the happy world of C++ error messages.
Second problem.
either use the predicate form or define an ordering
See the definitions of sort and unique.
You can either provide a
bool operator< (wordFrequency const &lhs, wordFrequency const &rhs)
{
return lhs.val_ < rhs.val_;
}
, but only do this if a less-than operation makes sense for that type, i.e. if there is a natural ordering, and if it's not just arbitrary (maybe you want other sort orders in the future?).
In the general case, use the predicate forms for sorting:
auto pred = [](wordFrequency const &lhs, wordFrequency const &rhs)
{
return lhs.foo < rhs.foo;
};
sort (words.begin(), words.end(), pred);
words.erase (unique (words.begin(), words.end(), pred));
If you can't C++11, write a functor:
struct FreqAscending { // should make it adaptible with std::binary_function
bool operator() (wordFrequency const &lhs, wordFrequency const &rhs) const
{ ... };
};
I guess in your case ("frequency of words"), operator<makes sense.
Also note vector::erase: This will remove the element indicated by the passed iterator. But, see also std::unique, unique returns an iterator to the new end of the range, and I am not sure if you really want to remove the new end of the range. Is this what you mean?
words.erase (words.begin(),
unique (words.begin(), words.end(), pred));
Third problem.
If you only need top ten, don't sort
C++ comes with different sorting algorithms (based on this). For top 10, you can use:
nth_element: gives you the top elements without sorting them
partial_sort: gives you the top elements, sorted
This wastes less watts on your CPU, will contribute to overall desktop performance, and your laptop batteries last longer so can do even more sorts.
The most probable answer is that operator< is not declared for the type of object vec contains. Have you overloaded it? It should look something like that:
bool operator<(const YourType& _a, const YourType& _b)
{
//... comparison check here
}
That code should work, as std::unique returns an iterator pointing to the beginning of the duplicate elements. What type is your vector containing? Perhaps you need to implement the equality operator.
Related
I'd like to sort m_correlationValues in descending order and get ids of the sorted list. I've got this error. I'll appreciate your help.
no match for 'operator=' (operand types are 'std::vector<std::pair<int, float> >' and 'void')
return idx_correlation.second; });
void MatrixCanvas::SortMatrix()
{
int naxes = (int) m_correlationData.size();
std::vector<std::pair<int,float>> idx_correlations;
std::vector<std::pair<int,float>> sorted;
std::vector<int> idxs(naxes);
for(int idx =0; idx<naxes;idx++){
idx_correlations[idx] = std::make_pair(idx, m_correlationValues[chosen_row_id][idx]);}
// Wrong
sorted = std::sort(idx_correlations.begin(),
idx_correlations.end(),
[](std::pair<int,float> &idx_correlation){
return idx_correlation.second; });
// this will contain the order:
for(int i =0; i<naxes;i++)
idxs[i] = sorted[i].first;
}
You have two problems:
sort does not return a copy of the sorted range. It modifies the range provided. If you want the original to be left alone, make a copy of it first and then sort it.
std::sort's third argument is a comparator between two values, which has the meaning of "less than". That is, "does a come before b?" For keeping the line short, I replaced your pair<...> type with auto in the lambda, but it'll be deduced to "whatever type of thing" is being sorted.
Note, if you want decreasing, just change < to > in the lambda when it compares the two elements.
Possible fix:
auto sorted = idx_correlations; // full copy
std::sort(sorted.begin(),
sorted.end(),
[](auto const & left, auto const & right) {
return left.first < right.first; });
After that, sorted will be a sorted vector and idx_correlations will be left unchanged. Of course, if you don't mind modifying your original collection, there's no need to make this copy (and you can take begin/end of idx_correlations.
So the main issue I can see in your code, is that you're expecting the std::sort to return the sorted vector, and this is NOT how it works.
https://en.cppreference.com/w/cpp/algorithm/sort
The solution in your case is to get the sorted vector out of the original vector, ie. sorted = idx_correlations then sort the new vector.
sorted = idx_correlations;
std::sort( sorted.begin(), sorted.end(), your_comparator... );
This will do the trick while also maintaining the original vector.
Update: another issue is that your comparator will have TWO arguments not one (two elements to compare for the sort).
The other answers covered proper use of std::sort, I wish to show C++20 std::rannges::sort which have projection functionality what is close to thing you've tried to do:
std::vector<std::pair<int, float>> idx_correlations;
.....
auto sorted = idx_correlations;
std::ranges::sort(sorted, std::greater{}, &std::pair<int, float>::second);
https://godbolt.org/z/4rzzqW9Gx
So I encountered this very weird behavior on an edge case when sorting a vector using a custom comparator.
When running this code, it will not halt, and goes forever:
int main() {
auto comp = [](int lhs, int rhs) {
return true;
};
vector<int> vec{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
sort(vec.begin(), vec.end(), comp);
for (int num : vec)
cout << num;
return 0;
}
however, when I change the true to false, it works perfectly.
auto comp = [](int lhs, int rhs) {
return false;
};
What's weirder, when I decrease the number of 0's to be sorted, it also works perfectly. (It works with 16 or less 0's, when I add one more 0 to make it 17, the program will not halt again. (Will g++ switch to another sorting algorithm if the length exceeds 16?)
vector<int> vec{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
Why is this the case? Am I missing some important concepts in C++'s sort() function?
This comparator:
auto comp = [](int lhs, int rhs) {
return true;
};
violates the requirement of std::sort that the comparator must establish a strict-weak-ordering. Note that this function returns true regardless of the order of arguments, implying that 2 elements can both be less than the other, which doesn't really make sense. Violating this requirement of std::sort invokes undefined behavior (this is sufficient to explain the differing behavior you see with different vector sizes).
On the other hand, this comparator:
auto comp = [](int lhs, int rhs) {
return false;
};
is perfectly fine. It basically says that no elements compare less than any other (i.e. they are all equivalent). This satisfies strict-weak-ordering, so std::sort will work just fine with it.
Of course, std::sort won't do anything useful with the second comparator, since all the elements are already "sorted". This might reorder the elements though; but if you use std::stable_sort then the original range is guaranteed to be unchanged.
Suppose we have a vector of pairs:
std::vector<std::pair<A,B>> v;
where for type A only equality is defined:
bool operator==(A const & lhs, A const & rhs) { ... }
How would you sort it that all pairs with the same first element will end up close? To be clear, the output I hope to achieve should be the same as does something like this:
std::unordered_multimap<A,B> m(v.begin(),v.end());
std::copy(m.begin(),m.end(),v.begin());
However I would like, if possible, to:
Do the sorting in place.
Avoid the need to define a hash function for equality.
Edit: additional concrete information.
In my case the number of elements isn't particularly big (I expect N = 10~1000), though I have to repeat this sorting many times ( ~400) as part of a bigger algorithm, and the datatype known as A is pretty big (it contains among other things an unordered_map with ~20 std::pair<uint32_t,uint32_t> in it, which is the structure preventing me to invent an ordering, and making it hard to build a hash function)
First option: cluster() and sort_within()
The handwritten double loop by #MadScienceDreams can be written as a cluster() algorithm of O(N * K) complexity with N elements and K clusters. It repeatedly calls std::partition (using C++14 style with generic lambdas, easily adaptable to C++1, or even C++98 style by writing your own function objects):
template<class FwdIt, class Equal = std::equal_to<>>
void cluster(FwdIt first, FwdIt last, Equal eq = Equal{})
{
for (auto it = first; it != last; /* increment inside loop */)
it = std::partition(it, last, [=](auto const& elem){
return eq(elem, *it);
});
}
which you call on your input vector<std::pair> as
cluster(begin(v), end(v), [](auto const& L, auto const& R){
return L.first == R.first;
});
The next algorithm to write is sort_within which takes two predicates: an equality and a comparison function object, and repeatedly calls std::find_if_not to find the end of the current range, followed by std::sort to sort within that range:
template<class RndIt, class Equal = std::equal_to<>, class Compare = std::less<>>
void sort_within(RndIt first, RndIt last, Equal eq = Equal{}, Compare cmp = Compare{})
{
for (auto it = first; it != last; /* increment inside loop */) {
auto next = std::find_if_not(it, last, [=](auto const& elem){
return eq(elem, *it);
});
std::sort(it, next, cmp);
it = next;
}
}
On an already clustered input, you can call it as:
sort_within(begin(v), end(v),
[](auto const& L, auto const& R){ return L.first == R.first; },
[](auto const& L, auto const& R){ return L.second < R.second; }
);
Live Example that shows it for some real data using std::pair<int, int>.
Second option: user-defined comparison
Even if there is no operator< defined on A, you might define it yourself. Here, there are two broad options. First, if A is hashable, you can define
bool operator<(A const& L, A const& R)
{
return std::hash<A>()(L) < std::hash<A>()(R);
}
and write std::sort(begin(v), end(v)) directly. You will have O(N log N) calls to std::hash if you don't want to cache all the unique hash values in a separate storage.
Second, if A is not hashable, but does have data member getters x(), y() and z(), that uniquely determine equality on A: you can do
bool operator<(A const& L, A const& R)
{
return std::tie(L.x(), L.y(), L.z()) < std::tie(R.x(), R.y(), R.z());
}
Again you can write std::sort(begin(v), end(v)) directly.
if you can come up with a function that assigns to each unique element a unique number, then you can build secondary array with this unique numbers and then sort secondary array and with it primary for example by merge sort.
But in this case you need function that assigns to each unique element a unique number i.e. hash-function without collisions. I think this should not be a problem.
And asymptotic of this solution if hash-function have O(1), then building secondary array is O(N) and sorting it with primary is O(NlogN). And summary O(N + NlogN) = O(N logN).
And the bad side of this solution is that it requires double memory.
In conclusion the main sense of this solution is quickly translate your elements to elements which you can quickly compare.
An in place algorithm is
for (int i = 0; i < n-2; i++)
{
for (int j = i+2; j < n; j++)
{
if (v[j].first == v[i].first)
{
std::swap(v[j],v[i+1]);
i++;
}
}
There is probably a more elegant way to write the loop, but this is O(n*m), where n is the number of elements and m is the number of keys. So if m is much smaller than n (with a best case being that all the keys are the same), this can be approximated by O(n). Worst case, the number of key ~= n, so this is O(n^2). I have no idea what you expect for the number of keys, so I can't really do the average case, but it is most likely O(n^2) for the average case as well.
For a small number of keys, this may work faster than unordered multimap, but you'll have to measure to find out.
Note: the order of clusters is completely random.
Edit: (much more efficient in the partially-clustered case, doesn't change complexity)
for (int i = 0; i < n-2; i++)
{
for(;i<n-2 && v[i+1].first==v[i].first; i++){}
for (int j = i+2; j < n; j++)
{
if (v[j].first == v[i].first)
{
std::swap(v[j],v[i+1]);
i++;
}
}
Edit 2: At /u/MrPisarik's comment, removed redundant i check in inner loop.
I'm surprised no one has suggested the use of std::partition yet. It makes the solution nice, elegant, and generic:
template<typename BidirIt, typename BinaryPredicate>
void equivalence_partition(BidirIt first, BidirIt last, BinaryPredicate p) {
using element_type = typename std::decay<decltype(*first)>::type;
if(first == last) {
return;
}
auto new_first = std::partition
(first, last, [=](element_type const &rhs) { return p(*first, rhs); });
equivalence_partition(new_first, last, p);
}
template<typename BidirIt>
void equivalence_partition(BidirIt first, BidirIt last) {
using element_type = typename std::decay<decltype(*first)>::type;
equivalence_partition(first, last, std::equal_to<element_type>());
}
Example here.
given a std::vector< std::string >, the vector is ordered by string length, how can I find a range of equal length strength?
I am looking forward an idiomatic solution in C++.
I have found this solution:
// any idea for a better name? (English is not my mother tongue)
bool less_length( const std::string& lhs, const std::string& rhs )
{
return lhs.length() < rhs.length();
}
std::vector< std::string > words;
words.push_back("ape");
words.push_back("cat");
words.push_back("dog");
words.push_back("camel");
size_t length = 3;
// this will give a range from "ape" to "dog" (included):
std::equal_range( words.begin(), words.end(), std::string( length, 'a' ), less_length );
Is there a standard way of doing this (beautifully)?
I expect that you could write a comparator as follows:
struct LengthComparator {
bool operator()(const std::string &lhs, std::string::size_type rhs) {
return lhs.size() < rhs;
}
bool operator()(std::string::size_type lhs, const std::string &rhs) {
return lhs < rhs.size();
}
bool operator()(const std::string &lhs, const std::string &rhs) {
return lhs.size() < rhs.size();
}
};
Then use it:
std::equal_range(words.begin(), words.end(), length, LengthComparator());
I expect the third overload of operator() is never used, because the information it provides is redundant. The range has to be pre-sorted, so there's no point the algorithm comparing two items from the range, it should be comparing items from the range against the target you supply. But the standard doesn't guarantee that. [Edit: and defining all three means you can use the same comparator class to put the vector in order in the first place, which might be convenient].
This works for me (gcc 4.3.4), and while I think this will work on your implementation too, I'm less sure that it is actually valid. It implements the comparisons that the description of equal_range says will be true of the result, and 25.3.3/1 doesn't require that the template parameter T must be exactly the type of the objects referred to by the iterators. But there might be some text I've missed which adds more restrictions, so I'd do more standards-trawling before using it in anything important.
Your way is definitely not unidiomatic, but having to construct a dummy string with the target length does not look very elegant and it isn't very readable either.
I'd perhaps write my own helper function (i.e. string_length_range), encapsulating a plain, simple loop through the string list. There is no need to use std:: tools for everything.
std::equal_range does a binary search. Which means the words vector must be sorted, which in this case means that it must be non-decreasing in length.
I think your solution is a good one, definitely better than writing your own implementation of binary search which is notoriously error prone and hard to prove correct.
If doing a binary search was not your intent, then I agree with Alexander. Just a simple for loop through the words is the cleanest.
I have a list of objects ("Move"'s in this case) that I want to sort based on their calculated evaluation. So, I have the List, and a bunch of numbers that are "associated" with an element in the list. I now want to sort the List elements with the first element having the lowest associated number, and the last having the highest. Once the items are order I can discard the associated number. How do I do this?
This is what my code looks like (kind've):
list<Move> moves = board.getLegalMoves(board.turn);
for(i = moves.begin(); i != moves.end(); ++i)
{
//...
a = max; // <-- number associated with current Move
}
I would suggest a Schwartzian transform sort. Make a new vector (I recommend vector for more efficient sorting) of pairs of the associated value, and a pointer to its item. Sort the vector of pairs and then regenerate the list from the sorted vector. Since operator< is defined on a std::pair to be comparison by the first item of the pair and then the second, you will get a proper ordering.
Example:
#include <algorithm> // gives you std::sort
#include <utility> // gives you std::pair
typedef double CostType;
typedef std::pair<CostType, Move*> Pair;
// Create the vector of pairs
std::vector<Pair> tempVec;
tempVec.reserve(moves.size());
for (std::list<Move>::iterator i = moves.begin(); i != moves.end(); ++i)
{
CostType cost = calcCost(*i);
Move* ptrToI = &(*i);
tempVec.push_back(Pair(cost, ptrToI));
}
// Now sort 'em
std::sort(tempVec.begin(), tempVec.end());
// Regenerate your original list in sorted order by copying the original
// elements from their pointers in the Pair.
std::list<Move> sortedMoves;
for (std::vector<Pair>::iterator i = tempVec.begin(); i != tempVec.end(); ++i)
{
sortedMoves.push_back(*(i->second));
}
Note that you will need a calcCost function that I have assumed here. This approach has an advantage over creating a comparison function if your comparison value calculation is time consuming. This way, you only pay the cost for calculating the comparison N times instead of 2 * N * log(N).
You could make a comparison function that compares the two elements in the way that you would like.
bool compare_m (const Move &first,const Move &second)
{
if (first.thing_you_are_comparing_on() < second.thing_you_are_comparing_on()) return true;
else return false;
}
Where "thing_you_are_comparing_on" is some member of the Move class that gives you the ordering you want. We use const here to make sure that we are only comparing and not actually changing the objects in the comparison function. You can then call the sort method on the list with compare_m as the comparison function:
moves.sort(compare_m)
Something to note is that if the calculation of the comparison function is particularly expensive it may be worthwhile to precompute all the associated rank numbers before sorting.
This would require adding something to the move class to store the rank for use later:
class Move{
//rest of move class
public:
int rank;
};
list<Move>::iterator iter;
for(iter = moves.begin(); iter != moves.end(); ++iter)
{
//...
(*iter).rank = max; // store the number associated with current Move
}
bool compare_rank (const Move &first,const Move &second)
{
if (first.rank < second.rank) return true;
else return false;
}
std::sort is used to sort STL collections. If the elements in the collection you are sorting can be compared simply by calling operator< and the collection in question is a vector, then sorting is very simple:
std::sort(collection.begin(), collection.end());
If the collection in question is not a vector but a list as in your case, then you can't use the general version of std::sort, but you can use std::list's version instead:
list<int> numbers;
numbers.sort();
STL's sort, along with most other algorithms in the STL, come in two flavors. One is the simple version we have already seen, which just uses operator< to do the comparison of two elements. The other is a 'predicated' version, which instead of using operator< uses a comparison functor you provide. This is what you need to use in your case. There is a predicated version of sort for list, and this is what you need to use in your case.
You can create a functor in a number of ways, but one of the most useful is to derive a class from std::unary_function or from std::binary_function, depending on how many arguments your functor will take -- in your case, two. Override the function-call operator, operator() and add the code that compares two elements:
class compare_functor : public std::binary_function<Move, Move, bool>
{
public:
bool operator(const Move& lhs, const Move& rhs) const
{
int left_val = lhs.Value();
int right_val = rhs.Value();
return left_val < right_val;
};
Here is a complete working example that puts everything together. In this program, instead of having a list of Moves, I have a list of 10 strings. Each string is 6 random characters. The list is populated by the call to generate_n, which uses the functor generator to create each random string. Then I dump that list of strings, along with their values, by calling copy and passing an output iterator that dumps the values to stdout (ostream_iterator). The value of each string is simply a sum of the numeric value of each character, computed by the function strng_val.
Then I sort the list using list's predicated version of sort. The comparison predicate used by sort is evaluator. Then I finally dump the resulting list and the string values to the screen again as above:
#include <cstdlib>
#include <iostream>
#include <list>
#include <string>
#include <algorithm>
#include <ctime>
#include <sstream>
using namespace std;
class generator
{
public:
generator() { srand((unsigned)time(0)); }
string operator()() const
{
string ret;
for( int i = 0; i < 6; ++i )
ret += static_cast<char>((rand()/(RAND_MAX/26)) + 'A');
return ret;
}
};
unsigned string_val(const string& rhs)
{
unsigned val = 0;
for( string::const_iterator it = rhs.begin(); it != rhs.end(); ++it )
val += (*it)-'A'+1;
return val;
};
class evaluator : public std::binary_function<string,string,bool>
{
public:
bool operator()(const string& lhs, const string& rhs) const
{
return string_val(lhs) < string_val(rhs);
}
};
class string_dumper : public std::unary_function<string, string>
{
public:
string operator()(const string& rhs) const
{
stringstream ss;
ss << rhs << " = " << string_val(rhs);
return ss.str();
}
};
int main()
{
// fill a list with strings of 6 random characters
list<string> strings;
generate_n(back_inserter(strings), 10, generator());
// dump it to the screen
cout << "Unsorted List:\n";
transform(strings.begin(), strings.end(), ostream_iterator<string>(cout, "\n"), string_dumper());
// sort the strings according to their numeric values computed by 'evaluator'
strings.sort(evaluator()); // because this is a 'list', we are using list's 'sort'
// dump it to the screen
cout << "\n\nSorted List:\n";
transform(strings.begin(), strings.end(), ostream_iterator<string>(cout, "\n"), string_dumper());
return 0;
}