order a vector of points based on another vector - c++

I am working on a C++ application.
I have 2 vectors of points
vector<Point2f> vectorAll;
vector<Point2f> vectorSpecial;
Point2f is defined typedef Point_<float> Point2f;
vectorAll has 1000 point while vectorSpecial has 10 points.
First Step:
I need to order the points in vectorSpecial depending on their order in vectorAll.
So something like this:
For each Point in vectorSpecial
Get The Order Of that point in the vectorAll
Insert it in the correct order in a new vector
I can do a double loop and save the indexes. and then order the points based on their indexes. However this method is taking too long when we have lots of points (for example 10000 points in vectorAll and 1000 points in vectorSpecial so that's ten million iteration)
What are better methods of doing that?
Second Step:
Some points in vectorSpecial might not be available in vectorAll. I need to take the point that is closest to it (by using the usual distance formula sqrt((x1-x2)^2 + (y1-y2)^2))
This also can be done when looping, but if someone has any suggestions for better methods, I would appreciate it.
Thanks a lot for any help

You can use std::sort on vectorAll with the Compare function designed to take into account the contents of vectorSpecial:
struct myCompareStruct
{
std::vector<Point2f> all;
std::vector<Point2f> special;
myCompareStruct(const std::vector<Point2f>& a, const std::vector<Point2f>& s)
: all(a), special(s)
{
}
bool operator() (const Point2f& i, const Point2f& j)
{
//whatever the logic is
}
};
std::vector<Point2f> all;
std::vector<Point2f> special;
//fill your vectors
myCompareStruct compareObject(all,special);
std::sort(special.begin(),special.end(),compareObject);

For your First Step, you can use C++11 lambda's to great effect (special.size() = K, and all.size() = N)
#include <algorithm> // std::sort, std::transform, std::find, std::min_element
#include <iterator> // std::distance
std::vector<int> indices;
indices.reserve(special.size());
// locate exact index in all for every element of special. Complexity = O(K * N)
std::transform(special.begin(), special.end(), indices.begin(), [&all](Point2f const& s){
return std::distance(
all.begin(),
std::find(all.begin(), all.end(), s)
);
});
// sort special based on index comparison. Complexity = O(K * log(K))
std::sort(special.begin(), special.end(), [&indices](Point2f const& r, Point2f const& s){
auto i = std::distance(special.begin(), r);
auto j = std::distance(special.begin(), s);
return indices[i] < indices[j];
});
Explanation: first, for every point in special, compute the distance between the beginning of all and the location of the special element in all, and store that result into the indices vector. Second, sort all elements of special by comparing for every pair of element the corresponding elements in the indices vector.
For your Second Step, you only have to change the way you compute indices
// locate closest element in all for every element of special. Complexity = O(K * N)
std::transform(special.begin(), special.end(), indices.begin(), [&all](Point2f const& s){
return std::distance(
all.begin(),
std::min_element(all.begin(), all.end(), [&s](Point2f const& a){
return // Euclidean 2D-distance between a and s
});
);
});
Explanation: the only change compared to your First Step is that for every element in special you find the element in all that is closest to it, which you do by computing the minimum Euclidean distance as you suggested in your question.
UPDATE: You could make a space/time tradeoff by first storing the index of every element of all into a std::unordered_map hash table, and then doing the comparison between elements of special based on lookup into that hash table. This reduces the time complexity of the first step to O(N) (assuming K < N), but adds O(N) of storage for the hash table.

Related

c++ improve vector sorting by presorting with old vector

I have a a vector of pair with the following typdef
typedef std::pair<double, int> myPairType;
typedef std::vector<myPairType> myVectorType;
myVectorType myVector;
I fill this vector with double values and the int part of the pair is an index.
The vector then looks like this
0.6594 1
0.5434 2
0.5245 3
0.8431 4
...
My program has a number of time steps with slight variations in the double values and every time step I sort this vector with std::sort to something like this.
0.5245 3
0.5434 2
0.6594 1
0.8431 4
The idea is now to somehow use the vector from the last time step (the "old vector, already sorted) to presort the current vector (the new vector, not yet sorted). And use an insertions sort or tim sort to sort the "rest" of the then presorted vector.
Is this somehow possible? I couldn't find a function to order the "new" vector of pairs by one part (the int part).
And if it is possible could this be faster then sorting the whole unsorted "new" vector?
Thanks for any pointers into the right direction.
tiom
UPDATE
First of all thanks for all the suggestions and code examples. I will have a look at each of them and do some benchmarking if they will speed up the process.
Since there where some questions regarding the vectors I will try to explain in more detail what I want to accomplish.
As I said I have a number if time steps 1 to n. For every time step I have a vector of double data values with approximately 260000 elements.
In every time step I add an index to this vector which will result in a vector of pairs <double, int>. See the following code snippet.
typedef typename myVectorType::iterator myVectorTypeIterator; // iterator for myVector
std::vector<double> vectorData; // holds the double data values
myVectorType myVector(vectorData.size()); // vector of pairs <double, int>
myVectorTypeIterator myVectorIter = myVector.begin();
// generating of the index
for (int i = 0; i < vectorData.size(); ++i) {
myVectorIter->first = vectorData[i];
myVectorIter->second = i;
++myVectorIter;
}
std::sort(myVector.begin(), myVector.end() );
(The index is 0 based. Sorry for my initial mistake in the example above)
I do this for every time step and then sort this vector of pairs with std::sort.
The idea was now to use the sorted vector of pairs of time step j-1 (lets call it vectorOld) in time step j as a "presorter" for the "new" myVector since I assume the ordering of the sorted "new" myVector of time step j will only differ in some cases from the already sorted vectorOld of time step j-1.
With "presorter" I mean to rearrange the pairs in the "new" myVector into a vector presortedVector of type myVectorType by the same index order as the vectorOld and then let a tim sort or some similar sorting algorithm that is good in presorted date do the rest of the sorting.
Some data examples:
This is what the beginning of myVector looks like in time step j-1 before the sorting.
0.0688015 0
0.0832928 1
0.0482259 2
0.142874 3
0.314859 4
0.332909 5
...
And after the sorting
0.000102207 23836
0.000107378 256594
0.00010781 51300
0.000109315 95454
0.000109792 102172
...
So I in the next time step j this is my vectorOld and I like to take the element with index 23836 of the "new" myVector and put it in the first place of the presortedVector, element with index 256594 should be the second element in presortedVector and so on. But the elements have to keep their original index. So 256594 will not be index 0 but only element 0 in presortedVector still with index 256594
I hope this is a better explanation of my plan.
First, scan through the sequence to find the first element that's smaller than the preceding one (either a loop, or C++11's std::is_sorted_until). This is the start of the unsorted portion. Use std::sort on the remainder, then merge the two halves with std::inplace_merge.
template<class RandomIt, class Compare>
void sort_new_elements(RandomIt first, RandomIt last, Compare comp)
{
RandomIt mid = std::is_sorted_until(first, last, comp);
std::sort(mid, last, comp);
std::inplace_merge(first, mid, last, comp);
}
This should be more efficient than sorting the whole sequence indiscriminately, as long as the presorted sequence at the front is significantly larger than the unsorted part.
Using the sorted vector would likely result in more comparisons (just to find a matching item).
What you seem to be looking for is a self-ordering container.
You could use a set (and remove/re-insert on modification).
Alternatively you could use Boost Multi Index which affords a bit more convenience (e.g. use a struct instead of the pair)
I have no idea if this could be faster than sorting the whole unsorted "new" vector. It will depend on the data.
But this will create a sorted copy of a new vector based on the order of an old vector:
myVectorType getSorted(const myVectorType& unsorted, const myVectorType& old) {
myVectorType sorted(unsorted.size());
auto matching_value
= [&unsorted](const myPairType& value)
{ return unsorted[value.second - 1]; };
std::transform(old.begin(), old.end(), sorted.begin(), matching_value);
return sorted;
}
You will then need to "finish" sorting this vector. I don't know how much quicker (if at all) this will be than sorting it from scratch.
Live demo.
Well you can create new vector with the order of the old and then use algorithms that has good complexity for (nearly) sorted inputs for the restoration of order.
Below I put an example of how it works, with Mark's function as restore_order:
#include <iostream>
#include <algorithm>
#include <vector>
#include <utility>
using namespace std;
typedef std::pair<double, int> myPairType;
typedef std::vector<myPairType> myVectorType;
void outputMV(const myVectorType& vect, std::ostream& out)
{
for(const auto& element : vect)
out << element.first << " " << element.second << '\n';
}
//https://stackoverflow.com/a/28813905/1133179
template<class RandomIt, class Compare>
void restore_order(RandomIt first, RandomIt last, Compare comp)
{
RandomIt mid = std::is_sorted_until(first, last, comp);
std::sort(mid, last, comp);
std::inplace_merge(first, mid, last, comp);
}
int main() {
myVectorType myVector = {{3.5,0},{1.4,1},{2.5,2},{1.0,3}};
myVectorType mv2 = {{3.6,0},{1.35,1},{2.6,2},{1.36,3}};
auto comparer = [] (const auto& lhs, const auto& rhs) { return lhs.first < rhs.first;};
// make sure we didn't mess with the initial indexing
int i = 0;
for(auto& element : myVector) element.second = i++;
i = 0;
for(auto& element : mv2) element.second = i++;
//sort the initial vector
std::sort(myVector.begin(), myVector.end(), comparer);
outputMV(myVector, cout);
// this will replace each element of myVector with a corresponding
// value from mv2 using the old sorted order
std::for_each(myVector.begin(), myVector.end(),
[mv2] (auto& el) {el = mv2[el.second];}
);
// restore order in case it was different for the new vector
restore_order(myVector.begin(), myVector.end(), comparer);
outputMV(myVector, cout);
return 0;
}
This works in O(n) up to the point of restore then. Then the trick is to use good function for it. A nice candidate will have good complexity for nearly sorted inputs. I used function that Mark Ransom posted, which works, but still isn't perfect.
It could get outperformed by bubble sort inspired method. Something like, iterate over each element, if the order between current and next element is wrong recursively swap current and next. However there is a bet on how much the order changes - if the order doesn't vary much you will stay close to O(2n), if does - you will go up to O(n^2).
I think the best would be an implementation of natural merge sort. That has best case (sorted input) O(n), and worst O(n log n).

What is the most efficient way of removing duplicates from a container only using almost equality criteria (no sort)

How do I remove duplicates from a non sorted container (mainly vector) when I do not have the possibility to define operator< e.g. when I can only define a fuzzy compare function.
This answer using sort does not work since I cannot define a function for ordering the data.
template <typename T>
void removeDuplicatesComparable(T& cont){
for(auto iter=cont.begin();iter!=cont.end();++iter){
cont.erase(std::remove(boost::next(iter),cont.end(),*iter),cont.end());
}
}
This is O(n²) and should be quite localized concerning cache hits.
Is there a faster or at least neater solution?
Edit: On why I cannot use sets. I do geometric comparisons. An example could be this but I have other entities different from polygons as well.
bool match(SegPoly const& left,SegPoly const& right,double epsilon){
double const cLengthCompare = 0.1; //just an example
if(!isZero(left.getLength()- right.getLength(), cLengthCompare)) return false;
double const interArea =areaOfPolygon(left.intersected(right)); //this is a geometric intersection
if(!isZero(interArea-right.getArea(),epsilon)) return false;
else return true;
}
So for such comparisons I would not know how to formulate sorting or a neat hash function.
First, don't remove elements one at a time.
Next, use a hash table (or similar structure) to detect duplicates.
If you don't need to preserve order, then copy all elements into a hashset (this destroys duplicates), then recreate the vector using the values left in the hashset.
If you need to preserve order, then:
Set read and write iterators to the beginning of the vector.
Start moving the read iterator through, checking elements against a hashset or octtree or something that allows finding nearby elements quickly.
For each element that collides with one in the hashset/octtree, advance the read iterator only.
For elements that do not collide, move from read iterator to write iterator, copy to hashset/octtree, then advance both.
When read iterator reaches the end, call erase to truncate the vector at the write iterator position.
The key advantage of the octtree is that while it doesn't let you immediately determine whether there is something close enough to be a "duplicate", it allows you to test against only near neighbors, excluding most of your dataset. So your algorithm might be O(N lg N) or even O(N lg lg N) depending on the spatial distribution.
Again, if you don't care about the ordering, you can actually move survivors into the hashset/octtree and at the end move them back into the vector (compactly).
If you don't want to rewrite your code to prevent duplicates from being placed in the vector to begin with, you can do something like this:
std::vector<Type> myVector;
// fill in the vector's data
std::unordered_set<Type> mySet(myVector.begin(), myVector.end());
myVector.assign(mySet.begin(), mySet.end());
Which will be of O(2 * n) = O(n).
std::set (or std::unordered_set - which uses a hash instead of a comparison) doesn't allow for duplicates, so it will eliminate them as the set is initialized. Then you re-assign the vector with the non-duplicated data.
Since you are insisting that you cannot create a hash, another alternative is to create a temporary vector:
std::vector<Type> vec1;
// fill vec1 with your data
std::vector<Type> vec2;
vec2.reserve(vec1.size()); // vec1.size() will be the maximum possible size for vec2
std::for_each(vec1.begin(), vec1.end(), [&](const Type& t)
{
bool is_unique = true;
for (std::vector<Type>::iterator it = vec2.begin(); it != vec2.end(); ++it)
{
if (!YourCustomEqualityFunction(s, t))
{
is_unique = false;
break;
}
}
if (is_unique)
{
vec2.push_back(t);
}
});
vec1.swap(vec2);
If copies are a concern, switch to a vector of pointers, and you can decrease the memory reallocations:
std::vector<std::shared_ptr<Type>> vec1;
// fill vec1 with your data
std::vector<std::shared_ptr<Type>> vec2;
vec2.reserve(vec1.size()); // vec1.size() will be the maximum possible size for vec2
std::for_each(vec1.begin(), vec1.end(), [&](const std::shared_ptr<Type>& t)
{
bool is_unique = true;
for (std::vector<Type>::iterator it = vec2.begin(); it != vec2.end(); ++it)
{
if (!YourCustomEqualityFunction(*s, *t))
{
is_unique = false;
break;
}
}
if (is_unique)
{
vec2.push_back(t);
}
});
vec1.swap(vec2);

Efficient way to get the indizes of the k highest values in vector<float>

How can I create a std::map<int, float> from a vector<float>, so that the map contains the k highest values from the vector with the keys beeing the index of the value in the vector.
A naive approach would be to traverse the vector (O(n)), extract and erase (O(n)) the highest element k times (O(k)), leading to a complexity of O(k*n^2), which is suboptimal, I guess.
Even better would be to just copy (O(n)) and remove the smallest until size is k. Which would lead to O(n^2). Still polynomial...
Any ideas?
Following should do the job:
#include <cstdint>
#include <algorithm>
#include <iostream>
#include <map>
#include <tuple>
#include <vector>
// Compare: greater T2 first.
struct greater_by_second
{
template <typename T1, typename T2>
bool operator () (const std::pair<T1, T2>& lhs, const std::pair<T1, T2>& rhs)
{
return std::tie(lhs.second, lhs.first) > std::tie(rhs.second, rhs.first);
}
};
std::map<std::size_t, float> get_index_pairs(const std::vector<float>& v, int k)
{
std::vector<std::pair<std::size_t, float>> indexed_floats;
indexed_floats.reserve(v.size());
for (std::size_t i = 0, size = v.size(); i != size; ++i) {
indexed_floats.emplace_back(i, v[i]);
}
std::nth_element(indexed_floats.begin(),
indexed_floats.begin() + k,
indexed_floats.end(), greater_by_second());
return std::map<std::size_t, float>(indexed_floats.begin(), indexed_floats.begin() + k);
}
Let's test it:
int main(int argc, char *argv[])
{
const std::vector<float> fs {45.67f, 12.34f, 67.8f, 4.2f, 123.4f};
for (const auto& elem : get_index_pairs(fs, 2)) {
std::cout << elem.first << " " << elem.second << std::endl;
}
return 0;
}
Output:
2 67.8
4 123.4
You can keep a list of the k-highest values so far, and update it for each of the values in your vector, which takes you down to O(n*log k) (assuming log k for each update of the list of highest values) or, for a naive list, O(kn).
You can probably get closer to O(n), but assuming k is probably pretty small, may not be worth the effort.
Your optimal solution will have a complexity of O(n+k*log(k)), since sorting the k elements can be reduced to this, and you will have to look at each of the elements at least once.
Two possible solutions come to mind:
Iterate through the vector while adding all elements to a bounded (size k) priority-queue/heap, also keeping their indices.
Create a copy of your vector with including the original indices, i.e. std::vector<std::pair<float, std::size_t>> and use std::nth_element to move the k highest values to the front using a comparator that compares only the first element. Then insert those elements into your target map. Ironically, that last step adds you the k*log(k) in the overall complexity, while nth_element is linear (but will permute your indices).
Maybe I did not get it, but in case the incremental approach is not an option, why not use std::sort std::partial_sort?
That should be an o(n log k), and since k is very likely to be small, that makes practically an o(n).
Edit: thanks to Mike Seymour for the update.
Edit (bis):
The idea is to use an intermediate vector for sorting, and then put it into the map. Trying to reduce the order of the computation would only be justified for significant amount of data, so I guess the copy time (in o(n) ) could be lost in background noise.
Edit (bis):
That's actually what the selected answer does, without the theorietical explanations :).

sort operator not working in C++

I'm having trouble using my sort operator since I need to sort only the first element in the pair. The code is simple but is not working:
The operator is defined in:
struct sort_pred {
bool operator()(const CromosomaIndex &left, const CromosomaIndex &right) {
return left.first < right.first;
}
};
and the type is
typedef std::pair<double,int> CromosomaIndex;
I'm trying to sort the array like this:
CromosomaIndex nuevo[2];
nuevo[0].first = 0.01;
nuevo[0].second = 0;
nuevo[1].first = 0.009;
nuevo[1].second = 1;
int elements = sizeof(nuevo) / sizeof(nuevo[0]);
sort(nuevo, nuevo+ elements, sort_pred());
But the problem is that this is sorting the first and the second element and I only want to sort the first element and keep the second fixed.
Any thoughts?
If you want the results to depend on the original order, use std::stable_sort.
This approach sorts pairs as a single unit, which is what it is expected to do: it never make sense to break up the first and the second of the pair. If you would like to sort only the first item and leave the second in place, you will end up with a different set of pairs.
If you want to sort the first separately from the second, place them in separate arrays (better yet, use vectors) and sort the first vector. Then iterate both vectors, and make a new set of pairs.
I am not sure that you understood the answer to the other question, but you do want the whole pair to be reordered according to the double value. The original index (the int) must be attached to the double that was in that location in the original vector so that you can recover the location. Note that if you sorted only the double within the pair, then the value of the int would be the location in the array... which does not need to be maintained as a datum at all.
Alternatively, you can consider a similar (although slightly different) solution. Create a single vector of integers that is initialized with values in the range [0..N) where N is the size of the vector of doubles. Then sort the vector of indices using a comparator functor that instead of looking at the value (int) passed in will check the value in the original double vector:
struct dereference_cmp {
std::vector<double> const & d_data;
dereference_cmp( std::vector<double> const & data ) : d_data(data) {}
bool operator()( int lhs, int rhs ) const {
return d_data[lhs] < d_data[rhs];
}
};
std::vector<double> d = ...;
std::vector<int> ints;
ints.reserve( d.size() );
for ( int i = 0; i < d.size(); ++i ) ints.push_back(i);
std::sort( d.begin(), d.end(), dereference_cmp(d) );
In this approach, note that what is not being reordered are the doubles, but rather the vector of indices. After the sort completes the vector of indices will contain locations into the vector of double such that i < j => d[ ints[i] ] <= d[ ints[j] ].
Note that in the whole process, what you want to reorder is the indices (in the original approach to be able to reconstruct the unsorted vector, in this approach to be able to find the values in sorted order), and the original vector is there only to provide the criterion for the sort.
Also note that the only reason to sort only the indices and not a modified container with both the value and the index would be if the cost of moving the data was high (say that each datum is a large object that cannot be cheaply moved, as a struct holding an array --not vector-- of data).

C++ Standard Library approach to removing one of a pair of items in a list that satisfy a criterion

Imagine you have an std::list with a set of values in it. For demonstration's sake, we'll say it's just std::list<int>, but in my case they're actually 2D points. Anyway, I want to remove one of a pair of ints (or points) which satisfy some sort of distance criterion. My question is how to approach this as an iteration that doesn't do more than O(N^2) operations.
Example
Source is a list of ints containing:
{ 16, 2, 5, 10, 15, 1, 20 }
If I gave this a distance criterion of 1 (i.e. no item in the list should be within 1 of any other), I'd like to produce the following output:
{ 16, 2, 5, 10, 20 } if I iterated forward or
{ 20, 1, 15, 10, 5 } if I iterated backward
I feel that there must be some awesome way to do this, but I'm stuck with this double loop of iterators and trying to erase items while iterating through the list.
Make a map of "regions", basically, a std::map<coordinates/len, std::vector<point>>.
Add each point to it's region, and each of the 8 neighboring regions O(N*logN). Run the "nieve" algorithm on each of these smaller lists (technically O(N^2) unless theres a maximum density, then it becomes O(N*density)). Finally: On your origional list, iterate through each point, and if it has been removed from any of the 8 mini-lists it was put in, remove it from the list. O(n)
With no limit on density, this is O(N^2), and slow. But this gets faster and faster the more spread out the points are. If the points are somewhat evenly distributed in a known boundary, you can switch to a two dimensional array, making this significantly faster, and if there's a constant limit to the density, that technically makes this a O(N) algorithm.
That is how you sort a list of two variables by the way. The grid/map/2dvector thing.
[EDIT] You mentioned you were having trouble with the "nieve" method too, so here's that:
template<class iterator, class criterion>
iterator RemoveCriterion(iterator begin, iterator end, criterion criter) {
iterator actend = end;
for(iterator L=begin; L != actend; ++L) {
iterator R(L);
for(++R; R != actend;) {
if (criter(*L, *R) {
iterator N(R);
std::rotate(R, ++N, actend);
--actend;
} else
++R;
}
}
return actend;
}
This should work on linked lists, vectors, and similar containers, and works in reverse. Unfortunately, it's kinda slow due to not taking into account the properties of linked lists. It's possible to make much faster versions that only work on linked lists in a specific direction. Note that the return value is important, like with the other mutating algorithms. It can only alter contents of the container, not the container itself, so you'll have to erase all elements after the return value when it finishes.
Cubbi had the best answer, though he deleted it for some reason:
Sounds like it's a sorted list, in which case std::unique will do the job of removing the second element of each pair:
#include <list>
#include <algorithm>
#include <iostream>
#include <iterator>
int main()
{
std::list<int> data = {1,2,5,10,15,16,20};
std::unique_copy(data.begin(), data.end(),
std::ostream_iterator<int>(std::cout, " "),
[](int n, int m){return abs(n-m)<=1;});
std::cout << '\n';
}
demo: https://ideone.com/OnGxk
That trivially extends to other types -- either by changing int to something else, or by defining a template:
template<typename T> void remove_close(std::list<T> &data, int distance)
{
std::unique_copy(data.begin(), data.end(),
std::ostream_iterator<int>(std::cout, " "),
[distance](T n, T m){return abs(n-m)<=distance;});
return data;
}
Which will work for any type that defines operator - and abs to allow finding a distance between two objects.
As a mathematician I am pretty sure there is no 'awesome' way to approaching this problem for an unsorted list. It seems to me that it is a logical necessity to check the criterion for any one element against all previous elements selected in order to determine whether insertion is viable or not. There may be a number of ways to optimize this, depending on the size of the list and the criterion.
Perhaps you could maintain a bitset based on the criterion. E.g. suppose abs(n-m)<1) is the criterion. Suppose the first element is of size 5. This is carried over into the new list. So flip bitset[5] to 1. Then, when you encounter an element of size 6, say, you need only test
!( bitset[5] | bitset[6] | bitset[7])
This would ensure no element is within magnitude 1 of the resulting list. This idea may be difficult to extend for more complicated(non discrete) criterions however.
What about:
struct IsNeighbour : public std::binary_function<int,int,bool>
{
IsNeighbour(int dist)
: distance(dist) {}
bool operator()(int a, int b) const
{ return abs(a-b) <= distance; }
int distance;
};
std::list<int>::iterator iter = lst.begin();
while(iter != lst.end())
{
iter = std::adjacent_find(iter, lst.end(), IsNeighbour(some_distance)));
if(iter != lst.end())
iter = lst.erase(iter);
}
This should have O(n). It searches for the first pair of neighbours (which are at maximum some_distance away from each other) and removes the first of this pair. This is repeated (starting from the found item and not from the beginning, of course) until no pairs are found anymore.
EDIT: Oh sorry, you said any other and not just its next element. In this case the above algorithm only works for a sorted list. So you should sort it first, if neccessary.
You can also use std::unique instead of this custom loop above:
lst.erase(std::unique(lst.begin(), lst.end(), IsNeighbour(some_distance), lst.end());
but this removes the second item of each equal pair, and not the first, so you may have to reverse the iteration direction if this matters.
For 2D points instead of ints (1D points) it is not that easy, as you cannot just sort them by their euclidean distance. So if your real problem is to do it on 2D points, you might rephrase the question to point that out more clearly and remove the oversimplified int example.
I think this will work, as long as you don't mind making copies of the data, but if it's just a pair of integer/floats, that should be pretty low-cost. You're making n^2 comparisons, but you're using std::algorithm and can declare the input vector const.
//calculates the distance between two points and returns true if said distance is
//under its threshold
bool isTooClose(const Point& lhs, const Point& rhs, int threshold = 1);
vector<Point>& vec; //the original vector, passed in
vector<Point>& out; //the output vector, returned however you like
for(b = vec.begin(), e = vec.end(); b != e; b++) {
Point& candidate = *b;
if(find_if(out.begin(),
out.end(),
bind1st(isTooClose, candidate)) == out.end())
{//we didn't find anyone too close to us in the output vector. Let's add!
out.push_back(candidate);
}
}
std::list<>.erase(remove_if(...)) using functors
http://en.wikipedia.org/wiki/Erase-remove_idiom
Update(added code):
struct IsNeighbour : public std::unary_function<int,bool>
{
IsNeighbour(int dist)
: m_distance(dist), m_old_value(0){}
bool operator()(int a)
{
bool result = abs(a-m_old_value) <= m_distance;
m_old_value = a;
return result;
}
int m_distance;
int m_old_value;
};
main function...
std::list<int> data = {1,2,5,10,15,16,20};
data.erase(std::remove_if(data.begin(), data.end(), IsNeighbour(1)), data.end());