Given a path, I want to optimize it so that all verticies that are straight on a line can be removed.
For example:
Path:
*******
* *
* *
***********
Could be optimized to:
*-----*
| \
| \
*---------*
However I want to have control over the deviation from the slope so that it doesnt have to be exactly on the slope.
What sort of algorithm can do this?
Thanks
I believe that you can do this with a simple iterative walk across the points on the path. Keep track, at each point, of the last three points you've encountered. If all three of them are collinear, then remove the middle point from the path, since taking a straight-line path from the first to the third node will pass through the middle node. You could control how much of a deviation there is by having some term that controls how close to collinear the points would have to be.
This can be implemented in O(n) time with a simple pass over the data if you have the points stored in a data structure like a doubly-linked list.
Hope this helps!
You should use the convex hull algorithm (it depends on how is your polygon stocked in memory) and then clean the points with a min angle on both neighbour point. Then you'll have a polygon with only the point at the extremity.
Here it is:
http://en.wikipedia.org/wiki/Convex_hull
They are many possible implementation.It depends on what language you're programing in, and the data you play with..
Edit: I didn't know at time that you had already the points in data.. Just iterate thrue the points and calculate the angle between the one you're on, the prev and the next. if the angle is ~= 180 erase the current point.
This is going to be a bit of an abstracted view since I'm not much of a C++ person, but here goes...
Let's take a look at one point right now:
*******
* *
* *<- this one, lets call it X
***********
What you're going to do is slowly decide if each point is necessary. To decide if a point is valid, other points must be used, the points immediately before and immediately after:
*******
* *<- A
* *
***********<- B
If the angle from A to X is the same (or within an error you deem accurate enough) as the angle from X to B, then X is unnecessary.
This will NOT result in the same outcome as the Convex Hull algorithm. This will simply reduce the resolution of the path. You can get side affects if your allowed error is too great such as this:
* *
* |
* |
* -> |
* |
* |
* *
Or if you're error is too small you may not change the path at all.
Also note that convex hull can greatly change the path, Example:
* * *---*
* * * * / \
* * * -> * *
* * | |
********* *-------*
set `D` to a maximum deviance of 10 degrees or so.
set `P` to the first point.
set `Q` to the point after `P`.
set `A` to the angle from `P` to `Q`.
while `Q` is not that last point in the list
if the angle from `P` to `Q` is within of `A` plus or minus `D`
remove `Q`
else
set `P` to `Q`
set `A` to the angle from `P` to `Q`.
set `Q` to the point after `P`
This is slightly more complicated than the templatetypedef's answer, but has the advantage that it forms a better fit on large curves.
A more complicated solution would involve techniques from image processing. You could try a Hough transform that allows deviations. Deviations can be included by "bluring" the parameter space. However the algorithm is not simple. Also I don't know how well it handles large number of lines, when the number of points on each line is very different. Since your points are ordered you could try to have a look at the parameter space and remove all points that have produced a match. If you select best matches first, you will probably be left with a good solution.
I think this page should help you: Simplyfing Polygons (and I also recommend the book).
I've implemented #templatetypedef's solution in C++, for a closed polygonal chain, described by two x,y vectors. I walk the polygon, and if a point is collinear with the previous and the next point, I delete it:
template<class T> void del_collinear_vanilla(std::vector<T> &x,
std::vector<T> &y) {
assert(x.size() == y.size());
size_t i = x.size();
size_t im1, ip1 = 0;
do {
i--;
im1 = i ? i - 1 : x.size() - 1;
if (are_collinear(x[ip1], y[ip1], x[i], y[i], x[im1], y[im1])) {
x.erase(x.begin() + i);
y.erase(y.begin() + i);
}
ip1 = i;
} while (i != 0);
}
where the implementation depends on a macro/template are_collinear(x0,y0,x1,y1,x2,y2).
However, in some cases I still had some collinear points in the output. This is a sample input with which the algorithm fails:
In the example, P5 coincides with P0 and P4 has the same ordinate of P0 and P1; I changed a little their coordinates to show all the segments. The algorithm should return only a rectangle with vertices P1,P2,P3,P4.
Above, P6 is collinear with P5 and P0. Then, once P6 is eliminated, P5 and P0 coincide, and they are both collinear with P4 and P1.
It turns out that a simple loop over each point, deleting a point if it is collinear with the previous and the next point, does not provide the correct result.
(In the example, let's say you start with P0, and you find that it is not collinear with the point before P6 and the point after P1. Then you move to P1,P2,... until you reach P6. P6 is collinear, you delete it, and the loop is finished. But now P0 is collinear with P4 and P1, and it should have been deleted!)
The same flaw exists for an open path. The algorithm works fine as long as the input path has not collapsed on itself, in a way.
The solution is to take a step back every time you delete a point, to verify if the previous point has now become collinear:
template<class T> void del_collinear(std::vector<T> &x, std::vector<T> &y) {
assert(x.size() == y.size());
size_t target = x.size() - 1;
size_t i = x.size() - 1;
do {
size_t im1 = i ? i - 1 : x.size() - 1;
size_t ip1 = (i == x.size() - 1) ? 0 : i + 1;
if (are_collinear(x[ip1], y[ip1], x[i], y[i], x[im1], y[im1])) {
x.erase(x.begin() + i);
y.erase(y.begin() + i);
// I do not decrease i in this case, as the the previous (alread
// processed) point may now be a collinear point that must be
// deleted. I mod it because i may now exceed x.size()
i = i % x.size();
//Increment the target as well.
target = (i + 1 + x.size()) % x.size();
} else
//go for the next point.
i = i ? i - 1 : x.size() - 1;
} while (i != target);
}
Related
I have N points in a 2D cartesian space loaded in a boost:rtree.
Given a random point P(x,y) not in the tree, I need to find an effective way to identify the nearest point for each of the four quadrant of generated by the local csys centered in P and parallel to the main csys
As shown in the image (linked above), given the red point I need to find the four purple points.
I tried this naive approach:
namespace bg = boost::geometry;
typedef bg::model::box<point> box;
vector<item> result_s;
vector<item> result_p;
int xres = 10; /*this is a fixed amount that is loosely related to the points distribution*/
int yres = 10; /*as for xres*/
int range = 10;
int maxp = 30;
/*
* .. filling the tree
*/
box query_box2(point(lat, lon), point(lat-range*yres, lon+range*xres));
rtree.query(bgi::intersects(query_box2) && bgi::nearest(p, maxp), std::back_inserter(result_p));
if(result_p.size()>0) result_s.push_back(result_p[0]);
result_p.clear();
box query_box1(point(lat, lon), point(lat+range*yres, lon+range*xres));
rtree.query(bgi::intersects(query_box1) && bgi::nearest(p, maxp), std::back_inserter(result_p));
if(result_p.size()>0) result_s.push_back(result_p[0]);
result_p.clear();
box query_box3(point(lat, lon), point(lat+range*yres, lon-range*xres));
rtree.query(bgi::intersects(query_box3) && bgi::nearest(p, maxp), std::back_inserter(result_p));
if(result_p.size()>0) result_s.push_back(result_p[0]);
result_p.clear();
box query_box4(point(lat, lon), point(lat-range*yres, lon-range*xres));
rtree.query(bgi::intersects(query_box4) && bgi::nearest(p, maxp), std::back_inserter(result_p));
if(result_p.size()>0) result_s.push_back(result_p[0]);
result_p.clear();
if(result_s.size()>3)
cout << "OK!" << endl;
else
cout << "KO" << endl;
but often it end up with an empty result (KO)
Any suggestion or address will be very appreciated.
Tnx.
I would do an iterated nearest query.
It will produce nearest points ordered by distance ascending.
You can cancel it after you received at least 1 point in all quadrants.
In principle the time complexity of this approach is MUCH lower because it involves only a single query.
Worst case behaviour would iterate all points in the tree e.g.
if one quadrant doesn't contain any points, or
when all the points in one quadrant are actually closer than the closest point in another quadrant.
Seems like the former might not be possible in your model (?) and the latter is statistically unlikely with normal distributions. You'd have to check your domains expected point distributions.
Or, and this always applies: MEASURE and compare the effective performance
Use a modified distance function. More precisely, use four.
The main idea is to use a distance such that
d(v1,v2) = infinity if v2.x < v1.x
d(v1,v2) = infinity if v2.y < v1.y
d(v1,v2) = (v1.x-v2.x)²+(v1.y-v2.y)² otherwise
If you search for the nearest point with this distance, it must be in the top right quadrant.
You'll need to extend this logic to minDist when searching the tree.
The benefit is that it can stop searching a quadrant when it has found a point. Pages that overlap the "axes" may be expanded twice though.
There are n points with each having two attributes:
1. Position (from axis)
2. Attraction value (integer)
Attraction force between two points A & B is given by:
Attraction_force(A, B) = (distance between them) * Max(Attraction_val_A, Attraction_val_B);
Find the summation of all the forces between all possible pairs of points?
I tried by calculating and adding forces between all the pairs
for(int i=0; i<n-1; i++) {
for(int j=i+1; j<n; j++) {
force += abs(P[i].pos - P[j].pos) * max(P[i].attraction_val, P[j].attraction_val);
}
}
Example:
Points P1 P2 P3
Points distance: 2 3 4
Attraction Val: 4 5 6
Force = abs(2 - 3) * max(4, 5) + abs(2 - 4) * max(4, 6) + abs(3 - 4) * max(5, 6) = 23
But this takes O(n^2) time, I can't think of a way to reduce it further!!
Scheme of a solution:
Sort all points by their attraction value and process them one-by-one, starting with the one with lowest attraction.
For each point you have to quickly calculate sum of distances to all previously added points. That can be done using any online Range Sum Query problem solution, like segment tree or BIT. Key idea is that all points to the left are really not different and sum of their coordinates is enough to calculate sum of distances to them.
For each newly added point you just multiply that sum of distances (obtained on step 2) by point's attraction value and add that to the answer.
Intuitive observations that I made in order to invent this solution:
We have two "bad" functions here (somewhat "discrete"): max and modulo (in distance).
We can get rid of max by sorting our points and processing them in a specific order.
We can get rid of modulo if we process points to the left and to the right separately.
After all these transformations, we have to calculate something which, after some simple algebraic transformations, converts to an online RSQ problem.
An algorithm of:
O(N2)
is optimal, because you need the actual distance between all possible pairs.
I have the following problem. Suppose you have a big array of Manhattan polygons on the plane (their sides are parallel to x or y axis). I need to find a polygons, placed closer than some value delta. The question - is how to make this in most effective way, because the number of this polygons is very large. I will be glad if you will give me a reference to implemented solution, which will be easy to adapt for my case.
The first thing that comes to mind is the sweep and prune algorithm (also known as sort and sweep).
Basically, you first find out the 'bounds' of each shape along each axis. For the x axis, these would be leftmost and rightmost points on a shape. For the y axis, the topmost and bottommost.
Lets say you have a bound structure that looks something like this:
struct Bound
{
float value; // The value of the bound, ie, the x or y coordinate.
bool isLower; // True for a lower bound (leftmost point or bottommost point).
int shapeIndex; // The index (into your array of shapes) of the shape this bound is on.
};
Create two arrays of these Bounds, one for the x axis and one for the y.
Bound xBounds* = new Bound[2 * numberOfShapes];
Bound yBounds* = new Bound[2 * numberOfShapes];
You will also need two more arrays. An array that tracks on how many axes each pair of shapes is close to one another, and an array of candidate pairs.
int closeAxes* = new int[numberOfShapes * numberOfShapes];
for (int i = 0; i < numberOfShapes * numberOfShapes; i++)
CloseAxes[i] = 0;
struct Pair
{
int shapeIndexA;
int shapeIndexB;
};
Pair candidatePairs* = new Pair[numberOfShapes * numberOfShape];
int numberOfPairs = 0;
Iterate through your list of shapes and fill the arrays appropriately, with one caveat:
Since you're checking for closeness rather than intersection, add delta to each upper bound.
Then sort each array by value, using whichever algorithm you like.
Next, do the following (and repeat for the Y axis):
for (int i = 0; i + 1 < 2 * numberOfShapes; i++)
{
if (xBounds[i].isLower && xBounds[i + 1].isLower)
{
unsigned int L = xBounds[i].shapeIndex;
unsigned int R = xBounds[i + 1].shapeIndex;
closeAxes[L + R * numberOfShapes]++;
closeAxes[R + L * numberOfShapes]++;
if (closeAxes[L + R * numberOfShapes] == 2 ||
closeAxes[R + L * numberOfShapes] == 2)
{
candidatePairs[numberOfPairs].shapeIndexA = L;
candidatePairs[numberOfPairs].shapeIndexB = R;
numberOfPairs++;
}
}
}
All the candidate pairs are less than delta apart on each axis. Now simply check each candidate pair to make sure they're actually less than delta apart. I won't go into exactly how to do that at the moment because, well, I haven't actually thought about it, but hopefully my answer will at least get you started. I suppose you could just check each pair of line segments and find the shortest x or y distance, but I'm sure there's a more efficient way to go about the 'narrow phase' step.
Obviously, the actual implementation of this algorithm can be a lot more sophisticated. My goal was to make the explanation clear and brief rather than elegant. Depending on the layout of your shapes and the sorting algorithm you use, a single run of this is approximately between O(n) and O(n log n) in terms of efficiency, as opposed to O(n^2) to check every pair of shapes.
I want to create a large set of random point cloud in 2D plane that are non-degenerate (no 3 points in a straight line in the whole set). I have a naive solution which generates a random float pair P_new(x,y) and checks with every pair of points (P1, P2) generated till now if point (P1, P2, P) lie in same line or not. This takes O(n^2) checks for each new point added to the list making the whole complexity O(n^3) which is very slow if I want to generate more than 4000 points (takes more than 40 mins).
Is there a faster way to generate these set of non-degenerate points?
Instead of checking the possible points collinearity on each cycle iteration, you could compute and compare coefficients of linear equations. This coefficients should be store in container with quick search. I consider using std::set, but unordered_map could fit either and could lead to even better results.
To sum it up, I suggest the following algorithm:
Generate random point p;
Compute coefficients of lines crossing p and existing points (I mean usual A,B&C). Here you need to do n computations;
Trying to find newly computed values inside of previously computed set. This step requires n*log(n^2) operations at maximum.
In case of negative search result, add new value and add its coefficients to corresponding sets. Its cost is about O(log(n)) too.
The whole complexity is reduced to O(n^2*log(n)).
This algorithm requires additional storing of n^2*sizeof(Coefficient) memory. But this seems to be ok if you are trying to compute 4000 points only.
O(n^2 log n) algorithm can be easily constructed in the following way:
For each point P in the set:
Sort other points by polar angle (cross-product as a comparison function, standard idea, see 2D convex hull gift-wrapping algorithm for example). In this step you should consider only points Q that satisfy
Q.x > P.x || Q.y >= P.y
Iterate over sorted list, equal points are lying on the same line.
Sorting is done in O(n log n), step 2. is O(n). This gives O(n^2 log n) for removing degenerate points.
Determining whether a set of points is degenerate is a 3SUM-hard problem. (The very first problem listed is determining whether three lines contains a common point; the equivalent problem under projective duality is whether three points belong to a common line.) As such, it's not reasonable to hope that a generate-and-test solution will be significantly faster than n2.
What are your requirements for the distribution?
generate random point Q
for previous points P calculate (dx, dy) = P - Q
and B = (asb(dx) > abs(dy) ? dy/dx : dx/dy)
sort the list of points P by its B value, so that points that form a line with Q will be in near positions inside the sorted list.
walk over the sorted list checking where Q forms a line with the current P value being considered and some next values that are nearer than a given distance.
Perl implementation:
#!/usr/bin/perl
use strict;
use warnings;
use 5.010;
use Math::Vector::Real;
use Math::Vector::Real::Random;
use Sort::Key::Radix qw(nkeysort);
use constant PI => 3.14159265358979323846264338327950288419716939937510;
#ARGV <= 2 or die "Usage:\n $0 [n_points [tolerance]]\n\n";
my $n_points = shift // 4000;
my $tolerance = shift // 0.01;
$tolerance = $tolerance * PI / 180;
my $tolerance_arctan = 3 / 2 * $tolerance;
# I got to that relation using no so basic maths in a hurry.
# it may be wrong!
my $tolerance_sin2 = sin($tolerance) ** 2;
sub cross2d {
my ($p0, $p1) = #_;
$p0->[0] * $p1->[1] - $p1->[0] * $p0->[1];
}
sub line_p {
my ($p0, $p1, $p2) = #_;
my $a0 = $p0->abs2 || return 1;
my $a1 = $p1->abs2 || return 1;
my $a2 = $p2->abs2 || return 1;
my $cr01 = cross2d($p0, $p1);
my $cr12 = cross2d($p1, $p2);
my $cr20 = cross2d($p2, $p0);
$cr01 * $cr01 / ($a0 * $a1) < $tolerance_sin2 or return;
$cr12 * $cr12 / ($a1 * $a2) < $tolerance_sin2 or return;
$cr20 * $cr20 / ($a2 * $a0) < $tolerance_sin2 or return;
return 1;
}
my ($c, $f1, $f2, $f3) = (0, 1, 1, 1);
my #p;
GEN: for (1..$n_points) {
my $q = Math::Vector::Real->random_normal(2);
$c++;
$f1 += #p;
my #B = map {
my ($dx, $dy) = #{$_ - $q};
abs($dy) > abs($dx) ? $dx / $dy : $dy / $dx;
} #p;
my #six = nkeysort { $B[$_] } 0..$#B;
for my $i (0..$#six) {
my $B0 = $B[$six[$i]];
my $pi = $p[$six[$i]];
for my $j ($i + 1..$#six) {
last if $B[$six[$j]] - $B0 > $tolerance_arctan;
$f2++;
my $pj = $p[$six[$j]];
if (line_p($q - $pi, $q - $pj, $pi - $pj)) {
$f3++;
say "BAD: $q $pi-$pj";
redo GEN;
}
}
}
push #p, $q;
say "GOOD: $q";
my $good = #p;
my $ratiogood = $good/$c;
my $ratio12 = $f2/$f1;
my $ratio23 = $f3/$f2;
print STDERR "gen: $c, good: $good, good/gen: $ratiogood, f2/f1: $ratio12, f3/f2: $ratio23 \r";
}
print STDERR "\n";
The tolerance indicates the acceptable error in degrees when considering if three points are in a line as π - max_angle(Q, Pi, Pj).
It does not take into account the numerical instabilities that can happen when subtracting vectors (i.e |Pi-Pj| may be several orders of magnitude smaller than |Pi|). An easy way to eliminate that problem would be to also require a minimum distance between any two given points.
Setting tolerance to 1e-6, the program just takes a few seconds to generate 4000 points. Translating it to C/C++ would probably make it two orders of magnitude faster.
O(n) solution:
Pick a random number r from 0..1
The point added to the cloud is then P(cos(2 × π × r), sin(2 × π × r))
What is, and is there, a fast way to check where in the plane my line will intersect, if i know the plane is always in the same z-axis (so it cannot be rotated), and its width/height is infinite? Also, my "line" isn't actually a line, but a 3d vector, so the "line" can go to infinite distance.
Here is the code that relies on two points:
(p1 and p2 are start and end points of the line. plane_z = where the plane is)
k1 = -p2.z/(p1.z-p2.z-plane_z);
k2 = 1.0f-k1;
ix = k1*p1.x + k2*p2.x;
iy = k1*p1.y + k2*p2.y;
iz = plane_z; // where my plane lays
Another solution which works with a vector (i made it use two points as the first example did too, "p2.x-p1.x" etc. is the vector calculation):
a = (plane_z-p1.z)/(p2.z-p1.z);
ix = p1.x + a*(p2.x-p1.x);
iy = p1.y + a*(p2.y-p1.y);
iz = plane_z;
Edit3: added Orbling's solution which is slightly faster, and doesnt rely on two points necessarily.
You can implement a strait-forward solution like there http://paulbourke.net/geometry/planeline/, then apply your simplifications. In the algebraic solution (#2) A and B are zeros in your case (if i understand correctly this statement)
plane is always in the same z-axis (so it cannot be rotated)
Note: your line should be a point and a direction, or two points right?