I have N points in a 2D cartesian space loaded in a boost:rtree.
Given a random point P(x,y) not in the tree, I need to find an effective way to identify the nearest point for each of the four quadrant of generated by the local csys centered in P and parallel to the main csys
As shown in the image (linked above), given the red point I need to find the four purple points.
I tried this naive approach:
namespace bg = boost::geometry;
typedef bg::model::box<point> box;
vector<item> result_s;
vector<item> result_p;
int xres = 10; /*this is a fixed amount that is loosely related to the points distribution*/
int yres = 10; /*as for xres*/
int range = 10;
int maxp = 30;
/*
* .. filling the tree
*/
box query_box2(point(lat, lon), point(lat-range*yres, lon+range*xres));
rtree.query(bgi::intersects(query_box2) && bgi::nearest(p, maxp), std::back_inserter(result_p));
if(result_p.size()>0) result_s.push_back(result_p[0]);
result_p.clear();
box query_box1(point(lat, lon), point(lat+range*yres, lon+range*xres));
rtree.query(bgi::intersects(query_box1) && bgi::nearest(p, maxp), std::back_inserter(result_p));
if(result_p.size()>0) result_s.push_back(result_p[0]);
result_p.clear();
box query_box3(point(lat, lon), point(lat+range*yres, lon-range*xres));
rtree.query(bgi::intersects(query_box3) && bgi::nearest(p, maxp), std::back_inserter(result_p));
if(result_p.size()>0) result_s.push_back(result_p[0]);
result_p.clear();
box query_box4(point(lat, lon), point(lat-range*yres, lon-range*xres));
rtree.query(bgi::intersects(query_box4) && bgi::nearest(p, maxp), std::back_inserter(result_p));
if(result_p.size()>0) result_s.push_back(result_p[0]);
result_p.clear();
if(result_s.size()>3)
cout << "OK!" << endl;
else
cout << "KO" << endl;
but often it end up with an empty result (KO)
Any suggestion or address will be very appreciated.
Tnx.
I would do an iterated nearest query.
It will produce nearest points ordered by distance ascending.
You can cancel it after you received at least 1 point in all quadrants.
In principle the time complexity of this approach is MUCH lower because it involves only a single query.
Worst case behaviour would iterate all points in the tree e.g.
if one quadrant doesn't contain any points, or
when all the points in one quadrant are actually closer than the closest point in another quadrant.
Seems like the former might not be possible in your model (?) and the latter is statistically unlikely with normal distributions. You'd have to check your domains expected point distributions.
Or, and this always applies: MEASURE and compare the effective performance
Use a modified distance function. More precisely, use four.
The main idea is to use a distance such that
d(v1,v2) = infinity if v2.x < v1.x
d(v1,v2) = infinity if v2.y < v1.y
d(v1,v2) = (v1.x-v2.x)²+(v1.y-v2.y)² otherwise
If you search for the nearest point with this distance, it must be in the top right quadrant.
You'll need to extend this logic to minDist when searching the tree.
The benefit is that it can stop searching a quadrant when it has found a point. Pages that overlap the "axes" may be expanded twice though.
Related
I have an input of n unique points (X,Y) that are between 0 and 2^32 inclusive. The coordinates are integers.
I need to create an algorithm that finds the number of pairs of points with a distance of exactly 2018.
I have thought of checking with every other point but it would be O(n^2) and I have to make it more efficient. I also thought of using a set or a vector and sort it using a comparator based on the distance with the origin point but it wouldn't help at all.
So how can I do it efficiently?
There is one Pythagorean triple with the hypotenuse of 2018: 11182+16802=20182.
Since all coordinates are integers, the only possible differences between the coordinates (both X an Y) of the two points are 0, 1118, 1680, and 2018.
Finding all pairs of points with a given difference between X (or Y) coordinates is a simple n log n operation.
Numbers other than 2018 might need a bit more work because they might be members of more than one Pythagorean triple (for example 2015 is a hypotenuse of 3 triples). If the number is not given as a constant, but provided at run time, you will have to generate all triples with this hypotenuse. This may require some sqrt(N) effort (N is the hypotenuse, not the number of points). One can find a recipe on the math stackexchange, e.g. here (there are many others).
You could try using a Quadtree. First you start sorting your points into the quadtree. You should specify a lower limit for the cell size of e.g. 2048 wich is a power of 2. Then iterate though the points and calculate distances to the points in the same cell and to the points in adjacent cells. That way you should be able to decrease the number of distance calculations drastically.
The main difficulty will probably be implementing the tree structure. You also have to find a way to find adjacent cells (you must include the possibility to traverse upwards in the tree)
The complexity of this is probably O(n*log(n)) in the best case but don't pin me down on that.
One additional word on the distance calculation: You are probably much faster if you don't do
dx = p1x - p2x;
dy = p1y - p2y;
if ( sqrt(dx*dx + dy*dy) == 2018 ) {
...
}
but
dx = p1x - p2x;
dy = p1y - p2y;
if ( dx*dx + dy*dy == 2018*2018 ) {
...
}
Squaring is faster than taking the sqare root. So just compare the square of the distance with the square of 2018.
I would like to write a C++ function which finds the median of an array of circular data.
For example, consider the reading from a compass where the readings are assumed to be in [0,360). Though 1 & 359 appears to be far away, they are very close due to the circular nature of the reading.
Finding median of N-elements in ordinary data is as follows.
1. sort the data of N-elements (ascending or descending order)
2. If N is odd, median is the (N+1)/2 th element in the sorted array.
3. If N is even, median is the average of the N/2 th and N/2+1 th elements in the sorted array.
However, the wrap around problem in the circular data takes the problem to a different dimension and the solution non-trivial.
A similar question to find mean from circular data is explained here How do you calculate the average of a set of circular data?
The suggestion in the above link is to find the unit vector corresponding to each angle and find the average. However, median requires sorting the data and sorting of vectors don't make any sense in this context. Hence I don't think we can use the proposed scheme to find median!
I've actually given this topic way more thought than is healthy so I'll share my thoughts and findings here. Maybe someone will have a similar problem and find this useful.
I haven't used C++ in many years so please forgive me if I write all the code in C#. I believe a fluent C++ speaker can pretty easily translate the algorithms.
Circular mean
First, let's define the circular mean. It's calculated by converting your points to radians, where your period (256, 360 or whatever - the value that is interpreted to be the same as zero) is scaled to 2*pi. You then calculate the sine and cosine of those radian values. Those are the y and x coordinates of your values on a unit circle. You then sum up all the sines and cosines and calculate atan2. This gives you the average angle, which can be easily converted back to your data point by dividing with the scaling factor.
var scalingFactor = 2 * Math.PI / period;
var sines = 0.0;
var cosines = 0.0;
foreach (var value in inputs)
{
var radians = value * scalingFactor;
sines += Math.Sin(radians);
cosines += Math.Cos(radians);
}
var circularMean = Math.Atan2(sines, cosines) / scalingFactor;
if (circularMean >= 0)
return circularMean;
else
return circularMean + period;
Marginal circular median
The simplest approach to a circular median is just a modified way of handling the circular mean.
The circular median can be calculated in a similar way, by just finding the median of the sines and cosines instead of the sums, and calculating the atan2 of that. This way, you are finding the marginal median of the circle points and taking its angle as a result.
var scalingFactor = 2 * Math.PI / period;
var sines = new List<double>();
var cosines = new List<double>();
foreach (var value in inputs)
{
var radians = value * scalingFactor;
sines.Add(Math.Sin(radians));
cosines.Add(Math.Cos(radians));
}
var circularMedian = Math.Atan2(Median(sines), Median(cosines)) / scalingFactor;
if (circularMedian >= 0)
return circularMedian;
else
return circularMedian + period;
This approach is O(n), robust to outliers and very simple to implement. It may suit your purposes well enough, but it has a problem: rotating the input points will give you different results. Depending on the distribution of your input data, it may or may not be a problem.
Circular arc median
To understand this other approach, you need to stop thinking of means and medians in terms of "this is how it's calculated", but in terms of what the resulting values actually represent.
For non-cyclic data, you get the mean by summing up all the values and dividing by the number of elements. What this number represents, though, is the value with the minimal sum of all squared distances to data elements. (I hear statisticians call this value the L2 estimate of location, but a statistician should probably confirm or deny this.)
Likewise for median. You get it by finding the data element that would end up in the middle if all data were sorted (ideally, using an O(n) selection algorithm, like nth_element in C++). What this number is, though, is a value that has the minimal sum of all absolute (non-squared!) distances to data elements. (Supposedly, this value is called an L1 estimate of location.)
Sorting circular data doesn't help you find a middle, so the usual way of thinking about medians doesn't work, but you can still find this point that minimizes the sum of absolute distances from all data points. Here's the algorithm that I came up with, that runs in O(n) time assuming the input data is normalized to >= 0 and < period, and then sorted. (If you need to do this sorting as part of your calculation, then the runtime is O(n log n).)
It works by going through all the data points and keeping track of the sum of distances. When you shift to the right data point by a distance D, the sum of distances to all the left points increases by D*LeftCount and the sum of all distances to all the right points decreases by D*RightCount. Then, if some of the left points are now actually the right points, because their left distance is larger than period/2, you subtract their previous distance and add the new, correct distance.
For comparing the current sum to the best sum, I added a bit of tolerance to guard against inexact floating point arithmetic.
There may be multiple or infinitely many points that satisfy the minimum distances condition. With non-circular medians with even number of values, the median can be any value between the two central values. It's usually taken to be the average of those two central values, so I took the similar approach with this median algorithm. I find all data points that minimize the distances and then just calculate the circular mean of those points.
// Requires a sorted list with values normalized to [0,period).
// Doing an initialization pass:
// * candidate is the lowest number
// * finding the index where the circle with this candidate starts
// * calculating the score for this candidate - the sum of absolute distances
// * counting the number of values to the left of the candidate
int i;
var candidate = list[0];
var distanceSum = 0.0;
for (i = 1; i < list.Count; ++i)
{
if (list[i] >= candidate + period / 2)
break;
distanceSum += list[i] - candidate;
}
var leftCount = list.Count - i;
var circleStart = i;
if (circleStart == list.Count)
circleStart = 0;
else
for (; i < list.Count; ++i)
distanceSum += candidate + period - list[i];
var previousCandidate = candidate;
var bestCandidates = new List<double> { candidate };
var bestDistanceSum = distanceSum;
var equalityTolerance = period * 1e-10;
for (i = 1; i < list.Count; ++i)
{
candidate = list[i];
// A formula for correcting the distance given the movement to the right.
// It doesn't take into account that some values may have wrapped to the other side of the circle.
++leftCount;
distanceSum += (2 * leftCount - list.Count) * (candidate - previousCandidate);
// Counting all the values that wrapped to the other side of the circle
// and correcting the sum of distances from the candidate.
if (i <= circleStart)
while (list[circleStart] < candidate + period / 2)
{
--leftCount;
distanceSum += 2 * (list[circleStart] - candidate) - period;
++circleStart;
if (circleStart == list.Count)
{
circleStart = 0;
break; // Letting the next loop continue.
}
}
if (i > circleStart)
while (list[circleStart] < candidate - period / 2)
{
--leftCount;
distanceSum += 2 * (list[circleStart] - candidate) + period;
++circleStart;
}
// Comparing current sum to the best one, using the given tolerance.
if (distanceSum <= bestDistanceSum + equalityTolerance)
{
if (distanceSum >= bestDistanceSum - equalityTolerance)
{
// The numbers are close, so using their average as the next best.
bestDistanceSum = (bestCandidates.Count * bestDistanceSum + distanceSum) / (bestCandidates.Count + 1);
}
else
{
// The new number is significantly better, clearing.
bestDistanceSum = distanceSum;
bestCandidates.Clear();
}
bestCandidates.Add(candidate);
}
previousCandidate = candidate;
}
if (bestCandidates.Count == 1)
return bestCandidates[0];
else
return CircularMean(bestCandidates, period);
Geometric circular median
There is an inconsistency in the previous algorithm, in the way the median is defined in relation to the circular mean. The circular mean minimizes the sum of squared euclidian distances between points on a circle. In other words, it looks at the straight lines connecting points on a circle, cutting through the circle.
The arc median, as I calculate it above, looks at the arc distances: how far the points are to each other by moving on the perimeter of the circle, not by taking a straight line between them.
I have thought about how to address this issue, if it bothers you, but I haven't really done any experiments so I can't claim the following method works. In short, I believe you could use a modification of the Iteratively reweighted least squares algorithm (IRLS), which is what is usually used to calculate geometric medians.
The idea is to pick a starting value (for instance, the circular mean or the arc median presented above), and calculate the euclidean distance to each point: Di = sqrt(dxi^2 + dyi^2). Circular mean will minimize the squares of those distances, so the weights of each point should cancel out the square and reset to just D: Wi = Di / Di^2, which is just Wi = 1 / Di.
With these weights, calculate the weighted circular mean (same as the circular mean, but multiply each sine and cosine by the weight of that point before summing them up) and repeat the process. Repeat until enough iterations have passed or until the result stops changing much.
The problem with this algorithm is that it has a division by zero if the current solution falls exactly on a data point. Even if the distance isn't exactly zero, the solution will stop moving if you hit close enough to the point because the weight will become enormous compared to all the other ones. This can be fixed by adding a small fixed offset to the distance before dividing by it. This will make the solution suboptimal, but at least it won't stop on a wrong point.
It will still take some number of iterations to dig itself out of that wrong point unless the offset is relatively large, and the final solution is worse the bigger the offset is. So the best way would probably be to start with a fairly large offset and then progressively making it smaller for each next iteration.
Two properties of median allow inventing two distinct algorithms for median finding.
1) Median minimizes sum of absolute distance to all other elements -- O(n^2) algo:
for (i = 0; i < N; i++)
{
sum = 0;
for (j = 0; j < N; j++)
sum += abs(item[i] - item[j]) % 360;
if (sum < best_so_far) { best_so_far = sum; index = i; }
}
2) Median satisfies that half of items are less and half are greater
sort the items
locate the first set of items (i=0...I), satisfying either that
I <= N/2, OR item[I] > i + 180
if the condition for median is not satisfied, advance either i, or I.
requires O(N*log N) for sorting and O(N) for the next scan
Of course in cyclical data all items (and all items inbetween data points) can be a proper candidate for the median.
For definition and discussion of circular median see
N.I. Fisher's 'Statistical Analysis of Circular Data', Cambridge Univ. Press 1993
and the discussion surrounding equations 2.32 and 2.33. For multi-modal or isotropic data a unique median may not exist.
Find an axis that divides the data into 2 equal groups and choose the end of the axis at the smaller value of the angle. If the sample size is odd the median will be a data point, otherwise it will be the midpoint of 2 data points.
There are packages in other languages (e.g. R, MatLab) that would help provide test values for any function you write.
e.g.
https://www.rdocumentation.org/packages/circular/versions/0.4-93
See in particular median.circular and medianHL.circular
or
Berens, Philipp. ‘CircStat: A MATLAB Toolbox for Circular Statistics’. Journal of Statistical Software 31, no. 1 (23 September 2009): 1–21. https://doi.org/10.18637/jss.v031.i10.
and see circ_median
With your vector of angular datapoints (i.e. vector of numbers from 0 to 259), create two new vectors, I'll call them x and y. These two new vectors are the sine and cosine respectively of your angular datapoints.
That is, x[n] = cos(data[n]) and y[n] = sin(data[n]) where data is your angular data vector and n is however many datapoints there are.
Next, add up all the values in the x vector to get a single value, call it say sum_x and add up all the values in the y vector to get a another single value, call it sum_y.
Now you can do tangent-inverse (e.g. atan(sum_y/sum_x)) to get a new value. And this value is very meaningful. This value is basically telling you which direction your data is "pointing", i.e. where the majority of your data exists. NOTE: You must be careful of dividing by 0 (when sum_x=0) and when the indeterminate forms occurs (when both sum_x=0 and sum_y=0). The indeterminate form just means your data is evenly distributed, in which case the median is meaningless, and when sum_x=0 but sum_y!=0, then it is effectively atan(inf) or atan(-inf), both of which are known.
EDIT:
My previous answer needed some tweaking after this point.
From here, it is easy. Take the value you got in the previous step (atan(sum_y/sum_x)) and add 180 degrees to that value. This is your reference point of where your data starts and ends. From here, you can sort your angular data with this reference point as both the starting and ending point, and find the median of that data.
It is not possible to canonically extend the concept of median to circular data. For the sake of simplicity lets consider numbers in [0 10) and as an example the (already ordered) set { 1 3 5 7 8 }. Depending on how you rotate the array you get different values for the median:
1 3 5 7 8 -> 5
3 5 7 8 1 -> 7
5 7 8 1 3 -> 8
...etc...
and any is as good as the other.
I am not claiming that it is not possible to define a median on circular data. I am just claiming that the "normal" median cannot be extended to that case in a meaningful way without adding additional constraints or making an arbitrary choice.
I'am trying to find out an algorithm to recognize circle in array of points.
Lets say that I've got points array where circle could or could not be stored (that also means array doesn't have to store only circle's points, there could be some "extra" points before or after circle's data).
I've already tried some algorithms but none of them work properly with those "extra" points. Have you got any ideas how to deal with this problem?
EDIT// I didn't mention that before. I want this algorithm to be used on circle gesture recognition. I've thought I would have data in array (for last few seconds) and by analysing this data in every tracking frame I would be able to say if there was or was not a circle gesture.
First I calculate the geometric mean (not the aritmetic mean) for each X and Y component.
I choose geometric mean because one feature is that small values (with respect to the arithmetic mean ) of the values are much more influential than the large values.
This lead me to the theoretical center of all points: circ_center
Then I calculate the standard deviation of distance of each point to center: stddev. This gives me the "indicator" to quantify the amount of variation. One property of circle is that all circumference point is at the same distance of it's center. With standard dev I try to test if your points are (with max variance threshold: max_dispersion) equally distance.
Last I calculates the average distance of points inside max_dispersion threshold from center, this give me the radius of the circle: avg_dist.
Parameters:
max_dispersion represents the "cicle precision". Smaller means more precise.
min_points_needed is the minimun number of points valid to be considered as circumference.
This is just an attempt, I have not tried. Let me know.
I will try this (in pseudo language)
points_size = 100; //number_of_user_points
all_poins[points_size]; //coordinates of points
//thresholds to be defined by user
max_dispersion = 20; //value of max stddev accepted, expressed in geometric units
min_points_needed = 5; //minimum number of points near the circumference
stddev = 0; //standard deviation of points from center
circ_center; //estimated circumference center, using Geometric mean
num_ok_points = 0; //points with distance under standard eviation
avg_dist = 0; //distance from center of "ok points"
all_x = 1; all_y = 1;
for(i = 0 ; i < points_size ; i++)
{
all_x = all_x * all_poins[i].x;
all_y = all_y * all_poins[i].y;
}
//pow(x, 1/y) = nth root
all_x = pow(all_x, 1 / points_size); //Geometric mean
all_y = pow(all_y, 1 / points_size); //Geometric mean
circ_center = make_point(all_x, all_y);
for(i = 0 ; i < points_size ; i++)
{
dist = distance(all_poins[i], circ_center);
stddev = stddev + (dist * dist);
}
stddev = square_root(stddev / points_size);
for(i = 0 ; i < points_size ; i++)
{
if( distance(all_poins[i], circ_center) < max_dispersion )
{
num_ok_points++;
avg_dist = avg_dist + distance(all_poins[i], circ_center);
}
}
avg_dist = avg_dist / num_ok_points;
if(stddev <= max_dispersion && num_ok_points >= min_points_needed)
{
circle recognized; it's center is circ_center; it's radius is avg_dist;
}
Can we assume the array of points are mostly on or near to the circumference of the circle?
A circle has a center and radius. If you can determine the circle's center coordinates, via the intersection of perpendiculars of two chords, then all the true circle points should be equidistant(r), from the center point.
The false points can be eliminated by not being equidistant (+-)tolerance from the center point.
The weakness of this approach is how well can you determine the center and radius? You may want to try a least squares approach to computing the center coordinates.
To answer the initially stated question, my approach would be to iterate through the points and derive the center of a circle from each consecutive set of three points. Then, take the longest contiguous subset of points that create circles with centers that fall within some absolute range. Then determine if the points wind consistently around the average of the circles. You can always perform some basic heuristics on any discarded data to determine if a circle is actually what the user wanted to make though.
Now, since you say that you want to perform gesture recognition, I would suggest you think of a completely different method. Personally, I would first create a basic sort of language that can be used to describe gestures. It should be very simple; the only words I would consider having are:
Start - Denotes the start of a stroke
Angle - The starting angle of the stroke. This should be one of the eight major cardinal directions (N, NW, W, SW, S, SE, E, NE) or Any for unaligned gestures. You could also add combining mechanisms, or perhaps "Axis Aligned" or other such things.
End - Denotes the end of a stroke
Travel - Denotes a straight path in the stroke
Distance - The percentage of the total length of the path that this particular operation will consume.
Turn - Denotes a turn in the stroke
Direction - The direction to turn in. Choices would be Left, Right, Any, Previous, or Opposite.
Angle - The angle of the turn. I would suggest you use just three directions (90 deg, 180 deg, 270 deg)
Tolerance - The maximum tolerance for deviation from the specified angle. This should have a default of somewhere around 45 degrees in either direction for a high chance of matching the angle in a signature.
Type - Hard or Radial. Radial angles would be a stroke along a radius. Hard angles would be a turn about a point.
Radius - If the turn is radial, this is the radius of the turn (units are in percentage of total path length, with appropriate conversions of course)
Obviously you can make the angles much more fine, but the coarser the ranges are, the more tolerant of input error it can be. Being too tolerant can lead to misinterpretation though.
If you apply some fuzzy logic, it wouldn't be hard to break just about any gesture down into a language like this. You could then create a bunch of gesture "signatures" that describe various gestures that can be performed. For instance:
//Circle
Start Angle=Any
Turn Type=Radial Direction=Any Angle=180deg Radius=50%
Turn Type=Radial Direction=Previous Angle=180deg Radius=50%
End
//Box
Start Angle=AxisAligned
Travel Distance=25%
Turn Type=Hard Direction=Any Angle=90deg Tolerance=10deg
Travel Distance=25%
Turn Type=Hard Direction=Previous Angle=90deg Tolerance=10deg
Travel Distance=25%
Turn Type=Hard Direction=Previous Angle=90deg Tolerance=10deg
Travel Distance=25%
End
If you want, I could work on an algorithm that could take a point cloud and degenerate it into a series of commands like this so you can compare them with pre-generated signatures.
At the moment i'm working with a camera to detect a marker. I use opencv and the Aruco Libary.
Only I'm stuck with a problem right now. I need to detect if the distance between 2 marker is less than a specific value. I have a function to calculate the distance, I can compare everything. But I'm looking for the most efficient way to keep track of all the markers (around 5/6) and how close they are together.
There is a list with markers but I cant find a efficient way to compare all of them.
I have a
Vector <Marker>
I also have a function called getDistance.
double getDistance(cv::Point2f punt1, cv::Point2f punt2)
{
float xd = punt2.x-punt1.x;
float yd = punt2.y-punt1.y;
double Distance = sqrtf(xd*xd + yd*yd);
return Distance;
}
The Markers contain a Point2f, so i can compare them easily.
One way to increase performance is to keep all the distances squared and avoid using the square root function. If you square the specific value you are checking against then this should work fine.
There isn't really a lot to recommend. If I understand the question and I'm counting the pairs correctly, you'll need to calculate 10 distances when you have 5 points, and 15 distances when you have 6 points. If you need to determine all of the distances, then you have no choice but to calculate all of the distances. I don't see any way around that. The only advice I can give is to make sure you calculate the distance between each pair only once (e.g., once you know the distance between points A and B, you don't need to calculate the distance between B and A).
It might be possible to sort the vector in such a way that you can short circuit your loop. For instance, if you sort it correctly and the distance between point A and point B is larger than your threshold, then the distances between A and C and A and D will also be larger than the threshold. But keep in mind that sorting isn't free, and it's likely that for small sets of points it would be faster to just calculate all distances ("Fancy algorithms are slow when n is small, and n is usually small. Fancy algorithms have big constants. Until you know that n is frequently going to be big, don't get fancy. ... For example, binary trees are always faster than splay trees for workaday problems.").
Newer versions of the C and C++ standard library have a hypot function for calculating distance between points:
#include <cmath>
double getDistance(cv::Point2f punt1, cv::Point2f punt2)
{
return std::hypot(punt2.x - punt1.x, punt2.y - punt1.y);
}
It's not necessarily faster, but it should be implemented in a way that avoids overflow when the points are far apart.
One minor optimization is to simply check if the change in X or change in Y exceeds the threshold. If it does, you can ignore the distance between those two points because the overall distance will also exceed the threshold:
const double threshold = ...;
std::vector<cv::Point2f> points;
// populate points
...
for (auto i = points.begin(); i != points.end(); ++i) {
for (auto j = i + 1; j != points.end(); ++j) {
double dx = std::abs(i->x - j->x), dy = std::abs(i->y - j->y);
if (dx > threshold || dy > threshold) {
continue;
}
double distance = std::hypot(dx, dy);
if (distance > threshold) {
continue;
}
...
}
}
If you're dealing with large amounts of data inside your vector you may want to consider some multithreading using future.
Vector <Marker> could be chunked into X chunks which are asynchronously computed together and stored inside std::future<>, putting to use #Sesame's suggestion will also increase your speed as well.
I am trying to calculate the centre of mass (x,y,z) coordinates of an object defined in an STL file (stereo lithography, not to be confused with the standard template library). The STL file contains a closed object (or objects) defined by a boundary made of triangles. The triangles themselves are not necessarily in any order, the file is simply the coordinates 3 vertices of each triangle floating in 3D space plus a normal vector to the triangle (the normal should be disregarded as it is not always done properly). There is nothing that links each triangle to one another, it is assumed that the object is closed.
One simple approach would be to divide a volume (in this case, a box) into millions of elements and determine if each element is inside the object defined in the STL file or not, then sum up the moments and calculate the centre of mass. This would work but its far from elegant and extremely slow.
Another method would be to convert the boundary representation into a number of packed tetrahedron solids. Form that I could calculate the centre of mass of each tetrahedron, its volume, and resulting moment and thus calculate the overall centre of mass from the sum of all tetrahedrons. The problem with this is that I don't know how to convert a surface representation of triangles into a volume representation of tetrahedrons (I'm assuming its a fairly non trivial task).
Dose anyone know of any methods or can think up of any methods that I could try? Or maybe even any reference material that talks about this?
For more information about STL files (only the first 2 sections are important, everything else is useless): http://en.wikipedia.org/wiki/STL_%28file_format%29
After a lot of thinking and experimentation I have the answer!
First we add a 4th point to each triangle in to make them into tetrahedrons with a volume centroid. We calculate the volumes and centres of masses and multiply them by each other to get our moments. We sum the moments and divide by total volume to get our overall centroid.
We calculate volumes using the determinate method shown here (equation 32): http://mathworld.wolfram.com/Tetrahedron.html
The centroids of each of the tetrahedrons is simply the average of the 4 points.
The trick here is that due to the way the STL file is created, the triangles have a normal that point outwards from the part surface, following the right hand rule of the 3 verticies used to create the triangle. we can use this to our advantage by allowing us to have a consistent convention in which to determine if a volume of the tetrahedron should be added or subtracted from our net part (this is because the reference point we chose may not necessarily be inside the part and the overall part is not necessarily convex, it is, however a closed object).
Using the determine method to calculate the volume, the first three coordinate points will represent the three points of our triangle. The fourth point would be our common origin. If the normal created by the triangle (following the right hand rule going from point 1, 2, 3) points towards our common reference point, that volume will be calculated as not part of our overall solid, or negative volume (by pointing towards, i mean the vector created by the triangle's normal is pointing loosely towards the same side as a normal plane created by the vector from our reference point to the centroid of the tetrahedron). If the vector is pointing away from the reference point, it is then positive volume or inside the part. If it is normal, then the volume goes to zero as the triangle is in the same plane as the reference point.
We don't need to worry about actually keeping track of any of this as if we are consistent with our inputs (as in the triangles follow the right hand rule with normal facing outwards from the part) the determine will give us the correct sign.
Anyways, heres the code (its even more simple than the explanation).
class data // 3 vertices of each triangle
{
public:
float x1,y1,z1;
float x2,y2,z2;
float x3,y3,z3;
};
int main ()
{
int numTriangles; // pull in the STL file and determine number of triangles
data * triangles = new triangles [numTriangles];
// fill the triangles array with the data in the STL file
double totalVolume = 0, currentVolume;
double xCenter = 0, yCenter = 0, zCenter = 0;
for (int i = 0; i < numTriangles; i++)
{
totalVolume += currentVolume = (triangles[i].x1*triangles[i].y2*triangles[i].z3 - triangles[i].x1*triangles[i].y3*triangles[i].z2 - triangles[i].x2*triangles[i].y1*triangles[i].z3 + triangles[i].x2*triangles[i].y3*triangles[i].z1 + triangles[i].x3*triangles[i].y1*triangles[i].z2 - triangles[i].x3*triangles[i].y2*triangles[i].z1) / 6;
xCenter += ((triangles[i].x1 + triangles[i].x2 + triangles[i].x3) / 4) * currentVolume;
yCenter += ((triangles[i].y1 + triangles[i].y2 + triangles[i].y3) / 4) * currentVolume;
zCenter += ((triangles[i].z1 + triangles[i].z2 + triangles[i].z3) / 4) * currentVolume;
}
cout << endl << "Total Volume = " << totalVolume << endl;
cout << endl << "X center = " << xCenter/totalVolume << endl;
cout << endl << "Y center = " << yCenter/totalVolume << endl;
cout << endl << "Z center = " << zCenter/totalVolume << endl;
}
Extremely fast for calculating centres of mass for STL files.
EDIT: Look up "winding number algorithm" or "crossing number algorithm" - what I try to describe below is a 3-d crossing number algorithm.
I've got a feeling that something like this will work, but I don't have the ability to test it out, right now:
Build the filled-in 3-d structure from the triangles in the STL file iteratively. Start by picking a single point to use as a basis for the 3-d structure. Then, begin your structure by creating a triangular pyramid, with base defined by the first triangle in the STL file, and vertex your chosen point. Each such component of your iteratively built volume would also contain an "intersection parity" - initialize it to 0.
For each subsequent triangle in the STL file, create a similar pyramid, and see if it intersects with the 3-d structure that you've built so far. If it does, calculate the intersection, and segment the existing structure and the new pyramid so that no two components overlap. Keep the "intersection parity" of the outermost part of the new polyhedron 0, but toggle it on all inner portions of the intersection -- if it was 0, make it 1, if it was 1, make it 0.
At the end, you'll have the closed polyhedron defined by all the portions of your structure that have intersection parity 0. Calculate the moments of all of these polyhedrons, and average them together to get your center of mass. I think the complexity would be something like O(n^2).