Approximate for average distance to nearest neighbour? - c++

And another algorithm I'm looking for: A free C/C++ implementation of the average distance to nearest neighbour problem.
So basically I have a cloud of points in 3D and I want the average over the distances between all points and their respective nearest neighbours. So easiest way to do this would be to find the nearest neighbour for every point, calculate the distance of that neighbour to the point, and devide the sum of those distances by the number of points. However, there are much better algorithms, as this has much redundancy and approximates run even faster. I'm looking for a free C/C++ implementation of those better algorithms.
An ε-Approximate if fine.

The C++ library FLANN allows you to do "fast approximate nearest-neighbor searches." It's written in C++ and claims to be one of the fastest implementations of this sort of search available.
Hope this helps!

You might try a Quadtree, as described in this question. There are many implementations for your problem in other 3D/2D graphic libraries, too.
I used GEOS, the 'Geometry Engine, Open Source' once in a project some years ago and was very satisfied.

Related

Point to point path in a Graph

I want an algorithm to be able to find an optimal path between two vertices on a graph (with positive int weights).The thing is my graph is relatively big (up to 100 vertices). I have considered the dijkstra algorithm but as I searched the net most implementions use the adjacency matrix which in my case will be 100x100.
If you could recommend me a certain source to read and learn from , or even better provide me with a c++ implementaion it will be great.
PS: The algorithm needs to output the required route and not just the shortest distance between two points.
Thank you for your time.
Have you looked into A*?
Here's a good article to start reading: http://www.redblobgames.com/pathfinding/a-star/introduction.html

Building the tetrahedra of a set of random points - tetrahedralization

I have a set of points (1 million of them, possibly more in the future, like 10 or 100 million) in 3D space that forms a sphere (they fill the sphere - they're not just on the surface) and I would like to build the tetrahedra that connect each sphere to its first neighbours... Looking for tetrahedralization, so far, all I found is :
algorithms for meshing, but they fill empty spaces as far as I understand, whereas my points are fixed.
algorithms for surface viewing, which is quite irrelevant
algorithms for 3D images viewing (in the medical field, mostly) : which is closer but do not quite do the trick.
How can I do this?
2014-08-09 first of, thanks to you all for Your suggestions ! I was - and still am - on holidays and was just passing by to check whether anyone had answered... I am not disappointed !!!! :-)
I guess I'll first try CGAL, and will see from there. I have other data calculations on the same set of points in O(n2) that I expect will last about 1 week so a few hours would not be that bad. Minutes would be a dream come true !
You appear to be looking for a Delaunay triangulation algorithm in 3-space.
I hope you don't mind waiting a while, because a Delaunay triangulation of 100 million points is going to take quite some time.
qhull has an n-dimensional Delaunay implementation that you might try. So does CGAL. Both packages will compute the Delaunay triangulation in O(n log(n)) asymptotic time, and CGAL can, with an appropriate choice of geometry kernels, do so in a numerically robust fashion. (That is, it can automatically switch to exact arithmetic for those computations where inexact arithmetic produces an uncertain result.)
I would not recommend trying to implement a fast Delaunay triangulation yourself, even in two dimensions. Terrifying things can happen when you need to evaluate predicates on the results of arithmetic.
I use tetgen for one of my project to do tetrahedralization. It works quite well and fast enough

Nearest neighbour search on graphics hardware

Given a huge collection of points (float64) in 2d space...
Is there a way to determine the nearest neighbour using a feature of OpenGL or DirectX?
I've implemented a kd-tree, which is still not fast enough.
A kd-tree should work just fine. But here's some hints.
I implemented a kd-tree once for a million point data set once. Here's what I learned out of it:
Did you try profiling your code? You might find that there are easy optimizations to make such as common helper functions needing to be forced inline.
Did you actually test your code to validate that it was culling out tree branches for partitions that are easily identified as "too far away". If you aren't careful, you can easily have a bug that does needless distance computations on points too far away.
Easiest thing: Where comparing linear distance between points, you don't need to take the SQRT of (x2-x1)*(y2-y1).
Most of the time spent in my code was just building the tree from the original data set, including multiple full sorts on each iteration deciding which axis was the best to partition on. An easier algorithm would be to just alternate between partitioning on the x and y axis for each tree branch and to cache the sorting order for each axis. It may not build the most optimal search tree, but the overall savings can be enormous.

Assurance of ICP, internal Metrics

So I have an iterative closest point (ICP) algorithm that has been written and will fit a model to a point cloud. As a quick tutorial for those not in the know ICP is a simple algorithm that fits points to a model ultimately providing a homogeneous transform matrix between the model and points.
Here is a quick picture tutorial.
Step 1. Find the closest point in the model set to your data set:
Step 2: Using a bunch of fun maths (sometimes based on gradiant descent or SVD) pull the clouds closer together and repeat untill a pose is formed:
![Figure 2][2]
Now that bit is simple and working, what i would like help with is:
How do I tell if the pose that I have is a good one?
So currently I have two ideas, but they are kind of hacky:
How many points are in the ICP Algorithm. Ie, if I am fitting to almost no points, I assume that the pose will be bad:
But what if the pose is actually good? It could be, even with few points. I dont want to reject good poses:
So what we see here is that low points can actually make a very good position if they are in the right place.
So the other metric investigated was the ratio of the supplied points to the used points. Here's an example
Now we exlude points that are too far away because they will be outliers, now this means we need a good starting position for the ICP to work, but i am ok with that. Now in the above example the assurance will say NO, this is a bad pose, and it would be right because the ratio of points vs points included is:
2/11 < SOME_THRESHOLD
So thats good, but it will fail in the case shown above where the triangle is upside down. It will say that the upside down triangle is good because all of the points are used by ICP.
You don't need to be an expert on ICP to answer this question, i am looking for good ideas. Using knowledge of the points how can we classify whether it is a good pose solution or not?
Using both of these solutions together in tandem is a good suggestion but its a pretty lame solution if you ask me, very dumb to just threshold it.
What are some good ideas for how to do this?
PS. If you want to add some code, please go for it. I am working in c++.
PPS. Someone help me with tagging this question I am not sure where it should fall.
One possible approach might be comparing poses by their shapes and their orientation.
Shapes comparison can be done with Hausdorff distance up to isometry, that is poses are of the same shape if
d(I(actual_pose), calculated_pose) < d_threshold
where d_threshold should be found from experiments. As isometric modifications of X I would consider rotations by different angles - seems to be sufficient in this case.
Is poses have the same shape, we should compare their orientation. To compare orientation we could use somewhat simplified Freksa model. For each pose we should calculate values
{x_y min, x_y max, x_z min, x_z max, y_z min, y_z max}
and then make sure that each difference between corresponding values for poses does not break another_threshold, derived from experiments as well.
Hopefully this makes some sense, or at least you can draw something useful for your purpose from this.
ICP attempts to minimize the distance between your point-cloud and a model, yes? Wouldn't it make the most sense to evaluate it based on what that distance actually is after execution?
I'm assuming it tries to minimize the sum of squared distances between each point you try to fit and the closest model point. So if you want a metric for quality, why not just normalize that sum, dividing by the number of points it's fitting. Yes, outliers will disrupt it somewhat but they're also going to disrupt your fit somewhat.
It seems like any calculation you can come up with that provides more insight than whatever ICP is minimizing would be more useful incorporated into the algorithm itself, so it can minimize that too. =)
Update
I think I didn't quite understand the algorithm. It seems that it iteratively selects a subset of points, transforms them to minimize error, and then repeats those two steps? In that case your ideal solution selects as many points as possible while keeping error as small as possible.
You said combining the two terms seemed like a weak solution, but it sounds to me like an exact description of what you want, and it captures the two major features of the algorithm (yes?). Evaluating using something like error + B * (selected / total) seems spiritually similar to how regularization is used to address the overfitting problem with gradient descent (and similar) ML algorithms. Selecting a good value for B would take some experimentation.
Looking at your examples, it seems that one of the things that determines whether the match is good or not, is the quality of the points. Could you use/calculate a weighting factor in calculating your metric?
For example, you could weight down points which are co-linear / co-planar, or spatially close, as they probably define the same feature. That would perhaps allow your upside-down triangle to be rejected (as the points are in a line, and that not a great indicator of the overall pose) but the corner-case would be ok, as they roughly define the hull.
Alternatively, maybe the weighting should be on how distributed the points are around the pose, again trying to ensure you have good coverage, rather than matching small indistinct features.

Finding the nearest XY coordinates

I've got a point in 2d image for example the red Dot in the given picture and a set of n points blue dot (x1,y1)...(xn,yn) and I want to find nearest point to (x0,y0) in a way better than trying all points. Like to have best possible solution. Would appreciate if you share any similar class if you have.
There are many approaches to this, the most common probably being using some form of space partitioning to speed up the search so that it is not O(n). For details, see Nearest neighbor search on Wikipedia.
Most solutions that we could suggest would depend on a little bit more knowledge, I am going to go out on a limb and say that unless you already know that you are short on time. I.e. there are tens of thousands of blue dots or you have to do thousands of these calculations in a short time. "Linear Search" will serve you well enough.
Don't bother calculating the actual distance, save yourself calculating the square root and use this as the "distance".
Most other methods use more complex data structures to sort the points in respect to their geometric arrangement. But are a lot harder to implement.