I have N number of points (x_N,y_N,z_N) in a point cloud. The point cloud forms the shape of a spherical shaped object. My problem is that I have points in my cloud that stick out noticeably along the z-axis (This is due to pin object inserted in my object during a scan). I would like to remove these points.
One approach I have taken is finding the change in slope for a set of points in my cloud compared to the immediate next set of points. (for example, I take my first 10 points, compute the change in slope and compare it to the change in slope for the next ten points). But this is not working so well. Any suggestions?
Any help would be greatly appreciated. Any confusion towards my problem, just let me know.
If it's sure to be a sphere like object and points are equally spread (no side has more points than other side), take the average X, Y and Z of all points.
This will be next to the center of the sphere. If that pin is not very thick or very long (if it have few points compared to the total), you can assume this as the center.
Then, measure the distance of each point to the center.
Take off those having distances higher than the average distance.
If you know the radii of the sphere and its center, simply calculate the distance of each point to the center and compare to the radii.
I have an observation and a corresponding suggestion:
First, the observation: You appear to be building a custom solution for a one-off case. This will not work when you scan a different object (with the pin sticking out again).
Now, the suggestion: Use something like meshlab, where you can load up a point cloud, select points and delete them.
Of course, if you're ken on writing code to solve this problem, then this is not helpful.
Find the highest point in z, which is 100% sure to be a pin or apart of one.
Set point to be center of sphere and remove all points within chosen radius
Iterate twice more for other pins
Related
I am trying to understand how Centroid to Contour (CtC) detector works. Here is a sample code that I have found on Github, and I am trying to understand what is the idea behind this. In this sample, the author trying to detect speed sign using CtC. There are just two important functions:
pre_process()
CtC_features
I have understood a part of the code and how it works but I have some problems in understanding how CtC_features function works.
If you can help me I would like to understand the following parts (just 3 points):
Why if centroid.x > curr.x we need to add PI value to the angle result ( if (centroid.x > curr.x) ang += 3.14159; //PI )
When we start selecting the features on line 97 we set the start angle ( double ang = - 1.57079; ). Why is this half the pi value and negative? How was this value chosen?
And a more general question, how can you know that what feature you select are related to speed limit sign? You find the centroid of the image and adjust the angle in the first step, but in the second step how can you know if ( while (feature_v[i].first > ang) ) the current angle is bigger than your hardcode angle ( in first case ang = - 1.57079) then we add that distance as a feature.
I would like to understand the idea behind this code and if someone with more experience and with some knowledge about trigonometry would help me it will be just great.
Thank you.
The code you provided is not the best, but let's see what happens.
I took this starting image of a sign:
Then, pre_process is called, which basically runs a Canny edge detector, along with some tricks which should lead to a better edge detection. I won't look into them, but this is what it returns:
Not the greatest. Maybe some parameter tuning would help.
But now, CtC_features is called, which is the scope of the question. The role of CtC_features is to obtain some features for a machine learning algorithms. This amounts to finding a numerical description of the image which would help the ML algorithm detect the sign. Such a description can be anything. Think about how someone who never saw a STOP sign and does not know how to read would describe it. They would say something like "A red, flat plate, with 8 sides and some white stuff in the middle". Based on this description, someone might be able to tell it's a STOP sign. We want to do the same, but since computers are computers, we look for numerical features. And, with them, some algorithm could be trained to "learn" what features each sign has.
So, let's see what features does CtC_features obtains from the contours.
The first thing it does is to call findContours. This function takes a binary image and returns arrays of points representing the contours of the image. Basically, it takes the edges and puts them into arrays of points. With connectivity, so we basically know which points are connected. If we use the code from here for visualization, we can see what happens:
So, the array contours is a std::vector<std::vector<cv::Point>> and you have in each sub-array a continuous contour, here drawn with a different color.
Next, we compute the number of points (edge pixels), and do an average over their coordinates to find the centroid of the edge image. The centroid is the filled circle:
Then, we iterate over all points, and create a vector of std::pair<double, double>, recording for each point the distance from the centroid and the angle. The angle function is defined at the bottom of the file as
double angle(Point2f a, Point2f b) {
return atan((a.y - b.y) / (a.x - b.x));
}
It basically computes the angle of the line from a to b with respect to the x axis, using the arctangent function. I'll let you watch a video on arctangent, but tl;dr is that it gives you the angle from a ratio. In radians (a circle is 2 PI radians, half a circle is PI radians). The problem is that the function is periodic, with a period of PI. This means that there are 2 angles on the circle (the circle of all points at the same distance around the centroid) which give you the same value. So, we compute the ratio (the ratio is btw known as the tangent of the angle), apply the inverse function (arctangent) and we get an angle (corresponding to a point). But what if it's the other point? Well, we know that the other point is exactly with PI degrees offset (it is diametrically opposite), so we add PI if we detect that it's the other point.
The picture below also helps understand why there are 2 points:
The tangent of the angle is highlighted vertical distance,. But the angle on the other side of the diagonal line, which intersects the circle in the bottom left, also has the same tangent. The atan function gives the tangents only for angles on the left side of the center. Note that there are no 2 directions with the same tangent.
What the check does is to ask whether the point is on the right of the centroid. This is done in order to be able to add a half a circle (PI radians or 180 degrees) to correct for the result of atan.
Now, we know the distance (a simple formula) and we have found (and corrected) for the angle. We insert this pair into the vector feature_v, and we sort it. The sort function, called like that, sorts after the first element of the pair, so we sort after the angle, then after distance.
The interval variable:
int degree = 10;
double interval = double((double(degree) / double(360)) * 2 * 3.14159); //5 degrees interval
simply is value of degree, converted from degrees into radians. We need radians since the angles have been computed so far in radians, and degrees are more user friendly. Yep, the comment is wrong, the interval is 10 degrees, not 5.
The ang variable defined below it is -PI / 2 (a quarter of a circle):
double ang = - 1.57079;
Now, what it does is to divide the points around the centroid into bins, based on the angle. Each bin is 10 degrees wide. This is done by iterating over the points sorted after angle, all are accumulated until we get to the next bin. We are only interested in the largest distance of a point in each bin. The starting point should be small enough that all the direction (points) are captured.
In order to understand why it starts from -PI/2, we have to get back at the trigonometric function diagram above. What happens if the angle goes like this:
See how the highlighted vertical segment goes "downwards" on the y axis. This means that its length (and implicitly the tangent) is negative here. Also, the angle is considered to be negative (otherwise there would be 2 angles on the same side of the center with the same tangent). Now, we are interested in the range of angles we have. It's all the angles on the right side of the centroid, starting from the bottom at -PI/2 to the top at PI/2. A range of PI radians, or 180 degrees. This is also written in the documentation of atan:
If no errors occur, the arc tangent of arg (arctan(arg)) in the range [-PI/2, +PI/2] radians, is returned.
So, we simply split all the possible directions (360 degrees) into buckets of 10 degrees, and take the distance of the farthest point in each bin. Since the circle has 360 degrees, we'll get 360 / 10 = 36 bins. Then, these are normalized such that the greatest value is 1. This helps a bit with the machine learning algorithm.
How can we know if the point we selected belongs to the sign? We don't. Most computer vision make some assumptions regarding the image in order to simplify the problem. The idea of the algorithm is to determine the shape of the sign by recording the distance from the center to the edges. This makes the assumption that the centroid is roughly in the middle of the sign. Depending on the ML algorithm used, and on the training data, different levels of robustness can be obtained.
Also, it assumes that (some of) the edges can be reliably identified. See how in my image, the algorithm was not able to detect the upper left edge?
The good news is that this doesn't have to be perfect. ML algorithms know how to handle this variation (up to some extent) provided that they are appropriately trained. It doesn't have to be perfect, but it has to be good enough. In order to answer what good enough means, what are the actual limitations of the algorithm, some more testing needs to be done, as well as some understanding of the ML algorithm used. But this is also why ML is so popular in vision: it can handle a lot of variation quite well.
At the end, we basically get an array of 36 numbers, one for each of the 36 bins of 10 degrees, representing the maximum distance of a point in the bin. I assume this is because the developer of the algorithm wanted a way to capture the shape of the sign, by looking at distances from center in various directions. This assumes that no edges are detected in the background, and the sign looks something like:
The max distance is used to pick the border, and not the or other symbols on the sign.
It is not directly used here, but a possibly related reading is the Hough transform, which uses a similar particularization to detect straight lines in an image.
I have a set of point cloud, and I would like to test if there is a corner in a 3D room. So I would like to discuss my approach and if there is a better approach or not in terms of speed, because I want to test it on mobile phones.
I will try to use hough tranform to detect lines, then I will try to see if there are three lines that are intersecting and they make a two plane that are intersecting too.
If the point cloud data comes from a depth sensor, then you have a relatively dense sampling of your walls. One thing I found that works well with depth sensors (e.g. Kinect or DepthSense) is a robust version of the RANSAC procedure that #MartinBeckett suggested.
Instead of picking 3 points at random, pick one point at random, and get the neighboring points in the cloud. There are two ways to do that:
The proper way: use a 3D nearest neighbor query data structure, like a KD-tree, to get all the points within some small distance from your query point.
The sloppy but faster way: use the pixel grid neighborhood of your randomly selected pixel. This may include points that are far from it in 3D, because they are on a different plane/object, but that's OK, since this pixel will not get much support from the data.
The next step is to generate a plane equation from that group of 3D points. You can use PCA on their 3D coordinates to get the two most significant eigenvectors, which define the plane surface (the last eigenvector should be the normal).
From there, the RANSAC algorithm proceeds as usual: check how many other points in the data are close to that plane, and find the plane(s) with maximal support. I found it better to find the largest support plane, remove the supporting 3D points, and run the algorithm again to find other 'smaller' planes. This way you may be able to get all the walls in your room.
EDIT:
To clarify the above: the support of a hypothesized plane is the set of all 3D points whose distance from that plane is at most some threshold (e.g. 10 cm, should depend on the depth sensor's measurement error model).
After each run of the RANSAC algorithm, the plane that had the largest support is chosen. All the points supporting that plane may be used to refine the plane equation (this is more robust than just using the neighboring points) by performing PCA/linear regression on the support set.
In order to proceed and find other planes, the support of the previous iteration should be removed from the 3D point set, so that remaining points lie on other planes. This may be repeated as long as there are enough points and best plane fit error is not too large.
In your case (looking for a corner), you need at least 3 perpendicular planes. If you find two planes with large support which are roughly parallel, then they may be the floor and some counter, or two parallel walls. Either the room has no visible corner, or you need to keep looking for a perpendicular plane with smaller support.
Normal approach would be ransac
Pick 3 points at random.
Make a plane.
Check if each other point lies on the plane.
If enough are on the plane - recalculate a best plane from all these points and remove them from the set
If not try another 3 points
Stop when you have enough planes, or too few points left.
Another approach if you know that the planes are near vertical or near horizontal.
pick a small vertical range
Get all the points in this range
Try and fit 2d lines
Repeat for other Z ranges
If you get a parallel set of lines in each Z slice then they are probably have a plane - recalculate the best fit plane for the points.
I would first like to point out
Even though this is an old post, I would like to present a complementary approach, similar to Hough Voting, to find all corner locations, composed of plane intersections, jointly:
Uniformly sample the space. Ensure that there is at least a distance $d$ between the points (e.g. you can even do this is CloudCompare with a 'space' subsampling)
Compute the point cloud normals at these points.
Randomly pick 3 points from this downsampled cloud.
Each oriented point (point+plane) defines a hypothetical plane. Therefore, each 3 point picked define 3 planes. Those planes, if not parallel and not intersecting at a line, always intersect at a single point.
Create a voting space to describe the corner: The intersection of the 3 planes (the point) might a valid parameterization. So our parameter space has 3 free parameters.
For each 3 points cast a vote in the accumulator space to the corner point.
Go to (2) and repeat until all sampled points are exhausted, or enough iterations are done. This way we'll be casting votes for all possible corner locations.
Take the local maxima of the accumulator space. Depending on the votes, we'll be selecting the corners from intersection of the largest planes (as they'll receive more votes) to the intersection of small planes. The largest 4 are probably the corners of the room. If not, one could also consider the other local maxima.
Note that the voting space is a quantized 3D space and the corner location will be a rough estimate of the actual one. If desired, one could store the planes intersection at that very location and refine them (with iterative optimization similar to ICP or etc) to get a very fine corner location.
This approach will be quite fast and probably very accurate, given that you could refine the location. I believe it's the best algorithm presented so far. Of course this assumes that we could compute the normals of the point clouds (we can always do that at sample locations with the help of the eigenvectors of the covariance matrix).
Please also look here, where I have put out a list of plane-fitting related questions at stackoverflow:
3D Plane fitting algorithms
I have a bunch of points lying on a vertical plane. In reality this plane
should be exactly vertical. But, when I visualize the point cloud, there is a
slight inclination (nearly 2 degrees) from the verticality. At the moment, I can calculate
this inclination only. Concerning other errors, I assume there are no
shifts or something like that.
So, I want to update coordinates of my point data so that they lie on the vertical plane. I think, I should do some kind of transformation. It may be only via rotation along X-axis. Not sure what it would be.
I guess, you understood my question. Honestly, I am poor at
mathematics. So, please let me know how to update my point coordinates
to lie on the exact vertical plane.
Note: AS I am implementing this in c++ and there are many programmers who have sound knowledge on these things, I am posting this question under c++.
UPDATES
If I say exactly what I have done so far;
I have point cloud data representing a vertical object + its surroundings things. (The data is collected by a moving scanner and may have axes deviations from the correct world axes). The problem is, I cannot say exactly that there is an error on my data or not. Therefore, I checked this with a vertical planar object (which is the dominated object in my data as well). In reality that plane is truly vertical. But, when I fit a plane by removing outliers, then that plane is not truly vertical and has nearly 2 degree inclination. Therefore, I am suspecting that my data has some error. So I want to update all my point clouds (including points on the plane and points which represent other objects) in a way to lay that particular planar points exactly on the vertical plane. Then, I guess, all the points will be updated into their correct positions as in the reality. That is all (x,y,z) coordinates should be updated.
As an example please refer the below figure.
left-represents original point cloud (as you can see, points themselves are not vertical) and back line tells the vertical plane which I fitted and red is the zenith line. as you can see, there is an inclination of the vertical plane.
So, I want to update whole my data in the right figure. then, after updating if i fit a plane again (removing outliers), then it is exactly parallel to the zenith line. please help me.
I may be able to help you out, considering I worked with planes recently. First of all, how come the points aren't coplanar from the get go? I'd make the points coplanar in the first place instead of them being at an inclination (from what origin?), and then having to fix them. Also, having the points be coplanar on your first go would increase efficiency.
Sorry if this is the answer you're not looking for, but I need more information before I can help you out. Also, 3D math is hard. If you work with it enough, it starts to get pounded into your head, where you will NEVER forget it, especially if you went through the headaches I had to go through.
I did a bit of thinking on it, and since you want to rotate along the x-axis, your rotation will be done on the xz-plane, which means we can make this a 2D problem. After doing a bit of research on Wikipedia, this may be your solution.
new z = ((x - intended x) * sin(angle)) + (z * cos(angle)) + intended x
What I'm doing here is subtracting our intended x value from our current x value, so that we make (intended x, 0) our point of origin to rotate around. After the point is rotated, I add (intended x, 0) back to our coordinate so that we get the correct result.
Depending on where you got your points from (some kind of measurement, I guess) and what you want to do with them, there are several different things you could do with your data.
The search keyword "regression plane" might help - there are several ways of finding planes approximating point clouds, and several ways to "snap" points to planes.
Edit: You want to apply a rotation around the axis defined by the cross product of the normal vector on your regression plane and the normal of your desired plane, and a point your choice. From your illustration I take it that you probably want the bottom of your vertical planar object to be the point of reference for the rotation.
So you've got your point of reference, you now the axis around which you want to rotate, and the angle. All you need to do is:
Translation (to get to your point of reference)
Rotation
I read your question again, and hopefully this answer will help you out. If there's anything else I need to know, please tell me.
Now, In order to rotate anything, there must be a center point to rotate around. Now you've already been able to detect the angle of inclination, so now we need a formula for rotating a point a certain angle around an origin. In addition, since this problem only occurs on a 2D plane, we can use this basic formula to readjust the points. For any two axis x and y:
Theta is the angle that you will rotate around in a counter-clockwise direction. x' and y' are your new points. x.origin and y.origin are the coordinates for the point you will be going around. Now I don't know if my math is 100% correct on this but if it's not, hopefully you can change a thing or two and it will work.
Say we have an object at point A. It wants to find out if it can move to point B. It has limited velocity so it can only move step by step. It casts a ray at direction it is moving to. Ray collides with an object and we detect it. How to get a way to pass our ray safely (avoiding collision)?
btw, is there a way to make such thing work in case of object cast, will it be as/nearly fast as with simple ray cast?
Is there a way to find optimal in some vay path?
What you're asking about is actually a pathfinding question; more specifically, it's the "any-angle pathfinding problem."
If you can limit the edges of obstacles to a grid, then a popular solution is to just use A* on that grid, then apply path-smoothing. However, there is a (rather recent) algorithm that is both simpler to implement/understand and gives better results than path-smoothing. It's called Theta*.
There is a nice article explaining Theta* (from which I stole the above image) here
If you can't restrict your obstacles to a grid, you'll have to generate a navigation mesh for your map:
There are many ways of doing this, of varying complexity; see for example here, here, or here. A quick google search also turns up plenty of libraries available to do this for you, such as this one or this one.
One approach could be to use a rope, or several ropes, where a rope is made of a few points connected linearly. You can initialize the points in random places in space, but the first point is the initial position of A, and the last point is the final position of A.
Initially, the rope will be a very bad route. In order to optimize, move the points along an energy gradient. In your case the energy function is very simple, i.e. the total length of the rope.
This is not a new idea but is used in computer vision to detect boundaries of objects, although the energy functions are much more complicated. Yet, have look at "snakes" to give you an idea how to move each point given its two neighbors: http://en.wikipedia.org/wiki/Snake_(computer_vision)
In your case, however, simply deriving a direction for each point from the force exerted by its neighbors will be just fine.
Your problem is a constrained problem where you consider collision. I would really go with #paddy's idea here to use a convex hull, or even just a sphere for each object. In the latter case, don't move a point into a place where its distance to B is less than the radius of A plus the radius of B plus a fudge factor considering that you don't have an infinite number of points.
A valid solution requires that the longest distance between any neighbors is smaller than a threshold, otherwise, the connecting line between two points will intersect with the obstacle.
How about a simple approach to begin with....
If this is just one object, you could compute the convex hull of all the vertices of the obstacle, plus the start and end points. You would then examine the two directions to get from A to B by traversing the hull clockwise and anti-clockwise. Choose the shortest path.
It's a little more complex because the shape you are moving is not just a point. You can't just blindly move its centre or it will collide. It gets more complicated still as it moves past a vertex, because you have to graze an edge of your object against the vertex of the obstacle.
But hopefully that gives you an idea to ponder over, that's not conceptually difficult to understand.
I have made this image to tell my idea for reaching the object to point B.
Objects in the image :-
The dark blue dot represents the object. The red lines are obstacles. The grey dot and line are the area which can be reached. The purple arrow is the direction of the point B. The grey line of the object is the field of visibility.
Understanding the image :-
The object will have a certain field of visibility. This is a 2d situation so i have assumed the field of visibility to be 180deg. (for human field of visibility refer http://en.wikipedia.org/wiki/Human_eye#Field_of_view ) The object will measure distance by using the idea of SONAR. With the help of SONAR the object can find out the area where it can reach. Using BACKTRACKING, the object can find out the way to the object. If there is no way to go, the object must change its field of visibility
One way to look at this is as a shadow casting problem. Make A the "light source" and then decide whether each point in the scene is in or out of shadow. Those not in shadow are accessible by rays from A. The other areas are not. If you find B is in shadow, then you need only locate the nearest point in the scene that is in light.
If you discretize this problem into "pixels," then the above approach has very well-known solutions in the huge computer graphics literature on shadow rendering. For example, you can use a Shadow Map to paint each pixel with a boolean flag that indicates whether it's in shadow or not. Finding the nearest lit pixel is just a simple search of growing concentric circles around B. Both of these operations can be made extremely fast by exploiting GPU hardware.
One other note: You can treat a general object path finding problem as a point path problem. The secret is to "grow" the obstacles by an appropriate amount using Minkowski Differences. See for example this work on robot path planning.
I know the distance to various points on a plane, as it is being viewed from an angle. I want to find the equation for this plane from just that information (5 to 15 different points, as many as necessary).
I will later use the equation for the plane to estimate what the distance to the plane should be at different points; in order to prove that it is roughly flat.
Unfortunately, a google search doesn't bring much up. :(
If you, indeed, know distances and not coordinates, then it is ill-posed problem - there is infinite number of planes that will have points with any number of given distances from origin.
This is easy to verify. Let's take shortest distance D0, from set of given distances {D0..DN-1} , and construct a plane with normal vector {D0,0,0} (vector of length D0 along x-axis). For each of remaining lengths we now have infinite number of points that will lie in this plane (forming circles in-plane around (D0,0,0) point). Moreover, we can rotate all vectors by an arbitrary angle and get a new plane.
Here is simple picture in 2D (distances to a line; it's simpler to draw ;) ).
As we can see, there are TWO points on the line for each distance D1..DN-1 > D0 - one is shown for D1 and D2, and the two other for these distances would be placed in 4th quadrant (+x, -y). Moreover, we can rotate our line around origin by an arbitrary angle and still satisfy given distances.
I'm going to skip over the process of finding the best fit plane, it's been handled in some other answers, and talk about something else.
"Prove" takes us into statistical inference. The way this is done is you make a formal hypothesis "the surface is flat" and then see if the data supports rejecting this hypothesis at some confidence level.
So you can wind up saying "I'm not even 1% sure that the surface isn't flat" -- but you can't ever prove that it's flat.
Geometry? Sounds like a job for math.SE! What form will the equation take? Will it be a plane?
I will assume you want an accurate solution.
Find the absolute positions with geometry
Make a best fit regression line in C++ in 2 of the 3 dimensions.