Using ball-pivoting algorithm on segmented point cloud? - computer-vision

I have seen the ball pivoting algorithm used on simple point clouds e.g. Stanford Armadillo. However, say I have a more complex scene and I have segmented it into different regions. Am I able to run the ball pivoting algorithm on each segment individually, and is this implementable within Open3d?

Related

How to merge two point cloud with there scales are different?

We now are using SFM to reconstruct our school building.
But our school is too big to reconstruct them by once.
So we decide to divide them into several blocks and reconstruct them, and merge these point clouds.
Because of the scale ambiguity of SFM, so the point clouds' scale is different.
And point cloud merge algorithm, such as ICP, can only estimate rotation and translation transform matrix, but not similarity transform matrix.
So, are there exist an algorithm to merge two point cloud with different scale or any paper?

how to detect the coordinates of certain points on image

I'm using the ORB algorithm to detect and get the coordinates of the crossings of rope shown in the image, which is represented by the red dot. I want to detect the coordinates of the four points surrounding the crossing represented by the blue dots. All the four points have the same distance from the red spot.
Any idea how to get their coordinates by getting use of the red spot coordinate.
Thank you in Advance
Although you're using ORB, you're still going to need an algorithm to segment the rope from the background, or at least some technique to identify image chunks that belong to the rope and that are equidistant from the red dot. There are a number of options to explore.
It's important to consider your lighting & imaging as separate problems to be solved if this is meant to be a real-world application. This looks a bit like a problem for a class rather than for a application you'll sell and support, but you should still consider lighting:
Will your algorithm(s) still work when light level is reduced?
How will detection be affected by changes in camera pose relative to the surface where the rope will be located?
If you'll be detecting "black" rope, will the algorithm also be required to detect rope of different colors? dirty rope? rope on different backgrounds?
Since you're object of interest is rope, you have to consider a class of algorithms suitable for detection of non-rigid objects. Always consider the simplest solution first!
Connected Components
Connected components labeling is a traditional image processing algorithm and still suitable as the starting point for many applications. The last I knew, this was implemented in OpenCV as findContours(). This can also be called "blob finding" or some variant thereof.
https://en.wikipedia.org/wiki/Connected-component_labeling
https://docs.opencv.org/2.4/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html?highlight=findcontours
Depending on lighting, you may have to take different steps to binarize the image before running connected components. As a start, convert the color image to grayscale, which will simplify the task significantly.
Try a manual threshold since you can quickly test a number of values to see the effect. Don't be too discouraged if the binarization isn't quite right--this can often be fixed with preprocessing.
If a range of manual thresholds works (e.g. 52 - 76 in an 8-bit grayscale range), then use an algorithm that will automatically calculate the threshold for you: Otsu, entropy-based methods, etc., will all offer comparable performance. Whichever technique works best, the code/algorithm can be tweaked further to optimize for your rope application.
If thresholding and binarization don't work--which for your rope application seems unlikely, at least how you've presented it--then switch to thinking in terms of gradient-based (edge-based, energy-based) techniques.
But assuming you can separate the rope from the background, you're still going to need a method to start at the red dot [within the rope] and move equal distances out to the blue points. More about that later after a discussion of other rope segmentation methods.
Note: connected components labeling can work in scenarios beyond just binarizing black & white images. If you can create a texture field or some other 2D representation of the image that makes it possible to distinguish the black rope from the relatively light background, you may be able to use a connected components algorithm. (Finding a "more complicated" or "more modern" algorithm isn't necessarily going to be the right approach.)
In a binarized image, blobs can be nested: on a white background you can have several black blobs, inside of one or more of which are white blobs, inside of which are black blobs, etc. An earlier version of OpenCV handled this reasonably well. (OpenCV is a nice starting point, and a touchpoint for many, but for a number of reasons it doesn't always compare favorably to other open source and commercial packages; popularity notwithstanding, OpenCV has some issues.)
Once you have a "blob" (a 4-connected region of pixels) in a 2D digital image, you can treat the blob as an object, at which point you have a number of options:
Edge tracing: trace around the inside and outside edges of the blob. From what I recall, OpenCV does (or at least should) have some relatively straightforward method to get the edges.
Split the blob into component blobs, each of which can be treated separately
Convert the blob to a polygon
...
A connected components algorithm should be high on the list of techniques to try if you have a non-rigid object.
Boolean Operations
Once you have the rope as a connected component (and possibly even without this), you can use boolean image operations to find the spots at the blue dots in your image:
Create a circular region in data, or even in the image
Find the intersection of the circle (an annulus) and the black region representing the rope. Using your original image, you should have four regions.
Find the center point of the intersection regions.
You could even try this without using connected components at all, but using connected components as part of the solution could make it more robust.
Polygon Simplification
If you have a blob, which in your application would be a connected set of black pixels representing the rope on the floor, then you can consider converting this blob to one or more polygons for further processing. There are advantages to working with polygons.
If you consider only the outside boundary of the rope, then you can see that the set of pixels defining the boundary represents a polygon. It's a polygon with a lot of points, and not a convex polygon, but a polygon nonetheless.
To simplify the polygon, you can use an algorithm such as Ramer-Douglas-Puecker:
https://en.wikipedia.org/wiki/Ramer%E2%80%93Douglas%E2%80%93Peucker_algorithm
Once you have a simplified polygon, you can try a few techniques to render useful data from the polygon
Angle Bisector Network
Triangulation (e.g. using ear clipping)
Triangulation is typically dependent on initial conditions, so the resulting triangulation for slighting different polygons (that is, rope -> blob -> polygon -> simplified polygon). So in your application it might be useful to triangulate the dark rope region, and then to connect the center of one triangle to the center of the next nearest triangle. You'll also have to deal with crossings, such as the rope overlap. Ultimately this can yield a "skeletonization" of the rope. Speaking of which...
Skeletonization
If the rope problem was posed to you as a class exercise, then it may have been a prompt to try skeletonization. You can read about it here:
https://en.wikipedia.org/wiki/Topological_skeleton
Skeletonization and thinning have their own problems to solve, but you should dig into them a bit and see those problems themselves.
The Medial Axis Transform (MAT) is a related concept. Long story there.
Edge-based techniques
There are a number of techniques to generate "edge images" based on edge strength, energy, entropy, etc. Making them robust takes a little effort. If you've had academic training in image processing you've likely heard of Harris, Sobel, Canny, and similar processing methods--none are magic bullets, but they're simple and dependable and will yield data you need.
An "edge image" consists of pixels representing the image gradient strength [and sometimes the gradient direction]. People may call this edge image something else, but it's the concept that matters.
What you then do with the edge data is another subject altogether. But one reason to think of edge images (or at least object borders) is that it reduces the amount of information your algorithm(s) will need to process.
Mean Shift (and related)
To get back to segmentation mentioned in the section on connected components, there are other methods for segmenting figures from a background: K-means, mean shift, and so on. You probably won't need any of those, but they're neat and worth studying.
Stroke Width Transform
This is an intriguing technique used to extract text from noisy backgrounds. Although it's intended for OCR, it could work for rope since the rope width is relatively constant, the rope shape varies, there are crossings, etc.
In short, and simplifying quite a bit, you can think of SWT as a means to find "strokes" (thick lines) by finding gradients antiparallel to each other. On either side of a stroke (or line), the edge gradient points normal to the object edge. The normal on one side of the stroke points opposite the direction of the normal on the other side of the stroke. By filtering for pixel-gradient pairs within a certain distance of each other, you can isolate certain strokes--even automatically. For your example the collection of points representing edge pairs for the rope would be much more common than other point pairs.
Non-Rigid Matching
There are techniques for matching non-rigid shapes, but they would not be worth exploring. If any of the techniques I mentioned above is unfamiliar to you, explore some of those first before you try any fancier algorithms.
CNNs, machine learning, etc.
Just don't even think of these methods as a starting point.
Other Considerations
If this were an application for industry, security, or whatnot, you'd have to determine how well your image processing worked under all environmental considerations. That's not an easy task, and can make all the difference between a setup that "works" in the lab and a setup that actually works in practice.
I hope that's of some help. Feel free to post a reply if I've confused more than helped, or if you want to explore some idea in more detail. Though I tried to touch on some common(ish) techniques, I didn't mention all the different ways of addressing this problem.
And briefly: once you have a skeleton, point network, or whatever representing a reduced data set for the rope and the red dot (the identified feature), a few techniques to find the items at the blue dots:
For a skeleton, trace along each "branch" of the rope outward from the know until the geodesic distance or straight-line 2D distance is the distance D that you want.
To use geometry, create a circle of width 1 - 2 pixels. Find the intersection of that circle and the rope. Find the center point of the intersections of circle and rope. (Also described above.)
Good luck!

B-Spline for any number of control points

I am currently working on a soft body system using numeric spring physics and I have finally got that working. My issue is that everything is currently in straight lines.
I am aiming to replicate something similar to the game "The floor is Jelly" and everything work except the smooth corners and deformation which currently are straight and angular.
I have tried using Cubic Bezier equations but that just means every 3 nodes I have a new curve. Is there an equation for Bezier splines that take in n number of control points that will work with loop of vec2's (so node[0] is the first and last control point).
Sorry I don't any code to show for this but i'm completely stumped and googling is bringing up nothing.
Simply google "B-spline library" will give you many references. Having said this, B-spline is not your only choice. You can use cubic Hermite spline (which is defined by a series of points and derivatives) (see link for details) as well.
On the other hand, you can also continue using straight lines in your system and create a curve interpolating the straight line vertices just for display purpose. To create an interpolating curve thru a series of data points, Catmull-Rom spline is a good choice for easy implementation. This approach is likely to have a better performance than really using a B-spline curve in your system.
I would use B-splines for this problem since they can represent smooth curves with minimal number of control points. In addition finding the approximate smooth surface for a given data set is a simple linear algebra problem.
I have written a simple B-spline C++ library (includes Bezier curves as well) that I am using for scientific computations, here:
https://github.com/feevos/bsplines
it can accept arbitrary number of control points / multiplicities and give you back a basis. However, creating the B-spline curve that fits your data is something you have to do.
A great implementation of B-splines (but no Bezier curves) exists also in GNU GSL (
https://www.gnu.org/software/gsl/manual/html_node/Basis-Splines.html). Again here you have to implement the control points to be 2/3D for the given basis, and fix the boundary conditions to fit your data.
More information on open/closed curves and B-splines here:
https://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/index.html

How do I minimize global error across multiple image homographies?

I am stitching together multiple images with arbitrary 3D views of a planar surface. I have some estimation of which images overlap and a coarse estimate of each pairwise homography between pairs of overlapping images. However, I need to refine my homographies by minimizing the global error across all images.
I have read a few different papers with various methods for doing this, and I think the best way would be to use a non-linear optimization such as Levenberg–Marquardt, ideally in a fast way that is sparse and/or parallel.
Ideally I would like to use an existing library such as sba or pba, but I am really confused as to how to limit the calculation to just estimating the eight parameters of the homography rather than the full 3 dimensions for both camera pose and object position. I also found this handy explanation by Szeliski (see section 5.1 on page 50) but again, the math is all for a rotating camera rather than a flat surface.
How do I use L-M to minimize the global error for a set of homographies? Is there a speedy way to do this with existing bundle adjustment libraries?
Note: I cannot use methods that rely on rotation-only camera motion (such as in openCV) because those cannot accurately estimate camera poses, and I also cannot use full 3D reconstruction methods (such as SfM) because those have too many parameters which results in non-planar point clouds. I definitely need something specific to a full 8 parameter homography. Camera intrinsics don't really matter because I am already correcting those in an earlier step.
Thanks for your help!

Detecting a cross in an image with OpenCV

I'm trying to detect a shape (a cross) in my input video stream with the help of OpenCV. Currently I'm thresholding to get a binary image of my cross which works pretty good. Unfortunately my algorithm to decide whether the extracted blob is a cross or not doesn't perform very good. As you can see in the image below, not all corners are detected under certain perspectives.
I'm using findContours() and approxPolyDP() to get an approximation of my contour. If I'm detecting 12 corners / vertices in this approximated curve, the blob is assumed to be a cross.
Is there any better way to solve this problem? I thought about SIFT, but the algorithm has to perform in real-time and I read that SIFT is not really suitable for real-time.
I have a couple of suggestions that might provide some interesting results although I am not certain about either.
If the cross is always near the center of your image and always lies on a planar surface you could try to find a homography between the camera and the plane upon which the cross lies. This would enable you to transform a sample image of the cross (at a selection of different in plane rotations) to the coordinate system of the visualized cross. You could then generate templates which you could match to the image. You could do some simple pixel agreement tests to determine if you have a match.
Alternatively you could try to train a Haar-based classifier to recognize the cross. This type of classifier is often used in face detection and detects oriented edges in images, classifying faces by the relative positions of several oriented edges. It has good classification accuracy on faces and is extremely fast. Although I cannot vouch for its accuracy in this particular situation it might provide some good results for simple shapes such as a cross.
Computing the convex hull and then taking advantage of the convexity defects might work.
All crosses should have four convexity defects, making up four sets of two points, or four vectors. Furthermore, if your shape was a cross then these four vectors will have two pairs of supplementary angles.