rotating an object with opticalflow? - c++

I am rather new to C++ and openframeworks. I am beginning to play with manipulating objects using the Lucas Kanade technique. I am having some success with pushing objects around but unfortunately I cannot figure out how to go about rotating them properly or even detect when rotational movement is occurring for that matter.
Does anyone have any pointers or tips they would like to share?
Many thanks,
N

Optical flow calculations won't on their own help you detect things like "rotational movement". Basically, all the optical flow calc is doing is looking at changes pixel-by-pixel, while what you mean by rotation is a larger aggregate of pixel change. An algorithm would need to detect something like "all the pixels on the edge of the object are flowing in a (counter-)clockwise direction". Very difficult to do, and I don't think there's anything in OpenFrameworks or OpenCV that will help you.
Are you trying to detect rotation of an object in the image, or rotation-like movements in the image that will affect a virtual object? If it's the former, I think there are OpenCV techniques for identifying objects and then tracking them, including things like rotation. I think the things to research are like "opencv object tracking" and "opencv object motion analysis".

To compute the 2x3 affine transformation matrix of your motion could be a solution. The affine transformation matrix contains tranlational and rotational movements as far as scaling. If you are using OpenCV than cv::getAffineTransform is what you are looking for where you can directly input the tracked feature points.

Related

Estimating a cuboid's rotation from its parallel projection

I paint in my spare time, and that means I have a truly massive collection of reference images. Folders full of buildings, people, animals, cars, etc. It's gotten to the point where it'd be great to tag the objects by their pose, so I can find the right object at the right angle. CVAT, an image annotating tool for machine learning, allows you to mark images with cuboids, as you can see in this picture.
But suddenly I'm wondering... is it even possible for a computer to estimate the rotation of a cuboid based on a single image, when all I can feed it are the eight (x,y) pairs that define the image of said cuboid?
My thinking is that I need to somehow invert the transformation matrix so that this cuboid looks like a rectangle. That would mean that we're looking at it "on-axis", and I'm imagining that this inversion could furnish me with those XYZ rotations I'm looking for.
My best lead right now is OpenCv's getPerspectiveTransform function, which can create a matrix that will warp an image, but that transformation seems to be purely two-dimensional.
Wikipedia does mention the idea of using an "augmented matrix" to perform transformations in an extra dimension, which seems apropos here, since I want to go from a 2D representation to a 3d.
A couple constraints & advantages that might clarify the feasibility, here:
The cuboids are rendered in a parallel projection. They don't match the perspective of the image, and that's okay! Just need a rough sense of their pose -- a margin of error of 10 degrees on any given axis of rotation is fine by me, in case there are some inexact solutions that could work.
In the case of multiple cuboids in the scene, I don't care at all about their interrelations -- each case can be treated separately.
I always have a sense of the "rear wall" of the cuboid, because I'm careful in how I make these annotations, in case that symmetry-breaking helps.
The lengths of edges are irrelevant, I'm not trying to measure the "aspect ratio" of these bounding cuboids.
Thank you for any advice or hints!

OpenCV - PCA analysis on BW image - area vs shape - is there already an implementation?

The shape of an object is detected on a bw image. The object is a black continuous shape, the background is white.
We use PCA (http://docs.opencv.org/3.1.0/d1/dee/tutorial_introduction_to_pca.html) to get the object direction and align the object. Currently the shape itself (the points on the contour) is the input to the opencv PCA implementation. This usually works very well. But from time to time there is small dirt on the object border, causing the shape to pass around the dirt. This causes more points and more weight on one side, slightly turning the object.
Idea: Instead of the contour, we use the area of the object as input for our PCA analysis. The issue there, to check all points on if they are inside the contour and then use them for PCA slows the application down. This part will be about 52352 times slower.
New Approach: We take random points in the image, check if they are inside the shape and if so, use them for our PCA. We have to see if we can get the consistent quality needed from this approach.
Is there already a similar implementation in opencv which is using the area instead of the shape?
Another approach would be to put a mesh over the object and use the mesh points inside the object for PCA.
Is there already something similar available one can just use or does one quickly need to implement something like this?
Going for straight lines around the object isn't an option.
Given that we have received very limited information about your problem (posting images would help a lot) and you do not seem to know the probability density function of the noise, your best bet is to consider the noise to be Gaussian.
As such, and following your intuition, my suggested approach is to take a few (by a few I mean statistically relevant but not raising the computation time that much) random points that lie inside the object and compute the PCA.
Repeat this procedure in an iterative loop and store somewhere the resulting rotation angles you get from the application of the PCA to the object shape.
Stop once you have enough point, compute the mean of the rotation angles: this is a decent estimate of the true angle. Compute also the standard deviation to get a measure of the quality of your estimation. By "enough points" you can consider that ~30 points is usually considered to be "enough" for being representative of the underlying population according to the central limit theorem.
If you want, you can improve on this approach in many ways, for example doing robust estimation of the true angle once you have collected enough points. It all depends on the data you have at hand...take my suggestion just as a starting point.
There are few parameters that you could change, in which may improve your system.
First is the threshold you use to binarize your image. I don't know what your application is about, but you could use other color systems, or normalize your image by cromacity, and after that, apply the new threshold.
Other aspect is to exclude shapes (contours) that have bigger or smaller area that what you are expecting.
To add up, you may use a blur filter before detect contours.
I don't know how the noise looks but when you say "small dirt" I think it might be only some few pixels that is a lot smaller then the object it self, but it might be attached to the object. To reduce this noise it might be possible to perform an opening (morphology) on the binary image.
http://docs.opencv.org/2.4/doc/tutorials/imgproc/opening_closing_hats/opening_closing_hats.html

OpenCV C++, find the equation describing the shape of an object

Thresholded Image
BGR Image
Fitted Thresholded Image
Hi all. I'm working on a project about computer vision using OpenCV for C++ interface. My purpose is to track a moving deformable object that is marked with a colored tape. By processing each frame of the video I'm able to effectively isolate the color (as you can see in the thresholded image) and track its trajectory, movement and shape into the BGR image.
My problem is that I need to extrapolate an equation or polynomial that can describe the current shape assumed by my tracked object.
Is there an effective way to do this? I've no idea on how to address the problem.
Thanks in advance,
Cheers!
If your final goal is to detect your shape in various forms i think you want to read about Active shape model: https://en.wikipedia.org/wiki/Active_shape_model
If you just want to get a polynomial fit of the shape in each instance of time i would use the suggestion of Cherkesgiller Tural and read about 2D curve fitting.
If I understood correctly:
I would start to fit a polygon on your shape. A common method for that is alpha-shapes.
You can also try an optimization approach which is enormously powerful because you can basically design your cost-function and constrains however you want. But it is computationally very costly (depending on the algorithm).
Have a look at this thread: It might help you.

Suggestions for improvenent object detection

I am new to opencv and i am trying to track some moving objects(e.g. cars) in an image. I have computed the optical flow and have used it to implement kmeans and try something like background substraction , i mean seperate moving objects from stationary. Then i have also used the intensity of the video as information . The following screenshots are from the result of the flow and the k means segmentation respectively :
The results are not good but also not bad. How could i proceed from now on ? I am thinking of trying SURF feature extraction and SURF detector . Any ideas are welcome .
It seems you are using dense optical flow. I would advice trying some feature detection (surf, fast, whatever) followed by sparse optical flow tracking(from my experience it is better than feature matching for this task). Then, once you have the feature correspondences over some frames, you can use fundamental matrix, trifocal tensor, plane+parallax or some other method to detect moving objects. You can later cluster moving objects into different motion groups that represent different objects.
Also it seems that your camera is fixed. In this case you can drop the movement detection step, and consider only tracks with enough displacement, and then do the clustering into motion groups.

findHomography usage opencv

I am using opencv c++ and am a new user. I am interested in object detection problems . So far I have studies and implemented the use of sparse optical flow( Lucas Kanade method) in a video from a stationary camera.After trying k means and Background substraction , I have decided to move to a more difficult problem , that is the moving camera.
I have so far studied some documentation and found out that I could use cv::findHomography in order to find the inliers or outliers during the sequence of frames in my video and then understand from the returned values what movement is caused due to camera motion and what due to object motion. In addition , I could use SURF features to track some objects and then decide which of them are good points .
However , I was wondering how I could implement this theory. For example, should I use the first frame as ground truth and detect some features using SURF and then for the rest of the video use findHomography for each frame ? Any ideas/help is welcome !
Detecting moving objects from moving camera is a quite challenging task, and requires solid understanding of multiple view geometry, besides there is less info on this topic available (than, for example, about structure from motion), so be warned!
Anyway, homography matrix will not be a good choice for detection of moving objects (unless you are 100% sure that your background can be represented by a flat surface accurately enough). You should probably use a fundamental matrix or trifocal tensor.
Fundamental matrix is computed from point correspondences between 2 frames. It associates points on one image with lines on other image (so called epipolar lines), and this way it is independent from scene structure. After you have obtained F matrix using some robust estimation method, like RANSAC or LMEDS (RANSAC seems to be better for this kind of task), you can calculate the reprojection error for each point. Objects that move independently from scene would not be accurately described by F matrix and will have a bigger error. So, outliers of F matrix calculated from image matches over two frames can be considered moving objects. One note though - objects that move along epipolar lines would not be detected by this approach, since their parallax can be also described by some depth level.
Trifocal tensor does not have the depth/motion ambiguity with objects that move along epipolar lines, but it is harder to estimate and it is not included into OpenCV. It can be calculated from correspondences over 3 frames, and its usage can be conceptually described as triangulating a point from 2 views and then calculating reprojection error on a third view.
As for the matching - I still think that LK tracking will be better than SURF matching if you work with video sequences, since in that case you don't need to consider very distant points as matches, and tracking usually is faster then detection+matching.