Affine homography computation - computer-vision

Suppose you have an homography H between two images. The first image is the reference image, where the planar object cover the entire image (and it is parallel to the image). The second image depicts the planar object from another abritrary view (run-time image). Now, given a point in the reference image p=(x,y), i have a rectangular region of pixels of size SxS (with S<=20 pixel) around p (call it patch). I can unwarp this patch using the pixels in the run-time image and the inverse homography H^(-1).
Now, what i want to do is to compute, given H, an affine homography H_affine suitable for the patch around the point p. The naive way that i am using is to compute 4 point correspondences: the four corners of the patch and the corresponding points in the run-time image (computed using the full homography H). Given this four point correspondences (all belonging to a small neighborhood of the point p), one can compute the affine homography solving a simple linear system (using the gold standard algorithm). The affine homography so computed will approximate with reasonable precision (below .5 pixel) the full projective homography, since we are in a small neighboorhood of p (if the scale is not too unfavorable, that is, the patch SxS does not correspond to a big image region in the run-time image).
Is there a faster way to compute H_affine given H (related to the point p and the patch SxS)?

You say that you already know H, but then it sounds like you're trying to compute it all over again but this time call the result H_affine. The correct H would be a projective transformation and it can be uniquely decomposed into 3 parts representing the projective part, the affine part and the similarity part. If you already know H and only want the affine part and below, then decompose H and ignore its projective component. If you don't know H, then the 4 point correspondence is the way to go.

Related

Given a Fundamental Matrix and Image Points in one image plane, find exactly corresponding points in second Image Plane

As a relative beginner in this topic, I have read the literature but I am not sure about how to manipulate the equations to my purposes and would like advice on tackling this topic.
Preamble:
I have 2 cameras in a stereo rig which have been calibrated, thus extracting data structures such as each camera's Camera Matrix K1 and K2, as well as the Fundamental, Essential, Rotation and Translation matrices, F, E, R and T respectively. Also after rectifying one has the projection matrices P1 and P2 as well as the disparity matrix Q.
My aim is however to test the triangulation method of OpenCV, and to this end I would like to use synthetic Images where the correspondence between the points in image1 and image2 is exact.
My idea was to take an image of a chessboard with one camera, and use findCorners() and cornerSubPix() to get image points in the left camera, let's call them imagePoints1.
To get synthetically generated Image Points with exactly corresponding points on the left camera's image plane, I intend to use the property
x2'Fx1 = 0, given the F matrix and x1 (which represents one homogenous 2D point from imagePoints1)
to generate said set of Image Points.
This is where I am stuck, since he obvious solution would be to have a zero-vector to make this equation work. Otherwise I get a parametric solution. How do I then get non-zero points that fulfill this property x2'Fx1 = 0 given x1 and F ?
Thank you.

Usage of findEssentialMat function in OpenCV 3.0

I'm currently working on a project to recover camera 6-DOF-Pose from two images by using SIFT/SURF. In old version of OpenCV, I use findFundamentalMat to find fundamental matrix, then further getting essential matrix with know camera intrinsic K and eventually get R and t by matrix decomposition. The result is very sensitive and unstable.
I saw some people have the same issue here
OpenCV findFundamentalMat very unstable and sensitive
Some people suggest apply Nister's 5-point algorithm which has implemented in the latest version of OpenCV3.0.
I have read an examples from
OpenCV documentation
In the example, it use focal = 1.0 and Point2d pp(0.0, 0.0).
Is this the real focal length and principle point of the camera? what are the unit? in pixel? or in actual size? I am having trouble to understand these two parameters. I think these two parameters should be acquired from a calibration routine, right?
For my current camera (VGA mode), I use Matlab Camera Calibrator to get these two parameters, and these parameters are
Focal length (millimeters): [ 1104 1102]
Principal point (pixels):[ 259 262]
So, if I want to use my camera parameters instead, should I need to directly fill in these values? or convert them to actual size, like millimeter?
Also, the translation result I get looks like a direction rather than actual size, is there any way I can get the actual size translation rather than a direction?
Any help is appreciated.
Focal Length
The focal length that you get from camera calibration is in pixels. It is actually the ratio of the "real" focal length (e.g. in mm) and the pixel size (also in mm). The world units cancel out, and you are left with pixels. Unfortunately, you cannot estimate both the focal length in world units and the pixel size, only their ratio.
Principal Point
The principal point is also in pixels. It is simply the point in the image where it intersects the optical axis. One caveat: the principal point you get from the Camera Calibrator in MATLAB uses 1-based pixel coordinates, where the center of the top-right pixel of the image is (1,1). OpenCV uses 0-based pixel coordinates. So if you want to use your camera parameters in OpenCV, you have to subtract 1 from the principal point.
Translation Vector
The translation vector you get from the essential matrix is a unit vector, because the essential matrix is only defined up to scale. In other words, you get a reconstruction where the unit is the distance between the cameras. If you need a metric reconstruction (in actual world units) you would either need to know the actual distance between the cameras, or you need to be able to detect an object of a known size in the scene. See this example.

displacement between two images using opencv surf

I am working on image processing with OPENCV.
I want to find the x,y and the rotational displacement between two images in OPENCV.
I have found the features of the images using SURF and the features have been matched.
Now i want to find the displacement between the images. How do I do that? Can RANSAC be useful here?
regards,
shiksha
Rotation and two translations are three unknowns so your min number of matches is two (since each match delivers two equations or constraints). Indeed imagine a line segment between two points in one image and the corresponding (matched) line segment in another image. The difference between segments' orientations gives you a rotation angle. After you rotated just use any of the matched points to find translation. Thus this is 3DOF problem that requires two points. It is called Euclidean transformation or rigid body transformation or orthogonal Procrustes.
Using Homography (that is 8DOF problem ) that has no close form solution and relies on non-linear optimization is a bad idea. It is slow (in RANSAC case) and inaccurate since it adds 5 extra DOF. RANSAC is only needed if you have outliers. In the case of pure noise and overdetrmined system (more than 2 points) your optimal solution that minimizes the sum of squares of geometric distance between matched points is given in a close form by:
Problem statement: min([R*P+t-Q]2), R-rotation, t-translation
Solution: R = VUT, t = R*Pmean-Qmean
where X=P-Pmean; Y=Q-Qmean and we take SVD to get X*YT=ULVT; all matrices have data points as columns. For a gentle intro into rigid transformations see this

Compensate rotation from successive images

Lets say I have image1 and image2 obtained from a webcam. For taking the image2, the webcam undergoes a rotation (yaw, pitch, roll) and a translation.
What I want: Remove the rotation from image2 so that only the translation remains (to be precise: my tracked points (x,y) from image2 will be rotated to the the same values as in image1 so that only the translation component remains).
What I have done/tried so far:
I tracked corresponding features from image1 and image2.
Calculated the fundamental matrix F with RANSAC to remove outliers.
Calibrated the camera so that I got a CAM_MATRIX (fx, fy and so on).
Calculated Essential Matrix from F with CAM_Matrix (E = cam_matrix^t * F * cam_matrix)
Decomposed the E matrix with OpenCV's SVD function so that I have a rotation matrix and translation vector.
-I know that there are 4 combinations and only 1 is the right translation vector/rotation matrix.
So my thought was: I know that the camera movement from image1 to image2 won't be more than lets say about 20°/AXIS so I can eliminate at least 2 possibilities where the angles are too far off.
For the 2 remaining I have to triangulate the points and see which one is the correct one (I have read that I only need 1 , but due possible errors/outliers it should be done with some more to be sure which one is the right). I think I could use the OpenCV's triangulation function for this? Is my thought right so far? Do I need to calculate the projection error?
Let's move on and assume that I finally obtained the right R|t matrix.
How do I continue? I tried to multiply the normal, as well as transposed rotation matrix which should reverse the rotation (?) (for testing purpose I just tried both possible combinations of R|t, I have not done the triangulation in code yet) with a tracked point in image2. but the calculated point is way too far off from what it should be. Do I need the calibration matrix here as well?
So how can I invert the rotation applied to image2? (to be exact, apply the inverse rotation to my std::vector<cv::Point2f> array which contains the tracked (x,y) points from image2)
Displaying the de-rotated image would be also nice to have. This is done with warpPerspective function? Like in this post ?
(I just don't fully understand what the purpose of A1/A2 and dist in the T matrix is or how I can adopt this solution to solve my problem.)

Homography computation of a deformed square planar object

suppose you have a square planar object (a piece of paper). You take a photo of it. Generally speaking, it will appear deformed. Suppose you process the image and compute the four corners of the planar object. Given the four points, you can compute an homography.
But now suppose that the object undergoes some type of deformation. All we can say about the nature of the deformation is:
it is "smooth" ( the surface of the object will not form sharp angles)
the surface of the object will be always totally visible even after the deformation.
For example: you stick the square paper on the surface of a cylindrical object.
The question is: given only the four coordinates (in pixel) of the corners of the planar (deformed) object, can i compute the correct homography? That is, can i "remove" the effect of the deformation before computing the homograhy?
Even an "approximated" (read working ;) method would be really useful.
Thanks.
Ps.
I wish to add that i don't know, a priori, the content of the planar object. Infact, the algorithm i am writing computes the homography, unwarp the object and check its content. It is a 2D barcode, so i have a pair id/crc of numbers. If the crc extracted from the object is equal to the crc computed on the id then it is a valid barcode.
A homography is by definition a plane-plane transform. If the barcode is small enough you could probably assume that the object it is attached to is piecewise planar. After rectifying the image of the barcode you could estimate a barrel distortion model.
If you want to remove the deformation first then you would have to estimate the surface first and then flatten it. That would be a lot more difficult.