OpenCV - things to do between stereoCalibrate and collecting 2x2D coordinates - c++

I've calibrated a pair of cameras, I have a loop where markings on a person's legs are being tracked and their positions saved. What now?
Do I have to cv::undistort the images before saving the pixel coordinates? Is there anything else that has to be done before I try to convert these pairs of 2D coordinates to 3D (I don't know how, but that's for later)?

Yes, you need to remove lens distortion before doing triangulation. If you are going to apply cv::undistort on the whole images though, you have to apply it prior to detecting/tracking the points of interest.
If you are not going to compute disparity image and you are only interested in few points, it is more efficient to detect/track the points of interest on the raw image and then apply cv::undistortPoints on these selected points.
After you remove distortion from the points, you can compose the projection matrices for the two cameras and carry out triangulation.

Related

How to triangulate Points from a Single Camera multiple Images?

I have a single calibrated camera pointing at a checkerboard at different locations with Known
Camera Intrinsics. fx,fy,cx,cy
Distortion Co-efficients K1,K2,K3,T1,T2,etc..
Camera Rotation & Translation (R,T) from IMU
After Undistortion, I have computed Point correspondence of checkerboard points in all the images with a known camera-to-camera Rotation and Translation vectors.
How can I estimate the 3D points of the checkerboard in all the images?
I think OpenCV has a function to do this but I'm not able to understand how to use this!
1) cv::sfm::triangulatePoints
2) triangulatePoints
How to compute the 3D Points using OpenCV?
Since you already have the matched points form the image you can use findFundamentalMat() to get the fundamental matrix. Keep in mind you need at least 7 matched points to do this. If you have more then 8 points CV_FM_RANSAC might be the best option.
Then use cv::sfm::projectionsFromFundamental() to find the projection matrix for each image, check if the projection matrix is valid (ex.check if the points are in-front of the camera).
then feed the projections and the points it into cv::sfm::triangulatePoints().
Hope this helps :)
Edit
The rotation and translation matrix are needed to change reference frames because the camera moves in SFM. The reference frame is at the position of the camera. Transforms are needed to make sure the position of the points a coherent(under the same reference frame which is usually the reface frame of the camera in the first image), so all the points are in the same coordinate system.
IE. To relate the point gathered by the second frame to the first frame, the third to second frame and so on.
So basically you can use the R and T vector to construct a transform matrix for each frame and multiplying it with your points to put them in the reface frame of the camera in the first frame.

OpenCV Structure from Motion Reprojection Issue

I am currently facing an issue with my Structure from Motion program based on OpenCv.
I'm gonna try to depict you what it does, and what it is supposed to do.
This program lies on the classic "structure from motion" method.
The basic idea is to take a pair of images, detect their keypoints and compute the descriptors of those keypoints. Then, the keypoints matching is done, with a certain number of tests to insure the result is good. That part works perfectly.
Once this is done, the following computations are performed : fundamental matrix, essential matrix, SVD decomposition of the essential matrix, camera matrix computation and finally, triangulation.
The result for a pair of images is a set of 3D coordinates, giving us points to be drawn in a 3D viewer. This works perfectly, for a pair.
Indeed, here is my problem : for a pair of images, the 3D points coordinates are calculated in the coordinate system of the first image of the image pair, taken as the reference image. When working with more than two images, which is the objective of my program, I have to reproject the 3D points computed in the coordinate system of the very first image, in order to get a consistent result.
My question is : How do I reproject 3D points coordinate given in a camera related system, into an other camera related system ? With the camera matrices ?
My idea was to take the 3D point coordinates, and to multiply them by the inverse of each camera matrix before.
I clarify :
Suppose I am working on the third and fourth image (hence, the third pair of images, because I am working like 1-2 / 2-3 / 3-4 and so on).
I get my 3D point coordinates in the coordinate system of the third image, how do I do to reproject them properly in the very first image coordinate system ?
I would have done the following :
Get the 3D points coordinates matrix, apply the inverse of the camera matrix for image 2 to 3, and then apply the inverse of the camera matrix for image 1 to 2.
Is that even correct ?
Because those camera matrices are non square matrices, and I can't inverse them.
I am surely mistaking somewhere, and I would be grateful if someone could enlighten me, I am pretty sure this is a relative easy one, but I am obviously missing something...
Thanks a lot for reading :)
Let us say you have a 3 * 4 extrinsic parameter matrix called P. To match the notations of OpenCV documentation, this is [R|t].
This matrix P describes the projection from world space coordinates to the camera space coordinates. To quote the documentation:
[R|t] translates coordinates of a point (X, Y, Z) to a coordinate system, fixed with respect to the camera.
You are wondering why this matrix is non-square. That is because in the usual context of OpenCV, you are not expecting homogeneous coordinates as output. Therefore, to make it square, just add a fourth row containing (0,0,0,1). Let's call this new square matrix Q.
You have one such matrix for each pair of cameras, that is you have one Qk matrix for each pair of images {k,k+1} that describes the projection from the coordinate space of camera k to that of camera k+1. Those matrices are inversible because they describe isometries in homogeneous coordinates.
To go from the coordinate space of camera 3 to that of camera 1, just apply to your points the inverse of Q2 and then the inverse of Q1.

FInding the Z coordinate using disparity map

I have found Disparity map of two stereoscopic images. And now I have to write an OpenGL code to visualize it for 3D reconstruction.
OpenGL has function vertex3f() for which three co-ordinates are to mentioned.
Two dimension are points on image.
So how to find z dimension using Disparity map?
Please suggest something on this.
Since, you have found disparity mapping, I assume that you are working with rectified images. In that case, the Z coordinate is given by simple similar triangle formulation,
z=Bf/d, where f if the focal length of the camera used (in pixels), d is the obtained disparity value for the pixel of interest and B is the baseline between the two stereo images.
Note, the unit of z would be the same as that of B.

Homography for Image Rectification

We can do image rectification given stereo images. For a single image we can find the two vanishing points and then the vanishing line. Using this vanishing line we can do projective rectification. But what constraints are required for affine rectification??
I basically want to rectify an image to its frontal view such that parallel line in the world are parallel and also parallel to the x-axis. I hope am making myself clear ..
Also what is the homography to transfer a vanishing point (x,y) in the image to (1,0,0) ???
Thanks in advance
A projective transformation transfers an imaged point to an ideal point (1,0,0). It thereby rectifies an image up to an affine transformation of what people would normally want to see. This process is "affine rectification". Affine transformations keep lines parallel, but can skew the image and make it look weird, so I'm not sure if that's where you want to stop. Affine transformations can turn circles into ellipses and can change the angle of intersecting lines.
To remove the affine transformation (metric rectification), one must identify the "conic dual to the circular points". This can be done by identifying two pairs of perpendicular lines; identifying an ellipse that should really be a circle; or by using two known length ratios. Once this is done, the only differences between your image and how you would like it to look will be its scale, its x and y positions and the rotation of the whole image.

How to use a chessboard to find the rotation/translation between 2 cameras

I am using opencv with C, and I am trying to get the extrinsic parameters (Rotation and translation) between 2 cameras.
I'm told that a checkerboard pattern can be used to calibrate, but I can't find any good samples on this. How do I go about doing this?
edit
The suggestions given are for calibrating a single camera with a checkerboard. How would you find the rotation and translation between 2 cameras given the checkerboard images from both views?
I was using code from http://www.starlino.com/opencv_qt_stereovision.html. It has some useful information and code of the author is pretty easy to understand and analyze, it covers both - chessboard calibrate and getting depth image from stereo cameras. I think it's based on this OpenCV book
opencv library here and about 3 chapters of the opencv book
A picture from a camera is just a projection of a bunch of color samples onto a plane. Assuming that the camera itself creates pictures with square pixels, the possible position of a given pixel is a vector from the camera's origin through the plane the pixel was projected onto. We'll refer to that plane as the picture plane.
One sample doesn't give you that much information. Two samples tells you a little bit more - the position of the camera relative to the plane created by three points: the two sample points and the camera position. And a third sample tells you the relative position of the camera in the world; this will be a single point in space.
If you take the same three samples and find them in another picture taken from a different camera, you will be able to determine the relative position of the cameras from the three samples (and their orientations based on the right and up vectors of the picture plane). To the correct distance, you need to know the distance between the actual sample points. In the case of a checkerboard, it's the physical dimensions of the checkerboard.