I am trying to develop an augmented reality program that overlays a 3d object on top of a marker. The model does not move along(proportionately) with the marker. Here are the list of things that I did
1) Using opencv: a) I used the solvepnp method to find rvecs and tvecs. b) I also used the rodrigues method to find the rotation matrix and appended the tvecs vector to get the projection matrix. c) Just for testing I made some points and lines and projected them to make a cube. This works perfectly fine and I am getting a good output.
2) Using irrlicht: a) I tried to place a 3d model(at position(0,0,0) and rotation(0,0,0)) with the camera feed running in the background. b) Using the rotation matrix found using rodrigues in opencv I calculated the pitch, yaw and roll values from this post("http://planning.cs.uiuc.edu/node103.html") and passed the value onto the rotation field. In the position field I passed the tvecs values. The tvecs values are tvecs[0], -tvecs[1], tvecs[2].
The model is moving in the correct directions but it is not moving proportionately. Meaning, if I move the marker 100 pixels in the x direction, the model only moves 20 pixels(the values 100 and 20 are not measured, I just took arbitrary values illustrate the example). Similarly for y axis and z axis. I do know I have to introduce another transformation matrix that maps the opencv camera coordinates to irrlicht camera coordinates and its a 4x4 matrix. But I do not know how to find it. Also the opencv's projections matrix [R|t] is a 3x4 matrix and it yields a 2d point that is to be projected. The 4x4 matrix mapping between opencv and irrlicht requires a 3d point(made homogeneous) to be fed into a 4x4 matrix. How do I achieve that?
The 4x4 matrix You are writing about seems to be M=[ R|t; 0 1]. t is 3x1 translation vector. To get the transformed coordinates v' of 4x1 ([x y z 1]^T) point v just do v'=Mt.
Your problem with scaling may be also caused by difference in units used for camera calibration in OpenCV and those used by the other library.
Related
I am currently facing an issue with my Structure from Motion program based on OpenCv.
I'm gonna try to depict you what it does, and what it is supposed to do.
This program lies on the classic "structure from motion" method.
The basic idea is to take a pair of images, detect their keypoints and compute the descriptors of those keypoints. Then, the keypoints matching is done, with a certain number of tests to insure the result is good. That part works perfectly.
Once this is done, the following computations are performed : fundamental matrix, essential matrix, SVD decomposition of the essential matrix, camera matrix computation and finally, triangulation.
The result for a pair of images is a set of 3D coordinates, giving us points to be drawn in a 3D viewer. This works perfectly, for a pair.
Indeed, here is my problem : for a pair of images, the 3D points coordinates are calculated in the coordinate system of the first image of the image pair, taken as the reference image. When working with more than two images, which is the objective of my program, I have to reproject the 3D points computed in the coordinate system of the very first image, in order to get a consistent result.
My question is : How do I reproject 3D points coordinate given in a camera related system, into an other camera related system ? With the camera matrices ?
My idea was to take the 3D point coordinates, and to multiply them by the inverse of each camera matrix before.
I clarify :
Suppose I am working on the third and fourth image (hence, the third pair of images, because I am working like 1-2 / 2-3 / 3-4 and so on).
I get my 3D point coordinates in the coordinate system of the third image, how do I do to reproject them properly in the very first image coordinate system ?
I would have done the following :
Get the 3D points coordinates matrix, apply the inverse of the camera matrix for image 2 to 3, and then apply the inverse of the camera matrix for image 1 to 2.
Is that even correct ?
Because those camera matrices are non square matrices, and I can't inverse them.
I am surely mistaking somewhere, and I would be grateful if someone could enlighten me, I am pretty sure this is a relative easy one, but I am obviously missing something...
Thanks a lot for reading :)
Let us say you have a 3 * 4 extrinsic parameter matrix called P. To match the notations of OpenCV documentation, this is [R|t].
This matrix P describes the projection from world space coordinates to the camera space coordinates. To quote the documentation:
[R|t] translates coordinates of a point (X, Y, Z) to a coordinate system, fixed with respect to the camera.
You are wondering why this matrix is non-square. That is because in the usual context of OpenCV, you are not expecting homogeneous coordinates as output. Therefore, to make it square, just add a fourth row containing (0,0,0,1). Let's call this new square matrix Q.
You have one such matrix for each pair of cameras, that is you have one Qk matrix for each pair of images {k,k+1} that describes the projection from the coordinate space of camera k to that of camera k+1. Those matrices are inversible because they describe isometries in homogeneous coordinates.
To go from the coordinate space of camera 3 to that of camera 1, just apply to your points the inverse of Q2 and then the inverse of Q1.
I get an image point in the left camera (pointL) and the corresponding image point in the right camera (pointR) of my stereo camera using feature matching. The two cameras are parallel and are at the same "hight". There is only a x-translation between them.
I also know the projection matrices for each camera (projL, projR), which I got during calibration using initUndistortRectifyMap.
For triangulating the point, I call:
triangulatePoints(projL, projR, pointL, pointR, pos3D) (documentation), where pos3D is the output 3D position of the object.
Now, I want to project the 3D-coordinates to the 2D-image of the left camera:
2Dpos = projL*3dPos
The resulting x-coordinate is correct. But the y-coodinate is about 20 pixels wrong.
How can I fix this?
Edit:
Of course, I need to use homogeneous coordinates, in order to multiply it with the projection matrix (3x4). For that reason, I set:
3dPos[0] = x;
3dPos[1] = y;
3dPos[2] = z;
3dPos[3] = 1;
Is it wrong, to set 3dPos[3]to 1?
Note:
All images are remapped, I do this in a kind of preprocessing step.
Of course, I always use the homogeneous coordinates
You are likely projecting into the rectified camera. Need to apply the inverse of the rectification warp to obtain the point in the original (undistorted) linear camera coordinates, then apply distortion to get into the original image.
I'm working on an AR program. I have done this steps:
Detect corners of a chess board with OpenCV
Use SolvePNP to find rvec and tvec
Apply Rodrigues on rvec to get R_mat
Use hconcat(R_mat, tvec, P); to concatenate R_mat and tvec to get Projection matrix
Apply decomposeProjectionMatrix on P to get new translation,T2, vector and eulerAngles
now my problem is in translation vectors just tvec[2] or translation over Z-axis is correct
and in rotations again in Euler Angles (from decomposeProjectionMatrix) just eulerAngles[2] or rotation around z-axis is correct.
I don't know hoa can I get Translation and Rotation related to X-axis and Y-axis correctly.
I'm gonna use these in OpenGL to Augment a cube on the pattern.
In my code, result of translation over X and Y axis is very larger than windows size; and rotaion around X and Y axis are very small and always near the zero (less than 0.0001)
Any Idea how to get correct meaningful T and R?
P.S : I'm using an Identity matrix as camera matrix and zero matrix for dist coeff matrix.
I have some 2D points set (X,Y) corresponding to a 3D points set (X,Y,Z).
2D points were captured from camera and 3D points were the real coordinate according to world base. I want to find the transformation matrix between them, that is to say, how to convert other 2D points to 3D points.
I have try getPespectiveTransform function, but it didnt work in this problem.
How can I write a regression to find this transform matrix ?
You can use solvePnP of OpenCV, it gives you rotation and translation matrix. In this answer you can see in more detail:
Camera position in world coordinate from cv::solvePnP
I have calibrated my camera with OpenCV (findChessboard etc) so I have:
- Camera Distortion Coefficients & Intrinsics matrix
- Camera Pose information (Translation & Rotation, computed separatedly via other means) as Euler Angles & a 4x4
- 2D points within the camera frame
How can I convert these 2D points into 3D unit vectors pointing out into the world? I tried using cv::undistortPoints but that doesn't seem to do it (only returns 2D remapped points), and I'm not exactly sure what method of matrix math to use to model the camera via the Camera intrinsics I have.
Convert your 2d point into a homogenous point (give it a third coordinate equal to 1) and then multiply by the inverse of your camera intrinsics matrix. For example
cv::Matx31f hom_pt(point_in_image.x, point_in_image.y, 1);
hom_pt = camera_intrinsics_mat.inv()*hom_pt; //put in world coordinates
cv::Point3f origin(0,0,0);
cv::Point3f direction(hom_pt(0),hom_pt(1),hom_pt(2));
//To get a unit vector, direction just needs to be normalized
direction *= 1/cv::norm(direction);
origin and direction now define the ray in world space corresponding to that image point. Note that here the origin is centered on the camera, you can use your camera pose to transform to a different origin. Distortion coefficients map from your actual camera to the pinhole camera model and should be used at the very beginning to find your actual 2d coordinate. The steps then are
Undistort 2d coordinate with distortion coefficients
Convert to ray (as shown above)
Move that ray to whatever coordinate system you like.