use SolvePnP Rotation and translation vector in OpenGL - c++

I'm working on an AR program. I have done this steps:
Detect corners of a chess board with OpenCV
Use SolvePNP to find rvec and tvec
Apply Rodrigues on rvec to get R_mat
Use hconcat(R_mat, tvec, P); to concatenate R_mat and tvec to get Projection matrix
Apply decomposeProjectionMatrix on P to get new translation,T2, vector and eulerAngles
now my problem is in translation vectors just tvec[2] or translation over Z-axis is correct
and in rotations again in Euler Angles (from decomposeProjectionMatrix) just eulerAngles[2] or rotation around z-axis is correct.
I don't know hoa can I get Translation and Rotation related to X-axis and Y-axis correctly.
I'm gonna use these in OpenGL to Augment a cube on the pattern.
In my code, result of translation over X and Y axis is very larger than windows size; and rotaion around X and Y axis are very small and always near the zero (less than 0.0001)
Any Idea how to get correct meaningful T and R?
P.S : I'm using an Identity matrix as camera matrix and zero matrix for dist coeff matrix.

Related

Reference frame of homography matrix (and how to obtain the motion wrt the first image)?

The OpenCV function findHomography "finds a perspective transformation between two planes". According to this post, if H is the transformation matrix, we have (where X1 is the prior image):
X1 = H ยท X2
I am struggling to fully comprehend the reference axes of the parameters in the transformation (translation distances, scaling factors and rotation angle -in a simplified affine transformation case without shear-).
What I am trying to obtain is the camera motion from one image to another, relative to the first image. That is, motion between images expressed as the translation in the first image's axes. I assume these translations are the (tx,ty) from the homography matrix, but corrected using the rotation angle and scaling parameters. Could someone enlighten me on how to obtain these translation values?

solvePnP: Obtaining the rotation translation matrix

I am trying to image coordinates to 3D coordinates. Using the solvePnP function (in C++)has given me 3X1 rotation matrix and 3X1 translation matrix. But isn't the [R|t] matrix supposed to be 3X4?
Any help will be greatly appreciated!
From the OpenCV documentation for solvePnP:
"rvec โ€“ Output rotation vector (see Rodrigues() ) that, together with tvec , brings points from the model coordinate system to the camera coordinate system."
Following the link to Rodrigues():
src โ€“ Input rotation vector (3x1 or 1x3) or rotation matrix (3x3).
dst โ€“ Output rotation matrix (3x3) or rotation vector (3x1 or 1x3), respectively.

Matrix multiplication to mirror translation and rotation of only 1 axis?

I'm using OpenGL with some other library. This library will provide Projection Matrix and I cannot modify it. I have to provide only ModelViewMatrix.
However, strange things happen. Only y-axis translation and rotation are inverted.
For example if I increase x translation in the ModelViewMatrix then the object will go to the right just fine. (positive x direction) but in y-axis it is reversed. I want it to go the other way.
Rotation in y-axis also got reversed. It is rotation opposite of the way it should be.
I cannot fix it on ProjectionMatrix, so I think I might have to multiply my ModelViewMatrix with something that can reverse one axis before send it to the library. Do you know that something? A matrix that can reverse 1 axis?
it's just a matrix with elements [1,0,0,0][0,-1,0,0][0,0,1,0][0,0,0,1]
in other words an identity matrix with the second diagonal element negated.

Convert a bounding box in ECEF coordinates to ENU coordinates

I have a geometry with its vertices in cartesian coordinates. These cartesian coordinates are the ECEF(Earth centred earth fixed) coordinates. This geometry is actually present on an ellipsoidal model of the earth using wgs84 corrdinates.The cartesian coordinates were actually obtained by converting the set of latitudes and longitudes along which the geomtries lie but i no longer have access to them. What i have is an axis aligned bounding box with xmax, ymax, zmax and xmin,ymin,zmin obtained by parsing the cartesian coordinates (There is no obviously no cartesian point of the geometry at xmax,ymax,zmax or xmin,ymin,zmin. The bounding box is just a cuboid enclosing the geometry).
What i want to do is to calculate the camera distance in an overview mode such that this geometry's bounding box perfectly fits the camera frustum.
I am not very clear with the approach to take here. A method like using a local to world matrix comes to mind but its not very clear.
#Specktre I referred to your suggestions on shifting points in 3D and that led me to another improved solution, nevertheless not perfect.
Compute a matrix that can transfer from ECEF to ENU. Refer this - http://www.navipedia.net/index.php/Transformations_between_ECEF_and_ENU_coordinates
Rotate all eight corners of my original bounding box using this matrix.
Compute a new bounding box by finding the min and max of x,y,z of these rotated points
compute distance
cameraDistance1 = ((newbb.ymax - newbb.ymin)/2)/tan(fov/2)
cameraDistance2 = ((newbb.xmax - newbb.xmin)/2)/(tan(fov/2)xaspectRatio)
cameraDistance = max(cameraDistance1, cameraDistance2)
This time i had to use the aspect ratio along x as i had previously expected since in my application fov is along y. Although this works almost accurately, there is still a small bug i guess. I am not very sure if it a good idea to generate a new bounding box. May be it is more accurate to identify 2 points point1(xmax, ymin, zmax) and point(xmax, ymax, zmax) in the original bounding box, find their values after multiplying with matrix and then do (point2 - point1).length(). Similarly for y. Would that be more accurate?
transform matrix
first thing is to understand that transform matrix represents coordinate system. Look here Transform matrix anatomy for another example.
In standard OpenGL notation If you use direct matrix then you are converting from matrix local space (LCS) to world global space (GCS). If you use inverse matrix then you converting coordinates from GCS to LCS
camera matrix
camera matrix converts to camera space so you need the inverse matrix. You get camera matrix like this:
camera=inverse(camera_space_matrix)
now for info on how to construct your camera_space_matrix so it fits the bounding box look here:
Frustrum distance computation
so compute midpoint of the top rectangle of your box compute camera distance as max of distance computed from all vertexes of box so
camera position = midpoint + distance*midpoint_normal
orientation depends on your projection matrix. If you use gluPerspective then you are viewing -Z or +Z according selected glDepthFunc. So set Z axis of matrix to normal and Y,X vectors can be aligned to North/South and East/West so for example
Y=Z x (1,0,0)
X = Z x Y
now put position, and axis vectors X,Y,Z inside matrix, compute inverse matrix and that it is.
[Notes]
Do not forget that FOV can have different angles for X and Y axis (aspect ratio).
Normal is just midpoint - Earth center which is (0,0,0) so normal is also the midpoint. Just normalize it to size 1.0.
For all computations use cartesian world GCS (global coordinate system).

3d object overlay - augmented reality irrlicht + opencv

I am trying to develop an augmented reality program that overlays a 3d object on top of a marker. The model does not move along(proportionately) with the marker. Here are the list of things that I did
1) Using opencv: a) I used the solvepnp method to find rvecs and tvecs. b) I also used the rodrigues method to find the rotation matrix and appended the tvecs vector to get the projection matrix. c) Just for testing I made some points and lines and projected them to make a cube. This works perfectly fine and I am getting a good output.
2) Using irrlicht: a) I tried to place a 3d model(at position(0,0,0) and rotation(0,0,0)) with the camera feed running in the background. b) Using the rotation matrix found using rodrigues in opencv I calculated the pitch, yaw and roll values from this post("http://planning.cs.uiuc.edu/node103.html") and passed the value onto the rotation field. In the position field I passed the tvecs values. The tvecs values are tvecs[0], -tvecs[1], tvecs[2].
The model is moving in the correct directions but it is not moving proportionately. Meaning, if I move the marker 100 pixels in the x direction, the model only moves 20 pixels(the values 100 and 20 are not measured, I just took arbitrary values illustrate the example). Similarly for y axis and z axis. I do know I have to introduce another transformation matrix that maps the opencv camera coordinates to irrlicht camera coordinates and its a 4x4 matrix. But I do not know how to find it. Also the opencv's projections matrix [R|t] is a 3x4 matrix and it yields a 2d point that is to be projected. The 4x4 matrix mapping between opencv and irrlicht requires a 3d point(made homogeneous) to be fed into a 4x4 matrix. How do I achieve that?
The 4x4 matrix You are writing about seems to be M=[ R|t; 0 1]. t is 3x1 translation vector. To get the transformed coordinates v' of 4x1 ([x y z 1]^T) point v just do v'=Mt.
Your problem with scaling may be also caused by difference in units used for camera calibration in OpenCV and those used by the other library.