OpenCV triangulatePoints handedness - c++

I have two cameras attached rigidly side by side looking in parallel directions.
Projection matrix for left camera
Projection matrix for right camera
When I perform triangulatePoints on the two vectors of corresponding points, I get the collection of points in 3D space. All of the points in 3D space have a negative Z coordinate.
So, to get to the bottom of this...
My assumption was that OpenCV uses Right Hand coordinate system.
The handedness reminder:
I assume that the initial orientation of each camera is directed in the positive Z axis direction.
So, by using the projection matrices like the ones I presented in the beginning, I would assume that the cameras are positioned in space like this:
This assumption is in conflict with what I observe when I get negative Z corrdinates for the traingulated points. The only explanation I can think of is that OpenCV in fact uses the Left Hand coordinate system. So, with the projection matrices I stated in the beginning, this is how the cameras are positioned in space:
This would indicate that my left camera in this situation is not on the left side. And this is why I am getting negative depth for points.
Moreover, if I try to combine triangulatePoints with solvePnP I run into problems.
I use the output of triangulatePoints as an input to solvePnP. I expect to get the camera coordinates near the origin of 3D coordinate system. I expect for calculated camera position to match the projection matrices used in the beginning. But this is not happening. I get some completely wild results, missing the expected values by over 10 times the baseline length.
Example
This example is more complete representation of the problem than what is stated above.
points3D
Here is the code for generating these points.
Movin on, setting up camera A and camera D...
Mat cameraMatrix = (Mat_<double>(3, 3) <<
716.731, 0, 660.749,
0, 716.731, 360.754,
0, 0, 1);
Mat distCoeffs = (Mat_<double>(5, 1) << 0, 0, 0, 0, 0);
Mat rotation_a = Mat::eye(3, 3, CV_64F); // no rotation
Mat translation_a = (Mat_<double>(3, 1) << 0, 0, 0); // no translation
Mat rt_a;
hconcat(rotation_a, translation_a, rt_a);
Mat projectionMatrix_a = cameraMatrix * rt_a;
Mat rotation_d = (Mat_<double>(3, 1) << 0, CV_PI / 6.0, 0); // 30° rotation about Y axis
Rodrigues(rotation_d, rotation_d); // convert to 3x3 matrix
Mat translation_d = (Mat_<double>(3, 1) << 100, 0, 0);
Mat rt_d;
hconcat(rotation_d, translation_d, rt_d);
Mat projectionMatrix_d = cameraMatrix * rt_d;
What are the pixel coordinates of points3D when observed by projections A and D.
Mat points2D_a = projectionMatrix_a * points3D;
Mat points2D_d = projectionMatrix_d * points3D;
I put them in vectors:
vector<Point2f> points2Dvector_a, points2Dvector_d;
After that, I generate the 3D points again.
Mat points3DHomogeneous;
triangulatePoints(projectionMatrix_a, projectionMatrix_d, points2Dvector_a, points2Dvector_d, points3DHomogeneous);
Mat triangulatedPoints3D;
transpose(points3DHomogeneous, triangulatedPoints3D);
convertPointsFromHomogeneous(triangulatedPoints3D, triangulatedPoints3D);
Now, triangulatedPoints3D start out like this:
and they are identical to points3D.
And then the last step.
Mat rvec, tvec;
solvePnP(triangulatedPoints3D, points2Dvector_d, cameraMatrix, distCoeffs, rvec, tvec);
Resulting rvec and tvec:
I had hoped to get something more similar to the transformations used in creation of projectionMatrix_d, i.e. translation of (100, 0, 0) and rotation of 30° about Y axis.
If I use inverted transformations when creating projection matrix, like this:
Mat rotation_d = (Mat_<double>(3, 1) << 0, CV_PI / 6.0, 0); // 30° rotation about Y axis
Rodrigues(-rotation_d, rotation_d); // NEGATIVE ROTATION
Mat translation_d = (Mat_<double>(3, 1) << 100, 0, 0);
Mat rt_d;
hconcat(rotation_d, -translation_d, rt_d); // NEGATIVE TRANSLATION
Mat projectionMatrix_d = cameraMatrix * rt_d;
then I get rvec and tvec:
And that makes much more sense. But then I change the starting transformation so that the rotation is negative CV_PI / 6.0 -> -CV_PI / 6.0 and the resulting rvec and tvec are:
I would like to find an explanation to why this is happening. Why am I getting such weird results from solvePnP.

OpenCV coordinate system is right-handed, the answer here gives an illustrative example about OpenCV camera system. I guess the confusion is about rvec and tvec, the latter does not give the translation of the camera but it points to the world origin. The first answer here explains it based on an example. You can get the actual projection matrix from the output of solvePnP by a simple matrix multiplication, the details are here in the first answer.

Related

Translate and rotate a 3d point to origin with pcl transformcloud

Given a rotation matrix and translation matrix obtained from camera pose estimation of a corner of a charucoBoard, I want to transform one particular corner to the origin by using homogeneous transformation. If I understand correctly, the position of the camera is (0,0,0) in camera's world, and the rotation matrix and translation vector are also interpreted in camera's coordinate system.
translation vector tells you where exactly is the point in camera's coordinate system, whereas the rotation matrix tells you the rotation angle with respect to the camera.
I then construct my homogeneous transformation matrix as :
float transformation_matrix[] = {rmatrix.at<float>(0,0),rmatrix.at<float>(0,1), rmatrix.at<float>(0,2), tvec[0],rmatrix.at<float>(1,0), rmatrix.at<float>(1,1), rmatrix.at<float>(1,2), tvec[1], rmatrix.at<float>(2,0), rmatrix.at<float>(2,1), rmatrix.at<float>(2,2), tvec[2],0.f ,0.f ,0.f, 1.f};
where the rmatrix is my rotation matrix and tvec is the translation vector
and the output of transformation_matrix is
transformation matrix =
[-0.99144238, 0.12881841, 0.021162758, 0.18116128;
0.045751289, 0.49469706, -0.86786038, -0.037676446;
-0.12226554, -0.8594653, -0.49635723, 0.67637974;
0, 0, 0, 1]
and the rotation matrix is:
rodrigues matrix =
[-0.99144238, 0.12881841, 0.021162758;
0.045751289, 0.49469706, -0.86786038;
-0.12226554, -0.8594653, -0.49635723]
translation vector:
[0.181161, -0.0376764, 0.67638]
I then use transformpointcloud from
pcl::transformPointCloud (*cloud, *transformed_cloud, translation_matrix);
However, I cannot align the point to origin after transformation.
Does anyone have an insight on where I might get wrong?
aruco::drawAxis(imageCopy, Camera_matrix, distortion, rvec, tvec, 0.1);
cv::Rodrigues(rvec, rmatrix);
pcl::PointCloud<pcl::PointXYZ>::Ptr transformed_cloud (new pcl::PointCloud<pcl::PointXYZ> ());
float transformation_matrix[] = {rmatrix.at<float>(0,0),rmatrix.at<float>(0,1), rmatrix.at<float>(0,2), tvec[0],rmatrix.at<float>(1,0), rmatrix.at<float>(1,1), rmatrix.at<float>(1,2), tvec[1], rmatrix.at<float>(2,0), rmatrix.at<float>(2,1), rmatrix.at<float>(2,2), tvec[2],0.f ,0.f ,0.f, 1.f};
cv::Mat translation_m = cv::Mat(4,4,CV_32F, transformation_matrix);
Eigen::Matrix<float,Eigen::Dynamic,Eigen::Dynamic> translation_matrix;
cv::cv2eigen(translation_m, translation_matrix);
pcl::transformPointCloud (*cloud, *transformed_cloud, translation_matrix);
UPDATE:
I tested out the transformation before and after and found out my transformation with homogeneous matrix was incorrect:
Here are the depth data taken from the side of the board
As you can see, the x, y, z axis are displayed as red, green and blue respectively and the white point cloud is before whereas the red point cloud is after. I found out that the image is flipped correctly to the direction I want but the origin (the intersection of red, green and blue axis) is sometimes not attached to one of the corner of the board, and the x axis is not perfectly aligned with the border of the board.
For sure one thing I can confirm is that when the camera is parallel to the board, the x axis is parallel to the transformed image as well.
taken from the upright angle at the top of the board
I am speculating that I am missing one or two axis of rotation.
I made sure my rotation matrix and translation vector of the corner are correct by labeling another corner with the following code using the R*P + T formula:
aruco::drawAxis(imageCopy, Camera_matrix, distortion, rvec, tvec, 0.1);
cv::Rodrigues(rvec, rmatrix);
Vec3f corner1_0(0.4, 0, 0);
Mat matcorner1_0;
matcorner1_0 = FloatMatFromVec3f(corner1_0);
auto mat_tvec = FloatMatFromVec3f(tvec);
Mat3f rcorner1_0 = rmatrix * matcorner1_0 + mat_tvec;
float cornerdata[3] = {rcorner1_0.at<float>(0,0),rcorner1_0.at<float>(0,1),rcorner1_0.at<float>(0,2)};
cv::Mat rcorner_10 = cv::Mat(3,1, CV_32F, cornerdata);
auto r1_0 = to_vec3f(rcorner_10);
aruco::drawAxis(imageCopy, Camera_matrix, distortion, rvec, r1_0, 0.1);
The length of the board is 0.4 meter so i set corner1_0 to (0.4, 0, 0)
and then this is the image of the two marker axis I draw for indication purpose
As you can see the rotation matrix and translation vector are correct.
Any help/hint/speculation is greatly appreciated:)
I solved the problem.
The problem is that the homogeneous transformation matrix is transforming from camera's coordinate system to that of real world. However, I want the transformation matrix from real world coordinate system back to camera's coordinate system.
You can achieve this by taking the inverse of the transformation matrix.
so adding translation_m= translation_m.inv(); between
cv::Mat translation_m = cv::Mat(4,4,CV_32F, transformation_matrix);
and
Eigen::Matrix<float,Eigen::Dynamic,Eigen::Dynamic> translation_matrix;
cv::cv2eigen(translation_m, translation_matrix);
will give you the correct answer.

How to get the difference of Aruco Markers' axis-angel in the world coordinate?

I am new to opencv and Aruco. I attempted to find out the differences in axis-angles of two different Aruco markers. For example, the angle difference b/w the two (1, 0, 0) vectors of markers in the real world. In my understanding, the transform happens in the following order: local coordinate-> camera coordinate -> world coordinate. And then I get angle difference of two aruco markers because they are now both in the same world coordinate. Can someone explain how this process is done? Or is there any better way to find out the angle difference? I am coding with python and opencv
Known:
Translation and rotation vectors(1x3) from estimatePoseSingleMarkers() function in aruco module. (rotation and translation vectors can be transformed to matrix(3x3) with Rodrigues() )
camera Matrix(3x3) and dist_coefs matrix(1x5) from camera calibration.
Using 6x6_250 aruco markers
Upate:
world coordinate = camera coordinate
The following is the function to draw the aruco's XYZ components. It looks like the origin or the camera is (0, 0, 0). Apply the translation and rotation matrix of aruco marker to the x, y, z components of origin gets me the x, y, z components of aruco marker. Is that right?
/**
*/
void drawAxis(InputOutputArray _image, InputArray _cameraMatrix, InputArray _distCoeffs,
InputArray _rvec, InputArray _tvec, float length) {
CV_Assert(_image.getMat().total() != 0 &&
(_image.getMat().channels() == 1 || _image.getMat().channels() == 3));
CV_Assert(length > 0);
// project axis points
vector< Point3f > axisPoints;
axisPoints.push_back(Point3f(0, 0, 0));
axisPoints.push_back(Point3f(length, 0, 0));
axisPoints.push_back(Point3f(0, length, 0));
axisPoints.push_back(Point3f(0, 0, length));
vector< Point2f > imagePoints;
projectPoints(axisPoints, _rvec, _tvec, _cameraMatrix, _distCoeffs, imagePoints);
// draw axis lines
line(_image, imagePoints[0], imagePoints[1], Scalar(0, 0, 255), 3);
line(_image, imagePoints[0], imagePoints[2], Scalar(0, 255, 0), 3);
line(_image, imagePoints[0], imagePoints[3], Scalar(255, 0, 0), 3);
}
Technically it's a work around to my question. So I figured ARUCO library has this function called projectPoints which maps the 3D objectPoints into 2D imagePoints.
imagePoints[0] and imagePoints[1] gives you the 2D projection of (length, 0, 0). By using arccos(np.dot(v1,v2)), you can get the difference in x-axis angle of the two markers.

How to find correspondence of 3d points and 2d points

I have a set of 3d points in world coordinates and respective correspondences with 2d points in an image. I want to find a matrix that gives me the transformation between these set of points. How can I do this in OpenCV?
cv::solvePnP() is what you are looking for, it finds an object pose from 3D-2D point correspondences and results a rotation vector (rvec), that, together with translation vector (tvec), brings points from the model coordinate system to the camera coordinate system.
you can use solvePnP for this:
// camMatrix based on img size
int max_d = std::max(img.rows,img.cols);
Mat camMatrix = (Mat_<double>(3,3) <<
max_d, 0, img.cols/2.0,
0, max_d, img.rows/2.0,
0, 0, 1.0);
// 2d -> 3d correspondence
vector<Point2d> pts2d = ...
vector<Point3d> pts3d = ...
Mat rvec,tvec;
solvePnP(pts3d, pts2d, camMatrix, Mat(1,4,CV_64F,0.0), rvec, tvec, false, SOLVEPNP_EPNP);
// get 3d rot mat
Mat rotM(3, 3, CV_64F);
Rodrigues(rvec, rotM);
// push tvec to transposed Mat
Mat rotMT = rotM.t();
rotMT.push_back(tvec.reshape(1, 1));
// transpose back, and multiply
return camMatrix * rotMT.t();

3D Geometry projecting a point in space to an image given FOV and angles

I'm attempting to project a 3D point in Cartesian space onto an image, so it is shown as a circle at the location on the image it would be seen if a certain camera was looking at it.
Knows:
The camera has horizontal field of view, vertical FOV, azimuth and elevation.
It is located at (0,0,0) although being able to change this would be good.
The image has a certain size.
The point in space has a know X, Y, Z location.
I'm using the very good C++ library MathGeoLib
In my attempt, changing azimuth or elevation does not work, but the projection seems to work ok if rotation is ignored.
How do you convert from a real world point into pixel location of an image providing that you know where the where the camera is, which direction it is looking and it's field of view?
Frustum MyClass::GetCameraFrustrum()
{
Frustum frustrum;
frustrum.SetPerspective(DegToRad(_horizontal_field_of_view), DegToRad(_vertical_filed_of_view));
// Looking straight down the z axis
frustrum.SetFront(vec(0, 0, 1));
frustrum.SetUp(vec(0, 1, 0));
frustrum.SetPos(vec(0,0,0));
frustrum.SetKind(FrustumSpaceGL, FrustumRightHanded);
frustrum.SetViewPlaneDistances(1, 2000);
float3x3 rotate_around_x = float3x3::RotateX(DegToRad(_angle_azimuth));
frustrum.Transform(rotate_around_x);
float3x3 rotate_around_y = float3x3::RotateY(DegToRad(_angle_elevation));
frustrum.Transform(rotate_around_y);
return frustrum;
}
void MyClass::Test()
{
cv::Mat image = cv::Mat::zeros(2000, 2000, CV_8UC1);
Frustum camera_frustrum = GetCameraFrustrum();
vec projection = camera_frustrum.Project(_position_of_object);
cv::Point2d location_on_image(projection.x * image.cols, projection.y * image.rows);
LOG_DEBUG << location_on_image;
cv::circle(image, location_on_image, 5, cv::Scalar(255, 0, 0), 3, cv::LINE_AA);
}

Camera position in world coordinate from cv::solvePnP

I have a calibrated camera (intrinsic matrix and distortion coefficients) and I want to know the camera position knowing some 3d points and their corresponding points in the image (2d points).
I know that cv::solvePnP could help me, and after reading this and this I understand that I the outputs of solvePnP rvec and tvec are the rotation and translation of the object in camera coordinate system.
So I need to find out the camera rotation/translation in the world coordinate system.
From the links above it seems that the code is straightforward, in python:
found,rvec,tvec = cv2.solvePnP(object_3d_points, object_2d_points, camera_matrix, dist_coefs)
rotM = cv2.Rodrigues(rvec)[0]
cameraPosition = -np.matrix(rotM).T * np.matrix(tvec)
I don't know python/numpy stuffs (I'm using C++) but this does not make a lot of sense to me:
rvec, tvec output from solvePnP are 3x1 matrix, 3 element vectors
cv2.Rodrigues(rvec) is a 3x3 matrix
cv2.Rodrigues(rvec)[0] is a 3x1 matrix, 3 element vectors
cameraPosition is a 3x1 * 1x3 matrix multiplication that is a.. 3x3 matrix. how can I use this in opengl with simple glTranslatef and glRotate calls?
If with "world coordinates" you mean "object coordinates", you have to get the inverse transformation of the result given by the pnp algorithm.
There is a trick to invert transformation matrices that allows you to save the inversion operation, which is usually expensive, and that explains the code in Python. Given a transformation [R|t], we have that inv([R|t]) = [R'|-R'*t], where R' is the transpose of R. So, you can code (not tested):
cv::Mat rvec, tvec;
solvePnP(..., rvec, tvec, ...);
// rvec is 3x1, tvec is 3x1
cv::Mat R;
cv::Rodrigues(rvec, R); // R is 3x3
R = R.t(); // rotation of inverse
tvec = -R * tvec; // translation of inverse
cv::Mat T = cv::Mat::eye(4, 4, R.type()); // T is 4x4
T( cv::Range(0,3), cv::Range(0,3) ) = R * 1; // copies R into T
T( cv::Range(0,3), cv::Range(3,4) ) = tvec * 1; // copies tvec into T
// T is a 4x4 matrix with the pose of the camera in the object frame
Update: Later, to use T with OpenGL you have to keep in mind that the axes of the camera frame differ between OpenCV and OpenGL.
OpenCV uses the reference usually used in computer vision: X points to the right, Y down, Z to the front (as in this image). The frame of the camera in OpenGL is: X points to the right, Y up, Z to the back (as in the left hand side of this image). So, you need to apply a rotation around X axis of 180 degrees. The formula of this rotation matrix is in wikipedia.
// T is your 4x4 matrix in the OpenCV frame
cv::Mat RotX = ...; // 4x4 matrix with a 180 deg rotation around X
cv::Mat Tgl = T * RotX; // OpenGL camera in the object frame
These transformations are always confusing and I may be wrong at some step, so take this with a grain of salt.
Finally, take into account that matrices in OpenCV are stored in row-major order in memory, and OpenGL ones, in column-major order.
If you want to turn it into a standard 4x4 pose matrix specifying the position of your camera. Use rotM as the top left 3x3 square, tvec as the 3 elements on the right, and 0,0,0,1 as the bottom row
pose = [rotation tvec(0)
matrix tvec(1)
here tvec(2)
0 , 0, 0, 1]
then invert it (to get pose of camera instead of pose of world)