I have some 2D points set (X,Y) corresponding to a 3D points set (X,Y,Z).
2D points were captured from camera and 3D points were the real coordinate according to world base. I want to find the transformation matrix between them, that is to say, how to convert other 2D points to 3D points.
I have try getPespectiveTransform function, but it didnt work in this problem.
How can I write a regression to find this transform matrix ?
You can use solvePnP of OpenCV, it gives you rotation and translation matrix. In this answer you can see in more detail:
Camera position in world coordinate from cv::solvePnP
Related
I am currently facing an issue with my Structure from Motion program based on OpenCv.
I'm gonna try to depict you what it does, and what it is supposed to do.
This program lies on the classic "structure from motion" method.
The basic idea is to take a pair of images, detect their keypoints and compute the descriptors of those keypoints. Then, the keypoints matching is done, with a certain number of tests to insure the result is good. That part works perfectly.
Once this is done, the following computations are performed : fundamental matrix, essential matrix, SVD decomposition of the essential matrix, camera matrix computation and finally, triangulation.
The result for a pair of images is a set of 3D coordinates, giving us points to be drawn in a 3D viewer. This works perfectly, for a pair.
Indeed, here is my problem : for a pair of images, the 3D points coordinates are calculated in the coordinate system of the first image of the image pair, taken as the reference image. When working with more than two images, which is the objective of my program, I have to reproject the 3D points computed in the coordinate system of the very first image, in order to get a consistent result.
My question is : How do I reproject 3D points coordinate given in a camera related system, into an other camera related system ? With the camera matrices ?
My idea was to take the 3D point coordinates, and to multiply them by the inverse of each camera matrix before.
I clarify :
Suppose I am working on the third and fourth image (hence, the third pair of images, because I am working like 1-2 / 2-3 / 3-4 and so on).
I get my 3D point coordinates in the coordinate system of the third image, how do I do to reproject them properly in the very first image coordinate system ?
I would have done the following :
Get the 3D points coordinates matrix, apply the inverse of the camera matrix for image 2 to 3, and then apply the inverse of the camera matrix for image 1 to 2.
Is that even correct ?
Because those camera matrices are non square matrices, and I can't inverse them.
I am surely mistaking somewhere, and I would be grateful if someone could enlighten me, I am pretty sure this is a relative easy one, but I am obviously missing something...
Thanks a lot for reading :)
Let us say you have a 3 * 4 extrinsic parameter matrix called P. To match the notations of OpenCV documentation, this is [R|t].
This matrix P describes the projection from world space coordinates to the camera space coordinates. To quote the documentation:
[R|t] translates coordinates of a point (X, Y, Z) to a coordinate system, fixed with respect to the camera.
You are wondering why this matrix is non-square. That is because in the usual context of OpenCV, you are not expecting homogeneous coordinates as output. Therefore, to make it square, just add a fourth row containing (0,0,0,1). Let's call this new square matrix Q.
You have one such matrix for each pair of cameras, that is you have one Qk matrix for each pair of images {k,k+1} that describes the projection from the coordinate space of camera k to that of camera k+1. Those matrices are inversible because they describe isometries in homogeneous coordinates.
To go from the coordinate space of camera 3 to that of camera 1, just apply to your points the inverse of Q2 and then the inverse of Q1.
I am trying to develop an augmented reality program that overlays a 3d object on top of a marker. The model does not move along(proportionately) with the marker. Here are the list of things that I did
1) Using opencv: a) I used the solvepnp method to find rvecs and tvecs. b) I also used the rodrigues method to find the rotation matrix and appended the tvecs vector to get the projection matrix. c) Just for testing I made some points and lines and projected them to make a cube. This works perfectly fine and I am getting a good output.
2) Using irrlicht: a) I tried to place a 3d model(at position(0,0,0) and rotation(0,0,0)) with the camera feed running in the background. b) Using the rotation matrix found using rodrigues in opencv I calculated the pitch, yaw and roll values from this post("http://planning.cs.uiuc.edu/node103.html") and passed the value onto the rotation field. In the position field I passed the tvecs values. The tvecs values are tvecs[0], -tvecs[1], tvecs[2].
The model is moving in the correct directions but it is not moving proportionately. Meaning, if I move the marker 100 pixels in the x direction, the model only moves 20 pixels(the values 100 and 20 are not measured, I just took arbitrary values illustrate the example). Similarly for y axis and z axis. I do know I have to introduce another transformation matrix that maps the opencv camera coordinates to irrlicht camera coordinates and its a 4x4 matrix. But I do not know how to find it. Also the opencv's projections matrix [R|t] is a 3x4 matrix and it yields a 2d point that is to be projected. The 4x4 matrix mapping between opencv and irrlicht requires a 3d point(made homogeneous) to be fed into a 4x4 matrix. How do I achieve that?
The 4x4 matrix You are writing about seems to be M=[ R|t; 0 1]. t is 3x1 translation vector. To get the transformed coordinates v' of 4x1 ([x y z 1]^T) point v just do v'=Mt.
Your problem with scaling may be also caused by difference in units used for camera calibration in OpenCV and those used by the other library.
I get my camera parameters using calibrateCamera() and now I have cameraMatrix, distCoeffs, rotationMatrix, transformMatrix.
With these matrices I can build the Projection Matrix and convert 3D object points in the space into 2D image points.
Some thing like this:
But what I want is the reverse of this projection. I want to convert these 2D points back into 3D space. I know I'll lost some information, but all of my original points were in a same plan.
Please help me to build a similar matrix by using camera parameters for this convertion.
From a set of projected 2D point you cannot get back the original 3D points, but the rays that join the 3D points and their projections in the image plane. So, you lose the depth of the 3D points; it is to say, you know the orientation but you don't have the distance from the camera to the 3D points.
You will have to make up the depth of the 3D points. Their planar condition allows you to make some constraints in their relative positions, but it is not enough to retrieve their original depth.
For example, you can set the depth of 3 non-collinear points to create a plane in the 3D space. The depth of the other 2D points will be given by the intersection of those rays with this new plane.
If you know the normal vector to the plane that originated the 3D points, you could do something similar just by setting the depth of a single 2D point and computing the others accordingly. Only if you have, in addition, the distance from that plane to the origin (to the camera), you can retrieve the real depth of your 3D points.
I have calibrated my camera with OpenCV (findChessboard etc) so I have:
- Camera Distortion Coefficients & Intrinsics matrix
- Camera Pose information (Translation & Rotation, computed separatedly via other means) as Euler Angles & a 4x4
- 2D points within the camera frame
How can I convert these 2D points into 3D unit vectors pointing out into the world? I tried using cv::undistortPoints but that doesn't seem to do it (only returns 2D remapped points), and I'm not exactly sure what method of matrix math to use to model the camera via the Camera intrinsics I have.
Convert your 2d point into a homogenous point (give it a third coordinate equal to 1) and then multiply by the inverse of your camera intrinsics matrix. For example
cv::Matx31f hom_pt(point_in_image.x, point_in_image.y, 1);
hom_pt = camera_intrinsics_mat.inv()*hom_pt; //put in world coordinates
cv::Point3f origin(0,0,0);
cv::Point3f direction(hom_pt(0),hom_pt(1),hom_pt(2));
//To get a unit vector, direction just needs to be normalized
direction *= 1/cv::norm(direction);
origin and direction now define the ray in world space corresponding to that image point. Note that here the origin is centered on the camera, you can use your camera pose to transform to a different origin. Distortion coefficients map from your actual camera to the pinhole camera model and should be used at the very beginning to find your actual 2d coordinate. The steps then are
Undistort 2d coordinate with distortion coefficients
Convert to ray (as shown above)
Move that ray to whatever coordinate system you like.
I am trying to use OpenCv to correct an image for distortion and then calculate the real world coordinates given a pixel coordinate. I can not find any examples online or in the OpenCv book of how to do this.
I have done the camera calibration with the chess board image. Now, I just need a basic function that I can give pixel coordinates to that will give me real world coordinates based off of the camera matrix, the distortion coefficients, rotational and translation vectors.
Does anyone know how to do this?
Take a look at the findHomography() function. If you know the location in the real world of a set of points you can use this function to create transformation matrix that you can use with the function perspectiveTransform()
std::vector<Point2f> worldPoints;
std::vector<Point2f> cameraPoints;
//insert somepoints in both vectors
Mat perspectiveMat_= findHomography(cameraPoints, worldPoints, CV_RANSAC);
//use perspective transform to translate other points to real word coordinates
std::vector<Point2f> camera_corners;
//insert points from your camera image here
std::vector<Point2f> world_corners;
perspectiveTransform(camera_corners, world_corners, perspectiveMat_);
You can find more information about the function here
As I understand correctly you need a world point from image point. With a monocular camera this problem is unsolvable. You can not determine the depth (distance) of the real world point to the camera.
There are visual simultaneous localization and mapping (SLAM) algorithms that create a map of the world and compute the trajectory of the camera from a video, but they are a whole other thing.
Given a single image and a point on it, expressed in terms of 2D pixel coordinates, there is an infinity of 3D points in the real world, all belonging to a line, which map to your point in your image... not just one point.
But, if you know the distance of the object in pixel (x,y) from the camera then you can calculate its location in 3D.