I am trying to use OpenCv to correct an image for distortion and then calculate the real world coordinates given a pixel coordinate. I can not find any examples online or in the OpenCv book of how to do this.
I have done the camera calibration with the chess board image. Now, I just need a basic function that I can give pixel coordinates to that will give me real world coordinates based off of the camera matrix, the distortion coefficients, rotational and translation vectors.
Does anyone know how to do this?
Take a look at the findHomography() function. If you know the location in the real world of a set of points you can use this function to create transformation matrix that you can use with the function perspectiveTransform()
std::vector<Point2f> worldPoints;
std::vector<Point2f> cameraPoints;
//insert somepoints in both vectors
Mat perspectiveMat_= findHomography(cameraPoints, worldPoints, CV_RANSAC);
//use perspective transform to translate other points to real word coordinates
std::vector<Point2f> camera_corners;
//insert points from your camera image here
std::vector<Point2f> world_corners;
perspectiveTransform(camera_corners, world_corners, perspectiveMat_);
You can find more information about the function here
As I understand correctly you need a world point from image point. With a monocular camera this problem is unsolvable. You can not determine the depth (distance) of the real world point to the camera.
There are visual simultaneous localization and mapping (SLAM) algorithms that create a map of the world and compute the trajectory of the camera from a video, but they are a whole other thing.
Given a single image and a point on it, expressed in terms of 2D pixel coordinates, there is an infinity of 3D points in the real world, all belonging to a line, which map to your point in your image... not just one point.
But, if you know the distance of the object in pixel (x,y) from the camera then you can calculate its location in 3D.
Related
I am interested in finding the Rotation Matrix of an Aruco Marker from a Stereo Camera.
I know that estimateposesinglemarkers gives a Rotation Vector (which can be converted to matrix via Rodrigues)and Translation Vector but the values are not that stable and is supposedly written for MonoCam.
I can get Stable 3D points of the Marker from a Stereo Camera, however i am struggling in creating a Rotation Matrix. My Main goal is to achieve what Ali has achieved in this following blog Relative Position of Aruco Markers.
I have tried working with Euler Angles from here by creating a plane of the Aruco Marker from the 3D points that i get from the Stereo Camera but in vain.
I know my algorithm is failing because the values of the Relative Co-ordinates keeps on changing on moving the camera which should not happen as the Relative Co-ordinates b/w the Markers Should remain Constant.
I have a properly Calibrated camera with all the required matrices.
I tried using SolvePnP, but i believe it gives Rvecs and Tvecs which when combined together brings points from the model coordinate system to the camera coordinate system.
Any idea on how i can create the Rotation Matrix of the Marker with my fairly Stable 3D points so that on moving the camera, the relative Co-ordinates doesn't Change ?
Thanks in Advance.
i am completing my thesis related opencv.
I want to measure real size of object (mm) with single camera but i have problem with convert the camera's natural units (pixels) and the real world units!!!
After calibrate camera, i have:
Camera matrix (3x3)
Distortion coefficients
Extrinsic parameters [rotation vector(1x3) + translation vector(1x3)]
I have read following link but i can't find out formula to convert unit.
https://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html
Example about measure size of object
Any sugguestion???
Thanks so much.
As mentioned in the comments, you need the distance to the object to obtain 3D coordinates from pixels. A possible workflow would be:
Rectify the image using the distortion parameters, i.e., correct the distortion caused by the camera.
Deproject the pixels into 3D points in the camera coordinate frame using the camera matrix. For this you can multiply the inverse of the 3x3 camera matrix with a vector containing the pixels [pixel_x, pixel_y, 1]^T. If you multiply the result [x', y', 1]^T with the depth, i.e., the z-component you obtain the 3D point in the camera coordinate frame.
Transform the point from the camera coordinate frame into the world coordinate frame using the extrinsics parameters.
Obtaining the depth values from an image alone is not possible. The only option is to use some additional information. Maybe your object is placed on a table and you know the distance between the camera and the table.
To measure distances between the camera and a table or even the object itself you could use Aruco markers, which are also available within openCV.
I am currently facing an issue with my Structure from Motion program based on OpenCv.
I'm gonna try to depict you what it does, and what it is supposed to do.
This program lies on the classic "structure from motion" method.
The basic idea is to take a pair of images, detect their keypoints and compute the descriptors of those keypoints. Then, the keypoints matching is done, with a certain number of tests to insure the result is good. That part works perfectly.
Once this is done, the following computations are performed : fundamental matrix, essential matrix, SVD decomposition of the essential matrix, camera matrix computation and finally, triangulation.
The result for a pair of images is a set of 3D coordinates, giving us points to be drawn in a 3D viewer. This works perfectly, for a pair.
Indeed, here is my problem : for a pair of images, the 3D points coordinates are calculated in the coordinate system of the first image of the image pair, taken as the reference image. When working with more than two images, which is the objective of my program, I have to reproject the 3D points computed in the coordinate system of the very first image, in order to get a consistent result.
My question is : How do I reproject 3D points coordinate given in a camera related system, into an other camera related system ? With the camera matrices ?
My idea was to take the 3D point coordinates, and to multiply them by the inverse of each camera matrix before.
I clarify :
Suppose I am working on the third and fourth image (hence, the third pair of images, because I am working like 1-2 / 2-3 / 3-4 and so on).
I get my 3D point coordinates in the coordinate system of the third image, how do I do to reproject them properly in the very first image coordinate system ?
I would have done the following :
Get the 3D points coordinates matrix, apply the inverse of the camera matrix for image 2 to 3, and then apply the inverse of the camera matrix for image 1 to 2.
Is that even correct ?
Because those camera matrices are non square matrices, and I can't inverse them.
I am surely mistaking somewhere, and I would be grateful if someone could enlighten me, I am pretty sure this is a relative easy one, but I am obviously missing something...
Thanks a lot for reading :)
Let us say you have a 3 * 4 extrinsic parameter matrix called P. To match the notations of OpenCV documentation, this is [R|t].
This matrix P describes the projection from world space coordinates to the camera space coordinates. To quote the documentation:
[R|t] translates coordinates of a point (X, Y, Z) to a coordinate system, fixed with respect to the camera.
You are wondering why this matrix is non-square. That is because in the usual context of OpenCV, you are not expecting homogeneous coordinates as output. Therefore, to make it square, just add a fourth row containing (0,0,0,1). Let's call this new square matrix Q.
You have one such matrix for each pair of cameras, that is you have one Qk matrix for each pair of images {k,k+1} that describes the projection from the coordinate space of camera k to that of camera k+1. Those matrices are inversible because they describe isometries in homogeneous coordinates.
To go from the coordinate space of camera 3 to that of camera 1, just apply to your points the inverse of Q2 and then the inverse of Q1.
I have some 2D points set (X,Y) corresponding to a 3D points set (X,Y,Z).
2D points were captured from camera and 3D points were the real coordinate according to world base. I want to find the transformation matrix between them, that is to say, how to convert other 2D points to 3D points.
I have try getPespectiveTransform function, but it didnt work in this problem.
How can I write a regression to find this transform matrix ?
You can use solvePnP of OpenCV, it gives you rotation and translation matrix. In this answer you can see in more detail:
Camera position in world coordinate from cv::solvePnP
I'm given a plane (support vector and plane's normal vector), an image which was taken by a camera of which i know the intrinsic parameters (fx,fy,cx,cy). How do i obtain the transformation of this image to a bird-eye-view like image (so that birds view is collinear to the plane's normal vector). I'm confused with the coordinate systems i have to use, some matrices are in world coordinates and some in local. I know that there is warpPerspective() in OpenCV, would this do the job?
Im using OpenCV: 2.4.9
Thanks alot!
Update:
Do I have to calculate 4 points with the camera facing normal, then 4 points from the bird eye view and pass them to findHomography() to obtain the transformation matrix?
Update:
Solved. Got it to work!
A rectangle on the world plane will appear as a quadrangle in your image.
In a bird's eye view, you want it to appear as a rectangle again.
You must know at least the aspect ratio of this world rectangle for the top view to be correctly and equally scaled along both axes.
Given the 4 quadrangle 2D points in your image, and the destination rectangle corners (for which you essentially choose the 2D coordinates) you can calculate the homography from your quadrangle to a rectangle and use warpPerspective() to render it.
This is often the easiest way to do it.
You can also go through the camera and matrices themselves. For this you will need to rotate the camera to be above the plane with orthographic projection. See the homography decomposition here.