I am trying to measure the pose of a camera and I have done the following.
Mark world 3-D(Assuming z=0, since it is flat) points on corners of a square on a flat surface and assume a world coordinate system.(in cms)
Have taken the top left corner of the square as my origin and given the world points in the following order(x,y)or(col,row):
(0,0),(12.8,0),(12.8,12.8),(0,12.8) - in cms
Detect those points in my image.(in pixels)
The image points and world points are in the same order.
I have calibrated my camera for intrinsic matrix and distortion coefficients.
I use SolvePnP function to get rvec and tvec.
I use Rodrigues function to get rotation matrix.
To check if rvec and tvec is correct, I project back the 3-D points(z=0) using ProjectPoints into the image plane and I get the points correctly on my image with an error of 3 pixels on X- axis.
Now I go ahead and calculate my camera position in the world frame using the formula:
cam_worl_pos = - inverse(R) * tvec. (This formula I have verified in many blogs and also this makes sense)
But my cam_worl_pos x,y, and z in cms do not seem to be correct.
My doubt is, if I am able to project back the 3-D world point to image plane using rvec and tvec with (3 pixel error on X-axis and almost no error on Y axis, hope it is not too bad), then why am I not getting the camera position in world frame right.
Also, I have a doubt on SolvPnP rvec and tvec solution, they might be one of the multiple solutions, but not the one which I want.
How do I get the right rvec and tvec from SolvPnp or any other suggestions to get rvec and tvec would also be helpful.
EDITS :
Image Size - 720(row) * 1280(col)
camera parameters
cameraMatrix_Front=[908.65 0 642.88
0 909.28 364.95
0 0 1]
distCoeffs_Front=[-0.4589, 0.09462, -1.46*10^-3, 1.23*10^-3]
OpenCV C++ code:
vector<Point3f> front_object_pts;
Mat rvec_front;
Mat tvec_front;
Mat rotation_front;
Mat world_position_front_cam;
//Fill front object points(x-y-z order in cms)
//It is square of side 12.8cms on Z=0 plane
front_object_pts.push_back(Point3f(0, 0, 0));
front_object_pts.push_back(Point3f(12.8, 0, 0));
front_object_pts.push_back(Point3f(12.8,12.8,0));
front_object_pts.push_back(Point3f(0, 12.8, 0));
//Corresponding Image points detected in the same order as object points
front_image_pts.push_back(points_front[0]);
front_image_pts.push_back(points_front[1]);
front_image_pts.push_back(points_front[2]);
front_image_pts.push_back(points_front[3]);
//Detected points in image matching the 3-D points in the same order
//(467,368)
//(512,369)
//(456,417)
//(391,416)
//Get rvec and tvec using Solve PnP
solvePnP(front_object_pts, front_image_pts, cameraMatrix_Front,
Mat(4,1,CV_64FC1,Scalar(0)), rvec_front, tvec_front, false, CV_ITERATIVE);
//Output of SolvePnP
//tvec=[-26.951,0.6041,134.72] (3 x 1 matrix)
//rvec=[-1.0053,0.6691,0.3752] (3 x 1 matrix)
//Check rvec and tvec is correct or not by projecting the 3-D object points to image
vector<Point2f>check_front_image_pts
projectPoints(front_object_pts, rvec_front, tvec_front,
cameraMatrix_Front, distCoeffs_Front, check_front_image_pts);
//Here to note that I have made **distCoefficents**,
//a 0 vector since my image points are detected after radial distortion is removed
//Get rotation matrix
Rodrigues(rvec_front, rotation_front);
//Get rotation matrix inverse
Mat rotation_inverse;
transpose(rotation_front, rotation_inverse);
//Get camera position in world cordinates
world_position_front_cam = -rotation_inverse * tvec_front;
//Actual location of camera(Measured manually)
X=47cm
Y=18cm
Z=25cm
//Obtained location
X=110cm
Y=71cm
Z=-40cm
Related
I am working on face head estimation using dlib and opencv from a saved video.
I am able to get head rotation matrix and euler angle using opencv solvepnp function in each frame. My question is how can I get the angle difference between 2 consecutive frame based on rotation matrix or euler angle matrix?
std::vector<cv::Point3d> object_pts; //generic 3D world coordinate
std::vector<cv::Point2d> image_pts; //dlib landmark points
cv::Mat rotation_vec; //3 x 1
cv::Mat rotation_mat; //3 x 3 R
cv::Mat translation_vec;
//calc pose
cv::solvePnP(object_pts, image_pts, cam_matrix, dist_coeffs, rotation_vec, translation_vec);
//calc euler angle
cv::Rodrigues(rotation_vec, rotation_mat);
cv::hconcat(rotation_mat, translation_vec, pose_mat);
cv::decomposeProjectionMatrix(pose_mat, out_intrinsics, out_rotation, out_translation, cv::noArray(), cv::noArray(), cv::noArray(), euler_angle);
This is example of rotation and euler angle matrices from two frames:
First frame :
rotation_mat: [0.9808042880813987, -0.00560492255811192, -0.194914169121329;
-0.009358357320917634, 0.9970819362297282, -0.07576300939528717;
0.1947700429465335, 0.07613276093439612, 0.9778897857545662]
Euler_angle[4.451733321076386;
-11.23129045806711;
-0.5466718623384634]
Second frame:
rotation_mat: [0.9821961040283421, -0.005362175244554571, -0.1877819488347928;
-0.01108060484669302, 0.9961985559469002, -0.08640403535450306;
0.1875314198415228, 0.0869464444699003, 0.9784029243447022]
Euler_angle[5.078288701210371;
-10.80875560893563;
-0.6463525542012101]
My goal is to find one pose as a baseline and compare other head pose from each frame to find how much it deviates from the baseline.
Best regards.
I have two sets of points from an image that I am trying to transform between. To do this, I am using OpenCV's function estimateRigidTransform(). With the homography matrix that is produced (containing the rotation matrix and the translation vector), I am using OpenCV's function warpAffine() to transform the image. I am then displaying this transformed image in a new window.
My code is as follows:
cv::namedWindow("Transformed Blue A", CV_WINDOW_AUTOSIZE);
cv::Mat mask_image, warp_matrix;
bool fullAffine = false;
homography_matrix = estimateRigidTransform(black_A_points, blue_A_points, fullAffine);
warpAffine(image, mask_image, homography_matrix, image.size());
cv::imshow("Transformed Blue A", mask_image);
black_A_points and blue_A_points are vectors with four Point2f values in them (these are the coordinates that are transformed between). After transforming with warpAffine(), I am displaying the transformed image in a new window.
The results are as follows:
Image to be Transformed
Transformed Image
I am using the corners of the 'A's as the feature points to translate between (hence why the have red lines and green dots drawn on them for visual confirmation that I have found these points correctly). I manually shifted the image over in the window so I could see it better and the rotation is visually correct. However, I was expecting to see the blue 'A' where the black 'A' was in the window showing the original image (translated to that position).
The homography matrix is: [-0.7138494567933336, 0.7193648090910907, 40.48675760211488; -0.7193648090910907, -0.7138494567933336, 849.4044159834291]
Therefore the rotation matrix is: [-0.7138494567933336, 0.7193648090910907; -0.7193648090910907, -0.7138494567933336]
and the translation vector is: [40.48675760211488; 849.4044159834291]
Am I using the homography matrix correctly? Do I need to be performing a mathematical operation on the homography matrix before I can use it in the coordinates of the window (as in the current coordinate frame is wrong)? Or am I using the OpenCV functions wrong?
I also tried the OpenCV functions findHomography() and getAffineTransform() but both of these produced the same problem.
Thank you very much for your time. I appreciate any help.
UPDATE:
Corners of Black A:
[(495, 515), (479, 497), (428, 646), (345, 565)]
Corners of Blue A:
[(57, 125), (57, 151), (200, 80), (200, 198)]
After testing it with your numbers I found the problem :)
You are using points as (y,x) and not as (x,y). I tried with your original numbers and I reproduce the same results as you did. Then I did a small script to test it in python, inverting the coordinates:
import numpy as np
import cv2
# loads iamge and data
img = cv2.imread("test.png")
pointsBlack = np.array([(495, 515), (479, 497), (428, 646), (345, 565)])
pointsBlue = np.array([(57, 125), (57, 151), (200, 80), (200, 198)])
# inverts the points (y,x) is (x,y)
a = np.array([(x[1], x[0]) for x in pointsBlack])
b = np.array([(x[1], x[0]) for x in pointsBlue])
res = cv2.estimateRigidTransform(a, b, True)
print(res)
imgWarped = cv2.warpAffine(img, res, img.shape[1::-1])
cv2.imshow("warped", imgWarped)
cv2.imshow("s", img)
cv2.waitKey(0)
The result of this is:
[[-7.80571429e-01 -7.46857143e-01 8.96688571e+02]
[ 6.53714286e-01 -7.35428571e-01 8.43742857e+01]]
and the image looks like:
In C++ cv::Point2f constructor is cv::Point2f(x, y). You are passing (y,x). Not sure how you find this points, but it could be a confusion with the cv::Mat::at<T>(row, col) function which takes first rows and the cols, or in cartesian, y and then x.
I succesfully calibrate my camera using opencv. The camera lens i am using.
https://www.baslerweb.com/en/products/vision-components/lenses/basler-lens-c125-0418-5m-f1-8-f4mm/
The internal and external camera parameter is given below.
cv::Mat cameraMatrix(3, 3, cv::DataType<double>::type);
cameraMatrix.at<double>(0) = 1782.80;//fx //432.2 in mm
cameraMatrix.at<double>(1) = 0;
cameraMatrix.at<double>(2) = 3.0587694283633488e+002;//cx
cameraMatrix.at<double>(3) = 0;
cameraMatrix.at<double>(4) = 1782.80;//fy
cameraMatrix.at<double>(5) = 3.0535864258476721e+002;//cy
cameraMatrix.at<double>(6) = 0;
cameraMatrix.at<double>(7) = 0;
cameraMatrix.at<double>(8) = 1;
cv::Mat disCoeffs(1, 5, cv::DataType<double>::type);
disCoeffs.at<double>(0) = -8.1752937039996709e-001;//k1
disCoeffs.at<double>(1) = -2.5660653367749450e+001;//k2
disCoeffs.at<double>(2) = -1.5556922931812768e-002;//p1
disCoeffs.at<double>(3) = -4.4021541217208054e-002;//p2
disCoeffs.at<double>(4) = 1.5042036073609015e+002;//k3
I know this formula is used to calculate the distance of the object. But i am very confuse how to proper use it.
Resolution of my camera is 640x480.
focal length = 1782.80 (px) do not know how to correctly convert to mm
i know focal length is distance from sensor to image plane. So what actually this value represent? Pixel is just a unit represent dot on screen.
Object i am using is circle.
radius = 22. (width and height 44*44)
circle center point: 300,300 (x,y)
sensor height do not know how to get?
Where do i use principle points?
How i get distance from camera to object? How do get real world coordinate of the circle?
I know its too much to ask. I try one month. Did not find any proper solution.
i use function solvePnP to get the camera translation and rotation matrix. But i have problem how to calculate object point?
Your cx and cy seems to be wrong because they should be half the resolution: 640/2 & 480/2.
fx and fy are in pixel unit you get from calibration process. To convert them to mm use that formula:
pixels width = (image width in pixels) * (focal length in mm) / (CCD width in mm)
pixels height = (image height in pixels) * (focal length in mm) / (CCD height in mm)
When you calibrate your camera, you use those formulas to make sure you've the right values. For me cx and cy are wrong because they represent the center of the image (they shouldn't be equal unless your image is square which is not the case). For fx and fy I can't tell because I don't know the CCD of your camera. They can be equal if the CCD is square.
Don't change those parameters manually but let the your calibration software compute them.
Now you've those parameters, how you compute the distance?
The formula you presented is not useful in a sense that if you can measure the real height, you usually can measure the distance (at least in your case).. so why using a camera!?
So to compute the distance in real world, you need two more things: The extrinsic parameters (Your cameraMatrix matrix is the intrinsic parameters) and at least four points (the more points the better) in real world coordinates.
Once you have those things, you can use solvePnP function to find the pose of an object. The pose represents the translation and rotation with respect to the camera frame.
http://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#solvepnp
This is a piece of code can help to do that:
//Four points in real world with `x,y and z` coordinates
vector<Point3f> vec3d;
vec3d.push_back(Point3f(0, 0, 0));
vec3d.push_back(Point3f(0, 211, 0));
vec3d.push_back(Point3f(295, 211, 0));
vec3d.push_back(Point3f(295, 0, 0));
The z=0 because your real points are in a plane.
//The same four points but in your image plan, therefore there is no z and they're in pixel unit
vector<Point2f> vec2d;
vec2d.push_back(Point2f(532, 412)); //(y,x)
vec2d.push_back(Point2f(583, 594));
vec2d.push_back(Point2f(927, 535));
vec2d.push_back(Point2f(817, 364));
//The pose of the object: rvec is your rotation vector, tvec is your translation vector
cv::Mat rvec, tvec;
solvePnP(vec3d, vec2d, cameraMatrix, distCoeffs, rvec, tvec);
Finally, you can compute the real distance from the tvec as euclidean distance: d=std::sqrt(tx*tx+ty*ty+tz*tz).
Your questions:
sensor height do not know how to get?
Look for your camera specification in the internet or in the manual book and you'll find it.
Where do i use principle points?
They're your intrinsic parameters. You're not gonna use them separately.
How i get distance from camera to object? How do get real world coordinate of the circle?
I explained that above. You need four points and with a circle you have only one which not enough to compute the pose.
But i have problem how to calculate object point?
objectPoints in solvePnP are your real world coordinates. For example, a chessboard has corners in which we know the exact position in mm of each one with respect to a world frame that you choose in the chessboard. It can be in the left top corner or something like that and z=0 because the chessboard is printed in a paper just like your circle!
EDIT:
You can find more specifications in the manual page 13 here. It is said 7.4 x 7.4µm:
f (mm)=f(pixel) x pixel_size(mm) => f (mm) = 1782.80x7.2e-6 = 12.83616 (mm)
Which is not 4mm!! then you need to do the calibration again, something is wrong!
3D points:
vector vec3d;
vec3d is where you gonna store your 3D coordinates point. I gave you an example for the first point which the origin:
vec3d.push_back(Point3f(0, 0, 0)); //y,x,z
EDIT3
If you take a pattern like this
Then choose for example the circle in top left or right corner and it will have a coordinate of (0,0,0), that the origin. After that the circle next to it is your second point and it will have (x,0,0) x is the distance in (mm) between the two circles.. You do the same for four points in your pattern. You can choose any pattern you want as long as you can detect it in your image and retrieve their coordinates in pixel.
If you still don't understand, I advise you take a course in projective geometry and camera models.. so as you can understand what every parameter means.
I have a fish-eye camera in the ceiling and I want to locate some points on the floor. I have put the origin of my reference system (real world) just below the camera and I want to know the position of every object in centimeters. This picture shows this:
Reference system - Real world
Firstly, I have done the camera calibration and I have obtained the next result with an RMS of 1.11:
Undistorted image after calibration
As a result of the calibration I obtained intrinsic parameters (camera matrix), so I used cv::solvePnP to get rotation and translation vectors. For apply this I marked some points in the undistorted image (in pixels) and I measured them in real world according to my reference system.
For example, the origin is in the center of a 1024x768 image, so:
Point 0: ImagePoint(512, 384) [pixels] --> ObjectPoint(0,0) [centimeters]
The next code shows this:
std::vector<cv::Point2f> imagePointsPix;
std::vector<cv::Point3f> objectPointsCm;
imagePointsPix.push_back(cv::Point2f(512.,384.));
imagePointsPix.push_back(cv::Point2f(404.,512.));
imagePointsPix.push_back(cv::Point2f(666.,211.));
imagePointsPix.push_back(cv::Point2f(519.,66.));
objectPointsCm.push_back(cv::Point3f(0., 0., 0.));
objectPointsCm.push_back(cv::Point3f(-80.,-132.,0.));
objectPointsCm.push_back(cv::Point3f(120.,188.,0.));
objectPointsCm.push_back(cv::Point3f(-40.,268.,0.));
cv::Mat rvec(1,3,cv::DataType<double>::type);
cv::Mat tvec(1,3,cv::DataType<double>::type);
cv::Mat rotationMatrix(3,3,cv::DataType<double>::type);
cv::solvePnP(objectPointsCm, imagePointsPix, cameraMatrix, distCoeffs, rvec, tvec, 0, SOLVEPNP_ITERATIVE);
cv::Rodrigues(rvec,rotationMatrix);
Now I have the camera matrix, the rotation matrix and the traslation vector, so by using this as reference I am able to compute any point if I have its position in pixels. This is the code:
cv::Mat uvPoint = cv::Mat::ones(3,1,cv::DataType<double>::type); //u,v,1
uvPoint.at<double>(0,0) = 512.; //img point for which we want its real coordinates
uvPoint.at<double>(1,0) = 384.;
cv::Mat tempMat, tempMat2;
double s;
tempMat = rotationMatrix.inv() * cameraMatrix.inv() * uvPoint;
tempMat2 = rotationMatrix.inv() * tvec;
s = 0 + tempMat2.at<double>(2,0); //before 0 it was 285, which represents the height Zconst
s /= tempMat.at<double>(2,0);
std::cout << "P = " << rotationMatrix.inv() * (s * cameraMatrix.inv() * uvPoint - tvec) << std::endl;
I get this results for the same points I used for obtaining my parameters:
Point 0 --> (0.213, 3.391) (it should be (0,0)) ERROR: 3.69 cm
Point 1 --> (-68.28, -112.82) (it should be (-80, -132)) ERROR: 17.49 cm
Point 2 --> (84.48, 137.61) (it should be (120, 188)) ERROR: 49.62 cm
The rest of points also show an error too big... I have used more points but the results do not improve. I don't know where I went wrong, could anyone help me?
Thanks in advance.
It looks like you may be effectively undistorting your image twice from solvePNP's perspective. This is due to passing in the distortion coefficients along with point correspondences that are already derived from an undistorted image.
Try passing the actual camera matrix from your calibration to solvePNP instead of an identity matrix, but still pass NULL for the distortion coefficients to avoid the double-undistortion.
Finally I have found out that the error was caused by the distortion coefficients, i.e. my calibration. I set the cameraMatrix to the Identity matrix (eye(3)) and the distCoefficients to NULL so that solvePNP assumed I have a perfect camera. Using this approach I obtained an error much lower. I will have to make a better calibration.
I am new to visualSFM and Computer Vision. I am wondering how I can compute the camera pose (the direction in which the camera is looking) with respect to the world coordinate system and also the camera location with respect to the world coordinate system. The output in camera_v2.txt file is as follows:
The format of each camera is as follows:
Filename (of the undistorted image in visualize folder)
Original filename
Focal Length (of the undistorted image)
2-vec Principal Point (image center)
3-vec Translation T (as in P = K[R T])
3-vec Camera Position C (as in P = K[R -RC])
3-vec Axis Angle format of R
4-vec Quaternion format of R
3x3 Matrix format of R
[Normalized radial distortion] = [radial distortion] * [focal length]^2
3-vec Lat/Lng/Alt from EXIF
Am I right in assuming that C is the position of the camera with respect to world coordinates. In that case what is the direction in which the camera is looking.
'R' and 'T' (or commonly denoted as 't') are the extrinsic camera calibration [R|t]. Camera center in 3D coordinates is given by -R^-1 * t = -R^T * t (R^T is the transpose of R). R is the rotation matrix that can be given in (Rodrigues) angle axis, quaternion or plain 3x3 matrix format; it indeed expresses how the camera is oriented in 3D space. Watch out how the 3D Y- and Z-axis are chosen, different options exists.