I want to perform camera calibration with OpenCV C++ API, using a set of known world to image point matches.
OpenCV has a function called cv::calibrateCamera as documented here. This mention clearly that the function will deduce the
intrinsic camera matrix for planar objects and that it expects the user
to specify the matrix for non-planar 3D environments.
In my point correspondences, the world coordinates are not planar. And I do not have a qualified guess for the internal camera matrix.
How would I go about calibrating the camera in this case?
Currently, I am using a simple DLT based approach for the calculation using the cv::SVD::solveZ function. But I would like to use the non-linear estimation that OpenCV performs.
This page explains how to perform camera auto-calibration. This includes a method using Kruppa equations which appears to be solvable using the non-linear techniques you desire.
I was in the same situation: I have a non-planar 3D target, however I wanted to use OpenCV's non-linear LM-optimization for the calibration process. (Zhang's initialization method used by OpenCV only allows for planar calibration targets)
What you can do is to extract the camera matrix from your own DLT result and use this as an initial guess for calibrateCamera. It is sufficient if done for one pair, only (camera points - object points). Even though the other pairs might produce other camera matrices, they will hopefully be similar and you'll need that matrix only for initialization anyways.
Note, I do assume though, that with your own DLT you obtain a projection matrix P which maps homogeneous world points X to hom. image points x via x = P * X.
This would be the way to go, it is in python though, you should be able to adapt to your own needs:
P = YOUR_DLT(imagePoints[0], objectPoints[0])
cameraMatrix, _, _, _, _, _, _ = cv2.decomposeProjectionMatrix(P)
cameraMatrix /= cameraMatrix[2,2] # ensure unit elem[2,2]
cameraMatrix[0,1] = 0 # ensure no skew
cameraMatrix[0,0] = abs(cameraMatrix[0,0]) # ensure positive focal lengthes
cameraMatrix[1,1] = abs(cameraMatrix[1,1])
# ensure principal point within image:
cameraMatrix[0,2] = min(resX-1, max(0, cameraMatrix[0,2]))
cameraMatrix[1,2] = min(resY-1, max(0, cameraMatrix[1,2]))
retval, cameraMatrix, distCoeffs, rvecs, tvecs = \
cv2.calibrateCamera(objectPoints, imagePoints, imageSize, cameraMatrix)
Note, since calibrateCamera assumes cameraMatrix[2,2]==1 and is constrained to positive focal lengths and 0 skew, the camera matrix likely needs to be corrected, as I've showed in the code above.
Related
I'm trying to deskew an image that has an element of known size. Given this image:
I can use aruco:: estimatePoseBoard which returns rotation and translation vectors. Is there a way to use that information to deskew everything that's in the same plane as the marker board? (Unfortunately my linear algebra is rudimentary at best.)
Clarification
I know how to deskew the marker board. What I want to be able to do is deskew the other things (in this case, the cloud-shaped object) in the same plane as the marker board. I'm trying to determine whether or not that's possible and, if so, how to do it. I can already put four markers around the object I want to deskew and use the detected corners as input to getPerspectiveTransform along with the known distance between them. But for our real-world application it may be difficult for the user to place markers exactly. It would be much easier if they could place a single marker board in the frame and have the software deskew the other objects.
Since you tagged OpenCV:
From the image I can see that you have detected the corners of all the black box. So just get the most border for points in a way or another:
Then it is like this:
std::vector<cv::Point2f> src_points={/*Fill your 4 corners here*/};
std::vector<cv::Point2f> dst_points={cv:Point2f(0,0), cv::Point2f(width,0), cv::Point2f(width,height),cv::Point2f(0,height)};
auto H=v::getPerspectiveTransform(src_points,dst_points);
cv::Mat copped_image;
cv::warpPerspective(full_image,copped_image,H,cv::Size(width,height));
I was stuck on the assumption that the destination points in the call to getPerspectiveTransform had to be the corners of the output image (as they are in Humam's suggestion). Once it dawned on me that the destination points could be somewhere within the output image I had my answer.
float boardX = 1240;
float boardY = 1570;
float boardWidth = 1730;
float boardHeight = 1400;
vector<Point2f> destinationCorners;
destinationCorners(Point2f(boardX+boardWidth, boardY));
destinationCorners(Point2f(boardX+boardWidth, boardY+boardHeight));
destinationCorners(Point2f(boardX, boardY+boardHeight));
destinationCorners(Point2f(boardX, boardY));
Mat h = getPerspectiveTransform(detectedCorners, destinationCorners);
Mat bigImage(image.size() * 3, image.type(), Scalar(0, 50, 50));
warpPerspective(image, bigImage, h, bigImage.size());
This fixed the perspective of the board and everything in its plane. (The waviness of the board is due to the fact that the paper wasn't lying flat in the original photo.)
I'm getting results I don't expect when I use OpenCV 3.0 calibrateCamera. Here is my algorithm:
Load in 30 image points
Load in 30 corresponding world points (coplanar in this case)
Use points to calibrate the camera, just for un-distorting
Un-distort the image points, but don't use the intrinsics (coplanar world points, so intrinsics are dodgy)
Use the undistorted points to find a homography, transforming to world points (can do this because they are all coplanar)
Use the homography and perspective transform to map the undistorted points to the world space
Compare the original world points to the mapped points
The points I have are noisy and only a small section of the image. There are 30 coplanar points from a single view so I can't get camera intrinsics, but should be able to get distortion coefficients and a homography to create a fronto-parallel view.
As expected, the error varies depending on the calibration flags. However, it varies opposite to what I expected. If I allow all variables to adjust, I would expect error to come down. I am not saying I expect a better model; I actually expect over-fitting, but that should still reduce error. What I see though is that the fewer variables I use, the lower my error. The best result is with a straight homography.
I have two suspected causes, but they seem unlikely and I'd like to hear an unadulterated answer before I air them. I have pulled out the code to just do what I'm talking about. It's a bit long, but it includes loading the points.
The code doesn't appear to have bugs; I've used "better" points and it works perfectly. I want to emphasize that the solution here can't be to use better points or perform a better calibration; the whole point of the exercise is to see how the various calibration models respond to different qualities of calibration data.
Any ideas?
Added
To be clear, I know the results will be bad and I expect that. I also understand that I may learn bad distortion parameters which leads to worse results when testing points that have not been used to train the model. What I don't understand is how the distortion model has more error when using the training set as the test set. That is, if the cv::calibrateCamera is supposed to choose parameters to reduce error over the training set of points provided, yet it is producing more error than if it had just selected 0s for K!, K2, ... K6, P1, P2. Bad data or not, it should at least do better on the training set. Before I can say the data is not appropriate for this model, I have to be sure I'm doing the best I can with the data available, and I can't say that at this stage.
Here an example image
The points with the green pins are marked. This is obviously just a test image.
Here is more example stuff
In the following the image is cropped from the big one above. The centre has not changed. This is what happens when I undistort with just the points marked manually from the green pins and allowing K1 (only K1) to vary from 0:
Before
After
I would put it down to a bug, but when I use a larger set of points that covers more of the screen, even from a single plane, it works reasonably well. This looks terrible. However, the error is not nearly as bad as you might think from looking at the picture.
// Load image points
std::vector<cv::Point2f> im_points;
im_points.push_back(cv::Point2f(1206, 1454));
im_points.push_back(cv::Point2f(1245, 1443));
im_points.push_back(cv::Point2f(1284, 1429));
im_points.push_back(cv::Point2f(1315, 1456));
im_points.push_back(cv::Point2f(1352, 1443));
im_points.push_back(cv::Point2f(1383, 1431));
im_points.push_back(cv::Point2f(1431, 1458));
im_points.push_back(cv::Point2f(1463, 1445));
im_points.push_back(cv::Point2f(1489, 1432));
im_points.push_back(cv::Point2f(1550, 1461));
im_points.push_back(cv::Point2f(1574, 1447));
im_points.push_back(cv::Point2f(1597, 1434));
im_points.push_back(cv::Point2f(1673, 1463));
im_points.push_back(cv::Point2f(1691, 1449));
im_points.push_back(cv::Point2f(1708, 1436));
im_points.push_back(cv::Point2f(1798, 1464));
im_points.push_back(cv::Point2f(1809, 1451));
im_points.push_back(cv::Point2f(1819, 1438));
im_points.push_back(cv::Point2f(1925, 1467));
im_points.push_back(cv::Point2f(1929, 1454));
im_points.push_back(cv::Point2f(1935, 1440));
im_points.push_back(cv::Point2f(2054, 1470));
im_points.push_back(cv::Point2f(2052, 1456));
im_points.push_back(cv::Point2f(2051, 1443));
im_points.push_back(cv::Point2f(2182, 1474));
im_points.push_back(cv::Point2f(2171, 1459));
im_points.push_back(cv::Point2f(2164, 1446));
im_points.push_back(cv::Point2f(2306, 1474));
im_points.push_back(cv::Point2f(2292, 1462));
im_points.push_back(cv::Point2f(2278, 1449));
// Create corresponding world / object points
std::vector<cv::Point3f> world_points;
for (int i = 0; i < 30; i++) {
world_points.push_back(cv::Point3f(5 * (i / 3), 4 * (i % 3), 0.0f));
}
// Perform calibration
// Flags are set out so they can be commented out and "freed" easily
int calibration_flags = 0
| cv::CALIB_FIX_K1
| cv::CALIB_FIX_K2
| cv::CALIB_FIX_K3
| cv::CALIB_FIX_K4
| cv::CALIB_FIX_K5
| cv::CALIB_FIX_K6
| cv::CALIB_ZERO_TANGENT_DIST
| 0;
// Initialise matrix
cv::Mat intrinsic_matrix = cv::Mat(3, 3, CV_64F);
intrinsic_matrix.ptr<float>(0)[0] = 1;
intrinsic_matrix.ptr<float>(1)[1] = 1;
cv::Mat distortion_coeffs = cv::Mat::zeros(5, 1, CV_64F);
// Rotation and translation vectors
std::vector<cv::Mat> undistort_rvecs;
std::vector<cv::Mat> undistort_tvecs;
// Wrap in an outer vector for calibration
std::vector<std::vector<cv::Point2f>>im_points_v(1, im_points);
std::vector<std::vector<cv::Point3f>>w_points_v(1, world_points);
// Calibrate; only 1 plane, so intrinsics can't be trusted
cv::Size image_size(4000, 3000);
calibrateCamera(w_points_v, im_points_v,
image_size, intrinsic_matrix, distortion_coeffs,
undistort_rvecs, undistort_tvecs, calibration_flags);
// Undistort im_points
std::vector<cv::Point2f> ud_points;
cv::undistortPoints(im_points, ud_points, intrinsic_matrix, distortion_coeffs);
// ud_points have been "unintrinsiced", but we don't know the intrinsics, so reverse that
double fx = intrinsic_matrix.at<double>(0, 0);
double fy = intrinsic_matrix.at<double>(1, 1);
double cx = intrinsic_matrix.at<double>(0, 2);
double cy = intrinsic_matrix.at<double>(1, 2);
for (std::vector<cv::Point2f>::iterator iter = ud_points.begin(); iter != ud_points.end(); iter++) {
iter->x = iter->x * fx + cx;
iter->y = iter->y * fy + cy;
}
// Find a homography mapping the undistorted points to the known world points, ground plane
cv::Mat homography = cv::findHomography(ud_points, world_points);
// Transform the undistorted image points to the world points (2d only, but z is constant)
std::vector<cv::Point2f> estimated_world_points;
std::cout << "homography" << homography << std::endl;
cv::perspectiveTransform(ud_points, estimated_world_points, homography);
// Work out error
double sum_sq_error = 0;
for (int i = 0; i < 30; i++) {
double err_x = estimated_world_points.at(i).x - world_points.at(i).x;
double err_y = estimated_world_points.at(i).y - world_points.at(i).y;
sum_sq_error += err_x*err_x + err_y*err_y;
}
std::cout << "Sum squared error is: " << sum_sq_error << std::endl;
I would take random samples of the 30 input points and compute the homography in each case along with the errors under the estimated homographies, a RANSAC scheme, and verify consensus between error levels and homography parameters, this can be just a verification of the global optimisation process. I know that might seem unnecessary, but it is just a sanity check for how sensitive the procedure is to the input (noise levels, location)
Also, it seems logical that fixing most of the variables gets you the least errors, as the degrees of freedom in the minimization process are less. I would try fixing different ones to establish another consensus. At least this would let you know which variables are the most sensitive to the noise levels of the input.
Hopefully, such a small section of the image would be close to the image centre as it will incur the least amount of lens distortion. Is using a different distortion model possible in your case? A more viable way is to adapt the number of distortion parameters given the position of the pattern with respect to the image centre.
Without knowing the constraints of the algorithm, I might have misunderstood the question, that's also an option too, in such case I can roll back.
I would like to have this as a comment rather, but I do not have enough points.
OpenCV runs Levenberg-Marquardt algorithm inside calibrate camera.
https://en.wikipedia.org/wiki/Levenberg%E2%80%93Marquardt_algorithm/
This algortihm works fine in problems with one minimum. In case of single image, points located close each other and many dimensional problem (n= number of coefficents) algorithm may be unstable (especially with wrong initial guess of camera matrix. Convergence of algorithm is well described here:
https://na.math.kit.edu/download/papers/levenberg.pdf/
As you wrote, error depends on calibration flags - number of flags changes dimension of a problem to be optimized.
Camera calibration also calculates pose of camera, which will be bad in models with wrong calibration matrix.
As a solution I suggest changing approach. You dont need to calculate camera matrix and pose in this step. Since you know, that points are located on a plane you can use 3d-2d plane projection equation to determine distribution type of points. By distribution I mean, that all points will be located equally on some kind of trapezoid.
Then you can use cv::undistort with different distCoeffs on your test image and calculate image point distribution and distribution error.
The last step will be to perform this steps as a target function for some optimization algorithm with distortion coefficents being optimized.
This is not the easiest solution, but i hope it will help you.
I am trying to do some image analysis using an Inverse Perspective Map. I used the openCV functions getTransform and findHomography to generate a transformation matrix and apply it to the source image. This works well and I am able to get the points from the image I want. The problem is, I don't know how I can take individual point values and undo the transform to draw them back on the original picture. I want to only undo the transform for this set of points to find their original location. How does one do this.
The points are in the form Point(x,y) from the openCV library.
To invert a homography (e.g. perspective transformation) you typically just invert the transformation matrix.
So to transform some points back from your destination image to your source image you invert the transformation matrix and transform those points with the result. To transform a point with a transformation matrix you multiply it from right to the matrix, maybe followed by a de-homogenization.
Luckily, OpenCV provides not only the warpAffine/warpPerspective methods, which transform each pixel of one image to the other image, but there is method to transform single points, too.
Use cv::perspectiveTransform(inputVector, emptyOutputVector, yourTransformation) method to transform a set of points, where
inputVector is a std::vector<cv::Point2f> (you can use a nx2 or 2xn matrix, too, but sometimes that's erroneous). Instead you can use cv::Point3f type, but I'm not sure whether those would be homgeneous coordinate points or 3D points for 3D transformation (or maybe both?).
outputVector is an empty std::vector<cv::Point2f> where the result will be stored
yourTransformation is a double precision 3x3 cv::Mat (like provided by findHomography ) transformation matrix (or 4x4 for 3D points).
Here's a Python example:
import cv2
import numpy as np
# Forward transform
point_transformed = cv2.perspectiveTransform(point_original, trans)
# Reverse transform
inv_trans = np.linalg.pinv(trans)
round_tripped = cv2.perspectiveTransform(point_transformed, inv_trans)
# Now, round_tripped should be approximately equal to point_original
you can use cv::perspectiveTransform(inputVector, emptyOutputVector, yourTransformation) to apply persepective transform on points
Python: cv2.perspectiveTransform(src, m) → dst
src – input two-channel or three-channel floating-point array; each element is a 2D/3D vector to be transformed.
m – 3x3 or 4x4 floating-point transformation matrix calculated earlier by cv2.getPerspectiveTransform(_src, _dst)
In python, you have to pass points in a numpy array as shown below:
points_to_be_transformed = np.array([[[0, 0]]], dtype=np.float32)
transfromed_points = cv2.perspectiveTransform(points_to_be_transformed, m)
transfromed_points will also be in the same shape as the input array: points_to_be_transformed
I have used this code as a basis to detect my rectangular target in a scene.I use ORB and Flann Matcher.I have been able to draw the bounding box of the detected target in my scene successfully using the findHomography() and perspectiveTransform() functions.
The reference image (img_object in the above code) is a straight view of only the rectangular target.Now the target in my scene image may be tilted forwards or backwards.I want to find out the angle by which it has been tilted.I have read various posts and came to the conclusion that the homography returned by findHomography() can be decomposed to the rotation matrix and translation vector. I have used code from https:/gist.github.com/inspirit/740979 recommended by this link translated to C++.This is the Zhang SVD decomposition code got from the camera calibration module of OpenCV.I got the complete explanation of this decomposition code from O'Reilly's Learning OpenCV book.
I also used solvePnP() on the the keypoints returned by the matcher to cross check the rotation matrix and the translation vector returned from the homography decomposition but they do not seem to the same.
I have already the measurements of the tilts of all my scene images.i found 2 ways to retrieve the angles from the rotation matrix to check how well they match my values.
Given a 3×3 rotation matrix
R = [ r_{11} & r_{12} & r_{13} ]
[ r_{21} & r_{22} & r_{23} ]
[ r_{31} & r_{32} & r_{33} ]
The 3 Euler angles are
theta_{x} = atan2(r_{32}, r_{33})
theta_{y} = atan2(-r_{31}, sqrt{r_{32}^2 + r_{33}^2})
theta_{z} = atan2(r_{21}, r_{11})
The axis,angle representation - Being R a general rotation matrix, its corresponding rotation axis u
and rotation angle θ can be retrieved from:
cos(θ) = ( trace(R) − 1) / 2
[u]× = (R − R⊤) / 2 sin(θ)
I calculated the angles using both the methods for the rotation matrices obtained from the homography decomposition and the solvepnp().All the angles are different and give very unexpected values.
Is there a hole in my understanding?I do not understand where my calculations are wrong.Are there any alternatives i can use?
Why do you expect them to be the same? They are not the same thing at all.
The Euler angles are three angles of rotation about one axis at a time, starting from the world frame.
Rodriguez's formula gives components of one vector in the world frame, and an angle of rotation about that vector.
I am currently lost in the OpenCV documentation and am looking for some guidance on the possible ordering of functions, or perhaps a function within OpenCV that I haven't came acrossed yet...
I am tracking a laser blob within a camera feed to a location on a projection screen. Up until now I have been using findHomography and then projectTransform to accomplish this however the camera I was using had very little distortion. Now I am using a different camera with a noticeable radial distortion. I have used cvCalibrateCamera to get the distortion coefficients, camera matrix, etc. but I am not sure how I should use this data with my current process, or perhaps I need to use different functions and/or ordering of functions from OpenCV altogether. Any suggestions would be appreciated...
My current code that works well (without distortion) is as follows:
Mat homog;
homog = findHomography(Mat(vCameraPoints), Mat(vTargetPoints), CV_RANSAC);
vector<Point2f> cvTrackPoint;
cvTrackPoint.push_back(Point2f(pMapPoint.fX, pMapPoint.fY));
Mat normalizedImageMat;
perspectiveTransform(Mat(cvTrackPoint), normalizedImageMat, homog);
Point2f normalizedImgPt;
normalizedImgPt = Point2f(normalizedImageMat.at<Point2f>(0,0));
normalizedImgPt.x /= szCameraSize.fWidth;
normalizedImgPt.y /= szCameraSize.fHeight;
I then of course multiply the normalizedImgPt to my projection screen resolution
So again, just to clarify...I do have what appears to be good data from calibrateCamera, how would I use this information to factor in the lens distortion? Perhaps the above process wont work, any help?
Thanks, in advance
If you have acquired the distortion coefficients, then a simple (yet probably suboptimal) way to get back to the non-distorted case would be to undistort the image. The undistorted image is the image a camera with similar intrinsic and extrinsic parameters but without lens distorsion would acquire.
The corresponding OpenCV function is undistort