How to correctly use cv::triangulatePoints() - c++

I am trying to triangulate some points with OpenCV and I found this cv::triangulatePoints() function. The problem is that there is almost no documentation or examples of it.
I have some doubts about it.
What method does it use?
I've making a small research about triangulations and there are several methods (Linear, Linear LS, eigen, iterative LS, iterative eigen,...) but I can't find which one is it using in OpenCV.
How should I use it? It seems that as an input it needs a projection matrix and 3xN homogeneous 2D points. I have them defined as std::vector<cv::Point3d> pnts, but as an output it needs 4xN arrays and obviously I can't create a std::vector<cv::Point4d> because it doesn't exist, so how should I define the output vector?
For the second question I tried: cv::Mat pnts3D(4,N,CV_64F); and cv::Mat pnts3d;, neither seems to work (it throws an exception).

1.- The method used is Least Squares. There are more complex algorithms than this one. Still it is the most common one, as the other methods may fail in some cases (i.e. some others fails if points are on plane or on infinite).
The method can be found in Multiple View Geometry in Computer Vision by Richard Hartley and Andrew Zisserman (p312)
2.-The usage:
cv::Mat pnts3D(1,N,CV_64FC4);
cv::Mat cam0pnts(1,N,CV_64FC2);
cv::Mat cam1pnts(1,N,CV_64FC2);
Fill the 2 chanel point Matrices with the points in images.
cam0 and cam1 are Mat3x4 camera matrices (intrinsic and extrinsic parameters). You can construct them by multiplying A*RT, where A is the intrinsic parameter matrix and RT the rotation translation 3x4 pose matrix.
cv::triangulatePoints(cam0,cam1,cam0pnts,cam1pnts,pnts3D);
NOTE: pnts3D NEEDs to be a 4 channel 1xN cv::Mat when defined, throws exception if not, but the result is a cv::Mat(4,N,cv_64FC1) matrix. Really confusing, but it is the only way I didn't got an exception.
UPDATE: As of version 3.0 or possibly earlier, this is no longer true, and pnts3D can also be of type Mat(4,N,CV_64FC1) or may be left completely empty (as usual, it is created inside the function).

A small addition to #Ander Biguri's answer. You should get your image points on a non-undistorted image, and invoke undistortPoints() on the cam0pnts and cam1pnts, because cv::triangulatePoints expects the 2D points in normalized coordinates (independent from the camera) and cam0 and cam1 should be only [R|t^T] matricies you do not need to multiple it with A.

Thanks to Ander Biguri! His answer helped me a lot. But I always prefer the alternative with std::vector, I edited his solution to this:
std::vector<cv::Point2d> cam0pnts;
std::vector<cv::Point2d> cam1pnts;
// You fill them, both with the same size...
// You can pick any of the following 2 (your choice)
// cv::Mat pnts3D(1,cam0pnts.size(),CV_64FC4);
cv::Mat pnts3D(4,cam0pnts.size(),CV_64F);
cv::triangulatePoints(cam0,cam1,cam0pnts,cam1pnts,pnts3D);
So you just need to do emplace_back in the points. Main advantage: you do not need to know the size N before start filling them. Unfortunately, there is no cv::Point4f, so pnts3D must be a cv::Mat...

I tried cv::triangulatePoints, but somehow it calculates garbage. I was forced to implement a linear triangulation method manually, which returns a 4x1 matrix for the triangulated 3D point:
Mat triangulate_Linear_LS(Mat mat_P_l, Mat mat_P_r, Mat warped_back_l, Mat warped_back_r)
{
Mat A(4,3,CV_64FC1), b(4,1,CV_64FC1), X(3,1,CV_64FC1), X_homogeneous(4,1,CV_64FC1), W(1,1,CV_64FC1);
W.at<double>(0,0) = 1.0;
A.at<double>(0,0) = (warped_back_l.at<double>(0,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,0) - mat_P_l.at<double>(0,0);
A.at<double>(0,1) = (warped_back_l.at<double>(0,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,1) - mat_P_l.at<double>(0,1);
A.at<double>(0,2) = (warped_back_l.at<double>(0,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,2) - mat_P_l.at<double>(0,2);
A.at<double>(1,0) = (warped_back_l.at<double>(1,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,0) - mat_P_l.at<double>(1,0);
A.at<double>(1,1) = (warped_back_l.at<double>(1,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,1) - mat_P_l.at<double>(1,1);
A.at<double>(1,2) = (warped_back_l.at<double>(1,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,2) - mat_P_l.at<double>(1,2);
A.at<double>(2,0) = (warped_back_r.at<double>(0,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,0) - mat_P_r.at<double>(0,0);
A.at<double>(2,1) = (warped_back_r.at<double>(0,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,1) - mat_P_r.at<double>(0,1);
A.at<double>(2,2) = (warped_back_r.at<double>(0,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,2) - mat_P_r.at<double>(0,2);
A.at<double>(3,0) = (warped_back_r.at<double>(1,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,0) - mat_P_r.at<double>(1,0);
A.at<double>(3,1) = (warped_back_r.at<double>(1,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,1) - mat_P_r.at<double>(1,1);
A.at<double>(3,2) = (warped_back_r.at<double>(1,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,2) - mat_P_r.at<double>(1,2);
b.at<double>(0,0) = -((warped_back_l.at<double>(0,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,3) - mat_P_l.at<double>(0,3));
b.at<double>(1,0) = -((warped_back_l.at<double>(1,0)/warped_back_l.at<double>(2,0))*mat_P_l.at<double>(2,3) - mat_P_l.at<double>(1,3));
b.at<double>(2,0) = -((warped_back_r.at<double>(0,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,3) - mat_P_r.at<double>(0,3));
b.at<double>(3,0) = -((warped_back_r.at<double>(1,0)/warped_back_r.at<double>(2,0))*mat_P_r.at<double>(2,3) - mat_P_r.at<double>(1,3));
solve(A,b,X,DECOMP_SVD);
vconcat(X,W,X_homogeneous);
return X_homogeneous;
}
the input parameters are two 3x4 camera projection matrices and a corresponding left/right pixel pair (x,y,w).

Additionally to Ginés Hidalgo comments,
if you did a stereocalibration and could estimate exactly Fundamental Matrix from there, which was calculated based on checkerboard.
Use correctMatches function refine detected keypoints
std::vector<cv::Point2f> pt_set1_pt_c, pt_set2_pt_c;
cv::correctMatches(F,pt_set1_pt,pt_set2_pt,pt_set1_pt_c,pt_set2_pt_c)

Related

Rotate around a point with a 3x3 matrix and a translation vector

I have a 3x3 matrix in openGL format and a translation vector. I get confused when rotating around a point because my rotate function does not consider translation.
For rotating I assumed I could simply change the 3x3 matrix into a quaternion, rotate it and then change it back. This does not work when rotating about anything other than the objects current position.
The normal procedure for 4x4 matrcies is:
translate point by -(point - position)
rotate
translate point by (point - position)
Optional advice:
Bullet physics uses this format and that is why I was considering using their data format to store transformations. I do eventually have to convert to a 4x4 matrix to do projection, but that does not matter all that much. Bullet's format is attractive because it removes the useless shear data. Is it worth the bother to keep transformations in bullet format or no?
Came up with one. Just do matrix math. Create the rotation matrix manually or from a quaternion and then just multiply them. Bullet does this for you but i'll show what bullet does internally too.
*psuedo code
void Transform::rotateAroundPoint(axis, angle, point)
{
Matrix3x3 mat = createMatrixFromQuaternion(axis, angle);
Transform rotateTransform;
trans.basis = mat; // <- 3x3 matrix
trans.origin = Vector3(0,0,0); // <- Translation vector
// "this" is a pointer to the transform being edited since I decided to write
// this as a member function.
Vector3 pos = this->origin;
this->origin = this->origin - point;
(*this) = rotateTransform * (*this);
this->origin = this->origin + point;
}
// Internally bullet does basically this
resultTransform.basis = t1.basis * t2.basis;
resultTransform.origin = t1.basis * t2.origin + t1.origin;
I have not tested the code yet. I am a little bit terrified that it may suffer from gimbal lock so in my actual code I am probably going to just do quaternion math when i have to multiply 2 basis matricies.

OpenCV most efficient way to find a point in a polygon

I have a dataset of 500 cv::Point.
For each point, I need to determine if this point is contained in a ROI modelized by a concave polygon.
This polygon can be quite large (most of the time, it can be contained in a bounding box of 100x400, but it can be larger)
For that number of points and that size of polygon, what is the most efficient way to determine if a point is in a polygon?
using the pointPolygonTest openCV function?
building a mask with drawContours and finding if the point is white or black in the mask?
other solution? (I really want to be accurate, so convex polygons and bounding boxes are excluded).
In general, to be both accurate and efficient, I'd go with a two-step process.
First, a bounding box on the polygon. It's a quick and simple matter to see which points are not inside the box. With that, you can discard several points right off the bat.
Secondly, pointPolygonTest. It's a relatively costly operation, but the first step guarantees that you will only perform it for those points that need better accuracy.
This way, you mantain accuracy but speed up the process. The only exception is when most points will fall inside the bounding box. In that case, the first step will almost always fail and thus won't optimise the algorithm, will actually make it slightly slower.
Quite some time ago I had exactly the same problem and used the masking approach (second point of your statement). I was testing this way datasets containing millions of points and found this solution very effective.
This is faster than pointPolygonTest with and without a bounding box!
Scalar color(0,255,0);
drawContours(image, contours, k, color, CV_FILLED, 1); //k is the index of the contour in the array of arrays 'contours'
for(int y = 0; y < image.rows, y++){
const uchar *ptr = image.ptr(y);
for(int x = 0; x < image.cols, x++){
const uchar * pixel = ptr;
if((int) pixel[1] = 255){
//point is inside contour
}
ptr += 3;
}
}
It uses the color to check if the point is inside the contour.
For faster matrix access than Mat::at() we're using pointer access.
In my case this was up to 20 times faster than the pointPolygonTest.

Two 3D point cloud transformation matrix

I'm trying to guess wich is the rigid transformation matrix between two 3D points clouds.
The two points clouds are those ones:
keypoints from the kinect (kinect_keypoints).
keypoints from a 3D object (box) (object_keypoints).
I have tried two options:
[1]. Implementation of the algorithm to find rigid transformation.
**1.Calculate the centroid of each point cloud.**
**2.Center the points according to the centroid.**
**3. Calculate the covariance matrix**
cvSVD( &_H, _W, _U, _V, CV_SVD_U_T );
cvMatMul( _V,_U, &_R );
**4. Calculate the rotartion matrix using the SVD descomposition of the covariance matrix**
float _Tsrc[16] = { 1.f,0.f,0.f,0.f,
0.f,1.f,0.f,0.f,
0.f,0.f,1.f,0.f,
-_gc_src.x,-_gc_src.y,-_gc_src.z,1.f }; // 1: src points to the origin
float _S[16] = { _scale,0.f,0.f,0.f,
0.f,_scale,0.f,0.f,
0.f,0.f,_scale,0.f,
0.f,0.f,0.f,1.f }; // 2: scale the src points
float _R_src_to_dst[16] = { _Rdata[0],_Rdata[3],_Rdata[6],0.f,
_Rdata[1],_Rdata[4],_Rdata[7],0.f,
_Rdata[2],_Rdata[5],_Rdata[8],0.f,
0.f,0.f,0.f,1.f }; // 3: rotate the scr points
float _Tdst[16] = { 1.f,0.f,0.f,0.f,
0.f,1.f,0.f,0.f,
0.f,0.f,1.f,0.f,
_gc_dst.x,_gc_dst.y,_gc_dst.z,1.f }; // 4: from scr to dst
// _Tdst * _R_src_to_dst * _S * _Tsrc
mul_transform_mat( _S, _Tsrc, Rt );
mul_transform_mat( _R_src_to_dst, Rt, Rt );
mul_transform_mat( _Tdst, Rt, Rt );
[2]. Use estimateAffine3D from opencv.
float _poseTrans[12];
std::vector<cv::Point3f> first, second;
cv::Mat aff(3,4,CV_64F, _poseTrans);
std::vector<cv::Point3f> first, second; (first-->kineckt_keypoints and second-->object_keypoints)
cv::estimateAffine3D( first, second, aff, inliers );
float _poseTrans2[16];
for (int i=0; i<12; ++i)
{
_poseTrans2[i] = _poseTrans[i];
}
_poseTrans2[12] = 0.f;
_poseTrans2[13] = 0.f;
_poseTrans2[14] = 0.f;
_poseTrans2[15] = 1.f;
The problem in the first one is that the transformation it is not correct and in the second one, if a multiply the kinect point cloud with the resultant matrix, some values are infinite.
Is there any solution from any of these options? Or an alternative one, apart from the PCL?
Thank you in advance.
EDIT: This is an old post, but an answer might be useful to someone ...
Your first approach can work in very specific cases (ellipsoid point clouds or very elongated shapes), but is not appropriate for point clouds acquired by the kinect. And about your second approach, I am not familiar with OpenCV function estimateAffine3D but I suspect it assumes the two input point clouds correspond to the same physical points, which is not the case if you used a kinect point cloud (which contain noisy measurements) and points from an ideal 3D model (which are perfect).
You mentioned that you are aware of the Point Cloud Library (PCL) and do not want to use it. If possible, I think you might want to reconsider this, because PCL is much more appropriate than OpenCV for what you want to do (check the tutorial list, one of them covers exactly what you want to do: Aligning object templates to a point cloud).
However, here are some alternative solutions to your problem:
If your two point clouds correspond exactly to the same physical points, your second approach should work, but you can also check out Absolute Orientation (e.g. Matlab implementation)
If your two point clouds do not correspond to the same physical points, you actually want to register (or align) them and you can use either:
one of the many variants of the Iterative Closest Point (ICP) algorithm, if you know approximately the position of your object. Wikipedia Entry
3D feature points such as 3D SIFT, 3D SURF or NARF feature points, if you have no clue about your object's position.
Again, all these approaches are already implemented in PCL.

Finding extrinsics between cameras

I'm in the situation where I need to find the relative camera poses between two/or more cameras based on image correspondences (so the cameras are not in the same point). To solve this I tried the same approach as described here (code below).
cv::Mat calibration_1 = ...;
cv::Mat calibration_2 = ...;
cv::Mat calibration_target = calibration_1;
calibration_target.at<float>(0, 2) = 0.5f * frame_width; // principal point
calibration_target.at<float>(1, 2) = 0.5f * frame_height; // principal point
auto fundamental_matrix = cv::findFundamentalMat(left_matches, right_matches, CV_RANSAC);
fundamental_matrix.convertTo(fundamental_matrix, CV_32F);
cv::Mat essential_matrix = calibration_2.t() * fundamental_matrix * calibration_1;
cv::SVD svd(essential_matrix);
cv::Matx33f w(0,-1,0,
1,0,0,
0,0,1);
cv::Matx33f w_inv(0,1,0,
-1,0,0,
0,0,1);
cv::Mat rotation_between_cameras = svd.u * cv::Mat(w) * svd.vt; //HZ 9.19
But in most of my cases I get extremly weird results. So my next thought was using a full fledged bundle adjuster (which should do what i am looking for?!). Currently my only big dependency is OpenCV and they only have a undocumented bundle adjustment implementation.
So the question is:
Is there a bundle adjuster which has no dependencies and uses a licence which allows commerical use?
Are there other easy way to find the extrinsics?
Are objects with very different distances to the cameras a problem? (heavy parallax)
Thanks in advance
I'm also working on same problem and facing slimier issues.
Here are some suggestions -
Modify Essential Matrix Before Decomposition:
Modify Essential matrix before decomposition [U W Vt] = SVD(E), and new E' = diag(s,s,0) where s = W(0,0) + W(1,1) / 2
2-Stage Fundamental Matrix Estimation:
Recalculate the fundamental matrix with the RANSAC inliers
These steps should make the Rotation estimation more susceptible to noise.
you have to get 4 different solutions and select the one with the most # points having positive Z coordinates. The solution are generated by inverting the sign of the fundamental matrix an substituting w with w_inv which you did not do though you calculated w_inv. Are you reusing somebody else code?

What algorithm does OpenCV's Bayer conversion use?

I would like to implement a GPU Bayer to RGB image conversion algorithm, and I was wondering what algorithm the OpenCV cvtColor function uses. Looking at the source I see what appears to be a variable number of gradients algorithm and a basic algorithm that could maybe be bilinear interpolation? Does anyone have experience with this that they could share with me, or perhaps know of GPU code to convert from Bayer to BGR format?
The source code is in imgproc/src/color.cpp. I'm looking for a link to it. Bayer2RGB_ and Bayer2RGB_VNG_8u are the functions I'm looking at.
Edit: Here's a link to the source.
http://code.opencv.org/projects/opencv/repository/revisions/master/entry/modules/imgproc/src/color.cpp
I've already implemented a bilinear interpolation algorithm, but it doesn't seem to work very well for my purposes. The picture looks ok, but I want to compute HOG features from it and in that respect it doesn't seem like a good fit.
Default is 4way linear interpolation or variable number of gradients if you specify the VNG version.
see ..\modules\imgproc\src\color.cpp for details.
I submitted a simple linear CUDA Bayer->RGB(A) to opencv, haven't followed if it's been accepted but it should be in the bugs tracker.
It's based on the code in Cuda Bayer/CFA demosaicing example.
Here is a sample of howto use cv::GPU in your own code.
/*-------RG ccd BGRA output ----------------------------*/
__global__ void bayerRG(const cv::gpu::DevMem2Db in, cv::gpu::PtrStepb out)
{
// Note called for every pair, so x/y are for start of cell so need x+1,Y+1 for right/bottom pair
// R G
// G B
// src
int x = 2 * ((blockIdx.x*blockDim.x) + threadIdx.x);
int y = 2 * ((blockIdx.y*blockDim.y) + threadIdx.y);
uchar r,g,b;
// 'R'
r = (in.ptr(y)[x]);
g = (in.ptr(y)[x-1]+in.ptr(y)[x+1]+(in.ptr(y-1)[x]+in.ptr(y+1)[x]))/4;
b = (in.ptr(y-1)[x-1]+in.ptr(y-1)[x+1]+(in.ptr(y+1)[x-1]+in.ptr(y+1)[x+1]))/4;
((uchar4*)out.ptr(y))[x] = make_uchar4( b,g,r,0xff);
// 'G' in R
r = (in.ptr(y)[x]+in.ptr(y)[x+2])/2;
g = (in.ptr(y)[x+1]);
b = (in.ptr(y-1)[x+1]+in.ptr(y+1)[x+1])/2;
((uchar4*)out.ptr(y))[x+1] = make_uchar4( b,g,r,0xff);
// 'G' in B
r = (in.ptr(y)[x]+in.ptr(y+2)[x])/2;
g = (in.ptr(y+1)[x]);
b = (in.ptr(y+1)[x-1]+in.ptr(y+1)[x+2])/2;
((uchar4*)out.ptr(y+1))[x] = make_uchar4( b,g,r,0xff);
// 'B'
r = (in.ptr(y)[x]+in.ptr(y)[x+2]+in.ptr(y+2)[x]+in.ptr(y+2)[x+2])/4;;
g = (in.ptr(y+1)[x]+in.ptr(y+1)[x+2]+in.ptr(y)[x+1]+in.ptr(y+2)[x+1])/4;
b = (in.ptr(y+1)[x+1]);
((uchar4*)out.ptr(y+1))[x+1] = make_uchar4( b,g,r,0xff);
}
/* called from */
extern "C" void cuda_bayer(const cv::gpu::DevMem2Db& img, cv::gpu::PtrStepb out)
{
dim3 threads(16,16);
dim3 grid((img.cols/2)/(threads.x), (img.rows/2)/(threads.y));
bayerGR2<<<grid,threads>>>(img,out);
cudaThreadSynchronize();
}
Currently, to my knowledge, the best debayer out there is DFPD (directional filtering with posteriori decision) as explained in this paper. The paper is quite explanatory and you can easily prototype this approach on Matlab. Here's a blog post comparing the results of DFPD to debayer based on linear approach. You can visibly see the improvement in artifacts, colors and sharpness.
As far as I know at this point it is using adaptive homogeneity directed demosaicing. Explained in a paper by Hirakawa and many other sources on the web.