Extracting 3D coordinates given 2D image points, depth map and camera calibration matrices - c++

I have a set of 2D image keypoints that are outputted from the OpenCV FAST corner detection function. Using an Asus Xtion I also have a time-synchronised depth map with all camera calibration parameters known. Using this information I would like to extract a set of 3D coordinates (point cloud) in OpenCV.
Can anyone give me any pointers regarding how to do so? Thanks in advance!

Nicolas Burrus has created a great tutorial for Depth Sensors like Kinect.
http://nicolas.burrus.name/index.php/Research/KinectCalibration
I'll copy & paste the most important parts:
Mapping depth pixels with color pixels
The first step is to undistort rgb and depth images using the
estimated distortion coefficients. Then, using the depth camera
intrinsics, each pixel (x_d,y_d) of the depth camera can be projected
to metric 3D space using the following formula:
P3D.x = (x_d - cx_d) * depth(x_d,y_d) / fx_d
P3D.y = (y_d - cy_d) * depth(x_d,y_d) / fy_d
P3D.z = depth(x_d,y_d)
with fx_d, fy_d, cx_d and cy_d the intrinsics of the depth camera.
If you are further interested in stereo mapping (values for kinect):
We can then reproject each 3D point on the color image and get its
color:
P3D' = R.P3D + T
P2D_rgb.x = (P3D'.x * fx_rgb / P3D'.z) + cx_rgb
P2D_rgb.y = (P3D'.y * fy_rgb / P3D'.z) + cy_rgb
with R and T the rotation and translation parameters estimated during
the stereo calibration.
The parameters I could estimate for my Kinect are:
Color
fx_rgb 5.2921508098293293e+02
fy_rgb 5.2556393630057437e+02
cx_rgb 3.2894272028759258e+02
cy_rgb 2.6748068171871557e+02
k1_rgb 2.6451622333009589e-01
k2_rgb -8.3990749424620825e-01
p1_rgb -1.9922302173693159e-03
p2_rgb 1.4371995932897616e-03
k3_rgb 9.1192465078713847e-01
Depth
fx_d 5.9421434211923247e+02
fy_d 5.9104053696870778e+02
cx_d 3.3930780975300314e+02
cy_d 2.4273913761751615e+02
k1_d -2.6386489753128833e-01
k2_d 9.9966832163729757e-01
p1_d -7.6275862143610667e-04
p2_d 5.0350940090814270e-03
k3_d -1.3053628089976321e+00
Relative transform between the sensors (in meters)
R [ 9.9984628826577793e-01, 1.2635359098409581e-03, -1.7487233004436643e-02,
-1.4779096108364480e-03, 9.9992385683542895e-01, -1.2251380107679535e-02,
1.7470421412464927e-02, 1.2275341476520762e-02, 9.9977202419716948e-01 ]
T [ 1.9985242312092553e-02, -7.4423738761617583e-04, -1.0916736334336222e-02 ]

Related

Convert kinect depth intensity to distance in meter

I'm working on kinect v1 depth images.
How do I find the distance in meters for the corresponding depth intensity value of each pixel?
The intensity value ranges from 0-255 since it is a grayscale image and i don't have the raw depth data.
I've tried various ways to get the distance, such as using the following formulas:
- 1.0 / (raw_depth * -0.0030711016 + 3.3309495161)
- 0.1236 * tan(rawDisparity / 2842.5 + 1.1863)
I've also tried to get the raw data by using:
raw = (255 - depthIntensity)/256*2047
How do i solve this problem?
The Kinect actually sends something akin to a disparity image over USB. Both OpenNI and libfreenect are capable of converting that to a depth image using parameters reported by the device (baseline, focal length, and distance to reference plane, IIRC). e.g CV_CAP_PROP_OPENNI_BASELINE
In math form below, it is how we find the depth based on disparity
Depth = Baseline * focal length / disparity. The depth should be corresponding to the Z axis of current image frame.

C++ / OpenCV - Depth map issue: Items in the Point Cloud have some distortions

I want to create a Depth Map in order to obtain the 3D position of each pixels so that I can have 3D position of some selected Item on my picture. In order to see if my depth data is correct, I visualize it with MeshLab.
I use the stereo data of KITTI dataset so the images are rectified and the calibration for each camera is provided.
The process is the following:
Image left + image right --> Compute disparity using Stereo Semi Global Matching (SGBM) --> Compute the depth Map using cv::reprojectImageTo3D() with the Q initialized thanks to the parameters of the calibration and thanks to this function:
cv::stereoRectify(cameraMatrix1, distCoeffs1, cameraMatrix2, distCoeffs2, imgSize, R, T, R1, R2, P1, P2, Q);
My problem is the following:
The road is okey but the sign board have some distortion. I don't understand and I tried to change the parameters but without success. I always have this distortion. It's annoying because I cannot compute a good 3D position of the sign board.
I tried also with the classic block matching but It's the same and the results is not that good in comparison with the semi global one.
However my disparity looks like this (Which seems good to me) :
The parameters of the disparity computation is the following:
StereoSGBM sgbm;
sgbm.SADWindowSize = 3;
sgbm.numberOfDisparities = 128;
sgbm.preFilterCap = 10;
sgbm.minDisparity = 0;
sgbm.uniquenessRatio = 10.0;
sgbm.speckleWindowSize = 100;
sgbm.speckleRange = 32;
sgbm.disp12MaxDiff = 1;
sgbm.fullDP = 1;
sgbm.P1 = sgbm.SADWindowSize*sgbm.SADWindowSize*4;
sgbm.P2 = sgbm.SADWindowSize*sgbm.SADWindowSize*32;
sgbm(gray1, gray2, disp);
Do you have an idea why that's happened ? How I can solve that ? I would like to have a well planar surface of the sign board.
You could probably improve things slightly by playing with the SGBM parameters, but the reality is stereo data is noisy and you should not expect to get a perfectly planar sign in your point cloud.
If you are interested in finding the 3D location of the sign, segmenting the sign from the RGB image and averaging the corresponding points in the point cloud together should produce reasonable results.

How to map thermal image (Flir A325sc) with Asus XTion Pro Live IR image under OpenCV+ROS

I want to map the thermal image of the Flir with the depth image of the XTion.
As the depth image is calculatet from Xtions IR camera I want to map the Flir with Xtions IR image.
Therefor I placed both cameras on one plane close to each other (about 7 cm in x, 1 cm in y and 3 cm in z).
Then I used ROS Indigo and openCV 2.4.9 to:
Set the Flir Focus to a fix on (no autofocus)
Get both images synchronized.
Resize the Xtion IR image from 640x480 to 320x240 pixels as the Flir image
Calculate the intrinsic camera parameters for both cameras. (Flir + Xtion IR)
Calculate the extrinsic parameters
Remap both images to get the rectified images
I now have the two rectified images but still an offset in X (horizontal direction).
If I understand that correctly, I have the offset due to the different focal lengths and field of views (Flir with objective: 45° H x 33.8° V and 9.66mm focal length, XTion: 58° H x 45° V) and could solve the problem with a perspective transform but I don't have both focal lengths in mm.
The datasheets:
http://support.flir.com/DsDownload/Assets/48001-0101_en_40.pdf
https://www.imc-store.com.au/v/vspfiles/assets/images/1196960_en_51.pdf
http://www.asus.com/us/Multimedia/Xtion_PRO_LIVE/specifications/
I had the idea to get the focal lengths with cv::calibrationMatrixValues but I dont know the apertureWith and Heigth.
Cross-Post
How could I solve this problem?

How rectify an image from a single calibrated camera using Matlab toolbox [duplicate]

I'm using Matlab for camera calibration using Jean-
Yves Bouget's Camera Calibration Toolbox. I have all the camera
parameters from the calibration procedure. When I use a new image not
in the calibration set, I can get its transformation equation e.g.
Xc=R*X+T, where X is the 3D point of the calibration rig (planar) in
the world frame, and Xc its coordinates in the camera frame. In other
words, I have everything (both extrinsic and intrinsic parameters).
What I want to do is to perform perspective correction on this image
i.e. I want it to remove any perspective and see the calibration rig
undistorted (its a checkerboard).
Matlab's new Computer Vision toolbox has an object that performs a perspective transformation on an
image, given a 3X3 matrix H. The problem is, I can't compute this
matrix from the known intrinsic and extrinsic parameters!
To all who are still interested in this after so many months, i've managed to get the correct homography matrix using Kovesi's code (http://www.csse.uwa.edu.au/~pk/research/matlabfns), and especially the homography2d.m function. You will need however the pixel values of the four corners of the rig. If the camera is steady fixed, then you will need to do this once. See example code below:
%get corner pixel coords from base image
p1=[33;150;1];
p2=[316;136;1];
p3=[274;22;1];
p4=[63;34;1];
por=[p1 p2 p3 p4];
por=[0 1 0;1 0 0;0 0 1]*por; %swap x-y <--------------------
%calculate target image coordinates in world frame
% rig is 9x7 (X,Y) with 27.5mm box edges
XXw=[[0;0;0] [0;27.5*9;0] [27.5*7;27.5*9;0] [27.5*7;0;0]];
Rtarget=[0 1 0;1 0 0;0 0 -1]; %Rotation matrix of target camera (vertical pose)
XXc=Rtarget*XXw+Tc_ext*ones(1,4); %go from world frame to camera frame
xn=XXc./[XXc(3,:);XXc(3,:);XXc(3,:)]; %calculate normalized coords
xpp=KK*xn; %calculate target pixel coords
% get homography matrix from original to target image
HH=homography2d(por,xpp);
%do perspective transformation to validate homography
pnew=HH*por./[HH(3,:)*por;HH(3,:)*por;HH(3,:)*por];
That should do the trick. Note that Matlab defines the x axis in an image ans the rows index and y as the columns. Thus one must swap x-y in the equations (as you'll probably see in the code above). Furthermore, i had managed to compute the homography matrix from the parameters solely, but the result was slightly off (maybe roundoff errors in the calibration toolbox). The best way to do this is the above.
If you want to use just the camera parameters (that is, don't use Kovesi's code), then the Homography matrix is H=KK*Rmat*inv_KK. In this case the code is,
% corner coords in pixels
p1=[33;150;1];
p2=[316;136;1];
p3=[274;22;1];
p4=[63;34;1];
pmat=[p1 p2 p3 p4];
pmat=[0 1 0;1 0 0;0 0 1]*pmat; %swap x-y
R=[0 1 0;1 0 0;0 0 1]; %rotation matrix of final camera pose
Rmat=Rc_ext'*R; %rotation from original pose to final pose
H=KK*Rmat*inv_KK; %homography matrix
pnew=H*pmat./[H(3,:)*pmat;H(3,:)*pmat;H(3,:)*pmat]; %do perspective transformation
H2=[0 1 0;-1 0 0;0 0 1]*H; %swap x-y in the homography matrix to apply in image
Approach 1:
In the Camera Calibration Toolbox you should notice that there is an H matrix for each image of your checkerboard in your workspace. I am not familiar with the computer vision toolbox yet but perhaps this is the matrix you need for your function. It seems that H is computed like so:
KK = [fc(1) fc(1)*alpha_c cc(1);0 fc(2) cc(2); 0 0 1];
H = KK * [R(:,1) R(:,2) Tc]; % where R is your extrinsic rotation matrix and Tc the translation matrix
H = H / H(3,3);
Approach 2:
If the computer vision toolbox function doesn't work out for you then to find the prospective projection of an image I have used the interp2 function like so:
[X, Y] = meshgrid(0:size(I,2)-1, 0:size(I,1)-1);
im_coord = [X(:), Y(:), ones(prod(size(I_1)))]';
% Insert projection here for X and Y to XI and YI
ZI = interp2(X,Y,Z,XI,YI);
I have used prospective projections on a project a while ago and I believe that you need to use homogeneous coordinates. I think I found this wikipedia article quite helpful.

Resolving rotation matrices to obtain the angles

I have used this code as a basis to detect my rectangular target in a scene.I use ORB and Flann Matcher.I have been able to draw the bounding box of the detected target in my scene successfully using the findHomography() and perspectiveTransform() functions.
The reference image (img_object in the above code) is a straight view of only the rectangular target.Now the target in my scene image may be tilted forwards or backwards.I want to find out the angle by which it has been tilted.I have read various posts and came to the conclusion that the homography returned by findHomography() can be decomposed to the rotation matrix and translation vector. I have used code from https:/gist.github.com/inspirit/740979 recommended by this link translated to C++.This is the Zhang SVD decomposition code got from the camera calibration module of OpenCV.I got the complete explanation of this decomposition code from O'Reilly's Learning OpenCV book.
I also used solvePnP() on the the keypoints returned by the matcher to cross check the rotation matrix and the translation vector returned from the homography decomposition but they do not seem to the same.
I have already the measurements of the tilts of all my scene images.i found 2 ways to retrieve the angles from the rotation matrix to check how well they match my values.
Given a 3×3 rotation matrix
R = [ r_{11} & r_{12} & r_{13} ]
[ r_{21} & r_{22} & r_{23} ]
[ r_{31} & r_{32} & r_{33} ]
The 3 Euler angles are
theta_{x} = atan2(r_{32}, r_{33})
theta_{y} = atan2(-r_{31}, sqrt{r_{32}^2 + r_{33}^2})
theta_{z} = atan2(r_{21}, r_{11})
The axis,angle representation - Being R a general rotation matrix, its corresponding rotation axis u
and rotation angle θ can be retrieved from:
cos(θ) = ( trace(R) − 1) / 2
[u]× = (R − R⊤) / 2 sin(θ)
I calculated the angles using both the methods for the rotation matrices obtained from the homography decomposition and the solvepnp().All the angles are different and give very unexpected values.
Is there a hole in my understanding?I do not understand where my calculations are wrong.Are there any alternatives i can use?
Why do you expect them to be the same? They are not the same thing at all.
The Euler angles are three angles of rotation about one axis at a time, starting from the world frame.
Rodriguez's formula gives components of one vector in the world frame, and an angle of rotation about that vector.