TOF camera calibration - distance to chessboard - computer-vision

For my application, I need to calibrate my TOF-camera (kinect v2).
I have done this with matlab camera calibration. After my calibration, I recognized that right-angled planes are oblique.
For example here a result of two "right-angled" planes:
I think this result is so oblique cause of the wrong parameters from calibration. Therefore I want to improve my calibration process of the kinect.
So I have three major questions:
Is the distance between TOF-camera and chessboard important for the calibration result? For my application, I need a quite high accurancy in the interval 2.5m - 3m (Z-distance between camera and object). So I choose this intervall to get the best result for this area, expecialy because it is a TOF-camera. Or should I take a quite short distance (1-1.5m) to get a good chessboard with a high resolution?
What kind of images (viewpoint: rotated,obiquely / images in the middle or corner) are important to get a good result for the tangential distortion ( I think this parameter turns totataly wrong)? Any tipps to improve here my results?
I fixed my chessboard on a flat wall and fixed my camera on a tripod. For different calibration images I move my tripod. Would this procedure be also ok? Or do I have to move the chessboard pattern to get better results?

Related

OpenCV 3.4 camera calibration yields strange principal point

I'm doing camera calibration using the calibration.cpp sample provided in the OpenCV 3.4 release. I'm using a simple 9x6 chessboard, with square length = 3.45 mm.
Command to run the code:
Calib.exe -w=9 -h=6 -s=3.45 -o=camera.yml -oe imgList.xml
imgList.xml
I'm using a batch of 28 images available here
camera.yml (output)
Image outputs from drawChessboardCorners: here
There are 4 images without the chessboard overlay drawn, findChessboardCorners has failed for these.
Results look kind of strange (if I understand them correctly). I'm taking focal length value for granted, but the principal point seems way off at c = (834, 1513). I was expecting a point closer to the image center at (1280, 960) since the orientation of the camera to the surface viewed is very close to 90 degrees.
Also if I place an object at the principal point and move it in the Z axis I shouldn't see it move along x and y in the image, is this correct?
I suspect I should add images with greater tilt of the chessboard with respect to the camera to get better results (z-angle). But the camera has a really narrow depth of field, and this prevents the chessboard corners from being detected.
The main issue you have is you don't feed the camera software enough information to get the right estimation of different parameters.
In all the 28 images you changed only the orientation of the chessboard around the z axis in the same plane. You don't need to take that much photos, for me around 15 is okay. You need to add more ddl to your images: change the distance of the chessboard from the camera and tilt the chessboard around its X and Y axis. Re calibrate the camera and you should get the right parameters.
It really depends on the camera and lens you use.
More specifically on things like:
precision of chip deployment
attachment of screw thread of lens
manufacturing of lens itself
Some cheap webcam with small chip could even have the principal point out of the image size (means it could be also a negative number). So in your case C could be both - (834,1513) or (1513,834).
If you are using industrial cam or something similar, C should be in range of tens of percent around the centre of the image ->e.g. (1280,960)+-25%.
About the problem with narrow DOF (in nutshell) - to make it wider you need to get aperture as small as possible, prolong the exposure and add some extra light behind the camera to compensate the aperture.
Also you could refocus to get sharp shots from different distances, only your accuracy gets lower as refocusing is slightly changing the focal length. But in most cases you do not need this super extra ultra accuracy so this should not be the problem.

Take a image of a tube that alway spin around in openCV C++

First of all, sorry for my bad English,
I have an object like following picture, the object always spin around a horizontal axis. Anybody can recommend me how to I can take a photo that's full label of tube when the tube is spinning ? I can take a image from my camera via OpenCV C++, but when I'm trying to spin the tube around, I can't take a perfect photo (my image is blurry, not clearly).
My tube is perfectly facing toward camera. Its rotating speed is about 500 RPM.
Hope to get your help soon,
Thank you very much!
this is my object:
Some sample images:
Here my image when I use camera of Ip5 with flash:
Motion blur
this can be improved by lowering the exposure time but you need to increase light conditions to compensate. Most modern compact cameras can not set the exposure time directly (so the companies can sold the expensive profi cameras) even if it is just few lines of GUI code but if you increase the light the automatic exposure should lower on its own.
In industry this problem is solved by special TDI cameras like
HAMAMATSU TDI Line Scan Cameras
The TDI means Time delay integration which means the camera CCD pixels are passing its charge to the next pixel synchronized with the motion. This results in effect like you would move the camera synchronously with your object surface. The blur is still present but much much smaller (only a fraction of real exposure time)
In computer vision and DIP you can de-blur the image by deconvolution process if you know the movement properties (which you know) It is inversion of gaussian blur filter with use of FFT and optimization process to find the inverse filter.
Out of focus blur
This is due the fact your surface is curved and camera chip is not. So outer pixels have different distance to chip then the center pixels. Without special optics you can handle this by Line cameras. Of coarse I do not expect you got one so you can use your camera for this too.
Just mount your camera so one of the camera axis is parallel to you object rotation axis (surface) for example x axis. Then sample more images with constant time step and use only the center line/slice of the image (height of the line/slice depends on your exposure time and the object speed, they should overlap a bit). then just combine these lines/slices from all the sampled images to form the focused image .
[Edit1] home made TDI setup
So mount camera so its view axis is perpendicular to surface.
Take burst shots or video with constant frame-rate
The shorter exposure time (higher frame-rate) the more focused whole image will be (due to optical blur) and the bigger area dy from motion blur. And the higher the rotation RPM the smaller the dy will be. So find the best option for your camera,RPM and lighting conditions (usually adding strong light helps if you do not have reflective surfaces on the tube).
For correct output you need to compromise each parameter so:
exposure time is as short as it can
focused areas are overlapping between the shots (if not you can sample more rounds similar to old FDD sector reading...)
extract focused part of shots
You need just the focused middle part of all shots so empirically take few shots from your setup and choose the dy size. Then use that as a constant latter. So extract the middle part (slice) from the shots. In my example image it is the red area.
combine slices
You just copy (or average overlapped part) the slices together. They should overlap a bit so you do not have holes in final image. As you can see my final image example has smaller slices then acquired to make that more obvious.
Your camera image can be off by few pixels due to vibrations so If that is a problem in final image then you can use SIFT/SURF + RANSAC for auto-stitching for higher precision output.

OpenCV triangulatePoints varying distance

I am using OpenCV's triangulatePoints function to determine 3D coordinates of a point imaged by a stereo camera.
I am experiencing that this function gives me different distance to the same point depending on angle of camera to that point.
Here is a video:
https://www.youtube.com/watch?v=FrYBhLJGiE4
In this video, we are tracking the 'X' mark. In the upper left corner info is displayed about the point that is being tracked. (Youtube dropped the quality, the video is normally much sharper. (2x1280) x 720)
In the video, left camera is the origin of 3D coordinate system and it's looking in positive Z direction. Left camera is undergoing some translation, but not nearly as much as the triangulatePoints function leads to believe. (More info is in the video description.)
Metric unit is mm, so the point is initially triangulated at ~1.94m distance from the left camera.
I am aware that insufficiently precise calibration can cause this behaviour. I have ran three independent calibrations using chessboard pattern. The resulting parameters vary too much for my taste. ( Approx +-10% for focal length estimation).
As you can see, the video is not highly distorted. Straight lines appear pretty straight everywhere. So the optimimum camera parameters must be close to the ones I am already using.
My question is, is there anything else that can cause this?
Can a convergence angle between the two stereo cameras can have this effect? Or wrong baseline length?
Of course, there is always a matter of errors in feature detection. Since I am using optical flow to track the 'X' mark, I get subpixel precision which can be mistaken by... I don't know... +-0.2 px?
I am using the Stereolabs ZED stereo camera. I am not accessing the video frames using directly OpenCV. Instead, I have to use the special SDK I acquired when purchasing the camera. It has occured to me that this SDK I am using might be doing some undistortion of its own.
So, now I wonder... If the SDK undistorts an image using incorrect distortion coefficients, can that create an image that is neither barrel-distorted nor pincushion-distorted but something different altogether?
The SDK provided with the ZED Camera performs undistortion and rectification of images. The geometry model is based on the same as openCV :
intrinsic parameters and distortion parameters for both Left and Right cameras.
extrinsic parameters for rotation/translation between Right and Left.
Through one of the tool of the ZED ( ZED Settings App), you can enter your own intrinsic matrix for Left/Right and distortion coeff, and Baseline/Convergence.
To get a precise 3D triangulation, you may need to adjust those parameters since they have a high impact on the disparity you will estimate before converting to depth.
OpenCV gives a good module to calibrate 3D cameras. It does :
-Mono calibration (calibrateCamera) for Left and Right , followed by a stereo calibration (cv::StereoCalibrate()). It will output Intrinsic parameters (focale, optical center (very important)), and extrinsic (Baseline = T[0], Convergence = R[1] if R is a 3x1 matrix). the RMS (return value of stereoCalibrate()) is a good way to see if the calibration has been done correctly.
The important thing is that you need to do this calibration on raw images, not by using images provided with the ZED SDK. Since the ZED is a standard UVC Camera, you can use opencv to get the side by side raw images (cv::videoCapture with the correct device number) and extract Left and RIght native images.
You can then enter those calibration parameters in the tool. The ZED SDK will then perform the undistortion/rectification and provide the corrected images. The new camera matrix is provided in the getParameters(). You need to take those values when you triangulate, since images are corrected as if they were taken from this "ideal" camera.
hope this helps.
/OB/
There are 3 points I can think of and probably can help you.
Probably the least important, but from your description you have separately calibrated the cameras and then the stereo system. Running an overall optimization should improve the reconstruction accuracy, as some "less accurate" parameters compensate for the other "less accurate" parameters.
If the accuracy of reconstruction is important to you, you need to have a systematic approach to reducing it. Building an uncertainty model, thanks to the mathematical model, is easy and can write a few lines of code to build that for you. Say you want to see if the 3d point is 2 meters away, at a particular angle to the camera system, and you have a specific uncertainty on the 2d projections of the 3d point, it's easy to backproject the uncertainty to the 3d space around your 3d point. By adding uncertainty to the other parameters of the system then you can see which ones are more important and need to have lower uncertainty.
This inaccuracy is inherent in the problem and the method you're using.
First if you model the uncertainty you will see the reconstructed 3d points further away from the center of cameras have a much higher uncertainty. The reason is that the angle <left-camera, 3d-point, right-camera> is narrower. I remember the MVG book had a good description of this with a figure.
Second, if you look at the implementation of triangulatePoints you see that the pseudo-inverse method is implemented using SVD to construct the 3d point. That can lead to many issues, which you probably remember from linear algebra.
Update:
But I consistently get larger distance near edges and several times
the magnitude of the uncertainty caused by the angle.
That's the result of using pseudo-inverse, a numerical method. You can replace that with a geometrical method. One easy method is to back-project the 2d-projections to get 2 rays in 3d space. Then you want to find where the intersect, which doesn't happen due to the inaccuracies. Instead you want to find the point where the 2 rays have the least distance. Without considering the uncertainty you will consistently favor a point from the set of feasible solutions. That's why with pseudo inverse you don't see any fluctuation but a gross error.
Regarding the general optimization, yes, you can run an iterative LM optimization on all the parameters. This is the method used in applications like SLAM for autonomous vehicles where accuracy is very important. You can find some papers by googling bundle adjustment slam.

3D reconstruction from 2 images with baseline and single camera calibration

my semester project is to Calibrate Stereo Cameras with a big baseline (~2m).
so my approach is to run without exact defined calibration pattern like the chessboard cause it had to be huge and would be hard to handle.
my problem is similar to this: 3d reconstruction from 2 images without info about the camera
Program till now:
Corner detection left image goodFeaturesToTrack
refined corners cornerSubPix
Find corner locations in right image calcOpticalFlowPyrLK
calculate fundamental matrix F findFundamentalMat
calculate H1, H2 rectification homography matrix stereoRectifyUncalibrated
Rectify images warpPerspective
Calculate Disparity map sgbm
so far so good it works passably but rectified images are "jumping" in perspective if i change the number of corners..
don't know if this if form imprecision or mistakes i mad or if it cant be calculated due to no known camera parameters or no lens distortion compensation (but also happens on Tsukuba pics..)
suggestions are welcome :)
but not my main problem, now i want to reconstruct the 3D points.
but reprojectImageTo3D needs the Q matrix which i don't have so far. so my question is how to calculate it? i have the baseline, distance between the two cameras. My feeling says if i convert des disparity map in to a 3d point cloud the only thing im missing is the scale right? so if i set in the baseline i got the 3d reconstruction right? then how to?
im also planing to compensate lens distortion as the first step for each camera separately with a chessboard (small and close to one camera at a time so i haven't to be 10-15m away with a big pattern in the overlapping area of both..) so if this is helping i could also use the camera parameters..
is there a documentation besides the http://docs.opencv.org? that i can see and understand what and how the Q matrix is calculated or can i open the source code (probably hard to understand for me ^^) if i press F2 in Qt i only see the function with the transfer parameter types.. (sorry im really new to all of this )
left: input with found corners
top h1, h2: rectify images (looks good with this corner count ^^)
SGBM: Disparity map
so i found out what the Q matrix constrains here:
Using OpenCV to generate 3d points (assuming frontal parallel configuration)
all these parameters are given by the single camera calibration:
c_x , c_y , f
and the baseline is what i have measured:
T_x
so this works for now, only the units are not that clear to me, i have used them form single camera calib which are in px and set the baseline in meters, divided the disparity map by 16, but it seams not the right scale..
by the way the disparity map above was wrong ^^ and now it looks better. you have to do a anti Shearing Transform cause the stereoRectifyUncalibrated is Shearing your image (not documented?).
described in this paper at "7 Shearing Transform" by Charles Loop Zhengyou Zhang:
http://research.microsoft.com/en-us/um/people/Zhang/Papers/TR99-21.pdf
Result:
http://i.stack.imgur.com/UkuJi.jpg

web cam calibrate

I have 2 logistic webcam, I want to do stereo triangulation for which I have to measure the focal length of 2 web cameras.
My question is if I use openCv to calibrate the camera and generate the intrinsic and extrinsic matrices can I use the focal length value that is generated in the intrinsic matrix
as the exact value of focal length.
Well in short i wanted to know if I can use 2 webcams to do stereo triangulation rather than using pin whole stereo camera...
Well, the answer depends on what you mean by the exact value of focal length. If the accuracy of triangulation is your concern then you need to know there are a few factors that affect the accuracy of calibration and triangulation. The rule of thumb is to have a wider baseline (the distance between the two cameras) to improve the accuracy of calibration. Second a larger number of points and more accurate points should be used for calibration. Third, check out the the back projection error after running bundle adjustment. Four, when triangulating the points further from the cameras have a larger uncertainty. And finally, apart from the first points, the wide baseline, the relative pose between the two camera is very important as you should consider what points you want to triangulate and relatively where in the 3D space they should be, then you can reconstruct some points that are important to you more accurate than the others. If you provide more details about the problem you're dealing with perhaps you get a more detailed answer too. I hope that helps.