OpenCv Blob tracking of point relative to plane - c++

Am doing an installation that tracks blobs using openCv, and projecting graphics over the blobs. Problem is my camera is off and away from the projector.
I'm thinking to get the point's position in relation to the projection's plane, I would need to calibrate by marking out the plane's corners as seen in the camera view.
My problem is how do i use that 4 points info, and then convert the tracked blob from the camera view to the projection plane, so the projected graphic lines up with the tracked blob? Not sure what i should be searching for.

After you detect the 4 corners points, you can calculate the transformation to the projector plane by using PerspectiveTransform.
Once you have this transformation, you could use warpPerspective, to go from one coordinate system to another.

Unfortunately I'm unable to help with a minimal code example at the moment, but I recommend having a look at ofxCv and it's examples. There is a camera based undistort example, but the wrapper also provides utilities for warping/unwarping perspective via warpPerspective and unwarpPerspective.
Bare in mind ofxCv has handy function to convert to/from ofImage to cv::Mat like toCv() and toOf()
ofxCv may make it easier to use the OpenCV functions Elad Joseph recommends (which sound like exactly what you need)

Related

How to calibrate the Kinect camera?

Camera calibration is the process of estimating intrinsic and/or extrinsic parameters. Intrinsic parameters deal with the camera's internal characteristics, such as, its focal length, skew, distortion, and image center. Extrinsic parameters describe its position and orientation in the world. Knowing intrinsic parameters is an essential first step for 3D computer vision, as it allows you to estimate the scene's structure in Euclidean space and removes lens distortion, which degraces accuracy.
I'm using the Kinect for Computer Vision but I need to calibrate it. I've already read some articles about Kinect calibration but I didn't understand very clearly.
I want to start from nothing. Because I need to know how the calibration is done.
How do I do this?
Thanks.
The Kinect is slightly different than your standard camera. There is a customized toolbox here http://www.ee.oulu.fi/~dherrera/kinect/
I'd suggest your read the paper and try to understand what calibration is.
In very simplistic terms you need calibration so that other geometric algorithms can work. The vast majority of geometry-based algorithms in vision assume the pinhole camera model. That is, the center of the camera is a tiny pinhole and rays reflecting off of objects travel in straight lines.
However, a pinhole camera is not practical to manufacture. You can make a pinhole camera at home, but the image quality won't be good.
People use lenses to deal with this. But, lenses are imperfect and they have distortion.
Distortion means that pixel coordinates do not correspond to straight lines anymore. So, many of the algorithms fail to compute the right thing.
Camera intrinsic calibration corrects the distortion in the lens so that the projection is as close to a pinhole as possible.
The Kinect has two cameras. The RGB camera, and the IR camera. Both are factory calibrated, but you can get better results customized for the sensor you use using the toolbox above.
HTH
Thulio, for calibrating the color camera of Kinect have a look at:
camera calibration toolbox. Basically you need to print a chessboard pattern on paper, glue it on a planar surface, take a lot of Kinect color pictures and load it with the toolbox to get your camera parameters. I suspect that somebody else may have done that before you (most Kinects will have the same intrinsic parameters, I guess).

how do I re-project points in a camera - projector system (after calibration)

i have seen many blog entries and videos and source coude on the internet about how to carry out camera + projector calibration using openCV, in order to produce the camera.yml, projector.yml and projectorExtrinsics.yml files.
I have yet to see anyone discussing what to do with this files afterwards. Indeed I have done a calibration myself, but I don't know what is the next step in my own application.
Say I write an application that now uses the camera - projector system that I calibrated to track objects and project something on them. I will use contourFind() to grab some points of interest from the moving objects and now I want to project these points (from the projector!) onto the objects!
what I want to do is (for example) track the centre of mass (COM) of an object and show a point on the camera view of the tracked object (at its COM). Then a point should be projected on the COM of the object in real time.
It seems that projectPoints() is the openCV function I should use after loading the yml files, but I am not sure how I will account for all the intrinsic & extrinsic calibration values of both camera and projector. Namely, projectPoints() requires as parameters the
vector of points to re-project (duh!)
rotation + translation matrices. I think I can use the projectorExtrinsics here. or I can use the composeRT() function to generate a final rotation & a final translation matrix from the projectorExtrinsics (which I have in the yml file) and the cameraExtrinsics (which I don't have. side question: should I not save them too in a file??).
intrinsics matrix. this tricky now. should I use the camera or the projector intrinsics matrix here?
distortion coefficients. again should I use the projector or the camera coefs here?
other params...
So If I use either projector or camera (which one??) intrinsics + coeffs in projectPoints(), then I will only be 'correcting' for one of the 2 instruments . Where / how will I use the other's instruments intrinsics ??
What else do I need to use apart from load() the yml files and projectPoints() ? (perhaps undistortion?)
ANY help on the matter is greatly appreciated .
If there is a tutorial or a book (no, O'Reilly "Learning openCV" does not talk about how to use the calibration yml files either! - only about how to do the actual calibration), please point me in that direction. I don't necessarily need an exact answer!
First, you seem to be confused about the general role of a camera/projector model: its role is to map 3D world points to 2D image points. This sounds obvious, but this means that given extrinsics R,t (for orientation and position), distortion function D(.) and intrisics K, you can infer for this particular camera the 2D projection m of a 3D point M as follows: m = K.D(R.M+t). The projectPoints function does exactly that (i.e. 3D to 2D projection), for each input 3D point, hence you need to give it the input parameters associated to the camera in which you want your 3D points projected (projector K&D if you want projector 2D coordinates, camera K&D if you want camera 2D coordinates).
Second, when you jointly calibrate your camera and projector, you do not estimate a set of extrinsics R,t for the camera and another for the projector, but only one R and one t, which represent the rotation and translation between the camera's and projector's coordinate systems. For instance, this means that your camera is assumed to have rotation = identity and translation = zero, and the projector has rotation = R and translation = t (or the other way around, depending on how you did the calibration).
Now, concerning the application you mentioned, the real problem is: how do you estimate the 3D coordinates of a given point ?
Using two cameras and one projector, this would be easy: you could track the objects of interest in the two camera images, triangulate their 3D positions using the two 2D projections using function triangulatePoints and finally project this 3D point in the projector 2D coordinates using projectPoints in order to know where to display things with your projector.
With only one camera and one projector, this is still possible but more difficult because you cannot triangulate the tracked points from only one observation. The basic idea is to approach the problem like a sparse stereo disparity estimation problem. A possible method is as follows:
project a non-ambiguous image (e.g. black and white noise) using the projector, in order to texture the scene observed by the camera.
as before, track the objects of interest in the camera image
for each object of interest, correlate a small window around its location in the camera image with the projector image, in order to find where it projects in the projector 2D coordinates
Another approach, which unlike the one above would use the calibration parameters, could be to do a dense 3D reconstruction using stereoRectify and StereoBM::operator() (or gpu::StereoBM_GPU::operator() for the GPU implementation), map the tracked 2D positions to 3D using the estimated scene depth, and finally project into the projector using projectPoints.
Anyhow, this is easier, and more accurate, using two cameras.
Hope this helps.

Augmented Reality OpenGL+OpenCV

I am very new to OpenCV with a limited experience on OpenGL. I am willing to overlay a 3D object on a calibrated image of a checkerboard. Any tips or guidance?
The basic idea is that you have 2 cameras: one is the physical one (the one where you are retriving the images with opencv) and one is the opengl one. You have to align those two matrices.
To do that, you need to calibrate the physical camera.
First. You need a distortion parameters (because every lens more or less has some optical distortion), and build with those parameters the so called intrinsic parameters. You do this with printing a chessboard in a paper, using it for get some images and calibrate the camera. It's full of nice tutorial about that on the internet, and from your answer it seems you have them. That's nice.
Then. You have to calibrate the position of the camera. And this is done with the so called extrinsic parameters. Those parameters encoded the position and the rotation the the 3D world of those camera.
The intrinsic parameters are needed by the OpenCV methods cv::solvePnP and cv::Rodrigues and that uses the rodrigues method to get the extrinsic parameters. This method get in input 2 set of corresponding points: some 3D knowon points and their 2D projection. That's why all augmented reality applications need some markers: usually the markers are square, so after detecting it you know the 2D projection of the point P1(0,0,0) P2(0,1,0) P3(1,1,0) P4(1,0,0) that forms a square and you can find the plane lying on them.
Once you have the extrinsic parameters all the game is easily solved: you just have to make a perspective projection in OpenGL with the FoV and the aperture angle of the camera from intrinsic parameter and put the camera in the position given by the extrinsic parameters.
Of course, if you want (and you should) understand and handle each step of this process correctly.. there is a lot of math - matrices, angles, quaternion, matrices again, and.. matrices again. You can find a reference in the famous Multiple View Geometry in Computer Vision from R. Hartley and A. Zisserman.
Moreover, to handle correctly the opengl part you have to deal with the so called "Modern OpenGL" (remember that glLoadMatrix is deprecated) and a little bit of shader for loading the matrices of the camera position (for me this was a problem because I didn't knew anything about it).
I have dealt with this some times ago and I have some code so feel free to ask any kind of problems you have. Here some links I found interested:
http://ksimek.github.io/2012/08/14/decompose/ (really good explanation)
Camera position in world coordinate from cv::solvePnP (a question I asked about that)
http://www.morethantechnical.com/2010/11/10/20-lines-ar-in-opencv-wcode/ (fabulous blog about computer vision)
http://spottrlabs.blogspot.it/2012/07/opencv-and-opengl-not-always-friends.html (nice tricks)
http://strawlab.org/2011/11/05/augmented-reality-with-OpenGL/
http://www.songho.ca/opengl/gl_projectionmatrix.html (very good explanation on opengl camera settings basics)
some other random usefull stuffs:
http://docs.opencv.org/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html (documentation, always look at the docs!!!)
Determine extrinsic camera with opencv to opengl with world space object
Rodrigues into Eulerangles and vice versa
Python Opencv SolvePnP yields wrong translation vector
http://answers.opencv.org/question/23089/opencv-opengl-proper-camera-pose-using-solvepnp/
Please read them before anything else. As usual, once you got the concept it is an easy joke, need to crash your brain a little bit against the wall. Just don't be scared from all those math : )

Reconstructing 3D from some images without calibration?

I want to make a 3D reconstruction from multiple images without using a chessboard Calibration. I'm using OpenCV and studying the method to obtain the way to get the model 3D from 30 images without calibrating the camera with a chessboard pattern.
Is this possible? Where can I get the extrinsics params?
Can I make the 3D reconstruction without calibrating?
The calibration grid (chessboard in the typical OpenCV example) is simply an object of known dimensions that lets you estimate the camera's intrinsic parameters, i.e. the mapping from camera coordinates to the image coordinates of a point. This includes focal length, centre of projection, radial distortion parameters et cetera.
If you do away with the calibration object, you will need to find these parameters from the image observations themselves. This approach is called "self-calibration" or "auto-calibration" and can be fairly involved. Basically, you are trying to get a good starting point for the follow-up non-linear optimisation (i.e. bundle adjustment). For a start, you might want to refer to Marc Pollefeys' PhD thesis, who came up with a simple linear algorithm for this problem:
http://www.cs.unc.edu/~marc/pubs/PollefeysIJCV04.pdf

Rigid motion estimation

Now what I have is the 3D point sets as well as the projection parameters of the camera. Given two 2D point sets projected from the 3D point by using the camera and transformed camera(by rotation and translation), there should be an intuitive way to estimate the camera motion...I read some parts of Zisserman's book "Muliple view Geometry in Computer Vision", but I still did not get the solution..
Are there any hints, how can the rigid motion be estimated in this case?
THANKS!!
What you are looking for is a solution to the PnP problem. OpenCV has a function which should work called solvePnP. Just to be clear, for this to work you need point locations in world space, a camera matrix, and the points projections onto the image plane. It will then tell you the rotation and translation of the camera or points depending on how you choose to think of it.
Adding to the previous answer, Eigen has an implementation of Umeyama's method for estimation of the rigid transformation between two sets of 3d points. You can use it to get an initial estimation, and then refine it using an optimization algorithm and considering the projections of the 3d points onto the images too. For example, you could try to minimize the reprojection error between 2d points on the first image and projections of the 3d points after you bring them from the reference frame of one camera to the the reference frame of the other using the previously estimated transformation. You can do this in both ways, using the transformation and its inverse, and try to minimize the bidirectional reprojection error. I'd recommend the paper "Stereo visual odometry for autonomous ground robots", by Andrew Howard, as well as some of its references for a better explanation, especially if you are considering an outlier removal/inlier detection step before the actual motion estimation.