How to do the correspondance 2D-3D points - opengl

I'm working with OpenCv API on an augmented reality project using one camera.I have :
The 3D point of my 3D object( i get 4 points from MeshLab)
The 2D points which i want to follow ( i have 4 points):these points are not the projection of the 3D points.
Intrinsic camera parameters.
Using these parameters, i have the extrinsic parameters( rotation and translation using the cvFindExtrinsicParam function) which i have used to render my model and set the modelView matrix.
My problem is that the 3D model are not shown in particular position: it has been shown in différent location on my image. How can i fix the model location and then the modelView matrix?
In other forums they told me that i should do the correspondance 2D-3D to get the extrinsic parameters but i don't know how to correspond my 2D points with the 3D points?

Typically you would design the points you want to track in such a fashion that the 2d-3d correspondence is immediately clear. The easiest way to do this is to have points with different colors. You could also go with some sort of pattern (google augmented reality cards) which you would then have to analyze in order to find out how it is rotated in the image. The pattern of course can not be rotation symmetric.
If you can't do that, you can try out all the different permutations of the points, plug them into OpenCV to get a matrix, then project your 3D points to 2D points with those matrices, and then see which one fits best.

Related

Determining homography from known planes?

I've got a question related to multiple view geometry.
I'm currently dealing with a problem where I have a number of images collected by a drone flying around an object of interest. This object is planar, and I am hoping to eventually stitch the images together.
Letting aside the classical way of identifying corresponding feature pairs, computing a homography and warping/blending, I want to see what information related to this task I can infer from prior known data.
Specifically, for each acquired image I know the following two things: I know the correspondence between the central point of my image and a point on the object of interest (on whose plane I would eventually want to warp my image). I also have a normal vector to the plane of each image.
So, knowing the centre point (in object-centric world coordinates) and the normal, I can derive the plane equation of each image.
My question is, knowing the plane equation of 2 images is it possible to compute a homography (or part of the transformation matrix, such as the rotation) between the 2?
I get the feeling that this may seem like a very straightforward/obvious answer to someone with deep knowledge of visual geometry but since it's not my strongest point I'd like to double check...
Thanks in advance!
Your "normal" is the direction of the focal axis of the camera.
So, IIUC, you have a 3D point that projects on the image center in both images, which is another way of saying that (absent other information) the motion of the camera consists of the focal axis orbiting about a point on the ground plane, plus an arbitrary rotation about the focal axis, plus an arbitrary translation along the focal axis.
The motion has a non-zero baseline, therefore the transformation between images is generally not a homography. However, the portion of the image occupied by the ground plane does, of course, transform as a homography.
Such a motion is defined by 5 parameters, e.g. the 3 components of the rotation vector for the orbit, plus the the angle of rotation about the focal axis, plus the displacement along the focal axis. However the one point correspondence you have gives you only two equations.
It follows that you don't have enough information to constrain the homography between the images of the ground plane.

OpenCV Projection Matrix Choice

I am currently facing a problem, to depict you what my programm does and should do, here is the copy/paste of the beginning of a previous post I've made.
This program lies on the classic "structure from motion" method.
The basic idea is to take a pair of images, detect their keypoints and compute the descriptors of those keypoints. Then, the keypoints matching is done, with a certain number of tests to insure the result is good. That part works perfectly.
Once this is done, the following computations are performed : fundamental matrix, essential matrix, SVD decomposition of the essential matrix, camera projection matrices computation and finally, triangulation.
The result for a pair of images is a set of 3D coordinates, giving us points to be drawn in a 3D viewer. This works perfectly, for a pair.
However, I have to perform a step manually, and this is not acceptable if I want my program to efficiently work with more than two images.
Indeed, I compute my projection matrices according the classic method, as follows, at paragraph "Determining R and t from E" : https://en.wikipedia.org/wiki/Essential_matrix
I have then 4 possible solutions for my projection matrix.
I think I have understood the geometrical point of view of the problem, portrayded in this Hartley and Zisserman paper extract (chapters 9.6.3 and 9.7.1) : http://www.robots.ox.ac.uk/~vgg/hzbook/hzbook2/HZepipolar.pdf
Nonetheless, my question is : Given the four possible projection matrices computed and the 3D points computed by the OpenCV function triangulatePoints() (for each projection matrix), how can I elect the "true" projection matrix, automatically ? (without having to draw 4 times my points in my 3D viewer, in order to see if they are consistent)
Thanks for reading.

how do I re-project points in a camera - projector system (after calibration)

i have seen many blog entries and videos and source coude on the internet about how to carry out camera + projector calibration using openCV, in order to produce the camera.yml, projector.yml and projectorExtrinsics.yml files.
I have yet to see anyone discussing what to do with this files afterwards. Indeed I have done a calibration myself, but I don't know what is the next step in my own application.
Say I write an application that now uses the camera - projector system that I calibrated to track objects and project something on them. I will use contourFind() to grab some points of interest from the moving objects and now I want to project these points (from the projector!) onto the objects!
what I want to do is (for example) track the centre of mass (COM) of an object and show a point on the camera view of the tracked object (at its COM). Then a point should be projected on the COM of the object in real time.
It seems that projectPoints() is the openCV function I should use after loading the yml files, but I am not sure how I will account for all the intrinsic & extrinsic calibration values of both camera and projector. Namely, projectPoints() requires as parameters the
vector of points to re-project (duh!)
rotation + translation matrices. I think I can use the projectorExtrinsics here. or I can use the composeRT() function to generate a final rotation & a final translation matrix from the projectorExtrinsics (which I have in the yml file) and the cameraExtrinsics (which I don't have. side question: should I not save them too in a file??).
intrinsics matrix. this tricky now. should I use the camera or the projector intrinsics matrix here?
distortion coefficients. again should I use the projector or the camera coefs here?
other params...
So If I use either projector or camera (which one??) intrinsics + coeffs in projectPoints(), then I will only be 'correcting' for one of the 2 instruments . Where / how will I use the other's instruments intrinsics ??
What else do I need to use apart from load() the yml files and projectPoints() ? (perhaps undistortion?)
ANY help on the matter is greatly appreciated .
If there is a tutorial or a book (no, O'Reilly "Learning openCV" does not talk about how to use the calibration yml files either! - only about how to do the actual calibration), please point me in that direction. I don't necessarily need an exact answer!
First, you seem to be confused about the general role of a camera/projector model: its role is to map 3D world points to 2D image points. This sounds obvious, but this means that given extrinsics R,t (for orientation and position), distortion function D(.) and intrisics K, you can infer for this particular camera the 2D projection m of a 3D point M as follows: m = K.D(R.M+t). The projectPoints function does exactly that (i.e. 3D to 2D projection), for each input 3D point, hence you need to give it the input parameters associated to the camera in which you want your 3D points projected (projector K&D if you want projector 2D coordinates, camera K&D if you want camera 2D coordinates).
Second, when you jointly calibrate your camera and projector, you do not estimate a set of extrinsics R,t for the camera and another for the projector, but only one R and one t, which represent the rotation and translation between the camera's and projector's coordinate systems. For instance, this means that your camera is assumed to have rotation = identity and translation = zero, and the projector has rotation = R and translation = t (or the other way around, depending on how you did the calibration).
Now, concerning the application you mentioned, the real problem is: how do you estimate the 3D coordinates of a given point ?
Using two cameras and one projector, this would be easy: you could track the objects of interest in the two camera images, triangulate their 3D positions using the two 2D projections using function triangulatePoints and finally project this 3D point in the projector 2D coordinates using projectPoints in order to know where to display things with your projector.
With only one camera and one projector, this is still possible but more difficult because you cannot triangulate the tracked points from only one observation. The basic idea is to approach the problem like a sparse stereo disparity estimation problem. A possible method is as follows:
project a non-ambiguous image (e.g. black and white noise) using the projector, in order to texture the scene observed by the camera.
as before, track the objects of interest in the camera image
for each object of interest, correlate a small window around its location in the camera image with the projector image, in order to find where it projects in the projector 2D coordinates
Another approach, which unlike the one above would use the calibration parameters, could be to do a dense 3D reconstruction using stereoRectify and StereoBM::operator() (or gpu::StereoBM_GPU::operator() for the GPU implementation), map the tracked 2D positions to 3D using the estimated scene depth, and finally project into the projector using projectPoints.
Anyhow, this is easier, and more accurate, using two cameras.
Hope this helps.

Reconstructing 3D from some images without calibration?

I want to make a 3D reconstruction from multiple images without using a chessboard Calibration. I'm using OpenCV and studying the method to obtain the way to get the model 3D from 30 images without calibrating the camera with a chessboard pattern.
Is this possible? Where can I get the extrinsics params?
Can I make the 3D reconstruction without calibrating?
The calibration grid (chessboard in the typical OpenCV example) is simply an object of known dimensions that lets you estimate the camera's intrinsic parameters, i.e. the mapping from camera coordinates to the image coordinates of a point. This includes focal length, centre of projection, radial distortion parameters et cetera.
If you do away with the calibration object, you will need to find these parameters from the image observations themselves. This approach is called "self-calibration" or "auto-calibration" and can be fairly involved. Basically, you are trying to get a good starting point for the follow-up non-linear optimisation (i.e. bundle adjustment). For a start, you might want to refer to Marc Pollefeys' PhD thesis, who came up with a simple linear algorithm for this problem:
http://www.cs.unc.edu/~marc/pubs/PollefeysIJCV04.pdf

How to get curve from intersection of point cloud and arbitrary plane?

I have various point clouds defining RT-STRUCTs called ROI from DICOM files. DICOM files are formed by tomographic scanners. Each ROI is formed by point cloud and it represents some 3D object.
The goal is to get 2D curve which is formed by plane, cutting ROI's cloud point. The problem is that I can't just use points which were intersected by plane. What I probably need is to intersect 3D concave hull with some plane and get resulting intersection contour.
Is there any libraries which have already implemented these operations? I've found PCL library and probably it should be able to solve my problem, but I can't figure out how to achieve it with PCL. In addition I can use Matlab as well - we use it through its runtime from C++.
Has anyone stumbled with this problem already?
P.S. As I've mentioned above, I need to use a solution from my C++ code - so it should be some library or matlab solution which I'll use through Matlab Runtime.
P.P.S. Accuracy in such kind of calculations is really important - it will be used in a medical software intended for work with brain tumors, so you can imagine consequences of an error (:
You first need to form a surface from the point set.
If it's possible to pick a 2d direction for the points (ie they form a convexhull in one view) you can use a simple 2D Delaunay triangluation in those 2 coordinates.
otherwise you need a full 3D surfacing function (marching cubes or Poisson)
Then once you have the triangles it's simple to calculate the contour line that a plane cuts them.
See links in Mesh generation from points with x, y and z coordinates
Perhaps you could just discard the points that are far from the plane and project the remaining ones onto the plane. You'll still need to reconstruct the curve in the plane but there are several good methods for that. See for instance http://www.cse.ohio-state.edu/~tamaldey/curverecon.htm and http://valis.cs.uiuc.edu/~sariel/research/CG/applets/Crust/Crust.html.