Vanishing points in rubiks cube - computer-vision

Hi I'm trying to do the Ex 6.5 from Szeliski's book ... But I'm stuck at points 3 and 4, I have the theory of what a vanishing point is, but what does it mean to find it for each face? and how about the focal lenght and rotation angle for those VP? If you can provide some resources easy to understand I will appreciate it.

Vanishing points are points at which parallel lines in a plane approach the same point. So in order to find the vanishing points you would take lines from the plane that are parallel (there are 6 of these per face on a Rubik's cube, 3 in 2 different directions) then where those lines intersect in the image is where a vanishing point is for that plane. You should be able to find 2 vanishing points per face unless you're looking at the cube head on.
You are bit mistaken in saying " the focal length and rotation angle for those VP", the book wants you to find these FROM the vanishing points. After you've found these vanishing points using the other points from the face you can construct a plane (in the coordinate system it would be a single vector normal to the plane). The rotation angle for this would be the difference between the image plane and the plane of the face.
Unfortunately I'm not well versed in finding focal lengths. But there should be a way of determining focal length of the camera by knowing the actual distance between the cubes. You could try reading this: from an image processing class I took.


Rendering an atmosphere around a planet with shading

I have a made a planet and wanted to make an atmosphere around it. So I was referring to this site:
I don't understand this:
As with the lookup table proposed in Nishita et al. 1993, we can get the optical depth for the ray to the sun from any sample point in the atmosphere. All we need is the height of the sample point (x) and the angle from vertical to the sun (y), and we look up (x, y) in the table. This eliminates the need to calculate one of the out-scattering integrals. In addition, the optical depth for the ray to the camera can be figured out in the same way, right? Well, almost. It works the same way when the camera is in space, but not when the camera is in the atmosphere. That's because the sample rays used in the lookup table go from some point at height x all the way to the top of the atmosphere. They don't stop at some point in the middle of the atmosphere, as they would need to when the camera is inside the atmosphere.
Fortunately, the solution to this is very simple. First we do a lookup from sample point P to the camera to get the optical depth of the ray passing through the camera to the top of the atmosphere. Then we do a second lookup for the same ray, but starting at the camera instead of starting at P. This will give us the optical depth for the part of the ray that we don't want, and we can subtract it from the result of the first lookup. Examine the rays starting from the ground vertex (B 1) in Figure 16-3 for a graphical representation of this.
First Question - isn't optical depth dependent on how you see that is, on the viewing angle? If yes, the table just gives me the optical depth of the rays going from land to the top of the atmosphere in a straight line. So what about the case where the rays pierce the atmosphere to reach the camera? How do I get the optical depth in this case?
Second Question - What is the vertical angle it is talking, is it the same as the angle with the z-axis as we use in polar coordinates?
Third Question - The article talks about scattering of the rays going to the sun..shouldn't it be the other way around? like coming from the sun to a point?
Any explanation on the article or on my questions will help a lot.
Thanks in advance!
I am no expert in the matter but have played with Atmospheric scattering and various physical and optical simulations. I strongly recommend to look at this:
my VEEERRRYYY Simplified version of atmospheric scattering in GLSL
It odes not do the full volume intergration but just linear path integration along the ray and does only the Rayleight scatering with isotropic coefficients. As you can see its still good enough.
In real scattering the viewing angle is impacting the real scattering equation as the scattering coefficients are different in different angles (against main light source and viewer) So answer to your first question is Yes it does.
Not sure what you are refer to in your second question. The scattering itself is dependent on angle between light source, particle and camera. That lies on arbitrary plane. However if the Earth surface is accounted to the equation too then its dependent on the horizontal and vertical angles (against terrain) so azimuth,elevation as usually more light is reflected when camera is facing sun (azimuth) and the reflected rays are closer to your elevation. So my guess is that's what the horizontal angle is about accounting for reflected light from the surface.
To answer your 3th question is called back ray tracing. You can cast rays both ways (from camera or from sun) however if you start from light source you do not know which way to go to hit a pixel on camera screen so you need to cast a lot of rays to increase the probability of hit enough to fill the screen which is too slow and inaccurate (produce holes). If you start from screen pixel then you cast just single or per wavelength ray instead which is much much faster. The resulting color is the same.
[Edit1] vertical angle
OK I read the linked topic a bit and this is How I understand it:
So its just angle between surface normal and the casted ray. Its scaled so vert.angle=0 means that ray and normal are the same and vert.angle=1 means the are opposite directions.

Invalid cameras calibration for an head mounter Eye Tracking system

I'm working on an Eye Tracking system with two cameras mounted on some kind of glasses. There are optical lenses so that the screen is perceived at around 420 mm from the eye.
From a few dozen pupil samples, we compute two eye models (one for each camera), located in their respective camera coordinates system. This is based on the works here, but modified so that an estimation of the eye center is found using some kind of brute-force approach to minimize the ellipse projection error on the model given its center position in camera space.
Theorically, an approximation of the cameras parameters would be symetrical to the lenses on the Y axis. So every camera should be at the coordinates (around 17.5mm or -17.5, 0, 3.3) with respect to the lenses coordinates system, a rotation of around 42.5 degrees on the Y axis.
With the However, with these values, there is an offset in the result. See below:
The red point is the gaze center estimated by the left eye tracker, the white one is the right eye tracker, in screen coordinates
The screen limits are represented by the white lines.
The green line is the gaze vector, in camera coordinates (projected in 2D for visualization)
The two camera centers found, projected in 2D, are in the middle of the eye (the blue circle).
The pupil samples and current pupils are represented by the ellipses with matching colors.
The offset on x isn't constant which mean the rotation on Y is not exact. and the position of the camera aren't precise too. In order to fix it, we used: this to calibrate and then this to get the rotation parameters from the rotation matrix.
We added a camera on the middle of the lenses (Close to the theorical 0,0,0 point ?) to get the extrinsics and intrinsic parameters of the cameras, relative to our lens center. However, with about 50 checkerboard captures from different positions, the results given by OpenCV doesn't seems correct.
For example, it gives for a camera a position of about (-14,0,10) in lens coordinates for the translation and something like (-2.38, 49, -2.83) as rotation angles in degrees.
The previous screenshots are taken with theses parameters. The theorical ones are a bit further apart, but are more likely to reach the screen borders, unlike the opencv value.
This is probably because the test camera is in front of the optic, not behind, where our real 0,0,0 would be located (we just add the distance at which the screen is perceived on the Z axis afterwards, which is 420mm).
However, we have no way to put the camera in (0, 0, 0).
As the system is compact (everything is captured within a few cm^2), each degree or millimeter can change the result drastically so without the precise value the cameras, we're a bit stuck.
Our objective here is to find an accurate way to get the extrinsic and intrisic parameters of each cameras, so that we can compute a precise position of the center of the eye of the person wearing the glasses, without other calibration procedure than looking around (so no fixation points)
Right now, the system is precise enough so that we get a global indication on where someone is looking on the screen,but there is a divergence between the right and left camera, it's not precise enough. Any advice or hint that could help us is welcome :)

Matching top view human detections with floor projection on interactive floor project

I'm building an interactive floor. The main idea is to match the detections made with a Xtion camera with objects I draw in a floor projection and have them following the person.
I also detect the projection area on the floor which translates to a polygon. the camera can detect outside the "screen" area.
The problem is that the algorithm detects the the top most part of the person under it using depth data and because of the angle between that point and the camera that point isn't directly above the person's feet.
I know the distance to the floor and the height of the person detected. And I know that the camera is not perpendicular to the floor but I don't know the camera's tilt angle.
My question is how can I project that 3D point onto the polygon on the floor?
I'm hoping someone can point me in the right direction. I've been reading about camera projections but I'm not seeing how to use it in this particular problem.
Thanks in advance
With the awnser from Diego O.d.L I was able to get an almost perfect detection. I'll write the steps I used for those who might be looking for the same solution (I won't get into much detail on how detection is made):
Step 1 : Calibration
Here I get some color and depth frames from the camera, using openNI, with the projection area cleared.
The projection area is detected on the color frames.
I then convert the detection points to real world coordinates (using OpenNI's CoordinateConverter). With the new real world detection points I look for the plane that better fits them.
Step 2: Detection
I use the detection algorithm to get new person detections and to track them using the depth frames.
These detection points are converted to real world coordinates and projected to the plane previously computed. This corrects the offset between the person's height and the floor.
The points are mapped to screen coordinates using a perspective transform.
Hope this helps. Thank you again for the awnsers.
Work with the camera coordinate system initially. I'm assuming you don't have problems converting from (row,column,distance) to a real world system aligned with the camera axis (x,y,z):
calculate the plane with three or more points (for robustness) with
the camera projection (x,y,z). (choose your favorite algorithm,
Then Find the projection of your head point to the floor plane
Finally, you can convert it to the floor coordinate system or just
keep it in the camera system
From the description of your intended application, it is probably more useful for you to recover the image coordinates, I guess.
This type of problems usually benefits from clearly defining the variables.
In this case, you have a head at physical position {x,y,z} and you want the ground projection {x,y,0}. That's trivial, but your camera gives you {u,v,d} (d being depth) and you need to transform that to {x,y,z}.
The easiest solution to find the transform for a given camera positioning may be to simply put known markers on the floor at {0,0,0}, {1,0,0}, {0,1,0} and see where they pop up in your camera.

Sorting of vertices after intersection of 3d isosurface with plane

Here is another geometric problem:
I have created an 3-dimensional triangulated iso-surface of a point cloud using the marching cubes algorithm. Then I intersect this iso-surface with a plane and get a number of line segments that represent the contour lines of the intersection.
Is there any possibility to sort the vertices of these line segments clockwise so that I can draw them as a closed path and do a flood fill?
Thanks in advance!
It depends on how complex your isosurface is, but the simplest thing I can think of that might work is:
For each point, project to the plane. This will give you a set of points in 2d.
Make sure these are centered, via a translation to the centroid or center of the bounding box.
For each 2d point, run atan2 and get an angle. atan2 just puts things in the correct quadrant.
Order by that angle
If your isosurface/plane is monotonically increasing in angle around the centroid, then this will work fine. If not, then you might need to find the 2 nearest neighbors to each point in the plane, and hope that that makes a simple loop. In face, the simple loop idea might be simpler, because you don't need to project and you don't need to compute angles - just do everything in 3d.

Creating OOBB from points

How can I create minimal OOBB for given points? Creating AABB or sphere is very easy, but I have problems creating minimal OOBB.
First answer didn't get me good results. I don't have huge cloud of points. I have little amount of points. I am doing collision geometry generation. For example, cube has 36 points (6 sides, 2 triangles each, 3 points for each triangle). And algorithm from first post gave bad results for cube. Example points for cube: (should return identity axis)
The PCA/covariance/eigenvector method essentially finds the axes of an ellipsoid that approximates the vertices of your object. It should work for random objects, but will give bad results for symmetric objects like the cube. That's because the approximating ellipsoid for a cube is a sphere, and a sphere does not have well defined axes. So you're not getting the standard axes that you expect.
Perhaps if you know in advance that an object is, for example, a cube you can use a specialized method, and use PCA for everything else.
On the other hand, if you want to compute the true OBB there are existing implementations you can use e.g.
(archived at and I believe this implements the algorithm alluded to in the comments to your question.
Quoting from that page:
The ContMinBox3 files implement an
algorithm for computing the
minimum-volume box containing the
points. This method computes the
convex hull of the points, a convex
polyhedron. The minimum-volume box
either has a face coincident with a
face of the convex polyhedron or has
axis directions given by three
mutually perpendicular edges of the
convex polyhedron. Each face of the
convex polyhedron is processed by
projecting the polyhedron to the plane
of the face, computing the
minimum-area rectangle containing the
projections, and computing the
minimum-length interval containing the
projections onto the perpendicular of
the face. The minimum-area rectangle
and minimum-length interval combine to
form a candidate box. Then all triples
of edges of the convex polyhedron are
processed. If any triple has mutually
perpendicular edges, the smallest box
with axes in the directions of the
edges is computed. Of all these boxes,
the one with the smallest volume is
the minimum-volume box containing the
original point set.
If, as you say, your objects do not have a large number of vertices, the running time should be acceptable.
In a discussion at the author of the above library casts some more light on the topic:
Gottschalk's approach to OBB construction is to compute a covariance matrix for the point set. The eigenvectors of this matrix are the OBB axes. The average of the points is the OBB center. The OBB is not guaranteed to have the minimum volume of all containing boxes. An OBB tree is built by recursively splitting the triangle mesh whose vertices are the point set. A couple of heuristics are mentioned for the splitting.
The minimum volume box (MVB) containing a point set is the minimum volume box containing the convex hull of the points. The hull is a convex polyhedron. Based on a result of Joe O'Rourke, the MVB is supported by a face of the polyhedron or by three perpendicular edges of the polyhedron. "Supported by a face" means that the MVB has a face coincident with a polyhedron face. "Supported by three perpendicular edges" means that three perpendicular edges of the MVB are coincident with edges of the polyhedron.
As jyk indicates, the implementations of any of these algorithms is not trivial. However, never let that discourage you from trying :) An AABB can be a good fit, but it can also be a very bad fit. Consider a "thin" cylinder with end points at (0,0,0) and (1,1,1) [imagine the cylinder is the line segment connecting the points]. The AABB is 0 <= x <= 1, 0 <= y <= 1, and 0 <= z <= 1, with a volume of 1. The MVB has center (1,1,1)/2, an axis (1,1,1)/sqrt(3), and an extent for this axis of sqrt(3)/2. It also has two additional axes perpendicular to the first axis, but the extents are 0. The volume of this box is 0. If you give the line segment a little thickness, the MVB becomes slightly larger, but still has a volume much smaller than that of the AABB.
Which type of box you choose should depend on your own application's data.
Implementations of all of this are at my website. I use the median-split heuristic for the bounding-volume trees. The MVB construction requires a convex hull finder in 2D, a convex hull finder in 3D, and a method for computing the minimum area box containing a set of planar points--I use the rotating caliper method for this.
First you have to compute the centroid of the points, in pseudcode
mu = sum(0..N, x[i]) / N
then you have to compute the covariance matrix
C = sum(0..N, mult(x[i]-mu, transpose(x[i]-mu)));
Note that the mult performs an (3x1) matrix multiplication by (1x3) matrix multiplication, and the result is a 3x3 matrix.
The eigenvectors of the C matrix define the three axis of the OBB.
There is a new library ApproxMVBB in C++ online which computes an approximation for the minimum volume bounding box. Its released under MPL 2.0 Licences, and written by me.
If you have time look at:
The library is C++11 compatible and only needs Eigen
Tests show that an approximation for 140Million points in 3D can be computed in reasonable time (arround 5-7 seconds) depending on your settings for the approximation.