My models are axis-aligned.
I know the normal of the surface I want to draw them on; how do I compute the rotations or rotation matrix that I should use to draw the model perpendicular to the surface?
Well, if u is the unit up vector, and n is the unit surface normal, then you want a rotation that turns u into n. (This rotation isn't unique: you can compose any rotation about u before it, and any rotation about n after it.)
How to compute this rotation? Well, it depends on what framework or API you're working with, and how rotations are represented. A decent framework has an API call for this, for example in Unity 3D there's Quaternion.FromToRotation.
If you don't have an API call for that, then you can work it out for yourself. Any rotation can be represented in axis–angle form, say (a, θ). Here θ is the angle between the vectors u and n (which you can compute using the cosine rule: u · n = |u| |n| cos θ), and a is an axis perpendicular to both vectors: a = u × n. (There are two special cases: when u = n the null rotation will do, and if u = −n any half-rotation about any vector perpendicular to u will do.)
If you're using OpenGL you can now call glRotate.
And if you don't even have that ... well, you've no business asking this kind of basic question! But see Wikipedia for how to turn axis–angle representation into matrix representation.
Related
I have a fisheye camera, which I have already calibrated. I need to calculate the camera pose w.r.t a checkerboard just by using a single image of said checkerboard,the intrinsic parameters, and the size of the squares of the checkerboards. Unfortunately many calibration libraries first calculate the extrinsic parameters from a set of images and then the intrinsic parameters, which is essentially the "inverse" procedure of what I want. Of course I can just put my checkerboard image inside the set of other images I used for the calibration and run the calib procedure again, but it's very tedious, and moreover, I can't use a checkerboard of different size from the ones used for the instrinsic calibration. Can anybody point me in the right direction?
EDIT: After reading francesco's answer, I realized that I didn't explain what I mean by calibrating the camera. My problem begins with the fact that I don't have the classic intrinsic parameters matrix (so I can't actually use the method Francesco described).In fact I calibrated the fisheye camera with the Scaramuzza's procedure (https://sites.google.com/site/scarabotix/ocamcalib-toolbox), which basically finds a polynom which maps 3d world points into pixel coordinates( or, alternatively, the polynom which backprojects pixels to the unit sphere). Now, I think these information are enough to find the camera pose w.r.t. a chessboard, but I'm not sure exactly how to proceed.
the solvePnP procedure calculates extrinsic pose for Chess Board (CB) in camera coordinates. openCV added a fishEye library to its 3D reconstruction module to accommodate significant distortions in cameras with a large field of view. Of course, if your intrinsic matrix or transformation is not a classical intrinsic matrix you have to modify PnP:
Undo whatever back projection you did
Now you have so-called normalized camera where intrinsic matrix effect was eliminated.
k*[u,v,1]T = R|T * [x, y, z, 1]T
The way to solve this is to write the expression for k first:
k=R20*x+R21*y+R22*z+Tz
then use the above expression in
k*u = R00*x+R01*y+R02*z+Tx
k*v = R10*x+R11*y+R12*z+Tx
you can rearrange the terms to get Ax=0, subject to |x|=1, where unknown
x=[R00, R01, R02, Tx, R10, R11, R12, Ty, R20, R21, R22, Tz]T
and A, b
are composed of known u, v, x, y, z - pixel and CB corner coordinates;
Then you solve for x=last column of V, where A=ULVT, and assemble rotation and translation matrices from x. Then there are few ‘messy’ steps that are actually very typical for this kind of processing:
A. Ensure that you got a real rotation matrix - perform orthogonal Procrustes on your R2 = UVT, where R=ULVT
B. Calculate scale factor scl=sum(R2(i,j)/R(i,j))/9;
C. Update translation vector T2=scl*T and check for Tz>0; if it is negative invert T and negate R;
Now, R2, T2 give you a good starting point for non linear algorithm optimization such as Levenberg Marquardt. It is required because a previous linear step optimizes only an algebraic error of parameters while non-linear one optimizes a correct metrics such as squared error in pixel distances. However, if you don’t want to follow all these steps you can take advantage of the fish-eye library of openCV.
I assume that by "calibrated" you mean that you have a pinhole model for your camera.
Then the transformation between your chessboard plane and the image plane is a homography, which you can estimate from the image of the corners using the usual DLT algorithm. You can then express it as the product, up to scale, of the matrix of intrinsic parameters A and [x y t], where x and y columns are the x and y unit vectors of the world's (i.e. chessboard's) coordinate frame, and t is the vector from the camera centre to the origin of that same frame. That is:
H = scale * A * [x|y|t]
Therefore
[x|y|t] = 1/scale * inv(A) * H
The scale is chosen so that x and y have unit length. Once you have x and y, the third axis is just their cross product.
I am using chai3d api, which uses a 3x3 floating pt matrix for storing objects orientation in my virtual world.
I want to predict these orientations on client side, after periodic updates from server, so that I have a consistent virtual graphical world.
I predict the objects (e.g. opengl cube) position by sending a position and velocity value.
Is angular velocity for orientation same as velocity for position?
if yes how do I calculate the angular velocity from this 3x3 matrix and use it for extrapolation?
A transformation matrix is essentially a representation of a new coordinate system within another coordinate system. If you add a column you can even put the translation into it. If you remember your calculus and physics, then you may remember
r = 1/2 a² t + v0 t + r0
v = d/dt r = a + v0
a = d/dt v
To get from velocity 'v' to position 'r' you have to integrate. In the case of scalars you multiply v with time. But scalar multiplication with a matrix will just scale it, not rotate it. So you must do something else. The keyword, if you want to do this using matrices is matrix powers, i.e. calculating the powers of a matrix.
Say you have a differential rotation, d/dt R, then you would integrate this, by multiplying the corresponding rotation matrix infinitesimaly often with itself, i.e. take a power.
But there's also a mathematically much nicer way to do this. Something very close to just multiplying with a factor. And that is: Using quaternions instead of matrices to represent orientations. It turns out that simply scaling a quaternions is the same as just multiplying on the rotation it desscribes.
The keywords you should Google for (because StackOverflow is the wrong place for introducing one into the whole theory of quaternions) are:
quaternion
angular velocity
angular interpolation
SLERP http://en.wikipedia.org/wiki/Slerp
The Project
I am working on a texture tracking project for mobile. It exclusively tracks planar surfaces so I have been using openCV's cv::FindHomography() to calculate the homography between two frames. That function runs very very slow however and is the primary bottleneck in my pipeline. I decided that an algorithm that can take an initial estimate of the homography would run much faster because my change in homography between frames is very small. Also, my outlier percentage is very small so robust methods are optional. Unfortunately, to my knowledge open CV does not include a homography finder that takes an initial estimate. It does however include solvePnP() which takes the original 3d world coordinates of the scene, the current 2d image coordinates, a camera matrix, distortion parameters, and most importantly an initial estimate. I am trying to replace FindHomography with solvePnP. Since I use only 2d coordinates throughout the pipeline and solvePnP asks for 3d coordinates I am trying to move from 2d->3d->3d_transform->2d_transform. Right now that process runs 6x faster than FindHomography() if it is given a good initial guess but it has issues.
The Problem
Something is wrong with how I am converting. My theory was that since a camera matrix is not required to find a homography it should not be required for this process since I only want the information contained in a homography in the end. I also assumed that since I throw out all z information in the end how I initialize z should not matter. My process is as follows
First I convert all my initial 2d coordinates to 3d by giving them a z pos of 1. I can assume that my original coordinates lie flat in the x-y plane. Then
cv::Mat rot_mat; //3x3 rotation matrix
cv::Mat pnp_rot; //3x1 rotation vector
cv::Mat pnp_tran; //3x1 translation vector
cv::Matx33f camera_matrix(1,0,0,
0,1,0,
0,0,1);
cv::Matx41f dist(0,0,0,0);
cv::solvePnP(original_cord, current_cord, camera_matrix, dist, pnp_rot, pnp_tran,true);
//Rodrigues converts from a rotation vector to a rotation matrix
cv::Rodrigues(pnp_rot, rot_mat);
cv::Matx33f homography(rot_mat(0,0),rot_mat(0,1),pnp_tran(0),
rot_mat(1,0),rot_mat(1,1),pnp_tran(1),
rot_mat(2,0),rot_mat(2,1),pnp_tran(2)+1);
The conversion to a homography here is simple. The first two columns of the homography are from the 3x3 rotation matrix, the last column is the translation vector. The one trick is that homography(2,2) corresponds to scale while pnp_tran(2) corresponds to movement in the z axis. Given that I initialize my z coordinates to 1, scale is z_translation + 1. This process works perfectly for 4 of 6 degrees of freedom. Translation_x, translation_x, scale, and rotation about z all work. Rotation about x and y however display significant error. I believe that this is due to initializing my points at z = 1 but I don't know how to fix it.
The Question
Was my assumption that I can get good results from solvePnP by using a faked camera matrix and initial z coord correct? If so, how should I set up my camera matrix and z coordinates to make x and y rotation work? Also if anyone knows where I could get a homography finding algorithm that takes an initial guess and works only in 2d, or information on techniques for writing my own it would be very helpful. I will most likely be moving in that direction once I get this working.
Update
I built myself a test program which takes a homography, generates a set of coplanar points from that homography, and then runs the points through solvePnP to recover the specified homography. In the process of doing this I realized that I am fundamentally misunderstanding some part of how homographies are constructed. I have been assuming that a homography is constructed as follows.
hom(0,2) = x translation
hom(1,2) = y translation
hom(2,2) = scale, I can divide the entire matrix by this to normalize
the first two columns I assumed were the first two columns of a 3x3 rotation matrix. This essentially amounts to taking a 3x4 transform and throwing away column(2). I have discovered however that this is not true. The test case showing me the error of my ways was trying to make a homography which rotates points some small angle around the y axis.
//rotate by .0175 rad about y axis
rot_mat = (1,0,.0174,
0,1,0,
-.0174,0,1);
//my conversion method to make this a homography gives
homography = (1,0,0,
0,1,0,
-.0174,0,1);
The above homography does not work at all. Take for example a point x,y,1 where x > 58. The result will be x,y,some_negative_number. When I convert from homogeneous coordinates back to cartesian my x and y values will both flip signs.
All that is to say, I now have a much simpler question that I think would let me solve everything. How do I construct a homography that rotates points by some angle around the x and y axis?
Homographies are not simple translation or rotation matrices. The aim is to map straight lines to straight lines rather than to map single points to other points. They take into account perspective matrices to achieve this and are explained here
Hence, homography matrices cannot be easily decomposed, but there are (complicated) ways to do so shown here. This may help you extract rotations and translations out of it.
This should help you better understand a homography, but the rest I am unfamiliar with.
I have already done the comparison of 2 images of same scene which are taken by one camera with different view angles(say left and right) using SURF in emgucv (C#). And it gave me a 3x3 homography matrix for 2D transformation. But now I want to make those 2 images in 3D environment (using DirectX). To do that I need to calculate relative location and orientation of 2nd image(right) to the 1st image(left) in 3D form. How can I calculate Rotation and Translate matrices for 2nd image?
I need also z value for 2nd image.
I read something called 'Homograhy decomposition'. Is it the way?
Is there anybody who familiar with homography decomposition and is there any algorithm which it implement?
Thanks in advance for any help.
Homography only works for planar scenes (ie: all of your points are coplanar). If that is the case then the homography is a projective transformation and it can be decomposed into its components.
But if your scene isn't coplanar (which I think is the case from your description) then it's going to take a bit more work. Instead of a homography you need to calculate the fundamental matrix (which emgucv will do for you). The fundamental matrix is a combination of the camera intrinsic matrix (K), the relative rotation (R) and translation (t) between the two views. Recovering the rotation and translation is pretty straight forward if you know K. It looks like emgucv has methods for camera calibration. I am not familiar with their particular method but these generally involve taking several images of a scene with know geometry.
To figure out camera motion (exact rotation and translation up to a scaling factor) you need
Calculate fundamental matrix F, for example, using eight-point
algorithm
Calculate Essential matrix E = A’FA, where A is intrinsic camera matrix
Decompose E which is by definition Tx * R via SVD into E=ULV’
Create a special 3x3 matrix
0 -1 0
W = 1 0 0
0 0 1
that helps to run decomposition:
R = UW-1VT, Tx = ULWUT, where
0 -tx ty
Tx = tz 0 -tx
-ty tx 0
Since E can have an arbitrary sign and W can be replace by Winv we have 4 distinct solution and have to select the one which produces most points in front of the camera.
It's been a while since you asked this question. By now, there are some good references on this problem.
One of them is "invitation to 3D image" by Ma, chapter 5 of it is free here http://vision.ucla.edu//MASKS/chapters.html
Also, Vision Toolbox of Peter Corke includes the tools to perform this. However, he does not explain much math of the decomposition
i have an object in 3d space that i want to align according to a vector.
i already got the Y-rotation out by doing an atan2 on the x and z component of the vector. but i would also like to have an X-rotation to make the object look downwards or upwards.
imagine a plane that does it's pitch yaw roll, just without the roll.
i am using openGL to set the rotations so i will need an Y-angle and an X-angle.
I would not use Euler angles, but rather a Euler axis/angle. For that matter, this is what Opengl glRotate uses as input.
If all you want is to map a vector to another vector, there are an infinite number of rotations to do that. For the shortest one, (the one with the smallest angle of rotation), you can use the vector found by the cross product of your from and to unit vectors.
axis = from X to
from there, the angle of rotation can be found from from.to = cos(theta) (assuming unit vectors)
theta = arccos(from.to)
glRotate(axis, theta) will then transform from to to.
But as I said, this is only one of many rotations that can do the job. You need a full referencial to define better how you want the transform done.
You should use some form of quaternion interpolation (Spherical Linear Interpolation) to animate your object going from its current orientation to this new orientation.
If you store the orientations using Quaternions (vector space math), then you can get the shortest path between two orientations very easily. For a great article, please read Understanding Slerp, Then Not Using It.
If you use Euler angles, you will be subject to gimbal lock and some really weird edge cases.
Actually...take a look at this article. It describes Euler Angles which I believe is what you want here.