Camera projection matrix: why transpose rotation matrix? - computer-vision

In the following:
http://cvlab.epfl.ch/files/content/sites/cvlab2/files/data/strechamvs/rathaus.tar.gz
there's a README file that says:
a 3D point X will be projected into the images in the usual way:
x = K[R^T|-R^T t]X
I remember that 3D to 2D Camera Projection Matrix requires the R rotation, not the transpose R matrix, i.e. I expect:
x = K[R|-R t]X
Why does it say R^T and not simply R ?

It depends in which direction R was determined. I.e. is it a transformation of the camera in the global reference frame, or is it a transformation of the points in the local camera's reference frame.
The true answer is: Don't worry just check that what you've got is right.

Since R^T == R^{-1}, it seems like the upper formula just expects the rotation to be available in the reverse direction of the below one. Just make sure to use the direction they expect as input.

Related

how do I maintain relative transformation b/w 2 objects after changing transformation of any one without a scenegraph?

Say I have 2 objects, a camera and a cube, both on XZ plane, the cube has some arbitrary rotation, and camera is facing the cube.
now if a transformation R is applied to the camera such that it has a new rotation and position.
I want to move the cube in front of the camera using transformation R1, such that in the view it looks exactly as before R was applied, meaning relative distance, rotation and scale b/w the 2 objects remain same after both R and R1.
Following image gives a gist of the problem.
Assume that there's no scenegraph that we can use.
I've posed the problem mainly in 2D but I'm trying to solve it in 3D, so rotations can have all yaw, pitch and roll, and translations can be anywhere in 3D space.
EDIT:
I forgot to add what I have done so far.
I figured out how to maintain relative distance b/w camera and cube, I can project cube's position to get world to screen point, then unproject the screen point in new camera position to get new world position.
However for rotation, I have tried this
I thought I can apply same rotation as R in R1, this didn't work, it appears to work if rotation happens only in one axis, if rotation happens in more than one axes, it does not work.
I thought I can take delta rotation b/w camera and cube, and simply apply camera's rotation to the cube and then multiply delta rotation, this also didn't work
Let M and V be the model and view matrices before you move the camera, M2 and V2 be the matrices after you move the camera. To be clear: model matrix transforms the coordinates from object local coordinates into world coordinates; a view matrix transforms from world coordinates into clip-space camera coordinates. Consequently V*M*p transforms the position p into clip-space.
For the position on the screen to stay constant, we need V*M*p = V2*M2*p to be true for all p (that's assuming that the FOV doesn't change). Therefore V*M = V2*M2, or
M2 = inverse(V2)*V*M
If you apply the camera transformation on the right (V2 = V*R) then the above expression for M2 can be simplified:
M2 = inverse(R)*M
(that is you apply inverse(R) on the left of the model matrix to compensate).
Alternatively, ask yourself if you really need to keep the object coordinates in the world reference frame. It may be easier to not to apply the view matrix when rendering that object at all; that would effectively keep it relative to the camera at all times without any additional tweaks. That would have better numerical stability too.

How to calculate a linear tapering transformation matrix

I need to calculate a 4x4 matrix (for OpenGL) that can transform a 3d object on the left to one on the right. Transformation applied only in one axis.
EDIT:
The inputs are a given 3d object (points) to be deformed and a single variable for the amount of deformation.
A picture represents a cube projected to plane showing only relevant changes. There are no changes in the axis perpendicular to view plane.
The relative position of these two objects is not relevant and used only to show "before and after" situation.
******* wrong answer was here *******
It's somewhat opposite to 2D perspective matrix with further perspective division. So, to do this "perspective" thing inversely, you need to do something opposite to perspective division then multiply the result by an inverted "perspective" matrix. And though the perspective matrix may be inverted, I have no idea what is "opposite to perspective division". I think you just can't do it with matrices. You'll have to transform Y coord of each vertex instead

How to flip only one axis of transformation matrix?

I have a 4x4 transformation matrix. However, after trying out the transformation I noticed that movement and rotation of the Y axis is going the opposite way. The rest is correct.
I got this matrix from some other API so probably it is the difference of coordinate system. So, how can I flip an axis of transformation matrix?
If only translation I can add minus sign on the Y translation, but I have no idea about opposite rotation of only one axis since all the rotation is being represented in the same 3x3 area. I thought there might be some way that even affect both translation and rotation at the same time. (truly flipping the axis)
Edit: I'm pretty sure the operation you're looking for is changing coordinate systems while maintaining Z-up or Y-up. In this case, try setting all the elements of the second column (or row) of your matrix to their inverse.
This question would be better for the Math StackExchange. First, a really helpful read on rotation matrices.
The first problem is the matter of rotation order. I will be assuming the XYZ rotation order. We know the rotation matrices for each axis is as follows:
Given a matrix derived from the same rotation order, the resulting matrix would be as follows, where alpha is the X angle, beta is the Y angle, and gamma is the Z angle:
You can derive the individual components of each axis angle from this matrix. For example, you can derive the Y angle from -sin(beta) using some inverse trig. Given beta, you can derive alpha from cos(beta)sin(alpha). You can also derive gamma from cos(beta)sin(gamma). Note that the same number in the matrix can represent multiple values (e.g. sin(0)=0 and sin(180)=0).
Now that you know alpha, beta, and gamma, you can reverse beta and remake the rotation matrix.
There's a good chance that there's a better way to do this using quaternions, but you should ask the Math StackExchange these kinds of language-agnostic questions.
Much shorter answer: if you are not careful with your frame orientation many things down your pipeline are likely to have a bad hair day. The reason is "parity", a.k.a. "frame orientation", a.k.a. "right-handedness" (or rarely left-handedness). Most 3D geometry tools and libraries that work together normally assume implicitly that all coordinate systems in play are right-handed (or at least consistently-handed). Inverting the orientation of just one axis in a coordinate system changes its orientation from right to left handed or viceversa.
So, suggestion for things to check & try in your problem:
Check that the frame you get from your API is right-handed. You do so
by computing the determinant of the 3x3 rotation part of your 4x4 transform matrix: it must be +1 or very close to it.
If it is -1, then flip one if its axis, i.e. change the sign of one of the columns of the 3x3 rotation.
Note carefully: I said "columns" because I assume that you apply a transform Q to a point x by multiplying as Q * x, x being a 4x1 column vector with the last component equal to one. If you use row vectors left-multiplied by Q you need flip a row.
If that determinant is +1, you have a bug someplace else.

3d object overlay - augmented reality irrlicht + opencv

I am trying to develop an augmented reality program that overlays a 3d object on top of a marker. The model does not move along(proportionately) with the marker. Here are the list of things that I did
1) Using opencv: a) I used the solvepnp method to find rvecs and tvecs. b) I also used the rodrigues method to find the rotation matrix and appended the tvecs vector to get the projection matrix. c) Just for testing I made some points and lines and projected them to make a cube. This works perfectly fine and I am getting a good output.
2) Using irrlicht: a) I tried to place a 3d model(at position(0,0,0) and rotation(0,0,0)) with the camera feed running in the background. b) Using the rotation matrix found using rodrigues in opencv I calculated the pitch, yaw and roll values from this post("http://planning.cs.uiuc.edu/node103.html") and passed the value onto the rotation field. In the position field I passed the tvecs values. The tvecs values are tvecs[0], -tvecs[1], tvecs[2].
The model is moving in the correct directions but it is not moving proportionately. Meaning, if I move the marker 100 pixels in the x direction, the model only moves 20 pixels(the values 100 and 20 are not measured, I just took arbitrary values illustrate the example). Similarly for y axis and z axis. I do know I have to introduce another transformation matrix that maps the opencv camera coordinates to irrlicht camera coordinates and its a 4x4 matrix. But I do not know how to find it. Also the opencv's projections matrix [R|t] is a 3x4 matrix and it yields a 2d point that is to be projected. The 4x4 matrix mapping between opencv and irrlicht requires a 3d point(made homogeneous) to be fed into a 4x4 matrix. How do I achieve that?
The 4x4 matrix You are writing about seems to be M=[ R|t; 0 1]. t is 3x1 translation vector. To get the transformed coordinates v' of 4x1 ([x y z 1]^T) point v just do v'=Mt.
Your problem with scaling may be also caused by difference in units used for camera calibration in OpenCV and those used by the other library.

Translating a Quaternion

(perhaps this is better for a math Stack Exchange?)
I have a chain composed of bones. Each bone has a with a tip and tail. The following code computes where its tip will be, given a rotation, and sets the next link in the chain's position appropriately:
// Quaternion is a hand-rolled class that works correctly (as far as I can tell.)
Quaternion quat = new Quaternion(getRotationAngleDegrees(), getRotation());
// figure out where the tip will be after applying the rotation
Vector3f rotatedTip = quat.applyRotationTo(tip);
// set the next bone's tail to be at this one's tip
updateNextPosFrom(rotatedTip);
This works if the rotation is supposed to occur around the origin of the object's coordinate system. But what if I want the rotation to occur around some other arbitrary point in the object? I'm not sure how to translate the quaternion. What is the best way to do it?
(I'm using JOGL / OpenGL.)
Dual quaternions are useful for expressing rigid spatial transformations (combined rotations and translations.)
Based on dual numbers (one of the Clifford algebras, d = a + e b where a, b are real and e is unequal to zero but e^2 = 0), dual quaternions, U + e V, can represent lines in space with U the unit direction quaternion and V the moment about a reference point. In this way, dual quaternion lines are very much like Pluecker lines.
While the quaternion transform Q V Q* (Q* is the quaternion conjugate of Q) is used to rotate a unit vector quaternion V about a point, a similar dual quaternion form can be used to apply to line a screw transform (the rigid rotation about an axis combined with a translation along the axis.)
Just as any rigid 2D transform can be resolved to a rotation about a point, any rigid 3D transform can be resolved to a screw.
For such power and expressiveness, dual quaternion references are thin, and the Wikipedia article is as good a place as any to start.
A quaternion is used specifically to handle a rotation factor, but does not include a translation at all.
Typically, in this situation, you'll want to apply a rotation to a point based on the "bone's" length, but centered at the origin. You can then translate post-rotation to the proper location in space.
Quaternions are generally used to represent rotations only; they cannot represent translations as well.
You need to convert your quaternion into a rotation matrix, insert it into the appropriate part of your standard OpenGL 4x4 matrix, and combine it with a translation in order to rotate about an arbitrary point.
4x4 rotation matrix:
[ r r r 0 ]
[ r r r 0 ] <- the r's are the 3x3 rotation matrix from the wiki article
[ r r r 0 ]
[ 0 0 0 1 ]
The Wikipedia page on forward kinematics points to this paper: Introduction to Homogeneous Transformations & Robot Kinematics.
Edit : This answer is wrong. It argues on 4x4 transformation matrices properties, which are not quaternions...
I might have got it wrong but to me (unlike some answers) a quaternion is indeed a tool to handle rotations and translations (and more). It is a 4x4 matrix where the last column represents the translation. Using matrix algebra, replace the 3-vector (x, y, z) by the 4-vector (x, y, z, 1) and compute the transformed vector by the matrix. You will find that values of the last column of the matrix will be added to the coordinates x, y, z of the original vector, as in a translation.
A 3x3 matrix for a 3D space represents a linear transformation (like rotation around the origin). You cannot use a 3x3 matrix for an affine transformation like a translation. So I understand simply the quaternions as a little "trick" to represent more kinds of transformations using matrix algebra. The trick is to add a fourth coordinate equal to 1 and to use 4x4 matrices. Because matrix algebra remains valid, you can combine space transformations by multiplying the matrices, which is indeed powerful.