Model transformation from transformation between two frames - opengl

Given:
a) two frames of a moving object and euclidean transformation(rotate + shift) between them which aligns projection of the object on the first frame to its projection on the second frame.
b) model transformation matrix for the first frame;
c) view and projection matrices which are the same for both frames.
How to find model transformation matrix for the second frame?
Thanks.

Related

Place a billboard at a given distance so that it occupies a certain size on screen

I have a rectangle that I place on the screen using a simple scale matrix (S). Now I would like to place this rectangle into "3D space", but have it appear just like before on the screen. I found that I can do so by applying the view and projection matrices in inverse. Something like:
S' = (V⁻¹ P⁻¹ S)
Matrix = P V (V⁻¹ P⁻¹ S)
This works fine so far. My rectangle is like a billboard now and I can treat it like any other object, apply P and V and it will show up correctly. However, there is a degeneracy here: I don't specify at which depth the object is placed. It could be twice as far away but x times bigger!
The reason that this is important is that I want to animate the rectangle, say rotate it around the Z axis or move in 3D space. Then I want it to come to a stop and be positioned pixel-perfect on the screen.
How can I place a flat object at a given z distance, such that it appears on screen in a certain way? I already have with the scale matrix that I need to display it in OpenGL without any 3D transformation, that is the matrix for displaying it in NDC or screen coordinates. I also have the projection and view matrices I want to use. How can I go from this to the desired model matrix?
How can I place a flat object at a given z distance, such that it appears on screen in a certain way [...]
Actually you want to draw the object in view space. Define a model matrix for the object that contains only one translation component (0, 0, -z) and skip the view matrix when drawing the object.
Usually the order to transform 3D vertex v to 2D point p is written as a matrix multiplication. [Depending on the API you are using, the order might be reversed. The notation I used here is glsl - friendly]
p = P * V * M * S * v
v = vertex (usually 3D of the form x,y,z,1)
P = projection matrix
V = view (camera) matrix
M = model matrix (or world transformation)
S is a 4x4 object-scaling matrix
matrices are usually 4x4 with the last line/column 0,0,0,1
The model matrix M can be decomposed into a number of sub components such as T translation, S scale and R rotation. Of course here the order matters.
To rotate an object or vertex around itself, first rotate it (as if it is at the origin already) and then translate it to the position it needs to be, for example using vector v with coordinates (x,y,z).
R is a typical 3x3 rotation matrix embedded in a 4x4 with 0,0,0,1 on the last line
v is a 3D vector with coordinates (x,y,z)
T is a 4x4 translation matrix, all zeroes except the last column where it has x,y,z,1
M = T * R (first R, then T)
To rotate an object around an arbitrary point q, first translate it so that it is relative to that point (i.e. q is at the origin, so you subtracted q from v). Then rotate around q (at the origin) by applying R. Lastly place the object it back where it should be (so add q again).
R is your typical 3x3 rotation matrix embedded in a 4x4 with 0,0,0,1 on the last line
v is a 3D vector with coordinates (x,y,z)
T is a 4x4 translation matrix, all zeroes except the last column where it has x,y,z,1
L is another 4x4 translation matrix, but now with q instead of v
L' is the inverse transformation of L
M = T * L * R * L'
Also, scaling your object first (i.e. S is at the end of the multiplications) before translations will keep the translation in world units. Scale after all transformations, in fact scales all translations too, and the object will move over scaled distances.

What is the correct order of Transformations when calculating Matrices in OpenGL?

I am following the tutorials at LearnOpenGL.com and I am confused about the order of Matrices.
The Transformations chapter tells:
Matrix multiplication is not commutative, which means their order is important. When multiplying matrices the right-most matrix is first multiplied with the vector so you should read the multiplications from right to left. It is advised to first do scaling operations, then rotations and lastly translations when combining matrices otherwise they might (negatively) affect each other. For example, if you would first do a translation and then scale, the translation vector would also scale!
So If I am not wrong, the order is Translate * Rotate * Scale * vector_to_transform.
But immediately in the next Chapter, when calculating the LookAt matrix, the multiplication order is flipped. Here is the code snippet from the website:
// Custom implementation of the LookAt function
glm::mat4 calculate_lookAt_matrix(glm::vec3 position, glm::vec3 target, glm::vec3 worldUp)
{
// 1. Position = known
// 2. Calculate cameraDirection
glm::vec3 zaxis = glm::normalize(position - target);
// 3. Get positive right axis vector
glm::vec3 xaxis = glm::normalize(glm::cross(glm::normalize(worldUp), zaxis));
// 4. Calculate camera up vector
glm::vec3 yaxis = glm::cross(zaxis, xaxis);
// Create translation and rotation matrix
// In glm we access elements as mat[col][row] due to column-major layout
glm::mat4 translation = glm::mat4(1.0f); // Identity matrix by default
translation[3][0] = -position.x; // Third column, first row
translation[3][1] = -position.y;
translation[3][2] = -position.z;
glm::mat4 rotation = glm::mat4(1.0f);
rotation[0][0] = xaxis.x; // First column, first row
rotation[1][0] = xaxis.y;
rotation[2][0] = xaxis.z;
rotation[0][1] = yaxis.x; // First column, second row
rotation[1][1] = yaxis.y;
rotation[2][1] = yaxis.z;
rotation[0][2] = zaxis.x; // First column, third row
rotation[1][2] = zaxis.y;
rotation[2][2] = zaxis.z;
// Return lookAt matrix as combination of translation and rotation matrix
return rotation * translation; // Remember to read from right to left (first translation then rotation)
}
At the end of the code snippet, the matrix is calculated as rotation * translation, even though the matrix is going to be multiplied as,
gl_position = projection * lookAt * model * vec4(vertexPosition, 1.0);
as Column-major matrices must be pre-multiplied to the vector.
Please help me understand this.
LearnOpenGL unfortunately doesn't explain where the Camera transform comes from.
You can see the Camera transform as an inverse model transform.
3D math doesn't care if you move the camera towards the Objects or the Objects towards the Camera.
Also if your objects are already scaled to have the proper "World Space" size you don't need the camera to scale them. The scaling for the "intrinsic Camera Parameters" are dealt with in the projection matrix (scale for aspect ration and field of view). Which is done after the Camera transform.
So we move the object points towards the "camera" instead of the camera towards the points. As I said you would not leave the scaling in the Camera Matrix, since you only want to orient the objects in front of the Camera.
Placing the Camera as Model in the world space would be:
M = TR (leave out S for above reasons)
Then you inverse the Camera Transform ->
C = M^-1 | M = TR
= (TR)^-1
= R^-1 * T^-1 | inverse of matrix product -> flip order and invert matrices
Lets assume R(angle) is the Matrix that rotates by angle angle and T(t) is the Matrix that translates by vector t then:
= R(angle)^-1 * T(t)^-1
= R^T(angle) * T(-t)
Which is exactly what you return in your lookAt-method. The basis vectors of R are set up by the column vectors of you new coordinate frame, but then transposed (so you have them as row vectors). That's because the camera frame vectors are orthogonal and unit length, so the resulting matrix is "orthonomal" which means it's inverse is it's transpose. And the inverse Translation Matrix T(t) gets the translation vector T(-t) (inverse of eye-Position of the Camera).
Hope my explanation clearifies more than it confuses :-)
Okay! After reading through the entire chapter again, I missed a crucial detail. The View matrix usually does not scale the objects. It's just a matrix to rotate and translate the model matrix in such a way to simulate an eye into the world. This chapter of LearnOpenGL.com has a block explaining about combining matrices and it shows how to combine a translate and rotate matric and it's how the lookAt function is implemented.

Rotating an Object Around an Axis

I have a circular shape object, which I want to rotate like a fan along it's own axis.
I can change the rotation in any direction i.e. dx, dy, dz using my transformation matrix.
The following it's the code:
Matrix4f matrix = new Matrix4f();
matrix.setIdentity();
Matrix4f.translate(translation, matrix, matrix);
Matrix4f.rotate((float) Math.toRadians(rx), new Vector3f(1,0,0), matrix, matrix);
Matrix4f.rotate((float) Math.toRadians(ry), new Vector3f(0,1,0), matrix, matrix);
Matrix4f.rotate((float) Math.toRadians(rz), new Vector3f(0,0,1), matrix, matrix);
Matrix4f.scale(new Vector3f(scale,scale,scale), matrix, matrix);
My vertex code:
vec4 worldPosition = transformationMatrix * vec4(position,1.0);
vec4 positionRelativeToCam = viewMatrix*worldPosition;
gl_Position = projectionMatrix *positionRelativeToCam;
Main Game Loop:
Object.increaseRotation(dxf,dyf,dzf);
But, it's not rotating along it's own axis. What am I missing here?
I want something like this. Please Help
You should Get rid of Euler angles for this.
Object/mesh geometry
You need to be aware of how your object is oriented in its local space. For example let assume this:
So in this case the main rotation is around axis z. If your mesh is defined so the rotation axis is not aligned to any of the axises (x,y or z) or the center point is not (0,0,0) than that will cause you problems. The remedy is either change your mesh geometry or create a special constant transform matrix M0 that will transform all vertexes from mesh LCS (local coordinate system) to a different one that is axis aligned and center of rotation has zero in the axis which is also the axis of rotation.
In the latter case any operation on object matrix M would be done like this:
M'=M.M0.operation.Inverse(M0)
or in reverse or in inverse (depends on your matrix/vertex multiplication and row/column order conventions). If you got your mesh already centered and axis aligned then do just this instead:
M'=M.operation
The operation is transform matrix of the change increment (for example rotation matrix). The M is the object current transform matrix from #2 and M' is its new version after applying operation.
Object transform matrix
You need single Transform matrix for each object you got. This will hold the position and orientation of your object LCS so it can be converted to world/scene GCS (global coordinate system) or its parent object LCS
rotating your object around its local axis of rotation
As in the Understanding 4x4 homogenous transform matrices is mentioned for standard OpenGL matrix convetions you need to do this:
M'=M*rotation_matrix
Where M is current object transform matrix and M' is the new version of it after rotation. This is the thing you got different. You are using Euler angles rx,ry,rz instead of accumulating the rotations incrementally. You can not do this with Euler angles in any sane and robust way! Even if many modern games and apps are still trying hard to do it (and failing for years).
So what to do to get rid of Euler angles:
You must have persistent/global/static matrix M per object
instead of local instance per render so you need to init it just once instead of clearing it on per frame basis.
On animation update apply operation you need
so:
M*=rotation_around_z(angspeed*dt);
Where angspeed is in [rad/second] or [deg/second] of your fan speed and dt is time elapsed in [seconds]. For example if you do this in timer then dt is the timer interval. For variable times you can measure the time elapsed (it is platform dependent I usually use PerformanceTimers or RDTSC).
You can stack more operations on top of itself (for example your fan can also turning back and forward around y axis to cover more area.
For object direct control (by keyboard,mouse or joystick) just add things like:
if (keys.get( 38)) { redraw=true; M*=translate_z(-pos_speed*dt); }
if (keys.get( 40)) { redraw=true; M*=translate_z(+pos_speed*dt); }
if (keys.get( 37)) { redraw=true; M*=rotation_around_y(-turn_speed*dt); }
if (keys.get( 39)) { redraw=true; M*=rotation_around_y(+turn_speed*dt); }
Where keys is my key map holding on/off state for every key in the keyboard (so I can use more keys at once). This code just control object with arrows. For more info on the subject see related QA:
Computer Graphics: Moving in the world
Preserve accuracy
With incremental changes there is a risc of loosing precision due to floating point errors. So add a counter to your matrix class which counts how many times it has been changed (incremental operation applied) and if some constant count hit (for example 128 operations) Normalize your matrix.
To do that you need to ensure orthogonormality of your matrix. So eaxh axis vector X,Y,Z must be perpendicular to the other two and its size has to be unit. I do it like this:
Choose main axis which will have unchanged direction. I am choosing Z axis as that is usually my main axis in my meshes (viewing direction, rotation axis etc). so just make this vector unit Z = Z/|Z|
exploit cross product to compute the other two axises so X = (+/-) Z x Y and Y = (+/-) Z x X and also normalize them too X = X/|X| and Y = Y/|Y|. The (+/-) is there because I do not know your coordinate system conventions and the cross product can produce opposite vector to your original direction so if the direction is opposite change the multiplication order or negate the result (this is done while coding time not in runtime!).
Here example in C++ how my orthonormal normalization is done:
void reper::orto(int test)
{
double x[3],y[3],z[3];
if ((cnt>=_reper_max_cnt)||(test)) // here cnt is the operations counter and test force normalization regardless of it
{
use_rep(); // you can ignore this
_rep=1; _inv=0; // you can ignore this
axisx_get(x);
axisy_get(y);
axisz_get(z);
vector_one(z,z);
vector_mul(x,y,z); // x is perpendicular to y,z
vector_one(x,x);
vector_mul(y,z,x); // y is perpendicular to z,x
vector_one(y,y);
axisx_set(x);
axisy_set(y);
axisz_set(z);
cnt=0;
}
}
Where axis?_get/set(a) just get/set a as axis from/to your matrix. The vector_one(a,b) returns a = b/|b| and vector_mul(a,b,c) return a = b x c

How to calculate extrinsic parameters of one camera relative to the second camera?

I have calibrated 2 cameras with respect to some world coordinate system. I know rotation matrix and translation vector for each of them relative to the world frame. From these matrices how to calculate rotation matrix and translation vector of one camera with respect to the other??
Any help or suggestion please. Thanks!
Here is an easier solution, since you already have the 3x3 rotation matrices R1 and R2, and the 3x1 translation vectors t1 and t2.
These express the motion from the world coordinate frame to each camera, i.e. are the matrices such that, if p is a point expressed in world coordinate frame, then the same point expressed in, say, camera 1 frame is p1 = R1 * p + t1.
The motion from camera 1 to 2 is then simply the composition of (a) the motion FROM camera 1 TO the world frame, and (b) of the motion FROM the world frame TO camera 2. You can easily compute this composition as follows:
Form the 4x4 roto-translation matrices Qw1 = [R1 t1] and Qw2 = [ R2 t2 ], both with the 4th row equal to [0 0 0 1]. These matrices completely express the roto-translation FROM the world coordinate frame TO camera 1 and 2 respectively.
The motion FROM camera 1 TO the world frame is simply Q1w = inv(Qw1). Here inv() is the algebraic inverse matrix, i.e. the one such that inv(X) * X = X * inv(X) = IdentityMatrix, for every nonsingular matrix X.
The roto-translation from camera 1 to 2 is then Q12 = Q1w * Qw2, and viceversa, the one from camera 2 to 1 is Q21 = Q2w * Qw1 = inv(Qw2) * Qw1.
Once you have Q12 you can extract from it the rotation and translation parts, if you so wish, respectively from its upper 3x3 submatrix and right 3x1 sub-column.
First convert your rotation matrix into a rotation vector. Now you have 2 3d vectors for each camera, call them A1,A2,B1,B2. You have all 4 of them with respect to some origin O. The rule you need is
A relative to B = (A relative to O)- (B relative to O)
Apply that rule to your 2 vectors and you will have their pose relative to one another.
Some documentation on converting from rotation matrix to euler angles can be found here as well as many other places. If you are using openCV you can just use Rodrigues. Here is some matlab/octave code I found.
Here is very simple and easy solution. I suppose your 1st camera has R1 and T1, 2nd camera has R2 and T2 rotation matrixes and translation vector according to common reference point.
Translation from 1st to 2nd camera, rotation from 1st to 2nd camera can be calculated by following two line matlab code;
R=R2*R1';
T=T2-R*T1;
but note, that is true if you have just one R and T for each camera. (I mean rotations and translation for one unique world reference). if you have more reference translations and rotations, you should calcuate R,T for every single reference point. Probably they will be very close to each other. But those might be sligtly different. Then you can calculate mean of Translation vector and convert all found rotation matrix to rotation vector, caluculate its mean and then convert them as rotation matrix.

Get 3D coordinates from 2D image pixel if extrinsic and intrinsic parameters are known

I am doing camera calibration from tsai algo. I got intrensic and extrinsic matrix, but how can I reconstruct the 3D coordinates from that inormation?
1) I can use Gaussian Elimination for find X,Y,Z,W and then points will be X/W , Y/W , Z/W as homogeneous system.
2) I can use the
OpenCV documentation approach:
as I know u, v, R , t , I can compute X,Y,Z.
However both methods end up in different results that are not correct.
What am I'm doing wrong?
If you got extrinsic parameters then you got everything. That means that you can have Homography from the extrinsics (also called CameraPose). Pose is a 3x4 matrix, homography is a 3x3 matrix, H defined as
H = K*[r1, r2, t], //eqn 8.1, Hartley and Zisserman
with K being the camera intrinsic matrix, r1 and r2 being the first two columns of the rotation matrix, R; t is the translation vector.
Then normalize dividing everything by t3.
What happens to column r3, don't we use it? No, because it is redundant as it is the cross-product of the 2 first columns of pose.
Now that you have homography, project the points. Your 2d points are x,y. Add them a z=1, so they are now 3d. Project them as follows:
p = [x y 1];
projection = H * p; //project
projnorm = projection / p(z); //normalize
Hope this helps.
As nicely stated in the comments above, projecting 2D image coordinates into 3D "camera space" inherently requires making up the z coordinates, as this information is totally lost in the image. One solution is to assign a dummy value (z = 1) to each of the 2D image space points before projection as answered by Jav_Rock.
p = [x y 1];
projection = H * p; //project
projnorm = projection / p(z); //normalize
One interesting alternative to this dummy solution is to train a model to predict the depth of each point prior to reprojection into 3D camera-space. I tried this method and had a high degree of success using a Pytorch CNN trained on 3D bounding boxes from the KITTI dataset. Would be happy to provide code but it'd be a bit lengthy for posting here.