What is the correct order of Transformations when calculating Matrices in OpenGL? - c++

I am following the tutorials at LearnOpenGL.com and I am confused about the order of Matrices.
The Transformations chapter tells:
Matrix multiplication is not commutative, which means their order is important. When multiplying matrices the right-most matrix is first multiplied with the vector so you should read the multiplications from right to left. It is advised to first do scaling operations, then rotations and lastly translations when combining matrices otherwise they might (negatively) affect each other. For example, if you would first do a translation and then scale, the translation vector would also scale!
So If I am not wrong, the order is Translate * Rotate * Scale * vector_to_transform.
But immediately in the next Chapter, when calculating the LookAt matrix, the multiplication order is flipped. Here is the code snippet from the website:
// Custom implementation of the LookAt function
glm::mat4 calculate_lookAt_matrix(glm::vec3 position, glm::vec3 target, glm::vec3 worldUp)
{
// 1. Position = known
// 2. Calculate cameraDirection
glm::vec3 zaxis = glm::normalize(position - target);
// 3. Get positive right axis vector
glm::vec3 xaxis = glm::normalize(glm::cross(glm::normalize(worldUp), zaxis));
// 4. Calculate camera up vector
glm::vec3 yaxis = glm::cross(zaxis, xaxis);
// Create translation and rotation matrix
// In glm we access elements as mat[col][row] due to column-major layout
glm::mat4 translation = glm::mat4(1.0f); // Identity matrix by default
translation[3][0] = -position.x; // Third column, first row
translation[3][1] = -position.y;
translation[3][2] = -position.z;
glm::mat4 rotation = glm::mat4(1.0f);
rotation[0][0] = xaxis.x; // First column, first row
rotation[1][0] = xaxis.y;
rotation[2][0] = xaxis.z;
rotation[0][1] = yaxis.x; // First column, second row
rotation[1][1] = yaxis.y;
rotation[2][1] = yaxis.z;
rotation[0][2] = zaxis.x; // First column, third row
rotation[1][2] = zaxis.y;
rotation[2][2] = zaxis.z;
// Return lookAt matrix as combination of translation and rotation matrix
return rotation * translation; // Remember to read from right to left (first translation then rotation)
}
At the end of the code snippet, the matrix is calculated as rotation * translation, even though the matrix is going to be multiplied as,
gl_position = projection * lookAt * model * vec4(vertexPosition, 1.0);
as Column-major matrices must be pre-multiplied to the vector.
Please help me understand this.

LearnOpenGL unfortunately doesn't explain where the Camera transform comes from.
You can see the Camera transform as an inverse model transform.
3D math doesn't care if you move the camera towards the Objects or the Objects towards the Camera.
Also if your objects are already scaled to have the proper "World Space" size you don't need the camera to scale them. The scaling for the "intrinsic Camera Parameters" are dealt with in the projection matrix (scale for aspect ration and field of view). Which is done after the Camera transform.
So we move the object points towards the "camera" instead of the camera towards the points. As I said you would not leave the scaling in the Camera Matrix, since you only want to orient the objects in front of the Camera.
Placing the Camera as Model in the world space would be:
M = TR (leave out S for above reasons)
Then you inverse the Camera Transform ->
C = M^-1 | M = TR
= (TR)^-1
= R^-1 * T^-1 | inverse of matrix product -> flip order and invert matrices
Lets assume R(angle) is the Matrix that rotates by angle angle and T(t) is the Matrix that translates by vector t then:
= R(angle)^-1 * T(t)^-1
= R^T(angle) * T(-t)
Which is exactly what you return in your lookAt-method. The basis vectors of R are set up by the column vectors of you new coordinate frame, but then transposed (so you have them as row vectors). That's because the camera frame vectors are orthogonal and unit length, so the resulting matrix is "orthonomal" which means it's inverse is it's transpose. And the inverse Translation Matrix T(t) gets the translation vector T(-t) (inverse of eye-Position of the Camera).
Hope my explanation clearifies more than it confuses :-)

Okay! After reading through the entire chapter again, I missed a crucial detail. The View matrix usually does not scale the objects. It's just a matrix to rotate and translate the model matrix in such a way to simulate an eye into the world. This chapter of LearnOpenGL.com has a block explaining about combining matrices and it shows how to combine a translate and rotate matric and it's how the lookAt function is implemented.

Related

Place a billboard at a given distance so that it occupies a certain size on screen

I have a rectangle that I place on the screen using a simple scale matrix (S). Now I would like to place this rectangle into "3D space", but have it appear just like before on the screen. I found that I can do so by applying the view and projection matrices in inverse. Something like:
S' = (V⁻¹ P⁻¹ S)
Matrix = P V (V⁻¹ P⁻¹ S)
This works fine so far. My rectangle is like a billboard now and I can treat it like any other object, apply P and V and it will show up correctly. However, there is a degeneracy here: I don't specify at which depth the object is placed. It could be twice as far away but x times bigger!
The reason that this is important is that I want to animate the rectangle, say rotate it around the Z axis or move in 3D space. Then I want it to come to a stop and be positioned pixel-perfect on the screen.
How can I place a flat object at a given z distance, such that it appears on screen in a certain way? I already have with the scale matrix that I need to display it in OpenGL without any 3D transformation, that is the matrix for displaying it in NDC or screen coordinates. I also have the projection and view matrices I want to use. How can I go from this to the desired model matrix?
How can I place a flat object at a given z distance, such that it appears on screen in a certain way [...]
Actually you want to draw the object in view space. Define a model matrix for the object that contains only one translation component (0, 0, -z) and skip the view matrix when drawing the object.
Usually the order to transform 3D vertex v to 2D point p is written as a matrix multiplication. [Depending on the API you are using, the order might be reversed. The notation I used here is glsl - friendly]
p = P * V * M * S * v
v = vertex (usually 3D of the form x,y,z,1)
P = projection matrix
V = view (camera) matrix
M = model matrix (or world transformation)
S is a 4x4 object-scaling matrix
matrices are usually 4x4 with the last line/column 0,0,0,1
The model matrix M can be decomposed into a number of sub components such as T translation, S scale and R rotation. Of course here the order matters.
To rotate an object or vertex around itself, first rotate it (as if it is at the origin already) and then translate it to the position it needs to be, for example using vector v with coordinates (x,y,z).
R is a typical 3x3 rotation matrix embedded in a 4x4 with 0,0,0,1 on the last line
v is a 3D vector with coordinates (x,y,z)
T is a 4x4 translation matrix, all zeroes except the last column where it has x,y,z,1
M = T * R (first R, then T)
To rotate an object around an arbitrary point q, first translate it so that it is relative to that point (i.e. q is at the origin, so you subtracted q from v). Then rotate around q (at the origin) by applying R. Lastly place the object it back where it should be (so add q again).
R is your typical 3x3 rotation matrix embedded in a 4x4 with 0,0,0,1 on the last line
v is a 3D vector with coordinates (x,y,z)
T is a 4x4 translation matrix, all zeroes except the last column where it has x,y,z,1
L is another 4x4 translation matrix, but now with q instead of v
L' is the inverse transformation of L
M = T * L * R * L'
Also, scaling your object first (i.e. S is at the end of the multiplications) before translations will keep the translation in world units. Scale after all transformations, in fact scales all translations too, and the object will move over scaled distances.

placing objects perpendicularly on the surface of a sphere that has a wavy surface

So I have a sphere. It rotates around a given axis and changes its surface by a sin * cos function.
I also have a bunck of tracticoids at fix points on the sphere. These objects follow the sphere while moving (including the rotation and the change of the surface). But I can't figure out how to make them always perpendicular to the sphere. I have the ponts where the tracticoid connects to the surface of the sphere and its normal vector. The tracticoids are originally orianted by the z axis. So I tried to make it's axis to the given normal vector but I just can't make it work.
This is where i calculate M transformation matrix and its inverse:
virtual void SetModelingTransform(mat4& M, mat4& Minv, vec3 n) {
M = ScaleMatrix(scale) * RotationMatrix(rotationAngle, rotationAxis) * TranslateMatrix(translation);
Minv = TranslateMatrix(-translation) * RotationMatrix(-rotationAngle, rotationAxis) * ScaleMatrix(vec3(1 / scale.x, 1 / scale.y, 1 / scale.z));
}
In my draw function I set the values for the transformation.
_M and _Minv are the matrixes of the sphere so the tracticoids are following the sphere, but when I tried to use a rotation matrix, the tracticoids strated moving on the surface of the sphere.
_n is the normal vector that the tracticoid should follow.
void Draw(RenderState state, float t, mat4 _M, mat4 _Minv, vec3 _n) {
SetModelingTransform(M, Minv, _n);
if (!sphere) {
state.M = M * _M * RotationMatrix(_n.z, _n);
state.Minv = Minv * _Minv * RotationMatrix(-_n.z, _n);
}
else {
state.M = M;
state.Minv = Minv;
}
.
.
.
}
You said your sphere has an axis of rotation, so you should have a vector a aligned with this axis.
Let P = P(t) be the point on the sphere at which your object is positioned. You should also have a vector n = n(t) perpendicular to the surface of the sphere at point P=P(t) for each time-moment t. All vectors are interpreted as column-vectors, i.e. 3 x 1 matrices.
Then, form the matrix
U[][1] = cross(a, n(t)) / norm(cross(a, n(t)))
U[][3] = n(t) / norm(n(t))
U[][2] = cross(U[][3], U[][1])
where for each j=1,2,3 U[][j] is a 3 x 1 vector column. Then
U(t) = [ U[][1], U[][2], U[][3] ]
is a 3 x 3 orthogonal matrix (i.e. it is a 3D rotation around the origin)
For each moment of time t calculate the matrix
M(t) = U(t) * U(0)^T
where ^T is the matrix transposition.
The final transformation that rotates your object from its original position to its position at time t should be
X(t) = P(t) + M(t)*(X - P(0))
I'm not sure if I got your explanations, but here I go.
You have a sphere with a wavy surface. This means that each point on the surface changes its distance to the center of the sphere, like a piece of wood on a wave in the sea changes its distance to the bottom of the sea at that position.
We can tell that the radious R of the sphere is variable at each point/time case.
Now you have a tracticoid (what's a tracticoid?). I'll take it as some object floating on the wave, and following the sphere movements.
Then it seems you're asking as how to make the tracticoid follows both wavy surface and sphere movements.
Well. If we define each movement ("transformation") by a 4x4 matrix it all reduces to combine in the proper order those matrices.
There are some good OpenGL tutorials that teach you about transformations, and how to combine them. See, for example, learnopengl.com.
To your case, there are several transformations to use.
The sphere spins. You need a rotation matrix, let's call it MSR (matrix sphere rotation) and an axis of rotation, ASR. If the sphere also translates then also a MST is needed.
The surface waves, with some function f(lat, long, time) which calculates for those parameters the increment (signed) of the radious. So, Ri = R + f(la,lo,ti)
For the tracticoid, I guess you have some triangles that define a tracticoid. I also guess those triangles are expressed in a "local" coordinates system whose origin is the center of the tracticoid. Your issue comes when you have to position and rotate the tracticoid, right?
You have two options. The first is to rotate the tracticoid to make if aim perpendicular to the sphere and then translate it to follow the sphere rotation. While perfect mathematically correct, I find this option some complicated.
The best option is to make the tracticoid to rotate and translate exactly as the sphere, as if both would share the same origin, the center of the sphere. And then translate it to its current position.
First part is quite easy: The matrix that defines such transformation is M= MST * MSR, if you use the typical OpenGL axis convention, otherwise you need to swap their order. This M is the common part for all objects (sphere & tracticoids).
The second part requires you have a vector Vn that defines the point in the surface, related to the center of the sphere. You should be able to calculate it with the parameters latitude, longitude and the R obtained by f() above, plus the size/2 of the tracticoid (distance from its center to the point where it touches the wave). Use the components of Vn to build a translation matrix MTT
And now, just get the resultant transformation to use with every vertex of the tracticoid: Mt = MTT * M = MTT * MST * MSR
To render the scene you need other two matrices, for the camera (MV) and for the projection (MP). While Mt is for each tracticoid, MV and MP are the same for all objects, including the sphere itself.

Trouble Animating Quaternion Slerp

I am attempting to animate a slerp from q1 to q2 for my FPS camera. I have a target somewhere in my world and I want the camera to pan from its current axis to looking at my target. From what I understand the way to do this would be to calculate a quaternion representing my current (axis, rotation) and a second representing my final (axis, rotation) then every frame increment the amount I interpolate between the two from 0 to 1. Is this the correct idea?
What I don't understand is how to compute these beginning and end quaternions?
My camera is pretty standard and has the usual member variables:
glm::vec3 position,forward, up, yAxis, target;
glm::quat orientation;
Note:
= in this post represents mathematical equations, not assignments. (Sadly we have no mathmode on stackoverflow)
If your camera already has a member-quaternion, which describes its rotation, i suppose you have this quaternion. If not, you can use the same technique to find it as well:
If you know your rotational axis vec3 r and your angle a then your quaternion is vec4 q = (cos(a/2), sin(a/2)*r) (and any multiple of it). Your rotated vector is then vec3 v' = q v inv(q).
I assume you want the camera to still point upwards, then you can split the rotation in two rotations, one around the global up axis (probably y) and one around the local horizontal-axis of the camera (probably x).
So your rotation is:
vec3 v' = g l v inv(l) inv(g)
g = (cos(a/2), sin(a/2)*(0,1,0))
l = (cos(b/2), sin(b/2)*(1,0,0))
with the addition of
vec3 normal(viewDirection) = g l (0,0,1) inv(l) inv(g)
(because later you want to have your cameras z-axis point in your viewDirection) you should be able to solve the equations.

Direction of rotation in GLM matrix, using quaternions

I am working on a 3D rendering setup (all math done with GLM for OpenGL), and it all works correctly, except for how I would prefer my transformations to work.
I create a matrix for each entity like so:
matrix = mat4(1);
vec3 scale = GetWorldScale();
vec3 pos = GetWorldPosition(); // Returns pos + parent->pos
quat rot = GetWorldRotationQuat(); // Returns parent->rot * rot
matrix = glm::translate(matrix, pos);
matrix *= mat4_cast(rot);
matrix = glm::scale(matrix, scale);
right = matrix[0].xyz;
up = matrix[1].xyz;
direction = matrix[2].xyz;
Using this, it generally works correctly, except that I'm not sure how to adjust part of it for preference. That is that, using this, translation on the X-axis is flipped (eg. left is positive, and forward is positive on the Z-axis, but I less discriminant with that), and rotation on the Y-axis is flipped.
Looking at other code for this purpose, it seems that many negate what I've used for direction (for the camera). I've done that as well, and translation is correct, but all axes of rotation are the opposite of what's preferred (though rotation on X is the same whether direction is negated or not).
I'm not quite sure what I should do to help correct this, except possibly negate X-axis translation and Y-axis rotation before usage, but I feel that that isn't the best way. Thoughts?
I believe the problems come from the fact that you mix the order of transformations, especially translation and rotation. Each of these transformations (scale, translation, and rotation) define how to transform one coordinate system into another.
Let's go through an example: You have one child object c and its parent p. They each have translation t, rotation r, and scale s. To keep it simple, each of these are 4x4 matrices. Currently you do
matrix = p.t * c.t * p.r * c.r * p.s * c.s
So imagine the local coordinate system of the child that's transformed by this matrix (xyz axes of length 1). Points are multiplied from the right, so we have to read it right to left. First, the coordinate system gets scaled by the child, then scaled by the parent. Then it gets rotated by the child. Then by the parent. And now the child's translation is applied. That means the child translation is applied in a rotated coordinate system. Since you already applied both the child's and the parent's rotations, it's rotated into the parent's coordinate system. So the coordinates of the child are interpreted as if they were parent coordinates.
So what you should be doing: In your method, compute the object matrix m = c.t * c.r * c.s. Then the world matrix of your object is defined as wm = pm * m, where the parent matrix pm is the world matrix of the parent. That way you'll end up with a world matrix:
c.wm = (c.pm) * (c.o) = (p.t * p.r * p.s) * (c.t * c.r * c.s)
And that means that the child's translation is in the coordinate system of the child, and the parent's translation is in the coordinate system of the parent.

3d coordinate from point and angles

I'm working on a simple OpenGL world- and so far I've got a bunch of cubes randomly placed about and it's pretty fun to go zooming about. However I'm ready to move on. I would like to drop blocks in front of my camera, but I'm having trouble with the 3d angles. I'm used to 2d stuff where to find an end point we simply do something along the lines of:
endy = y + (sin(theta)*power);
endx = x + (cos(theta)*power);
However when I add the third dimension I'm not sure what to do! It seems to me that the power of the second dimensional plane would be determined by the z axis's cos(theta)*power, but I'm not positive. If that is correct, it seems to me I'd do something like this:
endz = z + (sin(xtheta)*power);
power2 = cos(xtheta) * power;
endx = x + (cos(ytheta) * power2);
endy = y + (sin(ytheta) * power2);
(where x theta is the up/down theta and y = left/right theta)
Am I even close to the right track here? How do I find an end point given a current point and an two angles?
Working with euler angles doesn't work so well in 3D environments, there are several issues and corner cases in which they simply don't work. And you actually don't even have to use them.
What you should do, is exploit the fact, that transformation matrixes are nothing else, then coordinate system bases written down in a comprehensible form. So you have your modelview matrix MV. This consists of a model space transformation, followed by a view transformation (column major matrices multiply right to left):
MV = V * M
So what we want to know is, in which way the "camera" lies within the world. That is given to you by the inverse view matrix V^-1. You can of course invert the view matrix using Gauss Jordan method, but most of the time your view matrix will consist of a 3×3 rotation matrix with a translation vector column P added.
R P
0 1
Recall that
(M * N)^-1 = N^-1 * M^-1
and also
(M * N)^T = M^T * N^T
so it seems there is some kind of relationship between transposition and inversion. Not all transposed matrices are their inverse, but there are some, where the transpose of a matrix is its inverse. Namely it are the so called orthonormal matrices. Rotations are orthonormal. So
R^-1 = R^T
neat! This allows us to find the inverse of the view matrix by the following (I suggest you try to proof it as an exersice):
V = / R P \
\ 0 1 /
V^-1 = / R^T -P \
\ 0 1 /
So how does this help us to place a new object in the scene at a distance from the camera? Well, V is the transformation from world space into camera space, so V^-1 transforms from camera to world space. So given a point in camera space you can transform it back to world space. Say you wanted to place something at the center of the view in distance d. In camera space that would be the point (0, 0, -d, 1). Multiply that with V^-1:
V^-1 * (0, 0, -d, 1) = (R^T)_z * d - P
Which is exactly what you want. In your OpenGL program you somewhere have your view matrix V, probably not properly named yet, but anyway it is there. Say you use old OpenGL-1 and GLU's gluLookAt:
void display(void)
{
/* setup viewport, clear, set projection, etc. */
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
gluLookAt(...);
/* the modelview matrix now holds the View transform */
At this point we can extract the modelview matrix
GLfloat view[16];
glGetFloatv(GL_MODELVIEW_MATRIX, view);
Now view is in column major order. If we were to use it directly we could directly address the columns. But remember that transpose is inverse of a rotation, so we actually want the 3rd row vector. So let's assume you keep view around, so that in your event handler (outside display) you can do the following:
GLfloat z_row[3];
z_row[0] = view[2];
z_row[1] = view[6];
z_row[2] = view[10];
And we want the position
GLfloat * const p_column = &view[12];
Now we can calculate the new objects position at distance d:
GLfloat new_object_pos[3] = {
z_row[0]*d - p_column[0],
z_row[1]*d - p_column[1],
z_row[2]*d - p_column[2],
};
There you are. As you can see, nowhere you had to work with angles or trigonometry, it's just straight linear algebra.
Well I was close, after some testing, I found the correct formula for my implementation, it looks like this:
endy = cam.get_pos().y - (sin(toRad(180-cam.get_rot().x))*power1);
power2 = cos(toRad(180-cam.get_rot().x))*power1;
endx = cam.get_pos().x - (sin(toRad(180-cam.get_rot().y))*power2);
endz = cam.get_pos().z - (cos(toRad(180-cam.get_rot().y))*power2);
This takes my camera's position and rotational angles and get's the corresponding points. Works like a charm =]