Transformation in vertex shader only works with post-multiplying - opengl

I am currently in the process of learning OpenGL and GLSL to write a simple software that loads models, display them on the screen, transform them etc.
As a first stage, I wrote a pure-C++ program without using OpenGL.
it works great, and it uses a Row-major matrix representation:
So for instance mat[i][j] means row i and column j.
class mat4
{
vec4 _m[4]; // vec4 is a struct with 4 fields
...
}
This is the relevant matrix multiplication method:
mat4 operator*(const mat4& m) const
{
mat4 a(0.0);
for (int i = 0; i < 4; ++i)
{
for (int j = 0; j < 4; ++j)
{
for (int k = 0; k < 4; ++k)
{
a[i][j] += _m[i][k] * m[k][j];
}
}
}
return a;
}
In order to get from model space to clip space I do as follows in C++:
vec4 vertexInClipSpace = projectionMat4 * viewMat4 * modelMat4 * vertexInModelSpace;
Now, trying to implement that in a GLSL Shader (Version 1.5) yields weird results. It works, but only if I post multiply the vertex instead of pre-multiplying it and in addition transpose each of the matrices.
uniform mat4 m;
uniform mat4 v;
uniform mat4 p;
void main()
{
// works ok, but using post multiplication and transposed matrices :
gl_Position = vec4(vertex, 1.0f) * m * v * p;
}
Although mathematically OK as v2 = P * V * M * v1 is the same as transpose(v2) = transpose(v1) * transpose(M) * transpose(V) * transpose(P) ,
I obviously don't get something because I have not seen even 1 reference where they post multiply a vertex in the vertex shader.
To sum up, here are specific questions:
Why does this works? is it even legal to post multiply in glsl?
How can I pass my C++ matrices so that they work properly inside the shader?
Links to related Questions:
link 1
link 2
EDIT:
Problem was sort of "solved" by altering the "transpose" flag in the call to:
glUniformMatrix4fv(
m_modelTransformID,
1,
GL_TRUE,
&m[0][0]
);
Now the multiplication in the shader is a pre-multiplication:
gl_Position = MVP * vec4(vertex, 1.0f);
Which kind of left me puzzled as the mathematics doesn't make sense for a column-major matrices that are a transpose of row major matrices.
could someone please explain?

Citing OpenGL faq:
For programming purposes, OpenGL matrices are 16-value arrays with
base vectors laid out contiguously in memory. The translation
components occupy the 13th, 14th, and 15th elements of the 16-element
matrix, where indices are numbered from 1 to 16 as described in
section 2.11.2 of the OpenGL 2.1 Specification.
Column-major versus row-major is purely a notational convention. Note
that post-multiplying with column-major matrices produces the same
result as pre-multiplying with row-major matrices. The OpenGL
Specification and the OpenGL Reference Manual both use column-major
notation. You can use any notation, as long as it's clearly stated.
About some conventions:
Row vs Column Vector
Multiply 2 matrices is possible only if the number of columns of the left matrix is equal to the number of rows of the right matrix.
MatL[r1,c] x MatR[c,r2]
So, if you are working on a piece of paper, considering that a vector is a 1 dimensional matrix, if you want to multiply a 4vec for 4x4matrix then the vector should be:
a row vector if you post-multiply the matrix
a colum vector if you pre-multiply the matrix
Into a computer you can consider 4 consecutive values either as a column or a row (there's no concept of dimension), so you can post-multiply or pre-multiply a vector for the same matrix. Implicitly you are sticking with one of the 2 conventions.
Row Major vs Column Major layout
Computer memory is a continuous space of locations. The concept of multiple dimensions doesn't exist, it's a pure convention. All matrix elements are stored continuously into a one dimensional memory.
If you decide to store a 2 dimensional entity, you have 2 conventions:
storing consecutive row elements in memory (row-major)
storing consecutive column elements in memory (column-major)
Incidentally, transposing the elements of a matrix stored in row major, it's equivalent to store its elements in column major order.
That implies, that swapping the order of the multiplication between a vector and a matrix is equivalent to multiply the same vector in the same order with the transposed matrix.
Open GL
It doesn't officially prescribes any convention, as stated above. I suggest you to look at OpenGL convention as if the translation is stored in the last column and the matrix layout is column major.
Why does this works? is it even legal to post multiply in glsl?
It is legal. As far as you are consistent across you code, either convention/multiplication order is fine.
How can I pass my C++ matrices so that they work properly inside the
shader?
If you are using 2 different convention in C++ and in the shader, than you can either transpose the matrix and keep the same multiplication order, or don't transpose the matrix and invert the multiplication order.

If you got any gaps see Understanding 4x4 homogenous transform matrices.
If you swap between column major (OpenGL matrices) and row major (DX and Your matrices) order of matrices then it is the same as transpose so you're right. What you are missing is that:
For orthogonal and orthonormal homogenous transform matrices if you
transpose a matrix it is the same as if you're invert it
Which is answer to your question I think.
transpose(M) = inverse(M)
The other question if it is OK to post multiply a vertex that is only matter of convention and it is not forbidden in GLSL. The whole point of GLSL is that you can do almost anything there.

Related

OpenGL: mat4x4 multiplied with vec4 yields tvec<float>

Consider the code:
glm::mat4x4 T = glm::mat4x4(1);
glm::vec4 vrpExpanded;
vrpExpanded.x = this->vrp.x;
vrpExpanded.y = this->vrp.y;
vrpExpanded.z = this->vrp.z;
vrpExpanded.w = 1;
this->vieworientationmatrix = T * (-vrpExpanded);
Why does T*(-vrpExpanded) yield a vector? According to my knowledge of linear algebra this should yield a mat4x4.
According to my knowledge of linear algebra this should yield a mat4x4.
Then that's the problem.
According to linear algebra, a matrix can be multipled by a scalar (which does element-wise multiplication) or by another matrix. But even then, a matrix * matrix multiplication only works if the number of rows in the first matrix equals the number of columns in the second. And the resulting matrix is one which has the number of columns in the first and the number of rows in the second.
So if you have an AxB matrix and you multiply it with a CxD matrix, this only works if B and C are equal. And the result is an AxD matrix.
Multiplying a matrix by a vector means to pretend the vector is a matrix. So if you have a 4x4 matrix and you right-multiply it with a 4-element vector, this will only make sense if you treat that vector as a 4x1 matrix (since you cannot multiply a 4x4 matrix by a 1x4 matrix). And the result of a 4x4 matrix * a 4x1 matrix is... a 4x1 matrix.
AKA: a vector.
GLM is doing exactly what you asked.

Packing the normal vector and tangent vector

In the deferred shading engine I'm working on, I currently store the normal vector in a buffer with the internal format GL_RGBA16F.
I was always aware that this could not be the best solution, but I had no time to deal with it.
Recently I read "Survey of Efficient Representations for Independent Unit Vectors", which inspired me to use Octahedral Normal Vectors (ONV) and to change the buffer to GL_RG16_SNORM:
Encode the normal vector (vec3 to vec2):
// Returns +/- 1
vec2 signNotZero( vec2 v )
{
return vec2((v.x >= 0.0) ? +1.0 : -1.0, (v.y >= 0.0) ? +1.0 : -1.0);
}
// Assume normalized input. Output is on [-1, 1] for each component.
vec2 float32x3_to_oct( in vec3 v )
{
// Project the sphere onto the octahedron, and then onto the xy plane
vec2 p = v.xy * (1.0 / (abs(v.x) + abs(v.y) + abs(v.z)));
// Reflect the folds of the lower hemisphere over the diagonals
return (v.z <= 0.0) ? ((1.0 - abs(p.yx)) * signNotZero(p)) : p;
}
Decode the normal vector (vec2 to vec3):
vec3 oct_to_float32x3( vec2 e )
{
vec3 v = vec3(e.xy, 1.0 - abs(e.x) - abs(e.y));
if (v.z < 0) v.xy = (1.0 - abs(v.yx)) * signNotZero(v.xy);
return normalize(v);
}
Since I have implemented an anisotropic light model right now, it is necessary to store the tangent vector as well as the normal vector. I want to store both vectors in one and the same color attachment of the frame buffer. That brings me to my question. What is a efficient compromise to pack a unit normal vector and tangent vector in a buffer?
Of course it would be easy with the algorithms from the paper to store the normal vector in the RG channels and the tangent vector in the BA channels of a GL_RGBA16_SNORM buffer, and this is my current implementation too.
But since the normal vector an the tangent vector are always orthogonal, there must be more elegant way, which either increases accuracy or saves memory.
So the real question is: How can I take advantage of the fact that I know that 2 vectors are orthogonal? Can I store both vectors in an GL_RGB16_SNORM buffer and if not can I improve the accuracy when I pack them to a GL_RGBA16_SNORM buffer.
The following considerations are purely mathematical and I have no experience with their practicality. However, I think that especially Option 2 might be a viable candidate.
Both of the following options have in common how they state the problem: Given a normal (that you can reconstruct using ONV), how can one encode the tangent with a single number.
Option 1
The first option is very close to what meowgoesthedog suggested. Define an arbitrary reference vector (e.g. (0, 0, 1)). Then encode the tangent as the angle (normalized to the [-1, 1] range) that you need to rotate this vector about the normal to match the tangent direction (after projecting on the tangent plane, of course). You will need two different reference vectors (or even three) and choose the correct one depending on the normal. You don't want the reference vector to be parallel to the normal. I assume that this is computationally more expensive than the second option but that would need measuring. But you would get a uniform error distribution in return.
Option 2
Let's consider the plane orthogonal to the tangent. This plane can be defined either by the tangent or by two vectors that lie in the plane. We know one vector: the surface normal. If we know a second vector v, we can calculate the tangent as t = normalize(cross(normal, v)). To encode this vector, we can prescribe two components and solve for the remaining one. E.g. let our vector be (1, 1, x). Then, to encode the vector, we need to find x, such that cross((1, 1, x), normal) is parallel to the tangent. This can be done with some simple arithmetic. Again, you would need a few different vector templates to account for all scenarios. In the end, you have a scheme whose encoder is more complex but whose decoder couldn't be simpler. The error distribution will not be as uniform as in Option 1, but should be ok for a reasonable choice of vector templates.

matrix order in skeletal animation using assimp

I had followed this tutorial and got the output animation for a rigged model as expected. The tutorial uses assimp, glsl and c++ to load a rigged model from a file. However, there were things that I couldn't figure out.
First thing is assimp's transformation matrix are row major matrices and the tutorial uses a Matrix4f class which uses those transformation matrices just as they are i.e. row major order. The constructor of that Matrix4f class is as given:
Matrix4f(const aiMatrix4x4& AssimpMatrix)
{
m[0][0] = AssimpMatrix.a1; m[0][2] = AssimpMatrix.a2; m[0][2] = AssimpMatrix.a3; m[0][3] = AssimpMatrix.a4;
m[1][0] = AssimpMatrix.b1; m[1][3] = AssimpMatrix.b2; m[1][2] = AssimpMatrix.b3; m[1][3] = AssimpMatrix.b4;
m[2][0] = AssimpMatrix.c1; m[2][4] = AssimpMatrix.c2; m[2][2] = AssimpMatrix.c3; m[2][3] = AssimpMatrix.c4;
m[3][0] = AssimpMatrix.d1; m[3][5] = AssimpMatrix.d2; m[3][2] = AssimpMatrix.d3; m[3][3] = AssimpMatrix.d4;
}
However, in the tutorial for calculating the final node transformation, the calculations are done expecting the matrices to be in column major order, which is shown below:
Matrix4f NodeTransformation;
NodeTransformation = TranslationM * RotationM * ScalingM; //note here
Matrix4f GlobalTransformation = ParentTransform * NodeTransformation;
if(m_BoneMapping.find(NodeName) != m_BoneMapping.end())
{
unsigned int BoneIndex = m_BoneMapping[NodeName];
m_BoneInfo[BoneIndex].FinalTransformation = m_GlobalInverseTransform * GlobalTransformation * m_BoneInfo[BoneIndex].BoneOffset;
m_BoneInfo[BoneIndex].NodeTransformation = GlobalTransformation;
}
Finally, since the matrices calculated are in row major order, it is specified so while passing the matrices in the shader by setting GL_TRUE flag in the following function. Then, openGL knows it is in row major order as openGL itself uses column major order.
void SetBoneTransform(unsigned int Index, const Matrix4f& Transform)
{
glUniformMatrix4fv(m_boneLocation[Index], 1, GL_TRUE, (const GLfloat*)Transform);
}
So, how does the calculation done considering column major order
transformation = translation * rotation * scale * vertices
yield a correct output. I expected that for the calculation to hold true, each matrices should first be transposed to change to column order, followed by the above calculation and finally transposed again to obtain back row order matrix, which is also discussed in this link. However, doing so produced a horrible output. Is there something that I am missing here?
You are confusing two different things:
the layout the data has in memory (row vs. column major order)
the mathematical interpretation of the operations (things like multiplication order)
It is often claimed that when working with row major vs. column major, things have to be transposed and matrix multipication order hase to be reversed. But this is not true.
What is true is that mathematically, transpose(A*B) = transpose(B) * transpose(A). However, that is irrelevant here, because the matrix storage order is independent of, and orthogonal to, the mathematical interpretation of the matrices.
What I mean by this is: In math, it is exactly defined what a row and a column of a matrix is, and each element can be uniquely addressed by these two "coordinates". All the matrix operations are defined based on this convention. For example, in C=A*B, the element in the first row and the first column of C, is calculated as the dot product of the first row of A (transposed to a column vector) and the first column of B.
Now, the matrix storage order just defines how the matrix data is laid out in memory. As a generalization, we could define a function f(row,col) mapping each (row, col) pair to some memory address. We now could write or matrix functions using f, and we could change f to adapt row-major, column-major or something completely else (like a Z order curve, if we want some fun).
It doesn't matter what f we actually use (as long as the mapping is bijective), the operation C=A*B will always have the same result. What changes is just the data in memory, but we have also to use f to interpet that data. We could just write a simple print function, also using f, to print the matrix as the 2D array in columns x rows as a typical human would expect.
The confusion comes from this fact when you use a matrix in a different layout than the implementation of the matrix functions is designed on.
If you have a matrix library which is internally assuimg colum-major layout, and pass in data in row-major format, it is as if you transformed that matrix before - and only at this point, things get screwed up.
To confuse things even more, there is another issue related to this: the matrix * vector vs vector * matrix issue. Some people like to write x' = x * M (with v' and v being row vectors), while others like to write y' = N *y (with column vectors). It is clear that mathematically, M*x = transpose((transpose(x) * transpose(M)), so that people often also confuse this with row- vs column-major order effects - but it is also totally independent of that. It is just a matter of convention if you want to use the one or the other.
So, to finally answer your question:
The transformation matrices created there are written for the convention of multyplying matrix * vector, so that Mparent * Mchild is the correct matrix multiplication order.
Up to this point, the actual data layout in memory does not matter at all. It only begins to matter because now, we are interfacing a different API, with its own conventions. GL's default order is column-major. The matrix class in use is written for row-major memory layout. So you just transpose at this point, so that GL's interpretation of that matrix matches your other library's.
The alternative would be not convert them and account for that by incorporating the implicit operation created by this into the system - either by changing the multiplication order in the shader, or by adjusting the operations which created the matrix in the first place. However, I would not recommend going that path, because the resulting code will be totally unintuitive, because in the end, this would mean working with column-major matrices in a matrix class using a row-major interpretation.
Yes, the memory layout is similar for glm and assimp : data.html
But, according to the doc page : classai_matrix4x4t
The assimp matrix is always row-major whereas the glm matrix is always col-major meaning you need to create a transponse on conversion:
inline static Mat4 Assimp2Glm(const aiMatrix4x4& from)
{
return Mat4(
(double)from.a1, (double)from.b1, (double)from.c1, (double)from.d1,
(double)from.a2, (double)from.b2, (double)from.c2, (double)from.d2,
(double)from.a3, (double)from.b3, (double)from.c3, (double)from.d3,
(double)from.a4, (double)from.b4, (double)from.c4, (double)from.d4
);
}
inline static aiMatrix4x4 Glm2Assimp(const Mat4& from)
{
return aiMatrix4x4(from[0][0], from[1][0], from[2][0], from[3][0],
from[0][1], from[1][1], from[2][1], from[3][1],
from[0][2], from[1][2], from[2][2], from[3][2],
from[0][3], from[1][3], from[2][3], from[3][3]
);
}
PS: The abcd stands for row and 1234 stands for col in assimp.

vector * matrix product efficiency issue

Just as Z boson recommended, I am using a column-major matrix format in order to avoid having to use the dot product. I don't see a feasible way to avoid it when multiplying a vector with a matrix, though. The matrix multiplication trick requires efficient extraction of rows (or columns, if we transpose the product). To multiply a vector by a matrix, we therefore transpose:
(b * A)^T = A^T * b^T
A is a matrix, b a row vector, which, after being transposed, becomes a column vector. Its rows are just single scalars and the vector * matrix product implementation becomes an inefficient implementation of dot products of columns of (non-transposed) matrix A with b. Is there a way to avoid performing these dot products? The only way I see that could do it, would involve row extraction, which is inefficient with the column-major matrix format.
This can be understood from original post on this (my first on SO)
efficient-4x4-matrix-vector-multiplication-with-sse-horizontal-add-and-dot-prod
. The rest of the discussion applies to 4x4 matrices.
Here are two methods to do do matrix times vector (v = Mu where v and u are column vectors)
method 1) v1 = dot(row1, u), v2 = dot(row2, u), v3 = dot(row3, u), v4 = dot(row4, u)
method 2) v = u1*col1 + u2*col2 + u3*col3 + u4*col4.
The first method is more familiar from math class while the second is more efficient for a SIMD computer. The second method uses vectorized math (like numpy) e.g.
u1*col1 = (u1x*col1x, u1y*col1y, u1z*col1z, u1w*col1w).
Now let's look at vector times matrix (v = uM where v and u are row vectors)
method 1) v1 = dot(col1, u), v2 = dot(col2, u), v3 = dot(col3, u), v4 = dot(col4, u)
method 2) v = u1*row1 + u2*row2 + u3*row3 + u4*row4.
Now the roles of columns and rows have swapped but method 2 is still the efficient method to use on a SIMD computer.
To do matrix times vector efficiently on a SIMD computer the matrix should be stored in column-major order. To do vector times matrix efficient on a SIMD computer the matrix should be stored in row-major order.
As far as I understand OpenGL uses column major ordering and does matrix times vector and DirectX uses row-major ordering and does vector times matrix.
If you have three matrix transformations that you do in order M1 first then M2 then M3 with matrix times vector you write it as
v = M3*M2*M1*u //u and v are column vectors - OpenGL form
With vector times matrix you write
v = u*M1*M2*M3 //u and v are row vectors - DirectX form
Neither form is better than the other in terms of efficiency. It's just a question of notation (and causing confusion which is useful when you have competition).
It's important to note that for matrix*matrix row-major versus column-major storage is irrelevant.
If you want to know why the vertical SIMD instructions are faster than the horizontal ones that's a separate question which should be asked but in short the horizontal ones really act in serial rather than parallel and are broken up into several micro-ops (which is why ironically dppd is faster than dpps).

z direction of a matrix in a GLSL shader

I have the following (working) code in a GLSL shader:
vec3 normal = vec3(u_mvMatrix * vec4(0.0,0.0,1.0,0.0));//normal in eye space (no scaling in u_mvMatrix).
However, since what I need is just the z direction of the u_mvMatrix, which of the following lines is equivalent to the line above:
vec3 normal = u_mvMatrix[2].xyz;
or:
vec3 normal = vec3(u_mvMatrix[0].z,u_mvMatrix[1].z,u_mvMatrix[2].z);
?
Thanks in advance for your help!
OpenGL matrices are column-major by default.
You can make them row-major in GLSL with a layout qualifier (e.g. layout (row_major) uniform mat4 u_mvMatrix;). Note that pre-multiplication of a row-major matrix is the same as post-multiplication of a column-major matrix, so this is a true statement:
(mat * vec) == (vec * transpose (mat))
In the fixed-function pipeline when OpenGL does a matrix multiply, it is always column-major and post-multiplied. Direct3D is row-major and pre-multiplied. In the programmable pipeline you have full control over the matrix notation and whether you pre- or post-multiply.
Getting back to your question, in this example you are post-multiplying (the matrix is on the l-hand side of the * operator) a column-major matrix. The correct equivalence is therefore vec3 normal = u_mvMatrix [2].xyz;
For a lengthy article that does a decent job explaining what I just wrote, see this site.