Matrix calculations for gpu skinning - c++

I'm trying to do skeletal animation in OpenGL using Assimp as my model import library.
What exactly do I need to the with the bones' offsetMatrix variable? What do I need to multiply it by?

Let's take for instance this code, which I used to animate characters in a game I worked. I used Assimp too, to load bone information and I read myself the OGL tutorial already pointed out by Nico.
glm::mat4 getParentTransform()
{
if (this->parent)
return parent->nodeTransform;
else
return glm::mat4(1.0f);
}
void updateSkeleton(Bone* bone = NULL)
{
bone->nodeTransform = bone->getParentTransform() // This retrieve the transformation one level above in the tree
* bone->transform //bone->transform is the assimp matrix assimp_node->mTransformation
* bone->localTransform; //this is your T * R matrix
bone->finalTransform = inverseGlobal // which is scene->mRootNode->mTransformation from assimp
* bone->nodeTransform //defined above
* bone->boneOffset; //which is ai_mesh->mBones[i]->mOffsetMatrix
for (int i = 0; i < bone->children.size(); i++) {
updateSkeleton (&bone->children[i]);
}
}
Essentially the GlobalTransform as it is referred in the tutorial Skeletal Animation with Assimp or properly the transform of the root node scene->mRootNode->mTransformation is the transformation from local space to global space. To give you an example, when in a 3D modeler (let's pick Blender for instance) you create your mesh or you load your character, it is usually positioned (by default) at the origin of the Cartesian plane and its rotation is set to the identity quaternion.
However you can translate/rotate your mesh/character from the origin (0,0,0) to somewhere else and have in a single scene even multiple meshes with different positions. When you load them, especially if you do skeletal animation, it is mandatory to translate them back in local space (i.e. back at the origin 0,0,0 ) and this is the reason why you have to multiply everything by the InverseGlobal (which brings back your mesh to local space).
After that you need to multiply it by the node transform which is the multiplication of the parentTransform (the transformation one level up in the tree, this is the overall transform) the transform (formerly the assimp_node->mTransformation which is just the transformation of the bone relative to the node's parent) and your local transformation (any T * R) you want to apply to do: forward kinematic, inverse kinematic or key-frame interpolation.
Eventually there is the boneOffset (ai_mesh->mBones[i]->mOffsetMatrix) that transforms from mesh space to bone space in bind pose as stated in the documentation.
Here there is a link to GitHub if you want to look at the whole code for my Skeleton class.
Hope it helps.

The offset matrix defines the transform (translation, scale, rotation) that transforms the vertex in mesh space, and converts it to "bone" space. As an example consider the following vertex and a bone with the following properties;
Vertex Position<0, 1, 2>
Bone Position<10, 2, 4>
Bone Rotation<0,0,0,1> // Note - no rotation
Bone Scale<1, 1, 1>
If we multiply a vertex by the offset Matrix in this case we would get a vertex position of <-10, -1, 2>.
How do we use this? You have two options on how to use this matrix which is down to how we store the vertex data in the vertex buffers. The options are;
1) Store the mesh vertices in mesh space
2) Store the mesh vertices in bone space
In the case of #1, we would take the offsetMatrix and apply it to the vertices that are influenced by the bone as we build the vertex buffer. And then when we animate the mesh, we later apply the animated matrix for that bone.
In the case of #2, we would use the offsetMatrix in combination with the animation matrix for that bone when transforming the vertices stored in the vertex buffer. So it would be something like (Note: you may have to switch the matrix concatenations around here);
anim_vertex = (offset_matrix * anim_matrix) * mesh_vertex
Does this help?

As I already assumed, the mOffsetMatrix is the inverse bind pose matrix. This tutorial states the correct transformations that you need for linear blend skinning:
You first need to evaluate your animation state. This will give you a system transform from animated bone space to world space for every bone (GlobalTransformation in the tutorial). The mOffsetMatrix is the system transform from world space to bind pose bone space. Therefore, what you do for skinning is the following (assuming that a specific vertex is influenced by a single bone): Transform the vertex to bone space with mOffsetMatrix. Now assume an animated bone and transform the intermediate result back from animated bone space to world space. So:
boneMatrix[i] = animationMatrix[i] * mOffsetMatrix[i]
If the vertex is influenced by multiple bones, LBS simply averages the results. That's where the weights come into play. Skinning is usually implemented in a vertex shader:
vec4 result = vec4(0);
for each influencing bone i
result += weight[i] * boneMatrix[i] * vertexPos;
Usually, the maximum number of influencing bones is fixed and you can unroll the for loop.
The tutorial uses an additional m_GlobalInverseTransform for the boneMatrix. However, I have no clue why they do that. Basically, this undoes the overall transformation of the entire scene. Probably it is used to center the model in the view.

Related

Skeletal animation with ASSIMP

I have been trying to implement skeletal animation in my own 3D openGL/c++ game engine using ASSIMP to import the models.
I think that the problem is caused by the calculation of the bone matrices, loaded as uniform variables in the shader because if I assign them to identities, the mesh is loaded in its bind pose.
These are the functions I use to initialize and calculate the ready bone matrices:
void Joint::AddToSceneGraph(GameObject* root)
{
GameObject* parent = root->FindChild(m_name);
//m_globalInverseMatrix = inverse(root->GetTransform().GetMatrix());
parent->AddComponent(this);
while (parent->GetParent() != root)
parent = parent->GetParent();
parent->GetTransform().SetParent(NULL); (1)
}
mat4 Joint::GetMatrix()
{
mat4 ans = /*m_globalInverseMatrix*/GetTransform().GetMatrix()*m_offsetMatrix;
return ans;
}
Since I am trying to render the model in its bind pose I wont supply you the code of calculating animation matrices.
Some clarifications - I have a Transform class which has Transform* parent and whose GetMatrix() method calculates the matrix from the parent's one, scaling and translating vec3s and rotation quaternion so the relation between parent and child is taken in consideration. I assume that the parent's transform of the root joint has to be NULL thus its matrix - identity, which is the purpose of (1). Although I am not sure for this assumption, I am sure that the node with the root of the skeleton and the node, containing the meshes, are siblings.
Also I am not sure weather I should use m_globalInverseMatrix and furthermore what exactly is its purpose.
In general I think the main issue I have is misunderstanding of how ASSIMP calculates the offset matrix or so called inverse bind pose matrix and how to invert its effect. This issue results in the model looking "packed" in itself :
From the docs in Assimp, the offset matrix of the bones transforms from the mesh space to the bone space.
I managed to render the bind pos with a hierarchy similar to yours (if I have not misunderstood, you have a root node with two children, one spawning the skeleton hierarchy and the other spawning the mesh). What you have to do is set the m_globalInverseMatrix to the inverse transform of the node containing the mesh. Then, each bone's transform is:
boneTransform = m_globalInverseMatrix * boneHierarchyTransforms * boneOffsetMatrix
where the boneHierarchyTransform comes from traversing the tree up to each bone node and the boneOffsetMatrix is the one you mention and I referred to in the first paragraph.

Normal mapping without using Tangent/Bitangent vectors

Unfortunately many tutorials describe the TBN matrix as a de-facto must for any type of normal mapping without getting too much into details on why that's the case, which confused me on one particular scenario
Let's assume I need to apply bump/normal mapping on a simple quad on screen, which could later be transformed by it's normal matrix
If the quad's surface normal in "rest position" before any transformation is pointing exactly in positive-z direction (opengl) isn't it sufficient to just transform the vector you read from the normal texture map with the model matrix?
vec3 bumpnormal = texture2D(texture, Coord.xy);
bumpnormal = mat3(model) * bumpnormal; //assuming no scaling occured
I do understand how things would change if we were computing the bumpnormal on a cube without taking in count how different faces with the same texture coordinates actually have different orientations, which leads me to the next question
Assuming that an entire model uses only a single normalmap texture, without any repetition of said texture coordinates in different parts of the model, is it possible to save those 6 floats of the tangent/bitangent vectors stored for each vertex and the computation of the TBN matrix altogheter while getting the same results by simply transforming the bumpnormal with the model's matrix?
If that's the case, why isn't it the preferred solution?
If the quad's surface normal in "rest position" before any transformation is pointing exactly in positive-z direction (opengl) isn't it sufficient to just transform the vector you read from the normal texture map with the model matrix?
No.
Let's say the value you get from the normal map is (1, 0, 0). So that means the normal in the map points right.
So... where is that exactly? Or more to the point, what space are we in when we say "right"?
Now, you might immediately think that right is just +X in model space. But the thing is, it isn't. Why?
Because of your texture coordinates.
If your model-space matrix performs a 90 degree rotation, clockwise, around the model-space Z axis, and you transform your normal by that matrix, then the normal you get should go from (1, 0, 0) to (0, -1, 0). That is what is expected.
But if you have a square facing +Z, and you rotate it by 90 degrees around the Z axis, should that not produce the same result as rotation the texture coordinates? After all, it's the texture coordinates who define what U and V mean relative to model space.
If the top-right texture coordinate of your square is (1, 1), and the bottom left is (0, 0), then "right" in texture space means "right" in model space. But if you change the mapping, so that (1, 1) is at the bottom-right and (0, 0) is at the top-left, then "right" in texture space has become "down" (-Y) in model space.
If you ignore the texture coordinates, the mapping from the model space positions to locations on the texture, then your (1, 0, 0) normal will be still pointing "right" in model space. But your texture mapping says that it should be pointing down (0, -1, 0) in model space. Just like it would have if you rotated model space itself.
With a tangent-space normal map, normals stored in the texture are relative to how the texture is mapped onto a surface. Defining a mapping from model space into the tangent space (the space of the texture's mapping) is what the TBN matrix is for.
This gets more complicated as the mapping between the object and the normals gets more complex. You could fake it for the case of a quad, but for a general figure, it needs to be algorithmic. The mapping is not constant, after all. It involves stretching and skewing as different triangles use different texture coordinates.
Now, there are object-space normal maps, which generate normals that are explicitly in model space. These avoid the need for a tangent-space basis matrix. But it intimately ties a normal map to the object it is used with. You can't even do basic texture coordinate animation, let alone allow a normal map to be used with two separate objects. And they're pretty much unworkable if you're doing bone-weight skinning, since triangles often change sizes.
http://www.thetenthplanet.de/archives/1180
vec3 perturb_normal( vec3 N, vec3 V, vec2 texcoord )
{
// assume N, the interpolated vertex normal and
// V, the view vector (vertex to eye)
vec3 map = texture2D( mapBump, texcoord ).xyz;
#ifdef WITH_NORMALMAP_UNSIGNED
map = map * 255./127. - 128./127.;
#endif
#ifdef WITH_NORMALMAP_2CHANNEL
map.z = sqrt( 1. - dot( map.xy, map.xy ) );
#endif
#ifdef WITH_NORMALMAP_GREEN_UP
map.y = -map.y;
#endif
mat3 TBN = cotangent_frame( N, -V, texcoord );
return normalize( TBN * map );
}
Basically I think you are describing this method. Which I agree is superior in most respects. It makes later calculations much more clean instead of devolving into a mess of space transformation.
Instead of calculating everything into the space of the tangents you just find what the correct world space normal is. That's what I am using in my projects and I am very happy I found this method.

Transforming Vertices by Joints with Smooth Skinning

I'm having an issue when trying to animate a mesh within my game using smooth skinning. I'm exporting a mesh, it's joints, and all animation data from Maya and it's importing properly within the game. I've also got my joints displayed in-game as spheres, just to make sure they are animating properly, and they are. I'm having trouble when transforming the vertices to bone space, and I'm clueless as to why. Here's a basic run-down of what I'm currently doing.
Precalculating the inverse of each bone in it's bind-pose. The bone
hierarchy is also set up here, I'm only setting their local matrices
as their world matrices are calculated upon request.
Interpolating matrices between keyframes, this appears to be working
as the spheres are animating correctly.
Copying the meshes vertices from the bind-pose into a seperate buffer
to be modified
Transforming each vertex:
For each vertex
For each influencing bone in vertex
Vertex tempvert;
tempvert = vertex.position * local bone inverse matrix;
tempvert = tempvert * world interpolated bone transform;
tempvert *= influence weight;
vertex += tempvert;
Update the GPU with the transformed vertices
I've also tried switching around local and world transforms for the bones, but to no avail. I'm pretty sure I'm doing something stupid, I just want to make sure that's the case before I go tearing apart my math library.
Thanks for the help.

COLLADA: Inverse bind pose in the wrong space?

I'm working on writing my own COLLADA importer. I've gotten pretty far, loading meshes and materials and such. But I've hit a snag on animation, specifically: joint rotations.
The formula I'm using for skinning my meshes is straight-forward:
weighted;
for (i = 0; i < joint_influences; i++)
{
weighted +=
joint[joint_index[i]]->parent->local_matrix *
joint[joint_index[i]]->local_matrix *
skin->inverse_bind_pose[joint_index[i]] *
position *
skin->weight[j];
}
position = weighted;
And as far as the literature is concerned, this is the correct formula. Now, COLLADA specifies two types of rotations for the joints: local and global. You have to concatenate the rotations together to get the local transformation for the joint.
What the COLLADA documentation does not differentiate between is the joint's local rotation and the joint's global rotation. But in most of the models I've seen, rotations can have an id of either rotate (global) or jointOrient (local).
When I disregard the global rotations and only use the local ones, I get the bind pose for the model. But when I add the global rotations to the joint's local transformation, strange things start to happen.
This is without using global rotations:
And this is with global rotations:
In both screenshots I'm drawing the skeleton using lines, but in the first it's invisible because the joints are inside the mesh. In the second screenshot the vertices are all over the place!
For comparison, this is what the second screenshot should look like:
It's hard to see, but you can see that the joints are in the correct position in the second screenshot.
But now the weird thing. If I disregard the inverse bind pose as specified by COLLADA and instead take the inverse of the joint's parent local transform times the joint's local transform, I get the following:
In this screenshot I'm drawing a line from each vertex to the joints that have influence. The fact that I get the bind pose is not so strange, because the formula now becomes:
world_matrix * inverse_world_matrix * position * weight
But it leads me to suspect that COLLADA's inverse bind pose is in the wrong space.
So my question is: in what space does COLLADA specifies its inverse bind pose? And how can I transform the inverse bind pose to the space I need?
I started by comparing my values to the ones I read from Assimp (an open source model loader). Stepping through the code I looked at where they built their bind matrices and their inverse bind matrices.
Eventually I ended up in SceneAnimator::GetBoneMatrices, which contains the following:
// Bone matrices transform from mesh coordinates in bind pose to mesh coordinates in skinned pose
// Therefore the formula is offsetMatrix * currentGlobalTransform * inverseCurrentMeshTransform
for( size_t a = 0; a < mesh->mNumBones; ++a)
{
const aiBone* bone = mesh->mBones[a];
const aiMatrix4x4& currentGlobalTransform
= GetGlobalTransform( mBoneNodesByName[ bone->mName.data ]);
mTransforms[a] = globalInverseMeshTransform * currentGlobalTransform * bone->mOffsetMatrix;
}
globalInverseMeshTransform is always identity, because the mesh doesn't transform anything. currentGlobalTransform is the bind matrix, the joint's parent's local matrices concatenated with the joint's local matrix. And mOffsetMatrix is the inverse bind matrix, which comes directly from the skin.
I checked the values of these matrices to my own (oh yes I compared them in a watch window) and they were exactly the same, off by maybe 0.0001% but that's insignificant. So why does Assimp's version work and mine doesn't even though the formula is the same?
Here's what I got:
When Assimp finally uploads the matrices to the skinning shader, they do the following:
helper->piEffect->SetMatrixTransposeArray( "gBoneMatrix", (D3DXMATRIX*)matrices, 60);
Waaaaait a second. They upload them transposed? It couldn't be that easy. No way.
Yup.
Something else I was doing wrong: I was converting the coordinates the right system (centimeters to meters) before applying the skinning matrices. That results in completely distorted models, because the matrices are designed for the original coordinate system.
FUTURE GOOGLERS
Read all the node transforms (rotate, translation, scale, etc.) in the order you receive them.
Concatenate them to a joint's local matrix.
Take the joint's parent and multiply it with the local matrix.
Store that as the bind matrix.
Read the skin information.
Store the joint's inverse bind pose matrix.
Store the joint weights for each vertex.
Multiply the bind matrix with the inverse bind pose matrix and transpose it, call it the skinning matrix.
Multiply the skinning matrix with the position times the joint weight and add it to the weighted position.
Use the weighted position to render.
Done!
BTW, if you transpose the matrices upon loading them rather than transposing the matrix at the end (which can be problematic when animating) you want to perform your multiplication differently (the method you use above appears to be for using skinning in DirectX when using OpenGL friendly matrices - ergo the transpose.)
In DirectX I transpose matrices when they are loaded from the file and then I use (in the example below I am simply applying the bind pose for the sake of simplicity):
XMMATRIX l_oWorldMatrix = XMMatrixMultiply( l_oBindPose, in_oParentWorldMatrix );
XMMATRIX l_oMatrixPallette = XMMatrixMultiply( l_oInverseBindPose, l_oWorldMatrix );
XMMATRIX l_oFinalMatrix = XMMatrixMultiply( l_oBindShapeMatrix, l_oMatrixPallette );

How to transform back-facing vertices in GLSL when creating a shadow volume

I'm writing a game using OpenGL and I am trying to implement shadow volumes.
I want to construct the shadow volume of a model on the GPU via a vertex shader. To that end, I represent the model with a VBO where:
Vertices are duplicated such that each triangle has its own unique three vertices
Each vertex has the normal of its triangle
For reasons I'm not going to get into, I was actually doing the above two points anyway, so I'm not too worried about the vertex duplication
Degenerate triangles are added to form quads inside the edges between each pair of "regular" triangles
Using this model format, inside the vertex shader I am able to find vertices that are part of triangles that face away from the light and move them back to form the shadow volume.
What I have left to figure out is what transformation exactly I should apply to the back-facing vertices.
I am able to detect when a vertex is facing away from the light, but I am unsure what transformation I should apply to it. This is what my vertex shader looks like so far:
uniform vec3 lightDir; // Parallel light.
// On the CPU this is represented in world
// space. After setting the camera with
// gluLookAt, the light vector is multiplied by
// the inverse of the modelview matrix to get
// it into eye space (I think that's what I'm
// working in :P ) before getting passed to
// this shader.
void main()
{
vec3 eyeNormal = normalize(gl_NormalMatrix * gl_Normal);
vec3 realLightDir = normalize(lightDir);
float dotprod = dot(eyeNormal, realLightDir);
if (dotprod <= 0.0)
{
// Facing away from the light
// Need to translate the vertex along the light vector to
// stretch the model into a shadow volume
//---------------------------------//
// This is where I'm getting stuck //
//---------------------------------//
// All I know is that I'll probably turn realLightDir into a
// vec4
gl_Position = ???;
}
else
{
gl_Position = ftransform();
}
}
I've tried simply setting gl_position to ftransform() - (vec4(realLightDir, 1.0) * someConstant), but this caused some kind of depth-testing bugs (some faces seemed to be visible behind others when I rendered the volume with colour) and someConstant didn't seem to affect how far the back-faces are extended.
Update - Jan. 22
Just wanted to clear up questions about what space I'm probably in. I must say that keeping track of what space I'm in is the greatest source of my shader headaches.
When rendering the scene, I first set up the camera using gluLookAt. The camera may be fixed or it may move around; it should not matter. I then use translation functions like glTranslated to position my model(s).
In the program (i.e. on the CPU) I represent the light vector in world space (three floats). I've found during development that to get this light vector in the right space of my shader I had to multiply it by the inverse of the modelview matrix after setting the camera and before positioning the models. So, my program code is like this:
Position camera (gluLookAt)
Take light vector, which is in world space, and multiply it by the inverse of the current modelview matrix and pass it to the shader
Transformations to position models
Drawing of models
Does this make anything clearer?
the ftransform result is in clip-space. So this is not the space you want to apply realLightDir in. I'm not sure which space your light is in (your comment confuses me), but what is sure is that you want to add vectors that are in the same space.
On the CPU this is represented in world
space. After setting the camera with
gluLookAt, the light vector is multiplied by
the inverse of the modelview matrix to get
it into eye space (I think that's what I'm
working in :P ) before getting passed to
this shader.
multiplying a vector by the inverse of the mv matrix brings the vector from view space to model space. so you're saying your light-vector (in world space), is applied a transform that does view->model. It makes little sense to me.
We have 4 spaces:
model space: the space where your gl_Vertex is defined in.
world space: a space that GL does not care about in general, that represents an arbitrary space to locate the models in. It's usually what the 3d engine works in (it maps to our general understanding of world coordinates).
view space: a space that corresponds to the referencial of the viewer. 0,0,0 is where the viewer is, looking down Z. Obtained by multiplying gl_Vertex by the modelview
clip space: the magic space that the matrix projection brings us in. result of ftransform is in this space (so is gl_ModelViewProjectionMatrix * gl_Vertex )
Can you clarify exactly which space your light direction is in ?
What you need to do, however, is make the light vector addition in either model, world or view space: Bring all the bits of your operation in the same space. E.g. for model space, just compute the light direction in model space on CPU, and do a:
vec3 vertexTemp = gl_Vertex + lightDirInModelSpace * someConst
then you can bring that new vertex position in clip space with
gl_Position = gl_ModelViewProjectionMatrix * vertexTemp
Last bit, don't try to apply vector additions in clip-space. It won't generally do what you think it should do, as at that point you are necessarily dealing with homogeneous coordinates with non-uniform w.