Skinning with Assimp.Net and OpenTK - opengl

I'm trying to implement skeletal animation using Assimp.net and OpenTK and have been following this tutorial but I cannot get it to work.
The model appears fine with identity matrices but is terribly garbled when using the transforms I generate from Assimp.
I suspect the issue is the way I am combining all of the matrices or that there is a difference in OpenTK that I am not realising. I have making similar adjustments from the tutorial as suggested here: Matrix calculations for gpu skinning
but it is still garbled, just differently, I have also tried converting all Assimp matrices to OpenTK matrices before performing any multiplication. These are the areas of the code related to the matrices, I can provide more if needed:
Matrix Conversion
public static OpenTK.Matrix4 TKMatrix(Assimp.Matrix4x4 input)
{
return new OpenTK.Matrix4(input.A1, input.B1, input.C1, input.D1,
input.A2, input.B2, input.C2, input.D2,
input.A3, input.B3, input.C3, input.D3,
input.A4, input.B4, input.C4, input.D4);
}
Storing the GLobal Inverse
public class LoaderMesh
{
public Scene mScene;
public Mesh mMesh;
public OpenTK.Matrix4 GlobalInverseTransform { get; set; }
public LoaderMesh(Scene aiScene, Mesh aiMesh)
{
mScene = aiScene;
mMesh = aiMesh;
GlobalInverseTransform = Util.TKMatrix(mScene.RootNode.Transform);
GlobalInverseTransform.Invert();
}
Loading the bones
public void LoadBones(List<VBO.Vtx_BoneWeight.Vtx> boneData)
{
for (uint iBone = 0; iBone < mMesh.BoneCount; ++iBone)
{
uint boneIndex = 0;
String bonename = mMesh.Bones[iBone].Name;
if (!BoneMapping.ContainsKey(bonename))
{
boneIndex = (uint)NumBones;
NumBones++;
BoneInfo bi = new BoneInfo();
BoneInfos.Add(bi);
}
else
{
boneIndex = BoneMapping[bonename];
}
BoneMapping[bonename] = boneIndex;
BoneInfos[(int)boneIndex].OffsetMatrix = Util.TKMatrix(mMesh.Bones[iBone].OffsetMatrix);
for (uint iWeight = 0; iWeight < mMesh.Bones[iBone].VertexWeightCount; iWeight++)
{
uint VertexID = /*m_Entries[MeshIndex].BaseVertex*/ mMesh.Bones[iBone].VertexWeights[iWeight].VertexID;
float Weight = mMesh.Bones[iBone].VertexWeights[iWeight].Weight;
VBO.Vtx_BoneWeight.Vtx vtx = boneData[(int)VertexID];
VBO.Vtx_BoneWeight.AddWeight(ref vtx, boneIndex, Weight);
boneData[(int)VertexID] = vtx;
}
}
}
Calculating the Transforms
public void ReadNodeHierarchy(float animationTime, Node aiNode, ref OpenTK.Matrix4 parentTransform)
{
String NodeName = aiNode.Name;
Animation animation = mScene.Animations[0];
OpenTK.Matrix4 NodeTransformation = Util.TKMatrix(aiNode.Transform);
NodeAnimationChannel nodeAnim = FindNodeAnim(animation, NodeName);
OpenTK.Matrix4 localTransform = OpenTK.Matrix4.Identity;
if (nodeAnim != null)
{
// Interpolate scaling and generate scaling transformation matrix
Vector3D Scaling = new Vector3D();
CalcInterpolatedScaling(ref Scaling, animationTime, nodeAnim);
Console.WriteLine("Scaling: " + Scaling.ToString());
OpenTK.Matrix4 ScalingM = Util.TKMatrix(Matrix4x4.FromScaling(Scaling));
// Interpolate rotation and generate rotation transformation matrix
Quaternion RotationQ = new Quaternion();
CalcInterpolatedRotation(ref RotationQ, animationTime, nodeAnim);
Console.WriteLine("Rotation: " + RotationQ.ToString());
OpenTK.Matrix4 RotationM = Util.TKMatrix(RotationQ.GetMatrix());
// Interpolate translation and generate translation transformation matrix
Vector3D Translation = new Vector3D();
CalcInterpolatedPosition(ref Translation, animationTime, nodeAnim);
Console.WriteLine("Transform: " + Translation.ToString());
OpenTK.Matrix4 TranslationM = Util.TKMatrix(Matrix4x4.FromTranslation(Translation));
// Combine the above transformations
NodeTransformation = TranslationM * RotationM * ScalingM;
localTransform = TranslationM * RotationM * ScalingM;
}
OpenTK.Matrix4 GlobalTransformation = parentTransform * NodeTransformation;
OpenTK.Matrix4 parentPass = OpenTK.Matrix4.Identity;
if (BoneMapping.ContainsKey(NodeName) == true)
{
uint BoneIndex = BoneMapping[NodeName];
//BoneInfos[(int)BoneIndex].FinalTransformation = GlobalInverseTransform * BoneInfos[(int)BoneIndex].OffsetMatrix * GlobalTransformation;
BoneInfos[(int)BoneIndex].NodeTransformation = parentTransform * Util.TKMatrix(aiNode.Transform) * localTransform;
parentPass = BoneInfos[(int)BoneIndex].NodeTransformation;
BoneInfos[(int)BoneIndex].FinalTransformation = GlobalInverseTransform * BoneInfos[(int)BoneIndex].NodeTransformation * BoneInfos[(int)BoneIndex].OffsetMatrix;
}
for (uint i = 0; i < aiNode.ChildCount; i++)
{
ReadNodeHierarchy(animationTime, aiNode.Children[i], ref parentPass);
}
}
And this is the vertex shader code
#version 400
layout(location = 0)in vec4 vert;
layout(location = 1)in vec4 normal;
layout(location = 2)in vec4 texCoord;
layout(location = 3)in vec4 tanCoord;
layout(location = 4)in ivec4 boneIDs;
layout(location = 5)in vec4 boneWeights;
uniform mat4 projectionMtx;
uniform mat4 viewMtx;
uniform mat4 modelMtx;
const int MAX_BONES = 100;
uniform mat4 bones[MAX_BONES];
out vec3 positionFrg_CS;
out vec3 normalFrg_CS;
out vec3 tanCoordFrg_CS;
out vec3 bitCoordFrg_CS;
out vec4 texCoordFrg;
void main()
{
mat4 BoneTransform = bones[boneIDs[0]] * boneWeights[0];
BoneTransform += bones[boneIDs[1]] * boneWeights[1];
BoneTransform += bones[boneIDs[2]] * boneWeights[2];
BoneTransform += bones[boneIDs[3]] * boneWeights[3];
gl_Position = projectionMtx * viewMtx * modelMtx * BoneTransform * vert;
}
Is there anything I am doing wrong multiplying the matrices together?

In reply to livin_amuk, I have got this working, at least well enough for my needs, however I fixed this 6 months ago and my memory is vague...
If I remember correctly my main issue was the bone/vertex indices, I think I messed up the BaseVertex because I was in a rush. Here is my current working LoadBones function.
public void LoadBones(List<VBO.Vtx_BoneWeight.Vtx> boneData, SubMesh mesh)
{
for (int iBone = 0; iBone < mesh.mMesh.BoneCount; ++iBone)
{
uint boneIndex = 0;
String bonename = mesh.mMesh.Bones[iBone].Name;
if (!BoneMapping.ContainsKey(bonename))
{
boneIndex = (uint)NumBones;
NumBones++;
BoneInfo bi = new BoneInfo();
BoneInfos.Add(bi);
//Note, I have these two lines included inside the if statement, the original tut does not. Not sure if it makes a difference.
BoneMapping[bonename] = boneIndex;
BoneInfos[(int)boneIndex].OffsetMatrix = AssimpToOpenTK.TKMatrix(mesh.mMesh.Bones[iBone].OffsetMatrix);
}
else
{
boneIndex = BoneMapping[bonename];
}
for (int iWeight = 0; iWeight < mesh.mMesh.Bones[iBone].VertexWeightCount; iWeight++)
{
//My question has the mesh.BaseVertex commented out. it is important!
long VertexID = mesh.BaseVertex + mesh.mMesh.Bones[iBone].VertexWeights[iWeight].VertexID;
float Weight = mesh.mMesh.Bones[iBone].VertexWeights[iWeight].Weight;
VBO.Vtx_BoneWeight.Vtx vtx = boneData[(int)VertexID];
VBO.Vtx_BoneWeight.AddWeight(ref vtx, boneIndex, Weight);
boneData[(int)VertexID] = vtx;
}
}
}
I also had the transforms backwards. Read node hierarchy function.
public void ReadNodeHierarchy(float animationTime, Node aiNode, ref OpenTK.Matrix4 parentTransform)
{
String NodeName = aiNode.Name;
Animation animation = mScene.Animations[0];
OpenTK.Matrix4 NodeTransformation = AssimpToOpenTK.TKMatrix(aiNode.Transform);
NodeAnimationChannel nodeAnim = FindNodeAnim(animation, NodeName);
if (nodeAnim != null)
{
// Interpolate scaling and generate scaling transformation matrix
Vector3D Scaling = new Vector3D();
CalcInterpolatedScaling(ref Scaling, animationTime, nodeAnim);
OpenTK.Matrix4 ScalingM = AssimpToOpenTK.TKMatrix(Matrix4x4.FromScaling(Scaling));
// Interpolate rotation and generate rotation transformation matrix
Quaternion RotationQ = new Quaternion();
CalcInterpolatedRotation(ref RotationQ, animationTime, nodeAnim);
OpenTK.Matrix4 RotationM = AssimpToOpenTK.TKMatrix(RotationQ.GetMatrix());
// Interpolate translation and generate translation transformation matrix
Vector3D Translation = new Vector3D();
CalcInterpolatedPosition(ref Translation, animationTime, nodeAnim);
OpenTK.Matrix4 TranslationM = AssimpToOpenTK.TKMatrix(Matrix4x4.FromTranslation(Translation));
// Combine the above transformations
//All that local transform stuff is gone. The order of the transforms is reversed from my question AND the original tut.
NodeTransformation = ScalingM * RotationM * TranslationM;
}
//Also reversed.
OpenTK.Matrix4 GlobalTransformation = NodeTransformation * parentTransform;
//GlobalTransformation = OpenTK.Matrix4.Identity;
if (BoneMapping.ContainsKey(NodeName) == true)
{
uint BoneIndex = BoneMapping[NodeName];
//Also, Also, reversed.
BoneInfos[(int)BoneIndex].FinalTransformation = BoneInfos[(int)BoneIndex].OffsetMatrix * GlobalTransformation * GlobalInverseTransform;
}
for (int i = 0; i < aiNode.ChildCount; i++)
{
ReadNodeHierarchy(animationTime, aiNode.Children[i], ref GlobalTransformation);
}
}
The Matrix conversion at the top is also correct, as is the Shader code.

Related

How to generate mesh object file from a VBO mesh that only have (vertices, normal, color, tangent)?

and recently I've been reading a piece of code, I want to generate the .obj mesh file from it. But it looks like VBO doesn't have triangle information.
Here is the code generate VBO mesh:
void buildVBOMesh()
{
const vector<VertexData> &vertexData = m_graph->vertexData();
uint nrVertices = vertexData.size();
VertexBufferObjectAttribs::DATA *attrData = new VertexBufferObjectAttribs::DATA[nrVertices];
for(uint i=0; i<nrVertices; ++i)
{
VertexData d = vertexData[i];
vec3 p = d.position;
vec3 n = d.direction;
vec3 v = d.vParallel;
vec3 t = d.tangent;
float thick = d.thickness;
float lengthFromBeginning = d.lengthFromBegining;
float lengthTotal = d.lengthTotal;
attrData[i].vx = p.x;
attrData[i].vy = p.y;
attrData[i].vz = p.z;
attrData[i].vw = 1.0f;
attrData[i].nx = n.x;
attrData[i].ny = n.y;
attrData[i].nz = n.z;
attrData[i].nw = lengthFromBeginning;
attrData[i].cx = v.x;
attrData[i].cy = v.y;
attrData[i].cz = v.z;
attrData[i].cw = thick;
attrData[i].tx = t.x;
attrData[i].ty = t.y;
attrData[i].tz = t.z;
attrData[i].tw = lengthTotal;
}
delete m_vboMesh;
m_vboMesh = new VertexBufferObjectAttribs();
m_vboMesh->setData(attrData, GL_STATIC_DRAW, nrVertices, GL_LINES);
delete[] attrData;
}
If there is no index buffer then the faces will simply be:
f 1/1/1 2/2/2 3/3/3
f 4/4/4 5/5/5 6/6/6
# and so on.
However I don't agree that you should be creating a .obj file, instead you can simply to a binary dump of the vertexData and skip the text parsing when loading.

Cell-Shading Outlines: edge mesh writer does not define all desired edges

The program that I am writing takes in the vertex data of a 3D mesh, performs a series of calculations (forgive the vagueness, I'll try to explain in better detail later), and outputs a binary file that defines where the edges are on the mesh. My program then draws a colored line where the edge is. Without the appropriate vertex shader, this would look like a regular triangulated mesh, but once the appropriate vertex shader is applied, only the edges that are "sharp" (the dot product of their normals is greater than something close to zero) have lines drawn on them, along with the edges on the outside of the figure. My implementation for the outline is not correct, as I made the assumption that if an edge wasn't behind the edge, and didn't define a sharp edge, it would be an outline edge. I haven't found a satisfactory answer to this elsewhere, and I didn't want to rely on the old trick of re-drawing the mesh as a solid color, and rendering it to be slightly larger than the original mesh. This approach was to be entirely math-based, relying only on the vertex data of a mesh. I am writing a program that uses the following vertex shader:
uniform mat4 worldMatrix;
uniform mat4 projMatrix;
uniform mat4 viewProjMatrix;
uniform vec4 eyepos;
attribute vec3 a;
attribute vec3 b;
attribute vec3 n1;
attribute vec3 n2;
attribute float w;
void main()
{
float a_vertex = dot(eyepos.xyz - a, n1);
float b_vertex = dot(eyepos.xyz - a, n2);
if (a_vertex * b_vertex > 0.0) // signs are different, edge is behind the object
{
gl_Position = vec4(2.0,2.0,2.0,1.0);
}
else // the outline of the figure
{
if(w == 0.0)
{
vec4 p = vec4(a.x, a.y, a.z, 1.0);
p = p * worldMatrix * viewProjMatrix;
gl_Position = p;
}
else
{
vec4 p = vec4(b.x, b.y, b.z, 1.0);
p = p * worldMatrix * viewProjMatrix;
gl_Position = p;
}
}
if(dot(n1, n2) <= 0.2) // there is a sharp edge
{
if(w == 0.0)
{
vec4 p = vec4(a.x, a.y, a.z, 1.0);
p = p * worldMatrix * viewProjMatrix;
gl_Position = p;
}
else
{
vec4 p = vec4(b.x, b.y, b.z, 1.0);
p = p * worldMatrix * viewProjMatrix;
gl_Position = p;
}
}
}
... to take information from a binary file that is written using this program in C++:
#include <iostream>
#include "llgl.h"
#include <fstream>
#include <vector>
#include "SuperMesh.h"
using namespace std;
using namespace llgl;
struct Vertex
{
float x,y,z,w;
float s,t,p,q;
float nx,ny,nz,nw;
};
bool isFileAlright(string fName)
{
ifstream in(fName.c_str());
if(!in.good())
return false;
return true;
}
int main(int argc, char* argv[])
{
// INPUT FILE NAME //
string fName;
cout << "Enter the path to your spec.mesh file here: ";
cin >> fName;
while(!isFileAlright(fName))
{
cout << "Enter the path to your spec.mesh file here: ";
cin >> fName;
}
SuperMesh* Model = new SuperMesh(fName.c_str());
// END INPUT //
Model->load();
Model->draw();
string fname = Model->fname;
string FileName = fname.substr(0, fname.size() - 10); // supposed to slash the last 10 characters off of the string, removing ".spec.mesh"...
FileName = FileName + ".bin"; //... and then we make it a .bin file*/
cout << FileName << endl;
ofstream out(FileName.c_str(), ios::binary);
for (unsigned w = 0; w < Model->m.size(); w++)
{
vector<float> &vdata = Model->m[w]->vdata;
vector<char> &idata = Model->m[w]->idata;
//Create a vertex and index variable, a map for Edge Mesh, perform two loops to analyze all triangles on a mesh and write out their vertex values to a file.//
Vertex* V = (Vertex*)(&vdata[0]);
unsigned short* I16 = (unsigned short*)(&idata[0]);
unsigned char* I8 = (unsigned char*)(&idata[0]);
unsigned int* I32 = (unsigned int*)(&idata[0]);
map<set<int>, vector<vec3> > EM;
for(unsigned i = 0; i < Model->m[w]->ic; i += 3) // 3 because we're looking at triangles //
{
Mesh* foo = Model->m[w];
int i1;
int i2;
int i3;
if( Model->m[w]->ise == GL_UNSIGNED_BYTE)
{
i1 = I8[i];
i2 = I8[i + 1];
i3 = I8[i + 2];
}
else if( Model->m[w]->ise == GL_UNSIGNED_SHORT)
{
i1 = I16[i];
i2 = I16[i + 1];
i3 = I16[i + 2];
}
else
{
i1 = I32[i];
i2 = I32[i + 1];
i3 = I32[i + 2];
}
vec3 p = vec3(V[i1].x, V[i1].y, V[i1].z); // to represent the point in 3D space of each vertex on every triangle on the mesh
vec3 q = vec3(V[i2].x, V[i2].y, V[i2].z);
vec3 r = vec3(V[i3].x, V[i3].y, V[i3].z);
vec3 v1 = p - q;
vec3 v2 = r - q;
vec3 n = cross(v2,v1); //important to make sure the order is correct here, do VERTEX TWO dot VERTEX ONE//
set<int> tmp;
tmp.insert(i1); tmp.insert(i2);
EM[tmp].push_back(n);
set<int> tmp2;
tmp2.insert(i2); tmp2.insert(i3);
EM[tmp2].push_back(n);
set<int> tmp3;
tmp3.insert(i3); tmp3.insert(i1);
EM[tmp3].push_back(n);
//we have now pushed every needed point into our edge map
}
int edgeNumber = 0;
cout << "There should be 12 edges on a lousy cube." << endl;
for(map<set<int>, vector<vec3> >::iterator it = EM.begin(); it != EM.end(); ++it)
{
//Now we will take our edge map and write its data to the file!//
/* Information is written to the file in this form:
Vertex One, Vertex Two, Normal One, Normal Two, r (where r, depending on its value, determines whether one edge is on top of the other in the case
where two edges are aligned with one another)
*/
set<int>::iterator tmp = it->first.begin();
int pi = *tmp;
tmp++;
int qi = *tmp;
Vertex One = V[pi];
Vertex Two = V[qi];
vec3 norm1 = it->second[0];
vec3 norm2;
if(it->second.size() == 1)
norm2 = -1 * norm1;
else
norm2 = it->second[1];
out.write((char*) &One, 12);
out.write((char*) &Two, 12);
out.write((char*) &norm1, 12);
out.write((char*) &norm2, 12);
float r = 0;
out.write((char*) &r, 4);
out.write((char*) &One, 12);
out.write((char*) &Two, 12);
out.write((char*) &norm1, 12);
out.write((char*) &norm2, 12);
r = 1;
out.write((char*) &r, 4);
edgeNumber++;
cout << "Wrote edge #" << edgeNumber << endl;
}
}
return 0;
}
The problem that this program has is that it does neither of these two essential things in the test case where I use it to draw a simple box with outlines:
It does not draw outlines. The vertex shader is not sufficient to determine anything more than where the edges of the object are. The binary file that makes this happen is pre-computed in a separate program using code from the second snippet posted above, and then it is saved as a .bin file along with the mesh assets to which it belongs. However, raw vertex data would only take me so far, and I seek a way to draw a line around the outside of the mesh without using more traditional methods.
It does not draw ALL of the edges that I need. In my test case, two of the edges are missing, and I cannot figure out for the life of me why. I figure I must have done something wrong in writing the edge map.
A couple notes about the above code:
llgl is an OpenGL wrapper that I have used to simplify many elements of OpenGL. It is not used extensively here, but rather in the creation of meshes, done elsewhere.
Things like Mesh and SuperMesh (a collection of meshes into one rigid body) are meant to be 3D objects in my scene. In my test case, there is only one Mesh in my scene, and defining a SuperMesh of a single Mesh is essentially just creating a single Mesh.
The "draw" call in the second snippet, which pre-computes a Mesh's edge map, does not actually draw anything. It is necessary to gain access to the Mesh's vertex data.
The variable "ise" is taken from the individual Meshes in the SuperMesh, and is a variable found by reading it in from the original Blender .OBJ file. It is related to how much memory should be used to store the important vertex data. It generally isn't a good idea to allocate more space than is needed for these values, as I've been told by friends and mentors who work with Blender.
It isn't well-commented, as I'm not the only one who has worked on this code, and I, unfortunately, have a limited understanding of how the second snippet could iterate through all of the triangles on a mesh and somehow miss the last two edges. Once I understand better what this code should do when properly written, I plan on heavily commenting it and using it in future applications.
Order of multiplication between matrix and vector is not comutative, so
your vertex shader have to output Projection * Model * Vertex and not the opposite.
I solved the mystery of the undrawn lines by allocating more space to write vertex data in a different part of my code. As for my other problems, although the order of multiplication being done in my vertex shader was actually alright, I had messed up another fundamental concept of vector math. The dot product of two face normals will be a negative number when the normals make an obtuse angle... the way a sharp point on my model would. Also, there is the faulty logic above that basically says that if the face is visible, draw all of the lines on it. I re-wrote my shader to test first if a face was visible, and then in that same conditional block I did the test for sharp edges. Now, if a face is visible BUT it doesn't create a sharp edge, the shader will ignore that edge. Also, outlines appear now, just not perfectly. Here is a modified version of the above vertex shader:
uniform mat4 worldMatrix; /* the matrix that defines how to project a point from
object space to world space.*/
uniform mat4 viewProjMatrix; // the view (pertaining to screen size) matrix times the projection (how to project points to 3D) matrix.
uniform vec4 eyepos; // the position of the eye, given by the program.
attribute vec3 a; // one vertex on an edge, having an x,y,z, and w coordinate.
attribute vec3 b; // the other edge vertex.
attribute vec3 n1; // the normal of the face the edge is on.
attribute vec3 n2; // another normal in the case that an edge shares two faces... otherwise, this is the same as n1.
attribute float w; // an attribute given to make a binary choice between two edges when they draw on top of one another.
void main()
{
// WORLD SPACE ATTRIBUTES //
vec4 eye_world = eyepos * worldMatrix;
vec4 a_world = vec4(a.x, a.y,a.z,1.0) * worldMatrix;
vec4 b_world = vec4(b.x, b.y,b.z,1.0) * worldMatrix;
vec4 n1_world = normalize(vec4(n1.x, n1.y,n1.z,0.0) * worldMatrix);
vec4 n2_world = normalize(vec4(n2.x, n2.y,n2.z,0.0) * worldMatrix);
// END WORLD SPACE ATTRIBUTES //
// TEST CASE ATTRIBUTES //
float a_vertex = dot(eye_world - a_world, n1_world);
float b_vertex = dot(eye_world - b_world, n2_world);
float normalDot = dot(n1_world.xyz, n2_world.xyz);
float vertProduct = a_vertex * b_vertex;
float hardness = 0.0; // this would be the value for an object made of sharp angles, like a box. Take a look at its use below.
// END TEST CASE ATTRIBUTES //
gl_Position = vec4(2.0,2.0,2.0,1.0); // if all else fails, keeping this here will discard unwanted data.
if (vertProduct >= 0.1) // NOTE: face is behind the viewable portion of the object, normally uses 0.0 when not checking for silhouette
{
gl_Position = vec4(2.0,2.0,2.0,1.0);
}
else if(vertProduct < 0.1 && vertProduct >= -0.1) // NOTE: face makes almost a right angle with the eye vector
{
if(w == 0.0)
{
vec4 p = vec4(a_world.x, a_world.y, a_world.z, 1.0);
p = p * viewProjMatrix;
gl_Position = p;
}
else
{
vec4 p = vec4(b_world.x, b_world.y, b_world.z, 1.0);
p = p * viewProjMatrix;
gl_Position = p;
}
}
else // NOTE: this is the case where you can very clearly see a face.
{ // NOTE: the number that normalDot compares to should be its "hardness" value. The more negative the value, the smoother the surface.
// a.k.a. the less we care about hard edges (when the normals of the faces make an obtuse angle) on the object, the more negative
// hardness becomes on a scale of 0.0 to -1.0.
if(normalDot <= hardness) // NOTE: the dot product of the two normals is obtuse, so we are looking at a sharp edge.
{
if(w == 0.0)
{
vec4 p = vec4(a_world.x, a_world.y, a_world.z, 1.0);
p = p * viewProjMatrix;
gl_Position = p;
}
else
{
vec4 p = vec4(b_world.x, b_world.y, b_world.z, 1.0);
p = p * viewProjMatrix;
gl_Position = p;
}
}
else // NOTE: not sharp enough, just throw the vertex away
{
gl_Position = vec4(2.0,2.0,2.0,1.0);
}
}
}

Weird performance drop, caused by a single for loop

I'm currently writing an OpenGL 3.1 (with GLSL version 330) application on linux, (NVIDIA 360M card, with the 313.0 nv driver) that has about 15k lines. My problem is that in one of my vertex shaders, I can experience drastical perforamce drops by making minimal changes in the code that should actually be no-op.
For example:
// With this solution my program runs with 3-5 fps
for(int i = 0; i < 4; ++i) {
vout.shadowCoord[i] = uShadowCP[i] * w_pos;
}
// But with this it runs with 30+ fps
vout.shadowCoord[0] = uShadowCP[0] * w_pos;
vout.shadowCoord[1] = uShadowCP[1] * w_pos;
vout.shadowCoord[2] = uShadowCP[2] * w_pos;
vout.shadowCoord[3] = uShadowCP[3] * w_pos;
// This works with 30+ fps too
vec4 shadowCoords[4];
for(int i = 0; i < 4; ++i) {
shadowCoords[i] = uShadowCP[i] * w_pos;
}
for(int i = 0; i < 4; ++i) {
vout.shadowCoord[i] = shadowCoords[i];
}
Or consider this:
uniform int uNumUsedShadowMaps = 4; // edit: I called this "random_uniform" in the original question
// 8 fps
for(int i = 0; i < min(uNumUsedShadowMaps, 4); ++i) {
vout.shadowCoord[i] = vec4(1.0);
}
// 30+ fps
for(int i = 0; i < 4; ++i) {
if(i < uNumUsedShadowMaps) {
vout.shadowCoord[i] = vec4(1.0);
} else {
vout.shadowCoord[i] = vec4(0.0);
}
}
See the entire shader code here, where this problem appeared:
http://pastebin.com/LK5CNJPD
Like any idea would be appreciated, about what can cause these.
I finally managed to find what was the source of the problem, and also found a solution to it.
But before jumping in right for the solution, please let me paste the most minimal shader code, which with, I could reproduce this 'bug'.
Vertex Shader:
#version 330
vec3 CountPosition(); // Irrelevant how it is implemented.
uniform mat4 uProjectionMatrix, uCameraMatrix;
out VertexData {
vec3 c_pos, w_pos;
vec4 shadowCoord[4];
} vout;
void main() {
vout.w_pos = CountPosition();
vout.c_pos = (uCameraMatrix * vec4(vout.w_pos, 1.0)).xyz;
vec4 w_pos = vec4(vout.w_pos, 1.0);
// 20 fps
for(int i = 0; i < 4; ++i) {
vout.shadowCoord[i] = uShadowCP[i] * w_pos;
}
// 50 fps
vout.shadowCoord[0] = uShadowCP[0] * w_pos;
vout.shadowCoord[1] = uShadowCP[1] * w_pos;
vout.shadowCoord[2] = uShadowCP[2] * w_pos;
vout.shadowCoord[3] = uShadowCP[3] * w_pos;
gl_Position = uProjectionMatrix * vec4(vout.c_pos, 1.0);
}
Fragment Shader:
#version 330
in VertexData {
vec3 c_pos, w_pos;
vec4 shadowCoord[4];
} vin;
out vec4 frag_color;
void main() {
frag_color = vec4(1.0);
}
And funny thing is that with only a minimal modification of the vertex shader is needed to make both solutions work with 50 fps. The main function should be modified to be like this:
void main() {
vec4 w_pos = vec4(CountPosition(), 1.0);
vec4 c_pos = uCameraMatrix * w_pos;
vout.w_pos = vec3(w_pos);
vout.c_pos = vec3(c_pos);
// 50 fps
for(int i = 0; i < 4; ++i) {
vout.shadowCoord[i] = uShadowCP[i] * w_pos;
}
// 50 fps
vout.shadowCoord[0] = uShadowCP[0] * w_pos;
vout.shadowCoord[1] = uShadowCP[1] * w_pos;
vout.shadowCoord[2] = uShadowCP[2] * w_pos;
vout.shadowCoord[3] = uShadowCP[3] * w_pos;
gl_Position = uProjectionMatrix * c_pos;
}
What's the difference is that the upper code reads from the shaders out varyings, while the bottom one saves those values in temporary variables, and only writes to the out varyings.
The conclusion:
Reading a shader's out varying is often seen to be used as an optimisation to get off with one less temporary variable, or at least I have seen it at many places on the internet. Despite of the previous fact, reading an out varying might actually be an invalid OpenGL operation, and might get the GL into an undefined state, in which random changes in the code can trigger bad things.
The best thing about this, is that the GLSL 330 specification doesn't say anything about reading from an out varying, that was previously written into. Probably because it's not something I should be doing.
P.S.
Also note that the second example in the original code might look totally different, but it works exactly same in this small code snippet, if the out varyings are read, it gets quite slow with the i < min(uNumUsedShadowMaps, 4) as condition in the for loop, however if the out varyings are only written, it doesn't make any change in the performace, and the i < min(uNumUsedShadowMaps, 4) one works with 50 fps too.

DirectX 11 Compute Shader 5 loop

I have the following compute shader code for computing depth of field. However, very unusually, the loop executes just once, even if g_rayCount is 10. Please have a look in the main function raycastercs where the for loop lies.
//--------------------------------------------------------------------------------------
// Compute Shader
//-------------------------------------------------------------------------------------
SamplerState SSLinear
{
Filter = Min_Mag_Linear_Mip_Point;
AddressU = Border;
AddressV = Border;
AddressW = Border;
};
float3 CalculateDoF(uint seedIndex, uint2 fragPos)
{
;
}
[numthreads(RAYCASTER_THREAD_BLOCK_SIZE, RAYCASTER_THREAD_BLOCK_SIZE, 1)]
void RaycasterCS(in uint3 threadID: SV_GroupThreadID, in uint3 groupID: SV_GroupID, in uint3 dispatchThreadID :SV_DispatchThreadID)
{
uint2 fragPos = groupID.xy * RAYCASTER_THREAD_BLOCK_SIZE + threadID.xy;
float4 dstColor = g_texFinal[fragPos];
uint seedIndex = dispatchThreadID.x * dispatchThreadID.y;
float3 final = float3(0, 0, 0);
float color = 0;
[loop][allow_uav_condition]
for (int i = 0; i < g_rayCount; ++i);
{
float3 dof = CalculateDoF(seedIndex, fragPos);
final += dof;
}
final *= 1.0f / ((float) g_rayCount);
g_texFinalRW[fragPos] = float4(final, 1);
}
//--------------------------------------------------------------------------------------
technique10 Raycaster
{
pass RaycastDefault
{
SetVertexShader(NULL);
SetGeometryShader(NULL);
SetPixelShader(NULL);
SetComputeShader(CompileShader(cs_5_0, RaycasterCS()));
}
}
Remove the semicolon at the end of the for statement
for (int i = 0; i < g_rayCount; ++i) // removed semicolon
{
float3 dof = CalculateDoF(seedIndex, fragPos);
final += dof;
}
As I guess you know, the semicolon was just running an empty for loop, then the code in braces was thereafter executed just once.

GLSL linker error(Sampler needs to be a uniform (global or parameter to main))

we have a GLSL fragment shader :
but the problem is in this code
vec4 TFSelection(StrVolumeColorMap volumeColorMap , vec4 textureCoordinate)
{
vec4 finalColor = vec4(0.0);
if(volumeColorMap.TransferFunctions[0].numberOfBits == 0)
{
return texture(volumeColorMap.TransferFunctions[0].TransferFunctionID,textureCoordinate.x);
}
if(textureCoordinate.x == 0)
return finalColor;
float deNormalize = textureCoordinate.x *65535/*255*/;
for(int i = 0; i < volumeColorMap.TransferFunctions.length(); i++)
{
int NormFactor = volumeColorMap.TransferFunctions[i].startBit + volumeColorMap.TransferFunctions[i].numberOfBits;
float minval = CalculatePower(2, volumeColorMap.TransferFunctions[i].startBit);
if(deNormalize >= minval)
{
float maxval = CalculatePower(2, NormFactor);
if(deNormalize <maxval)
{
//float tempPower = CalculatePower(2 , NormFactor);
float coord = deNormalize /maxval/*tempPower*/;
return texture(volumeColorMap.TransferFunctions[i].TransferFunctionID,coord);
}
}
}
return finalColor;
}
when we compile and link shader this message logs:
Sampler needs to be a uniform (global or parameter to main), need to
inline function or resolve conditional expression
with a simple change like maybe the shader link successfully like changing
float `coord = deNormalize /maxval
to
float coord = deNormalize .`
driver:nvidia 320.49