I am relatively familiar with instanced drawing and per instance data: I've implemented this in the past with success.
Now I am refactoring some old code, and I introduced a bug on how per instance data are supplied to shaders.
The relevant bits are the following:
I have a working render loop implemented using glMultiDrawElementsIndirect: if I ignore the per instance data everything draws as expected.
I have a vbo storing the world transforms of my objects. I used AMD's CodeXL to debug this: the buffer is correctly populated with data, and is bind when drawing a frame.
glBindBuffer(GL_ARRAY_BUFFER,batch.mTransformBuffer);
glBufferData(GL_ARRAY_BUFFER, sizeof(glm::mat4) * OBJ_NUM, &xforms, GL_DYNAMIC_DRAW);
The shader specifies the input location explicitly:
#version 450
layout(location = 0) in vec3 vertexPos;
layout(location = 1) in vec4 vertexCol;
//...
layout(location = 6)uniform mat4 ViewProj;
layout(location = 10)uniform mat4 Model;
The ViewProj matrix is equal for all instances and is set correctly using:
glUniformMatrix4fv(6, 1, GL_FALSE, &viewProjMat[0][0]);
Model is per instance world matrix it's wrong: contains all zeros.
After binding the buffer and before drawing each frame, I am trying to setup the attribute pointers and divisors in such a way that every drawn instance will receive a different transform:
for (size_t i = 0; i < 4; ++i)
{
glEnableVertexAttribArray(10 + i);
glVertexAttribPointer(10 + i, 4, GL_FLOAT, GL_FALSE,
sizeof(GLfloat) * 16,
(const GLvoid*) (sizeof(GLfloat) * 4 * i));
glVertexAttribDivisor(10 + i, 1);
}
Now, I've looked and the code for a while and I really can't figure out what I am missing. CodeXL clearly show that Model (location 10) isn't correctly filled. No OpenGL error is generated.
My question is: does anyone know under which circumstances the setup of per instance data may fail silently? Or any suggestion on how to debug further this issue?
layout(location = 6)uniform mat4 ViewProj;
layout(location = 10)uniform mat4 Model;
These are uniforms, not input values. They don't get fed by attributes; they get fed by glUniform* calls. If you want Model to be an input value, then qualify it with in, not uniform.
Equally importantly, inputs and uniforms do not get the same locations. What I mean is that uniform locations have a different space from input locations. An input can have the same location index as a uniform, and they won't refer to the same thing. Input locations only refer to attribute indices; uniform locations refer to uniform locations.
Lastly, uniform locations don't work like input locations. With attributes, each vec4-equivalent uses a separate attribute index. With uniform locations, every basic type (anything that isn't a struct or an array) uses a single uniform location. So if ViewProj is a uniform location, then it only takes up 1 location. But if Model is an input, then it takes up 4 attribute indices.
Related
Im having a little problem with glDrawArraysInstanced().
Right now Im trying to draw a chess board with pieces.
I have all the models loaded in properly.
Ive tried drawing pawns only with instance drawing and it worked. I would send an array with transformation vec3s to shader through a uniform and move throught the array with gl_InstanceID
That would be done with this for loop (individual draw call for each model):
for (auto& i : this->models) {
i->draw(this->shaders[0], count);
}
which eventually leads to:
glDrawArraysInstanced(GL_TRIANGLES, 0, vertices.size(), count);
where the vertex shader is:
#version 460
layout(location = 0) in vec3 vertex_pos;
layout(location = 1) in vec2 vertex_texcoord;
layout(location = 2) in vec3 vertex_normal;
out vec3 vs_pos;
out vec2 vs_texcoord;
out vec3 vs_normal;
flat out int InstanceID;
uniform mat4 modelMatrix;
uniform mat4 viewMatrix;
uniform mat4 projectionMatrix;
uniform vec3 offsets[16];
void main(void){
vec3 offset = offsets[gl_InstanceID]; //saving transformation in the offset
InstanceID = gl_InstanceID; //unimportant
vs_pos = vec4(modelMatrix * vec4(vertex_pos + offset, 1.f)).xyz; //using the offset
vs_texcoord = vec2(vertex_texcoord.x,1.f-vertex_texcoord.y);
vs_normal = mat3(transpose(inverse(modelMatrix))) * vertex_normal;
gl_Position = projectionMatrix * viewMatrix * modelMatrix * vec4(vertex_pos + offset,1.f); //using the offset
}
Now my problem is that I dont know how to draw multiple objects in this way and change their transformations since gl_InstanceID starts from 0 on each draw call and thus my array with transformations would be used again from the beggining (which would just draw next pieces on pawns positions).
Any help will be appreciated.
You've got two problems. Or rather, you have one problem, but the natural solution will create a second problem for you.
The natural solution to your problem is to use one of the base-instance rendering functions, like glDrawElementsInstancedBaseInstance. These allow you to specify a starting instance for your instanced rendering calls.
This will precipitate a second problem: gl_InstanceID does not respect the base instance. It will always be on the range [0, instancecount). Only instance arrays respect the base instance. So instead of using a uniform to provide your per-instance data, you must use instance array rendering. This means storing the per-instance data in a buffer object (which you should have done anyway) and accessing it via a VS input whose VAO specifies that the particular attribute is instanced.
This also has the advantage of not restricting your instance count to uniform limitations.
OpenGL 4.6/ARB_shader_draw_parameters allows access to the gl_BaseInstance vertex shader input, which provides the baseinstance value specified by the draw command. So if you don't want to/can't use instanced arrays (for example, the amount of per-instance data is too big for the attribute limitations), you will have to rely on that extension/4.6 functionality. Recent desktop GL drivers offer this functionality, so if your hardware is decently new, you should be able to use it.
I am trying to pass some integer values to the Vertex Shader along with the vertex data.
I generate a buffer while vertex array is bound and then try to attach it to a location but it seems like in vertex shader the value is always 0.
here is part of the code that generates the buffer and it`s usage in the shader.
glm::vec3 materialStuff = glm::vec3(31, 32, 33);
glGenBuffers(1, &materialBufferIndex);
glBindBuffer(GL_ARRAY_BUFFER, materialBufferIndex);
glBufferData(GL_ARRAY_BUFFER, sizeof(glm::vec3), &materialStuff, GL_STATIC_DRAW);
glEnableVertexAttribArray(9);
glVertexAttribIPointer(9, 3, GL_INT, sizeof(glm::vec3), (void*)0);
And here is part of the shader that suppose to receive the integer values
// Some other locations
layout (location = 0) in vec3 vertex_position;
layout (location = 1) in vec2 vertex_texcoord;
layout (location = 2) in vec3 vertex_normal;
layout (location = 3) in vec3 vertex_tangent;
layout (location = 4) in vec3 vertex_bitangent;
layout (location = 5) in mat4 vertex_modelMatrix;
// layout (location = 6) in_use...
// layout (location = 7) in_use...
// layout (location = 8) in_use...
// The location I am attaching my integer buffer to
layout (location = 9) in ivec3 vertex_material;
// I also tried with these variations
//layout (location = 9) in int vertex_material[3];
//layout (location = 9) in int[3] vertex_material;
// and then in vertex shader I try to retrieve the int value by doing something like this
diffuseTextureInd = vertex_material[0];
That diffuseTextureInd should go to fragment shader through
out flat int diffuseTextureInd;
And I am planning to use this to index into an array of bindless textures that I already have set up and working. The issue is that it seems like vertex_material just contains 0s since my fragment shader always displays the 0th texture in the array.
Note: I know that my fragment shader is fine since if I do
diffuseTextureInd = 31;
in the vertex shader, the fragment shader correctly receives the correct index and displays the correct texture. But when I try to use the value from the layout location 9, it seems like I always get a 0. Any idea what I am doing wrong here?
The following definitions:
glm::vec3 materialStuff = glm::vec3(31, 32, 33);
glVertexAttribIPointer(9, 3, GL_INT, sizeof(glm::vec3), (void*)0);
...
layout (location = 9) in ivec3 vertex_material;
practically mean that:
glm::vec3 means that you declare vector of 3 floats rather than integers. glm::ivec3 should be used for vector of integer.
ivec3 vertex attribute means a vector of 3 integer values is expected for each vertex. At the same moment, materialStuff defines values only for a single vertex (makes no sense for a triangle, which would require at least 3 glm::ivec3).
What is supposed to be declared for passing a single integer vertex attribute:
layout (location = 9) in int vertex_material;
(without any array qualifier)
GLint materialStuff[3] = { 31, 32, 33 };
glVertexAttribIPointer(9, 1, GL_INT, sizeof(GLint)*3, (void*)0);
It should be noticed though, that passing different per-vertex integer to fragment shader makes no sense, which I suppose you solved by flat keyword. Existing pipeline defines only per-vertex inputs, not per-triangle or something like this. There are glVertexAttribDivisor() defining the vertex attribute rate, but it is applicable only to rendering instances via glDrawArraysInstanced()/glDrawElementsInstanced() (specific vertex attribute might be increment per instance), not triangles.
There are ways to handle per-triangle inputs - this could be done by defining Uniform Buffer Object or Texture Buffer Object (same as 1D texture but for accessing by index without interpolation) instead of generic Vertex Attribute. But tricks will be still necessary to determine the triangle index in this array - again, from vertex attribute or from built-in variables like gl_VertexID in Vertex Shader, gl_PrimitiveIDIn in Geometry Shader or gl_PrimitiveID in Fragment Shader (I cannot say, though, how these counters are affected by culling).
I am trying to render 2 textures in OpenGL 3.
I created two arrays of vertices of GLfloat type,generated and bound the buffers etc.
Note: The texture loading function is working fine,I have already loaded a texture before, now I just need 2 textures rendered at the same time.
Then I load my textures like this:
GLuint grass = texture.loadTexture("grass.bmp");
GLuint grassLoc = glGetUniformLocation(programID, "grassSampler");
glUniform1i(grassLoc, 0);
GLuint crate = texture.loadTexture("crate.bmp");
GLuint crateLoc = glGetUniformLocation(programID, "crateSampler");
glUniform1i(crateLoc, 1);
This is how I draw them:
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, grass);
glDrawArrays(GL_TRIANGLES, 0, 6);
glActiveTexture(GL_TEXTURE1);
glBindTexture(GL_TEXTURE_2D, crate);
glDrawArrays(GL_TRIANGLES, 2, 6);
Vertex shader:
#version 330 core
layout(location = 0) in vec3 grassPosition;
layout(location = 1) in vec2 grassUvPosition;
layout(location = 2) in vec3 cratePosition;
layout(location = 3) in vec2 crateUvPosition;
out vec2 grassUV;
out vec2 crateUV;
uniform mat4 MVP;
void main(){
gl_Position = MVP * vec4(grassPosition,1);
gl_Position = MVP * vec4(cratePosition,1);
grassUV = grassUvPosition;
crateUV = crateUvPosition;
}
Fragment shader:
#version 330 core
in vec2 grassUV;
in vec2 crateUV;
out vec3 grassColor;
out vec3 crateColor;
uniform sampler2D grassSampler;
uniform sampler2D crateSampler;
void main(){
crateColor = texture(grassSampler, grassUV).rgb;
grassColor = texture(crateSampler, crateUV).rgb;
}
Can anyone see what I am doing wrong?
EDIT:
I am trying to render 2 different textures on 2 different VAOs
You're kinda doing everything wrong; it's hard to pick out one thing.
Your shaders look like they're tying to take two positions and two texture coordinates, presumably generate two triangles, then sample from two textures and write colors to two different images.
That's not how it works. Unless you use a geometry shader (and please do not take that as an endorsement), your call to glDrawArrays(GL_TRIANGLES, 0, 6); will render exactly 2 triangles, no matter what your VS or FS's say.
A vertex has only one position. Writing to gl_Position twice will simply overwrite the previous value, just like writing to any variable twice in C++ would. And the number of triangles to be rendered is defined by the number of vertices. A vertex shader cannot create vertices. It can't even destroy them (though, through gl_CullDistance, it can potentially cull whole primitives).
It is not clear what you mean by "I just need 2 textures rendered at the same time." Or more to the point, what "at the same time" refers to. I don't know what your code ought to be trying to do.
Given the data your vertex shader expects, it looks like you have two separate sets of triangles, with their own positions and texture coordinates. You want to render one set of triangles with one texture, then render another set with a different texture.
So... do that. Instead of having your VAOs send 2 positions and 2 texture coordinates, send just one. Your VS should also take one position/texcoord, and your FS should similarly take a single texture and write to a single output. The difference will be determined by what VAO is currently active and which texture is bound to texture unit 0 at the time you issue the render call.
If you truly intend to write to different output images, the way your FS suggests, then change FBOs between rendering as well.
If however, your goal is to have the same triangle use two textures with two mappings, writing separate results to two images, you can do that too. The difference is that you only provide a single position, and textures must be bound to both texture units 0 and 1 when you issue your rendering command.
I want to incorporate a custom attribute that varies per vertex. In this case it is assigned to location=4 ... but nothing happens, the other four attributes vary properly except that one. At the bottom, I added a test to produce a specific color if it encounters the value '1' (which I know exists in the buffer, because I queried the buffer earlier). Attribute 4 is stuck at the first value of its array and never moves.
Am I missing a setting ? (something to be enabled maybe ?) or is it that openGL only varies a handful attributes but nothing else ?
#version 330 //for openGL 3.3
//uniform variables stay constant for the whole glDraw call
uniform mat4 ProjViewModelMatrix;
uniform vec4 DefaultColor; //x=-1 signifies no default color
//non-uniform variables get fed per vertex from the buffers
layout (location=0) in vec3 coords; //feeding from attribute=0 of the main code
layout (location=1) in vec4 color; //per vertex color, feeding from attribute=1 of the main code
layout (location=2) in vec3 normals; //per vertex normals
layout (location=3) in vec2 UVcoord; //texture coordinates
layout (location=4) in int vertexTexUnit;//per vertex texture unit index
//Output
out vec4 thisColor;
out vec2 vertexUVcoord;
flat out int TexUnitIdx;
void main ()
{
vertexUVcoord = UVcoord;
TexUnitIdx=vertexTexUnit;
if (DefaultColor.x==-1) {thisColor = color;} //If no default color is set, use per vertex colors
else {thisColor = DefaultColor;}
gl_Position = ProjViewModelMatrix * vec4(coords,1.0); //This outputs the position to the graphics card.
//TESTING
if (vertexTexUnit==1) thisColor=vec4(1,1,0,1); //Never receives value of 1, but the buffer does contain such values
}
Because the vertexTexUnit attribute is an integer, you must use glVertexAttribIPointer() instead of glVertexAttribPointer().
You can use vertex attributes for whatever you want. OpenGL doesn't know or care what you're using them for.
I'm developping a little 3D Engine using OpenGL and GLSL. I currently use Texture Buffer Objects (TBOs) to store all my matrices (Proj, View, Model and Shadow Matrices). But I did some researches on what is the best way to handle matrices (I mean the most efficient way) within a graphic engine, without any success. The goal is to store a maximum of matrices into a minimum number of TBO and occur a minimum of state changes and a minimum of exchanges between the GPU and client code (glBufferSubData).
I propose 2 different methods (with their advantages and disadvantages):
Here's a scene example:
1 Camera (1 ProjMatrix, 1 ViewMatrix)
5 boxes (5 ModelMatrix)
Here's an example of a simple vertex shader I use:
#version 400
/*
** Vertex attributes.
*/
layout (location = 0) in vec4 VertexPosition;
layout (location = 1) in vec2 VertexTexture;
/*
** Uniform matrix buffer.
*/
uniform samplerBuffer matrixBuffer;
/*
** Matrix buffer offset.
*/
uniform int MatrixBufferOffset;
/*
** Output variables.
*/
out vec2 TexCoords;
/*
** Returns matrix4x4 from texture cache.
*/
mat4 Get_Matrix(int offset)
{
return (mat4(texelFetch(
matrixBuffer, offset), texelFetch(
matrixBuffer, offset + 1), texelFetch(matrixBuffer, offset + 2),
texelFetch(matrixBuffer, offset + 3)));
}
/*
** Vertex shader entry point.
*/
void main(void)
{
TexCoords = VertexTexture;
{
mat4 ModelViewProjMatrix = Get_Matrix(
MatrixBufferOffset);
gl_Position = ModelViewProjMatrix * VertexPosition;
}
}
1) The method I currently use: in my vertex shader I use to use ModelViewProjMatrix (needed for rasterization(gl_Position)), ModelViewMatrix (for lighting calculations) and ModelMatrix. So to avoid useless calculation within the vertex shader I've decided to store the ModelViewProjMatrix, the ModelViewMatrix and the ModelMatrix for each mesh node inlined in the TBO as follow:
TBO = {[ModelViewProj_Box1][ModelView_Box1][Model_Box1]|[ModelViewProj_Box2]...}
Advantages: I don't need to compute the product Proj * View * Model (ModelViewProj for example) for each vertex shader (the matrices are pre-calculated).
Disadvantages: if I move the camera I need to update all the ModelViewProj and ModelView matrices. So, a lot of informations to update.
2) I thought about an other way, I think more efficient: store once the projection matrix, once the view matrix and finally each box scene node model matrix once again this way:
TBO = {[ProjMatrix][ViewMatrix][ModelMatrix_Box1][ModelMatrix_Box2]...}
So my vertex shader will look like this:
#version 400
/*
** Vertex attributes.
*/
layout (location = 0) in vec4 VertexPosition;
layout (location = 1) in vec2 VertexTexture;
/*
** Uniform matrix buffer.
*/
uniform samplerBuffer matrixBuffer;
/*
** Matrix buffer offset.
*/
uniform int MatrixBufferOffset;
/*
** Output variables.
*/
out vec2 TexCoords;
/*
** Returns matrix4x4 from texture cache.
*/
mat4 Get_Matrix(int offset)
{
return (mat4(texelFetch(
matrixBuffer, offset), texelFetch(
matrixBuffer, offset + 1), texelFetch(matrixBuffer, offset + 2),
texelFetch(matrixBuffer, offset + 3)));
}
/*
** Vertex shader entry point.
*/
void main(void)
{
TexCoords = VertexTexture;
{
mat4 ProjMatrix = Get_Matrix(MatrixBufferOffset);
mat4 ViewMatrix = Get_Matrix(MatrixBufferOffset + 4);
mat4 ModelMatrix = Get_Matrix(MatrixBufferOffset + 8);
gl_Position = ProjMatrix * ViewMatrix * ModelMatrix * VertexPosition;
}
}
Advantages: The TBO contains the exact number of matrices used. The update is highly targeted (if I move the camera I only updates the view matrix, if I resize the window I only updates the projection matrix and finally if a object is moving only its model matrix will be updated).
Disadvantages: I need to compute fo each vertex within the vertex shader the ModelViewProjMatrix. Plus, if the scene is composed of a huge number of object with each of them owning a different model matrix, I probably need to create a new TBO. Consequently, I will loose the proj/view matrix information because I won't be connect to the right TBO, which bring us to my third method.
3) Store the Projection and View matrix in a TBO and all the other model matrices within another or others TBO(s) as follow:
TBO_0 = {[ProjMatrix][ViewMatrix]}
TBO_1 = {[ModelMatrix_Box1][ModelMatrix_Box2]...}
What do you think of my 3 methods ? Which one is the best for you?
Thanks a lot in advance for your help!
The solution 3 is what most engines do, except they use uniform buffers (constant buffers) instead of texture buffers. Also they don't generally group all the model matrices together in the same buffer, they usually are grouped by object type (because same objects are drawn at once with instancing) and sometimes by frequency of update (objects that never move are in the same buffer so that it never needs to be updated).
Also glBufferSubData can be pretty slow; updating a buffer is often slower than just binding a different one, because of all the synchronization happening inside the driver. There is a very good book chapter about that, freely available on the Internet, called "OpenGL Insights: Asynchronous Buffer Transfers" (Google it to find it).
EDIT: The nvidia article you linked in the comments is very interesting. They recommend using glMultiDrawElements to make several draw calls at once (that's the main trick, everything else is there because of that decision). That can reduce the CPU work in the driver a lot, but that also mean that it's a lot more complicated to provide all the data required to draw the objects: you have to build/update bigger buffers for the matrices / material values and, you also need to use something like bindless textures to be able to have different textures for each object. So, interesting, but more complicated.
And glMultiDrawElements is only important if you want to draw a lot of different objects. Their examples have 68000-98000 different meshes, that's really a lot. In a game, for example, you usually have lots of instances of the same objects, but only a few hundred of different objects (maximum). In the end, it depends on what your 3D engine needs to render.