Previously, to render a bunch of quads, I was simply using a few uniforms (One for a model matrix and another for the texture layer ID), however, I'd rather not have to loop through each quad and set both uniforms each time, each frame.
So I went ahead and looked for better alternatives with which I could render everything with a single call.
Now I'm using instanced rendering:
// In VAO definition
glGenBuffers(1, &instanceVBO);
glBindBuffer(GL_ARRAY_BUFFER, instanceVBO);
glBufferData(GL_ARRAY_BUFFER, 0, nullptr, GL_DYNAMIC_DRAW);
glEnableVertexAttribArray(3);
glVertexAttribPointer(3, 4, GL_FLOAT, GL_FALSE, sizeof(RenderingData), nullptr);
glVertexAttribDivisor(3, 1);
glEnableVertexAttribArray(4);
glVertexAttribPointer(4, 4, GL_FLOAT, GL_FALSE, sizeof(RenderingData), (GLvoid*)(sizeof(glm::vec4)));
glVertexAttribDivisor(4, 1);
glEnableVertexAttribArray(5);
glVertexAttribPointer(5, 4, GL_FLOAT, GL_FALSE, sizeof(RenderingData), (GLvoid*)(sizeof(glm::vec4) * 2));
glVertexAttribDivisor(5, 1);
glEnableVertexAttribArray(6);
glVertexAttribPointer(6, 4, GL_FLOAT, GL_FALSE, sizeof(RenderingData), (GLvoid*)(sizeof(glm::vec4) * 3));
glVertexAttribDivisor(6, 1);
glEnableVertexAttribArray(7);
glVertexAttribIPointer(7, 1, GL_UNSIGNED_INT, sizeof(RenderingData), (GLvoid*)(sizeof(glm::mat4)));
glVertexAttribDivisor(7, 1);
// In quad adding function
instanceData.push_back(quad->data);
glBindBuffer(GL_ARRAY_BUFFER, instanceVBO);
glBufferData(GL_ARRAY_BUFFER, sizeof(RenderingData) * quads.size(), &instanceData[0], GL_DYNAMIC_DRAW);
// Rendering
glDrawElementsInstanced(GL_TRIANGLES, 6, GL_UNSIGNED_INT, 0, quads.size());
I'm using 2 VBOs: one for per-vertex data (Position, normal, etc vectors) and another for per-instance data (Model matrix and texture layer ID).
My problem is the following: http://imgur.com/20Wb9pQ
As you can see, the middle and bottom images aren't rendering correctly (Middle one is way too stretched, whereas bottom one's UV are incorrect, since they aren't filling the whole quad), however, I'm 98% sure both my vertex and instance data are correct, since when I was using the previously mentioned uniforms, they were rendering correctly.
Also, I might 've spotted the problem: the indices. Somehow, if I change my glDrawElementsInstanced's count value to say, 12 indices, the middle one renders correctly, whereas the other 2, do not (Proof: http://imgur.com/PDQqbuT), same thing happens if I change them to 18 (The last one renders correctly and the other 2, do not).
What might the problem be?
Related
I have a piece of OpenGL code that renders meshes. I use VBOs to render them. Now, meshes consist of vertices that have the following attributes:
glm::vec3 position;
glm::vec2 uv;
glm::vec4 color;
glm::vec3 normal;
glm::vec3 tangent;
glm::vec3 binormal;
Currently, I render the vertices on per-vertex basis like this:
// Upload a vector of vertices
glBindBuffer(GL_ARRAY_BUFFER, &m_vbo);
glBufferData(GL_ARRAY_BUFFER, m_vertices.size() * sizeof(Vertex), &m_vertices[0], GL_STATIC_DRAW);
// Set the "layout" of the vertex attributes
// Binormal
glVertexAttribPointer(5, 3, GL_FLOAT, GL_FALSE, sizeof(Vertex), (void*) (sizeof(glm::vec3) * 3 + sizeof(glm::vec2) + sizeof(glm::vec4)));
// Tangent
glVertexAttribPointer(4, 3, GL_FLOAT, GL_FALSE, sizeof(Vertex), (void*) (sizeof(glm::vec3) * 2 + sizeof(glm::vec2) + sizeof(glm::vec4)));
// Normal
glVertexAttribPointer(3, 3, GL_FLOAT, GL_FALSE, sizeof(Vertex), (void*) (sizeof(glm::vec3) + sizeof(glm::vec2) + sizeof(glm::vec4)));
// Color
glVertexAttribPointer(2, 4, GL_FLOAT, GL_FALSE, sizeof(Vertex), (void*) (sizeof(glm::vec3) + sizeof(glm::vec2)));
// UV
glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, sizeof(Vertex), (void*) (sizeof(glm::vec3)));
// Position
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, sizeof(Vertex), (void*) 0);
// Draw
glDrawElements(GL_TRIANGLES, m_indices.size(), GL_UNSIGNED_SHORT, 0);
Now, I've seen people do it a bit differently. Some upload all the vertex positions first, then the UV data, then normals and so on. To do a rough visualization of the data layout:
// P = position, U = uv, N = normal
// Per-vertex layout
PUNPUNPUNPUNPUNPUNPUNPUNPUNPUNPUNPUN
// Per-attribute layout
PPPPPPPPPPPPUUUUUUUUUUUUNNNNNNNNNNNN
Is there any difference between these two layouts? Is one or the other causing any performance issues, especially if data gets updated constantly?
The first layout you're describing is typically called "interleaved", and is mostly considered advantageous. The reasoning is that it results in more local memory access patterns, which are more cache friendly.
One good reason to use a different layout would be if some of the attributes are updated much more frequently than others. In the extreme case, where some of them are static, while others are updated frequently, it might actually be beneficial to keep the static attributes in one VBO, with GL_STATIC_DRAW usage, and use a separate buffer with GL_DYNAMIC_DRAW usage for the attributes that change frequently.
#leemes brings up another interesting case in a comment above: If you often use only a subset of the attributes for draw calls, it might also be worth grouping them differently. In that case, you could have the attributes that are always used in an interleaved layout, and keep the more rarely used ones separate.
With all that said, you will often have bigger bottlenecks in your rendering pipeline, so the difference might be difficult to measure outside targeted synthetic benchmarks. Still, I think it's mostly worth it to keep everything as streamlined as possible. Particularly since most computers/devices run on battery power these days, where you don't want to waste anything.
I'm using a Widget inheriting QGLWidget to show an OpenGL viewport inside my Qt application.
The Widget does nothing more than creating three CRenderVectors and drawing them all the time.
A CRenderVector is simply a group of a QVector3D, a QOpenGLVertexArrayObject and three QOpenGLBuffers for vertices, indices and colors.
The vertex buffer objects get created with
const GLFloat color[] = {1, 0, 0, 1, 0, 0};
color_buffer.create();
color_buffer.setUsagePattern(QOpenGLBuffer::StaticDraw);
color_buffer.bind();
color_buffer.allocate(color, sizeof(color);
color_buffer.release();
respectively, using 0, 0, 0 and vec().{x,y,z}() for the vertex_buffer.
The vertex array object gets created via
vertex_array.create();
vertex_array.bind();
vertex_buffer.bind();
glEnableVertexAttribArray(0);
glVertexAttribPointer(0, 3, GL_FLOAT, GL_TRUE, 0, (void*)0);
index_buffer.bind();
color_buffer.bind();
glEnableVertexAttribArray(1);
glVertexAttribPointer(1, 3, GL_FLOAT, GL_TRUE, 0, (void*)0);
vertex_array.release();
Drawing of a vector looks like
vertex_array.bind();
glDrawElements(GL_LINES, elements, GL_UNSIGNED_INT, (void*)0);
vertex_array.release();
The problem is, that I can't see anything in the viewport except the clearing color although I think I'm using everything like shown in the official Qt documentation. Where did I misunderstand Qt or OpenGL documentations?
The stripped project for QtCreator can be downloaded at mediafire.
It seems like glBufferSubData is overwriting or somehow mangling data between my glDrawArrays calls. I'm working in Windows 7 64bit, with that latest drivers for my Nvidia GeForce GT520M CUDA 1GB.
I have 2 models, each with an animation. The models have 1 mesh, and that mesh is stored in the same VAO. They also have 1 animation each, and the bone transformations to be used for rendering the mesh is stored in the same VBO.
My workflow looks like this:
calculate bone transformation matrices for a model
load bone transformation matrices into opengl using glBufferSubData, then bind the buffer
render the models mesh using glDrawArrays
For one model, this works (at least, mostly - sometimes I get weird gaps in between the vertices).
However, for more than one model, it looks like bone transformation matrix data is getting mixed up between the rendering calls to the meshes.
Single Model Animated Windows
Two Models Animated Windows
I load my bone transformation data like so:
void Animation::bind()
{
glBindBuffer(GL_UNIFORM_BUFFER, bufferId_);
glBufferSubData(GL_UNIFORM_BUFFER, 0, currentTransforms_.size() * sizeof(glm::mat4), ¤tTransforms_[0]);
bindPoint_ = openGlDevice_->bindBuffer( bufferId_ );
}
And I render my mesh like so:
void Mesh::render()
{
glBindVertexArray(vaoId_);
glDrawArrays(GL_TRIANGLES, 0, vertices_.size());
glBindVertexArray(0);
}
If I add a call to glFinish() after my call to render(), it works just fine! This seems to indicate to me that, for some reason, the transformation matrix data for one animation is 'bleeding' over to the next animation.
How could this happen? I am under the impression that if I called glBufferSubData while that buffer was in use (i.e. for a glDrawArrays for example), then it would block. Is this not the case?
It might be worth mentioning that this same code works just fine in Linux.
Note: Related to a previous post, which I deleted.
Mesh Loading Code:
void Mesh::load()
{
LOG_DEBUG( "loading mesh '" + name_ +"' into video memory." );
// create our vao
glGenVertexArrays(1, &vaoId_);
glBindVertexArray(vaoId_);
// create our vbos
glGenBuffers(5, &vboIds_[0]);
glBindBuffer(GL_ARRAY_BUFFER, vboIds_[0]);
glBufferData(GL_ARRAY_BUFFER, vertices_.size() * sizeof(glm::vec3), &vertices_[0], GL_STATIC_DRAW);
glEnableVertexAttribArray(0);
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, 0);
glBindBuffer(GL_ARRAY_BUFFER, vboIds_[1]);
glBufferData(GL_ARRAY_BUFFER, textureCoordinates_.size() * sizeof(glm::vec2), &textureCoordinates_[0], GL_STATIC_DRAW);
glEnableVertexAttribArray(1);
glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, 0, 0);
glBindBuffer(GL_ARRAY_BUFFER, vboIds_[2]);
glBufferData(GL_ARRAY_BUFFER, normals_.size() * sizeof(glm::vec3), &normals_[0], GL_STATIC_DRAW);
glEnableVertexAttribArray(2);
glVertexAttribPointer(2, 3, GL_FLOAT, GL_FALSE, 0, 0);
glBindBuffer(GL_ARRAY_BUFFER, vboIds_[3]);
glBufferData(GL_ARRAY_BUFFER, colors_.size() * sizeof(glm::vec4), &colors_[0], GL_STATIC_DRAW);
glEnableVertexAttribArray(3);
glVertexAttribPointer(3, 4, GL_FLOAT, GL_FALSE, 0, 0);
if (bones_.size() == 0)
{
bones_.resize( vertices_.size() );
for (auto& b : bones_)
{
b.weights = glm::vec4(0.25f);
}
}
glBindBuffer(GL_ARRAY_BUFFER, vboIds_[4]);
glBufferData(GL_ARRAY_BUFFER, bones_.size() * sizeof(VertexBoneData), &bones_[0], GL_STATIC_DRAW);
glEnableVertexAttribArray(4);
glVertexAttribIPointer(4, 4, GL_INT, sizeof(VertexBoneData), (const GLvoid*)0);
glEnableVertexAttribArray(5);
glVertexAttribPointer(5, 4, GL_FLOAT, GL_FALSE, sizeof(VertexBoneData), (const GLvoid*)(sizeof(glm::ivec4)));
glBindVertexArray(0);
}
Animation UBO Setup:
void Animation::setupAnimationUbo()
{
bufferId_ = openGlDevice_->createBufferObject(GL_UNIFORM_BUFFER, Constants::MAX_NUMBER_OF_BONES_PER_MESH * sizeof(glm::mat4), ¤tTransforms_[0]);
}
where Constants::MAX_NUMBER_OF_BONES_PER_MESH is set to 100.
In OpenGlDevice:
GLuint OpenGlDevice::createBufferObject(GLenum target, glmd::uint32 totalSize, const void* dataPointer)
{
GLuint bufferId = 0;
glGenBuffers(1, &bufferId);
glBindBuffer(target, bufferId);
glBufferData(target, totalSize, dataPointer, GL_DYNAMIC_DRAW);
glBindBuffer(target, 0);
bufferIds_.push_back(bufferId);
return bufferId;
}
Those usage flags are mostly correct for this scenario, though you might consider trying GL_STREAM_DRAW.
Your driver appears to be failing to implicitly synchronize for some reason, so you might want to try a technique that eliminates the need for synchronization in the first place. I would suggest Buffer Orphaning: call glBufferData (...) with NULL for the data pointer prior to sending data. This will allow commands that are currently using the UBO to continue using the original data store without forcing synchronization, since you will allocate a new data store before sending new data. When the earlier mentioned commands finish the original data store will be orphaned and the GL implementation will free it.
In newer OpenGL implementations you can use glInvalidateBuffer[Sub]Data (...) to hint the driver into doing what was discussed above. Likewise, you can use glMapBufferRange (...) with appropriate flags to control all of this behavior more explicitly. Unmapping will implicitly flush and synchronize access to a buffer object unless told otherwise, this might get your driver to do its job if you do not want to mess around with synchronization-free buffer update logic.
Most of what I mentioned is discussed in more detail here.
It seems like glBufferSubData is overwriting or somehow mangling data between my glDrawArrays calls. I'm working in Windows 7 64bit, with that latest drivers for my Nvidia GeForce GT520M CUDA 1GB.
I have 2 models, each with an animation. The models have 1 mesh, and that mesh is stored in the same VAO. They also have 1 animation each, and the bone transformations to be used for rendering the mesh is stored in the same VBO.
My workflow looks like this:
calculate bone transformation matrices for a model
load bone transformation matrices into opengl using glBufferSubData, then bind the buffer
render the models mesh using glDrawArrays
For one model, this works (at least, mostly - sometimes I get weird gaps in between the vertices).
However, for more than one model, it looks like bone transformation matrix data is getting mixed up between the rendering calls to the meshes.
Single Model Animated Windows
Two Models Animated Windows
I load my bone transformation data like so:
void Animation::bind()
{
glBindBuffer(GL_UNIFORM_BUFFER, bufferId_);
glBufferSubData(GL_UNIFORM_BUFFER, 0, currentTransforms_.size() * sizeof(glm::mat4), ¤tTransforms_[0]);
bindPoint_ = openGlDevice_->bindBuffer( bufferId_ );
}
And I render my mesh like so:
void Mesh::render()
{
glBindVertexArray(vaoId_);
glDrawArrays(GL_TRIANGLES, 0, vertices_.size());
glBindVertexArray(0);
}
If I add a call to glFinish() after my call to render(), it works just fine! This seems to indicate to me that, for some reason, the transformation matrix data for one animation is 'bleeding' over to the next animation.
How could this happen? I am under the impression that if I called glBufferSubData while that buffer was in use (i.e. for a glDrawArrays for example), then it would block. Is this not the case?
It might be worth mentioning that this same code works just fine in Linux.
Note: Related to a previous post, which I deleted.
Mesh Loading Code:
void Mesh::load()
{
LOG_DEBUG( "loading mesh '" + name_ +"' into video memory." );
// create our vao
glGenVertexArrays(1, &vaoId_);
glBindVertexArray(vaoId_);
// create our vbos
glGenBuffers(5, &vboIds_[0]);
glBindBuffer(GL_ARRAY_BUFFER, vboIds_[0]);
glBufferData(GL_ARRAY_BUFFER, vertices_.size() * sizeof(glm::vec3), &vertices_[0], GL_STATIC_DRAW);
glEnableVertexAttribArray(0);
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, 0);
glBindBuffer(GL_ARRAY_BUFFER, vboIds_[1]);
glBufferData(GL_ARRAY_BUFFER, textureCoordinates_.size() * sizeof(glm::vec2), &textureCoordinates_[0], GL_STATIC_DRAW);
glEnableVertexAttribArray(1);
glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, 0, 0);
glBindBuffer(GL_ARRAY_BUFFER, vboIds_[2]);
glBufferData(GL_ARRAY_BUFFER, normals_.size() * sizeof(glm::vec3), &normals_[0], GL_STATIC_DRAW);
glEnableVertexAttribArray(2);
glVertexAttribPointer(2, 3, GL_FLOAT, GL_FALSE, 0, 0);
glBindBuffer(GL_ARRAY_BUFFER, vboIds_[3]);
glBufferData(GL_ARRAY_BUFFER, colors_.size() * sizeof(glm::vec4), &colors_[0], GL_STATIC_DRAW);
glEnableVertexAttribArray(3);
glVertexAttribPointer(3, 4, GL_FLOAT, GL_FALSE, 0, 0);
if (bones_.size() == 0)
{
bones_.resize( vertices_.size() );
for (auto& b : bones_)
{
b.weights = glm::vec4(0.25f);
}
}
glBindBuffer(GL_ARRAY_BUFFER, vboIds_[4]);
glBufferData(GL_ARRAY_BUFFER, bones_.size() * sizeof(VertexBoneData), &bones_[0], GL_STATIC_DRAW);
glEnableVertexAttribArray(4);
glVertexAttribIPointer(4, 4, GL_INT, sizeof(VertexBoneData), (const GLvoid*)0);
glEnableVertexAttribArray(5);
glVertexAttribPointer(5, 4, GL_FLOAT, GL_FALSE, sizeof(VertexBoneData), (const GLvoid*)(sizeof(glm::ivec4)));
glBindVertexArray(0);
}
Animation UBO Setup:
void Animation::setupAnimationUbo()
{
bufferId_ = openGlDevice_->createBufferObject(GL_UNIFORM_BUFFER, Constants::MAX_NUMBER_OF_BONES_PER_MESH * sizeof(glm::mat4), ¤tTransforms_[0]);
}
where Constants::MAX_NUMBER_OF_BONES_PER_MESH is set to 100.
In OpenGlDevice:
GLuint OpenGlDevice::createBufferObject(GLenum target, glmd::uint32 totalSize, const void* dataPointer)
{
GLuint bufferId = 0;
glGenBuffers(1, &bufferId);
glBindBuffer(target, bufferId);
glBufferData(target, totalSize, dataPointer, GL_DYNAMIC_DRAW);
glBindBuffer(target, 0);
bufferIds_.push_back(bufferId);
return bufferId;
}
Those usage flags are mostly correct for this scenario, though you might consider trying GL_STREAM_DRAW.
Your driver appears to be failing to implicitly synchronize for some reason, so you might want to try a technique that eliminates the need for synchronization in the first place. I would suggest Buffer Orphaning: call glBufferData (...) with NULL for the data pointer prior to sending data. This will allow commands that are currently using the UBO to continue using the original data store without forcing synchronization, since you will allocate a new data store before sending new data. When the earlier mentioned commands finish the original data store will be orphaned and the GL implementation will free it.
In newer OpenGL implementations you can use glInvalidateBuffer[Sub]Data (...) to hint the driver into doing what was discussed above. Likewise, you can use glMapBufferRange (...) with appropriate flags to control all of this behavior more explicitly. Unmapping will implicitly flush and synchronize access to a buffer object unless told otherwise, this might get your driver to do its job if you do not want to mess around with synchronization-free buffer update logic.
Most of what I mentioned is discussed in more detail here.
I'm a bit confused why this still renders. I thought you need to bind a vertex buffer object so that glDrawArrays knows which vertex buffer to use.
Here is my initialisation code..
// Create and bind vertex array to store vertex attribute states.
glGenVertexArraysOES(NUM_VERTEX_ARRAYS, &m_vertexArray);
glBindVertexArrayOES(m_vertexArray);
// Create and bind vertex buffer to store vertex data.
glGenBuffers(NUM_VERTEX_BUFFERS, &m_vertexBuffer);
glBindBuffer(GL_ARRAY_BUFFER, m_vertexBuffer);
glBufferData(GL_ARRAY_BUFFER, sizeof(Vertex) * 36, &m_vertices[0], GL_STATIC_DRAW);
glEnableVertexAttribArray(VertexAttribPosition);
glVertexAttribPointer(VertexAttribPosition, 3, GL_FLOAT, GL_FALSE, 24, BUFFER_OFFSET(0));
glEnableVertexAttribArray(VertexAttribNormal);
glVertexAttribPointer(VertexAttribNormal, 3, GL_FLOAT, GL_FALSE, 24, BUFFER_OFFSET(12));
glBindBuffer(GL_ARRAY_BUFFER, 0);
glBindVertexArrayOES(0);
Here is my render code. I'm confused why glDrawArrays still works when I bind 0 to GL_ARRAY_BUFFER.
glBindVertexArrayOES(m_vertexArray);
glBindBuffer(GL_ARRAY_BUFFER, 0);
glDrawArrays(GL_TRIANGLES, 0, 36);
glBindVertexArrayOES(0);
I thought you need to bind a vertex buffer object so that glDrawArrays knows which vertex buffer to use.
When glDraw… is called it uses the data addressed to by the most recent calls to the gl…Pointer (or equivalent) calls and activated by glEnableVertexAttribArray. When you do
glBindBuffer(GL_ARRAY_BUFFER, m_vertexBuffer);
glVertexAttribPointer(VertexAttribPosition, 3, GL_FLOAT, GL_FALSE, 24, BUFFER_OFFSET(0));
glVertexAttribPointer(VertexAttribNormal, 3, GL_FLOAT, GL_FALSE, 24, BUFFER_OFFSET(12));
an association between the (active) vertex attributes and the buffer objects is formed. Or in other words: glBindBuffer is only relevant for calls to glBuffer… and gl…Pointer calls. Hence you can safely bind a different buffer object after making the call to a gl…Pointer function. In fact the following would work, too:
glBindBuffer(GL_ARRAY_BUFFER, m_vertexPositionBuffer);
glVertexAttribPointer(VertexAttribPosition, 3, GL_FLOAT, GL_FALSE, 24, BUFFER_OFFSET(0));
glBindBuffer(GL_ARRAY_BUFFER, m_vertexNormalBuffer);
glVertexAttribPointer(VertexAttribNormal, 3, GL_FLOAT, GL_FALSE, 24, BUFFER_OFFSET(0));
i.e., different buffer objects are used for each vertex attribute array.
Update
Vertex Array Objects add some sugar coating to this, by making it possible to keep the bind→pointer/offset association in a object, that itself can be bound. So switching to a new (set of) buffer object(s) becomes less work.
You're using vertex array objects, so all data is already recorded into VAO. Here is a good explanation of what data VAO holds http://www.altdevblogaday.com/2013/10/18/ios-open-gl-es-2-multiple-objects-at-once/