glDeleteBuffers slower than glBufferData - c++

I'm having a bit of performance issue in my iOS/Android game where several VBO's have to be updated every once in a while. After profiling my game it turns out that glDeleteBuffers() takes up to 7ms per VBO update. This of course results in a hiccup when frames normally take only 4 ms to render.
Here's the part where I update my VBO:
Chunk* chunk;
pthread_join(constructionThread, (void**)&chunk);
building = false;
if (vboID)
{
//takes 7 milliseconds
glDeleteBuffers(1, &vboID);
vboID = 0;
}
if (offset)
{
glGenBuffers(1, &vboID);
glBindBuffer(GL_ARRAY_BUFFER, vboID);
//takes about 1-2 milliseconds, which is acceptable
glBufferData(GL_ARRAY_BUFFER, offset * 4, constructionBuffer, GL_STATIC_DRAW);
}
where offset is an instance variable is basically the size of the new VBO, which is quite variable. vboID speaks for itself, I guess ;)

glGenBuffers and glDeleteBuffers are designed to only be run on initialization and cleanup, respectively. Calling them during runtime is bad.
glBufferData replaces the current buffer data with a new set of data, which automatically changes the size of the buffer. You can safely remove the whole glGenBuffers/glDeleteBuffers thing and move it into initialization and cleanup.
Additionally, you are creating the buffer as a static buffer. This is telling OpenGL that you will almost never change it so it stores it in a way that's quicker to access on the GPU but slower to access from the rest of the system. Try changing GL_STATIC_DRAW to GL_DYNAMIC_DRAW or GL_STREAM_DRAW. More on this here: http://www.opengl.org/wiki/Buffer_Objects#Buffer_Object_Usage

Related

Why are glDeleteBuffers and glDeleteVertexArrays so slow?

At some point in my program's flow I generate anywhere from between 0 and 300 meshes, each of them like so:
public Mesh(float[] vertices, byte[] indices, float[] textureCoordinates)
{
vao = glGenVertexArrays();
glBindVertexArray(vao);
vbo = glGenBuffers();
glBindBuffer(GL_ARRAY_BUFFER, vbo);
glBufferData(GL_ARRAY_BUFFER, BufferUtils.createFloatBuffer(vertices), GL_STATIC_DRAW);
glVertexAttribPointer(ShaderProgram.VERTEX_ATTRIB, 3, GL_FLOAT, false, 0, 0);
glEnableVertexAttribArray(ShaderProgram.VERTEX_ATTRIB);
tbo = glGenBuffers();
glBindBuffer(GL_ARRAY_BUFFER, tbo);
glBufferData(GL_ARRAY_BUFFER, BufferUtils.createFloatBuffer(textureCoordinates), GL_STATIC_DRAW);
glVertexAttribPointer(ShaderProgram.TCOORD_ATTRIB, 2, GL_FLOAT, false, 0, 0);
glEnableVertexAttribArray(ShaderProgram.TCOORD_ATTRIB);
ibo = glGenBuffers();
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ibo);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, BufferUtils.createByteBuffer(indices), GL_STATIC_DRAW);
glBindVertexArray(0);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);
glBindBuffer(GL_ARRAY_BUFFER, 0);
}
When the user presses a button these meshes need to be deleted again, so I run a cleanup method on each of these mesh objects:
public void cleanup()
{
glDeleteBuffers(vbo);
glDeleteBuffers(tbo);
glDeleteBuffers(ibo);
glDeleteVertexArrays(vao);
}
The problem is that I am trying to run at 60 fps and deleting 70 of these mesh objects takes about 30 ms (and deleting 110 of them takes 75 ms). This creates a noticable hiccup in performance because one frame should take at most ~16 ms.
Is this not the right way to dispose of VBO's and VAO's? I read in a different question (glDeleteBuffers slower than glBufferData) that
glGenBuffers and glDeleteBuffers are designed to only be run on initialization and cleanup, respectively. Calling them during runtime is bad.
but I am not sure how I can get rid of these VBO's and VAO's without calling the above functions.
I have thought of adding all meshes-to-be-deleted to a queue and only deleting a couple of them each frame, slowly emptying the queue, but that does not feel like the right solution. Another (possible) solution I have thought of is to use instanced rendering, but, as far as I understand, when I make sub 1000 draw calls per frame, non-instanced rendering should work fine too. My program will never have much more than 1000 Mesh objects at any given time, and I am not even sure this will solve my problem.
UPDATE: Besides from the below answer pointing me in exactly the right direction, I also discovered I wasn't actually deleting ~0-300 VBO's, but a factor of 48 more! No wonder performance was killed. So if anyone else ever has the same problem, thoroughly check the amount of glDeleteBuffers your code is performing.
You've hit some mental roadblock there. You seem to think that "per mesn: {one vertex attribute == one VBO}", but that's not how its supposed to work. What you should do is use one single, large VBO and use as a pool of memory from which you allocate chunks, each holding some data.
So you put not only all the vertex attributes of a single mesh into a common VBO, you also put several meshes into a single VBO.
Also it seems like you're creating and deleting your VBOs and VAOs on every single rendering iteration. Why? Do the meshes dramatically change between every frame? If that is so, that must be epilepsy inducing to watch; mesh deformations baked into the geometry data don't require recreation of the buffers, you just overwrite the data with glBufferSubData.

how to use glMapBufferRange

I have defined a VBO with a section of vertics that will be updated
periodically by a thread.
After creating the VBO and filling its buffers i retrieved the pointer
to the section i want to change.
Should i call glMapBufferRange every time before i change the content
of my array?
The documentation says it must be unmapped after usage, which means it is invalid then.
My understanding is that the following pseude code has to appear in my update thread
glBindBuffer(GL_ARRAY_BUFFER, id);
glBufferData(GL_ARRAY_BUFFER, size, null, GL_STATIC_DRAW);
ptr = glMapBufferRange( GL_ARRAY_BUFFER, 0, size, GL_MAP_WRITE_BITGL_MAP_INVALIDATE_BUFFER_BIT);
-> populate buffer
glUnmapBuffer(GL_ARRAY_BUFFER);
am i right with that, and is this the best way to manipulate a section of the VBO?

OpenGL VAO VBO resize

I have ran into an issue that uploading new content of a different size to the buffer causes VAO to behave unpredictably. Causing my object to look as if the buffer size was incorrectly set.
1) I generate VAO and VBO for an object
glGenVertexArrays(1, &_vao); // Generate 1 VAO
glGenBuffers(1, &_vbo); // Generate 1 VBO
2) Get location of uniform variables and attributes
3) Set the bindings like this:
glBindBuffer(GL_ARRAY_BUFFER, _vbo); // Bind VBO first
glBindVertexArray(_vao); // Bind VAO properties to VBO
{
GLsizei packSize = DIM * sizeof(GLfloat) * 2;
glEnableVertexAttribArray(_position);
glVertexAttribPointer(_position, DIM, GL_FLOAT, GL_FALSE, packSize, (GLvoid*)0);
// And so on ...
}
glBindVertexArray(0); // Unbind VAO
glBindBuffer(GL_ARRAY_BUFFER, 0); // Unbind VBO
4) Then I load my data:
glBindBuffer(GL_ARRAY_BUFFER, _vbo);
glBufferData(GL_ARRAY_BUFFER, _vertices.size() * sizeof(float),
_vertices.data(), GL_STATIC_DRAW);
glBindBuffer(GL_ARRAY_BUFFER, 0);
The problem starts when I try to load new data of a different size:
glBindBuffer(GL_ARRAY_BUFFER, _vbo);
glBufferData(GL_ARRAY_BUFFER, _vertices.size() * sizeof(float),
_vertices.data(), GL_STATIC_DRAW);
glBindBuffer(GL_ARRAY_BUFFER, 0);
And my buffer size seem to remain the same.
Though, if I define and bind my vao to vbo again (like in 3-d step)
glDeleteBuffers(1, &_vbo);
glGenBuffers(1, &_vbo);
// glBindBuffer... See step 3
// ...
It works.
Is it possible to avoid recreating the new buffer?
What is prefered way of resizing it?
Resize screenshot
Why it behaves such way?
I use OpenGL version 4.2.0 Build 10.18.10.3379
What is prefered way of resizing it?
The preferred way of resizing buffer objects is to not resize them. I'm being serious.
The OpenGL ARB created an entire alternate way of allocating storage for buffers, for the sole purpose of making it impossible to reallocate them later. OK, if you want to be technical, the purpose of immutable storage is to allow persistent mapping, which requires immutability, but if they just wanted persistent mapped buffers, they could have restricted the new API to just those.
Figure out what the maximum size of your data will be (or whatever maximum you want to support), allocate that up front, and just whatever part you want to.
Technically, your code ought to work. By the standard, if you resize a buffer object, things that reference it ought to be able to see the resized storage. But since implementations don't want you to resize them, high-performance applications don't resize them. So that code is rarely tested in real applications, and bugs can linger for quite some time.
Just don't do it.

Why do OpenGL buffers (and objects in general) work?

Here is what I understand about how OpenGL buffers work: Sending data from the CPU to the GPU is slow, so we don't want to send, say, vertex data vertex by vertex. Instead, we send vertex data to the GPU all at once, and inform the GPU how to interpret what we send it. Buffers are involved in this process somehow (that's the point I do not understand).
So how to do this? We follow the OpenGL object creation model:
Generate a buffer ID
GLuint VBO;
glGenBuffers(1, &VBO);
Bind the buffer ID to a target buffer in the current state/context
glBindBuffer(GL_ARRAY_BUFFER, VBO);
Modify or query data. For instance, glBufferData() called on the target GL_ARRAY_BUFFER can be used to send the vertex data.
Set the target back to default
glBindBuffer(GL_ARRAY_BUFFER, 0);
De-allocate resources (edit: after the buffer has served its purpose; for instance, after the end of the game loop)
glDeleteBuffers(1, &VBO);
So at one hand I have the abstract idea of a block of GPU memory, and on the other hand I have a procedure to access that block, and I do not understand how those two correlate.

glBufferData syncronization between drawcalls

I found myself in trouble working with openGl es, in particular feeding the VBO with my data.
Here is the code causing the problem:
void Renderer::DrawCurrentData()
{
glBufferData(GL_ARRAY_BUFFER, (currentVertexIndex) * sizeof(Vertex), vertices, GL_STREAM_DRAW);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, (currentIndexIndex) * sizeof(GLushort), indices, GL_STREAM_DRAW);
glDrawElements(GL_TRIANGLE_STRIP, currentIndexIndex, GL_UNSIGNED_SHORT, 0);
currentVertexIndex = 0;
currentIndexIndex = 0;
bufVertex = &vertices[currentVertexIndex];
}
It works fine until I have only one draw call per frame, so glBufferData is called only once for each buffer per frame before calling glDrawElements. But if I want to make several draw calls per frame like
frame begin
glBufferData
glDrawElements
.. repeat steps 2,3
presentRenderbuffer
Then I got crash in glBufferData. This probably happens because of buffer is still locked by previous draw call when I try to feed it with new data. I know I should use synchronization here but I'm not sure how to do that in right way.
I tried to orphan the buffer using glBufferData with NULL data pointer before providing new data but it din't help. I got the crash on presenting frame buffer in that case. The question is how to manage buffer feeding with new data while the old data still in use?
I tried to use GL_STREAM_DRAW and GL_DYNAMIC_DRAW both showed same results.