OpenGL: How does Vertex Buffer manage its memory? - c++

I'm learning OpenGL and I'm trying to understand things properly. If my understanding is incorrect at any point, please correct me.
Introduction
So let's say we have a triangle. This triangle has its vertices. Let's say these vertices only have the position set - no color, not anything else. These vertices are passed to the shaders using a buffer - let's call it VB (VBO in tutorials).
The shaders are the following:
Vertex shader:
#version 330 core
layout (location = 0) in vec3 aPos;
void main()
{
gl_Position = vec4(aPos.x, aPos.y, aPos.z, 1.0);
}
Fragment shader:
#version 330 core
out vec4 FragColor;
void main()
{
FragColor = vec4(1.0f, 0.5f, 0.2f, 1.0f);
}
The VB is an non-formatted array of data. For example, if we wanted to pass 3 one-byte values to this buffer (0, 255, 16), the data would look like this:
00FFF0
However, the shaders do not know how to read the data, so we need to "instruct" them by telling them what is what. To do this we use Vertex Array Objects. Let's call our Vertex Array Object VA.
To pass data to the buffer, glBufferData is used. When calling the function like this:
glBufferData(GL_ARRAY_BUFFER, sizeof(vertices), vertices, GL_STATIC_DRAW);
We inform OpenGL that we want to buffer sizeof(vertices) elements from array vertices to the buffer currently bound to GL_ARRAY_BUFFER for static drawing.
Then, we inform VA how to use the data like this:
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 3 * sizeof(float), (void*)0);
glEnableVertexAttribArray(0);
This way we tell VA to get 3 floating point values from active buffer at the offset 0 while not normalizing coordinates and set this data to the vertex attribute at location 0.
This way, the shaders finally get all the data they need to work and our triangle is drawn.
Question
However, what if we wanted to change one vertex after we've already passed the data to the buffer?
In my understanding, we'd need to call glBufferData the same way as before. But how does it influence the data that was originally in the buffer? Does it overwrite it?
If it does overwrite it, how do we pass another data, let's say colors, without overwriting the positions?
If it doesn't, how does VA know the data it's "pointing" to is no longer up to date?

The buffer data can be updated with glBufferSubData (or Mapping). glBufferData creates a new data store with immutable size. Therefore, you cannot add additional data to a buffer. You must first create a buffer that is large enough.
However you can create a separate buffer for additional attributes.

In my understanding, we'd need to call glBufferData the same way as before. But how does it influence the data that was originally in the buffer? Does it overwrite it?
glBufferSubData() lets you override a subrange of a buffer, so you can use it to selectively update parts of one.
If it does overwrite it, how do we pass another data, let's say colors, without overwriting the positions?
The simplest way to do that is to structure your buffer so that it's composed of separate sequential buffers instead of being one big interleaved buffer.
It would look roughly like so (take note of the stride parameter set to 0, which tells the driver that the data is not interleaved):
size_t coord_start = 0;
size_t normal_start = coord_start + coord_data_len;
size_t color_start = normal_start + normal_data_len;
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, (void*)coord_start );
glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, 0, (void*)normal_start );
glVertexAttribPointer(2, 3, GL_FLOAT, GL_FALSE, 0, (void*)color_start );
This way, when you call glBufferSubData() to update the memory range that contains colors, you will be leaving the coord and normal data alone.
Alternatively, you can also create separate buffers for different attributes.
If it doesn't, how does VA know the data it's "pointing" to is no longer up to date?
This is where things get kinda tricky. On paper, that's the driver's problem and you shouldn't worry about it.
In practice, because the GPU runs in parallel from what is happening on the CPU, you can end up in a circumstance where updating a buffer that's currently being used can cause some slowdowns as the synchronization resolves itself.
Because of this, it's sometimes preferable to just create a brand new buffer and fill it "from scratch" instead of updating an existing one. This way, you can be confident that the buffer update process doesn't step on the toes of any rendering making use of the buffer. When it is better to sub-updates vs fresh buffers depends on a lot of factors though.

Related

Is it necessary to bind all VBOs (and textures) each frame?

I'm following basic tutorial on OpenGL 3.0. What is not clear to me why/if I have to bind, enable and unbind/disable all vertex buffers and textures each frame.
To me it seems too much gl**** calls which I guess have some overhead. For example here you see each frame several blocks like:
// do this for each mesh in scene
// vertexes
glEnableVertexAttribArray(0);
glBindBuffer(GL_ARRAY_BUFFER, vertex_buffer);
glVertexAttribPointer( 0, 3, GL_FLOAT,GL_FALSE,0,(void*)0);
// normals
glEnableVertexAttribArray(1);
glBindBuffer(GL_ARRAY_BUFFER, normal_buffer );
glVertexAttribPointer( 1, 3, GL_FLOAT,GL_FALSE,0,(void*)0);
// UVs
glEnableVertexAttribArray(2);
glBindBuffer(GL_ARRAY_BUFFER, uv_buffer );
glVertexAttribPointer( 2, 2, GL_FLOAT,GL_FALSE,0,(void*)0);
// ...
glDrawArrays(GL_TRIANGLES, 0, nVerts );
// ...
glDisableVertexAttribArray(0);
glDisableVertexAttribArray(1);
glDisableVertexAttribArray(2);
imagine you have not just one but 100 different meshes each with it's own VBOs for vertexes,normas,UVs. Should I really do this procedure each frame for each of them? Sure I can encapsulate that complexity into some function/objects, but I worry about overheads of this gl**** function calls.
Is it not possible some part of this machinery to move from per frame loop into scene setup ?
Also I read that VAO is a way how to pack corresponding VBOs for one object together. And that binding VAO automatically binds corresponding VBOs. So I was thinking that maybe one VAO for each mesh (not instance) is how it should be done - but according to this answer it does not seems so?
First things first: Your concerns about GL call overhead have been addressed with the introduction of Vertex Array Objects (see #Criss answer). However the real problem with your train of thought is, that you equate VBOs with geometry meshes, i.e. give each geometry its own VBO.
That's not how you should see and use VBOs. VBOs are chunks of memory and you can put the data of several objects into a single VBO; you don't have to draw the whole thing, you can limit draw calls to subsets of a VBO. And you can coalesce geometries with similar or even identical drawing setup and draw them all at once with a single draw call. Either by having the right vertex index list, or by use of instancing.
When it comes to the binding state of textures… well, yeah, that's a bit more annoying. You really have to do the whole binding dance when switching textures. That's why in general you sort geometry by texture/shader before drawing, so that the amount of texture switches is minimized.
The last 3 or 4 generations of GPUs (as of late 2016) do support bindless textures though, where you can access textures through a 64 bit handle (effectively the address of the relevant data structure in some address space) in the shader. However bindless textures did not yet make it into the core OpenGL standard and you have to use vendor extensions to make use of it.
Another interesting approach (popularized by Id Tech 4) is virtual textures. You can allocate sparsely populated texture objects that are huge in their addressable size, but only part of them actually populated with data. During program execution you determine which areas of the texture are required and swap in the required data on demand.
You should use vertex array object (generated by glGenVertexArrays). Thanks to it you don't have to perform those calls everytime. Vertex buffer object stores:
Calls to glEnableVertexAttribArray or glDisableVertexAttribArray.
Vertex attribute configurations via glVertexAttribPointer.
Vertex buffer objects associated with vertex attributes by calls to
glVertexAttribPointer.
Maybe this will be better tutorial.
So that you can generate vao object, then bind it, perform the calls and unbind. Now in drawing loop you just have to bind vao.
Example:
glUseProgram(shaderId);
glBindVertexArray(vaoId);
glDrawArrays(GL_TRIANGLES, 0, 3);
glBindVertexArray(0);
glUseProgram(0);

Why is texture buffer faster than vertex inputs when using instancing in glsl?

I am coding my own rendering engine. Currently I am working on terrain.
I render the terrain using glDrawArraysInstanced. The terrain is made out of a lot of "chunks". Every chunk is one quad which is also one instance of the draw call. Each quad is then tessellated in tessellation shaders. For my shader inputs I use VBOs, instanced VBOs (using vertex attribute divisor) and texture buffers. This is a simple example of one of my shaders:
#version 410 core
layout (location = 0) in vec3 perVertexVector; // VBO attribute
layout (location = 1) in vec3 perInstanceVector; // VBO instanced attribute
uniform samplerBuffer someTextureBuffer; // texture buffer
out vec3 outputVector;
void main()
{
// some processing of the inputs;
outputVector = something...whatever...;
}
Everything works fine and I got no errors. It renders at around 60-70 FPS. But today I was changing the code a bit and I had to change all the instanced VBOs to texture buffers. For some reason the performance doubled and it runs at 120-160 FPS! (sometimes even more!) I didn't change anything else, I just created more texture buffers and used them instead of all instanced attributes.
This was my code for creating instanced attribute fot the shader (simplified to readable version):
glBindBuffer(GL_ARRAY_BUFFER, VBO);
glBufferData(GL_ARRAY_BUFFER, size, buffer, GL_DYNAMIC_DRAW);
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 3 * sizeof(GLfloat), (GLvoid*)0);
glEnableVertexAttribArray(0);
glVertexAttribDivisor(0, 1); // this makes the buffer instanced
This is my simplified code for creating texture buffer:
glBindTexture(GL_TEXTURE_BUFFER, textureVBO);
glTexBuffer(GL_TEXTURE_BUFFER, GL_RGB32F, VBO);
I don't think I am doing anything wrong because everything works correctly. It's just the performance... I would assume that attributes are faster then textures but I got the opposite result and I am really surprised by the fact that texture buffers are more than two times faster than attributes.
But there is one more thing that I don't understand.
I actually call the render function for the terrain (glDrawArraysInstanced) two times. The first time is to render the terrain and the second time is to render it to the FBO with different transformation matrix for water reflection. When I render it only once (without the reflection) and I use the instanced attributes I get around 90 FPS so that is a bit faster than 60 FPS which I mentioned earlier.
BUT! when I render it only once and I use the texture buffers the difference is really small. It runs just as fast as when I render it two times (around 120-150 fps)!
I am wondering if it uses some kind of caching or something but it doesn't make any sense for me because the vertices are transformed with different matrices each of the two render calls so the shaders output totally different results.
I would really appreciate some explanation for this question:
Why is the texture buffer faster than the instanced attributes?
EDIT:
Here is a summary of my question for better understanding:
The only thing I do is that I change these lines in my glsl code:
layout (location = 1) in vec3 perInstanceVector; // VBO instanced attribute
outputVector = perInstanceVector;
to this:
uniform samplerBuffer textureBuffer; // texture buffer which has the same data as the previous VBO instanced attribute
outputVector = texelFetch(textureBuffer, gl_InstanceID).xyz
Everything works exactly as before but it is twice as fast in terms of performance.
I see 3 possible reason :
The shaders could have a different occupancy as the register are used differently therefore performance will be quite different
Between attribute the fetching is not achieved in the same way and scheduler could do a better wait handling in the Shaders than in the input assembler
Maybe there is less driver overhead with the second one
Did you tried with different amount of primitive? Or tried to use timers?

OpenGL, VAOs and multiple buffers

I am writing a little graphics engine using OpenGL ( via OpenTK with C# ).
To define vertex attributes, I have a VertexDeclaration class with an array of VertexElement structures that are mapped to glEnableVertexAttribArray/glVertexAttribPointer calls.
Also, to support multiple vertex streams, I have a special structure holding a vertex buffer, vertex declaration, vertex offset and instance frequency (like the XNA's VertexBufferBinding structure).
Currently, whenever a drawing call is invoked, I iterate over all the set vertex streams and
bind their vertex buffers, apply vertex declarations, disable unused vertex attributes and draw the primitives.
I would like to use VAOs to cache the glEnableVertexAttribArray calls into them,
and whenever a vertex stream is applied, bind the VAO and change its array buffer binding.
Is that a correct usage of VAOs?
Is that a correct usage of VAOs?
No1.
glVertexAttribPointer uses the buffer object that was bound to GL_ARRAY_BUFFER at the moment the function was called. So you can't do this:
glVertexAttribPointer(...);
glBindBuffer(GL_ARRAY_BUFFER, bufferObject);
glDrawArrays(...);
This will not use bufferObject; it will use whatever was bound to GL_ARRAY_BUFFER when glVertexAttribPointer was originally called.
VAOs capture this state. So the VAO will, for each vertex attribute, store whatever buffer object was bound to GL_ARRAY_BUFFER when it was called. This allows you to do things like this:
glBindVertexArray(VAO);
glBindBuffer(GL_ARRAY_BUFFER, buffer1);
glVertexAttribPointer(0, ...);
glVertexAttribPointer(1, ...);
glBindBuffer(GL_ARRAY_BUFFER, buffer2);
glVertexAttribPointer(2, ...);
Attributes 0 and 1 will come from buffer1, and attribute 2 will come from buffer2. VAO now captures all of that state. To render, you just do this:
glBindVertexArray(VAO);
glDraw*();
In short, if you want to change where an attribute's storage comes from in OpenGL, you must also change it's format. Even if it's the same format, you must call glVertexAttribPointer again.
1: This discussion assumes you're not using the new ARB_vertex_attrib_binding. Or, as it is otherwise known, "Exactly how Direct3D does vertex attribute binding." If you happen to be using an implementation that offers this extension, you can effectively do what you're talking about, because the attribute format is not tied with the buffer object's storage. Also, the tortured logic of glVertexAttribPointer is gone.
In general, the way we solve this in the OpenGL world is to put as many things as possible in the same buffer object. Failing that, just use one VAO for each object.

OpenGL structure of VAO/VBO for model with moving parts?

I came from this question:
opengl vbo advice
I use OpenGL 3.3 and will not to use deprecated features. Im using Assimp to import my blender models. But im a bit confused as to how much i should split them up in terms of VAO's and VBO's.
First off a little side question. I use glDrawElements, do that mean i cannot interleave my vertex attributes or can the VAO figure out using the glVertexAttribPointer and the glDrawElements offset to see where my vertex position is?
Main question i guess, boils down to how do i structure my VAO/VBO's for a model with multiple moving parts, and multiple meshes pr. part.
Each node in assimp can contain multiple meshes where each mesh has texture, vertices, normals, material etc. The nodes in assimp contains the transformations. Say i have a ship with a cannon turret on it. I want to be able to roatate the turret. Do this mean i will make the ship node a seperate VAO with VBO's for each mesh containing its attributes(or multiple VBO's etc.).
I guess it goes like
draw(ship); //call to draw ship VAO
pushMatrix(turretMatrix) //updating uniform modelview matrix for the shader
draw(turret); //call to draw turret VAO
I don't fully understand UBO(uniform buffer objects) yet, but it seems i can pass in multiple uniforms, will that help me contain a full model with moveable parts in a single VAO?
first, off VAO only "remembers" the last vertex attribute bindings (and VBO binding for an index buffer (the GL_ELEMENT_ARRAY_BUFFER_BINDING), if there is one). So it does not remember offsets in glDrawElements(), you need to call that later when using the VAO. It laso does not prevent you from using interleaved vertex arrays. Let me try to explain:
int vbo[3];
glGenBuffers(3, vbo);
glBindBuffer(GL_ARRAY_BUFFER, vbo[0]);
glBufferData(GL_ARRAY_BUFFER, data0, size0);
glBindBuffer(GL_ARRAY_BUFFER, vbo[1]);
glBufferData(GL_ARRAY_BUFFER, data1, size1);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, vbo[2]);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, data2, size2);
// create some buffers and fill them with data
int vao;
glGenVertexArrays(1, &vao);
glBindVertexArray(vao);
// create a VAO
{
glBindBuffer(GL_ARRAY_BUFFER, vbo[0]); // not saved in VAO
glVertexAttribPointer(0, 3, GL_FLOAT, false, 3 * sizeof(float), NULL); // this is VAO saved state
glEnableVertexAttribArray(0); // this is VAO saved state
// sets up one vertex attrib array from vbo[0] (say positions)
glBindBuffer(GL_ARRAY_BUFFER, vbo[1]); // not saved in VAO
glVertexAttribPointer(1, 3, GL_FLOAT, false, 5 * sizeof(float), NULL); // this is VAO saved state
glVertexAttribPointer(2, 2, GL_FLOAT, false, 5 * sizeof(float), (const void*)(2 * sizeof(float))); // this is VAO saved state
glEnableVertexAttribArray(1); // this is VAO saved state
glEnableVertexAttribArray(2); // this is VAO saved state
// sets up two more VAAs from vbo[1] (say normals interleaved with texcoords)
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, vbo[2]); // this is VAO saved state
// uses the third buffer as the source for indices
}
// set up state that VAO "remembers"
glBindVertexArray(0); // bind different vaos, etc ...
Later ...
glBindVertexArray(vao); // bind our VAO (so we have VAAs 0, 1 and 2 as well as index buffer)
glDrawElements(GL_TRIANGLE_STRIP, 57, GL_UNSIGNED_INT, NULL);
glDrawElements(GL_TRIANGLE_STRIP, 23, GL_UNSIGNED_INT, (const void*)(57 * sizeof(unsigned int)));
// draws two parts of the mesh as triangle strips
So you see ... you can draw interleaved vertex arrays using glDrawElements using a single VAO and one or more VBOs.
To answer the second part of your question, you either can have different VAOs and VBOs for different parts of the mesh (so drawing separate parts is easy), or you can fuse all into one VAO VBO pair (so you need not call glBind*() often) and use multiple glDraw*() calls to draw individual parts of the mesh (as seen in the code above - imagine the first glDrawElements() draws the ship and the second draws the turret, you just update some matrix uniform between the calls).
Because shaders can contain multiple modelview matrices in uniforms, you can also encode mesh id as another vertex attribute, and let the vertex shader choose which matrix to use to transform the vertex, based on this attribute. This idea can also be extended to using multiple matrices per a single vertex, with some weights assigned for each matrix. This is commonly used when animating organic objects such as player character (look up "skinning").
As uniform buffer objects go, the only advantage is that you can pack a lot of data into them and that they can be easily shared between shaders (just bind the UBO to any shader that is able to use it). There is no real advantage in using them for you, except if you would be to have objects with 1OOOs of matrices.
Also, i wrote the source codes above from memory. Let me know if there are some errors / problems ...
#theswine
Not binding this during VAO initialization causes my program to crash, but binding it after binding the VAO causes it to run correctly. Are you sure this isn't saved in the VAO?
glBindBuffer(GL_ARRAY_BUFFER, vbo[0]); // not saved in VAO
(BTW: sorry for bringing up an old topic, I just thought this could be useful to others, this post sure was! (which reminds me, thank you!!))

Storing different vertex attributes in different VBO's

Is it possible to store different vertex attributes in different vertex buffers?
All the examples I've seen so far do something like this
float data[] =
{
//position
v1x, v1y, v1z,
v2x, v2y, v2z,
...
vnx, vny, vnz,
//color
c1r, c1g, c1b,
c2r, c2g, c2b,
...
cnr, cng, cnb,
};
GLuint buffname;
glGenBuffers(1, &buffname);
glBindBuffer(GL_ARRAY_BUFFER, buffname);
glBufferData(GL_ARRAY_BUFFER, sizeof(data), data, GL_STATIC_DRAW);
And the drawing is done something like this:
glBindBuffer(GL_ARRAY_BUFFER, buffname);
glEnableVertexAttrib(position_location);
glEnableVertexAttrib(color_location);
glVertexAttribPointer(position_location, 3, GL_FLOAT, GL_FALSE, 0, 0);
glVertexAttribPointer(color_location, 3, GL_FLOAT, GL_FALSE, 0, (void*)(3*n));
glDrawArrays(GL_TRIANGLES, 0, n/3);
glDisableVertexAttrib(position_location);
glDisableVertexAttrib(color_location);
glBindBuffer(GL_ARRAY_BUFFER, 0);
Isn't it possible to store position data and color data in different VBOs? The problem is I don't understand how this would work out because you can't bind two buffers at once, can you?
If there is a simple but inefficient solution, I would prefer it over a more complicated but efficient solution because I am in primary learning state and I don't want to complicate things too much.
Also, if what I'm asking is possible, is it a good idea or not?
To clarify: I do understand how I could store different attributes in different VBO's. I don't understand how I would later draw them.
The association between attribute location X and the buffer object that provides that attribute is made with the glVertexAttribPointer command. The way it works is simple, but unintuitive.
At the time glVertexAttribPointer is called (that's the part a lot of people don't get), whatever buffer object is currently bound to GL_ARRAY_BUFFER becomes associated with the attribute X, where X is the first parameter of glVertexAttribPointer.
So if you want to have an attribute that comes from one buffer and an attribute that comes from another, you do this:
glEnableVertexAttrib(position_location);
glEnableVertexAttrib(color_location);
glBindBuffer(GL_ARRAY_BUFFER, buffPosition);
glVertexAttribPointer(position_location, 3, GL_FLOAT, GL_FALSE, 0, 0);
glBindBuffer(GL_ARRAY_BUFFER, buffColor);
glVertexAttribPointer(color_location, 3, GL_FLOAT, GL_FALSE, 0, 0);
As for whether you should split attributes into different buffers... I would say that you should only do it if you have a demonstrable need.
For example, let's say you're doing a dynamic height-map, perhaps for some kind of water effect. The Z position of each element changes, but this also means that the normals change. However, the XY positions and the texture coordinates do not change.
Efficient streaming often requires either double-buffering buffer objects or invalidating them (reallocating them with a glBufferData(NULL) or glMapBufferRange(GL_INVALIDATE_BIT)). Either way only works if the streamed data is in another buffer object from the non-streamed data.
Another example of a demonstrable need is if memory is a concern and several objects share certain attribute lists. Perhaps objects have different position and normal arrays but the same color and texture coordinate arrays. Or something like that.
But otherwise, it's best to just put everything for an object into one buffer. Even if you don't interleave the arrays.