How does OpenGL differentiate binding points in VAO from ones defined with glBindBufferBase? - c++

I am writing a particle simulation which uses OpenGL >= 4.3 and came upon a "problem" (or rather the lack of one), which confuses me.
For the compute shader part, I use various GL_SHADER_STORAGE_BUFFERs which are bound to binding points via glBindBufferBase().
One of these GL_SHADER_STORAGE_BUFFERs is also used in the vertex shader to supply normals needed for rendering.
The binding in both the compute and vertex shader GLSL (these are called shaders 1 below) looks like this:
OpenGL part:
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 1, normals_ssbo);
GLSL part:
...
layout(std430, binding = 1) buffer normals_ssbo
{
vec4 normals[];
};
...
The interesting part is that in a seperate shader program with a different vertex shader (below called shader 2), the binding point 1 is (re-)used like this:
GLSL:
layout(location = 1) in vec4 Normal;
but in this case, the normals come from a different buffer object and the binding is done using a VAO, like this:
OpenGL:
glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, 0, 0);
As you can see, the binding point and the layout of the data (both are vec4) are the same, but the actual buffer objects differ.
Now to my questions:
Why does the VAO of shader 2, which is created and used after setting up shaders 1 (which use glBindBufferBase for binding), seamingly overwrite (?) the binding point, but shaders 1 still remember the SSBO binding and work fine without calling glBindBufferBase again before using them?
How does OpenGL know which of those two buffer objects the binding point (which in both cases is 1) should use? Are binding points created via VAO and glBindBufferBase simply completely seperate things? If that's the case, why does something like this NOT work:
layout(std430, binding = 1) buffer normals_ssbo
{
vec4 normals[];
};
layout(location = 1) in vec4 Normal;

Are binding points created via VAO and glBindBufferBase simply completely seperate things?
Yes, they are. That's why they're set by two different functions.
If that's the case, why does something like this NOT work:
Two possibilities present themselves. You implemented it incorrectly on the rendering side, or your driver has a bug. Which is which cannot be determined without seeing your actual code.

Related

Why does this shader to draw a triangle in opengl not get run multiple times

I am learning to use OpenGL through some youtube tutorials online. At 24:43 is the code I am talking about: https:
//www.youtube.com/watch?v=71BLZwRGUJE&list=PLlrATfBNZ98foTJPJ_Ev03o2oq3-GGOS2&index=7
In the previous video of the series, the guy says that the vertex shader is run 3 times (for a triangle) and the fragment shader is run once for every pixel within the shape however in the video I have linked, there is nothing telling the vertex shader to run 3 times and there is nothing telling the fragment shader to be run multiple times either. Can someone please explain why?
Also I am struggling to understand the terminology being used. For example, in the vertex shader is the code: in vec4 position . And in the fragment shader there is the code out vec4 color. I searched around google alot for what this means but I couldn't find what it means anywhere.
1.
A vertex shader is executed for each vertex of the primitives that need to be drawn. Since only a triangle (i.e. primitive with three vertices) is being drawn in the example, the vertex shader is obviously executed three times, once for each vertex of that triangle. The scheduling of the vertex shaders is done by OpenGL itself. The user does not need to take care of this.
A fragment shader is executed for each fragment generated by the rasterizer (i.e. the rasterizer breaks primitives down into discrete elements called fragments). A fragment corresponds to a pixel. Though this is not a bijection, for some pixels there can be no fragments and for some pixels there can be more than one fragment depending on the scene to draw. The scheduling of the fragments is done by OpenGL itself. The user does not need to take care of this.
The user effectively only configures the configurable stages of the pipeline, binds the programmable shaders, binds the shader input and output resources, and binds the geometry resources (vertex and index buffers, topology). The latter corresponds in the example to the vertex buffer containing the three vertices of the triangle, and the GL_TRIANGLES topology.
So given the example:
// The buffer ID.
unsigned int buffer;
// Generate one buffer object:
glGenBuffers(1, &buffer);
// Bind the newly created buffer to the GL_ARRAY_BUFFER target:
glBindBuffer(GL_ARRAY_BUFFER, buffer);
// Copies the previously defined vertex data into the buffer's memory:
glBufferData(GL_ARRAY_BUFFER, 6 * sizeof(float), positions, GL_STATIC_DRAW);
// Set the vertex attributes pointers
glEnableVertexAttribArray(0);
glVertexAttribPointer(0, 2, GL_FLOAT, GL_FALSE, sizeof(float) * 2, 0);
...
// Bind the buffer as a vertex buffer:
glBindVertexArray(buffer);
...
// Draw a triangle list for the triangles with the vertices at indices [0,3) = 1 triangle:
glDrawArrays(GL_TRIANGLES, 0, 3);
A similar well-explained "How to draw a triangle"-tutorial.
2.
layout(location = 0) in vec4 position;
A user-defined input value to a vertex shader (i.e. vertex attribute) of type vec4 (a vector of 4 floats) with name position. In the example, each vertex has a position which needs to be transformed properly in the vertex shader before passing eventually to the rasterizer (assignment to gl_Position).
3.
layout(location = 0) out vec4 color
A user-defined output value to a fragment shader of type vec4 (a vector of 4 floats) with name color. In the example, the fragment shader outputs a constant color (e.g., red) for each fragment to be eventually written to the back buffer.
References
Some useful OpenGL/GLSL reference:
Learn OpenGL
And if you want to skip all CPU boiler plate and just focus on the shaders themselves, you can take a look at ShaderToy to facilitate prototyping.

OpenGL Compute Shader Binding Point Redundancy

I wrote my first OGL compute shader. After many hours and examples I finally got things working. One of the sections of programs I don't understand is block indices and binding points. (Following this example among others.)
My computer shader has:
layout (std430, binding=2) buffer particles
{
particle ps[];
};
layout (std430, binding=3) buffer spheres
{
sphere ss[];
};
The part I am unclear on is the binding=X. My setup code has the following:
GLuint block_index =
glGetProgramResourceIndex(
compute_shader->getProgramId(),
GL_SHADER_STORAGE_BLOCK,
"particles");
GLuint ssbo_binding_point_index = 2;
GL_CALL(glShaderStorageBlockBinding,
compute_shader->getProgramId(),
block_index,
ssbo_binding_point_index);
(Note, GL_CALL just forwards the call and checks for errors afterwards.)
Finally, I before each invocation of the compute shader I have:
GL_CALL(glBindBufferBase, GL_SHADER_STORAGE_BUFFER, 2, particles->bufferId());
GL_CALL(glUseProgram, compute_shader->getProgramId());
This works, but it seems overly complicated. Is there a simpler way? Why can't I just query the set the buffer base like any other uniform?
Thanks!
By using layout(binding = #), you're already setting the block binding index for the SSBO. You don't need to set it again from code; not unless you're changing the index to a new one. And you're using the same index here.
So just bind the buffer and move forward.

In OpenGL can I access the buffer I'm drawing in a vertex shader?

I know that each time a vertex shader is run it basically accesses part of the buffer (VBO) being drawn, when drawing vertex number 7 for example it's basically indexing 7 vertices into that VBO, based on the vertex attributes and so on.
layout (location = 0) in vec3 position;
layout (location = 1) in vec3 normal;
layout (location = 2) in vec3 texCoords; // This may be running on the 7th vertex for example.
What I want to do is have access to an earlier part of the VBO for example, so when it's drawing the 7th Vertex I would like to have access to vertex number 1 for example, so that I can interpolate with it.
Seeing that at the time of running the shader it's already indexing into the VBO already, I would think that this is possible, but I don't know how to do it.
Thank you.
As you can see in the documentation, vertex attributes are expected to change on every shader run. So no, you can't access attributes defined for other vertices in a vertex shader.
You can probably do this:
Define a uniform array and pass in the values you need. But keep in mind that you are using more memory this way, you need to pass more data etc.
As #Reaper said you can use a uniform buffer, which can be accessed freely. But the GPU doesn't like random access, it's usually more efficient to stream the data.
You can solve this as well by just adding the data for later/earlier vertices into the array, because in C++ all vertices are at your disposal.
For example if this is the "normal" array:
{
vertex1_x, vertex1_y, vertex1_z, normal1_x, normal1_y, normal1_z, texCoord1_x, texCoord1_y,
...
}
Then you could extend it with data for the other vertex to interpolate with:
{
vertex1_x, vertex1_y, vertex1_z, normal1_x, normal1_y, normal1_z, texCoord1_x, texCoord1_y, vertex2_x, vertex2_y, vertex2_z, normal2_x, normal2_y, normal2_z, texCoord2_x, texCoord2_y,
...
}
Actually you can pass any data per vertex. Just make sure that the stride size and other parameters are adjusted in the glVertexAttribPointer parameters.

Why is texture buffer faster than vertex inputs when using instancing in glsl?

I am coding my own rendering engine. Currently I am working on terrain.
I render the terrain using glDrawArraysInstanced. The terrain is made out of a lot of "chunks". Every chunk is one quad which is also one instance of the draw call. Each quad is then tessellated in tessellation shaders. For my shader inputs I use VBOs, instanced VBOs (using vertex attribute divisor) and texture buffers. This is a simple example of one of my shaders:
#version 410 core
layout (location = 0) in vec3 perVertexVector; // VBO attribute
layout (location = 1) in vec3 perInstanceVector; // VBO instanced attribute
uniform samplerBuffer someTextureBuffer; // texture buffer
out vec3 outputVector;
void main()
{
// some processing of the inputs;
outputVector = something...whatever...;
}
Everything works fine and I got no errors. It renders at around 60-70 FPS. But today I was changing the code a bit and I had to change all the instanced VBOs to texture buffers. For some reason the performance doubled and it runs at 120-160 FPS! (sometimes even more!) I didn't change anything else, I just created more texture buffers and used them instead of all instanced attributes.
This was my code for creating instanced attribute fot the shader (simplified to readable version):
glBindBuffer(GL_ARRAY_BUFFER, VBO);
glBufferData(GL_ARRAY_BUFFER, size, buffer, GL_DYNAMIC_DRAW);
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 3 * sizeof(GLfloat), (GLvoid*)0);
glEnableVertexAttribArray(0);
glVertexAttribDivisor(0, 1); // this makes the buffer instanced
This is my simplified code for creating texture buffer:
glBindTexture(GL_TEXTURE_BUFFER, textureVBO);
glTexBuffer(GL_TEXTURE_BUFFER, GL_RGB32F, VBO);
I don't think I am doing anything wrong because everything works correctly. It's just the performance... I would assume that attributes are faster then textures but I got the opposite result and I am really surprised by the fact that texture buffers are more than two times faster than attributes.
But there is one more thing that I don't understand.
I actually call the render function for the terrain (glDrawArraysInstanced) two times. The first time is to render the terrain and the second time is to render it to the FBO with different transformation matrix for water reflection. When I render it only once (without the reflection) and I use the instanced attributes I get around 90 FPS so that is a bit faster than 60 FPS which I mentioned earlier.
BUT! when I render it only once and I use the texture buffers the difference is really small. It runs just as fast as when I render it two times (around 120-150 fps)!
I am wondering if it uses some kind of caching or something but it doesn't make any sense for me because the vertices are transformed with different matrices each of the two render calls so the shaders output totally different results.
I would really appreciate some explanation for this question:
Why is the texture buffer faster than the instanced attributes?
EDIT:
Here is a summary of my question for better understanding:
The only thing I do is that I change these lines in my glsl code:
layout (location = 1) in vec3 perInstanceVector; // VBO instanced attribute
outputVector = perInstanceVector;
to this:
uniform samplerBuffer textureBuffer; // texture buffer which has the same data as the previous VBO instanced attribute
outputVector = texelFetch(textureBuffer, gl_InstanceID).xyz
Everything works exactly as before but it is twice as fast in terms of performance.
I see 3 possible reason :
The shaders could have a different occupancy as the register are used differently therefore performance will be quite different
Between attribute the fetching is not achieved in the same way and scheduler could do a better wait handling in the Shaders than in the input assembler
Maybe there is less driver overhead with the second one
Did you tried with different amount of primitive? Or tried to use timers?

What's glUniformBlockBinding used for?

Assuming I have a shader program with a UniformBlock at index 0.
Binding the UniformBuffer the following is apparently enough to bind a UniformBuffer to the block:
glUseProgram(program);
glBindBuffer(GL_UNIFORM_BUFFER, buffer);
glBindBufferBase(GL_UNIFORM_BUFFER, 0, buffer);
I only have to use glUniformBlockBinding when I bind the buffer to a different index than used in the shader program.
//...
glBindBufferBase(GL_UNIFORM_BUFFER, 1, buffer)
glUniformBlockBinding(program, 0, 1); // bind uniform block 1 to index 0
Did I understand it right? Would I only have to use glUniformBlockBinding if I use use the buffer in different programs where the appropriate blocks have different indices?
Per-program active uniform block indices differ from global binding locations.
The general idea here is that assuming you use the proper layout, you can bind a uniform buffer to one location in GL and use it in multiple GLSL programs. But the mapping between each program's individual buffer block indices and GL's global binding points needs to be established by this command.
To put this in perspective, consider sampler uniforms.
Samplers have a uniform location the same as any other uniform, but that location actually says nothing about the texture image unit the sampler uses. You still bind your textures to GL_TEXTURE7 for instance instead of the location of the sampler uniform.
The only conceptual difference between samplers and uniform buffers in this respect is that you do not assign the binding location using glUniform1i (...) to set the index. There is a special command that does this for uniform buffers.
Beginning with GLSL 4.20 (and applied retroactively by GL_ARB_shading_language_420pack), you can also establish a uniform block's binding location explicitly from within the shader.
GLSL 4.20 (or the appropriate extension) allows you to write the following:
layout (std140, binding = 0) uniform MyUniformBlock
{
vec4 foo;
vec4 bar;
};
Done this way, you never have to determine the uniform block index for MyUniformBlock; this block will be bound to 0 at link-time.