I wrote my first OGL compute shader. After many hours and examples I finally got things working. One of the sections of programs I don't understand is block indices and binding points. (Following this example among others.)
My computer shader has:
layout (std430, binding=2) buffer particles
{
particle ps[];
};
layout (std430, binding=3) buffer spheres
{
sphere ss[];
};
The part I am unclear on is the binding=X. My setup code has the following:
GLuint block_index =
glGetProgramResourceIndex(
compute_shader->getProgramId(),
GL_SHADER_STORAGE_BLOCK,
"particles");
GLuint ssbo_binding_point_index = 2;
GL_CALL(glShaderStorageBlockBinding,
compute_shader->getProgramId(),
block_index,
ssbo_binding_point_index);
(Note, GL_CALL just forwards the call and checks for errors afterwards.)
Finally, I before each invocation of the compute shader I have:
GL_CALL(glBindBufferBase, GL_SHADER_STORAGE_BUFFER, 2, particles->bufferId());
GL_CALL(glUseProgram, compute_shader->getProgramId());
This works, but it seems overly complicated. Is there a simpler way? Why can't I just query the set the buffer base like any other uniform?
Thanks!
By using layout(binding = #), you're already setting the block binding index for the SSBO. You don't need to set it again from code; not unless you're changing the index to a new one. And you're using the same index here.
So just bind the buffer and move forward.
Related
I am writing a particle simulation which uses OpenGL >= 4.3 and came upon a "problem" (or rather the lack of one), which confuses me.
For the compute shader part, I use various GL_SHADER_STORAGE_BUFFERs which are bound to binding points via glBindBufferBase().
One of these GL_SHADER_STORAGE_BUFFERs is also used in the vertex shader to supply normals needed for rendering.
The binding in both the compute and vertex shader GLSL (these are called shaders 1 below) looks like this:
OpenGL part:
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 1, normals_ssbo);
GLSL part:
...
layout(std430, binding = 1) buffer normals_ssbo
{
vec4 normals[];
};
...
The interesting part is that in a seperate shader program with a different vertex shader (below called shader 2), the binding point 1 is (re-)used like this:
GLSL:
layout(location = 1) in vec4 Normal;
but in this case, the normals come from a different buffer object and the binding is done using a VAO, like this:
OpenGL:
glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, 0, 0);
As you can see, the binding point and the layout of the data (both are vec4) are the same, but the actual buffer objects differ.
Now to my questions:
Why does the VAO of shader 2, which is created and used after setting up shaders 1 (which use glBindBufferBase for binding), seamingly overwrite (?) the binding point, but shaders 1 still remember the SSBO binding and work fine without calling glBindBufferBase again before using them?
How does OpenGL know which of those two buffer objects the binding point (which in both cases is 1) should use? Are binding points created via VAO and glBindBufferBase simply completely seperate things? If that's the case, why does something like this NOT work:
layout(std430, binding = 1) buffer normals_ssbo
{
vec4 normals[];
};
layout(location = 1) in vec4 Normal;
Are binding points created via VAO and glBindBufferBase simply completely seperate things?
Yes, they are. That's why they're set by two different functions.
If that's the case, why does something like this NOT work:
Two possibilities present themselves. You implemented it incorrectly on the rendering side, or your driver has a bug. Which is which cannot be determined without seeing your actual code.
I am coding my own rendering engine. Currently I am working on terrain.
I render the terrain using glDrawArraysInstanced. The terrain is made out of a lot of "chunks". Every chunk is one quad which is also one instance of the draw call. Each quad is then tessellated in tessellation shaders. For my shader inputs I use VBOs, instanced VBOs (using vertex attribute divisor) and texture buffers. This is a simple example of one of my shaders:
#version 410 core
layout (location = 0) in vec3 perVertexVector; // VBO attribute
layout (location = 1) in vec3 perInstanceVector; // VBO instanced attribute
uniform samplerBuffer someTextureBuffer; // texture buffer
out vec3 outputVector;
void main()
{
// some processing of the inputs;
outputVector = something...whatever...;
}
Everything works fine and I got no errors. It renders at around 60-70 FPS. But today I was changing the code a bit and I had to change all the instanced VBOs to texture buffers. For some reason the performance doubled and it runs at 120-160 FPS! (sometimes even more!) I didn't change anything else, I just created more texture buffers and used them instead of all instanced attributes.
This was my code for creating instanced attribute fot the shader (simplified to readable version):
glBindBuffer(GL_ARRAY_BUFFER, VBO);
glBufferData(GL_ARRAY_BUFFER, size, buffer, GL_DYNAMIC_DRAW);
glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 3 * sizeof(GLfloat), (GLvoid*)0);
glEnableVertexAttribArray(0);
glVertexAttribDivisor(0, 1); // this makes the buffer instanced
This is my simplified code for creating texture buffer:
glBindTexture(GL_TEXTURE_BUFFER, textureVBO);
glTexBuffer(GL_TEXTURE_BUFFER, GL_RGB32F, VBO);
I don't think I am doing anything wrong because everything works correctly. It's just the performance... I would assume that attributes are faster then textures but I got the opposite result and I am really surprised by the fact that texture buffers are more than two times faster than attributes.
But there is one more thing that I don't understand.
I actually call the render function for the terrain (glDrawArraysInstanced) two times. The first time is to render the terrain and the second time is to render it to the FBO with different transformation matrix for water reflection. When I render it only once (without the reflection) and I use the instanced attributes I get around 90 FPS so that is a bit faster than 60 FPS which I mentioned earlier.
BUT! when I render it only once and I use the texture buffers the difference is really small. It runs just as fast as when I render it two times (around 120-150 fps)!
I am wondering if it uses some kind of caching or something but it doesn't make any sense for me because the vertices are transformed with different matrices each of the two render calls so the shaders output totally different results.
I would really appreciate some explanation for this question:
Why is the texture buffer faster than the instanced attributes?
EDIT:
Here is a summary of my question for better understanding:
The only thing I do is that I change these lines in my glsl code:
layout (location = 1) in vec3 perInstanceVector; // VBO instanced attribute
outputVector = perInstanceVector;
to this:
uniform samplerBuffer textureBuffer; // texture buffer which has the same data as the previous VBO instanced attribute
outputVector = texelFetch(textureBuffer, gl_InstanceID).xyz
Everything works exactly as before but it is twice as fast in terms of performance.
I see 3 possible reason :
The shaders could have a different occupancy as the register are used differently therefore performance will be quite different
Between attribute the fetching is not achieved in the same way and scheduler could do a better wait handling in the Shaders than in the input assembler
Maybe there is less driver overhead with the second one
Did you tried with different amount of primitive? Or tried to use timers?
I'm writing my own OpenGL-3D-Application and have stumbled across a little problem:
I want the number of light sources to be dynamic. For this, my shader contains an array of my lights struct:uniform PointLight pointLights[NR_POINT_LIGHTS];
The variable NR_POINT_LIGHTS is set by preprocessor, and the command for this is generated by my applications code (Java). So when creating a shader program, I pass the desired start-amount of PintLights, complete the source text with the preprocessor command, compile, link and use. This works great.
Now I want to change this variable. I re-build the shader-source-string, re-compile and re-link a new shaderProgram and continue using this onoe. It just appears that all uniforms set in the old program are getting lost in the progress (of course, I once set them for the old program).
My ideas on how to fix this:
Don't compile a new program, but rather somehow change the source data for the currently running shaders and somehow re-compile them, to continue using the program with the right uniform values
Copy all uniform data from the old program to the newly generated one
What is the right way to do this? How do I do this? I'm not very experienced yet and don't know if any of my ideas is even possible.
You're looking for a Uniform Buffer or (4.3+ only) a Shader Storage Buffer.
struct Light {
vec4 position;
vec4 color;
vec4 direction;
/*Anything else you want*/
}
Uniform Buffer:
const int MAX_ARRAY_SIZE = /*65536 / sizeof(Light)*/;
layout(std140, binding = 0) uniform light_data {
Light lights[MAX_ARRAY_SIZE];
};
uniform int num_of_lights;
Host Code for Uniform Buffer:
glGenBuffers(1, &light_ubo);
glBindBuffer(GL_UNIFORM_BUFFER, light_ubo);
glBufferData(GL_UNIFORM_BUFFER, sizeof(GLfloat) * static_light_data.size(), static_light_data.data(), GL_STATIC_DRAW); //Can be adjusted for your needs
GLuint light_index = glGetUniformBlockIndex(program_id, "light_data");
glBindBufferBase(GL_UNIFORM_BUFFER, 0, light_ubo);
glUniformBlockBinding(program_id, light_index, 0);
glUniform1i(glGetUniformLocation(program_id, "num_of_lights"), static_light_data.size() / 12); //My lights have 12 floats per light, so we divide by 12.
Shader Storage Buffer (4.3+ Only):
layout(std430, binding = 0) buffer light_data {
Light lights[];
};
/*...*/
void main() {
/*...*/
int num_of_lights = lights.length();
/*...*/
}
Host Code for Shader Storage Buffer (4.3+ Only):
glGenBuffers(1, &light_ssbo);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, light_ssbo);
glBufferData(GL_SHADER_STORAGE_BUFFER, sizeof(GLfloat) * static_light_data.size(), static_light_data.data(), GL_STATIC_DRAW); //Can be adjusted for your needs
light_ssbo_block_index = glGetProgramResourceIndex(program_id, GL_SHADER_STORAGE_BLOCK, "light_data");
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, light_ssbo);
glShaderStorageBlockBinding(program_id, light_ssbo_block_index, 0);
The main difference between the two is that Uniform Buffers:
Have compatibility with older, OpenGL3.x hardware,
Are limited on most systems to 65kb per buffer
Arrays need to have their [maximum] size declared statically at the compile-time of the shader.
Whereas Shader Storage Buffers:
Require hardware no older than 5 years
Have a API mandated minimum allowable size of 16Mb (and most systems will allow up to 25% the total VRAM)
Can dynamically query the size of any arrays stored in the buffer (though this can be buggy on older AMD systems)
Can be slower than Uniform Buffers on the Shader side (roughly equivalent to a Texture Access)
Don't compile a new program, but rather somehow change the source data for the currently running shaders and somehow re-compile them, to continue using the program with the right uniform values
This isn't do-able at runtime if I'm understanding right (implying that you could change the shader-code of the compiled shader program) but if you modify the shader source text you can compile a new shader program. Thing is, how often do the number of lights change in your scene? Because this is a fairly expensive process to do.
You could specify a max number of lights if you don't mind having a limitation and only use the lights in the shader that have been populated with information, saving you the task of tweaking the source text and recompiling a whole new shader program, but that leaves you with a limitation on the number of lights (If you aren't planning on having absolutely loads of lights in your scene but are planning of having the number of lights change relatively often then this is probably going to be best for you)
However, if you really want to go down the route that you are looking at here:
Copy all uniform data from the old program to the newly generated one
You can look at using a Uniform Block. If you're going to be using shader programs with similar or shared uniforms, Uniform Blocks are a good way of managed those 'universal' uniform variables across your shade programs, or in your case the shader you are moving to as you grow the amount of lights in the shader. Theres a good tutorial on uniform blocks here
Lastly, depending on the OpenGL version you're using, you might still be able to achieve dynamic array sizes. OpenGL 4.3 introduced the ability to use buffers and have unbound array sizes, that you would use glBindBufferRange to send the length of your lights array to. You'll see more talk about that functionality in this question and this wiki reference.
The last would probably be my preference, but it depends on if you're aiming at hardware supporting older OpenGL versions.
I have a an SSBO which stores vec4 colour values for each pixel on screen and is pre populated with values by a compute shader before the main loop.
I'm now trying to get this data onscreen which I guess involves using the fragment shader (Although if you know a better method for this I'm open to suggestions)
So I'm trying to get the buffer or at least the data in it to the fragment shader so that I can set the colour of each fragment to the corresponding value in the buffer but I cannot find any way of doing this?
I have been told that I can bind the SSBO to the fragment shader but I don't know how to do this? Other thoughts I had was somehow moving the data from the SSBO to a texture but I can't work that out either
UPDATE:
In response thokra's excellent answer and following comments here is the code to set up my buffer:
//Create the buffer
GLuint pixelBufferID;
glGenBuffers(1, &pixelBufferID);
//Bind it
glBindBuffer(GL_SHADER_STORAGE_BUFFER, pixelBufferID);
//Set the data of the buffer
glBufferData(GL_SHADER_STORAGE_BUFFER, sizeof(vec4) * window.getNumberOfPixels, new vec4[window.getNumberOfPixels], GL_DYNAMIC_DRAW);
//Bind the buffer to the correct interface block number
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, pixelBufferID);
Then I call the compute shader and this part works, I check the data has been populated correctly. Then in my fragment shader, just as a test:
layout(std430, binding=0) buffer PixelBuffer
{
vec4 data[];
} pixelBuffer
void main()
{
gl_FragColor = pixelBuffer.data[660000];
}
What I've noticed is that it seems to take longer and longer the higher the index so at 660000 it doesn't actually crash its just taking an silly amount of time.
Storage buffers work quite similarly to uniform buffers. To get a sense of how those work I suggest something like this. The main differences are that storage buffer can hold substantially higher amounts of data and the you can randomly read from and write to them.
There are multiple angles of working this, but I'll start with the most basic one - the interface block inside your shader. I will only describe a subset of the possibilities when using interface blocks but it should be enough to get you started.
In contrast to "normal" variables, you cannot specify buffer variables in the global scope. You need to use an interface block (Section 4.3.9 - GLSL 4.40 Spec) as per Section 4.3.7 - GLSL 4.40 Spec:
The buffer qualifier can be used to declare interface blocks (section 4.3.9 “Interface Blocks”), which are then referred to as shader storage blocks. It is a compile-time error to declare buffer variables at global scope (outside a block).
Note that the above mentioned section differs slightly from the ARB extension.
So, to get access to stuff in your storage buffer you'll need to define a buffer interface block inside your fragment shader (or any other applicable stage):
layout (binding = 0) buffer BlockName
{
float values[]; // just as an example
};
Like with any other block without an instance name, you'll refer to the buffer storage as if values were at global scope, e.g.:
void main()
{
// ...
values[0] = 1.f;
// ...
}
On the application level the only thing you now need to know is that the buffer interface block BlockName has the binding 0 after the program has been successfully linked.
After creating a storage buffer object with your application, you first bind the buffer to the binding you specified for the corresponding interface block using
glBindBufferBase(GLenum target, GLuint index, GLuint buffer);
for binding the complete buffer to the index or
glBindBufferRange(GLenum target, GLuint index, GLuint buffer, GLintptr offset, GLsizeiptr size);
for binding a subset specified by an offset and a number of of the buffer to the index.
Note that index refers to the binding specified in your layout for the corresponding interface block.
And that's basically it. Be aware that there are certain limits for the storage buffer size, the number of binding points, maximum storage block sizes and so on. I refer you to the corresponding sections in the GL and GLSL specs.
Also, there is a minimal example in the ARB extension. Reading the issues sections of extension also often provides further insight into the exposed functionality and the rationale behind it. I advise you to read through it.
Leave a comment if you run into problems.
using this code I can send one texture to the shader:
devcon->PSSetShaderResources(0, 1, &pTexture);
Of course i made the pTexture by: D3DX11CreateShaderResourceViewFromFile
Shader:
Texture2D Texture;
return color * Texture.Sample(ss, texcoord);
I'm currently only sending one texture to the shader, but I would like to send multiple textures, how is this possible?
Thank You.
You can use multiple textures as long as their count does not exceed your shader profile specs. Here is an example:
HLSL Code:
Texture2D diffuseTexture : register(t0);
Texture2D anotherTexture : register(t1);
C++ Code:
devcon->V[P|D|G|C|H]SSetShaderResources(texture_index, 1, &texture);
So for example for above HLSL code it will be:
devcon->PSSetShaderResources(0, 1, &diffuseTextureSRV);
devcon->PSSetShaderResources(1, 1, &anotherTextureSRV); (SRV stands for Shader Texture View)
OR:
ID3D11ShaderResourceView * textures[] = { diffuseTextureSRV, anotherTextureSRV};
devcon->PSSetShaderResources(0, 2, &textures);
HLSL names can be arbitrary and doesn't have to correspond to any specific name - only indexes matter. While "register(tXX);" statements are not required, I'd recommend you to use them to avoid confusion as to which texture corresponds to which slot.
By using Texture Arrays. When you fill out your D3D11_TEXTURE2D_DESC look at the ArraySize member. This desc struct is the one that gets passed to ID3D11Device::CreateTexture2D. Then in your shader you use a 3rd texcoord sampling index which indicates which 2D texture in the array you are referring to.
Update: I just realised you might be talking about doing it over multiple calls (i.e. for different geo), in which case you update the shader's texture resource view. If you are using the effects framework you can use ID3DX11EffectShaderResourceVariable::SetResource, or alternatively rebind a new texture using PSSetShaderResources. However, if you are trying to blend between multiple textures, then you should use texture arrays.
You may also want to look into 3D textures, which provide a natural way to interpolate between adjacent textures in the array (whereas 2D arrays are automatically clamped to the nearest integer) via the 3rd element in the texcoord. See the HLSL sample remarks.