Batch model matrixes to single Uniform Buffer Object

Batch model matrixes to single Uniform Buffer Object - c++

I've decided to take a look at uniform buffer objects. But i am not sure when and when not use it.
I have tried to batch all models transformations into single array that i would send to the shader at once. But it has it's consequences. I have to also send to shader id for every vertex to match those transformations.
So my question is: Is it worth? Or should i prefer in that particular case to use regular glUniform calls?
Below is my shader program.
#version 330 core
layout (location = 0) in vec3 position;
layout (location = 1) in int id;
layout (std140) uniform scene_data
{
mat4 ViewProjectionMatrix;
mat4 ModelMatrix[128];
};
void main()
{
gl_Position = ViewProjectionMatrix * ModelMatrix[id] * vec4(position,1.0);
}
Here is how i create Uniform Buffer Object:
Dot::UniformBuffer::UniformBuffer(const void*data, unsigned int size, unsigned int index)
:m_Index(index),m_size(size)
{
glGenBuffers(1, &m_UBO);
glBindBuffer(GL_UNIFORM_BUFFER, m_UBO);
glBufferData(GL_UNIFORM_BUFFER, size, data, GL_DYNAMIC_DRAW);
glBindBufferBase(GL_UNIFORM_BUFFER, index, m_UBO);
glBindBuffer(GL_UNIFORM_BUFFER, 0);
}
Dot::UniformBuffer::~UniformBuffer()
{
glDeleteBuffers(1, &m_UBO);
}
void Dot::UniformBuffer::Update(const void* data)
{
glBindBuffer(GL_UNIFORM_BUFFER, m_UBO);
GLvoid* p = glMapBuffer(GL_UNIFORM_BUFFER, GL_WRITE_ONLY);
memcpy(p, data, m_size);
glUnmapBuffer(GL_UNIFORM_BUFFER);
}

Everything has its tradeoffs, and would require performance measurements to verify.
Think of glUniform as pass-by-value, whereas UBO is pass-by-reference. Except, there is [usually] no perf penalty in the GPU shader code for additional indirection to the UBO.
UBOs' main benefit is reducing CPU overhead at the "bind" stage, since all the values are already written to memory. A render can prepare many UBOs up front at loading time (one UBO per "state vector") to avoid having to transfer the data during the main rendering loop. Typically you don't want to modify a UBO in-place in your renderer, because the glMapBuffer will wait for previous draws that are using that UBO to complete.
In this particular example, the shader is performing an "indexed constant lookup" using the input attribute id, which is slower than a uniform constant lookup.
Other considerations:
layout (location = 1) in int id; requires the CPU to prepare an additional vertex buffer and index buffer. This may also imply making data copies of the raw geometry, which is more CPU and memory intensive.
glDrawArraysInstanced or glDrawElementsInstanced can let OpenGL generate the id for you instead. From the shader, use gl_InstanceID. Removes the need for additional vertex buffers. That would be similar to https://learnopengl.com/Advanced-OpenGL/Instancing
glUniform will cause the data to be transferred from application code --> driver --> GPU on every call.
UBO requires more object management and planning, but usually converges on the fastest performing renderer. Vulkan and D3D12 support only UBO style programming.

Related

Can I modify vertex buffer in GPU through the vertex shader?

For some reason cannot find the answer on the web. I want to update vertex attributes in GPU through the shader in similar form:
#version 330 core
layout(location = 0) in vec4 position;
uniform mat4 someTransformation;
void main()
{
position = position * someTransformation;
gl_Position = position;
}
Is it possible?

Can you write the code you have written? Yes, that is legal code.
Will that change the contents of any GPU storage? No.
While there are ways for a VS to directly manipulate the contents of a buffer, if the buffer region being manipulated is also potentially being used as an attribute array for a rendering command, then you will have undefined behavior.
You can use SSBOs to manipulate other storage which is not being used as the input for rendering. And you can use transform feedback to accumulate data output from vertex processing. But you cannot have a VS directly modify its own input array.

Arrays in uniform blocks cannot be indexed with vertex attributes, right?

I was wondering if I can index into a uniform buffer array with a value contained in the vertices I draw, like:
layout (location = 0) in vec3 position;
layout (location = 1) in flat int idx;
layout(binding = 0, std140) uniform uniformValues
{
float values[100];
};
void main()
{
values[idx];
}
My understanding is that this is not possible because
'in flat int idx'
Is most likely not a 'dynamically uniform expression', and according to the documentation cannot be used to index into a uniform buffer array:
There are places where GLSL requires the expression to be dynamically
uniform. All of the following must use a dynamically uniform
expression:
-The index to buffer-backed interface block arrays.
However I came across information from the same source regarding how to access an array of samplers holding texture handles for 'bindless textures', and it says (emphasis mine):
Sampler and image types used in default block uniform variables can be
populated from handles rather than the index of a binding point.
These types can also now be passed as Shader Stage inputs/outputs (using the flat interpolation qualifier where needed). They can be
used as Vertex Attributes, where they are treated as 64-bit integers
on the OpenGL side. And they can be used in Interface Blocks of all
kinds; buffer-backed interface blocks treat them as 64-bit integers.
It's saying, I believe, that instead of doing this:
layout (location = 0) in flat int textureBinding;
layout (binding = 0) uniform sampler2D textures[16];
void main()
{
textures[textureBinding];
}
You do this:
layout (location = 0) in flat int bindlessTextureHandle;
layout (binding = 0) uniform textureBuffer
{
sampler2D textures[200];
}
void main()
{
textures[bindlessTextureHandle];
}
'bindlessTextureHandle' isn't a 'dynamically uniform expression', how can it be used to index into uniform buffer?
All of the following must use a dynamically uniformexpression:
-The index to buffer-backed interface block arrays.
So why is it saying that you can index into 'interface blocks' of all kinds with values from vertex inputs?
Also are you allowed to index into:
'uniform sampler2D[16] textures;'
with a 'non-dynamically uniform expression'?

My understanding is that this is not possible because
in flat int idx
Is most likely not a 'dynamically uniform expression', and according
to the documentation cannot be used to index into a uniform buffer
array.
You are right that you need a dynamically uniform value to index into an uniform buffer array. However, this:
layout(binding = 0, std140) uniform uniformValues
{
float values[100];
};
is not a uniform buffer array. That is an array inside a single uniform buffer object, and you can index with non-uniform values into this array as you like. A uniform buffer array would be:
layout(binding = 0, std140) uniform myUBO
{
float value;
} myUBOArray[4];
The rest of your question gets even more obscure. You sometimes reference "The index to buffer-backed interface block arrays", which your code never uses. This is talking about SSBOs, which use interface blocks of the form layout(...) buffer foo {...}.
So why is it saying that you can index into 'interface blocks' of all kinds with values from vertex inputs?
Because that is how it is. You just need to understand that indexing into an array of interface blocks is not the same as indexing some other array (which might or might not be defined inside an interface block, doesn't matter).
Also are you allowed to index into:
uniform sampler2D[16] textures; with a 'non-dynamically uniform expression'?
No, not in standard GL.
The first thing about bindless textures is that this is not a core feature of any OpenGL version released to date (which is 4.6 at the time of writing this).It is only defined as an extension GL_ARB_bindless_textures which some modern GPUs and drivers expose, but the availability of that feature is quite far from being universal.
Second, the extension spec above explains: "Sampler and image handles passed to texture built-in functions must be dynamically uniform", so it still doesn't get you there. However, the extension GL_NV_gpu_shader5 removes that restriction. So on recent NVIDIA GPUs, you can get a non-dymically uniform index into an array of bindless texture samplers - but performance will still suffer by a significant amount if you do so.

There are a myriad of separate, overlapping issues here.
Indexing an array within a uniform block has never been limited to dynamically uniform expressions (generally, see below). Even in GL 3.x, you can index an array within a buffer-backed block with an arbitrary index.
However, you're not asking about a general array; you're asking about arrays of textures. Or to be more general, the entire sequence of operations leading to the computation of a sampler type through bindless textures.
That entire sequence must be dynamically uniform (unless you're on NVIDIA, which allows arbitrary expressions). It doesn't matter if you're indexing an SSBO array, using an input variable to pass a texture handle directly, or anything else. The value that leads to the acquisition of a sampler type must be dynamically uniform.
So why is it saying that you can index into 'interface blocks' of all kinds with values from vertex inputs?
Because you can.
A common misunderstanding of what "dynamically uniform" means is that it is a static property. That an expression by itself is either dynamically uniform or not. This is close to true, but it's not actually true.
Some expressions are dynamically uniform by their nature. You could call these "statically dynamically uniform" expressions. A constant expression is always dynamically uniform, for example.
However, being dynamically uniform is about the value of the expression. All invocations (within the rendering command) must result in the same value. An in variable for a shader stage can be dynamically uniform so long as it just so happens to always have the same value within the rendering command. For example, a VS could access a value from a uniform array using gl_DrawID (which is dynamically uniform), pass that as an in to the FS, and the FS can use that value as a sampler. Or to access an array of samplers. Or whatever. All FS invocations will get the same value within the draw command, so that value is dynamically uniform.

How to generate geometry to link neighbour nodes in a geometry shader with OpenCL/GL interop?

I am working on a 3D mesh I am storing in an array: each element of the array is a 4D point in homogeneous coordinates (x, y, z, w). I use OpenCL to do some calculations on these data, which later I want to visualise, therefore I set up an OpenCL/GL interop context. I have created a shared buffer between OpenCL and OpenGL by using the clCreateFromGLBuffer function on a GL_ARRAY_BUFFER:
...
glGenBuffers(1, &vbo);
glBindBuffer(GL_ARRAY_BUFFER, vbo);
glBufferData(GL_ARRAY_BUFFER, size_of_data, data, GL_DYNAMIC_DRAW);
vbo_buff = clCreateFromGLBuffer(ctx, CL_MEM_READ_WRITE, vbo, &err);
...
In the vertex shader, I access data this way:
layout (location = 0) in vec4 data;
out VS_OUT
{
vec4 vertex;
} vs_out;
void main(void)
{
vs_out.vertex = data;
}
Then in the geometry shader I do something like this:
layout (points) in;
layout (triangle_strip, max_vertices = MAX_VERT) out;
in VS_OUT
{
vec4 vertex;
} gs_in[];
void main()
{
gl_Position = gs_in[0].vertex;
EmitVertex();
...etc...
}
This gives me the ability of generating geometry based on the position of each point the stored in the data array.
This way, the geometry I can generate is only based on the current point being processed by the geometry shader: e.g. I am able to construct a small cube (voxel) around each point.
Now I would like to be able to access to the position of other points in the data array within the geometry shader: e.g. I would like to be able to retrieve the coordinates of another point (indexed by another shared buffer of an arbitrary length) besides the one which is currently processed in order to draw a line connecting them.
The problem I have is that in the geometry shader gs_in[0].vertex gives me the position of each point but I don't know which one for at the time (which index?). Moreover I don't know how to access the position of other points besides that one at the same time.
In an hypothetical pseudo-code I would like to be able to do something like this:
point_A = gs_in[0].vertex[index_A];
point_B = gs_in[0].vertex[index_B];
draw_line_between_A_and_B(point_A, point_B);
It is not clear to me whether this is possible or not, or how to achieve this within a geometry shader. I would like to stick to this approach because the calculations I do in the OpenCL kernels implement a cellular automata, hence it is convenient for me to organise my code (neutrino) in terms of central nodes and related neighbours.
All suggestions are welcome.
Thanks.

but I don't know which one for at the time (which index?)
See gl_PrimitiveIDIn
I don't know how to access the position of other points besides that one at the same time.
You can bind same source buffer two times, as a vertex source and as GL_TEXTURE_BUFFER. If your OpenGL implementation supports it, you'll then be able to read from there.
Unlike Direct3D, in GL the support for the feature is optional, the spec says GL_MAX_GEOMETRY_SHADER_STORAGE_BLOCKS can be zero.

"This gives me the ability of generating geometry based on the position of each point the stored in the data array."
No it does not. The input to the geometry shader are not all the vertex attributes in the buffer. Let me quote the Geometry Shader wiki page:
Geometry shaders take a primitive as input; each primitive is composed of some number of vertices, as defined by the input primitive type in the shader.
Primitives are a single point, line primitive or triangle primitive. For instance, If the primitive type is GL_POINTS, then the size of the input array is 1 and you can only access the vertex coordinate of the point, or if the primitive type is GL_TRIANGLES, the the size of the input array is 3 and you can only access the 3 vertex attributes (respectively corners) which form the triangle.
If you want to access more data, the you have to use a Shader Storage Buffer Object (or a texture).

how can i load an array of uniform buffer objects into the shader?

shader code:
// UBO for MVP matrices
layout (binding = 0) uniform UniformBufferObject {
mat4 model;
mat4 view;
mat4 proj;
} ubo;
this works fine because its just one struct and I can set the VkWriteDescriptorSet.descriptorCount to 1. But how can I create an array of those structs?
// Want to do something like this
// for lighting calculations
layout (binding = 2) uniform Light {
vec3 position;
vec3 color;
} lights[4];
I have the data for all of the four lights stored in one buffer. When I set the VkWriteDescriptorSet.descriptorCount to four,
Do I have to create four VkDescriptorBufferInfo? If so, I dont know what to put into offset and range.

All of the blocks in a uniform buffer array live in the same descriptor. However, they are still different blocks; they get a different VkDescriptorBufferInfo info object. So those blocks don't have to come from sequential regions of storage.
Note: The KHR_vulkan_glsl extension gets this wrong, if you look pay close attention. It notes that arrays of opaque types should go into a single descriptor, but arrays of interface blocks don't. The actual glslangValidator compiler (and SPIR-V and Vulkan) do handle it as I described.
However, you cannot access the elements of an interface block array with anything other than dynamically uniform expressions. And even that requires having a certain feature available; without that feature, you can only access the array with constant expressions.
What you probably want is an array within the block, not an array of blocks:
struct Light
{
vec4 position; //(NEVER use `vec3` in blocks)
vec4 color;
};
layout (set = 0, binding = 2, std140) uniform Lights {
Light lights[4];
};
This means that you have a single descriptor in binding slot 2 (descriptorCount is 1). And the buffer's data should be 4 sequential structs.

OpenGL Bindless Textures: Bind to uniform sampler2D array

I am looking into using bindless textures to rapidly display a series of images. My reference is the OpenGL 4.5 redbook. The book says I can sample bindless textures in a shader with this fragment shader:
#version 450 core
#extension GL_ARB_bindless_texture : require
in FS_INPUTS {
vec2 i_texcoord;
flat int i_texindex;
};
layout (binding = 0) uniform ALL_TEXTURES {
sampler2D fs_textures[200];
};
out vec4 color;
void main(void) {
color = texture(fs_textures[i_texindex], i_texcoord);
};
I created a vertex shader that looks like this:
#version 450 core
in vec2 vert;
in vec2 texcoord;
uniform int texindex;
out FS_INPUTS {
vec2 i_texcoord;
flat int i_texindex;
} tex_data;
void main(void) {
tex_data.i_texcoord = texcoord;
tex_data.i_texindex = texindex;
gl_Position = vec4(vert.x, vert.y, 0.0, 1.0);
};
As you may notice, my grasp of whats going on is a little weak.
In my OpenGL code, I create a bunch of textures, get their handles, and make them resident. The function I am using to get the texture handles is 'glGetTextureHandleARB'. There is another function that could be used instead, 'glGetTextureSamplerHandleARB' where I can pass in a sampler location. Here is what I did:
Texture* textures = new Texture[load_limit];
GLuint64* tex_handles = new GLuint64[load_limit];
for (int i=0; i<load_limit; ++i)
{
textures[i].bind();
textures[i].data(new CvImageFile(image_names[i]));
tex_handles[i] = glGetTextureHandleARB(textures[i].id());
glMakeTextureHandleResidentARB(tex_handles[i]);
textures[i].unbind();
}
My question is how do I bind my texture handles to the ALL_TEXTURES uniform attribute of the fragment shader? Also, what should I use to update the vertex attribute 'texindex' - an actual index into my texture handle array or a texture handle?

It's bindless texturing. You do not "bind" such textures to anything.
In bindless texturing, the data value of a sampler is a number. Specifically, the number returned by glGetTextureHandleARB. Texture handles are 64-bit unsigned integer.
In a shader, values of sampler types in buffer-backed interface blocks (UBOs and SSBOs) are 64-bit unsigned integers. So an array of samplers is equivalent in structure to an array of 64-bit unsigned integers.
So in C++, a struct equivalent to your ALL_TEXTURES block would be:
struct AllTextures
{
GLuint64 textures[200];
};
Well, assuming you properly use std140 layout, of course. Otherwise, you'd have to query the layout of the structure.
At this point, you treat the buffer as no different from any other UBO usage. Build the data for the shader by sticking an AllTextures into a buffer object, then bind that buffer as a UBO to binding 0. You just need to fill the array in with the actual texture handles.
Also, what should I use to update the vertex attribute 'texindex' - an actual index into my texture handle array or a texture handle?
Well, neither one will work. Not the way you've written it.
See, ARB_bindless_texture does not allow you to access any texture you want in any way at any time from any shader invocation. Unless you are using NV_gpu_shader5, the code leading to the texture access must be based on dynamically uniform expressions.
So unless every vertex in your rendering command gets the same index or handle... you cannot use them to pick which texture to use. Even instancing will not save you, since dynamically uniform expressions don't care about instancing.
If you want to render a bunch of quads without having to change uniforms between them (and without having to rely on an NVIDIA extension), then you have a few options. Most hardware that supports bindless texture also supports ARB_shader_draw_parameters. This gives you access to gl_DrawID, which represents the current index of a rendering command within a glMultiDraw-style command. And that extension explicitly declares that gl_DrawID is dynamically uniform.
So you could use that to select which texture to render. You simply need to issue a multi-draw command where you render the same mesh data over and over, but it gets a different gl_DrawID index in each case.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js