difference between SSBO and Image load/store - opengl

What are the diffences between "Shader Storage Buffer Objects" (SSBO) and Image load store operations
When should one be used and not the other?
They both can have atomic operations and I assume they are stored in the same type of memory. And regardless if they are stored in the same type of memory, do they have the same performance characteristics?
edit: the original question was asking between SSBOs and Uniform buffer objects, it was meant to be between SSBO and Image load store.

The difference between shader storage buffer objects and image textures and why one would want to use them is that they can use interface blocks.
Images are just textures which mean only vec4's are in the data structure. Well not only vec4, it could have other formats, but the data structure would be many of one data type.
Where as, SSBO's are generic. They can use combinations of int's, float's, arrays of vec3's all in a single interface block.
So, SSBO's are much more flexible than just Image Texture's.

Your question is already answered more or less definitively at http://www.opengl.org/wiki/Shader_Storage_Buffer_Object. It says:
SSBOs are a lot like Uniform Buffer Objects. Shader storage blocks are
defined by Interface Block (GLSL)s in almost the same way as uniform
blocks. Buffer objects that store SSBOs are bound to SSBO binding
points, just as buffer objects for uniforms are bound to UBO binding
points. And so forth.
The major differences between them are:
SSBOs can be much larger. The smallest required UBO size is 16KB; the smallest required SSBO size is 16MB, and typical sizes will
be on the order of the size of GPU memory.
SSBOs are writable, even atomically; UBOs are uniform​s. SSBOs reads and writes use incoherent memory accesses, so they need the
appropriate barriers, just as Image Load Store operations.
SSBOs can have unbounded storage, up to the buffer range bound; UBOs must have a specific, fixed storage size. This means that you can
have an array of arbitrary length in an SSBO. The actual size of the
array, based on the range of the buffer bound, can be queried at
runtime in the shader using the length​ function on the unbounded
array variable.

As others have mentioned, SSBOs have much larger storage and supports atomic operations, the accepted answer also mentioned that SSBOs are generic in the sense that they allow users to combine different types. But personally, I just want to point out that I think this is usually BAD, it is not always ideal to use interface blocks or structs in SSBO. Here's an example:
Let's say you have a struct in C++ like this:
struct Foo {
glm::vec4 position;
glm::vec4 velocity;
glm::vec4 padding_and_range; // range is a float padded to a vec4
};
which corresponds to an SSBO buffer in glsl:
struct Foo {
vec4 position;
vec4 velocity;
vec4 padding_and_range; // range is a float padded to a vec4
};
layout(std430, binding = 0) readonly buffer SSBO {
Foo data[];
} foo;
Although the SSBO buffer is able to hold an array of struct Foo, notice that paddings must be taken into account as per the std430 memory layout, you have to padded your float range to a vec4, and then use foo.data[i].padding_and_range.w to access it. This is error-prone, let alone the waste of memory spaces, especially when your SSBO is large (to be used in a compute shader) and your Foo struct is complex (needs a lot of paddings). Apart from that, you often need to fill in the buffer data in a loop like this:
Foo* foos = reinterpret_cast<Foo*>(glMapNamedBufferRange(ssbo, offset, size, GL_MAP_READ_BIT));
for (int i = 0; i < n_foos; i++) {
Foo& foo = foos[i];
foo.position = glm::vec4(1.0f);
foo.velocity = glm::vec4(2.0f);
foo.padding_and_range = glm::vec4(glm::vec3(0.0f), 3.5f);
}
glUnmapNamedBuffer(ssbo);
instead of simply writing data to it in one go using glNamedBufferData or glNamedBufferSubData.
A better way of handling struct is to store each struct element into a separate SSBO, so that each SSBO buffer array is tightly packed and homogeneous. Even though the performance may not be any better, it helps keep your code clean and more readable. Rarher than using the struct, you would want to use:
layout(std430, binding = 0) buffer FooPosition {
vec4 position[];
};
layout(std430, binding = 1) buffer FooVelocity {
vec4 velocity[];
};
layout(std430, binding = 2) buffer FooRange {
float range[];
};

Related

GLSL buffer with multiple arbitrary sized variables

I'm trying to bind a storage buffer of object structs containing multiple variables of different types so that I can access them in my vertex shader.
Struct:
struct Object
{
vec2 vertices[4];
};
layout(std140, set = 0, binding = 0) buffer ObjectBlock
{
Object objects[];
};
Using the code above, although I am able to compile the shader and get the Vulkan program running, the values from the array variable vertices are completely different from the values I inputted. I'm not sure why this is happening.

OpenGL uniform blocks syntax

I was asking myself a question about UBO and the way of accessing them in GLSL with uniform blocks.
Following the official documentation, if I want to design an array of lights, I will probably write :
layout(std140, binding = 0) uniform LightBlock
{
vec4 position;
vec4 direction;
vec4 color;
...
} lights[8];
Now I see a lot of examples, where the uniform block is written that way :
struct LightStruct
{
vec4 position;
vec4 direction;
vec4 color;
...
};
layout(std140, binding = 0) uniform LightBlock
{
LightStruct lights[8];
};
What is the difference between the two ways ?
I guess it could help to reduce the number of uniform variables in use within a shader, but I'm not sure.
The first
layout(std140, binding = 0) uniform LightBlock
{
vec4 position;
vec4 direction;
vec4 color;
...
} lights[8];
declares an array of UBO buffer blocks itself. That means, that you can bind a different buffer object for each index in your array, or a different buffer range. Note that in this example, you will consume the UBO binding indices from 0 to 7, the GLSL spec explicitly states:
If the binding identifier is used with a uniform or shader storage
block instanced as an array, the first element of the array takes the
specified block binding and each subsequent element takes the next
consecutive uniform block binding point.
This has a couple of implications:
you can only use a very limited array size, because the number of UBO binding point is limited
you must index those arrays only with a dynamically uniform expression
you can bind the same UBO and buffer range to some or all of the indivudal indices of your array (something you could not do with an array inside the block)
In summary, you seldom really want to use an array of uniform blocks. Especially for your light example, you would use the latter:
layout(std140, binding = 0) uniform LightBlock
{
LightStruct lights[8];
};
just declares one uniform block with an array in it. It means you have to provide one consecutive UBO buffer range for the array, so you consume only one of the precious UBO binding points, and you can have the array as big as the maximum UBO size of your implementation is.

Suggestions for returning memory from a class

I have a class which is supposed to keep pixel data (floats for the position, floats for the color). I'm trying to use a C++ style in data members (the data is kept in std::array<float, N> instead of plain C arrays). The class has other getters, setters and functions meant to be "helpers" to populate these fields.
Now I need to create an OpenGL vertex data buffer where I should write out
4 floats for xyzw
4 floats for rgba
2 floats for UV coords
in this order. I'm wondering how should I do this.. I tried doing
class MyVertexData {
std::array<float, 4> pos;
std::array<float, 4> rgba;
std::array<float, 2> uv;
public:
void writeData(float *ptrToMemory) {
if(ptrToMemory == nullptr)
throw std::runtime_exception("Null pointer");
std::array<float, 10> output;
output= {
pos[0], pos[1], pos[2], pos[3],
rgba[0], rgba[1], rgba[2], rgba[3],
uv[0], uv[1]
};
memcpy(memory, out.data(), 10 * sizeof(float));
}
};
// Caller code
std::vector<float[10]> buffer(4);
vertex0.writeElements(buffer[0]);
vertex1.writeElements(buffer[1]);
vertex2.writeElements(buffer[2]);
vertex3.writeElements(buffer[3]);
but this approach has two problems:
I need to trust the caller to have allocated memory to store 10 floats
No C++11+ signature, I just get a float pointer
I can't just return a std::unique_ptr since I need a contiguous memory area (buffer) where the elements are to be stored, but I also need a distinction between the different elements (that would also make the code more readable).
It would be nice to return a smart pointer or something similar whose memory I can easily "concatenate" to other elements so I can safely pass this stuff to OpenGL.
CppCoreGuidelines introduces span which is a view of contiguous element, so you may use something like:
void writeData(gsl::span<float, 10> ptrToMemory)
to express the intend.

What data GLSL takes?

What data GLSL takes? For example I have a matrix 4x4, which is following kind:
float **matrix = new float*[4];
for(int i = 0; i < 4; i++)
matrix[i] = new matrix[4];
Can the GLSL to take this, as mat4x4?
Or, better to use following:
float *matrix = new float[16];
I haven't found this information in specification of GLSL 1.30(I have use particularly this version)
First of all you should make use of GLfloat instead that float, since it's there exactly to represent GL data which is used by your program and the GPU.
Regarding your specific question glUniform has many flavours used to send what you need. You have both
void glUniformMatrix2f(GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
void glUniformMatrix4f(GLint location, GLsizei count, GLboolean transpose, const GLfloat *value);
You can use them easily by passing the pointer to your data:
GLint location = glGetUniformLocation(variable,"variable_name");
glUniformMatrix4f(location, 1, false, &matrix);
You can't use the first variant, because the 16 floats of the matrix data will be allocated in different memory areas. You store array of 4 pointers, each pointing to separate 4xfloat array. OpenGL expects data to be located sequentially, so either use the second variant or use struct/static 2d array:
GLfloat[4][4]
It may be convenient to use the existing library for that, fore example gl matrix
What data GLSL takes? For example I have a matrix 4x4, which is following kind:
float **matrix = new float*[4];
for(int i = 0; i < 4; i++)
matrix[i] = new matrix[4];
That's not a 4×4 matrix. That's an array of pointers to arrays of 4 float elements.
Can the GLSL to take this, as mat4x4?
No, because it's not a matrix.
A 4×4 matrix would be a region of contiguous memory that contains 4·4 = 16 values (floats if you will) of which you denote, that each n-tuple (n=4) of values forms a vector and the m-tuple (m=4) of vectors forms a matrix.
Or, better to use following:
float *matrix = new float[16];
That would be a contiguous region of memory holding 16 = 4·4 float values, which by denotion can be interpreted as a 4×4 matrix of floats. So yes, this can be interpreted by OpenGL / GLSL as a 4×4 matrix of floats mat4.
However if you make this part of some class don't use dynamic memory.
class foo {
foo() { matrix = new float[16]; }
float *matrix;
};
Is bad, because you create some unnecessary overhead. If the class is dynamically allocated (with new) that will trigger another dynamic memory allocation, which creates overhead. If the instances are on automatic memory (stack, i.e. no new), it also imposes unnecessary overhead.
class foo {
foo() { … }
float matrix[16];
};
is much better, because if the class instances are created with new that also covers the memory for the matrix. And if it's on automatic memory it completely avoids the overhead of dynamic allocation.

Allocating vertex buffer object

I'm trying to create terrain from a heightmap in opengl(c++), and following this tutorial.
I'm also trying to use a vertex buffer object to speed it up. In their example, they create a vertex object with 3 floats for x, y, z. They then pass a pointer to an array of these vertex objects to be copied to the buffer object. What I don't understand is why for the size of the buffer parameter they pass it the size of the 3 floats (multiplied by the number of vertices).
Surely the vertex objects being passed to it are larger than just the size of the 3 floats? Does the glBufferDataARB function somehow extract these variables? Is the size of an object equal to the size of the variables in it? or am I missing something?
VBOs store bytes. Later gl*Pointer() and/or glVertexAttrib() calls tell OpenGL how to interpret those bytes.
To store three floats you need sizeof(float) * 3 bytes.
To store N three-float vertices you need sizeof(float) * 3 * N bytes.