My GLSL fragment shader skips the "if" statement. The shader itself is very short.
I send some data via a uniform buffer object and use it further in the shader. However, the thing skips the assignment attached to the "if" statement for whatever reason.
I checked the values of the buffer object using glGetBufferSubData (tested with specific non zero values). Everything is where it needs to be. So I'm really kinda lost here. Must be some GLSL weirdness I'm not aware of.
Currently the
#version 420
layout(std140, binding = 2) uniform textureVarBuffer
{
vec3 colorArray; // 16 bytes
int textureEnable; // 20 bytes
int normalMapEnable; // 24 bytes
int reflectionMapEnable; // 28 bytes
};
out vec4 frag_colour;
void main() {
frag_colour = vec4(1.0, 1.0, 1.0, 0.5);
if (textureEnable == 0) {
frag_colour = vec4(colorArray, 0.5);
}
}
You are confusing the base alignment rules with the offsets. The spec states:
The base offset of the first
member of a structure is taken from the aligned offset of the structure itself. The
base offset of all other structure members is derived by taking the offset of the
last basic machine unit consumed by the previous member and adding one. Each
structure member is stored in memory at its aligned offset. The members of a toplevel
uniform block are laid out in buffer storage by treating the uniform block as
a structure with a base offset of zero.
It is true that a vec3 requires a base alignment of 16 bytes, but it only consumes 12 bytes. As a result, the next element after the vec3 will begin 12 bytes after the aligned offset of the vec3 itself. Since the alignment rules for int are just 4 bytes, there will be no padding at all.
Related
I'm flattening out an octree and sending it to my fragment shader using an SSBO, and I believe I am running into some memory alignment issues. I'm using std430 for the layout and binding a vector of voxels to this SSBO this is the structure in my shader. I'm using GLSL 4.3 FYI
struct Voxel
{
bool data; // 4
vec4 pos; // 16
vec4 col; // 16
float size; // 4
int index; // 4
int pIndex; // 4
int cIdx[8]; // 4, 16 or 32 bytes?
};
layout (std430, binding=2) buffer octreeData
{
Voxel voxels[];
};
I'm not 100% sure but I think I'm running into an issue using the int cIdx[8] array inside of the struct, looking at the spec (page 124, section 7.6)
If the member is an array of scalars or vectors, the base alignment and array
stride are set to match the base alignment of a single array element, according
to rules (1), (2), and (3), and rounded up to the base alignment of a vec4. The
array may have padding at the end; the base offset of the member following
the array is rounded up to the next multiple of the base alignment.
I'm not entirely sure what the alignment is, I know the vec4's take up 16 bytes of memory, but how much does my array? If it was just sizeof(int)*8 that would be 32, but it says that it's set to the size of a single array element and then rounded up to a vec4 right? So does that mean my cIdx array has a base alignment of 16 bytes? There's no follow up members so is there padding getting added to my struct?
So total structure memory = 52 bytes (if we only allocate 4 bytes for cIdx), would that mean there is 12 bytes of padding being added on that I need to account for that may be causing me issues? If it was allocating 16 bytes would that be 64 bytes total for the structure and no memory alignment issues?
My corresponding c++ structure
struct Voxel
{
bool data;
glm::vec4 pos;
glm::vec4 col;
float size;
int index;
int pIndex;
int cIdx[8];
};
I'm then filling in my std::vector<Voxel> and passing it to my shader like so
glGenBuffers(1, &octreeSSBO);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, octreeSSBO);
glBufferData(GL_SHADER_STORAGE_BUFFER, voxelData.size()*sizeof(Voxel), voxelData.data(), GL_DYNAMIC_DRAW);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 2, octreeSSBO);
reading directly from the voxelData vector, I can confirm that the data is getting filled in correctly, and I can even occasionally see that the data is getting passed to the shader but behaving incorrectly compared to what I would expect to see based on the values I'm looking at.
Does it look like there are memory alignment issues here?
I'm not entirely sure what the alignment is
The specification is very clear as to what the base alignment of things are. Your problem is not in item #4 (std430 doesn't do the rounding specified in #4 anyway).
Your problem is in #2:
If the member is a two- or four-component vector with components consuming N basic machine units, the base alignment is 2N or 4N, respectively.
In GLSL, vec4 has a base alignment of 16. That means that any vec4 must be allocated on a 16-byte boundary.
pos must be on a 16-byte boundary. However, data is only 4 bytes. Therefore, 12 bytes of padding must be inserted between data and pos to satisfy std430's alignment requirements.
However, glm::vec4 has a C++ alignment of 4. So the C++ compiler does not insert a bunch of padding between data and pos. Thus, the types in the two languages do not agree.
You should explicitly align all GLM vectors in C++ structs that you want to match GLSL, using C++11's alignas keyword:
struct Voxel
{
bool data;
alignas(16) glm::vec4 pos;
alignas(16) glm::vec4 col;
float size;
int index;
int pIndex;
int cIdx[8];
};
Also, I would not assume that the C++ type bool and the GLSL type bool have the same size.
According to khronos.org,
GL_MAX_UNIFORM_BLOCK_SIZE refers to the maximum size in basic machine units of a uniform block. The value must be at least 16384.
I have a fragment shader, where I declared a uniform interface block and attached a uniform buffer object to it.
#version 460 core
layout(std140, binding=2) uniform primitives{
vec3 foo[3430];
};
...
If I query the size of GL_MAX_UNIFORM_BLOCK_SIZEwith:
GLuint info;
glGetUniformiv(shaderProgram.getShaderProgram_id(), GL_MAX_UNIFORM_BLOCK_SIZE, reinterpret_cast<GLint *>(&info));
cout << "GL_MAX_UNIFORM_BLOCK_SIZE: " << info << endl;
I get: GL_MAX_UNIFORM_BLOCK_SIZE: 22098. It is ok, but for example: when I changes the size of the array to 3000 (instead of 3430), I get GL_MAX_UNIFORM_BLOCK_SIZE: 21956
As far as I know, GL_MAX_UNIFORM_BLOCK_SIZE should be a constant depending on my GPU. Then why does it change, when I modify the size of the array?
GL_MAX_UNIFORM_BLOCK_SIZE is properly queried with glGetIntegerv. It is a constant defined by the implementation which tells you the implementation-defined maximum. glGetUniform returns the value of a uniform in the given program. You probably got an OpenGL error of some kind, since GL_MAX_UNIFORM_BLOCK_SIZE is not a valid uniform location, and therefore your integer was never written to. So you're just reading uninitialized data.
I'm trying to use a SPIR-V specialization constant to define the size of an array in a uniform block.
#version 460 core
layout(constant_id = 0) const uint count = 0;
layout(binding = 0) uniform Uniform
{
vec4 foo[count];
uint bar[count];
};
void main() {}
With a declaration of count = 0 in the shader, compilation fails with :
array size must be a positive integer
With count = 1 and a specialization of 5, the code compiles but linking fails at runtime with complaints of aliasing :
error: different uniforms (named Uniform.foo[4] and Uniform.bar[3]) sharing the same offset within a uniform block (named Uniform) between shaders
error: different uniforms (named Uniform.foo[3] and Uniform.bar[2]) sharing the same offset within a uniform block (named Uniform) between shaders
error: different uniforms (named Uniform.foo[2] and Uniform.bar[1]) sharing the same offset within a uniform block (named Uniform) between shaders
error: different uniforms (named Uniform.foo[1] and Uniform.bar[0]) sharing the same offset within a uniform block (named Uniform) between shaders
It seems the layout of the uniform block (the offset of each member) is not affected during specialization so foo and bar overlap.
Explicit offsets don't work either and result in the same link errors :
layout(binding = 0, std140) uniform Uniform
{
layout(offset = 0) vec4 foo[count];
layout(offset = count) uint bar[count];
};
Is this intended behavior ? An overlook ?
Can a specialization constant be used to define the size of an array ?
This is an odd quirk of ARB_spir_v. From the extension specification:
Arrays inside a block may be sized with a specialization constant, but the block will have a static layout. Changing the specialized size will not re-layout the block. In the absence of explicit offsets, the layout will be based on the default size of the array.
Since the default size is 0, the struct in the block will be laid out as though the arrays were zero-sized.
Basically, you can use specialization constants to make the arrays shorter than the default, but not longer. And even if you make them shorter, they still take up the same space as the default.
So really, using specialization constants in block array lengths is just a shorthand way of declaring the array with the default value as its length, and then replacing where you would use name.length() with the specialization constant/expression. It's purely syntactic sugar.
I have the following uniform buffer:
layout(std140) uniform Light
{
vec4 AmbientLight;
vec4 LightIntensity;
vec3 LightPosition;
float LightAttenuation;
};
I have some issues when buffering the data and the padding I need to add. I have read the http://ptgmedia.pearsoncmg.com/images/9780321552624/downloads/0321552628_AppL.pdf which says I have to add an extra 4 bytes at the end of the vec3 for padding - so I will upload a total of 13 bytes for 'Light'. When I do that however, 'LightAttenuation' gets the value I padded on 'LightPosition', rather than one byte ahead, so I get the correct values in the shader when I do NOT pad. Why is this?
See section 7.6.2.2 of the OpenGL spec for the details, but basically, std140 layout says that each variable will be laid out immediately after the previous variable with enough padding added for the alignment required for the variable's type. vec3 and vec4 both require 16-byte alignment and are 12 and 16 bytes respectively. float requires 4 byte alignment and has 4 byte size. So with std140 layout, LightPosition will get 16 byte alignment so will always end at an address that is 12 mod 16. Since this is 4-byte aligned, no extra padding will be inserted before LightAttenuation.
Usually yes, openGL will treat a vec3 as an vec4. But AFAIK in this case it appends the float LightAttenuation to the vec3 LightPosition - forming an overall vec4 (its some kind of optimization, done by the glsl compiler).
The whole structure will be of size 3x vec4.
Try out using a vec3 or vec4 for LightAttenuation.
My vertex shader is ,
uniform Block1{ vec4 offset_x1; vec4 offset_x2;}block1;
out float value;
in vec4 position;
void main()
{
value = block1.offset_x1.x + block1.offset_x2.x;
gl_Position = position;
}
The code I am using to pass values is :
GLfloat color_values[8];// contains valid values
glGenBuffers(1,&buffer_object);
glBindBuffer(GL_UNIFORM_BUFFER,buffer_object);
glBufferData(GL_UNIFORM_BUFFER,sizeof(color_values),color_values,GL_STATIC_DRAW);
glUniformBlockBinding(psId,blockIndex,0);
glBindBufferRange(GL_UNIFORM_BUFFER,0,buffer_object,0,16);
glBindBufferRange(GL_UNIFORM_BUFFER,0,buffer_object,16,16);
Here what I am expecting is, to pass 16 bytes for each vec4 uniform. I get GL_INVALID_VALUE error for offset=16 , size = 16.
I am confused with offset value. Spec says it is corresponding to "buffer_object".
There is an alignment restriction for UBOs when binding. Any glBindBufferRange/Base's offset must be a multiple of GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT. This alignment could be anything, so you have to query it before building your array of uniform buffers. That means you can't do it directly in compile-time C++ logic; it has to be runtime logic.
Speaking of querying things at runtime, your code is horribly broken in many other ways. You did not define a layout qualifier for your uniform block; therefore, the default is used: shared. And you cannot use `shared* layout without querying the layout of each block's members from OpenGL. Ever.
If you had done a query, you would have quickly discovered that your uniform block is at least 32 bytes in size, not 16. And since you only provided 16 bytes in your range, undefined behavior (which includes the possibility of program termination) results.
If you want to be able to define C/C++ objects that map exactly to the uniform block definition, you need to use std140 layout and follow the rules of std140's layout in your C/C++ object.