Having a non bound sampler inside an uniform branch - glsl

Lets say I have pixel shader that sometimes need to read from one sampler and sometimes needs to read from two different samplers, depending on a uniform variable
layout (set = 0, binding = 0) uniform UBO {
....
bool useSecondTexture;
} ubo;
...
void main() {
vec3 value0 = texture(sampler1, pos).rgb;
vec3 value2 = vec3(0,0,0);
if(ubo.useSecondTexture) {
value2 = texture(sampler2, pos).rgb;
}
value0 += value2;
}
Does the second sampler; sampler2 need to be bound to a valid texture even though the texture will not be read if useSecondTexture is false.

All of the vkCmdDraw and vkCmdDispatch commands have this Valid Usage statement:
Descriptors in each bound descriptor set, specified via vkCmdBindDescriptorSets, must be valid if they are statically used by the currently bound VkPipeline object, specified via vkCmdBindPipeline
Since sampler2 is statically used, you must have a valid descriptor for it or you'll get undefined behavior.
My guess is that on some implementations, it'll work as you expect. But drivers/hardware are allowed to require that all descriptors that might be used by a pipeline are valid, and requiring them to inspect the contents of memory buffers to determine if something might be used would be very expensive.

Related

Why does the value of GL_MAX_UNIFORM_BLOCK_SIZE change?

According to khronos.org,
GL_MAX_UNIFORM_BLOCK_SIZE refers to the maximum size in basic machine units of a uniform block. The value must be at least 16384.
I have a fragment shader, where I declared a uniform interface block and attached a uniform buffer object to it.
#version 460 core
layout(std140, binding=2) uniform primitives{
vec3 foo[3430];
};
...
If I query the size of GL_MAX_UNIFORM_BLOCK_SIZEwith:
GLuint info;
glGetUniformiv(shaderProgram.getShaderProgram_id(), GL_MAX_UNIFORM_BLOCK_SIZE, reinterpret_cast<GLint *>(&info));
cout << "GL_MAX_UNIFORM_BLOCK_SIZE: " << info << endl;
I get: GL_MAX_UNIFORM_BLOCK_SIZE: 22098. It is ok, but for example: when I changes the size of the array to 3000 (instead of 3430), I get GL_MAX_UNIFORM_BLOCK_SIZE: 21956
As far as I know, GL_MAX_UNIFORM_BLOCK_SIZE should be a constant depending on my GPU. Then why does it change, when I modify the size of the array?
GL_MAX_UNIFORM_BLOCK_SIZE is properly queried with glGetIntegerv. It is a constant defined by the implementation which tells you the implementation-defined maximum. glGetUniform returns the value of a uniform in the given program. You probably got an OpenGL error of some kind, since GL_MAX_UNIFORM_BLOCK_SIZE is not a valid uniform location, and therefore your integer was never written to. So you're just reading uninitialized data.

Specialization constant used for array size

I'm trying to use a SPIR-V specialization constant to define the size of an array in a uniform block.
#version 460 core
layout(constant_id = 0) const uint count = 0;
layout(binding = 0) uniform Uniform
{
vec4 foo[count];
uint bar[count];
};
void main() {}
With a declaration of count = 0 in the shader, compilation fails with :
array size must be a positive integer
With count = 1 and a specialization of 5, the code compiles but linking fails at runtime with complaints of aliasing :
error: different uniforms (named Uniform.foo[4] and Uniform.bar[3]) sharing the same offset within a uniform block (named Uniform) between shaders
error: different uniforms (named Uniform.foo[3] and Uniform.bar[2]) sharing the same offset within a uniform block (named Uniform) between shaders
error: different uniforms (named Uniform.foo[2] and Uniform.bar[1]) sharing the same offset within a uniform block (named Uniform) between shaders
error: different uniforms (named Uniform.foo[1] and Uniform.bar[0]) sharing the same offset within a uniform block (named Uniform) between shaders
It seems the layout of the uniform block (the offset of each member) is not affected during specialization so foo and bar overlap.
Explicit offsets don't work either and result in the same link errors :
layout(binding = 0, std140) uniform Uniform
{
layout(offset = 0) vec4 foo[count];
layout(offset = count) uint bar[count];
};
Is this intended behavior ? An overlook ?
Can a specialization constant be used to define the size of an array ?
This is an odd quirk of ARB_spir_v. From the extension specification:
Arrays inside a block may be sized with a specialization constant, but the block will have a static layout. Changing the specialized size will not re-layout the block. In the absence of explicit offsets, the layout will be based on the default size of the array.
Since the default size is 0, the struct in the block will be laid out as though the arrays were zero-sized.
Basically, you can use specialization constants to make the arrays shorter than the default, but not longer. And even if you make them shorter, they still take up the same space as the default.
So really, using specialization constants in block array lengths is just a shorthand way of declaring the array with the default value as its length, and then replacing where you would use name.length() with the specialization constant/expression. It's purely syntactic sugar.

GL_SHADER_STORAGE_BUFFER memory limitations

I'm writing ray-tracing on OGL computing shaders, to pass data to and from shaders I use buffers.
When size of vec2 output buffer (which is equal to number of rays multiplied by number of faces) reaches ~30Mb attempt of mapping buffer is stable returning NULL pointer. Range mapping also fails.
I can't find any info about GL_SHADER_STORAGE_BUFFER limitations in ogl documentation, but maybe someone can help me, is ~30Mb limit or this mapping-fail may happen because of something different?
And is there any way to avoid this except for calling shader multiple times?
Data declaration in shader:
#version 440
layout(std430, binding=0) buffer rays{
vec4 r[];
};
layout(std430, binding=1) buffer faces{
vec4 f[];
};
layout(std430, binding=2) buffer outputs{
vec2 o[];
};
uniform int face_count;
uniform vec4 origin;
Calling code (using some Qt5 wrappers):
QOpenGLBuffer ray_buffer;
QOpenGLBuffer face_buffer;
QOpenGLBuffer output_buffer;
QVector<QVector2D> output;
output.resize(rays[r].size()*faces.size());
if(!ray_buffer.create()) { /*...*/ }
if(!ray_buffer.bind()) { /*...*/ }
ray_buffer.allocate(rays.data(), rays.size()*sizeof(QVector4D));
if(!face_buffer.create()) { /*...*/ }
if(!face_buffer.bind()) { /*...*/ }
face_buffer.allocate(faces.data(), faces.size()*sizeof(QVector4D));
if(!output_buffer.create()) { /*...*/ }
if(!output_buffer.bind()) { /*...*/ }
output_buffer.allocate(output.size()*sizeof(QVector2D));
ogl->glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, ray_buffer.bufferId());
ogl->glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 1, face_buffer.bufferId());
ogl->glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 2, output_buffer.bufferId());
int face_count = faces.size();
compute.setUniformValue("face_count", face_count);
compute.setUniformValue("origin", pos);
ogl->glDispatchCompute(rays.size()/256, faces.size(), 1);
ray_buffer.destroy();
face_buffer.destroy();
QVector2D* data = (QVector2D*)output_buffer.map(QOpenGLBuffer::ReadOnly);
First of all, you have to understand that the OpenGL specification defines minimum maxima for a variety of values (the ones starting with a MAX_{*} prefix). That means that implementations are required to at least provide the specified amount as the maximum value, but are free to increase the limit as implementors see fit. This way, developers can at least rely on some upper bound, but can still make provisions for possibly larger values.
Section 23 - State Tables summarizes what has been previously specified in the corresponding sections. The information you were looking for is found in table 23.64 - Implementation Dependent Aggregate Shader Limits (cont.). If you want to know about which state belongs where (because there is per-object state, quasi-global state, program state and so on), you go to section 23.
The minimum maximum size of a shader storage buffer is represented by the symbolic constant MAX_SHADER_STORAGE_BLOCK_SIZE as per section 7.8 of the core OpenGL 4.5 specification.
Since their adoption into core, the required size (i.e. the minimum maximum) has been significantly increased. In core OpenGL 4.3 and 4.4, the minimum maximum was pow(2, 24) (or 16MB with 1 byte basic machine units and 1MB = 1024^2 bytes) - in core OpenGL 4.5 this value is now pow(2, 27) (or 128MB)
Summary: When in doubt about OpenGL state, refer to section 23 of the core specification.
From OpenGL Wiki:
SSBOs can be much larger. The OpenGL spec guarantees that UBOs can be
up to 16KB in size (implementations can allow them to be bigger). The
spec guarantees that SSBOs can be up to 128MB. Most implementations
will let you allocate a size up to the limit of GPU memory.
OpenGL < 4.5 guarantees only 16MiB (OpenGL 4.5 increased the minimum to 128MiB) , you can try using glGet() to query if you can bind more.
GLint64 max;
glGetInteger64v(GL_MAX_SHADER_STORAGE_BLOCK_SIZE, &max);
In fact problem seems to be in Qt wrappers. Didn't look in-depth, but when I've changed QOpenGLBuffer's create(), bind(), allocate() and map() to glCreateBuffers(), glBindBuffer(), glNamedBufferData() and glMapNamedBuffer(), all called through QOpenGLFunctions_4_5_Core, memory problem was gone until I reached 2Gb (which is GPU physical memory limit).
Second error I've made was not using glMemoryBarrier(), but it didn't help while QOpenGLBuffer was in use.

GLSL: will compiler evaluate functions with constant arguments?

If I have the following code in a GLSL fragment shader:
float r = 0.386;
float a = 26.6;
float xd = r*cos(0.0174532924*(a+0));
float yd = r*sin(0.0174532924*(a+0));
float xe = r*cos(0.0174532924*(a+90));
float ye = r*sin(0.0174532924*(a+90));
is it a sane assumption that the compiler will evaluate those trigonometric functions instead of have them be evaluated in every fragment execution?
In this case, sadly, you can't know much, since the compilation is done by the GPU. I would say it is implementation dependent, since some compilers may be better optimized.
However, as WearyWanderer sayed, you can hardcode the values or pass them through uniforms/UBO.
As you mentioned you could calculate the values and directly assign them, but want to let it for documentation purposes, I assum the values will be the same in every execution of the shader code.
Uniform Variables are variables that you can calculate once, send to a shader, and are the same for every execution, unless you change the uniform variable at some point. For example:
float r = 0.386;
float a = 26.6;
float xd_val = r*cos(0.0174532924*(a+0));
GLuint xd_id = glGetUniformLocation(pShaderProgram, "xd");
glUniform1f(xd_id, xd_val);
This calculates the value only once on the CPU, passes it to the shader program as a uniform variable, and the shader has access to the value for every execution without recaulcating it, but still leaves the code in here for your documentation that you wanted.
Uniform's are commonly used for object wide values, I.E an alpha-value, passing in scene lights for phong shader model, etc.

Issue with glBindBufferRange() OpenGL 3.1

My vertex shader is ,
uniform Block1{ vec4 offset_x1; vec4 offset_x2;}block1;
out float value;
in vec4 position;
void main()
{
value = block1.offset_x1.x + block1.offset_x2.x;
gl_Position = position;
}
The code I am using to pass values is :
GLfloat color_values[8];// contains valid values
glGenBuffers(1,&buffer_object);
glBindBuffer(GL_UNIFORM_BUFFER,buffer_object);
glBufferData(GL_UNIFORM_BUFFER,sizeof(color_values),color_values,GL_STATIC_DRAW);
glUniformBlockBinding(psId,blockIndex,0);
glBindBufferRange(GL_UNIFORM_BUFFER,0,buffer_object,0,16);
glBindBufferRange(GL_UNIFORM_BUFFER,0,buffer_object,16,16);
Here what I am expecting is, to pass 16 bytes for each vec4 uniform. I get GL_INVALID_VALUE error for offset=16 , size = 16.
I am confused with offset value. Spec says it is corresponding to "buffer_object".
There is an alignment restriction for UBOs when binding. Any glBindBufferRange/Base's offset must be a multiple of GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT. This alignment could be anything, so you have to query it before building your array of uniform buffers. That means you can't do it directly in compile-time C++ logic; it has to be runtime logic.
Speaking of querying things at runtime, your code is horribly broken in many other ways. You did not define a layout qualifier for your uniform block; therefore, the default is used: shared. And you cannot use `shared* layout without querying the layout of each block's members from OpenGL. Ever.
If you had done a query, you would have quickly discovered that your uniform block is at least 32 bytes in size, not 16. And since you only provided 16 bytes in your range, undefined behavior (which includes the possibility of program termination) results.
If you want to be able to define C/C++ objects that map exactly to the uniform block definition, you need to use std140 layout and follow the rules of std140's layout in your C/C++ object.