GLSL produced by SPIRV-cross from SPIR-V breaks std140 rules? - glsl

I've put my snippet HLSL code here: https://shader-playground.timjones.io/d9011ef7826a68ed93394792c2edb732
I compile HLSL with DXC to SPIR-V and then use SPIRV-Cross to get the GLSL code.
The GLSL constant buffer is tagged with std140 and it contains vec3 and float.
This according to my knowledge will not work. Shouldn't the GL_EXT_scalar_block_layout be used here? The constant block should be tagged with scalar instead of std140. Am I missing something obvious here? Thanks.

For an arbitrary input buffer, there isn't a generic OpenGL memory layout that is exactly equivalent to the DX constant buffer layout.
DX constant buffers will add padding needed to stop single variables spanning 16 byte boundaries, but variables themselves are only 4 byte aligned.
GL std140 uniform buffers will always align vec3 on a 16 byte boundary. This has no equivalent in DX.
GL std430 uniform buffers (if supported via GL_EXT_scalar_block_layout) will always align vec3 on a 16 byte boundary. This has no equivalent in DX.
GL scalar uniform buffers (if supported via GL_EXT_scalar_block_layout) will only pad to component element size, and don't care about 16 byte boundaries. This has no equivalent in DX.
Things get even more fun if you start throwing around struct and array types ...
TLDR, if you want a fixed binary memory layout that is portable between DX and GL/GLES and Vulkan you take some responsibility for designing a portable memory layout for your constant buffers. You can't throw arbitrary layouts around and expect it to work.

Related

Nondeterministic SSBO has GL_BUFFER_DATA_SIZE too small after glBufferData [duplicate]

The OpenGL specification lies (or is this a bug?)... Referring to the layout for std140, with shared uniform buffers, it states:
"The set of rules shown in Tabl e L-1 are used by the GLSL compiler to
layout members in a std140-qualified uniform block. The offsets of
members in the block are accumulated based on the sizes of the
previous members in the block (those declared before the variable in
question), and the starting offset. The starting offset of the first
member is always zero.
Scalar variable type (bool, int, uint, float) - Size of the scalar in
basic machine types"
(http://www.opengl-redbook.com/appendices/AppL.pdf)
So, armed with this information, I setup a uniform block in my shader that looks something like this:
// Spotlight.
layout (std140) uniform Spotlight
{
float Light_Intensity;
vec4 Light_Ambient;
vec3 Light_Position;
};
... only to discover it doesn't work with the subsequent std140 layout I setup on the CPU side. That is the first 4 bytes are a float (size of the machine scalar type for GLfloat), the next 16 bytes are a vec4 and the following 12 bytes are a vec3 (with 4 bytes left over on the end to take account of the rule that a vec3 is really a vec4).
When I change the CPU side to specify a float as being the same size as a vec4, i.e. 16 bytes, and do my offsets and buffer size making this assumption, the shader works as intended.
So, either the spec is wrong or I've misunderstood the meaning of "scalar" in this context, or ATI have a driver bug. Can anyone shed any light on this mystery?
That PDF you linked to is not the OpenGL specification. I don't know where you got it from, but that is certainly not the full list of rules. Always check your sources; the spec is not as unreadable as many claim it to be.
Yes, the size of variables of basic types is the same size as the basic machine type (ie: 4 bytes). But size alone does not determine the position of the variable.
Each type has a base alignment, and no matter where that type is found in a uniform block, it's overall byte offset must fit that alignment. The base alignment of a vec4 is 4 * the alignment of its basic type (ie: float). So the base alignment of a vec4 is 16.
Because Light_Intensity ends after 4 bytes, the compiler must insert 12 bytes of padding, because Light_Ambient cannot be on a 4-byte boundary. It must be on a 16-byte boundary, so the compiler uses 12 bytes of empty space.
ATI does have a few driver bugs around std140 layout, but this isn't one of them.
As a general rule, I like to explicitly put padding into my structures, and I avoid vec3 (because it has 16 byte alignment). Doing these generally cuts down on compiler bugs as well as accidental misunderstanding about where things go and how much room they actually take.

Is it legal to reuse Bindings for several Shader Storage Blocks

Suppose that I have one shader storage buffer and want to have several views into it, e.g. like this:
layout(std430,binding=0) buffer FloatView { float floats[]; };
layout(std430,binding=0) buffer IntView { int ints[]; };
Is this legal GLSL?
opengl.org says no:
Two blocks cannot use the same index.
However, I could not find such a statement in the GL 4.5 Core Spec or GLSL 4.50 Spec (or the ARB_shader_storage_buffer_object extension description) and my NVIDIA Driver seems to compile such code without errors or warnings.
Does the OpenGL specification expressly forbid this? Apparently not. Or at least, if it does, I can't see where.
But that doesn't mean that it will work cross-platform. When dealing with OpenGL, it's always best to take the conservative path.
If you need to "cast" memory from one representation to another, you should just use separate binding points. It's safer.
There is some official word on this now. I filed a bug on this issue, and they've read it and decided some things. Specifically, the conclusion was:
There are separate binding namespaces for: atomic counters, images, textures, uniform buffers, and SSBOs.
We don't want to allow aliasing on any of them except atomic counters, where aliasing with different offsets (e.g. sharing a binding) is allowed.
In short, don't do this. Hopefully, the GLSL specification will be clarified in this regard.
This was "fixed" in the revision 7 of GLSL 4.5:
It is a compile-time or link-time error to use the same binding number for more than one uniform block or for more than one buffer block.
I say "fixed" because you can still perform aliasing manually via glUniform/ShaderStorageBlockBinding. And the specification doesn't say how this will work exactly.

GLSL: Data Distortion

I'm using OpenGL 3.3 GLSL 1.5 compatibility. I'm getting a strange problem with my vertex data. I'm trying to pass an index value to the fragment shader, but the value seems to change based on my camera position.
This should be simple : I pass a GLfloat through the vertex shader to the fragment shader. I then convert this value to an unsigned integer. The value is correct the majority of the time, except for the edges of the fragment. No matter what I do the same distortion appears. Why is does my camera position change this value? Even in the ridiculous example below, tI erratically equals something other than 1.0;
uint i;
if (tI == 1.0) i = 1;
else i = 0;
vec4 color = texture2D(tex[i], t) ;
If I send integer data instead of float data I get the exact same problem. It does not seem to matter what I enter as vertex Data. The value I enter into the data is not consistent across the fragment. The distortion even looks the exact same each time.
What you are doing here is invalid in OpenGL/GLSL 3.30.
Let me quote the GLSL 3.30 specification, section 4.1.7 "Samplers" (emphasis mine):
Samplers aggregated into arrays within a shader (using square brackets
[ ]) can only be indexed with integral constant expressions (see
section 4.3.3 “Constant Expressions”).
Using a varying as index to a texture does not represent a constant expression as defined by the spec.
Beginning with GL 4.0, this was somewhat relaxed. The GLSL 4.00 specification states now the following (still my emphasis):
Samplers aggregated into arrays within a shader (using square brackets
[ ]) can only be indexed with a dynamically uniform integral
expression, otherwise results are undefined.
With dynamically uniform being defined as follows:
A fragment-shader expression is dynamically uniform if all fragments
evaluating it get the same resulting value. When loops are involved,
this refers to the expression's value for the same loop iteration.
When functions are involved, this refers to calls from the same call
point.
So now this is a bit tricky. If all fragment shader invocations actaully get the same value for that varying, it would be allowed, I guess. But it is unclear that your code guarantees that. You should also take into account that the fragment might be even sampled outside of the primitive.
However, you should never check floats for equality. There will be numerical issues. I don't know what exactly you are trying to achieve here, but you might use some simple rounding behavior, or use an integer varying. You also should disable the interpolation of the value in any case using the flat qualifier (which is required for the integer case anyway), which should greatly improve the changes of that construct to become dynamically uniform.

why gl_VertexID is not an unsigned int?

I am in the process of designing a shader program that makes use of the built-in variable gl_VertexID:
gl_VertexID — contains the index of the current vertex
The variable is defined as a signed int. Why it is not an unsigned int? What happens when it is used with very large arrays (e.g. a 2^30 long array)?
Does GLSL treat it as an unsigned int?
I want to use its content as an output of my shader (e.g writing it into an output FBO buffer) I will read its content using glReadPixels with GL_RED_INTEGER as format and either GL_INT or GL_UNSIGNED_INT as type.
Which one is correct?
If I use GL_INT I will not be able to address very large arrays.
In order to use GL_UNSIGNED_INT I might cast the generated gl_VertexID to a uint inside my shader but again, how to access long array?
Most likely historical reasons. gl_VertexID was first defined as part of the EXT_gpu_shader4 extension. This extension is defined based on OpenGL 2.0:
This extension is written against the OpenGL 2.0 specification and version 1.10.59 of the OpenGL Shading Language specification.
GLSL did not yet support unsigned types at the time. They were not introduced until OpenGL 3.0.
I cannot tell if OpenGL might treat the vertex id as unsigned int, but you could most likely create your own (full 32-bit) ID. I have done this some time ago by specifying a rgba8888 vertex color attribute which is converted to an id in the shader by bit-shifting the r,g,b, and a components.
Doing this i also noticed that this wasn't anyhow slower than using gl_VertexID, which seemed to introduce some overhead. Nowadays just use an unsigned int attribute.
Also, i wonder, why would you want to read back the gl_VertexID?
(i did this once for an algorithm and it turned out to be not thought through and now has been replaced by sth more efficient ;) )

OpenGL - How is GLenum a unsigned 32 bit Integer?

To begin there are 8 types of Buffer Objects in OpenGL:
GL_ARRAY_BUFFER​
GL_ELEMENT_ARRAY_BUFFER​
GL_COPY_READ_BUFFER
...
They are enums, or more specifically GLenum's. Where GLenum is a unsigned 32 bit integer that has values up to ~ 4,743,222,432 so to say.
Most of the uses of buffer objects involve binding them to a certain target like this: e.g.
glBindBuffer (GL_ARRAY_BUFFER, Buffers [size]);
[void glBindBuffer (GLenum target, GLuint buffer)] documentation
My question is - is that if its an enum its only value must be 0,1,2,3,4..7 respectively so why go all the way and make it a 32 bit integer if it has only values up to 7? Pardon my knowledge of CS and OpenGL, it just seems unethical.
Enums aren't used just for the buffers - but everywhere a symbolic constant is needed. Currently, several thousand enum values are assigned (look into your GL.h and the latest glext.h. Note that vendors get allocated their official enum ranges so they can implement vendor-specific extensions wihtout interfering with others - so a 32Bit enum space is not a bad idea. Furthermore, on modern CPU architechtures, using less than 32Bit won't be any more efficient, so this is not a problem performance-wise.
UPDATE:
As Andon M. Coleman pointed out, currently only 16Bit enumerant ranges are beeing allocated. It might be useful to link at the OpenGL Enumerant Allocation Policies, which also has the following remark:
Historically, enumerant values for some single-vendor extensions were allocated in blocks of 1000, beginning with the block [102000,102999] and progressing upward. Values in this range cannot be represented as 16-bit unsigned integers. This imposes a significant and unnecessary performance penalty on some implementations. Such blocks that have already been allocated to vendors will remain allocated unless and until the vendor voluntarily releases the entire block, but no further blocks in this range will be allocated.
Most of these seem to have been removed in favor of 16 Bit values, but 32 Bit values have been in use. In the current glext.h, one still can find some (obsolete) enumerants above 0xffff, like
#ifndef GL_PGI_misc_hints
#define GL_PGI_misc_hints 1
#define GL_PREFER_DOUBLEBUFFER_HINT_PGI 0x1A1F8
#define GL_CONSERVE_MEMORY_HINT_PGI 0x1A1FD
#define GL_RECLAIM_MEMORY_HINT_PGI 0x1A1FE
...
Why would you use a short anyway? What situation would you ever be in that you would even save more than 8k ram (if the reports of near a thousand GLenums is correct) by using a short or uint8_t istead of GLuint for enums and const declarations? Considering the trouble of potential hardware incompatibilities and potential cross platform bugs you would introduce, it's kind of odd to try to save something like 8k ram even in the context of the original 2mb Voodoo3d graphics hardware, much less SGL super-computer-farms OpenGL was created for.
Besides, modern x86 and GPU hardware aligns on 32 or 64 bits at a time, you would actually stall the operation of the CPU/GPU as 24 or 56 bits of the register would have to be zeroed out and THEN read/written to, whereas it could operate on the standard int as soon as it was copied in. From the start of OpenGL compute resources have tended to be more valuable than memory while you might do billions of state changes during a program's life you'd be saving about 10kb (kilobytes) of ram max if you replaced every 32 bit GLuint enum with a uint8_t one. I'm trying so hard not to be extra-cynical right now, heh.
For example, one valid reason for things like uint18_t and the like is for large data buffers/algorithms where data fits in that bit-depth. 1024 ints vs 1024 uint8_t variables on the stack is 8k, are we going to split hairs over 8k? Now consider a 4k raw bitmap image of 4000*2500*32 bits, we're talking a few hundred megs and it would be 8 times the size if we used 64 bit RGBA buffers in the place of standard 8 bit RGBA8 buffers, or quadruple in size if we used 32 bit RGBA encoding. Multiply that by the number of textures open or pictures saved and swapping a bit of cpu operations for all that extra memory makes sense, especially in the context of that type of work.
That is where using a non standard integer type makes sense. Unless you're on a 64k machine or something (like an old-school beeper, good luck running OpenGL on that) system if you're trying to save a few bits of memory on something like a const declaration or reference counter you're just wasting everyone's time.