(unsigned) byte in GLSL - opengl

Some of my vertex attributes are single unsigned bytes, I need them in my GLSL fragment shader, not for any "real" calculations, but for comparing them (like enums if you will). I didnt find any unsigned byte or even byte data type in GLSL, so is there a way as using it as an input? If not (which at the moment it seems to be) what is the purpose of GL_UNSIGNED_BYTE?

GLSL doesn't deal in sized types (well, not sized types smaller than 32-bits). It only has signed/unsigned integers, floats, doubles, booleans, and vectors/matrices of them. If you pass an unsigned byte as an integer vertex attribute to a vertex shader, then it can read it as a uint type, which is 32-bits in size. Passing integral attributes requires the use of glVertexAttribIPointer/IFormat (note the "I").
The vertex shader can then pass this value to the fragment shader as a uint type (but only with the flat interpolation qualifier). Of course, every fragment for a triangle will get the same value.

Related

What use has the layout specifier scalar in EXT_scalar_block_layout?

Question
What use has the scalar layout specifier when accessing a storage buffer in GL_EXT_scalar_block_layout? (see below for example)
What would be use case for scalar?
Background
I recently programmed a simple Raytracer using Vulkan and NVidias VkRayTracing extension and was following this tutorial. In the section about the closest hit shader it is required to access some data that's stored in, well storage buffers (with usage flags vk::BufferUsageFlagBits::eStorageBuffer).
In the shader the extension GL_EXT_scalar_block_layout is used and those buffers are accessed like this:
layout(binding = 4, set = 1, scalar) buffer Vertices { Vertex v[]; } vertices[];
When I first used this code the validation layers told me that the structs like Vertex had an invalid layout, so I changed them to have each member aligned on 16byte blocks:
struct Vertex {
vec4 position;
vec4 normal;
vec4 texCoord;
};
with the corresponding struct in C++:
#pragma pack(push, 1)
struct Vertex {
glm::vec4 position_1unused;
glm::vec4 normal_1unused;
glm::vec4 texCoord_2unused;
};
#pragma pack(pop)
Errors disappeared and I got a working Raytracer. But I still don't understand why the scalar keyword is used here. I found this document talking about the GL_EXT_scalar_block_layout-extension, but I really don't understand it. Probably I'm just not used to glsl terminology? I can't see any reason why I would have to use this.
Also I just tried to remove the scalar and it still worked without any difference, warnings or erros whatsoever. Would be grateful for any clarification or further resources on this topic.
The std140 and std430 layouts do quite a bit of rounding of the offsets/alignments sizes of objects. std140 basically makes any non-scalar type aligned to the same alignment as a vec4. std430 relaxes that somewhat, but it still does a lot of rounding up to a vec4's alignment.
scalar layout means basically to layout the objects in accord with their component scalars. Anything that aggregates components (vectors, matrices, arrays, and structs) does not affect layout. In particular:
All types are sized/aligned only to the highest alignment of the scalar components that they actually use. So a struct containing a single uint is sized/aligned to the same size/alignment as a uint: 4 bytes. Under std140 rules, it would have 16-byte size and alignment.
Note that this layout makes vec3 and similar types actually viable, because C and C++ would then be capable of creating alignment rules that map to those of GLSL.
The array stride of elements in the array is based solely on the size/alignment of the element type, recursively. So an array of uint has an array stride of 4 bytes; under std140 rules, it would have a 16-byte stride.
Alignment and padding only matter for scalars. If you have a struct containing a uint followed by a uvec2, in std140/430, this will require 16 bytes, with 4 bytes of padding after the first uint. Under scalar layout, such a struct only takes 12 bytes (and is aligned to 4 bytes), with the uvec2 being conceptually misaligned. Padding therefore only exists if you have smaller scalars, like a uint16 followed by a uint.
In the specific case you showed, scalar layout was unnecessary since all of the types you used are vec4s.

Confusion About glVertexAttrib... Functions

After a lot of searching, I still am confused about what the glVertexAttrib... functions (glVertexAttrib1d, glVertexAttrib1f, etc.) do and what their purpose is.
My current understanding from reading this question and the documentation is that their purpose is to somehow set a vertex attribute as constant (i.e. don't use an array buffer). But the documentation also talks about how they interact with "generic vertex attributes" which are defined as follows:
Generic attributes are defined as four-component values that are organized into an array. The first entry of this array is numbered 0, and the size of the array is specified by the implementation-dependent constant GL_MAX_VERTEX_ATTRIBS. Individual elements of this array can be modified with a glVertexAttrib call that specifies the index of the element to be modified and a value for that element.
It says that they are all "four-component values", yet it is entirely possible to have more or less components than that in a vertex attribute.
What is this saying exactly? Does this only work for vec4 types? What would be the index of a "generic vertex attribute"? A clear explanation is probably what I really need.
In OpenGL, a vertex is specified as a set of vertex attributes. With the advent of the programmable pipleine, you are responsible for writing your own vertex processing functionality. The vertex shader does process one vertex, and gets this specific vertex' attributes as input.
These vertex attributes are called generic vertex attributes, since their meaning is completely defined by you as the application programmer (in contrast to the legacy fixed function pipeline, where the set of attributes were completely defined by the GL).
The OpenGL spec requires implementors to support at least 16 different vertex attributes. So each vertex attribute can be identified by its index from 0 to 15 (or whatever limit your implementation allows, see glGet(GL_MAX_VERTEX_ATTRIBS,...)).
A vertex attribute is conceptually treated as a four-dimensional vector. When you use less than vec4 in a shader, the additional elements are just ignored. If you specify less than 4 elements, the vector is always filled to the (0,0,0,1), which makes sense for both RGBA color vectors, as well as homogenous vertex coordinates.
Though you can declare vertex attributes of mat types, this will just be mapped to a number of consecutive vertex attribute indices.
The vertex attribute data can come from either a vertex array (nowadays, these are required to lie in a Vertex Buffer Object, possibly directly in VRAM, in legacy GL, they could also come from the ordinary client address space) or from the current value of that attribute.
You enable the fetching from attribute arrays via glEnableVertexAttribArray.If a vertex array for a particular attribute you access in your vertex shader is enabled, the GPU will fetch the i-th element from that arry when processing vertex i. FOr all other attributes you access, you will get the current value for that array.
The current value can be set via the glVertexAttrib[1234]* family of GL functions. They cannot be changed durint the draw call, so they remain constant during the whole draw call - just like uniform variables.
One important thing worth noting is that by default, vertex attributes are always floating point, ad you must declare in float/vec2/vec3/vec4 in the vertex shader to acces them. Setting the current value with for example glVertexAttrib4ubv or using GL_UNISGNED_BYTE as the type parameter of glVertexAttribPointer will not change this. The data will be automatically converted to floating-point.
Nowadays, the GL does support two other attribute data types, though: 32 bit integers, and 64 bit double precision floating-point values. YOu have to declare them as int/ivec*, uint/uvec* or double/dvec* respectively in the shader, and you have to use completely separate functions when setting up the array pointer or current values: glVertexAttribIPointer and glVertexAttribI* for signed/unsigned integers and
glVertexAttribLPointer and glVertexAttribL* for doubles ("long floats").

How do I deal with many variables per triangle in OpenGL?

I'm working with OpenGL and am not totally happy with the standard method of passing values PER TRIANGLE (or in my case, quads) that need to make it to the fragment shader, i.e., assign them to each vertex of the primitive and pass them through the vertex shader to presumably be unnecessarily interpolated (unless using the "flat" directive) in the fragment shader (so in other words, non-varying per fragment).
Is there some way to store a value PER triangle (or quad) that needs to be accessed in the fragment shader in such a way that you don't need redundant copies of it per vertex? Is so, is this way better than the likely overhead of 3x (or 4x) the data moving code CPU side?
I am aware of using geometry shaders to spread the values out to new vertices, but I heard geometry shaders are terribly slow on non up to date hardware. Is this the case?
OpenGL fragment language supports the gl_PrimitiveID input variable, which will be the index of the primitive for the currently processed fragment (starting at 0 for each draw call). This can be used as an index into some data store which holds per-primitive data.
Depending on the amount of data that you will need per primitive, and the number of primitives in total, different options are available. For a small number of primitives, you could just set up a uniform array and index into that.
For a reasonably high number of primitives, I would suggest using a texture buffer object (TBO). This is basically an ordinary buffer object, which can be accessed read-only at random locations via the texelFetch GLSL operation. Note that TBOs are not really textures, they only reuse the existing texture object interface. Internally, it is still a data fetch from a buffer object, and it is very efficient with none of the overhead of the texture pipeline.
The only issue with this approach is that you cannot easily mix different data types. You have to define a base data type for your TBO, and every fetch will get you the data in that format. If you just need some floats/vectors per primitive, this is not a problem at all. If you e.g. need some ints and some floats per primitive, you could either use different TBOs, one for each type, or with modern GLSL (>=3.30), you could use an integer type for the TBO and reinterpret the integer bits as floating point with intBitsToFloat(), so you can get around that limitation, too.
You can use one element in the vertex array for rendering multiple vertices. It's called instanced vertex attributes.

GLSL : uniform buffer object example

I have an array of GLubyte of variable size. I want to pass it to fragment shader. I have seen
This thread and this thread. So I decided to use "Uniform Buffer Objects". But being a newbie in GLSL, I do not know:
1 - If I am going to add this to fragment shader, how do I pass size? Should I create a struct?
layout(std140) uniform MyArray
{
GLubyte myDataArray[size]; //I know GLSL doesn't understand GLubyte
};
2- how and where in C++ code associate this buffer object ?
3 - how to deal with casting GLubyte to float?
1 - If I am going to add this to fragment shader, how do I pass size? Should I create a struct?
Using Uniform Buffers (UB), you cannot do this.
size must be static and known when you link your GLSL program. This means it has to be hard-coded into the actual shader.
The modern way around this is to use a feature from GL4 called Shader Storage Buffers (SSB).
SSBs can have variable length (the last field can be declared as an unsized array, like myDataArray[]) and they can also store much more data than UBs.
In older versions of GL, you can use a Buffer Texture to pass large amounts of dynamically sized data into a shader, but that is a cheap hack compared to SSBs and you cannot access the data using a nice struct-like interface either.
3 - how to deal with casting GLubyte to float?
You really would not do this at all, it is considerably more complicated.
The smallest data type you can use in a GLSL data structure is 32-bit. You can pack and unpack smaller pieces of data into a uint if need though using special functions like packUnorm4x8 (...). That was done intentionally, to avoid having to define new data types with smaller sizes.
You can do that even without using any special GLSL functions.
packUnorm4x8 (...) is roughly equivalent to performing the following:
for (int i = 0; i < 4; i++)
packed += round (clamp (vec [i], 0, 1) * 255.0) * pow (2, i * 8);
It takes a 4-component vector of floating-point values in the range [0,1] and does fixed-point arithmetic to pack each of them into an unsigned normalized (unorm) 8-bit integer occupying its own 1/4 of a uint.
Newer versions of GLSL introduce intrinsic functions that do that, but GPUs have actually been doing that sort of thing for as long as shaders have been around. Anytime you read/write a GL_RGBA8 texture from a shader you are basically packing or unpacking 4 8-bit unorms represented by a 32-bit integer.

GLSL packing 4 float attributes into vec4

I have a question about resource consumption of attribute float in glsl.
Does it take as many resources as vec4, or no?
I ask this, because uniforms takes https://stackoverflow.com/a/20775024/1559666 (at least, they could)
If it is not, then does it makes any sense to pack 4 float's into one vec4 attribute?
Yes, all vertex attributes require some multiple of a 4-component vector for storage.
This means that a float vertex attribute takes 1 slot the same as a vec2, vec3 or vec4 would. And types larger than vec4 take multiple slots. A mat4 vertex attribute takes 4 x vec4 many units of storage. A dvec4 (double-precision vector) vertex attribute takes 2 x vec4. Since implementations are only required to offer 16 unique vertex attribute slots, if you naively used single float attributes, you could easily exhaust all available storage just to store a 4x4 matrix.
There is no getting around this. Unlike uniforms (scalar GPUs may be able to store float uniforms more efficiently than vec4), attributes are always tied to a 4-component data type. So for vertex attributes, packing attributes into vectors is quite important.
I have updated my answer to point out relevant excerpts from the GL and GLSL specifications:
OpenGL 4.4 Core Profile Specification - 10.2.1 Current Generic Attributes - pp. 307
Vertex shaders (see section 11.1) access an array of 4-component generic vertex
attributes. The first slot of this array is numbered zero, and the size of the array is
specified by the implementation-dependent constant GL_MAX_VERTEX_ATTRIBS.
GLSL 4.40 Specification - 4.4.1 Input Layout Qualifiers - pp. 60
If a vertex shader input is any scalar or vector type, it will consume a single location. If a non-vertex shader input is a scalar or vector type other than dvec3 or dvec4, it will consume a single location, while types dvec3 or dvec4 will consume two consecutive locations. Inputs of type double and dvec2 will consume only a single location, in all stages.
Admittedly, the behavior described for dvec4 differs slightly. In GL_ARB_vertex_attrib_64bit form, double-precision types may consume twice as much storage as floating-point, such that a dvec3 or dvec4 may consume two attribute slots. When it was promoted to core, that behavior changed... they are only supposed to consume 1 location in the vertex stage, potentially more in any other stage.
Original (extension) behaviour of double-precision vector types:
Name
ARB_vertex_attrib_64bit
[...]
Additionally, some vertex shader inputs using the wider 64-bit components
may count double against the implementation-dependent limit on the number
of vertex shader attribute vectors. A 64-bit scalar or a two-component
vector consumes only a single generic vertex attribute; three- and
four-component "long" may count as two. This approach is similar to the
one used in the current GL where matrix attributes consume multiple
attributes.
The attribute vec4 will take 4 times the memory of the attribute float.
On uniforms, due to some alignments, you may loose some components. (vec4 will be aligned to 4 bytes).