I've been using Vulkan for a while but I just now learned about specialization constants. The specification says:
Specialization constants are useful to allow a compute shader to have its local workgroup size changed at runtime by the user, for example.
Neat! I want to do almost exactly that, and I would like to use such varying constants for other purposes as well. But the examples of specialization constants given in the Vulkan specification (version 1.0.34 at the moment) all appear to be in SPIR-V, not GLSL, and my shaders are all written in GLSL. So I think I probably can't use this nice feature. :(
Am I right? Or is there a way to use specialization constants via GLSL, either as workgroup size constants, or as arbitrary constant variable values, or in some other fashion?
Sure, specialization constants can be used with GLSL in Vulkan as per GL_KHR_Vulkan.
It's a special layout qualifier, so your GLSL specialization constants will look like this:
layout (constant_id = 0) const int SSAO_KERNEL_SIZE = 64;
Values for those constants are then specified upon pipeline creation using the pSpecializationInfo member of the shader stage create info used in the pipeline create info.
This also works fine for e.g. compute shader workgroup sizes.
Related
I need to pass from a minumum fo one to a maximum of seven file paths to a function. There is a convention where the file path alone is enough to identify how to handle each file.
Order of the parameters does not matter.
An obvious option to handle this (the one I currently implemented) is to pass an empty string as a parameter for unused slots.
Another one is to pass the parameters as an array or vector.
Yet another one would be to implement all possible permutations of parameters (possible, not practical).
I wonder if there is a way to simply specify that the number of paramters can vary, and then simply pass the parameters themselves.
So for example assuming that there is only one implementation of f() with special syntax to denote varying amounts of parameters
All fo the following should compile:
int main()
{
f(file);
f(file1, file2);
f(file1, file3, file2, file6);
}
Is there a way to achieve this in C++ ?
You can use a recursive template function.
#include <iostream>
template <typename First>
void f(First&& first) {
std::cout << first << std::endl;
}
template <typename First, typename... Rest>
void f(First&& first, Rest&&... rest) {
f(std::forward<First>(first));
f(std::forward<Rest>(rest)...);
}
int main() {
f(6,7,8,9,10);
}
If you really need a variable (unbounded) number of arguments:
If you are on C++11 or later:
Use std::initializer_list (only if all the types are the same) -- see https://stackoverflow.com/a/16338804/9305398.
Use variadic templates (i.e. parameter packs) -- see #super's answer and/or https://stackoverflow.com/a/16338804/9305398.
If you are on C++03 or later:
Use variadic arguments -- see https://stackoverflow.com/a/1657924/9305398.
Otherwise, if you have a fixed number of (optional) parameters:
If you are on C++20 or later:
Use designated initialization as a way to have named parameters.
If you are on C++03 or later:
Use a nullable/optional type (e.g. a raw pointer, boost::optional, C++17's std::optional...) -- see #NicolBolas' answer.
Define all required/logical overloads (possibly using custom types) -- ugly, but this may be automated via an external code generator and/or with the preprocessor.
Otherwise, if you can use a different design to accomplish the same thing, you can do any of the following -- for C++03 and later:
Pass a pointer to a struct as suggested by #PaulMcKenzie.
Design a class that allows to set properties (through the constructor and/or methods) and then has member functions to perform operations on that data, e.g.:
ShaderCompiler sc(vs, fs, ...);
sc.setGeometryShader(...);
sc.compile();
A particular nice way (see e.g. QString) is to design a class that allows to do:
result = ShaderCompiler()
.vertex(...)
.fragment(...)
...
.compile()
;
Similarly, exploiting argument-dependent lookup:
Shader()
<< Vertex(...)
<< Fragment(...)
...
;
Since you have a bounded set of possibilities, here's the obvious way to handle this:
using opt_path = std::optional<path>;
shader compile_shaders(opt_path vs, opt_path tcs = std::nullopt, opt_path tes = std::nullopt, opt_path gs = std::nullopt, opt_path fs = std::nullopt, opt_path cs = std::nullopt)
{
...
}
These just use default arguments for all other shader paths. You can tell which is provided and which is not through the interface to std::optional. If you're not using C++17, you'll obviously substitute that for boost::optional or a similar type.
Of course, however you handle this, it will lead to a decidedly poor interface. Consider what one has to do in order to create the most common case: a vertex shader combined with a fragment shader:
compile_shaders(vs_path, std::nullopt, std::nullopt, std::nullopt, fs_path);
Will the user remember that there are 3 stages between them? Odds are good, they will not. People will constantly make the mistake of using only 2 std::nullopts or using 4. And considering that VS+FS is the most common case, you have an interface where the most common case is very easy to get wrong.
Now sure, you could rearrange the order of the parameters, making the FS the second parameter. But if you want to use other stages, you now have to look up the definition of the function to remember which values map to which stages. At least, the way I did it here follows OpenGL's pipeline. An arbitrary mapping requires looking up the docs.
And if you want to create a compute shader, you have to remember there are 6 stages you have to explicitly null out:
compile_shaders(std::nullopt, std::nullopt, std::nullopt, std::nullopt, std::nullopt, std::nullopt, cs_path);
Compare all this to a more self-descriptive interface:
shader_paths paths(vs_path, shader_paths::vs);
paths.fragment(fs_path);
auto shader = compile_shaders(paths);
There is zero ambiguity here. The path given to the constructor is explicitly stated to be a vertex shader, using a second argument. So if you want a compute shader, you would use shader_paths::cs to express that. The paths are then given a fragment shader, using an appropriately named function. Following this, you compile the shaders, and you're done.
Background
I'm currently writing a wrapper around OpenGL's glUniform functions in C++ in an effort to make them type safe. I have a bunch of set_uniform functions that are overloaded to accept either the OpenGL PODs (GLint, GLuint, GLfloat) or any of the GLM vector and matrix types.
I thought it had all been straight forward so far but then I hit a problem with boolean types. GLSL provides provides bool, bec2, bvec3 and bvec4 so I must provide a set_uniform overload for GLboolean as well as the GLM boolean vector types.
According to the OpenGL manual there is no glUniform function that accepts either a GLboolean or a pointer to a GLboolean array. I must pass either GLint, GLuint or GLfloat and the driver will do the conversion for me.
Either the i, ui or f variants may be used to provide values for uniform variables of type bool, bvec2, bvec3, bvec4, or arrays of these. The uniform variable will be set to false if the input value is 0 or 0.0f, and it will be set to true otherwise.
Converting GLboolean to GLint before passing is easy enough but the GLM vector types are proving more difficult. The deeper I go into the implementation the more worried I get about this library.
Problem
The recommended way to pass a GLM vector type to OpenGL is to use glm::value_ptr:
glm::bvec3 v(true, true, false);
glUniform3iv(some_uniform_id, 1, glm::value_ptr(v));
I have a number of problems with this code.
First, glm::bvec3 is implemented as a struct of 3 bools (not GLboolean but C++ bool). I don't believe I should pass it directly since glUniform3iv is expecting a void pointer to some GLints. The C++ spec gives no guarantee over the size of a bool. This means glUniform3iv is potentially reading garbage for the second and third component, or worse, it's actually reading past the end of the array.
To correct this I convert from glm::bvec3 to glm::ivec3 before passing to OpenGL:
glm::bvec3 bv(true, true, false);
glm::ivec3 iv = bv;
glUniform3iv(some_uniform_id, 1, glm::value_ptr(iv));
I'm not 100% happy with this since glm::ivec3 has a value type of glm::detail::mediump_int_t which is a typedef for int rather than GLint but maybe this can be chalked up to 'the library designer knows the sizes are the same'.
The second and more major problem is that glm::value_ptr is just passing the address of the first struct member and treating the struct as an array with no regard to padding.
Am I missing something here? The GLM library is very widely used alongside OpenGL, it's even listed on Khronos' own wiki. Yet the function it provides for passing its structures to OpenGL, namely glm::value_ptr, makes no effort to ensure the types it's passing are actually the same size as the types OpenGL expects as well as completely disregarding any padding that may exist. Is the GLM library doing some hidden trickery with regard to type sizes and struct padding so that the data sent to OpenGL is valid or does this library have some serious fundamental problems?
Is the GLM library doing some hidden trickery with regard to type sizes and struct padding so that the data sent to OpenGL is valid or does this library have some serious fundamental problems?
Neither. It's simply making the same assumptions that everyone else does about the behavior of struct layouts and pointer arithmetic.
The C++ standard does not allow value_ptr to work; it is clearly undefined behavior. But it is also a commonly used technique for dealing with such things. Lots of real, functional code out there assumes that a struct { int x; int y;}; can be considered equivalent to an int[2]. And under most C++ implementations, this will all function as expected.
When dealing with low-level programming, it is not unreasonable to make assumptions of this nature.
I'm not 100% happy with this since glm::ivec3 has a value type of glm::detail::mediump_int_t which is a typedef for int rather than GLint but maybe this can be chalked up to 'the library designer knows the sizes are the same'.
That has nothing to do with it. While GLM is called "OpenGL Mathematics", it has no dependency on OpenGL itself. As such, it has no access to GLint or any other OpenGL-defined type.
So you can either assume that ivec3's value_type will be the same type as GLint (you can even write static_asserts to verify it) or you can make your own variation. After all, GLM is templated:
using gl_ivec3 = glm::tvec<GLint, 3>;
...
glm::gl_ivec3 iv = bv;
glUniform3iv(some_uniform_id, 1, glm::value_ptr(iv));
I've just learned from Bug in VC++ 14.0 (2015) compiler? that one shouldn't make assumptions about how a struct's layout will end up in memory. However, I don't understand how it is common practice in a lot of code I've seen. For example, the Vulkan graphics API does the following:
Defines a struct
struct {
glm::mat4 projection;
glm::mat4 model;
glm::vec4 lightPos;
} uboVS;
Then fills up its fields:
uboVS.model = ...
uboVS....
Then just copies over the struct (in host memory) to device memory via memcpy:
uint8_t *pData;
vkMapMemory(device, memory, 0, sizeof(uboVS), 0, (void **)&pData);
memcpy(pData, &uboVS, sizeof(uboVS));
vkUnmapMemory(device, memory);
Then over to the GPU, it defines a UBO to match that struct:
layout (binding = 0) uniform UBO
{
mat4 projection;
mat4 model;
vec4 lightPos;
} ubo;
Then, on the GPU side, ubo will always match uboVS.
Is this the same undefined behavior? Doesn't that code rely on the uboVS struct to be laid out exactly as defined, or for both sides (the compiled C++ code and the compiled SPIR-V shader) to basically generate the same different struct layout? (similar to the first example in https://www.securecoding.cert.org/confluence/display/c/EXP11-C.+Do+not+make+assumptions+regarding+the+layout+of+structures+with+bit-fields)
This question is not specific to Vulkan or graphics APIs, I am curious as to what exactly can one assume and when is it ok to just use a struct as a chunk of memory. I understand struct packing and alignment, but is there more to it?
Thanks
It's important to recognize the difference between what you did in the question you cite and what you're doing here.
What you did in the question you showed breaks the rules of C++. It invokes undefined behavior. You tried to pretend that an object containing 16 floats is the same thing as a 16-float array. C++ doesn't permit this to be well-defined behavior, and compilers are allowed to assume you won't try it.
By contrast, converting a struct into a byte array and copying that array somewhere else actually doesn't break the rules of the C++ object model. It has a very specific clause permitting such things for appropriate types.
The difference is that it's not the C++ compiler that cares about the object's layout; it's the GPU. So long as the layout of data you provide matches what your shader said it would be, you're fine. You're not casting floats into arrays or trying to access one object through a pointer to a different one or somesuch. You're just copying bytes.
At which point, the only question that remains is whether the byte representation of that struct matches the byte representation of the expected SPIR-V data structure definition. And yes, this is something you can rely upon for most systems that Vulkan can run on.
It is true that, roughly speaking, the C++ standard does not mandate any particular internal layout of class members.
However, specialty libraries, like graphics libraries for a particular operating system, are going to be targeting the operating system's specific compiler. They know how this particular compiler arranges the layout of C/C++ class and structure members, and the library is going to supply suitable definitions that matches the actual hardware in question.
Operating systems that have more than one compiler will often have a formal specification for that operating system's binary ABI, and the compilers are going to follow that ABI, and the specialty library will provide class and structure definitions that will be in sync with that.
So, in your specific case, you can "assume and when is it ok to just use a struct as a chunk of memory" after you consult your compiler's documentation, determine how your compiler lays out the members of structures or classes, and then come up with a structure layout accordingly.
Spir-V (the shading language you pass to vulkan) requires that you add layout decorations to struct members used for UniformConstant, Uniform, and PushConstant variables. You can use this to make the spir-V member offsets match the member offsets in the C++ struct.
To actually do this is tricky as it requires inspecting the spir-V code and setting the offsets as needed.
I'm trying to run my shaders on an Intel card. I found that samplers types cannot be declared as structure fields... It was disappointing.
My shaders, on NVIDIA platforms, compile and run fine, with samplers arrays and struct with sampler fields. I know that NVIDIA platform is more permissive than others, w.t.r. the GLSL syntax, but I think that sampler types should be allowed in structures and arrays.
But, after having read this page, I get more confused. In particoular, I have found interesting the following quotes:
Arrays of sampler types are special. Under GLSL version 3.30, sampler
arrays can be declared
Structs cannot contain variables of sampler types.
So, I have investigate on GLSL specifications, and while searching I found that the sampler type in a basic one (para 4.1), the array can be composed by basic types (para ), and the same for structure member declarations (para 4.1.9). Am I misinterpreting the specification, or the Intel driver is too "strict"?
Someone could point a spot on this question? The final question should be "Sampler types are considered a basic one?"
What part of this is unclear?
Samplers are basic types. Basic types can be in arrays. And samplers:
They can only be declared as function parameters or uniform variables (see section 4.3.5 “Uniform” ).
Fields in a struct are neither function parameters nor uniform variables. The struct itself can later be declared as a uniform, but the member declaration isn't a uniform yet. Therefore, it is illegal to declare a sampler within a struct.
It is best to not think of samplers and other opaque types as types, but instead as placeholders for special constructs (like textures and such).
I have to send vertex attributes using glVertexAttribPointer to shaders expecting them as built-in (gl_Vertex, gl_Color, etc.).
The glVertexAttribPointer function needs to specify the index (or location) of each built-in attribute. I can do it on NVidia implementations since the location of each attribute is fixed (see http://www.opengl.org/sdk/docs/tutorials/ClockworkCoders/attributes.php at the section "Custom attributes), however i'm not sure about the locations in ATI implementation.
Also, the function glGetAttribLocation will return -1 when trying to get the location of any attribute beginning starting with "gl_".
I think i'm missing something and this is a trivial problem, however I have not found the correct solution for ATI.
The builtin attribute arrays are not set with glVertexAttribPointer, but with functions like glVertexPointer, glColorPointer, .... And you enable these by calling glEnableClientState with constants like GL_VERTEX_ARRAY, GL_COLOR_ARRAY, ..., instead of glEnableVertexAttribArray.
Whereas on nVidia glVertexAttribPointer might work, due to their aliasing of custom attribute indices with builtin attributes, this is not standard conformant and I'm sure you cannot expect this on any other hardware vendor. So to be sure use glVertexAttribPointer for custom attributes and the glVertexPointer/glNormalPointer/... functions for bultin attributes, together with the matching enable/disable functions.
Keep in mind that the builtin attributes are deprecated anyway, together with the mentioned functions. So if you want to write modern OpenGL code, you should define your own attributes anyway. But maybe you have to support legacy shaders or don't care about forward compatiblity at the moment.