Following Kyle Halladay's tutorial on "Using Arrays of Textures in Vulkan Shaders" I managed to make my code work.
At some point, Kyle says:
"I don’t know how much (if any) of a performance penalty you pay for using an array of textures over a sampler2DArray." (cit.)
I'm concerned about performance. This is the shader:
#version 450
#extension GL_ARB_separate_shader_objects : enable
layout(binding = 1) uniform sampler baseSampler;
layout(binding = 2) uniform texture2D textures[2];
layout(location = 0) in vec2 inUV;
layout(location = 1) flat in vec4 inTint;
layout(location = 2) flat in uint inTexIndex;
layout(location = 0) out vec4 outColor;
void main()
{
outColor = inTint * texture(sampler2D(textures[inTexIndex], baseSampler), inUV);
}
The part I'm concerned about is sampler2D(textures[inTexIndex], baseSampler), where it looks like a sampler2D is set up based on baseSampler. This looks horrendous and I don't know if it's per-fragment or if glslc can optimize it away, somehow.
Does someone know how much of an impact sampler2D() has?
Obsolete question which received answers in the comments: What if I bind an array of VkSampler descriptors (VK_DESCRIPTOR_TYPE_SAMPLER) in place of the VkImageView descriptors (VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE)? The shader wouldn't index into a texture2D array but into a sampler2D array with attached ImageInfo (the textures).
Also, are these kinds of optimizations crucial or are they irrelevant?
EDIT: cleaned up original question without changing the meaning, added same/corollary question with better wording below.
I apologize for my English.
What does this specific piece of code do:
texture(sampler2D(textures[inTexIndex], baseSampler), inUV)
Is this executed per-fragment? And if so, is the sampler2D() a function? A type cast? A constructor that allocates memory? This "function" is what I'm concerned about most. I presume indexing is inevitable.
In the comment I wonder if, as an alternative, I could use VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER descriptors and have an array of sampler2D in the shader. Would this choice increase performance?
Finally, I wonder if switching to a sampler2Darray really makes much difference (performance-wise).
The cost of not using combined image/sampler objects will of course depend on the hardware. However, consider this.
HLSL, from Shader Model 4 onwards (so D3D10+) has never had combined image/samplers. They've always used separate texture and sampler objects.
So if an entire API has been doing this for over a decade, it's probably safe to say that this is not going to be a performance problem.
Related
According to Vulkan Specification,
The following features are removed:
default uniforms (uniform variables not inside a uniform block)
For example, the following is forbidden in GLSL:
layout(set = 0, binding = 0) uniform mat4 projectionMatrix;
Instead, it seems, programmers are encouraged to create a uniform block:
layout(set = 0, binding = 0) uniform Projection {
mat4 matrix;
} projection;
Using single-valued uniforms is sometimes useful for smaller applications. What was the rationale behind removing the functionality?
It's in keeping with Vulkan's philosophy of being more explicit, where the application makes the decisions the driver otherwise would. It's not efficient for GPUs to manage single uniforms in isolation; they will be inevitably aggregated into some kind of uniform block under the hood. So Vulkan exposes only uniform blocks and lets you make those decisions yourself. You can use a single uniform block per variable if you really want to, but you'll have to explicitly allocate memory for them separately.
For the simple vertex shader below to render a triangle, when is it safe to update the uniform mvpMatrix from the CPU (using glUniformMatrix4fv)? Is it safe to assume that after a draw call, e.g. glDrawArrays that the uniform can be updated for the next draw? Or are there sync mechanisms to ensure that the update is not taking place mid way through the vertex shader applying the MVP matrix.
#version 330
layout (location=0) in vec3 vert
uniform mat4 mvpMatrix;
void main(void)
{
gl_Position = mvpMatrix * vec4(vert, 1.0);
}
OpenGL is defined as a synchronous API. This means that everything (largely) happens "as-if" every command is executed fully and completely by the time the function call returns. As such, you can change uniforms (or any other state) as you wish without affecting any prior rendering commands.
Now, that doesn't make it a good idea. But how bad of an idea it is depends on the kind of state in question. Changing uniform state is extremely cheap (especially compared to other state), and implementations of OpenGL kind of expect you to do so between rendering calls, so they're optimized for that circumstance. Changing the contents of storage (like the data stored in a buffer object used to provide uniform data), is going to be much more painful. So if you're using UBOs, its better to write your data to an unused piece of a buffer than to overwrite the part of a buffer being used by a prior rendering command.
I use glGetActiveUniform to query uniforms from the shaders.But I also use uniform buffers (regular and std140).Querying them returns an arrays with the primitive type of the uniform in the buffer.But I need a way to identify those are uniform buffers and not uniforms.
Examples of uniform buffers in a shader:
layout(std140, binding = LIGHTS_UBO_INDEX) uniform LightsBuffer {
LightInfo lights[MAX_LIGHTS];
};
Is it possible to query only buffers from GLSL shader?
Technically, what you actually have here is a uniform block. It has a name, but no type; its members (which are uniforms) have types, and I think that is what you are actually describing.
It is a pretty important distinction because of the way program introspection works. Using OpenGL 4.3+ (or GL_ARB_program_interface_query), you will find that you cannot query a type for GL_UNIFORM_BLOCK interfaces.
glGetActiveUniformBlockiv (...) can be used to query information about the uniform block, but "LightsBuffer" is still a block and not a buffer. By that I mean even though it has an attribute called GL_BUFFER_BINDING, that is really the binding location of the uniform block and not the buffer that is currently bound to that location. Likewise, GL_BUFFER_DATA_SIZE is the size of the data required by the uniform block and not the size of the buffer currently bound.
If some calculations in a GLSL shader are only dependent on uniform variables, they could be calculated only once and used for every vertex/fragment. Is this really used in hardware? I got the idea after reading about "Uniform and Non-Uniform Control Flow" in the GLSL specification:
https://www.opengl.org/registry/doc/GLSLangSpec.4.40.pdf#page=30&zoom=auto,115.2,615.4
I would like to know if theres a difference between precalculating projection- and view-matrix for example.
It depends on the driver and the optimizations it is built to do, in direct3D there is an explicit api for that.
For example the simple point of
//...
uniform mat4 projection;
uniform mat4 view;
uniform mat4 model;
main(){
gl_position = projection*view*model*pos;
}
most drivers will be able to optimize it to precalculate the MVP matrix and pass just that in a single uniform.
This is implementation defined and some drivers are better at inlining uniforms than other. One other optimization option is recompiling the entire program with inlined uniforms and optimize non-taken paths out.
When doing multitexturing in GLSL, is there anyway to have an indexable array of samplers where each texture is a different size? This syntax isn't valid:
uniform sampler2D texArray[5];
Right now it seems like the only option is to individually create the samplers:
uniform sampler2D tex1;
uniform sampler2D tex2;
uniform sampler2D tex3;
uniform sampler2D tex4;
uniform sampler2D tex5;
But then I can't iterate through them, which is a real pain in the ass. Is there a solution?
This syntax isn't valid:
Says who? Arrays of samplers most certainly are valid (depending on the version). How you use them is a different matter.
GLSL 1.20 and below do not allow sampler arrays.
In GLSL 1.30 to 3.30, you can have sampler arrays, but with severe restrictions on the index. The index must be an integral constant expression. Thus, while you can declare a sampler array, you can't loop over it.
GLSL 4.00 and above allow the index to be a "dynamically uniform integral expression". That term basically means that all instantiations of the shader (within the same draw call) must get the same values.
So you can loop over a constant range in GLSL 4.00+, and index a sampler array with the loop counter. You can even get the index from a uniform variable. What you can't do is have the index depend on an input to the shader stage (unless that value is the same across all instances caused by the rendering command), or come from a value derived from a texture access (unless that value is the same across all instances caused by the rendering command), or something.
The only requirement on the textures placed in arrays of samplers is that they match the sampler type. So you have to use a GL_TEXTURE_2D on all the elements of a sampler2D array. Beyond that, the textures can have any number of differences, including size. The array exists to make coding easier; it doesn't change the semantics of what is there.
And remember: each individual element in the sampler array needs to be bound to its own texture image unit.
is there anyway to have an indexable array of samplers where each texture is a different size?
Not yet. Maybe this gets added to a later OpenGL version down the road, but I doubt it.
But then I can't iterate through them, which is a real pain in the ass. Is there a solution?
As a workaround you could use Array Textures and use only subregions of each layer. Use a vec4 array to store the boudaries of each picture on each layer.