Running openGL 3.1, the question is simple.
From GLSL site, here is how one can define array of uniform buffer blocks:
uniform BlockName
{
vec3 blockMember1, blockMember2;
float blockMember3;
} multiBlocks[3];
Now, is it possible to have dynamic number of these multiBlocks? There are no pointers in GLSL so no "new" statement etc.
If not, is there other approach to send dynamic number of elements?
My block is currently packing four floats and one vec2.
I haven't wrote shader yet so you can suggest anything, thanks ;)
You can't have a dynamic number of them, and you can't have a dynamic index into them. That means that even if you could change the count dynamically, it would be of little use since you'd still have to change the shader code to access the new elements.
One possible alternative would be to make the block members arrays:
#define BLOCK_COUNT %d
uniform BlockName
{
vec3 blockMember1[BLOCK_COUNT];
vec3 blockMember2[BLOCK_COUNT];
float blockMember3[BLOCK_COUNT];
}
multiBlocks;
Then you can alter BLOCK_COUNT to change the number of members, and you can use dynamic indexes just fine:
multiBlocks.blockMember2[i];
It still doesn't allow you to alter the number of elements without recompiling the shader, though.
Ok so i wrote also to openGL forum and this came out
So basicaly you have 3 solutions:
Uniform buffer objects or Texture buffers or static array with some high number of prepared elements and use another uniform for specifying actual size.
The last one could be upgraded with OneSadCookie's definition of max size in compile time.
Related
One of the inputs of my fragment shader is an array of 5 structures. The shader computes a color based on each of the 5 structures. In the end, these 5 colors are summed together to produce the final output. The total size of the array is 1440 bytes. To accommodate the alignment of the uniform buffer, the size of the uniform buffer changes to 1920 bytes.
1- If I define the array of 5 structures as a uniform buffer array, the rendering takes 5ms (measured by Nsight Graphics). The uniform buffer's memory property is 'VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT'. The uniform buffer in glsl is defined as follows
layout(set=0,binding=0) uniform UniformStruct { A a; } us[];
layout(location=0) out vec4 c;
void main()
{
vec4 col = vec4(0);
for (int i = 0; i < 5; i++)
col += func(us[nonuniformEXT(i)]);
c = col;
}
Besides, I'm using 'GL_EXT_nonuniform_qualifier' extension to access the uniform buffer array. This seems the most straightforward way for me but there are alternative implementations.
2- I can split the rendering from one vkCmdDraw to five vkCmdDraw, change the framebuffer's blend mode from overwriting to addition and define a uniform buffer instead of a uniform buffer array in the fragment shader. On the CPU side, I change the descriptor type from UNIFORM_BUFFER to UNIFORM_BUFFER_DYNAMICS. Before each vkCmdDraw, I bind the dynamic uniform buffer and the corresponding offsets. In the fragment shader, the for loop is removed. Although it seems that it should be slower than the first method, it is surprisingly much faster than the first method. The rendering only takes 2ms total for 5 draws.
3- If I define the array of 5 structures as a storage buffer and do one vkCmdDraw, the rendering takes only 1.4ms. In other words, if I change the array from the uniform buffer array to storage buffer but keep anything else the same as 1, it becomes faster.
4- If I define the array of 5 structures as a global constant in the glsl and do one vkCmdDraw, the rendering takes only 0.5ms.
In my opinion, 4 should be the fastest way, which is true in the test. Then 1 should be the next. Both 2 and 3 should be slower than 1. However, Neither 2 or 3 is slower than 1. In contrast, they are much faster than 1. Any ideas why using uniform buffer array slows down the rendering? Is it because it is a host visible buffer?
When it comes to UBOs, there are two kinds of hardware: the kind where UBOs are specialized hardware and the kind where they aren't. For GPUs where UBOs are not specialized hardware, a UBO is really just a readonly SSBO. You can usually tell the difference because hardware where UBOs are specialized will have different size limits on them from those of SSBOs.
For specialized hardware-based UBOs (which NVIDIA still uses, if I recall correctly), each UBO represents an upload from memory into a big block of constant data that all invocations of a particular shader stage can access.
For this kind of hardware, an array of UBOs is basically creating an array out of segments of this block of constant data. And some hardware has multiple blocks of constant data, so indexing then with non-constant expressions is tricky. This is why non-constant access to such indices is an optional feature of Vulkan.
By contrast, a UBO which contains an array is just one big UBO. It's special only in how big it is. Indexing through an array within a UBO is no different from indexing any array. There are no special rules with regard to the uniformity of the index of such accesses.
So stop using an array of UBOs and just use a single UBO which contains an array of data:
layout(set=0,binding=0) uniform UniformStruct { A a[5]; } us;
It'll also avoid additional padding due to alignment, additional descriptors, additional buffers, etc.
However, you might also speed things up by not lying to Vulkan. The expression nonuniformEXT(i) states that the expression i is not dynamically uniform. This is incorrect. Every shader invocation that executes this loop will generate i expressions that have values from 0 to 4. Every dynamic instance of the expression i for any invocation will have the same value at that place in the code as every other.
Therefore i is dynamically uniform, so telling Vulkan that it isn't is not helpful.
I've recently replaced all usage of glUniform() in my code to instead use Uniform Buffer Objects (UBO) to better organize my object rendering code. However, my performance has sharply plummeted as the number of objects drawn increases (an object in my case is anything with it's own offset into a UBO; only one UBO is shared between everything that uses the same shader program).
I think this may be because my objects actually use very little uniform data, sometimes as little as one vec4, while a UBO requires each offset be in multiples of GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT which can be rather large.
I understood UBO to be a strict upgrade to using glUniform(), am I mistaken? Should I have a glUniform() path for these simple objects, or is there a necessary step to using UBOs effectively with many different simple objects that I haven't taken?
My scene is mostly quads, with very few objects using more than 4 verts, 6 indices. Some of these quads have large uniform blocks while most have very simple ones, maybe one or two vec4.
(For now, I've gotten around this problem by embedding this object data into the vertices, though it definitely feels wrong to do it this way.)
When doing multitexturing in GLSL, is there anyway to have an indexable array of samplers where each texture is a different size? This syntax isn't valid:
uniform sampler2D texArray[5];
Right now it seems like the only option is to individually create the samplers:
uniform sampler2D tex1;
uniform sampler2D tex2;
uniform sampler2D tex3;
uniform sampler2D tex4;
uniform sampler2D tex5;
But then I can't iterate through them, which is a real pain in the ass. Is there a solution?
This syntax isn't valid:
Says who? Arrays of samplers most certainly are valid (depending on the version). How you use them is a different matter.
GLSL 1.20 and below do not allow sampler arrays.
In GLSL 1.30 to 3.30, you can have sampler arrays, but with severe restrictions on the index. The index must be an integral constant expression. Thus, while you can declare a sampler array, you can't loop over it.
GLSL 4.00 and above allow the index to be a "dynamically uniform integral expression". That term basically means that all instantiations of the shader (within the same draw call) must get the same values.
So you can loop over a constant range in GLSL 4.00+, and index a sampler array with the loop counter. You can even get the index from a uniform variable. What you can't do is have the index depend on an input to the shader stage (unless that value is the same across all instances caused by the rendering command), or come from a value derived from a texture access (unless that value is the same across all instances caused by the rendering command), or something.
The only requirement on the textures placed in arrays of samplers is that they match the sampler type. So you have to use a GL_TEXTURE_2D on all the elements of a sampler2D array. Beyond that, the textures can have any number of differences, including size. The array exists to make coding easier; it doesn't change the semantics of what is there.
And remember: each individual element in the sampler array needs to be bound to its own texture image unit.
is there anyway to have an indexable array of samplers where each texture is a different size?
Not yet. Maybe this gets added to a later OpenGL version down the road, but I doubt it.
But then I can't iterate through them, which is a real pain in the ass. Is there a solution?
As a workaround you could use Array Textures and use only subregions of each layer. Use a vec4 array to store the boudaries of each picture on each layer.
Lets say I have 2 species such as humans and ponies. They have different skeletal systems so the uniform bone array will have to be different for each species. Do I have to implement two separate shader programs able to render each bone array properly or is there a way to dynamically declare uniform arrays and iterate through that dynamic array instead?
Keeping in mind performance (There's all of the shaders suck at decision branching going around).
Until OpenGL 4.3, arrays in GLSL had to be of a fixed, compile-time size. 4.3 allows the use of shader storage buffer objects, which allow for their ultimate length to be "unbounded". Basically, you can do this:
buffer BlockName
{
mat4 manyManyMatrices[];
};
OpenGL will figure out how many matrices are in this array at runtime based on how you use glBindBufferRange. So you can still use manyManyMatrices.length() to get the length, but it won't be a compile-time constant.
However, this feature is (at the time of this edit) very new and only implemented in beta. It also requires GL 4.x-class hardware (aka: Direct3D 11-class hardware). Lastly, since it uses shader storage blocks, accessing the data may be slower than one might hope for.
As such, I would suggest that you just use a uniform block with the largest number of matrices that you would use. If that becomes a memory issue (unlikely), then you can split your shaders based on array size or use shader storage blocks or whatever.
You can use n-by-1-Textures as a replacement for arrays. Texture size can be specified at run-time. I use this approach for passing an arbitrary number of lights to my shaders. I'm surprised how fast it runs despite the many loops and branches. For an example see the polygon.f shader file in the jogl3.glsl.nontransp in the jReality sources.
uniform sampler2D sys_globalLights;
uniform int sys_numGlobalDirLights;
uniform int sys_numGlobalPointLights;
uniform int sys_numGlobalSpotLights;
...
int lightTexSize = sys_numGlobalDirLights*3+sys_numGlobalPointLights*3+sys_numGlobalSpotLights*5;
for(int i = 0; i < numDir; i++){
vec4 dir = texture(sys_globalLights, vec2((3*i+1+0.5)/lightTexSize, 0));
...
As the title says, I can't do vector_array[foo] (assuming foo is in-range and integer) in webgl vertex shaders, correct?
Are textures the best alternative, or is there a workaround or some better way to mimick a lookup table?
http://www.khronos.org/registry/webgl/specs/latest/#DYNAMIC_INDEXING_OF_ARRAYS
"WebGL only allows dynamic indexing with constant expressions, loop indices or a combination. The only exception is for uniform access in vertex shaders, which can be indexed using any expression."
Did you try it? If it didn't work, there are a couple options.
If you have a small number of values, if-else could work ok. AFAIK the uniform values are going to be loaded into registers anyhow, so doing a dozen cycles of math on them isn't going to make your shader much slower.
For a large number of values, textures are your best bet.
I haven't tested it, but I don't get any compilation error from the following
//index as a float
attribute lowp float vColorIndex;
//the array
uniform vec4 Colors[16];
//type cast the float in an int
int index = int(vColorIndex);
//use index
vec4 col = Colors[index];