GLSL: Float array in uniform buffer object - c++

I have an uniform buffer like this (GLSL/GPU):
layout(std140) uniform UConstantBufferPS1
{
float m_LuminanceHistory[8];
};
I upload my data like this (C++/CPU):
SHistoryBuffer* pHistogramHistory = static_cast<SHistoryBuffer*>(Gfx::BufferManager::MapConstantBuffer(m_BufferSetPtr->GetBuffer(1)));
pHistogramHistory->m_LuminanceHistory[0] = 1.0f;
pHistogramHistory->m_LuminanceHistory[1] = 1.0f;
pHistogramHistory->m_LuminanceHistory[2] = 1.0f;
pHistogramHistory->m_LuminanceHistory[3] = 1.0f;
pHistogramHistory->m_LuminanceHistory[4] = 1.0f;
// ...
Gfx::BufferManager::UnmapConstantBuffer(m_BufferSetPtr->GetBuffer(1));
On the GLSL side everything is 0, except the first and second float value (m_LuminanceHistory[0]). It seems to be packed in a certain way!?
One bad solution is to define an array of float vectors (vec4) on CPU and GPU. Then I can iterate inside this array and read every x-value of the array. But then I have a big overhead.
Is there any good solution? Thx 4 ur help!
EDIT:
I used the following solution:
layout(std140) uniform UConstantBufferPS1
{
vec4 m_LuminanceHistory[2];
};
float History[8];
History[0] = m_LuminanceHistory[0].x;
History[1] = m_LuminanceHistory[0].y;
History[2] = m_LuminanceHistory[0].z;
History[3] = m_LuminanceHistory[0].w;
History[4] = m_LuminanceHistory[1].x;
History[5] = m_LuminanceHistory[1].y;
History[6] = m_LuminanceHistory[1].z;
History[7] = m_LuminanceHistory[1].w;
This solution works as expected but I don't know why I can't use float[8] directly.

Without having much detail on your Gtx::BufferManager, I'm guessing the following possibilities that might help you to debug.
Try commenting out the layout(std140).
Double check that you have called glBindBufferBase() to bind the target of buffer object with the CORRECT binding point (By default it should start with 0, if you didn't specify it as you declare UConstantBufferPS1 in the shader).

Related

when do i need GL_EXT_nonuniform_qualifier?

I want to compile the following code into SPIR-V
#version 450 core
#define BATCH_ID (PushConstants.Indices.x >> 16)
#define MATERIAL_ID (PushConstants.Indices.x & 0xFFFF)
layout (push_constant) uniform constants {
ivec2 Indices;
} PushConstants;
layout (constant_id = 1) const int MATERIAL_SIZE = 32;
in Vertex_Fragment {
layout(location = 0) vec4 VertexColor;
layout(location = 1) vec2 TexCoord;
} inData;
struct ParameterFrequence_3 {
int ColorMap;
};
layout (set = 3, binding = 0, std140) uniform ParameterFrequence_3 {
ParameterFrequence_3[MATERIAL_SIZE] data;
} Frequence_3;
layout (location = 0) out vec4 out_Color;
layout (set = 2, binding = 0) uniform sampler2D[] Sampler2DResources;
void main(void) {
vec4 color = vec4(1.0);
color *= texture(Sampler2DResources[Frequence_3.data[MATERIAL_ID].ColorMap], inData.TexCoord);
color *= inData.VertexColor;
out_Color = color;
}
(The code is generated by a program I am developing which is why the code might look a little strange, but it should make the problem clear)
When trying to do so, I am told
error: 'variable index' : required extension not requested: GL_EXT_nonuniform_qualifier
(for the third last line where the texture lookup also happens)
After I followed a lot of discussion around how dynamically uniform is specified and that the shading language spec basically says the scope is specified by the API while neither OpenGL nor Vulkan really do so (maybe that changed), I am confused why i get that error.
Initially I wanted to use instanced vertex attributes for the indices, those however are not dynamically uniform which is what I thought the PushConstants would be.
So when PushConstants are constant during the draw call (which is the max scope for dynamically uniform requirement), how can the above shader end up in any dynamically non-uniform state?
Edit: Does it have to do with the fact that the buffer backing the storage for the "ColorMap" could be aliased by another buffer via which the content might be modified during the invocation? Or is there a way to tell the compiler this is a "restricted" storage so it knows it is constant?
Thanks
It is 3 am in the morning over here, I should just go to sleep.
Chances anyone end up having the same problem are small, but I'd still rather answer it myself than delete it:
I simply had to add a SpecializationConstant to set the size of the sampler2D array, now it works without requiring any extension.
Good night

Unexpeced value upon accessing an SSBO float

I am trying to calculate a morph offset for a gpu driven animation.
To that effect I have the following function (and SSBOS):
layout(std140, binding = 7) buffer morph_buffer
{
vec4 morph_targets[];
};
layout(std140, binding = 8) buffer morph_weight_buffer
{
float morph_weights[];
};
vec3 GetMorphOffset()
{
vec3 offset = vec3(0);
for(int target_index=0; target_index < target_count; target_index++)
{
float w1 = morph_weights[1];
offset += w1 * morph_targets[target_index * vertex_count + gl_VertexIndex].xyz;
}
return offset;
}
I am seeing strange behaviour so I opened renderdoc to trace the state:
As you can see, index 1 of the morph_weights SSBO is 0. However if I step over in the built in debugger for renderdoc I obtain:
Or in short, the variable I get back is 1, not 0.
So I did a little experiment and changed one of the values and now the SSBO looks like this:
And now I get this:
So my SSBO of type float is being treated like an ssbo of vec4's it seems. I am aware of alignment issues with vec3's, but IIRC floats are fair game. What is happenning?
Upon doing a little bit of asking around.
The issue is the SSBO is marked as std140, the correct std for a float array is std430.
For the vulkan GLSL dialect, an alternative is to use the scalar qualifier.

GLSL Compute Shader Setting buffer with lookup table results in no data written, setting the same buffer with other data works

I am attempting to implement a slightly modified version of this standard marching cubes algorithm in a compute shader.
I have reached the stage at which triTable is used to insert the correct vertex indices into a buffer and have modified the table to be 1 dimensional (const int triTable[4096]={-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,0,8,3...})
The following code shows the error that I am experiencing (this does not implement the algorithm however it demonstrates the current issue fully):
layout(binding=1) buffer Grid
{
float GridData[]; //contains 512*512*512 data volume previously generated, unused in this test case
};
uniform uint marchableCount;
uniform uint pointCount;
layout(std430, binding = 4) buffer X {uvec4 marchableList[];}; //format is x,y,z,cubeIndex
layout(std430, binding = 5) buffer v {vec4 vertices[];};
layout(std430,binding = 6) buffer n {vec4 normals[];};
layout(binding = 7) uniform atomic_uint triCount;
void main()
{
uvec3 gid = marchableList[gl_GlobalInvocationID.x].xyz; //xyz of grid cell
int E = int(edgeTable[marchableList[gl_GlobalInvocationID.x].w]);
if (E != 0)
{
uint cubeIndex = marchableList[gl_GlobalInvocationID.x].w;
uint index = atomicCounterIncrement(triCount);
int tCount = 0;//unused in this test, used for iteration in actual algorithm
int tGet = tCount+16*int(cubeIndex); //correction from converting 2d array to 1d array
vertices[index] = vec4(tGet);
}
}
This code produces expected values: the vertices buffer is filled with data and the atomic counter increments
changing this line:
vertices[index] = vec4(tGet);
to
vertices[index] = vec4(triTable[tGet]);
or
vertices[index] = vec4(triTable[tGet]+1);
(demonstrating that triTable is not coincidentally returning zeros)
results in what appears to be a complete failure of the shader: the buffer is filled with zeros and the atomic counter does not increment. No error messages are output when the shader is compiled. tGet is less than 4096.
The following test cases also produce the correct output:
vertices[index] = vec4(triTable[3]); //-1
vertices[index] = vec4(triTable[4095]); //also -1
showing that triTable is in fact implemented correctly
What causes the shader to have issues in these very specific cases?
I'm more surprised that const int triTable[4096] = {...}; compiles at all. That array, if it is actually needed, is 16KB in size. That's a lot for a shader, even if the array lives in shared memory.
What is most likely happening is that, whenever the compiler detects usage of this array that it can't optimize it out to a simple value (triTable[3] will always be 1, so the compiler doesn't need to store the whole table), the compilation either fails or results in a non-functional shader.
It would be best to make this table a uniform buffer. An SSBO might work too, but some hardware implements uniform blocks through specialized memory rather than with a global memory fetch.

GLSL Channel Selection

I have a GLSL shader that reads from one of the channels (e.g. R) of an input texture and then writes to the same channel in an output texture. This channel has to be selected by the user.
What I can think of right now is to just use an int uniform and tons of if-statements:
uniform sampler2D uTexture;
uniform int uChannelId;
varying vec2 vUv;
void main() {
//read in data from texture
vec4 t = texture2D(uTexture, vUv);
float data;
if (uChannelId == 0) {
data = t.r;
} else if (uChannelId == 1) {
data = t.g;
} else if (uChannelId == 2) {
data = t.b;
} else {
data = t.a;
}
//process the data...
float result = data * 2; //for example
//write out
if (uChannelId == 0) {
gl_FragColor = vec4(result, t.g, t.b, t.a);
} else if (uChannelId == 1) {
gl_FragColor = vec4(t.r, result, t.b, t.a);
} else if (uChannelId == 2) {
gl_FragColor = vec4(t.r, t.g, result, t.a);
} else {
gl_FragColor = vec4(t.r, t.g, t.b, result);
}
}
Is there any way of doing something like a dictionary access such as t[uChannelId]?
Or perhaps I should have 4 different versions of the same shader, each of which processes a different channel, so that I can avoid all the if-statements?
What is the best way to do this?
EDIT: To be more specific, I am using WebGL (Three.js)
There is such a way, and it is as simple as you actually wrote it in the question. Just use t[channelId]. To quote the GLSL Spec (This is from Version 3.30, Section 5.5, but applies to other versions as well):
Array subscripting syntax can also be applied to vectors to provide numeric indexing. So in
vec4 pos;
pos[2] refers to the third element of pos and is equivalent to pos.z. This allows variable indexing into a
vector, as well as a generic way of accessing components. Any integer expression can be used as the
subscript. The first component is at index zero. Reading from or writing to a vector using a constant
integral expression with a value that is negative or greater than or equal to the size of the vector is illegal.
When indexing with non-constant expressions, behavior is undefined if the index is negative, or greater
than or equal to the size of the vector.
Note that for the first part of your code, you use this to access a specific channel of a texture. You could also use the ARB_texture_swizzle functionality. In that case, you would just use a fxied channel, say r, for access in the shader and what swizzle the actual texture channels so that wahtever channel you want to access becomes r.
Update: as the target platform turned out to be webgl, these suggestions are not available. However, a simple solution would be to use a vec4 uniform in place of uChannelID which is 1.0 for the selected component and 0.0 for all others. Say this variable is called uChannelSel. You could use data=dot(t, uChannelSel) in the first part and gl_FragColor=(vec4(1.0)-uChannelSel) * t + uChannelSel*result for the second part.
as i'm sure you know, branching can be expensive in shaders. however, it sounds like it'll always be the same channel in a pass (yes?), so you might maintain enough cohesion to see good performance.
it's been a good while since i've used GLSL, but if you're using a newer version, maybe you could do some bitwise shifting (<< or >>) magic? you would read the texture into int instead of vec4, then shift it a number of bits depending on which channel you want to read.

How do constant shaders need to be padded in order to avoid a E_INVALIDARG?

I am investigating a E_INVALIDARG exception that is thrown when I attempt to create a second constant buffer that stores the information for my lights:
// create matrix stack early
CD3D11_BUFFER_DESC constantMatrixBufferDesc(sizeof(ModelViewProjectionConstantBuffer), D3D11_BIND_CONSTANT_BUFFER);
DX::ThrowIfFailed(
m_d3dDevice->CreateBuffer(
&constantMatrixBufferDesc,
nullptr,
&m_constantMatrixBuffer
)
);
DX::ThrowIfFailed(
m_matrixStack.Initialize(m_d3dContext, m_constantMatrixBuffer, &m_constantMatrixBufferData)
);
// also create the light buffer early, we must create it now but we will later
// update it with the light information that we parsed from the model
CD3D11_BUFFER_DESC constantLightBufferDesc(sizeof(LightConstantBuffer), D3D11_BIND_CONSTANT_BUFFER);
/* !!!!---- AN E_INVALIDARG IS THROWN BY THE FOLLOWING LINE ----!!!! */
DX::ThrowIfFailed(
m_d3dDevice->CreateBuffer(
&constantLightBufferDesc,
nullptr,
&m_constantLightBuffer
)
);
At this point, it appears that the parameters being passed into the Light's CreateBuffer call are in the same state that the Matrix's are! The problem seems to have to do with the number of bytes being stored in the buffer description.
The buffer is defined as such in the module:
// a constant buffer that contains the 3 matrices needed to
// transform points so that they're rendered correctly
struct ModelViewProjectionConstantBuffer
{
DirectX::XMFLOAT4X4 model;
DirectX::XMFLOAT4X4 view;
DirectX::XMFLOAT4X4 projection;
};
// a constant buffer that contains up to 4 directional or point lights
struct LightConstantBuffer
{
DirectX::XMFLOAT3 ambient[4];
DirectX::XMFLOAT3 diffuse[4];
DirectX::XMFLOAT3 specular[4];
// the first spot in the array is the constant attenuation term,
// the second is the linear term, and the third is quadradic
DirectX::XMFLOAT3 attenuation[4];
// the position and direction of the light
DirectX::XMFLOAT3 position[4];
DirectX::XMFLOAT3 direction[4];
// the type of light that we're working with, defined in lights.h
UINT type[4];
// a number from 0 to 4 that tells us how many lights there are
UINT num;
};
And as such in the vertex shader (.hlsl):
cbuffer ModelViewProjectionConstantBuffer : register (b0)
{
matrix model;
matrix view;
matrix projection;
};
cbuffer LightConstantBuffer : register (b1)
{
float3 ambient[4];
float3 diffuse[4];
float3 specular[4];
// the first spot in the array is the constant attenuation term,
// the second is the linear term, and the third is quadradic
float3 attenuation[4];
// the position and direction of the light
float3 position[4];
float3 direction[4];
// the type of light that we're working with, defined in lights.h
uint type[4];
// a number from 0 to 4 that tells us how many lights there are
uint num;
}
In an attempt to figure out what is causing this, I have stumbled across this line in the MSDN HLSL Shader documentation (http://msdn.microsoft.com/en-us/library/windows/desktop/ff476898(v=vs.85).aspx):
Each element stores a 1-to-4 component constant, determined by the format of the data stored.
What does this mean and is it the reason for this exception? I have noticed that in the Visual Studio 3D Starter Kit (http://code.msdn.microsoft.com/wpapps/Visual-Studio-3D-Starter-455a15f1), the buffers have extra floats padding them:
///////////////////////////////////////////////////////////////////////////////////////////
//
// Constant buffer structures
//
// These structs use padding and different data types in places to adhere
// to the shader constant's alignment.
//
struct MaterialConstants
{
MaterialConstants()
{
Ambient = DirectX::XMFLOAT4(0.0f,0.0f,0.0f,1.0f);
Diffuse = DirectX::XMFLOAT4(1.0f,1.0f,1.0f,1.0f);
Specular = DirectX::XMFLOAT4(0.0f, 0.0f, 0.0f, 0.0f);
Emissive = DirectX::XMFLOAT4(0.0f, 0.0f, 0.0f, 0.0f);
SpecularPower = 1.0f;
Padding0 = 0.0f;
Padding1 = 0.0f;
Padding2 = 0.0f;
}
DirectX::XMFLOAT4 Ambient;
DirectX::XMFLOAT4 Diffuse;
DirectX::XMFLOAT4 Specular;
DirectX::XMFLOAT4 Emissive;
float SpecularPower;
float Padding0;
float Padding1;
float Padding2;
};
struct LightConstants
{
LightConstants()
{
ZeroMemory(this, sizeof(LightConstants));
Ambient = DirectX::XMFLOAT4(1.0f,1.0f,1.0f,1.0f);
}
DirectX::XMFLOAT4 Ambient;
DirectX::XMFLOAT4 LightColor[4];
DirectX::XMFLOAT4 LightAttenuation[4];
DirectX::XMFLOAT4 LightDirection[4];
DirectX::XMFLOAT4 LightSpecularIntensity[4];
UINT IsPointLight[4*4];
UINT ActiveLights;
float Padding0;
float Padding1;
float Padding2;
};
... // and there's even more where that came from
So am I just not padding these things correctly? And if so, how should I pad them? Or is it something completely different that I'm missing?
I greatly appreciate you reading this and trying to help.
It is hard to fix your problem because lack of important info, but let's make a try.
Obviously, 'E_INVALIDARG' says that invalid argument passed to function. Now we must figure out what parameter is wrong.
ID3D11Device::CreateBuffer method accepts 3 parameters: D3D11_BUFFER_DESC, D3D11_SUBRESOURCE_DATA, and ID3D11Buffer** itself.
And you feed to it &constantLightBufferDesc, nullptr, &m_constantLightBuffer.
Now you must carefully read all 4 MSDN articles to find out what is wrong.
constantLightBuffer it is not a problem, just check that it has ID3D11Buffer pointer type.
nullptr it is unlikely a problem, but AFAIK it is not C++ standard keyword, so probably simple '0' will be better here. Actually, it is a standard since C++11
Unfortunately you don't provide your constantLightBufferDesc definition, which is a candidate to be a problem:
as you've stated there can be buffer alignment mistake: if your constantLightBufferDesc.BindFlags has D3D11_BIND_CONSTANT_BUFFER flag and constantLightBufferDesc.ByteWidth is not a multiple of 16, buffer creation fails. But that's just a guess. You can have any other mismatch here, so, you can make a guesses infinetely.
Fortunalely, there are another way of diagnostic: if you creating your ID3D11Device with D3D11_CREATE_DEVICE_DEBUG flag, in Visual Studio output window you will see all the warnings and errors according to D3D11. For example, in case of misalignment you will see:
D3D11 ERROR: ID3D11Device::CreateBuffer: The Dimensions are invalid.
For ConstantBuffers, marked with the D3D11_BIND_CONSTANT_BUFFER
BindFlag, the ByteWidth (value = 10) must be a multiple of 16.
ByteWidth must also be less than or equal to 65536 on the current
driver. [ STATE_CREATION ERROR #66: CREATEBUFFER_INVALIDDIMENSIONS]
So, if CreateBuffer() failing because of wrong buffer size, there are several ways to handle this:
Resize your structures: add padding members so total sizeof() will become multiple of 16.
Declare your structures as 16-bit aligned. AFAIK there are only compiler-specific ways to do this: for example #pragma pack for msvc.
Assign to ByteWidth not a real structure size, but rounded up to next multiple of 16: link
Happy debugging! =)