I am working on a compute shader where the output is written to SSBO.Now,the consumer of this buffer is CUDA which expects it to contain unsigned bytes.I currently can't see find the way how to write a byte per index in SSBO.With texture or image the normalized float to unsigned byte conversion is handled by OpenGL.For example I can attach a texture with internal format R8 and store byte per entry.But nothing like this is possible with SSBO.Does it mean that except of bool data type all the numerical storage types in SSBO can be at least 4 bytes per entry only?
Practically speaking I would like to be able to do the following:
Compute shader:
#version 430 core
layout (local_size_x = 8,local_size_y = 8 ) in;
struct SSBOBlock
{
byte mydata;
};
layout(std430,binding = BUFFER_OUTPUT) writeonly buffer bBuffer
{
SSBOBlock Ouput[];
} Out;
void main()
{
//..... Compute shader stuff...
//.......
Out.Ouput[globalIndex].mydata = val;//where val is normalized float
}
The smallest type exposed on GPUs tends to be 32-bit for scalars. Even the boolean type you mentioned is actually 32-bit. The same goes for languages like C often; a boolean does not need anything more than 1-bit but even so bool is not synonymous with "give me the smallest data type available."
There are intrinsic functions you can use to pack and unpack data types however and I will show an example of how to use them below:
#version 420 core
layout (local_size_x = 8,local_size_y = 8 ) in;
struct SSBOBlock
{
uint mydata;
};
layout(std430,binding = BUFFER_OUTPUT) writeonly buffer bBuffer
{
SSBOBlock Ouput[];
} Out;
void main()
{
//..... Compute shader stuff...
//.......
Out.Output [globalIndex].mydata = packUnorm4x8 (val)
// where val is a 4-component unsigned normalized vector to pack into globalIndex
}
Your sample shader shows an attempt to write a single scalar to a "byte" data type, that is not possible and you are going to have to modify this somehow to work with indices that reference a packed group of 4 scalars. In the worst-case, this might mean unpacking three values and then re-packing the entire thing just to write one scalar.
This intrinsic function is discussed in the extension specification for GL_ARB_shading_languge_packing and is core in GL 4.2 and later.
Even if you were on an implementation that does not support that extension, it is explained in the text of the extension specification exactly what each does. The equivalent operation for packUnorm4x8 is:
uint fixed_val = round(clamp(float_val, 0, +1) * 255.0);
Some bit-shifts will be necessary to properly pack each component, but those are trivial.
I found a way to write unsigned byte data into buffer in compute shader.Buffer texture does the job.It is basically image texture with buffer as storage.This way I can specify image format to be R8 which allows me to store byte size values on each index of the buffer.
GLuint _tbo_buffer,_tbo_tex;
glGenBuffers(1, &_tbo_buffer);
glBindBuffer(GL_TEXTURE_BUFFER, _tbo_buffer);
glBufferData(GL_TEXTURE_BUFFER, SCREEN_WIDTH * SCREEN_HEIGHT, NULL, GL_DYNAMIC_COPY);
glGenTextures(1, &_tbo_tex);
glBindTexture(GL_TEXTURE_BUFFER, _tbo_tex);
//attach the TBO to the texture:
glTexBuffer(GL_TEXTURE_BUFFER, GL_R8, _tbo_buffer);
glBindImageTexture(0, _tbo_tex, 0, GL_FALSE, 0, GL_WRITE_ONLY, GL_R8);
Compute shader:
#version 430 core
layout (local_size_x = 8,local_size_y = 8 ) in;
layout(binding=0) uniform sampler2D TEX_IN;
layout(r8) writeonly uniform imageBuffer mybuffer;
void main(){
vec2 texSize = vec2(textureSize(TEX_IN,0));
vec2 uv = vec2(gl_GlobalInvocationID.xy / texSize);
vec4 tex = texture(TEX_IN,uv);
uint globalIndex = gl_GlobalInvocationID.y * nThreads.x + gl_GlobalInvocationID.x;
//store only r:
imageStore(mybuffer,int(globalIndex),vec4(0.5,0,0,0));
}
Then we can read byte by byte on CPU or map to CUDA buffer resource:
GLubyte* ptr = (GLubyte*)glMapBuffer(GL_TEXTURE_BUFFER, GL_READ_ONLY);
Related
I have a vertex shader with a push-constant block containing one float:
layout(push_constant) uniform pushConstants {
float test1;
} u_pushConstants;
And a fragment shader with another push-constant block with a different float value:
layout(push_constant) uniform pushConstants {
float test2;
} u_pushConstants;
test1 and test2 are supposed to be different.
The push-constant ranges for the pipeline layout are defined like this:
std::array<vk::PushConstantRange,2> ranges = {
vk::PushConstantRange{
vk::ShaderStageFlagBits::eVertex,
0,
sizeof(float)
},
vk::PushConstantRange{
vk::ShaderStageFlagBits::eFragment,
sizeof(float), // Push-constant range offset (Start after vertex push constants)
sizeof(float)
}
};
The actual constants are then pushed during rendering like this:
std::array<float,1> constants = {123.f};
commandBufferDraw.pushConstants(
pipelineLayout,
vk::ShaderStageFlagBits::eVertex,
0,
sizeof(float),
constants.data()
);
std::array<float,1> constants = {456.f};
commandBufferDraw.pushConstants(
pipelineLayout,
vk::ShaderStageFlagBits::eFragment,
sizeof(float), // Offset in bytes
sizeof(float),
constants.data()
);
However, when checking the values inside the shaders, both have the value 123.
It seems that the offsets are completely ignored. Am I using them incorrectly?
In your pipeline layout, you stated that your vertex shader would access the range of data from [0, 4) bytes in the push constant range. You stated that your fragment shader would access the range of data from [4, 8) in the push constant range.
But your shaders tell a different story.
layout(push_constant) uniform pushConstants {
float test2;
} u_pushConstants;
This definition very clearly says that the push constant range starts uses [0, 4). But you told Vulkan it uses [4, 8). Which should Vulkan believe: your shader, or your pipeline layout?
A general rule of thumb to remember is this: your shader means what it says it means. Parameters given to pipeline creation cannot change the meaning of your code.
If you intend to have the fragment shader really use [4, 8), then the fragment shader must really use it:
layout(push_constant) uniform fragmentPushConstants {
layout(offset = 4) float test2;
} u_pushConstants;
Since it has a different definition from the VS version, it should have a different block name too. The offset layout specifies the offset of the variable in question. That's standard stuff from GLSL, and compiling for Vulkan doesn't change that.
I'm trying to make a simple compute shader using a Shader Storage Buffer (SSBO) to pass data to the shader. I'm coding in C++ with GLFW3 and GLEW. I'm passing an array of integers into an SSBO, binding it to the index 0, and expecting to retrieve the data in the shader from a layout buffer variable (as explained on various websites). However, I get an unexpected "undefined variable" error on shader compilation concerning this layout buffer variable, though it is clearly declared.
Here is the GLSL code of the compute shader (this script is only at its beginning) :
#version 430
layout (local_size_x = 1, local_size_y = 1, local_size_z = 1) in;
layout (std430, binding = 0) buffer params
{
ivec3 dims;
};
int index(ivec3 coords){
ivec3 dims = params.dims;
return coords.x + dims.y * coords.y + dims.x * dims.y * coords.z;
}
void main() {
ivec3 coords = ivec3(gl_GlobalInvocationID);
int i = index(coords);
}
I get the error : 0(12) : error C1503: undefined variable "params"
Here is the C++ script that setups and runs the compute shader :
int dimensions[] {width, height, depth};
GLuint paramSSBO;
glGenBuffers(1, ¶mSSBO);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, paramSSBO);
glBufferData(GL_SHADER_STORAGE_BUFFER, sizeof(dimensions), &dimensions, GL_STREAM_READ);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, paramSSBO);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, 0);
GLuint computeShaderID;
GLuint csProgramID;
char* computeSource;
loadShaderSource(computeSource, "compute.glsl");
computeShaderID = glCreateShader(GL_COMPUTE_SHADER);
compileShader(computeShaderID, computeSource);
delete[] computeSource;
csProgramID = glCreateProgram();
glAttachShader(csProgramID, computeShaderID);
glLinkProgram(csProgramID);
glDeleteShader(computeShaderID);
glUseProgram(csProgramID);
glDispatchCompute(width, height, depth);
glMemoryBarrier(GL_BUFFER_UPDATE_BARRIER_BIT);
glUseProgram(0);
glDeleteBuffers(1, ¶mSSBO);
width, height and depth are int variables defined earlier in the program. I'm binding the dimensions array to the index 0 and I expect to retrieve it in the ivec3 params.dims variable in the shader. However the params variable is said to be undefined when used in the index() function.
This script is just the beginning and I wanted to add a second buffer where the shader would actually write its result, but I'm stuck here. For clarification : in the complete script I expect not to write in any texture (as all online examples show), but write the results in the second buffer from which I will get the data back into a C++ array for further use.
params is not a variable. Nor is it a struct or class. It is the name of an interface block. And the name of an interface block is not really part of GLSL itself; it's part of OpenGL. It's the name used by the OpenGL API to represent that particular block.
You never use an interface block's name in the shader text itself, outside of defining it.
Unless you give your interface block an instance name, the names of all variables within that block are essentially part of the global namespace. Indeed, scoping those names is the whole point of giving the block an instance name.
So the correct way to access the dims field in the interfae block is as "dims".
Im passing uniform buffer to compute shader in vulkan. The buffer contains an array of 49 floating point numbers (gaussian matrix). Everything is fine, but when I read array in the shader, it gives only 13 values, the others are 0 or gunk, and they correspond to 0, 4, 8, etc. of initial array. I think its some kind of alignment problem
Shader layouts are
struct Pixel
{
vec4 value;
};
layout(push_constant) uniform params_t
{
int width;
int height;
} params;
layout(std140, binding = 0) buffer buf
{
Pixel imageData[];
};
layout (binding = 1) uniform sampler2D inputTex;
layout (binding = 2) uniform unf_t
{
float gauss[SAMPLE_SIZE*SAMPLE_SIZE];
};
Could that be binding 0 influencing binding 2? and if so how can I copy array to buffer with needed alignment? Currently I use
vkCmdUpdateBuffer(a_cmdBuff, a_uniform, 0, a_gaussSize, (const uint32_t *)gauss)
or may be better to split on different sets?
Edit: by expanding buffer and array i manage to pass it with alignment of 16 and all great, but it looks like a waste of memory. How can I align floats by 4?
Uniform blocks require that array elements are aligned to vec4 (16 bytes).
To work around this you use a vec4 instead and you can pass 52 floats and then take the correct component based on index/4 and index%4.
I've noticed that my shaders are performing a calculation that I need in the CPU code. Is it possible for me to load that results of that calculation into a uniform array, and then access that uniform from the CPU once the GPU has finished working?
You can write arbitrary amounts of data through either Image Load/Store or SSBOs. While the number of image variables is restricted in image load/store, those variables can refer to buffer textures or array textures. Either of which give you access to a more-or-less arbitrarily large amount of data to write to:
layout(rgba32f, writeOnly) imageBuffer buffer;
imageStore(buffer, valueOffset1, value1);
imageStore(buffer, valueOffset2, value2);
imageStore(buffer, valueOffset3, value3);
imageStore(buffer, valueOffset4, value4);
SSBOs make this even easier:
layout(std430) buffer Data
{
float giantArray[];
};
giantArray[valueOffset1] = data1;
giantArray[valueOffset2] = data2;
giantArray[valueOffset3] = data3;
giantArray[valueOffset4] = data4;
However, note that any such writes will be unordered with regard to writes from other shader invocations. So overwriting such data will be... problematic. And you'll need an appropriate glMemoryBarrier call before you try to read from it.
But if all you're doing is a compute operation, you ought to be using dedicated compute shaders.
As far as i know, there is no way of retreiving uniform data from your GPU. But you could execute the calculation and set the output color to something you can identify on your screen depending on the expected result of your calculation. For exmaple:
#version 330 core
layout(location = 0) out vec4 color;
void main() {
if( Something you're trying to debug )
color = vec4(1, 1, 1, 1);
else
color = vec4(0, 0, 0, 1);
}
That's the only way I know of, and I use it all the time.
I have a vertex shader with a push-constant block containing one float:
layout(push_constant) uniform pushConstants {
float test1;
} u_pushConstants;
And a fragment shader with another push-constant block with a different float value:
layout(push_constant) uniform pushConstants {
float test2;
} u_pushConstants;
test1 and test2 are supposed to be different.
The push-constant ranges for the pipeline layout are defined like this:
std::array<vk::PushConstantRange,2> ranges = {
vk::PushConstantRange{
vk::ShaderStageFlagBits::eVertex,
0,
sizeof(float)
},
vk::PushConstantRange{
vk::ShaderStageFlagBits::eFragment,
sizeof(float), // Push-constant range offset (Start after vertex push constants)
sizeof(float)
}
};
The actual constants are then pushed during rendering like this:
std::array<float,1> constants = {123.f};
commandBufferDraw.pushConstants(
pipelineLayout,
vk::ShaderStageFlagBits::eVertex,
0,
sizeof(float),
constants.data()
);
std::array<float,1> constants = {456.f};
commandBufferDraw.pushConstants(
pipelineLayout,
vk::ShaderStageFlagBits::eFragment,
sizeof(float), // Offset in bytes
sizeof(float),
constants.data()
);
However, when checking the values inside the shaders, both have the value 123.
It seems that the offsets are completely ignored. Am I using them incorrectly?
In your pipeline layout, you stated that your vertex shader would access the range of data from [0, 4) bytes in the push constant range. You stated that your fragment shader would access the range of data from [4, 8) in the push constant range.
But your shaders tell a different story.
layout(push_constant) uniform pushConstants {
float test2;
} u_pushConstants;
This definition very clearly says that the push constant range starts uses [0, 4). But you told Vulkan it uses [4, 8). Which should Vulkan believe: your shader, or your pipeline layout?
A general rule of thumb to remember is this: your shader means what it says it means. Parameters given to pipeline creation cannot change the meaning of your code.
If you intend to have the fragment shader really use [4, 8), then the fragment shader must really use it:
layout(push_constant) uniform fragmentPushConstants {
layout(offset = 4) float test2;
} u_pushConstants;
Since it has a different definition from the VS version, it should have a different block name too. The offset layout specifies the offset of the variable in question. That's standard stuff from GLSL, and compiling for Vulkan doesn't change that.