Should I reuse the same constant buffer in multiple shader stages? - hlsl

For example, instead of:
VertexShader.hlsl
cbuffer VSPerInstance : register(b0){
matrix World, View, Projection;
};
PixelShader.hlsl
cbuffer PSPerInstance : register(b0){
float4 AmbientColor;
float4 DiffuseColor;
float4 SpecularColor;
float4 EmissiveColor;
};
I could have:
MyInclude.hlsl
cbuffer PerInstance : register(b0){
matrix World, View, Projection;
float4 AmbientColor;
float4 DiffuseColor;
float4 SpecularColor;
float4 EmissiveColor;
};
In a include file, when updating the constant buffer this would reduce the number of calls to ID3D11DeviceContext::Map, although I would still have to copy the same amount of bytes and set the constant buffer for each stage, like: ID3D11DeviceContext::VSSetShader, ID3D11DeviceContext::PSSetShader, etc.
So my question is, is it even legal to set the same constant buffer in multiple shader stages? And is there anything negative about it that I should reconsider? Since I started learning Direct3D programming, all the examples that I have seem use individual buffers for each stage as in the first example, so I don't know if this is a good practice.
Just to make things clearer, I'm still using more than one constant buffer anyway, I have one constant buffer for each frequency of update, one for each instance, for each partition, for each draw call...

You generally want to organize the constant buffers by "frequency of update". So in your case, you'd probably want:
cbuffer VSPerInstance : register(b0){
matrix World, View, Projection;
};
which is updated per render frame and a different.
cbuffer PerInstance : register(b1){
float4 AmbientColor;
float4 DiffuseColor;
float4 SpecularColor;
float4 EmissiveColor;
};
which is updated per material.
You can reuse the same cbuffers at different stages of the render pipeline, and it's generally a good idea to minimize binding and update costs. The only real restriction is you can't have the same resource bound as an input and output at the same time (such as a texture bound for reading by the pixel shader and as render target in the same Draw).
See this venerable presentation from Gamefest 2007) for general recommendations on constant buffer usage for Direct3D 10 and 11.

Related

Can I modify vertex buffer in GPU through the vertex shader?

For some reason cannot find the answer on the web. I want to update vertex attributes in GPU through the shader in similar form:
#version 330 core
layout(location = 0) in vec4 position;
uniform mat4 someTransformation;
void main()
{
position = position * someTransformation;
gl_Position = position;
}
Is it possible?
Can you write the code you have written? Yes, that is legal code.
Will that change the contents of any GPU storage? No.
While there are ways for a VS to directly manipulate the contents of a buffer, if the buffer region being manipulated is also potentially being used as an attribute array for a rendering command, then you will have undefined behavior.
You can use SSBOs to manipulate other storage which is not being used as the input for rendering. And you can use transform feedback to accumulate data output from vertex processing. But you cannot have a VS directly modify its own input array.

OpenGL Bindless Textures: Bind to uniform sampler2D array

I am looking into using bindless textures to rapidly display a series of images. My reference is the OpenGL 4.5 redbook. The book says I can sample bindless textures in a shader with this fragment shader:
#version 450 core
#extension GL_ARB_bindless_texture : require
in FS_INPUTS {
vec2 i_texcoord;
flat int i_texindex;
};
layout (binding = 0) uniform ALL_TEXTURES {
sampler2D fs_textures[200];
};
out vec4 color;
void main(void) {
color = texture(fs_textures[i_texindex], i_texcoord);
};
I created a vertex shader that looks like this:
#version 450 core
in vec2 vert;
in vec2 texcoord;
uniform int texindex;
out FS_INPUTS {
vec2 i_texcoord;
flat int i_texindex;
} tex_data;
void main(void) {
tex_data.i_texcoord = texcoord;
tex_data.i_texindex = texindex;
gl_Position = vec4(vert.x, vert.y, 0.0, 1.0);
};
As you may notice, my grasp of whats going on is a little weak.
In my OpenGL code, I create a bunch of textures, get their handles, and make them resident. The function I am using to get the texture handles is 'glGetTextureHandleARB'. There is another function that could be used instead, 'glGetTextureSamplerHandleARB' where I can pass in a sampler location. Here is what I did:
Texture* textures = new Texture[load_limit];
GLuint64* tex_handles = new GLuint64[load_limit];
for (int i=0; i<load_limit; ++i)
{
textures[i].bind();
textures[i].data(new CvImageFile(image_names[i]));
tex_handles[i] = glGetTextureHandleARB(textures[i].id());
glMakeTextureHandleResidentARB(tex_handles[i]);
textures[i].unbind();
}
My question is how do I bind my texture handles to the ALL_TEXTURES uniform attribute of the fragment shader? Also, what should I use to update the vertex attribute 'texindex' - an actual index into my texture handle array or a texture handle?
It's bindless texturing. You do not "bind" such textures to anything.
In bindless texturing, the data value of a sampler is a number. Specifically, the number returned by glGetTextureHandleARB. Texture handles are 64-bit unsigned integer.
In a shader, values of sampler types in buffer-backed interface blocks (UBOs and SSBOs) are 64-bit unsigned integers. So an array of samplers is equivalent in structure to an array of 64-bit unsigned integers.
So in C++, a struct equivalent to your ALL_TEXTURES block would be:
struct AllTextures
{
GLuint64 textures[200];
};
Well, assuming you properly use std140 layout, of course. Otherwise, you'd have to query the layout of the structure.
At this point, you treat the buffer as no different from any other UBO usage. Build the data for the shader by sticking an AllTextures into a buffer object, then bind that buffer as a UBO to binding 0. You just need to fill the array in with the actual texture handles.
Also, what should I use to update the vertex attribute 'texindex' - an actual index into my texture handle array or a texture handle?
Well, neither one will work. Not the way you've written it.
See, ARB_bindless_texture does not allow you to access any texture you want in any way at any time from any shader invocation. Unless you are using NV_gpu_shader5, the code leading to the texture access must be based on dynamically uniform expressions.
So unless every vertex in your rendering command gets the same index or handle... you cannot use them to pick which texture to use. Even instancing will not save you, since dynamically uniform expressions don't care about instancing.
If you want to render a bunch of quads without having to change uniforms between them (and without having to rely on an NVIDIA extension), then you have a few options. Most hardware that supports bindless texture also supports ARB_shader_draw_parameters. This gives you access to gl_DrawID, which represents the current index of a rendering command within a glMultiDraw-style command. And that extension explicitly declares that gl_DrawID is dynamically uniform.
So you could use that to select which texture to render. You simply need to issue a multi-draw command where you render the same mesh data over and over, but it gets a different gl_DrawID index in each case.

Acessing VBO/VAO Data in GLSL Shader

In a vertex shader how can a function within the shader be made to access a specific attribute array value after buffering its vertex data to a VBO?
In the shader below the cmp() function is supposed to compare a uniform variable with vertex i.
#version 150 core
in vec2 vertices;
in vec3 color;
out vec3 Color;
uniform mat4 projection;
uniform mat4 view;
uniform mat4 model;
uniform vec2 cmp_vertex; // Vertex to compare
out int isEqual; // Output variable for cmp()
// Comparator
vec2 cmp(){
int i = 3;
return (cmp_vertex == vertices[i]);
}
void main() {
Color = color;
gl_Position = projection * view * model * vec4(vertices, 0.0, 1.0);
isEqual = cmp();
}
Also, can cmp() be modified so that it does the comparison in parallel?
Based on the naming in your shader code, and the wording of your question, it looks like you misunderstood the concept of vertex shaders.
The vertex shader is invoked once for each vertex. So when your vertex shader code executes, it always operates on a single vertex. This means that the name of your in variable is misleading:
in vec2 vertices;
This variable gives you the position of the one and only vertex your shader is working on. So it would probably be clearer if you used a name in singular form:
in vec2 vertex;
Once you realize that you're operating on a single vertex, the rest becomes easy. For the comparison:
bool cmp() {
return (cmp_vertex == vertex);
}
Vertex shaders are typically already invoked in parallel, meaning that many instances can execute at the same time, each one on its own vertex. So there is no need for parallelism within a single shader instance.
You'll probably have more issues achieving what you're after. But I hope that this gets you at least over the initial hurdle.
For example, the following out variable is problematic:
out int isEqual;
out variables of the vertex shader have matching in variables in the fragment shader. By default, the value written by the vertex shader is linearly interpolated across triangles, and the fragment shader gets the interpolated values. This is not supported for variables of type int. They only support flat interpolation:
flat out int isEqual;
But this will probably not give you what you're after, since the value you see in the fragment shader will always be the same across an entire triangle.

Using input/output structs in GLSL-Shaders

In HLSL I can write:
struct vertex_in
{
float3 pos : POSITION;
float3 normal : NORMAL;
float2 tex : TEXCOORD;
};
and use this struct as an input of a vertex shader:
vertex_out VS(vertex_in v) { ... }
Is there something similar in GLSL? Or do I need to write something like:
layout(location = 0) in vec4 aPosition;
layout(location = 1) in vec4 aNormal;
...
What you are looking for are known as interface blocks in GLSL.
Unfortunately, they are not supported for the input to the Vertex Shader stage or for the output from the Fragment Shader stage. Those must have explicitly bound data locations at program link-time (or they will be automatically assigned, and you can query them later).
Regarding the use of HLSL semantics like : POSITION, : NORMAL, : TEXCOORD, modern GLSL does not have anything like that. Vertex attributes are bound to a generic (numbered) location by name either using glBindAttribLocation (...) prior to linking or as in your example, layout (location = ...).
For input / output between shader stages, that is matched entirely on the basis of variable / interface block name during GLSL program linking (except if you use the relatively new Separate Shader Objects extension). In no case will you have constant pre-defined named semantics like POSITION, NORMAL, etc. though; locations are all application- or linker-defined numbers.

Understanding Shader Programming

I am trying to understand shader programming, but at this point, documentation wont help me further.
1] Does the data type & size of the buffers have to match?
In the DX tutorial 4 from the DX SDK, they have a struct:
struct SimpleVertex{
XMFLOAT3 Pos;
XMFLOAT4 Color;
};
While in their shader file, they define:
struct VS_OUTPUT{
float4 Pos : SV_POSITION;
float4 Color : COLOR0;
};
They define Pos as a vector of 3 in one file, while it is 4 in another. How is this correct? I thought the size of the data have to match.
2] To create another constant buffer, is this the steps I need to do?
// Make in shader file
cbuffer LightBuffer : register(b0){
float3 lDir;
float4 lColor;
}
// Make in C++ file
struct LightBuffer{
XMFLOAT3 Direction;
XMFLOAT4 Color;
};
...
LightBuffer lb;
lb.Direction=XMFLOAT3(-1.0f, -10.0f, 4.0f); // Make an instance of it
lb.Color=XMFLOAT4(0.35f, 0.5f, 1.0f, 1.0f);
...
ID3D11Buffer* lightBuffer=NULL; // Declare in global scope
D3D11_BUFFER_DESC bd;
ZeroMemory(&bd, sizeof(bd));
bd.Usage=D3D11_USAGE_DEFAULT;
bd.ByteWidth=sizeof(LightBuffer);
bd.BindFlags=D3D11_BIND_CONSTANT_BUFFER;
hr=graphics->deviceInterface->CreateBuffer(&bd, NULL, &lightBuffer);
graphics->deviceContext->UpdateSubresource(lightBuffer, 0, NULL, &lb, 0, 0);
graphics->deviceContext->PSSetConstantBuffers(0, 1, &lightBuffer);
These are the steps I did, which was similar to the constant buffer in the tutorial. It ends up producing nothing.
I found out by accident that if I change the type of LightBuffer::XMFLOAT3 Direction to XMFLOAT4, it works. What am I not understanding? Why cant I have the type I wish?
Thanks for reading.
1-As you can see from the struct's name, it is not the input but the output of the vertex shader. Vertex shader should output position variable as 4 floats(homogeneus coordinates). So somewhere in the shader file, there should be an operation which expands the vector to a float4 variable(something like "float4(inputPos, 1.0);").
2-It's probably an alignment issue. GPU's are designed to work with 4D vectors. While using constant buffers, try to create your structures with matrices first, 4D variables second, 3D variables third, and so on. Or you can add extra unused padding bytes like you said you did. If you have too much non-4D vectors, you can pack them in one slot with 'packoffset' keyword for not wasting GPU registers. Detailed explanation is here:
http://msdn.microsoft.com/en-us/library/windows/desktop/bb509581(v=vs.85).aspx
SimpleVertex is the C++ side structure defining the input vertex layout. VS_OUTPUT is the hlsl side structure defining the vertex shader output / pixel shader input. The layout of SimpleVertex corresponds to the vertex shader input which is the arguments to the vertex shader function VS(float4 Pos : POSITION, float4 Color: COLOR). In D3D11 you use an ID3D11InputLayout object (on the C++ side) to describe how the input vertex layout (SimpleVertex structure) should be bound to the vertex shader inputs. If you search in the C++ source code for the tutorial for InputLayout you will see where this is created.
You need to account for alignment with constant buffers. Constant buffer members are aligned on float4 boundaries but C++ structs are not so when you use an XMFLOAT3 for direction your XMFLOAT4 color doesn't line up with the layout your constant buffer defines in hlsl.