Write to some fragment locations but not others - opengl

Currently, I have two shaders that are intended to process the same type of objects, but produce different output: one color for the screen, the other selection info.
Output of draw shader:
layout(location = 0) out vec4 outColor;
Output of selection shader:
layout(location = 0) out vec4 selectionInfo0;
layout(location = 1) out ivec4 selectionInfo1;
I am considering combining the shaders together (these two and others in my application) for clarity and ease of maintenance (why edit two shaders when you can edit one?).
Output of unified shader:
layout(location = 0) out vec4 outColor;
layout(location = 1) out vec4 selectionInfo0;
layout(location = 2) out ivec4 selectionInfo1;
Under this scheme, I would set a uniform that determines which fragments need to be written to.
Can I write to some fragment locations and not others?
void main()
{
if(Mode == 1){
outColor = vec4(1, 0, 0, 1);
}
else {
selectionInfo0 = vec4(0.1, 0.2, 0.3, 0.4);
selectionInfo1 = ivec4(1, 2, 3, 4);
}
}
Is this a legitimate approach? Is there anything I should be concerned about?

Is this a legitimate approach?
That depends on how you define "legitimate". It can be made to function.
A fragment is either discarded in its entirety or it is not. If it is discarded, then the fragment has (mostly) no effect. If it is not discarded, then all of its outputs either have defined values (ie: you wrote to them), or they have undefined values.
However, undefined values can be OK, depending on other state. For example, the frambuffer's draw buffer state routes FS output colors to actual color attachments. It can also route them to GL_NONE, which throws them away. Similarly, you can use color write masks on a per-attachment basis, turning off writes to attachments you don't want to write to.
But this means that you cannot determine it on a per-fragment basis. You can only determine this using state external to the shader. The FS can't make this happen or not happen; it has to be done between draw calls with state changes.
If Mode is some kind of uniform value, then that should be OK. But if it is something derived on a per-vertex or per-fragment basis, then this will not work effectively.
As for branching performance, again, that depends on Mode. If it's a uniform, you shouldn't be concerned at all. Modern GPUs can handle that just fine. If it is something else... well, your entire scheme stops working for reasons that have already been detailed, so there's no reason to worry about it ;)
That all being said, I would advise against this sort of complexity. It is a confusing way of handling things from the driver's perspective. Also, because you're relying on a lot of things that other applications are not relying on, you open yourself up to driver bugs. Your idea is different from a traditional Ubershader, because your options fundamentally change the nature of the render targets and outputs.
So I would suggest you try to do things in as conventional a way as possible. If you really want to minimize the number of separate files you work with, employ #ifdefs, and simply patch the shader string with a #define, based on the reason you're loading it. So you have one shader file, but 2 programs built from it.

Related

Vulkan, why do validation layers (and by extension the spec) forbid pipelines from not writing to certain attachments?

In vulkan, if under the lifetime of a single render pass you naively render to a framebuffer that contains multiple attachemnts with a pipeline that renders to all of them and then render again with a pipeline that only renders to one of them you will get an error.
Let me give an example.
Consider the following image, which is an intermediary step in a multi pass effect.
Which is obtained from writing a wireframe on top of an albedo:
#version 450
#extension GL_ARB_separate_shader_objects : enable
layout(location = 0) out vec4 color_out;
layout(location = 1) out vec4 color_out1;
layout(location = 2) out vec4 color_out2;
layout(location = 3) out vec4 color_out3;
layout(location = 4) out vec4 color_out4;
void main()
{
color_out = vec4(1,1,1,1);
color_out1 = vec4(0,0,0,0);
color_out2 = vec4(0,0,0,0);
color_out3 = vec4(0,0,0,0);
color_out4 = vec4(0,0,0,0);
}
The 4 "noop" outputs are not really necessary, they exist merely to prevent vulkan errors.
Let's assume we instead do this (and we modify our pieline as well):
#version 450
#extension GL_ARB_separate_shader_objects : enable
layout(location = 0) out vec4 color_out;
void main()
{
color_out = vec4(1,1,1,1);
}
Then we obtain the same image.
However a critical difference exists:
The second image produces multiple errors, one for each attachemnt, which look like this:
Message ID name: UNASSIGNED-CoreValidation-Shader-InputNotProduced
Message: Attachment 1 not written by fragment shader; undefined values will be written to attachment
Severity: VK_DEBUG_UTILS_MESSAGE_SEVERITY_WARNING_BIT_EXT
Why is not explicitely writing to the attachments of a framebuffer not valid as per the spec? i.e why isn't the spec that if you do not write to an attachment, the contents are preserved?
why isn't the spec that if you do not write to an attachment, the contents are preserved?
Because Vulkan is a low-level rendering API.
What gets written to has always, in OpenGL just as in Vulkan, been governed by the write mask state, not anything the fragment shader does. If you have some number of attachments, again in OpenGL as well as Vulkan, any rendering operation will write to all of them (assuming the various tests are passed) unless write masks (or blending) are employed to prevent those writes.
Note that this distinction may well be a hardware issue. If a GPU uses specialized hardware to interpret fragment data and perform blending/write masking/etc, it is hardly unreasonable to consider that there may be no mechanism for the shader to directly communicate which values in the fragment are valid and which are not.
It appears that some hardware does handle this as you would prefer. Or at least in some subset of cases. However, as we're looking at unspecified behavior, there's no way to know what triggers it or in which cases it may not function.
Now, one might say that, given that the Vulkan pipeline object includes all of this state, the code which builds the internal pipeline state data could just detect which values get written by the FS and write-mask out the rest. But that's kind of against the point of being a low-level API.

Am I able to initiate blank variables and declare them afterwards?

I would like to do something like this:
vec4 text;
if (something){
text = texture(backgroundTexture, pass_textureCoords);
}
if (somethingElse){
text = texture(anotherTexture, pass_textureCoords);
}
Is this valid GLSL code, and if not, is there any suitable alternative?
The answer depends on what something and somethingElse actually are.
If this is a uniform control flow, which means that all shader invocations execute the same branch, then this is perfectly valid.
If they are non uniform expressions, then there are some limitations:
When a texture uses mipmapping or anisotropic filtering of any kind, then any texture function that requires implicit derivatives will retrieve undefined results.
To sum it up: Yes, this is perfectly valid glsl code but textures in non-uniform controls flow have some restrictions. More details on this can be found here.
Side note: In the code sample you are not declaring the variables afterwards. You are just assigning values to them.
Edit (more info)
To elaborate a bit more on what uniform and non-uniform control flow is: Shaders can (in general) have two types of inputs. Uniform variables like uniform vec3 myuniform; and varyings like in vec3 myvarying. The difference is where the data comes from and how it can change during an invocation:
Uniforms are set from application side, and are (due to this) constant over an invocation (where invocation simplified means draw command).
Varyings are read from vertex inputs (in the vertex shader) or are passed and interpolated from previous shader stages (e.g. in the fragment shader).
Uniform control flow now means, that the condition of the corresponding if statement only depends on uniforms. Everything else is non-uniform control flow. Let's have a look at this fragment shader example:
//Uniforms
uniform vec3 my_uniform;
uniform sampler2D my_sampler;
/Varyings
in vec2 tex_coord;
in vec2 ndc_coords;
void main()
{
vec3 result = vec3(0,0,0);
if (my_uniform.y > 0.5) //Uniform control flow
result += texture(my_sampler, tex_coord).rgb;
if (ndc_coords.y > 0.5) //Non-uniform control flow
result += texture(my_sampler, tex_coord).rgb;
float some_value = ndc_coords.y + my_uniform.y;
if (some_value > 0.5) //Still non-uniform control flow
result += texture(my_sampler, tex_coord).rgb;
}
In this shader only the first texture read happens in uniform control flow (since the if condition only depends on a uniform variable). The other two reads are in non-uniform control flow since the if condition also depends on an varying (ndc_coords).

OpenGL GLSL uniform branching vs. Multiple shaders

I've been reading many articles on uniform if statements that deal with branching to change the behavior of large shaders "uber shaders". I started on an uber shader (opengl lwjgl) but then I realized, the simple act of adding an if statement set by a uniform in the fragment shader that does simple calculations decreased my fps by 5 compared to seperate shaders without uniform if statements. I haven't set any cap to my fps limit, it's just refreshing as fast as possible. I'm about to add normal mapping and parrallax mapping and I can see two routes:
Uber vertex shader:
#version 400 core
layout(location = 0) in vec3 position;
layout(location = 1) in vec2 textureCoords;
layout(location = 2)in vec3 normal;
**UNIFORM float RenderFlag;**
void main(void){
if(RenderFlag ==0){
//Calculate outVariables for normal mapping to the fragment shader
}
if(RenderFlag ==1){
//Calcuate outVariables for parallax mapping to the fragment shader
}
gl_Position = MVPmatrix *vec4(position,1);
}
Uber fragment shader:
layout(location = 0) in vec3 position;
layout(location = 1) in vec2 textureCoords;
layout(location = 2)in vec3 normal;
**UNIFORM float RenderFlag;**
**UNIFORM float reflectionFlag;** // if set either of the 2 render modes
will have some reflection of the skybox added to it, like reflective
surface.
void main(void){
if(RenderFlag ==0){
//display normal mapping
if(reflectionFlag){
vec4 reflectColor = texture(cube_texture, ReflectDirR) ;
//add reflection color to final color and output
}
}
if(RenderFlag ==1){
//display parrallax mapping
if(reflectionFlag){
vec4 reflectColor = texture(cube_texture, ReflectDirR) ;
//add reflection color to final color and output
}
}
gl_Position = MVPmatrix *vec4(position,1);
}
The benefit of this (for me) is simplicity in the flow, but makes the overall program more complex and i'm faced with ugly nested if statements. Also if I wanted to completely avoid if statements I would need 4 seperate shaders, one to handle each possible branch (Normal w/o reflection : Normal with reflection : Parrallax w/o reflection : Parrallax with reflection) just for one feature, reflection.
1: Does GLSL execute both branches and subsequent branches and calculates BOTH functions then outputs the correct one?
2: Instead of a uniform flag for the reflection should I remove the if statement in favor of calculating the reflection color irregardless and adding it to the final color if it is a relatively small operation with something like
finalColor = finalColor + reflectionColor * X
where X = a uniform variable, if none X == 0, if Reflection X==some amount.
Right off the bat, let me point out that GL4 has added subroutines, which are sort of a combination of both things you discussed. However, unless you are using a massive number of permutations of a single basic shader that gets swapped out multiple times during a frame (as you might if you had some dynamic material system in a forward rendering engine), subroutines really are not a performance win. I've put some time and effort into this in my own work and I get worthwhile improvements on one particular hardware/driver combination, and no appreciable change (good or bad) on most others.
Why did I bring up subroutines? Mostly because you're discussing what amounts to micro optimization, and subroutines are a really good example of why it doesn't pay to invest a whole lot of time thinking about that until the very end of development. If you're struggling to meet some performance number and you've crossed every high-level optimization strategy off the list, then you can worry about this stuff.
That said, it's almost impossible to answer how GLSL executes your shader. It's just a high-level language; the underlying hardware architectures have changed several times over since GLSL was created. The latest generation of hardware has actual branch predication and some pretty complicated threading engines that GLSL 1.10 class hardware never had, some of which is actually exposed directly through compute shaders now.
You could run the numbers to see which strategy works best on your hardware, but I think you'll find it's the old micro optimization dilemma and you may not even get enough of a measurable difference in performance to make a guess which approach to take. Keep in mind "Uber shaders" are attractive for multiple reasons (not all performance related), none the least of which, you may have fewer and less complicated draw commands to batch. If there's no appreciable difference in performance consider the design that's simpler and easier to implement / maintain instead.

OpenSceneGraph and GLSL 330 light and shadows

I've written plenty of
#version 330 core
GLSL shaders I'd like to reuse along with the OpenSceneGraph (OSG) 3.2.0 framework, and try to figure out how to get the state from the OSG I need to pass in by uniforms, and how to set them without having to change well-tested shader code, as well as how to populate arbitrarily named attributes.
This (version 140, OpenGL 3.1)
http://trac.openscenegraph.org/projects/osg/browser/OpenSceneGraph/trunk/examples/osgsimplegl3/osgsimplegl3.cpp
and this (version 400)
http://trac.openscenegraph.org/projects/osg/browser/OpenSceneGraph/trunk/examples/osgtessellationshaders/osgtessellationshaders.cpp
example give rise to a notion of aliasing certain attribute and uniform names to "osg_", but I'd like to use arbitrary names for the uniforms,
uniform mat4 uMVMatrix;
/*...*/
and to refer, or let the OSG refer, to the attributes by their numbers only, so sth like this
/*...*/
layout(location = 0) in vec4 aPosition;
layout(location = 1) in vec3 aNormal;
layout(location = 2) in vec2 aST;
as used in my legacy shaders, I'd like the OSG framework to populate with the vbo it already maintains for the "Drawables", or, at least, use an API call and do it myself.
I addition, I'd like to populate uniforms for lights and shawdowmaps by means of the scenegraph and the visitors; "somewhere" and "somehow" in the SG there must be light and esp shadow information be aggregated for default shading, so I'd like simply like to use this data and tailor it to fit to my custom shaders.
So the fundamental question is
How to populate arbitrary GLSL 330 shaders with data from within OSG without having to resent to redundant uniform assignment - providing my "u[..]Matrix" manually in addition to the "osg_[...]" uniform set by OSG - or changing attribute names in the shader sources?
I just stumbled upon this, turns out, you can just use your own names after all, if you just specify the layout location (so far I only tried it for the vertex position, so you might have to take care of using the correct layout location as osg would specify them, i.e. vertex position at 0, normal at 1 (which is not done the example of the link though))
layout (location = 0) in vec3 vertex;
this is enough to use the variable named vertex in the shader.
The link also provides an example to use custom names for matrices: you create an osg::Uniform::Callback class that uploads the matrix to the uniform.
when you create the osg::Uniform object, you specify the name of your choosing and add the callback.

Does RenderMonkey have a bug in TEXCOORD stream mapping for GLSL?

For clarity, I start with my question:
Is it possible to use (in the shader code) the custom attribute name which I set for the TEXCOORD usage in the (OpenGL) stream mapping in RenderMonkey 1.82 or do I have to use gl_MultiTexCoord0?
(The question might be valid for the NORMAL usage too, i.e custom name or gl_Normal)
Background:
Using RenderMonkey version 1.82.
I have successfully used the stream mapping to map the general vertex attribute "position" (and maybe "normal"), but the texture coordinates does not seem to be forwarded correctly.
For the shader code, I use #version 330 and the "in" qualifier in GLSL, which should be OK since RM does not compile the shaders itself (the OpenGL driver do).
I have tried both .obj and .3ds files (exported from blender), and when checking the wavefront .obj-file, all texture coordinate information is there, as well as the vertex positions and normals.
If it is not possible, the stream mapping is broken and there is no point in naming the variables in the stream mapping editor (besides for the vertex position stream, which works), since one has to use the built-in variables anyway.
Update:
If using the deprecated built-in variables, one has to use compatibility mode in the shader e.g
#version 330 compatibility
out vec2 vTexCoord;
and, in the main function:
vTexCoord = vec2(gl_MultiTexCoord0);
(Now I'm not sure about the stream mapping of normals either. As soon as I got the texture coordinates working, I had normal problems and had to revert to gl_Normal.)
Here is a picture of a working solution, but with built-in variables (and yes, the commented texcoord variable in the picture does not have the same name as in the stream mapping dialog, but it had the same name when I tried to use it, so it's OK.):
You could try to use the generic vertices's attributes, see http://open.gl, it's a great tutorial ;)
(but I think it imply you'd have to rewrite the code to manual handle the transformations...)
#version 330
layout(location = 0) in vec3 bla_bla_bla_Vertex;
layout(location = 2) in vec3 bla_bla_bla_Normal;
layout(location = 8) in vec3 bla_bla_bla_TexCoord0;
This is a working solution for RM 1.82