Let's say I have a varying variable between any two GLSL shader stages (e.g. the vertex and fragment stage) declared as a vec4:
in/out/varying vec4 texCoord;
What happens if I only use part of that variable (say, through swizzling) in both shaders, i.e. I only write to a part of it in the vertex shader and only read from that same part in the fragment shader?
// vertex shader
texCoord.st = ...
// fragment shader
... = texture2D(..., texCoord.st);
Is that guartanteed (i.e. by specification) to always produce sane results? It seems reasonable that it does, however I'm not too well-versed in the intricacies of GLSL language-lawyering and don't know if that varying variable is interpreted as somehow "incomplete" by the compiler/linker because it isn't fully written to in the preceding stage. I'm sure the values of texCoord.pq will be undefined anyway, but does that affect the validity of texCoord.st too or does the whole varying system operate on a pure component level?
I haven't found anything to that effect in the GLSL specification on first glance and I would prefer answers based either on the actual specification or any other "official" guarantees, rather than statements that it should work on reasonable hardware (unless of course this case simply is unspecified or implementation-defined). I would also be interested in any possible changes of that throughout GLSL history, including all the way back to its appliance to deprecated builtin varying variables like gl_TexCoord[] in good old GLSL 1.10.
I'm trying to argue that your code will be fine, as per the specification. However, I'm not sure if you will find my reasoning 100% convincing, because I think that the spec seems somewhat imprecise about this. I'm going to refer to the OpenGL 4.5 Core Profile Specification and the OpenGL Shading language 4.50 specification.
Concerning input and output variables, the GLSL spec established the following in section 4.3.4
Shader input variables are declared with the storage qualifier in. They form the input interface between
previous stages of the OpenGL pipeline and the declaring shader. [...] Values from the previous pipeline stage are copied into input variables at the beginning of
shader execution.
and 4.3.6, respectively:
Shader output variables are declared with a storage qualifier using the storage qualifier out. They form
the output interface between the declaring shader and the subsequent stages of the OpenGL pipeline. [...]
During shader execution they will behave as normal
unqualified global variables. Their values are copied out to the subsequent pipeline stage on shader exit.
Only output variables that are read by the subsequent pipeline stage need to be written; it is allowed to
have superfluous declarations of output variables.
Section 5.8 "Assignments" establishes that
Reading a variable before writing (or initializing) it is legal, however the value is undefined.
Since the assignment of the .st vector will write to the sub-vector, we can establish that this variable will contain two intialized and two un-initialized components at the end of the shader invocation, and the whole vector will be copied to the output.
Section 11.1.2.1 of the GL spec states:
If the output variables are passed directly to the vertex processing
stages leading to rasterization, the values of all outputs are
expected to be interpolated across the primitive being rendered,
unless flatshaded. Otherwise the values of all outputs are collected
by the primitive assembly stage and passed on to the subsequent
pipeline stage once enough data for one primitive has been collected.
"The values of all outputs" are determined by the shader, and although some components have undefined values, they still have values, and there is no undefined or implementation-defined behavior here. The interpolation formulas for the line and polygon primitives (sections 14.5.1 and 14.6.1) also never mix between the components, so any defined component value will result in a defined value in the interpolated datum.
Section 11.1.2.1 also contains this statement about the vertex shader outputs:
When a program is linked, all components of any outputs written by a
vertex shader will count against this limit. A program whose vertex
shader writes more than the value of MAX_VERTEX_OUTPUT_COMPONENTS
components worth of outputs may fail to link, unless device-dependent
optimizations are able to make the program fit within available
hardware resources.
Note that this language implies that the full 4 components of a vec4 are counted against the limit as soon as a single component is written to.
On output variables, the specification says:
Their values are copied out to the subsequent pipeline stage on shader exit.
So the question boils down to two things:
What is the value of such an output variable?
That is easily answered. The section on swizzling makes it clear that writing to a swizzle mask will not modify the components that are not part of the swizzle mask. Since you did not write to those components, their values are undefined. So undefined values will be copied out to the subsequent pipeline stage.
Will interpolation of undefined values affect the interpolation of defined values?
No. Interpolation is a component-wise operation. The result of one component's interpolation cannot affect another's.
So this is fine.
Related
I am following tutorials on both Youtube and online and they all say to use glGetUniformLocation to get the uniform location of the uniform in the shader. But what if you know the index of the uniform location to begin with, it saves one draw call at the start of program and you can just hard code in a value.
I'm asking specifically about whether its necessary due to being able to hard code in the value. Not the new features of openGL that gives you the ability to get the same result.
In my program I tried inputting the following. glUniform4f(0, 0.1, 0.4, 0.6, 0.3); and it works!. Its linking to the first uniform in the shader.
I've tried searching "importance of glGetUniformLocation" and "Is glGetUniformLocation necessary" in stackoverflow and found no one asking the same question as I am. As a new openGl learning, reading through the openGL books I can't seem to find any explanation of this either.
Until GLSL gained the ability to specify a uniform's location in the text itself, you needed to call glGetUniformLocation (or an equivalent) at least once for any particular program, for any particular uniform within that program which you wanted to manipulate. You could cache that value, either in specific variables or in data structures. But you needed to call it at least once.
However, once a program could specify the location of its uniforms, there was no longer a reason to do this. After all, the shader must specify the correct name that your glGetUniformLocation call looks for, right? What is the difference between the shader specifying a name and the shader specifying a number? They're just identifiers representing some conceptual meaning within that program; one is a string, the other a number. And the shader has to use the correct identifier, the one which matches the thing your code expects to find.
So if you have the ability to specify locations in your shader, you should just do that and forgo any glGetUniformLocation usage.
Firstly, I'm not entirely sure how the clipping works, but I suppose it "cuts" off the fragments that are not seen by the viewer, although I don't know how this works in practice. However, does it happen before or after the primitive assembly?
The official documentation says this:
The purpose of the primitive assembly step is to convert a vertex stream into a sequence of base primitives. For example, a primitive
which is a line list of 12 vertices needs to generate 11 line base
primitives.
The full primitive assembly step (including the processing below) will
always happen after Vertex Post-Processing. However, some Vertex
Processing steps require that a primitive be decomposed into a
sequence of base primitives. For example, a Geometry Shader operates
on each input base primitive in the primitive sequence. Therefore, a
form of primitive assembly must happen before the GS can execute.
This early primitive assembly only performs the conversion to base
primitives. It does not perform any of the below processing steps.
Such early processing must happen if a Geometry Shader or Tessellation
is active. The early assembly step for Tessellation is simplified,
since Patch Primitives are always sequences of patches.
It seems that there are two forms of primitive assembly, which I'm confused about.
First, we see that when the vertex data is first fed into the vertex shader for rendering, it has to interepret the stream of vertices as some triangle or line etc. This is called "rendering" I suppose.
But on the other hand, the primitive assembly as quoted above also does something so similar. What is the difference between the two processes?
The article on primitives says this:
The term Primitive in OpenGL is used to refer to two similar but
separate concepts. The first is the interpretive scheme used by OpenGL
to determine what a stream of vertices represents when being rendered.
Such sequences of vertices can be arbitrarily long.
The other meaning of "Primitive" is as the result of the
interpretation of a vertex stream, as part of Primitive Assembly.
Therefore, processing a vertex stream by one of these primitive
interpretations results in an ordered sequence of primitives. The
individual primitives are sometimes called "base primitives".
If we follow the quote above, it seems that there is no difference between the two apparently separate concepts. The "interpretation step" can view, say, a sequence of 10 vertices as 8 dependent triangles. But so can the primitive assembly steps, which views the "dependent triangles" as base primitives. What is concretely different between the two?
Basically, there are a lot of things called "primitive assembly". They all do the same thing (turn a sequence of vertices into individual primitives), but several of them happen at different times.
There is a specific chapter of the specification titled "Fixed-Function Primitive Assembly and Rasterization." One could argue that this is where "the" Primitive Assembly stage happens. But the standard also says that it happens after the VS, after the TES, and after the GS (where applicable).
The standard still talks about "the primitive assembly stage" as if there were only one, despite the fact that it clearly calls for multiple of them.
What's clear is that the process of clipping knows about the individual primitives, so some primitive assembly has happened prior to reaching that stage.
I found something in the Specification of OpenGL Version 4.6 (Core Profile):
The output of Vertex Shader:
If the output variables are passed directly to the vertex processing stages lead- ing to rasterization, the values of all outputs are expected to be interpolated across the primitive being rendered, unless flatshaded. Otherwise the values of all out- puts are collected by the primitive assembly stage and passed on to the subsequent pipeline stage once enough data for one primitive has been collected.
Seems that if there is no TES and GS, the "Primitive Assembly" will be done later. Otherwise where will be an "early primitive assembly " as said in official documentation.
However in "11.1.3 Shader Execution" of the specification:
The following sequence of operations is performed:
Vertices are processed by the vertex shader (see section 11.1) and assembled into primitives as described in sections 10.1 through 10.3.
If the current program contains a tessellation control shader, each indi- vidual patch primitive is processed by the tessellation control shader (sec- tion 11.2.1). Otherwise, primitives are passed through unmodified. If active, the tessellation control shader consumes its input patch and produces a new patch primitive, which is passed to subsequent pipeline stages.
If the current program contains a tessellation evaluation shader, each indi- vidual patch primitive is processed by the tessellation primitive generator (section 11.2.2) and tessellation evaluation shader (see section 11.2.3). Oth- erwise, primitives are passed through unmodified. When a tessellation eval- uation shader is active, the tessellation primitive generator produces a new collection of point, line, or triangle primitives to be passed to subsequent pipeline stages. The vertices of these primitives are processed by the tes- sellation evaluation shader. The patch primitive passed to the tessellation primitive generator is consumed by this process.
If the current program contains a geometry shader, each individual primitive is processed by the geometry shader (section 11.3). Otherwise, primitives are passed through unmodified. If active, the geometry shader consumes its input patch primitive. However, each geometry shader invocation may emit new vertices, which are arranged into primitives and passed to subsequent pipeline stages.
Following shader execution, the fixed-function operations described in chap- ter 13 are applied.
Have a look at the "fixed-function operations" in chapter 13, there will do all the "Vertex Post-processing" such as clipping, perspective-divide, viewport transform and Transform-feedback. In chapter 13, I found:
After programmable vertex processing, the following fixed-function operations are applied to vertices of the resulting primitives:...
My understand
I think the accurate time of "Primitive Assembly" happened maybe is a little hard to tell, but I tend to believe that this is done just after vertex process. As Nicol said What's clear is that the process of clipping knows about the individual primitives, so some primitive assembly has happened prior to reaching that stage..
I think one of main task of the so-called "Primitive Assembly" stage which between vertex process and rasterization is the face culling. (I am not familiar with multi-draw maybe this stage is to do with this too.)
A figure of vertex process:
A simple pipeline based on my simple understand:
# Start
(Vertices Data)
|
|
|
V
Vertex Shader # Do primitive assembly here
|
|
| (primitives)
|
V
[Tessellation Shaders]
|
|
| (primitives)
|
V
[Geometry Shader]
|
|
| (primitives)
|
V
Vertex Post-processing
|
|
| (primitives)
|
V
Primitive Assembly # Mainly do face culling
|
|
| (primitives)
|
V
Rasterization
|
|
| (fragment)
|
V
Fragment Shader
# [] means can be ignored
On the Internet I found some examples of TCS code, where gl_TessLevel* variables are set only for one output patch vertex
// first code snippet
if ( gl_InvocationID == 0 ) // set tessellation level, can do only for one vertex
{
gl_TessLevelOuter [0] = foo
gl_TessLevelOuter [1] = bar;
}
instead of just
// second code snippet
gl_TessLevelOuter [0] = foo;
gl_TessLevelOuter [1] = bar;
It works similarly with and without condition checking, but I didn't find anything about such usage on OpenGL wiki.
If to think logically, it should be OK to set these variables only in one TCS invocation, and it would be weird to set them to different values based on gl_InvocationID. So my questions are:
Is this way of setting gl_TessLevel* correct and may it cause errors or crashes on some platforms?
If it's correct, should it be used always? Is it idiomatic?
And finally, how do both snippets affect performance? May the first snippet slow-down performance due to branching? May the second snippet cause redundant and/or idle invocations of subsequent pipeline stages, also slowing down performance?
What you are seeing here is an attempt by the shader's author to establish a convention similar to provoking vertices used by other primitive types.
OpenGL Shading Language 4.50 - 2.2 Tessellation Control Processor - p. 7
Tessellation control shader invocations run mostly independently, with undefined relative execution order.
However, the built-in function barrier() can be used to control execution order by synchronizing invocations, effectively dividing tessellation control shader execution into a set of phases.
Tessellation control shaders will get undefined results if one invocation reads a per-vertex or per-patch attribute written by another invocation at any point during the same phase, or if two invocations attempt to write different
values to the same per-patch output in a single phase.
It is unclear given the shader psuedo-code whether foo and bar are uniform across all TCS invocations. If they are not, the second shader snippet invokes undefined behavior due to the undefined relative ordering.
Arbitrarily deciding that the first invocation is the only one that is allowed to write the per-patch attribute solves this problem and is analogous to a first-vertex provoking convention. A last-vertex convention could just as easily be implemented since the number of patch vertices is known to all invocations.
None of this is necessary if you know foo and bar are constant, however.
I'm using OpenGL 3.3 GLSL 1.5 compatibility. I'm getting a strange problem with my vertex data. I'm trying to pass an index value to the fragment shader, but the value seems to change based on my camera position.
This should be simple : I pass a GLfloat through the vertex shader to the fragment shader. I then convert this value to an unsigned integer. The value is correct the majority of the time, except for the edges of the fragment. No matter what I do the same distortion appears. Why is does my camera position change this value? Even in the ridiculous example below, tI erratically equals something other than 1.0;
uint i;
if (tI == 1.0) i = 1;
else i = 0;
vec4 color = texture2D(tex[i], t) ;
If I send integer data instead of float data I get the exact same problem. It does not seem to matter what I enter as vertex Data. The value I enter into the data is not consistent across the fragment. The distortion even looks the exact same each time.
What you are doing here is invalid in OpenGL/GLSL 3.30.
Let me quote the GLSL 3.30 specification, section 4.1.7 "Samplers" (emphasis mine):
Samplers aggregated into arrays within a shader (using square brackets
[ ]) can only be indexed with integral constant expressions (see
section 4.3.3 “Constant Expressions”).
Using a varying as index to a texture does not represent a constant expression as defined by the spec.
Beginning with GL 4.0, this was somewhat relaxed. The GLSL 4.00 specification states now the following (still my emphasis):
Samplers aggregated into arrays within a shader (using square brackets
[ ]) can only be indexed with a dynamically uniform integral
expression, otherwise results are undefined.
With dynamically uniform being defined as follows:
A fragment-shader expression is dynamically uniform if all fragments
evaluating it get the same resulting value. When loops are involved,
this refers to the expression's value for the same loop iteration.
When functions are involved, this refers to calls from the same call
point.
So now this is a bit tricky. If all fragment shader invocations actaully get the same value for that varying, it would be allowed, I guess. But it is unclear that your code guarantees that. You should also take into account that the fragment might be even sampled outside of the primitive.
However, you should never check floats for equality. There will be numerical issues. I don't know what exactly you are trying to achieve here, but you might use some simple rounding behavior, or use an integer varying. You also should disable the interpolation of the value in any case using the flat qualifier (which is required for the integer case anyway), which should greatly improve the changes of that construct to become dynamically uniform.
As it says on the tin: Is there any reason, ever, to use gl_FragColor instead of gl_FragData[0]? If not, then why does gl_FragColor even exist? Is it mere legacy from a time when gl_FragData did not exist?
(Yes, I know that both are deprecated in the latest GLSL versions, but I still want to write code that can run on older cards.)
I will refer you to the OpenGL specification for GLSL 1.1, as it makes very little distinction between the two except to say that they are mutually exclusive.
The OpenGL Shading Language (version 1.1) - 7.2 Fragment Shader Special Variables - pp. 43
If a shader statically assigns a value to gl_FragColor, it may not assign a value to any element of gl_FragData. If a shader statically writes a value to any element of gl_FragData, it may not assign a value to gl_FragColor. That is, a shader may assign values to either gl_FragColor or gl_FragData, but not both.
Given this language, gl_FragColor should probably be preferred in shaders that do not use MRT (multiple render targets). For shaders that output to multiple buffers, use gl_FragData [n]. But never mix-and-match, even though you might logically assume that gl_FragColor is an alias to gl_FragData [0].
The GLSL specification pre-dates FBOs, so having an array of locations to output fragment data did not always make sense. Since GLSL and FBOs are both core in OpenGL 3.0, it is easy to take this for granted. The old ARB extension specification for fragment shaders has a blurb on this very subject:
14) What is the interaction with a possible MRT (Multiple Render Target)
extension?
The OpenGL Shading Language defines the array gl_FragData[] to output
values to multiple buffers. There are two situations to consider.
1) There is no MRT extension support. A shader can statically assign a
value to either gl_FragColor or gl_FragData[0] (but not both).
Either way the same buffer will be targeted.
2) There is MRT support. In this case what happens is defined in the
relevant MRT extension documentation.
They were thinking ahead, but did not quite have all the pieces in place yet. This is ancient history in any case.