Why Tesellation control shader is invoked many times? - c++

My question is that all of Tesellation Control Shader Invocation produce the same result, why OPENGL has to call this shader many times for each patch.
For example: My Tesellation Control Shader calculates control points for the Bezier Surface. It take an array of three vertices, which is aggregated earlier from Vertex Shaders.
// attributes of the input CPs
in vec3 WorldPos_CS_in[];
My patch size is 3, so Tesellation Control Shader is called three times for the same input, save for gl_invocatinoID, and then all of them produce the same following control points:
struct OutputPatch
{
vec3 WorldPos_B030;
vec3 WorldPos_B021;
vec3 WorldPos_B012;
vec3 WorldPos_B003;
vec3 WorldPos_B102;
vec3 WorldPos_B201;
vec3 WorldPos_B300;
vec3 WorldPos_B210;
vec3 WorldPos_B120;
vec3 WorldPos_B111;
vec3 Normal[3];
vec2 TexCoord[3];
};
// attributes of the output CPs
out patch OutputPatch oPatch;
And, also the same information which help OpenGL divide this patch into tesellation coordinates:
// Calculate the tessellation levels
gl_TessLevelOuter[0] = gTessellationLevel;
gl_TessLevelOuter[1] = gTessellationLevel;
gl_TessLevelOuter[2] = gTessellationLevel;
gl_TessLevelInner[0] = gTessellationLevel;
It is clear that all of Tes Control Shader do same job. Does it waste resources? Why Tesellation Control Shader should be called for one time for each patch?

Well the control shader invocations don't produce exactly the same result, because the control point output is obviously different for each. But that's being pedantic.
In your program, and in all of mine so far, yes the control shader is doing exactly the same thing for every control point and the tessellation level doesn't change.
But suppose you have the shader generating new attributes for each control point, a texture normal or something? Then the shader would generate different results for each. It's nice to have the extra flexibility if you need it.
Modern GPUs try to do as much as possible in parallel. The older geometry shaders have one invocation generating multiple outputs. It's more efficient, not less, on modern GPUs to have multiple invocations each generating one output.

Related

How to access gl_Position variable for a specific vertex?

In the vertex shader, the position of the current vertex is stored in gl_Position from what I understand. If in the tessellation shader I wanted to change that variable for a specific vertex how would I do that? Does each vertex contain it's own gl_Position variable? Is there a way to do something like vertex1.gl_Position so then OpenGL would know that I want to modify vertex1's gl_Position variable?
The tessellation control shader operates on patches, specific groups of vertices that are conceptually tessellated as a bundle. As such, the TCS's inputs are arrayed. The pre-defined VS outputs are aggregated into an array of TCS inputs in the following input interface block:
in gl_PerVertex
{
vec4 gl_Position;
float gl_PointSize;
float gl_ClipDistance[];
} gl_in[gl_MaxPatchVertices];
So gl_in[1].gl_Position is the gl_Position value output by the vertex shader for the second vertex in the patch.
Note that, while the array is sized on the maximum supported number of patch vertices, the actual limit is gl_PatchVerticesIn, the number of vertices in the tessellation patch.
Each TCS invocation outputs to a single vertex in the output patch (which may have a different number of vertices compared to the input patch). TCS outputs are also arrayed, and the built-in outputs have a similar array compared to the inputs:
out gl_PerVertex
{
vec4 gl_Position;
float gl_PointSize;
float gl_ClipDistance[];
} gl_out[];
Each TCS invocation is meant to write only to the output index gl_InvocationID. If a barrier is used, they can read from output vertices that were written by other invocations (or other patch outputs).
Tessellation evaluation shaders operate on individual vertices in the abstract patch generated by the primitive generator. However, all TES invocations operating on the same patch get the entire output patch from the TCS. This is arrayed in the same way as the TCS input patch, just with an array size defined by the TCS's output patch size.
If the TCS did not write to the predefined TCS outputs, the corresponding TES cannot access those outputs. As such, they're not actually special, they have no intrinsic meaning. gl_in[1].gl_Position in the TES is only a "position" if the TCS put a position data value there.

how to pass GL_PATCHES

I am trying to create an example of an interpolated surface.
First I created an example of an interpolated trefoil.
Here the source of my example.
Then I had to noticed that the animation is pretty slow, around 20-30FPS.
After some papers, I know that have to "move" the evaluation of the trefoil into the GPU. Thus I studied some papers about tessellation shaders.
At the moment I bind following simply vertex shader:
#version 130
in vec4 Position;
in vec3 Normal;
uniform mat4 Projection;
uniform mat4 Modelview;
uniform mat3 NormalMatrix;
uniform vec3 DiffuseMaterial;
out vec3 EyespaceNormal;
out vec3 Diffuse;
void main()
{
EyespaceNormal = NormalMatrix * Normal;
gl_Position = Projection * Modelview * Position;
Diffuse = DiffuseMaterial;
}
Now I have multiply questions:
Do I use an array of vertices to pass GL_PATCHES like I already did with Triangle_Strips ? Which way is faster? DrawElements?
glDrawElements(GL_TRIANGLE_STRIP, Indices.Length, OpenGL.GL_UNSIGNED_SHORT, IntPtr.Zero);
or should I use
glPatchParameteri(GL_PATCH_VERTICES,16);
glBegin(GL_PATCHES);
glVertex3f(x0,y0,z0)
...
glEnd();
What about the array of indices? How can I determine the path means in which order the patches will be passed.
Do I calculate the normals in the Shader as well?
I found some examples of tessellation shader but in #version400
Can I use this version on mobile devices as well?(OpenGL ES)
Can I pass multiple Patches to the GPU by Multithreading?
Many many thanks in advance.
In essence I don't believe you have to send anything to the GPU in terms of indices (or vertices) as everything can be synthesized. I don't know if the evaluation of the trefoil knot directly maps onto the connectivity of the resulting tessellated mesh of a bilinear patch, but this could work.
You could do with a simple vertex buffer where each vertex is the position of a single trefoil knot. Set glPatchParameteri(GL_PATCH_VERTICES​, 1). Then you could draw multiple knots with a single call to glDrawArrays:
glDrawArrays(GL_PATCHES, 0, numKnots);
The tessellation control stage can be a simple pass through stage. Then in the tessellation evaluation shader you can use the abstract patch type of quads. Then move the evaluation of the trefoil knot, or any other biparametric shape, into the tessellation evaluation shader, using the supplied [u, v] coordinates. Then you could translate every trefoil by the input vertex. The normals can be calculated in the shader as well.
Alternatively, you could use the geometry shader to synthesize the trefoil just from one input vertex position using points as input primitive and triangle strip as output primitive. Then you could just call again
glDrawArrays(GL_POINTS, 0, numKnots);
and create the trefoil in the geometry shader using the function for the generation of the indices to describe the order of evaluation and translating the generated vertices with the input vertex.
In both cases there would be no need to multithread draw calls, which is ineffective with OpenGL anyways. You are limited by the number of vertices that can be generated maximum per-patch which should be 64 times 64 for tessellation and GL_MAX_GEOMETRY_OUTPUT_VERTICES for geometry shaders.

Am I able to initiate blank variables and declare them afterwards?

I would like to do something like this:
vec4 text;
if (something){
text = texture(backgroundTexture, pass_textureCoords);
}
if (somethingElse){
text = texture(anotherTexture, pass_textureCoords);
}
Is this valid GLSL code, and if not, is there any suitable alternative?
The answer depends on what something and somethingElse actually are.
If this is a uniform control flow, which means that all shader invocations execute the same branch, then this is perfectly valid.
If they are non uniform expressions, then there are some limitations:
When a texture uses mipmapping or anisotropic filtering of any kind, then any texture function that requires implicit derivatives will retrieve undefined results.
To sum it up: Yes, this is perfectly valid glsl code but textures in non-uniform controls flow have some restrictions. More details on this can be found here.
Side note: In the code sample you are not declaring the variables afterwards. You are just assigning values to them.
Edit (more info)
To elaborate a bit more on what uniform and non-uniform control flow is: Shaders can (in general) have two types of inputs. Uniform variables like uniform vec3 myuniform; and varyings like in vec3 myvarying. The difference is where the data comes from and how it can change during an invocation:
Uniforms are set from application side, and are (due to this) constant over an invocation (where invocation simplified means draw command).
Varyings are read from vertex inputs (in the vertex shader) or are passed and interpolated from previous shader stages (e.g. in the fragment shader).
Uniform control flow now means, that the condition of the corresponding if statement only depends on uniforms. Everything else is non-uniform control flow. Let's have a look at this fragment shader example:
//Uniforms
uniform vec3 my_uniform;
uniform sampler2D my_sampler;
/Varyings
in vec2 tex_coord;
in vec2 ndc_coords;
void main()
{
vec3 result = vec3(0,0,0);
if (my_uniform.y > 0.5) //Uniform control flow
result += texture(my_sampler, tex_coord).rgb;
if (ndc_coords.y > 0.5) //Non-uniform control flow
result += texture(my_sampler, tex_coord).rgb;
float some_value = ndc_coords.y + my_uniform.y;
if (some_value > 0.5) //Still non-uniform control flow
result += texture(my_sampler, tex_coord).rgb;
}
In this shader only the first texture read happens in uniform control flow (since the if condition only depends on a uniform variable). The other two reads are in non-uniform control flow since the if condition also depends on an varying (ndc_coords).

Fragment shader color interpolation: details and hardware support

I know using a very simple vertex shader like
attribute vec3 aVertexPosition;
attribute vec4 aVertexColor;
uniform mat4 uMVMatrix;
uniform mat4 uPMatrix;
varying vec4 vColor;
void main(void) {
gl_Position = uPMatrix * uMVMatrix * vec4(aVertexPosition, 1.0);
vColor = aVertexColor;
}
and a very simple fragment shader like
precision mediump float;
varying vec4 vColor;
void main(void) {
gl_FragColor = vColor;
}
to draw a triangle with red, blue, and green vertices will end up having a triangle like this
My questions are:
Do calculations for interpolating fragment colors belonging to one triangle (or a primitive) happen in parallel on GPU?
What are the algorithm and also hardware support for interpolating fragment colors inside the triangle?
The interpolation is the step Fragment Processor
Algorithm is very simple they just interpolate the color according to their UV
Yes, absolutely.
Triangle color interpolation is part of the fixed-function pipeline (it's actually part of the rasterization step, which happens before fragment processing), so it is carried out entirely in hardware with probably all video cards. The equations for interpolating vertex data can be found e.g. in OpenGL 4.3 specification, section 14.6.1 (pp. 405-406). The algorithm defines barycentric coordinates for the triangle and uses them to interpolate between the vertices.
Besides the answers giving here, I wanted to add that there doesn't have to be dedicated fixed-function hardware for the interpolations. Modern GPU tend to use "pull-model interpolation" where the interpolation is actually done via the shader units.
I recommend reading Fabian Giesen's blog article about the topic (and the whole series of articles about the graphics pipeline in genreal).
On the first question - though there are parallel units on the GPU, it depends on the size of the triangle in consideration. For most of the GPUs, drawing happens on a tile by tile basis, and depending on the "screen" size of the triangle, if it falls within just one tile completely, it will be processed completely in only one tile processor. If it is split, it can be done in parallel by different units.
The second question is answered by other posters before me.

GLSL shader for each situation

In my game I want to create seperate GLSL shaders for each situation. In example if i would have 3 models character, shiny sword and blury ghost i would like to set renderShader, animationShader and lightingShader to the character, then renderShader, lightingShader and specularShader to shiny sword, and finally i would like to set renderShader, lightingShader and blurShader to the blury ghost.
The renderShader should multiply the positions of vertices by projection, world and other matrices, and it's fragmet shader should simply set the texture to the model.
animationShader should transform vertices by given bone transforms.
lightingShader should do the lighting and specularLighting should do the specular lighting.
blurShader should do the blur effect.
Now first of all how can i do multiple vertex transforms on different shaders? Because the animationShader should calculate the animated positions of vertices and then renderShader should get that position and trasform it by some matrices.
Secondly how can i change the color of fragments on different shader?
The basic idea is that i want to be able to use different shaders for each sutuations/effects, and i don't know how to achieve it.
I need to know how should i use these shaders in opengl, and how should i use GLSL so that all shaders would complete each other and the shaders would not care if another shader is used or not.
What you're asking for is decidedly non-trivial, and is probably extreme overkill for the relatively limited number of "shader" types you describe.
Doing what you want will require developing what is effectively your own shading language. It may be a highly #defined version of GLSL, but the shaders you write would not be pure GLSL. They would have specialized hooks and be written in ways that code could be expected to flow into other code.
You'll need to have your own way of specifying the inputs and outputs of your language. When you want to connect shaders together, you have to say who's outputs go to which shader's inputs. Some inputs can come from actual shader stage inputs, while others come from other shaders. Some outputs written by a shader will be actual shader stage outputs, while others will feed other shaders.
Therefore, a shader who needs an input from another shader must execute after that other shader. Your system will have to work out the dependency graph.
Once you've figured out all of the inputs and outputs for a specific sequence of shaders, you have to take all of those shader text files and compile them into GLSL, as appropriate. Obviously, this is a non-trivial process.
Your shader language might look like this:
INPUT vec4 modelSpacePosition;
OUTPUT vec4 clipSpacePosition;
uniform mat4 modelToClipMatrix;
void main()
{
clipSpacePosition = modelToClipMatrix * modelSpacePosition;
}
Your "compiler" will need to do textual transformations on this, converting references to modelSpacePosition into an actual vertex shader input or a variable written by another shader, as appropriate. Similarly, if clipSpacePosition is to be written to gl_Position, you will need to convert all uses of clipSpacePosition to gl_Position. Also, you will need to remove the explicit output declaration.
In short, this will be a lot of work.
If you're going to do this, I would strongly urge you to avoid trying to merge the concept of vertex and fragment shaders. Keep this shader system working within the well-defined shader stages. So your "lightingShader" would need to be either a vertex shader or a fragment shader. If it's a fragment shader, then one of the shaders in the vertex shader that feeds into it will need to provide a normal in some way, or you'll need the fragment shader component to compute the normal via some mechanism.
Effectively for every combination of the shader stages you'll have to create an individual shader program. To save work and redundancy you'd use some caching structure to create a program for each requested combination only one time and reuse it, whenever it is requested.
Similar you can do with the shader stages. However shader stages can not be linked from several compilation units (yet, this is an ongoing effort in OpenGL development to get there, separable shaders of OpenGL-4 are a stepping stone there). But you can compile a shader from several sources. So you'd write functions for each desired effect into a separate source and then combine them at compilation time. And again use a caching structure to map source module combinations to shader object.
Update due to comment
Let's say you want to have some modularity. For this we can exploit the fact that glShaderSource accepts multiple source strings, it simply concatenates. You write a number of shader modules. One doing the illumination per-vertex calculations
uniform vec3 light_positions[N_LIGHT_SOURCES];
out vec3 light_directions[N_LIGHT_SOURCES];
out vec3 light_halfdirections[N_LIGHT_SOURCES];
void illum_calculation()
{
for(int i = 0; i < N_LIGHT_SOURCES; i++) {
light_directions[i] = ...;
light_halfdirections[i] = ...;
}
}
you put this into illum_calculation.vs.glslmod (the filename and extensions are arbitrary). Next you have a small module that does bone animation
uniform vec4 armature_pose[N_ARMATURE_BONES];
uniform vec3 armature_bones[N_ARMATURE_BONES];
in vec3 vertex_position;
void skeletal_animation()
{
/* ...*/
}
put this into illum_skeletal_anim.vs.glslmod. Then you have some common header
#version 330
uniform ...;
in ...;
and some common tail which contains the main function, which invokes all the different stages
void main() {
skeletal_animation();
illum_calculation();
}
and so on. Now you can load all those modules, in the right order into a single shader stage. The same you can do with all shader stages. The fragment shader is special, since it can write to several framebuffer targets at the same time (in OpenGL versions large enough). And technically you can pass a lot of varyings between the stages. So you could pass a own set of varyings between shader stages for each framebuffer target. However the geometry and the transformed vertex positions are common to all of them.
You have to provide different shader programs for each Model you want to render.
You can switch between different shader combinations using the glUseProgram function.
So before rendering your character or shiny sword or whatever you have to initialize the appropriate shader attributes and uniforms.
So it just a question of the design of the code of your game,
because you need to provide all uniform attributes to the shader, for example light information, texture samples and you must enable all necessary vertex attributes of the shader in order to assign position, color and so on.
These attributes can differ between the shaders and also your client side model can have different kind of Vertex attribute structures.
That means the model of your code directly influences the assigned shader and depends on it.
If you want to share common code between different shader programs, e.g illuminateDiffuse
you have to outsource this function and providing it to your shader through simply insert the string literal which represents the function into your shaders code, which is nothin more than a string literal. So you can reach a kind of modularity or include behavior through string manipulation of you shader code.
In any case the shader compiler tells you whats wrong.
Best regards