Why is my geometry shader becoming "overloaded"? - opengl

I use an OpenGL shader to plot graphs. Every span of the graph has the form:
The vertex shader just passes the a's and b's to a geometry shader that then evaluates the curve at max_vertices points.
The problem is that sometimes the geometry shader seems to become "overloaded" and stops spitting out points:
Both curves actually have the exact same values, but for some reason the bottom one has some kind of failure of the geometry shader to generate points.
When I change max_vertices in the following line of my geometry shader:
layout (triangle_strip, max_vertices = ${max_vertices}) out;
from 1024 (the result of gl.glGetInteger(gl.GL_MAX_GEOMETRY_OUTPUT_VERTICES)) to 256, then I get the desired output:
What is happening? What is the true maximum number of vertices? Why is the top graph unaffected, but the bottom one corrupted? They have the same data.

Geometry shaders have two competing sets of limitations. The first is the number of vertices in each GS invocation, and the second is the total number of components you output from a GS invocation (retrieved from GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS). You must stay within both to get defined behavior.
1024 is the required minimum value for GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS; all implementations will support at least that many. If an implementation only supports that number, and you tried to output 1024 vertices, you could only output a single component of data for each such vertex (a single float or int or whatever, not even a vec2).
Overall, it would be better to avoid this problem entirely. What you're trying to do seems like it'd be more easily done via a compute shader or at the very least, geometry shader instancing.

Related

Can one OpenGL tessellation shader handle variable patch sizes and tessellation levels?

I have a well-established OpenGL project (in c# using SharpGL, if that helps), and within it is a class that can handle drawing points, lines (well, line stripes), and triangles (for filled polygons). Currently, my single shader program consists of a vertex shader and a fragment shader, which works for any of the three primitive types.
However, in reality, any lines in the resulting graphic (from line stripes or lines between triangle vertices) need to follow a curvature within a well-understood geometry (I know how to calculate points between the vertices that will follow the curve).
Given that, I now want to introduce tessellation shaders (control and evaluation) to add the additional points needed to display the curvatures.
That leads to my main questions:
Is there a way to have one shader program where the tessellation shaders can be told at runtime how many vertices are in the input patches about to be rendered (i.e., there will be 2 vertices per patch when rendering lines but 3 when rendering triangles)?
Further, can the tessellation shaders dynamically decide how many vertices will be output (e.g., if the 2 vertices of a line segment are too far apart, I may want to increase the number of vertices in the output to better depict the curvature).
I've had a hard time researching these questions as most tutorials focus on other, more fundamental aspects of tessellation shaders.
I know that there is an OpenGL call, glPatchParameter, that lets me set patch vertex size as well as default outer and inner patch sizes, but does that forego the need for having layout(vertices = patch_size​) out; in the shader code? Is there a way for me to access, for example, the patch vertex size set using glPatchParameter from within the shader code (other than passing in my own, additional uniform variable)? Are there any good examples out there of code that does something similar to what I'm looking for?
The TCS and TES do not define the input patch size. They can query the patch size effectively by using the .length() function on any arrayed input parameter.
However, the size of the output patch from the TCS is a compile-time fixed part of the TCS itself. So even if you could make a TCS that could handle 2 or 3 input vertices, it wouldn't be able to selectively choose between 2 or 3 output vertices based on the number of input vertices.
So you're going to need to use different programs. If you're able to use SPIR-V shaders, you can use specialization constants to set the number of output vertices in the patch. You would still get different programs, but they would all come from the same shader source.
You can also do some find/replace stuff with the text of your shader before compiling it to get the same effect.
Note: do not mistake the number of vertices output by the TCS with the amount of tessellation done to the abstract patch. They are in no way related.
Further, can the tessellation shaders dynamically decide how many vertices will be output (e.g., if the 2 vertices of a line segment are too far apart, I may want to increase the number of vertices in the output to better depict the curvature).
This is about tessellation levels. And basically 80% of the job of the TCS is to decide how much tessellation to do.
Lines are somewhat tricky in as far as tessellation works. An isoline output "patch" is really a sequence of lines. The number of lines is defined by gl_TessLevelOuter[0], and the subdivisions within each line are defined by gl_TessLevelOuter[1]. But since the amount of tessellation is capped (implementation-defined, but is at least 64), if you need more than this number of subdivisions for a single conceptual line, you'll have to build it out of multiple lines.
This would be done by making the end-point of one line binary-identical to the start-point of the next line in the tessellated isoline patch. Fortunately, you're guaranteed that gl_TessCoord.x will be 0 and 1 exactly for the start and end of lines.

Get element ID in vertex shader in OpenGL

I'm rendering a line that is composed of triangles in OpenGL.
Right now I have it working where:
Vertex buffer: {v0, v1, v2, v3}
Index buffer (triangle strip): {0, 1, 2, 3}
The top image is the raw data passed into the vertex shader and the bottom is the vertex shader output after applying an offset to v1 and v3 (using a vertex attribute).
My goal is to use one vertex per point on the line and generate the offset some other way. I was looking at gl_VertexID, but I want something more like an element ID. Here's my desired setup:
Vertex buffer: {v0, v2}
Index buffer (triangle strip): {0, 0, 1, 1}
and use an imaginary gl_ElementID % 2 to offset every other vertex.
I'm trying to avoid using geometry shaders or additional vertex attributes. Is there any way of doing this? I'm open to completely different ideas.
I can think of one way to avoid the geometry shader and still work with a compact representation: instanced rendering. Just draw many instances of one quad (as a triangle strip), and define the two positions as per-instance attributes via glVertexAttribDivisor().
Note that you don't need a "template quad" with 4 vertices at all. You just need conceptually two attributes, one for your start point, and one for your end point. (If you work in 2D, you can fuse that into one vec4, of course). In each vertex shader invocation, you will have access to both points, and can construct the final vertex position based on that and the value of gl_VertexID (which will only be in range 0 to 3). That way, you can get away with exactly that vertex array layout of two points per line segment you are aiming for, and still only need a single draw call and a vertex shader.
No, that is not possible, because each vertex is only processed once. So if you're referencing a vertex 10 times with an index buffer, the corresponding vertex shader is still only executed one time.
This is implemented in hardware with the Post Transform Cache.
In the absolute best case, you never have to process the same vertex
more than once.
The test for whether a vertex is the same as a previous one is
somewhat indirect. It would be impractical to test all of the
user-defined attributes for inequality. So instead, a different means
is used.
Two vertices are considered equal (within a single rendering command)
if the vertex's index and instance count are the same (gl_VertexID​
and gl_InstanceID​ in the shader). Since vertices for non-indexed
rendering are always increasing, it is not possible to use the post
transform cache with non-indexed rendering.
If the vertex is in the post transform cache, then that vertex data is
not necessarily even read from the input vertex arrays again. The
process skips the read and vertex shader execution steps, and simply
adds another copy of that vertex's post-transform data to the output
stream.
To solve your problem I would use a geometry shader with a line (or line strip) as input and a triangle strip as output. With this setup you could get rid of the index buffer, since it's only working on lines.

How vertex and fragment shaders communicate in OpenGL?

I really do not understand how fragment shader works.
I know that
vertex shader runs once per vertices
fragment shader runs once per fragment
Since fragment shader does not work per vertex but per fragment how can it send data to the fragment shader? The amount of vertices and amount of fragments are not equal.
How can it decide which fragment belong to which vertex?
To make sense of this, you'll need to consider the whole render pipeline. The outputs of the vertex shader (besides the special output gl_Position) is passed along as "associated data" of the vertex to the next stages in the pipeline.
While the vertex shader works on a single vertex at a time, not caring about primitives at all, further stages of the pipeline do take the primitive type (and the vertex connectivity info) into account. That's what typically called "primitive assembly". Now, we still have the single vertices with the associated data produced by the VS, but we also know which vertices are grouped together to define a basic primitive like a point (1 vertex), a line (2 vertices) or a triangle (3 vertices).
During rasterization, fragments are generated for every pixel location in the output pixel raster which belongs to the primitive. In doing so, the associated data of the vertices defining the primitve can be interpolated across the whole primitve. In a line, this is rather simple: a linear interpolation is done. Let's call the endpoints A and B with each some associated output vector v, so that we have v_A and v_B. Across the line, we get the interpolated value for v as v(x)=(1-x) * v_A + x * v_B at each endpoint, where x is in the range of 0 (at point A) to 1 (at point B). For a triangle, barycentric interpolation between the data of all 3 vertices is used. So while there is no 1:1 mapping between vertices and fragments, the outputs of the VS still define the values of the corrseponding input of the FS, just not in a direct way, but indirectly by the interpolation across the primitive type used.
The formula I have given so far are a bit simplified. Actually, by default, a perspective correction is applied, effectively by modifying the formula in such a way that the distortion effects of the perspective are taken into account. This simply means that the interpolation should act as it is applied linearily in object space (before the distortion by the projection was applied). For example, if you have a perspective projection and some primitive which is not parallel to the image plane, going 1 pixel to the right in screen space does mean moving a variable distance on the real object, depending on the distance of the actual point to the camera plane.
You can disable the perspective correction by using the noperspective qualifier for the in/out variables in GLSL. Then, the linear/barycentric interpolation is used as I described it.
You can also use the flat qualifier, which will disable the interpolation entirely. In that case, the value of just one vertex (the so called "provoking vertex") is used for all fragments of the whole primitive. Integer data can never by automatically interpolated by the GL and has to be qualified as flat when sent to the fragment shader.
The answer is that they don't -- at least not directly. There's an additional thing called "the rasterizer" that sits between the vertex processor and the fragment processor in the pipeline. The rasterizer is responsible for collecting the vertexes that come out of the vertex shader, reassembling them into primitives (usually triangles), breaking up those triangles into "rasters" of (partially) coverer pixels, and sending these fragments to the fragment shader.
This is a (mostly) fixed-function piece of hardware that you don't program directly. There are some configuration tweaks you can do that affects what it treats as a primitive and what it produces as fragments, but for the most part its just there between the vertex shader and fragment shader doing its thing.

Access world-space primitive size in fragment shader

It is essential for my fragment shader to know the world-space size of the primitive it belongs to. It is intended to be used solely for rendering rectangles (=triangles pair).
Naturally, I can compute its width and height on CPU and pass it as uniform value, but using such shader can be uncomfortable in long run - one have to remember what and how to compute it or search documentation. Is there any "automated" way of finding primitive size?
I have an idea of using a kind of pass-through geometry shader for doing this (since it is the only part of pipeline I know of that have access to whole primitive), but would that a good idea?
Is there any "automated" way of finding primitive size?
No because the concept of "primitive size" depends entirely on you, namely how your shaders work. As far as OpenGL is concerned, there's not even such a thing as world space. There's just clip space and NDC space and your vertex shaders take a formless bunch of data, called vertex attributes, and throw it out into clip space.
Since the vertex shader operates on a per-vertex base and don't see the other vertices (except if you pass them in as another vertex attribute, with a shifted index; if doing it that way, the output varying must be flat per vertex and the computation being skipped for 0 != vertex_ID % 3 for triangles) the only viable shader stages to do fully automate this is using a geometry or tesselation shader, doing that preparative calculation, emitting the result as a scalar vertex attribute.

Is specifying EndStreamPrimitive() necessary in Geometry shader with streams

EndStreamPrimitive() can only be used in case of Geometry shader with streams.
Geometry shader with streams can only emit GL_POINTS.
But In GL_POINTS, each vertex itself is a primitive.
So what is the point of having a function like EndStreamPrimitive()?
Just specifying EmitStreamVertex() when primitive type = GL_POINT means end of primitive.
My next question is What is max_vertices in a Geometry shader?
layout(points, max_vertices = 6) out;
I suppose it is the maximum number of vertices a Geometry shader will emit (irrespective of weather it is using streams or not).
If I have 2 streams in my Geometry shader, and I emit 2 vertices to stream 0, 3 vertices to stream 1. should the value of max_vertices be set to 5?
As far as I know, EndStreamPrimitive (...) currently has no use; likely provided simply for consistency with the non-stream GLSL design. If the restriction on points being the only type of output primitive is lifted it may become useful in the future.
To answer your second question, that is the maximum number of vertices you will ever emit in a single invocation of your GS. Geometry Shaders are different from other stages in that the size of their output data can vary at run-time. You could decide because of some condition at run-time that you want to output 8 vertices to stream 1 -- GLSL needs to know an upper-bound and it cannot figure that out if your flow control is based on a variable set elsewhere.
There are implementation limits set on both the number of vertices (GL_MAX_GEOMETRY_OUTPUT_VERTICES) a GS invocation can emit and the sum total number of vector components (GL_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS). These limits are also defined in the shading language.
Implementations must support the following minimums:
const int gl_MaxGeometryOutputVertices = 256;
const int gl_MaxGeometryTotalOutputComponents = 1024;
If your shader exceeds those limits, you will need to split it into multiple invocations. You can either do that using multiple draw calls or you can have GL4+ automatically invoke your GS multiple times:
layout (invocations = <num_invocations​>) in;
You can determine which invocation the GS is performing by the value of gl_InvocationID​ (GL4+).