Does a fragment shader only run for visible fragments? - opengl

We need to write a raytracer in OpenGL. Now, I decided I would shoot a ray for every fragment shader call since, as far as I understand, a fragment is a screen pixel that could be written to by a geometry object. So I was wondering if a fragment shader would only run for visible pixels or for all pixels. If it only runs for visible ones, it would be a given that the primary ray (from screen to object) is not obstructed. This would save a lot of calculations.

There is absolutely no guarantee that the execution of a fragment shader means that the fragment is certainly visible.
Early depth test by itself will not save you. Rendering each triangle front-to-back will not save you; there is no guarantee in OpenGL that fragments are generated in order (only "as if" in order). And that's ignoring cases of overlap where it's impossible to have a proper ordering. Even issuing each triangle in its own separate rendering command guarantees nothing as far as OpenGL is concerned.
The only thing you can do to ensure this is to perform a depth pre-pass. That is, render your entire scene, but without a fragment shader active (and turn off color writes to the framebuffer). That will write all of the depth data to the depth buffer. That way, if you use early depth tests, when you render your scene again, the only fragments that pass the depth test will be those that are visible.
Depth pre-passes can be pretty fast, depending on your vertex shader and other aspects of your rendering pipeline.

Related

OpenGL: using instanced drawing to draw with the framebuffer I'm drawing into [duplicate]

This question already has an answer here:
(How) can a shader view the current render-buffer?
(1 answer)
Closed 3 years ago.
I'm trying to basically "add" framebuffers, or better the colortexture attachmement of framebuffers. I found a way to do this is by having a shader which gets all the textures and renders their combination.
But to improve the performance wouldn't it be better to just have one shader and framebuffer, and then through instanced drawing the shader draws onto the framebuffer colortexture attachement it is using for drawing?
A bit better explained:
I have 2 framebuffers: Default and Framebuffer1.
I bind Framebuffer1
and give the colortexture attachment of Framebuffer1 as uniform "Fb1_cta" to the following fragment shader:
out vec4 FragColor;
in vec2 TexCoords;
uniform sampler2D Fb1_cta;
void main()
{
vec3 text = texture(Framebuffer1, TexCoords).rgb;
FragColor = vec4(vec3(0.5) + text, 1.0);
}
So i draw into Framebuffer1, but also use the current colortexture attachement for the drawing.
Now I call glDrawArraysInstanced with instancecount 2.
The first renderpass should draw the whole texture in grey (rgb = (0.5, 0.5, 0.5)) and the second should add another vec3(0.5) to that, so the result will be white. That however didn't really work so I split the glDrawArraysInstanced into 2 glDrawArrays and checked the 2 results.
Now while the first pass works as intended:Result of first rendering
The second didn't (btw this is the same result as with the glDrawArraysInstanced):Result of second rendering
To me this pretty much looks like the two renderpasses aren't done sequentially, but in parallel. So I did rerun my code but this time with a bit of time passing between the calls and that seemed to have solved the issue.
Now I wonder is there any way to tell OpenGL that those calls should truly be sequential and might there even be a way to do it with glDrawArraysInstanced to improve the performance?
Is there in general a more elegant solution to this kind of problem?
In general, you cannot read from a texture image that is also being rendered to. To achieve the level of performance necessary for real-time rendering, it is essential to take advantage of parallelism wherever possible. Fragment shader invocations are generally not processed sequentially. On a modern GPU, there will be thousands and thousands of fragment shader invocations running concurrently during rendering. Even fragment shader invocations from separate draw calls. OpenGL and GLSL are designed specifically to enable this sort of parallelization.
From the OpenGL 4.6 specification, section 9.3.1:
Specifically, the values of rendered fragments are undefined if any shader stage
fetches texels and the same texels are written via fragment shader outputs, even
if the reads and writes are not in the same draw call, unless any of the following
exceptions apply:
The reads and writes are from/to disjoint sets of texels (after accounting for
texture filtering rules).
There is only a single read and write of each texel, and the read is in
the fragment shader invocation that writes the same texel (e.g. using
texelFetch2D(sampler, ivec2(gl_FragCoord.xy), 0);).
If a texel has been written, then in order to safely read the result a texel fetch
must be in a subsequent draw call separated by the command
 void TextureBarrier( void );
TextureBarrier will guarantee that writes have completed and caches have
been invalidated before subsequent draw calls are executed.
The OpenGL implementation is allowed to (and, as you have noticed, will actually) run multiple drawcalls concurrently if possible. Across all the fragment shader invocations that your two drawcalls produce, you do have some that read and write from/to the same sets of texels. There is more than a single read and write of each texel from different fragment shader invocations. The drawcalls are not separated by a call to glTextureBarrier(). Thus, your code produces undefined results.
A drawcall alone does not constitute a rendering pass. A rendering pass is usually understood as the whole set of operations that produce a certain piece of output (like a particular image in a framebuffer) that is then usually again consumed as an input into another pass. To make your two draw calls "truly sequential", you could call glTextureBarrier() between issuing the draw calls.
But if all you want to do is draw two triangles, one after the other, on top of each other into the same framebuffer, all you have to do is draw two triangles and use additive blending. You don't need instancing. You don't need separate drawcalls. Just draw two triangles. OpenGL requires blending to take place in the order in which the triangles from which the fragments originated were specified. Be aware that if you happen to have depth testing enabled, chances are your depth test is going to prevent the second triangle from ever being drawn unless you did change the depth testing function to something other than the default.
The downside of blending is that you're limited to a set of a few fixed functions that you can select as your blend function. But add is one of them. If you need more complex blending functions, there are vendor-specific extensions that enable what is typically called "programmable blending" on some GPUs…
Note that all of the above only concerns drawcalls that read-from and write to the same target. Drawcalls that read from a target that ealier drawcalls rendered to are guaranteed to be sequenced after the drawcalls that produced their input.

OpenGL Lighting Shader

I can't understand concept of smaller shaders in OpenGL. How does it work? For example: do I need to create one shader for positioning object in space and then shader another shader for lighting or what? Could someone explain this to me? Thanks in advance.
This is a very complex topic, especially since your question isn't very specific. At first, there are various shader stages (vertex shader, pixel shader, and so on). A shader program consists of different shader stages, at least a pixel and a vertex shader (except for compute shader programs, which are each single compute shaders). The vertex shader calculates the possition of the points on screen, so here the objects are being moved. The pixel shader calculates the color of each pixel, that is covered by the rendered geometry your vertex shader produced. Now, in terms of lighting, there are different ways of doing it:
Forward Shading
This is the straight-forward way, where you simply calculate the lighting in pixel shader of the same shader program, that moves to objects. This is the oldest way of calculating lighting, and the easiest one. However, it's abilities are very limited.
Deffered Shading
For ages, this is the go-to variant in games. Here, you have one shader program (vertex + pixel shader) that renders the geometrie on one (or multiple) textures (so it moves the objects, but it doesn't save the lit color, but rather things like the base color and surface normals into the texture), and then an other shader program that renders a quad on screen for each light you want to render, the pixel shader of this shader program reads the informations previously rendered in the textur by the first shader program, and uses it to render the lit objects on an other textur (which is then the final image). In constrast to forward shading, this allows (in theory) any number of lights in the scene, and allows easier usage of shadow maps
Tiled/Clustered Shading
This is a rather new and very complex way of calculating lighting, that can be build on top of deffered or forward shading. It basicly uses compute shaders to calculate an accelleration-structure on the gpu, which is then used draw huge amount of lights very fast. This allows to render thousands of lights in a scene in real time, but using shadow maps for these lights is very hard, and the algorithm is way more complex then the previous ones.
Writing smaller shaders means to separate some of your shader functionalities in another files. Then if you are writing a big shader which contains lightning algorithms, antialiasing algorithms, and any other shader computation algorithm, you can separate them in smaller shader files (light.glsl, fxaa.glsl, and so on...) and you have to link these files in your main shader file (the shader file which contains the void main() function) since in OpenGL a vertex array can only have one shader program (composition of vertex shader, fragment shader, geometry shader, etc...) during the rendering pipeline.
The way of writing smaller shader depends also on your rendering algorithm (forward rendering, deffered rendering, or forward+ rendering).
It's important to notice that writing a lot of shader will increase the shader compilation time, and also, writing a big shader with a lot of uniforms will also slow things down...

Understanding the shader workflow in OpenGL?

I'm having a little bit of trouble conceptualizing the workflow used in a shader-based OpenGL program. While I've never really done any major projects using either the fixed-function or shader-based pipelines, I've started learning and experimenting, and it's become quite clear to me that shaders are the way to go.
However, the fixed-function pipeline makes much more sense to me from an intuitive perspective. Rendering a scene with that method is simple and procedural—like painting a picture. If I want to draw a box, I tell the graphics card to draw a box. If I want a lot of boxes, I draw my box in a loop. The fixed-function pipeline fits well with my established programming tendencies.
These all seem to go out the window with shaders, and this is where I'm hitting a block. A lot of shader-based tutorials show how to, for example, draw a triangle or a cube on the screen, which works fine. However, they don't seem to go into at all how I would apply these concepts in, for example, a game. If I wanted to draw three procedurally generated triangles, would I need three shaders? Obviously not, since that would be infeasible. Still, it's clearly not as simple as just sticking the drawing code in a loop that runs three times.
Therefore, I'm wondering what the "best practices" are for using shaders in game development environments. How many shaders should I have for a simple game? How do I switch between them and use them to render a real scene?
I'm not looking for specifics, just a general understanding. For example, if I had a shader that rendered a circle, how would I reuse that shader to draw different sized circles at different points on the screen? If I want each circle to be a different color, how can I pass some information to the fragment shader for each individual circle?
There is really no conceptual difference between the fixed-function pipeline and the programmable pipeline. The only thing shaders introduce is the ability to program certain stages of the pipeline.
On current hardware you have (for the most part) control over the vertex, primitive assembly, tessellation and fragment stages. Some operations that occur inbetween and after these stages are still fixed-function, such as depth/stencil testing, blending, perspective divide, etc.
Because shaders are actually nothing more than programs that you drop-in to define the input and output of a particular stage, you should think of input to a fragment shader as coming from the output of one of the previous stages. Vertex outputs are interpolated during rasterization and these are often what you're dealing with when you have an in variable in a fragment shader.
You can also have program-wide variables, known as uniforms. These variables can be accessed by any stage simply by using the same name in each stage. They do not vary across invocations of a shader, hence the name uniform.
Now you should have enough information to figure out this circle example... you can use a uniform to scale your circle (likely a simple scaling matrix) and you can either rely on per-vertex color or a uniform that defines the color.
You don't have shaders that draws circles (ok, you may with the right tricks, but's let's forget it for now, because it is misleading and has very rare and specific uses). Shaders are little programs you write to take care of certain stages of the graphic pipeline, and are more specific than "drawing a circle".
Generally speaking, every time you make a draw call, you have to tell openGL which shaders to use ( with a call to glUseProgram You have to use at least a Vertex Shader and a Fragment Shader. The resulting pipeline will be something like
Vertex Shader: the code that is going to be executed for each of the vertices you are going to send to openGL. It will be executed for each indices you sent in the element array, and it will use as input data the correspnding vertex attributes, such as the vertex position, its normal, its uv coordinates, maybe its tangent (if you are doing normal mapping), or whatever you are sending to it. Generally you want to do your geometric calculations here. You can also access uniform variables you set up for your draw call, which are global variables whic are not goin to change per vertex. A typical uniform variable you might watn to use in a vertex shader is the PVM matrix. If you don't use tessellation, the vertex shader will be writing gl_Position, the position which the rasterizer is going to use to create fragments. You can also have the vertex outputs different things (as the uv coordinates, and the normals after you have dealt with thieri geometry), give them to the rasterizer an use them later.
Rasterization
Fragment Shader: the code that is going to be executed for each fragment (for each pixel if that is more clear). Generally you do here texture sampling and light calculation. You will use the data coming from the vertex shader and the rasterizer, such as the normals (to evaluate diffuse and specular terms) and the uv coordinates (to fetch the right colors form the textures). The texture are going to be uniform, and probably also the parameters of the lights you are evaluating.
Depth Test, Stencil Test. (which you can move before the fragment shader with the early fragments optimization ( http://www.opengl.org/wiki/Early_Fragment_Test )
Blending.
I suggest you to look at this nice program to develop simple shaders http://sourceforge.net/projects/quickshader/ , which has very good examples, also of some more advanced things you won't find on every tutorial.

Tessellation shader culling unseen vertices?

I've noticed that in my program when I look away from my tessellated mesh, the frame time goes way down, suggesting that no tessellation is happening on meshes that aren't on screen. (I have no custom culling code)
But for my purposes I need access in the geometry shader to the tessellated vertices whether they are on screen or not.
Are these tessellated triangles making it to the geometry shader stage? Or are they being culled before they even make it to the tessellation evaluator stage.
when I look away from my tessellated mesh, the frame time goes way down, suggesting that no tessellation is happening on meshes that aren't on screen.
Well there's your problem; the belief that faster speed means lack of tessellation.
There are many reasons why off-screen polygons will be faster than on-screen ones. For example, you could be partially or entirely fragment-shader and/or pixel-bound in terms of speed. Thus, once those fragments are culled by being entirely off-screen, your rendering time goes down.
In short, everything's fine. Neither OpenGL nor D3D allows the tessellation stage to discard geometry arbitrarily. Remember: you don't have to tessellate in clip-space, so there's no way for the system to even know if a triangle is "off-screen". Your GS is allowed to potentially bring off-screen triangles on-screen. So there's not enough information at that point to decide what is and is not off-screen.
In all likelihood, what's happening is that your renderer's performance was bound to the rasterizer/fragment shader, rather than to vertex processing. So even though the vertices are still being processed, the expensive per-fragment operations aren't being done because the triangles are off-screen. Thus, a performance improvement.

Quick question about glColorMask and its work

I want to render depth buffer to do some nice shadow mapping. My drawing code though, consists of many shader switches. If I set glColorMask(0,0,0,0) and leave all shader programs, textures and others as they are, and just render the depth buffer, will it be 'OK' ? I mean, if glColorMask disables the "write of color components", does it mean that per-fragment shading IS NOT going to be performed?
For rendering a shadow map, you will normally want to bind a depth texture (preferrably square and power of two, because stereo drivers take this as hint!) to a FBO and use exactly one shader (as simple as possible) for everything. You do not want to attach a color buffer, because you are not interested in color at all, and it puts more unnecessary pressure on ROP (plus, some hardware can render double speed or more with depth-only). You do not want to switch between many shaders.
Depending on whether you do "classic" shadow mapping, or something more sophisticated such as exponential shadow maps, the shader that you will use is either as simple as it can be (constant color, and no depth write), or performs some (moderately complex) calculations on depth, but you normally do not want to perform any colour calculations, since that will mean needless calculations which will not be visible in any way.
No, the fragment operations will be performed anyway, but their result will be squashed by your zero color mask.
If you don't want some fragment operations to be performed - use the proper shader program which has an empty fragment shader attached and set the draw buffer to GL_NONE.
There is another way to disable fragment processing - to enable GL_RASTERIZER_DISCARD, but you won't get even the depth values in this case :)
No, the shader programs execute independent of the fixed function pipeline. Setting the glColorMask will have no effect on the shader programs.