Relation between depth-only FBO and fragment shader - opengl

I’ve been wondering what happens when binding a depth-only FBO (only the GL_DEPTH_ATTACHMENT gets attached and glDrawBuffer(GL_NONE) is called) for the fragment shader part. Because any color is discarded:
does OpenGL simply process vertices the regular way, call the rasterizer, apply the fragment shader for rasterized fragments, but discard any result
or does it do smarter things, like process vertices until the optional geometry shader, then cut the fragment shader part and use a dummy fragment shader in order to discard useless color computations?
Because of vendor-implementation details, I guess it might vary, but I’d like to have a better idea about that topic.

In my experience, the fragment shader will still run even if it has no outputs. This can be used for example to draw shadow maps with punch-through alpha textures, using discard.
If it does have outputs (or more outputs then are bound), then they should just be ignored. I'd imagine that a smart driver could easily skip the fragment shader entirely if it doesn't contain any discard statements.
Also perhaps look into Separate Shader Objects (https://www.opengl.org/registry/specs/ARB/separate_shader_objects.txt). It allows you to disable the stages manually.

I've read (Though never personally tested) that a complete lack of a color buffer causes strange undefined behavior, as OpenGL implementations each had to ask this question in reverse: "What /should/ we make it do when there's no color buffer?" and have no official, commonly-used-across-all-implementations answer.
The official documentation carefully avoids mentioning this situation generally.
As such, it is just recommended that you simply... not do that, and instead always have a color buffer, even if you don't use it.

Related

Access different Fragment in Fragmentshader OpenGL

Can I access and change output values of another Fragment at a certain location in the Fragmentshader?
For example in the main() loop I process everything just like usualy and output the color with some value. But in adition to that I also want the fragment at position vec3(5,3,6) (in world coordinates) to have the same colour.
Now I already did some researche on the web on that. The OpenGL site says, the fragmentshader has one fragment as input and has one fragment as output, which doesnt sound very promising.
Also I know that all fragments are being processed in parallel. But maybe it is posible to say, if the fragment at this position has not been processed yet, write this color to it and take this fragment as already processed.
My be someone can explain if this is posible somehow and if not, why this is not a good idea. The best guess I would have is, to build this logic into the shader, it would have a very bad effect on the general performance.
My be someone can explain if this is posible somehow and if not, why this is not a good idea.
It's not a question of bad idea vs. good idea. It's simply not possible.
The closest you can get to this functionality is ARB_fragment_shader_interlock. Through its interlock and ordering guarantees, it allows limited interoperation. And that limitation is... it only allows interoperation for fragments that cover the same pixel/sample.
So even this functionality does not allow you to write to some other pixel.
The absolute best you can do is use SSBOs and atomic counters to have fragment shaders write what color values and "world coordinates" they would like to write to, then have a second process execute that buffer as either a rendering command or a compute shader to actually write that data.
As already pointed out in Nicol's answer, you can't write to additional fragments of a framebuffer surface in the fragment shader.
The description of your use case is not clear enough to tell what might work best. In the interest of brainstorming, the most direct approach that comes to mind is that you don't use a framebuffer draw surface at all, but output to an image instead.
If you bind a texture as an image, you can write to it in the fragment shader using the imageStore() built-in function. This function takes coordinates as one of the argument, so you can write to any pixel you want, as well as write multiple pixels from the same shader invocation.
Depending on what exactly you want to achieve, I could also imagine a hybrid approach, where your primary rendering still goes to a framebuffer, but you write additional pixel values to an image at the desired positions. Then, in a second rendering pass, you can combine the content of the image with the primary rendering. The combination could be done with blending if the math/logic is simple enough. If you need a more complex combination, you can use a texture as the framebuffer attachment of the initial pass, and then use the result of the rendering and the extra image as two inputs for the fragment shader of the combination pass.

Understanding the shader workflow in OpenGL?

I'm having a little bit of trouble conceptualizing the workflow used in a shader-based OpenGL program. While I've never really done any major projects using either the fixed-function or shader-based pipelines, I've started learning and experimenting, and it's become quite clear to me that shaders are the way to go.
However, the fixed-function pipeline makes much more sense to me from an intuitive perspective. Rendering a scene with that method is simple and procedural—like painting a picture. If I want to draw a box, I tell the graphics card to draw a box. If I want a lot of boxes, I draw my box in a loop. The fixed-function pipeline fits well with my established programming tendencies.
These all seem to go out the window with shaders, and this is where I'm hitting a block. A lot of shader-based tutorials show how to, for example, draw a triangle or a cube on the screen, which works fine. However, they don't seem to go into at all how I would apply these concepts in, for example, a game. If I wanted to draw three procedurally generated triangles, would I need three shaders? Obviously not, since that would be infeasible. Still, it's clearly not as simple as just sticking the drawing code in a loop that runs three times.
Therefore, I'm wondering what the "best practices" are for using shaders in game development environments. How many shaders should I have for a simple game? How do I switch between them and use them to render a real scene?
I'm not looking for specifics, just a general understanding. For example, if I had a shader that rendered a circle, how would I reuse that shader to draw different sized circles at different points on the screen? If I want each circle to be a different color, how can I pass some information to the fragment shader for each individual circle?
There is really no conceptual difference between the fixed-function pipeline and the programmable pipeline. The only thing shaders introduce is the ability to program certain stages of the pipeline.
On current hardware you have (for the most part) control over the vertex, primitive assembly, tessellation and fragment stages. Some operations that occur inbetween and after these stages are still fixed-function, such as depth/stencil testing, blending, perspective divide, etc.
Because shaders are actually nothing more than programs that you drop-in to define the input and output of a particular stage, you should think of input to a fragment shader as coming from the output of one of the previous stages. Vertex outputs are interpolated during rasterization and these are often what you're dealing with when you have an in variable in a fragment shader.
You can also have program-wide variables, known as uniforms. These variables can be accessed by any stage simply by using the same name in each stage. They do not vary across invocations of a shader, hence the name uniform.
Now you should have enough information to figure out this circle example... you can use a uniform to scale your circle (likely a simple scaling matrix) and you can either rely on per-vertex color or a uniform that defines the color.
You don't have shaders that draws circles (ok, you may with the right tricks, but's let's forget it for now, because it is misleading and has very rare and specific uses). Shaders are little programs you write to take care of certain stages of the graphic pipeline, and are more specific than "drawing a circle".
Generally speaking, every time you make a draw call, you have to tell openGL which shaders to use ( with a call to glUseProgram You have to use at least a Vertex Shader and a Fragment Shader. The resulting pipeline will be something like
Vertex Shader: the code that is going to be executed for each of the vertices you are going to send to openGL. It will be executed for each indices you sent in the element array, and it will use as input data the correspnding vertex attributes, such as the vertex position, its normal, its uv coordinates, maybe its tangent (if you are doing normal mapping), or whatever you are sending to it. Generally you want to do your geometric calculations here. You can also access uniform variables you set up for your draw call, which are global variables whic are not goin to change per vertex. A typical uniform variable you might watn to use in a vertex shader is the PVM matrix. If you don't use tessellation, the vertex shader will be writing gl_Position, the position which the rasterizer is going to use to create fragments. You can also have the vertex outputs different things (as the uv coordinates, and the normals after you have dealt with thieri geometry), give them to the rasterizer an use them later.
Rasterization
Fragment Shader: the code that is going to be executed for each fragment (for each pixel if that is more clear). Generally you do here texture sampling and light calculation. You will use the data coming from the vertex shader and the rasterizer, such as the normals (to evaluate diffuse and specular terms) and the uv coordinates (to fetch the right colors form the textures). The texture are going to be uniform, and probably also the parameters of the lights you are evaluating.
Depth Test, Stencil Test. (which you can move before the fragment shader with the early fragments optimization ( http://www.opengl.org/wiki/Early_Fragment_Test )
Blending.
I suggest you to look at this nice program to develop simple shaders http://sourceforge.net/projects/quickshader/ , which has very good examples, also of some more advanced things you won't find on every tutorial.

GLSL vertex shader cancel render

Can the rendering for a pixel be terminated in a vertex shader. For example if a vertex does not meet a certain requirement cancel the rendering of that vertex?
I'll assuming you said "rendering for a vertex be terminated". And no, you can't; OpenGL is very strict about the 1:1 ratio of input vertices to outputs for a VS. Also, it wouldn't really mean what you want it to, since vertices don't get rendered. Primitives do, and a primitive can be composed of more than one vertex. What would it mean to discard a vertex in the middle of a triangle strip, for example.
This is why Geometry Shaders have the ability to "cull" primitives; they deal specifically with a primitive, not merely a single vertex. This is done by simply not emitting any vertices; GS's must explicitly emit the vertices that it wants to output.
Vertex shaders now have the ability to cull primitives. This is done using the "cull distance" feature of OpenGL 4.5. It's like gl_ClipDistance, only instead of clipping, it culls the entire primitive if one of the vertices crosses the threshold.
In theory, you can use a vertex shader to produce a degenerate (zero-area) primitive. A primitive with zero area should not result in anything rasterized, and thus no fragment will be rendered. It is not particularly intuitive, however, especially if you are using primitives that share vertices.
But no, canceling a vertex is almost meaningless. It is the fundamental unit upon which primitives are constructed. If you simply remove a single vertex, then you will alter the rasterized output in undefined ways.
Put simply, vertices are not what create pixels on screen. It is the connectivity between vertices, which creates primitives, that ultimately lead to pixels. Geometry Shaders operate on a primitive-by-primitive basis, so they are generally where you would cancel rasterization and fragment shading in a programatic fashion.
UPDATE:
It has come to my attention that you are using GL_POINTS as your primitive type. In this special case, all you have to do to prevent your vertex from going further down the pipeline is set its position somewhere outside of your camera's viewing volume. The vertex will be clipped and no rasterization or fragment shading will occur.
This is a much more efficient solution to testing for some condition in a fragment shader and then discarding, because you skip rasterization and do not have to execute a fragment shader at all. Not to mention, discard usually winds up working as a post-shader execution flag that tells the GPU to discard the result - the GPU is often forced to execute the entire shader no matter where in the shader you issue the discard instruction. Thus discard rarely gives a performance benefit, and in many cases it can disable other potentially more useful hardware optimizations. This is the nature of the way GPUs schedule their shader workload, unfortunately.
The cheapest fragment is the one you never have to process :)
You can't terminate rendering of a pixel in a vertex shader (it doesn't deal with pixels), but you can in the fragment shader using the discard instruction.
I am elaborating on Andon M. Coleman answer, which deserves IMHO to be marked as the right one.
Even though the OpenGL specification is adamant about the fact that you cannot skip the fragment shader step (unless you actually remove the whole primitive in the geometry shader, as Nicol Bolas correctly pointed out, which is a bit overkill imho), you can do it in practice by letting OpenGL cull the whole geometry, as modern GPUs have early fragment rejection optimizations which will likely produce the same effect.
And, for the records, making the whole geometry get discarded is really really easy: just write the vertex outside the (-1, -1, -1),(1,1,1) cube,
gl_Position = vec4(2.0, 2.0, 2.0, 1.0);
...and off you go!
Hope this helps
You can make alterations to the vertex stream, including the removal of vertices, but the place to do that would be in a geometry shader. If you look into geometry shaders, you may find the solution you're looking for in simply failing to 'emit' a vertex.
EDIT: If rendering a triangle strip you would probably also want to take the care to start a new primitive, when a vertex is removed; you'll see why if you investigate geom. shaders. With GL_POINTS it would be less of an issue.
And yes, if you send a triangle strip of only 2 vertices, for instance, then indeed you fail to render anything -- just as you would do if you passed in such a degenerate strip in the first place. That does not mean the vertex stream can't be altered on the GL side of things, however.
Hope that helps >Tom
set the position out of ndc
or
set flag and pass to fragment, and discard in fragment according to the flag

Quick question about glColorMask and its work

I want to render depth buffer to do some nice shadow mapping. My drawing code though, consists of many shader switches. If I set glColorMask(0,0,0,0) and leave all shader programs, textures and others as they are, and just render the depth buffer, will it be 'OK' ? I mean, if glColorMask disables the "write of color components", does it mean that per-fragment shading IS NOT going to be performed?
For rendering a shadow map, you will normally want to bind a depth texture (preferrably square and power of two, because stereo drivers take this as hint!) to a FBO and use exactly one shader (as simple as possible) for everything. You do not want to attach a color buffer, because you are not interested in color at all, and it puts more unnecessary pressure on ROP (plus, some hardware can render double speed or more with depth-only). You do not want to switch between many shaders.
Depending on whether you do "classic" shadow mapping, or something more sophisticated such as exponential shadow maps, the shader that you will use is either as simple as it can be (constant color, and no depth write), or performs some (moderately complex) calculations on depth, but you normally do not want to perform any colour calculations, since that will mean needless calculations which will not be visible in any way.
No, the fragment operations will be performed anyway, but their result will be squashed by your zero color mask.
If you don't want some fragment operations to be performed - use the proper shader program which has an empty fragment shader attached and set the draw buffer to GL_NONE.
There is another way to disable fragment processing - to enable GL_RASTERIZER_DISCARD, but you won't get even the depth values in this case :)
No, the shader programs execute independent of the fixed function pipeline. Setting the glColorMask will have no effect on the shader programs.

Shader framebuffer readback

I was wondering if there is support in the newer shader models to read-back a pixel value from the target framebuffer. I assume that this is alrdy done in later (non-programmable) stages in the drawing pipeline which made me hope that this feature might have been added into the programmable pipeline.
I am aware that it is possible to draw to a texture bound framebuffer and then send this texture to the shader, I was just hoping for a more elegant way to achieve the same functionality.
As Andrew notes, the framebuffer access is logically a separate stage from the fragment shader, so reading the framebuffer in the fragment shader is impossible. The reason for this (to answer Andrew's question) is a combination of performance and the ordering requirements of the graphics pipeline. The way the rendering pipeline is defined, framebuffer blending operations MUST occur in the same order as the triangles/primitives that went into the beginning of the pipeline. The fragment shaders, on the other hand, can happen in any order. So by having them be separate stages, the GPU is free to run fragment shaders as fast as it can, as their inputs become available, without having to synchronize between them. As long as it maintains enough bufffer space to hold on to the outputs of the fragment shaders, so that they can be accumulated and allow the framebuffer blends and writes to occur in order, all is well, as the results of any given fragment shader are not visible until after the blending stage.
If there was a way for the fragment shader to read the framebuffer, it would require some sort of synchronization to ensure that those reads happen in order, thus greatly slowing things down.
No. As you mention, rendering to a texture is the way to achieve that functionality.
If you take a look at a block diagram of a GPU pipeline, you'll see that the blending stage - which is what combines fragment shader output with the framebuffer - is separate from the fragment shader and is fixed-function.
I'm not a GPU designer - so I can only speculate the reason for this. Presumably it is to keep framebuffer access fast and insulate the fragment shader stage from the frame buffer so that it can be better parallelised. There are probably also issues regarding multi-sampling, and so on.
(Not to mention that fixed-function blending is "good enough" in most cases.)
Actually I think this is now doable with Direct3D 11 SM 5.0 (I didn't test it though).
You can bind an UAV to a PS 5.0, for allowing read and write operations on it using method OMSetRenderTargetsAndUnorderedAccessViews.
In that case the backbuffer of the swap chain in which you render has to be created with flag DXGI_USAGE_UNORDERED_ACCESS (I guess).
This is used in DXSDK OIT11 sample.
It is possible to read back the contents of the frame buffer in the fragment shader with Shader_framebuffer_fetch extension. The support can be added to the GPU with some performance loss. In fact, these days I'm working on to add the support of this extension in the OpenGL ES2.0 driver of a well known GPU brand in the consumer electronics market.
You can draw to a texture TEX (using a render target view) and then bind that as an input to another shader (using a shader resource view). TEX is then a pseduo-framebuffer.