Optimize for expensive fragment shader

Optimize for expensive fragment shader - opengl

I'm rendering multiple layers of flat triangles with a raytracer in the fragment shader. The upper layers have holes, and I'm looking for a way how I can avoid running the shader for pixels that are filled already by one of the upper layers, i.e. I want only the parts of the lower layers rendered that lie in the holes in the upper layers. Of course, if there's a hole or not is not known unless the fragment shader did his thing for a layer.
As far as I understand, I cannot use early depth testing because there, the depth values are interpolated between the vertices and do not come from the fragment shader. Is there a way to "emulate" that behavior?

The best way to solve this issue is to not use layers. You are only using layers because of the limitations of using a 3D texture to store your scene data. So... don't do that.
SSBOs and buffer textures (if your hardware is too old for SSBOs) can access more memory than a 3D texture. And you could even employ manual swizzling of the data to improve cache locality if that is viable.
As far as I understand, I cannot use early depth testing because there, the depth values are interpolated between the vertices and do not come from the fragment shader.
This is correct insofar as you cannot use early depth tests, but it is incorrect as to why.
The "depth" provided by the VS doesn't need to be the depth of the actual fragment. You are rendering your scene in layers, presumably with each layer being a full-screen quad. By definition, everything in one layer of rendering is beneath everything in a lower layer. So the absolute depth value doesn't matter; what matters is whether there is something from a higher layer over this fragment.
So each layer could get its own depth value, with lower layers getting a lower depth value. The exact value is arbitrary and irrelevant; what matters is that higher layers have higher values.
The reason this doesn't work is this: if your raytracing algorithm detects a miss within a layer (a "hole"), you must discard that fragment. And the use of discard at all turns off early depth testing in most hardware, since the depth testing logic is usually tied to the depth writing logic (it is an atomic read/conditional-modify/conditional-write).

Related

(Modern) OpenGL Different Colored Faces on a Cube - Using Shaders

A cube with different colored faces in intermediate mode is very simple. But doing this same thing with shaders seems to be quite a challenge.
I have read that in order to create a cube with different coloured faces, I should create 24 vertices instead of 8 vertices for the cube - in other words, (I visualies this as 6 squares that don't quite touch).
Is perhaps another (better?) solution to texture the faces of the cube using a real simple texture a flat color - perhaps a 1x1 pixel texture?
My texturing idea seems simpler to me - from a coder's point of view.. but which method would be the most efficient from a GPU/graphic card perspective?

I'm not sure what your overall goal is (e.g. what you're learning to do in the long term), but generally for high performance applications (e.g. games) your goal is to reduce GPU load. Every time you switch certain states (e.g. change textures, render targets, shader uniform values, etc..) the GPU stalls reconfiguring itself to meet your demands.
So, you can pass in a 1x1 pixel texture for each face, but then you'd need six draw calls (usually not so bad, but there is some prep work and potential cache misses) and six texture sets (can be very bad, often as bad as changing shader uniform values).
Suppose you wanted to pass in one texture and use that as a texture map for the cube. This is a little less trivial than it sounds -- you need to express each texture face on the texture in a way that maps to the vertices. Often you need to pass in a texture coordinate for each vertex, and due to the spacial configuration of the texture this normally doesn't end up meaning one texture coordinate for one spatial vertex.
However, if you use an environmental/reflection map, the complexities of mapping are handled for you. In this way, you could draw a single texture on all sides of your cube. (Or on your sphere, or whatever sphere-mapped shape you wanted.) I'm not sure I'd call this easier since you have to form the environmental texture carefully, and you still have to set a different texture for each new colors you want to represent -- or change the texture either via the GPU or in step with the GPU, and that's tricky and usually not performant.
Which brings us back to the canonical way of doing as you mentioned: use vertex values -- they're fast, you can draw many, many cubes very quickly by only specifying different vertex data, and it's easy to understand. It really is the best way, and how GPUs are designed to run quickly.
Additionally..
And yes, you can do this with just shaders... But it'd be ugly and slow, and the GPU would end up computing it per each pixel.. Pass the object space coordinates to the fragment shader, and in the fragment shader test which side you're on and output the corresponding color. Highly not recommended, it's not particularly easier, and it's definitely not faster for the GPU -- to change colors you'd again end up changing uniform values for the shaders.

How can I apply a depth test to vertices (not fragments)?

TL;DR I'm computing a depth map in a fragment shader and then trying to use that map in a vertex shader to see if vertices are 'in view' or not and the vertices don't line up with the fragment texel coordinates. The imprecision causes rendering artifacts, and I'm seeking alternatives for filtering vertices based on depth.
Background. I am very loosely attempting to implement a scheme outlined in this paper (http://dash.harvard.edu/handle/1/4138746). The idea is to represent arbitrary virtual objects as lots of tangent discs. While they wanted to replace triangles in some graphics card of the future, I'm implementing this on conventional cards; my discs are just fans of triangles ("Discs") around center points ("Points").
This is targeting WebGL.
The strategy I intend to use, similar to what's done in the paper, is:
Render the Discs in a Depth-Only pass.
In a second (or more) pass, compute what's visible based solely on which Points are "visible" - ie their depth is <= the depth from the Depth-Only pass at that x and y.
I believe the authors of the paper used a gaussian blur on top of the equivalent of a GL_POINTS render applied to the Points (ie re-using the depth buffer from the DepthOnly pass, not clearing it) to actually render their object. It's hard to say: the process is unfortunately a one line comment, and I'm unsure of how to duplicate it in WebGL anyway (a naive gaussian blur will just blur in the background pixels that weren't touched by the GL_POINTS call).
Instead, I'm hoping to do something slightly different, by rerendering the discs in a second pass instead as cones (center of disc becomes apex of cone, think "close the umbrella") and effectively computing a voronoi diagram on the surface of the object (ala redbook http://www.glprogramming.com/red/chapter14.html#name19). The idea is that an output pixel is the color value of the first disc to reach it when growing radiuses from 0 -> their natural size.
The crux of the problem is that only discs whose centers pass the depth test in the first pass should be allowed to carry on (as cones) to the 2nd pass. Because what's true at the disc center applies to the whole disc/cone, I believe this requires evaluating a depth test at a vertex or object level, and not at a fragment level.
Since WebGL support for accessing depth buffers is still poor, in my first pass I am packing depth info into an RGBA Framebuffer in a fragment shader. I then intended to use this in the vertex shader of the second pass via a sampler2D; any disc center that was closer than the relative texture2D() lookup would be allowed on to the second pass; otherwise I would hack "discarding" the vertex (its alpha would be set to 0 or some flag set that would cause discard of fragments associated with the disc/cone or etc).
This actually kind of worked but it caused horrendous z-fighting between discs that were close together (very small perturbations wildly changed which discs were visible). I believe there is some floating point error between depth->rgba->depth. More importantly, though, the depth texture is being set by fragment texel coords, but I'm looking up vertices, which almost certainly don't line up exactly on top of relevant texel coordinates; so I get depth +/- noise, essentially, and the noise is the issue. Adding or subtracting .000001 or something isn't sufficient: you trade Type I errors for Type II. My render became more accurate when I switched from NEAREST to LINEAR for the depth texture interpolation, but it still wasn't good enough.
How else can I determine which disc's centers would be visible in a given render, so that I can do a second vertex/fragment (or more) pass focused on objects associated with those points? Or: is there a better way to go about this in general?

GLSL glass effect plus depth peeling

I'm working on rendering a scene that potentially has multiple intersecting transparent objects. This makes the standard method of sorting and drawing back to front problematic (even sorting triangles wouldn't work if the triangles intersect). So I've implemented depth peeling using a GLSL fragment shader to do the second depth test. It's works great.
Now I want to be able to apply certain effects using shaders. One of the objects in the scene is a syringe, and I would like to apply a glass effect. If I was drawing back to front, this would be easy - just start the shader when I draw the syringe, since everything behind it is already in the frame buffer. However, when using depth peeling this approach won't work.
So my questions are:
How to I apply shader effects to a single object in a scene when using depth peeling?
How do I combine effect shaders with my depth peeling shader (assuming they need to run at the same time)?
I should note that I'm pretty new at using shaders, so code examples are appreciated!

I'd be surprised if that's possible without ray tracing. As far as I know, the way to use refraction shaders is to do texture lookups in an environment map. This map can be either precomputed, or it's computed on the fly in a separate rendering pass. For the latter option you would need one separate environment map and one extra pass for each object that uses the shader. I kinda doubt that that's possible if the objects intersect each other. Even if it was, for each of these passes you would also need another couple passes for the depth peeling. Now if you also wanted the depth peeling shader passes to factor in refractions for the surrounding objects, this would quickly get out of hand.

GLSL Shaders: blending, primitive-specific behavior, and discarding a vertex

Criteria: I’m using OpenGL with shaders (GLSL) and trying to stay with modern techniques (e.g., trying to stay away from deprecated concepts).
My questions, in a very general sense--see below for more detail—are as follows:
Do shaders allow you to do custom blending that help eliminate z-order transparency issues found when using GL_BLEND?
Is there a way for a shader to know what type of primitive is being drawn without “manually” passing it some sort of flag?
Is there a way for a shader to “ignore” or “discard” a vertex (especially when drawing points)?
Background: My application draws points connected with lines in an ortho projection (vertices have varying depth in the projection). I’ve only recently started using shaders in the project (trying to get away from deprecated concepts). I understand that standard blending has ordering issues with alpha testing and depth testing: basically, if a “translucent” pixel at a higher z level is drawn first (thus blending with whatever colors were already drawn to that pixel at a lower z level), and an opaque object is then drawn at that pixel but at a lower z level, depth testing prevents changing the pixel that was already drawn for the “higher” z level, thus causing blending issues. To overcome this, you need to draw opaque items first, then translucent items in ascending z order. My gut feeling is that shaders wouldn’t provide an (efficient) way to change this behavior—am I wrong?
Further, for speed and convenience, I pass information for each vertex (along with a couple of uniform variables) to the shaders and they use the information to find a subset of the vertices that need special attention. Without doing a similar set of logic in the app itself (and slowing things down) I can’t know a priori what subset of vericies that is. Thus I send all vertices to the shader. However, when I draw “points” I’d like the shader to ignore all the vertices that aren’t in the subset it determines. I think I can get the effect by setting alpha to zero and using an alpha function in the GL context that will prevent drawing anything with alpha less than, say, 0.01. However, is there a better or more “correct” glsl way for a shader to say “just ignore this vertex”?

Do shaders allow you to do custom blending that help eliminate z-order transparency issues found when using GL_BLEND?
Sort of. If you have access to GL 4.x-class hardware (Radeon HD 5xxx or better, or GeForce 4xx or better), then you can perform order-independent transparency. Earlier versions have techniques like depth peeling, but they're quite expensive.
The GL 4.x-class version uses essentially a series of "linked lists" of transparent samples, which you do a full-screen pass to resolve into the final sample color. It's not free of course, but it isn't as expensive as other OIT methods. How expensive it would be for your case is uncertain; it is proportional to how many overlapping pixels you have.
You still have to draw opaque stuff first, and you have to draw transparent stuff using special shader code.
Is there a way for a shader to know what type of primitive is being drawn without “manually” passing it some sort of flag?
No.
Is there a way for a shader to “ignore” or “discard” a vertex (especially when drawing points)?
No in general, but yes for points. A Geometry shader can conditionally emit vertices, thus allowing you to discard any vertex for arbitrary reasons.
Discarding a vertex in non-point primitives is possible, but it will also affect the interpretation of that primitive. The reason it's simple for points is because a vertex is a primitive, while a vertex in a triangle isn't a whole primitive. You can discard lines, but discarding a vertex within a line is... of dubious value.
That being said, your explanation for why you want to do this is of dubious merit. You want to update vertex data with essentially a boolean value that says "do stuff with me" or not to. That means that, every frame, you have to modify your data to say which points should be rendered and which shouldn't.
The simplest and most efficient way to do this is to simply not render with them. That is, arrange your data so that the only thing on the GPU are the points you want to render. Thus, there's no need to do anything special at all. If you're going to be constantly updating your vertex data, then you're already condemned to dealing with streaming vertex data. So you may as well stream it in a way that makes rendering efficient.

My own z-buffer

How I can make my own z-buffer for correct blending alpha channels? I'm using glsl.
I have only one idea. And this is use 2 "buffers", one of them storing depth-component and another color (with alpha channel). I don't need access to buffer in my program. I cant use uniform array because glsl have a restriction for the number of uniforms variables. I cant use FBO because behaviour for sometime writing and reading Frame Buffer is not defined (and dont working at any cards).
How I can resolve this problem?!
Or how to read actual real time z-buffer from glsl? (I mean for each fragment shader call z-buffer must be updated)

How I can make my own z-buffer for correct blending alpha channels?
That's not possible. For perfect order-independent transparency you must get rid of z-buffer and replace it with another mechanism for hidden surface removal.
With z-buffer there are two possible ways to tackle the problem.
Multi-layered z-buffer (impractical with hardware acceleration) - basically it'll store several layers of "depth" values and will use it for blending transparent surfaces. Will hog a lot of memory, and there will be maximum number of transparent overlayying surfaces, once you're over the limit, there will be artifacts.
Depth peeling (google it). Order independent transparency, but there's a limit for maximum number of "overlaying" transparent polygons per pixel. Can actually be implemented on hardware.
Both approaches will have a limit (maximum number of overlapping transparent polygons per pixel), once you go over the limit, scene will no longer render properly. Which means the whole thing rather useless.
What you could actually do (to get perfect solution) is to remove the zbuffer completely, and make a graphic rendering pipeline that will gather all polygons to be rendered, clip them, split them (when two polygons intersect), sort them and then paint them on screen in correct order to ensure that you'll get correct result. However, this is hard, and doing it with hardware acceleration is harder. I think (I'm not completely certain it happened) 5 ot 6 years ago some ATI GPU-related document mentioned that some of their cards could render correct scene with Z-Buffer disabled by enabling some kind of extension. However, they didn't say a thing about alpha-blending. I haven't heard about this feature since. Perhaps it didn't become popular and shared the fate of TruForm (forgotten). Also such rendering pipeline will not be able to some things that are possible on z-buffer

If it's order-independent transparencies you're after then the fundamental problem is that a depth buffer stores on depth per pixel but if you're composing a view of partially transparent geometry then more than one fragment contributes to each pixel.
If you were to solve the problem robustly you'd need an ordered list of depths per pixel, going back to the closest opaque fragment. You'd then walk the list in reverse order. In practice OpenGL doesn't do things like variably sized arrays so people achieve pretty much that by drawing their geometry in back-to-front order.
An alternative embodied by GL_SAMPLE_ALPHA_TO_COVERAGE is to switch to screen-door transparency, which is indistinguishable from real transparency either at a really high resolution or with multisampling. Ideally you'd do that stochastically, but that would void the OpenGL rule of repeatability. Nevertheless since you're in GLSL you can do it for yourself. Your sampler simply takes the input alpha and uses that as the probability that it'll output the final pixel. So grab a random value in the range 0.0 to 1.0 from somewhere and if it's greater than the alpha then discard the pixel. Always output with an alpha of 1.0 and just use the normal depth buffer. Answers like this say a bit more on what you can do to get randomish numbers in GLSL, and obviously you want to turn multisampling up as high as possible.
Eric Enderton has written a decent paper (which has a slide version) on stochastic order-independent transparency that goes alongside a DirectX implementation that's worth checking out.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js