Binary OR framebuffer blending - opengl

I'm currently looking into an algorithm described within this research paper, however I've come across a portion which I'm unclear of how it's been achieved.
A grid is defined by placing a camera above the scene and adjusting its view frustum to enclose the area to be voxelized. This camera has an associated viewport with (w, h ) dimensions. The scene is then rendered, constructing the voxelization in the frame buffer. A pixel (i,j) represents a column in the grid and each voxel within this column is binary encoded using the k th bit of the RGBA value of the pixel. Therefore, the corresponding image represents a w×h×32 grid with one bit of information per voxel. This bit indicates whether a primitive passes through a cell or not. The union of voxels corresponding to the kth bit for all pixels defines a slice. Consequently, the image/texture encoding the grid is called a slicemap . When a primitive is rasterized, a set of fragments are obtained. A fragment shader is used in order to determine the position of the fragment in the column based on its depth. The result is then OR−ed with the current value of the frame buffer.
Presumably one would achieve this by setting the blend equation to use a binary-OR, however that's not an available option and I can't see a way to achieve it through manipulation of glBlendFunc()+glBlendEquation()
Additionally from my understanding it's not possible to read the framebuffer within the fragment shader. You can bind a texture to both the shader and framebuffer, however accessing this within the shader is undefined behaviour due to a lack of synchronisation.
The paper doesn't state whether OpenGL or Direct-X was used, however to the best of my understanding it has the same glBlendEquation() limitations.
Am I missing something?
I realise I could simply achieve the same result in 32 passes.

OpenGL has a seperate glLogicOp() for performing logical operations on the frame buffer.
This can be configured and enabled using
glLogicOp(GL_OR);
glEnable(GL_COLOR_LOGIC_OP);
Although the flag is GL_COLOR_LOGIC_OP the documentation implies this will also cover alpha values.
It's slightly better described in citation 26

Related

OpenGL trim/inline contour of stencil

I have created a shape in my stencil buffer (black in the picture below). Now I would like to render to the backbuffer. I would like one texture on the outer pixels (say 4 pixels) of my stencil (red), and an other texture on the remaining pixels (red).
I have read several solutions that involve scaling, but that will not work when there is no obvious center of the shape.
How do I acquire the desired effect?
The stencil buffer works great for doing operations on the specific fragments being overlaid onto them. However, it's not so great for doing operations that require looking at pixels other than the one corresponding to the fragment being rendered. In order to do outlining, you have to ask about the values of neighboring pixels, which stencil operations don't allow.
So, if it is possible to put the stencil data you want to test against in a non-stencil format image (ie: a color image, maybe with an integer texture format), that would make things much simpler. You can do the effect of stencil discarding by using discard directly in the fragment shader. Since you can fetch arbitrarily from the texture (as long as you're not trying to modify it), you can fetch neighboring pixels and test their values. You can use that to identify when a fragment is near a border.
However, if you're relying on specialized stencil operations to build the stencil data itself (like bitwise operations), then that's more complicated. You will have to employ stencil texturing operations, so you're going to have to render to an FBO texture that has a depth/stencil format. And you'll have to set it up to allow you to read from the stencil aspect of the texture. This is an OpenGL 4.3 feature.
This effectively converts it into an 8-bit unsigned integer texture. That allows you to play whatever games you need to. But if you want to use stencil tests to discard fragments, you will also need texture barrier functionality to allow you to read from an image that's attached to the current FBO. But you don't need to actually use the barrier, since you should mask off stencil writing. You just need GL 4.5 or the NV/ARB_texture_barrier extension to be available, which they widely are.
Either way this happens, the biggest difficulty is going to be varying the size of the border. It is easy to just test the neighboring 9 pixels to see if it is at a border. But the larger the border size, the larger the area of pixels each fragment has to test. At that point, I would suggest trying to look for a different solution, one that is based on some knowledge of what pattern is being written into the stencil buffer.
That is, if the rendering operation that lays down the stencil has some knowledge of the shape, then it could compute a distance to the edge of the shape in some way. This might require constructing the geometry in a way that it has distance information in it.

How to do particle binning in OpenGL?

Currently I'm creating a particle system and I would like to transfer most of the work to the GPU using OpenGL, for gaining experience and performance reasons. At the moment, there are multiple particles scattered through the space (these are currently still created on the CPU). I would more or less like to create a histogram of them. If I understand correctly, for this I would first translate all the particles from world coordinates to screen coordinates in a vertex shader. However, now I want to do the following:
So, for each pixel a hit count of how many particles are inside. Each particle will also have several properties (e.g. a colour) and I would like to sum them for every pixel (as shown in the lower-right corner). Would this be possible using OpenGL? If so, how?
The best tool I recomend for having the whole data (if it fits on GPU memory) is the use of SSBO.
Nevertheless, you need data after transforming them (e.g. by a projection). Still SSBO is your best option:
In the fragment shader you read the properties of already handled particles (let's say, the rendered pixel) and write modified properties (number of particles at this pixel, color, etc) to the same index in the buffer.
Due to parallel nature of GPU, several instances coming from different particles may be doing concurrently the work for the same index. Thus you need to handle this on your own. Read Memory model and Atomic operations
Another approach, but limited, is using Blending
The idea is that each fragment increments the actual color value of the frame buffer. This can be done using GL_FUNC_ADD for glBlendEquationSeparate and using as fragment-output-color a value of 1/255 (normalized integer) for each RGB/a component.
Limitations come from the [0-255] range: Only up to 255 particles in the same pixel, the rest amount is clamped to this range and so "lost".
You have four components RGBA, thus four properties can be handled. But can have several renderbuffers in a FBO.
You can read the FBO by glReadPixels. Use glReadBuffer first with a GL_COLOR_ATTACHMENTi if you use a FBO instead of the default frame buffer.

How can I apply a depth test to vertices (not fragments)?

TL;DR I'm computing a depth map in a fragment shader and then trying to use that map in a vertex shader to see if vertices are 'in view' or not and the vertices don't line up with the fragment texel coordinates. The imprecision causes rendering artifacts, and I'm seeking alternatives for filtering vertices based on depth.
Background. I am very loosely attempting to implement a scheme outlined in this paper (http://dash.harvard.edu/handle/1/4138746). The idea is to represent arbitrary virtual objects as lots of tangent discs. While they wanted to replace triangles in some graphics card of the future, I'm implementing this on conventional cards; my discs are just fans of triangles ("Discs") around center points ("Points").
This is targeting WebGL.
The strategy I intend to use, similar to what's done in the paper, is:
Render the Discs in a Depth-Only pass.
In a second (or more) pass, compute what's visible based solely on which Points are "visible" - ie their depth is <= the depth from the Depth-Only pass at that x and y.
I believe the authors of the paper used a gaussian blur on top of the equivalent of a GL_POINTS render applied to the Points (ie re-using the depth buffer from the DepthOnly pass, not clearing it) to actually render their object. It's hard to say: the process is unfortunately a one line comment, and I'm unsure of how to duplicate it in WebGL anyway (a naive gaussian blur will just blur in the background pixels that weren't touched by the GL_POINTS call).
Instead, I'm hoping to do something slightly different, by rerendering the discs in a second pass instead as cones (center of disc becomes apex of cone, think "close the umbrella") and effectively computing a voronoi diagram on the surface of the object (ala redbook http://www.glprogramming.com/red/chapter14.html#name19). The idea is that an output pixel is the color value of the first disc to reach it when growing radiuses from 0 -> their natural size.
The crux of the problem is that only discs whose centers pass the depth test in the first pass should be allowed to carry on (as cones) to the 2nd pass. Because what's true at the disc center applies to the whole disc/cone, I believe this requires evaluating a depth test at a vertex or object level, and not at a fragment level.
Since WebGL support for accessing depth buffers is still poor, in my first pass I am packing depth info into an RGBA Framebuffer in a fragment shader. I then intended to use this in the vertex shader of the second pass via a sampler2D; any disc center that was closer than the relative texture2D() lookup would be allowed on to the second pass; otherwise I would hack "discarding" the vertex (its alpha would be set to 0 or some flag set that would cause discard of fragments associated with the disc/cone or etc).
This actually kind of worked but it caused horrendous z-fighting between discs that were close together (very small perturbations wildly changed which discs were visible). I believe there is some floating point error between depth->rgba->depth. More importantly, though, the depth texture is being set by fragment texel coords, but I'm looking up vertices, which almost certainly don't line up exactly on top of relevant texel coordinates; so I get depth +/- noise, essentially, and the noise is the issue. Adding or subtracting .000001 or something isn't sufficient: you trade Type I errors for Type II. My render became more accurate when I switched from NEAREST to LINEAR for the depth texture interpolation, but it still wasn't good enough.
How else can I determine which disc's centers would be visible in a given render, so that I can do a second vertex/fragment (or more) pass focused on objects associated with those points? Or: is there a better way to go about this in general?

How does multisample really work?

I am very interested in understanding how multisampling works. I have found a large literature on how to enable or use it, but very little information concerning what it really does in order to achieve an antialiased rendering. What I have found, in many places, is conflicting information that only confused me more.
Please note that I know how to enable and use multisampling (I actually already use it), what I don't know is what kind of data really gets into the multisampled renderbuffers/textures, and how this data is used in the rendering pipeline.
I can understand very well how supersampling works, but multisampling still has some obscure areas that I would like to understand.
here is what the specs say: (OpenGL 4.2)
Pixel sample values, including color, depth, and stencil values, are stored in this
buffer (the multisample buffer). Samples contain separate color values for each fragment color.
...
During multisample rendering the contents of a pixel fragment are changed
in two ways. First, each fragment includes a coverage value with SAMPLES bits.
...
Second, each fragment includes SAMPLES depth values and sets of associated
data, instead of the single depth value and set of associated data that is maintained
in single-sample rendering mode.
So, each sample contains a distinct color, coverage bit, and depth. What's the difference from a normal supersampling? Seems like a "weighted" supersampling to me, where each final pixel value is determined by the coverage value of its samples instead of a simple average, but I am very unsure about this. And what about texture coordinates at sample level?
If I store, say, normals in a RGBF multisampled texture, will I read them back "antialiased" (that is, approaching 0) on the edges of a polygon?
A fragment shader is called once per fragment, unless it uses gl_SampleID, glSampleIn or has a 'sample' storage qualifier. How can a fragment shader be invoked once per fragment and get an antialiased rendering?
OpenGL on Silicon Graphics Systems:
http://www-f9.ijs.si/~matevz/docs/007-2392-003/sgi_html/ch09.html#LE68984-PARENT
mentions: When you use multisampling and read back color, you get the resolved color value (that is, the average of the samples). When you read back stencil or depth, you typically get back a single sample value rather than the average. This sample value is typically the one closest to the center of the pixel.
And there's this technical spec (1994) from the OpenGL site. It explains in full detail what is done If MULTISAMPLE_SGIS is enabled: http://opengl.org/registry/specs/SGIS/multisample.txt
See also this related question: How are depth values resolved in OpenGL textures when multisampling?
And the answers to this question, where GL_MULTISAMPLE_ARB is recommended: where is GL_MULTISAMPLE defined?. The specs for GL_MULTISAMPLE_ARB (2002) are here: http://www.opengl.org/registry/specs/ARB/multisample.txt

The advantages of using a Z-Buffer versus prioritising pixels according to depth

This is a bit more of an academic question. Indeed I am preparing for a an exam and I'm just trying to truly understand this concept.
Allow me to explain somewhat the context. The issue at hand is hiding objects (or more specifically polygons) behind each other when drawing to the screen. A calculation needs to be done to decide which one gets drawn last and therefore to the forefront.
In a lecture I was at the other day my professor stated that prioritising pixels in terms of their depth value was computationally inefficient. He then gave us a short explanation of Z-buffers and how they test depth values of pixels and compare them with the the depth values of pixels in a buffer. How is this any different then 'prioritising pixels in terms of their depth'.
Thanks!
Deciding which polygon a fragment belongs to is computionally expensive, because that would require to find the closest polygon (and, having the entire geometry information available during pixel shading!) for every single pixel.
It is easy, almost trivial to sort entire objects, each consisting of many triangles (a polygon is no more than one or several triangles) according to their depth. This, however, is only a rough approximation, nearby objects will overlap and produce artefacts, so something needs to be done to make it pixel perfect.
This is where the z buffer comes in. If it turns out that a fragment's calculated depth is greater than what's already stored in the z-buffer, this means the fragment is "behind something", so it is discarded. Otherwise, the fragment is written to the color buffer and the depth value is written to the z-buffer. Of course that means that when 20 triangles are behind each other, then the same pixel will be shaded 19 times in vain. Alas, bad luck.
Modern graphics hardware addresses this by doing the z test before actually shading a pixel, according to the interpolated depth of the triangle's vertices (this optimization is obviously not possible if per-pixel depth is calculated).
Also, they employ conservative (sometimes hierarchical, sometimes just tiled) optimizations which discard entire groups of fragments quickly. For this, the z-buffer holds some additional (unknown to you) information, such as for example the maximum depth rendered to a 64x64 rectangular area. With this information, it can immediately discard any fragments in this screen area which are greater than that, without actually looking at the stored depths, and it can fully discard any fragments belonging to a triangle of which all vertices have a greater depth. Because, obviously, there is no way that any of it could be visible.
Those are implementation details, and very platform specific, though.
EDIT: Though this is probably obvious, I'm not sure if I made that point clear enough: When sorting to exploit z-culling, you would do the exact opposite of what you do with painter's algorithm. You want the closest things drawn first (roughly, does not have to be 100% precise), so instead of determining a pixel's final color in the sense of "last man standing", you have it in the sense of "first come, first served, and only one served".
First thing you need to understand is what your professor meant by 'prioritising pixels in terms of their depth'. My guess is that it's about storing all requested fragments for a given screen pixel and then producing the resulting color by choosing closest fragment. It's inefficient because Z buffer allows us to store only a single value instead of all of them.