Why is depth buffers faster than depth textures? - opengl

This tutorial on shadow-mapping in OpenGL briefly mentions the difference between using a depth buffer and a depth texture (edit: to store per pixel depth information for depth testing or other purposes, such as shadow-mapping) by stating:
Depth texture. Slower than a depth buffer, but you can sample it later in your shader
However, this got me wondering why this is so. After all, both seem to be nothing more than a two-dimensional array containing some data, and the definition on Microsofts notes on graphics define them in very similar terms as such (these notes are as pointed out in a comment, not on OpenGL but another graphical engine, but the purpose of the depth-buffers/-textures seem to be quite similar -- I have have not found an equal description of the two for OpenGL depth-buffers/-textures -- for which reason I have decided to keep these articles. If someone has a link to an article describing depth buffers and depth textures in OpenGL you will be welcome to post it in the comments)
A depth buffer contains per-pixel floating-point data for the z depth of each pixel rendered.
and
A depth texture, also known as a shadow map, is a texture that contains the data from the depth buffer for a particular scene
Of course, there are a few differences between the two methods -- notably, the depth texture can be sampled later, unlike the depth buffer.
Despite these differences, I can however not see why the depth buffer should be faster to use than a depth texture, and my question is, therefore: why can't these two methods of storing the same data be equally fast (edit: when used for storing depth data for depth testing).

By "depth buffer", I will assume you mean "renderbuffer with a depth format".
Possible reasons why a depth renderbuffer might be faster to render to than a depth texture include:
A depth renderbuffer can live within specialized memory that is not shader-accessible, since the implementation knows that you can't access it from the shader.
A depth renderbuffer might be able to have a special format or layout that a depth texture cannot have, since the texture has to be shader-accessible. This could include things like Hi-Z/Hierarchical-Z and so forth.
#1 tends to crop up on tile-based architectures. If you do things right, you can keep your depth renderbuffer entirely within tile memory. That means that, after a rendering operation, there is no need to copy it out to main memory. By contrast, with a depth texture, the implementation can't be sure you don't need to copy it out, so it has to do so just to be safe.
Note that this list is purely speculative. Unless you've actually profiled it, or have some specific knowledge of hardware (as in the TBR case), there's no reason to assume that there is any substantial difference in performance.

Related

Optimize for expensive fragment shader

I'm rendering multiple layers of flat triangles with a raytracer in the fragment shader. The upper layers have holes, and I'm looking for a way how I can avoid running the shader for pixels that are filled already by one of the upper layers, i.e. I want only the parts of the lower layers rendered that lie in the holes in the upper layers. Of course, if there's a hole or not is not known unless the fragment shader did his thing for a layer.
As far as I understand, I cannot use early depth testing because there, the depth values are interpolated between the vertices and do not come from the fragment shader. Is there a way to "emulate" that behavior?
The best way to solve this issue is to not use layers. You are only using layers because of the limitations of using a 3D texture to store your scene data. So... don't do that.
SSBOs and buffer textures (if your hardware is too old for SSBOs) can access more memory than a 3D texture. And you could even employ manual swizzling of the data to improve cache locality if that is viable.
As far as I understand, I cannot use early depth testing because there, the depth values are interpolated between the vertices and do not come from the fragment shader.
This is correct insofar as you cannot use early depth tests, but it is incorrect as to why.
The "depth" provided by the VS doesn't need to be the depth of the actual fragment. You are rendering your scene in layers, presumably with each layer being a full-screen quad. By definition, everything in one layer of rendering is beneath everything in a lower layer. So the absolute depth value doesn't matter; what matters is whether there is something from a higher layer over this fragment.
So each layer could get its own depth value, with lower layers getting a lower depth value. The exact value is arbitrary and irrelevant; what matters is that higher layers have higher values.
The reason this doesn't work is this: if your raytracing algorithm detects a miss within a layer (a "hole"), you must discard that fragment. And the use of discard at all turns off early depth testing in most hardware, since the depth testing logic is usually tied to the depth writing logic (it is an atomic read/conditional-modify/conditional-write).

Is it possible to depth test against a depth texture I am also sampling, in the same draw call?

Context:
I am using a deferred rendering setup, where in the first stage I have two FBO's: one is the GBuffer, for storing the normals, albedo, and material information for all visible fragments. This FBO has a 32-bit depth texture. This gets drawn into in a geometry pass, before any lighting is calculated.
The second FBO is color-only, and starts off black, but accumulates lighting over several passes, from lighting shaders that sample from the GBuffer and write to the color-only buffer using additive blending.
The problem is, I would really like to utilize early depth testing in order to have my lighting ONLY calculate for fragments that contain actual geometry (Not just sky). The best way I can think of to do this is to use depth testing to fail any pixels that have a depth of one in the case of sunlight, or to fail any pixels that lie behind the sphere of influence for point lights. However, I don't think I can bind this depth texture to my color FBO, since I also sample from it inside the lighting shader to calculate the fragments position in world-space.
So my question is: Is there a way to use the same depth texture for both the early depth test, and for sampling inside the shader? Or if not, is there some other (reasonably performant) way of rejecting pixels that don't have geometry in them? I will not be writing to this depth texture at all in my lighting pass.
I only have to target modern graphics hardware on PC's (So I can use any common extensions, or openGL 4.6 features).
There are rules in OpenGL about reading from data in a shader that's also being updated due to a framebuffer operation. Those rules used to be quite strict. Indeed, pre-GL 4.4, the rules were so strict that what you're trying to do was actually undefined behavior. That is, if an image from a texture was attached to the rendering FBO, and you took a sample from that texture in a way such that it was at all possible to be reading from the attached image, you got undefined behavior. Never mind if your write mask meant that no writing happened; it was UB.
Fortunately, it's well-defined now. You only get UB if you're doing an actual write, not merely because you have an image attached to the FBO. And by "now," I mean basically any hardware made in the last 10 years. While ARB_texture_barrier and GL 4.5 are fairly recent, their predecessor NV_texture_barrier is actually quite old. And despite being an NVIDIA extension by name, it was so widely implemented that it is even available on MacOS implementations.

Depth-fighting solution using custom depth testing

The core of my problem is that I have troubles with depth-fighting in pure OpenGL.
I have two identical geometries, but one is simpler than the other.
That forms a set of perfectly coplanar polygons, and I want to display the complex geometry on top of the simpler geometry.
Unsurprisingly, it leads me to depth-fighting when I draw sequentially the two sets of triangles using the OpenGL depth buffer. At the moment, I've patched it using glPolygonOffset but this solution is not suitable for me (I want the polygons the be exactly coplanar).
My idea is to temporary use a custom depth test when drawing the second set of triangles. I would like to save the depth of the fragments during the rendering of the first set. Next, I would use glDepthFunc(GL_ALWAYS) to disable the depth buffer (but still writing in it). When rendering the second set, I would discard fragments that have a greater z than the memory I just created, but with a certain allowance (at least one time the precision of the Z-buffer at the specific z, I guess). Then I would reset depth function to GL_LEQUAL.
Actually, I just want to force a certain allowance for the depth test.
Is it a possible approach ?
The problem is that I have no idea how to pass information (custom depth buffer) from one program to another.
Thanks
PS : I also looked into Frame Buffer Objects and Deferred Rendering because apparently it allows passing information via a 'G-buffer', but once I write:
unsigned int gBuffer;
glGenFramebuffers(1, &gBuffer);
glBindFramebuffer(GL_FRAMEBUFFER, gBuffer);
My window goes black... Sorry if things are obvious I'm not familiar yet with OpenGL
As Rabbid76 said, I can simply disable depth writing using glDepthMask(GL_FALSE).
Now, I can draw several layers of coplanar polygons using the same offset.
Solved.

Is the stencil buffer still relevant in modern OpenGL?

Me and a friend have been having an ongoing argument about the stencil buffer. In short I haven't been able to find a situation where the stencil buffer would provide any advantage over the programmable pipeline tools in OpenGL 3.2+. Are there any uses to the stencil buffer in modern OpenGL?
[EDIT]
Thanks everyone for all the inputs on the subject.
It is more useful than ever since you can sample stencil index textures from fragment shaders. It should not even be argued that the stencil buffer is not part of the programmable pipeline.
The depth buffer is used for simple pass/fail fragment rejection, which the stencil buffer can also do as suggested in comments. However, the stencil buffer can also accumulate information about test results over multiple passes. All sorts of logic and counting applications exist such as measuring a scene's depth complexity, constructive solid geometry, etc.
To add a recent example to Andon's answer, GTA V uses the stencil buffer kinda like an ID buffer to mark the player character, cars, vegetation etc.
It subsequently uses the stencil buffer to e.g. apply subsurface scattering only to the character or exclude him from motion blur.
See the GTA V Graphics Study (highly recommended, it's a great read!)
Edit: sure you can do this in software. But you can do rasterization or tessellation in software just as well... In the end it's about performance I guess. With depth24stencil8 you have a nice hardware-supported format, and the stencil test is most likely faster then doing discards in the fragment shader.
Just to provide one other use case, shadow volumes (aka "stencil shadows") are still very relevant: https://en.wikipedia.org/wiki/Shadow_volume
They're useful for indoor scenes where shadows are supposed to be pixel perfect, and you're less likely to have alpha-tested foliage messing up the extruded shadow volumes.
It's true that shadow maps are more common, but I suspect that stencil shadows will have a comeback once the brain dead Createive/3DLabs patent expires on the zfail method.

Writing to depth buffer from opengl compute shader

Generally on modern desktop OpenGL hardware what is the best way to fill a depth buffer from a compute shader and then use that depth buffer for graphics pipeline rendering with triangles etc?
Specifically I am wondering about concerns regards HiZ. Also I wonder if it's better to do compute shader modifications to the depth buffer before or after the graphics rendering?
If the compute shader is run after the graphics rendering I assume the depth buffer will typically be decompressed behind the scenes. But I worry done the other way around the depth buffer may be in a decompressed/non-optimal state for the graphics pipeline?
As far as i know, you cannot bind textures with any of the depth formats as images, and thus cannot write to depth format textures in compute shaders. See glBindImageTexture documentation, it lists the formats that your texture format must be compatible to. Depth formats are not among them and the specification says the depth formats are not compatible to the normal formats.
Texture copying functions have the same compatibility restrictions, so you can't even e.g. write to a normal texture in the compute shader and then copy to a depth texture. glCopyImageSubData does not explicitly have that restriction but i haven't tried it and it's not part of the core profile anymore.
What might work is writing to a normal texture, then rendering a fullscreen triangle and setting gl_FragDepth to values read from the texture, but that's an additional fullscreen pass.
I don't quite understand your second question - if your compute shader stuff modifies the depth buffer, the result will most likely be different depending on whether you do it before or after regular rendering because different parts will be visible or occluded.
But maybe that question is moot since it seems you cannot manually write into depth buffers at all - which might also answer your third question - by not writing into depth buffers you cannot mess with the compression of it :)
Please note that i'm no expert in this, i had a similar problem and looked at the docs/spec myself, so this all might be wrong :) Please let me know if you manage to write to depth buffers with compute shaders!