Accessing 32-bit Depth Buffer from fragment shader? - opengl

I'm trying to do the following technique for shadowing, I read it on the NVIDIA site and it seemed like a good technique. I would prefer it to calculating shadow volumes on the cpu because it seems more 'true' and I could use this one for soft-shadowing. :
1st pass:
Fill depth buffer from perspective of LIGHT0. Copy this depth buffer for second pass. (*)
2nd pass:
Render view from EYE, and for each fragment:
Get the XY location in the depth buffer stored in (*). Get the corresponding 32-bit value.
Calculate the distance to the light.
Match this distance to the stored depth buffer value.
If it is larger, then the fragment is drawn in glDisable(LIGHT0) mode, otherwise it is drawn with the light enabled. For this purpose I use two fragment shaders and fragments blend/switch between the two according to the comparison of the distance.
Now, I'd want to do the last steps in the fragment shader, for some reasons. One of them is that I want to take the distance into account for the 'effect" of the shadowing. In my game, if the distance to the obstructing object is small, it is safe to say that the shadow will be very "strict". If it is further away, global illumination kicks in more and the shadow is slighter. This is because it's a card game, this would not be the case for more complicated 'concave' shapes.
But, I'm new to openGL and I don't understand how to do any of the following:
How to access that 1st pass depth buffer in the fragment shader without copying it to a 2d texture. I assume that is not possible?
If copying the 32-bit depth buffer to a texture that has 8-bits in each R,G,B,A component and then re-assembling that value in the fragment shader is the most efficient thing I can do?
If there are cross-platform extensions I could use for this.
Thanks if someone can help me or give me some more ideas, I'm kind of stumped right now and my lack of good hardware and free time really makes for an exhausting process to debug/try everything out.

1st way is to use FBOs with a GL_DEPTH_COMPONENT texture attached to the GL_DEPTH_ATTACHMENT attachment.
The second way is to use glCopyTexImage2D again with a GL_DEPTH_COMPONENT texture.
FBOs are cross platform and available in almost every modern OpenGL implementation, you should have them available to you.

You are right: you need to create a 2D texture from the depth buffer values in order to use those values in the 2nd pass.
Concerning the texture itself, I think that copying from 32bits depth buffer to 8 bits RGBA will not use a cast to convert data: for a mid range value of the depth buffer (say 0x80000000), you will get half tone on R, G, B and A on your rgba texture:
RGBA[0] = 0x80;
RGBA[1] = 0x80;
RGBA[2] = 0x80;
RGBA[3] = 0x80;
Where you would have expected: (cast)
RGBA[0] = 0x80;
RGBA[1] = 0;
RGBA[2] = 0;
RGBA[3] = 0;
So, for the right format, I am not sure, but I would suggest you not to modify it during the copy, since you don't want to have a conversion overhead.

Related

How to write integers alongside pixels in the framebuffer, and then use the written integer to ignore the depth buffer

What I want to do
I want to have a set triangles bleed through, or rather ignore the depth buffer, for another set triangles, but only if they have the same number.
Problem (optional reading)
I do not know how to do this without introducing a ton of bubbles into the pipeline. Right now I have very high throughput because I can throw my geometry onto the GPU, tell it to render, and forget about it. However, if I have to keep toggling the state when drawing, I'm worried I'm going to tank my performance. Other people who have done what I've just said (doing a ton of draw calls and state changes) have much worse performance than me. This performance hit is also significantly worse on older hardware, where we are talking on order of 50 - 100+ times performance loss by doing it the state-change way.
Unfortunately this triangle bleeding scenario happens many thousands of times, so the state machine will be getting flooded with "draw triangles, depth off, draw triangles that bleed through, depth on, ...", except N times, where N can get large (N >= 1000).
A good way of imagining this is having a set of triangles T_i, and a set of triangles that bleed through B_i where B_i only bleeds through T_i, and i ranges from 0...1000+. Note that if we are drawing B_100, then it should only bleed through T_100, not T_99 or T_101.
My next thought is to draw all the triangles with their integer into one framebuffer (along with the integer), then draw the bleed through triangles into another framebuffer (also with the integer), and then merge these framebuffers together. I figure they will have the color, depth, and the integer, so I can hopefully merge them in the fragment shader.
Problem is, I have no idea how to write an integer alongside the out vec4 fragColor in the fragment shader.
Questions (and in short)
This leaves me with two questions:
How do I write an integer into a framebuffer? Do I need to write to 4 separate texture framebuffers? (like one color/depth framebuffer texture, another integer framebuffer texture, and then double this so I can merge the pairs of framebuffers together at some point?)
To make this more clear, the algorithm would look like
Render all the 'could be bled from triangles', described above as set T_i,
write colors and depth info into FB1, and write integers into FB2
Render all the 'bleeding' triangles, described above as set B_i,
write colors and depth into FB3, and write integers to FB4
Bind the textures for FB1, FB2, FB3, FB4
Render each pixel by sampling the RGBA, depth, and integers
from the appropriate texture and write those out into the
final framebuffer
I would need to access the color and depth from the textures in the shader. I would also need to access the integer from the other texture. Then I can do the comparison and choose which pixel to write to the default framebuffer.
Is this idea possible? I assume if (1) is, then the answer is yes. Maybe another question could be whether there's a better way. I tried thinking of doing this with the stencil buffer but had no luck
What you want is theoretically possible, but I can't speak as to its performance. You'll be reading and writing a whole lot of texels in a lot of textures for every program iteration.
Anyway to answer your questions:
A framebuffer can have multiple color attachments by using glFramebufferTexture2D with GL_COLOR_ATTACHMENT0, GL_COLOR_ATTACHMENT1, etc. Each texture can then have its own internal format, in your example you probably want a regular RGB texture for your color output, and a second 1-integer only texture.
Your depth buffer is complicated, because you don't want to let OpenGL handle it as normal. If you want to take over the depth buffer, you probably want to attach it as yet another, float texture that you can check against or not your screen-space fragments.
If you have doubts about your shader, remember that you can bind the any number of textures as input samplers you program in code, and each color bind gets its own output value (your shader runs per-texel, so you output one value at a time). Make sure the format of your output is correct, ie vec3/vec4 for the color buffer, int for your integer buffer and float for the float buffer.
And stencil buffers won't help you turn depth checking on or off in a single (possibly indirect) draw call. I can't visualize what your bleeding thing means, but it can probably help with that? Maybe? But definitely not conditional depth checking.

OpenGL trim/inline contour of stencil

I have created a shape in my stencil buffer (black in the picture below). Now I would like to render to the backbuffer. I would like one texture on the outer pixels (say 4 pixels) of my stencil (red), and an other texture on the remaining pixels (red).
I have read several solutions that involve scaling, but that will not work when there is no obvious center of the shape.
How do I acquire the desired effect?
The stencil buffer works great for doing operations on the specific fragments being overlaid onto them. However, it's not so great for doing operations that require looking at pixels other than the one corresponding to the fragment being rendered. In order to do outlining, you have to ask about the values of neighboring pixels, which stencil operations don't allow.
So, if it is possible to put the stencil data you want to test against in a non-stencil format image (ie: a color image, maybe with an integer texture format), that would make things much simpler. You can do the effect of stencil discarding by using discard directly in the fragment shader. Since you can fetch arbitrarily from the texture (as long as you're not trying to modify it), you can fetch neighboring pixels and test their values. You can use that to identify when a fragment is near a border.
However, if you're relying on specialized stencil operations to build the stencil data itself (like bitwise operations), then that's more complicated. You will have to employ stencil texturing operations, so you're going to have to render to an FBO texture that has a depth/stencil format. And you'll have to set it up to allow you to read from the stencil aspect of the texture. This is an OpenGL 4.3 feature.
This effectively converts it into an 8-bit unsigned integer texture. That allows you to play whatever games you need to. But if you want to use stencil tests to discard fragments, you will also need texture barrier functionality to allow you to read from an image that's attached to the current FBO. But you don't need to actually use the barrier, since you should mask off stencil writing. You just need GL 4.5 or the NV/ARB_texture_barrier extension to be available, which they widely are.
Either way this happens, the biggest difficulty is going to be varying the size of the border. It is easy to just test the neighboring 9 pixels to see if it is at a border. But the larger the border size, the larger the area of pixels each fragment has to test. At that point, I would suggest trying to look for a different solution, one that is based on some knowledge of what pattern is being written into the stencil buffer.
That is, if the rendering operation that lays down the stencil has some knowledge of the shape, then it could compute a distance to the edge of the shape in some way. This might require constructing the geometry in a way that it has distance information in it.

Mimic OpenGL texture mapping on CPU for reprojection

I'm trying to code a texture reprojection using a UV gBuffer (this is a texture that contains the UV desired value for mapping at that pixel)
I think that this should be easy to understand just by seeing this picture (I cannot attach due low reputation):
http://www.andvfx.com/wp-content/uploads/2012/12/3-objectes.jpg
The first image (the black/yellow/red/green one) is the UV gBuffer, it represents the uv values, the second one is the diffuse channel and the third the desired result.
Making this on OpenGL is pretty trivial.
Draw a simple rectangle and use as fragmented shader this pseudo-code:
float2 newUV=texture(UVgbufferTex,gl_TexCoord[0]).xy;
float3 finalcolor=texture(DIFFgbufferTex,newUV);
return float4(finalcolor,0);
OpenGL takes care about selecting the mipmap level, the anisotropic filtering etc, meanwhile if I make this on regular CPU process I get a single pixel for finalcolor so my result is crispy.
Any advice here? I was wondering about computing manually a kind of mipmaps and select the level by checking the contiguous pixel but not sure if this is the right way, also I doubt how to deal with that since it could be changing fast on horizontal but slower on vertical or viceversa.
In fact I don't know how this is computed internally on OpenGL/DirectX since I used this kind of code for a long time but never thought about the internals.
You are on the right track.
To select mipmap level or apply anisotropic filtering you need a gradient. That gradient comes naturally in GL (in fragment shaders) because it is computed for all interpolated variables after rasterization. This all becomes quite obvious if you ever try to sample a texture using mipmap filtering in a vertex shader.
You can compute the LOD (lambda) as such:
    ρ = max (((du/dx)2 + (dv/dx)2)1/2
, ((du/dy)2 + (dv/dy)2)1/2)
    λ = log2 ρ
The texture is picked basing on the size on the screen after reprojection. After you emit a triangle, check the rasterization size and pick the appropriate mipmap.
As for filtering, it's not that hard to implement i.e. bilinear filtering manually.

Using depth buffer for layering 2D sprites

I'm making a 2D game using OpenGL. I want to do the drawing like, first I copy vertex data of all objects I want to draw into VBOs (one VBO per texture/shader), then draw each VBO in separate draw call. It seemed like a good idea, until I realized it will mess up drawing order - the draw calls won't necessarily be in order the objects were loaded into VBOs. I thought of using a depth buffer to sort items - every new object to draw will have slightly higher Z position. The question is, how much should I increment it to not run into any problems? AFAIK, there can be two kinds of problems - if I make it too large, then I will have limited number of objects I can draw in a single frame, and if I make it too small, the precision loss of the depth buffer might make overlapping images be drawn in wrong order. To summarize:
1) What should be front and back values of my orthographic projection? 0 to 1? -1 to 1? 1 to 2? Does it matter?
2) If I use 's nextafter() for incrementing Z position, what kind of trouble can I run into? How does OpenGL and depth buffer react to subnormal floats? If I started with std::numeric_limits::min(), and ended at 1, is there anything else I should worry about?
First and foremost, you need to know the bit-depth of your depth buffer. Generally the depth buffer is fixed-point, either 16-, 24- or 32-bit.
Given a fixed-point depth buffer and the default depth range [0,1] you can make every integer value represent a uniquely distinguishable depth by using an orthographic projection matrix with 0.0 for nearVal and:
16-bit: farVal = 65535.0
24-bit: farVal = 16777215.0 // Most Common Configuration
32-bit: farVal = 4294967295.0
Then, you can assign your layered sprites up to farVal+1-many different depths (always use an integer value for sprite depth and begin with 0) and not worry about the depth buffer not being able to distinguish between the layers. In other words, the precision of your depth buffer will dictate the maximum number of layers you can have.

I need my GLSL fragment shader to return the distance calculation

I'm using some standard GLSL (version 120) vertex and fragment shaders to simulate LIDAR. In other words, instead of just returning a color at each x,y position (each pixel, via the fragment shader), it should return color and distance.
I suppose I don't actually need all of the color bits, since I really only want the intensity; so I could store the distance in gl_FragColor.b, for example, and use .rg for the intensity. But then I'm not entirely clear on how I get the value back out again.
Is there a simple way to return values from the fragment shader? I've tried varying, but it seems like the fragment shader can't write variables other than gl_FragColor.
I understand that some people use the GLSL pipeline for general-purpose (non-graphics) GPU processing, and that might be an option — except I still do want to render my objects normally.
OpenGL already returns this "distance calculation" via the depth buffer, although it's not linear. You can simply create a frame buffer object (FBO), attach colour and depth buffers, render to it, and you have the result sitting in the depth buffer (although you'll have to undo the depth transformation). This is the easiest option to program provided you are familiar with the depth calculations.
Another method, as you suggest, is storing the value in a colour buffer. You don't have to use the main colour buffer because then you'd lose your colour or have to render twice. Instead, attach a second render target (texture) to your FBO (GL_COLOR_ATTACHMENT1) and use gl_FragData[0] for normal colour and gl_FragData[1] for your distance (for newer GL versions you should be declaring out variables in the fragment shader). It depends on the precision you need, but you'll probably want to make the distance texture 32 bit float (GL_R32F and write to gl_FragData[1].r).
- This is a decent place to start: http://www.opengl.org/wiki/Framebuffer_Object
Yes, GLSL can be used for compute purposes. Especially with ARB_image_load_store and nvidia's bindless graphics. You even have access to shared memory via compute shaders (though I've never got one faster than 5 times slower). As #Jherico says, fragment shaders generally output to a single place in a framebuffer attachment/render target, and recent features such as image units (ARB_image_load_store) allow you to write to arbitrary locations from a shader. It's probably overkill and slower but you could also write your distances to a buffer via image units .
Finally, if you want the data back on the host (CPU accessible) side, use glGetTexImage with your distance texture (or glMapBuffer if you decided to use image units).
Fragment shaders output to a rendering buffer. If you want to use the GPU for computing and fetching data back into host memory you have a few options
Create a framebuffer and attach a texture to it to hold your data. Once the image has been rendered you can read back information from the texture into host memory.
Use an CUDA, OpenCL or an OpenGL compute shader to write the memory into an arbitrary bound buffer, and read back the buffer contents