OpenGL antialiasing without the accumulation buffer - opengl

On an NVIDIA card I can perform full scene anti-aliasing using the accumulation buffer something like this:
if(m_antialias)
{
glClear(GL_ACCUM_BUFFER_BIT);
for(int j = 0; j < antialiasing; j++)
{
accPerspective(m_camera.FieldOfView(), // Vertical field of view in degrees.
aspectratio, // The aspect ratio.
20., // Near clipping
1000.,
JITTER[antialiasing][j].X(), JITTER[antialiasing][j].Y(),
0.0, 0.0, 1.0);
m_camera.gluLookAt();
ActualDraw();
glAccum(GL_ACCUM, float(1.0 / antialiasing));
glDrawBuffer(GL_FRONT);
glAccum(GL_RETURN, float(antialiasing) / (j + 1));
glDrawBuffer(GL_BACK);
}
glAccum(GL_RETURN, 1.0);
}
On ATI cards the accumulation buffer is not implemented, and everyone says that you can do that in shader language now. The problem with that, of course, is that GLSL is a pretty high barrier to entry for an OpenGL beginner.
Can anyone point me to something that will show me how to do whole-scene anti-aliasing in a way that ATI cards can do, and that a newbie can understand?

Why would you ever do antialiasing this way, regardless of whether you have accumulation buffers or not? Just use multisampling; it's not free, but it's much cheaper than what you're doing.
First, you have to create a context with a multisampled buffer. That means you need to use WGL/GLX_ARB_multisample, which means that on Windows, you need to do two-stage context creation. You should request a pixel format with 1 *_SAMPLE_BUFFERS_ARB and some number of *_SAMPLES_ARB. The larger the number of samples, the better the antialiasing (also the slower). You can get the maximum number of samples with wglGetPixelFormatAttribfv or glXGetConfig.
Once you have successfully created a context with a multisample framebuffer, you render as normal, with one exception: call glEnable(GL_MULTISAMPLE) in your setup code. This will activate multisampled rendering.
And that's all you need.
Alternatively, if you're using GL 3.x or have access to ARB_framebuffer_object, you can skip the context stuff and create a multisampled framebuffer. Your depth buffer and color buffer(s) must all have the same number of samples. I would suggest using renderbuffers for these, since you're still using fixed-function (and you can't texture from a multisample texture in the fixed-function pipeline).
You create multisampled renderbuffers for color and depth (they must have the same number of samples). You set them up in an FBO, and render into them (with glEnable(GL_MULTISAMPLE), of course). When you're done, you then use glBlitFramebuffer to blit from your multisample framebuffer into the back-buffer (which shouldn't be multisampled).
The problem with that, of course, is that GLSL is a pretty high barrier to entry for an OpenGL beginner.
Says who? There is nothing wrong with a beginner learning from shaders. Indeed, in my experience, such beginners often learn better, because they understand the details of what's going on more effectively.

Related

Writing to depth buffer from opengl compute shader

Generally on modern desktop OpenGL hardware what is the best way to fill a depth buffer from a compute shader and then use that depth buffer for graphics pipeline rendering with triangles etc?
Specifically I am wondering about concerns regards HiZ. Also I wonder if it's better to do compute shader modifications to the depth buffer before or after the graphics rendering?
If the compute shader is run after the graphics rendering I assume the depth buffer will typically be decompressed behind the scenes. But I worry done the other way around the depth buffer may be in a decompressed/non-optimal state for the graphics pipeline?
As far as i know, you cannot bind textures with any of the depth formats as images, and thus cannot write to depth format textures in compute shaders. See glBindImageTexture documentation, it lists the formats that your texture format must be compatible to. Depth formats are not among them and the specification says the depth formats are not compatible to the normal formats.
Texture copying functions have the same compatibility restrictions, so you can't even e.g. write to a normal texture in the compute shader and then copy to a depth texture. glCopyImageSubData does not explicitly have that restriction but i haven't tried it and it's not part of the core profile anymore.
What might work is writing to a normal texture, then rendering a fullscreen triangle and setting gl_FragDepth to values read from the texture, but that's an additional fullscreen pass.
I don't quite understand your second question - if your compute shader stuff modifies the depth buffer, the result will most likely be different depending on whether you do it before or after regular rendering because different parts will be visible or occluded.
But maybe that question is moot since it seems you cannot manually write into depth buffers at all - which might also answer your third question - by not writing into depth buffers you cannot mess with the compression of it :)
Please note that i'm no expert in this, i had a similar problem and looked at the docs/spec myself, so this all might be wrong :) Please let me know if you manage to write to depth buffers with compute shaders!

OpenGL SuperSampling Anti-Aliasing?

At office we're working with an old GLX/Motif software that uses OpenGL's AccumulationBuffer to implement anti-aliasing for saving images.
Our problem is that Apple removed the AccumulationBuffer from all of its drivers (starting from OS X 10.7.5), and some Linux drivers like Intel HDxxxx don't support it neither.
Then I would like to update the anti-aliasing code of the software for making it compatible with most actual OSs and GPUs, but keeping the generated images as beautiful as they were before (because we need them for scientific publications).
SuperSampling seems to be the oldest and the best quality anti-aliasing method, but I can't find any example of SSAA that doesn't use AccumulationBuffer. Is there a different way to implement SuperSampling with OpenGL/GLX ???
You can use FBOs to implement the same kind of anti-aliasing that you most likely used with accumulation buffers. The process is almost the same, except that you use a texture/renderbuffer as your "accumulation buffer". You can either use two FBOs for the process, or change the attached render target of a single render FBO.
In pseudo-code, using two FBOs, the flow looks roughly like this:
create renderbuffer rbA
create fboA (will be used for accumulation)
bind fboA
attach rbA to fboA
clear
create texture texB
create fboB (will be used for rendering)
attach texB to fboB
(create and attach a renderbuffer for the depth buffer)
loop over jitter offsets
bind fboB
clear
render scene, with jitter offset applied
bind fboA
bind texB for texturing
set blend function GL_CONSTANT_ALPHA, GL_ONE
set blend color 0.0, 0.0, 0.0, 1.0 / #passes
enable blending
render screen size quad with simple texture sampling shader
disable blending
end loop
bind fboA as read_framebuffer
bind default framebuffer as draw framebuffer
blit framebuffer
Full super-sampling is also possible. As Andon in the comment above suggested, you create an FBO with a render target that is a multiple of your window size in each dimension, and in the end do a down-scaling blit to your window. The whole thing tends to be slow and use a lot of memory, even with just a factor of 2.

Why does OpenGL lighten my scene when multisampling with an FBO?

I just switched my OpenGL drawing code from drawing to the display directly to using an off-screen FBO with render buffers attached. The off-screen FBO is blitted to the screen correctly when I allocate normal render buffer storage.
However, when I enable multisampling on the render buffers (via glRenderbufferStorageMultisample), every color in the scene seems like it has been brightened (thus giving different colors than the non-multisampled part).
I suspect there's some glEnable option that I need to set to maintain the same colors, but I can't seem to find any mention of this problem elsewhere.
Any ideas?
I stumbled upon the same problem, due to the lack of proper downsampling because of mismatching sample locations. What worked for me was:
A separate "single sample" FBO with identical attachments, format and dimension (with texture or renderbuffer attached) to blit into for downsampling and then draw/blit this to the window buffer
Render into a multisample window buffer with multisample texture having the same sample count as input, by passing all corresponding samples per fragment using a GLSL fragment shader. This worked with sample shading enabled and is the overkill approach for deferred shading as you can calculate light, shadow, AO, etc. per sample.
I did also rather sloppy manual downsampling to single sample framebuffers using GLSL, where I had to fetch each sample separately using texelFetch().
Things got really slow with multisampling. Although CSAA performed better than MSAA, I recommend to take a look at FXAA shaders for postprocessing as a considerable alternative, when performance is an issue or those rather new extensions required, such as ARB_texture_multisample, are not available.
Accessing samples in GLSL:
vec4 texelDownsampleAvg(sampler2DMS sampler,ivec2 texelCoord,const int sampleCount)
{
vec4 accum = texelFetch(sampler,texelCoord,0);
for(int sample = 1; sample < sampleCount; ++sample) {
accum += texelFetch(sampler,texelCoord,sample);
}
return accum / sampleCount;
}
http://developer.download.nvidia.com/opengl/specs/GL_EXT_framebuffer_multisample.txt
http://developer.download.nvidia.com/opengl/specs/GL_EXT_framebuffer_blit.txt
11) Should blits be allowed between buffers of different bit sizes?
Resolved: Yes, for color buffers only. Attempting to blit
between depth or stencil buffers of different size generates
INVALID_OPERATION.
13) How should BlitFramebuffer color space conversion be
specified? Do we allow context clamp state to affect the
blit?
Resolved: Blitting to a fixed point buffer always clamps,
blitting to a floating point buffer never clamps. The context
state is ignored.
http://www.opengl.org/registry/specs/ARB/sample_shading.txt
Blitting multisampled FBO with multiple color attachments in OpenGL
The solution that worked for me was changing the renderbuffer color format. I picked GL_RGBA32F and GL_DEPTH_COMPONENT32F (figuring that I wanted the highest precision), and the NVIDIA drivers interpret that differently (I suspect sRGB compensation, but I could be wrong).
The renderbuffer image formats I found to work are GL_RGBA8 with GL_DEPTH_COMPONENT24.

C++, OpenGL Z-buffer prepass

I'm making a simple voxel engine (think Minecraft) and am currently at the stage of getting rid of occluded faces to gain some precious fps. I'm not very experimented in OpenGL and do not quite understand how the glColorMask magic works.
This is what I have:
// new and shiny
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);
// this one goes without saying
glEnable(GL_DEPTH_TEST);
// I want to see my code working, so fill the mask
glPolygonMode(GL_FRONT_AND_BACK, GL_FILL);
// fill the z-buffer, or whatever
glDepthFunc(GL_LESS);
glColorMask(0,0,0,0);
glDepthMask(GL_TRUE);
// do a first draw pass
world_display();
// now only show lines, so I can see the occluded lines do not display
glPolygonMode(GL_FRONT_AND_BACK, GL_LINE);
// I guess the error is somewhere here
glDepthFunc(GL_LEQUAL);
glColorMask(1,1,1,1);
glDepthMask(GL_FALSE);
// do a second draw pass for the real rendering
world_display();
This somewhat works, but once I change the camera position the world starts to fade away, I see less and less lines until nothing at all.
It sounds like you are not clearing your depth buffer.
You need to have depth writing enabled (via glDepthMask(GL_TRUE);) while you attempt to clear the depth buffer with glClear. You probably still have it disabled from the previous frame, causing all your clears to be no-ops in subsequenct frames. Just move your glDepthMask call before the glClear.
glColorMask and glDepthMask determine, which parts of the frame buffer are actually written to.
The idea of early Z culling is, to first render only the depth buffer part first -- the actual savings come from sorting the geometry near to far, so that the GPU can quickly discard occluded fragments. However while drawing the Z buffer you don't want to draw the color component: This allows you to switch of shaders, texturing, i.e. in short everything that's computationally intense.
A word of warning: Early Z only works with opaque geometry. Actually the whole depth buffer algorithm only works for opaque stuff. As soon as you're doing blending, you'll have to sort far to near and don't use depth buffering (search for "order independent transparency" for algorithms to overcome the associated problems).
S if you've got anything that's blended, remove it from the 'early Z' stage.
In the first pass you set
glDepthMask(1); // enable depth buffer writes
glColorMask(0,0,0); // disable color buffer writes
glDepthFunc(GL_LESS); // use normal depth oder testing
glEnable(GL_DEPTH_TEST); // and we want to perform depth tests
After the Z pass is done you change the settings a bit
glDepthMask(0); // don't write to the depth buffer
glColorMask(1,1,1); // now set the color component
glDepthFunc(GL_EQUAL); // only draw if the depth of the incoming fragment
// matches the depth already in the depth buffer
GL_LEQUAL does the job, too, but also lets fragments even closer than that in the depth buffer pass. But since no update of the depth buffer happens, anything between the origin and the stored depth will overwrite it, each time something is drawn there.
A slight change of the theme is using an 'early Z' populated depth buffer as a geometry buffer in multiple deferred shading passes afterwards.
To save further geometry, take a look into Occlusion Queries. With occlusion queries you ask the GPU how many, if any fragments pass all tests. This being a voxel engine you're probably using an octree or Kd tree. Drawing the spatial dividing faces (with glDepthMask(0), glColorMask(0,0,0)) of the tree's branches before traversing the branch tells you, if any geometry in that branch is visible at all. That combined with a near to far sorted traversal and a (coarse) frustum clipping on the tree will give you HUGE performance benefits.
z-pre pass can work with translucent objects. if they are translucent, do not render them in the prepass, then zsort and render.

How do draw to a texture in OpenGL

Now that my OpenGL application is getting larger and more complex, I am noticing that it's also getting a little slow on very low-end systems such as Netbooks. In Java, I am able to get around this by drawing to a BufferedImage then drawing that to the screen and updating the cached render every one in a while. How would I go about doing this in OpenGL with C++?
I found a few guides but they seem to only work on newer hardware/specific Nvidia cards. Since the cached rendering operations will only be updated every once in a while, i can sacrifice speed for compatability.
glBegin(GL_QUADS);
setColor(DARK_BLUE);
glVertex2f(0, 0); //TL
glVertex2f(appWidth, 0); //TR
setColor(LIGHT_BLUE);
glVertex2f(appWidth, appHeight); //BR
glVertex2f(0, appHeight); //BR
glEnd();
This is something that I am especially concerned about. A gradient that takes up the entire screen is being re-drawn many times per second. How can I cache it to a texture then just draw that texture to increase performance?
Also, a trick I use in Java is to render it to a 1 X height texture then scale that to width x height to increase the performance and lower memory usage. Is there such a trick with openGL?
If you don't want to use Framebuffer Objects for compatibility reasons (but they are pretty widely available), you don't want to use the legacy (and non portable) Pbuffers either. That leaves you with the simple possibility of reading the contents of the framebuffer with glReadPixels and creating a new texture with that data using glTexImage2D.
Let me add that I don't really think that in your case you are going to gain much. Drawing a texture onscreen requires at least texel access per pixel, that's not really a huge saving if the alternative is just interpolating a color as you are doing now!
I sincerely doubt drawing from a texture is less work than drawing a gradient.
In drawing a gradient:
Color is interpolated at every pixel
In drawing a texture:
Texture coordinate is interpolated at every pixel
Color is still interpolated at every pixel
Texture lookup for every pixel
Multiply lookup color with current color
Not that either of these are slow, but drawing untextured polygons is pretty much as fast as it gets.
Hey there, thought I'd give you some insight in to this.
There's essentially two ways to do it.
Frame Buffer Objects (FBOs) for more modern hardware, and the back buffer for a fall back.
The article from one of the previous posters is a good article to follow on it, and there's plent of tutorials on google for FBOs.
In my 2d Engine (Phoenix), we decided we would go with just the back buffer method. Our class was fairly simple and you can view the header and source here:
http://code.google.com/p/phoenixgl/source/browse/branches/0.3/libPhoenixGL/PhRenderTexture.h
http://code.google.com/p/phoenixgl/source/browse/branches/0.3/libPhoenixGL/PhRenderTexture.cpp
Hope that helps!
Consider using a display list rather than a texture. Texture reads (especially for large ones) are a good deal slower than 8 or 9 function calls.
Before doing any optimization you should make sure you fully understand the bottlenecks. You'll probably be surprised at the result.
Look into FBOs - framebuffer objects. It's an extension that lets you render to arbitrary rendertargets, including textures. This extension should be available on most recent hardware. This is a fairly good primer on FBOs: OpenGL Frame Buffer Object 101