I was reading about the side effects of using "discard" in OpenGL's fragment shader, such as early testing being disabled. But I could not find any alternative for alpha-testing until I stumbled upon glAlphaFunc, which seems to be deprecated since OpenGL 3.0. However I could not find any documentation on why it has been removed, and there seems to be no alternative to "discard".
Alpha testing has (on all implementations I know of) never been done in the early testing stage. I don't think it is even possible there because before the fragment shader has been executed there is no concept of a color or a alpha channel.
In addition, enabling alpha testing usually disables early depth testing (see here), which means that it behaves the same as when discard is used in the shader.
I cannot directly answer why glAlphaFunc has been removed, but since there is no real difference between discard and alpha testing, it's not really a problem.
Related
I'm in the processing of learning Vulkan, and I have just integrated ImGui into my code using the Vulkan-GLFW example in the original ImGui repo, and it works fine.
Now I want to render both the GUI and my 3D model on the screen at the same time, and since the GUI and the model definitely needs different shaders, I need to use multiple pipelines and submit multiples commands. The GUI is partly transparent, so I would like it to be rendered after the model. The Vulkan specs states that the execution order of commands are not likely to be the order that I record the commands, thus I need synchronization of some kind. In this Reddit post several methods of exactly achieving my goals was proposed, and I once believed that I must use multiple subpasses (together with subpass dependency) or barriers or other synchronization methods like that to solve this problem.
Then I had a look at SaschaWillems' Vulkan examples, in the ImGui example though, I see no synchronization between the two draw calls, it just record the command to draw the model first, and then the command to draw the GUI.
I am confused. Is synchronization really needed in this case, or did I misunderstand something about command re-ordering or blending? Thanks.
Think about what you're doing for a second. Why do you think there needs to be synchronization between the two sets of commands? Because the second set of commands needs to blend with the data in the first set, right? And therefore, it needs to do a read/modify/write (RMW), which must be able to read data written by the previous set of commands. The data being read has to have been written, and that typically requires synchronization.
But think a bit more about what that means. Blending has to read from the framebuffer to do its job. But... so does the depth test, right? It has to read the existing sample's depth value, compare it with the incoming fragment, and then discard the fragment or not based on the depth test. So basically every draw call that uses a depth test contains a framebuffer read/modify/wright.
And yet... your depth tests work. Not only do they work between draw calls without explicit synchronization, they also work within a draw call. If two triangles in a draw call overlap, you don't have any problem with seeing the bottom one through the top one, right? You don't have to do inter-triangle synchronization to make sure that the previous triangles' writes are finished before the reads.
So somehow, the depth test's RMW works without any explicit synchronization. So... why do you think that this is untrue of the blend stage's RMW?
The Vulkan specification states that commands, and stages within commands, will execute in a largely unordered way, with several exceptions. The most obvious being the presence of explicit execution barriers/dependencies. But it also says that the fixed-function per-sample testing and blending stages will always execute (as if) in submission order (within a subpass). Not only that, it requires that the triangles generated within a command also execute these stages (as if) in a specific, well-defined order.
That's why your depth test doesn't need synchronization; Vulkan requires that this is handled. This is also why your blending will not need synchronization (within a subpass).
So you have plenty of options (in order from fastest to slowest):
Render your UI in the same subpass as the non-UI. Just change pipelines as appropriate.
Render your UI in a subpass with an explicit dependency on the framebuffer images of the non-UI subpass. While this is technically slower, it probably won't be slower by much if at all. Also, this is useful for deferred rendering, since your UI needs to happen after your lighting pass, which will undoubtedly be its own subpass.
Render your UI in a different render pass. This would only really be needed for cases where you need to do some full-screen work (SSAO) that would force your non-UI render pass to terminate anyway.
I've been using glDisable(GL_DEPTH_TEST) to disable depth testing, thinking that it only disables depth "testing". The reason I've been confused I guess is because I've created two functions, one to disable the depth "test", and another to disable depth "writes" with glDepthMask(GL_FALSE);
If disabling GL_DEPTH_TEST disables both "testing" and "writing" then would it be the equivalent of doing:
glDepthFunc(GL_ALWAYS); // DISABLE TESTS (OR ACTUALLY ALWAYS LET THE TEST SUCCEED)
glDepthMask(GL_FALSE); // DISABLE WRITES
I'm thinking that there's little use for disabling GL_DEPTH_TEST unless I want to disable both tests and writes, and I'm wondering which one is better. The wording seems confusing, but maybe it's just me. I suppose doing disabling depth tests with glDepthFun(GL_ALWAYS) will still do a comparison, I suppose there's no way to disable the depth test entirely while still allowing writes?
glDisable(GL_DEPTH_TEST)
turns off the Depth Test from pipeline. If you use:
glDepthFunc(GL_ALWAYS);
glDepthMask(GL_FALSE);
instead it just hide it after condition (but still present in the pipeline) which could mean performance difference on some platforms ...
Also using your equivalent would need you to set/remember original DepthFunc in case you want to turn it on again (as there are many rendering techniques that need to switch the DeptTest on/off several times.
And lastly two GL function calls are bigger overhead than just one. So from performance point of view using glDisable(GL_DEPTH_TEST) is better for this. The glDepthFunc,glDepthMask are intended for different purpose like multi pass rendering, order independent transparency,holes, etc ...
I am experiencing the issue described in this article where the second color ramp is effectively being gamma-corrected twice, resulting in overbright and washed-out colors. This is in part a result of my using an sRGB framebuffer, but that is not the actual reason for the problem.
I'm testing textures in my test app on iOS8, and in particular I am currently using a PNG image file and using GLKTextureLoader to load it in as a cubemap.
By default, textures are treated NOT as being in sRGB space (which they are invariably saved in by the image editing software used to build the texture).
The consequence of this is that Apple has made GLKTextureLoader do the glTexImage2D call for you, and they invariably are calling it with the GL_RGB8 setting, whereas for actual correctness in future color operations we have to uncorrect the gamma in order to obtain linear brightness values in our textures for our shaders to sample.
Now I can actually see the argument that it is not required of most mobile applications to be pedantic about color operations and color correctness as applied to advanced 3D techniques involving color blending. Part of the issue is that it's unrealistic to use the precious shared device RAM to store textures at any bit depth greater than 8 bits per channel, and if we read our JPG/PNG/TGA/TIFF and gamma-uncorrect its 8 bits of sRGB into 8 bits linear, we're going to degrade quality.
So the process for most apps is just happily toss linear color correctness out the window, and just ignore gamma correction anyway and do blending in the SRGB space. This suits Angry Birds very well, as it is a game that has no shading or blending, so it's perfectly sensible to do all operations in gamma-corrected color space.
So this brings me to the problem that I have now. I need to use EXT_sRGB and GLKit makes it easy for me to set up an sRGB framebuffer, and this works great on last-3-or-so-generation devices that are running iOS 7 or later. In doing this I address the dark and unnatural shadow appearance of an uncorrected render pipeline. This allows my lambertian and blinn-phong stuff to actually look good. It lets me store sRGB in render buffers so I can do post-processing passes while leveraging the improved perceptual color resolution provided by storing the buffers in this color space.
But the problem now as I start working with textures is that it seems like I can't even use GLKTextureLoader as it was intended, as I just get a mysterious error (code 18) when I set the options flag for SRGB (GLKTextureLoaderSRGB). And it's impossible to debug as there's no source code to go with it.
So I was thinking I could go build my texture loading pipeline back up with glTexImage2D and use GL_SRGB8 to specify that I want to gamma-uncorrect my textures before I sample them in the shader. However a quick look at GL ES 2.0 docs reveals that GL ES 2.0 is not even sRGB-aware.
At last I find the EXT_sRGB spec, which says
Add Section 3.7.14, sRGB Texture Color Conversion
If the currently bound texture's internal format is one of SRGB_EXT or
SRGB_ALPHA_EXT the red, green, and blue components are converted from an
sRGB color space to a linear color space as part of filtering described in
sections 3.7.7 and 3.7.8. Any alpha component is left unchanged. Ideally,
implementations should perform this color conversion on each sample prior
to filtering but implementations are allowed to perform this conversion
after filtering (though this post-filtering approach is inferior to
converting from sRGB prior to filtering).
The conversion from an sRGB encoded component, cs, to a linear component,
cl, is as follows.
{ cs / 12.92, cs <= 0.04045
cl = {
{ ((cs + 0.055)/1.055)^2.4, cs > 0.04045
Assume cs is the sRGB component in the range [0,1]."
Since I've never dug this deep when implementing a game engine for desktop hardware (which I would expect color resolution considerations to be essentially moot when using render buffers of 16 bit depth per channel or higher) my understanding of how this works is unclear, but this paragraph does go some way toward reassuring me that I can have my cake and eat it too with respect to retaining all 8 bits of color information if I am to load in the textures using SRGB_EXT image storage format.
Here in OpenGL ES 2.0 with this extension I can use SRGB_EXT or SRGB_ALPHA_EXT rather than the analogous SRGB or SRGB8_ALPHA from vanilla GL.
My apologies for not presenting a simple answerable question. Let it be this one: Am I barking up the wrong tree here or are my assumptions more or less correct? Feels like I've been staring at these specs for far too long now. Another way to answer my question is if you can shed some light on the GLKTextureLoader error 18 that I get when I try to set the sRGB option.
It seems like there is yet more reading for me to do as I have to decide whether to start to branch my code to get one codepath that uses GL ES 2.0 with EXT_sRGB, and the other using GL ES 3.0, which certainly looks very promising by comparing the documentation for glTexImage2D with other GL versions and appears closer to OpenGL 4 than the others, so I am really liking that ES 3 will be bringing mobile devices a lot closer to the API used on the desktop.
Am I barking up the wrong tree here or are my assumptions more or less
correct?
Your assumptions are correct. If the GL_EXT_sRGB OpenGL ES extension is supported, both sRGB framebuffers (with automatic conversion from linear to gamma-corrected sRGB) and sRGB texture formats (with automatic conversion from sRGB to linear RGB when sampling from it) are available, so that is definitively the way to go, if you want to work in a linear color space.
I can't help with that GLKit issue, no idea about that.
I'm trying to replicate the effect of Cathode but i'm not really aware of any rendering effects in SDL. Does anyone know the technique used in Cathode? Are they using OpenGL and shaders maybe?
If you are still interested in the subject I'm working on a similar project. The effects were obtained by using GLSL shaders.
You can grab the source code here: https://github.com/Swordifish90/cool-old-term/
The shaders strings might not be extremely readable due to the extensive use of the ternary operators (needed to customize the appearance) but they should give you a really good idea.
If you poke around a bit in the application bundle, you'll find that the only relevant framework is GLKit which, according to Apple, will "reduce the effort required to create new shader-based apps".
There's also a bunch of ".fragdata", ".vertdata", and ".glsldata" files, which are encrypted.
Very unfortunate for you.
So I would say: Yes, it's OpenGL shaders all the way.
Unfortunately, since the shaders are encrypted, you're going to have to locate suitable algorithms elsewhere.
(Perhaps it's possible to use the OpenGL debugging and profiling tools to capture the shader source as it is compiled, but I doubt it.)
You may have realized that Android phones have (had?) such animations when you put them to sleep. That code is available under in file named ElectronBeam.java.
However it is Java code and uses GLES 1.0 with GLES 1.1 Extenstions but algorithm for bending screen should be understandable.
Seems to be based on GLTerminal which uses OpenGL, it would have to use OpenGL and shaders for speed.
I guess the fastest approximation would be to render the text to buffers within OpenGL and use a deformed 2d grid to create the "rounded corners" radial distortion.
But it would take a lot of work to add all the features that cathode has, not to mention to run them quickly.
I suspect emulating a CRT perfectly is a bit like emulating an analog synth perfectly - hard to impossible.
If you want to work quickly and not killing the CPU, the GPU is the best solution! So pixel shaders. pixel shaders can do all of these effects. Once I made such an application. I wrote it in Silverlight, but it does not matter, I used the pixel shader.
Suggests to write this in Qt4 and add to the QWidget pixel shader effects.
I ran into an issue while compiling an openGl code. The thing is that i want to achieve full scene anti-aliasing and i don't know how. I turned on force-antialiasing from the Nvidia control-panel and that was what i really meant to gain. I do it now with GL_POLYGON_SMOOTH. Obviously it is not efficient and good-looking. Here are the questions
1) Should i use multi sampling?
2) Where in the pipeline does openGl blend the colors for antialiasing?
3) What alternatives do exist besides GL_*_SMOOTH and multisampling?
GL_POLYGON_SMOOTH is not a method to do Full-screen AA (FSAA).
Not sure what you mean by "not efficient" in this context, but it certainly is not good looking, because of its tendency to blend in the middle of meshes (at the triangle edges).
Now, with respect to FSAA and your questions:
Multisampling (aka MSAA) is the standard way today to do FSAA. The usual alternative is super-sampling (SSAA), that consists in rendering at a higher resolution, and downsample at the end. It's much more expensive.
The specification says that logically, the GL keeps a sample buffer (4x the size of the pixel buffer, for 4xMSAA), and a pixel buffer (for a total of 5x the memory), and on each sample write to the sample buffer, updates the pixel buffer with the resolved value from the current 4 samples in the sample buffer (It's not called blending, by the way. Blending is what happens at the time of the write into the sample buffer, controlled by glBlendFunc et al.). In practice, this is not what happens in hardware though. Typically, you write only to the sample buffer (and the hardware usually tries to compress the data), and when comes the time to use it, the GL implementation will resolve the full buffer at once, before the usage happens. This also helps if you actually use the sample buffer directly (no need to resolve at all, then).
I covered SSAA and its cost. The latest technique is called Morphological anti-aliasing (MLAA), and is actively being researched. The idea is to do a post-processing pass on the fully rendered image, and anti-alias what looks like sharp edges. Bottom line is, it's not implemented by the GL itself, you have to code it as a post-processing pass. I include it for reference, but it can cost quite a lot.
I wrote a post about this here: Getting smooth, big points in OpenGL
You have to specify WGL_SAMPLE_BUFFERS and WGL_SAMPLES (or GLX prefix for XOrg/GLX) before creating your OpenGL context, when selecting a pixel format or visual.
On Windows, make sure that you use wglChoosePixelFormatARB() if you want a pixel format with extended traits, NOT ChoosePixelFormat() from GDI/GDI+. wglChoosePixelFormatARB has to be queried with wglGetProcAddress from the ICD driver, so you need to create a dummy OpenGL context beforehand. WGL function pointers are valid even after the OpenGL context is destroyed.
WGL_SAMPLE_BUFFERS is a boolean (1 or 0) that toggles multisampling. WGL_SAMPLES is the number of buffers you want. Typically 2,4 or 8.