In a deferred shading framework, I am using different framebufer objects to perform various render passes. In the first pass I write the DEPTH_STENCIL_ATTACHMENT for the whole scene to a texture, let's call it DepthStencilTexture.
To access the depth information stored in DepthStencilTexture from different render passes, for which I use different framebuffer objects, I know two ways:
1) I bind the DepthStencilTexture to the shader and I access it in the fragment shader, where I do the depth manually, like this
uniform vec2 WinSize; //windows dimensions
vec2 uv=gl_FragCoord.st/WinSize;
float depth=texture(DepthStencilTexture ,uv).r;
if(gl_FragCoord.z>depth) discard;
I also set glDisable(GL_DEPTH_TEST) and glDepthMask(GL_FALSE)
2) I bind the DepthStencilTexture to the framebuffer object as DEPTH_STENCIL_ATTACHMENT and set glEnable(GL_DEPTH_TEST) and glDepthMask(GL_FALSE) (edit: in this case I won't bind the DepthStencilTexture to the shader, to avoid loop feedback, see the answer by Nicol Bolas, and I if I need the depth in the fragment shader I will use gl_FragCorrd.z)
In certain situations, such as drawing light volumes, for which I need the Stencil Test and writing to the stencil buffer, I am going for the solution 2).
In other situations, in which I completely ignore the Stencil, and just need the depth stored in the DepthStencilTexture, does option 1) gives any advantages over the more "natural" option 2) ?
For example I have a (silly, I think) doubt about it . Sometimes in my fragment shaders Icompute the WorldPosition from the depth. In the case 1) it would be like this
uniform mat4 invPV; //inverse PV matrix
vec2 uv=gl_FragCoord.st/WinSize;
vec4 WorldPosition=invPV*vec4(uv, texture(DepthStencilTexture ,uv).r ,1.0f );
WorldPosition=WorldPosition/WorldPosition.w;
In the case 2) it would be like this (edit: this is wrong, gl_FragCoord.z is the current fragment's depth, not the actual depth stored in the texture)
uniform mat4 invPV; //inverse PV matrix
vec2 uv=gl_FragCoord.st/WinSize;
vec4 WorldPosition=invPV*vec4(uv, gl_FragCoord.z, 1.0f );
WorldPosition=WorldPosition/WorldPosition.w;
I am assuming that gl_FragCoord.z in case 2) will be the same as texture(DepthStencilTexture ,uv).r in case 1), or, in other words, the depth stored in the the DepthStencilTexture. Is it true? Is gl_FragCoord.z read from the currently bound DEPTH_STENCIL_ATTACHMENT also with glDisable(GL_DEPTH_TEST) and glDepthMask(GL_FALSE) ?
Going strictly by the OpenGL specification, option 2 is not allowed. Not if you're also reading from that texture.
Yes, I realize you're using write masks to prevent depth writes. It doesn't matter; the OpenGL specification is quite clear. In accord with 9.3.1 of OpenGL 4.4, a feedback loop is established when:
an image from texture object T is attached to the currently bound draw framebuffer object at attachment point A
the texture object T is currently bound to a texture unit U, and
the current programmable vertex and/or fragment processing state makes it
possible (see below) to sample from the texture object T bound to texture
unit U
That is the case in your code. So you technically have undefined behavior.
One reason this is undefined is so that simply changing write masks won't have to do things like clearing framebuffer and/or texture caches.
That being said, you can get away with option 2 if you employ NV_texture_barrier. Which, despite the name, is quite widely available on AMD hardware. The main thing to do here is to issue a barrier after you do all of your depth writing, so that all subsequent reads are guaranteed to work. The barrier will do all of the cache clearing and such you need.
Otherwise, option 1 is the only choice: doing the depth test manually.
I am assuming that gl_FragCoord.z in case 2) will be the same as texture(DepthStencilTexture ,uv).r in case 1), or, in other words, the depth stored in the the DepthStencilTexture. Is it true?
Neither is true. gl_FragCoord is the coordinate of the fragment being processed. This is the fragment generated by the rasterizer, based on the data for the primitive being rasterized. It has nothing to do with the contents of the framebuffer.
Related
I have a functioning OpenGL ES 3 program (iOS), but I've having a difficult time understanding OpenGL textures. I'm trying to render several quads to the screen, all with different textures. The textures are all 256 color images with a sperate palette.
This is C++ code that sends the textures to the shaders
// THIS CODE WORKS, BUT I'M NOT SURE WHY
glActiveTexture(GL_TEXTURE1);
glBindTexture(GL_TEXTURE_2D, _renderQueue[idx]->TextureId);
glUniform1i(_glShaderTexture, 1); // what does the 1 mean here
glActiveTexture(GL_TEXTURE2);
glBindTexture(GL_TEXTURE_2D, _renderQueue[idx]->PaletteId);
glUniform1i(_glShaderPalette, 2); // what does the 2 mean here?
glDrawElements(GL_TRIANGLES, sizeof(Indices)/sizeof(Indices[0]), GL_UNSIGNED_BYTE, 0);
This is the fragment shader
uniform sampler2D texture; // New
uniform sampler2D palette; // A palette of 256 colors
varying highp vec2 texCoordOut;
void main()
{
highp vec4 palIndex = texture2D(texture, texCoordOut);
gl_FragColor = texture2D(palette, palIndex.xy);
}
As I said, the code works, but I'm unsure WHY it works. Several seemingly minor changes break it. For example, using GL_TEXTURE0, and GL_TEXTURE1 in the C++ code breaks it. Changing the numbers in glUniform1i to 0, and 1 break it. I'm guessing I do not understand something about texturing in OpenGL 3+ (maybe Texture Units???), but need some guidance to figure out what.
Since it's often confusing to newer OpenGL programmers, I'll try to explain the concept of texture units on a very basic level. It's not a complex concept once you pick up on the terminology.
The whole thing is motivated by offering the possibility of sampling multiple textures in shaders. Since OpenGL traditionally operates on objects that are bound with glBind*() calls, this means that an option to bind multiple textures is needed. Therefore, the concept of having one bound texture was extended to having a table of bound textures. What OpenGL calls a texture unit is an entry in this table, designated by an index.
If you wanted to describe this state in a C/C++ style notation, you could define the table of bound texture as an array of texture ids, where the size is the maximum number of bound textures supported by the implementation (queried with glGetIntegerv(GL_MAX_TEXTURE_IMAGE_UNITS, ...)):
GLuint BoundTextureIds[MAX_TEXTURE_UNITS];
If you bind a texture, it gets bound to the currently active texture unit. This means that the last call to glActiveTexture() determines which entry in the table of bound textures is modified. In a typical call sequence, which binds a texture to texture unit i:
glActiveTexture(GL_TEXTUREi);
glBindTexture(GL_TEXTURE_2D, texId);
this would correspond to modifying our imaginary data structure by:
BoundTextureIds[i] = texId;
That covers the setup. Now, the shaders can access all the textures in this table. Variables of type sampler2D are used to access textures in the GLSL code. To determine which texture each sampler2D variable accesses, we need to specify which table entry each one uses. This is done by setting the uniform value to the table index:
glUniform1i(samplerLoc, i);
specifies that the sampler uniform at location samplerLoc reads from table entry i, meaning that it samples the texture with id BoundTextureIds[i].
In the specific case of the question, the first texture was bound to texture unit 1 because glActiveTexture(GL_TEXTURE1) was called before glBindTexture(). To access this texture from the shader, the shader uniform needs to be set to 1 as well. Same thing for the second texture, with texture unit 2.
(The description above was slightly simplified because it did not take into account different texture targets. In reality, textures with different targets, e.g. GL_TEXTURE_2D and GL_TEXTURE_3D, can be bound to the same texture unit.)
GL_TEXTURE1 and GL_TEXTURE2 refer to texture units. glUniform1i takes a texture unit id for the second argument for samplers. This is why they are 1 and 2.
From the OpenGL website:
The value of a sampler uniform in a program is not a texture object,
but a texture image unit index. So you set the texture unit index for
each sampler in a program.
I am starting to learn OpenGL (3.3+), and now I am trying to do an algorithm that draws 10000 points randomly in the screen.
The problem is that I don't know exactly where to do the algorithm. Since they are random, I can't declare them on a VBO (or can I?), so I was thinking in passing a uniform value to the vertex shader with the varying position (I would do a loop changing the uniform value). Then I would do the operation 10000 times. I would also pass a random color value to the shader.
Here is kind of my though:
#version 330 core
uniform vec3 random_position;
uniform vec3 random_color;
out vec3 Color;
void main() {
gl_Position = random_position;
Color = random_color;
}
In this way I would do the calculations outside the shaders, and just pass them through the uniforms, but I think a better way would be doing this calculations inside the vertex shader. Would that be right?
The vertex shader will be called for every vertex you pass to the vertex shader stage. The uniforms are the same for each of these calls. Hence you shouldn't pass the vertices - be they random or not - as uniforms. If you would have global transformations (i.e. a camera rotation, a model matrix, etc.), those would go into the uniforms.
Your vertices should be passed as a vertex buffer object. Just generate them randomly in your host application and draw them. The will be automatically the in variables of your shader.
You can change the array in every iteration, however it might be a good idea to keep the size constant. For this it's sometimes useful to pass a 3D-vector with 4 dimensions, one being 1 if the vertex is used and 0 otherwise. This way you can simply check if a vertex should be drawn or not.
Then just clear the GL_COLOR_BUFFER_BIT and draw the arrays before updating the screen.
In your shader just set gl_Position with your in variables (i.e. the vertices) and pass the color on to the fragment shader - it will not be applied in the vertex shader yet.
In the fragment shader the last set variable will be the color. So just use the variable you passed from the vertex shader and e.g. gl_FragColor.
By the way, if you draw something as GL_POINTS it will result in little squares. There are lots of tricks to make them actually round, the easiest to use is probably to use this simple if in the fragment shader. However you should configure them as Point Sprites (glEnable(GL_POINT_SPRITE)) then.
if(dot(gl_PointCoord - vec2(0.5,0.5), gl_PointCoord - vec2(0.5,0.5)) > 0.25)
discard;
I suggest you to read up a little on what the fragment and vertex shader do, what vertices and fragments are and what their respective in/out/uniform variables represent.
Since programs with full vertex buffer objects, shader programs etc. get quite huge, you can also start out with glBegin() and glEnd() to draw vertices directly. However this should only be a very early starting point to understand what you are drawing where and how the different shaders affect it.
The lighthouse3d tutorials (http://www.lighthouse3d.com/tutorials/) usually are a good start, though they might be a bit outdated. Also a good reference is the glsl wiki (http://www.opengl.org/wiki/Vertex_Shader) which is up to date in most cases - but it might be a bit technical.
Whether or not you are working with C++, Java, or other languages - the concepts for OpenGL are usually the same, so almost all tutorials will do well.
The following is an excerpt from GLSL spec:
"Texture lookup functions are available in all shading stages. However, automatic level of detail is computed only for fragment shaders. Other shaders operate as though the base level of detail were computed as zero."
So this is how I see it:
Vertex shader:
vec4 texel = texture(SamplerObj, texCoord);
// since this is vertex shader, sampling will always take place
// from 0th Mipmap level of the texture.
Fragment shader:
vec4 texel = texture(SamplerObj, texCoord);
// since this is fragment shader, sampling will take place
// from Nth Mipmap level of the texture, where N is decided
// based on the distance of object on which texture is applied from camera.
Is my understanding correct?
That sounds right. You can specify an explicit LOD by using textureLod() instead of texture() in the vertex shader.
I believe you could also make it use a higher LOD by setting the GL_TEXTURE_MIN_LOD parameter on the texture. If you call e.g.:
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_LOD, 2.0f);
while the texture is bound, it should use mipmap level 2 when you sample the texture in the vertex shader. I have never tried this, but this is my understanding of how the behavior is defined.
// since this is fragment shader, sampling will take place
// from Nth Mipmap level of the texture, where N is decided
// based on the distance of object on which texture is applied from camera.
I think the bit about the distance isn't correct. The mipmap level to use is determined using the derivation of the texture coordinates for the neighbouring pixels. The sampler hardware can determine this because the generated code for the fragment shader typically uses SIMD instructions and generates values for multiple pixels simultaneously. For example, on Intel hardware a single thread usually operates on a 4x4 grid of pixels. That means that whenever a message is sent to the sampler hardware it is given a set 16 of texture coordinates and 16 texels are expected in reply. The sampler hardware can determine the derivation by looking at the difference between those 16 texture coordinates. That is probably why further down in the GLSL spec it says:
Implicit derivatives are undefined within non-uniform control flow and for non-fragment-shader texture fetches.
Non-uniform control flow would mess up the implicit derivatives because potentially not all of the fragments being processed in the thread would be sampling at the same time.
Let's say I have this scene
And I want to add depth information from a custom made fragment shader.
Now the intuitive thing to do would be to draw a quad over my teapot without depth test enabled but with glDepthMask( 1 ) and glColorMask( 0, 0, 0, 0 ). Write some fragments gl_FragDepth and discard some other fragments.
if ( gl_FragCoord.x < 100 )
gl_FragDepth = 0.1;
else
discard;
For some reason, on a NVidia Quadro 600 and K5000 it works as expected but on a NVidia K3000M and a Firepro(dont't remember which one), all the area covered by my discarded fragments is given the depth value of the quad.
Can't I leave the discarded fragments depth values unmodified?
EDIT I have found a solution to my problem. It turns out that as Andon M. Coleman and Matt Fishman pointed out, I have early_fragment_test enabled but not because I enabled it, but because I use imageStore, imageLoad.
With the little time I had to address the problem, I simply copied the content of my current depth buffer just before the "add depth pass" to a texture. Assigned it to a uniform sampler2D. And this is the code in my shader:
if ( gl_FragCoord.x < 100 )
gl_FragDepth = 0.1;
else
{
gl_FragDepth = texture( depthTex, gl_PointCoord ).r;
color = vec4(0.0,0.0,0.0,0.0);
return;
}
This writes a completely transparent pixel with an unchanged z value.
Well, it sounds like a driver bug to me -- discarded fragments should not hit the depth buffer. You could bind the original depth buffer as a texture, sample it using the gl_FragCoord, and then write the result back instead of using discard. That would add an extra texture lookup -- but it might be a suitable workaround.
EDIT: From section 6.4 of the GLSL 4.40 Specification:
The discard keyword is only allowed within fragment shaders. It can be
used within a fragment shader to abandon the operation on the current
fragment. This keyword causes the fragment to be discarded and no
updates to any buffers will occur. Control flow exits the shader, and
subsequent implicit or explicit
derivatives are undefined when this exit is non-uniform. It would
typically be used within a conditional statement, for example:
if (intensity < 0.0) discard;
A fragment shader may test a fragment’s
alpha value and discard the fragment based on that test. However, it
should be noted that coverage testing occurs after the fragment shader
runs, and the coverage test can change the alpha value.if
(emphasis mine)
Posting a separate answer, because I found some new info in the OpenGL spec:
If early fragment tests are enabled, any depth value computed by the
fragment shader has no effect. Additionally, the depth buffer, stencil
buffer, and occlusion query sample counts may be updated even for
fragments or samples that would be discarded after fragment shader
execution due to per-fragment operations such as alpha-to-coverage or
alpha tests.
Do you have early fragment testing enabled? More info: Early Fragment Test
I would like to know if GL_TEXTURE_2D is active in the shader.
I am binding a color to the shader as well as the active texture (if GL_TEXTURE_2D is set) and need to combine these two.
So if texture is bound, mix the color and the texture (sampler2D * color) and if no texture is bound, use color.
Or should I go another way about this?
It is not quite clear what you mean by 'GL_TEXTURE_2D is active' or 'GL_TEXTURE_2D is set'.
Please note the following:
glEnable(GL_TEXTURE_2D) has no effect on your (fragment) shader. It parametrizes the fixed function part of your pipeline that you just replaced by using a fragment shader.
There is no 'direct'/'clean' way of telling from inside the GLSL shader whether there is a valid texture bound to the texture unit associated with your texture sampler (to my knowledge).
Starting with GLSL 1.3 you might have luck using textureSize(sampler, 0).x > 0 to detect the presence of a valid texture associated with sampler, but that might result in undefined behavior.
The ARB_texture_query_levels extension does indeed explicitly state that textureQueryLevels(gsampler2D sampler) returns 0 if there is no texture associated with sampler.
Should you go another way about this? I think so: Instead of making a decision inside the shader, simply bind a 1x1 pixel texture of 'white' and unconditionally sample that texture and multiply the result with color, which will obviously return 1.0 * color. That is going to be more portable and faster, too.