I have programmed the following shader for testing how linear filtering works in OpenGL.
Here we have a 5x1 texture splatted onto a face of a cube (megenta region is just the color of the background).
The texture is this one (it's very small).
The botton-left corner corresponds to uv=(0, 0) and the top-right corresponds to uv=(1, 1).
Linear filtering is enabled.
The shaders splits vertically the v coordinate in 5 rows (from top to bottom):
Continuous sampling. Just sample normally.
Green if u is in [0, 1], red otherwise. Just for testing purposes.
The u coordinate in gray scale.
Sampling at the left of the texel.
Sampling at the center of the texel.
The problem is that between 3 and 4 there is a row of one pixel that flickers. The flickering changes by changing the camera distance, and sometimes you can even make it disappear. The problem seems to be in the shader code that handles the fourth row.
// sample at the left of the pixel
// the following line can fix the problem if I add any number different from 0
tc.y += 0.000000; // replace by any number other than 0 and works fine
tc.x = floor(5 * tc.x) * 0.2;
c = texture(tex0, tc);
This looks weird to me because in that zone the v coordinate is not near any edge of the texture.
Your code relies on undefined values during the texture fetch.
The GLSL 4.60 specification states in Section 8.9 Texture Functions (emphasis mine):
Some texture functions (non-“Lod” and non-“Grad” versions) may require
implicit derivatives. Implicit derivatives are undefined within
non-uniform control flow and for non-fragment-shader texture fetches.
While most people think that those derivatives are only required for mip-mapping, that is not correct. The LOD factor is also needed to determine if the texture is magnified or minified (and also for anisotropic filtering in the non-mipmapped case, but that is not of interest here).
GPUs usually approximate the derivatives by finite differencing between neighboring pixels in a 2x2 pixel quad.
What's happening is that at the edge between your various options, you have non-uniform control flow where for one line you do the texture filtering, and on the line above, you don't do it. The finite differencing will result in trying to access the texture coords for the texture sampling operation in the upper row, which aren't guaranteed to have been calculated at all, since that shader invocation did not actively execute that code path - this is why the spec treats them as undefined.
Now depending where in the 2x2 pixel quad your edge lies, you do get correct results, or you don't. For the cases you don't get correct results, one possible outcome could be that the GL uses the minification filter which is GL_NEAREST in your example.
It would probably help to just set both filters to GL_LINEAR. However, that would still not be correct code, as the results are still undefined as per the spec.
The only correct solution would be to move the texture sampling out of the non-uniform control flow, like
vec4 c1=texture(tex, tc); // sample directly at tc
vec4 c2=texture(tex, some_function_of(tc)); // sample somewhere else
vec4 c3=texture(tex, ...);
// select output color in some non-uniform way
if (foo) {
c=c1;
} else if (bar) {
c=c2;
} else {
c=c3;
}
Related
I'm trying to draw a rectangle with a texture in OpenGL. I'm simply trying to render an entire .jpg image, so I specify the texture coordinates as [0, 0] to [1, 1] in the vertex buffer. I expect all the interpolated texture coordinates in the fragment shader to be between [0, 0] and [1, 1], however, depending on where the texture is drawn, I sometimes get a texture coordinate that is less than 0 (I know this is the case because I tried outputting red from the fragment shader if the tex coord is less than 0).
How come I get an interpolated value outside of the specified range? I currently visualize vertices/fragments like the following image (https://learnopengl.com/Advanced-OpenGL/Anti-Aliasing):
If I imagine a rectangle instead, then if the pixel sample is inside the rectangle, then the interpolated texture coord must be at least 0, since the very left of the rectangle represents 0, right? So how do I end up with a value less than 0?
Edit: after some basic testing, it looks like the fragment shader is called if a shape simply intersects that pixel, not if the pixel sample point is inside the shape. I tested this by placing the start of the rectangle slightly before and slightly after the middle of a pixel - when slightly behind the middle of the pixel, I don't get a negative value, but if I place it slightly after the middle, then I do get a negative value. This contradicts what the website I linked to said - perhaps it's driver-dependent?
Edit: the previous test I did was with multisampling on. If I turn multisampling off, then even if the shape is past the middle, I don't get a negative value...
Turns out I just needed to keep reading the article I linked:
This is where multisampling becomes interesting. We determined that 2 subsamples were covered by the triangle so the next step is to determine a color for this specific pixel. Our initial guess would be that we run the fragment shader for each covered subsample and later average the colors of each subsample per pixel. In this case we'd run the fragment shader twice on the interpolated vertex data at each subsample and store the resulting color in those sample points. This is (fortunately) not how it works, because this basically means we need to run a lot more fragment shaders than without multisampling, drastically reducing performance.
How MSAA really works is that the fragment shader is only run once per pixel (for each primitive) regardless of how many subsamples the triangle covers. The fragment shader is run with the vertex data interpolated to the center of the pixel and the resulting color is then stored inside each of the covered subsamples. Once the color buffer's subsamples are filled with all the colors of the primitives we've rendered, all these colors are then averaged per pixel resulting in a single color per pixel. Because only two of the 4 samples were covered in the previous image, the color of the pixel was averaged with the triangle's color and the color stored at the other 2 sample points (in this case: the clear color) resulting in a light blue-ish color.
So I was getting a negative value because the fragment shader was being run on a pixel that had at least one of its sub-sample points covered by the shape, but the shape was slightly after the mid-point of the pixel, and since "the fragment shader is run with the vertex data interpolated to the center of the pixel", I was getting a negative value.
TL;DR I'm computing a depth map in a fragment shader and then trying to use that map in a vertex shader to see if vertices are 'in view' or not and the vertices don't line up with the fragment texel coordinates. The imprecision causes rendering artifacts, and I'm seeking alternatives for filtering vertices based on depth.
Background. I am very loosely attempting to implement a scheme outlined in this paper (http://dash.harvard.edu/handle/1/4138746). The idea is to represent arbitrary virtual objects as lots of tangent discs. While they wanted to replace triangles in some graphics card of the future, I'm implementing this on conventional cards; my discs are just fans of triangles ("Discs") around center points ("Points").
This is targeting WebGL.
The strategy I intend to use, similar to what's done in the paper, is:
Render the Discs in a Depth-Only pass.
In a second (or more) pass, compute what's visible based solely on which Points are "visible" - ie their depth is <= the depth from the Depth-Only pass at that x and y.
I believe the authors of the paper used a gaussian blur on top of the equivalent of a GL_POINTS render applied to the Points (ie re-using the depth buffer from the DepthOnly pass, not clearing it) to actually render their object. It's hard to say: the process is unfortunately a one line comment, and I'm unsure of how to duplicate it in WebGL anyway (a naive gaussian blur will just blur in the background pixels that weren't touched by the GL_POINTS call).
Instead, I'm hoping to do something slightly different, by rerendering the discs in a second pass instead as cones (center of disc becomes apex of cone, think "close the umbrella") and effectively computing a voronoi diagram on the surface of the object (ala redbook http://www.glprogramming.com/red/chapter14.html#name19). The idea is that an output pixel is the color value of the first disc to reach it when growing radiuses from 0 -> their natural size.
The crux of the problem is that only discs whose centers pass the depth test in the first pass should be allowed to carry on (as cones) to the 2nd pass. Because what's true at the disc center applies to the whole disc/cone, I believe this requires evaluating a depth test at a vertex or object level, and not at a fragment level.
Since WebGL support for accessing depth buffers is still poor, in my first pass I am packing depth info into an RGBA Framebuffer in a fragment shader. I then intended to use this in the vertex shader of the second pass via a sampler2D; any disc center that was closer than the relative texture2D() lookup would be allowed on to the second pass; otherwise I would hack "discarding" the vertex (its alpha would be set to 0 or some flag set that would cause discard of fragments associated with the disc/cone or etc).
This actually kind of worked but it caused horrendous z-fighting between discs that were close together (very small perturbations wildly changed which discs were visible). I believe there is some floating point error between depth->rgba->depth. More importantly, though, the depth texture is being set by fragment texel coords, but I'm looking up vertices, which almost certainly don't line up exactly on top of relevant texel coordinates; so I get depth +/- noise, essentially, and the noise is the issue. Adding or subtracting .000001 or something isn't sufficient: you trade Type I errors for Type II. My render became more accurate when I switched from NEAREST to LINEAR for the depth texture interpolation, but it still wasn't good enough.
How else can I determine which disc's centers would be visible in a given render, so that I can do a second vertex/fragment (or more) pass focused on objects associated with those points? Or: is there a better way to go about this in general?
I've been trying to utilize the techniques in Eric Penner's "Shader Amortization using
Pixel Quad Message Passing" from GPU Pro 2, Chapter VI.2. The basic idea is that modern GPU's process fragment shaders in 2x2 fragment quads, and you can use ddx() and ddy() to get the value of some_var at all four fragments as long as the following hold:
Your GPU supports high-quality derivatives
You know which fragment you're processing (top-left, top-right, bottom-left, bottom-right)
This opens up a lot of opportunities for fragment shader optimization (like distributing texture fetches over a 2x2 pixel quad) that you'd need Compute Shaders to beat.
My problem is this:
I can't deterministically detect which fragment I'm processing. Ideally, each fragment block would start at even-numbered output pixel coords like (0, 0), (2, 0), ... (1024, 1024), ..., so you'd just need to check whether the output pixel x and y coords are even or odd to know which fragment you're currently processing. The method Penner uses in the book assumes this works...but it seems to be going wrong for me.
Unfortunately, my 2x2 fragment quads appear to be starting in nondeterministic places: I've seen them start at (even, even), (even, odd), and (odd, even). I can't remember if I've seen (odd, odd) or not, but anyway, the arrangement seems to depend on a myriad of factors I don't understand, including the output resolution and shader specifics. (I'm testing on an 8800 GTS, in case anyone's wondering.)
Does anyone know what might be causing this nondeterminism or have any documentation on it? I understand there's virtually no official standardization in this area, but I'm more interested in how things work in practice on modern desktop-level GPU's, and I'm hoping there's a way to get this technique to work. If no one knows how to reason about the even/odd start behavior, does anyone know any other way of determining the current fragment's relative location in its 2x2 quad?
Thanks :)
As it turns out, the premise of my question was mostly wrong:
The 2x2 fragment quads DO almost always start on even pixel numbers...as long as the output resolution is even-numbered.
If the output resolution is odd-numbered (a possibility with the underlying program I'm working with), things can get more complicated, for obvious reasons. I don't expect there's any uniformity here across drivers/GPU's/etc. either, but my current tests (which themselves may still be buggy) appear to demonstrate 2x2 pixel quads starting at an odd pixel along the dimension with odd resolution, at least when the odd dimension is horizontal.
All of this weirdness helped obscure my bigger issue: The code I used to detect the fragment's location in the pixel quad was buggy. I tested by setting the texture coordinates equal within a pixel quad (set to the pixel quad center)...or so I thought. However, I calculated the screen coordinates based on a full-screen quad where the uv mapping has the +v axis pointing downward. The screenspace origin starts at the bottom-left, because it's based on the top-right quadrant of Cartesian coordinates, and I accidentally forgot to invert the v-coordinate of the uv offset I used to find the pixel quad center. Many of my nondeterministic observations came from failing to check my assumptions while debugging and misinterpreting things as a result, particularly in combination with odd resolutions.
This was an embarrassing mistake I should have caught a lot sooner, but I figured I'd detail it as a warning to others to always double-check the direction of your vertical axis when you're dealing with opposite-facing coordinate frames. ;)
UPDATE:
I ran across a situation where 2x2 pixel quads started on even pixel numbers even when the resolution was odd. Thanks to the nondeterminism under odd resolutions, I had to work out another solution:
If you're deriving your screen pixel numbers from the uv coords of a fullscreen quad (for post-processing), the fragment location derived from this is only useful for arranging/placing shared samples between fragments, etc., not for the quad-pixel communication itself. You'll need to have screen pixel numbers with respect to the screenspace origin for that. You can derive these from vertex positions, or you can use ddx().x and ddy().y on the uv-based pixel numbers to find out their screen direction and mirror the fragment position in the appropriate direction from there.
Calculate the fragment location based on your screen pixel numbers (with respect to the true screenspace origin) and the assumption 2x2 pixel quads start on even pixels. (If you used uv-based pixel numbers, now is the time to mirror things.)
Do a ddx().x and ddy().y on the fragment location, and if they're negative in either direction, you know the pixel quad starts at an odd pixel number in that direction...so mirror in that direction.
If you calculate two fragment positions, one based on a uv origin and one based on a screen origin, use the uv-based one for reasoning about uv-based sample placement, and use the screen-based one for actually obtaining the values of a variable at neighboring fragments.
Profit.
I'll post a link to my working MIT-licensed code once I release it on Github, along with usage examples (the speedup is unfortunately not what I expected, but whatever ;)). I'm just waiting to get done with a larger shader I'll be uploading along with it.
I'm building a LIDAR simulator in OpenGL. This means that the fragment shader returns the length of the light vector (the distance) in place of one of the color channels, normalized by the distance to the far plane (so it'll be between 0 and 1). In other words, I use red to indicate light intensity and blue to indicate distance; and I set green to 0. Alpha is unused, but I keep it at 1.
Here's my test object, which happens to be a rock:
I then write the pixel data to a file and load it into a point cloud visualizer (one point per pixel) — basically the default. When I do that, it becomes clear that all of my points are in discrete planes each located at a different depth:
I tried plotting the same data in R. It doesn't show up initially with the default histogram because the density of the planes is pretty high. But when I set the breaks to about 60, I get this:
.
I've tried shrinking the distance between the near and far planes, in case it was a precision issue. First I was doing 1–1000, and now I'm at 1–500. It may have decreased the distance between planes, but I can't tell, because it means the camera has to be closer to the object.
Is there something I'm missing? Does this have to do with the fact that I disabled anti-aliasing? (Anti-aliasing was causing even worse periodic artifacts, but between the camera and the object instead. I disabled line smoothing, polygon smoothing, and multisampling, and that took care of that particular problem.)
Edit
These are the two places the distance calculation is performed:
The vertex shader calculates ec_pos, the position of the vertex relative to the camera.
The fragment shader calculates light_dir0 from ec_pos and the camera position and uses this to compute a distance.
Is it because I'm calculating ec_pos in the vertex shader? How can I calculate ec_pos in the fragment shader instead?
There are several possible issues I can think of.
(1) Your depth precision. The far plane has very little effect on resolution; the near plane is what's important. See Learning to Love your Z-Buffer.
(2) The more probable explanation, based on what you've provided, is the conversion/saving of the pixel data. The shader outputs floating point values, but these are stored in the framebuffer, which will typically have only 8bits per channel. For color, what that means is that your floats will be mapped to the underlying 8-bit (fixed width, integer) representation, therefore only possessing 256 values.
If you want to output pixel data as the true floats they are, you should make a 32-bit floating point RGBA FBO (with e.g. GL_RGBA32F or something similar). This will store actual floats. Then, when your data from the GPU, it will return the original shader values.
I suppose you could alternately encode a single float in a vec4 with some multiplication, if you don't have a FBO implementation handy.
I have a scene that is rendered to texture via FBO and I am sampling it from a fragment shader, drawing regions of it using primitives rather than drawing a full-screen quad: I'm conserving resources by only generating the fragments I'll need.
To test this, I am issuing the exact same geometry as my texture-render, which means that the rasterization pattern produced should be exactly the same: When my fragment shader looks up its texture with the varying coordinate it was given it should match up perfectly with the other values it was given.
Here's how I'm giving my fragment shader the coordinates to auto-texture the geometry with my fullscreen texture:
// Vertex shader
uniform mat4 proj_modelview_mat;
out vec2 f_sceneCoord;
void main(void) {
gl_Position = proj_modelview_mat * vec4(in_pos,0.0,1.0);
f_sceneCoord = (gl_Position.xy + vec2(1,1)) * 0.5;
}
I'm working in 2D so I didn't concern myself with the perspective divide here. I just set the sceneCoord value using the clip-space position scaled back from [-1,1] to [0,1].
uniform sampler2D scene;
in vec2 f_sceneCoord;
//in vec4 gl_FragCoord;
in float f_alpha;
out vec4 out_fragColor;
void main (void) {
//vec4 color = texelFetch(scene,ivec2(gl_FragCoord.xy - vec2(0.5,0.5)),0);
vec4 color = texture(scene,f_sceneCoord);
if (color.a == f_alpha) {
out_fragColor = vec4(color.rgb,1);
} else
out_fragColor = vec4(1,0,0,1);
}
Notice I spit out a red fragment if my alpha's don't match up. The texture render sets the alpha for each rendered object to a specific index so I know what matches up with what.
Sorry I don't have a picture to show but it's very clear that my pixels are off by (0.5,0.5): I get a thin, one pixel red border around my objects, on their bottom and left sides, that pops in and out. It's quite "transient" looking. The giveaway is that it only shows up on the bottom and left sides of objects.
Notice I have a line commented out which uses texelFetch: This method works, and I no longer get my red fragments showing up. However I'd like to get this working right with texture and normalized texture coordinates because I think more hardware will support that. Perhaps the real question is, is it possible to get this right without sending in my viewport resolution via a uniform? There's gotta be a way to avoid that!
Update: I tried shifting the texture access by half a pixel, quarter of a pixel, one hundredth of a pixel, it all made it worse and produced a solid border of wrong values all around the edges: It seems like my gl_Position.xy+vec2(1,1))*0.5 trick sets the right values, but sampling is just off by just a little somehow. This is quite strange... See the red fragments? When objects are in motion they shimmer in and out ever so slightly. It means the alpha values I set aren't matching up perfectly on those pixels.
It's not critical for me to get pixel perfect accuracy for that alpha-index-check for my actual application but this behavior is just not what I expected.
Well, first consider dropping that f_sceneCoord varying and just using gl_FragCoord / screenSize as texture coordinate (you already have this in your example, but the -0.5 is rubbish), with screenSize being a uniform (maybe pre-divided). This should work almost exact, because by default gl_FragCoord is at the pixel center (meaning i+0.5) and OpenGL returns exact texel values when sampling the texture at the texel center ((i+0.5)/textureSize).
This may still introduce very very very slight deviations form exact texel values (if any) due to finite precision and such. But then again, you will likely want to use a filtering mode of GL_NEAREST for such one-to-one texture-to-screen mappings, anyway. Actually your exsiting f_sceneCoord approach may already work well and it's just those small rounding issues prevented by GL_NEAREST that create your artefacts. But then again, you still don't need that f_sceneCoord thing.
EDIT: Regarding the portability of texelFetch. That function was introduced with GLSL 1.30 (~SM4/GL3/DX10-hardware, ~GeForce 8), I think. But this version is already required by the new in/out syntax you're using (in contrast to the old varying/attribute syntax). So if you're not gonna change these, assuming texelFetch as given is absolutely no problem and might also be slightly faster than texture (which also requires GLSL 1.30, in contrast to the old texture2D), by circumventing filtering completely.
If you are working in perfect X,Y [0,1] with no rounding errors that's great... But sometimes - especially if working with polar coords, you might consider aligning your calculated coords to the texture 'grid'...
I use:
// align it to the nearest centered texel
curPt -= mod(curPt, (0.5 / vec2(imgW, imgH)));
works like a charm and I no longer get random rounding errors at the screen edges...