What is, in simple terms, textureGrad()? - opengl

I read the Khronos wiki on this, but I don't really understand what it is saying. What exactly does textureGrad do?
I think it samples multiple mipmap levels and computes some color mixing using the explicit derivative vectors given to it, but I am not sure.

When you sample a texture, you need the specific texture coordinates to sample the texture data at. For sake of simplicity, I'm going to assume a 2D texture, so the texture coordinates are a 2D vector (s,t). (The explanation is analogous for other dimensionalities).
If you want to texture-map a triangle, one typically uses one of two strategies to get to the texture coordinates:
The texture coordinates are part of the model. Every vertex contains the 2D texture coordinates as a vertex attribute. During rasterization, those texture coordinates are interpolated across the primitive.
You specify a mathematic mapping. For example, you could define some function mapping the 3D object coordinates to some 2D texture coordinates. You can for example define some projection, and project the texture onto a surface, just like a real projector would project an image onto some real-world objects.
In either case, each fragment generated when rasterizing the typically gets different texture coordinates, so each drawn pixel on the screen will get a different part of the texture.
The key point is this: each fragment has 2D pixel coordinates (x,y) as well as 2D texture coordinates (s,t), so we can basically interpret this relationship as a mathematical function:
(s,t) = T(x,y)
Since this is a vector function in the 2D pixel position vector (x,y), we can also build the partial derivatives along x direction (to the right), and y direction (upwards), which are telling use the rate of change of the texture coordinates along those directions.
And the dTdx and dTdy in textureGrad are just that.
So what does the GPU need this for?
When you want to actually filter the texture (in contrast to simple point sampling), you need to know the pixel footprint in texture space. Each single fragment represents the area of one pixel on the screen, and you are going to use a single color value from the texture to represent the whole pixel (multisampling aside). The pixel footprint now represent the actual area the pixel would have in texture space. We could calculate it by interpolating the texcoords not for the pixel center, but for the 4 pixel corners. The resulting texcoords would form a trapezoid in texture space.
When you minify the texture, several texels are mapped to the same pixel (so the pixel footprint is large in texture space). When you maginify it, each pixel will represent only a fraction of the corresponding texel (so the footprint is quiete small).
The texture footprint tells you:
if the texture is minified or magnified (GL has different filter settings for each case)
how many texels would be mapped to each pixel, so which mipmap level would be appropriate
how much anisotropy there is in the pixel footprint. Each pixel on the screen and each texel in texture space is basically a square, but the pixel footprint might significantly deviate from than, and can be much taller than wide or the over way around (especially in situations with high perspective distortion). Classic bilinear or trilinear texture filters always use a square filter footprint, but the anisotropic texture filter will uses this information to
actually generate a filter footprint which more closely matches that of the actual pixel footprint (to avoid to mix in texel data which shouldn't really belong to the pixel).
Instead of calculating the texture coordinates at all pixel corners, we are going to use the partial derivatives at the fragment center as an approximation for the pixel footprint.
The following diagram shows the geometric relationship:
This represents the footprint of four neighboring pixels (2x2) in texture space, so the uniform grid are the texels, and the 4 trapezoids represent the 4 pixel footprints.
Now calculating the actual derivatives would imply that we have some more or less explicit formula T(x,y) as described above. GPUs usually use another approximation:
the just look at the actual texcoords the the neighboring fragments (which are going to be calculated anyway) in each 2x2 pixel block, and just approximate the footprint by finite differencing - the just subtracting the actual texcoords for neighboring fragments from each other.
The result is shown as the dotted parallelogram in the diagram.
In hardware, this is implemented so that always 2x2 pixel quads are shaded in parallel in the same warp/wavefront/SIMD-Group. The GLSL derivative functions like dFdx and dFdy simply work by subtracting the actual values of the neighboring fragments. And the standard texture function just internally uses this mechanism on the texture coordinate argument. The textureGrad functions bypass that and allow you to specify your own values, which means you control the what pixel footprint the GPU assumes when doing the actual filtering / mipmap level selection.

Related

Why do we need texture filtering in OpenGL?

When mapping texture to a geometry when we can choose the filtering method between GL_NEAREST and GL_LINEAR.
In the examples we have a texture coordinate surrounded by the texels like so:
And it's explained how each algorithm chooses what color the fragment be, for example linear interpolate all the neighboring texels based on distance from the texture coordinate.
Isn't each texture coordinate is essentially the fragment position which are mapped to pixel on screen? So how these coordinates are smaller than the texels which are essentially pixels and the same size as fragments?
A (2D) texture can be looked at as a function t(u, v), whose output is a "color" value. This is a pure function, so it will return the same value for the same u and v values. The value comes from a lookup table stored in memory, indexed by u and v, rather than through some kind of computation.
Texture "mapping" is the process whereby you associate a particular location on a surface with a particular location in the space of a texture. That is, you "map" a surface location to a location in a texture. As such, the inputs to the texture function t are often called "texture coordinates". Some surface locations may map to the same position on a texture, and some texture positions may not have surface locations mapped to them. It all depends on the mapping
An actual texture image is not a smooth function; it is a discrete function. It has a value at the texel locations (0, 0), and another value at (1, 0), but the value of a texture at (0.5, 0) is undefined. In image space, u and v are integers.
Your picture of a zoomed in part of the texture is incorrect. There are no values "between" the texels, because "between the texels" is not possible. There is no number between 0 and 1 on an integer number line.
However, any useful mapping from surface to the texture function is going to need to happen in a continuous space, not a discrete space. After all, it's unlikely that every fragment will land exactly on a location that maps to an exact integer within a texture. After all, especially in shader-based rendering, a shader can just invent a mapping arbitrarily. The "mapping" could be based on light directions (projective texturing), the elevation of a fragment relative to some surface, or anything a user might want. To a fragment shader, a texture is just a function t(u, v) which can be evaluated to produce a value.
So we really want that function to be in a continuous space.
The purpose of filtering is to create a continuous function t by inventing values in-between the discrete texels. This allows you to declare that u and v are floating-point values, rather than integers. We also get to normalize the texture coordinates, so that they're on the range [0, 1] rather than being based on the texture's size.
Texture filtering does not decide what color the fragment should be. This is what the fragment shader does. However, the fragment shader may sample a texture at a given position to get a color. It may directly return that color or it can process it (e.g. add shading etc.)
Texture filtering happens at sampling. The texture coordinates are not necessarily perfect pixel positions. E.g., the texture could be the material of a 3D model that you show in a perspective view. Then a fragment may cover more than a single texel or it may cover less. Or it might not be aligned with the texture grid. In all cases you need some kind of filtering.
For applications that render a sprite at its original size without any deformation, you usually don't need filtering as you have a 1:1 mapping from screen pixels to texels.

Why does OpenGL allow/use fractional values as the location of vertices?

As far as I understand, location of a point/pixel cannot be a fraction, at least on a raster graphics system where hardwares use pixels to display images.
Then, why and how does OpenGL use fractional values for plotting pixels?
For example, how is it possible: glVertex2f(0.15f, 0.51f); ?
This command does not plot any pixels. It merely defines the location of a point in 3D space (you'll notice that there are 3 coordinates, while for a pixel on the screen you'd only need 2). This is the starting point for the OpenGL pipeline. This point then goes through a lot of transformations before it ends up on the screen.
Also, the coordinates are unitless. For example, you can say that your viewport is between 0.0f and 1.0f, then these coordinates make a lot of sense. Basically you have to think of these point in terms of mathematics, not pixels.
I would suggest some reading on how OpenGL transformations work, for example here, here or the tutorial here.
The vectors you pass into OpenGL are not viewport positions but arbitrary numbers in some vector space. Only after a chain of transformations these numbers are mapped into viewport pixel positions. With the old fixed function pipeline this could be anything that can be represented by a vector–matrix multiplication.
These days, where everything is programmable (shaders) the mapping can very well be any kind of function you can think of. For example the values you pass into glVertex (immediate mode call, but available to shaders with OpenGL-2.1) may be interpreted as polar coordinates in the vertex shader:
This is a perfectly valid OpenGL-2.1 vertex shader that interprets the vertex position to be in polar coordinates. Note that due to triangles and lines being straight edges and polar coordinates being curvilinear this gives good visual results only for points or highly tesselated primitives.
#version 110
void main() {
gl_Position =
gl_ModelViewProjectionMatrix
* vec4( gl_Vertex.y*vec2(sin(gl_Vertex.x),cos(gl_Vertex.x)) , 0, 1);
}
As you can see here the valus passed to glVertex are actually arbitrary, unitless components of vectors in some vector space. Only by applying some transformation to the viewport space these vectors gain meaning. Hence it makes no way to impose a certain value range onto the values that go into the vertex attribute.
Vertex and pixel are very different things.
It's quite possible to have all your vertices within one pixel (although in this case you probably need help with LODing).
You might want to start here...
http://www.glprogramming.com/blue/ch01.html
Specifically...
Primitives are defined by a group of one or more vertices. A vertex defines a point, an endpoint of a line, or a corner of a polygon where two edges meet. Data (consisting of vertex coordinates, colors, normals, texture coordinates, and edge flags) is associated with a vertex, and each vertex and its associated data are processed independently, in order, and in the same way.
And...
Rasterization produces a series of frame buffer addresses and associated values using a two-dimensional description of a point, line segment, or polygon. Each fragment so produced is fed into the last stage, per-fragment operations, which performs the final operations on the data before it's stored as pixels in the frame buffer.
For your example, before glVertex2f(0.15f, 0.51f) is on the screen, there are many transforms to be done. Making complex thing crudely simpler, after moving your vertex to view space (applying camera position and direction), the magic here is (1) projection matrix, and (2) viewport setting.
Internally, OpenGL "screen coordinates" are in a cube (-1, -1, -1) - (1, 1, 1), :
http://www.matrix44.net/cms/wp-content/uploads/2011/03/ogl_coord_object_space_cube.png
Projection matrix 'squeezes' the frustum in this cube (which you do in vertex shader), assuming you have perspective transform - if projection is orthogonal, the projection is just a tube, limited by near and far values (and like in both cases, scaling factors):
http://www.songho.ca/opengl/files/gl_projectionmatrix01.png
EDIT: Maybe better example here:
http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/#The_Projection_matrix
(EDIT: The Z-coordinate is used as depth value) When fragments are finally transferred to pixels on texture/framebuffer/screen, these are multiplied with viewport settings:
https://www3.ntu.edu.sg/home/ehchua/programming/opengl/images/GL_2DViewportAspectRatio.png
Hope this helps!

Texture Mapping without OpenGL

So I'm supposed to Texture Map a specific model I've loaded into a scene (with a Framebuffer and a Planar Pinhole Camera), however I'm not allowed to use OpenGL and I have no idea how to do it otherwise (we do use glDrawPixels for other functionality, but that's the only function we can use).
Is anyone here able enough to give me a run-through on how to texture map without OpenGL functionality?
I'm supposed to use these slides: https://www.cs.purdue.edu/cgvlab/courses/334/Fall_2014/Lectures/TMapping.pdf
But they make very little sense to me.
What I've gathered so far is the following:
You iterate over a model, and assign each triangle "texture coordinates" (which I'm not sure what those are), and then use "model space interpolation" (again, I don't understand what that is) to apply the texture with the right perspective.
I currently have my program doing the following:
TL;DR:
1. What is model space interpolation/how do I do it?
2. What explicitly are texture coordinates?
3. How, on a high level (in layman's terms) do I texture map a model without using OpenGL.
OK, let's start by making sure we're both on the same page about how the color interpolation works. Lines 125 through 143 set up three vectors redABC, greenABC and blueABC that are used to interpolate the colors across the triangle. They work one color component at a time, and each of the three vectors helps interpolate one color component.
By convention, s,t coordinates are in source texture space. As provided in the mesh data, they specify the position within the texture of that particular vertex of the triangle. The crucial thing to understand is that s,t coordinates need to be interpolated across the triangle just like colors.
So, what you want to do is set up two more ABC vectors: sABC and tABC, exactly duplicating the logic used to set up redABC, but instead of using the color components of each vertex, you just use the s,t coordinates of each vertex. Then for each pixel, instead of computing ssiRed etc. as unsigned int values, you compute ssis and ssit as floats, they should be in the range 0.0f through 1.0f assuming your source s,t values are well behaved.
Now that you have an interpolated s,t coordinate, multiply ssis by the texel width of the texture, and ssit by the texel height, and use those coordinates to fetch the texel. Then just put that on the screen.
Since you are not using OpenGL I assume you wrote your own software renderer to render that teapot?
A texture is simply an image. A texture coordinate is a 2D position in the texture. So (0,0) is bottom-left and (1,1) is top-right. For every vertex of your 3D model you should store a 2D position (u,v) in the texture. That means that at that vertex, you should use the colour the texture has at that point.
To know the UV texture coordinate of a pixel in between vertices you need to interpolate the texture coordinates of the vertices around it. Then you can use that UV to look up the colour in the texture.

OpenGL: depth calculations are discontinuous

I'm building a LIDAR simulator in OpenGL. This means that the fragment shader returns the length of the light vector (the distance) in place of one of the color channels, normalized by the distance to the far plane (so it'll be between 0 and 1). In other words, I use red to indicate light intensity and blue to indicate distance; and I set green to 0. Alpha is unused, but I keep it at 1.
Here's my test object, which happens to be a rock:
I then write the pixel data to a file and load it into a point cloud visualizer (one point per pixel) — basically the default. When I do that, it becomes clear that all of my points are in discrete planes each located at a different depth:
I tried plotting the same data in R. It doesn't show up initially with the default histogram because the density of the planes is pretty high. But when I set the breaks to about 60, I get this:
.
I've tried shrinking the distance between the near and far planes, in case it was a precision issue. First I was doing 1–1000, and now I'm at 1–500. It may have decreased the distance between planes, but I can't tell, because it means the camera has to be closer to the object.
Is there something I'm missing? Does this have to do with the fact that I disabled anti-aliasing? (Anti-aliasing was causing even worse periodic artifacts, but between the camera and the object instead. I disabled line smoothing, polygon smoothing, and multisampling, and that took care of that particular problem.)
Edit
These are the two places the distance calculation is performed:
The vertex shader calculates ec_pos, the position of the vertex relative to the camera.
The fragment shader calculates light_dir0 from ec_pos and the camera position and uses this to compute a distance.
Is it because I'm calculating ec_pos in the vertex shader? How can I calculate ec_pos in the fragment shader instead?
There are several possible issues I can think of.
(1) Your depth precision. The far plane has very little effect on resolution; the near plane is what's important. See Learning to Love your Z-Buffer.
(2) The more probable explanation, based on what you've provided, is the conversion/saving of the pixel data. The shader outputs floating point values, but these are stored in the framebuffer, which will typically have only 8bits per channel. For color, what that means is that your floats will be mapped to the underlying 8-bit (fixed width, integer) representation, therefore only possessing 256 values.
If you want to output pixel data as the true floats they are, you should make a 32-bit floating point RGBA FBO (with e.g. GL_RGBA32F or something similar). This will store actual floats. Then, when your data from the GPU, it will return the original shader values.
I suppose you could alternately encode a single float in a vec4 with some multiplication, if you don't have a FBO implementation handy.

draw a triangle within a single pixel in opengl

Is it possible to draw a triangle within single pixel?
For example, when I specify the co-ordinates of the vertices of the triangle as A(0, 1), B(0, 0) and C(1, 0). I don't see a triangle being rendered at all. I was expecting to see a small triangle fitting within the pixel.
Is there something I am missing?
A pixel is the smallest discrete unit your display can show. Pixels can only have one color.
Therefore, while OpenGL can attempt to render a triangle to half of a pixel, all you will see is either that pixel filled in or that pixel not filled in. Antialiasing can make the filled in color less strong, but the color from a pixel is solid across the entire pixel.
That's simply the nature of a discrete image.
A pixel is a single point how does a triangle fit into a single point?
It is the absolute smallest unit of an image.
Why do you think you can render half a pixel diagonally? A pixel is either on or off, it can't be any other state. What OpenGL specification do you base your assumption on, most 3D libraries will decide to render a pixel based on how much of the sub-pixel information is filled it. But a pixel can't be partially painted, it is either on or off. A pixel is like a light bulb, you can' light up half of a light bulb.
Regardless, the 3D coordinate space represented doesn't map to the 2D space represented by the graphics plane of the camera drawn on the monitor.
Only with specific camera settings and drawing triangles in a 2D plane at a specific distance from the camera can you expect to try and map the 3D coordinates to 2D coordinates in a 1:1 manner, and even then it isn't precise in many cases.
Sub-pixel rendering, doesn't mean what you think it means, it is a technique/algorithm to determine what RGB elements of a pixels to light up and what color to make them, when there are lots of pixels to be lit up, especially in anti-aliasing situations, and the surrounding pixels are taken into consideration, in a 2D rasterized display. There is no way to partially illuminate a single pixel in a shape, sub-pixel rendering just varies the intensity of the color and brightness of a pixel in a more subtle manner. This only works on LCD display. The wikipedia article describes this very well.
You could never draw a triangle in a single pixel in that case either. A triangle will require at minimum 3 pixels to appear as something that might represent a triangle:
■
■ ■
and 6 pixels to represent a rasterized triangle with all three edges represented.
■
■ ■
■ ■ ■
Is it possible to draw a triangle within single pixel?
No!
You could try evaluate how much of the pixel is covered by the triangle, but there's no way to draw only part of a pixel. A pixel is the smallest unit of a rasterized display device. The pixel is the smallest element. And the pixel density of a display device sets the physical limit on the representable resolution.
The mathematical theory behind it is called "sampling therory" and most importantly you need to know about the so called Nyquist theorem.
Pixels being the ultimately smallest elements of a picture are also the reason why you can't zoom into a picture like they do in CSI:NY, it's simply not possible because there's simply no more information in the picture as there are pixels. (Well, if you have some additional source of information, for example by combining the images taken over a longer period of time and you can estimate the movements, then it actuall is possible to turn temporal information into spatial information, but that's a different story.)