im trying to make something on shader toy: https://www.shadertoy.com/view/wsffDN
(original ref: https://www.shadertoy.com/view/3dtSD7)
bufferA line 18
i want to know why uv was declared as uv
vec2 uv = (fragCoord.xy - iResolution.xy*.5) / iResolution.y;
, but this line
sceneColor = vec3((uv[0] + stagger) / initpack + 0.05*0., -0, 0.05);
uv[0] is used as a float
how does this work, and what uv's value become?
It is perfectly legal to access the components of any vec type (or mat type for that matter) with array syntax. You can even use a non-constant array index (well, depending on the GLSL version, but 1.30+ versions allow it). uv[0] does exactly what it looks like: access the first element of the vector.
Related
I need help to precisely sample from my 3D Texture in the OpenGL (4.5) Compute Shader given a world position (within the domain of the texture dimensions). More precisely, I need help with my uv() function which maps world coordinates to the exact corresponding texture coordinates.
I want linear interpolation of the data, so my current approach uses texture(). But this results in errors around 0.001 compared to the expected values.
However, if I use texelFetch() and mix() to manually mimick the linear interpolation of texture() as stated in the specification (p. 248), I can reduce the error to 0.0000001 (which is desired). You can see an example of how I implemented it below in the Code section.
This is the function which I currently use inside the Compute Shader to calculate my uv-coordinates:
vec3 uv(const vec3 position) {
return (position + 0.5) / textureSize(tex[0], 0);
}
Though this one is often suggested across the internet, my results are not perfectly aligned.
Example
To elaborate, I have floating point data stored in a Texture as GL_RGB32F. For simplicity my example here uses scalar GL_R32F. The data has dimensions of, e.g., 20x20x20 (but can be arbitrary). I operate in the data domain [0, 19]^3 and want to exactly map my current position to the texture domain [0, 1]^3 to index the data at this position.
I have a test texture which alternates between 0 and 1 on the x-axis and therefore should interpolate for vec3(2.2, 0, 0) to 0.2.
As stated above, I tested texture() and texelFetch() + mix(). My manual interpolation evaluates to 0.200000003 which is fine. But calling texture() evaluates to 0.199218750, a quite high error compared. Strangely, manual interpolation and automatic interpolation evaluate to the same (correct) value for integer positions and the mid between integer positions (e.g., for vec3(2.0, 0, 0), vec3(2.5, 0, 0) and vec3(3.0, 0, 0)).
A visual example with actual calculated values:
uv(x, y, z) = ((x, y, z) + 0.5) / (20, 20, 20)
19| 1 |
| |
..| uv ..|
| (2.2, 3.0) ===> | (0.135, 0.175)
1 | x | x
|___________ |___________
0 1 .. 19 0 1
Code
I use C++, OpenGL 4.5 and globjects as a wrapper for OpenGL. The texture buffers are created and configured as depicted below.
// Texture buffer creation
t = globjects::Texture::createDefault(gl::GLenum::GL_TEXTURE_3D);
t->setParameter(gl::GL_TEXTURE_WRAP_S, gl::GL_CLAMP_TO_EDGE);
t->setParameter(gl::GL_TEXTURE_WRAP_T, gl::GL_CLAMP_TO_EDGE);
t->setParameter(gl::GL_TEXTURE_WRAP_R, gl::GL_CLAMP_TO_EDGE);
t->setParameter(gl::GL_TEXTURE_MIN_FILTER, gl::GL_LINEAR);
t->setParameter(gl::GL_TEXTURE_MAG_FILTER, gl::GL_LINEAR);
The Compute Shader is invocated.
// datatex holds image information
t->image3D(0, gl::GL_RGB32F, datatex->dimensions, 0, gl::GL_RGB, gl::GL_FLOAT, (const uint8_t*) datatex->data());
// ... (Make texture resident)
gl::glDispatchCompute(1, 1, 1);
// ... (Make texture not resident)
The Compute Shader, summarized to the important parts, is as follows:
#version 450
#extension GL_ARB_bindless_texture : enable
layout(local_size_x = 1, local_size_y = 1, local_size_z = 1) in;
layout(binding=0) uniform samplers
{
sampler3D tex[1];
};
vec3 uv(const vec3 position) {
return (position + 0.5) / textureSize(tex[0], 0);
}
void main() {
// Automatic interpolation
vec4 correct1 = texture(tex[0], uv(vec3(2.0,0,0), 0);
vec4 correct2 = texture(tex[0], uv(vec3(2.5,0,0), 0);
vec4 correct3 = texture(tex[0], uv(vec3(3.0,0,0), 0);
vec4 wrong = texture(tex[0], uv(vec3(2.1,0,0), 0);
// Manual interpolation on x-axis
vec3 pos = vec3(2.1,0,0);
vec4 v0 = texelFetch(tex[0], ivec3(floor(pos.x), pos.yz), 0);
vec4 v1 = texelFetch(tex[0], ivec3(ceil(pos.x), pos.yz), 0);
vec4 correct4 = mix(v0, v1, fract(pos.x));
}
I'd love your input, I'm at my end.. Thanks!
System
Also, I'm trying to achieve this on an NVIDIA GPU.
The texture units of GPUs are only needed to sample with 8bit precision in the fraction as of the D3D11 specs. This explains the small error which does not happen on (normalized) integer or mid-integer coordinates.
The fractional precision can also be queried in Vulkan via subTexelPrecisionBits and the online Vulkan database shows that there is no GPU as of today which offers more than 8 bits of precision in the fraction during sampling.
Performing linear interpolation in the shader itself offers the full float32 precision.
I am having trouble converting a cg shader to glsl.
In cg, there is a line :
float4 dst = tex2D(DST, i.uv);
float4 outputColor = (dst > 0.5 ? 1.0 : 2.0);
And when I convert it to glsl:
vec4 dst = texture2D(DST, v_texCoord);
vec4 outputColor = (dst > 0.5 ? 1.0 : 2.0);
I am having the error:
'>' : comparison operator only defined for scalars
And then I tried :
vec4 outputColor = (dst > vec4(0.5) ? 1.0 : 2.0);
Still the same error.....
Anybody can give me some advices on how to convert this in glsl ? thanks :)
Assuming that the Cg comparison code is essentially broadcasting each of those operations to the 4 components of the vector, GLSL doesn't have a simple, built-in way to handle it. But it does have a way to do it.
Modern GLSL (ie: versions where texture2D have long since been discarded) have access to component-wise comparison functions that have the effect of your condition. They produce boolean vectors that say whether the corresponding components satisfy the condition.
You can then use the mix function to do component-wise selection. However, you have to manually do the broadcasting of the integers to make this work.
So the equivalent GLSL code would be:
mix(vec4(2.0), vec4(1.0), greaterThan(dst, vec4(0.5)));
Yes, the order of the values in mix is "backwards": the value taken for a false condition (not greater than) is the first one; the true condition is the second.
In the deferred shading engine I'm working on, I currently store the normal vector in a buffer with the internal format GL_RGBA16F.
I was always aware that this could not be the best solution, but I had no time to deal with it.
Recently I read "Survey of Efficient Representations for Independent Unit Vectors", which inspired me to use Octahedral Normal Vectors (ONV) and to change the buffer to GL_RG16_SNORM:
Encode the normal vector (vec3 to vec2):
// Returns +/- 1
vec2 signNotZero( vec2 v )
{
return vec2((v.x >= 0.0) ? +1.0 : -1.0, (v.y >= 0.0) ? +1.0 : -1.0);
}
// Assume normalized input. Output is on [-1, 1] for each component.
vec2 float32x3_to_oct( in vec3 v )
{
// Project the sphere onto the octahedron, and then onto the xy plane
vec2 p = v.xy * (1.0 / (abs(v.x) + abs(v.y) + abs(v.z)));
// Reflect the folds of the lower hemisphere over the diagonals
return (v.z <= 0.0) ? ((1.0 - abs(p.yx)) * signNotZero(p)) : p;
}
Decode the normal vector (vec2 to vec3):
vec3 oct_to_float32x3( vec2 e )
{
vec3 v = vec3(e.xy, 1.0 - abs(e.x) - abs(e.y));
if (v.z < 0) v.xy = (1.0 - abs(v.yx)) * signNotZero(v.xy);
return normalize(v);
}
Since I have implemented an anisotropic light model right now, it is necessary to store the tangent vector as well as the normal vector. I want to store both vectors in one and the same color attachment of the frame buffer. That brings me to my question. What is a efficient compromise to pack a unit normal vector and tangent vector in a buffer?
Of course it would be easy with the algorithms from the paper to store the normal vector in the RG channels and the tangent vector in the BA channels of a GL_RGBA16_SNORM buffer, and this is my current implementation too.
But since the normal vector an the tangent vector are always orthogonal, there must be more elegant way, which either increases accuracy or saves memory.
So the real question is: How can I take advantage of the fact that I know that 2 vectors are orthogonal? Can I store both vectors in an GL_RGB16_SNORM buffer and if not can I improve the accuracy when I pack them to a GL_RGBA16_SNORM buffer.
The following considerations are purely mathematical and I have no experience with their practicality. However, I think that especially Option 2 might be a viable candidate.
Both of the following options have in common how they state the problem: Given a normal (that you can reconstruct using ONV), how can one encode the tangent with a single number.
Option 1
The first option is very close to what meowgoesthedog suggested. Define an arbitrary reference vector (e.g. (0, 0, 1)). Then encode the tangent as the angle (normalized to the [-1, 1] range) that you need to rotate this vector about the normal to match the tangent direction (after projecting on the tangent plane, of course). You will need two different reference vectors (or even three) and choose the correct one depending on the normal. You don't want the reference vector to be parallel to the normal. I assume that this is computationally more expensive than the second option but that would need measuring. But you would get a uniform error distribution in return.
Option 2
Let's consider the plane orthogonal to the tangent. This plane can be defined either by the tangent or by two vectors that lie in the plane. We know one vector: the surface normal. If we know a second vector v, we can calculate the tangent as t = normalize(cross(normal, v)). To encode this vector, we can prescribe two components and solve for the remaining one. E.g. let our vector be (1, 1, x). Then, to encode the vector, we need to find x, such that cross((1, 1, x), normal) is parallel to the tangent. This can be done with some simple arithmetic. Again, you would need a few different vector templates to account for all scenarios. In the end, you have a scheme whose encoder is more complex but whose decoder couldn't be simpler. The error distribution will not be as uniform as in Option 1, but should be ok for a reasonable choice of vector templates.
I'm using a logarithmic depth algorithmic which results in someFunc(clipspace.z) being written to the depth buffer and no implicit perspective divide.
I'm doing RTT / postprocessing so later on in a fragment shader I want to recompute eyespace.xyz, given ndc.xy (from the fragment coordinates) and clipspace.z (from someFuncInv() on the value stored in the depth buffer).
Note that I do not have clipspace.w, and my stored value is not clipspace.z / clipspace.w (as it would be when using fixed function depth) - so something along the lines of ...
float clip_z = ...; /* [-1 .. +1] */
vec2 ndc = vec2(FragCoord.xy / viewport * 2.0 - 1.0);
vec4 clipspace = InvProjMatrix * vec4(ndc, clip_z, 1.0));
clipspace /= clipspace.w;
... does not work here.
So is there a way to calculate clipspace.w out of clipspace.xyz, given the projection matrix or it's inverse?
clipspace.xy = FragCoord.xy / viewport * 2.0 - 1.0;
This is wrong in terms of nomenclature. "Clip space" is the space that the vertex shader (or whatever the last Vertex Processing stage is) outputs. Between clip space and window space is normalized device coordinate (NDC) space. NDC space is clip space divided by the clip space W coordinate:
vec3 ndcspace = clipspace.xyz / clipspace.w;
So the first step is to take our window space coordinates and get NDC space coordinates. Which is easy:
vec3 ndcspace = vec3(FragCoord.xy / viewport * 2.0 - 1.0, depth);
Now, I'm going to assume that your depth value is the proper NDC-space depth. I'm assuming that you fetch the value from a depth texture, then used the depth range near/far values it was rendered with to map it into a [-1, 1] range. If you didn't, you should.
So, now that we have ndcspace, how do we compute clipspace? Well, that's obvious:
vec4 clipspace = vec4(ndcspace * clipspace.w, clipspace.w);
Obvious and... not helpful, since we don't have clipspace.w. So how do we get it?
To get this, we need to look at how clipspace was computed the first time:
vec4 clipspace = Proj * cameraspace;
This means that clipspace.w is computed by taking cameraspace and dot-producting it by the fourth row of Proj.
Well, that's not very helpful. It gets more helpful if we actually look at the fourth row of Proj. Granted, you could be using any projection matrix, and if you're not using the typical projection matrix, this computation becomes more difficult (potentially impossible).
The fourth row of Proj, using the typical projection matrix, is really just this:
[0, 0, -1, 0]
This means that the clipspace.w is really just -cameraspace.z. How does that help us?
It helps by remembering this:
ndcspace.z = clipspace.z / clipspace.w;
ndcspace.z = clipspace.z / -cameraspace.z;
Well, that's nice, but it just trades one unknown for another; we still have an equation with two unknowns (clipspace.z and cameraspace.z). However, we do know something else: clipspace.z comes from dot-producting cameraspace with the third row of our projection matrix. The traditional projection matrix's third row looks like this:
[0, 0, T1, T2]
Where T1 and T2 are non-zero numbers. We'll ignore what these numbers are for the time being. Therefore, clipspace.z is really just T1 * cameraspace.z + T2 * cameraspace.w. And if we know cameraspace.w is 1.0 (as it usually is), then we can remove it:
ndcspace.z = (T1 * cameraspace.z + T2) / -cameraspace.z;
So, we still have a problem. Actually, we don't. Why? Because there is only one unknown in this euqation. Remember: we already know ndcspace.z. We can therefore use ndcspace.z to compute cameraspace.z:
ndcspace.z = -T1 + (-T2 / cameraspace.z);
ndcspace.z + T1 = -T2 / cameraspace.z;
cameraspace.z = -T2 / (ndcspace.z + T1);
T1 and T2 come right out of our projection matrix (the one the scene was originally rendered with). And we already have ndcspace.z. So we can compute cameraspace.z. And we know that:
clispace.w = -cameraspace.z;
Therefore, we can do this:
vec4 clipspace = vec4(ndcspace * clipspace.w, clipspace.w);
Obviously you'll need a float for clipspace.w rather than the literal code, but you get my point. Once you have clipspace, to get camera space, you multiply by the inverse projection matrix:
vec4 cameraspace = InvProj * clipspace;
So, I've got an imposter (the real geometry is a cube, possibly clipped, and the imposter geometry is a Menger sponge) and I need to calculate its depth.
I can calculate the amount to offset in world space fairly easily. Unfortunately, I've spent hours failing to perturb the depth with it.
The only correct results I can get are when I go:
gl_FragDepth = gl_FragCoord.z
Basically, I need to know how gl_FragCoord.z is calculated so that I can:
Take the inverse transformation from gl_FragCoord.z to eye space
Add the depth perturbation
Transform this perturbed depth back into the same space as the original gl_FragCoord.z.
I apologize if this seems like a duplicate question; there's a number of other posts here that address similar things. However, after implementing all of them, none work correctly. Rather than trying to pick one to get help with, at this point, I'm asking for complete code that does it. It should just be a few lines.
For future reference, the key code is:
float far=gl_DepthRange.far; float near=gl_DepthRange.near;
vec4 eye_space_pos = gl_ModelViewMatrix * /*something*/
vec4 clip_space_pos = gl_ProjectionMatrix * eye_space_pos;
float ndc_depth = clip_space_pos.z / clip_space_pos.w;
float depth = (((far-near) * ndc_depth) + near + far) / 2.0;
gl_FragDepth = depth;
For another future reference, this is the same formula as given by imallett, which was working for me in an OpenGL 4.0 application:
vec4 v_clip_coord = modelview_projection * vec4(v_position, 1.0);
float f_ndc_depth = v_clip_coord.z / v_clip_coord.w;
gl_FragDepth = (1.0 - 0.0) * 0.5 * f_ndc_depth + (1.0 + 0.0) * 0.5;
Here, modelview_projection is 4x4 modelview-projection matrix and v_position is object-space position of the pixel being rendered (in my case calculated by a raymarcher).
The equation comes from the window coordinates section of this manual. Note that in my code, near is 0.0 and far is 1.0, which are the default values of gl_DepthRange. Note that gl_DepthRange is not the same thing as the near/far distance in the formula for perspective projection matrix! The only trick is using the 0.0 and 1.0 (or gl_DepthRange in case you actually need to change it), I've been struggling for an hour with the other depth range - but that is already "baked" in my (perspective) projection matrix.
Note that this way, the equation really contains just a single multiply by a constant ((far - near) / 2) and a single addition of another constant ((far + near) / 2). Compare that to multiply, add and divide (possibly converted to a multiply by an optimizing compiler) that is required in the code of imallett.