I want actual world space distance, and I get the feeling from experimentation that
(gl_FragCoord.z / gl_FragCoord.w)
is the depth in world space? But I'm not too sure.
EDIT I've just found where I had originally located this snippet of code. Apparently it is the actual depth from the camera?
This was asked (by the same person) and answered elsewhere. I'm paraphrasing and embellishing the answer here:
As stated in section 15.2.2 of the OpenGL 4.3 core profile specification (PDF), gl_FragCoord.w is 1 / clip.w, where clip.w is the W component of the clip-space position (ie: what you wrote to gl_Position).
gl_FragCoord.z is generated by the following process, assuming the usual transforms:
Camera-space to clip-space transform, via projection matrix multiplication in the vertex shader. clip.z = (projectionMatrix * cameraPosition).z
Transform to normalized device coordinates. ndc.z = clip.z / clip.w
Transform to window coordinates, using the glDepthRange near/far values. win.z = ((dfar-dnear)/2) * ndc.z + (dfar+dnear)/2.
Now, using the default depth range of near=0, far=1, we can define win.z in terms of clip-space: (clip.z/clip.w)/2 + 0.5. If we then divide this by gl_FragCoord.w, that is the equivalent of multiplying by clip.w, thus giving us:
(gl_FragCoord.z / gl_FragCoord.w) = clip.z/2 + clip.w/2 = (clip.z + clip.w) / 2
Using the standard projection matrix, clip.z represents a scale and offset from camera-space Z component. The scale and offset are defined by the camera's near/far depth values. clip.w is, again in the standard projection matrix, just the negation of the camera-space Z. Therefore, we can redefine our equation in those terms:
(gl_FragCoord.z / gl_FragCoord.w) = (A * cam.z + B -cam.z)/2 = (C * cam.z + D)
Where A and B represent the offset and scale based on near/far, and C = (A - 1)/2 and D = B / 2.
Therefore, gl_FragCoord.z / gl_FragCoord.w is not the camera-space (or world-space) distance to the camera. Nor is it the camera-space planar distance to the camera. But it is a linear transform of the camera-space depth. You could use it as a way to compare two depth values together, if they came from the same projection matrix and so forth.
To actually compute the camera-space Z, you need to either pass the camera near/far from your matrix (OpenGL already gives you the range near/far) and compute those A and B values from them, or you need to use the inverse of the projection matrix. Alternatively, you could just use the projection matrix directly yourself, since fragment shaders can use the same uniforms available to vertex shaders. You can pick the A and B terms directly from that matrix. A = projectionMatrix[2][2], and B = projectionMatrix[3][2].
According to the docs:
Available only in the fragment language, gl_FragDepth is an output variable that
is used to establish the depth value for the current fragment. If depth buffering
is enabled and no shader writes to gl_FragDepth, then the fixed function value
for depth will be used (this value is contained in the z component of
gl_FragCoord) otherwise, the value written to gl_FragDepth is used.
So, it looks like gl_FragDepth should just be gl_FragCoord.z unless you've set it somewhere else in your shaders.
As
gl_FragCoord.w = 1.0 / gl_Position.w
And (likely) your projection matrix gets w from -z (if the last column is 0,0,-1,0) then;
float distanceToCamera = 1.0 / gl_FragCoord.w;
Related
I am a graphics programming beginner working on my own engine and tried to implement frustum-aligned volume rendering.
The idea was to render multiple planes as vertical slices across the view frustum and then use the world coordinates of those planes for procedural volumes.
Rendering the slices as a 3d model and using the vertex positions as worldspace coordinates works perfectly fine:
//Vertex Shader
gl_Position = P*V*vec4(vertexPosition_worldspace,1);
coordinates_worldspace = vertexPosition_worldspace;
Result:
However rendering the slices in frustum-space and trying to reverse engineer the world space coordinates doesent give expected results. The closest i got was this:
//Vertex Shader
gl_Position = vec4(vertexPosition_worldspace,1);
coordinates_worldspace = (inverse(V) * inverse(P) * vec4(vertexPosition_worldspace,1)).xyz;
Result:
My guess is, that the standard projection matrix somehow gets rid of some crucial depth information, but other than that i have no clue what i am doing wrong and how to fix it.
Well, it is not 100% clear what you mean by "frustum space". I'm going to assume that it does refer to normalized device coordinates in OpenGL, where the view frustum is (by default) the axis-aligned cube -1 <= x,y,z <= 1. I'm also going to assume a perspective projection, so that NDC z coordinate is actually a hyperbolic function of eye space z.
My guess is, that the standard projection matrix somehow gets rid of some crucial depth information, but other than that i have no clue what i am doing wrong and how to fix it.
No, a standard perspective matrix in OpenGL looks like
( sx 0 tx 0 )
( 0 sy ty 0 )
( 0 0 A B )
( 0 0 -1 0 )
When you multiply this by a (x,y,z,1) eye space vector, you get the homogenous clip coordinates. Consider only the
last two lines of the matrix as separate equations:
z_clip = A * z_eye + B
w_clip = -z_eye
Since we do the perspective divide by w_clip to get from clip space to NDC, we end up with
z_ndc = - A - B/z_eye
which is actually the hyperbolically remapped depth information - so that information is completely preserved. (Also note that we do the division also for x and y).
When you calculate inverse(P), you only invert the 4D -> 4D homogenous mapping. But you will get a resulting w that is not 1 again, so here:
coordinates_worldspace = (inverse(V) * inverse(P) * vec4(vertexPosition_worldspace,1)).xyz;
^^^
lies your information loss. You just skip the resulting w and use the xyz components as if it were cartesian 3D coordinates, but they are 4D homogenous coordinates representing some 3D point.
The correct approach would be to divide by w:
vec4 coordinates_worldspace = (inverse(V) * inverse(P) * vec4(vertexPosition_worldspace,1));
coordinates_worldspace /= coordinates_worldspace.w
I have a vertex (x, y, z) and I want to calculate the screen location where this point would be rendered on my viewport. Something like Ray Picking, just more or less the other way around. I don't think I can use gluProject because at the time I need the projected point my matrices are restored to identities.
I would like to stay independent from OpenGL, so no extra render pass. This way I'm sure it would only be some math like the ray picking thing. I've implemented that one and it works well, so I want to project a vertex the same way.
Of course I have camera pos, up and lookAt vectors and fovy. Is there any source of information about this? Or does anyone know how to work this out?
If your know your matrices (or at least know how to construct them), you can compute screen location for a vertex by multiplying its position with the matrices and then performing viewport transformation:
vProjected = modelViewPojectionMatrix * v;
if (
// check that vertex shouldn't be clipped.
-vProjected.w <= vProjected.x && vProjected.x <= vProjected.w &&
-vProjected.w <= vProjected.y && vProjected.y <= vProjected.w &&
-vProjected.w <= vProjected.z && vProjected.z <= vProjected.w
) {
vProjected /= vProjected.w;
vScreen.x = VIEWPORT_W * vProjected.x / 2 + VIEWPORT_CENTER_X;
vScreen.y = VIEWPORT_H * vProjected.y / 2 + VIEWPORT_CENTER_Y;
}
Note that, as per OpenGL convention, (0, 0) is lower left corner, not upper left one.
Any math library with verctor and matrix operations can help you with that. For example, mathfu or glm.
UPD. How you can construct modelViewProjectionMatrix given camera position and orientation and projection params? We need two matrices (let's assume that model matrix is just an identity, i.e. vertex positions a given already in world coordinate system). First one would be the view matrix, which takes into account camera position and orientation. Here I'll be using mathfu since I'm more familiar with it, but almost every math library design with 3D graphics in mind has the same functions:
viewMatrix = mathfu::mat4::LookAt(
cameraLookAtPosition,
cameraPosition,
cameraUpVector
);
The second one would be projection matrix:
projectionMatrix = mathfu::mat4::Perspective(fovy, aspect, zNear, zFar);
Now modelViewProjectionMatrix is just a product of those two:
modelViewProjectionMatrix = projectionMatrix * viewMatrix;
Note that matrix multiplication is not commutative, in other words A * B != B * A. So order in which matrices are multiplied is important.
I render a 3D mesh model using OpenGL with perspective camera – gluPerspective(fov, aspect, near, far).
Then I use rendered image in a computer vision algorithm.
At some point that algorithm requires camera matrix K (along with several vertices on the model and their corresponding projections) in order to estimate camera position: rotation matrix R and translation vector t. I can estimate R and t by using any algorithm which solves Perspective-n-Point problem.
I construct K from the OpenGL projection matrix (see how here)
K = [fX, 0, pX | 0, fY, pY | 0, 0, 1]
If I want to project a model point 'by hand' I can compute:
X_proj = K*(R*X_model + t)
x_pixel = X_proj[1] / X_proj[3]
y_pixel = X_proj[2] / X_proj[3]
Anyway, I pass this camera matrix in a PnP algorithm and it works just fine.
But then I had to change perspective projection to orthographic one.
As far as I understand when using orthographic projection the camera matrix becomes:
K = [1, 0, 0 | 0, 1, 0 | 0, 0, 0]
So I changed gluPerspective to glOrtho. Following the same way I constructed K from OpenGL projection matrix, and it turned out that fX and fY are not ones but 0.0037371. Is this a scaled orthographic projection or what?
Moreover, in order to project model vertices 'by hand' I managed to do the following:
X_proj = K*(R*X_model + t)
x_pixel = X_proj[1] + width / 2
y_pixel = X_proj[2] + height / 2
Which is not what I expected (that plus width and hight divided by 2 seems strange...). I tried to pass this camera matrix to POSIT algorithm to estimate R and t, and it doesn't converge. :(
So here are my questions:
How to get orthographic camera matrix from OpenGL?
If the way I did it is correct then is it true orthographic? Why POSIT doesn't work?
Orthographic projection will not use the depth to scale down farther points. Though, it will scale the points to fit inside the NDC which means it will scale the values to fit inside the range [-1,1].
This matrix from Wikipedia shows what this means:
So, it is correct to have numbers other than 1.
For your way of computing by hand, I believe it's not scaling back to screen coordinates and that makes it wrong. As I said, the output of projection matrices will be in the range [-1,1], and if you want to get the pixel coordinates, I believe you should do something similar to this:
X_proj = K*(R*X_model + t)
x_pixel = X_proj[1]*width/2 + width / 2
y_pixel = X_proj[2]*height/2 + height / 2
Anyway, I think you'd be better if you used modern OpenGL with libraries like GLM. In this case, you have the exact projection matrices used at hand.
I'm using a logarithmic depth algorithmic which results in someFunc(clipspace.z) being written to the depth buffer and no implicit perspective divide.
I'm doing RTT / postprocessing so later on in a fragment shader I want to recompute eyespace.xyz, given ndc.xy (from the fragment coordinates) and clipspace.z (from someFuncInv() on the value stored in the depth buffer).
Note that I do not have clipspace.w, and my stored value is not clipspace.z / clipspace.w (as it would be when using fixed function depth) - so something along the lines of ...
float clip_z = ...; /* [-1 .. +1] */
vec2 ndc = vec2(FragCoord.xy / viewport * 2.0 - 1.0);
vec4 clipspace = InvProjMatrix * vec4(ndc, clip_z, 1.0));
clipspace /= clipspace.w;
... does not work here.
So is there a way to calculate clipspace.w out of clipspace.xyz, given the projection matrix or it's inverse?
clipspace.xy = FragCoord.xy / viewport * 2.0 - 1.0;
This is wrong in terms of nomenclature. "Clip space" is the space that the vertex shader (or whatever the last Vertex Processing stage is) outputs. Between clip space and window space is normalized device coordinate (NDC) space. NDC space is clip space divided by the clip space W coordinate:
vec3 ndcspace = clipspace.xyz / clipspace.w;
So the first step is to take our window space coordinates and get NDC space coordinates. Which is easy:
vec3 ndcspace = vec3(FragCoord.xy / viewport * 2.0 - 1.0, depth);
Now, I'm going to assume that your depth value is the proper NDC-space depth. I'm assuming that you fetch the value from a depth texture, then used the depth range near/far values it was rendered with to map it into a [-1, 1] range. If you didn't, you should.
So, now that we have ndcspace, how do we compute clipspace? Well, that's obvious:
vec4 clipspace = vec4(ndcspace * clipspace.w, clipspace.w);
Obvious and... not helpful, since we don't have clipspace.w. So how do we get it?
To get this, we need to look at how clipspace was computed the first time:
vec4 clipspace = Proj * cameraspace;
This means that clipspace.w is computed by taking cameraspace and dot-producting it by the fourth row of Proj.
Well, that's not very helpful. It gets more helpful if we actually look at the fourth row of Proj. Granted, you could be using any projection matrix, and if you're not using the typical projection matrix, this computation becomes more difficult (potentially impossible).
The fourth row of Proj, using the typical projection matrix, is really just this:
[0, 0, -1, 0]
This means that the clipspace.w is really just -cameraspace.z. How does that help us?
It helps by remembering this:
ndcspace.z = clipspace.z / clipspace.w;
ndcspace.z = clipspace.z / -cameraspace.z;
Well, that's nice, but it just trades one unknown for another; we still have an equation with two unknowns (clipspace.z and cameraspace.z). However, we do know something else: clipspace.z comes from dot-producting cameraspace with the third row of our projection matrix. The traditional projection matrix's third row looks like this:
[0, 0, T1, T2]
Where T1 and T2 are non-zero numbers. We'll ignore what these numbers are for the time being. Therefore, clipspace.z is really just T1 * cameraspace.z + T2 * cameraspace.w. And if we know cameraspace.w is 1.0 (as it usually is), then we can remove it:
ndcspace.z = (T1 * cameraspace.z + T2) / -cameraspace.z;
So, we still have a problem. Actually, we don't. Why? Because there is only one unknown in this euqation. Remember: we already know ndcspace.z. We can therefore use ndcspace.z to compute cameraspace.z:
ndcspace.z = -T1 + (-T2 / cameraspace.z);
ndcspace.z + T1 = -T2 / cameraspace.z;
cameraspace.z = -T2 / (ndcspace.z + T1);
T1 and T2 come right out of our projection matrix (the one the scene was originally rendered with). And we already have ndcspace.z. So we can compute cameraspace.z. And we know that:
clispace.w = -cameraspace.z;
Therefore, we can do this:
vec4 clipspace = vec4(ndcspace * clipspace.w, clipspace.w);
Obviously you'll need a float for clipspace.w rather than the literal code, but you get my point. Once you have clipspace, to get camera space, you multiply by the inverse projection matrix:
vec4 cameraspace = InvProj * clipspace;
So, I've got an imposter (the real geometry is a cube, possibly clipped, and the imposter geometry is a Menger sponge) and I need to calculate its depth.
I can calculate the amount to offset in world space fairly easily. Unfortunately, I've spent hours failing to perturb the depth with it.
The only correct results I can get are when I go:
gl_FragDepth = gl_FragCoord.z
Basically, I need to know how gl_FragCoord.z is calculated so that I can:
Take the inverse transformation from gl_FragCoord.z to eye space
Add the depth perturbation
Transform this perturbed depth back into the same space as the original gl_FragCoord.z.
I apologize if this seems like a duplicate question; there's a number of other posts here that address similar things. However, after implementing all of them, none work correctly. Rather than trying to pick one to get help with, at this point, I'm asking for complete code that does it. It should just be a few lines.
For future reference, the key code is:
float far=gl_DepthRange.far; float near=gl_DepthRange.near;
vec4 eye_space_pos = gl_ModelViewMatrix * /*something*/
vec4 clip_space_pos = gl_ProjectionMatrix * eye_space_pos;
float ndc_depth = clip_space_pos.z / clip_space_pos.w;
float depth = (((far-near) * ndc_depth) + near + far) / 2.0;
gl_FragDepth = depth;
For another future reference, this is the same formula as given by imallett, which was working for me in an OpenGL 4.0 application:
vec4 v_clip_coord = modelview_projection * vec4(v_position, 1.0);
float f_ndc_depth = v_clip_coord.z / v_clip_coord.w;
gl_FragDepth = (1.0 - 0.0) * 0.5 * f_ndc_depth + (1.0 + 0.0) * 0.5;
Here, modelview_projection is 4x4 modelview-projection matrix and v_position is object-space position of the pixel being rendered (in my case calculated by a raymarcher).
The equation comes from the window coordinates section of this manual. Note that in my code, near is 0.0 and far is 1.0, which are the default values of gl_DepthRange. Note that gl_DepthRange is not the same thing as the near/far distance in the formula for perspective projection matrix! The only trick is using the 0.0 and 1.0 (or gl_DepthRange in case you actually need to change it), I've been struggling for an hour with the other depth range - but that is already "baked" in my (perspective) projection matrix.
Note that this way, the equation really contains just a single multiply by a constant ((far - near) / 2) and a single addition of another constant ((far + near) / 2). Compare that to multiply, add and divide (possibly converted to a multiply by an optimizing compiler) that is required in the code of imallett.