OpenGL GL_Position 3D to 2D Screen Space - opengl

I am trying to understand the maths associated with converting a 3D
point into a 2D screen position. I understand that the process involves moving from object space->worldspace->eye space -> clip Space -> NDC space -> ViewPort space (Final 2D position on screen)
VertexShader:
GL_Position = Projection Matrix * view Matrix * model Matrix * vec(Position,1); => clip space.
FragmentShader:
(Pseudo code)
//assuming GL_position is received as a vec4 input variable
vec2 Gl_position_ndc = (Gl_position.xy/Gl_position.w)/2+ .5;
(Gl_position_ndc -> GL_FragColor) after perspective division and converting to Normalized device Coordinate space
Do these automatic perspective divides and NDC conversion in the Fragment shader happen automatically to the GL_Position received from the Vertex shader as described above in the Fragment shader?

Yes, division by w is automatic after you output the clip-space vertex in the vertex shader.
It happens before rasterization, and therefore before the fragment shader runs. Now, one interesting quirk is that in window-space (gl_FragCoord) in a fragment shader, w = 1/clip_w. If you try to do this divide again using gl_FragCoord, you actually undo the perspective division and things will get weird.
There are reasons you might want to divide by 1/clip.w, but this is not one of them.

Related

OpenGL vertex shader for pinhole camera model

I am trying to implement a simple OpenGL renderer that simulates a pinhole camera model (as defined for example here). Currently I use the vertex shader to map the 3D vertices to the clip space, where K in the shader contains [focal length x, focal length y, principal point x, principal point y] and zrange is the depth range of the vertices.
#version 330 core
layout (location = 0) in vec3 vin;
layout (location = 1) in vec3 cin;
layout (location = 2) in vec3 nin;
out vec3 shader_pos;
out vec3 shader_color;
out vec3 shader_normal;
uniform vec4 K;
uniform vec2 zrange;
uniform vec2 imsize;
void main() {
vec3 uvd;
uvd.x = (K[0] * vin.x + K[2] * vin.z) / vin.z;
uvd.y = (K[1] * vin.y + K[3] * vin.z) / vin.z;
uvd.x = 2 * uvd.x / (imsize[0]) - 1;
uvd.y = 2 * uvd.y / (imsize[1]) - 1;
uvd.z = 2 * (vin.z - zrange[0]) / (zrange[1] - zrange[0]) - 1;
shader_pos = uvd;
shader_color = cin;
shader_normal = nin;
gl_Position = vec4(uvd.xyz, 1.0);
}
I verify the renderings with a simple ray-tracer, however there seems to be an offset stemming from my OpenGL implementation. The depth values are different, but not by an affine offset as it would be caused by a wrong remapping (see the slanted surface on the tetrahedron, ignoring the errors on the edges).
I am trying to implement a simple OpenGL renderer that simulates a pinhole camera model.
A standard perspective projection matrix already implements a pinhole camera model. What you're doing here is just having more calculations per vertex, which could all be pre-calculated on the CPU and put in a single matrix.
The only difference is the z range. But a "pinhole camera" does not have a z range, all points are projected to the image plane. So what you want here is a pinhole camera model for x and y, and a linear mapping for z.
However, your implementation is wrong. A GPU will interpolate the z linearly in window space. That means, it will calculate the barycentric coordinates of each fragment with respect to the 2D projection of the triangle of the window. However, when using a perspective projection, and when the triangle is not excatly parallel to the image plane, those barycentric coordinates will not be those the respective 3D point would have had with respect to the actual 3D primitive before the projection.
The trick here is that since in screen space, we typically have x/z and y/z as the vertex coordinates, and when we interpolate linaerily inbetween that, we also have to interpolate 1/z for the depth. However, in reality, we don't divide by z, but w (and let the projection matrix set w_clip = [+/-]z_eye for us). After the division by w_clip, we get a hyperbolic mapping of the z value, but with the nice property that it can be linearly interpolated in window space.
What this means is that by your use of a linear z mapping, your primitives now would have to be bend along the z dimension to get the correct result. Look at the following top-down view of the situation. The "lines" represent flat triangles, looked from straight above:
In eye space, the view rays would all go from the origin through each pixel (we could imagine the 2D pixel raster on the near plane, for example). In NDC, we have transformed this to an orthograhic projection. The pixels still can be imagined at the near plane, but all view rays now are parallel.
In the standard hyperbolical mapping, the point in the middle of the frustum is compressed much towards the end. However, the traingle still is flat.
If you use a linear mapping instead, your triangle would have not to be flat any more. Look for example at the intersection point between the two traingles. It must have the same x (and y) coordinate as in the hyperbolic case, for the correct result.
However, you only transform the vertices according to a linear z value, the GPU will still linearly interpolate the result, so in your case, you would get straight connections between your transformed points, your intersection point between the two triangles is moved, and your depth values are all wrong except for the actual vertex points itself.
If you want to use a linear depth buffer, you have to correct the depth of each fragment in the fragment shader, to implement the required non-linear interpolation on your own. Doing so would break a lot of the clever depth test optimizations GPUs do, notably early Z and hierachical Z, so while it is possible, you'l loose some performance.
The much better solution is: Just use a standard hyperbolic depth value. Just linearize the depth values after you read them back. Also, don't do the z Division in the vertex shader. You do not only break z this way, you also break the perspective-corrected interpolation of the varyings, so your shading will also be wrong. Let the GPU do the division, just shuffle the correct value into gl_Position.w. The GPU will internally not only do the divide, the perspective corrected interpolation also depends on w.

How to get homogeneous screen space coordinates in openGL

I'm studying opengl and I'v got i little 3d scene with some objects. In GLSL vertex shader I multiply vertexes on matixes like this:
vertexPos= viewMatrix * worldMatrix * modelMatrix * gl_Vertex;
gl_Position = vertexPos;
vertexPos is a vec4 varying variable and I pass it to fragment shader.
Here is how the scene renders normaly:
normal render
But then I wana do a debug render. I write in fragment shader:
gl_FragColor = vec4(vertexPos.x, vertexPos.x, vertexPos.x, 1.0);
vertexPos is multiplied by all matrixes, including perspective matrix, and I assumed that I would get a smooth gradient from the center of the screen to the right edge, because they are mapped in -1 to 1 square. But look like they are in screen space but perspective deformation isn't applied. Here is what I see:
(dont look at red line and light source, they are using different shader)
debug render
If I devide it by about 15 it will look like this:
gl_FragColor = vec4(vertexPos.x, vertexPos.x, vertexPos.x, 1.0)/15.0;
devided by 15
Can someone please explain me, why the coordinates aren't homogeneous and the scene still renders correctly with perspective distortion?
P.S. if I try to put gl_Position in fragment shader instead of vertexPos, it doesn't work.
A so-called perspective division is applied to gl_Position after it's computed in a vertex shader:
gl_Position.xyz /= gl_Position.w;
But it doesn't happen to your varyings unless you do it manually. Thus, you need to add
vertexPos.xyz /= vertexPos.w;
at the end of your vertex shader. Make sure to do it after you copy the value to gl_Position, you don't want to do the division twice.

OpenGL shadow mapping with deferred rendering, position transformation

I am using deferred rendering where i store the eye space position in a texture accordingly:
vertex:
gl_Position = vec4(vertex_position, 1.0);
geometry:
vertexOut.position = vec3(viewMatrix * modelMatrix * gl_in[i].gl_Position);
fragment:
positionOut = vec3(vertexIn.position);
Now, in the second pass (lighting pass) I am trying to sample my shadow map, using UV coordinates calculated from this vec4
vec4 lightSpacePos = lightProjectionMatrix * lightViewMatrix * lightModelMatrix * vec4(position, 1.0);
The position used is the same position stored and sampled from the position texture.
Do I need to transfrom the position with the inverse camera view matrix before doing this calculation? To bring it back to world space or how should I proceed?
Typically shadow mapping is done by comparing the window-space Z coordinate (this is what a depth texture stores) of your current fragment vs. your light. This must be done using a common reference orientation, so that involves re-projecting your current fragment's position from the perspective of your light.
You have the view-space position right now, which is relative to your current camera and not particularly useful. To do this effectively you want world-space position. You can get that if you transform the view-space position by the inverse view matrix.
Given world-space position, transform into clip-space from light's perspective:
// This will be in clip-space
vec4 lightSpacePos = lightProjectionMatrix * lightViewMatrix * vec4 (worldPos);
// Transform it into NDC-space by dividing by w
lightSpacePos /= lightSpacePos.w;
// Range is now [-1.0, 1.0], but you need [0.0, 1.0]
lightSpacePos = lightSpacePos * vec4 (0.5) + vec4 (0.5);
Assuming default depth range, lightSpacePos is now ready for use. xy contains the texture coordinates to sample from your shadow map and z contains the depth to use for comparison.
For a more thorough explanation, see the following answer.
Incidentally, you will want to eliminate your position texture from your G-Buffer to achieve reasonable performance. It is very easy to reconstruct world- or view-space position given only the depth and the projection and view matrices and the arithmetic involved is much quicker than an extra texture fetch. Storing an additional texture with adequate precision to represent position in 3D space will burn through tons of memory bandwidth each frame and is completely unnecessary.
This article from the OpenGL Wiki explains how to do this. You can take it one step farther and work back to world-space, which is more desirable than view-space. You may need to tweak your depth buffer a little bit to get adequate precision, but it will still be quicker than storing position separately.

Apply custom projectionmatrix (to texturecoordinate) in GLSL

I want to view a flat fullscreen texture as it is spherical, by transforming it in a postprocess shader.
I figure I have to apply a projectionmatrix to the texture coordinate in the shader.
I found this website: http://www.songho.ca/opengl/gl_projectionmatrix.html which learns me a lot about the inners of the projectionmatrix.
But how do I apply it? I thought I would have to multiply the third row of the projection matrix to the texture coordinate with a calculated z value added to make it spherical. My efforts don't show any result though.
EDIT: I see the same issue here: http://lists.openscenegraph.org/pipermail/osg-users-openscenegraph.org/2008-April/009765.html
I think after you multiply text coords by projection matrix you have to make a perspective division and move from 3D to 2D (since the texture is 2D). This is the same as with shadow mapping.
// in fragment shader:
vec4 proj = uniformModelViewProjMatrix * tex_coords;
proj.xyz /= proj.w;
proj.xyz += vec3(1.0);
proj.xyz *= 0.5;
vec4 col = texture2D(sampler, proj.xy);
or look at http://www.ozone3d.net/tutorials/glsl_texturing_p08.php (for texture2DProj)

C++/OpenGL convert world coords to screen(2D) coords

I am making a game in OpenGL where I have a few objects within the world space. I want to make a function where I can take in an object's location (3D) and transform it to the screen's location (2D) and return it.
I know the the 3D location of the object, projection matrix and view matrix in the following varibles:
Matrix projectionMatrix;
Matrix viewMatrix;
Vector3 point3D;
To do this transform, you must first take your model-space positions and transform them to clip-space. This is done with matrix multiplies. I will use GLSL-style code to make it obvious what I'm doing:
vec4 clipSpacePos = projectionMatrix * (viewMatrix * vec4(point3D, 1.0));
Notice how I convert your 3D vector into a 4D vector before the multiplication. This is necessary because the matrices are 4x4, and you cannot multiply a 4x4 matrix with a 3D vector. You need a fourth component.
The next step is to transform this position from clip-space to normalized device coordinate space (NDC space). NDC space is on the range [-1, 1] in all three axes. This is done by dividing the first three coordinates by the fourth:
vec3 ndcSpacePos = clipSpacePos.xyz / clipSpacePos.w;
Obviously, if clipSpacePos.w is zero, you have a problem, so you should check that beforehand. If it is zero, then that means that the object is in the plane of projection; it's view-space depth is zero. And such vertices are automatically clipped by OpenGL.
The next step is to transform from this [-1, 1] space to window-relative coordinates. This requires the use of the values you passed to glViewport. The first two parameters are the offset from the bottom-left of the window (vec2 viewOffset), and the second two parameters are the width/height of the viewport area (vec2 viewSize). Given these, the window-space position is:
vec2 windowSpacePos = ((ndcSpacePos.xy + 1.0) / 2.0) * viewSize + viewOffset;
And that's as far as you go. Remember: OpenGL's window-space is relative to the bottom-left of the window, not the top-left.