3D point to screen position. Discarding the points behind the camera in OpenGL - opengl

I'm try to convert a 3D point to its screen position.
this is the code that I use.
glm::vec2 screenPosition(const glm::vec3 & _coord) const {
glm::vec4 coord = glm::vec4(_coord, 1);
coord = getProjection() * getView() * coord;
coord.x /= coord.w;
coord.y /= coord.w;
coord.z /= coord.w;
coord.x = (coord.x + 1) * width * 0.5;
coord.y = (coord.y + 1) * height * 0.5;
return glm::vec2(coord.x, coord.y);
I'm not 100% sure about the code but I do not know how to discard the points that are behind the camera.
Some one can help me?
Thanks

If coord.z (after division by coord.w) is not in interval [-1,1] the point should be discarded. The value outside that interval indicates that the point is not in camera frustum which also includes the case when point is behind the camera. For DirectX and Vulkan the interval is [0,1].

The correct way to do the culling / clipping of primitives is to do it in clip space, before the perspective divide.
The default OpenGL clip convention is -w <= x,y,z, <= w, and all points which do not fulfill this condition can be discarded (culling). Note that discarding points only works for point primitives, if you deal with more complex primitives (lines, triangles), you need to do actual clipping.
In the most general case, you will be using a perspective projection, and the clip space w value will vary per vertex - and it can be 0 - trying to do the discard in NDC will yield to a division by zero in such cases.
If you only want to deal about clipping points behind the camera, you can discard everything which is w <= 0, but usually, additionally clipping against the near plane makes much more sense (and also invoids some numerical issues when going very close to the camera): z < -w.
I'd like to stress on a few details here. The clip condition -w <= x,y,z, <= w implies that points which lie truly behind the camera (w < 0) must be rejected, but the w = 0 case is still a bit weird, because the homogeneous point (0,0,0,0) would still satisfy the above clip condition (and really yield no useful results when doing the perspective divide). However, OpenGL (and GPUs) do not clip against the plane where the camera is in (w=0), but against a view volume, and require you to set up a near plane which is in front of the camera. And in such a scenario, even if w=0 can occur, it is guaranteed that there is never both w=0 and z=0 simultaneously, so the (0,0,0,0) case is never hit. However, this does not prevent people from actually feeding (0,0,0,0) into gl_Position, and you can assume that real world implementations will not only reject the w < 0 case which is directly mandated from the above clip condition, but will reject /clip anything w <= 0. Note that primitive clipping where one vertex has clip space coords of (0,0,0,0) will still result in nonsense, but you're explicitly asking for that then.
For an orthogonal projection, there is actually no way to clip points behind the camera, because conceptually, the camera is infinitely far away. You still might set up an imagined "camera position" via your view matrix, and a view volume via the projection matrix, and you can still cull/clip against the near plane there (z < -w). Note that for practical purposes, an orthogonal projection will yield w = 1, so the additional w <= 0 check required in the perspective case is irrelevant.

Related

Project a 3D vertex to screen coordinates independently from OpenGL?

I have a vertex (x, y, z) and I want to calculate the screen location where this point would be rendered on my viewport. Something like Ray Picking, just more or less the other way around. I don't think I can use gluProject because at the time I need the projected point my matrices are restored to identities.
I would like to stay independent from OpenGL, so no extra render pass. This way I'm sure it would only be some math like the ray picking thing. I've implemented that one and it works well, so I want to project a vertex the same way.
Of course I have camera pos, up and lookAt vectors and fovy. Is there any source of information about this? Or does anyone know how to work this out?
If your know your matrices (or at least know how to construct them), you can compute screen location for a vertex by multiplying its position with the matrices and then performing viewport transformation:
vProjected = modelViewPojectionMatrix * v;
if (
// check that vertex shouldn't be clipped.
-vProjected.w <= vProjected.x && vProjected.x <= vProjected.w &&
-vProjected.w <= vProjected.y && vProjected.y <= vProjected.w &&
-vProjected.w <= vProjected.z && vProjected.z <= vProjected.w
) {
vProjected /= vProjected.w;
vScreen.x = VIEWPORT_W * vProjected.x / 2 + VIEWPORT_CENTER_X;
vScreen.y = VIEWPORT_H * vProjected.y / 2 + VIEWPORT_CENTER_Y;
}
Note that, as per OpenGL convention, (0, 0) is lower left corner, not upper left one.
Any math library with verctor and matrix operations can help you with that. For example, mathfu or glm.
UPD. How you can construct modelViewProjectionMatrix given camera position and orientation and projection params? We need two matrices (let's assume that model matrix is just an identity, i.e. vertex positions a given already in world coordinate system). First one would be the view matrix, which takes into account camera position and orientation. Here I'll be using mathfu since I'm more familiar with it, but almost every math library design with 3D graphics in mind has the same functions:
viewMatrix = mathfu::mat4::LookAt(
cameraLookAtPosition,
cameraPosition,
cameraUpVector
);
The second one would be projection matrix:
projectionMatrix = mathfu::mat4::Perspective(fovy, aspect, zNear, zFar);
Now modelViewProjectionMatrix is just a product of those two:
modelViewProjectionMatrix = projectionMatrix * viewMatrix;
Note that matrix multiplication is not commutative, in other words A * B != B * A. So order in which matrices are multiplied is important.

Clipping triangles in screen space per pixel

As I understand it, in OpenGL polygons are usually clipped in clip space and only those triangles (or parts of the triangles if the clipping process splits them) that survive the comparison with +- w. This then requires implementation of a polygon clipping algorithm such as Sutherland-Hodgman.
I am implementing my own CPU rasterizer and for now would like to avoid doing that. I have the NDC coordinates of vertices available (not really normalized since I did not clip anything so the positions may not be in range [-1, 1]). I would like to interpolate these values for all pixels and only draw pixels the NDC coordinates of which fall within [-1, 1] in the x, y and z dimensions. I would then additionally perform the depth test.
Would this work? If yes what would the interpolation look like? Can I use the OpenGl spec (page 427 14.9) formula for attribute interpolation as described here? Alternatively, should I use the formula 14.10 which is used for depth (z) interpolation for all 3 coordinates (I don't really understand why a different one is used there)?
Update:
I have tried interpolating the NDC values per pixel by two methods:
w0, w1, w2 are the barycentric weights of the vertices.
1) float x_ndc = w0 * v0_NDC.x + w1 * v1_NDC.x + w2 * v2_NDC.x;
float y_ndc = w0 * v0_NDC.y + w1 * v1_NDC.y + w2 * v2_NDC.y;
float z_ndc = w0 * v0_NDC.z + w1 * v1_NDC.z + w2 * v2_NDC.z;
2)
float x_ndc = (w0*v0_NDC.x/v0_NDC.w + w1*v1_NDC.x/v1_NDC.w + w2*v2_NDC.x/v2_NDC.w) /
(w0/v0_NDC.w + w1/v1_NDC.w + w2/v2_NDC.w);
float y_ndc = (w0*v0_NDC.y/v0_NDC.w + w1*v1_NDC.y/v1_NDC.w + w2*v2_NDC.y/v2_NDC.w) /
(w0/v0_NDC.w + w1/w1_NDC.w + w2/v2_NDC.w);
float z_ndc = w0 * v0_NDC.z + w1 * v1_NDC.z + w2 * v2_NDC.z;
The clipping + depth test always looks like this:
if (-1.0f < z_ndc && z_ndc < 1.0f && z_ndc < currentDepth &&
1.0f < y_ndc && y_ndc < 1.0f &&
-1.0f < x_ndc && x_ndc < 1.0f)
Case 1) corresponds to using equation 14.10 for their interpolation. Case 2) corresponds to using equation 14.9 for interpolation.
Results documented in gifs on imgur.
1) Strange things happen when the second cube is behind the camera or when I go into a cube.
2) Strange artifacts are not visible but as the camera approaches vertices, they start disappearing. And since this is the perspective correct interpolation of attributes vertices (nearer to the camera?) have greater weight so as soon as a vertex gets clipped this information is interpolated with strong weight to the triangle pixels.
Is all of this expected or have I done something wrong?
Clipping against the near plane is not strictly necessary, unless the triangle goes to or past 0 in the camera-space Z. Once that happens, the homogeneous coordinate math gets weird.
Most hardware only bothers to clip triangles if they extend more than a screen's width outside the clip space or if they cross the camera-Z of zero. This kind of clipping is called "guard-band clipping", and it saves a lot of performance, since clipping isn't cheap.
So yes, the math can work fine. The main thing you have to do, when setting up your scan lines, is figure out where each of them start/end on screen. The interpolation math is the same either way.
I don't see any reason why this wouldn't work. But it will be ways slower than traditional clipping. Note, that you might get into trouble with triangles close to the projection center since they will be vanishingly small and might cause problems in the barycentric coordinate calculation.
The difference between equation 14.9 and 14.10 is, that depth is basically z/w (and remapped to [0, 1]). Since the perspective divide has already happened, it has to be left away during interpolation.

Ray tracing texture implementation for spheres

I'm trying to implement textures for spheres in my ray tracer. I managed to get something working, but I am unsure about its correctness. Below is the code for getting the texture coordinates. For now, the texture is random and is generated at runtime.
virtual void GetTextureCoord(Vect hitPoint, int hres, int vres, int& x, int& y) {
float theta = acos(hitPoint.getVectY());
float phi = atan2(hitPoint.getVectX(), hitPoint.getVectZ());
if (phi < 0.0) {
phi += TWO_PI;
}
float u = phi * INV_TWO_PI;
float v = 1 - theta * INV_PI;
y = (int) ((hres - 1) * u);
x = (int) ((vres - 1) * v);
}
This is how the spheres look now:
I had to normalize the coordinates of the hit point to get the spheres to look like that. Otherwise they would look like:
Was normalising the hit point coordinates the right approach, or is something else broken in my code? Thank you!
Instead of normalising the hit point, I tried translating it to the world origin (as if the sphere center was there) and obtained the following result:
I'm using a 256x256 resolution texture by the way.
It's unclear what you mean by "normalizing" the hit point since there's nothing that normalizes it in the code you posted, but you mentioned that your hit point is in world space.
Also, you didn't say what texture mapping you're trying to implement, but I assume you want your U and V texture coordinates to represent latitude and longitude on the sphere's surface.
Your first problem is that converting Cartesian to spherical coordinates requires that the sphere is centered at the origin in the Cartesian space, which isn't true in world space. If the hit point is in world space, you have to subtract the sphere's world-space center point to get the effective hit point in local coordinates. (You figured this part out already and updated the question with a new image.)
Your second problem is that the way you're calculating theta requires that the the sphere have a radius of 1, which isn't true even after you move the sphere's center to the origin. Remember your trigonometry: the argument to acos is the ratio of a triangle's side to its hypotenuse, and is always in the range (-1, +1). In this case your Y-coordinate is the side, and the sphere's radius is the hypotenuse. So you have to divide by the sphere's radius when calling acos. It's also a good idea to clamp the value to the (-1, +1) range in case floating-point rounding error puts it slightly outside.
(In principle you'd also have to divide the X and Z coordinates by the radius, but you're only using those for an inverse tangent, and dividing them both by the radius won't change their quotient and thus won't change phi.)
Right now your sphere intersection and texture-coordinate functions are operating in world space, but you'll probably find it useful later to implement transformation matrices, which let you transform things from one coordinate space to another. Then you can change your sphere functions to operate in a local coordinate space where the center is the origin and the radius is 1, and give each object an associated transformation matrix that maps the local coordinate space to the world coordinate space. This will simplify your ray/sphere intersection code, and let you remove the origin subtraction and radius division from GetTextureCoord (since they're always (0, 0, 0) and 1 respectively).
To intersect a ray with an object, you'd use the object's transformation matrix to transform the ray into the object's local coordinate space, do the intersection (and compute texture coordinates) there, and then transform the result (e.g. hit point and surface normal) back to world space.

How to check if an object lies outside the clipping volume in OpenGL?

I'm really confused about OpenGL's modelview transformation. I understand all the transformation processes, but when it comes to projection matrix, I'm lost :(
If I have a point P (x, y, z), how can I check to see if this point will be drawn on a clipping volume defined by either by parallel clipping volume or perspective clipping volume? What's the mathematical background behind this process?
Apply the model-view-projection matrix to the object, then check if it lies outside the clip coordinate frustum, which is defined by the planes:
-w < x < w
-w < y < w
0 < z < w
So if you have a point p which is a vec3, and a model-view-projection matrix, M, then in GLSL it would look like this:
bool in_frustum(mat4 M, vec3 p) {
vec4 Pclip = M * vec4(p, 1.);
return abs(Pclip.x) < Pclip.w &&
abs(Pclip.y) < Pclip.w &&
0 < Pclip.z &&
Pclip.z < Pclip.w;
}
For anyone relying on the accepted answer, it is incorrect (at least in current implementations). OpenGL clips in the z plane the same as the x and y as -w < z < w (https://www.khronos.org/opengl/wiki/Vertex_Post-Processing).
The two tests for z should then be: std::abs(Pclip.z) < Pclip.w
Checking for zero will exclude all the drawn points that are closer to the near field clip plane than the far field clip plane.
To determine if a given point will be visible on the screen, you test it against the viewing frustum. See this frustum culling tutorial:
http://www.lighthouse3d.com/tutorials/view-frustum-culling/

3D coordinate of 2D point given camera and view plane

I wish to generate rays from the camera through the viewing plane. In order to do this, I need my camera position ("eye"), the up, right, and towards vectors (where towards is the vector from the camera in the direction of the object that the camera is looking at) and P, the point on the viewing plane. Once I have these, the ray that's generated is:
ray = camera_eye + t*(P-camera_eye);
where t is the distance along the ray (assume t = 1 for now).
My question is, how do I obtain the 3D coordinates of point P given that it is located at position (i,j) on the viewing plane? Assume that the upper left and lower right corners of the viewing plane are given.
NOTE: The viewing plane is not actually a plane in the sense that it doesn't extend infinitely in all directions. Rather, one may think of this plane as a widthxheight image. In the x direction, the range is 0-->width and in the y direction the range is 0-->height. I wish to find the 3D coordinate of the (i,j)th element, 0
General solution of the itnersection of a line and a plane see http://local.wasp.uwa.edu.au/~pbourke/geometry/planeline/
Your particular graphics lib (OpenGL/DirectcX etc) may have an standard way to do this
edit: You are trying to find the 3d intersection of a screen point (eg a mouse cursor) with a 3d object in you scene?
To work out P, you need the distance from the camera to the near clipping plane (the screen), the size of the window on the near clipping plane (or the view angle, you can work out the window size from the view angle) and the size of the rendered window.
Scale the screen position to the range -1 < x < +1 and -1 < y < +1 where +1 is the top/right and -1 is the bottom/left
Scale normalised x,y by the view window size
Scale by the right and up vectors of the camera and sum the results
Add the look at vector scaled by the clipping plane distance
In effect, you get:
p = at * near_clip_dist + x * right + y * up
where x and y are:
x = (screen_x - screen_centre_x) / (width / 2) * view_width
y = (screen_y - screen_centre_y) / (height / 2) * view_height
When I directly plugged in suggested formulas into my program, I didn't obtain correct results (maybe some debugging needed to be done). My initial problem seemed to be in the misunderstanding of the (x,y,z) coordinates of the interpolating corner points. I was treating x,y,z-coordinates separately, where I should not (and this may be specific to the application, since the camera can be oriented in any direction). Instead, the solution turned out to be a simple interpolation of the corner points of the viewing plane:
interpolate the bottom corner points in the i direction to get P1
interpolate the top corner points in the i direction to get P2
interpolate P1 and P2 in the j direction to get the world coordinates of the final point