Why clipping should be done in CCS, not NDCS.
I think it is easier to clip in NDCS, but many book said the clipping should be done in CCS. They give an example that a line is laid over eye from behind and front.
I could not understand why it can be a problem.
The only difference between Normalized Device Coordinates (NDCS) and Clip Space (CCS) is, that CCS is before the perspective divide and NDCS is afterwards. The reason why clipping doesn't work well in NDCS is that the perspective divide moves points behind the viewer to the front (since w contains -z), so triangles behind the viewer would not be clipped away correctly at the front plane.
Q: Where is the viewer in NDCS. In VCS, the viewer's location is origin point [0,0,0,1]. However, if I calculate the origin point with perspective matrix, the result is weird. The homogeneous coordinate is not 1 but 0. How can we define the viewer's position in NDCS?
In NDCS and CCS there is no finite viewing point (and I'm not sure what the viewer has to do with clipping). One has to think about both systems as the view-frustum being warped to a cube (near and far plane having the same size). In NDCS, the visible area is in [-1, 1] along each axis, whereas in CCS it is in [-w, w]. Now think about the viewer: In view space, the viewer (the projection center) was that point where all rays going from a corner of the near plane to the respective corner in the far plane intersected. When we now warp the frustum to a cube, all these rays are parallel and there is no intersection point anymore. This means the projection center is infinitely far away, which is described in projective space by vectors that have a homogeneous coordinate of 0.
Q: However, the point where z> 0 is always larger than 1 after conversion, and is also cut in NDCS. Am I wrong? If I'm wrong, can you give me one example?
You are basically right. But clipping doesn't happen at single points, clipping happens on edges spanned between these points.
Let's assume we have a line going from a point inside the frustum (A) to a point behind the viewer (B). In this case clipping should happen at the near plane and the line should go from A to B' (the intersection of the line with the near plane).
If we would first perform the perspective divide, then (as you noted) A still stays inside the frustum but B gets mapped to a point behind the far plane. When we now clip the line between those points, we get a line going from A to a point B' which is on the far plane. Obviously we don't want to get a line away from the viewer when the line was initially going through the viewer.
Related
I heard clipping should be done in clipping coordinate system.
The book suggests a situation that a line is laid from behind camera to in viewing volume. (We define this line as PQ, P is behind camera point)
I cannot understand why it can be a problem.
(The book says after finishing normalizing transformation, the P will be laid in front of camera.)
I think before making clipping coordinate system, the camera is on original point (0, 0, 0, 1) because we did viewing transformation.
However, in NDCS, I cannot think about camera's location.
And I have second question.
In vertex shader, we do model-view transformation and then projection transformation. Finally, we output these vertices to rasterizer.
(some vertices's w is not equal to '1')
Here, I have curiosity. The rendering pipeline will automatically do division procedure (by w)? after finishing clipping.
Sometimes not all the model can be seen on screen, mostly because some objects of it lie behind the camera (or "point of view"). Those objects are clipped out. If just a part of the object can't be seen, then just that part must be clipped leaving the rest as seen.
OpenGL clips
OpenGL does this clipping in Clipping Coordinate Space (CCS). This is a cube of size 2w x 2w x 2w where 'w' is the fourth coordinate resulting of (4x4) x (4x1) matrix and point multiplication. A mere comparison of coordinates is enough to tell if the point is clipped or not. If the point passes the test then its coordinates are divided by 'w' (so called "perspective division"). Notice that for ortogonal projections 'w' is always 1, and with perspective it's generally not 1.
CPU clips
If the model is too big perhaps you want to save GPU resources or improve the frame rate. So you decide to skip those objects that are going to get clipped anyhow. Then you do the maths on your own (on CPU) and only send to the GPU the vertices that passed the test. Be aware that some objects may have some vertices clipped while other vertices of this same object may not.
Perhaps you do send them to GPU and let it handle these special cases.
You have a volume defined where only objects inside are seen. This volume is defined by six planes. Let's put ourselves in the camera and look at this volume: If your projection is perspective the six planes build a "fustrum", a sort of truncated pyramid. If your projection is orthogonal, the planes form a parallelepiped.
In order to clip or not to clip a vertex you must use the distance form the vertex to each of these six planes. You need a signed distance, this means that the sign tells you what side of the plane is seen form the vertex. If any of the six distance signs is not the right one, the vertex is discarded, clipped.
If a plane is defined by equation Ax+By+Cz+D=0 then the signed distance from p1,p2,p3 is (Ap1+Bp2+Cp3+D)/sqrt(AA+BB+C*C). You only need the sign, so don't bother to calculate the denominator.
Now you have all tools. If you know your planes on "camera view" you can calculate the six distances and clip or not the vertex. But perhaps this is an expensive operation considering that you must transform the vertex coodinates from model to camera (view) spaces, a ViewModel matrix calculation. With the same cost you use your precalculated ProjectionViewModel matrix instead and obtain CCS coordinates, which are much easier to compare to '2w', the size of the CCS cube.
Sometimes you want to skip some vertices not due to they are clipped, but because their depth breaks a criteria you are using. In this case CCS is not the right space to work with, because Z-coordinate is transformed into [-w, w] range, depth is somehow "lost". Instead, you do your clip test in "view space".
I am learning openGL from this scratchpixel, and here is a quote from the perspective project matrix chapter:
Cameras point along the world coordinate system negative z-axis so that when a point is converted from world space to camera space (and then later from camera space to screen space), if the point is to left of the world coordinate system y-axis, it will also map to the left of the camera coordinate system y-axis. In other words, we need the x-axis of the camera coordinate system to point to the right when the world coordinate system x-axis also points to the right; and the only way you can get that configuration, is by having camera looking down the negative z-axis.
I think it has something to do with the mirror image? but this explanation just confused me...why is the camera's coordinate by default does not coincide with the world coordinate(like every other 3D objects we created in openGL)? I mean, we will need to transform the camera coordinate anyway with a transformation matrix (whatever we want with the negative z set up, we can simulate it)...why bother?
It is totally arbitrary what to pick for z direction.
But your pick has a lot of deep impact.
One reason to stick with the GL -z way is that the culling of faces will match GL constant names like GL_FRONT. I'd advise just to roll with the tutorial.
Flipping the sign on just one axis also flips the "parity". So a front face becomes a back face. A znear depth test becomes zfar. So it is wise to pick one early on and stick with it.
By default, yes, it's "right hand" system (used in physics, for example). Your thumb is X-axis, index finger Y-axis, and when you make those go to right directions, Z-points (middle finger) to you. Why Z-axis has been selected to point inside/outside screen? Because then X- and Y-axes go on screen, like in 2D graphics.
But in reality, OpenGL has no preferred coordinate system. You can tweak it as you like. For example, if you are making maze game, you might want Y to go outside/inside screen (and Z upwards), so that you can move nicely at XY plane. You modify your view/perspective matrices, and you get it.
What is this "camera" you're talking about? In OpenGL there is no such thing as a "camera". All you've got is a two stage transformation chain:
vertex position → viewspace position (by modelview transform)
viewspace position → clipspace position (by projection transform)
To see why be default OpenGL is "looking down" -z, we have to look at what happens if both transformation steps do "nothing", i.e. full identity transform.
In that case all vertex positions passed to OpenGL are unchanged. X maps to window width, Y maps to window height. All calculations in OpenGL by default (you can change that) have been chosen adhere to the rules of a right hand coordinate system, so if +X points right and +Y points up, then Z+ must point "out of the screen" for the right hand rule to be consistent.
And that's all there is about it. No camera. Just linear transformations and the choice of using right handed coordinates.
I am confused about the position of objects in opengl .The eye position is 0,0,0 , the projection plane is at z = -1 . At this point , will the objects be in between the eye position and and the plane (Z =(0 to -1)) ? or its behind the projection plane ? and also if there is any particular reason for being so?
First of all, there is no eye in modern OpenGL. There is also no camera. There is no projection plane. You define these concepts by yourself; the graphics library does not give them to you. It is your job to transform your object from your coordinate system into clip space in your vertex shader.
I think you are thinking about projection wrong. Projection doesn't move the objects in the same sense that a translation or rotation matrix might. If you take a look at the link above, you can see that in order to render a perspective projection, you calculate the x and y components of the projected coordinate with R = V(ez/pz), where ez is the depth of the projection plane, pz is the depth of the object, V is the coordinate vector, and R is the projection. Almost always you will use ez=1, which makes that equation into R = V/pz, allowing you to place pz in the w coordinate allowing OpenGL to do the "perspective divide" for you. Assuming you have your eye and plane in the correct places, projecting a coordinate is almost as simple as dividing by its z coordinate. Your objects can be anywhere in 3D space (even behind the eye), and you can project them onto your plane so long as you don't divide by zero or invalidate your z coordinate that you use for depth testing.
There is no "projection plane" at z=-1. I don't know where you got this from. The classic GL perspective matrix assumes an eye space where the camera is located at origin and looking into -z direction.
However, there is the near plane at z<0 and eveything in front of the near plane is going to be clipped. You cannot put the near plane at z=0, because then, you would end up with a division by zero when trying to project points on that plane. So there is one reasin that the viewing volume isn't a pyramid with they eye point at the top but a pyramid frustum.
This is btw. also true for real-world eyes or cameras. The projection center lies behind the lense, so no object can get infinitely close to the optical center in either case.
The other reason why you want a big near clipping distance is the precision of the depth buffer. The whole depth range between the front and the near plane has to be mapped to some depth value with a limited amount of bits, typically 24. So you want to keep the far plane as close as possible, and shift away the near plane as far as possible. The non-linear mapping of the screen-space z coordinate makes this even more important, as that the precision is non-uniformely distributed over that range.
I want to test if any given point in the world is on a quad/plane? The quad/plane can be translated/rotated/scaled by any values but it still should be able to detect if the given point is on it. I also need to get the location where the point should have been, if the quad was not applied any rotation/scale/translation.
For example, consider a quad at 0, 0, 0 with size 100x100, rotated at an angle of 45 degrees along z axis. If my mouse location in the world is at ( x, y, 0, ), I need to know if that point falls on that quad in its current transformation? If yes, then I need to know if no transformations were applied to the quad, where that point would have been on it? Any code sample would be of great help
A ray-casting approach is probably simplest:
Use gluUnProject() to get the world-space direction of the ray to cast into the scene. The ray's origin is the camera position.
Put this ray into object space by transforming it by the inverse of your rectangle's transform. Note that you need to transform both the ray's origin point and direction vector.
Compute the intersection point between this ray and the XY plane with a standard ray-plane intersection test.
Check that the intersection point's x and y values are within your rectangle's bounds, if they are then that's your desired result.
A math library such as GLM will be very helpful if you aren't confident about some of the math involved here, it has corresponding functions such as glm::unProject() as well as functions to invert matrices and do all the other transformations you'd need.
I know it possible to use PerspectiveCamera's frustum member to judge if a point(x,y,z) is in the frustum by calling pointInFrustum(Vector3 point) function. Now if the z-coordinate is fixed to some value, can I get the boundary of x and y axis that is visible directly?
The Frustum contains the 8 points (planePoints) making up the near and far plane:
/** eight points making up the near and far clipping "rectangles". order is counter clockwise, starting at bottom left **/
public final Vector3[] planePoints
These points are updated when Camera.update is called. To be on the safe side, just call this once before doing the following operations:
Calculate a parameter t (in [0,1]) specifying how far the POI is away from the near plane wrt the far plane.
t allows to linearly interpolate the top surface of the frustum (unsure about how to call this) down the z axis.
And btw, frustum in libgdx's Camera class should really be called something like viewvolume, since orthographic projections don't span a frustum in space.