I have a point in 2D screen coordinates that I want to project to a point on a plane given in clip space. However, does this even make sense as the w coordinate is dependent on the z-Coordinate so linear equations are worthless? Should I use eye coordinates instead?
Turns out it was the normal transformation problem. Thanks everyone!
Related
OpenGL spec:
It says: However, depth values for polygons must be interpolated by (14.10).
Why? Are the z coordinates depth values in camera space? If so, we should use perspective correctly barycentric coordinates to interpolate them, isn't it?(like equation 14.9)
Update:
So the z coordinates are NDC coordinates(which already divided by w). I have a small demo which implement a rasterizer. When I use linear interpolation of the NDC z coordinates, the result is a bit unusual(image below). While I use perspective correctly interpolation of camera z coordinates, the result is ok.
This is the perspective projection matrix I use:
Why? Are the z coordinates depth values in camera space? If so, we should use perspective correctly barycentric coordinates to interpolate them, isn't it?
No, they are not. They are in window space, meaning they already have been divided by w. It is correct that if you wanted to interpolate camrea space z, you would have to apply perspective correction. But for NDC and window space Z this would be wrong - after all, the perspective transformation (as achieved by perspective projection matrix and perspective divide) still maps straight lines to straight lines, and flat trinagles to flat triangles. That's why we use the hyperbolically distorted Z values as depth in the first place. This is also a property that is exploited for the hierarchical depth test optimization. Have a look at my answer here for some more details, including a few diagrams.
I am confused about the position of objects in opengl .The eye position is 0,0,0 , the projection plane is at z = -1 . At this point , will the objects be in between the eye position and and the plane (Z =(0 to -1)) ? or its behind the projection plane ? and also if there is any particular reason for being so?
First of all, there is no eye in modern OpenGL. There is also no camera. There is no projection plane. You define these concepts by yourself; the graphics library does not give them to you. It is your job to transform your object from your coordinate system into clip space in your vertex shader.
I think you are thinking about projection wrong. Projection doesn't move the objects in the same sense that a translation or rotation matrix might. If you take a look at the link above, you can see that in order to render a perspective projection, you calculate the x and y components of the projected coordinate with R = V(ez/pz), where ez is the depth of the projection plane, pz is the depth of the object, V is the coordinate vector, and R is the projection. Almost always you will use ez=1, which makes that equation into R = V/pz, allowing you to place pz in the w coordinate allowing OpenGL to do the "perspective divide" for you. Assuming you have your eye and plane in the correct places, projecting a coordinate is almost as simple as dividing by its z coordinate. Your objects can be anywhere in 3D space (even behind the eye), and you can project them onto your plane so long as you don't divide by zero or invalidate your z coordinate that you use for depth testing.
There is no "projection plane" at z=-1. I don't know where you got this from. The classic GL perspective matrix assumes an eye space where the camera is located at origin and looking into -z direction.
However, there is the near plane at z<0 and eveything in front of the near plane is going to be clipped. You cannot put the near plane at z=0, because then, you would end up with a division by zero when trying to project points on that plane. So there is one reasin that the viewing volume isn't a pyramid with they eye point at the top but a pyramid frustum.
This is btw. also true for real-world eyes or cameras. The projection center lies behind the lense, so no object can get infinitely close to the optical center in either case.
The other reason why you want a big near clipping distance is the precision of the depth buffer. The whole depth range between the front and the near plane has to be mapped to some depth value with a limited amount of bits, typically 24. So you want to keep the far plane as close as possible, and shift away the near plane as far as possible. The non-linear mapping of the screen-space z coordinate makes this even more important, as that the precision is non-uniformely distributed over that range.
i've been drawing directly into homogenous clip space (the 2x2x2 cube centered around 0,0,0) in opengl and i've realized that the perspective transformation matrix transforms all geometry from one right-parallelipid (view-space) to another right-parallelipid (homogenous clip-space).
so, why the heck does every opengl article use a non-right-parallelipid frustum to illustrate how projection works? i understand that the perspective transformation matrix will cause everything to get scaled by a term containing its distance from the camera and the camera's distance from the plane... is the traditional frustum illustration trying to explain that? or are we truly entered some warped space at some point in the perspective transformation? if so, how are we ending up back at a right-parallelipid (homogenous clip-space) at the end of it all?
You are right in the sense that there is just some affine, linear transformation and no real perspective distortion - when you just interpret the clip space as a 4-dimensional vector space.
But the clip space is not the "end of it all". The perspective effect is a nonlinear transformation which is finally achieved by the perspective division which will be done after the transformation to clip space. The projection matrix determines the w value that will be the divisor for this, and which is typically just -z_eye.
I am using the vtkCamera and would am trying to move it around and make it look at a certain point. For example, if I want to put the camera at position (x,y,z) and make it look at (0,0,0) for example with gluLookAt in openGL we would set the eye coordinates to (x,y,z) and the centre coordinates to (0,0,0) and the up vector to (0,1,0).
In vtk however, using the vtkCamera we have three separate methods, namely setPosition, setFocalPoint and setViewUp
my question is what do setPositon and setFocalPoint correspond to?
Thanks
setPosition corresponds to eye coordinates. setFocalPoint corresponds to where the camera is looking, so gluLookAt. It functions the same way as both openGL and DirectX in that sense.
I have a very general question. I wish to determine the boundary points of a number of objects (comprising 30-50 closed polygons (z) each having around 300 points(x,y,z)). I am working with a fixed viewport which is rotated about x,y and z-axes (alpha, beta, gamma) wrt origin of coordinate system for polygons.
As I see it there are two possibilities: perspective projection or raytracing. Perspective projection would seem to requires a large number of matrix operations for each point to determine its position is within or without the viewport.
Or given the large number of points would I better to raytrace the viewport pixels to object?
i.e. determine whether there is an intersection and then whether intersection occurs within or without object(s).
In either case I will write this result as 0 (outside) or 1 (inside) to 200x200 an integer matrix representing the viewport
Thank you in anticipation
Perspective projection (and then scan-converting the polygons in image coordinates) is going to be a lot faster.
The matrix transform that is required in the case of perspective projection (essentially the world-to-camera matrix) is required in exactly the same way when raytracing. However, with perspective projection, you're only transforming the corner points, whereas with raytracing, you're transforming all the points in the image.
You should be able to use perspective projection and a perspective projection matrix to compute the position of the vertices in screen space? It's hard to understand what you want to do really. If you want to create an image of that 3D scene then with only few polygons it would be hard to see any difference anyway between ray tracing and rasterisation if your code is optimised (you will still need to use an acceleration structure for the ray tracing approach), however yes rasterisation is likely to be faster anyway.
Now if you need to compute the distance from between the eye (the camera's origin) and the geometry visible through the camera's view, the I don't see why you can't use the depth value of any sample for any pixel in the image and use the inverse of the perspective projection matrix to find its distance in camera space.
Why is speed an issue in your problem? Otherwise use RT indeed.
Most of this information can be found on www.scratchapixel.com