Why depth values must be interpolated directly by barycentric coordinates in openGL? - opengl

OpenGL spec:
It says: However, depth values for polygons must be interpolated by (14.10).
Why? Are the z coordinates depth values in camera space? If so, we should use perspective correctly barycentric coordinates to interpolate them, isn't it?(like equation 14.9)
Update:
So the z coordinates are NDC coordinates(which already divided by w). I have a small demo which implement a rasterizer. When I use linear interpolation of the NDC z coordinates, the result is a bit unusual(image below). While I use perspective correctly interpolation of camera z coordinates, the result is ok.
This is the perspective projection matrix I use:

Why? Are the z coordinates depth values in camera space? If so, we should use perspective correctly barycentric coordinates to interpolate them, isn't it?
No, they are not. They are in window space, meaning they already have been divided by w. It is correct that if you wanted to interpolate camrea space z, you would have to apply perspective correction. But for NDC and window space Z this would be wrong - after all, the perspective transformation (as achieved by perspective projection matrix and perspective divide) still maps straight lines to straight lines, and flat trinagles to flat triangles. That's why we use the hyperbolically distorted Z values as depth in the first place. This is also a property that is exploited for the hierarchical depth test optimization. Have a look at my answer here for some more details, including a few diagrams.

Related

Coordinate System [-1, 1]

I am confused on how the OpenGL coordinate system works. I know you start with object coordinates -- everything defined in its own system. Then by applying a matrix, the coordinates change to world coordinates. By applying another matrix, you have view coordinates. Then if you're working in 3D, you can apply a perspective matrix. In the end, you are left with a set of coordinates which likely are not from [-1, 1]. How does OpenGL know how to normalize them from [-1, 1]? How does it know what to clip them out? In the shader, glPosition is just given your coordinates, it doesn't know that there have been through several transformations. I know that a view to normalized coordinate matrix involves a translation and a scale, but we never explicitly make a matrix for that in OpenGL. Does OpenGL use its own hidden matrix to translate from coordinates passed to glPostion to normalized coordinates?
Deprecated fixed function vertex transformations are explained in https://www.opengl.org/wiki/Vertex_Transformation
Shader based rendering is likely to use same or very similar math for each transformation step. The missing step between glPosition and device coordinates is perspective divide (like LJ commented quickly) where xyzw coordinates are converted to xyz coordinates. xyzw coordinates are homogeneous coordinates for 3-dimensional coordinates that use 4 components to represent a location.
https://en.wikipedia.org/wiki/Homogeneous_coordinates

opengl mixed perspective and ortographic projection

How can one mix ortographic and perspective projection in openGL?
Some 2d elements have to be drawn in screen space (no scaling, rotation, etc..)
These 2d elements have a z position, they have to appear in front/behind of other 3d elements.
So i set up orographic projection, draw all 2d elements, then setup perspective projection and draw all 3d elements.
The result is that all 2d elements are drawn on top. It seems that the z values from the orto projection and the z values from the perspective projection are not compatible (GL_DEPTH_TEST).
Separately all 2d and all 3d elements work fine, the problem is when i try to mix them.
Does the prespective projection changes the z values? In what way?
Is it possible to use z values from orto projection mixed with z values from perspective projection for depth test, or this whole concept is flawed?
Bare opengl1.5
It seems that the z values from the orto projection and the z values from the perspective projection are not compatible (GL_DEPTH_TEST).
That is indeed the case. Perspective transformation maps the Z values nonlinear to the depth buffer values. The usual way to address this problem is to copy the depth buffer after the perspective pass into a depth texture and use that as an additional input in the fragment shader of the orthographic drawn stuff, reverse the nonlinearity in the depth input and compare the incoming Z coordinate with that; then discard appropriately.
It's also possible to emit linear depth values in the perspective drawn geometry fragment shaders, however the depth nonlinearity of perspective projection has its purpose; without it you loose depth precision where it matters most, close to the point of view.

What is the difference between ProjectionTransformMatrix of VTK and GL_PROJECTION of OpenGL?

I am having profound issues regarding understanding the transformations involved in VTK. OpenGL has fairly good documentation and I was of the impression that VTK is verym similar to OpenGL (it is, in many ways). But when it comes to transformations, it seems to be an entirely different story.
This is a good OpenGL documentation about transforms involved:
http://www.songho.ca/opengl/gl_transform.html
The perspective projection matrix in OpenGL is:
I wanted to see if this formula applied in VTK will give me the projection matrix of VTK (by cross-checking with VTK projection matrix).
Relevant Camera and Renderer Parameters:
camera->SetPosition(0,0,20);
camera->SetFocalPoint(0,0,0);
double crSet[2] = {10, 1000};
renderer->GetActiveCamera()->SetClippingRange(crSet);
double windowSize[2];
renderWindow->SetSize(1280,720);
renderWindowInteractor->GetSize(windowSize);
proj = renderer->GetActiveCamera()->GetProjectionTransformMatrix(windowSize[0]/windowSize[1], crSet[0], crSet[1]);
The projection transform matrix I got for this configuration is:
The (3,3) and (3,4) values of the projection matrix (lets say it is indexed 1 to 4 for rows and columns) should be - (f+n)/(f-n) and -2*f*n/(f-n) respectively. In my VTK camera settings, the nearz is 10 and farz is 1000 and hence I should get -1.020 and -20.20 respectively in the (3,3) and (3,4) locations of the matrix. But it is -1010 and -10000.
I have changed my clipping range values to see the changes and the (3,3) position is always nearz+farz which makes no sense to me. Also, it would be great if someone can explain why it is 3.7320 in the (1,1) and (2,2) positions. And this value DOES NOT change when I change the window size of the renderer window. Quite perplexing to me.
I see in VTKCamera class reference that GetProjectionTransformMatrix() returns the transformation matrix that maps from camera coordinates to viewport coordinates.
VTK Camera Class Reference
This is a nice depiction of the transforms involved in OpenGL rendering:
OpenGL Projection Matrix is the matrix that maps from eye coordinates to clip coordinates. It is beyond doubt that eye coordinates in OpenGL is the same as camera coordinates in VTK. But is the clip coordinates in OpenGL same as viewport coordinates of VTK?
My aim is to simulate a real webcam camera (already calibrated) in VTK to render a 3D model.
Well, the documentation you linked to actually explains this (emphasis mine):
vtkCamera::GetProjectionTransformMatrix:
Return the projection transform matrix, which converts from camera
coordinates to viewport coordinates. This method computes the aspect,
nearz and farz, then calls the more specific signature of
GetCompositeProjectionTransformMatrix
with:
vtkCamera::GetCompositeProjectionTransformMatrix:
Return the concatenation of the ViewTransform and the
ProjectionTransform. This transform will convert world coordinates to
viewport coordinates. The 'aspect' is the width/height for the
viewport, and the nearz and farz are the Z-buffer values that map to
the near and far clipping planes. The viewport coordinates of a point located inside the frustum are in the range
([-1,+1],[-1,+1], [nearz,farz]).
Note that this neither matches OpenGL's window space nor normalized device space. If find the term "viewport coordinates" for this aa poor choice, but be it as it may. What bugs me more with this is that the matrix actually does not transform to that "viewport space", but to some clip-space equivalent. Only after the perspective divide, the coordinates will be in the range as given for the above definition of the "viewport space".
But is the clip coordinates in OpenGL same as viewport coordinates of
VTK?
So that answer is a clear no. But it is close. Basically, that projection matrix is just a scaled and shiftet along the z dimension, and it is easy to convert between those two. Basically, you can simply take znear and zfar out of VTK's matrix, and put it into that OpenGL projection matrix formula you linked above, replacing just those two matrix elements.

Perspective Projection - OpenGL

I am confused about the position of objects in opengl .The eye position is 0,0,0 , the projection plane is at z = -1 . At this point , will the objects be in between the eye position and and the plane (Z =(0 to -1)) ? or its behind the projection plane ? and also if there is any particular reason for being so?
First of all, there is no eye in modern OpenGL. There is also no camera. There is no projection plane. You define these concepts by yourself; the graphics library does not give them to you. It is your job to transform your object from your coordinate system into clip space in your vertex shader.
I think you are thinking about projection wrong. Projection doesn't move the objects in the same sense that a translation or rotation matrix might. If you take a look at the link above, you can see that in order to render a perspective projection, you calculate the x and y components of the projected coordinate with R = V(ez/pz), where ez is the depth of the projection plane, pz is the depth of the object, V is the coordinate vector, and R is the projection. Almost always you will use ez=1, which makes that equation into R = V/pz, allowing you to place pz in the w coordinate allowing OpenGL to do the "perspective divide" for you. Assuming you have your eye and plane in the correct places, projecting a coordinate is almost as simple as dividing by its z coordinate. Your objects can be anywhere in 3D space (even behind the eye), and you can project them onto your plane so long as you don't divide by zero or invalidate your z coordinate that you use for depth testing.
There is no "projection plane" at z=-1. I don't know where you got this from. The classic GL perspective matrix assumes an eye space where the camera is located at origin and looking into -z direction.
However, there is the near plane at z<0 and eveything in front of the near plane is going to be clipped. You cannot put the near plane at z=0, because then, you would end up with a division by zero when trying to project points on that plane. So there is one reasin that the viewing volume isn't a pyramid with they eye point at the top but a pyramid frustum.
This is btw. also true for real-world eyes or cameras. The projection center lies behind the lense, so no object can get infinitely close to the optical center in either case.
The other reason why you want a big near clipping distance is the precision of the depth buffer. The whole depth range between the front and the near plane has to be mapped to some depth value with a limited amount of bits, typically 24. So you want to keep the far plane as close as possible, and shift away the near plane as far as possible. The non-linear mapping of the screen-space z coordinate makes this even more important, as that the precision is non-uniformely distributed over that range.

OpenGL/GLUT - Project ModelView Coordinate to Texture Matrix

Is there a way using OpenGL or GLUT to project a point from the model-view matrix into an associated texture matrix? If not, is there a commonly used library that achieves this? I want to modify the texture of an object according to a ray cast in 3D space.
The simplest case would be:
A ray is cast which intersects a quad, mapped with a single texture.
The point of intersection is converted to a value in texture space clamped between [0.0,1.0] in the x and y axis.
A 3x3 patch of pixels centered around the rounded value of the resulting texture point is set to an alpha value of 0.( or another RGBA value which is convenient, for the desired effect).
To illustrate here is a more complex version of the question using a sphere, the pink box shows the replaced pixels.
I just specify texture points for mapping in OpenGL, I don't actually know how the pixels are projected onto the sphere. Basically I need to to the inverse of that projection, but I don't quite know how to do that math, especially on more complex shapes like a sphere or an arbitrary convex hull. I assume that you can somehow find a planar polygon that makes up the shape, which the ray is intersecting, and from there the inverse projection of a quad or triangle would be trivial.
Some equations, articles and/or example code would be nice.
There are a few ways you could accomplish what you're trying to do:
Project a world coordinate point into normalized device coordinates (NDCs) by doing the model-view and projection transformation matrix multiplications by yourself (or if you're using old-style OpenGL, call gluProject), and perform the perspective division step. If you use a depth coordinate of zero, this would correspond to intersecting your ray at the imaging plane. The only other correction you'd need to do map from NDCs (which are in the range [-1,1] in x and y) into texture space by dividing the resulting coordinate by two, and then shifting by .5.
Skip the ray tracing all together, and bind your texture as a framebuffer attachment to a framebuffer object, and then render a big point (or sprite) that modifies the colors in the neighborhood of the intersection as you want. You could use the same model-view and projection matrices, and will (probably) only need to update the viewport to match the texture resolution.
So I found a solution that is a little complicated, but does the trick.
For complex geometry you must determine which quad or triangle was intersected, and use this as the plane. The quad must be planar(obviously).
Draw a plane in the identity matrix with dimensions 1x1x0, map the texture on points identical to the model geometry.
Transform the plane, and store the inverse of each transform matrix in a stack
Find the point at which the the plane is intersected
Transform this point using the inverse matrix stack until it returns to identity matrix(it should have no depth(
Convert this point from 1x1 space into pixel space by multiplying the point by the number of pixels and rounding. Or start your 2D combining logic here.