How to calculate a linear tapering transformation matrix - c++

I need to calculate a 4x4 matrix (for OpenGL) that can transform a 3d object on the left to one on the right. Transformation applied only in one axis.
EDIT:
The inputs are a given 3d object (points) to be deformed and a single variable for the amount of deformation.
A picture represents a cube projected to plane showing only relevant changes. There are no changes in the axis perpendicular to view plane.
The relative position of these two objects is not relevant and used only to show "before and after" situation.

******* wrong answer was here *******
It's somewhat opposite to 2D perspective matrix with further perspective division. So, to do this "perspective" thing inversely, you need to do something opposite to perspective division then multiply the result by an inverted "perspective" matrix. And though the perspective matrix may be inverted, I have no idea what is "opposite to perspective division". I think you just can't do it with matrices. You'll have to transform Y coord of each vertex instead

Related

Calculating the perspective projection matrix according to the view plane

I'm working with openGL but this is basically a math question.
I'm trying to calculate the projection matrix, I have a point on the view plane R(x,y,z) and the Normal vector of that plane N(n1,n2,n3).
I also know that the eye is at (0,0,0) which I guess in technical terms its the Perspective Reference Point.
How can I arrive the perspective projection from this data? I know how to do it the regular way where you get the FOV, aspect ration and near and far planes.
I think you created a bit of confusion by putting this question under the "opengl" tag. The problem is that in computer graphics, the term projection is not understood in a strictly mathematical sense.
In maths, a projection is defined (and the following is not the exact mathematical definiton, but just my own paraphrasing) as something which doesn't further change the results when applied twice. Think about it. When you project a point in 3d space to a 2d plane (which is still in that 3d space), each point's projection will end up on that plane. But points which already are on this plane aren't moving at all any more, so you can apply this as many times as you want without changing the outcome any further.
The classic "projection" matrices in computer graphics don't do this. They transfrom the space in a way that a general frustum is mapped to a cube (or cuboid). For that, you basically need all the parameters to describe the frustum, which typically is aspect ratio, field of view angle, and distances to near and far plane, as well as the projection direction and the center point (the latter two are typically implicitely defined by convention). For the general case, there are also the horizontal and vertical asymmetries components (think of it like "lens shift" with projectors). And all of that is what the typical projection matrix in computer graphics represents.
To construct such a matrix from the paramters you have given is not really possible, because you are lacking lots of parameters. Also - and I think this is kind of revealing - you have given a view plane. But the projection matrices discussed so far do not define a view plane - any plane parallel to the near or far plane and in front of the camera can be imagined as the viewing plane (behind the camere would also work, but the image would be mirrored), if you should need one. But in the strict sense, it would only be a "view plane" if all of the projected points would also end up on that plane - which the computer graphics perspective matrix explicitely does'nt do. It instead keeps their 3d distance information - which also means that the operation is invertible, while a classical mathematical projection typically isn't.
From all of that, I simply guess that what you are looking for is a perspective projection from 3D space onto a 2D plane, as opposed to a perspective transformation used for computer graphics. And all parameters you need for that are just the view point and a plane. Note that this is exactly what you have givent: The projection center shall be the origin and R and N define the plane.
Such a projection can also be expressed in terms of a 4x4 homogenous matrix. There is one thing that is not defined in your question: the orientation of the normal. I'm assuming standard maths convention again and assume that the view plane is defined as <N,x> + d = 0. From using R in that equation, we can get d = -N_x*R_x - N_y*R_y - N_z*R_z. So the projection matrix is just
( 1 0 0 0 )
( 0 1 0 0 )
( 0 0 1 0 )
(-N_x/d -N_y/d -N_z/d 0 )
There are a few properties of this matrix. There is a zero column, so it is not invertible. Also note that for every point (s*x, s*y, s*z, 1) you apply this to, the result (after division by resulting w, of course) is just the same no matter what s is - so every point on a line between the origin and (x,y,z) will result in the same projected point - which is what a perspective projection is supposed to do. And finally note that w=(N_x*x + N_y*y + N_z*z)/-d, so for every point fulfilling the above plane equation, w= -d/-d = 1 will result. In combination with the identity transform for the other dimensions, which just means that such a point is unchanged.
Projection matrix must be at (0,0,0) and viewing in Z+ or Z- direction
this is a must because many things in OpenGL depends on it like FOG,lighting ... So if your direction or position is different then you need to move this to camera matrix. Let assume your focal point is (0,0,0) as you stated and the normal vector is (0,0,+/-1)
Z near
is the distance between focal point and projection plane so znear is perpendicular distance of plane and (0,0,0). If assumption is correct then
znear=R.z
otherwise you need to compute that. I think you got everything you need for it
cast line from R with direction N
find closest point to focal point (0,0,0)
and then the z near is the distance of that point to R
Z far
is determined by the depth buffer bit width and z near
zfar=znear*(1<<(cDepthBits-1))
this is the maximal usable zfar (for mine purposes) if you need more precision then lower it a bit do not forget precision is higher near znear and much much worse near zfar. The zfar is usually set to the max view distance and znear computed from it or set to min focus range.
view angle
I use mostly 60 degree view. zang=60.0 [deg]
Common males in my region can see up to 90 degrees but that is peripherial view included the 60 degree view is more comfortable to view.
Females have a bit wider view ... but I did not heard any complains from them on 60 degree views ever so let assume its comfortable for them too...
Aspect
aspect ratio is determined by your OpenGL window dimensions xs,ys
aspect=(xs/ys)
This is how I set the projection matrix:
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
gluPerspective(zang/aspect,aspect,znear,zfar);
// gluPerspective has inacurate tangens so correct perspective matrix like this:
double perspective[16];
glGetDoublev(GL_PROJECTION_MATRIX,perspective);
perspective[ 0]= 1.0/tan(0.5*zang*deg);
perspective[ 5]=aspect/tan(0.5*zang*deg);
glLoadMatrixd(perspective);
deg = M_PI/180.0
perspective is projection matrix copy I use it for mouse position conversions etc ...
If you do not correct the matrix then you will be off when using advanced things like overlapping more frustrum to get high precision depth range. I use this to obtain <0.1m,1000AU> frustrum with 24bit depth buffer and the inaccuracy would cause the images will not fit perfectly ...
[Notes]
if the focal point is not really (0,0,0) or you are not viewing in Z axis (like you do not have camera matrix but instead use projection matrix for that) then on basic scenes/techniques you will see no problem. They starts with use of advanced graphics. If you use GLSL then you can handle this without problems but fixed OpenGL function can not handle this properly. This is also called PROJECTION_MATRIX abuse
[edit1] few links
If your view is standard frustrum then write the matrix your self gluPerspective otherwise look here Projections for some ideas how to construct it
[edit2]
From your comment I see it like this:
f is your viewing point (axises are the global world axises)
f' is viewing point if R would be the center of screen
so create projection matrix for f' position (as explained above), create transform matrix to transform f' to f. The transformed f must have Z axis the same as in f' the other axises can be obtained by cross product and use that as camera or multiply booth together and use as abused Projection matrix
How to construct the matrix is explained in the Understanding transform matrices link from my earlier comments

OpenGL Frustum Pedagogy

i've been drawing directly into homogenous clip space (the 2x2x2 cube centered around 0,0,0) in opengl and i've realized that the perspective transformation matrix transforms all geometry from one right-parallelipid (view-space) to another right-parallelipid (homogenous clip-space).
so, why the heck does every opengl article use a non-right-parallelipid frustum to illustrate how projection works? i understand that the perspective transformation matrix will cause everything to get scaled by a term containing its distance from the camera and the camera's distance from the plane... is the traditional frustum illustration trying to explain that? or are we truly entered some warped space at some point in the perspective transformation? if so, how are we ending up back at a right-parallelipid (homogenous clip-space) at the end of it all?
You are right in the sense that there is just some affine, linear transformation and no real perspective distortion - when you just interpret the clip space as a 4-dimensional vector space.
But the clip space is not the "end of it all". The perspective effect is a nonlinear transformation which is finally achieved by the perspective division which will be done after the transformation to clip space. The projection matrix determines the w value that will be the divisor for this, and which is typically just -z_eye.

Calculating the Normal Matrix in OpenGL

The following site says to use the model_view matrix when computing the normal matrix (assuming we are not using the built in gl_NormalMatrix): (site)Light House. I have the following algorithm in my program:
//Calculate Normal matrix
// 1. Multiply the model matrix by the view matrix and then grab the upper left
// corner 3 x 3 matrix.
mat3x3 mv_orientation = glext::orientation_matrix<float, glext::column>(
glext::model_view<float, glext::column>(glext_model_, glext_view_));
// 2. Because openGL matrices use homogeneous coordinate an affine inversion
// should work???
mv_orientation.affine_invert();
// 3. The normal matrix is defined as the transpose of the inverse of the upper
// left 3 X 3 matrix
mv_orientation.transpose();
// 4. Place this into the shader
basic_shader_.set_uniform_3d_matrix("normal_matrix", mv_orientation.to_gl_matrix());
Assuming most statements above are correct in the aforementioned code. Do you not include the projection matrix in the computation of the normal matrix? If not why, does the projection matrix not affect the normals like they do points?
That's because projection is not an affine transformation. Projections don't maintain the inner product and then they don't maintain the angles. And the real angles that have effect on the light diffusion and reflection are the angles in the affine 3d space. So using also the projection matrix would get you different angles, wrong angles, and hence wrong lights.
Do you not include the projection matrix in the computation of the normal matrix?
No. Normals are required for calculations, like illumination, happening in world and/or view space. It doesn't make sense from a mathematical point of you to do this after projection.
If not why, does the projection matrix not affect the normals like they do points?
Because it would make no sense. That normals should not undergo projective transformation was the original reason to have a separate projection matrix. If you'd put normals through the projection they'd loose their meaning and usefullness.

How perspective matrix works?

I started to read lesson 1 in learningwebgl blog, and I noticed this part:
var pMatrix = mat4.create();
mat4.perspective(45, gl.viewportWidth / gl.viewportHeight, 0.1, 100.0, pMatrix);
I roughly understand how matrices (translation/rotation/multiple) works, but I have no idea what mat4.perspective(...) means. What is it used for? What is the result, if I multiply a vector with this matrix?
The perspective matrix is used to scale, and possibly translate or flip the coordinate system in preparation for the perspective divide. Since the the perspective projection operation involves a divide, it cannot be represented by a linear matrix transformation alone.
In a programmable graphics pipeline (see pixel shaders) you cannot see the divide operation - it is still one of the fixed-function parts. The programmer controls it by tweaking the variables involved in the operation. In the case of the perspective divide it is the projection matrix that gives you this control.
The projection matrix is used to convert world-coordinates to screen coordinates.
The positions in your three-dimensional virtual world are triplets of x, y and z coordinates. When you want to draw something (or rather tell OpenGL to draw something) it needs to calculation where these coordinates are on the users screen.
This calculation is implemented with matrix multiplication.
A vector consisting of x, y and z (and a fourth value of 1 which is necessary to allow the matrix to do some operations like scaling) is multiplied with a matrix to receive a new set of x, y and z coordinates (4th value is discarded) which represent where this point is on the users screen (the z-coordinate is required to determine which objects are in front of others).
The function mat4.perspective generates a projection matrix which generates a matrix which does exactly that. The arguments are:
The field-of-view in degree (45)
the aspect ratio of the field of view (the aspect ratio of the viewport)
the minimal distance from the viewer which is still drawn (0.1 world units)
the maximum distance from the viewer which is still drawn (100.0 world units)
the array in which the generated matrix is stored (pMatrix)
When a point is multiplied with this matrix, the result are the screen coordinates where this point has to be drawn.

Efficiency of perspective projection vs raytracing/ray casting

I have a very general question. I wish to determine the boundary points of a number of objects (comprising 30-50 closed polygons (z) each having around 300 points(x,y,z)). I am working with a fixed viewport which is rotated about x,y and z-axes (alpha, beta, gamma) wrt origin of coordinate system for polygons.
As I see it there are two possibilities: perspective projection or raytracing. Perspective projection would seem to requires a large number of matrix operations for each point to determine its position is within or without the viewport.
Or given the large number of points would I better to raytrace the viewport pixels to object?
i.e. determine whether there is an intersection and then whether intersection occurs within or without object(s).
In either case I will write this result as 0 (outside) or 1 (inside) to 200x200 an integer matrix representing the viewport
Thank you in anticipation
Perspective projection (and then scan-converting the polygons in image coordinates) is going to be a lot faster.
The matrix transform that is required in the case of perspective projection (essentially the world-to-camera matrix) is required in exactly the same way when raytracing. However, with perspective projection, you're only transforming the corner points, whereas with raytracing, you're transforming all the points in the image.
You should be able to use perspective projection and a perspective projection matrix to compute the position of the vertices in screen space? It's hard to understand what you want to do really. If you want to create an image of that 3D scene then with only few polygons it would be hard to see any difference anyway between ray tracing and rasterisation if your code is optimised (you will still need to use an acceleration structure for the ray tracing approach), however yes rasterisation is likely to be faster anyway.
Now if you need to compute the distance from between the eye (the camera's origin) and the geometry visible through the camera's view, the I don't see why you can't use the depth value of any sample for any pixel in the image and use the inverse of the perspective projection matrix to find its distance in camera space.
Why is speed an issue in your problem? Otherwise use RT indeed.
Most of this information can be found on www.scratchapixel.com