Can anyone explain me what what are steps to be performed in order to generate NDC from object space coordinates with below example.
float vertices[] = {
-2.0f, -12.0f,
2.0f, -12.0f,
-2.0f, 12.0f,
2.0f, 12.0f
};
here is my mvp matrix:
float MVP[16] = {
0.993, 0.054, -0.102, -0.102,
0.007, 0.852, 0.524, 0.524,
-0.115, 0.521, -0.846, -0.846,
0.575, -2.604, 4.061, 4.260 };
By my calculation I get:
X,Y,Z,W:
0.824918 7.111959 1.110172 1.000000
-1.111003 5.714332 1.090060 1.000000
-0.123152 0.698949 0.981588 1.000000
0.256286 0.747339 0.980855 1.000000
But those are wrong. By wrong I mean, if I do:
vec4 vert = mvp * vec4(inPos.x,inPos.y,0,1);
vert.xyzw/=vert.w;
gl_Position = vert;
in VS, i get different o/p as that of:
vec4 vert = mvp * vec4(inPos.x,inPos.y,0,1);
gl_Position = vert;
First of all, I can't exactly reproduce the numbers you got. Maybe you rounded the matrix elements in the question. The values I get are:
V0: [0.819627 7.09211 1.1091 1]
V1: [-1.10977 5.69893 1.08916 1]
V2: [-0.123419 0.698661 0.981492 1]
V3: [0.255704 0.7471 0.980762 1]
However, the issue is something else. When you look at the clip space coordinates of those points, you get this:
V0: [-1.495 -12.936 -2.023 -1.824]
V1: [2.477 -12.72 -2.431 -2.232]
V2: [-1.327 7.512 10.553 10.752]
V3: [2.645 7.728 10.145 10.344]
Note several things:
The viewing volume is restriced by the inequality -w <= x,y,z <= w in clip space. The coordinates of V0 and V1 do not satisfy that inequality. So these points lie outside of the viewing frustum. In general, the GL will clip primitives which partly lie inside and outside of the viewing volume.
The clipping in itself is not the issue here though. Note that the w component of V0 and V1 is actually negative. This means that those points lie behind the projection center (= virtual camera). By dividing them by w, you mirror these points back in front of the camera. NDC coordinates of points which are behind the camera are mirrored in x,y and z, and skipping the clipping step will result in completely wrong primitives in this situation.
The GL will never have this issue, because the clipping will be done in clip space, before the division is done. If you want meaningful NDC coordinates, you also have to implement your own clipping. However, this can hardly be done in the Vertex shader (unless all you are drawing are separate points). For line or triangle primtivies, you need the data of all vertices to calculate the intersection between the primitive edges and the clipping plane(s), and you might have to create new primitives on the fly.
This also means that there just are no meaningful NDC coordinates at all for points which lie behind the projection center. For example, consider a triangle where two vertices lie in front of the camera, well inside the view volume, and the third vertex lies behind the camera. Clipping will create a new point on each of the edges from the third point to the others (and will generate two triangles replacing the original one). Those two new points do have meaningful NDC coordinates, but the third original point never has - and there is no 1:1 mapping between that third input point and the newly created points, either.
Related
I am trying to orient a 3d object at the world origin such that it doesn't change its position wrt camera when I move the camera OR change its field of view. I tried doing this
Object Transform = Inverse(CameraProjectionMatrix)
How do I undo the perspective divide because when I change the fov, the object is affected by it
In detail it looks like
origin(0.0, 0.0, 0.0, 1.0f);
projViewInverse = Camera.projViewMatrix().inverse();
projectionMatrix = Camera.projViewMatrix();
projectedOrigin = projectionMatrix * origin;
topRight(0.5f, 0.5f, 0.f);
scaleFactor = 1.0/projectedOrigin.z();
scale(scaleFactor,scaleFactor,scaleFactor);
finalMatrix = projViewInverse * Scaling(w) * Translation(topRight);
if you use gfx pipeline where positions (w=1.0) and vectors (w=0.0) are transformed to NDC like this:
(x',y',z',w') = M*(x,y,z,w) // applying transforms
(x'',y'') = (x',y')/w' // perspective divide
where M are all your 4x4 homogenyuous transform matrices multiplied in their order together. If you want to go back to the original (x,y,z) you need to know w' which can be computed from z. The equation depends on your projection. In such case you can do this:
w' = f(z') // z' is usually the value encoded in depth buffer and can obtained
(x',y') = (x'',y'')*w' // screen -> camera
(x,y) = Inverse(M)*(x',y',z',w') // camera -> world
However this can be used only if you know the z' and can derive w' from it. So what is usually done (if we can not) is to cast ray from camera focal point through the (x'',y'') and stop at wanted perpendicular distance to camera. For perspective projection you can look at it as triangle similarity:
So for each vertex you want to transform you need its projected x'',y'' position on the znear plane (screen) and then just scale the x'',y'' by the ratio between distances to camera focal point (*z1/z0). Now all we need is the focal length z0. That one dependss on the kind of projection matrix you use. I usually encounter 2 versions when you are in camera coordinate system then point (0,0,0) is either the focal point or znear plane. However the projection matrix can be any hence also the focal point position can vary ...
Now when you have to deal with aspect ratio then the first method deals with it internally as its inside the M. The second method needs to apply inverse of aspect ratio correction before conversion. So apply it directly on x'',y''
I'm displaying an array of 3D points with OpenGL. The problem is the 3D points are from a sensor where X is forward, Y is to the left, Z is up. From my understanding OpenGL has X to the right, Y up, Z out of screen. So when I use a lot of the examples of projection matrices, and cameras the points are obviously not viewed the right way, or the way that makes sense.
So to compare the two (S for sensor, O for OpenGL):
Xs == -Zo, Ys == -Xo, Zs == Yo.
Now my questions are:
How can I rotate the the points from S to O. I tried rotating by 90degrees around X, then Z but it doesn't appear to be working.
Do I even need to rotate to OpenGL convention, can I make up my own Axes (use the sensors orientation), and change the camera code? Or will some assumptions break somewhere in the graphics pipeline?
My implementation based on the answer below:
glm::mat4 model = glm::mat4(0.0f);
model[0][1] = -1;
model[1][2] = 1;
model[2][0] = -1;
// My input to the shader was a mat4 for the model matrix so need to
// make sure the bottom right element is 1
model[3][3] = 1;
The one line in the shader:
// Note that the above matrix is OpenGL to Sensor frame conversion
// I want Sensor to OpenGL so I need to take the inverse of the model matrix
// In the real implementation I will change the code above to
// take inverse before sending to shader
" gl_Position = projection * view * inverse(model) * vec4(lidar_pt.x, lidar_pt.y, lidar_pt.z, 1.0f);\n"
In order to convert the sensor data's coordinate system into OpenGL's right-handed world-space, where the X axis points to the right, Y points up and Z points towards the user in front of the screen (i.e. "out of the screen") you can very easily come up with a 3x3 rotation matrix that will perform what you want:
Since you said that in the sensor's coordinate system X points into the screen (which is equivalent to OpenGL's -Z axis, we will map the sensor's (1, 0, 0) axis to (0, 0, -1).
And your sensor's Y axis points to the left (as you said), so that will be OpenGL's (-1, 0, 0). And likewise, the sensor's Z axis points up, so that will be OpenGL's (0, 1, 0).
With this information, we can build the rotation matrix:
/ 0 -1 0\
| 0 0 1|
\-1 0 0/
Simply multiply your sensor data vertices with this matrix before applying OpenGL's view and projection transformation.
So, when you multiply that out with a vector (Sx, Sy, Sz), you get:
Ox = -Sy
Oy = Sz
Oz = -Sx
(where Ox/y/z is the point in OpenGL coordinates and Sx/y/z is the sensor coordinates).
Now, you can just build a transformation matrix (right-multiply against your usual model-view-projection matrix) and let a shader transform the vertices by that or you simply pre-transform the sensor vertices before uploading to OpenGL.
You hardly ever need angles in OpenGL when you know your linear algebra math.
Can someone tell me how to make triangle vertices collide with edges of the screen?
For math library I am using GLM and for window creation and keyboard/mouse input I am using GLFW.
I created perspective matrix and simple array of triangle vertices.
Then I multiplied all this in vertex shader like:
gl_Position = projection * view * model * vec4(pos, 1.0);
Projection matrix is defined as:
glm::mat4 projection = glm::perspective(
45.0f, (GLfloat)screenWidth / (GLfloat)screenHeight, 0.1f, 100.0f);
I have fully working camera and projection. I can move around my "world" and see triangle standing there. The problem I have is I want to make sure that triangle collide with edges of the screen.
What I did was disable camera and only enable keyboard movement. Then I initialized translation matrix as glm::translate(model, glm::vec3(xMove, yMove, -2.5f)); and scale matrix to scale by 0.4.
Now all of that is working fine. When I press RIGHT triangle moves to the right when I press UP triangle moves up etc... The problem is I have no idea how to make it stop moving then it hits edges.
This is what I have tried:
triangleRightVertex.x is glm::vec3 object.
0.4 is scaling value that I used in scaling matrix.
if(((xMove + triangleRightVertex.x) * 0.4f) >= 1.0f)
{
cout << "Right side collision detected!" << endl;
}
When I move triangle to the right it does detect collision when x of the third vertex(bottom right corner of triangle) collides with right side but it goes little bit beyond before it detects. But when I tried moving up it detected collision when half of the triangle was up.
I have no idea what to do here can someone explain me this please?
Each of the vertex coordinates of the triangle is transformed by the model matrix form model space to world space, by the view matrix from world space to view space and by the projection matrix from view space to clip space. gl_Position is the Homogeneous coordinate in clip space and further transformed by a Perspective divide from clip space to normalized device space. The normalized device space is a cube, with right, bottom, front of (-1, -1, -1) and a left, top, back of (1, 1, 1).
All the geometry which is in this (volume) cube is "visible" on the viewport.
In clip space the clipping of the scene is performed.
A point is in clip space if the x, y and z components are in the range defined by the inverted w component and the w component of the homogeneous coordinates of the point:
-w <= x, y, z <= w
What you want to do is to check if a vertex x coordinate of the triangle is clipped. SO you have to check if the x component of the clip space coordinate is in the view volume.
Calculate the clip space position of the vertices on the CPU, as it does the vertex shader.
The glm library is very suitable for things like that:
glm::vec3 triangleVertex = ... ; // new model coordinate of the triangle
glm::vec4 h_pos = projection * view * model * vec4(triangleVertex, 1.0);
bool x_is_clipped = h_pos.x < -h_pos.w || h_pos.x > h_pos.w;
If you don't know how the orientation of the triangle is transformed by the model matrix and view matrix, then you have to do this for all the 3 vertex coordinates of the triangle-
I have been playing around with OpenGL and matrix operations and I understand the concept of P * V * M but I cannot understand why changing the Z position of the 'camera' does not have the effect of zooming.
When using a perspective projection, changing the Z of the camera has the effect of zoom (as i'd expect).
glm::mat4 Projection = glm::perspective(45.0f, 4.0f / 3.0f, 0.1f, 100.0f);
glm::mat4 View = glm::lookAt(
glm::vec3(0,0,3), // changing 3 to 8 will zoom out
glm::vec3(0,0,0),
glm::vec3(0,1,0)
);
glm::mat4 Model = glm::mat4(1.0f);
gml::mat4 MVP = Projection * View * Model;
However, when I use an ortho projection, changing the 3 to 8 or anything it does not have the effect of zooming out. I know they are very different projections but I am looking for an explanation (the math behind why it doesn't work would be especially helpful).
glm::mat4 Projection = glm::ortho(
0.0f,
128.0f,
0.0f,
72.0f,
0.0f,
100.0f
);
That's how orthographic projections work. Let's start with a perspective transform:
You get the projection of an object by following a straight line to the camera:
If you move the camera closer, then you will see that the projected area increases:
Orthographic projections work differently. You get the projection by following a straight line that is perpendicular to the image plane:
And obviously, the size of the projected area does not depend on how far the camera is away from the object. That's because the projection lines will always be parallel and preserve the size of the object in the two directions of the image plane.
When you change the Z coordinate from 3 to 8, you're not actually zooming out, you're just moving the camera farther away. You can zoom out without moving the camera by changing the first parameter for glm::perspective.
An orthographic camera doesn't have a location (you can think of it as infinitely far away), so it's not possible to "move" an orthographic camera in the same way. You can zoom out by changing the bounds passed to glm::ortho. Simply pass larger numbers to glm::ortho.
Look at what happens, when you move perspective camera:
Here: (xe, ye, ze) - point in eye coordinate system. (xp, yp, zp) - projection of that point
n - distance to near plane
t - distance to top plane of frustrum
You can see, that when you approach camera, xp and yp will grow.
In contrast, changing z position of orthogonal camera won't effect xp and yp, but still will effect zp, thus will change value in depth buffer.
I am using the default OpenGL values like glDepthRangef(0.0,1.0);, gldepthfunc(GL_LESS); and glClearDepthf(1.f); because my projection matrices change the right hand coordinate to the left hand coordinate. I mean, My near plane and the far plane z-values are supposed to be [-1 , 1] in NDC.
The problem is when I draw two objects at the one FBO including same RBOs, for example, like this code below,
glEnable(GL_DEPTH_TEST);
glClearDepthf(1.f);
glClearColor(0.0,0.0,0.0,0.0);
glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT);
drawObj1(); // this uses 1) the orthogonal projection below
drawObj2(); // this uses 2) the perspective projection below
glDisable(GL_DEPTH_TEST);
always, the object1 is above the object2.
1) orthogonal
2) perspective
However, when they use same projection whatever it is, it works fine.
Which part do you think I should go over?
--Updated--
Coverting Eye coordinate to NDC to Screen coordinate, what really happens?
My understanding is because after both of projections, its NDC shape is same as images below, its z-value after multiplying 2) perspective matrix doesn't have to be distorted. However, according to the derbass's good answer, if z-value in the view coordinate is multiplied by the perspective matrix, the z-value would be hyperbolically distorted in NDC.
If so, if one vertex position, for example, is [-240.0, 0.0, -100.0] in the eye(view) coordinate with [w:480.0,h:320.0], and I clipped it with [-0.01,-100], would it be [-1,0,-1] or [something>=-1,0,-1] in NDC ? And its z value is still same as -1, isn't it? when its z-value is distorted?
1) Orthogonal
2) Perspective
You can't expect that the z values of your vertices are projected to the same window space z value just because you use the same near and far values for a perspecitive and an orthogonal projection matrix.
In the prespecitve case, the eye space z value will be hyperbolically distorted to the NDC z value. In the orthogonal case, it is just linaerily scaled and shifted.
If your "Obj2" lies just in a flat plane z_eye=const, you can pre-calulate the distorted depth it should have in the perspective case. But if it has a non-zero extent into depth, this will not work. I can think of different approaches to deal with the situation:
"Fix" the depth of object two in the fragment shader by adjusting the gl_FragDepth according to the hyperbolic distortion your z buffer expects.
Use a linear z-buffer, aka. a w buffer.
These approaches are conceptually the inverse of each other. In both cases, you have play with gl_FragDepth so that it matches the conventions of the other render pass.
UPDATE
My understanding is because after both of projections, its NDC shape
is same as images below, its z-value after multiplying 2) perspective
matrix doesn't have to be distorted.
Well, these images show the conversion from clip space to NDC. And that transfromation is what the projection matrix followed by the perspective divide do. When it is in normalized device coords, no further distortion does occur. It is just linearily transformed to window space z according to the glDepthRange() setup.
However, according to the
derbass's good answer, if z-value in the view coordinate is multiplied
by the perspective matrix, the z-value would be hyperbolically
distorted in NDC.
The perspective matrix is applied to the complete 4D homogenous eye space vector, so it is applied to z_eye as well as to x_eye, y_eye and also w_eye (which is typically just 1, but doesn't have to).
So the resulting NDC coordinates for the perspective case are hyberbolically distorted to
f + n 2 * f * n B
z_ndc = ------- + ----------------- = A + -------
n - f (n - f) * z_eye z_eye
while, in the orthogonal case, they are just linearily transformed to
- 2 f + n
z_ndc = ------- z_eye - --------- = C * z_eye + D
f - n (f - n)
For n=1 and f=10, it will look like this (note that I plotted the range partly outside of the frustum. Clipping will prevent these values from occuring in the GL, of course).
If so, if one vertex position, for example, is [-240.0, 0.0, -100.0]
in the eye(view) coordinate with [w:480.0,h:320.0], and I clipped it
with [-0.01,-100], would it be [-1,0,-1] or [something>=-1,0,-1] in
NDC ? And its z value is still same as -1, isn't it? when its z-value
is distorted?
Points at the far plane are always transformed to z_ndc=1, and points at the near plane to z_ndc=-1. This is how the projection matrices were constructed, and this is exactly where the two graphs in the plot above intersect. So for these trivial cases, the different mappings do not matter at all. But for all other distances, they will.