I want to convert 2D screen coordinates to 3D world coordinates. I have searched a lot but I did not get any satisfying result.
Note: I am not using OpenGL nor any other graphics library.
Data which I have:
Screen X
Screen Y
Screen Height
Screen Width
Aspect Ratio
If you have the Camera world Matrix and Projection Matrix this is pretty simple.
If you don't have the world Matrix you can compute it from it's position and rotation.
worldMatrix = Translate(x, y, z) * RotateZ(z_angle) * RotateY(y_angle) * RotateX(x_angle);
Where translate returns the the 4x4 translation matrices and Rotate returns the 4x4 rotation matrices around the given axis.
The projection matrix can be calculated from the aspect ratio, field of view angle, and near and far planes.
This blog has a good explanation of how to calculate the projection matrix.
You can unproject the screen coordinates by doing:
mat = worldMatrix * inverse(ProjectionMatrix)
dir = transpose(mat) * <x_screen, y_screen, 0.5, 1>
dir /= mat[3] + mat[7] + mat[11] + mat[15]
dir -= camera.position
Your ray will point from the camera in the direction dir.
This should work, but it's not a super concreate example on how to do this.
Basically you just need to do the following steps:
calculate camera's worldMatrix
calculate camera's projection matrix
multiply worldMatrix with inverse projection matrix.
create a point <Screen_X_Value, Screen_Y_Value, SOME_POSITIVE_Z_VALUE, 1>
apply this "inverse" projection to your point.
then subtract the cameras position form this point.
The resulting vector is the direction from the camera. Any point along that ray are the 3D coordinates corresponding to your 2D screen coordinate.
Related
I am trying to orient a 3d object at the world origin such that it doesn't change its position wrt camera when I move the camera OR change its field of view. I tried doing this
Object Transform = Inverse(CameraProjectionMatrix)
How do I undo the perspective divide because when I change the fov, the object is affected by it
In detail it looks like
origin(0.0, 0.0, 0.0, 1.0f);
projViewInverse = Camera.projViewMatrix().inverse();
projectionMatrix = Camera.projViewMatrix();
projectedOrigin = projectionMatrix * origin;
topRight(0.5f, 0.5f, 0.f);
scaleFactor = 1.0/projectedOrigin.z();
scale(scaleFactor,scaleFactor,scaleFactor);
finalMatrix = projViewInverse * Scaling(w) * Translation(topRight);
if you use gfx pipeline where positions (w=1.0) and vectors (w=0.0) are transformed to NDC like this:
(x',y',z',w') = M*(x,y,z,w) // applying transforms
(x'',y'') = (x',y')/w' // perspective divide
where M are all your 4x4 homogenyuous transform matrices multiplied in their order together. If you want to go back to the original (x,y,z) you need to know w' which can be computed from z. The equation depends on your projection. In such case you can do this:
w' = f(z') // z' is usually the value encoded in depth buffer and can obtained
(x',y') = (x'',y'')*w' // screen -> camera
(x,y) = Inverse(M)*(x',y',z',w') // camera -> world
However this can be used only if you know the z' and can derive w' from it. So what is usually done (if we can not) is to cast ray from camera focal point through the (x'',y'') and stop at wanted perpendicular distance to camera. For perspective projection you can look at it as triangle similarity:
So for each vertex you want to transform you need its projected x'',y'' position on the znear plane (screen) and then just scale the x'',y'' by the ratio between distances to camera focal point (*z1/z0). Now all we need is the focal length z0. That one dependss on the kind of projection matrix you use. I usually encounter 2 versions when you are in camera coordinate system then point (0,0,0) is either the focal point or znear plane. However the projection matrix can be any hence also the focal point position can vary ...
Now when you have to deal with aspect ratio then the first method deals with it internally as its inside the M. The second method needs to apply inverse of aspect ratio correction before conversion. So apply it directly on x'',y''
Can someone tell me how to make triangle vertices collide with edges of the screen?
For math library I am using GLM and for window creation and keyboard/mouse input I am using GLFW.
I created perspective matrix and simple array of triangle vertices.
Then I multiplied all this in vertex shader like:
gl_Position = projection * view * model * vec4(pos, 1.0);
Projection matrix is defined as:
glm::mat4 projection = glm::perspective(
45.0f, (GLfloat)screenWidth / (GLfloat)screenHeight, 0.1f, 100.0f);
I have fully working camera and projection. I can move around my "world" and see triangle standing there. The problem I have is I want to make sure that triangle collide with edges of the screen.
What I did was disable camera and only enable keyboard movement. Then I initialized translation matrix as glm::translate(model, glm::vec3(xMove, yMove, -2.5f)); and scale matrix to scale by 0.4.
Now all of that is working fine. When I press RIGHT triangle moves to the right when I press UP triangle moves up etc... The problem is I have no idea how to make it stop moving then it hits edges.
This is what I have tried:
triangleRightVertex.x is glm::vec3 object.
0.4 is scaling value that I used in scaling matrix.
if(((xMove + triangleRightVertex.x) * 0.4f) >= 1.0f)
{
cout << "Right side collision detected!" << endl;
}
When I move triangle to the right it does detect collision when x of the third vertex(bottom right corner of triangle) collides with right side but it goes little bit beyond before it detects. But when I tried moving up it detected collision when half of the triangle was up.
I have no idea what to do here can someone explain me this please?
Each of the vertex coordinates of the triangle is transformed by the model matrix form model space to world space, by the view matrix from world space to view space and by the projection matrix from view space to clip space. gl_Position is the Homogeneous coordinate in clip space and further transformed by a Perspective divide from clip space to normalized device space. The normalized device space is a cube, with right, bottom, front of (-1, -1, -1) and a left, top, back of (1, 1, 1).
All the geometry which is in this (volume) cube is "visible" on the viewport.
In clip space the clipping of the scene is performed.
A point is in clip space if the x, y and z components are in the range defined by the inverted w component and the w component of the homogeneous coordinates of the point:
-w <= x, y, z <= w
What you want to do is to check if a vertex x coordinate of the triangle is clipped. SO you have to check if the x component of the clip space coordinate is in the view volume.
Calculate the clip space position of the vertices on the CPU, as it does the vertex shader.
The glm library is very suitable for things like that:
glm::vec3 triangleVertex = ... ; // new model coordinate of the triangle
glm::vec4 h_pos = projection * view * model * vec4(triangleVertex, 1.0);
bool x_is_clipped = h_pos.x < -h_pos.w || h_pos.x > h_pos.w;
If you don't know how the orientation of the triangle is transformed by the model matrix and view matrix, then you have to do this for all the 3 vertex coordinates of the triangle-
I was trying to place a sphere on the 3D space from the user selected point on 2d screen space. For this iam trying to calculate 3d point from 2d point using the below technique and this technique not giving the correct solution.
mousePosition.x = ((clickPos.clientX - window.left) / control.width) * 2 - 1;
mousePosition.y = -((clickPos.clientY - window.top) / control.height) * 2 + 1;
then Iam multiplying the mousePositionwith Inverse of MVP matrix. But getting random number at result.
for calculating MVP Matrix :
osg::Matrix mvp = _camera->getViewMatrix() * _camera->getProjectionMatrix();
How can I proceed? Thanks.
Under the assumption that the mouse position is normalized in the range [-1, 1] for x and y, the following code will give you 2 points in world coordinates projected from your mouse coords: nearPoint is the point in 3D lying on the camera frustum near plane, farPointon the frustum far plane.
Than you can compute a line passing by these points and intersecting that with your plane.
// compute the matrix to unproject the mouse coords (in homogeneous space)
osg::Matrix VP = _camera->getViewMatrix() * _camera->getProjectionMatrix();
osg::Matrix inverseVP;
inverseVP.invert(VP);
// compute world near far
osg::Vec3 nearPoint(mousePosition.x, mousePosition.x, -1.0f);
osg::Vec3 farPoint(mousePosition.x, mousePosition.x, 1.0f);
osg::Vec3 nearPointWorld = nearPoint * inverseVP;
osg::Vec3 farPointWorld = farPoint * inverseVP;
The target is to draw a shape, lets say a triangle, pixel-perfect (vertices shall be specified in pixels) and be able to transform it in the 3rd dimension.
I've tried it with a orthogonal projection matrix and everything works fine, but the shape doesn't have any depth - if I rotate it around the Y axis it looks like I would just scale it around the X axis. (because a orthogonal projection obviously behaves like this). Now I want to try it with a perspective projection. But with this projection, the coordinate system changes completely, and due to this I can't specify my triangles verticies with pixels. Also if the size of my window changes, the size of shape changes too (because of the changed coordinate system).
Is there any way to change the coordinate system of the perspective projection so that I can specify my vertices like if I would use the orthogonal projection? Or do anyone have a Idea how to achieve the target described in the first sentence?
The projection matrix describes the mapping from 3D points of a scene, to 2D points of the viewport. It transforms from eye space to the clip space, and the coordinates in the clip space are transformed to the normalized device coordinates (NDC) by dividing with the w component of the clip coordinates. The NDC are in range (-1,-1,-1) to (1,1,1).
At Perspective Projection the projection matrix describes the mapping from 3D points in the world as they are seen from of a pinhole camera, to 2D points of the viewport. The eye space coordinates in the camera frustum (a truncated pyramid) are mapped to a cube (the normalized device coordinates).
Perspective Projection Matrix:
r = right, l = left, b = bottom, t = top, n = near, f = far
2*n/(r-l) 0 0 0
0 2*n/(t-b) 0 0
(r+l)/(r-l) (t+b)/(t-b) -(f+n)/(f-n) -1
0 0 -2*f*n/(f-n) 0
where:
aspect = w / h
tanFov = tan( fov_y * 0.5 );
prjMat[0][0] = 2*n/(r-l) = 1.0 / (tanFov * aspect)
prjMat[1][1] = 2*n/(t-b) = 1.0 / tanFov
I assume that the view matrix is the identity matrix, and thus the view space coordinates are equal to the world coordinates.
If you want to draw a polygon, where the vertex coordinates are translated 1:1 into pixels, then you have to draw the polygon in parallel plane to the viewport. This means all points have to be draw with the same depth. The depth has to choose that way, that the transformation of a point in normalized device coordinates, by the inverse projection matrix gives the vertex coordinates in pixel. Note, the homogeneous coordinates given by the transformation with the inverse projection matrix, have to be divided by the w component of the homogeneous coordinates, to get cartesian coordinates.
This means, that the depth of the plane depends on the field of view angle of the projection:
Assuming you set up a perspective projection like this:
float vp_w = .... // width of the viewport in pixel
float vp_h = .... // height of the viewport in pixel
float fov_y = ..... // field of view angle (y axis) of the view port in degrees < 180°
gluPerspective( fov_y, vp_w / vp_h, 1.0, vp_h*2.0f );
Then the depthZ of the plane with a 1:1 relation of vertex coordinates and pixels, will be calculated like this:
float angRad = fov_y * PI / 180.0;
float depthZ = -vp_h / (2.0 * tan( angRad / 2.0 ));
Note, the center point of the projection to the view port is (0,0), so the bottom left corner point of the plane is (-vp_w/2, -vp_h/2, depthZ) and the top right corner point is (vp_w/2, vp_h/2, depthZ). Ensure, that the near plane of the perspective projetion is less than -depthZ and the far plane is greater than -depthZ.
See further:
Both depth buffer and triangle face orientation are reversed in OpenGL
Transform the modelMatrix
In computer vision, when I need to convert a cv::mat from object(world) space coordinates to camera space, or a camera-centric coordinate, i use the following
code: ( where rvec and tvec are the rotation and translation vectors of the camera)
cv::Mat R; //holds rotation matrix
cv::Rodrigues(rvec, R); // converts a rotation vector to a 3x3 matrix
R = R.t(); // rotation of inverse
tvec2 = -R * tvec; // translation of inverse
giving me tvec2 as the camera space coordinate.
My current problem is the opposite. I have an array of 3d points in camera space, and i need to convert them to world space. What does the inverse of the above function look like?
Thank you.
What I would advice you to do is have a 4x4 projection matrix. Indeed, it would be way easier for you to transform your points from the World Cordinate System (WCS) to you Camera Coordinate System. Indeed, let's say you have a point in WCS at {0,0,0}, and your camera at {10, 0, 0} with a rotation on Y axis of 45 degree. What you can do is create your projection matrix like that
rot = 45 degrees = 0,785398 rad.
Rotation: (3x3)
Translation: (1x3)
Your projection Matrix will be: (4x4)
With the projection matrix, let's say you have a point in you WCS at:
just change it to a point at:
then what you can do to change it to you CCS is:
PointInCCS = projectionMatrix * PointInWCS;
to return in WCS just use the matrix inverse:
PointInWCS = projectionMatrix.inv() * PointInCCS;