Unprojecting 2d screen position into 3d world space - c++

I am using glm maths library for the following problem: converting a 2d screen position into 3d world space.
In an attempt to track down the problem, I have simplified the code to the following:
float screenW = 800.0f;
float screenH = 600.0f;
glm::vec4 viewport = glm::vec4(0.0f, 0.0f, screenW, screenH);
glm::mat4 tmpView(1.0f);
glm::mat4 tmpProj = glm::perspective( 90.0f, screenW/screenH, 0.1f, 100000.0f);
glm::vec3 screenPos = glm::vec3(0.0f, 0.0f, 1.0f);
glm::vec3 worldPos = glm::unProject(screenPos, tmpView, tmpProj, viewport);
Now with the glm::unProject in this case I would expect worldPos to be (0, 0, 1). However it is coming through as (127100.12, -95325.094, -95325.094).
Am I misunderstanding what glm::unProject is supposed to do? I have traced through the function and it seems to be working OK.

The Z component in screenPos corresponds to the values in the depth buffer. So 0.0f is the near clip plane and 1.0f is the far clip plane.
If you want to find the world pos that is one unit away from the screen, you can rescale the vector:
worldPos = worldPos / (worldPos.z * -1.f);
Note also that the screenPos of 0,0 designates the bottom left corner of the screen, while in worldPos 0,0 is the center of the screen. So 0,0,1 should give you -1.3333,-1,-1, and 400,300,1 should give you 0,0,-1.

Related

For mouse click ray casting a line, why aren't my starting rays updating to my camera position after I move my camera?

When camera is moved around, why are my starting rays are still stuck at origin 0, 0, 0 even though the camera position has been updated?
It works fine if I start the program and my camera position is at default 0, 0, 0. But once I move my camera for instance pan to the right and click some more, the lines are still coming from 0 0 0 when it should be starting from wherever the camera is. Am I doing something terribly wrong? I've checked to make sure they're being updated in the main loop. I've used this code snippit below referenced from:
picking in 3D with ray-tracing using NinevehGL or OpenGL i-phone
// 1. Get mouse coordinates then normalize
float x = (2.0f * lastX) / width - 1.0f;
float y = 1.0f - (2.0f * lastY) / height;
// 2. Move from clip space to world space
glm::mat4 inverseWorldMatrix = glm::inverse(proj * view);
glm::vec4 near_vec = glm::vec4(x, y, -1.0f, 1.0f);
glm::vec4 far_vec = glm::vec4(x, y, 1.0f, 1.0f);
glm::vec4 startRay = inverseWorldMatrix * near_vec;
glm::vec4 endRay = inverseWorldMatrix * far_vec;
// perspective divide
startR /= startR.w;
endR /= endR.w;
glm::vec3 direction = glm::vec3(endR - startR);
// start the ray points from the camera position
glm::vec3 startPos = glm::vec3(camera.GetPosition());
glm::vec3 endPos = glm::vec3(startPos + direction * someLength);
The first screenshot I click some rays, the 2nd I move my camera to the right and click some more but the initial starting rays are still at 0, 0, 0. What I'm looking for is for the rays to come out wherever the camera position is in the 3rd image, ie the red rays sorry for the confusion, the red lines are supposed to shoot out and into the distance not up.
// and these are my matrices
// projection
glm::mat4 proj = glm::perspective(glm::radians(camera.GetFov()), (float)width / height, 0.1f, 100.0f);
// view
glm::mat4 view = camera.GetViewMatrix(); // This returns glm::lookAt(this->Position, this->Position + this->Front, this->Up);
// model
glm::mat4 model = glm::translate(glm::mat4(1.0f), glm::vec3(0.0f, 0.0f, 0.0f));
Its hard to tell where in the code the problem lies. But, I use this function for ray casting that is adapted from code from scratch-a-pixel and learnopengl:
vec3 rayCast(double xpos, double ypos, mat4 projection, mat4 view) {
// converts a position from the 2d xpos, ypos to a normalized 3d direction
float x = (2.0f * xpos) / WIDTH - 1.0f;
float y = 1.0f - (2.0f * ypos) / HEIGHT;
float z = 1.0f;
vec3 ray_nds = vec3(x, y, z);
vec4 ray_clip = vec4(ray_nds.x, ray_nds.y, -1.0f, 1.0f);
// eye space to clip we would multiply by projection so
// clip space to eye space is the inverse projection
vec4 ray_eye = inverse(projection) * ray_clip;
// convert point to forwards
ray_eye = vec4(ray_eye.x, ray_eye.y, -1.0f, 0.0f);
// world space to eye space is usually multiply by view so
// eye space to world space is inverse view
vec4 inv_ray_wor = (inverse(view) * ray_eye);
vec3 ray_wor = vec3(inv_ray_wor.x, inv_ray_wor.y, inv_ray_wor.z);
ray_wor = normalize(ray_wor);
return ray_wor;
}
where you can draw your line with startPos = camera.Position and endPos = camera.Position + rayCast(...) * scalar_amount.

Create an OpenGL 2D view camera but using model view projection cameras

I am trying to create a 2D, top down, style camera in OpenGL. I would like to stick to the convention of using model-view-projection matrices, this is so I can switch between a 3D view and a top down view while the application runs. I am actually using the glm::lookAt method to create the view matrix.
However there is something missing in my understanding. I am rendering a triangle on the screen, [very close to this tutorial][1], and that works perfectly fine (so no problems with windowing, display loops, vertex buffers, etc). The triangle is centered at (0, 0), and vertices are on -0.5/0.5 (so already in NDC).
I then added a uniform mat4 mpv; to the vertex shader. If I set the mpv matrix to:
glm::vec3 camera_pos = glm::vec3(0.0f, 0.0f, -1.0f);
glm::vec3 target_pos = glm::vec3(0.0f, 0.0f, 0.0f);
glm::mat4 view = glm::lookAt(camera_pos, target_pos, glm::vec3(0.0f, 1.0f, 0.0f));
I get the same, unmodified triangle as expected as these are (from my understanding) the default values for OpenGL.
Now I thought if I changed the Z value of the camera position it would have the same effect as zooming in and out, however all I get is the clear color, no triangle is rendered.
// Trying to simulate zoom in and out by changing z value of camera
glm::vec3 camera_pos = glm::vec3(0.0f, 0.0f, -3.0f);
glm::vec3 target_pos = glm::vec3(0.0f, 0.0f, 0.0f);
glm::mat4 view = glm::lookAt(camera_pos, target_pos, glm::vec3(0.0f, 1.0f, 0.0f));
So I printed the view matrix, and noticed that all I was doing was translating the Z value, which makes sense.
I then added an ortho projection matrix, to make sure everything is in NDC, but I still get nothing.
// *2 cause im on a Mac/high-res screen and the frame buffer scale is 2.
// Doing projection * view in one step and just updating view uniform until I get it working.
view = glm::ortho(0.0f, 800.0f * 2, 0.0f, 600.0f * 2, 0.1f, 100.0f) * view;
Where is my misunderstanding taking place. I would like to:
Simulate a top down view where I can zoom in and out on the target.
Create a 2D camera that follows a target (racing car), so the camera_pos XY and target_pos XY will be the same.
Eventually add an option to switch to a 3D following camera, like a standard racing game 3rd person view, hence the MPV vs just using simple translations.
[1]: https://learnopengl.com/Getting-started/Hello-Triangle
The vertex coordinates are in the range [-0.5, 0.5], but the orthographic projection projects the cuboid volume with the left, bottom, near point _(0, 0, 0.1) and the right, top, far point (800.0 * 2, 600.0 * 2, 100) of the viewport.
Therefore, the triangle mesh just covers one fragment in the lower left of the viewport.
Change the orthographic projection:
view = glm::ortho(0.0f, 800.0f * 2, 0.0f, 600.0f * 2, 0.1f, 100.0f) * view;
view = glm::ortho(-1.0f, 1.0f, -1.0f, 1.0f, 0.1f, 100.0f) * view;

Incorrect ray direction from inverse vp matrix and camera position

I have a problem with my ray generation that I do not understand. The direction for my ray is computed wrongly. I ported this code from DirectX 11 to Vulkan, where it works fine, so I was surprised I could not get it to work:
vec4 farPos = inverseViewProj * vec4(screenPos, 1, 1);
farPos /= farPos.w;
r.Origin = camPos.xyz;
r.Direction = normalize(farPos.xyz - camPos.xyz);
Yet this code works perfectly:
vec4 nearPos = inverseViewProj * vec4(screenPos, 0, 1);
nearPos /= nearPos.w;
vec4 farPos = inverseViewProj * vec4(screenPos, 1, 1);
farPos /= farPos.w;
r.Origin = camPos.xyz;
r.Direction = normalize(farPos.xyz – nearPos.xyz);
[Edit] Matrix and camera positions are set like this:
const glm::mat4 clip(1.0f, 0.0f, 0.0f, 0.0f, 0.0f, -1.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.5f, 0.0f, 0.0f, 0.0f, 0.5f, 1.0f);
projMatrix = clip * glm::perspectiveFov(FieldOfView, float(ViewWidth), float(ViewHeight), NearZ, FarZ);
viewMatrix = glm::inverse(glm::translate(glm::toMat4(Rotation), -Position));
buffer.inverseViewProjMatrix = glm::inverse(projMatrix * viewMatrix);
buffer.camPos = viewMatrix[3];
[Edit2] What I see on screen is correct if I start at the origin. However, if I move left, for example, it looks as if I am moving right. All my rays seem to be perturbed. In some cases, strafing the camera looks as if I am moving around a different point in space. I assume the camera position is not equal to the singularity of my perspective matrix, yet I can not figure out why.
I think I am misunderstanding something basic. What am I missing?
Thanks to the comments I have found the problem. I was building my view matrix incorrectly, in the exact same way as in this post:
glm::inverse(glm::translate(glm::toMat4(Rotation), -Position));
This is equal to translating first and then rotating, which of course leads to something unwanted. In addition, the Position was negative and camPos was obtained using the last column of the view matrix instead of the inverse view matrix, which is wrong.
It was not noticable with my fractal raycaster simply because I never moved far away from the origin. That, and the fact that there is no point of reference in such an environment.

Pixel-perfect projection matrix?

I'm trying to understand how far should I place the camera position in the lookat function (or the object in the model matrix) to have pixel-perfect coordinates to pass in the vertex shader.
This is actually simple with orthographic projection matrices, but I fail to visualize how the math would work for perspective projection.
Here's the perspective matrix I'm using:
glm::mat4 projection = glm::perspective(45.0f, (float)SCR_WIDTH / (float)SCR_HEIGHT, 0.1f, 10000.0f);
vertex multiplication in the shader is as simple as:
gl_Position = projection * view * model * vec4(position.xy, 0.0f, 1.0);
I'm basically trying to show a quad on screen that needs to be rotated and show perspective effects (hence why I can't use orthographic projection), but I'd like to specify in pixel coordinates where and how big it should appear on screen.
Well it can only have pixel-coordinates in one "z-plane" if you want to use a trapezoid view-frustum.
Basic Math
If you use a standard camera the basic math for a camera at (0,0,0) would be
for alpha being the vertical fov (45° in your case)
target_y = tan(alpha/2) * z-distance * ((pixel_y/height)*2-1)
target_x = tan(alpha/2) * z-distance * ((pixel_x/width)*aspect-ratio*2-1)
Reversing projection
As for the general case. You can "un-project" to find where a point in 3D before all transforms should be to end up on a specific point.
Basically you need to un-do the math.
gl_Position = projection * view * model * vec4(position.xy, 0.0f, 1.0);
So if you have your final position and want to revert it you do:
unprojection = model^-1 * view^-1 *projection^-1 * gl_Position //not actual glsl notation, '^-1' being the inverse
This is basically what functions like gluUnProject or glm::gtc::matrix_transform::unProject do.
But you should note that the final clip-space after you apply the projection matrix is typically [-1,-1,0] to [1,1,1], so if you want to enter pixel coordinates you can apply an additional matrix to transform into that space.
Something like:
[2/width, 0, 0 -1]
[ 0, 2/height, 0 -1]
screenToClip = [ 0, 0, 1 0]
[ 0, 0, 0 1]
would transform [0,0,0,1] to [-1,-1,0,1] and [width,height,0,1] to [1,1,0,1]
Also, you're probably best off trying some z-value like 0.5 to make sure that you're well within the view frustum and not clipping near the front or back.
You can achieve this effect with a 60 degree field of view. Basically you want to place the camera at a distance from the viewing plane such that the camera forms an equilateral triangle with center points at the top and bottom of the screen.
Here's some code to do that:
float fovy = 60.0f; // field of view - degrees
float aspect = nScreenWidth / nScreenHeight;
float zNearClip = 0.1f;
float zFarClip = nScreenHeight*2.0f;
float degToRad = MF_PI / 180.0f;
float fH = tanf(fovY * degToRad / 2.0f) * zNearClip;
float fW = fH * aspect;
glFrustum(-fW, fW, -fH, fH, zNearClip, zFarClip);
float nCameraDistance = sqrtf( nScreenHeight * nScreenHeight - 0.25f * nScreenHeight * nScreenHeight);
glTranslatef(0, 0, -nCameraDistance);
You can also use a 90 degree fov. In that case the camera distance is 1/2 the height of the window. However, this has a lot of foreshortening.
In the 90 degree case, you could push the camera out by the full height, but then apply a 2x scaling to the x and y components (ie: glScale (2,2,1).
Here's an image of what I mean:
I'll extend PeterT answer and leave here the practical code I used to find the world coordinates of one of the frustum's plane through unprojection
This assumes a basic view matrix (camera pos at 0,0,0)
glm::mat4 projectionInv(0);
glm::mat4 projection = glm::perspective(45.0f, (float)SCR_WIDTH / (float)SCR_HEIGHT, 0.1f, 500.0f);
projectionInv = glm::inverse(projection);
std::vector<glm::vec4> NDCCube;
NDCCube.push_back(glm::vec4(-1.0f, -1.0f, -1.0f, 1.0f));
NDCCube.push_back(glm::vec4(1.0f, -1.0f, -1.0f, 1.0f));
NDCCube.push_back(glm::vec4(1.0f, -1.0f, 1.0f, 1.0f));
NDCCube.push_back(glm::vec4(-1.0f, -1.0f, 1.0f, 1.0f));
NDCCube.push_back(glm::vec4(-1.0f, 1.0f, -1.0f, 1.0f));
NDCCube.push_back(glm::vec4(1.0f, 1.0f, -1.0f, 1.0f));
NDCCube.push_back(glm::vec4(1.0f, 1.0f, 1.0f, 1.0f));
NDCCube.push_back(glm::vec4(-1.0f, 1.0f, 1.0f, 1.0f));
std::vector<glm::vec3> frustumVertices;
for (int i = 0; i < 8; i++)
{
glm::vec4 tempvec;
tempvec = projectionInv * NDCCube.at(i); //multiply by projection matrix inverse to obtain frustum vertex
frustumVertices.push_back(glm::vec3(tempvec.x /= tempvec.w, tempvec.y /= tempvec.w, tempvec.z /= tempvec.w));
}
Keep in mind these coordinates would not end up on screen if your perspective far distance is lower than the one I set in the projection matrix
If you happen to know the world-coordinate width of "some item" that you want to display pixel-exact, this ends up being a bit of trivial trigonometry (works for both y FOV or x FOV):
S = Width of item in world coordinates
T = "Pixel Exact" size of item (say, the width of the texture)
h = Z distance to the object
a = 2 * h * tan(Phi / 2)
b = a / cos(phi / 2)
r = Total screen resolution (width or height depending on the FOV you want)
a = 2 * h * tan(Phi / 2) = (r / T) * S
Theta = atan(2*h / a)
Phi = 180 - 2*Theta
Where b are the sides of your triangle, a is the base of your triangle, h is the height of your triangle, theta is the angles of the two equal angles of the Isosoleces triangle, and Phi is the resulting FOV
So the end code might look something like
float frustumWidth = (float(ScreenWidth) / TextureWidth) * InWorldItemWidth;
float theta = glm::degrees(atan((2 * zDistance) / frustumWidth));
float PixelPerfectFOV = 180 - 2 * theta;

Changing From Perspective to Orthographic Matrix

I have the scene with one simple triangle. And i am using perspective projection. I have my MVP matrix set up (with the help of GLM) like this:
glm::mat4 Projection = glm::perspective(45.0f, 4.0f / 3.0f, 0.1f, 100.0f);
glm::mat4 View = glm::lookAt(
glm::vec3(0,0,5), // Camera is at (0,0,5), in World Space
glm::vec3(0,0,0), // and looks at the origin
glm::vec3(0,1,0) // Head is up (set to 0,-1,0 to look upside-down)
);
glm::mat4 Model = glm::mat4(1.0f);
glm::mat4 MVP = Projection * View * Model;
And it all works ok, i can change the values of the camera and the triangle is still displayed properly.
But i want to use orthographic projection. And when i change the projection matrix to orthographic, it works unpredictable, i can't display the triangle, or i just see one small part of it in the corner of the screen. To use the orthographic projection, i do this:
glm::mat4 Projection = glm::ortho( 0.0f, 800.0f, 600.0f, 0.0f,-5.0f, 5.0f);
while i don't change anything in View and Model matrices. And i just doesn't work properly.
I just need a push in the right direction, am i doing something wrong? What am i missing, what should i do to properly set up orthographic projection?
PS i don't know if it's needed, but these are the coordinates of the triangle:
static const GLfloat g_triangle[] = {
-1.0f, 0.0f, 0.0f,
1.0f, 0.0f, 0.0f,
0.0f, 2.0f, 0.0f,
};
Your triangle is about 1 unit large. If you use an orthographic projection matrix that is 800/600 units wide, it is natural that you triangle appears very small. Just decrease the bounds of the orthographic matrix and make sure that the triangle is inside this area (e.g. the first vertex is outside of the view, because its x-coordinate is less than 0).
Furthermore, make sure that your triangle is not erased by backface culling or z-clipping. Btw.. negative values for zNear are a bit unusual but should work for orthographic projections.