How do I find intersection of mouse click and a 3D mesh? - c++

In my program I'm loading in a 3D mesh to view and interact with. The user can rotate and scale the view. I will be using a rotation matrix for the rotation and calling multmatrix to rotate the view, and scaling using glScalef. The user can also paint the mesh, and this is why I need to translate the mouse coordinates to see if it intersects with the mesh.
I've read http://www.opengl.org/resources/faq/technical/selection.htm and the method where I use gluUnproject at the near and far plane and subtracting, and I have some success with it, but only when gluLookAt's position is (0, 0, z) where z can be any reasonable number. When I move the position to say (0, 1, z), it goes haywire and returns an intersection where there is only empty space, and returns no intersection where the mesh is clearly underneath the mouse.
This is how I'm making the ray from the mouse click to the scene:
float hx, hy;
hx = mouse_x;
hy = mouse_y;
GLdouble m1x,m1y,m1z,m2x,m2y,m2z;
GLint viewport[4];
GLdouble projMatrix[16], mvMatrix[16];
glGetIntegerv(GL_VIEWPORT,viewport);
glGetDoublev(GL_MODELVIEW_MATRIX,mvMatrix);
glGetDoublev(GL_PROJECTION_MATRIX,projMatrix);
//unproject to find actual coordinates
gluUnProject(hx,scrHeight-hy,2.0,mvMatrix,projMatrix,viewport,&m1x,&m1y,&m1z);
gluUnProject(hx,scrHeight-hy,10.0,mvMatrix,projMatrix,viewport,&m2x,&m2y,&m2z);
gmVector3 dir = gmVector3(m2x,m2y,m2z) - gmVector3(m1x,m1y,m1z);
dir.normalize();
gmVector3 point;
bool intersected = findIntersection(gmVector3(0,0,5), dir, point);
I'm using glFrustum if that makes any difference.
The findIntersection code is really long and I'm pretty confident it works, so I won't post it unless someone wants to see it. The gist of it is that for every face in the mesh, find intersection between the ray and the plane, then see if the intersection point is inside the face.
I believe that it has to do with the camera's position and look at vector, but I don't know how, and what to do with them so that the mouse clicks on the mesh properly. Can anyone help me with this?
I also haven't yet made the rotation matrix or anything with the glScalef, so can anyone give me insight into this? Like, does unproject account for the multmatrix and glScalef calls when calculating?
Many thanks!

The solution is with raytracing. The ray you shoot is defined through two points. The first one is the origin of the camera, the second one is the mouse position projected on the view plane in the scene (the plane you describe with glFrustum). The intersection of this ray with you model is the point where your mouse click has hit the model

making the ray from the camera to the scene using the ray dir, I should've used:
bool intersected = findIntersection(gmVector3(m1x,m1y,m1z), dir, point);
(notice the different vector being passed to the function). This solved my problem, and didn't have anything to do with the gluLookAt after all!
Also, for the second part of the question that I asked, yes, the unproject does take into account the glScale and glmultmatrix functions.

Related

Get screen position of the center of a mesh

My goal is to create an intuitive 3D manipulator to handle rotations of meshes displayed in my 3D editor, made with Qt / QML.
To do that, when the user clicks on an entity, 3 tori are spawned around the mesh, representing the euler angles the user can act on. If the user then clicks on one torus, I want him to be able to rotate the mesh by dragging the mouse. The natural way users seem to do that is by dragging the mouse around the torus in the direction they want the mesh to rotate.
I therefore need a way to know how the user is rotating his mouse. I thought of a way: when the user clicks on the torus, I retrieve the position of the center of the torus. Then, I translate this world position to its screen position. Then, I monitor the angle between the cursor of the mouse and the center of the torus. The evolution of this angle should tell me everything I need: if the angle increases clockwise, the mesh should rotate clockwise and vice versa. This solution should yield a result good enough for my application, since it won't depend on the angle of the camera, or only very minimally.
However, I can't find a way to translate a world position to its screen position with Qt. I found the function QVector3D::project(const QMatrix4x4 &modelView, const QMatrix4x4 &projection, const QRect &viewport), but its documentation is very scarce and I couldn't find anyone using it... I might have found what to feed in for the projection argument (the projectionMatrix property from QCamera, here https://doc.qt.io/qt-5/qml-qt3d-render-camera.html), but that's it. What is the modelView ? And viewport ? Is it simply QRect(0, 0, 1920, 1080) ?
If anyone have any kind of lead, it would be amazing, I can't find anything anywhere and I'm kind of losing hope now. Or maybe another, simpler, solution to my problem ? Please note that the user can also freely move the camera around the mesh, which adds in complexity.
Thanks a lot for your time, and have a nice day !
Yes, you should be able to translate from world position to screen position using the mentioned function. You are correct about the projection argument. As for the modelView argument, you should use viewMatrix property from QCamera, which is missing from the official documentation, but it works for me. The viewport parameter represents the dimensions of the part of the screen, you are projecting on. You could use QRect(0, 0, 1920, 1080) if you use full screen Full HD projection, otherwise use something like QRect(QPoint(0, 0), view->size()), where view is the wigdet or window with your 3D image. Be careful, that the resulting screen position will have y = 0 being down and positive values being above, which the opposite to usual screen coordinates.

What are we trying to achieve by creating a right and down vectors in this ray tracing depiction?

Often, we see the following picture when talking about ray tracing.
Here, I see the Z axis as the sort of direction if the camera pointed straight ahead, and the XY grid as the grid that the camera is seeing here. From the camera's point of view, we see the usual Cartesian grid me and my classmates are used to.
Recently I was examining code that simulates this. One thing that is not obvious from this picture to me is the requirement for the "right" and "down" vectors. Obviously we have look_at, which shows where the camera is looking. And campos is where the camera is located. But why do we need camright and camdown? What are we trying to construct?
Vect X (1, 0, 0);
Vect Y (0, 1, 0);
Vect Z (0, 0, 1);
Vect campos (3, 1.5, -4);
Vect look_at (0, 0, 0);
Vect diff_btw (
campos.getVectX() - look_at.getVectX(),
campos.getVectY() - look_at.getVectY(),
campos.getVectZ() - look_at.getVectZ()
);
Vect camdir = diff_btw.negative().normalize();
Vect camright = Y.crossProduct(camdir);
Vect camdown = camright.crossProduct(camdir);
Camera scene_cam (campos, camdir, camright, camdown);
I was searching about this question recently and found this post as well: Setting the up vector in the camera setup in a ray tracer
Here, the answerer says this: "My answer assumes that the XY plane is the "ground" in world space and that Z is the altitude. Imagine standing on the floor holding a camera. It's position has a positive Z component, and it's view vector is nearly parallel to the ground. (In my diagram, it's pointing slightly upwards.) The plane of the camera's film (the uv grid) is perpendicular to the view grid. A common approach is to transform everything until the film plane coincides with the XY plane and the camera points along the Z axis. That's equivalent, but not what I'm describing here."
I'm not entirely sure why "transformations" are necessary.. How is this point of view different from the picture at the top? Here they also say that they need an "up" vector and "right" vector to "construct an image plane". I'm not sure what an image plane is..
Could someone explain better the relationship between the physical representation and code representation?
How do you know that you always want the camera's "up" to be aligned with the vertical lines in the grid in your image?
Trying to explain it another way: The grid is not really there. That imaginary grid is the result of the calculations of camera's directional vectors and the resolution you are rendering in. The grid is not what decides the camera angle.
When you are holding a physical camera in your hand, like the camera in the cell phone, don't you ever rotate the camera little bit for effect? Or when filming, you may want to slowly rotate the camera? Have you not seen any movies where the camera is rotated?
In the same way, you may want to rotate the "camera" in your ray traced world. And rotating only the camera is much easier than rotating all your objects in the scene(may be millions!)
Check out the example of rotating the camera from the movie Ice Age here:
https://youtu.be/22qniGtZhZ8?t=61
The (up or down) and right vectors constructs the plane you project the scene onto. Since the scene is in 3D you need to project the scene onto a 2D scene in order to render a picture to display on your screen.
If you have the camera position and direction you still don't know whether you're holding the camera right-side up, upside down, or tilted to the left and right.
Using camera position, lookat, up (down) and right vectors we can uniquely define the 3D scene is projected into a 2D picture.
Concretely, if you look at the code and the picture. The 3D scene are the objects displayed. The image/projection plane is the grid infront of the camera. It's orientation is defined by the the camright and camdir vectors (because we are assuming the cameras line of sight is perpendicular to camdir, camdown is uniquely defined by the other two).
The placement of the grid is based on the camera's position and intrinsic properties (it's not being displayed here, but the camera will have a specific field of view).

How to draw a ray/line from the near clipping plane w/ perspective projection?

Simply put - I want to draw a ray/line from the near clipping plane out to the far clipping plane using a perspective projection. I have what I believe are correctly normalized world coordinates generated from a mouse click using methods describe in various OpenGL/graphics programming guides.
The problem I am having is that it seems my ray is being drawn from outside the near clipping plane.
Background: This is for a simple model viewer I am building in Qt that requires a picking capability. I need to draw the ray in order to calculate intersections with objects in the scene. However, my basic problem is that I can seem to draw the ray correctly.
My perspective projection is defined:
gluPerspective(_fov, aspect, 0.1, 100.0);
where _fov is 45.0 degress, and aspect is the ratio of the window width/height.
Using my picking code, I've generated what I believe to be correctly normalized world coordinates based off of mouse clicks. An example of these coordinates:
-0.385753,-0.019608,-0.100000
However, when I try to draw a ray starting at that point, it looks like it is being drawn from outside of the clipping plane:
Maybe I am expecting something different, but in the example above I clicked on the nose of the airplane, generated the world coordinates above, and I am drawing the ray incorrectly (or so I believe). I was hoping to see the line being drawn from the location of the mouse click into the airplane model.
When I draw the ray I first load the identity matrix, and then draw a line from the near clipping plane coordinates to the far plane. Then I draw a sphere at the end of the ray (in this screenshot it is behind the plane).
glPushMatrix();
glLoadIdentity();
glColor3f(0,0,1);
glBegin(GL_LINES);
glVertex3f(_near_ray.x(), _near_ray.y(), _near_ray.z());
glVertex3f(_far_ray.x(), _far_ray.y(), _far_ray.z());
glEnd();
glTranslatef(_far_ray.x(), _far_ray.y(), _far_ray.z());
glColor3f(1,0,0);
glutWireSphere(1, 10, 10);
glPopMatrix();
Any hints as to what I am doing wrong? The _far_ray coordinates are the same as the _near_ray except for the Z field. I want the ray to be drawn straight into the scene.
In The End... I'd just like to know how the draw the ray itself. I understand that there might be errors in my code that generates the coordinates, but what if I just wanted to draw an arbitrary ray from the near clipping plane straight into the scene. That is that I'd like answered.
With perspective projection, a line looks like a point on the screen if and only if it passes through the eye position.
Since you revert modelview matrix to identity, the eye is located at the origin (according to this question). Pass (0, 0, 0) as one of the vertices, and hopefully you'll see that line degenerates into a point.
Generally, the two 3d vectors used as vertices must be collinear.
If you do not revert modelview matrix, then you can draw a line from (eye) to (eye + dir), where eye is the first vector passed to gluLookAt, and dir is the any sufficiently large vector looking into proper direction.

C++ OpenGL dragging multiple objects with mouse

just wondering how someone would go about dragging 4 different
objects in openGL. I have very simple code to draw these objects:
glPushMatrix();
glTranslatef(mouse_x, mouse_y, 0);
glutSolidIcosahedron();
glPopMatrix();
glPushMatrix();
glTranslatef(mouse_x2, mouse_y2, 0);
glutSolidIcosahedron();
glPopMatrix();
glPushMatrix();
glTranslatef(mouse_x3, mouse_y3, 0);
glutSolidIcosahedron();
glPopMatrix();
glPushMatrix();
glTranslatef(mouse_x4, mouse_y4, 0);
glutSolidIcosahedron();
glPopMatrix();
I understand how to move an object, but I want to learn how to drag and drop any one of these objects.
I've been researching about the Name Stack and Selection Mode, but it just confused the hell out of me.
And I also know about having to have some sort of glutMouseFunc.
It's just the selection of each shape I'm puzzled on.
First thing that you need to do is capturing the position of mouse on the screen when the button is clicked. There are plenty of ways to do it but I believe it's outside of the scope of this question. When you have screen X,Y coords you must detect if any object is selected and which one it is. There are two possible approaches. You can either keep track of a bounding rectangle positions of each object (in screen space) and the test if the cursor is inside any of those rectangles will be quite simple. Or you can cast a ray from eye through cursor position in world space and check intersection of this ray with each object.
The second approach is more versatile for 3D graphics but you seem to be using only X and Y coords so you don't need to worry about Z order of objects.
In case of the first solution the main problem is: how to know how big is your object on the screen. glutSolidIcosahedron() renders an object of radius 1. To calculate it's screen radius you can either use some matrix math or in that case a simple trigonometry. You will need to know the distance from camera to the drawing plane (I believe you're using some glTranslatef(0,0,X) before you render. X is your distance) You also need to know the view angle of the camera. You set it in projection matrix. Now take a piece of paper, draw a cone of angle alpha, intersecting a plane at distance X and knowing that an object has radius 1 you can easily calculate how large area of the screen it occupies. (I'll leave this calculation for you)
Now if you know the radius on screen, simply test the distance from your click position to each object. if the distance is below radius it's selected. If more than one object passes this test just select first one of them.

How do I implement basic camera operations in OpenGL?

I'm trying to implement an application using OpenGL and I need to implement the basic camera movements: orbit, pan and zoom.
To make it a little clearer, I need Maya-like camera control. Due to the nature of the application, I can't use the good ol' "transform the scene to make it look like the camera moves". So I'm stuck using transform matrices, gluLookAt, and such.
Zoom I know is dead easy, I just have to hook to the depth component of the eye vector (gluLookAt), but I'm not quite sure how to implement the other two, pan and orbit. Has anyone ever done this?
I can't use the good ol' "transform the scene to make it look like the camera moves"
OpenGL has no camera. So you'll end up doing exactly this.
Zoom I know is dead easy, I just have to hook to the depth component of the eye vector (gluLookAt),
This is not a Zoom, this is a Dolly. Zooming means varying the limits of the projection volume, i.e. the extents of a ortho projection, or the field of view of a perspective.
gluLookAt, which you've already run into, is your solution. First three arguments are the camera's position (x,y,z), next three are the camera's center (the point it's looking at), and the final three are the up vector (usually (0,1,0)), which defines the camera's y-z plane.*
It's pretty simple: you just glLoadIdentity();, call gluLookAt(...), and then draw your scene as normally. Personally, I always do all the calculations in the CPU myself. I find that orbiting a point is an extremely common task. My template C/C++ code uses spherical coordinates and looks like:
double camera_center[3] = {0.0,0.0,0.0};
double camera_radius = 4.0;
double camera_rot[2] = {0.0,0.0};
double camera_pos[3] = {
camera_center[0] + camera_radius*cos(radians(camera_rot[0]))*cos(radians(camera_rot[1])),
camera_center[1] + camera_radius* sin(radians(camera_rot[1])),
camera_center[2] + camera_radius*sin(radians(camera_rot[0]))*cos(radians(camera_rot[1]))
};
gluLookAt(
camera_pos[0], camera_pos[1], camera_pos[2],
camera_center[0],camera_center[1],camera_center[2],
0,1,0
);
Clearly you can adjust camera_radius, which will change the "zoom" of the camera, camera_rot, which will change the rotation of the camera about its axes, or camera_center, which will change the point about which the camera orbits.
*The only other tricky bit is learning exactly what all that means. To clarify, because the internet is lacking:
The position is the (x,y,z) position of the camera. Pretty straightforward.
The center is the (x,y,z) point the camera is focusing at. You're basically looking along an imaginary ray from the position to the center.
Now, your camera could still be looking any direction around this vector (e.g., it could be upsidedown, but still looking along the same direction). The up vector is a vector, not a position. It, along with that imaginary vector from the position to the center, form a plane. This is the camera's y-z plane.