convert 2D screen coordinates to 3D space coordinates in C++?

convert 2D screen coordinates to 3D space coordinates in C++? - c++

I'm writing a 2D game using a wrapper of OpenGLES. There is a camera aiming at a bunch of textures, which are the sprites for the game. The user should be able to move the view around by moving their fingers around on the screen. The problem is, the camera is about 100 units away from the textures, so when the finger is slid across the screen to pan the camera, the sprites move faster than the finger due to parallax effect.
So basically, I need to convert 2D screen coordinates, to 3D coordinates at a specific z distance away (in my case 100 away because that's how far away the textures are).
There are some "Unproject" functions in C#, but I'm using C++ so I need the math behind this function. I'm extremely new to 3D stuff and I'm very bad at math so if you can explain like you are explaining to a 10 year old that would be much appreciated.
If I can do this, I can pan the camera at such a speed so it looks like the distant sprites are panning with the users finger.

For picking purposes, there are better ways than doing a reverse projection. See this answer: https://stackoverflow.com/a/1114023/252687

In general, you will have to scale your finger-movement-distance to use it in a far-away plane (z unit away).
i.e, it l unit is the amount of finger movement and if you want to find the effect z unit away, the length l' = l/z
But, please check the effect and adjust the l' (double/halve etc) to get the desired effect.

Found the answer at:
Wikipedia
It has the following formula:
To determine which screen x-coordinate corresponds to a point at Ax,Az multiply the point coordinates by:
where
Bx is the screen x coordinate
Ax is the model x coordinate
Bz is the focal length—the axial distance from the camera center to the image plane
Az is the subject distance.

Related

Matching top view human detections with floor projection on interactive floor project

I'm building an interactive floor. The main idea is to match the detections made with a Xtion camera with objects I draw in a floor projection and have them following the person.
I also detect the projection area on the floor which translates to a polygon. the camera can detect outside the "screen" area.
The problem is that the algorithm detects the the top most part of the person under it using depth data and because of the angle between that point and the camera that point isn't directly above the person's feet.
I know the distance to the floor and the height of the person detected. And I know that the camera is not perpendicular to the floor but I don't know the camera's tilt angle.
My question is how can I project that 3D point onto the polygon on the floor?
I'm hoping someone can point me in the right direction. I've been reading about camera projections but I'm not seeing how to use it in this particular problem.
Thanks in advance
UPDATE:
With the awnser from Diego O.d.L I was able to get an almost perfect detection. I'll write the steps I used for those who might be looking for the same solution (I won't get into much detail on how detection is made):
Step 1 : Calibration
Here I get some color and depth frames from the camera, using openNI, with the projection area cleared.
The projection area is detected on the color frames.
I then convert the detection points to real world coordinates (using OpenNI's CoordinateConverter). With the new real world detection points I look for the plane that better fits them.
Step 2: Detection
I use the detection algorithm to get new person detections and to track them using the depth frames.
These detection points are converted to real world coordinates and projected to the plane previously computed. This corrects the offset between the person's height and the floor.
The points are mapped to screen coordinates using a perspective transform.
Hope this helps. Thank you again for the awnsers.

Work with the camera coordinate system initially. I'm assuming you don't have problems converting from (row,column,distance) to a real world system aligned with the camera axis (x,y,z):
calculate the plane with three or more points (for robustness) with
the camera projection (x,y,z). (choose your favorite algorithm,
i.e
Then Find the projection of your head point to the floor plane
(example)
Finally, you can convert it to the floor coordinate system or just
keep it in the camera system
From the description of your intended application, it is probably more useful for you to recover the image coordinates, I guess.

This type of problems usually benefits from clearly defining the variables.
In this case, you have a head at physical position {x,y,z} and you want the ground projection {x,y,0}. That's trivial, but your camera gives you {u,v,d} (d being depth) and you need to transform that to {x,y,z}.
The easiest solution to find the transform for a given camera positioning may be to simply put known markers on the floor at {0,0,0}, {1,0,0}, {0,1,0} and see where they pop up in your camera.

Best way to drag 3D point in 3D space with mouse picking in OpenGL?

What is the best way to drag 3D point with mouse picking. Issue is not the picking but to dragging in 3D space.
There are two ways I am thinking, one is to get the View to World coordinates using gluUnProject and translate the 3D point. The problem in this case is, it only world on surfaces with Depth value (with glReadPixels), if mouse leaves the surface it gives the maximum or minimum depth values based on winZ component of gluUnProject. And it doesn't work in some cases.
The second way is to drag along XY, XZ, YZ plane using GL_MODELVIEW_MATRIX.
But the problem in this case, is how would we know that we are on XY or XZ or YZ plane? How can we know that the front view of the trackball is in XY plane, and what if we want to drag on the side plane not the front plane?
So, is there any way that gives me the accurate 2D to 3D coordinate so that I can drag 3D point easily in every direction without considering the plane cases? There must be some ways, I have seen 3D softwares, they have perfect dragging feature.

I'm used to solving these user-interaction problems somewhat naively (perhaps not in a mathematically optimal way) but "well enough" considering that they're not very performance-critical (the user interaction parts, not necessarily the resulting modifications to the scene).
For unconstrained free dragging of an object, the method you described using unproject tends to work quite well to often give near pixel-perfect dragging with a slight tweak:
... instead of using glReadPixels to try to extract screen depth, you want a notion of a geometric object/mesh when the user is picking and/or has selected. Now just project the center point of that object to get your screen depth. Then you can drag around in screen X/Y, keeping the same Z you got from this projection, and unproject to get the resulting translation delta from previous center to new center to transform the object. This also makes it "feel" like you're dragging from the center of the object which tends to be quite intuitive.
For auto-constrained dragging, a quick way to detect that is to first grab a 'viewplane normal'. A quick way (might make mathematicians frown) using those projection/unprojection functions you're used to is to unproject two points at the center of the viewport in screenspace (one with a near z value and one with a far z value) and get a unit vector in between those two points. Now you can find the world axis closest to that normal using dot product. The other two world axises define the world plane we want to drag along.
Then it becomes a simple matter of using those handy unprojection functions again to get a ray along the mouse cursor. After that, you can do repeated ray/plane intersections as you're dragging the cursor around to compute a translation vector from the delta.
For more flexible constraints, a gizmo (aka manipulator, basically a 3D widget) can come in handy so that the user can indicate what kind of drag constraint he wants (planar, axis, unconstrained, etc) based on which parts of the gizmo he picks/drags. For axis constraints, ray/line or line/line intersection is handy.
As requested in the comments, to retrieve a ray from a viewport (C++-ish pseudocode):
// Get a ray from the current cursor position (screen_x and screen_y).
const float near = 0.0f;
const float far = 1.0f;
Vec3 ray_org = unproject(Vec3(screen_x, screen_y, near));
Vec3 ray_dir = unproject(Vec3(screen_x, screen_y, far));
ray_dir -= ray_org;
// Normalize ray_dir (rsqrt should handle zero cases to avoid divide by zero).
const float rlen = rsqrt(ray_dir[0]*ray_dir[0] +
ray_dir[1]*ray_dir[1] +
ray_dir[2]*ray_dir[2]);
ray_dir[0] *= rlen;
ray_dir[1] *= rlen;
ray_dir[2] *= rlen;
Then we do a ray/plane intersection with the ray obtained from the mouse cursor to figure out where the ray intersects the plane when the user begins dragging (the intersection will give us a 3D point). After that, it's just translating the object by the deltas between the points gathered from repeatedly doing this as the user drags the mouse around. The object should intuitively follow the mouse while being moved along a planar constraint.
Axis dragging is basically the same idea, but we turn the ray into a line and do a line/line intersection (mouse line against the line for the axis constraint, giving us a nearest point since the lines generally won't perfectly intersect), giving us back a 3D point from which we can use the deltas to translate the object along the constrained axis.
Note that there are tricky edge cases involved with axis/planar dragging constraints. For example, if a plane is perpendicular to the viewing plane (or close), it can shoot off the object into infinity. The same kind of case exists with axis dragging along a line that is perpendicular, like trying to drag along the Z axis from a front viewport (X/Y viewing plane). So it's worth detecting those cases where the line/plane is perpendicular (or close) and prevent dragging in such cases, but that can be done after you get the basic concept working.
Another trick worth noting to improve the way things "feel" for some cases is to hide the mouse cursor. For example, with axis constraints, the mouse cursor could end up becoming very distant from the axis itself, and it might look/feel weird. So I've seen a number of commercial packages simply hide the mouse cursor in this case to avoid revealing that discrepancy between the mouse and the gizmo/handle, and it tends to feel a bit more natural as a result. When the user releases the mouse button, the mouse cursor is moved to the visual center of the handle. Note that you shouldn't do this hidden-cursor dragging for tablets (they're a bit of an exception).
This picking/dragging/intersection stuff can be very difficult to debug, so it's worth tackling it in babysteps. Set small goals for yourself, like just clicking a mouse button in a viewport somewhere to create a ray. Then you can orbit around and make sure the ray was created in the right position. Next you can try a simple test to see if that ray intersects a plane in the world (say X/Y) plane, and create/visualize the intersection point between the ray and plane, and make sure that's correct. Take it in small, patient babysteps, pacing yourself, and you'll have smooth, confident progress. Try to do too much at once and you can have very discouraging jarring progress trying to figure out where you went wrong.

It's a very interesting toic. As well as the methods you described color tagging coloring the z-buffer is also a method. There have been similar discussions about the same topic here:
Generic picking solution for 3D scenes with vertex-shader-based geometry deformation applied
OpenGL - Picking (fastest way)
How to get object coordinates from screen coordinates?
This is also similar to object picking that has been discussed fully including their pros and cons:
http://www.opengl-tutorial.org/miscellaneous/clicking-on-objects/picking-with-an-opengl-hack/
Actually my answer is there is no unique best way to do this. As you also mentioned they have pros and cons. I personally like gluunproject as it is easy and its results are okay. In addition, you cannot move a vertex at any direction using the screen space as screen space is 2D and there is not a unique way to map it back to th 3D geometry space. Hope it helps :).

How to preserve the perception of angle between walls when the camera is rotating in OpenGL?

I am building a number of rooms with different shapes: Parallelogram, Rectangle, Rhombus, Square and etc. The viewer is supposed to look at the rooms from different corners, turn his head to right and left, and guess the shape of the room. So here “the perception of the angles between walls” is very important. My problems are these:
1) most of the acute angles seem to be 90 degrees from the distance,
2) the angles between walls as well as the length of the walls seem to change when the viewer turns his head left or right.
As I have read until now, it is the consequence of using Perspective projection; however, with Orthogonal projection I would have no depth (no perception of depth) in the screen and since I am inside of the room, the size of the room should be bigger than clipped area which produces a quite rubbish image.
I just want to know that is there any way to avoid or at least minimize this deformation effect? Should I build my own projection (something between glortho and glprospective)?
It is also worth mentioning that I use "glutlookatfunction" for positioning the camera and lookat points . The eye position is always one of the room corners and the initial lookat point is the opposite corner of the viewer.By pressing right and left arrow keys , the lookat point moves on the imaginary circle serrounding the room , just like the most of the OpenGL programs I have seen until now .Do you think it would be better if I move the lookat point on the walls ? Or rotate the room instead of changing the look at point ?
I added some pictures for better illustration of my problem :
This is my parallelogram room :
parallelogram.png
As you can see here , the acute angle ,which is supposed to be 60 degrees ,seems to be at least 90 degrees . And this is my rectangle room, which doesn't give you the sense of being in a rectangle room at all :
Rectangle.png

I'd say, without seeing the problem in action, that your camera FOV is too large.
From your comment, your camera has a 112.5° horizontal FOV, which I believe introduces 'unnatural distortions' on the edges of the screen : simple OpenGL perspective projection is linear and suffers from too wide FOVs.
You may want to take a look at this article comparing cameras to human eyes, as you want to jauge the perception of your users.
You should try to reduce the horizontal FOV of your camera, perhaps in the 80-90° range to see if it helps.
As a last resort, you could switch to non-linear projections, but you should make an educated choice before switching to this, there should be psychophysics research available (such as this) that may help you.

How do I implement basic camera operations in OpenGL?

I'm trying to implement an application using OpenGL and I need to implement the basic camera movements: orbit, pan and zoom.
To make it a little clearer, I need Maya-like camera control. Due to the nature of the application, I can't use the good ol' "transform the scene to make it look like the camera moves". So I'm stuck using transform matrices, gluLookAt, and such.
Zoom I know is dead easy, I just have to hook to the depth component of the eye vector (gluLookAt), but I'm not quite sure how to implement the other two, pan and orbit. Has anyone ever done this?

I can't use the good ol' "transform the scene to make it look like the camera moves"
OpenGL has no camera. So you'll end up doing exactly this.
Zoom I know is dead easy, I just have to hook to the depth component of the eye vector (gluLookAt),
This is not a Zoom, this is a Dolly. Zooming means varying the limits of the projection volume, i.e. the extents of a ortho projection, or the field of view of a perspective.

gluLookAt, which you've already run into, is your solution. First three arguments are the camera's position (x,y,z), next three are the camera's center (the point it's looking at), and the final three are the up vector (usually (0,1,0)), which defines the camera's y-z plane.*
It's pretty simple: you just glLoadIdentity();, call gluLookAt(...), and then draw your scene as normally. Personally, I always do all the calculations in the CPU myself. I find that orbiting a point is an extremely common task. My template C/C++ code uses spherical coordinates and looks like:
double camera_center[3] = {0.0,0.0,0.0};
double camera_radius = 4.0;
double camera_rot[2] = {0.0,0.0};
double camera_pos[3] = {
camera_center[0] + camera_radius*cos(radians(camera_rot[0]))*cos(radians(camera_rot[1])),
camera_center[1] + camera_radius* sin(radians(camera_rot[1])),
camera_center[2] + camera_radius*sin(radians(camera_rot[0]))*cos(radians(camera_rot[1]))
};
gluLookAt(
camera_pos[0], camera_pos[1], camera_pos[2],
camera_center[0],camera_center[1],camera_center[2],
0,1,0
);
Clearly you can adjust camera_radius, which will change the "zoom" of the camera, camera_rot, which will change the rotation of the camera about its axes, or camera_center, which will change the point about which the camera orbits.
*The only other tricky bit is learning exactly what all that means. To clarify, because the internet is lacking:
The position is the (x,y,z) position of the camera. Pretty straightforward.
The center is the (x,y,z) point the camera is focusing at. You're basically looking along an imaginary ray from the position to the center.
Now, your camera could still be looking any direction around this vector (e.g., it could be upsidedown, but still looking along the same direction). The up vector is a vector, not a position. It, along with that imaginary vector from the position to the center, form a plane. This is the camera's y-z plane.

OpenGL find distance to a point

I have a virtual landscape with the ability to walk around in first-person. I want to be able to walk up any slope if it is 45 degrees or less. As far as I know, this involves translating your current position out x units then finding the distance between the translated point and the ground. If that distance is x units or more, the user can walk there. If not, the user cannot. I have no idea how to find the distance between one point and the nearest point in the negative y direction. I have programmed this in Java3D, but I do not know how to program this in OpenGL.

Barking this problem at OpenGL is barking up the wrong tree: OpenGL's sole purpose is drawing nice pictures to the screen. It's not a math library!
Depending you your demands there are several solutions. This is how I'd tackle this problem: The normals you calculate for proper shading give you the slope of each point. Say your heightmap (=terrain) is in the XY plane and your gravity vector g = -Z, then the normal force is terrain_normal(x,y) · g. The normal force is, what "pushes" your feet against the ground. Without sufficient normal force, there's not enough friction to convey your muscles force into a movement perpendicular to the ground. If you look at the normal force formula you can see that the more the angle between g and terrain_normal(x,y) deviates, the smaller the normal force.
So in your program you could simply test if the normal force exceeds some threshold; correctly you'd project the excerted friction force onto the terrain, and use that as acceleration vector.

If you just have a regular triangular hightmap you can use barycentric coordinates to interpolate Z values from a given (X,Y) position.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js