2d pixel to 3d point conversion with known depth?

2d pixel to 3d point conversion with known depth? - c++

I gone through some similar post for this problem like this,
#RafazZ
OpenCV unproject 2D points to 3D with known depth `Z`
2D Coordinate to 3D world coordinate
as doing 2d to 3d conversion in camera_frame they have removed R,T matrix from calculation and only using like this,
x = (u-cx)/fx
y= (v-cy)/fy
actual (X,Y,Z)
X= x * depth.at(u,v)
Y= y * depth.at(u,v)
Z = depth.at(u,v)
so, I want to understand why we can remove R,t from calculation like this when doing 2d-->3d in camera_frame?

Related

Isometric coordinates, diamond shape, unwanted space between tiles

I'm working with SMFL/C++ to make a 2D isometric game engine, i got this when i do the isometric calculations :
Here is my formula to calculate isometric coordinates in my 2D engine :
For I-J coordinates i have :
x = (I - J) * (tileWidth / 2);
y = (J + I) * (tileHeight / 2);
//Totally working with classics tiles
EDIT: My problem is due to my tiles' shape wich is a cube, but i don't have a clue about how to fix it. Did i really have to do somes complicated maths to handle 3D objetcs(i would rather avoid this) or i can just change the formula a little bit ?
EDIT 2: Solution : int isoY = (x + y) * (height / 4);

First if it is a 2D engine I wonder why there are 3 dimensions, why and how you use z in your engine.
Assuming you want to have a plan of tiles in isometric projection ((x,y) in pixels) given the coordinates (I,J) in number of tiles in orthographic projection.
In that case your formula for x and y are fine by me given tileWidth and tileHeight are correct (i.e. value in isometric projection). And you shouldn't have to use any z.
On the other hand if your problem is to get (x,y) pixels coordinates of a 3D object given (x,y,z) cartesian coordinates i suggest you read this: Computing the Pixel Coordinates of a 3D Point
In case i assumed wrong I'll edit or delete.

Rotate a 3D- Point around another one

I have a function in my program which rotates a point (x_p, y_p, z_p) around another point (x_m, y_m, z_m) by the angles w_nx and w_ny.
The new coordinates are stored in global variables x_n, y_n, and z_n. Rotation around the y-axis (so changing value of w_nx - so that the y - values are not harmed) is working correctly, but as soon as I do a rotation around the x- or z- axis (changing the value of w_ny) the coordinates aren't accurate any more. I commented on the line I think my fault is in, but I can't figure out what's wrong with that code.
void rotate(float x_m, float y_m, float z_m, float x_p, float y_p, float z_p, float w_nx ,float w_ny)
{
float z_b = z_p - z_m;
float x_b = x_p - x_m;
float y_b = y_p - y_m;
float length_ = sqrt((z_b*z_b)+(x_b*x_b)+(y_b*y_b));
float w_bx = asin(z_b/sqrt((x_b*x_b)+(z_b*z_b))) + w_nx;
float w_by = asin(x_b/sqrt((x_b*x_b)+(y_b*y_b))) + w_ny; //<- there must be that fault
x_n = cos(w_bx)*sin(w_by)*length_+x_m;
z_n = sin(w_bx)*sin(w_by)*length_+z_m;
y_n = cos(w_by)*length_+y_m;
}

What the code almost does:
compute difference vector
convert vector into spherical coordinates
add w_nx and wn_y to the inclination and azimuth angle (see link for terminology)
convert modified spherical coordinates back into Cartesian coordinates
There are two problems:
the conversion is not correct, the computation you do is for two inclination vectors (one along the x axis, the other along the y axis)
even if computation were correct, transformation in spherical coordinates is not the same as rotating around two axis
Therefore in this case using matrix and vector math will help:
b = p - m
b = RotationMatrixAroundX(wn_x) * b
b = RotationMatrixAroundY(wn_y) * b
n = m + b
basic rotation matrices.

Try to use vector math. Decide in which order you rotate, first along x, then along y perhaps.
If you rotate along z-axis, [z' = z]
x' = x*cos a - y*sin a;
y' = x*sin a + y*cos a;
The same repeated for y-axis: [y'' = y']
x'' = x'*cos b - z' * sin b;
z'' = x'*sin b + z' * cos b;
Again rotating along x-axis: [x''' = x'']
y''' = y'' * cos c - z'' * sin c
z''' = y'' * sin c + z'' * cos c
And finally the question of rotating around some specific "point":
First, subtract the point from the coordinates, then apply the rotations and finally add the point back to the result.
The problem, as far as I see, is a close relative to "gimbal lock". The angle w_ny can't be measured relative to the fixed xyz -coordinate system, but to the coordinate system that is rotated by applying the angle w_nx.
As kakTuZ observed, your code converts point to spherical coordinates. There's nothing inherently wrong with that -- with longitude and latitude, one can reach all the places on Earth. And if one doesn't care about tilting the Earth's equatorial plane relative to its trajectory around the Sun, it's ok with me.
The result of not rotating the next reference axis along the first w_ny is that two points that are 1 km a part of each other at the equator, move closer to each other at the poles and at the latitude of 90 degrees, they touch. Even though the apparent purpose is to keep them 1 km apart where ever they are rotated.

if you want to transform coordinate systems rather than only points you need 3 angles. But you are right - for transforming points 2 angles are enough. For details ask Wikipedia ...
But when you work with opengl you really should use opengl functions like glRotatef. These functions will be calculated on the GPU - not on the CPU as your function. The doc is here.

Like many others have said, you should use glRotatef to rotate it for rendering. For collision handling, you can obtain its world-space position by multiplying its position vector by the OpenGL ModelView matrix on top of the stack at the point of its rendering. Obtain that matrix with glGetFloatv, and then multiply it with either your own vector-matrix multiplication function, or use one of the many ones you can obtain easily online.
But, that would be a pain! Instead, look into using the GL feedback buffer. This buffer will simply store the points where the primitive would have been drawn instead of actually drawing the primitive, and then you can access them from there.
This is a good starting point.

Calculating a line from a starting point and angle in 3d

I have a point in 3D space and two angles, I want to calculate the resulting line from this information. I have found how to do this with 2D lines, but not 3D. How can this be calculated?
If it helps: I'm using C++ & OpenGL and have the location of the user's mouse click and the angle of the camera, I want to trace this line for intersections.

In trig terms two angles and a point are required to define a line in 3d space. Converting that to (x,y,z) is just polar coordinates to cartesian coordinates the equations are:
x = r sin(q) cos(f)
y = r sin(q) sin(f)
z = r cos(q)
Where r is the distance from the point P to the origin; the angle q (zenith) between the line OP and the positive polar axis (can be thought of as the z-axis); and the angle f (azimuth) between the initial ray and the projection of OP onto the equatorial plane(usually measured from the x-axis).
Edit:
Okay that was the first part of what you ask. The rest of it, the real question after the updates to the question, is much more complicated than just creating a line from 2 angles and a point in 3d space. This involves using a camera-to-world transformation matrix and was covered in other SO questions. For convenience here's one: How does one convert world coordinates to camera coordinates? The answers cover converting from world-to-camera and camera-to-world.

The line can be fathomed as a point in "time". The equation must be vectorized, or have a direction to make sense, so time is a natural way to think of it. So an equation of a line in 3 dimensions could really be three two dimensional equations of x,y,z related to time, such as:
x = ax*t + cx
y = ay*t + cy
z = az*t + cz
To find that set of equations, assuming the camera is at origin, (0,0,0), and your point is (x1,y1,z1) then
ax = x1 - 0
ay = y1 - 0
az = z1 - 0
cx = cy = cz = 0
so
x = x1*t
y = y1*t
z = z1*t
Note: this also assumes that the "speed" of the line or vector is such that it is at your point (x1,y1,z1) after 1 second.
So to draw that line just fill in the points as fine as you like for as long as required, such as every 1/1000 of a second for 10 seconds or something, might draw a "line", really a series of points that when seen from a distance appear as a line, over 10 seconds worth of distance, determined by the "speed" you choose.

Get 3D coordinates from 2D image pixel if extrinsic and intrinsic parameters are known

I am doing camera calibration from tsai algo. I got intrensic and extrinsic matrix, but how can I reconstruct the 3D coordinates from that inormation?
1) I can use Gaussian Elimination for find X,Y,Z,W and then points will be X/W , Y/W , Z/W as homogeneous system.
2) I can use the
OpenCV documentation approach:
as I know u, v, R , t , I can compute X,Y,Z.
However both methods end up in different results that are not correct.
What am I'm doing wrong?

If you got extrinsic parameters then you got everything. That means that you can have Homography from the extrinsics (also called CameraPose). Pose is a 3x4 matrix, homography is a 3x3 matrix, H defined as
H = K*[r1, r2, t], //eqn 8.1, Hartley and Zisserman
with K being the camera intrinsic matrix, r1 and r2 being the first two columns of the rotation matrix, R; t is the translation vector.
Then normalize dividing everything by t3.
What happens to column r3, don't we use it? No, because it is redundant as it is the cross-product of the 2 first columns of pose.
Now that you have homography, project the points. Your 2d points are x,y. Add them a z=1, so they are now 3d. Project them as follows:
p = [x y 1];
projection = H * p; //project
projnorm = projection / p(z); //normalize
Hope this helps.

As nicely stated in the comments above, projecting 2D image coordinates into 3D "camera space" inherently requires making up the z coordinates, as this information is totally lost in the image. One solution is to assign a dummy value (z = 1) to each of the 2D image space points before projection as answered by Jav_Rock.
p = [x y 1];
projection = H * p; //project
projnorm = projection / p(z); //normalize
One interesting alternative to this dummy solution is to train a model to predict the depth of each point prior to reprojection into 3D camera-space. I tried this method and had a high degree of success using a Pytorch CNN trained on 3D bounding boxes from the KITTI dataset. Would be happy to provide code but it'd be a bit lengthy for posting here.

c++ opengl converting model coordinates to world coordinates for collision detection

(This is all in ortho mode, origin is in the top left corner, x is positive to the right, y is positive down the y axis)
I have a rectangle in world space, which can have a rotation m_rotation (in degrees).
I can work with the rectangle fine, it rotates, scales, everything you could want it to do.
The part that I am getting really confused on is calculating the rectangles world coordinates from its local coordinates.
I've been trying to use the formula:
x' = x*cos(t) - y*sin(t)
y' = x*sin(t) + y*cos(t)
where (x, y) are the original points,
(x', y') are the rotated coordinates,
and t is the angle measured in radians
from the x-axis. The rotation is
counter-clockwise as written.
-credits duffymo
I tried implementing the formula like this:
//GLfloat Ax = getLocalVertices()[BOTTOM_LEFT].x * cosf(DEG_TO_RAD( m_orientation )) - getLocalVertices()[BOTTOM_LEFT].y * sinf(DEG_TO_RAD( m_orientation ));
//GLfloat Ay = getLocalVertices()[BOTTOM_LEFT].x * sinf(DEG_TO_RAD( m_orientation )) + getLocalVertices()[BOTTOM_LEFT].y * cosf(DEG_TO_RAD( m_orientation ));
//Vector3D BL = Vector3D(Ax,Ay,0);
I create a vector to the translated point, store it in the rectangles world_vertice member variable. That's fine. However, in my main draw loop, I draw a line from (0,0,0) to the vector BL, and it seems as if the line is going in a circle from the point on the rectangle (the rectangles bottom left corner) around the origin of the world coordinates.
Basically, as m_orientation gets bigger it draws a huge circle around the (0,0,0) world coordinate system origin. edit: when m_orientation = 360, it gets set back to 0.
I feel like I am doing this part wrong:
and t is the angle measured in radians
from the x-axis.
Possibly I am not supposed to use m_orientation (the rectangles rotation angle) in this formula?
Thanks!
edit: the reason I am doing this is for collision detection. I need to know where the coordinates of the rectangles (soon to be rigid bodies) lie in the world coordinate place for collision detection.

What you do is rotation [ special linear transformation] of a vector with angle Q on 2d.It keeps vector length and change its direction around the origin.
[linear transformation : additive L(m + n) = L(m) + L(n) where {m, n} € vector , homogeneous L(k.m) = k.L(m) where m € vector and k € scalar ] So:
You divide your vector into two pieces. Like m[1, 0] + n[0, 1] = your vector.
Then as you see in the image, rotation is made on these two pieces, after that your vector take
the form:
m[cosQ, sinQ] + n[-sinQ, cosQ] = [mcosQ - nsinQ, msinQ + ncosQ]
you can also look at Wiki Rotation
If you try to obtain eye coordinates corresponding to your object coordinates, you should multiply your object coordinates by model-view matrix in opengl.
For M => model view matrix and transpose of [x y z w] is your object coordinates you do:
M[x y z w]T = Eye Coordinate of [x y z w]T

This seems to be overcomplicating things somewhat: typically you would store an object's world position and orientation separately from its set of own local coordinates. Rotating the object is done in model space and therefore the position is unchanged. The world position of each coordinate is the same whether you do a rotation or not - add the world position to the local position to translate the local coordinates to world space.
Any rotation occurs around a specific origin, and the typical sin/cos formula presumes (0,0) is your origin. If the coordinate system in use doesn't currently have (0,0) as the origin, you must translate it to one that does, perform the rotation, then transform back. Usually model space is defined so that (0,0) is the origin for the model, making this step trivial.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

2d pixel to 3d point conversion with known depth? - c++

Related

Isometric coordinates, diamond shape, unwanted space between tiles

Rotate a 3D- Point around another one

Calculating a line from a starting point and angle in 3d

Get 3D coordinates from 2D image pixel if extrinsic and intrinsic parameters are known

c++ opengl converting model coordinates to world coordinates for collision detection

Categories

Resources