I'm creating a 360° image player using Oculus rift SDK.
The scene is composed by a cube and the camera is posed in the center of it with just the possibility to rotate around yaw, pitch and roll.
I've drawn the object using openGL considering a 2D texture for each cube's face to create the 360° effect.
I would like to find the portion in the original texture that is actual shown on the Oculus viewport in a certain instant.
Up to now, my approach was try to find the an approximate pixel position of some significant point of the viewport (i.e. the central point and the corners) using the Euler Angles in order to identify some areas in the original textures.
Considering all the problems of using Euler Angles, do not seems the smartest way to do it.
Is there any better approach to accomplish it?
Edit
I did a small example that can be runned in the render loop:
//Keep the Orientation from Oculus (Point 1)
OVR::Matrix4f rotation = Matrix4f(hmdState.HeadPose.ThePose);
//Find the vector respect to a certain point in the viewport, in this case the center (Point 2)
FovPort fov_viewport = FovPort::CreateFromRadians(hmdDesc.CameraFrustumHFovInRadians, hmdDesc.CameraFrustumVFovInRadians);
Vector2f temp2f = fov_viewport.TanAngleToRendertargetNDC(Vector2f(0.0,0.0));// this values are the tangent in the center
Vector3f vector_view = Vector3f(temp2f.x, temp2f.y, -1.0);// just add the third component , where is oriented
vector_view.Normalize();
//Apply the rotation (Point 3)
Vector3f final_vect = rotation.Transform(vector_view);//seems the right operation.
//An example to check if we are looking at the front face (Partial point 4)
if (abs(final_vect.z) > abs(final_vect.x) && abs(final_vect.z) > abs(final_vect.y) && final_vect.z <0){
system("pause");
}
Is it right to consider the entire viewport or should be done for each single eye?
How can be indicated a different point of the viewport respect to the center? I don't really understood which values should be the input of TanAngleToRendertargetNDC().
You can get a full rotation matrix by passing the camera pose quaternion to the OVR::Matrix4 constructor.
You can take any 2D position in the eye viewport and convert it to its camera space 3D coordinate by using the fovPort tan angles. Normalize it and you get the direction vector in camera space for this pixel.
If you apply the rotation matrix gotten earlier to this direction vector you get the actual direction of that ray.
Now you have to convert from this direction to your texture UV. The component with the highest absolute value in the direction vector will give you the face of the cube it's looking at. The remaining components can be used to find the actual 2D location on the texture. This depends on how your cube faces are oriented, if they are x-flipped, etc.
If you are at the rendering part of the viewer, you will want to do this in a shader. If this is to find where the user is looking at in the original image or the extent of its field of view, then only a handful of rays would suffice as you wrote.
edit
Here is a bit of code to go from tan angles to camera space coordinates.
float u = (x / eyeWidth) * (leftTan + rightTan) - leftTan;
float v = (y / eyeHeight) * (upTan + downTan) - upTan;
float w = 1.0f;
x and y are pixel coordinates, eyeWidth and eyeHeight are eye buffer size, and *Tan variables are the fovPort values. I first express the pixel coordinate in [0..1] range, then scale that by the total tan angle for the direction, and then recenter.
Related
I get an image point in the left camera (pointL) and the corresponding image point in the right camera (pointR) of my stereo camera using feature matching. The two cameras are parallel and are at the same "hight". There is only a x-translation between them.
I also know the projection matrices for each camera (projL, projR), which I got during calibration using initUndistortRectifyMap.
For triangulating the point, I call:
triangulatePoints(projL, projR, pointL, pointR, pos3D) (documentation), where pos3D is the output 3D position of the object.
Now, I want to project the 3D-coordinates to the 2D-image of the left camera:
2Dpos = projL*3dPos
The resulting x-coordinate is correct. But the y-coodinate is about 20 pixels wrong.
How can I fix this?
Edit:
Of course, I need to use homogeneous coordinates, in order to multiply it with the projection matrix (3x4). For that reason, I set:
3dPos[0] = x;
3dPos[1] = y;
3dPos[2] = z;
3dPos[3] = 1;
Is it wrong, to set 3dPos[3]to 1?
Note:
All images are remapped, I do this in a kind of preprocessing step.
Of course, I always use the homogeneous coordinates
You are likely projecting into the rectified camera. Need to apply the inverse of the rectification warp to obtain the point in the original (undistorted) linear camera coordinates, then apply distortion to get into the original image.
I have a geometry with its vertices in cartesian coordinates. These cartesian coordinates are the ECEF(Earth centred earth fixed) coordinates. This geometry is actually present on an ellipsoidal model of the earth using wgs84 corrdinates.The cartesian coordinates were actually obtained by converting the set of latitudes and longitudes along which the geomtries lie but i no longer have access to them. What i have is an axis aligned bounding box with xmax, ymax, zmax and xmin,ymin,zmin obtained by parsing the cartesian coordinates (There is no obviously no cartesian point of the geometry at xmax,ymax,zmax or xmin,ymin,zmin. The bounding box is just a cuboid enclosing the geometry).
What i want to do is to calculate the camera distance in an overview mode such that this geometry's bounding box perfectly fits the camera frustum.
I am not very clear with the approach to take here. A method like using a local to world matrix comes to mind but its not very clear.
#Specktre I referred to your suggestions on shifting points in 3D and that led me to another improved solution, nevertheless not perfect.
Compute a matrix that can transfer from ECEF to ENU. Refer this - http://www.navipedia.net/index.php/Transformations_between_ECEF_and_ENU_coordinates
Rotate all eight corners of my original bounding box using this matrix.
Compute a new bounding box by finding the min and max of x,y,z of these rotated points
compute distance
cameraDistance1 = ((newbb.ymax - newbb.ymin)/2)/tan(fov/2)
cameraDistance2 = ((newbb.xmax - newbb.xmin)/2)/(tan(fov/2)xaspectRatio)
cameraDistance = max(cameraDistance1, cameraDistance2)
This time i had to use the aspect ratio along x as i had previously expected since in my application fov is along y. Although this works almost accurately, there is still a small bug i guess. I am not very sure if it a good idea to generate a new bounding box. May be it is more accurate to identify 2 points point1(xmax, ymin, zmax) and point(xmax, ymax, zmax) in the original bounding box, find their values after multiplying with matrix and then do (point2 - point1).length(). Similarly for y. Would that be more accurate?
transform matrix
first thing is to understand that transform matrix represents coordinate system. Look here Transform matrix anatomy for another example.
In standard OpenGL notation If you use direct matrix then you are converting from matrix local space (LCS) to world global space (GCS). If you use inverse matrix then you converting coordinates from GCS to LCS
camera matrix
camera matrix converts to camera space so you need the inverse matrix. You get camera matrix like this:
camera=inverse(camera_space_matrix)
now for info on how to construct your camera_space_matrix so it fits the bounding box look here:
Frustrum distance computation
so compute midpoint of the top rectangle of your box compute camera distance as max of distance computed from all vertexes of box so
camera position = midpoint + distance*midpoint_normal
orientation depends on your projection matrix. If you use gluPerspective then you are viewing -Z or +Z according selected glDepthFunc. So set Z axis of matrix to normal and Y,X vectors can be aligned to North/South and East/West so for example
Y=Z x (1,0,0)
X = Z x Y
now put position, and axis vectors X,Y,Z inside matrix, compute inverse matrix and that it is.
[Notes]
Do not forget that FOV can have different angles for X and Y axis (aspect ratio).
Normal is just midpoint - Earth center which is (0,0,0) so normal is also the midpoint. Just normalize it to size 1.0.
For all computations use cartesian world GCS (global coordinate system).
Anyone know how to project set of 3D points into virtual image plane in opencv c++
Thank you
First you need to have your transformation matrix defined (rotation, translation, etc) to map the 3D space to the 2D virtual image plane, then just multiply your 3D point coordinates (x, y, z) to the matrix to get the 2D coordinates in the image.
registration (OpenNI 2) or alternative viewPoint capability (openNI 1.5) indeed help to align depth with rgb using a single line of code. The price you pay is that you cannot really restore exact X, Y point locations in 3D space since the row and col are moved after alignment.
Sometimes you need not only Z but also X, Y and want them to be exact; plus you want the alignment of depth and rgb. Then you have to align rgb to depth. Note that this alignment is not supported by Kinect/OpenNI. The price you pay for this - there is no RGB values in the locations where depth is undefined.
If one knows extrinsic parameters that is rotation and translation of the depth camera relative to color one then alignment is just a matter of making an alternative viewpoint: restore 3D from depth, and then look at your point cloud from the point of view of a color camera: that is apply inverse rotation and translation. For example, moving camera to the right is like moving the world (points) to the left. Reproject 3D into 2D and interpolate if needed. This is really easy and is just an inverse of 3d reconstruction; below, Cx is close to w/2 and Cy to h/2;
col = focal*X/Z+Cx
row = -focal*Y/Z+Cy // this is because row in the image increases downward
A proper but also more expensive way to get a nice depth map after point cloud rotation is to trace rays from each pixel till it intersects the point cloud or come sufficiently close to one of the points. In this way you will have less holes in your depth map due to sampling artifacts.
In OpenGL I'm trying to create a free flight camera. My problem is the rotation on the Y axis. The camera should always be rotated on the Y world axis and not on the local orientation. I have tried several matrix multiplications, but all without results. With
camMatrix = camMatrix * yrotMatrix
rotates the camera along the local axis. And with
camMatrix = yrotMatrix * camMatrix
rotates the camera along the world axis, but always around the origin. However, the rotation center should be the camera. Somebody an idea?
One of the more tricky aspects of 3D programming is getting complex transformations right.
In OpenGL, every point is transformed with the model/view matrix and then with the projection matrix.
the model view matrix takes each point and translates it to where it should be from the point of view of the camera. The projection matrix converts the point's coordinates so that the X and Y coordinates can be mapped to the window easily.
To get the mode/view matrix right, you have to start with an identity matrix (one that doesn't change the vertices), then apply the transforms for the camera's position and orientation, then for the object's position and orientation in reverse order.
Another thing you need to keep in mind is, rotations are always about an axis that is centered on the origin (0,0,0). So when you apply a rotate transform for the camera, whether you are turning it (as you would turn your head) or orbiting it around the origin (as the Earth orbits the Sun) depends on whether you have previously applied a translation transform.
So if you want to both rotate and orbit the camera, you need to:
Apply the rotation(s) to orient the camera
Apply translation(s) to position it
Apply rotation(s) to orbit the camera round the origin
(optionally) apply translation(s) to move the camera in its set orientation to move it to orbit around a point other than (0,0,0).
Things can get more complex if you, say, want to point the camera at a point that is not (0,0,0) and also orbit that point at a set distance, while also being able to pitch or yaw the camera. See here for an example in WebGL. Look for GLViewerBase.prototype.display.
The Red Book covers transforms in much more detail.
Also note gluLookAt, which you can use to point the camera at something, without having to use rotations.
Rather than doing this using matrices, you might find it easier to create a camera class which stores a position and orthonormal n, u and v axes, and rotate them appropriately, e.g. see:
https://github.com/sgolodetz/hesperus2/blob/master/Shipwreck/MapEditor/GUI/Camera.java
and
https://github.com/sgolodetz/hesperus2/blob/master/Shipwreck/MapEditor/Math/MathUtil.java
Then you write things like:
if(m_keysDown[TURN_LEFT])
{
m_camera.rotate(new Vector3d(0,0,1), deltaAngle);
}
When it comes time to set the view for the camera, you do:
gl.glLoadIdentity();
glu.gluLookAt(m_position.x, m_position.y, m_position.z,
m_position.x + m_nVector.x, m_position.y + m_nVector.y, m_position.z + m_nVector.z,
m_vVector.x, m_vVector.y, m_vVector.z);
If you're wondering how to rotate about an arbitrary axis like (0,0,1), see MathUtil.rotate_about_axis in the above code.
If you don't want to transform based on the camera from the previous frame, my suggestion might be just to throw out the matrix compounding and recalc it every frame. I don't think there's a way to do what you want with a single matrix, as that stores the translation and rotation together.
I guess if you just want a pitch/yaw camera only, just store those values as two floats, and then rebuild the matrix based on that. Maybe something like pseudocode:
onFrameUpdate() {
newPos = camMatrix * (0,0,speed) //move forward along the camera axis
pitch += mouse_move_x;
yaw += mouse_move_y;
camMatrix = identity.translate(newPos)
camMatrix = rotate(camMatrix, (0,1,0), yaw)
camMatrix = rotate(camMatrix, (1,0,0), pitch)
}
rotates the camera along the world axis, but always around the origin. However, the rotation center should be the camera. Somebody an idea?
I assume matrix stored in memory this way (number represent element index if matrix were a linear 1d array):
0 1 2 3 //row 0
4 5 6 7 //row 1
8 9 10 11 //row 2
12 13 14 15 //row 3
Solution:
Store last row of camera matrix in temporary variable.
Set last row of camera matrix to (0, 0, 0, 1)
Use camMatrix = yrotMatrix * camMatrix
Restore last row of camera matrix from temporary variable.
I am trying to implement a raytracer that uses an arbitrary camera position and perspective projection. I have the camera position, the look at position, the angle of field of view, but I cannot figure out the direction I have to shoot the rays so that each ray corresponds to a pixel. If I could find a way to find the coordinates of the image plane, or the direction vectors the rays should have, it would be downhill from there. Any help is appreciated.
I would do the following: imagine that there is a rectangular grid just in front of your eye. The grid is defined by one point (the (0;0) point of the grid) and two (three dimensional) base vectors (x,y); with this you can calculate a ray as (origin + Xcoordinate * x + Ycoordinate * y) - eye. By adjusting the distance between your eye point, and origin; or by adjusting the length of the base vectors you could get the desired angle of view.