I want to create a perspective projection matrix in OpenGL. This projection matrix should simulate the kinect camera from Xbox One. It should be the depth camera from the kinect. It should be as exactly as possible.
I am using for the moment this:
matrix = glm::perspective(70.6f, 1.177f, 0.01f, 1700.0f);
I found for the kinect camera that it has a field of view of 70.6 in x and 60 in y. So I thought it would work by just giving as angle 70.6 and as aspect ration 1.177, which is 70,6/60.
What would be an accurate way to define such a projection matrix?
Related
I am trying to render 3D point cloud from the depth data which I saved from opengl framebuffer. Basically, I took different depth samples from different n viewpoints (which are already known) for the rendered model centered at (0, 0, 0). I successfully saved the depth maps but now I want to extract x, y, z coordinated from these depth maps. For this, I am back projecting point from image to world. To get world coordinates I use the following equation P = K_inv [R|t]_inv * p. to calculate the world coordinates.
To calculate the image intrinsics matrix I used information from the opengl camera matrix, glm::perspective(fov, aspect, near_plane, far_plane). The intrinsic matrix K is calculated as
where
If I transform the coordinates in camera origin (i.e., no extrinsic transformation [R|t]), I get a 3D model for a single Image. To fuse multiple depths maps, I also need extrinsic transformation which I am calculating as from the OpenGL lookat matrix glm::lookat(eye=n_viewpoint_coorinates, center=(0, 0, 0), up=(0, 1, 0)). The extrisnics matrix is calculated as below (ref: http://ksimek.github.io/2012/08/22/extrinsic/
But when I fuse two depth images they are misaligned. I think the extrinsic matrix is not correct. I also tried to use glm::lookat matrix directly but that does not work as well. The fused model snapshot is shown below
Can someone suggest, what is wrong with my approach. Is it the extrinsic matrix that is wrong (which I am damn sure of)?
Finally, I managed to solve this by myself. Instead of doing transformation inside the OpenGL, I did the transformation outside of the OpenGL. Basically, I kept the camera constant and at some distance from the model and did rotation transformation on the model, and then finally render the model without lookat matrix (or just 4x4 identity matrix). I don't know why using lookat matrix does not gave me the result or maybe it is due something I was missing. To backproject the model into world coordinates I would just take the inverse of the exact transformation I did initially before feeding the model to OpenGL.
I am trying to understand the pinhole camera model and the geometry behind some computer vision and camera calibration stuff that I am looking at.
So, if I understand correctly, the pinhole camera model maps the pixel coordinates to 3D real world coordinates. So, the model looks as:
y = K [R|T]x
Here y is pixel coordinates in homogeneous coordinates, R|T represent the extrinsic transformation matrix and x is the 3D world coordinates also in homogeneous coordinates.
Now, I am looking at a presentation which says
project the center of the focus region onto the ground plane using [R|T]
Now the center of the focus region is just taken to be the center of the image. I am not sure how I can estimate the ground plane? Assuming, the point to be projected is in input space, the projection should that be computed by inverting the [R|T] matrix and multiplying that point by the inverted matrix?
EDIT
Source here on page 29: http://romilbhardwaj.github.io/static/BuildSys_v1.pdf
In an OpenGL game, I am trying to rotate the camera relative to the player's view. This rotation is not easily defined by relative angles, but rather easily defined by relative forward/up/left vectors.
How do I construct a matrix such that I can multiply it by the current projection matrix to achieve this rotation?
I want to display a model on image using Camera extrinsic matrix and gluLookAt function.
The model is translated to origin, that is, the model's center of mass is at origin. (model's coordinates is based on right-hand)
And using cvFindExtrinsicCameraParams2 function, i got camera extrinsic matrix E = [R|t].
For this case, i'd like to display cad model using gluLookat.
It has three parameters ; camera position, camera eye, camera up.
What values that i have to enter?
I guess, camera position is t : extrinsic matrix's translation values.
Also, if rotation and translation are zero, then camera see model through (0,0,1) vector. Thus, if rotation exists, camera eye should be R*(0,1,0).
Finally camera up, it should be (0,-1,0) at first if camera looks model at the front. Then new camera up vector is R * (0,-1,0).
But it does not give me a correct result. What's the problem? What's my mistake?
The eye is a point in space at which the camera is looking. What you currently calculate is the direction in which it should look. You can, for example, use
eye = t + R * [0,0,1];
I'm wondering why you try to recreate the camera matrix using glLookAt, since the result should be exactly the extrinsic camera matrix that you already have.
I am having profound issues regarding understanding the transformations involved in VTK. OpenGL has fairly good documentation and I was of the impression that VTK is verym similar to OpenGL (it is, in many ways). But when it comes to transformations, it seems to be an entirely different story.
This is a good OpenGL documentation about transforms involved:
http://www.songho.ca/opengl/gl_transform.html
The perspective projection matrix in OpenGL is:
I wanted to see if this formula applied in VTK will give me the projection matrix of VTK (by cross-checking with VTK projection matrix).
Relevant Camera and Renderer Parameters:
camera->SetPosition(0,0,20);
camera->SetFocalPoint(0,0,0);
double crSet[2] = {10, 1000};
renderer->GetActiveCamera()->SetClippingRange(crSet);
double windowSize[2];
renderWindow->SetSize(1280,720);
renderWindowInteractor->GetSize(windowSize);
proj = renderer->GetActiveCamera()->GetProjectionTransformMatrix(windowSize[0]/windowSize[1], crSet[0], crSet[1]);
The projection transform matrix I got for this configuration is:
The (3,3) and (3,4) values of the projection matrix (lets say it is indexed 1 to 4 for rows and columns) should be - (f+n)/(f-n) and -2*f*n/(f-n) respectively. In my VTK camera settings, the nearz is 10 and farz is 1000 and hence I should get -1.020 and -20.20 respectively in the (3,3) and (3,4) locations of the matrix. But it is -1010 and -10000.
I have changed my clipping range values to see the changes and the (3,3) position is always nearz+farz which makes no sense to me. Also, it would be great if someone can explain why it is 3.7320 in the (1,1) and (2,2) positions. And this value DOES NOT change when I change the window size of the renderer window. Quite perplexing to me.
I see in VTKCamera class reference that GetProjectionTransformMatrix() returns the transformation matrix that maps from camera coordinates to viewport coordinates.
VTK Camera Class Reference
This is a nice depiction of the transforms involved in OpenGL rendering:
OpenGL Projection Matrix is the matrix that maps from eye coordinates to clip coordinates. It is beyond doubt that eye coordinates in OpenGL is the same as camera coordinates in VTK. But is the clip coordinates in OpenGL same as viewport coordinates of VTK?
My aim is to simulate a real webcam camera (already calibrated) in VTK to render a 3D model.
Well, the documentation you linked to actually explains this (emphasis mine):
vtkCamera::GetProjectionTransformMatrix:
Return the projection transform matrix, which converts from camera
coordinates to viewport coordinates. This method computes the aspect,
nearz and farz, then calls the more specific signature of
GetCompositeProjectionTransformMatrix
with:
vtkCamera::GetCompositeProjectionTransformMatrix:
Return the concatenation of the ViewTransform and the
ProjectionTransform. This transform will convert world coordinates to
viewport coordinates. The 'aspect' is the width/height for the
viewport, and the nearz and farz are the Z-buffer values that map to
the near and far clipping planes. The viewport coordinates of a point located inside the frustum are in the range
([-1,+1],[-1,+1], [nearz,farz]).
Note that this neither matches OpenGL's window space nor normalized device space. If find the term "viewport coordinates" for this aa poor choice, but be it as it may. What bugs me more with this is that the matrix actually does not transform to that "viewport space", but to some clip-space equivalent. Only after the perspective divide, the coordinates will be in the range as given for the above definition of the "viewport space".
But is the clip coordinates in OpenGL same as viewport coordinates of
VTK?
So that answer is a clear no. But it is close. Basically, that projection matrix is just a scaled and shiftet along the z dimension, and it is easy to convert between those two. Basically, you can simply take znear and zfar out of VTK's matrix, and put it into that OpenGL projection matrix formula you linked above, replacing just those two matrix elements.