VTKCamera difference between focal point and position - c++

I am using the vtkCamera and would am trying to move it around and make it look at a certain point. For example, if I want to put the camera at position (x,y,z) and make it look at (0,0,0) for example with gluLookAt in openGL we would set the eye coordinates to (x,y,z) and the centre coordinates to (0,0,0) and the up vector to (0,1,0).
In vtk however, using the vtkCamera we have three separate methods, namely setPosition, setFocalPoint and setViewUp
my question is what do setPositon and setFocalPoint correspond to?
Thanks

setPosition corresponds to eye coordinates. setFocalPoint corresponds to where the camera is looking, so gluLookAt. It functions the same way as both openGL and DirectX in that sense.

Related

Does a picking operation make sense in clip coordinates?

I have a point in 2D screen coordinates that I want to project to a point on a plane given in clip space. However, does this even make sense as the w coordinate is dependent on the z-Coordinate so linear equations are worthless? Should I use eye coordinates instead?
Turns out it was the normal transformation problem. Thanks everyone!

Why does the camera face the negative end of the z-axis by default?

I am learning openGL from this scratchpixel, and here is a quote from the perspective project matrix chapter:
Cameras point along the world coordinate system negative z-axis so that when a point is converted from world space to camera space (and then later from camera space to screen space), if the point is to left of the world coordinate system y-axis, it will also map to the left of the camera coordinate system y-axis. In other words, we need the x-axis of the camera coordinate system to point to the right when the world coordinate system x-axis also points to the right; and the only way you can get that configuration, is by having camera looking down the negative z-axis.
I think it has something to do with the mirror image? but this explanation just confused me...why is the camera's coordinate by default does not coincide with the world coordinate(like every other 3D objects we created in openGL)? I mean, we will need to transform the camera coordinate anyway with a transformation matrix (whatever we want with the negative z set up, we can simulate it)...why bother?
It is totally arbitrary what to pick for z direction.
But your pick has a lot of deep impact.
One reason to stick with the GL -z way is that the culling of faces will match GL constant names like GL_FRONT. I'd advise just to roll with the tutorial.
Flipping the sign on just one axis also flips the "parity". So a front face becomes a back face. A znear depth test becomes zfar. So it is wise to pick one early on and stick with it.
By default, yes, it's "right hand" system (used in physics, for example). Your thumb is X-axis, index finger Y-axis, and when you make those go to right directions, Z-points (middle finger) to you. Why Z-axis has been selected to point inside/outside screen? Because then X- and Y-axes go on screen, like in 2D graphics.
But in reality, OpenGL has no preferred coordinate system. You can tweak it as you like. For example, if you are making maze game, you might want Y to go outside/inside screen (and Z upwards), so that you can move nicely at XY plane. You modify your view/perspective matrices, and you get it.
What is this "camera" you're talking about? In OpenGL there is no such thing as a "camera". All you've got is a two stage transformation chain:
vertex position → viewspace position (by modelview transform)
viewspace position → clipspace position (by projection transform)
To see why be default OpenGL is "looking down" -z, we have to look at what happens if both transformation steps do "nothing", i.e. full identity transform.
In that case all vertex positions passed to OpenGL are unchanged. X maps to window width, Y maps to window height. All calculations in OpenGL by default (you can change that) have been chosen adhere to the rules of a right hand coordinate system, so if +X points right and +Y points up, then Z+ must point "out of the screen" for the right hand rule to be consistent.
And that's all there is about it. No camera. Just linear transformations and the choice of using right handed coordinates.

Translate/Rotate "World" or "Camera" in OpenGL

Before I ask the question: Yes I know that there doesn't exist a camera in OpenGL - but the setLookAt-Method is nearly the same for me ;)
What I was wondering about: If I have the task, to look at a specific point with a specific distance in my scene I basically have two options:
I could change the eyeX,eyeY,eyeZ and the centerX, centerY, centerZ values of my lookAt-Method to achieve this or I could translate my model itself.
Let's say I'm translating/rotating my model. How would I ever know where to put my center/eye-coords of my setLookAt to look at a specific point? Because the world is rotated, the point (x,y,z) is also translated and rotated. So basically when I want to look at the point x,y,z the values are changing after the rotation/translation and it's impossible for me, to look at this point.
When I only transform my eye and center-values of my lookAt I can easily look at the point again - am I missing something? Seems not like a good way to move the model instead of the camera...
It helps to understand your vector spaces.
Model Space: The intrinsic coordinate system of an object. Basically how it lines up with XYZ axes in your 3D modeling software.
World Space: Where everything is in your universe. When you move your camera in a scene layout program, the XYZ axes don't change. This is the coordinate system you're used to interacting with and thinking about.
Camera Space: This is where everything is with respect to your camera. The camera's position in camera space is the by definition the origin, and your XYZ axes are your orthonormalized right, up, and look vectors. When you move or rotate your camera, all the positions and orientations of your objects change with it in camera space. This isn't intuitive - when you walk around, you see think of everything "staying the same way" - it didn't actually move. That's because you're thinking in world space. In camera space, the position and orientation of everything is relative to your eye. If a chair's position is 5 units in front of you (ie (0,0,-5) in camera space) and you want 2 units towards it, the chair's position is now (0,0,-3).
How do I set a lookat?
What does a lookat function do, exactly? It's a convenient way to set up your view matrix without you having to understand what it's doing.
Your eye variables are the camera's position in world space. IE they're what you think they are. The same goes for the center variables - they're the position of your object in world space. From here you get the transformation from world space to camera space that you give to OpenGL.

Computer Vision: labelling camera pose

I am trying to create a dataset of images of objects at different poses, where each image is annotated with camera pose (or object pose).
If, for example, I have a world coordinate system and I place the object of interest at the origin and place the camera at a known position (x,y,z) and make it face the origin. Given this information, how can I calculate the pose (rotation matrix) for the camera or for the object.
I had one idea, which was to have a reference coordinate i.e. (0,0,z') where I can define the rotation of the object. i.e. its tilt, pitch and yaw. Then I can calculate the rotation from (0,0,z') and (x,y,z) to give me a rotation matrix. The problem is, how to now combine the two rotation matrices?
BTW, I know the world position of the camera as I am rendering these with OpenGL from a CAD model as opposed to physically moving a camera around.
The homography matrix maps between homogeneous screen coordinates (i,j) to homogeneous world coordinates (x,y,z).
homogeneous coordinates are normal coordinates with a 1 appended. So (3,4) in screen coordinates is (3,4,1) as homogeneous screen coordinates.
If you have a set of homogeneous screen coordinates, S and their associated homogeneous world locations, W. The 4x4 homography matrix satisfies
S * H = transpose(W)
So it boils down to finding several features in world coordinates you can also identify the i,j position in screen coordinates, then doing a "best fit" homography matrix (openCV has a function findHomography)
Whilst knowing the camera's xyz provides helpful info, its not enough to fully constrain the equation and you will have to generate more screen-world pairs anyway. Thus I don't think its worth your time integrating the cameras position into the mix.
I have done a similar experiment here: http://edinburghhacklab.com/2012/05/optical-localization-to-0-1mm-no-problemo/

How do I implement basic camera operations in OpenGL?

I'm trying to implement an application using OpenGL and I need to implement the basic camera movements: orbit, pan and zoom.
To make it a little clearer, I need Maya-like camera control. Due to the nature of the application, I can't use the good ol' "transform the scene to make it look like the camera moves". So I'm stuck using transform matrices, gluLookAt, and such.
Zoom I know is dead easy, I just have to hook to the depth component of the eye vector (gluLookAt), but I'm not quite sure how to implement the other two, pan and orbit. Has anyone ever done this?
I can't use the good ol' "transform the scene to make it look like the camera moves"
OpenGL has no camera. So you'll end up doing exactly this.
Zoom I know is dead easy, I just have to hook to the depth component of the eye vector (gluLookAt),
This is not a Zoom, this is a Dolly. Zooming means varying the limits of the projection volume, i.e. the extents of a ortho projection, or the field of view of a perspective.
gluLookAt, which you've already run into, is your solution. First three arguments are the camera's position (x,y,z), next three are the camera's center (the point it's looking at), and the final three are the up vector (usually (0,1,0)), which defines the camera's y-z plane.*
It's pretty simple: you just glLoadIdentity();, call gluLookAt(...), and then draw your scene as normally. Personally, I always do all the calculations in the CPU myself. I find that orbiting a point is an extremely common task. My template C/C++ code uses spherical coordinates and looks like:
double camera_center[3] = {0.0,0.0,0.0};
double camera_radius = 4.0;
double camera_rot[2] = {0.0,0.0};
double camera_pos[3] = {
camera_center[0] + camera_radius*cos(radians(camera_rot[0]))*cos(radians(camera_rot[1])),
camera_center[1] + camera_radius* sin(radians(camera_rot[1])),
camera_center[2] + camera_radius*sin(radians(camera_rot[0]))*cos(radians(camera_rot[1]))
};
gluLookAt(
camera_pos[0], camera_pos[1], camera_pos[2],
camera_center[0],camera_center[1],camera_center[2],
0,1,0
);
Clearly you can adjust camera_radius, which will change the "zoom" of the camera, camera_rot, which will change the rotation of the camera about its axes, or camera_center, which will change the point about which the camera orbits.
*The only other tricky bit is learning exactly what all that means. To clarify, because the internet is lacking:
The position is the (x,y,z) position of the camera. Pretty straightforward.
The center is the (x,y,z) point the camera is focusing at. You're basically looking along an imaginary ray from the position to the center.
Now, your camera could still be looking any direction around this vector (e.g., it could be upsidedown, but still looking along the same direction). The up vector is a vector, not a position. It, along with that imaginary vector from the position to the center, form a plane. This is the camera's y-z plane.