Translate/Rotate "World" or "Camera" in OpenGL - opengl

Before I ask the question: Yes I know that there doesn't exist a camera in OpenGL - but the setLookAt-Method is nearly the same for me ;)
What I was wondering about: If I have the task, to look at a specific point with a specific distance in my scene I basically have two options:
I could change the eyeX,eyeY,eyeZ and the centerX, centerY, centerZ values of my lookAt-Method to achieve this or I could translate my model itself.
Let's say I'm translating/rotating my model. How would I ever know where to put my center/eye-coords of my setLookAt to look at a specific point? Because the world is rotated, the point (x,y,z) is also translated and rotated. So basically when I want to look at the point x,y,z the values are changing after the rotation/translation and it's impossible for me, to look at this point.
When I only transform my eye and center-values of my lookAt I can easily look at the point again - am I missing something? Seems not like a good way to move the model instead of the camera...

It helps to understand your vector spaces.
Model Space: The intrinsic coordinate system of an object. Basically how it lines up with XYZ axes in your 3D modeling software.
World Space: Where everything is in your universe. When you move your camera in a scene layout program, the XYZ axes don't change. This is the coordinate system you're used to interacting with and thinking about.
Camera Space: This is where everything is with respect to your camera. The camera's position in camera space is the by definition the origin, and your XYZ axes are your orthonormalized right, up, and look vectors. When you move or rotate your camera, all the positions and orientations of your objects change with it in camera space. This isn't intuitive - when you walk around, you see think of everything "staying the same way" - it didn't actually move. That's because you're thinking in world space. In camera space, the position and orientation of everything is relative to your eye. If a chair's position is 5 units in front of you (ie (0,0,-5) in camera space) and you want 2 units towards it, the chair's position is now (0,0,-3).
How do I set a lookat?
What does a lookat function do, exactly? It's a convenient way to set up your view matrix without you having to understand what it's doing.
Your eye variables are the camera's position in world space. IE they're what you think they are. The same goes for the center variables - they're the position of your object in world space. From here you get the transformation from world space to camera space that you give to OpenGL.

Related

OpenGL: How can I change my coordinate setup?

From what I understand, OpenGL uses a right-hand coordinate system, that, at least in clip space, works like this:
X points right
Y points up
Z points into the screen
This means that, without any modifications to all the matrices used for transformations, world space coordinates work like this:
The X-Z plane is horizontal
The X-Y and Z-Y planes are vertical
What if I want to change it so that the Z axis is the one pointing up? How could I go about doing this? I've thought about multiplying all matrices by a rotation matrix that just shifts all coordinates by 90 degrees, or maybe I could change the Y and Z components of a vector once I send data to the GPU, but those seem more like workarounds than actual solutions, and they might also take a hit on performance if done for every mesh in the scene. Is there any standard way to do this? Am I getting something wrong?
The clip and NDC spaces are left-handed axis system, not as you defined each X,Y,Z, axis.
You can have several axis systems. For example some "objects store" use a left-handed system. If you're starting with OpenGL, try to set everything in right-handed system, will be easier for you to understand.
Your objects are normally defined in its own local system (right handed or not). You place them by a "world" matrix. And you see the world from a camera position, which requieres a "view" matrix. And then you project all of them, another "proj" matrix.
As you can see, matrices are used everywhere. Don't be afraid of them.
Changing from an axis system to another is just another matrix. There are many examples in the web.

Why does the camera face the negative end of the z-axis by default?

I am learning openGL from this scratchpixel, and here is a quote from the perspective project matrix chapter:
Cameras point along the world coordinate system negative z-axis so that when a point is converted from world space to camera space (and then later from camera space to screen space), if the point is to left of the world coordinate system y-axis, it will also map to the left of the camera coordinate system y-axis. In other words, we need the x-axis of the camera coordinate system to point to the right when the world coordinate system x-axis also points to the right; and the only way you can get that configuration, is by having camera looking down the negative z-axis.
I think it has something to do with the mirror image? but this explanation just confused me...why is the camera's coordinate by default does not coincide with the world coordinate(like every other 3D objects we created in openGL)? I mean, we will need to transform the camera coordinate anyway with a transformation matrix (whatever we want with the negative z set up, we can simulate it)...why bother?
It is totally arbitrary what to pick for z direction.
But your pick has a lot of deep impact.
One reason to stick with the GL -z way is that the culling of faces will match GL constant names like GL_FRONT. I'd advise just to roll with the tutorial.
Flipping the sign on just one axis also flips the "parity". So a front face becomes a back face. A znear depth test becomes zfar. So it is wise to pick one early on and stick with it.
By default, yes, it's "right hand" system (used in physics, for example). Your thumb is X-axis, index finger Y-axis, and when you make those go to right directions, Z-points (middle finger) to you. Why Z-axis has been selected to point inside/outside screen? Because then X- and Y-axes go on screen, like in 2D graphics.
But in reality, OpenGL has no preferred coordinate system. You can tweak it as you like. For example, if you are making maze game, you might want Y to go outside/inside screen (and Z upwards), so that you can move nicely at XY plane. You modify your view/perspective matrices, and you get it.
What is this "camera" you're talking about? In OpenGL there is no such thing as a "camera". All you've got is a two stage transformation chain:
vertex position → viewspace position (by modelview transform)
viewspace position → clipspace position (by projection transform)
To see why be default OpenGL is "looking down" -z, we have to look at what happens if both transformation steps do "nothing", i.e. full identity transform.
In that case all vertex positions passed to OpenGL are unchanged. X maps to window width, Y maps to window height. All calculations in OpenGL by default (you can change that) have been chosen adhere to the rules of a right hand coordinate system, so if +X points right and +Y points up, then Z+ must point "out of the screen" for the right hand rule to be consistent.
And that's all there is about it. No camera. Just linear transformations and the choice of using right handed coordinates.

Names for camera moves

I've got a 3D scene and want to offer an API to control the camera. The camera is currently described by its own position, a look-at point in the scene somewhere along the z axis of the camera frame of reference, an “up” vector describing the y axis of the camera frame of reference, and a field-of-view angle. I'd like to provide at least the following operations:
Two-dimensional operations (mouse drag or arrow keys)
Keep look-at point and rotate camera around that. This can also feel like rotating the object, with the look-at point describing its centre. I think that at some point I've heard this described as the camera “orbiting” around the centre of the scene.
Keep camera position, and rotate camera around that point. Colloquially I'd call this “looking around”. With a cinema camera this might perhaps be called pan and tilt, but in 3d modelling “panning” is usually something else, see below. Using aircraft principal directions, this would be a pitch-and-yaw movement of the camera.
Move camera position and look-at point in parallel. This can also feel like translating the object parallel to the view plane. As far as I know this is usually called “panning” in 3d modelling contexts.
One-dimensional operations (e.g. mouse wheel)
Keep look-at point and move camera closer to that, by a given factor. This is perhaps what most people would consider a “zoom” except for those who know about real cameras, see below.
Keep all positions, but change field-of-view angle. This is what a “real” zoom would be: changing the focal length of the lens but nothing else.
Move both look-at point and camera along the line connecting them, by a given distance. At first this feels very much like the first item above, but since it changes the look-at point, subsequent rotations will behave differently. I see this as complementing the last point of the 2d operations above, since together they allow me to move camera and look-at point together in all three directions. The cinema camera man might call this a “dolly” shot, but I guess a dolly might also be associated with the other translation moves parallel to the viewing plane.
Keep look-at point, but change camera distance from it and field-of-view angle in such a way that projected sizes in the plane of the look-at point remain unchanged. This would be a dolly zoom in cinematic contexts, but might also be used to adjust for the viewer's screen size and distance from screen, to make the field-of-view match the user's environment.
Rotate around z axis in camera frame of reference. Using aircraft principal directions, this would be a roll motion of the camera. But it could also feel like a rotation of the object within the image plane.
What would be a consistent, unambiguous, concise set of function names to describe all of the above operations? Perhaps something already established by some existing API?

How do I implement basic camera operations in OpenGL?

I'm trying to implement an application using OpenGL and I need to implement the basic camera movements: orbit, pan and zoom.
To make it a little clearer, I need Maya-like camera control. Due to the nature of the application, I can't use the good ol' "transform the scene to make it look like the camera moves". So I'm stuck using transform matrices, gluLookAt, and such.
Zoom I know is dead easy, I just have to hook to the depth component of the eye vector (gluLookAt), but I'm not quite sure how to implement the other two, pan and orbit. Has anyone ever done this?
I can't use the good ol' "transform the scene to make it look like the camera moves"
OpenGL has no camera. So you'll end up doing exactly this.
Zoom I know is dead easy, I just have to hook to the depth component of the eye vector (gluLookAt),
This is not a Zoom, this is a Dolly. Zooming means varying the limits of the projection volume, i.e. the extents of a ortho projection, or the field of view of a perspective.
gluLookAt, which you've already run into, is your solution. First three arguments are the camera's position (x,y,z), next three are the camera's center (the point it's looking at), and the final three are the up vector (usually (0,1,0)), which defines the camera's y-z plane.*
It's pretty simple: you just glLoadIdentity();, call gluLookAt(...), and then draw your scene as normally. Personally, I always do all the calculations in the CPU myself. I find that orbiting a point is an extremely common task. My template C/C++ code uses spherical coordinates and looks like:
double camera_center[3] = {0.0,0.0,0.0};
double camera_radius = 4.0;
double camera_rot[2] = {0.0,0.0};
double camera_pos[3] = {
camera_center[0] + camera_radius*cos(radians(camera_rot[0]))*cos(radians(camera_rot[1])),
camera_center[1] + camera_radius* sin(radians(camera_rot[1])),
camera_center[2] + camera_radius*sin(radians(camera_rot[0]))*cos(radians(camera_rot[1]))
};
gluLookAt(
camera_pos[0], camera_pos[1], camera_pos[2],
camera_center[0],camera_center[1],camera_center[2],
0,1,0
);
Clearly you can adjust camera_radius, which will change the "zoom" of the camera, camera_rot, which will change the rotation of the camera about its axes, or camera_center, which will change the point about which the camera orbits.
*The only other tricky bit is learning exactly what all that means. To clarify, because the internet is lacking:
The position is the (x,y,z) position of the camera. Pretty straightforward.
The center is the (x,y,z) point the camera is focusing at. You're basically looking along an imaginary ray from the position to the center.
Now, your camera could still be looking any direction around this vector (e.g., it could be upsidedown, but still looking along the same direction). The up vector is a vector, not a position. It, along with that imaginary vector from the position to the center, form a plane. This is the camera's y-z plane.

point - plane collision without the glutLookAt* functions

As I have understood, it is recommended to use glTranslate / glRotate in favour of glutLootAt. I am not going to seek the reasons beyond the obvious HW vs SW computation mode, but just go with the wave. However, this is giving me some headaches as I do not exactly know how to efficiently stop the camera from breaking through walls. I am only interested in point-plane intersections, not AABB or anything else.
So, using glTranslates and glRotates means that the viewpoint stays still (at (0,0,0) for simplicity) while the world revolves around it. This means to me that in order to check for any intersection points, I now need to recompute the world's vertices coordinates (which was not needed with the glutLookAt approach) for every camera movement.
As there is no way in obtaining the needed new coordinates from GPU-land, they need to be calculated in CPU land by hand. For every camera movement ... :(
It seems there is the need to retain the current rotations aside each of the 3 axises and the same for translations. There is no scaling used in my program. My questions:
1 - is the above reasoning flawed ? How ?
2 - if not, there has to be a way to avoid such recalculations.
The way I see it (and by looking at http://www.glprogramming.com/red/appendixf.html) it needs one matrix multiplication for translations and another one for rotating (only aside the y axis needed). However, having to compute so many additions / multiplications and especially the sine / cosine will certainly be killing FPS. There are going to be thousands or even tens of thousands of vertices to compute on. Every frame... all the maths... After having computed the new coordinates of the world things seem to be very easy - just see if there is any plane that changed its 'd' sign (from the planes equation ax + by + cz + d = 0). If it did, use a lightweight cross products approach to test if the point is inside the space inside each 'moving' triangle of that plane.
Thanks
edit: I have found about glGet and I think it is the way to go but I do not know how to properly use it:
// Retains the current modelview matrix
//glPushMatrix();
glGetFloatv(GL_MODELVIEW_MATRIX, m_vt16CurrentMatrixVerts);
//glPopMatrix();
m_vt16CurrentMatrixVerts is a float[16] which gets filled with 0.f or 8.67453e-13 or something similar. Where am I screwing up ?
gluLookAt is a very handy function with absolutely no performance penalty. There is no reason not to use it, and, above all, no "HW vs SW" consideration about that. As Mk12 stated, glRotatef is also done on the CPU. The GPU part is : gl_Position = ProjectionMatrix x ViewMatrix x ModelMatrix x VertexPosition.
"using glTranslates and glRotates means that the viewpoint stays still" -> same thing for gluLookAt
"at (0,0,0) for simplicity" -> not for simplicity, it's a fact. However, this (0,0,0) is in the Camera coordinate system. It makes sense : relatively to the camera, the camera is at the origin...
Now, if you want to prevent the camera from going through the walls, the usual method is to trace a ray from the camera. I suspect this is what you're talking about ("to check for any intersection points"). But there is no need to do this in camera space. You can do this in world space. Here's a comparison :
Tracing rays in camera space : ray always starts from (0,0,0) and goes to (0,0,-1). Geometry must be transformed from Model space to World space, and then to Camera space, which is what annoys you
Tracing rays in world space : ray starts from camera position (in world space) and goes to (eyeCenter - eyePos).normalize(). Geometry must be transformed from Model space to World space.
Note that there is no third option (Tracing rays in Model space) which would avoid to transform the geometry from Model space to World space. However, you have a pair of workarounds :
First, your game's world is probably still : the Model matrix is probably always identity. So transforming its geometry from Model to World space is equivalent to doing nothing at all.
Secondly, for all other objets, you can take the opposite approach. Intead of transforming the entire geometry in one direction, transform only the ray the other way around : Take your Model matrix, inverse it, and you've got a matrix which goes from world space to model space. Multiply your ray's origin and direction by this matrix : your ray is now in model space. Intersect the normal way. Done.
Note that all I've said is standard techniques. No hacks or other weird stuff, just math :)