How do I implement basic camera operations in OpenGL?

How do I implement basic camera operations in OpenGL? - opengl

I'm trying to implement an application using OpenGL and I need to implement the basic camera movements: orbit, pan and zoom.
To make it a little clearer, I need Maya-like camera control. Due to the nature of the application, I can't use the good ol' "transform the scene to make it look like the camera moves". So I'm stuck using transform matrices, gluLookAt, and such.
Zoom I know is dead easy, I just have to hook to the depth component of the eye vector (gluLookAt), but I'm not quite sure how to implement the other two, pan and orbit. Has anyone ever done this?

I can't use the good ol' "transform the scene to make it look like the camera moves"
OpenGL has no camera. So you'll end up doing exactly this.
Zoom I know is dead easy, I just have to hook to the depth component of the eye vector (gluLookAt),
This is not a Zoom, this is a Dolly. Zooming means varying the limits of the projection volume, i.e. the extents of a ortho projection, or the field of view of a perspective.

gluLookAt, which you've already run into, is your solution. First three arguments are the camera's position (x,y,z), next three are the camera's center (the point it's looking at), and the final three are the up vector (usually (0,1,0)), which defines the camera's y-z plane.*
It's pretty simple: you just glLoadIdentity();, call gluLookAt(...), and then draw your scene as normally. Personally, I always do all the calculations in the CPU myself. I find that orbiting a point is an extremely common task. My template C/C++ code uses spherical coordinates and looks like:
double camera_center[3] = {0.0,0.0,0.0};
double camera_radius = 4.0;
double camera_rot[2] = {0.0,0.0};
double camera_pos[3] = {
camera_center[0] + camera_radius*cos(radians(camera_rot[0]))*cos(radians(camera_rot[1])),
camera_center[1] + camera_radius* sin(radians(camera_rot[1])),
camera_center[2] + camera_radius*sin(radians(camera_rot[0]))*cos(radians(camera_rot[1]))
};
gluLookAt(
camera_pos[0], camera_pos[1], camera_pos[2],
camera_center[0],camera_center[1],camera_center[2],
0,1,0
);
Clearly you can adjust camera_radius, which will change the "zoom" of the camera, camera_rot, which will change the rotation of the camera about its axes, or camera_center, which will change the point about which the camera orbits.
*The only other tricky bit is learning exactly what all that means. To clarify, because the internet is lacking:
The position is the (x,y,z) position of the camera. Pretty straightforward.
The center is the (x,y,z) point the camera is focusing at. You're basically looking along an imaginary ray from the position to the center.
Now, your camera could still be looking any direction around this vector (e.g., it could be upsidedown, but still looking along the same direction). The up vector is a vector, not a position. It, along with that imaginary vector from the position to the center, form a plane. This is the camera's y-z plane.

Related

Best way to drag 3D point in 3D space with mouse picking in OpenGL?

What is the best way to drag 3D point with mouse picking. Issue is not the picking but to dragging in 3D space.
There are two ways I am thinking, one is to get the View to World coordinates using gluUnProject and translate the 3D point. The problem in this case is, it only world on surfaces with Depth value (with glReadPixels), if mouse leaves the surface it gives the maximum or minimum depth values based on winZ component of gluUnProject. And it doesn't work in some cases.
The second way is to drag along XY, XZ, YZ plane using GL_MODELVIEW_MATRIX.
But the problem in this case, is how would we know that we are on XY or XZ or YZ plane? How can we know that the front view of the trackball is in XY plane, and what if we want to drag on the side plane not the front plane?
So, is there any way that gives me the accurate 2D to 3D coordinate so that I can drag 3D point easily in every direction without considering the plane cases? There must be some ways, I have seen 3D softwares, they have perfect dragging feature.

I'm used to solving these user-interaction problems somewhat naively (perhaps not in a mathematically optimal way) but "well enough" considering that they're not very performance-critical (the user interaction parts, not necessarily the resulting modifications to the scene).
For unconstrained free dragging of an object, the method you described using unproject tends to work quite well to often give near pixel-perfect dragging with a slight tweak:
... instead of using glReadPixels to try to extract screen depth, you want a notion of a geometric object/mesh when the user is picking and/or has selected. Now just project the center point of that object to get your screen depth. Then you can drag around in screen X/Y, keeping the same Z you got from this projection, and unproject to get the resulting translation delta from previous center to new center to transform the object. This also makes it "feel" like you're dragging from the center of the object which tends to be quite intuitive.
For auto-constrained dragging, a quick way to detect that is to first grab a 'viewplane normal'. A quick way (might make mathematicians frown) using those projection/unprojection functions you're used to is to unproject two points at the center of the viewport in screenspace (one with a near z value and one with a far z value) and get a unit vector in between those two points. Now you can find the world axis closest to that normal using dot product. The other two world axises define the world plane we want to drag along.
Then it becomes a simple matter of using those handy unprojection functions again to get a ray along the mouse cursor. After that, you can do repeated ray/plane intersections as you're dragging the cursor around to compute a translation vector from the delta.
For more flexible constraints, a gizmo (aka manipulator, basically a 3D widget) can come in handy so that the user can indicate what kind of drag constraint he wants (planar, axis, unconstrained, etc) based on which parts of the gizmo he picks/drags. For axis constraints, ray/line or line/line intersection is handy.
As requested in the comments, to retrieve a ray from a viewport (C++-ish pseudocode):
// Get a ray from the current cursor position (screen_x and screen_y).
const float near = 0.0f;
const float far = 1.0f;
Vec3 ray_org = unproject(Vec3(screen_x, screen_y, near));
Vec3 ray_dir = unproject(Vec3(screen_x, screen_y, far));
ray_dir -= ray_org;
// Normalize ray_dir (rsqrt should handle zero cases to avoid divide by zero).
const float rlen = rsqrt(ray_dir[0]*ray_dir[0] +
ray_dir[1]*ray_dir[1] +
ray_dir[2]*ray_dir[2]);
ray_dir[0] *= rlen;
ray_dir[1] *= rlen;
ray_dir[2] *= rlen;
Then we do a ray/plane intersection with the ray obtained from the mouse cursor to figure out where the ray intersects the plane when the user begins dragging (the intersection will give us a 3D point). After that, it's just translating the object by the deltas between the points gathered from repeatedly doing this as the user drags the mouse around. The object should intuitively follow the mouse while being moved along a planar constraint.
Axis dragging is basically the same idea, but we turn the ray into a line and do a line/line intersection (mouse line against the line for the axis constraint, giving us a nearest point since the lines generally won't perfectly intersect), giving us back a 3D point from which we can use the deltas to translate the object along the constrained axis.
Note that there are tricky edge cases involved with axis/planar dragging constraints. For example, if a plane is perpendicular to the viewing plane (or close), it can shoot off the object into infinity. The same kind of case exists with axis dragging along a line that is perpendicular, like trying to drag along the Z axis from a front viewport (X/Y viewing plane). So it's worth detecting those cases where the line/plane is perpendicular (or close) and prevent dragging in such cases, but that can be done after you get the basic concept working.
Another trick worth noting to improve the way things "feel" for some cases is to hide the mouse cursor. For example, with axis constraints, the mouse cursor could end up becoming very distant from the axis itself, and it might look/feel weird. So I've seen a number of commercial packages simply hide the mouse cursor in this case to avoid revealing that discrepancy between the mouse and the gizmo/handle, and it tends to feel a bit more natural as a result. When the user releases the mouse button, the mouse cursor is moved to the visual center of the handle. Note that you shouldn't do this hidden-cursor dragging for tablets (they're a bit of an exception).
This picking/dragging/intersection stuff can be very difficult to debug, so it's worth tackling it in babysteps. Set small goals for yourself, like just clicking a mouse button in a viewport somewhere to create a ray. Then you can orbit around and make sure the ray was created in the right position. Next you can try a simple test to see if that ray intersects a plane in the world (say X/Y) plane, and create/visualize the intersection point between the ray and plane, and make sure that's correct. Take it in small, patient babysteps, pacing yourself, and you'll have smooth, confident progress. Try to do too much at once and you can have very discouraging jarring progress trying to figure out where you went wrong.

It's a very interesting toic. As well as the methods you described color tagging coloring the z-buffer is also a method. There have been similar discussions about the same topic here:
Generic picking solution for 3D scenes with vertex-shader-based geometry deformation applied
OpenGL - Picking (fastest way)
How to get object coordinates from screen coordinates?
This is also similar to object picking that has been discussed fully including their pros and cons:
http://www.opengl-tutorial.org/miscellaneous/clicking-on-objects/picking-with-an-opengl-hack/
Actually my answer is there is no unique best way to do this. As you also mentioned they have pros and cons. I personally like gluunproject as it is easy and its results are okay. In addition, you cannot move a vertex at any direction using the screen space as screen space is 2D and there is not a unique way to map it back to th 3D geometry space. Hope it helps :).

Names for camera moves

I've got a 3D scene and want to offer an API to control the camera. The camera is currently described by its own position, a look-at point in the scene somewhere along the z axis of the camera frame of reference, an “up” vector describing the y axis of the camera frame of reference, and a field-of-view angle. I'd like to provide at least the following operations:
Two-dimensional operations (mouse drag or arrow keys)
Keep look-at point and rotate camera around that. This can also feel like rotating the object, with the look-at point describing its centre. I think that at some point I've heard this described as the camera “orbiting” around the centre of the scene.
Keep camera position, and rotate camera around that point. Colloquially I'd call this “looking around”. With a cinema camera this might perhaps be called pan and tilt, but in 3d modelling “panning” is usually something else, see below. Using aircraft principal directions, this would be a pitch-and-yaw movement of the camera.
Move camera position and look-at point in parallel. This can also feel like translating the object parallel to the view plane. As far as I know this is usually called “panning” in 3d modelling contexts.
One-dimensional operations (e.g. mouse wheel)
Keep look-at point and move camera closer to that, by a given factor. This is perhaps what most people would consider a “zoom” except for those who know about real cameras, see below.
Keep all positions, but change field-of-view angle. This is what a “real” zoom would be: changing the focal length of the lens but nothing else.
Move both look-at point and camera along the line connecting them, by a given distance. At first this feels very much like the first item above, but since it changes the look-at point, subsequent rotations will behave differently. I see this as complementing the last point of the 2d operations above, since together they allow me to move camera and look-at point together in all three directions. The cinema camera man might call this a “dolly” shot, but I guess a dolly might also be associated with the other translation moves parallel to the viewing plane.
Keep look-at point, but change camera distance from it and field-of-view angle in such a way that projected sizes in the plane of the look-at point remain unchanged. This would be a dolly zoom in cinematic contexts, but might also be used to adjust for the viewer's screen size and distance from screen, to make the field-of-view match the user's environment.
Rotate around z axis in camera frame of reference. Using aircraft principal directions, this would be a roll motion of the camera. But it could also feel like a rotation of the object within the image plane.
What would be a consistent, unambiguous, concise set of function names to describe all of the above operations? Perhaps something already established by some existing API?

Translate/Rotate "World" or "Camera" in OpenGL

Before I ask the question: Yes I know that there doesn't exist a camera in OpenGL - but the setLookAt-Method is nearly the same for me ;)
What I was wondering about: If I have the task, to look at a specific point with a specific distance in my scene I basically have two options:
I could change the eyeX,eyeY,eyeZ and the centerX, centerY, centerZ values of my lookAt-Method to achieve this or I could translate my model itself.
Let's say I'm translating/rotating my model. How would I ever know where to put my center/eye-coords of my setLookAt to look at a specific point? Because the world is rotated, the point (x,y,z) is also translated and rotated. So basically when I want to look at the point x,y,z the values are changing after the rotation/translation and it's impossible for me, to look at this point.
When I only transform my eye and center-values of my lookAt I can easily look at the point again - am I missing something? Seems not like a good way to move the model instead of the camera...

It helps to understand your vector spaces.
Model Space: The intrinsic coordinate system of an object. Basically how it lines up with XYZ axes in your 3D modeling software.
World Space: Where everything is in your universe. When you move your camera in a scene layout program, the XYZ axes don't change. This is the coordinate system you're used to interacting with and thinking about.
Camera Space: This is where everything is with respect to your camera. The camera's position in camera space is the by definition the origin, and your XYZ axes are your orthonormalized right, up, and look vectors. When you move or rotate your camera, all the positions and orientations of your objects change with it in camera space. This isn't intuitive - when you walk around, you see think of everything "staying the same way" - it didn't actually move. That's because you're thinking in world space. In camera space, the position and orientation of everything is relative to your eye. If a chair's position is 5 units in front of you (ie (0,0,-5) in camera space) and you want 2 units towards it, the chair's position is now (0,0,-3).
How do I set a lookat?
What does a lookat function do, exactly? It's a convenient way to set up your view matrix without you having to understand what it's doing.
Your eye variables are the camera's position in world space. IE they're what you think they are. The same goes for the center variables - they're the position of your object in world space. From here you get the transformation from world space to camera space that you give to OpenGL.

Relative rotation of OpenGL Camera

I am currently struggling in finding a formula to rotate my OpenGL "Camera" (I tried do do it via a scene rotation, but have the same issue).
Basically my Camera is at a given position, looking a given point (all indicated to gluLookAt) and I would like to rotate the camera upwards for example, and still looking at the same point.
What should be the right process ?
What input data should I take to decide the amount of movement ? 2D mouse coordinates evolution or 3D unprojected mouse coordinates evolution ?

The trick is to see that a camera-rotation is the same as a scene rotation if you do it at the correct position. Move the camera into the point around which you want to rotate, then rotate the camera, then move back out by the same distance you moved in.
The amount by which you rotate depends on your application. Take G-Earth as an example: if you are close to the surface the rotation is (absolute) small, if you are far from the surface it is large.

If you're creating orbiting(oribitng around LookAt) camera for openGL I sugest you make it with these data:
LookAtPosition- 3D vector
CamUp - 3D unit vector
RelativeCamPosition - 3D unit vector
CamDistance - decimal number
LookAtPosition is a point on which you'll be looking. CamUp is vector that points up from camera, you can see it on this image. It's best to initialize camera at no rotation, so that CamUp = [0,1,0]. Note that it's unit vector so it's magnitude/size/length is always 1. RelativeCamPosition is again unit vector. You get it by taking LookAt to Camera
vector and dividing by it's magnitude, which you'll save in CamDistance. In intialized state it might look as this:
LookAtPosition = [0,0,0]
CamUp = [0,1,0]
RelativeCamPosition = [1,0,0]
CamDistance = 10
You can now get camera position by
CamPosition = LookAtPosition + RelativeCamPosition * CamDistance
But you need to rotate that camera arround right? Well there's a reason for unit vectors - they are easy to use in calculations. I believe you use angles for rotating so you need to use only sine and cosine. Rotate function might look like this:
Rotate(angleX, angleY){
RelativeCamPosition.x = sin(angleX)*cos(angleY);
RelativeCamPosition.z = cos(angleX)*cos(angleY);
RelativeCamPosition.y = sin(angleY);
}
where angleX and angleY are absolute (NOT RELATIVE) rotations in horizontal and vertical direction. You should always use absolute roations because there can be floating point errors while adding. Anyway I just made those calculations on scrap of paper so I hope they're allright.
Edit: I've just noticed that this will work just if your intiial state is like I wrote RelativeCamPosition = [1,0,0]. However it shouldn't be hard to edit them so it works for arbirtary initial state.

How does zooming, panning and rotating work?

Using OpenGL I'm attempting to draw a primitive map of my campus.
Can anyone explain to me how panning, zooming and rotating is usually implemented?
For example, with panning and zooming, is that simply me adjusting my viewport? So I plot and draw all my lines that compose my map, and then as the user clicks and drags it adjusts my viewport?
For panning, does it shift the x/y values of my viewport and for zooming does it increase/decrease my viewport by some amount? What about for rotation?
For rotation, do I have to do affine transforms for each polyline that represents my campus map? Won't this be expensive to do on the fly on a decent sized map?
Or, is the viewport left the same and panning/zooming/rotation is done in some otherway?
For example, if you go to this link you'll see him describe panning and zooming exactly how I have above, by modifying the viewport.
Is this not correct?

They're achieved by applying a series of glTranslate, glRotate commands (that represent camera position and orientation) before drawing the scene. (technically, you're rotating the whole scene!)
There are utility functions like gluLookAt which sorta abstract some details about this.
To simplyify things, assume you have two vectors representing your camera: position and direction.
gluLookAt takes the position, destination, and up vector.
If you implement a vector class, destinaion = position + direction should give you a destination point.
Again to make things simple, you can assume the up vector to always be (0,1,0)
Then, before rendering anything in your scene, load the identity matrix and call gluLookAt
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
gluLookAt( source.x, source.y, source.z, destination.x, destination.y, destination.z, 0, 1, 0 );
Then start drawing your objects
You can let the user span by changing the position slightly to the right or to the left. Rotation is a bit more complicated as you have to rotate the direction vector. Assuming that what you're rotating is the camera, not some object in the scene.
One problem is, if you only have a direction vector "forward" how do you move it? where is the right and left?
My approach in this case is to just take the cross product of "direction" and (0,1,0).
Now you can move the camera to the left and to the right using something like:
position = position + right * amount; //amount < 0 moves to the left
You can move forward using the "direction vector", but IMO it's better to restrict movement to a horizontal plane, so get the forward vector the same way we got the right vector:
forward = cross( up, right )
To be honest, this is somewhat of a hackish approach.
The proper approach is to use a more "sophisticated" data structure to represent the "orientation" of the camera, not just the forward direction. However, since you're just starting out, it's good to take things one step at a time.

All of these "actions" can be achieved using model-view matrix transformation functions. You should read about glTranslatef (panning), glScalef (zoom), glRotatef (rotation). You also should need to read some basic tutorial about OpenGL, you might find this link useful.

Generally there are three steps that are applied whenever you reference any point in 3d space within opengl.
Given a Local point
Local -> World Transform
World -> Camera Transform
Camera -> Screen Transform (usually a projection. depends on if you're using perspective or orthogonal)
Each of these transforms is taking your 3d point, and multiplying by a matrix.
When you are rotating the camera, it is generally changing the world -> camera transform by multiplying the transform matrix by your rotation/pan/zoom affine transformation. Since all of your points are re-rendered each frame, the new matrix gets applied to your points, and it gives the appearance of a rotation.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js