Related
All yalls,
I set up my camera eye on the positive z axis (0, 0, 10), up pointing towards positive y (0, 1, 0), and center towards positive x (2, 0, 0). If, y is up, and the camera is staring down the negative z axis, then x points left in screen coordinates, in right-handed OpenGL world coordinates.
I also have an object centered at the world origin. As the camera looks more to the left (positive x direction), I would expect my origin-centered object to move right in the resulting screen projection. But I see the opposite is the case.
Am I lacking a fundamental understanding? If so, what? If not, can anyone explain how to properly use glm to generate view and projection matrices, in the default OpenGL right-handed world model, which are sent to shaders?
glm::vec3 _eye(0, 0, 10), _center(2, 0, 0), _up(0, 1, 0);
viewMatrix = glm::lookAt(_eye, _center, _up);
projectionMatrix = glm::perspective(glm::radians(45), 6./8., 0.1, 200.);
Another thing I find interesting is the red line in the image points in the positive x-direction. It literally is the [eye -> (forward + eye)] vector of another camera in the scene, which I extract from the inverse of the viewMatrix. What melts my brain about this is, when I use that camera's VP matrices, it points in the direction opposite to the same forward direction that was extracted from the inverse of the viewMatrix. I'd really appreciate any insight into this discrepancy as well.
Also worth noting: I built glm 0.9.9 via cmake. And I verified it uses the right-hand, [-1, 1] variants of lookat and perspective.
resulting image:
I would expect my origin-centered object to move right in the resulting screen projection. But I see the opposite is the case.
glm::lookAt defines a view matrix. The parameters of glm::lookAt are in world space and the center parameter of glm::lookAt defines the position where you look at.
The view space is the local system which is defined by the point of view onto the scene.
The position of the view, the line of sight and the upwards direction of the view, define a coordinate system relative to the world coordinate system. The view matrix transforms from the wolrd space to the view (eye) space.
If the coordiante system of the view space is a Right-handed system, then the X-axis points to the left, the Y-axis up and the Z-axis out of the view (Note in a right hand system the Z-Axis is the cross product of the X-Axis and the Y-Axis).
The line of sight is the vector form the eye position to the center positon:
eye = (0, 0, 10)
center = (2, 0, 0)
up = (0, 1, 0)
los = center - eye = (2, 0, -10)
In this case, if the center of the object is at (0, 0, 0) and you look at (0, 0, 2), the you look at a position at the right of the object, this means that the object is shifted to the left.
This will change, if you change the point of view e.g. (0, 0, -10) or flip the up vector e.g. (0, -1, 0).
I have to render 2 scenes separately and embed one of them into another scene as a plane. The sub scene that is rendered as a plane will use a view matrix calculated from relative camera position and perspective matrix considering distance and calculated skew to render sub scene as if that scene is placed actually on the point.
For describing more detail, this is a figure to describe the simpler case.
(In this case, we have the sub scene on the center line of the main frustum)
It is easy to calculate perspective matrix visualized as red frustum by using these parameters.
However, it is very difficult for me to solve the other case. If there were the sub scene outside of the center line, I should skew the projection matrix to correspond with scene outside.
I think this is kind of oblique perspective projection. And also this is very similar to render mirror. How do I calculate this perspective matrix?
As #Rabbid76 already pointed out this is just a standard asymmetric frustum. For that, you just need to know the coordinates of the rectangle on the near plane you are going to use, in eye-space.
However, there is also another option: You can also modify the existing projection matrix. That approach will be easier if you know the position of your rectangle in window coordinates or normalized devices coordinates. You can simply pre-multiply scale and translation matrices to select any sub-region of your original frustum.
Let's assume that your viewport is w * h pixels wide, and starts at (0,0) in the window. And you want to create a frustum which just renders a sub-rectangle which starts at the lower left corner of pixel (x,y), and which is a pixels wide and b pixels tall.
Convert to NDC:
x_ndc = (x / w) * 2 - 1 and y_ndc = (y / h) * 2 - 1
a_ndc = (a / w) * 2 and b_ndc = (b / h) * 2
Create a scale and translation transform which maps the range [x_ndc, x_ndc+a_ndc] to [-1,1], and similiar for y:
( 2/a_ndc 0 0 -2*x_ndc/a_ndc-1 )
M = ( 0 2/b_ndc 0 -2*y_ndc/b_ndc-1 )
( 0 0 1 0 )
( 0 0 0 1 )
(note that the factor 2 is going to be cancled out. Instead of going to [-1,1] NDC space in step 1, we could also just have used the normalized [0,1], I just wanted to use the standard spaces.)
Pre-Multiply M to the original projection matrix P:
P' = M * P
Note that even though we defined the transformation in NDC space, and P works in clip space before the division, the math still will work out. By using the homogenous coordinates, the translation part of M will be scaled by w accordingly. The resulting matrix will just be a general asymmetric projection matrix.
Now this does not adjust the near and far clipping planes of the original projection. But you can adjust them in the very same way by adding appropriate scale and translation to the z coordinate.
Also note that using this approach, you are not even restricted to selecting an axis-parallel rectangle, you can also rotate or skew it arbitrarily, so basically, you can select an arbitrary parallelogram in window space.
How do I calculate this perspective matrix?
An asymmetric perspective (column major order) projection matrix is set up like this:
m[16] = [
2*n/(r-l), 0, 0, 0,
0, 2*n/(t-b), 0, 0,
(r+l)/(r-l), (t+b)/(t-b), -(f+n)/(f-n), -1,
0, 0, -2*f*n/(f-n), 0];
Where r, l, b, and t are the left, right, bottom and top distances to the frustum planes on the near plane. n and f are the distances to the near and far plane.
In common, in a framework or a library a projection matrix like this is set up by a function called frustum.
e.g.
OpenGL Mathematics: glm::frustum
OpenGL fixed function pipeline: glFrustum
I want to determine what's the coordinate of camera in opengl.
So I simply draw a sphere in a window, the code is like this:
glutSolidSphere (1.0, 20, 16); //draw a sphere, its radius is 1
//I use glOrtho to set the x,y coordinate
//1
glOrtho(-1,1,-1,1,-0.99,-1.0);
//2
glOrtho(-1,1,-1,1,-1.0,-0.99);
//3
glOrtho(-1,1,-1,1,1.0,0.99);
//5
glOrtho(-1,1,-1,1,1.0,1.0);
//6
glOrtho(-1,1,-1,1,10,10);
//7
glOrtho(-1,1,-1,1,0.0,0.0);
//8
glOrtho(-1,1,-1,1,-0.5,0.5);
//9
//glOrtho(-1,1,-1,1,0.0,0.1);
in case 1,2,3,4, the picture is like this:
a small circle
in case 5,6,7, the sphere just the same size
of the window.
in case 8, the picture is like this:
like a torus,strange
According to glOrtho description:
void glOrtho( GLdouble left,
GLdouble right,
GLdouble bottom,
GLdouble top,
GLdouble nearVal,
GLdouble farVal);
Let's assume that the coordinate of camera is fixed in opengl.
from case 1, it seems that the camera is at (0,0,0);
1) but if then, how can case 2,3,4 is the same as case1?
2) how case 5,6,7 come out?
3) how case 8 come out?
You seem to be confusing several things.
Conceptually, the default glOrtho and glFrustum()/gluPerspecitve() functions assume that the camera is at eye space origin and looking at negative z direction. If you have left the ModelView matrix at idendity (the default), it means your object space will be identical to the eye space, so you are drawing directly in eye space.
OpenGL defines a three-dimensional viewing volume. This means that there is not only a 2D rectangle limited by your viewport/window size, but there are aloe near and far clipping planes. That viewing volume is described as a axis-aligned cube -1 <= x,y,z <= 1 in _normalized device coordinates`.
The purpose of the projection matrix is to transfrom some
viewing volume to that normalized cube. With an orthogonal projection, there will be no perspective effect. Objects which are far away will not appear smaller. So you can interpret the ortho matrix as defining an axis-aligned cuboid in eye space, which defines the part ot the space that will be visibile on the screen. Note that you can set up that projection such that you can see things which are actually behind your "camera" (by using neagtive values for near or far).
Your cases 1-4 all appear identically because you cut out only a tiny section z in [0.99, 1] or z in [-1, -0.99]. where the intersection with a sphere will just appear as a disc. It doesn't matter if you flip the ranges, since that will only flip what is in front or behind. Whithout lighting, you basically see only the silhuette, so you can't see the differences.
Your cases 5, 6 and 7 are just invalid, the parameters near and far must not be identical. That code will just generate a GL error and create no ortho matrix at all, which means that the projection matrix is left at identity - and then, you get excatly the [-1,1]^3 viewing volume. Since you draw a sphere with radius 1 centered at the origin, it will exactly fit.
Case 8 is just a cut of the spehre, the intersecion within -0.5 <= z <= 0.5.
I am currently working on ray-tracing techniques and I think I've made a pretty good job; but, I haven't covered camera yet.
Until now, I used a plane fragment for view plane which is located between (-width/2, height/2, 200) and (width/2, -height/2, 200) [200 is just a fixed number of z, can be changed].
Addition to that, I use the camera mostly on e(0, 0, 1000), and I use a perspective projection.
I send rays from point e to pixels, and print it to image's corresponding pixel after calculating the pixel color.
Here is a image I created. Hopefully you can guess where eye and view plane are by looking at the image.
My question starts from here. It's time to move my camera around, but I don't know how to map 2D view plane coordinates to the canonical coordinates. Is there a transformation matrix for that?
The method I think requires to know the 3D coordinates of pixels on view plane. I am not sure it's the right method to use. So, what do you suggest?
There are a variety of ways to do it. Here's what I do:
Choose a point to represent the camera location (camera_position).
Choose a vector that indicates the direction the camera is looking (camera_direction). (If you know a point the camera is looking at, you can compute this direction vector by subtracting camera_position from that point.) You probably want to normalize (camera_direction), in which case it's also the normal vector of the image plane.
Choose another normalized vector that's (approximately) "up" from the camera's point of view (camera_up).
camera_right = Cross(camera_direction, camera_up)
camera_up = Cross(camera_right, camera_direction) (This corrects for any slop in the choice of "up".)
Visualize the "center" of the image plane at camera_position + camera_direction. The up and right vectors lie in the image plane.
You can choose a rectangular section of the image plane to correspond to your screen. The ratio of the width or height of this rectangular section to the length of camera_direction determines the field of view. To zoom in you can increase camera_direction or decrease the width and height. Do the opposite to zoom out.
So given a pixel position (i, j), you want the (x, y, z) of that pixel on the image plane. From that you can subtract camera_position to get a ray vector (which then needs to be normalized).
Ray ComputeCameraRay(int i, int j) {
const float width = 512.0; // pixels across
const float height = 512.0; // pixels high
double normalized_i = (i / width) - 0.5;
double normalized_j = (j / height) - 0.5;
Vector3 image_point = normalized_i * camera_right +
normalized_j * camera_up +
camera_position + camera_direction;
Vector3 ray_direction = image_point - camera_position;
return Ray(camera_position, ray_direction);
}
This is meant to be illustrative, so it is not optimized.
For rasterising renderers, you tend to need a transformation matrix because that's how you map directly from 3D coordinates to screen 2D coordinates.
For ray tracing, it's not necessary because you're typically starting from a known pixel coordinate in 2D space.
Given the eye position, a point in 3-space that's in the center of the screen, and vectors for "up" and "right", it's quite easy to calculate the 3D "ray" that goes from the eye position and through the specified pixel.
I've previously posted some sample code from my own ray tracer at https://stackoverflow.com/a/12892966/6782
I am working on an application that has similar functionality to MotionBuilder in its viewport interactions. It has three buttons:
Button 1 rotates the viewport around X and Y depending on X/Y mouse drags.
Button 2 translates the viewport around X and Y depending on X/Y mouse drags.
Button 3 "zooms" the viewport by translating along Z.
The code is simple:
glTranslatef(posX,posY,posZ);
glRotatef(rotX, 1, 0, 0);
glRotatef(rotY, 0, 1, 0);
Now, the problem is that if I translate first, the translation will be correct but the rotation then follows the world axis. I've also tried rotating first:
glRotatef(rotX, 1, 0, 0);
glRotatef(rotY, 0, 1, 0);
glTranslatef(posX,posY,posZ);
^ the rotation works, but the translation works according to world axis.
My question is, how can I do both so I achieve the translation from code snippet one and the rotation from code snippet 2.
EDIT
I drew this rather crude image to illustrate what I mean by world and local rotations/translations. I need the camera to rotate and translate around its local axis.
http://i45.tinypic.com/2lnu3rs.jpg
Ok, the image makes things a bit clearer.
If you were just talking about an object, then your first code snippet would be fine, but for the camera it's quite different.
Since there's technically no object as a 'camera' in opengl, what you're doing when building a camera is just moving everything by the inverse of how you're moving the camera. I.e. you don't move the camera up by +1 on the Y axis, you just move the world by -1 on the y axis, which achieves the same visual effect of having a camera.
Imagine you have a camera at position (Cx, Cy, Cz), and it has x/y rotation angles (CRx, CRy). If this were just a regular object, and not the camera, you would transform this by:
glTranslate(Cx, Cy, Cz);
glRotate(CRx, 1, 0, 0);
glRotate(CRy, 0, 1, 0);
But because this is the camera, we need to do the inverse of this operation instead (we just want to move the world by (-Cx, -Cy, and -Cz) to emulate the moving of a 'camera'. To invert the matrix, you just have to do the opposite of each individual transform, and do them in reverse order.
glRotate(-CRy, 0, 1, 0);
glRotate(-CRx, 1, 0, 0);
glTranslate(-Cx, -Cy, -Cz);
I think this will give you the kind of camera you're mentioning in your image.
I suggest that you bite the apple and implement a camera class that stores the current state of the camera (position, view direction, up vector, right vector) and manipulate that state according to your control scheme. Then you can set up the projection matrix using gluLookAt(). Then, the order of operations becomes unimportant. Here is an example:
Let camPos be the current position of the camera, camView its view direction, camUp the up vector and camRight the right vector.
To translate the camera by moveDelta, simply add moveDelta to camPos. Rotation is a bit more difficult, but if you understand quaternions you'll be able to understand it quickly.
First you need to create a quaternion for each of your two rotations. I assume that your horizontal rotation is always about the positive Z axis (which points at the "ceiling" if you will). Let hQuat be the quaternion representing the horizontal rotation. I further assume that you want to rotate the camera about its right axis for your vertical rotation (creating a pitch effect). For this, you must apply the horizontal rotation to the camera's current angle. The result is the rotation axis for your vertical rotation hQuat. The total rotation quaternion is then rQuat = hQuat * vQuat. Then you apply rQuat to the camera's view direction, up, and right vectors.
Quat hRot(rotX, 0, 0, 1); // creates a quaternion that rotates by angle rotX about the positive Z axis
Vec3f vAxis = hRot * camRight; // applies hRot to the camera's right vector
Quat vRot(rotY, vAxis); // creates a quaternion that rotates by angle rotY about the rotated camera's right vector
Quat rQuat = hRot * vRot; // creates the total rotation
camUp = rQuat * camUp;
camRight = rQuat * camRight;
camView = rQuat * camView;
Hope this helps you solve your problem.
glRotate always works around the origin. If you do:
glPushMatrix();
glTranslated(x,y,z);
glRotated(theta,1,0,0);
glTranslated(-x,-y,-z);
drawObject();
glPopMatrix();
Then the 'object' is rotate around (x,y,z) instead of the origin, because you moved (x,y,z) to the origin, did the rotation, and then pushed (x,y,z) back where it started.
However, I don't think that's going to be enough to get the effect you're describing. If you always want transformations to be done with respect to the current frame of reference, then you need to keep track of the transformation matrix yourself. This why people use Quaternion based cameras.