From what I understand, OpenGL uses a right-hand coordinate system, that, at least in clip space, works like this:
X points right
Y points up
Z points into the screen
This means that, without any modifications to all the matrices used for transformations, world space coordinates work like this:
The X-Z plane is horizontal
The X-Y and Z-Y planes are vertical
What if I want to change it so that the Z axis is the one pointing up? How could I go about doing this? I've thought about multiplying all matrices by a rotation matrix that just shifts all coordinates by 90 degrees, or maybe I could change the Y and Z components of a vector once I send data to the GPU, but those seem more like workarounds than actual solutions, and they might also take a hit on performance if done for every mesh in the scene. Is there any standard way to do this? Am I getting something wrong?
The clip and NDC spaces are left-handed axis system, not as you defined each X,Y,Z, axis.
You can have several axis systems. For example some "objects store" use a left-handed system. If you're starting with OpenGL, try to set everything in right-handed system, will be easier for you to understand.
Your objects are normally defined in its own local system (right handed or not). You place them by a "world" matrix. And you see the world from a camera position, which requieres a "view" matrix. And then you project all of them, another "proj" matrix.
As you can see, matrices are used everywhere. Don't be afraid of them.
Changing from an axis system to another is just another matrix. There are many examples in the web.
Related
I am learning openGL from this scratchpixel, and here is a quote from the perspective project matrix chapter:
Cameras point along the world coordinate system negative z-axis so that when a point is converted from world space to camera space (and then later from camera space to screen space), if the point is to left of the world coordinate system y-axis, it will also map to the left of the camera coordinate system y-axis. In other words, we need the x-axis of the camera coordinate system to point to the right when the world coordinate system x-axis also points to the right; and the only way you can get that configuration, is by having camera looking down the negative z-axis.
I think it has something to do with the mirror image? but this explanation just confused me...why is the camera's coordinate by default does not coincide with the world coordinate(like every other 3D objects we created in openGL)? I mean, we will need to transform the camera coordinate anyway with a transformation matrix (whatever we want with the negative z set up, we can simulate it)...why bother?
It is totally arbitrary what to pick for z direction.
But your pick has a lot of deep impact.
One reason to stick with the GL -z way is that the culling of faces will match GL constant names like GL_FRONT. I'd advise just to roll with the tutorial.
Flipping the sign on just one axis also flips the "parity". So a front face becomes a back face. A znear depth test becomes zfar. So it is wise to pick one early on and stick with it.
By default, yes, it's "right hand" system (used in physics, for example). Your thumb is X-axis, index finger Y-axis, and when you make those go to right directions, Z-points (middle finger) to you. Why Z-axis has been selected to point inside/outside screen? Because then X- and Y-axes go on screen, like in 2D graphics.
But in reality, OpenGL has no preferred coordinate system. You can tweak it as you like. For example, if you are making maze game, you might want Y to go outside/inside screen (and Z upwards), so that you can move nicely at XY plane. You modify your view/perspective matrices, and you get it.
What is this "camera" you're talking about? In OpenGL there is no such thing as a "camera". All you've got is a two stage transformation chain:
vertex position → viewspace position (by modelview transform)
viewspace position → clipspace position (by projection transform)
To see why be default OpenGL is "looking down" -z, we have to look at what happens if both transformation steps do "nothing", i.e. full identity transform.
In that case all vertex positions passed to OpenGL are unchanged. X maps to window width, Y maps to window height. All calculations in OpenGL by default (you can change that) have been chosen adhere to the rules of a right hand coordinate system, so if +X points right and +Y points up, then Z+ must point "out of the screen" for the right hand rule to be consistent.
And that's all there is about it. No camera. Just linear transformations and the choice of using right handed coordinates.
I am trying to understand open GL concepts . While reading this tutorial - http://www.arcsynthesis.org/gltut/Positioning/Tut04%20Perspective%20Projection.html,
I came accross this statement :
This is because camera space and NDC space have different viewing directions. In camera space, the camera looks down the -Z axis; more negative Z values are farther away. In NDC space, the camera looks down the +Z axis; more positive Z values are farther away. The diagram flips the axis so that the viewing direction can remain the same between the two images (up is away).
I am confused as to why the viewing direction has to change . Could some one please help me understand this with an example ?
This is mostly just a convention. OpenGL clip space (and NDC space and screen space) has always been defined as left-handed (with z pointing away into the screen) by the spec.
OpenGL eye space had been defined with camera at origin and looking at -z direction (so right-handed). However, this convention was just meaningful in the fixed-function pipeline, where together with the fixed function per vertex lighting which was carried out in eye space, the viewing direction did matter cases like whenGL_LOCAL_VIEWER was disabled (as was the default).
The classic GL projection matrix typically converts the handedness, and the perspecitve division is done with a divisior of -z_eye, typically, so the last row of the projection matrix is typically (0, 0, -1, 0). The old glFrustum(), glOrtho(), and gluPerspective() actually supported that convention by using the z_near and z_far clipping distances negated, so that you had to specify positive values for clip planes to lie before the camera at z<0.
However, with modern GL, this convention is more or less meaningless. There is no fixed-function unit left which does work in eye space, so the eye space (and anything before that) is totally under the user's control. You can use anything you like here. The clip space and all the later spaces are still used by fixed function units (clipping, rasterization, ...), so there most be some convention to define the interface, and it is still a left-handed system.
Even in modern GL, the old right-handed eye space convention is still in use. The popular glm library for example reimplements the old GL matrix functions the same way.
There is really no reason to prefer one of the possible conventions over the other, but at some point, you have to choose and stick to one.
Before I ask the question: Yes I know that there doesn't exist a camera in OpenGL - but the setLookAt-Method is nearly the same for me ;)
What I was wondering about: If I have the task, to look at a specific point with a specific distance in my scene I basically have two options:
I could change the eyeX,eyeY,eyeZ and the centerX, centerY, centerZ values of my lookAt-Method to achieve this or I could translate my model itself.
Let's say I'm translating/rotating my model. How would I ever know where to put my center/eye-coords of my setLookAt to look at a specific point? Because the world is rotated, the point (x,y,z) is also translated and rotated. So basically when I want to look at the point x,y,z the values are changing after the rotation/translation and it's impossible for me, to look at this point.
When I only transform my eye and center-values of my lookAt I can easily look at the point again - am I missing something? Seems not like a good way to move the model instead of the camera...
It helps to understand your vector spaces.
Model Space: The intrinsic coordinate system of an object. Basically how it lines up with XYZ axes in your 3D modeling software.
World Space: Where everything is in your universe. When you move your camera in a scene layout program, the XYZ axes don't change. This is the coordinate system you're used to interacting with and thinking about.
Camera Space: This is where everything is with respect to your camera. The camera's position in camera space is the by definition the origin, and your XYZ axes are your orthonormalized right, up, and look vectors. When you move or rotate your camera, all the positions and orientations of your objects change with it in camera space. This isn't intuitive - when you walk around, you see think of everything "staying the same way" - it didn't actually move. That's because you're thinking in world space. In camera space, the position and orientation of everything is relative to your eye. If a chair's position is 5 units in front of you (ie (0,0,-5) in camera space) and you want 2 units towards it, the chair's position is now (0,0,-3).
How do I set a lookat?
What does a lookat function do, exactly? It's a convenient way to set up your view matrix without you having to understand what it's doing.
Your eye variables are the camera's position in world space. IE they're what you think they are. The same goes for the center variables - they're the position of your object in world space. From here you get the transformation from world space to camera space that you give to OpenGL.
I was trying to understand lesson 9 from NEHEs tutorial, which is about bitmaps being moved in 3d space.
the most interesting thing here is to move 2d bitmap texture on a simple quad through 3d space and keep it facing the screen (viewer) all the time. So the bitmap looks 3d but is 2d facing the viewer all the time no matter where it is in the 3d space.
In lesson 9 a list of stars is generated moving in a circle, which looks really nice. To avoid seeing the star from its side the coder is doing some tricky coding to keep the star facing the viewer all the time.
the code for this is as follows: ( the following code is called for each star in a loop)
glLoadIdentity();
glTranslatef(0.0f,0.0f,zoom);
glRotatef(tilt,1.0f,0.0f,0.0f);
glRotatef(star[loop].angle,0.0f,1.0f,0.0f);
glTranslatef(star[loop].dist,0.0f,0.0f);
glRotatef(-star[loop].angle,0.0f,1.0f,0.0f);
glRotatef(-tilt,1.0f,0.0f,0.0f);
After the lines above, the drawing of the star begins. If you check the last two lines, you see that the transformations from line 3 and 4 are just cancelled (like undo). These two lines at the end give us the possibility to get the star facing the viewer all the time. But i dont know why this is working.
And i think this comes from my misunderstanding of how OpenGL really does the transformations.
For me the last two lines are just like undoing what is done before, which for me, doesnt make sense. But its working.
So when i call glTranslatef, i know that the current matrix of the view gets multiplied with the translation values provided with glTranslatef.
In other words "glTranslatef(0.0f,0.0f,zoom);" would move the place where im going to draw my stars into the scene if zoom is negative. OK.
but WHAT exactly is moved here? Is the viewer moved "away" or is there some sort of object coordinate system which gets moved into scene with glTranslatef? Whats happening here?
Then glRotatef, what is rotated here? Again a coordinate system, the viewer itself?
In a real world. I would place the star somewhere in the 3d space, then rotate it in the world space around my worlds origin, then do the moving as the star is moving to the origin and starts at the edge again, then i would do a rotate for the star itself so its facing to the viewer. And i guess this is done here. But how do i rotate first around the worlds origin, then around the star itself? for me it looks like opengl is switching between a world coord system and a object coord system which doesnt really happen as you see.
I dont need to add the rest of the code, because its pretty standard. Simple GL initializing for 3d drawings, the rotating stuff, and then the simple drawing of QUADS with the star texture using blending. Thats it.
Could somebody explain what im misunderstanding here?
Another way of thinking about the gl matrix stack is to walk up it, backwards, from your draw call. In your case, since your draw is the last line, let's step up the code:
1) First, the star is rotated by -tilt around the X axis, with respect to the origin.
2) The star is rotated by -star[loop].angle around the Y axis, with respect to the origin.
3) The star is moved by star[loop].dist down the X axis.
4) The star is rotated by star[loop].angle around the Y axis, with respect to the origin. Since the star is not at the origin any more due to step 3, this rotation both moves the center of the star, AND rotates it locally.
5) The star is rotated by tilt around the X axis, with respect to the origin. (Same note as 4)
6) The star is moved down the Z axis by zoom units.
The trick here is difficult to type in text, but try and picture the sequence of moves. While steps 2 and 4 may seem like they invert each other, the move in between them changes the nature of the rotation. The key phrase is that the rotations are defined around the Origin. Moving the star changes the effect of the rotation.
This leads to a typical use of stacking matrices when you want to rotate something in-place. First you move it to the origin, then you rotate it, then you move it back. What you have here is pretty much the same concept.
I find that using two hands to visualize matrices is useful. Keep one hand to represent the origin, and the second (usually the right, if you're in a right-handed coordinate system like OpenGL), represents the object. I splay my fingers like the XYZ axes to I can visualize the rotation locally as well as around the origin. Starting like this, the sequence of rotations around the origin, and linear moves, should be easier to picture.
The second question you asked pertains to how the camera matrix behaves in a typical OpenGL setup. First, understand the concept of screen-space coordinates (similarly, device-coordinates). This is the space that is actually displayed. X and Y are the vectors of your screen, and Z is depth. The space is usually in the range -1 to 1. Moving an object down Z effectively moves the object away.
The Camera (or Perspective Matrix) is typically responsible for converting 'World' space into this screen space. This matrix defines the 'viewer', but in the end it is just another matrix. The matrix is always applied 'last', so if you are reading the transforms upward as I described before, the camera is usually at the very top, just as you are seeing. In this case you could think of that last transform (translate by zoom) as a very simple camera matrix, that moves the camera back by zoom units.
Good luck. :)
The glTranslatef in the middle is affected by the rotation : it moves the star along axis x' to distance dist, and axis x' is at that time rotated by ( tilt + angle ) compared to the original x axis.
In opengl you have object coordinates which are multiplied by a (a stack of) projection matrix. So you are moving the objects. If you want to "move a camera" you have to mutiply by the inverse matrix of the camera position and axis :
ProjectedCoords = CameraMatrix^-1 . ObjectMatrix . ObjectCoord
I also found this very confusing but I just played around with some of the NeHe code to get a better understanding of glTranslatef() and glRotatef().
My current understanding is that glRotatef() actually rotates the coordinate system, such that glRotatef(90.0f, 0.0f, 0.0f, 1.0f) will cause the x-axis to be where the y-axis was previously. After this rotation, glTranslatef(1.0f, 0.0f, 0.0f) will move an object upwards on the screen.
Thus, glTranslatef() moves objects in accordance with the current rotation of the coordinate system. Therefore, the order of glTranslatef and glRotatef are important in tutorial 9.
In technical terms my description might not be perfect, but it works for me.
As I have understood, it is recommended to use glTranslate / glRotate in favour of glutLootAt. I am not going to seek the reasons beyond the obvious HW vs SW computation mode, but just go with the wave. However, this is giving me some headaches as I do not exactly know how to efficiently stop the camera from breaking through walls. I am only interested in point-plane intersections, not AABB or anything else.
So, using glTranslates and glRotates means that the viewpoint stays still (at (0,0,0) for simplicity) while the world revolves around it. This means to me that in order to check for any intersection points, I now need to recompute the world's vertices coordinates (which was not needed with the glutLookAt approach) for every camera movement.
As there is no way in obtaining the needed new coordinates from GPU-land, they need to be calculated in CPU land by hand. For every camera movement ... :(
It seems there is the need to retain the current rotations aside each of the 3 axises and the same for translations. There is no scaling used in my program. My questions:
1 - is the above reasoning flawed ? How ?
2 - if not, there has to be a way to avoid such recalculations.
The way I see it (and by looking at http://www.glprogramming.com/red/appendixf.html) it needs one matrix multiplication for translations and another one for rotating (only aside the y axis needed). However, having to compute so many additions / multiplications and especially the sine / cosine will certainly be killing FPS. There are going to be thousands or even tens of thousands of vertices to compute on. Every frame... all the maths... After having computed the new coordinates of the world things seem to be very easy - just see if there is any plane that changed its 'd' sign (from the planes equation ax + by + cz + d = 0). If it did, use a lightweight cross products approach to test if the point is inside the space inside each 'moving' triangle of that plane.
Thanks
edit: I have found about glGet and I think it is the way to go but I do not know how to properly use it:
// Retains the current modelview matrix
//glPushMatrix();
glGetFloatv(GL_MODELVIEW_MATRIX, m_vt16CurrentMatrixVerts);
//glPopMatrix();
m_vt16CurrentMatrixVerts is a float[16] which gets filled with 0.f or 8.67453e-13 or something similar. Where am I screwing up ?
gluLookAt is a very handy function with absolutely no performance penalty. There is no reason not to use it, and, above all, no "HW vs SW" consideration about that. As Mk12 stated, glRotatef is also done on the CPU. The GPU part is : gl_Position = ProjectionMatrix x ViewMatrix x ModelMatrix x VertexPosition.
"using glTranslates and glRotates means that the viewpoint stays still" -> same thing for gluLookAt
"at (0,0,0) for simplicity" -> not for simplicity, it's a fact. However, this (0,0,0) is in the Camera coordinate system. It makes sense : relatively to the camera, the camera is at the origin...
Now, if you want to prevent the camera from going through the walls, the usual method is to trace a ray from the camera. I suspect this is what you're talking about ("to check for any intersection points"). But there is no need to do this in camera space. You can do this in world space. Here's a comparison :
Tracing rays in camera space : ray always starts from (0,0,0) and goes to (0,0,-1). Geometry must be transformed from Model space to World space, and then to Camera space, which is what annoys you
Tracing rays in world space : ray starts from camera position (in world space) and goes to (eyeCenter - eyePos).normalize(). Geometry must be transformed from Model space to World space.
Note that there is no third option (Tracing rays in Model space) which would avoid to transform the geometry from Model space to World space. However, you have a pair of workarounds :
First, your game's world is probably still : the Model matrix is probably always identity. So transforming its geometry from Model to World space is equivalent to doing nothing at all.
Secondly, for all other objets, you can take the opposite approach. Intead of transforming the entire geometry in one direction, transform only the ray the other way around : Take your Model matrix, inverse it, and you've got a matrix which goes from world space to model space. Multiply your ray's origin and direction by this matrix : your ray is now in model space. Intersect the normal way. Done.
Note that all I've said is standard techniques. No hacks or other weird stuff, just math :)