What is world space and eye space in game development? - opengl

I am reading a book about 3D concepts and OpenGL. The book always talks about world space, eye space, and so on.
What exactly is a world inside the computer monitor screen?
What is the world space?
What is eye space? Is it synonymous to projection?

World space
World space is the (arbitrarily chosen) frame of reference in which everything within the world is located in absolute coordinates.
Local space
Local space is space relative to another local frame of reference, in coordinates relative to the local frame.
For example, the mesh of a model will be constructed in relation to a coordinate system local to the model. When you move around the model in the world, the relative positions to each other of the points making up the model don't change. But they change within the world.
Hence there exists a model-to-world transformation from local to world space.
Eye (or view) space
Eye (or view) space is the world as seen by the viewer, i.e. all the positions of things in the world are no longer in relation to the (arbitrary) world coordinate system, but in relation to the viewer.
View space is somewhat special, because it not arbitrarily chosen. The coordinate (0, 0, 0) in view space is the position of the viewer and a certain direction (usually parallel to Z) is the direction of viewing.
So there exists a transformation world-to-view. Now because the viewer is always at the origin of the view space, setting the viewpoint is done by defining the world-to-view transformation.
Since for the purposes of rendering the graphics world space is of little use, you normally coalesce model-to-world and world-to-view transformations into a single model-to-view transformation.
Note that eye (or view) space is not the projection. Projection happens by a separate projection transform that transforms view-to-clip space.

You should read this: http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/
That tutorial uses term "camera space" instead of "eye space" but they are the same.

Related

Why does the camera face the negative end of the z-axis by default?

I am learning openGL from this scratchpixel, and here is a quote from the perspective project matrix chapter:
Cameras point along the world coordinate system negative z-axis so that when a point is converted from world space to camera space (and then later from camera space to screen space), if the point is to left of the world coordinate system y-axis, it will also map to the left of the camera coordinate system y-axis. In other words, we need the x-axis of the camera coordinate system to point to the right when the world coordinate system x-axis also points to the right; and the only way you can get that configuration, is by having camera looking down the negative z-axis.
I think it has something to do with the mirror image? but this explanation just confused me...why is the camera's coordinate by default does not coincide with the world coordinate(like every other 3D objects we created in openGL)? I mean, we will need to transform the camera coordinate anyway with a transformation matrix (whatever we want with the negative z set up, we can simulate it)...why bother?
It is totally arbitrary what to pick for z direction.
But your pick has a lot of deep impact.
One reason to stick with the GL -z way is that the culling of faces will match GL constant names like GL_FRONT. I'd advise just to roll with the tutorial.
Flipping the sign on just one axis also flips the "parity". So a front face becomes a back face. A znear depth test becomes zfar. So it is wise to pick one early on and stick with it.
By default, yes, it's "right hand" system (used in physics, for example). Your thumb is X-axis, index finger Y-axis, and when you make those go to right directions, Z-points (middle finger) to you. Why Z-axis has been selected to point inside/outside screen? Because then X- and Y-axes go on screen, like in 2D graphics.
But in reality, OpenGL has no preferred coordinate system. You can tweak it as you like. For example, if you are making maze game, you might want Y to go outside/inside screen (and Z upwards), so that you can move nicely at XY plane. You modify your view/perspective matrices, and you get it.
What is this "camera" you're talking about? In OpenGL there is no such thing as a "camera". All you've got is a two stage transformation chain:
vertex position → viewspace position (by modelview transform)
viewspace position → clipspace position (by projection transform)
To see why be default OpenGL is "looking down" -z, we have to look at what happens if both transformation steps do "nothing", i.e. full identity transform.
In that case all vertex positions passed to OpenGL are unchanged. X maps to window width, Y maps to window height. All calculations in OpenGL by default (you can change that) have been chosen adhere to the rules of a right hand coordinate system, so if +X points right and +Y points up, then Z+ must point "out of the screen" for the right hand rule to be consistent.
And that's all there is about it. No camera. Just linear transformations and the choice of using right handed coordinates.

OpenGL understand two viewpoint to see model transformation

There are two viewpoints to understand model transformations,I read from Red book 7th edition,[ Grand, Fixed Coordinate System ] and [ Moving a Local Coordinate System ].
My Question is:
What is the difference between the two viewpoints and when to use them for a specified situation ?
additional context info:
I'd like to give some context for you to help me,Or you can just ignore below details.
I understood these two viewpoints in the following way.Think I have following code:
(functions like glTranslatef are deprecated,replaced by math library ,but theory may keep helpful.)
//render the sence,and use orthogonal projection
void display( void )
{
glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT);
glLoadIdentity();
drawAixs(4.8f);//draw x y z aixs,4.8 is axis length
glRotatef(45.0,0.0,0.0,1.0);
glTranslatef(3.0,0.0,0.0);
glutSolidCube(2.0);
glutSwapBuffers();
}
From the local coordinate view :
In this viewpoint,we can understand it like :
And the current transformation matrix(CTM) is:
From global fixed coordinate view :
In this viewpoint,we can get :
The two coordinate systems are just a convention. There is always one global coordinate system, but there may be many local systems. The notion of a local system is only a convention. A general transformation matrix M transforms vertices from one space to another:
v' = M * v
So that v is in one coordinate space and v' is in another. In case M describes position of an object, we say that it is putting vertices from local object coordinate frame to the global world coordinate frame. On the other hand, if it describes camera position and projection, it puts vertices from the world coordinate frame to another local eye coordinate frame.
A complex objects with joints (such as humanoid characters, or mechanisms with hinges) may actually have several intermediate local spaces, where the transformations are chained, depending on the structure of the skeleton of the object.
Usually, one uses object space where the vertices of the models are defined, world space which is a global space where the coordinates can be related to each other, and camera space or eye space which is space of screen coordinates. But with the introduction of OpenGL 3 shaders, these are completely arbitrary: you can write your shader so that it uses a single matrix that transforms the vertices from object space directly to screen space. So do not worry about coordinate frames, just focus on the task at hand - what it is you want to display and how the objects should be moving (relative to each other or relative to some common reference).

openGL Camera space vs NDC viewing direction

I am trying to understand open GL concepts . While reading this tutorial - http://www.arcsynthesis.org/gltut/Positioning/Tut04%20Perspective%20Projection.html,
I came accross this statement :
This is because camera space and NDC space have different viewing directions. In camera space, the camera looks down the -Z axis; more negative Z values are farther away. In NDC space, the camera looks down the +Z axis; more positive Z values are farther away. The diagram flips the axis so that the viewing direction can remain the same between the two images (up is away).
I am confused as to why the viewing direction has to change . Could some one please help me understand this with an example ?
This is mostly just a convention. OpenGL clip space (and NDC space and screen space) has always been defined as left-handed (with z pointing away into the screen) by the spec.
OpenGL eye space had been defined with camera at origin and looking at -z direction (so right-handed). However, this convention was just meaningful in the fixed-function pipeline, where together with the fixed function per vertex lighting which was carried out in eye space, the viewing direction did matter cases like whenGL_LOCAL_VIEWER was disabled (as was the default).
The classic GL projection matrix typically converts the handedness, and the perspecitve division is done with a divisior of -z_eye, typically, so the last row of the projection matrix is typically (0, 0, -1, 0). The old glFrustum(), glOrtho(), and gluPerspective() actually supported that convention by using the z_near and z_far clipping distances negated, so that you had to specify positive values for clip planes to lie before the camera at z<0.
However, with modern GL, this convention is more or less meaningless. There is no fixed-function unit left which does work in eye space, so the eye space (and anything before that) is totally under the user's control. You can use anything you like here. The clip space and all the later spaces are still used by fixed function units (clipping, rasterization, ...), so there most be some convention to define the interface, and it is still a left-handed system.
Even in modern GL, the old right-handed eye space convention is still in use. The popular glm library for example reimplements the old GL matrix functions the same way.
There is really no reason to prefer one of the possible conventions over the other, but at some point, you have to choose and stick to one.

Translate/Rotate "World" or "Camera" in OpenGL

Before I ask the question: Yes I know that there doesn't exist a camera in OpenGL - but the setLookAt-Method is nearly the same for me ;)
What I was wondering about: If I have the task, to look at a specific point with a specific distance in my scene I basically have two options:
I could change the eyeX,eyeY,eyeZ and the centerX, centerY, centerZ values of my lookAt-Method to achieve this or I could translate my model itself.
Let's say I'm translating/rotating my model. How would I ever know where to put my center/eye-coords of my setLookAt to look at a specific point? Because the world is rotated, the point (x,y,z) is also translated and rotated. So basically when I want to look at the point x,y,z the values are changing after the rotation/translation and it's impossible for me, to look at this point.
When I only transform my eye and center-values of my lookAt I can easily look at the point again - am I missing something? Seems not like a good way to move the model instead of the camera...
It helps to understand your vector spaces.
Model Space: The intrinsic coordinate system of an object. Basically how it lines up with XYZ axes in your 3D modeling software.
World Space: Where everything is in your universe. When you move your camera in a scene layout program, the XYZ axes don't change. This is the coordinate system you're used to interacting with and thinking about.
Camera Space: This is where everything is with respect to your camera. The camera's position in camera space is the by definition the origin, and your XYZ axes are your orthonormalized right, up, and look vectors. When you move or rotate your camera, all the positions and orientations of your objects change with it in camera space. This isn't intuitive - when you walk around, you see think of everything "staying the same way" - it didn't actually move. That's because you're thinking in world space. In camera space, the position and orientation of everything is relative to your eye. If a chair's position is 5 units in front of you (ie (0,0,-5) in camera space) and you want 2 units towards it, the chair's position is now (0,0,-3).
How do I set a lookat?
What does a lookat function do, exactly? It's a convenient way to set up your view matrix without you having to understand what it's doing.
Your eye variables are the camera's position in world space. IE they're what you think they are. The same goes for the center variables - they're the position of your object in world space. From here you get the transformation from world space to camera space that you give to OpenGL.

point - plane collision without the glutLookAt* functions

As I have understood, it is recommended to use glTranslate / glRotate in favour of glutLootAt. I am not going to seek the reasons beyond the obvious HW vs SW computation mode, but just go with the wave. However, this is giving me some headaches as I do not exactly know how to efficiently stop the camera from breaking through walls. I am only interested in point-plane intersections, not AABB or anything else.
So, using glTranslates and glRotates means that the viewpoint stays still (at (0,0,0) for simplicity) while the world revolves around it. This means to me that in order to check for any intersection points, I now need to recompute the world's vertices coordinates (which was not needed with the glutLookAt approach) for every camera movement.
As there is no way in obtaining the needed new coordinates from GPU-land, they need to be calculated in CPU land by hand. For every camera movement ... :(
It seems there is the need to retain the current rotations aside each of the 3 axises and the same for translations. There is no scaling used in my program. My questions:
1 - is the above reasoning flawed ? How ?
2 - if not, there has to be a way to avoid such recalculations.
The way I see it (and by looking at http://www.glprogramming.com/red/appendixf.html) it needs one matrix multiplication for translations and another one for rotating (only aside the y axis needed). However, having to compute so many additions / multiplications and especially the sine / cosine will certainly be killing FPS. There are going to be thousands or even tens of thousands of vertices to compute on. Every frame... all the maths... After having computed the new coordinates of the world things seem to be very easy - just see if there is any plane that changed its 'd' sign (from the planes equation ax + by + cz + d = 0). If it did, use a lightweight cross products approach to test if the point is inside the space inside each 'moving' triangle of that plane.
Thanks
edit: I have found about glGet and I think it is the way to go but I do not know how to properly use it:
// Retains the current modelview matrix
//glPushMatrix();
glGetFloatv(GL_MODELVIEW_MATRIX, m_vt16CurrentMatrixVerts);
//glPopMatrix();
m_vt16CurrentMatrixVerts is a float[16] which gets filled with 0.f or 8.67453e-13 or something similar. Where am I screwing up ?
gluLookAt is a very handy function with absolutely no performance penalty. There is no reason not to use it, and, above all, no "HW vs SW" consideration about that. As Mk12 stated, glRotatef is also done on the CPU. The GPU part is : gl_Position = ProjectionMatrix x ViewMatrix x ModelMatrix x VertexPosition.
"using glTranslates and glRotates means that the viewpoint stays still" -> same thing for gluLookAt
"at (0,0,0) for simplicity" -> not for simplicity, it's a fact. However, this (0,0,0) is in the Camera coordinate system. It makes sense : relatively to the camera, the camera is at the origin...
Now, if you want to prevent the camera from going through the walls, the usual method is to trace a ray from the camera. I suspect this is what you're talking about ("to check for any intersection points"). But there is no need to do this in camera space. You can do this in world space. Here's a comparison :
Tracing rays in camera space : ray always starts from (0,0,0) and goes to (0,0,-1). Geometry must be transformed from Model space to World space, and then to Camera space, which is what annoys you
Tracing rays in world space : ray starts from camera position (in world space) and goes to (eyeCenter - eyePos).normalize(). Geometry must be transformed from Model space to World space.
Note that there is no third option (Tracing rays in Model space) which would avoid to transform the geometry from Model space to World space. However, you have a pair of workarounds :
First, your game's world is probably still : the Model matrix is probably always identity. So transforming its geometry from Model to World space is equivalent to doing nothing at all.
Secondly, for all other objets, you can take the opposite approach. Intead of transforming the entire geometry in one direction, transform only the ray the other way around : Take your Model matrix, inverse it, and you've got a matrix which goes from world space to model space. Multiply your ray's origin and direction by this matrix : your ray is now in model space. Intersect the normal way. Done.
Note that all I've said is standard techniques. No hacks or other weird stuff, just math :)