Can I rotate the view of glOrtho()? - opengl

I'm making a program where I need an orthographic projection.
So, I'm using glOrtho(). I made a zoom function but I was wandering if you can rotate your view?
Because glOrho() only looks parralel to other planes.
Or is there another projection wich can do that.
glLookAt can rotate but it changes dimensions when further from the camera.
I read
How can I make the glOrtho parallelepiped rotating?
but it didn't give me an answer.

What's essential here is that you usually split the operations into two matrices: The (Model)View and The Projection.
While glOrtho() is usually called with glMatrixMode(GL_PROJECTION), all operations regarding moving and rotating the camera (such as glRotate*, glTranslate* and gluLookAt) should be preceeded by glMatrixMode(GL_MODELVIEW).
In fixed pipeline, the final position of the vertex is calculated by multiplying input data by these two matrices, and the projection used (orthographic or perspective or nonlinear whatnot) is separate from camera transformations.

Related

Correct way to translate in a 3D world after rotating?

I have a basic cube generated in my 3D world. I can rotate correctly around the camera, but when I translate after rotating, the translations are not correct.
For example, if I rotate 90 degrees and translate into the Z axis, it would move as if translating in the X axis.
glLoadIdentity();
glRotatef(angle,0,1,0); //Rotate around the camera.
glTranslatef(movX,movY,movZ); //Translate after rotating around the camera.
glCallList(cubes[0]);
I need some help with this. Also, I tried translating before rotating, but the rotation is not at the camera. It is at the edge of the cube.
Keep in mind that in OpenGL that the transformation is applied to the camera, not the objects rendered; so you observe the inverse of the transformation you expected.
Also in OpenGL the Y and Z axes are flipped (Y is vertical), so you observe a horizontal translation instead of a vertical one.
Also, because the object is rotated though 90 degrees about Y, the X and Z axes replace each other (one of them is reversed).
willywonkadailyblah's answer is half correct. Because you are using the old OpenGL, you're using the old matrix stack. You are modifying the modelview matrix when you're doing your glRotatef and glTranslatef calls. The modelview matrix is actually the model's matrix and the camera's view matrix precombined (already multiplied together). These matrices are what determine where your object is in 3D space and where your viewing position/direction of the world is. So you can think of your calls as moving the camera, but it's probably easier to think of them as moving and rotating the world.
These rotate and translation calls are linear transformations. This has a precise definition, but for our purposes it means that you can represent the transformation as a matrix and you multiply it with the point's coordinates to apply the transformation to a point. Now matrix multiplication is not commutative, meaning AB != BA. All this to say that when you rotate, then translate it is different than translating and rotating, which I think you know. But then when you translate, rotate, and translate again, it might be a little more difficult to follow what you're actually doing. Worse even if you throw in some scaling in there. So I would suggest learning how linear transformations work and maintaining your own matrices for the objects and camera if you're serious about learning OpenGL.
learnopengl.org is an excellent website, but it teaches you Modern OpenGL, not what you're currently using. But the lesson on transformations and on coordinate systems are probably generally helpful, even without exact code for you to follow

Why do modeview and camera matrices use RUB orientation

I usually find matrix libraries building both modelview and cameras matrices from the RUB (right-up-back) vectors, as depicted in these pages:
http://3dengine.org/Right-up-back_from_modelview
http://3dengine.org/Modelview_matrix
Is the RUB tuple just a common standard?
Otherwise, is there a reason the RUB vectors are preferred over any other orientation (such as forward-up-right)?
Particularly if you're using the programmable pipeline, you have almost complete freedom about the coordinate system you work in, and how you transform your geometry. But once all your transformations are applied in the vertex shader (resulting in the vector assigned to gl_Position), there is still a fixed function block in the pipeline between the vertex shader and fragment shader. That fixed function block relies on the transformed vertices being in a well defined coordinate system.
gl_Position is in a coordinate system called "clip coordinates", which then turns into "normalized device coordinates" (NDC) after dividing by the w coordinate of the vector.
Based on the vector in NDC, the fixed function rasterization block generates pixels. It will use the first coordinate to map to the horizontal window direction, and the second coordinate to map to the vertical window direction. The third coordinate will be used to calculate the depth, which can be used for depth testing.
This means that after all transformations are applied, the first coordinate has to be left-right, the second coordinate has to be bottom-up, and the third coordinate has to be front-back (well, it could be back-front if you change the depth test).
If you use a classic setup with modelview and projection matrix, it makes sense to use the modelview matrix to transform the original geometry into this orientation, and then use the projection matrix to apply e.g. a perspective.
I don't think there's anything stopping you from using a different orientation as the result of the modelview transformation, and then include a rotation in the projection matrix to transform the whole thing into the correct clip coordinate space. But I don't see a benefit, and it looks like it would just add unnecessary confusion.

OpenGL spectrum between perspective and orthogonal projection

In OpenGL (all versions, though I happen to be working in OpenGL ES 2.0) there is the option of using a perspective projection versus an orthogonal one. Is there a way to control the degree of orthogonality?
For the sake of picturing the issue (and please don't take this as the actual question, I am well aware there is no camera in OpenGL) assume that a scene is rendered with the viewport "looking" down the -z axis. Two parallel lines extending a finite distance down the -z axis at (x,y)=1,1 and (x,y)=-1,1 will appear as points in orthogonal projection, or as two lines that eventually converge to a single pixel in perspective projection. Is there a way to have the x- and y- values represented by the outer edges of the screen remain the same as in projection space - I assume this requires not changing the frustum - but have the lines only converge part of the way to a single pixel?
Is there a way to control the degree of orthogonality?
Either something is orthogonal, or it is not. There's no such thing like "just a little orthogonal".
Anyway, from a mathematical point of view, a perspective projection with an infinitely narrow field of view is orthogonal. So you can use glFrustum with a very large near and far plane distance, together with a countering translation in modelview to bring the far away viewing volume back to the origin.

Why would it be beneficial to have a separate projection matrix, yet combine model and view matrix?

When you are learning 3D programming, you are taught that it's easiest think in terms of 3 transformation matrices:
The Model Matrix. This matrix is individual to every single model and it rotates and scales the object as desired and finally moves it to its final position within your 3D world. "The Model Matrix transforms model coordinates to world coordinates".
The View Matrix. This matrix is usually the same for a large number of objects (if not for all of them) and it rotates and moves all objects according to the current "camera position". If you imaging that the 3D scene is filmed by a camera and what is rendered on the screen are the images that were captured by this camera, the location of the camera and its viewing direction define which parts of the scene are visible and how the objects appear on the captured image. There are little reasons for changing the view matrix while rendering a single frame, but those do in fact exists (e.g. by rendering the scene twice and changing the view matrix in between, you can create a very simple, yet impressive mirror within your scene). Usually the view matrix changes only once between two frames being drawn. "The View Matrix transforms world coordinates to eye coordinates".
The Projection Matrix. The projection matrix decides how those 3D coordinates are mapped to 2D coordinates, e.g. if there is a perspective applied to them (objects get smaller the farther they are away from the viewer) or not (orthogonal projection). The projection matrix hardly ever changes at all. It may have to change if you are rendering into a window and the window size has changed or if you are rendering full screen and the resolution has changed, however only if the new window size/screen resolution has a different display aspect ratio than before. There are some crazy effects for that you may want to change this matrix but in most cases its pretty much constant for the whole live of your program. "The Projection Matrix transforms eye coordinates to screen coordinates".
This makes all a lot of sense to me. Of course one could always combine all three matrices into a single one, since multiplying a vector first by matrix A and then by matrix B is the same as multiplying the vector by matrix C, where C = B * A.
Now if you look at the classical OpenGL (OpenGL 1.x/2.x), OpenGL knows a projection matrix. Yet OpenGL does not offer a model or a view matrix, it only offers a combined model-view matrix. Why? This design forces you to permanently save and restore the "view matrix" since it will get "destroyed" by model transformations applied to it. Why aren't there three separate matrices?
If you look at the new OpenGL versions (OpenGL 3.x/4.x) and you don't use the classical render pipeline but customize everything with shaders (GLSL), there are no matrices available any longer at all, you have to define your own matrices. Still most people keep the old concept of a projection matrix and a model-view matrix. Why would you do that? Why not using either three matrices, which means you don't have to permanently save and restore the model-view matrix or you use a single combined model-view-projection (MVP) matrix, which saves you a matrix multiplication in your vertex shader for ever single vertex rendered (after all such a multiplication doesn't come for free either).
So to summarize my question: Which advantage has a combined model-view matrix together with a separate projection matrix over having three separate matrices or a single MVP matrix?
Look at it practically. First, the fewer matrices you send, the fewer matrices you have to multiply with positions/normals/etc. And therefore, the faster your vertex shaders.
So point 1: fewer matrices is better.
However, there are certain things you probably need to do. Unless you're doing 2D rendering or some simple 3D demo-applications, you are going to need to do lighting. This typically means that you're going to need to transform positions and normals into either world or camera (view) space, then do some lighting operations on them (either in the vertex shader or the fragment shader).
You can't do that if you only go from model space to projection space. You cannot do lighting in post-projection space, because that space is non-linear. The math becomes much more complicated.
So, point 2: You need at least one stop between model and projection.
So we need at least 2 matrices. Why model-to-camera rather than model-to-world? Because working in world space in shaders is a bad idea. You can encounter numerical precision problems related to translations that are distant from the origin. Whereas, if you worked in camera space, you wouldn't encounter those problems, because nothing is too far from the camera (and if it is, it should probably be outside the far depth plane).
Therefore: we use camera space as the intermediate space for lighting.
In most cases your shader will need the geometry in world or eye coordinates for shading so you have to seperate the projection matrix from the model and view matrices.
Making your shader multiply the geometry with two matrices hurts performance. Assuming each model have thousends (or more) vertices it is more efficient to compute a model view matrix in the cpu once, and let the shader do one less mtrix-vector multiplication.
I have just solved a z-buffer fighting problem by separating the projection matrix. There is no visible increase of the GPU load. The two folowing screenshots shows the two results - pay attention to the green and white layers fighting.

OpenGL define vertex position in pixels

I've been writing a 2D basic game engine in OpenGL/C++ and learning everything as I go along. I'm still rather confused about defining vertices and their "position". That is, I'm still trying to understand the vertex-to-pixels conversion mechanism of OpenGL. Can it be explained briefly or can someone point to an article or something that'll explain this. Thanks!
This is rather basic knowledge that your favourite OpenGL learning resource should teach you as one of the first things. But anyway the standard OpenGL pipeline is as follows:
The vertex position is transformed from object-space (local to some object) into world-space (in respect to some global coordinate system). This transformation specifies where your object (to which the vertices belong) is located in the world
Now the world-space position is transformed into camera/view-space. This transformation is determined by the position and orientation of the virtual camera by which you see the scene. In OpenGL these two transformations are actually combined into one, the modelview matrix, which directly transforms your vertices from object-space to view-space.
Next the projection transformation is applied. Whereas the modelview transformation should consist only of affine transformations (rotation, translation, scaling), the projection transformation can be a perspective one, which basically distorts the objects to realize a real perspective view (with farther away objects being smaller). But in your case of a 2D view it will probably be an orthographic projection, that does nothing more than a translation and scaling. This transformation is represented in OpenGL by the projection matrix.
After these 3 (or 2) transformations (and then following perspective division by the w component, which actually realizes the perspective distortion, if any) what you have are normalized device coordinates. This means after these transformations the coordinates of the visible objects should be in the range [-1,1]. Everything outside this range is clipped away.
In a final step the viewport transformation is applied and the coordinates are transformed from the [-1,1] range into the [0,w]x[0,h]x[0,1] cube (assuming a glViewport(0, w, 0, h) call), which are the vertex' final positions in the framebuffer and therefore its pixel coordinates.
When using a vertex shader, steps 1 to 3 are actually done in the shader and can therefore be done in any way you like, but usually one conforms to this standard modelview -> projection pipeline, too.
The main thing to keep in mind is, that after the modelview and projection transforms every vertex with coordinates outside the [-1,1] range will be clipped away. So the [-1,1]-box determines your visible scene after these two transformations.
So from your question I assume you want to use a 2D coordinate system with units of pixels for your vertex coordinates and transformations? In this case this is best done by using glOrtho(0.0, w, 0.0, h, -1.0, 1.0) with w and h being the dimensions of your viewport. This basically counters the viewport transformation and therefore transforms your vertices from the [0,w]x[0,h]x[-1,1]-box into the [-1,1]-box, which the viewport transformation then transforms back to the [0,w]x[0,h]x[0,1]-box.
These have been quite general explanations without mentioning that the actual transformations are done by matrix-vector-multiplications and without talking about homogenous coordinates, but they should have explained the essentials. This documentation of gluProject might also give you some insight, as it actually models the transformation pipeline for a single vertex. But in this documentation they actually forgot to mention the division by the w component (v" = v' / v'(3)) after the v' = P x M x v step.
EDIT: Don't forget to look at the first link in epatel's answer, which explains the transformation pipeline a bit more practical and detailed.
It is called transformation.
Vertices are set in 3D coordinates which is transformed into a viewport coordinates (into your window view). This transformation can be set in various ways. Orthogonal transformation can be easiest to understand as a starter.
http://www.songho.ca/opengl/gl_transform.html
http://www.opengl.org/wiki/Vertex_Transformation
http://www.falloutsoftware.com/tutorials/gl/gl5.htm
Firstly be aware that OpenGL not uses standard pixel coordinates. I mean by that for particular resolution, ie. 800x600 you dont have horizontal coordinates in range 0-799 or 1-800 stepped by one. You rather have coordinates ranged from -1 to 1 later send to graphic card rasterizing unit and after that matched to particular resolution.
I ommited one step here - before all that you have an ModelViewProjection matrix (or viewProjection matrix in some simple cases) which before all that will cast coordinates you use to an projection plane. Default use of that is to implement a camera which converts 3D space of world (View for placing an camera into right position and Projection for casting 3d coordinates into screen plane. In ModelViewProjection it's also step of placing a model into right place in world).
Another case (and you can use Projection matrix this way to achieve what you want) is to use these matrixes to convert one range of resolutions to another.
And there's a trick you will need. You should read about modelViewProjection matrix and camera in openGL if you want to go serious. But for now I will tell you that with proper matrix you can just cast your own coordinate system (and ie. use ranges 0-799 horizontaly and 0-599 verticaly) to standarized -1:1 range. That way you will not see that underlying openGL api uses his own -1 to 1 system.
The easiest way to achieve this is glOrtho function. Here's the link to documentation:
http://www.opengl.org/sdk/docs/man/xhtml/glOrtho.xml
This is example of proper usage:
glMatrixMode (GL_PROJECTION)
glLoadIdentity ();
glOrtho (0, 800, 600, 0, 0, 1)
glMatrixMode (GL_MODELVIEW)
Now you can use own modelView matrix ie. for translation (moving) objects but don't touch your projection example. This code should be executed before any drawing commands. (Can be after initializing opengl in fact if you wont use 3d graphics).
And here's working example: http://nehe.gamedev.net/tutorial/2d_texture_font/18002/
Just draw your figures instead of drawing text. And there is another thing - glPushMatrix and glPopMatrix for choosen matrix (in this example projection matrix) - you wont use that until you combining 3d with 2d rendering.
And you can still use model matrix (ie. for placing tiles somewhere in world) and view matrix (in example for zooming view, or scrolling through world - in this case your world can be larger than resolution and you could crop view by simple translations)
After looking at my answer I see it's a little chaotic but If you confused - just read about Model, View, and Projection matixes and try example with glOrtho. If you're still confused feel free to ask.
MSDN has a great explanation. It may be in terms of DirectX but OpenGL is more-or-less the same.
Google for "opengl rendering pipeline". The first five articles all provide good expositions.
The key transition from vertices to pixels (actually, fragments, but you won't be too far off if you think "pixels") is in the rasterization stage, which occurs after all vertices have been transformed from world-coordinates to screen coordinates and clipped.