In old OpenGL versions spatial projection (where objects become smaller with growing distance) could be enabled via a call to
glMatrixMode(GL_PROJECTION);
(at least as far as I remember this was the call to enable this mode).
In OpenGL 3 - when this slow stack-mode is no longer used - this function does not work any more.
So how can I have the same spatial effect here? What is the intended way for this?
You've completely misunderstood what glMatrixMode actually did. The old, legacy OpenGL fixed function pipeline kept a set of matrices around, which were all used indiscriminately when drawing stuff. The two most important matrices were:
the modelview matrix, which is used to describe the transformation from model local space into view space. View space is still kind of abstract, but it can be understood as the world transformed into the coordinate space of the "camera". Illumination calculations happened in that space.
the projection matrix, which is used to describe the transformation from view space into clip space. Clip space is an intermediary stage right before reaching device coordinates (there are few important details involved in this, but those are not important right now), which mostly involves applying the homogenous divide i.e. scaling the clip coordinate vector by the reciprocal of its w-component.
The fixed transformation pipeline always was
position_view := Modelview · position
do illumination calculations with position_view
position_clip := Projection · position_view
position_pre_ndc := position_clip · 1/position_clip.w
In legacy OpenGL the modelview and projection matrix are always there. glMatrixMode is a selector, which of the existing matrices are subject to the operations done by the matrix manipulation functions. One of these functions is glFrustum which generates and multiplies a perspective matrix, i.e. a matrix which will create a perspective effect through the homogenous divide.
So how can I have the same spatial effect here? What is the intended way for this?
You generate a perspective matrix of the desired properties, and use it to transform the vertex attribute you designate as model local position into clip space and submit that into the gl_Position output of the vertex shader. The usual way to do this is by passing in a modelview and a projection matrix as uniforms.
The most bare bones GLSL vertex shader doing that would be
#version 330
uniform mat4 modelview;
uniform mat4 projection;
in vec4 pos;
void main(){
gl_Position = projection * modelview * pos;
}
As for generating the projection matrix: All the popular computer graphics math libraries got you covered and have functions for that.
I am drawing a stack of decals on a quad. Same geometry, different textures. Z-fighting is the obvious result. I cannot control the rendering order or use glPolygonoffset due to batched rendering. So I adjust depth values inside the vertex shader.
gl_Position = uMVPMatrix * pos;
gl_Position.z += aDepthLayer * uMinStep * gl_Position.w;
gl_Position holds clip coordinates. That means a change in z will move a vertex along its view ray and bring it to the front or push it to the back. For normalized device coordinates the clip coords get divided by gl_Position.w (=-Zclip). As a result the depth buffer does not have linear distribution and has higher resolution towards the near plane. By premultiplying gl_Position.w that should be fixed and I should be able to apply a flat amount (uMinStep) to the NDC.
That minimum step should be something like 1/(2^GL_DEPTH_BITS -1). Or, since NDC space goes from -1.0 to 1.0, it might have to be twice that amount. However it does not work with these values. The minStep is roughly 0.00000006 but it does not bring a texture to the front. Neither when I double that value. If I drop a zero (scale by 10), it works. (Yay, thats something!)
But it does not work evenly along the frustum. A value that brings a texture in front of another while the quad is close to the near plane does not necessarily do the same when the quad is close to the far plane. The same effect happens when I make the frustum deeper. I would expect that behaviour if I was changing eye coordinates, because of the nonlinear z-Buffer distribution. But it seems that premultiplying gl_Position.w is not enough to counter that.
Am I missing some part of the transformations that happen to clip coords? Do I need to use a different formula in general? Do I have to include the depth range [0,1] somehow?
Could the different behaviour along the frustum be a result of nonlinear floating point precision instead of nonlinear z-Buffer distribution? So maybe the calculation is correct, but the minStep just cannot be handled correctly by floats at some point in the pipeline?
The general question: How do I calculate a z-Shift for gl_Position (clip coordinates) that will create a fixed change in the depth buffer later? How can I make sure that the z-Shift will bring one texture in front of another no matter where in the frustum the quad is placed?
Some material:
OpenGL depth buffer faq
https://www.opengl.org/archives/resources/faq/technical/depthbuffer.htm
Same with better readable formulas (but some typos, be careful)
https://www.opengl.org/wiki/Depth_Buffer_Precision
Calculation from eye coords to z-buffer. Most of that happens already when I multiply the projection matrix.
http://www.sjbaker.org/steve/omniv/love_your_z_buffer.html
Explanation about the elements in the projection matrix that turn into the A and B parts in most depth buffer calculation formulas.
http://www.songho.ca/opengl/gl_projectionmatrix.html
How can one mix ortographic and perspective projection in openGL?
Some 2d elements have to be drawn in screen space (no scaling, rotation, etc..)
These 2d elements have a z position, they have to appear in front/behind of other 3d elements.
So i set up orographic projection, draw all 2d elements, then setup perspective projection and draw all 3d elements.
The result is that all 2d elements are drawn on top. It seems that the z values from the orto projection and the z values from the perspective projection are not compatible (GL_DEPTH_TEST).
Separately all 2d and all 3d elements work fine, the problem is when i try to mix them.
Does the prespective projection changes the z values? In what way?
Is it possible to use z values from orto projection mixed with z values from perspective projection for depth test, or this whole concept is flawed?
Bare opengl1.5
It seems that the z values from the orto projection and the z values from the perspective projection are not compatible (GL_DEPTH_TEST).
That is indeed the case. Perspective transformation maps the Z values nonlinear to the depth buffer values. The usual way to address this problem is to copy the depth buffer after the perspective pass into a depth texture and use that as an additional input in the fragment shader of the orthographic drawn stuff, reverse the nonlinearity in the depth input and compare the incoming Z coordinate with that; then discard appropriately.
It's also possible to emit linear depth values in the perspective drawn geometry fragment shaders, however the depth nonlinearity of perspective projection has its purpose; without it you loose depth precision where it matters most, close to the point of view.
I usually find matrix libraries building both modelview and cameras matrices from the RUB (right-up-back) vectors, as depicted in these pages:
http://3dengine.org/Right-up-back_from_modelview
http://3dengine.org/Modelview_matrix
Is the RUB tuple just a common standard?
Otherwise, is there a reason the RUB vectors are preferred over any other orientation (such as forward-up-right)?
Particularly if you're using the programmable pipeline, you have almost complete freedom about the coordinate system you work in, and how you transform your geometry. But once all your transformations are applied in the vertex shader (resulting in the vector assigned to gl_Position), there is still a fixed function block in the pipeline between the vertex shader and fragment shader. That fixed function block relies on the transformed vertices being in a well defined coordinate system.
gl_Position is in a coordinate system called "clip coordinates", which then turns into "normalized device coordinates" (NDC) after dividing by the w coordinate of the vector.
Based on the vector in NDC, the fixed function rasterization block generates pixels. It will use the first coordinate to map to the horizontal window direction, and the second coordinate to map to the vertical window direction. The third coordinate will be used to calculate the depth, which can be used for depth testing.
This means that after all transformations are applied, the first coordinate has to be left-right, the second coordinate has to be bottom-up, and the third coordinate has to be front-back (well, it could be back-front if you change the depth test).
If you use a classic setup with modelview and projection matrix, it makes sense to use the modelview matrix to transform the original geometry into this orientation, and then use the projection matrix to apply e.g. a perspective.
I don't think there's anything stopping you from using a different orientation as the result of the modelview transformation, and then include a rotation in the projection matrix to transform the whole thing into the correct clip coordinate space. But I don't see a benefit, and it looks like it would just add unnecessary confusion.
I've been writing a 2D basic game engine in OpenGL/C++ and learning everything as I go along. I'm still rather confused about defining vertices and their "position". That is, I'm still trying to understand the vertex-to-pixels conversion mechanism of OpenGL. Can it be explained briefly or can someone point to an article or something that'll explain this. Thanks!
This is rather basic knowledge that your favourite OpenGL learning resource should teach you as one of the first things. But anyway the standard OpenGL pipeline is as follows:
The vertex position is transformed from object-space (local to some object) into world-space (in respect to some global coordinate system). This transformation specifies where your object (to which the vertices belong) is located in the world
Now the world-space position is transformed into camera/view-space. This transformation is determined by the position and orientation of the virtual camera by which you see the scene. In OpenGL these two transformations are actually combined into one, the modelview matrix, which directly transforms your vertices from object-space to view-space.
Next the projection transformation is applied. Whereas the modelview transformation should consist only of affine transformations (rotation, translation, scaling), the projection transformation can be a perspective one, which basically distorts the objects to realize a real perspective view (with farther away objects being smaller). But in your case of a 2D view it will probably be an orthographic projection, that does nothing more than a translation and scaling. This transformation is represented in OpenGL by the projection matrix.
After these 3 (or 2) transformations (and then following perspective division by the w component, which actually realizes the perspective distortion, if any) what you have are normalized device coordinates. This means after these transformations the coordinates of the visible objects should be in the range [-1,1]. Everything outside this range is clipped away.
In a final step the viewport transformation is applied and the coordinates are transformed from the [-1,1] range into the [0,w]x[0,h]x[0,1] cube (assuming a glViewport(0, w, 0, h) call), which are the vertex' final positions in the framebuffer and therefore its pixel coordinates.
When using a vertex shader, steps 1 to 3 are actually done in the shader and can therefore be done in any way you like, but usually one conforms to this standard modelview -> projection pipeline, too.
The main thing to keep in mind is, that after the modelview and projection transforms every vertex with coordinates outside the [-1,1] range will be clipped away. So the [-1,1]-box determines your visible scene after these two transformations.
So from your question I assume you want to use a 2D coordinate system with units of pixels for your vertex coordinates and transformations? In this case this is best done by using glOrtho(0.0, w, 0.0, h, -1.0, 1.0) with w and h being the dimensions of your viewport. This basically counters the viewport transformation and therefore transforms your vertices from the [0,w]x[0,h]x[-1,1]-box into the [-1,1]-box, which the viewport transformation then transforms back to the [0,w]x[0,h]x[0,1]-box.
These have been quite general explanations without mentioning that the actual transformations are done by matrix-vector-multiplications and without talking about homogenous coordinates, but they should have explained the essentials. This documentation of gluProject might also give you some insight, as it actually models the transformation pipeline for a single vertex. But in this documentation they actually forgot to mention the division by the w component (v" = v' / v'(3)) after the v' = P x M x v step.
EDIT: Don't forget to look at the first link in epatel's answer, which explains the transformation pipeline a bit more practical and detailed.
It is called transformation.
Vertices are set in 3D coordinates which is transformed into a viewport coordinates (into your window view). This transformation can be set in various ways. Orthogonal transformation can be easiest to understand as a starter.
http://www.songho.ca/opengl/gl_transform.html
http://www.opengl.org/wiki/Vertex_Transformation
http://www.falloutsoftware.com/tutorials/gl/gl5.htm
Firstly be aware that OpenGL not uses standard pixel coordinates. I mean by that for particular resolution, ie. 800x600 you dont have horizontal coordinates in range 0-799 or 1-800 stepped by one. You rather have coordinates ranged from -1 to 1 later send to graphic card rasterizing unit and after that matched to particular resolution.
I ommited one step here - before all that you have an ModelViewProjection matrix (or viewProjection matrix in some simple cases) which before all that will cast coordinates you use to an projection plane. Default use of that is to implement a camera which converts 3D space of world (View for placing an camera into right position and Projection for casting 3d coordinates into screen plane. In ModelViewProjection it's also step of placing a model into right place in world).
Another case (and you can use Projection matrix this way to achieve what you want) is to use these matrixes to convert one range of resolutions to another.
And there's a trick you will need. You should read about modelViewProjection matrix and camera in openGL if you want to go serious. But for now I will tell you that with proper matrix you can just cast your own coordinate system (and ie. use ranges 0-799 horizontaly and 0-599 verticaly) to standarized -1:1 range. That way you will not see that underlying openGL api uses his own -1 to 1 system.
The easiest way to achieve this is glOrtho function. Here's the link to documentation:
http://www.opengl.org/sdk/docs/man/xhtml/glOrtho.xml
This is example of proper usage:
glMatrixMode (GL_PROJECTION)
glLoadIdentity ();
glOrtho (0, 800, 600, 0, 0, 1)
glMatrixMode (GL_MODELVIEW)
Now you can use own modelView matrix ie. for translation (moving) objects but don't touch your projection example. This code should be executed before any drawing commands. (Can be after initializing opengl in fact if you wont use 3d graphics).
And here's working example: http://nehe.gamedev.net/tutorial/2d_texture_font/18002/
Just draw your figures instead of drawing text. And there is another thing - glPushMatrix and glPopMatrix for choosen matrix (in this example projection matrix) - you wont use that until you combining 3d with 2d rendering.
And you can still use model matrix (ie. for placing tiles somewhere in world) and view matrix (in example for zooming view, or scrolling through world - in this case your world can be larger than resolution and you could crop view by simple translations)
After looking at my answer I see it's a little chaotic but If you confused - just read about Model, View, and Projection matixes and try example with glOrtho. If you're still confused feel free to ask.
MSDN has a great explanation. It may be in terms of DirectX but OpenGL is more-or-less the same.
Google for "opengl rendering pipeline". The first five articles all provide good expositions.
The key transition from vertices to pixels (actually, fragments, but you won't be too far off if you think "pixels") is in the rasterization stage, which occurs after all vertices have been transformed from world-coordinates to screen coordinates and clipped.