Glut Mouse Coordinates - opengl

I am trying to obtain the Cartesian coordinates of a point for that I am using a simple function the devolves a window coordinates, something like (100,200). I want to convert this to Cartesian coordinates, the problems is that the window size is variable so I really don't know how to implement this.
Edit:
Op:
I tried to use the guUnPorject by doing something like this
GLdouble modelMatrix[16];
glGetDoublev(GL_MODELVIEW_MATRIX,modelMatrix);
GLdouble projMatrix[16];
glGetDoublev(GL_PROJECTION_MATRIX,projMatrix);
double position[3];
gluUnProject(
x,
y,
1,
modelMatrix,
projMatrix,
viewport,
&position[0], //-> pointer to your own position (optional)
&position[1], // id
&position[2] //
);
cout<<position[0];
However the position I got seems completely random

Mouse coordinates are already in cartesian coordinates, namely the coordinates of window space. I think what you're looking for is a transformation into the world space of your OpenGL scene. The 2D mouse coordinates in the window lack some information thougn: The depth.
So what you can otain is actually a ray from the "camera" into the scene. For this you need to perform the back projection from screen space to world space. For this you take the projection matrix, P and the view matrix V (what's generated by gluLookAt or similar), form the product P*V and invert it i.e. (P*V)^-1 = V^-1 * P^-1. Take notice that inversion is not transposition. Inverting a matrix is done by Gauss-Jordan elimination, preferrably with some pivoting. Any mathtextbook on linear algebra explains it.
Then you need two vectors to form a ray. For this you take two screen space positions with the same XY coordinates but differing depth (say, 0 and 1) and do the backprojection:
(P*V)^-1 * (x, y, {0, 1})
The difference of the resulting vectors gives you a ray direction.
The whole backprojection process has been wrapped up into gluUnProject. However I recommend not using it in its GLU form, but either look at the source code to learn from it, or use a modern day substitute like it's offered by GLM.

Related

How to check if a point is inside a quad in perspective projection?

I want to test if any given point in the world is on a quad/plane? The quad/plane can be translated/rotated/scaled by any values but it still should be able to detect if the given point is on it. I also need to get the location where the point should have been, if the quad was not applied any rotation/scale/translation.
For example, consider a quad at 0, 0, 0 with size 100x100, rotated at an angle of 45 degrees along z axis. If my mouse location in the world is at ( x, y, 0, ), I need to know if that point falls on that quad in its current transformation? If yes, then I need to know if no transformations were applied to the quad, where that point would have been on it? Any code sample would be of great help
A ray-casting approach is probably simplest:
Use gluUnProject() to get the world-space direction of the ray to cast into the scene. The ray's origin is the camera position.
Put this ray into object space by transforming it by the inverse of your rectangle's transform. Note that you need to transform both the ray's origin point and direction vector.
Compute the intersection point between this ray and the XY plane with a standard ray-plane intersection test.
Check that the intersection point's x and y values are within your rectangle's bounds, if they are then that's your desired result.
A math library such as GLM will be very helpful if you aren't confident about some of the math involved here, it has corresponding functions such as glm::unProject() as well as functions to invert matrices and do all the other transformations you'd need.

Using GLM's UnProject

I'm not sure how to use the Unproject method provided by GLM.
Specifically, in what format is the viewport passed in? And why doesn't the function require a view matrix as well as a projection and world matrix?
A bit of history is required here. GLM's unproject is actually a more or less direct replacement for the gluUnProject function that uses deprecated OpenGL fixed-function rendering. In this mode the Model and View matrix were actually combined in the "ModelView" matrix. Apparently, the GLM author dropped the 'view' part in the naming, which confuses thing even more, but it comes down to passing something like view*model.
Now for the actual use:
win is a vector that holds three components that have meaning in Window coordinates. These are coordinates 'x,y' in your viewport, the 'z' coordinate you usually retrieve by reading your depth buffer at (x,y)
model, view and projection matrices should speak for itself if you're even thinking of using this function. But a good (opengl-specific) refresher can be useful.
The viewport is defined as in glViewport, which means (x,y,w,h). X and Y specify the lower left corner of your viewport (usually 0,0). Width and height (w,h). Note that in many other systems x,y specify the upper left corner, you have to transform your y-coordinate then, which is shown in the NeHe code I link to below.
When applied you simply end up by converting the provided window coordinates back to the object coordinates, more or less the inverse of what your render code usually does.
A half-decent explanation on the original gluUnProject can be found as a NeHe article. But of course that is OpenGL-specific, while glm can be used in other contexts.
The viewport is passed in as four floats: the x and y window coordinates of the viewport, followed by its width and height. That's the same order as used e.g. by glGetFloatv(GL_VIEWPORT, ...). So in most cases, the first two values should be 0.
As KillianDS already pointed out, the modelargument in fact is a modelview matrix, see the example use of unProject() in gtx_simd_mat4.cpp, function test_compute_gtx():
glm::mat4 E = glm::translate(D, glm::vec3(1.4f, 1.2f, 1.1f));
glm::mat4 F = glm::perspective(i, 1.5f, 0.1f, 1000.f);
glm::mat4 G = glm::inverse(F * E);
glm::vec3 H = glm::unProject(glm::vec3(i), G, F, E[3]);
As you can see, the matrix passed as the second argument basically is the product of a translation and a perspective transformation.

How to use OpenGL orthographic projection with the depth buffer?

I've rendered a 3D scene using glFrustum() perspective mode. I then have a 2D object that I place over the 3D scene to act as a label for a particular 3D object. I have calculated the 2D position of the 3D object using using gluProject() at which position I then place my 2D label object. The 2D label object is rendered using glOrtho() orthographic mode. This works perfectly and the 2D label object hovers over the 3D object.
Now, what I want to do is to give the 2D object a z value so that it can be hidden behind other 3D objects in the scene using the depth buffer. I have given the 2D object a z value that I know should be hidden by the depth buffer, but when I render the object it is always visible.
So the question is, why is the 2D object still visible and not hidden?
I did read somewhere that orthographic and perspective projections store incompatible depth buffer values. Is this true, and if so how do I convert between them?
I don't want it to be transformed, instead I want it to appear as flat 2D label that always faces the camera and remains the same size on the screen at all times. However, if it is hidden behind something I want it to appear hidden.
First, you should have put that in your question; it explains far more about what you're trying to do than your question.
To achieve this, what you need to do is find the z-coordinate in the orthographic projection that matches the z-coordinate in pre-projective space of where you want the label to appear.
When you used gluProject, you got three coordinates back. The Z coordinate is important. What you need to do is reverse-transform the Z coordinate based on the zNear and zFar values you give to glOrtho.
Pedantic note: gluProject doesn't transform the Z coordinate to window space. To do so, it would have to take the glDepthRange parameters. What it really does is assume a depth range of near = 0.0 and far = 1.0.
So our first step is to transform from window space Z to normalized device coordinate (NDC) space Z. We use this simple equation:
ndcZ = (2 * winZ) - 1
Simple enough. Now, we need to go to clip space. Which is a no-op, because with an orthographic projection, the W coordinate is assumed to be 1.0. And the division by W is the difference between clip space and NDC space.
clipZ = ndcZ
But we don't need clip space Z. We need pre-orthographic projection space Z (aka: camera space Z). And that requires the zNear and zFar parameters you gave to glOrtho. To get to camera space, we do this:
cameraZ = ((clipZ + (zFar + zNear)/(zFar - zNear)) * (zFar - zNear))/-2
And you're done. Use that Z position in your rendering. Oh, and make sure your modelview matrix doesn't include any transforms in the Z direction (unless you're using the modelview matrix to apply this Z position to the label, which is fine).
Based on Nicol's answer you can simply set zNear to 0 (which generally makes sense for 2D elements that are acting as part of the GUI) and then you simply have:
cameraZ = -winZ*zFar

OpenGL define vertex position in pixels

I've been writing a 2D basic game engine in OpenGL/C++ and learning everything as I go along. I'm still rather confused about defining vertices and their "position". That is, I'm still trying to understand the vertex-to-pixels conversion mechanism of OpenGL. Can it be explained briefly or can someone point to an article or something that'll explain this. Thanks!
This is rather basic knowledge that your favourite OpenGL learning resource should teach you as one of the first things. But anyway the standard OpenGL pipeline is as follows:
The vertex position is transformed from object-space (local to some object) into world-space (in respect to some global coordinate system). This transformation specifies where your object (to which the vertices belong) is located in the world
Now the world-space position is transformed into camera/view-space. This transformation is determined by the position and orientation of the virtual camera by which you see the scene. In OpenGL these two transformations are actually combined into one, the modelview matrix, which directly transforms your vertices from object-space to view-space.
Next the projection transformation is applied. Whereas the modelview transformation should consist only of affine transformations (rotation, translation, scaling), the projection transformation can be a perspective one, which basically distorts the objects to realize a real perspective view (with farther away objects being smaller). But in your case of a 2D view it will probably be an orthographic projection, that does nothing more than a translation and scaling. This transformation is represented in OpenGL by the projection matrix.
After these 3 (or 2) transformations (and then following perspective division by the w component, which actually realizes the perspective distortion, if any) what you have are normalized device coordinates. This means after these transformations the coordinates of the visible objects should be in the range [-1,1]. Everything outside this range is clipped away.
In a final step the viewport transformation is applied and the coordinates are transformed from the [-1,1] range into the [0,w]x[0,h]x[0,1] cube (assuming a glViewport(0, w, 0, h) call), which are the vertex' final positions in the framebuffer and therefore its pixel coordinates.
When using a vertex shader, steps 1 to 3 are actually done in the shader and can therefore be done in any way you like, but usually one conforms to this standard modelview -> projection pipeline, too.
The main thing to keep in mind is, that after the modelview and projection transforms every vertex with coordinates outside the [-1,1] range will be clipped away. So the [-1,1]-box determines your visible scene after these two transformations.
So from your question I assume you want to use a 2D coordinate system with units of pixels for your vertex coordinates and transformations? In this case this is best done by using glOrtho(0.0, w, 0.0, h, -1.0, 1.0) with w and h being the dimensions of your viewport. This basically counters the viewport transformation and therefore transforms your vertices from the [0,w]x[0,h]x[-1,1]-box into the [-1,1]-box, which the viewport transformation then transforms back to the [0,w]x[0,h]x[0,1]-box.
These have been quite general explanations without mentioning that the actual transformations are done by matrix-vector-multiplications and without talking about homogenous coordinates, but they should have explained the essentials. This documentation of gluProject might also give you some insight, as it actually models the transformation pipeline for a single vertex. But in this documentation they actually forgot to mention the division by the w component (v" = v' / v'(3)) after the v' = P x M x v step.
EDIT: Don't forget to look at the first link in epatel's answer, which explains the transformation pipeline a bit more practical and detailed.
It is called transformation.
Vertices are set in 3D coordinates which is transformed into a viewport coordinates (into your window view). This transformation can be set in various ways. Orthogonal transformation can be easiest to understand as a starter.
http://www.songho.ca/opengl/gl_transform.html
http://www.opengl.org/wiki/Vertex_Transformation
http://www.falloutsoftware.com/tutorials/gl/gl5.htm
Firstly be aware that OpenGL not uses standard pixel coordinates. I mean by that for particular resolution, ie. 800x600 you dont have horizontal coordinates in range 0-799 or 1-800 stepped by one. You rather have coordinates ranged from -1 to 1 later send to graphic card rasterizing unit and after that matched to particular resolution.
I ommited one step here - before all that you have an ModelViewProjection matrix (or viewProjection matrix in some simple cases) which before all that will cast coordinates you use to an projection plane. Default use of that is to implement a camera which converts 3D space of world (View for placing an camera into right position and Projection for casting 3d coordinates into screen plane. In ModelViewProjection it's also step of placing a model into right place in world).
Another case (and you can use Projection matrix this way to achieve what you want) is to use these matrixes to convert one range of resolutions to another.
And there's a trick you will need. You should read about modelViewProjection matrix and camera in openGL if you want to go serious. But for now I will tell you that with proper matrix you can just cast your own coordinate system (and ie. use ranges 0-799 horizontaly and 0-599 verticaly) to standarized -1:1 range. That way you will not see that underlying openGL api uses his own -1 to 1 system.
The easiest way to achieve this is glOrtho function. Here's the link to documentation:
http://www.opengl.org/sdk/docs/man/xhtml/glOrtho.xml
This is example of proper usage:
glMatrixMode (GL_PROJECTION)
glLoadIdentity ();
glOrtho (0, 800, 600, 0, 0, 1)
glMatrixMode (GL_MODELVIEW)
Now you can use own modelView matrix ie. for translation (moving) objects but don't touch your projection example. This code should be executed before any drawing commands. (Can be after initializing opengl in fact if you wont use 3d graphics).
And here's working example: http://nehe.gamedev.net/tutorial/2d_texture_font/18002/
Just draw your figures instead of drawing text. And there is another thing - glPushMatrix and glPopMatrix for choosen matrix (in this example projection matrix) - you wont use that until you combining 3d with 2d rendering.
And you can still use model matrix (ie. for placing tiles somewhere in world) and view matrix (in example for zooming view, or scrolling through world - in this case your world can be larger than resolution and you could crop view by simple translations)
After looking at my answer I see it's a little chaotic but If you confused - just read about Model, View, and Projection matixes and try example with glOrtho. If you're still confused feel free to ask.
MSDN has a great explanation. It may be in terms of DirectX but OpenGL is more-or-less the same.
Google for "opengl rendering pipeline". The first five articles all provide good expositions.
The key transition from vertices to pixels (actually, fragments, but you won't be too far off if you think "pixels") is in the rasterization stage, which occurs after all vertices have been transformed from world-coordinates to screen coordinates and clipped.

Opengl transform relative body coordinates to world coordinates

I have an opengl c++ project I'm working on, where I draw a player and a ball in his right hand. At first I want to make the ball follow the player's hand movement by setting ball position. After some time I need to make the ball travel using kinematics eqations. I get the player right hand transform matrix using (I do this the moment I draw the player):
glGetFloatv(GL_MODELVIEW_MATRIX, ball->transformMatrix);
and then the moment I draw the ball I try to do:
glLoadMatrixf(ball->transformMatrix);
glTranslated(0, 0, 0);
glCallList(lists[BALL]);
this works perfect and ball drawn at place.
The problem is when I need to move the ball
x += velocityX*dt;
y += velocityY*dt + gravity*dt*dt;
z += velocityZ*dt;
velocityY += gravity*dt;
the movement is relative not to player transform but the global. I tried many ways to solve this but none works.
So how is it possible for me to calculate properly the x, y, z of the ball so I can translate the ball after it left the player hand?
Edit:
I've managed to recieve the ball coordinates by multiplying ball->transformMatrix with the vector {0, 0, 0, 1} the result storing in world vector, after that calling (the moment I draw the ball):
glGetFloatv(GL_MODELVIEW_MATRIX, currentTransformMatrix);
then inversing currentTransformMatrix and multiplying inverse with world vector. The result is the ball proper coordinates {x, y, z}.
You need to transform back to the player's hand position; invert the transformation from hand to world. Since OpenGL is not a math library, you need to either implement the matrix math yourself of use some 3rd party linear math library.
EDIT due to comment:
What you transform is the velocity and acceleration into the ball's local coordinate system. Both velocity and gravity are just "directive" vectors, so the transformation is independent of the position of the ball in the world (well the gravity may change, but that's no issue here). So you need the inverse transform from world to ball space, that is independent from position. This is the very same problem like transforming normals (jsut the other way round) => so you need to take the transposed inverse of the inverted matrix (Explained in detail here http://www.lighthouse3d.com/tutorials/glsl-tutorial/the-normal-matrix/ ). However you already have the inverse of that inverse; it's the ball->world matrix itself. So you just have to transpose the modelview matrix, multiply velocity and gravity vectors with that and apply them in the ball's local space.