Convert projection to orthogonal raycast - c++

I have a completely implemented, working engine in OpenGL that supports a projection camera with raycasting. Recently, I implemented an orthogonal camera type, and visually, it's working just fine. For reference, here's how I compute the orthographic matrix:
double l = -viewportSize.x / 2 * zoom_;
double r = -l;
double t = -viewportSize.y / 2 * zoom_;
double b = -t;
double n = getNear();
double f = getFar();
m = Matrix4x4(
2 / (r - l),
0,
0,
-(r + l) / (r - l),
0,
2 / (t - b),
0,
-(t + b) / (t - b),
0,
0,
-2 / (f - n),
-(f + n) / (f - n),
0,
0,
0,
1);
However, my issue now is that raycasting does not work with the orthogonal camera. The issue seems to be that the raycasting engine was coded with projection-type cameras in mind, therefore when using the orthographic matrix instead it stops functioning. For reference, here's a high-level description of how the raycasting is implemented:
Get the world-space origin vector
Get normalized screen coordinate from input screen coordinates
Build mouseVector = (normScreenCoords.x, normScreenCoords.y, 0 if "near" or 1 if "far"
Build view-projection matrix (get view and projection matrices from Camera and multiply them)
Multiply the mouseVector by the inverse of the view-projection matrix.
Get the world-space forward vector
Get mouse world coordinates (far) and subtract them from mouse world coordinates (near)
Send the world-space origin and world-space forward vectors to the raycasting engine, which handles the logic of comparing these vectors to all the visible objects in the scene efficiently by using bounding boxes.
How do I modify this algorithm to work with orthographic cameras?

Your steps are fine and should work as expected with an orthographic camera. There may be a problem with the way you are calculating the origin and direction.
1.) Get the origin vector. First calculate the mouse position in world-space units, ie float rayX = (mouseX - halfResolution) / viewport.width * (r - l) or similar. It should be offset so the center of the screen is (0, 0), and the extreme values the mouse can reach translate to the edges of the viewport l, r, t, b. Then start with the camera position in world space and add two vectors rayX * camera.local.right and rayY * camera.local.up, where right and up are unit vectors in the camera's local co-ordinate system.
2.) The world space forward vector is always the camera forward vector for any mouse position.
3.) This should work without modification as long as you have the correct vectors for 1 and 2.

Related

How to rotate model to follow path

I have a spaceship model that I want to move along a circular path. I want the nose of the ship to always point in the direction it is moving in.
Here is the code I have to move it in a circle right now:
glm::mat4 m = glm::mat4(1.0f);
//time
long value_ms = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::time_point_cast<std::chrono::milliseconds>(std::chrono::
high_resolution_clock::now())
.time_since_epoch())
.count();
//translate
m = glm::translate(m, translate);
m = glm::translate(m, glm::vec3(-50, 0, -20));
m = glm::scale(m, glm::vec3(0.025f, 0.025f, 0.025f));
m = glm::translate(m, glm::vec3(1800, 0, 3000));
float speed = .002;
float x = 100 * cos(value_ms * speed); // + 1800;
float y = 0;
float z = 100 * sin(value_ms * speed); // + 3000;
m = glm::translate(m, glm::vec3(x, y, z));
How would I move it so the nose always points ahead? I tried doing glm::rotate with the rotation axis set as x or y or z but I cannot get it to work properly.
First see Understanding 4x4 homogenous transform matrices as I am using terminology and stuff from there...
Its usual to use a transform matrix of object for its navigation purposes and not the other way around ... So you should have a transform matrix M for your space ship that represents its position and orientation in [GCS] (global coordinate system). On top of that is sometimes multiplied another matrix M0 that align your space ship mesh to the first matrix (you know some meshes are not centered around (0,0,0) nor axis aligned...)
Now when you are moving your object you just do local transformations on the M so moving forward is just translating M origin position by a multiple of forward axis basis vector. The same goes for sliding to sides (just use different basis vector) resulting in that the object is alway aligned to where it supposed to be (in respect to movement). The same goes for turns. So going in circle is just moving forward and turning at constant speeds per time iteration step (timer).
You are doing this backwards first you compute position and orientation and then you are trying to make operations resulting in matrix that would do the same... In such case is much much easier to construct the matrix M instead of creating transformations that will create it... So what you need is:
origin position
3 perpendicular (most likely unit) basis vectors
So the origin is your x,y,z position. 2 basis vectors can be obtained from the circle so forward is tangent (or position-last_position) and vector towards circle center cen be used as (right or left). The 3th vector can be obtained by cross product so let assume:
+X axis is right
+Y axis is up
+Z axis is forward
you got:
r=100.0
a=speed*t
pos = (r*cos(a),0.0,r*sin(a))
center = (0.0,0.0,0.0)
so:
Z = (cos(a-0.5*M_PI),0.0,sin(a-0.5*M_PI))
X = (cos(a),0.0,sin(a))-ceneter
Y = cross(X,Z)
O = pos
normalize:
X /= length(X)
Y /= length(Y)
Z /= length(Z)
So now just feed your X,Y,Z,O to your matrix (depending on the conventions you use like multiplication order, direct/inverse matrix, row-major or column-major matrices ...)
so for example like this:
double M[16]=
{
X[0],X[1],X[2],0.0,
Y[0],Y[1],Y[2],0.0,
Z[0],Z[1],Z[2],0.0,
O[0],O[1],O[2],1.0,
};
or:
double M[16]=
{
X[0],Y[0],Z[0],O[0],
X[1],Y[1],Z[1],O[1],
X[2],Y[2],Z[2],O[2],
0.0 ,0.0 ,0.0 ,1.0,
};
And that is all ... The matrix might be transposed, inverted etc based on the conventions you use. Sorry I do not use GLM but the syntax should be very siilar ... the matrix feeding might be even simpler if rows or columns are loadable by a vector ...

Get camera matrix from OpenGL

I render a 3D mesh model using OpenGL with perspective camera – gluPerspective(fov, aspect, near, far).
Then I use rendered image in a computer vision algorithm.
At some point that algorithm requires camera matrix K (along with several vertices on the model and their corresponding projections) in order to estimate camera position: rotation matrix R and translation vector t. I can estimate R and t by using any algorithm which solves Perspective-n-Point problem.
I construct K from the OpenGL projection matrix (see how here)
K = [fX, 0, pX | 0, fY, pY | 0, 0, 1]
If I want to project a model point 'by hand' I can compute:
X_proj = K*(R*X_model + t)
x_pixel = X_proj[1] / X_proj[3]
y_pixel = X_proj[2] / X_proj[3]
Anyway, I pass this camera matrix in a PnP algorithm and it works just fine.
But then I had to change perspective projection to orthographic one.
As far as I understand when using orthographic projection the camera matrix becomes:
K = [1, 0, 0 | 0, 1, 0 | 0, 0, 0]
So I changed gluPerspective to glOrtho. Following the same way I constructed K from OpenGL projection matrix, and it turned out that fX and fY are not ones but 0.0037371. Is this a scaled orthographic projection or what?
Moreover, in order to project model vertices 'by hand' I managed to do the following:
X_proj = K*(R*X_model + t)
x_pixel = X_proj[1] + width / 2
y_pixel = X_proj[2] + height / 2
Which is not what I expected (that plus width and hight divided by 2 seems strange...). I tried to pass this camera matrix to POSIT algorithm to estimate R and t, and it doesn't converge. :(
So here are my questions:
How to get orthographic camera matrix from OpenGL?
If the way I did it is correct then is it true orthographic? Why POSIT doesn't work?
Orthographic projection will not use the depth to scale down farther points. Though, it will scale the points to fit inside the NDC which means it will scale the values to fit inside the range [-1,1].
This matrix from Wikipedia shows what this means:
So, it is correct to have numbers other than 1.
For your way of computing by hand, I believe it's not scaling back to screen coordinates and that makes it wrong. As I said, the output of projection matrices will be in the range [-1,1], and if you want to get the pixel coordinates, I believe you should do something similar to this:
X_proj = K*(R*X_model + t)
x_pixel = X_proj[1]*width/2 + width / 2
y_pixel = X_proj[2]*height/2 + height / 2
Anyway, I think you'd be better if you used modern OpenGL with libraries like GLM. In this case, you have the exact projection matrices used at hand.

An inconsistency in my understanding of the GLM lookAt function

Firstly, if you would like an explanation of the GLM lookAt algorithm, please look at the answer provided on this question: https://stackoverflow.com/a/19740748/1525061
mat4x4 lookAt(vec3 const & eye, vec3 const & center, vec3 const & up)
{
vec3 f = normalize(center - eye);
vec3 u = normalize(up);
vec3 s = normalize(cross(f, u));
u = cross(s, f);
mat4x4 Result(1);
Result[0][0] = s.x;
Result[1][0] = s.y;
Result[2][0] = s.z;
Result[0][1] = u.x;
Result[1][1] = u.y;
Result[2][1] = u.z;
Result[0][2] =-f.x;
Result[1][2] =-f.y;
Result[2][2] =-f.z;
Result[3][0] =-dot(s, eye);
Result[3][1] =-dot(u, eye);
Result[3][2] = dot(f, eye);
return Result;
}
Now I'm going to tell you why I seem to be having a conceptual issue with this algorithm. There are two parts to this view matrix, the translation and the rotation. The translation does the correct inverse transformation, bringing the camera position to the origin, instead of the origin position to the camera. Similarly, you expect the rotation that the camera defines to be inversed before being put into this view matrix as well. I can't see that happening here, that's my issue.
Consider the forward vector, this is where your camera looks at. Consequently, this forward vector needs to be mapped to the -Z axis, which is the forward direction used by openGL. The way this view matrix is suppose to work is by creating an orthonormal basis in the columns of the view matrix, so when you multiply a vertex on the right hand side of this matrix, you are essentially just converting it's coordinates to that of different axes.
When I play the rotation that occurs as a result of this transformation in my mind, I see a rotation that is not the inverse rotation of the camera, like what's suppose to happen, rather I see the non-inverse. That is, instead of finding the camera forward being mapped to the -Z axis, I find the -Z axis being mapped to the camera forward.
If you don't understand what I mean, consider a 2D example of the same type of thing that is happening here. Let's say the forward vector is (sqr(2)/2 , sqr(2)/2), or sin/cos of 45 degrees, and let's also say a side vector for this 2D camera is sin/cos of -45 degrees. We want to map this forward vector to (0,1), the positive Y axis. The positive Y axis can be thought of as the analogy to the -Z axis in openGL space. Let's consider a vertex in the same direction as our forward vector, namely (1,1). By using the logic of GLM.lookAt, we should be able to map (1,1) to the Y axis by using a 2x2 matrix that consists of the forward vector in the first column and the side vector in the second column. This is an equivalent calculation of that calculation http://www.wolframalpha.com/input/?i=%28sqr%282%29%2F2+%2C+sqr%282%29%2F2%29++1+%2B+%28sqr%282%29%2F2%2C+-sqr%282%29%2F2+%29+1.
Note that you don't get your (1,1) vertex mapped the positive Y axis like you wanted, instead you have it mapped to the positive X axis. You might also consider what happened to a vertex that was on the positive Y axis if you applied this transformation. Sure enough, it is transformed to the forward vector.
Therefore it seems like something very fishy is going on with the GLM algorithm. However, I doubt this algorithm is incorrect since it is so popular. What am I missing?
Have a look at GLU source code in Mesa: http://cgit.freedesktop.org/mesa/glu/tree/src/libutil/project.c
First in the implementation of gluPerspective, notice the -1 is using the indices [2][3] and the -2 * zNear * zFar / (zFar - zNear) is using [3][2]. This implies that the indexing is [column][row].
Now in the implementation of gluLookAt, the first row is set to side, the next one to up and the final one to -forward. This gives you the rotation matrix which is post-multiplied by the translation that brings the eye to the origin.
GLM seems to be using the same [column][row] indexing (from the code). And the piece you just posted for lookAt is consistent with the more standard gluLookAt (including the translational part). So at least GLM and GLU agree.
Let's then derive the full construction step by step. Noting C the center position and E the eye position.
Move the whole scene to put the eye position at the origin, i.e. apply a translation of -E.
Rotate the scene to align the axes of the camera with the standard (x, y, z) axes.
2.1 Compute a positive orthonormal basis for the camera:
f = normalize(C - E) (pointing towards the center)
s = normalize(f x u) (pointing to the right side of the eye)
u = s x f (pointing up)
with this, (s, u, -f) is a positive orthonormal basis for the camera.
2.2 Find the rotation matrix R that aligns maps the (s, u, -f) axes to the standard ones (x, y, z). The inverse rotation matrix R^-1 does the opposite and aligns the standard axes to the camera ones, which by definition means that:
(sx ux -fx)
R^-1 = (sy uy -fy)
(sz uz -fz)
Since R^-1 = R^T, we have:
( sx sy sz)
R = ( ux uy uz)
(-fx -fy -fz)
Combine the translation with the rotation. A point M is mapped by the "look at" transform to R (M - E) = R M - R E = R M + t. So the final 4x4 transform matrix for "look at" is indeed:
( sx sy sz tx ) ( sx sy sz -s.E )
L = ( ux uy uz ty ) = ( ux uy uz -u.E )
(-fx -fy -fz tz ) (-fx -fy -fz f.E )
( 0 0 0 1 ) ( 0 0 0 1 )
So when you write:
That is, instead of finding the camera forward being mapped to the -Z
axis, I find the -Z axis being mapped to the camera forward.
it is very surprising, because by construction, the "look at" transform maps the camera forward axis to the -z axis. This "look at" transform should be thought as moving the whole scene to align the camera with the standard origin/axes, it's really what it does.
Using your 2D example:
By using the logic of GLM.lookAt, we should be able to map (1,1) to the Y
axis by using a 2x2 matrix that consists of the forward vector in the
first column and the side vector in the second column.
That's the opposite, following the construction I described, you need a 2x2 matrix with the forward and row vector as rows and not columns to map (1, 1) and the other vector to the y and x axes. To use the definition of the matrix coefficients, you need to have the images of the standard basis vectors by your transform. This gives directly the columns of the matrix. But since what you are looking for is the opposite (mapping your vectors to the standard basis vectors), you have to invert the transformation (transpose, since it's a rotation). And your reference vectors then become rows and not columns.
These guys might give some further insights to your fishy issue:
glm::lookAt vertical camera flips when z <= 0
The answer might be of interest to you?

3d coordinate from point and angles

I'm working on a simple OpenGL world- and so far I've got a bunch of cubes randomly placed about and it's pretty fun to go zooming about. However I'm ready to move on. I would like to drop blocks in front of my camera, but I'm having trouble with the 3d angles. I'm used to 2d stuff where to find an end point we simply do something along the lines of:
endy = y + (sin(theta)*power);
endx = x + (cos(theta)*power);
However when I add the third dimension I'm not sure what to do! It seems to me that the power of the second dimensional plane would be determined by the z axis's cos(theta)*power, but I'm not positive. If that is correct, it seems to me I'd do something like this:
endz = z + (sin(xtheta)*power);
power2 = cos(xtheta) * power;
endx = x + (cos(ytheta) * power2);
endy = y + (sin(ytheta) * power2);
(where x theta is the up/down theta and y = left/right theta)
Am I even close to the right track here? How do I find an end point given a current point and an two angles?
Working with euler angles doesn't work so well in 3D environments, there are several issues and corner cases in which they simply don't work. And you actually don't even have to use them.
What you should do, is exploit the fact, that transformation matrixes are nothing else, then coordinate system bases written down in a comprehensible form. So you have your modelview matrix MV. This consists of a model space transformation, followed by a view transformation (column major matrices multiply right to left):
MV = V * M
So what we want to know is, in which way the "camera" lies within the world. That is given to you by the inverse view matrix V^-1. You can of course invert the view matrix using Gauss Jordan method, but most of the time your view matrix will consist of a 3×3 rotation matrix with a translation vector column P added.
R P
0 1
Recall that
(M * N)^-1 = N^-1 * M^-1
and also
(M * N)^T = M^T * N^T
so it seems there is some kind of relationship between transposition and inversion. Not all transposed matrices are their inverse, but there are some, where the transpose of a matrix is its inverse. Namely it are the so called orthonormal matrices. Rotations are orthonormal. So
R^-1 = R^T
neat! This allows us to find the inverse of the view matrix by the following (I suggest you try to proof it as an exersice):
V = / R P \
\ 0 1 /
V^-1 = / R^T -P \
\ 0 1 /
So how does this help us to place a new object in the scene at a distance from the camera? Well, V is the transformation from world space into camera space, so V^-1 transforms from camera to world space. So given a point in camera space you can transform it back to world space. Say you wanted to place something at the center of the view in distance d. In camera space that would be the point (0, 0, -d, 1). Multiply that with V^-1:
V^-1 * (0, 0, -d, 1) = (R^T)_z * d - P
Which is exactly what you want. In your OpenGL program you somewhere have your view matrix V, probably not properly named yet, but anyway it is there. Say you use old OpenGL-1 and GLU's gluLookAt:
void display(void)
{
/* setup viewport, clear, set projection, etc. */
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
gluLookAt(...);
/* the modelview matrix now holds the View transform */
At this point we can extract the modelview matrix
GLfloat view[16];
glGetFloatv(GL_MODELVIEW_MATRIX, view);
Now view is in column major order. If we were to use it directly we could directly address the columns. But remember that transpose is inverse of a rotation, so we actually want the 3rd row vector. So let's assume you keep view around, so that in your event handler (outside display) you can do the following:
GLfloat z_row[3];
z_row[0] = view[2];
z_row[1] = view[6];
z_row[2] = view[10];
And we want the position
GLfloat * const p_column = &view[12];
Now we can calculate the new objects position at distance d:
GLfloat new_object_pos[3] = {
z_row[0]*d - p_column[0],
z_row[1]*d - p_column[1],
z_row[2]*d - p_column[2],
};
There you are. As you can see, nowhere you had to work with angles or trigonometry, it's just straight linear algebra.
Well I was close, after some testing, I found the correct formula for my implementation, it looks like this:
endy = cam.get_pos().y - (sin(toRad(180-cam.get_rot().x))*power1);
power2 = cos(toRad(180-cam.get_rot().x))*power1;
endx = cam.get_pos().x - (sin(toRad(180-cam.get_rot().y))*power2);
endz = cam.get_pos().z - (cos(toRad(180-cam.get_rot().y))*power2);
This takes my camera's position and rotational angles and get's the corresponding points. Works like a charm =]

c++ opengl converting model coordinates to world coordinates for collision detection

(This is all in ortho mode, origin is in the top left corner, x is positive to the right, y is positive down the y axis)
I have a rectangle in world space, which can have a rotation m_rotation (in degrees).
I can work with the rectangle fine, it rotates, scales, everything you could want it to do.
The part that I am getting really confused on is calculating the rectangles world coordinates from its local coordinates.
I've been trying to use the formula:
x' = x*cos(t) - y*sin(t)
y' = x*sin(t) + y*cos(t)
where (x, y) are the original points,
(x', y') are the rotated coordinates,
and t is the angle measured in radians
from the x-axis. The rotation is
counter-clockwise as written.
-credits duffymo
I tried implementing the formula like this:
//GLfloat Ax = getLocalVertices()[BOTTOM_LEFT].x * cosf(DEG_TO_RAD( m_orientation )) - getLocalVertices()[BOTTOM_LEFT].y * sinf(DEG_TO_RAD( m_orientation ));
//GLfloat Ay = getLocalVertices()[BOTTOM_LEFT].x * sinf(DEG_TO_RAD( m_orientation )) + getLocalVertices()[BOTTOM_LEFT].y * cosf(DEG_TO_RAD( m_orientation ));
//Vector3D BL = Vector3D(Ax,Ay,0);
I create a vector to the translated point, store it in the rectangles world_vertice member variable. That's fine. However, in my main draw loop, I draw a line from (0,0,0) to the vector BL, and it seems as if the line is going in a circle from the point on the rectangle (the rectangles bottom left corner) around the origin of the world coordinates.
Basically, as m_orientation gets bigger it draws a huge circle around the (0,0,0) world coordinate system origin. edit: when m_orientation = 360, it gets set back to 0.
I feel like I am doing this part wrong:
and t is the angle measured in radians
from the x-axis.
Possibly I am not supposed to use m_orientation (the rectangles rotation angle) in this formula?
Thanks!
edit: the reason I am doing this is for collision detection. I need to know where the coordinates of the rectangles (soon to be rigid bodies) lie in the world coordinate place for collision detection.
What you do is rotation [ special linear transformation] of a vector with angle Q on 2d.It keeps vector length and change its direction around the origin.
[linear transformation : additive L(m + n) = L(m) + L(n) where {m, n} € vector , homogeneous L(k.m) = k.L(m) where m € vector and k € scalar ] So:
You divide your vector into two pieces. Like m[1, 0] + n[0, 1] = your vector.
Then as you see in the image, rotation is made on these two pieces, after that your vector take
the form:
m[cosQ, sinQ] + n[-sinQ, cosQ] = [mcosQ - nsinQ, msinQ + ncosQ]
you can also look at Wiki Rotation
If you try to obtain eye coordinates corresponding to your object coordinates, you should multiply your object coordinates by model-view matrix in opengl.
For M => model view matrix and transpose of [x y z w] is your object coordinates you do:
M[x y z w]T = Eye Coordinate of [x y z w]T
This seems to be overcomplicating things somewhat: typically you would store an object's world position and orientation separately from its set of own local coordinates. Rotating the object is done in model space and therefore the position is unchanged. The world position of each coordinate is the same whether you do a rotation or not - add the world position to the local position to translate the local coordinates to world space.
Any rotation occurs around a specific origin, and the typical sin/cos formula presumes (0,0) is your origin. If the coordinate system in use doesn't currently have (0,0) as the origin, you must translate it to one that does, perform the rotation, then transform back. Usually model space is defined so that (0,0) is the origin for the model, making this step trivial.