Model matrix in 3D graphics / OpenGL - opengl

I'm following some tutorials to learn openGL (from www.opengl-tutorial.org if it makes any difference) and there is an exercise that asks me to draw a cube and a triangle on the screen and it says as a hint that I'm supposed to calculate two MVP-matrices, one for each object. MVP matrix is given by Projection*View*Model and as far as I understand, the projection and view matrices are the same for all the objects on the screen (they are only affected by my choice of "camera" location and settings). However, the model matrix should change since it's supposed to give me the coordinates and rotation of the object in the global coordinates. Following the tutorials, for my cube the model matrix is just the unit matrix since it is located at the origin and there's no rotation or scaling. Then I draw my triangle so that its vertices are at (2,2,0), (2,3,0) and (3,2,0). Now my question is, what is the model matrix for my triangle?
My own reasoning says that if I don't want to rotate or scale it, the model matrix should be just translation matrix. But what gives the translation coordinates here? Should it include the location of one of the vertices or the center of the triangle or what? Or have I completely misunderstood what the model matrix is?

The model matrix is like the other matrices (projection, view) a 4x4 matrix with the same layout. Depending on whether you're using column or row vectors the matrix consists of the x,y,z axis of your local frame and a t1,t2,t3 vector specifying the translation part
so for a column vector p the transformation matrix (M) looks like
x1, x2, x3, t1,
y1, y2, y3, t2,
z1, z2, z3, t3,
0, 0, 0, 1
p' = M * p
so for row vectors you could try to find out how the matrix layout must be. Also note that if you have row vectors p' = p * M.
If you have no rotational component your local frame has the usual x,y,z axis as the rows of the 3x3 submatrix of the model matrix..
1 0 0 t1 -> x axis
0 1 0 t2 -> y axis
0 0 1 t3 -> z axis
0 0 0 1
the forth column specifies the translation vector (t1,t2,t3). If you have a point p =
1,
0,
0,
1
in a local coordinate system and you want it to translate +1 in z direction to place it in the world coordinate system the model matrix is simply:
1 0 0 0
0 1 0 0
0 0 1 1
0 0 0 1
p' = M * p .. p' is the transformed point in world coordinates.
For your example above you could already specify the triangle in (2,2,0), (2,3,0) and (3,2,0) in your local coordinate system. Then the model matrix is trivial. Otherwise you have to find out how you compute rotation etc.. I recommend reading the first few chapters of mathematics for 3d game programming and computer graphics. It's a very simple 3d math book, there you should get the minimal information you need to handle the most of the 3d graphics math.

Related

OpenGL How to calculate worldspace coordinates from frustum aligned vectors?

I am a graphics programming beginner working on my own engine and tried to implement frustum-aligned volume rendering.
The idea was to render multiple planes as vertical slices across the view frustum and then use the world coordinates of those planes for procedural volumes.
Rendering the slices as a 3d model and using the vertex positions as worldspace coordinates works perfectly fine:
//Vertex Shader
gl_Position = P*V*vec4(vertexPosition_worldspace,1);
coordinates_worldspace = vertexPosition_worldspace;
Result:
However rendering the slices in frustum-space and trying to reverse engineer the world space coordinates doesent give expected results. The closest i got was this:
//Vertex Shader
gl_Position = vec4(vertexPosition_worldspace,1);
coordinates_worldspace = (inverse(V) * inverse(P) * vec4(vertexPosition_worldspace,1)).xyz;
Result:
My guess is, that the standard projection matrix somehow gets rid of some crucial depth information, but other than that i have no clue what i am doing wrong and how to fix it.
Well, it is not 100% clear what you mean by "frustum space". I'm going to assume that it does refer to normalized device coordinates in OpenGL, where the view frustum is (by default) the axis-aligned cube -1 <= x,y,z <= 1. I'm also going to assume a perspective projection, so that NDC z coordinate is actually a hyperbolic function of eye space z.
My guess is, that the standard projection matrix somehow gets rid of some crucial depth information, but other than that i have no clue what i am doing wrong and how to fix it.
No, a standard perspective matrix in OpenGL looks like
( sx 0 tx 0 )
( 0 sy ty 0 )
( 0 0 A B )
( 0 0 -1 0 )
When you multiply this by a (x,y,z,1) eye space vector, you get the homogenous clip coordinates. Consider only the
last two lines of the matrix as separate equations:
z_clip = A * z_eye + B
w_clip = -z_eye
Since we do the perspective divide by w_clip to get from clip space to NDC, we end up with
z_ndc = - A - B/z_eye
which is actually the hyperbolically remapped depth information - so that information is completely preserved. (Also note that we do the division also for x and y).
When you calculate inverse(P), you only invert the 4D -> 4D homogenous mapping. But you will get a resulting w that is not 1 again, so here:
coordinates_worldspace = (inverse(V) * inverse(P) * vec4(vertexPosition_worldspace,1)).xyz;
^^^
lies your information loss. You just skip the resulting w and use the xyz components as if it were cartesian 3D coordinates, but they are 4D homogenous coordinates representing some 3D point.
The correct approach would be to divide by w:
vec4 coordinates_worldspace = (inverse(V) * inverse(P) * vec4(vertexPosition_worldspace,1));
coordinates_worldspace /= coordinates_worldspace.w

Drawing Euler Angles rotational model on a 2d image

I'm currently attempting to draw a 3d representation of euler angles within a 2d image (no opengl or 3d graphic windows). The image output can be similar to as below.
Essentially I am looking for research or an algorithm which can take a Rotation Matrix or a set of Euler angles and then output them onto a 2d image, like above. This will be implemented in a C++ application that uses OpenCV. It will be used to output annotation information on a OpenCV window based on the state of the object.
I think I'm over thinking this because I should be able to decompose the unit vectors from a rotation matrix and then extract their x,y components and draw a line in cartesian space from (0,0). Am i correct in this thinking?
EDIT: I'm looking for an Orthographic Projection. You can assume the image above has the correct camera/viewing angle.
Any help would be appreciated.
Thanks,
EDIT: The example source code can now be found in my repo.
Header: https://bitbucket.org/jluzwick/tennisspindetector/src/6261524425e8d80772a58fdda76921edb53b4d18/include/projection_matrix.h?at=master
Class Definitions: https://bitbucket.org/jluzwick/tennisspindetector/src/6261524425e8d80772a58fdda76921edb53b4d18/src/projection_matrix.cpp?at=master
It's not the best code but it works and shows the steps necessary to get the projection matrix described in the accepted answer.
Also here is a youtube vid of the projection matrix in action (along with scale and translation added): http://www.youtube.com/watch?v=mSgTFBFb_68
Here are my two cents. Hope it helps.
If I understand correctly, you want to rotate 3D system of coordinates and then project it orthogonally onto a given 2D plane (2D plane is defined with respect to the original, unrotated 3D system of coordinates).
"Rotating and projecting 3D system of coordinates" is "rotating three 3D basis vectors and projecting them orthogonally onto a 2D plane so they become 2D vectors with respect to 2D basis of the plane". Let the original 3D vectors be unprimed and the resulting 2D vectors be primed. Let {e1, e2, e3} = {e1..3} be 3D orthonormal basis (which is given), and {e1', e2'} = {e1..2'} be 2D orthonormal basis (which we have to define). Essentially, we need to find such operator PR that PR * v = v'.
While we can talk a lot about linear algebra, operators and matrix representation, it'd be too long of a post. It'll suffice to say that :
For both 3D rotation and 3D->2D projection operators there are real matrix representations (linear transformations; 2D is subspace of 3D).
These are two transformations applied consequently, i.e. PR * v = P * R * v = v', so we need to find rotation matrix R and projection matrix P. Clearly, after we rotated v using R, we can project the result vector vR using P.
You have the rotation matrix R already, so we consider it is a given 3x3 matrix. So for simplicity we will talk about projecting vector vR = R * v.
Projection matrix P is a 2x3 matrix with i-th column being a projection of i-th 3D basis vector ei onto {e1..2'} basis.
Let's find P projection matrix such as a 3D vector vR is linearly transformed into 2D vector v' on a 2D plane with an orthonormal basis {e1..2'}.
A 2D plane can be easily defined by a vector normal to it. For example, from the figures in the OP, it seems that our 2D plane (the plane of the paper) has normal unit vector n = 1/sqrt(3) * ( 1, 1, 1 ). We need to find a 2D basis in the 2D plane defined by this n. Since any two linearly independent vectors lying in our 2D plane would form such basis, here are infinite number of such basis. From the problem's geometry and for the sake of simplicity, let's impose two additional conditions: first, the basis should be orthonormal; second, should be visually appealing (although, this is somewhat a subjective condition). As it can be easily seen, such basis is formed trivially in the primed system by setting e1' = ( 1, 0 )' = x'-axis (horizontal, positive direction from left to right) and e2' = ( 0, 1 )' = y'-axis (vertical, positive direction from bottom to top).
Let's now find this {e1', e2'} 2D basis in {e1..3} 3D basis.
Let's denote e1' and e2' as e1" and e2" in the original basis. Noting that in our case e1" has no e3-component (z-component), and using the fact that n dot e1" = 0, we get that e1' = ( 1, 0 )' -> e1" = ( -1/sqrt(2), 1/sqrt(2), 0 ) in the {e1..3} basis. Here, dot denotes dot-product.
Then e2" = n cross e1" = ( -1/sqrt(6), -1/sqrt(6), 2/sqrt(6) ). Here, cross denotes cross-product.
The 2x3 projection matrix P for the 2D plane defined by n = 1/sqrt(3) * ( 1, 1, 1 ) is then given by:
( -1/sqrt(2) 1/sqrt(2) 0 )
( -1/sqrt(6) -1/sqrt(6) 2/sqrt(6) )
where first, second and third columns are transformed {e1..3} 3D basis onto our 2D basis {e1..2'}, i.e. e1 = ( 1, 0, 0 ) from 3D basis has coordinates ( -1/sqrt(2), -1/sqrt(6) ) in our 2D basis, and so on.
To verify the result we can check few obvious cases:
n is orthogonal to our 2D plane, so there should be no projection. Indeed, P * n = P * ( 1, 1, 1 ) = 0.
e1, e2 and e3 should be transformed into their representation in {e1..2'}, namely corresponding column in P matrix. Indeed, P * e1 = P * ( 1, 0 ,0 ) = ( -1/sqrt(2), -1/sqrt(6) ) and so on.
To finalize the problem. We now constructed a projection matrix P from 3D into 2D for an arbitrarily chosen 2D plane. We now can project any vector, previously rotated by rotation matrix R, onto this plane. For example, rotated original basis {R * e1, R * e2, R * e3}. Moreover, we can multiply given P and R to get a rotation-projection transformation matrix PR = P * R.
P.S. C++ implementation is left as a homework exercise ;).
The rotation matrix will be easy to display,
A Rotation matrix can be constructed by using a normal, binormal and tangent.
You should be able to get them back out as follows:-
Bi-Normal (y') : matrix[0][0], matrix[0][1], matrix[0][2]
Normal (z') : matrix[1][0], matrix[1][1], matrix[1][2]
Tangent (x') : matrix[2][0], matrix[2][1], matrix[2][2]
Using a perspective transform you can the add perspective (x,y) = (x/z, y/z)
To acheive an orthographic project similar to that shown you will need to multiply by another fixed rotation matrix to move to the "camera" view (45° right and then up)
You can then multiply your end points x(1,0,0),y(0,1,0),z(0,0,1) and center(0,0,0) by the final matrix, use only the x,y coordinates.
center should always transform to 0,0,0
You can then scale these values to draw to you 2D canvas.

How to calculate extrinsic parameters of one camera relative to the second camera?

I have calibrated 2 cameras with respect to some world coordinate system. I know rotation matrix and translation vector for each of them relative to the world frame. From these matrices how to calculate rotation matrix and translation vector of one camera with respect to the other??
Any help or suggestion please. Thanks!
Here is an easier solution, since you already have the 3x3 rotation matrices R1 and R2, and the 3x1 translation vectors t1 and t2.
These express the motion from the world coordinate frame to each camera, i.e. are the matrices such that, if p is a point expressed in world coordinate frame, then the same point expressed in, say, camera 1 frame is p1 = R1 * p + t1.
The motion from camera 1 to 2 is then simply the composition of (a) the motion FROM camera 1 TO the world frame, and (b) of the motion FROM the world frame TO camera 2. You can easily compute this composition as follows:
Form the 4x4 roto-translation matrices Qw1 = [R1 t1] and Qw2 = [ R2 t2 ], both with the 4th row equal to [0 0 0 1]. These matrices completely express the roto-translation FROM the world coordinate frame TO camera 1 and 2 respectively.
The motion FROM camera 1 TO the world frame is simply Q1w = inv(Qw1). Here inv() is the algebraic inverse matrix, i.e. the one such that inv(X) * X = X * inv(X) = IdentityMatrix, for every nonsingular matrix X.
The roto-translation from camera 1 to 2 is then Q12 = Q1w * Qw2, and viceversa, the one from camera 2 to 1 is Q21 = Q2w * Qw1 = inv(Qw2) * Qw1.
Once you have Q12 you can extract from it the rotation and translation parts, if you so wish, respectively from its upper 3x3 submatrix and right 3x1 sub-column.
First convert your rotation matrix into a rotation vector. Now you have 2 3d vectors for each camera, call them A1,A2,B1,B2. You have all 4 of them with respect to some origin O. The rule you need is
A relative to B = (A relative to O)- (B relative to O)
Apply that rule to your 2 vectors and you will have their pose relative to one another.
Some documentation on converting from rotation matrix to euler angles can be found here as well as many other places. If you are using openCV you can just use Rodrigues. Here is some matlab/octave code I found.
Here is very simple and easy solution. I suppose your 1st camera has R1 and T1, 2nd camera has R2 and T2 rotation matrixes and translation vector according to common reference point.
Translation from 1st to 2nd camera, rotation from 1st to 2nd camera can be calculated by following two line matlab code;
R=R2*R1';
T=T2-R*T1;
but note, that is true if you have just one R and T for each camera. (I mean rotations and translation for one unique world reference). if you have more reference translations and rotations, you should calcuate R,T for every single reference point. Probably they will be very close to each other. But those might be sligtly different. Then you can calculate mean of Translation vector and convert all found rotation matrix to rotation vector, caluculate its mean and then convert them as rotation matrix.

How to position object using forward, top and center coordinates in opengl

I have a scene and an object placed in some coordinates. I can transform the object using
glTranslate(center) and then glRotate...
But how do I rotate an object not using angles but rather directions top and forward?
Thanks
What I'm looking is translation between model coordinate system and global coordinate system.
Say you know 3 axes for your object in object space. For simplicity we'll assume these are the cartesian axes (if it's not the case, the process described below can be applied twice to take care of that):
ox = (1, 0, 0)
oy = (0, 1, 0)
oz = (0, 0, 1)
And say you have 3 other orthogonal and normalized axes in world space, indicating the top, forward and side directions of your object [*]:
wx = (wx.x, wx.y, wx.z)
wy = (wy.x, wy.y, wy.z)
wz = (wz.x, wz.y, wz.z)
Then the following (assuming column vectors) is a rotation matrix taking you from object space to world space:
[ wx.x wx.y wx.z ]
M = [ wy.x wy.y wy.z ]
[ wz.x wz.y wz.z ]
It's a rotation matrix because the determinant is 1 (othogonal and normalized lines). To verify it goes from world space to object space, just note how M*wx = (1, 0, 0) etc.
Now you want the exact opposite: from object space to world space. Just invert the matrix. In that case, the inverse is the same as the transpose, so your final answer is:
objectToWorld = transpose(M)
Two things remain:
1) Loading this matrix in OpenGL. glMultMatrix will do this for you (be careful, glMultMatrix is column major and needs a 4x4 matrix):
double objectToWorld[] = [ wx.x, wy.x, wz.x, 0,
wx.y, wy.y, wz.y, 0,
wx.z, wy.z, wz.z, 0,
0, 0, 0, 1 ];
glMultMatrixd( objectToWorld );
2) Translating. This is done just by following this with a call to glTranslate.
[*] If you have only two of the three, say top and forward you can easily compute the side with a cross product. If they're not normalized, simply normalize them. If they're not orthogonal it all gets trickier.
Since OpenGL works just with matrices there is no concept of top, bottom and so on..
you'll have to find a corrispondence between the rotation you wanna give and the orientation needed. Since glRotates(angle,x,y,z,) wants an angle in degrees you can just use a bidimensional array and store all 16 possibilities (from one of top, bottom, left, right to one of the same.. otherwise you can just see how much 90° steps are needed from actual position to new one and multiply the value by 90..
eg:
from top to bottom = 2 steps = 180°
from right to top = 1 step backward = -90°

Why does sign matter in opengl projection matrix

I'm working on a computer vision problem which requires rendering a 3d model using a calibrated camera. I'm writing a function that breaks the calibrated camera matrix into a modelview matrix and a projection matrix, but I've run into an interesting phenomenon in opengl that defies explanation (at least by me).
The short description is that negating the projection matrix results in nothing being rendered (at least in my experience). I would expect that multiplying the projection matrix by any scalar would have no effect, because it transforms homogeneous coordinates, which are unaffected by scaling.
Below is my reasoning why I find this to be unexpected; maybe someone can point out where my reasoning is flawed.
Imagine the following perspective projection matrix, which gives correct results:
[ a b c 0 ]
P = [ 0 d e 0 ]
[ 0 0 f g ]
[ 0 0 h 0 ]
Multiplying this by camera coordinates gives homogeneous clip coordinates:
[x_c] [ a b c 0 ] [X_e]
[y_c] = [ 0 d e 0 ] * [Y_e]
[z_c] [ 0 0 f g ] [Z_e]
[w_c] [ 0 0 h 0 ] [W_e]
Finally, to get normalized device coordinates, we divide x_c, y_c, and z_c by w_c:
[x_n] [x_c/w_c]
[y_n] = [y_c/w_c]
[z_n] [z_c/w_c]
Now, if we negate P, the resulting clip coordinates should be negated, but since they are homogeneous coordinates, multiplying by any scalar (e.g. -1) shouldn't have any affect on the resulting normalized device coordinates. However, in openGl, negating P results in nothing being rendered. I can multiply P by any non-negative scalar and get the exact same rendered results, but as soon as I multiply by a negative scalar, nothing renders. What is going on here??
Thanks!
Well, the gist of it is that clipping testing is done through:
-w_c < x_c < w_c
-w_c < y_c < w_c
-w_c < z_c < w_c
Multiplying by a negative value breaks this test.
I just found this tidbit, which makes progress toward an answer:
From Red book, appendix G:
Avoid using negative w vertex coordinates and negative q texture coordinates. OpenGL might not clip such coordinates correctly and might make interpolation errors when shading primitives defined by such coordinates.
Inverting the projection matrix will result in negative W clipping coordinate, and apparently opengl doesn't like this. But can anyone explain WHY opengl doesn't handle this case?
reference: http://glprogramming.com/red/appendixg.html
Reasons I can think of:
By inverting the projection matrix, the coordinates will no longer be within your zNear and zFar planes of the view frustum (necessarily greater than 0).
To create window coordinates, the normalized device coordinates are translated/scaled by the viewport. So, if you've used a negative scalar for the clip coordinates, the normalized device coordinates (now inverted) translate the viewport to window coordinates that are... off of your window (to the left and below, if you will)
Also, since you mentioned using a camera matrix and that you have inverted the projection matrix, I have to ask... to which matrices are you applying what from the camera matrix? Operating on the projection matrix save near/far/fovy/aspect causes all sorts of problems in the depth buffer including anything that uses z (depth testing, face culling, etc).
The OpenGL FAQ section on transformations has some more details.