Problem multiplying vec3 by model matrice (scaling problem) - c++

in my 3D application I have an Object3D class which construct is bounding box at creation, this bounding box is a simple cuboid with a min and a max point :
However, when I translate my Object3D, I want to update my bounding box using the Model matrix of my object :
BoundingBox& TransformBy(const glm::mat4& modelMatrice){
//extract position of my model matrix
glm::vec3 pos( modelMatrice[3] );
//extract scale of my model matrix
glm::vec3 scale( glm::length(glm::vec3(modelMatrice[0])), glm::length(glm::vec3(modelMatrice[1])) , glm::length(glm::vec3(modelMatrice[2])));
//Creating vec4 min/max from vec3
glm::vec4 min_(min.x, min.y, min.z, 1.0f);
glm::vec4 max_(max.x, max.y, max.z, 1.0f);
//Inverting the matrice to multiply my vec4 with it in order to rotate my min max
glm::mat4 inverse = glm::inverse(modelMatrice);
min_ = min_ * inverse;
max_ = max_ * inverse;
//Scale my min_ max_
min_ *= glm::vec4(scale,1.0f);
max_ *= glm::vec4(scale,1.0f);
//adding model matrice translation to min_ and max_
min_ += glm::vec4(pos,0.0f);
max_ += glm::vec4(pos,0.0f);
//Redefining max and min
max = glm::vec3( (min_.x > max_.x)? min_.x:max_.x, (min_.y > max_.y)? min_.y: max_.y, (min_.z > max_.z)? min_.z : max_.z);
min = glm::vec3( (min_.x < max_.x)? min_.x:max_.x, (min_.y < max_.y)? min_.y: max_.y, (min_.z < max_.z)? min_.z : max_.z);
return *this;
}
Here is how I call my TransformBy function :
box.TransformBy(transform.GetModelMatrix());
For some reason, it make my min and max point rotating correctly it also translate it but my scaling is not correctly applied, making my bounding box having a bigger scale than my 3D object.
Why my scaling is not working properly ?
Maybe there is a less complicated way of doing what I want ?

If I understand correctly how the result of GetModelMatrix() is supposed to look, i.e. that it is just a normal 4x4 transformation matrix, then I think I can guess what the problem is. I find it easiest to visualize/explain with an example, so hopefully that's alright.
Suppose your result of GetModelMatrix() is the following, a simple transformation matrix with no rotational component, just a translation by [5 6 7].
| 1 0 0 5 |
| 0 1 0 6 |
| 0 0 1 7 |
| 0 0 0 1 |
The inverse of that, which gets passed in as modelMatrice, is just:
| 1 0 0 -5 |
| 0 1 0 -6 |
| 0 0 1 -7 |
| 0 0 0 1 |
After calling glm::vec4 min_(min.x,min.y,min.z,1.0f);, let us represent min_ by the vector [x y z 1]. Then min_ * modelMatrice looks like:
| 1 0 0 -5 |
| 0 1 0 -6 |
[x y z 1] * | 0 0 1 -7 | = [x y z (-5x-6y-7z+1)]
| 0 0 0 1 |
That is, the x, y, and z components are not altered because it is the |0 0 0 1| row that gets applied to them in the multiplication, not the column containing the translation components.
May I ask, do you have a particular reason for (a) inverting GetModelMatrix() and (b) multiplying the matrix by the vector as opposed to vice versa? I'd guess that perhaps your code should instead look as follows:
BoundingBox& TransformBy(const glm::mat4 modelMatrice){
...
min_ = modelMatrice * min_;
max_ = modelMatrice * max_;
...
return *this;
}
Called with box.TransformBy(transform.GetModelMatrix());
Update:
In regards to your new code, I question the need for inversion of the transformation matrix for rotation, the need to split up the scaling/translation/rotation into three separate steps, etc. It feels like, unless there is an error somewhere else, the process should be as simple as taking your object's transformation vector, multiplying min_ and max_ by it, and then doing the test for whether any min/max components swapped that you currently have at the end of the function:
BoundingBox& TransformBy(const glm::mat4& modelMatrice){
min_ = modelMatrice * min_;
max_ = modelMatrice * max_;
max = glm::vec3( (min_.x > max_.x)? min_.x:max_.x, (min_.y > max_.y)? min_.y: max_.y, (min_.z > max_.z)? min_.z : max_.z);
min = glm::vec3( (min_.x < max_.x)? min_.x:max_.x, (min_.y < max_.y)? min_.y: max_.y, (min_.z < max_.z)? min_.z : max_.z);
return *this;
}
If the above five-line function does not work, please let me know how it fails; if it does not work, I feel the problem may lie elsewhere in your code.
However, to answer your updated question, as to why the scaling does not work: When you invert the transformation matrix, you are also inverting the scaling. Thus, you are not just rotating when you multiply by inverse: you are also applying the inverted scale. While you could fix this by multiplying by the scale again, the far simpler solution seems to be trying to get the above five-line solution working.

Related

Why does graphics pipeline need mapping to clip coordinates and normalized device coordinates?

On perspective projection, if I use simple projection matrix like:
1 0 0 0
0 1 0 0
0 0 1 0
0 0 1/near 0
, which is just projecting onto the image plane. It can be easily get view space coordinates by discarding and normalizing, I think.
If on orthogonal projection, it even does not need the projection matrix.
But, OpenGL graphics pipeline has the above process, though the perspective projection causes a depth precision error.
Why does it need mapping to clip coordinates and normalized device coordinates?
Added
If I use the above projection matrix,
1 0 0 0
p = ( 0 1 0 0 )
0 0 1 0
0 0 1/n 0
v_eye = (x y z 1)
v_clip = p * v_eye = (x y z z/n)
v_ndc = v_clip / v_clip.w = (nx/z ny/z n 1)
Then, v_ndc can be clipped by discarding values over top, bottom, left, right.
Values over far also can be clipped in the same way before multiplying the projection matrix.
Well, it looks like silly though, I think it's easier than before.
ps. I noticed that the depth buffer can't be written in this way. Then, can't it be written before the projection?
Sorry for silly question and gibberish...
In case of orthographic projections, you are right: The perspective divide is not required, but it des not introduce any error, since it is a division by 1. (A orthographic projection matrix contains always [0, 0, 0, 1] in the last row).
For perspective projection, this is a bit more complex:
Let's look at the simplest perspective projection:
1 0 0 0
P = ( 0 1 0 0 )
0 0 1 0
0 0 1 0
Then a vector v=[x,y,z,1] (in view space) gets projected to
v_p = P * v = [x, y, z, z],
which is in projektive space.
Now the perspectve divide is needed to get the perspectve effect (objects closer to the viewer look larger):
v_ndc = v / v.w = [x'/z y'/z, z'/z, 1]
I don't see how this could be achieved without the perspective divide.
Why does it need mapping to clip coordinates and normalized device coordinates?
The space where the programmer leaves the vertices to the GL to be taken care of is the clip space. It's the 4D homogeneous space where the vertices exist before normalization / perspective division. This division, useful to perform perspective projection, is the mapping needed to transform the vertices from clip space to NDC (3D). Why? Similar triangles.
View Space Point
*
/ |
Proj /- |
Y ^ Plane /-- |
| /-- |
| *-- |y
| /-- | |
| /-- |y' |
| /--- | |
<-----+------------+------------+-------
Z O |
|-----d------| |
|------------z------------|
Perspective projection is where rays from the eye/origin cuts through a projection plane hitting the points present in the space. The point where the ray intersects the plane is the projection of the point hit. Lets say we want to project point P on to the projection plane, where all points have z = d. The projected location of P i.e. P' needs to be found. We know that z' will be d (since projection planes lies there). To find y', we know
y ⁄ z = y' ⁄ z' (similar triangles)
y ⁄ z = y' ⁄ d (z' = d by defn. of proj. plane)
y' = (d * y) ⁄ z
This division by z is called the perspective division. This shows that in perspective projection, objects farther, with larger z, appear smaller and objects closer, will smaller z, appear larger.
Another thing which convenient to perform in clip space is, obviously, clipping. In 4D, clipping is which is just checking if the points lie within a range as opposed to the costlier division.
In case of orthographic projection, the projection isn't a frustum but a cuboid — parallel rays come from infinity and not the origin. Hence for point P = (x, y, z), the Z values are just dropped, giving P' = (x, y). Thus the perspective division does nothing (divides by 1) in this case.

Surface normal on depth image

How to estimate the surface normal of point I(i,j) on a depth image (pixel value in mm) without using Point Cloud Library(PCL)? I've gone through (1), (2), and (3) but I'm looking for a simple estimation of surface normal on each pixel with C++ standard library or openCV.
You need to know the camera's intrinsic parameters, so that you can also know the distance between pixels in the same units (mm). This distance between pixels is obviously true for a certain distance from the camera (i.e. the value of the center pixel)
If the camera matrix is K which is typically something like:
f 0 cx
K= 0 f cy
0 0 1
Then, taking a pixel coordinates (x,y), then a ray from the camera origin through the pixel (in camera world coordinate space) is defined using:
x
P = inv(K) * y
1
Depending of whether the distance in your image is a projection on the Z axis, or just a euclidean distance from the center, you need to either normalize the vector P such that the magnitude is the distance to the pixel you want, or make sure the z component of P is this distance. For pixels around the center of the frame this should be close to identical.
If you do the same operation to nearby pixels (say, left and right) you get Pl and Pr in units of mm
Then just find the norm of (Pl-Pr) which is twice the distance between adjacent pixels in mm.
Then, you calculate the gradient in X and Y
gx = (Pi+1,j - Pi-1,j) / (2*pixel_size)
Then, take the two gradients as direction vectors:
ax = atan(gx), ay=atan(gy)
| cos ax 0 sin ax | |1|
dx = | 0 1 0 | * |0|
| -sin ax 0 cos ax | |0|
| 1 0 0 | |0|
dy = | 0 cos ay -sin ay | * |1|
| 0 sin ay cos ay | |0|
N = cross(dx,dy);
You may need to see if the signs make sense, by looking at a certain gradient and seeing of the dx,dy point to the expected direction. You may need to use a negative for none/one/both angles and same for the N vector.

Explanation of the Perspective Projection Matrix (Second row)

I try to figure out how the Perspective Projection Matrix works.
According to this: https://www.opengl.org/sdk/docs/man2/xhtml/gluPerspective.xml
f = cotangent(fovy/2)
Logically I understand how it works (x- and y-Values moving further away from the bounding box or vice versa), but I need an mathematical explanation why this works. Maybe because of the theorem of intersecting lines???
I found an explanation here: http://www.songho.ca/opengl/gl_projectionmatrix.html
But I don't understand the relevent part of it.
As for me, an explanation of the perspective projection matrix at songho.ca is the best one.
I'll try to retell the main idea, without going into details. But, first of all, let's clarify why the cotangent is used in OpenGL docs.
What is cotangent? Accordingly to wikipedia:
The cotangent of an angle is the ratio of the length of the adjacent side to the length of the opposite side.
Look at the picture below, the near is the length of the adjacent side and the top is the length of the opposite side .
The fov/2 is the angle we are interested in.
The angle fov is the angle between the top plane and bottom plane, respectively the angle fov/2 is the angle between top(or botton) plane and the symmetry axis.
So, the [1,1] element of projection matrix that is defined as cotangent(fovy/2) in opengl docs is equivalent to the ratio near/top.
Let's have a look at the point A specified at the picture. Let's find the y' coordinate of the point A' that is a projection of the point A on the near plane.
Using the ratio of similar triangles, the following relation can be inferred:
y' / near = y / -z
Or:
y' = near * y / -z
The y coordinate in normalized device coordinates can be obtained by dividing by the value top (the range (-top, top) is mapped to the range (-1.0,1.0)), so:
yndc = near / top * y / -z
The coefficient near / top is a constant, but what about z? There is one very important detail about normalized device coordinates.
The output of the vertex shader is a four component vector, that is transformed to three component vector in the interpolator by dividing first three component by the fourth component:
,
So, we can assign to the fourth component the value of -z. It can be done by assigning to the element [2,3] of the projection matrix the value -1.
Similar reasoning can be done for the x coordinate.
We have found the following elements of projection matrix:
| near / right 0 0 0 |
| 0 near / top 0 0 |
| 0 0 ? ? |
| 0 0 -1 0 |
There are two elements that we didn't found, they are marked with '?'.
To make things clear, let's project an arbitary point (x,y,z) to normalized device coordinates:
| near / right 0 0 0 | | x |
| 0 near / top 0 0 | X | y | =
| 0 0 ? ? | | z |
| 0 0 -1 0 | | 1 |
| near / right * x |
= | near / top * y |
| ? |
| -z |
And finally, after dividing by the w component we will get:
| - near / right * x / z |
| - near / top * y / z |
| ? |
Note, that the result matches the equation inferred earlier.
As for the third component that marked with '?'. More complex reasoning is needed to find out how to calculate it. Refer to the songho.ca for more information.
I hope that my explanations make things a bit more clear.

Generating Fractals with transformation matrices

I'm trying to generate fractals using five different transformations that I have implemented from skeleton code, translate, rotate, scale, non-uniform scale, and image. These transformations are all 3x3 matrices, for example:
Matrix rotate ( Pt p, float theta )
{
Matrix rvalue;
rvalue.data[0][0] = cos(theta);
rvalue.data[0][1] = -sin(theta);
rvalue.data[0][2] = p.x + p.y*sin(theta) - p.x*cos(theta);
rvalue.data[1][0] = sin(theta);
rvalue.data[1][1] = cos(theta);
rvalue.data[1][2] = p.y - p.y*cos(theta) - p.x*sin(theta);
rvalue.data[2][0] = 0;
rvalue.data[2][1] = 0;
rvalue.data[2][2] = 1;
return rvalue;
}
where Matrix is defined as
class Matrix
{
public:
float data [ 3 ] [ 3 ];
Matrix ( void )
{
int i, j;
for ( i = 0; i < 3; i++ )
{
for ( j = 0; j < 3; j++ )
{
data [ i ] [ j ] = 0;
}
}
}
};
In a test file, there is the following code that is supposed to generate Serpinski's Triangle
vector<Matrix> iat;
iat.push_back ( scale ( Pt ( -.9, -.9 ), 0.5 ) );
iat.push_back ( scale ( Pt ( .9, -.9 ), 0.5 ) );
iat.push_back ( scale ( Pt ( 0, .56 ), 0.5 ) );
setIATTransformations ( iat );
Where Pt is defined as:
class Pt
{
public:
float x, y;
Pt ( float newX, float newY )
{
x = newX;
y = newY;
}
Pt ( void )
{
x = y = 0;
}
};
How should I implement setIATTransformations? Multiply the matrices until there is one transformation matrix and loop it a number of times to generate the fractal?
you want sierpinski triangle or fractal generator driven by input script ?
1.triangle
is easy enough
no rotations translations or what so ever are needed
just create the points according to sierpinski rule http://en.wikipedia.org/wiki/Sierpinski_triangle
all sides of triangles are divided to half
so the new points are just average of start and end point of each line
and then fill the sub triangles
if you want just wire-frame then even the point list is not needed
2.generator
you did not provide any rules,commands for the control script
the only thing I see in your script is input of the 3 vertexes of the triangle
and that is all
I do not see any rule for triangle division
or which part is filled or not
how many recursions are used
the only thing you mentioned was that you use 5 transformation matrices 3x3 for rotation,scale and translation
but did not specify when and why
You will have to implement the chaos game. That is, you randomly select one of the transformations and apply it to the iteration point. Do it a number (30, 50 or 100) without painting the point and after that mark all the points. The resulting point cloud will in time fill the fractal.
Your operation scale( Pt (a,b), s) should realize the operation
(x',y')=s*(x,y)+(1-s)*(a,b), that is, in matrix terms
| x' | | s 0 (1-s)*a| | x |
| y' | = | 0 s (1-s)*b| * | y |
| 1 | | 0 0 1 | | y |

How to convert Euler angles to directional vector?

I have pitch, roll, and yaw angles. How would I convert these to a directional vector?
It'd be especially cool if you can show me a quaternion and/or matrix representation of this!
Unfortunately there are different conventions on how to define these things (and roll, pitch, yaw are not quite the same as Euler angles), so you'll have to be careful.
If we define pitch=0 as horizontal (z=0) and yaw as counter-clockwise from the x axis, then the direction vector will be
x = cos(yaw)*cos(pitch)
y = sin(yaw)*cos(pitch)
z = sin(pitch)
Note that I haven't used roll; this is direction unit vector, it doesn't specify attitude. It's easy enough to write a rotation matrix that will carry things into the frame of the flying object (if you want to know, say, where the left wing-tip is pointing), but it's really a good idea to specify the conventions first. Can you tell us more about the problem?
EDIT:
(I've been meaning to get back to this question for two and a half years.)
For the full rotation matrix, if we use the convention above and we want the vector to yaw first, then pitch, then roll, in order to get the final coordinates in the world coordinate frame we must apply the rotation matrices in the reverse order.
First roll:
| 1 0 0 |
| 0 cos(roll) -sin(roll) |
| 0 sin(roll) cos(roll) |
then pitch:
| cos(pitch) 0 -sin(pitch) |
| 0 1 0 |
| sin(pitch) 0 cos(pitch) |
then yaw:
| cos(yaw) -sin(yaw) 0 |
| sin(yaw) cos(yaw) 0 |
| 0 0 1 |
Combine them, and the total rotation matrix is:
| cos(yaw)cos(pitch) -cos(yaw)sin(pitch)sin(roll)-sin(yaw)cos(roll) -cos(yaw)sin(pitch)cos(roll)+sin(yaw)sin(roll)|
| sin(yaw)cos(pitch) -sin(yaw)sin(pitch)sin(roll)+cos(yaw)cos(roll) -sin(yaw)sin(pitch)cos(roll)-cos(yaw)sin(roll)|
| sin(pitch) cos(pitch)sin(roll) cos(pitch)sin(roll)|
So for a unit vector that starts at the x axis, the final coordinates will be:
x = cos(yaw)cos(pitch)
y = sin(yaw)cos(pitch)
z = sin(pitch)
And for the unit vector that starts at the y axis (the left wing-tip), the final coordinates will be:
x = -cos(yaw)sin(pitch)sin(roll)-sin(yaw)cos(roll)
y = -sin(yaw)sin(pitch)sin(roll)+cos(yaw)cos(roll)
z = cos(pitch)sin(roll)
There are six different ways to convert three Euler Angles into a Matrix depending on the Order that they are applied:
typedef float Matrix[3][3];
struct EulerAngle { float X,Y,Z; };
// Euler Order enum.
enum EEulerOrder
{
ORDER_XYZ,
ORDER_YZX,
ORDER_ZXY,
ORDER_ZYX,
ORDER_YXZ,
ORDER_XZY
};
Matrix EulerAnglesToMatrix(const EulerAngle &inEulerAngle,EEulerOrder EulerOrder)
{
// Convert Euler Angles passed in a vector of Radians
// into a rotation matrix. The individual Euler Angles are
// processed in the order requested.
Matrix Mx;
const FLOAT Sx = sinf(inEulerAngle.X);
const FLOAT Sy = sinf(inEulerAngle.Y);
const FLOAT Sz = sinf(inEulerAngle.Z);
const FLOAT Cx = cosf(inEulerAngle.X);
const FLOAT Cy = cosf(inEulerAngle.Y);
const FLOAT Cz = cosf(inEulerAngle.Z);
switch(EulerOrder)
{
case ORDER_XYZ:
Mx.M[0][0]=Cy*Cz;
Mx.M[0][1]=-Cy*Sz;
Mx.M[0][2]=Sy;
Mx.M[1][0]=Cz*Sx*Sy+Cx*Sz;
Mx.M[1][1]=Cx*Cz-Sx*Sy*Sz;
Mx.M[1][2]=-Cy*Sx;
Mx.M[2][0]=-Cx*Cz*Sy+Sx*Sz;
Mx.M[2][1]=Cz*Sx+Cx*Sy*Sz;
Mx.M[2][2]=Cx*Cy;
break;
case ORDER_YZX:
Mx.M[0][0]=Cy*Cz;
Mx.M[0][1]=Sx*Sy-Cx*Cy*Sz;
Mx.M[0][2]=Cx*Sy+Cy*Sx*Sz;
Mx.M[1][0]=Sz;
Mx.M[1][1]=Cx*Cz;
Mx.M[1][2]=-Cz*Sx;
Mx.M[2][0]=-Cz*Sy;
Mx.M[2][1]=Cy*Sx+Cx*Sy*Sz;
Mx.M[2][2]=Cx*Cy-Sx*Sy*Sz;
break;
case ORDER_ZXY:
Mx.M[0][0]=Cy*Cz-Sx*Sy*Sz;
Mx.M[0][1]=-Cx*Sz;
Mx.M[0][2]=Cz*Sy+Cy*Sx*Sz;
Mx.M[1][0]=Cz*Sx*Sy+Cy*Sz;
Mx.M[1][1]=Cx*Cz;
Mx.M[1][2]=-Cy*Cz*Sx+Sy*Sz;
Mx.M[2][0]=-Cx*Sy;
Mx.M[2][1]=Sx;
Mx.M[2][2]=Cx*Cy;
break;
case ORDER_ZYX:
Mx.M[0][0]=Cy*Cz;
Mx.M[0][1]=Cz*Sx*Sy-Cx*Sz;
Mx.M[0][2]=Cx*Cz*Sy+Sx*Sz;
Mx.M[1][0]=Cy*Sz;
Mx.M[1][1]=Cx*Cz+Sx*Sy*Sz;
Mx.M[1][2]=-Cz*Sx+Cx*Sy*Sz;
Mx.M[2][0]=-Sy;
Mx.M[2][1]=Cy*Sx;
Mx.M[2][2]=Cx*Cy;
break;
case ORDER_YXZ:
Mx.M[0][0]=Cy*Cz+Sx*Sy*Sz;
Mx.M[0][1]=Cz*Sx*Sy-Cy*Sz;
Mx.M[0][2]=Cx*Sy;
Mx.M[1][0]=Cx*Sz;
Mx.M[1][1]=Cx*Cz;
Mx.M[1][2]=-Sx;
Mx.M[2][0]=-Cz*Sy+Cy*Sx*Sz;
Mx.M[2][1]=Cy*Cz*Sx+Sy*Sz;
Mx.M[2][2]=Cx*Cy;
break;
case ORDER_XZY:
Mx.M[0][0]=Cy*Cz;
Mx.M[0][1]=-Sz;
Mx.M[0][2]=Cz*Sy;
Mx.M[1][0]=Sx*Sy+Cx*Cy*Sz;
Mx.M[1][1]=Cx*Cz;
Mx.M[1][2]=-Cy*Sx+Cx*Sy*Sz;
Mx.M[2][0]=-Cx*Sy+Cy*Sx*Sz;
Mx.M[2][1]=Cz*Sx;
Mx.M[2][2]=Cx*Cy+Sx*Sy*Sz;
break;
}
return(Mx);
}
FWIW, some CPU's can compute Sin & Cos simultaneously (for example fsincos on x86). If you do this, you can make it a bit faster with three calls rather than 6 to compute the initial sin & cos values.
Update: There are actually 12 ways depending if you want right-handed or left-handed results -- you can change the "handedness" by negating the angles.
Beta saved my day. However I'm using a slightly different reference coordinate system and my definition of pitch is up\down (nodding your head in agreement) where a positive pitch results in a negative y-component. My reference vector is OpenGl style (down the -z axis) so with yaw=0, pitch=0 the resulting unit vector should equal (0, 0, -1).
If anyone comes across this post and has difficulties translating Beta's formulas to this particular system, the equations I use are:
vDir->X = sin(yaw);
vDir->Y = -(sin(pitch)*cos(yaw));
vDir->Z = -(cos(pitch)*cos(yaw));
Note the sign change and the yaw <-> pitch swap. Hope this will save someone some time.
You need to be clear about your definitions here - in particular, what is the vector you want? If it's the direction an aircraft is pointing, the roll doesn't even affect it, and you're just using spherical coordinates (probably with axes/angles permuted).
If on the other hand you want to take a given vector and transform it by these angles, you're looking for a rotation matrix. The wiki article on rotation matrices contains a formula for a yaw-pitch-roll rotation, based on the xyz rotation matrices. I'm not going to attempt to enter it here, given the greek letters and matrices involved.
If someone stumbles upon looking for implementation in FreeCAD.
import FreeCAD, FreeCADGui
from FreeCAD import Vector
from math import sin, cos, pi
cr = FreeCADGui.ActiveDocument.ActiveView.getCameraOrientation().toEuler()
crx = cr[2] # Roll
cry = cr[1] # Pitch
crz = cr[0] # Yaw
crx = crx * pi / 180.0
cry = cry * pi / 180.0
crz = crz * pi / 180.0
x = sin(crz)
y = -(sin(crx) * cos(crz))
z = cos(crx) * cos(cry)
view = Vector(x, y, z)