I am implementing perspective from scratch for an academic project. I am using "Computer Graphics: principles and practices", by Foley, van Dam, Feiner and Hughes (second edition in C).
I just followed the book by implementing all the matrices transformations needed to traslate, rotate, shear, scale, project, transform from perspective to parallel canonical view volumes and for clipping. The book apparently uses a right-handed coordinate system. However, I ended up with primitives appearing in a left-handed coordinate system and I cannot explain why.
Here are the matrices that I used:
Translation:
1, 0, 0, dx
0, 1, 0, dy
0, 0, 1, dz
0, 0, 0, 1
Rotation (to align a coordinate system (rx, ry, rz) to XYZ):
rx1, rx2, rx3, 0
ry1, ry2, ry3, 0
rz1, rz2, rz3, 0
0 , 0 , 0 , 1
Scale:
sx, 0 , 0 , 0
0 , sy, 0 , 0
0 , 0 , sz, 0
0 , 0 , 0 , 1
Shear XY:
1, 0, shx, 0
0, 1, shy, 0
0, 0, 1 , 0
0, 0, 0 , 1
Projecting onto a plane at z = d, with PRP at origin, looking in the positive z direction:
1, 0, 0 , 0
0, 1, 0 , 0
0, 0, 1 , 0
0, 0, 1/d, 0
Then given VRP, VPN, PRP, VUP, f and b (and the direction of projection dop), reduce the space to the canonical viewing volume for perspective using P:
rz = VPN / |VPN|
rx = (VUP x rz) / |VUP x rz|
ry = rz x rx
P = ScaleUniform(-1 / (vrp1Z + b)) *
Scale(-2 * vrp1Z / deltaU, -2 * vrp1Z / deltaV, 1) *
Shear(-dopX / dopZ, -dopY / dopZ) *
T(PRP) *
R(rx, ry, rz) *
T(-VRP)
Where vrp1 is ShearXY * T(-PRP) * (0, 0, 0, 1), deltaU and deltaV the width and height of the viewing window. dop is computed as CW - PRP, where CW is the center of the viewing window.
Then Projection(d) * P gives me the projection matrix.
I projected simple lines representing the unit vectors on x, y and z, but the representation that I obtained drawn on the screen was clearly a left handed coordinate system... Now I need to work in a right handed coordinate system, so is there a way to know where I did wrong?
Here is the code I used:
As you can see, the Z component of the scale matrix is of opposite sign, since clipping wasn't working properly because something was right-handed and something left-handed, but I couldn't discern what exactly, so I swapped the sign of the scale because it wasn't needed in a left-hand system.
Vector rz = vpn.toUnitVector();
Vector rx = vup.cross(rz).toUnitVector();
Vector ry = rz.cross(rx).toUnitVector();
Vector cw = viewWindow.getCenter();
Vector dop = cw - prp;
Matrix t1 = Matrix::traslation(-vrp[X], -vrp[Y], -vrp[Z]);
Matrix r = Matrix::rotation(rx, ry, rz);
Matrix t2 = Matrix::traslation(-prp[X], -prp[Y], -prp[Z]);
Matrix partial = t2 * r * t1;
Matrix shear = Matrix::shearXY(-dop[X] / dop[Z], -dop[Y] / dop[Z]);
Matrix inverseShear = Matrix::shearXY(dop[X] / dop[Z], dop[Y] / dop[Z]);
Vector vrp1 = shear * t2 * Vector(0, 0, 0, 1);
Matrix scale = Matrix::scale(
2 * vrp1[Z] / ((viewWindow.xMax - viewWindow.xMin) * (vrp1[Z] + b)),
2 * vrp1[Z] / ((viewWindow.yMax - viewWindow.yMin) * (vrp1[Z] + b)),
1 / (vrp1[Z] + b)); // HERE <--- WAS NEGATIVE
Matrix inverseScale = Matrix::scale(
((viewWindow.xMax - viewWindow.xMin) * (vrp1[Z] + b)) / (2 * vrp1[Z]),
((viewWindow.yMax - viewWindow.yMin) * (vrp1[Z] + b)) / (2 * vrp1[Z]),
(vrp1[Z] + b));
float zMin = -(vrp1[Z] + f) / (vrp1[Z] + b);
Matrix parallel = Perspective::toParallelCvv(zMin);
Matrix inverseParallel = Perspective::inverseToParallelCvv(zMin);
Matrix perspective = Perspective::copAtOrigin(-vrp1[Z]);
projection = perspective * shear * partial;
canonicalView = parallel * scale * shear * partial;
canonicalToProjection = perspective * inverseScale * inverseParallel;
Related
I am trying to find a 3D model from 2 images taken from the same camera using OpenCV with C++. I followed this method. I am still not able to rectify mistake in R and T computation.
Image 1: With Background Removed for eliminating mismatches
Image 2: Translated only in X direction wrt Image 1 With Background Removed for eliminating mismatches
I have found the Intrinsic Camera Matrix (K) using MATLAB Toolbox. I found it to be :
K=
[3058.8 0 -500
0 3057.3 488
0 0 1]
All image matching keypoints (using SIFT and BruteForce Matching, Mismatches Eliminated) were aligned wrt center of image as follows:
obj_points.push_back(Point2f(keypoints1[symMatches[i].queryIdx].pt.x - image1.cols / 2, -1 * (keypoints1[symMatches[i].queryIdx].pt.y - image1.rows / 2)));
scene_points.push_back(Point2f(keypoints2[symMatches[i].trainIdx].pt.x - image1.cols / 2, -1 * (keypoints2[symMatches[i].trainIdx].pt.y - image1.rows / 2)));
From Point Correspondeces, I found out Fundamental Matrix Using RANSAC in OpenCV
Fundamental Matrix:
[0 0 -0.0014
0 0 0.0028
0.00149 -0.00572 1 ]
Essential Matrix obtained using:
E = (camera_Intrinsic.t())*f*camera_Intrinsic;
E obtained:
[ 0.0094 36.290 1.507
-37.2245 -0.6073 14.71
-1.3578 -23.545 -0.442]
SVD of E:
E.convertTo(E, CV_32F);
Mat W = (Mat_<float>(3, 3) << 0, -1, 0, 1, 0, 0, 0, 0, 1);
Mat Z = (Mat_<float>(3, 3) << 0, 1, 0, -1, 0, 0, 0, 0, 0);
SVD decomp = SVD(E);
Mat U = decomp.u;
Mat Lambda = decomp.w;
Mat Vt = decomp.vt;
New Essential Matrix for epipolar constraint:
Mat diag = (Mat_<float>(3, 3) << 1, 0, 0, 0, 1, 0, 0, 0, 0);
Mat new_E = U*diag*Vt;
SVD new_decomp = SVD(new_E);
Mat new_U = new_decomp.u;
Mat new_Lambda = new_decomp.w;
Mat new_Vt = new_decomp.vt;
Rotation from SVD:
Mat R1 = new_U*W*new_Vt;
Mat R2 = new_U*W.t()*new_Vt;
Translation from SVD:
Mat T1 = (Mat_<float>(3, 1) << new_U.at<float>(0, 2), new_U.at<float>(1, 2), new_U.at<float>(2, 2));
Mat T2 = -1 * T1;
I was getting the R matrices to be :
R1:
[ -0.58 -0.042 0.813
-0.020 -0.9975 -0.066
0.81 -0.054 0.578]
R2:
[ 0.98 0.0002 0.81
-0.02 -0.99 -0.066
0.81 -0.054 0.57 ]
Translation Matrices:
T1:
[0.543
-0.030
0.838]
T2:
[-0.543
0.03
-0.83]
Please clarify wherever there is a mistake.
This 4 sets of P2 matrix R|T with P1=[I] are giving incorrect triangulated models.
Also, I think the T matrix obtained is incorrect, as it was supposed to be only x shift and no z shift.
When tried with same image1=image2 -> I got T=[0,0,1]. What is the meaning of Tz=1? (where there is no z shift as both images are same)
And should I be aligning my keypoint coordinates with image center, or with principle focus obtained from calibration?
I have given N points on a straight line, these are lets say- (x1,y1) , (x2, y2), .... (xn, yn) , these points represent a wire in 3D. I want this wire to bend to form shape of circle and ellipse. So these points will map to points on circle and ellipse. Tell about some mapping technique that maps points on straight line onto points on circle and ellipse.
Reduce the line points to scalar parametric coordinates 0 <= t <= 1.
Multiply the t coordinates by 2*pi (giving theta) and plug them into the parametric circle equation:
x = cos( theta )
y = sin( theta )
Example:
Given 4 points (0,0), (1,1), (5,5), and (10,10) convert to parametric coordinates like so:
length = | (10,10) - (0,0) | = sqrt( 10^2 + 10^2 ) = sqrt( 200 )
t0 = 0.0 = | (0,0) - (0,0) | / length = 0
t1 = 0.1 = | (1,1) - (0,0) | / length = sqrt( 2 ) / length
t2 = 0.5 = | (5,5) - (0,0) | / length = sqrt( 50 ) / length
t3 = 1.0 = | (10,10) - (0,0) | / length = sqrt( 200 ) / length
p0.x = cos( t0 * 2 * pi ) = 1
p0.y = sin( t0 * 2 * pi ) = 0
p1.x = cos( t1 * 2 * pi ) = 0.80901699437
p1.y = sin( t1 * 2 * pi ) = 0.58778525229
...
i found this example on how to to transform a unit cube into a frustum (truncated pyramid) via non-affine transformation. i need a matrix which can be pushed to my matrixstack which does the transform for me. how can this calculation
x' = (M11•x + M21•y + M31•z + OffsetX) ÷ (M14•x + M24•y + M34•z + M44)
y' = (M12•x + M22•y + M32•z + OffsetY) ÷ (M14•x + M24•y + M34•z + M44)
z' = (M13•x + M23•y + M33•z + OffsetZ) ÷ (M14•x + M24•y + M34•z + M44)
be expressed in a single matrix? is it possible?
for now i am using an inverse projection matrix to transform a unit cube into a frustum, but i have to divide every 3d point by w whenever i want to pick something.
The homogeneous matrix representing those equations is simply
[ M11 M12 M13 M14 ] [ 1 0 0 0 ]
M = [ M21 M22 M23 M24 ] , M0 = [ 0 1 0 0 ]
[ M31 M32 M33 M34 ] [ 0 0 1 0 ]
[ M41 M42 M43 M44 ] [ 0 0 1 0 ]
You can simply multiply you model data D of the cube with it to get the truncated pyramid, as well as continue stacking with other matrices, such as camera + projection:
((M * D) * V ) * P;
There's no need to worry about the division by 'w' -- playing with the 4x4 matrices postpones that to the final stages of the rasterizer.
M0 here is the simplest projection matrix: however to utilize that, you must first transform you cube along the z-axis further away from the camera, multiply by M0 and transform it back to it's origin. Define a transform Matrix T.
[ 1 0 0 0 ]
T = [ 0 1 0 0 ]
[ 0 0 1 4 ]
[ 0 0 0 0 ]
Then (D * T * M0 * (-T)) is a truncated pyramid, that just went through a perspective transform as if its center was 4 units away from the origin.
(Disclaimer: in opengl m43 is most likely -1)
To calculate the matrix it would be wise to choose a math library that is already implemented. The matrix is usually composed from a projection_matrix * vies_matrix * world_transform_matrix. All the three matrices can be created using a library such as GLM and the usage would be:
glm::perspective(..args...) * glm::lookAt(..args..) * object_transformation
In your case you could ignore lookAt and object_transformation and use only the projection to view the cube.
I was able to find an example of a Polar clock at http://raphaeljs.com/polar-clock.html
I modified it to draw concentric circles, but I need the arc to start at 6 o'clock. I am trying to dissect how it works, but haven't been able to figure it out.
JS Fiddle:
http://jsfiddle.net/5frQ8/
var r = Raphael("holder", 600, 600);
// Custom Attribute
r.customAttributes.arc = function (value, total, R, color)
{
var alpha = 360 / total * value,
a = (90 - alpha) * Math.PI / 180,
x = 300 + R * Math.cos(a),
y = 300 - R * Math.sin(a),
path;
if (total == value)
{
path = [["M", 300, 300 - R], ["A", R, R, 0, 1, 1, 299.99, 300 - R]];
}
else
{
path = [["M", 300, 300 - R], ["A", R, R, 0, +(alpha > 180), 1, x, y]];
}
return {path: path, stroke: color,"stroke-width": 30};
};
//West
r.path().attr({arc: [575, 2000, 200, '#19A69C']});
//Total#
r.path().attr({arc: [1000, 2000, 160, '#FEDC38']});
//East
r.path().attr({arc: [425, 2000, 120, '#7BBD26']});
I have modified the main function to make the arcs start from 6 o'clock equivalent position. Please note that the formulae to find a point in polar coordinates are always:
x = centerX + radius * cos(angle)
y = centerY + radius * sin(angle)
Find the starting and ending points accordingly.
To change the starting angle by "delta", all angles should be added by "delta". Thus,
newAngle = angle + delta
The values of delta are -90 and +90 for the arcs to start from 12 o'clock and 6 o'clock respectively.
The arc drawing function is changed accordingly.
// Custom Attribute
r.customAttributes.arc = function (value, total, R, color)
{
var angleShift = 90,
alpha = 360 / total * value,
a = (alpha + angleShift) * Math.PI / 180,
x = 300 + R * Math.cos(a),
y = 300 + R * Math.sin(a),
path;
if (total == value)
{
path = [["M", 300, 300 + R], ["A", R, R, 0, 1, 1, 300.01, 300 + R]];
}
else
{
path = [["M", 300, 300 + R], ["A", R, R, 0, +(alpha > 180), 1, x, y]];
}
return {path: path, stroke: color,"stroke-width": 30};
};
i've been trying to implement color picking and it just aint working right. the problem is that if initially paint my model in the different colors that are used for the picking (i mean, i give each triangle different color, which is his id color), it works fine (without texture or anything .. ), but if i put texture of the model, and that when the mouse is clicked i paint the model by giving each triangle a different color, it doesnt work..
here is the code:
public int selection(int x, int y) {
GL11.glDisable(GL11.GL_LIGHTING);
GL11.glDisable(GL11.GL_TEXTURE_2D);
IntBuffer viewport = BufferUtils.createIntBuffer(16);
ByteBuffer pixelbuff = BufferUtils.createByteBuffer(16);
GL11.glGetInteger(GL11.GL_VIEWPORT, viewport);
this.render(this.mesh);
GL11.glReadPixels(x, y, 1, 1, GL11.GL_RGB, GL11.GL_UNSIGNED_BYTE, pixelbuff);
for (int m = 0; m < 3; m++)
System.out.println(pixelbuff.get(m));
GL11.glEnable(GL11.GL_TEXTURE_2D);
GL11.glEnable(GL11.GL_LIGHTING);
return 0;
}
public void render(GL_Mesh m, boolean inPickingMode)
{
GLMaterial[] materials = m.materials; // loaded from the .mtl file
GLMaterial mtl;
GL_Triangle t;
int currMtl = -1;
int i = 0;
// draw all triangles in object
for (i=0; i < m.triangles.length; ) {
t = m.triangles[i];
// activate new material and texture
currMtl = t.materialID;
mtl = (materials != null && materials.length>0 && currMtl >= 0)? materials[currMtl] : defaultMtl;
mtl.apply();
GL11.glBindTexture(GL11.GL_TEXTURE_2D, mtl.textureHandle);
// draw triangles until material changes
for ( ; i < m.triangles.length && (t=m.triangles[i])!=null && currMtl == t.materialID; i++) {
drawTriangle(t, i, inPickingMode);
}
}
}
private void drawTriangle(GL_Triangle t, int i, boolean inPickingMode) {
if (inPickingMode) {
byte[] triColor = this.triangleToColor(i);
GL11.glColor3ub((byte)triColor[2], (byte)triColor[1], (byte)triColor[0]);
}
GL11.glBegin(GL11.GL_TRIANGLES);
GL11.glTexCoord2f(t.uvw1.x, t.uvw1.y);
GL11.glNormal3f(t.norm1.x, t.norm1.y, t.norm1.z);
GL11.glVertex3f( (float)t.p1.pos.x, (float)t.p1.pos.y, (float)t.p1.pos.z);
GL11.glTexCoord2f(t.uvw2.x, t.uvw2.y);
GL11.glNormal3f(t.norm2.x, t.norm2.y, t.norm2.z);
GL11.glVertex3f( (float)t.p2.pos.x, (float)t.p2.pos.y, (float)t.p2.pos.z);
GL11.glTexCoord2f(t.uvw3.x, t.uvw3.y);
GL11.glNormal3f(t.norm3.x, t.norm3.y, t.norm3.z);
GL11.glVertex3f( (float)t.p3.pos.x, (float)t.p3.pos.y, (float)t.p3.pos.z);
GL11.glEnd();
}
as you can see, i have a selection function that's called everytime the mouse is clicked, i then disable the lightining and the texture, and then i render the scene again in the unique colors, and then read the pixles buffer, and the call of:
GL11.glReadPixels(x, y, 1, 1, GL11.GL_RGB, GL11.GL_UNSIGNED_BYTE, pixelbuff);
gives me wrong values .. and its driving me nutz !
btw, the main render function is render(mesh m, boolean inPickingMode) as u can see, you can also see that there is texture on the model before the mouse clicking ..
there are several problems with the example.
First, you're not clearing the color and depth-buffer when clicking the mouse (that causes the scene with color polygons to be mixed into the scene with textured polygons - and then it doesn't work). you need to call:
GL11.glClear(GL11.GL_COLOR_BUFFER_BIT | GL11.GL_DEPTH_BUFFER_BIT);
Second, it is probably a bad idea to use materials when color-picking. I'm not familiar with the GLMaterial class, but it might enable GL_COLOR_MATERIAL or some other stuff, which modifies the final color, even if lighting is disabled. Try this:
if(!inPickingMode) { // === add this line ===
// activate new material and texture
currMtl = t.materialID;
mtl = (materials != null && materials.length>0 && currMtl >= 0)? materials[currMtl] : defaultMtl;
mtl.apply();
GL11.glBindTexture(GL11.GL_TEXTURE_2D, mtl.textureHandle);
} // === and this line ===
Next, and that is not related to color picking, you call glBegin() too often for no good reason. You can call it in render(), before the triangle drawing loop (but that shouldn't change how the result looks like):
GL11.glBegin(GL11.GL_TRIANGLES);
// draw triangles until material changes
for ( ; i < m.triangles.length && (t=m.triangles[i])!=null && currMtl == t.materialID; i++) {
drawTriangle(t, i, inPickingMode);
}
GL11.glEnd();
--- now i am answering a little beyond the original question ---
The thing about color picking is, that the renderer has only limited number of bits to represent the colors (like as little as 5 bits per channel), so you need to use colors that do not have these bits set. It might be a bad idea to do this on a mobile device.
If your objects are simple enough (can be represented by, say a sphere, for picking), it might be a good idea to use raytracing for picking objects. It is pretty simple, the idea is that you take inverse of modelview-projection matrix, and transform points (mouse_x, mouse_y, -1) and (mouse_x, mouse_y, +1) by it, which will give you position of mouse at the near and at the far view plane, in object space. All you need to do is to subtract them to get direction of ray (origin is at the near plane), and you can pick your objects using this ray (google ray - sphere intersection).
float[] mvp = new float[16]; // this is your modelview-projection
float mouse_x, mouse_y; // those are mouse coordinates (in -1 to +1 range)
// inputs
float[] mvp_inverse = new float[16];
Matrix.invertM(mvp_inverse, 0, mvp, 0);
// inverse the matrix
float nearX = mvp_inverse[0 * 4 + 0] * mouse_x +
mvp_inverse[1 * 4 + 0] * mouse_y +
mvp_inverse[2 * 4 + 0] * -1 +
mvp_inverse[3 * 4 + 0];
float nearY = mvp_inverse[0 * 4 + 1] * mouse_x +
mvp_inverse[1 * 4 + 1] * mouse_y +
mvp_inverse[2 * 4 + 1] * -1 +
mvp_inverse[3 * 4 + 1];
float nearZ = mvp_inverse[0 * 4 + 2] * mouse_x +
mvp_inverse[1 * 4 + 2] * mouse_y +
mvp_inverse[2 * 4 + 2] * -1 +
mvp_inverse[3 * 4 + 2];
float nearW = mvp_inverse[0 * 4 + 3] * mouse_x +
mvp_inverse[1 * 4 + 3] * mouse_y +
mvp_inverse[2 * 4 + 3] * -1 +
mvp_inverse[3 * 4 + 3];
// transform the near point
nearX /= nearW;
nearY /= nearW;
nearZ /= nearW;
// dehomogenize the coordinate
float farX = mvp_inverse[0 * 4 + 0] * mouse_x +
mvp_inverse[1 * 4 + 0] * mouse_y +
mvp_inverse[2 * 4 + 0] * +1 +
mvp_inverse[3 * 4 + 0];
float farY = mvp_inverse[0 * 4 + 1] * mouse_x +
mvp_inverse[1 * 4 + 1] * mouse_y +
mvp_inverse[2 * 4 + 1] * +1 +
mvp_inverse[3 * 4 + 1];
float farZ = mvp_inverse[0 * 4 + 2] * mouse_x +
mvp_inverse[1 * 4 + 2] * mouse_y +
mvp_inverse[2 * 4 + 2] * +1 +
mvp_inverse[3 * 4 + 2];
float farW = mvp_inverse[0 * 4 + 3] * mouse_x +
mvp_inverse[1 * 4 + 3] * mouse_y +
mvp_inverse[2 * 4 + 3] * +1 +
mvp_inverse[3 * 4 + 3];
// transform the far point
farX /= farW;
farY /= farW;
farZ /= farW;
// dehomogenize the coordinate
float rayX = farX - nearX, rayY = farY - nearY, rayZ = farZ - nearZ;
// ray direction
float orgX = nearX, orgY = nearY, orgZ = nearZ;
// ray origin
And finally - a debugging suggestion: try to render with inPickingMode set to true so you can see what is it that you are actually drawing, on screen. If you see texture or lighting, then something went wrong.