Related
I have a list L = [[5,6,7,8],[10,11,12,13],[1,2,3,4],[14,15,16,17]] Ii. That represents my matrix. The size can change dynamic, so the blocksize can be different, 4x4 = 4 elements, 9x9= 9 elements
I want to obtain the 4 squares that compose the List.(In this case it's a matrix 4 by 4). If I have that matrix:
5 6 7 8
10 11 12 13
1 2 3 4
14 15 16 17
The result should be:
R = [5,6,10,11],[7,8,12,13],[1,2,14,15],[3,4,16,17].
Any suggestions are welcomed. Thanks
The first thing you need is really a lever for turning a list of lists into a matrix. What distinguishes a 2-dimensional matrix from a list of lists? The idea of a coordinate system. So you need a way to relate a coordinate pair with the corresponding value in the matrix.
at(Matrix, X, Y, V) :- nth0(X, Matrix, Row), nth0(Y, Row, V).
This predicate makes it possible to index the matrix at (X,Y) and get the value V. This turns out to be, IMO, a massive demonstration of what makes Prolog powerful, because once you have this one, simple predicate, you gain:
The ability to obtain the value at the point supplied:
?- at([[5,6,7,8],[10,11,12,13],[1,2,3,4],[14,15,16,17]], 1,3, V).
V = 13.
The ability to iterate the entire matrix (only instantiate Matrix and leave the other arguments as variables):
?- at([[5,6,7,8],[10,11,12,13],[1,2,3,4],[14,15,16,17]], X,Y, V).
X = Y, Y = 0,
V = 5 ;
X = 0,
Y = 1,
V = 6 ;
...
X = 3,
Y = 2,
V = 16 ;
X = Y, Y = 3,
V = 17.
The ability to search the matrix for values:
?- at([[5,6,7,8],[10,11,12,13],[1,2,3,4],[14,15,16,17]], X,Y, 14).
X = 3,
Y = 0 ;
false.
So this is a pretty useful lever! In a conventional lanugage, you'd need three different functions to do all these things, but this is different, because in Prolog we just have to define the relationship between things (in this case, a data structure and a coordinate pair) and Prolog can do quite a bit of the heavy lifting.
It's easy to see how we could produce a particular submatrix now, by just defining the sets of X and Y values we'd like to see. For instance, to get the upper-left matrix we would do this:
?- between(0,1,X), between(0,1,Y),
at([[5,6,7,8],[10,11,12,13],[1,2,3,4],[14,15,16,17]], X,Y, V).
X = Y, Y = 0,
V = 5 ;
X = 0,
Y = 1,
V = 6 ;
X = 1,
Y = 0,
V = 10 ;
X = Y, Y = 1,
V = 11.
We can of course use findall/3 to gather up the solutions in one place:
?- findall(V, (between(0,1,X), between(0,1,Y),
at([[5,6,7,8],[10,11,12,13],[1,2,3,4],[14,15,16,17]], X,Y, V)),
Vs).
Vs = [5, 6, 10, 11].
What's left for your problem is basically some arithmetic. Let's see if we have a square matrix:
square_matrix(M, Degree) :-
length(M, Degree),
maplist(length, M, InnerDegrees),
forall(member(I, InnerDegrees), I=Degree).
This is not a perfect predicate, in that it will not generate! But it will tell us whether a matrix is square and if so, what degree it has:
?- square_matrix([[5,6,7,8],[10,11,12,13],[1,2,3,4],[14,15,16,17]], D).
D = 4.
Once you have that, what you have to do is sort of formulaic:
Make sure the degree is a perfect square
Take the square root of the degree. That's how many rows or columns you have (square root 4 = 2, 2 rows and 2 columns, square root 9 = 3, 3 rows and 3 columns).
Make a relationship between the (row,column) coordinate and a list of (x,y) coordinates for the matrix in that location. For instance in the 4x4 matrix, you have four tiles: (0,0), (0,1), (1,0) and (1,1). The coordinates for (0,0) will be (0,0), (0,1), (1,0), (1,1), but the coordinates for (1,1) will be (2,2),(2,3),(3,2),(3,3). If you do a few of these by hand, you'll see it's going to amount to adding an x and y offset to all the permutations from 0 to row/column count (minus one) for both coordinates.
Now that you have that relationship, you need to do the iteration and assemble your output. I think maplist/N will suffice for this.
Hope this helps!
I am attempting to work in LWJGL to display a simple quad using my own matrices. I've been looking around for awhile and have found a few perspective matrix implementations, these two in particular:
[cot(fov/2)/a 0 0 0]
[0 cot(fov/2) 0 0]
[0 0 -f/(f-n) -1]
[0 0 -f*n/(f-n) 0]
and:
[cot(fov/2)/a 0 0 0]
[0 cot(fov/2) 0 0]
[0 0 -(f+n)/(f-n) -1]
[0 0 -(2*f*n)/(f-n) 0]
Both of these provide the same effect, as expected (got them from here and here, respectively). The issue is in my understanding of how multiplying this by the modelview matrix, then a vertex, then dividing each x, y, and z value by its w value gives a screen coordinate. More specifically, if I multiply either of these by the modelview matrix then by a vertex (10, 10, 0, 1), it gives a w=0. That in itself is a big smack in the face. I conclude either the matrices are wrong, or I am missing something completely. In my actual test program, the vertices don't even end up on screen even though the camera position at (0,0,0) and no rotation would make it so. I even have tried many different z values, positive and negative, to see if it was just a clipping plane. Am I missing something here?
EDIT: After a lot of checking over, I've narrowed down the problem I am facing. The biggest issue is that the z-axis does not appear to be remapped to the range I specify (n to f). Any object just zooms in or out a little bit when I translate it along the z-axis then pops out of existence as it moves past the range [-1, 1]. I think this is also making me more confused. I set my far plane to 100 and my near to 0.1, and it behaves like anything but.
Both of these provide the same effect, as expected
While the second projection matrix form is very standard, the first one gives a different effect. If you have z==1 and w==0, the projection will be:
Matrix 1: -f/(f-n) / -f*n/(f-n) = f / f*n = 1 / n
Matrix 2: -(f+n)/(f-n) / -(2*f*n)/(f-n) = (f+n) / (2*f2n)
The result is clearly different. You should always use the second form.
if I multiply either of these by the modelview matrix then by a vertex
(10, 10, 0, 1), it gives a w=0. That in itself is a big smack in the
face
For a focal length d the projection is computed as (ignoring aspect ratio):
x'= d*x/z = x / w
y'= d*y/z = y / w
where
w = z / d
If you have z==0 this means that you want to project a point that is already in the eye and only points beyond d are visible. In practice this point will be clipped because z is not within the range n (near) and f (far) (n is expected as a positive constant)
I have a mesh model in X, Y, Z format. Lets say.
Points *P;
In first step, I want to normalize this mesh into (-1, -1, -1) to (1, 1, 1).
Here normalize means to fit this mesh into a box of (-1, -1, -1) to (1, 1, 1).
then after that I do some processing to normalized mesh, finally i want to revert the dimensions to similar with the original mesh.
step-1:
P = Original Mesh dimensions;
step-2:
nP = Normalize(P); // from (-1, -1, -1) to (1, 1, 1)
step-3:
cnP = do something with (nP), number of vertices has increased or decreased.
step-4:
Original Mesh dimensions = Revert(cnP); // dimension should be same with the original mesh
how can I do that?
I know how easy it can be to get lost in programming and completely miss the simplicity of the underlying math. But trust me, it really is simple.
The most intuitive way to go about your problem is probably this:
determine the minimum and maximum value for all three coordinate axes (i.e., x, y and z). This information is contained by the eight corner vertices of your cube. Save these six values in six variables (e.g., min_x, max_x, etc.).
For all points p = (x,y,z) in the mesh, compute
q = ( 2.0*(x-min_x)/(max_x-min_x) - 1.0
2.0*(y-min_y)/(max_y-min_y) - 1.0
2.0*(z-min_z)/(max_z-min_z) - 1.0 )
now q equals p translated to the interval (-1,-1,-1) -- (+1,+1,+1).
Do whatever you need to do on this intermediate grid.
Convert all coordinates q = (xx, yy, zz) back to the original grid by doing the inverse operation:
p = ( (xx+1.0)*(max_x-min_x)/2.0 + min_x
(yy+1.0)*(max_y-min_y)/2.0 + min_y
(zz+1.0)*(max_z-min_z)/2.0 + min_z )
Clean up any mess you've made and continue with the rest of your program.
This is so easy, it's probably a lot more work to find out which library contains these functions than it is to write them yourself.
It's easy - use shape functions. Here's a 1D example for two points:
-1 <= u <= +1
x(u) = x1*(1-u)/2.0 + x2*(1+u)/2.0
x(-1) = x1
x(+1) = x2
You can transform between coordinate systems using the Jacobean.
Let's see what it looks like in 2D:
-1 <= u <= =1
-1 <= v <= =1
x(u, v) = x1*(1-u)*(1-v)/4.0 + x2*(1+u)*(1-v)/4.0 + x3*(1+u)*(1+v)/4.0 + x4*(1-u)*(1+v)/4.0
y(u, v) = y1*(1-u)*(1-v)/4.0 + y2*(1+u)*(1-v)/4.0 + y3*(1+u)*(1+v)/4.0 + y4*(1-u)*(1+v)/4.0
I'm trying to calculate the cameras position for an image. I have 2 images of a rubiks cube. The first image is considered to be the base image and the next image is the image after the camera has moved. So for the first image I assume that the camera is at (0,0,0). On this image I then identify the 4 corners of the front face of the rubiks cube as shown here (4 corners identified by the 4 blue circles).
Then for the next image (after camera movement), I identify the same face of the rubiks cube as show here
So by assuming the first image as the base image, does anyone know if/how i can calculate how much the camera has moved for image 2 as shown here:
I would suggest you use OpenCV for this. I also think, this question would be more suited to StackOverflow.
The textbook on this subject would be "Multiple-View Geometry" by Hartley and Zisserman. http://www.robots.ox.ac.uk/~vgg/hzbook/ (There is a sample chapter on the Fundamental Matrix on that website.)
Basically, first find the Fundamental Matrix, then by knowing the intrinsic parameters of the camera, find a solution to the position.
Fundamental Matrix: http://en.wikipedia.org/wiki/Fundamental_matrix_%28computer_vision%29
Intrinsic Parameters: Stuff like the focal length and where the principal point is on the image plane. If you have F, then E = K^t * F * K, if K is the intrinsic matrix and the same for both images.
How to find a solution to the camera position: http://en.wikipedia.org/wiki/Essential_matrix#Determining_R_and_t_from_E
Algorithm
This is how I would do it in OpenCV. I have done this before, so it ought to work.
1. Run Feature Detection and Detector Extractor on both images.
2. Match Features.
3. Use F = cv::findFundamentalMatrix with Ransac.
4. E = K.t() * F * K. // K needs to be found beforehand.
5. Do SingularValueDecomposition of E such that E = U * S * V.t()
6. R = U * W.inv() * V.t() // W = [[0, -1, 0], [1, 0, 0], [0, 0, 1]]
7. Tx = V * Z * V.t() // Z = [[0, -1, 0], [1, 0, 0], [0, 0, 0]]
8. get t from Tx (matrix version of cross product)
9. Find the correct solution. R.t() and -t are possiblities.
10. Get overall scale by knowing the length of the size of the Rubrik's cube.
Alternative Solutions
I am certain that a more straightforward approach can also work. The benefit of this approach is that no human input is needed (unsupervised). This is not true for the optional step 10 (determining scale).
A different solution would exploit the knowledge of the geometry of the Rubrik's cube. For example, six (5.5) points are needed to estimate the position of the camera, if the point's 3D position is known.
Unfortunatly, I am not aware of any software that does this for you automatically.
So here is the alternative algorithm:
Write down the coordinates of the corners of the cube as (X_i, Y_i, Z_i), and possibly also points with other knowable positions.
Mark the corresponding points u_i = (x_i, y_i).
For every correspondence create two lines in a matrix A.
(X_i, Y_i, Z_i, 1, 0, 0, 0, 0, -x_iX_i, -x_iY_i, -x_iZ_i -x_i)
(0, 0, 0, 0, X_i, Y_i, Z_i, 1, -y_iX_i, -y_iY_i, -y_iZ_i -y_i)
Then find p such that Ap = 0. I.e. p is the right kernel of A, or the least-squared solution to Ap=0.
De-flatten p, to create a 3x4 matrix. P.
So I'm writing my own custom 3D transformation pipeline in order to gain a better understanding of how it all works. I can get everything rendering to the screen properly and I'm now about to go back and look at clipping.
From my understanding, I should be clipping a vertex point if the x or y value after the perspective divide is outside the bounds of [-1, 1] and in my case if the z value is outside the bounds of [0, 1].
When i implement that however, my z value is always -1.xxxxxxxxxxx where xxxxxxx is a very small number.
This is a bit long, and I apologize, but I wanted to make sure I gave all the information I could.
First conventions:
I'm using a left-handed system where a Matrix looks like this:
[m00, m01, m02, m03]
[m10, m11, m12, m13]
[m20, m21, m22, m23]
[m30, m31, m32, m33]
And my vectors are columns looking like this:
[x]
[y]
[z]
[w]
My camera is set up with:
A vertical FOV in radians of PI/4.
An aspect ration of 1. (Square view port)
A near clip value of 1.
A far clip value of 1000.
An initial world x position of 0.
An initial world y position of 0.
An initial world z position of -500.
The camera is looking down the position Z axis (0, 0, 1)
Given a vertex, the pipeline works like this:
Step 1: Multiply the vertex by the camera matrix.
Step 2: Multiply the vertex by the projection matrix.
Projection matrix is:
[2.41421, 0, 0, 0]
[0 2.41421, 0, 0]
[0, 0, 1.001001, 1]
[0, 0, -1.001001, 0]
Step 3: Multiply the x, y and z components by 1/w.
Step 4: [This is where the problem is] Clip the vertex if outside bounds.
Step 5: Convert to screen coordinates.
An example vertex that I have is
(-100, -100, 0, 1)
After multiplying by the camera matrix i get:
(-100, -100, 500, 1)
Which makes sense because relative to the camera, that vertex is 100 units to the left and down and 500 units ahead. It is also between the near clip of 1 and the far clip of 1000. W is still 1.
After multiplying by the projection matrix i get:
(-241.42135, -241.42135, 601.600600, -600.600600)
This I'm not sure if it makes sense. The x and y seem to be correct, but i'm iffy about the z and w since the next step of perspective divide is odd.
After the perspective divide I get:
(0.401966, 0.401966, -1.001665, 1)
Again the x and y make sense, they are both within the bounds of [-1, 1]. But the z value is clearly outside the bounds even though i believe it should still be within the frustrum. W is back to 1 which again makes sense.
Again apologies for the novel, but I'm hoping someone can help me figure out what I'm doing incorrectly.
Thanks!
Ok, it looks like I figured out what the problem it was.
My projection matrix was:
[2.41421, 0, 0, 0]
[0 2.41421, 0, 0]
[0, 0, 1.001001, 1]
[0, 0, -1.001001, 0]
But it really should be transposed and be:
[2.41421, 0, 0, 0]
[0 2.41421, 0, 0]
[0, 0, 1.001001, -1.001001]
[0, 0, 1, 0]
When using this matrix, my x and y values stay the same as expected and now my z values are constrained to be within [0, 1] and only exceed that range if they are outside the near of far clip plane.
The only issue now is that I'm quite confused as to whether I'm using a right or left handed system.
All i know is that now it works...
I may be out of my league here, but I thought that the purpose of the projection matrix and perspective divide were to discover the 2D position of that point on the screen. In that case, the left-over z value would not necessarily have any meaning any more, since the math is all geared towards finding those two x and y values.
Update: I think I have it figured out. Your math is all correct. The camera and frustum you describe has a near clipping plane at Z=1, so your example point at (-100, 100, 0) is actually outside of the clipping plane, so that z-buffer value of just below -1 makes perfect sense.
Try a sample point with a z-coordinate inside your frustum, say with a z-coordinate of 2.