OpenGL Superbible linear algebra - is this correct? - opengl

I recently started reading OpenGL Superbible 5th edition and noticed the following:
Having just taken linear algebra this seemed odd to me. The column vector is of dimension 4x1 and the matrix is 4x4, how is it possible to multiply them together? If the vector were a row-vector and the output were a row vector I agree that it would be possible, but this?
Update: I emailed the author and he said that I was correct. He noticed the order was wrong in the previous edition of the book, however it ended up not being fixed in the 5th edition.

I agree: it should be a column vector that's pre-multiplied by the identity matrix.
If it's a row vector, then the RHS needs to be a row vector as well to make the dimensions match.

This is not a typo or an error, it's a common way in 3D graphics to express vector-matrix multiplications. But mathematically speaking, you are correct : the left vector should be written horizontally. In 3D you will never see this, though.

It's a common mistake through all the book's matrix-related examples. See LISTING 4.1, the caption says "Translate then Rotate", while both the on book code and the executable sample code show "rotate-then-translate" behavior. Sigh.

Related

Perspective projection based on 4 points in 2D

I'm writing to ask about homography and perspective projection.
I'm trying to write a piece of code, that will "warp" my image so that its corners align with 4 reference points that are in the 3D space - however, the game engine that I'm running it in, already allows me to get the screen position of them, so I already have their screen-space coordinates of both xi,yi and ui,vi, normalized to values between 0 and 1.
I have to mention that I don't have a degree in mathematics, which seems to be a requirement in the posts I've seen on this topic so far, but I'm hoping there is actually a solution to this problem that one can comprehend. I never had a chance to take classes in Computer Vision.
The reason I came here is that in all the posts I've seen online, the simple explanation that I came across is that each point must be put into a 1x3 matrix and multiplied by a 3x3 homography, which consists of 9 components h1,h2,h3...h9, and this transformation matrix will transform each point to the correct perspective. And that's where I'm hitting a brick wall - how do I calculate the transformation matrix? It feels like it should be a relatively simple algebraic task, but apparently it's not.
At this point I spent days reading on the topic, and the solutions I've come across are either based on matlab (which have a ton of mathematical functions built into them), or include elaborations and discussions that don't really explain much; sometimes they suggest tons of different parameters and simplifications, but rarely explain why and what's their purpose, or they are referencing books and studies that have been since removed from the web, and I found myself more confused than I was in the beginning. Most of the resources I managed to find online are also made in a different context - image stitching and 3d engine development.
I also want to mention that I need to run this code each frame on the CPU, and I'm fairly concerned about the effect of having to run too many matrix transformations and solving a ton of linear algebra equations.
I apologize for not asking about any specific code, but my general question is - can anyone point me in the right direction with this issue?
Limit the problem you deal with.
For example, if you always warp the entire rectangular image, you can treat that the coordinates of the image corners are {(0,0), (1,0), (0,1), (1,1)}.
This can simplify the equation, and you'll be able to solve the equation by yourself.
So you'll be able to implement the answer.
Note : Homograpy is scale invariant. So you can decrease the freedom to 8. (e.g. you can solve the equation under h9=1).
Best advice I can give: read a good book on the subject. For example, "Multiple View Geometry" by Hartley and Zisserman

How to disambiguate P3P 4 solutions for Visual Odometry

I have implemented the P3p algorithm described in the following paper "A Novel Parametrization of the Perspective-Three_Point problem for Direct Computation of Absolute Camera Position and Orientation".
However the procedure provides 4 solutions, i.e., for combination of (translation, orientation).
Now I am supposed to disambiguate the 4 solutions and get a unique solution by BACK PROJECTION OF A FOURTH POINT.
My understanding is that back projection means to take the fourth point and re-project it on the image plane. But how is that going to help me with finding a unique solution from the above 4?
Any help would be much appreciated.
Thanks
Max
I might have an idea to what this means.
Basically you take a fourth world point and re-project it using all four rotations+translations. Then you choose the solution closest to the re-projection.

How to work out the angle between two 2D vectors using cross product?

So here's a link for the same question but the best answer doesn't explain it fully:
Rotate Sprite to Mouse Position
It's the cross product that I'm stuck with, since the formula in that link can only be applied in mathematics outside of computing.
What is the actual formula to calculate the cross product in computing form?
If you can post it as C++ code that would be great.
Keep in mind I'm looking for the cross product between 2 2D vectors, not 3D.
The title says you are interested in computing the angle between two 2D vectors, so thats what I'm going with.
If you look at for instance http://mathworld.wolfram.com/DotProduct.html, It is fairly straightforward to implement in code.
There is however the atan2 function which makes this a cinch:
double angle = atan2(p2y, p2x) - atan2(p1y, p1x);

Implementation of Non Local Means Noise reduction algorithm in image processing

I am working over implementation of Non Local Means noise reduction algorithm in C++ . There are papers on this algorithm (such as this paper), but they are also not very clear on it.
I know, it is using the weighted mean but I don't know what is the use of research window here and how is it related to comparison window.
Being a new user, StackOverflow is not allowing me to upload images. but, you can find formula under the nl means section the link provided above.
From the paper you refer to, when determining the result value for a given pixel p, all the other pixels of the image will be weighted and summed according to the similarity between their neighborhoods and the neighborhood of the pixel p.
But that is computationally very expensive. So the authors restrict the number of pixels which will contribute to the weighted sum; that must be what you call the search window. This search window is a 21x21 region centered on the pixel p. The neighborhoods being compared are of size 7x7 (section 5).
I could make a prototype quickly with Mathematica and I confirm it becomes very costly when the size of the search window increases. I expect the same behavior when you implement in C++.
There's some GPL'd C++ code along with a brief writeup of the algorithm by the original authors here: http://www.ipol.im/pub/algo/bcm_non_local_means_denoising/
This had been added to OpenCV
http://docs.opencv.org/modules/photo/doc/denoising.html

GJK collision detection implementation from 2D to 3D

I apologize for the length of this question and give a pre-emptive thanks for anyone who reads through this!
So i've spent the last few days going over the GJK algorithm. I understand the general concepts behind it, and understand the most of the nitty gritties of its implementation in 2D thanks to the wonderful article by William Bittle at http://www.codezealot.org/archives/88 .
I've implemented his pseudo code (found at the end of the article) into my own c++ project, however i want to make a 3D implementation. My weakness comes into using the dot products to test the voronoi regions and the tripleProducts to get perpandicular lines. But im trying to read up more on that.
My problem comes down to the containsOrigin function. Im having trouble visualizing and accounting for the new voronoi regions that the z axis adds. I just can't seem to wrap my head around how to determine which regions contains the origin. I assume there is 4 I have to account for, each extending from the triangular planes that the comprise the 4 faces of the tetrahedron simplex. If the origin is not within any of those regions, then it is contained, and we have a collision.
How do i go about testing if it is contained in a particular voronoi region/ which triangular face is pointing in the direction of the origin?
The current 2D algorithm checks if a triangle is made, if not, then the simplex is a line and it finds the 3rd point. I assume the 3D algorithm with check if a tetrahedron is made, if not, then it will check for a triangle, if true then it will to find a 4th point to make a tetrahedron(how would i get this? using a normal in direction of origin?). If i trangle isnt made, it will find a 3rd point to make a triangle (do i still use triple product for this like in 2D?).
Any suggestions, outlines, resources, code augmentations, comments are much appretiated.
Depending on what result you expect from the GJK algorithm you might want to look at this nice tutorial from Molly Rocket: https://mollyrocket.com/849
Be aware though that his implementation only outputs intersection? yes/no. But it might be a nice start.