Mathematical Issue: Triangle, Pyramid, Rotation, Translation, Zoom - c++

Another tricky question. What you can see here is my physical pyramid built with 3 leds which form a triangle in 1 plane and another led in the mid center, about 18mm above the other 3. The 4th one makes the triangle to a pyramid. (You may can see it better if you look on the right triangle. This one is rotated about the horizontal achsis, and you can see a diode on a stick very well).
The second picture shows my running program. The left box shows the raw picture of the leds (photo with ir-filter). The picture in the center shows that my program found the points and is also able to tell which point is which, based on some conditions (like C is always where the both lines with maximal distance betweens diodes intersect; and the both longest lengths are always a and b). But dont care about this, i know the points are 100% correctly found.
Then on the right picture are some calculated values, like the height between C and c and so on. I would be able to calculate more, but i didnt bother to care for now, cause I am stuck.
I want to calculate the pyramids rotation and translation in the 3 dimensional space.
The yellow points are the leds after rotation arround an axis throught the center of the triangle in camera z- direction. So now i do not have to worry about this, when calculating the other 2. The Rotation arround the horizontal axis, and the rotation arround the vertical axis. I could easily calculate this with the lengths of the distance from the center of the triangle to the 4th diode (as you can see the 4th diode moves on the image plane with rotation), or the lengths of the both axes.
But my problem is the unknown depth.
It affects all lengths (a,b,c, and also the lengths from the center to the 4th diode if we call this d and e). I know the measurments of the real pyramid, with a tolerance of +-5% or so, but they get also affected by the zoom. So how do i deal with this?
I thought of an equation with a ratio between something with the lengths of the horizontal axis, the length of the vertical axis, the angles alpha, beta and gamma, and the lengths d and e.
Alpha, beta and gamma only get affected by rotation arround the axes (which i want to know. i want to know the rotation and the zoom), where a rotation arround one axis has the opposite effect than a rotation arround the other. So if you rotate arround both axes in the same angle, the ratio between the length of the axes is the same as before.
The zoom (real: how close it is to the camera; what i want to know in 1st place: multiplication factor 2x, 3x,0.5, 0,4322344,.....) does not affect the angles, but all the lengths: a,b,c,d,e,hc (vertical length of axis), hx (i have not calculated it yet, but it would be easy. the name hx can vary, i just thought of something random right now; it is the length of the horizontal axis) in the same way (i guess).
You see i have thought of many, but i think i am too dumb.
So, is there any math genius out there wo can give me the right equations, for either the rotation OR/AND the zoomfactor?
(i also thought about using Posit/Downhill- Simplex, and so on, but this would be the nicest, since i already know so much, like all Points, and so on and so on)
Please, please, i need your help really bad! I am writing this in C++ and with help of OpenCV if you need to know, but i think its more a mathematical problem.
Thanks in advance!
Ah, and Alpha seems to be always the same as Beta!
Edit: Had to delete the second picture

Have a look to Boost Geometry or here also

Have a look at SolvePnP() in OpenCV. Even if you don't use it directly, the documentation has citations for the methods used.

Related

How a feature descriptior for traffic sign detection works in opencv

I am trying to understand how Centroid to Contour (CtC) detector works. Here is a sample code that I have found on Github, and I am trying to understand what is the idea behind this. In this sample, the author trying to detect speed sign using CtC. There are just two important functions:
pre_process()
CtC_features
I have understood a part of the code and how it works but I have some problems in understanding how CtC_features function works.
If you can help me I would like to understand the following parts (just 3 points):
Why if centroid.x > curr.x we need to add PI value to the angle result ( if (centroid.x > curr.x) ang += 3.14159; //PI )
When we start selecting the features on line 97 we set the start angle ( double ang = - 1.57079; ). Why is this half the pi value and negative? How was this value chosen?
And a more general question, how can you know that what feature you select are related to speed limit sign? You find the centroid of the image and adjust the angle in the first step, but in the second step how can you know if ( while (feature_v[i].first > ang) ) the current angle is bigger than your hardcode angle ( in first case ang = - 1.57079) then we add that distance as a feature.
I would like to understand the idea behind this code and if someone with more experience and with some knowledge about trigonometry would help me it will be just great.
Thank you.
The code you provided is not the best, but let's see what happens.
I took this starting image of a sign:
Then, pre_process is called, which basically runs a Canny edge detector, along with some tricks which should lead to a better edge detection. I won't look into them, but this is what it returns:
Not the greatest. Maybe some parameter tuning would help.
But now, CtC_features is called, which is the scope of the question. The role of CtC_features is to obtain some features for a machine learning algorithms. This amounts to finding a numerical description of the image which would help the ML algorithm detect the sign. Such a description can be anything. Think about how someone who never saw a STOP sign and does not know how to read would describe it. They would say something like "A red, flat plate, with 8 sides and some white stuff in the middle". Based on this description, someone might be able to tell it's a STOP sign. We want to do the same, but since computers are computers, we look for numerical features. And, with them, some algorithm could be trained to "learn" what features each sign has.
So, let's see what features does CtC_features obtains from the contours.
The first thing it does is to call findContours. This function takes a binary image and returns arrays of points representing the contours of the image. Basically, it takes the edges and puts them into arrays of points. With connectivity, so we basically know which points are connected. If we use the code from here for visualization, we can see what happens:
So, the array contours is a std::vector<std::vector<cv::Point>> and you have in each sub-array a continuous contour, here drawn with a different color.
Next, we compute the number of points (edge pixels), and do an average over their coordinates to find the centroid of the edge image. The centroid is the filled circle:
Then, we iterate over all points, and create a vector of std::pair<double, double>, recording for each point the distance from the centroid and the angle. The angle function is defined at the bottom of the file as
double angle(Point2f a, Point2f b) {
return atan((a.y - b.y) / (a.x - b.x));
}
It basically computes the angle of the line from a to b with respect to the x axis, using the arctangent function. I'll let you watch a video on arctangent, but tl;dr is that it gives you the angle from a ratio. In radians (a circle is 2 PI radians, half a circle is PI radians). The problem is that the function is periodic, with a period of PI. This means that there are 2 angles on the circle (the circle of all points at the same distance around the centroid) which give you the same value. So, we compute the ratio (the ratio is btw known as the tangent of the angle), apply the inverse function (arctangent) and we get an angle (corresponding to a point). But what if it's the other point? Well, we know that the other point is exactly with PI degrees offset (it is diametrically opposite), so we add PI if we detect that it's the other point.
The picture below also helps understand why there are 2 points:
The tangent of the angle is highlighted vertical distance,. But the angle on the other side of the diagonal line, which intersects the circle in the bottom left, also has the same tangent. The atan function gives the tangents only for angles on the left side of the center. Note that there are no 2 directions with the same tangent.
What the check does is to ask whether the point is on the right of the centroid. This is done in order to be able to add a half a circle (PI radians or 180 degrees) to correct for the result of atan.
Now, we know the distance (a simple formula) and we have found (and corrected) for the angle. We insert this pair into the vector feature_v, and we sort it. The sort function, called like that, sorts after the first element of the pair, so we sort after the angle, then after distance.
The interval variable:
int degree = 10;
double interval = double((double(degree) / double(360)) * 2 * 3.14159); //5 degrees interval
simply is value of degree, converted from degrees into radians. We need radians since the angles have been computed so far in radians, and degrees are more user friendly. Yep, the comment is wrong, the interval is 10 degrees, not 5.
The ang variable defined below it is -PI / 2 (a quarter of a circle):
double ang = - 1.57079;
Now, what it does is to divide the points around the centroid into bins, based on the angle. Each bin is 10 degrees wide. This is done by iterating over the points sorted after angle, all are accumulated until we get to the next bin. We are only interested in the largest distance of a point in each bin. The starting point should be small enough that all the direction (points) are captured.
In order to understand why it starts from -PI/2, we have to get back at the trigonometric function diagram above. What happens if the angle goes like this:
See how the highlighted vertical segment goes "downwards" on the y axis. This means that its length (and implicitly the tangent) is negative here. Also, the angle is considered to be negative (otherwise there would be 2 angles on the same side of the center with the same tangent). Now, we are interested in the range of angles we have. It's all the angles on the right side of the centroid, starting from the bottom at -PI/2 to the top at PI/2. A range of PI radians, or 180 degrees. This is also written in the documentation of atan:
If no errors occur, the arc tangent of arg (arctan(arg)) in the range [-PI/2, +PI/2] radians, is returned.
So, we simply split all the possible directions (360 degrees) into buckets of 10 degrees, and take the distance of the farthest point in each bin. Since the circle has 360 degrees, we'll get 360 / 10 = 36 bins. Then, these are normalized such that the greatest value is 1. This helps a bit with the machine learning algorithm.
How can we know if the point we selected belongs to the sign? We don't. Most computer vision make some assumptions regarding the image in order to simplify the problem. The idea of the algorithm is to determine the shape of the sign by recording the distance from the center to the edges. This makes the assumption that the centroid is roughly in the middle of the sign. Depending on the ML algorithm used, and on the training data, different levels of robustness can be obtained.
Also, it assumes that (some of) the edges can be reliably identified. See how in my image, the algorithm was not able to detect the upper left edge?
The good news is that this doesn't have to be perfect. ML algorithms know how to handle this variation (up to some extent) provided that they are appropriately trained. It doesn't have to be perfect, but it has to be good enough. In order to answer what good enough means, what are the actual limitations of the algorithm, some more testing needs to be done, as well as some understanding of the ML algorithm used. But this is also why ML is so popular in vision: it can handle a lot of variation quite well.
At the end, we basically get an array of 36 numbers, one for each of the 36 bins of 10 degrees, representing the maximum distance of a point in the bin. I assume this is because the developer of the algorithm wanted a way to capture the shape of the sign, by looking at distances from center in various directions. This assumes that no edges are detected in the background, and the sign looks something like:
The max distance is used to pick the border, and not the or other symbols on the sign.
It is not directly used here, but a possibly related reading is the Hough transform, which uses a similar particularization to detect straight lines in an image.

OpenCV get 3D coordinates from 2D pixel

For my undergraduate paper I am working on a iPhone Application using openCV to detect domino tiles. The detection works well in close areas, but when the camera is angled the tiles far away are difficult to detect.
My approach to solve this I would want to do some spacial calculations. For this I would need to convert a 2D Pixel value into world coordinates, calculate a new 3D position with a vector and convert these coordinates back to 2D and then check the colour/shape at that position.
Additionally I would need to know the 3D positions for Augmented Reality additions.
The Camera Matrix i got trough this link create opencv camera matrix for iPhone 5 solvepnp
The Rotationmatrix of the Camera I get from the Core Motion.
Using Aruco markers would be my last resort, as I woulnd't get the decided effect that I would need for the paper.
Now my question is, can i not make calculations when I know the locations and distances of the circles on a lets say Tile with a 5 on it?
I wouldn't need to have a measurement in mm/inches, I can live with vectors without measurements.
The camera needs to be able to be rotated freely.
I tried to invert the calculation sm'=A[R|t]M' to be able to calculate the 2D coordinates in 3D. But I am stuck with inverting the [R|t] even on paper, and I don't know either how I'd do that in swift or c++.
I have read so many different posts on forums, in books etc. and I am completely stuck and appreciate any help/input you can give me. Otherwise I'm screwed.
Thank you so much for your help.
Update:
By using the solvePnP that was suggested by Micka I was able to get the Rotation and Translation Vectors for the angle of the camera.
Meaning that if you are able to identify multiple 2D Points in your image and know their respective 3D World coordinates (in mm, cm, inch, ...), then you can get the mechanisms to project points from known 3D World coordinates onto the respective 2D coordinates in your image. (use the opencv projectPoints function).
What is up next for me to solve is the translation from 2D into 3D coordinates, where I need to follow ozlsn's approach with the inverse of the received matrices out of solvePnP.
Update 2:
With a top down view I am getting along quite well to being able to detect the tiles and their position in the 3D world:
tile from top Down
However if I am now angling the view, my calculations are not working anymore. For example I check the bottom Edge of a 9-dot group and the center of the black division bar for 90° angles. If Corner1 -> Middle Edge -> Bar Center and Corner2 -> Middle Edge -> Bar Center are both 90° angles, than the bar in the middle is found and the position of the tile can be found.
When the view is Angled, then these angles will be shifted due to the perspective to lets say 130° and 50°. (I'll provide an image later).
The Idea I had now is to make a solvePNP of 4 Points (Bottom Edge plus Middle), claculate solvePNP and then rotate the needed dots and the center bar from 2d position to 3d position (height should be irrelevant?). Then i could check with the translated points if the angles are 90° and do also other needed distance calculations.
Here is an image of what I am trying to accomplish:
Markings for Problem
I first find the 9 dots and arrange them. For each Edge I try to find the black bar. As said above, seen from Top, the angle blue corner, green middle edge to yellow bar center is 90°.
However, as the camera is angled, the angle is not 90° anymore. I also cannot check if both angles are 180° together, that would give me false positives.
So I wanted to do the following steps:
Detect Center
Detect Edges (3 dots)
SolvePnP with those 4 points
rotate the edge and the center points (coordinates) to 3D positions
Measure the angles (check if both 90°)
Now I wonder how I can transform the 2D Coordinates of those points to 3D. I don't care about the distance, as I am just calculating those with reference to others (like 1.4 times distance Middle-Edge) etc., if I could measure the distance in mm, that would even be better though. Would give me better results.
With solvePnP I get the rvec which I could change into the rotation Matrix (with Rodrigues() I believe). To measure the angles, my understanding is that I don't need to apply the translation (tvec) from solvePnP.
This leads to my last question, when using the iPhone, can't I use the angles from the motion detection to build the rotation matrix beforehand and only use this to rotate the tile to show it from the top? I feel that this would save me a lot of CPU Time, when I don't have to solvePnP for each tile (there can be up to about 100 tile).
Find Homography
vector<Point2f> tileDots;
tileDots.push_back(corner1);
tileDots.push_back(edgeMiddle);
tileDots.push_back(corner2);
tileDots.push_back(middle.Dot->ellipse.center);
vector<Point2f> realLivePos;
realLivePos.push_back(Point2f(5.5,19.44));
realLivePos.push_back(Point2f(12.53,19.44));
realLivePos.push_back(Point2f(19.56,19.44));
realLivePos.push_back(Point2f(12.53,12.19));
Mat M = findHomography(tileDots, realLivePos, CV_RANSAC);
cout << "M = "<< endl << " " << M << endl << endl;
vector<Point2f> barPerspective;
barPerspective.push_back(corner1);
barPerspective.push_back(edgeMiddle);
barPerspective.push_back(corner2);
barPerspective.push_back(middle.Dot->ellipse.center);
barPerspective.push_back(possibleBar.center);
vector<Point2f> barTransformed;
if (countNonZero(M) < 1)
{
cout << "No Homography found" << endl;
} else {
perspectiveTransform(barPerspective, barTransformed, M);
}
This however gives me wrong values, and I don't know anymore where to look (Sehe den Wald vor lauter Bäumen nicht mehr).
Image Coordinates https://i.stack.imgur.com/c67EH.png
World Coordinates https://i.stack.imgur.com/Im6M8.png
Points to Transform https://i.stack.imgur.com/hHjBM.png
Transformed Points https://i.stack.imgur.com/P6lLS.png
You see I am even too stupid to post 4 images here??!!?
The 4th index item should be at x 2007 y 717.
I don't know what I am doing wrongly here.
Update 3:
I found the following post Computing x,y coordinate (3D) from image point which is doing exactly what I need. I don't know maybe there is a faster way to do it, but I am not able to find it otherwise. At the moment I can do the checks, but still need to do tests if the algorithm is now robust enough.
Result with SolvePnP to find bar Center
The matrix [R|t] is not square, so by-definition, you cannot invert it. However, this matrix lives in the projective space, which is nothing but an extension of R^n (Euclidean space) with a '1' added as the (n+1)st element. For compatibility issues, the matrices that multiplies with vectors of the projective space are appended by a '1' at their lower-right corner. That is : R becomes
[R|0]
[0|1]
In your case [R|t] becomes
[R|t]
[0|1]
and you can take its inverse which reads as
[R'|-Rt]
[0 | 1 ]
where ' is a transpose. The portion that you need is the top row.
Since the phone translates in the 3D space, you need the distance of the pixel in consideration. This means that the answer to your question about whether you need distances in mm/inches is a yes. The answer changes only if you can assume that the ratio of camera translation to the depth is very small and this is called weak perspective camera. The question that you're trying to tackle is not an easy one. There is still people researching on this at PhD degree.

How to find the angle formed by blades of a wind turbine when the yaw is changed?

This is a continuation of the question from Here-How to find angle formed by the blades of a wind turbine with respect to a horizontal imaginary axis?
I've decided to use the following methodology for this-
 Getting a frame from a camera and putting it in a loop.
 Performing Canny edge detection.
 Perform HoughLinesP to detect lines in the image.
Finding Blade Angle:
 Perform Probabilistic Hough Lines Transform on the image. Restrict the blade lines to the length of the blades, as known already.
 The returned value will have the start and end points of the lines detected. Since there are no background noises, this gives the starting and end point of the blade lines and the image will have the blade lines.
 Now, find the dot product with a vector (1,0) by finding the vectors of the blade lines detected or we can use atan2 to find the relative angle of all the points detected with respect to a horizontal.
Problem:
When the yaw angle of the turbine is changed and it is not directly facing the camera, how do I calculate the blade angle formed?
The idea is to basically map the angles when rotated back into the form when viewed head on. From what I've been able to understand, I thought I'd find the homography matrix, decompose the matrix to get rotation, convert to Euler angles to calculate shift from the original axis, then shift all the axes with that angle. However, it's just a vague idea with no concrete planning to go upon.
Or I begin with trying to find the projection matrix, then get camera matrix and rotation matrix? I am lost on this account completely and feel overwhelmed with the many functions...
Other things I came across was the perspective transform,solvepnp..
It would be great if anyone could suggest another way to deal with this? Any links of code snippets would be helpful. I'm not that familiar with OpenCV and would be grateful for any help.
Thanks!
Edit:
[Edit by Spektre]
Assume the tip of the blades plus the center (or the three "roots" of the blades") lie on a common plane.
Fit a homography between those points and the corresponding ones in a reference pose for the turbine (cv::findHomography in OpenCv)
Decompose the homography into rotation and translation using an estimated or assumed camera calibration (cv::decomposeHomographyMat).
Convert the rotation into Euler angles.

Calculate an elliptical arc start and end angles given two vectors

I am working on a program which draws shapes based on a cgm file input. I am trying to draw elliptical arc and it gives the opening portion in terms of a start and end vector from the center of the arc. I need help calculating the angle to the vector so I can draw.
I have been trying to use the standard atan2(y/x) but then I found it is valid for circles and not ellipses.
This image gives an example of what I'm trying to do. I am looking for angles A and B.
edit: This is related to my other question here. (Also note, this question is based on the math behind my problem while the other question was for programming help with qt.)
The wiki page on ellipses kind of shows why the math isn't working but I'm not sure how to implement it.
The angles A and B you were drawing in your picture in fact have nothing to do with the ellipse.
Just calculate once the angle between the x-axis and the line from origin to point (75,50). This is given by arctan(50/75) = 33.69°. And by symmetry, it is the same as the angle to point (75, -50).
Then, by simple trigonometry, for angle A you get A = 360° - 33.69°, whereas for B you get B= 180° + 33.69°.
Considering A, this is the same information that is obtained by atan2(-50, 75). However, the result of atan2 is in (i) in radians and (ii) in the range [-pi, pi]. You could add 2*pi and express it in angles and you get the same result as above.

my plane is not vertical, How to update coordinate of point cloud to lie on a vertical plane

I have a bunch of points lying on a vertical plane. In reality this plane
should be exactly vertical. But, when I visualize the point cloud, there is a
slight inclination (nearly 2 degrees) from the verticality. At the moment, I can calculate
this inclination only. Concerning other errors, I assume there are no
shifts or something like that.
So, I want to update coordinates of my point data so that they lie on the vertical plane. I think, I should do some kind of transformation. It may be only via rotation along X-axis. Not sure what it would be.
I guess, you understood my question. Honestly, I am poor at
mathematics. So, please let me know how to update my point coordinates
to lie on the exact vertical plane.
Note: AS I am implementing this in c++ and there are many programmers who have sound knowledge on these things, I am posting this question under c++.
UPDATES
If I say exactly what I have done so far;
I have point cloud data representing a vertical object + its surroundings things. (The data is collected by a moving scanner and may have axes deviations from the correct world axes). The problem is, I cannot say exactly that there is an error on my data or not. Therefore, I checked this with a vertical planar object (which is the dominated object in my data as well). In reality that plane is truly vertical. But, when I fit a plane by removing outliers, then that plane is not truly vertical and has nearly 2 degree inclination. Therefore, I am suspecting that my data has some error. So I want to update all my point clouds (including points on the plane and points which represent other objects) in a way to lay that particular planar points exactly on the vertical plane. Then, I guess, all the points will be updated into their correct positions as in the reality. That is all (x,y,z) coordinates should be updated.
As an example please refer the below figure.
left-represents original point cloud (as you can see, points themselves are not vertical) and back line tells the vertical plane which I fitted and red is the zenith line. as you can see, there is an inclination of the vertical plane.
So, I want to update whole my data in the right figure. then, after updating if i fit a plane again (removing outliers), then it is exactly parallel to the zenith line. please help me.
I may be able to help you out, considering I worked with planes recently. First of all, how come the points aren't coplanar from the get go? I'd make the points coplanar in the first place instead of them being at an inclination (from what origin?), and then having to fix them. Also, having the points be coplanar on your first go would increase efficiency.
Sorry if this is the answer you're not looking for, but I need more information before I can help you out. Also, 3D math is hard. If you work with it enough, it starts to get pounded into your head, where you will NEVER forget it, especially if you went through the headaches I had to go through.
I did a bit of thinking on it, and since you want to rotate along the x-axis, your rotation will be done on the xz-plane, which means we can make this a 2D problem. After doing a bit of research on Wikipedia, this may be your solution.
new z = ((x - intended x) * sin(angle)) + (z * cos(angle)) + intended x
What I'm doing here is subtracting our intended x value from our current x value, so that we make (intended x, 0) our point of origin to rotate around. After the point is rotated, I add (intended x, 0) back to our coordinate so that we get the correct result.
Depending on where you got your points from (some kind of measurement, I guess) and what you want to do with them, there are several different things you could do with your data.
The search keyword "regression plane" might help - there are several ways of finding planes approximating point clouds, and several ways to "snap" points to planes.
Edit: You want to apply a rotation around the axis defined by the cross product of the normal vector on your regression plane and the normal of your desired plane, and a point your choice. From your illustration I take it that you probably want the bottom of your vertical planar object to be the point of reference for the rotation.
So you've got your point of reference, you now the axis around which you want to rotate, and the angle. All you need to do is:
Translation (to get to your point of reference)
Rotation
I read your question again, and hopefully this answer will help you out. If there's anything else I need to know, please tell me.
Now, In order to rotate anything, there must be a center point to rotate around. Now you've already been able to detect the angle of inclination, so now we need a formula for rotating a point a certain angle around an origin. In addition, since this problem only occurs on a 2D plane, we can use this basic formula to readjust the points. For any two axis x and y:
Theta is the angle that you will rotate around in a counter-clockwise direction. x' and y' are your new points. x.origin and y.origin are the coordinates for the point you will be going around. Now I don't know if my math is 100% correct on this but if it's not, hopefully you can change a thing or two and it will work.