Axis-Aligned Bounding Box Calculation (AABB) for different orientations of 3D object - computer-vision

I am trying to calculate the Axis-Aligned Bounding Box of a 3d CAD model (.stp file) for different orientations.
More specifically, imagine a 3d object lying on a virtual workbench and we have a top view of it in a CAD program.
We only care about the top view (representing the projection of the object on the XY plane).
The final goal is to create a table containing the ratio of the bounding box X and Y sides for every degree of rotation.
Τhe following sketches clarify what I mean.
Any ideas/ suggestions for any part of the task?

I've got two ideas for solution approaches. (depending of the capability of your cad software)
Using kind of "extreme point" function. Get the coordinates of these extreme points by varying the direction the point is generated
Create a straight line (or plane in 3d) which does not intersect your geometry. Measure the minimal distance between your body an the line/plane. Rotate the line/plane around your body (around the cog) stepwise to get multiple measurements.

Related

Determining homography from known planes?

I've got a question related to multiple view geometry.
I'm currently dealing with a problem where I have a number of images collected by a drone flying around an object of interest. This object is planar, and I am hoping to eventually stitch the images together.
Letting aside the classical way of identifying corresponding feature pairs, computing a homography and warping/blending, I want to see what information related to this task I can infer from prior known data.
Specifically, for each acquired image I know the following two things: I know the correspondence between the central point of my image and a point on the object of interest (on whose plane I would eventually want to warp my image). I also have a normal vector to the plane of each image.
So, knowing the centre point (in object-centric world coordinates) and the normal, I can derive the plane equation of each image.
My question is, knowing the plane equation of 2 images is it possible to compute a homography (or part of the transformation matrix, such as the rotation) between the 2?
I get the feeling that this may seem like a very straightforward/obvious answer to someone with deep knowledge of visual geometry but since it's not my strongest point I'd like to double check...
Thanks in advance!
Your "normal" is the direction of the focal axis of the camera.
So, IIUC, you have a 3D point that projects on the image center in both images, which is another way of saying that (absent other information) the motion of the camera consists of the focal axis orbiting about a point on the ground plane, plus an arbitrary rotation about the focal axis, plus an arbitrary translation along the focal axis.
The motion has a non-zero baseline, therefore the transformation between images is generally not a homography. However, the portion of the image occupied by the ground plane does, of course, transform as a homography.
Such a motion is defined by 5 parameters, e.g. the 3 components of the rotation vector for the orbit, plus the the angle of rotation about the focal axis, plus the displacement along the focal axis. However the one point correspondence you have gives you only two equations.
It follows that you don't have enough information to constrain the homography between the images of the ground plane.

How to access min and max coordinates of a 3D object in C++?

I am creating a game in qt Creator using c++ and OpenGL and an attempting to add bounding boxes to my scene in order to implement collision detection. I am using objects imported from Maya as .obj in my scene so their dimensions are not set in the code, only their position, rotation and scale. I am able to create a bounding box around each object which matches their position but am struggling to find a way to access the min and max x, y and z values of the objects in order to match the box to the size of the object.
Does anyone have any ideas on how I could access the min and max coordinates? I know how to implement the code if I could access these values..
The problem you afford is that each object geometry has different means of internal storage and determination of a bounding box.
Let's try some examples to illustrate this:
Suppose we have a circle, whose drawing parameters stored internally are the center coordinates x_center and y_center and the radius radius. If you try to determine the bounding box for this object, you'll see that it extends from (x_center - radius, y_center - radius) to (x_center + radius, y_center + radius).
In case you have an unrotated rectangle, given by the two points of it's principal diagonal, the bounding box just coincides with it's shape, so you have only to give the coordinates of the two same points that represent it.
If, on the other way, we have a polygon, the bounding box will be determined by the minimum and maximum coordinates of all the polygon vertices. If you allow to rotate the polygon, you'll need to rotate all the vertices coordinates before determining their maximum and minimum values, to get the bounding box.
If, for another example, we have a cubic spline, determined by the coordinates of its four control points you'll be determining the maximum and minimum values of two cubic polygons, which means solving two quadratic equations(after derivation), in the general case.
To cope with all this stuff, a geometric shape normally includes some means of polymorphically construct it's bounding box (it normally is even cached, so you don't have to calculate it, only after rotations or variations in it's position or scale) via some instance method.
Of course, all of this depends on how and how has defined the way shapes are implemented. perhaps your case is simpler than I'm exposing here, but you don't say. You also don't show any code or input/output data, as stated in the How to create a Minimal, Complete, and Verifiable example page. So you had better to edit your question and add your sample code, that will show more information about your exact problem.
if you have obj loader so you have an array.
float t[2100];
int x = 2100;
float xmax=-123243;
while(x>=0)
{
if(xmax<t[x]) xmax=t[x];
x-=3;
}
So here is a maximum x of the object(?).

Fit a circle or a spline into a bunch of 3D Points

I have some 3D Points that roughly, but clearly form a segment of a circle. I now have to determine the circle that fits best all the points. I think there has to be some sort of least squares best fit but I cant figure out how to start.
The points are sorted the way they would be situated on the circle. I also have an estimated curvature at each point.
I need the radius and the plane of the circle.
I have to work in c/c++ or use an extern script.
You could use a Principal Component Analysis (PCA) to map your coordinates from three dimensions down to two dimensions.
Compute the PCA and project your data onto the first to principal components. You can then use any 2D algorithm to find the centre of the circle and its radius. Once these have been found/fitted, you can project the centre back into 3D coordinates.
Since your data is noisy, there will still be some data in the third dimension you squeezed out, but bear in mind that the PCA chooses this dimension such as to minimize the amount of data lost, i.e. by maximizing the amount of data that is represented in the first two components, so you should be safe.
A good algorithm for such data fitting is RANSAC (Random sample consensus). You can find a good description in the link so this is just a short outline of the important parts:
In your special case the model would be the 3D circle. To build this up pick three random non-colinear points from your set, compute the hyperplane they are embedded in (cross product), project the random points to the plane and then apply the usual 2D circle fitting. With this you get the circle center, radius and the hyperplane equation. Now it's easy to check the support by each of the remaining points. The support may be expressed as the distance from the circle that consists of two parts: The orthogonal distance from the plane and the distance from the circle boundary inside the plane.
Edit:
The reason because i would prefer RANSAC over ordinary Least-Squares(LS) is its superior stability in the case of heavy outliers. The following image is showing an example comparision of LS vs. RANSAC. While the ideal model line is created by RANSAC the dashed line is created by LS.
The arguably easiest algorithm is called Least-Square Curve Fitting.
You may want to check the math,
or look at similar questions, such as polynomial least squares for image curve fitting
However I'd rather use a library for doing it.

library for beam tracing (beam intersection) on a 3D Polygon model

I want to simulate a laser scanner which emits laser beam onto a 3D model to measure distance or other features from the model. The 3D model consists of vertices in xyz coordinate and faces; each vertex has also some user defined features.
The method should be simple. I define a view point and view vector (i.e. laser beam); what I need to do is checking the first vertex or the first face which is intersected with the view vector, then I can measure the distance and evaluate feature from the nearest vertices.
Is there any available library or tools to do that?
What you are talking about is, in a very literal sense, ray tracing. The maths and code behind doing this is not particularly complicated, especially if you don't have to consider reflections. There's a tutorial for doing exactly this in C++ here; triangle intersection is almost as simple as sphere intersection, and you can completely ignore the surface properties. If you don't want to write your own code (but seriously, it's maybe a hundred lines to do what you're looking for), there's a hint as to how to get Povray to do what you're after here.
EDIT: More maths, including triangle intersection, is here.

Coordinate Transformation C++

I have a webcam pointed at a table at a slant and with it I track markers.
I have a transformationMatrix in OpenSceneGraph and its translation part contains the relative coordinates from the tracked Object to the Camera.
Because the Camera is pointed at a slant, when I move the marker across the table the Y and Z axis is updated, although all I want to be updated is the Z axis, because the height of the marker doesnt change only its distance to the camera.
This has the effect when when project a model on the marker in OpenSceneGraph, the model is slightly off and when I move the marker arround the Y and Z values are updated incorrectly.
So my guess is I need a Transformation Matrix with which I multiply each point so that I have a new coordinate System which lies orthogonal on the table surface.
Something like this: A * v1 = v2 v1 being the camera Coordinates and v2 being my "table Coordinates"
So what I did now was chose 4 points to "calibrate" my system. So I placed the marker at the top left corner of the Screen and defined v1 as the current camera coordinates and v2 as (0,0,0) and I did that for 4 different points.
And then taking the linear equations I get from having an unknown Matrix and two known vectors I solved the matrix.
I thought the values I would get for the matrix would be the values I needed to multiply the camera Coordinates with so the model would updated correctly on the marker.
But when I multiply the known Camera Coordinates I gathered before with the matrix I didnt get anything close to what my "table coordinates" were suposed to be.
Is my aproach completely wrong, did I just mess something up in the equations? (solved with the help of wolframalpha.com) Is there an easier or better way of doing this?
Any help would be greatly appreciated, as I am kind of lost and under some time pressure :-/
Thanks,
David
when I move the marker across the table the Y and Z axis is updated, although all I want to be updated is the Z axis, because the height of the marker doesnt change only its distance to the camera.
Only true when your camera's view direction is aligned with your Y axis (or Z axis). If the camera is not aligned with Y, it means the transform will apply a rotation around the X axis, hence modifying both the Y and Z coordinates of the marker.
So my guess is I need a Transformation Matrix with which I multiply each point so that I have a new coordinate System which lies orthogonal on the table surface.
Yes it is. After that, you will have 2 transforms:
T_table to express marker's coordinates in the table referential,
T_camera to express table coordinates in the camera referential.
Finding T_camera from a single 2d image is hard because there's no depth information.
This is known as the Pose problem -- it has been studied by -among others- Daniel DeMenthon. He developed a fast and robust algorithm to find the pose of an object:
articles available on its research homepage, section 4 "Model Based Object Pose" (and particularly "Model-Based Object Pose in 25 Lines of Code", 1995);
code at the same place, section "POSIT (C and Matlab)".
Note that the OpenCv library offers an implementation of the DeMenthon's algorithm. This library also offers a convenient and easy-to-use interface to grab images from a webcam. It's worth a try: OpenCv homepage
If you know the location in the physical world of your four markers and you've recorded the positions as they appear on the camera, you ought to be able to derive some sort of transform.
When you do the calibration, surely you'd want to put the marker at the four corners of the table not the screen? If you're just doing the corners of the screen, I imagine you're probably not taking into acconut the slant of the table.
Is the table literally just slanted relative to the camera or is it also rotated at all?