I'm working with modern opengl and I need some help with rotating around an arbitrary axis. so basically when i rotate the 3D model I also need to rotate its collision box. glm handles all the rotation math for the model itself I just need to be able to rotate the collision box. I tried learning that, but i'm having trouble, can anyone help with this? https://www.siggraph.org/education/materials/HyperGraph/modeling/mod_tran/3drota.htm
if (model->collision_box->aX.x >= this->collision_box->ax.x && this->collision_box->aX.x > model->collision_box->ax.x){
if (model->collision_box->aY.y >= this->collision_box->ay.y && this->collision_box->aY.y > model->collision_box->ay.y){
if (model->collision_box->aZ.z >= this->collision_box->az.z && this->collision_box->aZ.z > model->collision_box->az.z){
mx = 0;//stop movement
my = 0;
mz = 0;
}
}
}
bump: btw the 8 points of the collision box are stored as vec3 points so they can be rotated. It's not just min/max. Also if I used convex collision boxes I still need to rotate around an arbitrary axis!
When you have an object who's bounding box is defined as the min and max of its vertices, what you have is an Axis Aligned Bounding Box. When this updates as the object moves or rotates like so:
The box stays aligned to the primary axes. If you rotate the box along with the object, you have an Object Aligned Bounding Box and you don't need to update its dimension unless the objects vertices are modified (or it's scaled).
Here's an example of an AABB vs an OOBB.
An AABB is more efficient to intersect, but OOBBs are easier to maintain for dynamic objects. For first-order tests, you can also use spherical bounding volumes (which are rotation invariant).
PS: Credit for the GIF, taken from the video here
Related
For my undergraduate paper I am working on a iPhone Application using openCV to detect domino tiles. The detection works well in close areas, but when the camera is angled the tiles far away are difficult to detect.
My approach to solve this I would want to do some spacial calculations. For this I would need to convert a 2D Pixel value into world coordinates, calculate a new 3D position with a vector and convert these coordinates back to 2D and then check the colour/shape at that position.
Additionally I would need to know the 3D positions for Augmented Reality additions.
The Camera Matrix i got trough this link create opencv camera matrix for iPhone 5 solvepnp
The Rotationmatrix of the Camera I get from the Core Motion.
Using Aruco markers would be my last resort, as I woulnd't get the decided effect that I would need for the paper.
Now my question is, can i not make calculations when I know the locations and distances of the circles on a lets say Tile with a 5 on it?
I wouldn't need to have a measurement in mm/inches, I can live with vectors without measurements.
The camera needs to be able to be rotated freely.
I tried to invert the calculation sm'=A[R|t]M' to be able to calculate the 2D coordinates in 3D. But I am stuck with inverting the [R|t] even on paper, and I don't know either how I'd do that in swift or c++.
I have read so many different posts on forums, in books etc. and I am completely stuck and appreciate any help/input you can give me. Otherwise I'm screwed.
Thank you so much for your help.
Update:
By using the solvePnP that was suggested by Micka I was able to get the Rotation and Translation Vectors for the angle of the camera.
Meaning that if you are able to identify multiple 2D Points in your image and know their respective 3D World coordinates (in mm, cm, inch, ...), then you can get the mechanisms to project points from known 3D World coordinates onto the respective 2D coordinates in your image. (use the opencv projectPoints function).
What is up next for me to solve is the translation from 2D into 3D coordinates, where I need to follow ozlsn's approach with the inverse of the received matrices out of solvePnP.
Update 2:
With a top down view I am getting along quite well to being able to detect the tiles and their position in the 3D world:
tile from top Down
However if I am now angling the view, my calculations are not working anymore. For example I check the bottom Edge of a 9-dot group and the center of the black division bar for 90° angles. If Corner1 -> Middle Edge -> Bar Center and Corner2 -> Middle Edge -> Bar Center are both 90° angles, than the bar in the middle is found and the position of the tile can be found.
When the view is Angled, then these angles will be shifted due to the perspective to lets say 130° and 50°. (I'll provide an image later).
The Idea I had now is to make a solvePNP of 4 Points (Bottom Edge plus Middle), claculate solvePNP and then rotate the needed dots and the center bar from 2d position to 3d position (height should be irrelevant?). Then i could check with the translated points if the angles are 90° and do also other needed distance calculations.
Here is an image of what I am trying to accomplish:
Markings for Problem
I first find the 9 dots and arrange them. For each Edge I try to find the black bar. As said above, seen from Top, the angle blue corner, green middle edge to yellow bar center is 90°.
However, as the camera is angled, the angle is not 90° anymore. I also cannot check if both angles are 180° together, that would give me false positives.
So I wanted to do the following steps:
Detect Center
Detect Edges (3 dots)
SolvePnP with those 4 points
rotate the edge and the center points (coordinates) to 3D positions
Measure the angles (check if both 90°)
Now I wonder how I can transform the 2D Coordinates of those points to 3D. I don't care about the distance, as I am just calculating those with reference to others (like 1.4 times distance Middle-Edge) etc., if I could measure the distance in mm, that would even be better though. Would give me better results.
With solvePnP I get the rvec which I could change into the rotation Matrix (with Rodrigues() I believe). To measure the angles, my understanding is that I don't need to apply the translation (tvec) from solvePnP.
This leads to my last question, when using the iPhone, can't I use the angles from the motion detection to build the rotation matrix beforehand and only use this to rotate the tile to show it from the top? I feel that this would save me a lot of CPU Time, when I don't have to solvePnP for each tile (there can be up to about 100 tile).
Find Homography
vector<Point2f> tileDots;
tileDots.push_back(corner1);
tileDots.push_back(edgeMiddle);
tileDots.push_back(corner2);
tileDots.push_back(middle.Dot->ellipse.center);
vector<Point2f> realLivePos;
realLivePos.push_back(Point2f(5.5,19.44));
realLivePos.push_back(Point2f(12.53,19.44));
realLivePos.push_back(Point2f(19.56,19.44));
realLivePos.push_back(Point2f(12.53,12.19));
Mat M = findHomography(tileDots, realLivePos, CV_RANSAC);
cout << "M = "<< endl << " " << M << endl << endl;
vector<Point2f> barPerspective;
barPerspective.push_back(corner1);
barPerspective.push_back(edgeMiddle);
barPerspective.push_back(corner2);
barPerspective.push_back(middle.Dot->ellipse.center);
barPerspective.push_back(possibleBar.center);
vector<Point2f> barTransformed;
if (countNonZero(M) < 1)
{
cout << "No Homography found" << endl;
} else {
perspectiveTransform(barPerspective, barTransformed, M);
}
This however gives me wrong values, and I don't know anymore where to look (Sehe den Wald vor lauter Bäumen nicht mehr).
Image Coordinates https://i.stack.imgur.com/c67EH.png
World Coordinates https://i.stack.imgur.com/Im6M8.png
Points to Transform https://i.stack.imgur.com/hHjBM.png
Transformed Points https://i.stack.imgur.com/P6lLS.png
You see I am even too stupid to post 4 images here??!!?
The 4th index item should be at x 2007 y 717.
I don't know what I am doing wrongly here.
Update 3:
I found the following post Computing x,y coordinate (3D) from image point which is doing exactly what I need. I don't know maybe there is a faster way to do it, but I am not able to find it otherwise. At the moment I can do the checks, but still need to do tests if the algorithm is now robust enough.
Result with SolvePnP to find bar Center
The matrix [R|t] is not square, so by-definition, you cannot invert it. However, this matrix lives in the projective space, which is nothing but an extension of R^n (Euclidean space) with a '1' added as the (n+1)st element. For compatibility issues, the matrices that multiplies with vectors of the projective space are appended by a '1' at their lower-right corner. That is : R becomes
[R|0]
[0|1]
In your case [R|t] becomes
[R|t]
[0|1]
and you can take its inverse which reads as
[R'|-Rt]
[0 | 1 ]
where ' is a transpose. The portion that you need is the top row.
Since the phone translates in the 3D space, you need the distance of the pixel in consideration. This means that the answer to your question about whether you need distances in mm/inches is a yes. The answer changes only if you can assume that the ratio of camera translation to the depth is very small and this is called weak perspective camera. The question that you're trying to tackle is not an easy one. There is still people researching on this at PhD degree.
I am creating a game in qt Creator using c++ and OpenGL and an attempting to add bounding boxes to my scene in order to implement collision detection. I am using objects imported from Maya as .obj in my scene so their dimensions are not set in the code, only their position, rotation and scale. I am able to create a bounding box around each object which matches their position but am struggling to find a way to access the min and max x, y and z values of the objects in order to match the box to the size of the object.
Does anyone have any ideas on how I could access the min and max coordinates? I know how to implement the code if I could access these values..
The problem you afford is that each object geometry has different means of internal storage and determination of a bounding box.
Let's try some examples to illustrate this:
Suppose we have a circle, whose drawing parameters stored internally are the center coordinates x_center and y_center and the radius radius. If you try to determine the bounding box for this object, you'll see that it extends from (x_center - radius, y_center - radius) to (x_center + radius, y_center + radius).
In case you have an unrotated rectangle, given by the two points of it's principal diagonal, the bounding box just coincides with it's shape, so you have only to give the coordinates of the two same points that represent it.
If, on the other way, we have a polygon, the bounding box will be determined by the minimum and maximum coordinates of all the polygon vertices. If you allow to rotate the polygon, you'll need to rotate all the vertices coordinates before determining their maximum and minimum values, to get the bounding box.
If, for another example, we have a cubic spline, determined by the coordinates of its four control points you'll be determining the maximum and minimum values of two cubic polygons, which means solving two quadratic equations(after derivation), in the general case.
To cope with all this stuff, a geometric shape normally includes some means of polymorphically construct it's bounding box (it normally is even cached, so you don't have to calculate it, only after rotations or variations in it's position or scale) via some instance method.
Of course, all of this depends on how and how has defined the way shapes are implemented. perhaps your case is simpler than I'm exposing here, but you don't say. You also don't show any code or input/output data, as stated in the How to create a Minimal, Complete, and Verifiable example page. So you had better to edit your question and add your sample code, that will show more information about your exact problem.
if you have obj loader so you have an array.
float t[2100];
int x = 2100;
float xmax=-123243;
while(x>=0)
{
if(xmax<t[x]) xmax=t[x];
x-=3;
}
So here is a maximum x of the object(?).
I'm making a Minecraft clone as my very first OpenGL project and am stuck at the box selection part. What would be the best method to make a reliable box selection?
I've been going through some AABB algorithms, but none of them explain well enough what they exactly do (especially the super tweaked ones) and I don't want to use stuff I don't understand.
Since the world is composed of cubes I used octrees to remove some strain on ray cast calculations, basically the only thing I need is this function:
float cube_intersect(Vector ray, Vector origin, Vector min, Vector max)
{
//???
}
The ray and origin are easily obtained with
Vector ray, origin, point_far;
double mx, my, mz;
gluUnProject(viewport[2]/2, viewport[3]/2, 1.0, (double*)modelview, (double*)projection, viewport, &mx, &my, &mz);
point_far = Vector(mx, my, mz);
gluUnProject(viewport[2]/2, viewport[3]/2, 0.0, (double*)modelview, (double*)projection, viewport, &mx, &my, &mz);
origin = Vector(mx, my, mz);
ray = point_far-origin;
min and max are the opposite corners of a cube.
I'm not even sure this is the right way to do this, considering the number of cubes I'd have to check, even with octrees.
I've also tried gluProject, it works, but is very unreliable and doesn't give me the selected face of the cube.
EDIT
So this is what I've done: calculate a position in space with the ray:
float t = 0;
for(int i=0; i<10; i++)
{
Vector p = ray*t+origin;
while(visible octree)
{
if(p inside octree)
{
// then call recursive function until a cube is found
break;
}
octree = octree->next;
}
if(found a cube)
{
break;
}
t += .5;
}
It's actually surprisingly fast and stops after the first found cube.
As you can see the ray has to go trough multiple octrees before it finds a cube (actually a position in space) - there is a crosshair in the middle of the screen. The lower the increment step the more precise the selection, but also slower.
Working with boxes as primitives is overkill in memory requirements and processing power.
Cubes are fine for rendering and even there you can find more advanced algorithms that give you a better final image (Marching cubes). Minecraft's graphics are very primitive in this sense as voxel rendering has been around for a long time and significant progress has been made.
Basically you should exploit the fact that all your boxes are equally spaced and of the same size. These are called voxels.
Raycasting in a grid is trivial in comparison to what you have - a broad-phase oct-tree and a narrow phase AABB test. I suggest you research a bit on voxels and voxel set collision detection/raycasting as you will find both algorithms that are easier to implement and that would run faster.
You don't need any explicit octree structure in memory. All that is needed is byte[,,]. Just generate 8 children of a box lazily during search (like a chess engine generates children game states).
I'd also argue you don't have to rely on actual raycasts to determine what to render. Given you are in a predefined grid formation you really aren't held hostage to "exact visible" requirements. If you can track your camera's position and allocate a NSWE compass of some sorts you can also use that to determine if the GPU buffer should even consider array of vertex's for rendering..
I cover this theory in detail here https://stackoverflow.com/a/18028366/94167
But using Octree and Camera positioning + camera distance/bounds you basically know what the user is seeing without having to resort to raytrace(s) for exacts? If you can consolidate your triangles into larger ones for rendering purposes and then use a texture to break the visibility of large back to a cubic form (slight of hand) you reduce your vertex count significantly. Then its a matter of just rendering out the hull and by keeping track of what your camera direction is and where it sits xyz you can get away with letting some faces that shouldn't be displayed in as it will have minimal impact on performance (especially if your shaders themselves also do some of the work)
I'm experimenting further by tracking the centre point of the camera to determine it's horizontal focal point and from that you can determine the angle which in turn determines the depth of the chunk you're likely to see in the direction it faces.
I have a webcam pointed at a table at a slant and with it I track markers.
I have a transformationMatrix in OpenSceneGraph and its translation part contains the relative coordinates from the tracked Object to the Camera.
Because the Camera is pointed at a slant, when I move the marker across the table the Y and Z axis is updated, although all I want to be updated is the Z axis, because the height of the marker doesnt change only its distance to the camera.
This has the effect when when project a model on the marker in OpenSceneGraph, the model is slightly off and when I move the marker arround the Y and Z values are updated incorrectly.
So my guess is I need a Transformation Matrix with which I multiply each point so that I have a new coordinate System which lies orthogonal on the table surface.
Something like this: A * v1 = v2 v1 being the camera Coordinates and v2 being my "table Coordinates"
So what I did now was chose 4 points to "calibrate" my system. So I placed the marker at the top left corner of the Screen and defined v1 as the current camera coordinates and v2 as (0,0,0) and I did that for 4 different points.
And then taking the linear equations I get from having an unknown Matrix and two known vectors I solved the matrix.
I thought the values I would get for the matrix would be the values I needed to multiply the camera Coordinates with so the model would updated correctly on the marker.
But when I multiply the known Camera Coordinates I gathered before with the matrix I didnt get anything close to what my "table coordinates" were suposed to be.
Is my aproach completely wrong, did I just mess something up in the equations? (solved with the help of wolframalpha.com) Is there an easier or better way of doing this?
Any help would be greatly appreciated, as I am kind of lost and under some time pressure :-/
Thanks,
David
when I move the marker across the table the Y and Z axis is updated, although all I want to be updated is the Z axis, because the height of the marker doesnt change only its distance to the camera.
Only true when your camera's view direction is aligned with your Y axis (or Z axis). If the camera is not aligned with Y, it means the transform will apply a rotation around the X axis, hence modifying both the Y and Z coordinates of the marker.
So my guess is I need a Transformation Matrix with which I multiply each point so that I have a new coordinate System which lies orthogonal on the table surface.
Yes it is. After that, you will have 2 transforms:
T_table to express marker's coordinates in the table referential,
T_camera to express table coordinates in the camera referential.
Finding T_camera from a single 2d image is hard because there's no depth information.
This is known as the Pose problem -- it has been studied by -among others- Daniel DeMenthon. He developed a fast and robust algorithm to find the pose of an object:
articles available on its research homepage, section 4 "Model Based Object Pose" (and particularly "Model-Based Object Pose in 25 Lines of Code", 1995);
code at the same place, section "POSIT (C and Matlab)".
Note that the OpenCv library offers an implementation of the DeMenthon's algorithm. This library also offers a convenient and easy-to-use interface to grab images from a webcam. It's worth a try: OpenCv homepage
If you know the location in the physical world of your four markers and you've recorded the positions as they appear on the camera, you ought to be able to derive some sort of transform.
When you do the calibration, surely you'd want to put the marker at the four corners of the table not the screen? If you're just doing the corners of the screen, I imagine you're probably not taking into acconut the slant of the table.
Is the table literally just slanted relative to the camera or is it also rotated at all?
I am writing a program that will draw a solid along the curve of a spline. I am using visual studio 2005, and writing in C++ for OpenGL. I'm using FLTK to open my windows (fast and light toolkit).
I currently have an algorithm that will draw a Cardinal Cubic Spline, given a set of control points, by breaking the intervals between the points up into subintervals and drawing linesegments between these sub points. The number of subintervals is variable.
The line drawing code works wonderfully, and basically works as follows: I generate a set of points along the spline curve using the spline equation and store them in an array (as a special datastructure called Pnt3f, where the coordinates are 3 floats and there are some handy functions such as distance, length, dot and crossproduct). Then i have a single loop that iterates through the array of points and draws them as so:
glBegin(GL_LINE_STRIP);
for(pt = 0; pt<=numsubsegements ; ++pt) {
glVertex3fv(pt.v());
}
glEnd();
As stated, this code works great. Now what i want to do is, instead of drawing a line, I want to extrude a solid. My current exploration is using a 'cylinder' quadric to create a tube along the line. This is a bit trickier, as I have to orient openGL in the direction i want to draw the cylinder. My idea is to do this:
Psuedocode:
Push the current matrix,
translate to the first control point
rotate to face the next point
draw a cylinder (length = distance between the points)
Pop the matrix
repeat
My problem is getting the angles between the points. I only need yaw and pitch, roll isnt important. I know take the arc-cosine of the dot product of the two points divided by the magnitude of both points, will return the angle between them, but this is not something i can feed to OpenGL to rotate with. I've tried doing this in 2d, using the XZ plane to get x rotation, and making the points vectors from the origin, but it does not return the correct angle.
My current approach is much simpler. For each plane of rotation (X and Y), find the angle by:
arc-cosine( (difference in 'x' values)/distance between the points)
the 'x' value depends on how your set your plane up, though for my calculations I always use world x.
Barring a few issues of it making it draw in the correct quadrant that I havent worked out yet, I want to get advice to see if this was a good implementation, or to see if someone knew a better way.
You are correct in forming two vectors from the three points in two adjacent line segments and then using the arccosine of the dot product to get the angle between them. To make use of this angle you need to determine the axis around which the rotation should occur. Take the cross product of the same two vectors to get this axis. You can then build a transformation matrix using this axis-angle or pass it as parameters to glRotate.
A few notes:
first of all, this:
for(pt = 0; pt<=numsubsegements ; ++pt) {
glBegin(GL_LINE_STRIP);
glVertex3fv(pt.v());
}
glEnd();
is not a good way to draw anything. You MUST have one glEnd() for every single glBegin(). you probably want to get the glBegin() out of the loop. the fact that this works is pure luck.
second thing
My current exploration is using a
'cylinder' quadric to create a tube
along the line
This will not work as you expect. the 'cylinder' quadric has a flat top base and a flat bottom base. Even if you success in making the correct rotations according to the spline the edges of the flat tops are going to pop out of the volume of your intended tube and it will not be smooth. You can try it in 2D with just a pen and a paper. Try to draw a smooth tube using only shorter tubes with a flat bases. This is impossible.
Third, to your actual question, The definitive tool for such rotations are quaternions. Its a bit complex to explain in this scope but you can find plentyful information anywhere you look.
If you'd have used QT instead of FLTK you could have also used libQGLViewer. It has an integrated Quaternion class which would save you the implementation. If you still have a choice I strongly recommend moving to QT.
Have you considered gluLookAt? Put your control point as the eye point, the next point as the reference point, and make the up vector perpendicular to the difference between the two.