C++ Collision Detection causing objects to disappear - c++

I am currently working on some basic 2D RigidBody Physics and have run into an issue. I have a function that checks for collision between a Circle and a AABB but sometimes the Circle (in this case the player) will collide then disappear and if I print out the position when this happens I just set "nan".
bool Game::Physics::RigidBody2D::CircleAABB(RigidBody2D& body)
{
sf::Vector2f diff = m_Position - body.m_Position;
sf::Vector2f halfExtents = sf::Vector2f(body.m_Size.x / 2.0f, body.m_Size.y / 2.0f);
sf::Vector2f diffContrained = diff;
if (diff.x > halfExtents.x)
{
diffContrained.x = halfExtents.x;
}
else if (diff.x < -halfExtents.x)
{
diffContrained.x = -halfExtents.x;
}
if (diff.y > halfExtents.y)
{
diffContrained.y = halfExtents.y;
}
else if (diff.y < -halfExtents.y)
{
diffContrained.y = -halfExtents.y;
}
sf::Vector2f colCheck = diff - diffContrained;
sf::Vector2f VDirNorm = NormVector(colCheck);
sf::Vector2f colToPlayer = NormVector(m_Position - (diffContrained + body.m_Position));
float dist = getMagnitude(colCheck) - m_fRadius;
//std::cout << dist << std::endl;
if (dist < 0)
{
OnCollision((diffContrained + body.m_Position) - m_Position);
m_Position += (VDirNorm * abs(dist));
body.m_Position -= (VDirNorm * abs(dist))* (1.0f - body.m_fMass);
return true; //Collision has happened
}
return false;
}
This happens randomly and with almost no clear reason although it seems to happen more often when the circle is moving fast but can happen as well when it is moving slowly or one or two times when it is not moving at all.
An added note is that I apply gravity to the Y velocity and on collision set the velocity of the coordinating axis to 0.
So my question is, is something clearly wrong here to those with more physics experience than me?
Note: Using SFML for drawing and Vector2 class physics code is all mine.
EDIT: The OnCollision function checks the side the collision so that objects that inherit can use this (e.g. check if the collision was below to trigger a "isGrounded" boolean). In the this case the player checks the side and then sets the velocity on that axis to 0 and also trigger a isGrounded boolean when it is below.
void Game::GamePlay::PlayerController::OnCollision(sf::Vector2f vDir)
{
if (abs(vDir.x) > abs(vDir.y))
{
if (vDir.x > 0.0f)
{
//std::cout << "Right" << std::endl;
//Collision on the right
m_Velocity.x = 0.0f;
}
if (vDir.x < 0.0f)
{
//std::cout << "Left" << std::endl;
//Collision on the left
m_Velocity.x = 0.0f;
}
return;
}
else
{
if (vDir.y > 0.0f)
{
//std::cout << "Below" << std::endl;
//Collision below
m_Velocity.y = 0.0f;
if (!m_bCanJump && m_RecentlyCollidedNode != nullptr)
{
m_RecentlyCollidedNode->ys += 3.f;
}
m_bCanJump = true;
}
if (vDir.y < 0.0f)
{
//std::cout << "Above" << std::endl;
//Collision above
m_Velocity.y = 0.0f;
}
}
}
From debugging out velocity and position no real reason has come to the surface.
inline sf::Vector2f NormVector(sf::Vector2f vec)
{
float mag = getMagnitude(vec);
return vec / mag;
}
Solution:
if (colCheck.x == 0 && colCheck.y == 0)
{
std::cout << "Zero Vector" << std::endl;
float impulse = m_Velocity.x + m_Velocity.y;
m_Velocity.x = 0;
m_Velocity.y = 0;
m_Velocity += NormVector(diff)*impulse;
}
else
{
VDirNorm = NormVector(colCheck);
dist = getMagnitude(colCheck) - m_fRadius;
}

One issue I see is NormVector with a zero vector. You'll divide by zero, generating NaNs in your returned vector. This can happen in your existing code when diff and diffContrained are the same, so colCheck will be (0,0) causing VDirNorm to have NaNs in it, which will propagate into m_position.
Typically, a normalized zero length vector should stay a zero length vector (see this post), but in this case, since you're using the normalized vector to offset your bodies after the collision, you'll need to add code to handle it in a reasonable fashion.

Related

Rectangle Intersection. print message for empty intersection

I have four coordinates: x,y,width=w,height=h and I have two rectangles with the following coordinates:
r1.x=2,r1.y=3,r1.w=5,r1.h=6;
r2.x=0, r2.y=7,r2.w=-4,r4.h=2
How you can observe this intersection is empty.
what I did until now it was:
rectangle intersection (rectangle r1, rectangle r2){
r1.x=max(r1.x,r2.x);
r1.y=max(r1.y,r2.y);
r1.w=min(r1.w,r2.w);
r1.h=min(r1.h,r2.h);
return r1;
}
I think the above code it is used when there is an intersection, but when the intersection is empty I do not know. Also, I would like to print a message "empty" when there is no intersection.
thanks!
The method you are using for rectangle intersection does NOT work when rectangles are represented with their width and height.
It could work if you store the rectangles' two opposite corners (instead of one corner and the dimensions) and make sure that the first corner's coordinates are always less than or equal to the second corner, effectively storing min_x, min_y, max_x, and max_y for your rectangles.
I would suggest that you adopt the convention of making sure the rectangles always include their min coordinates and always exclude their max coords.
Assuming you have something not very different from:
struct rectangle {
int x;
int y;
int w;
int h;
};
(or the same using float or double instead of int)
I will assume here that w and h are always positive, if they may be negative, you should first normalize the input rectangle to ensure that they are.
You find the intersection by finding its opposite corners, and ensuring that lower left come before upper right:
rectangle intersection(const rectangle& r1, const rectangle& r2) {
// optionaly control arguments:
if (r1.w < 0 || r1.h < 0 || r2.w < 0 || r2.h < 0) {
throw std::domain_error("Unnormalized rectangles on input");
}
int lowx = max(r1.x, r2.x); // Ok, x coordinate of lower left corner
int lowy = max(r1.y, r2.y); // same for y coordinate
int upx = min(r1.x + r1.w, r2.x + r2.w) // x for upper right corner
int upy = min(r1.y + r1.h, r2.y + r2.h) // y for upper right corner
if (upx < lowx || upy < lowy) { // empty intersection
throw std::domain_error("Empty intersection");
}
return rectangle(lowx, lowy, upx - lowx, upy - lowy);
}
You can normalize a rectangle by forcing positive values for width and height:
rectangle& normalize(rectangle& r) {
if (r.w < 0) {
r.x += r.w;
r.w = - r.w;
}
if (r.h < 0) {
r.y += r.h;
r.h = -r.h;
}
return r;
}
You can then use that in a second function to display the intersection result:
void display_intersection(std::outstream out, rectangle r1, rectangle r2) {
try {
rectangle inter = intersection(normalize(r1), normalize(r2));
out << "(" << inter.x << ", " << inter.y << ") to (";
out << inter.x + inter.w << ", " << inter.y + inter.h << ")" << std::endl;
}
except (std::domain_error& e) {
out << "empty" << std::endl;
}
}

Optimization issues using Barnes-Hut for graph placing

I've been trying to work out the problem of Force-Directed graph/Barnes-Hut in my graph visualization app. I've checked so far octree creation, and it looks correctly (tree is represented by boxes and circles are my graph nodes):
Fields in my Quadtree are following:
class Quadtree
{
public:
int level;
Quadtree* trees[2][2][2];
glm::vec3 vBoundriesBox[8];
glm::vec3 center;
bool leaf;
float combined_weight = 0;
std::vector<Element*> objects;
//Addition methods/fields
private:
//Additional methods/fields
protected:
}
This is how I am adding elements recursively to my quadtree:
#define MAX_LEVELS 5
void Quadtree::AddObject(Element* object)
{
this->objects.push_back(object);
}
void Quadtree::Update()
{
if(this->objects.size()<=1 || level > MAX_LEVELS)
{
for(Element* Element:this->objects)
{
Element->parent_group = this;
this->combined_weight += Element->weight;
}
return;
}
if(leaf)
{
GenerateChildren();
leaf = false;
}
while (!this->objects.empty())
{
Element* obj = this->objects.back();
this->objects.pop_back();
if(contains(trees[0][0][0],obj))
{
trees[0][0][0]->AddObject(obj);
trees[0][0][0]->combined_weight += obj->weight;
} else if(contains(trees[0][0][1],obj))
{
trees[0][0][1]->AddObject(obj);
trees[0][0][1]->combined_weight += obj->weight;
} else if(contains(trees[0][1][0],obj))
{
trees[0][1][0]->AddObject(obj);
trees[0][1][0]->combined_weight += obj->weight;
} else if(contains(trees[0][1][1],obj))
{
trees[0][1][1]->AddObject(obj);
trees[0][1][1]->combined_weight += obj->weight;
} else if(contains(trees[1][0][0],obj))
{
trees[1][0][0]->AddObject(obj);
trees[1][0][0]->combined_weight += obj->weight;
} else if(contains(trees[1][0][1],obj))
{
trees[1][0][1]->AddObject(obj);
trees[1][0][1]->combined_weight += obj->weight;
} else if(contains(trees[1][1][0],obj))
{
trees[1][1][0]->AddObject(obj);
trees[1][1][0]->combined_weight += obj->weight;
} else if(contains(trees[1][1][1],obj))
{
trees[1][1][1]->AddObject(obj);
trees[1][1][1]->combined_weight += obj->weight;
}
}
for(int i=0;i<2;i++)
{
for(int j=0;j<2;j++)
{
for(int k=0;k<2;k++)
{
trees[i][j][k]->Update();
}
}
}
}
bool Quadtree::contains(Quadtree* child, Element* object)
{
if(object->pos[0] >= child->vBoundriesBox[0][0] && object->pos[0] <= child->vBoundriesBox[1][0] &&
object->pos[1] >= child->vBoundriesBox[4][1] && object->pos[1] <= child->vBoundriesBox[0][1] &&
object->pos[2] >= child->vBoundriesBox[3][2] && object->pos[2] <= child->vBoundriesBox[0][2])
return true;
return false;
}
As you can see on the picture nodes are very clustered. I've been trying to figure out the way to fix my repulsion force calculations, but it still not working and result is still this same.
So how I'm calculating it:
First in my main file I am running loop through all graph nodes:
for(auto& n_el:graph->node_vector)
{
tree->CheckNode(&n_el);
}
Next in my Qyadtree class, (tree is this class object), I have this recursive method:
void Quadtree::CheckNode(Node* node)
{
glm::vec3 diff = this->center - node->pos;
double distance_sqr = (diff.x * diff.x) + (diff.y*diff.y) + (diff.z*diff.z);
double width_sqr = (vBoundriesBox[1][0] - vBoundriesBox[0][0]) * (vBoundriesBox[1][0] - vBoundriesBox[0][0]);
if(width_sqr/distance_sqr < 10.0f || leaf)
{
if(leaf)
{
for(auto& n: objects)
{
n->Repulse(&objects);
}
}
else
{
node->RepulseWithGroup(this);
}
}
else
{
for(int i=0; i<2; i++)
{
for(int j=0; j<2; j++)
{
for(int k=0; k<2; k++)
{
trees[i][j][k]->CheckNode(node);
}
}
}
}
}
Finally I have two methods calculate repulse force depending on the fact if it's between group and node or between two nodes:
double Node::Repulse(std::vector<Node*>* nodes)
{
double dx;
double dy;
double dz;
double force = 0.0;
double distance_between;
double delta_weights;
double temp;
for(auto& element_node:*nodes)
{
if(this->name == element_node->name)
{
continue;
}
if(!element_node->use) continue;
delta_weights = 0.5 + abs(this->weight - element_node->weight);
dx = this->pos[0] - element_node->pos[0];
dy = this->pos[1] - element_node->pos[1];
dz = this->pos[2] - element_node->pos[2];
distance_between = dx * dx + dy * dy + dz * dz;
force = 0.19998 * delta_weights/(distance_between * distance_between);
temp = std::min(1.0, force);
if(temp<0.0001)
{
temp = 0;
}
double mx = temp * dx;
double my = temp * dy;
double mz = temp * dz;
this->pos[0] += mx;
this->pos[1] += my;
this->pos[2] += mz;
element_node->pos[0] -= mx;
element_node->pos[1] -= my;
element_node->pos[2] -= mz;
}
}
void Node::RepulseWithGroup(Quadtree* tree)
{
double dx;
double dy;
double dz;
double force = 0.0;
double distance_between;
double delta_weights;
double temp;
delta_weights = 0.5 + abs(this->weight - tree->combined_weight);
dx = this->pos[0] - tree->center.x;
dy = this->pos[1] - tree->center.y;
dz = this->pos[2] - tree->center.z;
distance_between = dx * dx + dy * dy + dz * dz;
force = 0.19998 * delta_weights/(distance_between * distance_between);
temp = std::min(1.0, force);
if(temp<0.0001)
{
temp = 0;
}
double mx = temp * dx;
double my = temp * dy;
double mz = temp * dz;
this->pos[0] += mx + this->parent_group->repulsion_force.x;
this->pos[1] += my + this->parent_group->repulsion_force.y;
this->pos[2] += mz + this->parent_group->repulsion_force.z;
}
In case this idea:
if(width_sqr/distance_sqr < 10.0f || leaf)
{
if(leaf)
{
for(auto& n: objects)
{
n->Repulse(&objects);
}
}
else
{
node->RepulseWithGroup(this);
}
}
is not clear it is because I've figured out, that there might be actually multiple elements in one tree leaf. That might happen if the maximum level might be already reached and still elements are in one box. Then I need also to calculate force within box against nodes inside.
What's more is bothering me is the speed of this approach (and it's indicating that octree is not working correctly) is the speed. This is simple plot representing time/number of nodes:
As far as I know the original Force-directed graph algorithm have complexity O(n^2), but with Barnes-Hut it should be O(nlogn). Yet, the plot it's not even close to nlogn.
Can someone tell me what I am doing here wrong? I've been looking at this code for quite a long now, and I don't see where I am missing something.
EDIT:
Based on #Ilmari Karonen answer I've run test for MAX_LEVELS 5, 20, 50, 100. Results are below. As it looks there is no meaningful difference I'd say (unfortunately)
Just off the top of my head,
#define MAX_LEVELS 5
seems awfully low. You may simply be running out of depth in your octree, causing your algorithm to degenerate into O(n²) direct summing. You may want to try increasing MAX_LEVELS to a significantly higher value (at least, say, 10 or 20) and seeing if that improves the performance.
I haven't tested your code, so I can't be sure if this is the real issue, or the only one. But it's definitely what I'd check first.
Looking a bit more closely at your code, I'm seeing a couple of other potential issues, too. These might not, strictly speaking, affect performance, but they might affect the correctness of the results.
First, you have a center vector in your Quadtree class, presumably representing the center of mass of the nodes within the subtree, but you never seem to update that vector when adding nodes into the tree. Since you do use that vector in your calculations, you might be getting bogus results because of that.
(In fact, since one thing you're using the center vector for is calculating the distance between a node and a subtree, and so deciding whether to descend deeper into the subtree, that might also be messing up your performance.)
Also, you seem to be updating the positions directly while traversing the tree, which means that the trajectories generated by your algorithm will depend on the order in which the nodes are traversed and the tree expanded. For more consistent and reproducible results, you may want to first calculate the displacement of each node during the current iteration of the algorithm, storing it in a separate vector, and then run a second pass over the nodes to add the displacement to their position (and reset it for the next iterations).
Also, surely I can't be the only one who finds the fact that you have a class named Quadtree that implements an octree annoying, can I? :)

Simulation of a point mass in a box (3D space)

I would like to simulate a point mass within a closed box. There is no friction and the point mass obeys the impact law. So there are only elastic collisions with the walls of the box. The output of the program is the time, position (rx,ry ,rz) and velocity (vx,vy,vz). I plot the trajectory by using GNUplot.
The problem I have now is, that the point mass gets energy from somewhere. So their jumps get each time more intense.
Is someone able to check my code?
/* Start of the code */
#include <iostream>
#include <cmath>
#include <iomanip>
using namespace std;
struct pointmass
{
double m; // mass
double r[3]; // coordinates
double v[3]; // velocity
};
// Grav.constant
const double G[3] = {0, -9.81, 0};
int main()
{
int Time = 0; // Duration
double Dt = 0; // Time steps
pointmass p0;
cerr << "Duration: ";
cin >> Time;
cerr << "Time steps: ";
cin >> Dt;
cerr << "Velocity of the point mass (vx,vy,vz)? ";
cin >> p0.v[0];
cin >> p0.v[1];
cin >> p0.v[2];
cerr << "Initial position of the point mass (x,y,z)? ";
cin >> p0.r[0];
cin >> p0.r[1];
cin >> p0.r[2];
for (double i = 0; i<Time; i+=Dt)
{
cout << i << setw(10);
for (int j = 0; j<=2; j++)
{
////////////position and velocity///////////
p0.r[j] = p0.r[j] + p0.v[j]*i + 0.5*G[j]*i*i;
p0.v[j] = p0.v[j] + G[j]*i;
///////////////////reflection/////////////////
if(p0.r[j] >= 250)
{
p0.r[j] = 500 - p0.r[j];
p0.v[j] = -p0.v[j];
}
else if(p0.r[j] <= 0)
{
p0.r[j] = -p0.r[j];
p0.v[j] = -p0.v[j];
}
//////////////////////////////////////////////
}
/////////////////////Output//////////////////
for(int j = 0; j<=2; j++)
{
cout << p0.r[j] << setw(10);
}
for(int j = 0; j<=2; j++)
{
cout << p0.v[j] << setw(10);
}
///////////////////////////////////////////////
cout << endl;
}
}
F = ma
a = F / m
a dt = F / m dt
a dt is acceleration over a fixed time - the change in velocity for that frame.
You are setting it to F / m i
it is that i which is wrong, as comments have suggested. It needs to be the duration of a frame, not the duration of the entire simulation so far.
I am a little concerned about the time loop along with other commenters - make sure that it represents an increment of time, not a growing duration.
Still, I think the main problem is you are changing the sign of all three components of velocity
on reflection.
That's not consistent with the laws of physics -conservation of linear momentum and energy - at the boundaries.
To see this, consider the case if your particle is moving in just the x-y plane (velocity in z is zero) and about to hit the wall at x= L.
The collision looks like this:
The force exerted on the point mass by the wall acts perpendicular to the wall. So there is no change in the momentum component of the particle parallel to the wall.
Applying conservation of linear momentum and kinetic energy, and assuming a perfectly elastic collision, you will find that
The component of velocity perpendicular to the wall DOES change sign
The component of velocity parallel to the wall DOES NOT change sign
In three dimensions, to have an accurate simulation, you have to work out the momentum components parallel and perpendicular to the wall on collision and code the resulting velocity changes.
In other words, this code:
///////////////////reflection/////////////////
if(p0.r[j] >= 250)
{
p0.r[j] = 500 - p0.r[j];
p0.v[j] = -p0.v[j];
}
else if(p0.r[j] <= 0)
{
p0.r[j] = -p0.r[j];
p0.v[j] = -p0.v[j];
}
//////////////////////////////////////////////
does not model the physics of reflection correctly. To fix it here is an outline of what to do:
Take the reflection checks out of the loop over x,y,z coordinates (but still within the time loop)
The collision condition for all six walls needs to be checked,
according to the direction of the normal vector to the wall.
For example for the right wall of the cube defined by X=250, 0<=Y<250, 0<=Z<250, the normal vector is in the negative X direction. For the left wall defined by X=0, 0<=Y<250, 0<=Z<250, the normal vector is in the positive X direction.
So on reflection from those two walls, the X component of velocity changes sign because it is normal (perpendicular) to the wall, but the Y and Z components do NOT change sign because they are parallel to the wall.
Apply similar considerations at the top and bottom wall (constant Y), and front and back wall (constant Z), of the cube -left as exercise to work out the normals to those surfaces.
Finally you shouldn't change sign of the position vector components on reflection, just the velocity vector. Instead recompute the next value of the position vector given the new velocity.
OK, so there are a few issues. The others have pointed out the need to use Dt rather than i for the integration step.
However, you are correct in stating that there is an issue with the reflection and energy conservation. I've added an explicit track of that below.
Note that the component wise computation of the reflection is actually fine other than the energy issue.
The problem was that during a reflection the acceleration due to gravity changes. In the case of the particle hitting the floor, it was acquiring kinetic energy equal to that it would have had if it had kept falling, but the new position had higher potential energy. So the energy would increase by exactly twice the potential energy difference between the floor and the new position. A bounce off the roof would have the opposite effect.
As noted below, once strategy would be to compute the actual time of reflection. However, actually working directly with energy is much simpler as well as more robust. However, please note although the the simple energy version below ensures that the speed and position are consistent, it actually does not have the correct position. For most purposes that may not actually matter. If you really need the correct position, I think we need to solve for the bounce time.
/* Start of the code */
#include <iostream>
#include <cmath>
#include <iomanip>
using namespace std;
struct pointmass
{
double m; // mass
double r[3]; // coordinates
double v[3]; // velocity
};
// Grav.constant
const double G[3] = { 0, -9.81, 0 };
int main()
{
// I've just changed the initial values to speed up unit testing; your code worked fine here.
int Time = 50; // Duration
double Dt = 1; // Time steps
pointmass p0;
p0.v[0] = 23;
p0.v[1] = 40;
p0.v[2] = 15;
p0.r[0] = 100;
p0.r[1] = 200;
p0.r[2] = 67;
for (double i = 0; i<Time; i += Dt)
{
cout << setw(10) << i << setw(10);
double energy = 0;
for (int j = 0; j <= 2; j++)
{
double oldR = p0.r[j];
double oldV = p0.v[j];
////////////position and velocity///////////
p0.r[j] = p0.r[j] + p0.v[j] * Dt + 0.5*G[j] * Dt*Dt;
p0.v[j] = p0.v[j] + G[j] * Dt;
///////////////////reflection/////////////////
if (G[j] == 0)
{
if (p0.r[j] >= 250)
{
p0.r[j] = 500 - p0.r[j];
p0.v[j] = -p0.v[j];
}
else if (p0.r[j] <= 0)
{
p0.r[j] = -p0.r[j];
p0.v[j] = -p0.v[j];
}
}
else
{
// Need to capture the fact that the acceleration switches direction relative to velocity half way through the timestep.
// Two approaches, either
// Try to compute the time of the bounce and work out the detail.
// OR
// Use conservation of energy to get the right speed - much easier!
if (p0.r[j] >= 250)
{
double energy = 0.5*p0.v[j] * p0.v[j] - G[j] * p0.r[j];
p0.r[j] = 500 - p0.r[j];
p0.v[j] = -sqrt(2 * (energy + G[j] * p0.r[j]));
}
else if (p0.r[j] <= 0)
{
double energy = 0.5*p0.v[j] * p0.v[j] - G[j] * p0.r[j];
p0.r[j] = -p0.r[j];
p0.v[j] = sqrt(2*(energy + G[j] * p0.r[j]));
}
}
energy += 0.5*p0.v[j] * p0.v[j] - G[j] * p0.r[j];
}
/////////////////////Output//////////////////
cout << energy << setw(10);
for (int j = 0; j <= 2; j++)
{
cout << p0.r[j] << setw(10);
}
for (int j = 0; j <= 2; j++)
{
cout << p0.v[j] << setw(10);
}
///////////////////////////////////////////////
cout << endl;
}
}

I need to understand the top function, more specific then i already know. The bottom function is pretty much self-explanatory

bool isOnPerimeter - function that I need help with.
bool isOnPerimeter(int row, int column, int radius)
{
double dRow=static_cast<double>(row);
double dColumn=static_cast<double>(column);
double dRadius=static_cast<double>(radius);
if (pow(dRow,2.0)+pow(dColumn,2.0)<=pow(dRadius,2.0) &&
pow(dRow,2.0)+pow(abs(dColumn)+1,2.0) > pow(dRadius,2.0))
return true;
else
return false;
}
void drawCircle(int radius)
{
for (int row = -radius;row <= radius;++row)
{
for (int column = -radius;column <= radius;++column)
{
if (isOnPerimeter(row,column,radius))
cout << "*";
else
cout << " ";
cout << endl;
}
}
}
the function looks like it's drawing a circle inside the square define by coordinates (-radius,-radius), (radius,radius).
How it does that: consider the trigonometric circle, you know that sin^2 + cos^2 = R^2. Since sin and cos are the projections of R on oy and ox axes, all the points inside the circle have the property that sin^2 + cos^2 < R^2 and all the points outside the circle have the property sin^2 + cos^2 > R^2
In your example you row, col are the equivalent of sin, cos. So you determine the edge of the circle as being all the points for which
sin^2 + cos^2 <= R^2 && sin^2 + (cos+1)^2 > R^2
Hope this helps

Implementing Table-Lookup-Based Trig Functions

For a videogame I'm implementing in my spare time, I've tried implementing my own versions of sinf(), cosf(), and atan2f(), using lookup tables. The intent is to have implementations that are faster, although with less accuracy.
My initial implementation is below. The functions work, and return good approximate values. The only problem is that they are slower than calling the standard sinf(), cosf(), and atan2f() functions.
So, what am I doing wrong?
// Geometry.h includes definitions of PI, TWO_PI, etc., as
// well as the prototypes for the public functions
#include "Geometry.h"
namespace {
// Number of entries in the sin/cos lookup table
const int SinTableCount = 512;
// Angle covered by each table entry
const float SinTableDelta = TWO_PI / (float)SinTableCount;
// Lookup table for Sin() results
float SinTable[SinTableCount];
// This object initializes the contents of the SinTable array exactly once
class SinTableInitializer {
public:
SinTableInitializer() {
for (int i = 0; i < SinTableCount; ++i) {
SinTable[i] = sinf((float)i * SinTableDelta);
}
}
};
static SinTableInitializer sinTableInitializer;
// Number of entries in the atan lookup table
const int AtanTableCount = 512;
// Interval covered by each Atan table entry
const float AtanTableDelta = 1.0f / (float)AtanTableCount;
// Lookup table for Atan() results
float AtanTable[AtanTableCount];
// This object initializes the contents of the AtanTable array exactly once
class AtanTableInitializer {
public:
AtanTableInitializer() {
for (int i = 0; i < AtanTableCount; ++i) {
AtanTable[i] = atanf((float)i * AtanTableDelta);
}
}
};
static AtanTableInitializer atanTableInitializer;
// Lookup result in table.
// Preconditions: y > 0, x > 0, y < x
static float AtanLookup2(float y, float x) {
assert(y > 0.0f);
assert(x > 0.0f);
assert(y < x);
const float ratio = y / x;
const int index = (int)(ratio / AtanTableDelta);
return AtanTable[index];
}
}
float Sin(float angle) {
// If angle is negative, reflect around X-axis and negate result
bool mustNegateResult = false;
if (angle < 0.0f) {
mustNegateResult = true;
angle = -angle;
}
// Normalize angle so that it is in the interval (0.0, PI)
while (angle >= TWO_PI) {
angle -= TWO_PI;
}
const int index = (int)(angle / SinTableDelta);
const float result = SinTable[index];
return mustNegateResult? (-result) : result;
}
float Cos(float angle) {
return Sin(angle + PI_2);
}
float Atan2(float y, float x) {
// Handle x == 0 or x == -0
// (See atan2(3) for specification of sign-bit handling.)
if (x == 0.0f) {
if (y > 0.0f) {
return PI_2;
}
else if (y < 0.0f) {
return -PI_2;
}
else if (signbit(x)) {
return signbit(y)? -PI : PI;
}
else {
return signbit(y)? -0.0f : 0.0f;
}
}
// Handle y == 0, x != 0
if (y == 0.0f) {
return (x > 0.0f)? 0.0f : PI;
}
// Handle y == x
if (y == x) {
return (x > 0.0f)? PI_4 : -(3.0f * PI_4);
}
// Handle y == -x
if (y == -x) {
return (x > 0.0f)? -PI_4 : (3.0f * PI_4);
}
// For other cases, determine quadrant and do appropriate lookup and calculation
bool right = (x > 0.0f);
bool top = (y > 0.0f);
if (right && top) {
// First quadrant
if (y < x) {
return AtanLookup2(y, x);
}
else {
return PI_2 - AtanLookup2(x, y);
}
}
else if (!right && top) {
// Second quadrant
const float posx = fabsf(x);
if (y < posx) {
return PI - AtanLookup2(y, posx);
}
else {
return PI_2 + AtanLookup2(posx, y);
}
}
else if (!right && !top) {
// Third quadrant
const float posx = fabsf(x);
const float posy = fabsf(y);
if (posy < posx) {
return -PI + AtanLookup2(posy, posx);
}
else {
return -PI_2 - AtanLookup2(posx, posy);
}
}
else { // right && !top
// Fourth quadrant
const float posy = fabsf(y);
if (posy < x) {
return -AtanLookup2(posy, x);
}
else {
return -PI_2 + AtanLookup2(x, posy);
}
}
return 0.0f;
}
"Premature optimization is the root of all evil" - Donald Knuth
Nowadays compilers provide very efficient intrinsics for trigonometric functions that get the best from modern processors (SSE etc.), which explains why you can hardly beat the built-in functions. Don't lose too much time on these parts and instead concentrate on the real bottlenecks that you can spot with a profiler.
Remember you have a co-processor ... you would have seen an increase in speed if it were 1993 ... however today you will struggle to beat native intrinsics.
Try viewing the disassebly to sinf.
Someone has already benchmarked this, and it looks as though the Trig.Math functions are already optimized, and will be faster than any lookup table you can come up with:
http://www.tommti-systems.de/go.html?http://www.tommti-systems.de/main-Dateien/reviews/languages/benchmarks.html
(They didn't use anchors on the page so you have to scroll about 1/3 of the way down)
I'm worried by this place:
// Normalize angle so that it is in the interval (0.0, PI)
while (angle >= TWO_PI) {
angle -= TWO_PI;
}
But you can:
Add time-meters to all functions, write special performance tests, run performance tests, print report of time test.. I think you will know answer after this tests.
Also you could use some profiling tools such as AQTime.
The built-in functions are very well optimized already, so it's going to be REALLY tough to beat them. Personally, I'd look elsewhere for places to gain performance.
That said, one optimization I can see in your code:
// Normalize angle so that it is in the interval (0.0, PI)
while (angle >= TWO_PI) {
angle -= TWO_PI;
}
Could be replaced with:
angle = fmod(angle, TWO_PI);