Clamp angle with continuous result - c++

I'm writing my own Inverse Kinematics and working on joint constraints currently. It's mostly fine but I want to remove all discontinuities from the resulting animation, and constraints are the biggest problem.
For example let's say you're looking at a ball rotating around your head. It rotates right and you look right, up to your neck twisting limit. The ball still moves and when it passes your back then your head will suddenly be clipped against opposite limit, looking left. This is what we get with typical clamping and it gives me frame-to-frame discontinuity.
What I need is that your head will get from right to left during few frames. My first idea is to reflect the source direction with my X axis (between shoulders) and if the reflected direction meets constraint limits - return it. If it doesn't meet the limits - clip the reflected direction instead of the source one.
This should give me the nice and continuous movement and it's not a rocket science but there're a lot of details to take care of. My limits may be like min < max but also max < min, they may pass through switching angle value (like 360->0, or 180->-180) or not. Besides, when my allowed range is bigger than 180deg then the logic should be slightly different.
The amount of different cases build up very fast so I'm wondering if I could find it already done somewhere?
BTW: I'm using Unreal engine if it makes any diffference.
EDIT:
I'm sorry if I've misguided you with my head example - it was just an analogy to visualize my needs. I'm not going to use this code to orient a head bone (nor any other) in a final animation. I'm planning to use it as a very basic operation in solving IK. It's going to be used in every iteration in every IK chain, for every bone, many times. So it need to be as fast as possible. All the higher level concepts, like movement planning, dynamics, etc. will be added on another layer, if needed.
For now I'm just asking for a function like:
float clampAngle( float angle, float min, float max )
that would return a continuous value for changing inputs.
I'm working on different issues currently, but I'll post my code when I get back to this one.

If you think of this as a simulation it will be a lot clearer. You won't need any planning, as in your comment, because the next frame is calculated using only information available on the current frame.
Set up one joint for the neck as described in your first two paragraphs. It will flip from left to right in one frame as the ball comes around.
Then set up a second identical joint and write an angular spring that will rotate it towards the first joint over time. By adjusting spring coefficients - strength, damping etc - you will have control over the way the head rotates.
How to write an angular spring?
This may not the best way, but the code is pretty short and easy, so here goes...
Lets call the 2 joints master and a slave. You need to store rotation phi and angular velocity omega on the slave joint.
phi and omega are axis-angle vectors - a 3d vector whose magnitude is the number of radians to rotate around the axis defined by the vector. It makes simulating rotations pretty easy. Whether your joint rotations are stored as euler angles or matrices or quaternions, you'll probably have some classes un UE API to help extract axis/angle vectors.
When master joint changes, convert its rotation to axis-angle. Lets call this phi_m.
On the start frame, the Slave rotation phi should be set to the same value as master. phi = phi_m. It has no angular velocity, so omega is set to the zero vector (0,0,0).
Then you move forward one frame. A frame's length dt is maybe 1/60 sec or whatever.
While simulating you first work out a new value for phi_m. Then the difference between master and slave ( the vector phi_m - phi) represents the torque acting on the slave joint. The bigger the angle the stronger the torque. Assume mass is 1.0 then using F=ma, angular acceleration alpha is equal to the torque. That means the change in angular velocity for the slave joint over that frame is alpha*dt. Now we can work out new angular velocity: omega = omega + (alpha*dt). Similarly, the new rotation is the old rotation plus change in rotation over time (angular velocity). phi = phi + (omega * dt)
Putting it all together and adding in a coefficient for spring strength and damping, a simulation step can go something like this:
damping = 0.01 // between 0 and 1
strength = 0.2 // ..experiment
dt = currentTime-lastTimeEvaluated
phi_m = eulerToAxisAngle(master.rotate)
if (currentTime <= startTime || dt < 0) {
slave.phi = phi_m
slave.omega = (0,0,0)
} else {
alpha = (phi_m - slave.phi)*strength
slave.omega = (slave.omega * (1.0 - damping)) + (alpha*dt)
slave.phi = slave.phi + slave.omega*dt
}
slave.rotate = axisAngleToEuler(slave.phi)
lastTimeEvaluated = currentTime
Notice that damping just kills off a little of the stored velocity. If damping is very low, the joint will overshoot and oscillate into place - boing!!

one approach:
define the region where the constraint cannot be satisfied (cone behind the head)
measure of how far into this region your target is (angle between target and center of cone divided by half cone angle)
us this scalar value to smoothly mix your constrained object back to some pre-defined neutral position (ie turn head back to center/rest position smoothly as target moves behind the head)

Related

3D Rigid body sphere collision response (C++)

I am currently programming a 3D physics engine in C++ and have two spheres in my scene. I can detect when said spheres are colliding and they have some basic collision as a response but the spheres go inside each other a lot and act minimally upon each other so I'm looking to change this to be a RigidBody collision. I've looked online and many tutorials are on 2D collisions and use Elastic collision which is already hard enough to translate into 2D. Any advice on where to look for Rigidbody sphere collision would be a huge help and I'll put below the code I am currently using for my collision. Thanks in advance!
for (size_t i = 0; i < pool.size(); i++)
{
if (pool.at(i) == this)
{
continue;
}
DynamicObjects* other = pool.at(i);
SScollision = PFG::SphereToSphereCollision(pos, other->GetPosition(), rad, other->GetRadious(), contactPoint);
if (SScollision)
{
glm::vec3 relativeVel = this->GetVelocity() - other->GetVelocity();
float eCof = -(1.0f + 1.0f) * glm::dot(relativeVel, contactPoint);
float jLin = eCof / (1.0f / this->GetMass() + 1.0f / other->GetMass());
glm::vec3 collision_impulse_force = jLin * contactPoint / deltaTs;
//ClearForces();
AddForce(1.0f / this->GetMass() * collision_impulse_force);
//pool.at(otherI)->ClearForces();
other->AddForce(-(1.0f / other->GetMass() * collision_impulse_force));
}
}
When dealing with forces (as opposed to pressure, stress, or potentials), you will need to use a mass-spring-damper model. You can look into the discrete particle method for a technique that is used in many fields.
To learn for yourself, try to understand your state variables and your system. Take the acceleration you are seeing and divide the sphere's velocity (or velocity difference) by that acceleration. This will give you an idea of the time scale that the sphere needs to slow down on impact. Multiply by the velocity and you get a displacement, you can get an idea of how far that sphere will travel before slowing down.
Also, take a look at your time step. If the velocity multiplied by the time step allows the sphere to travel further than your threshold for detecting contact, then the sphere's will pass through each other. The same is true if your contact force doesn't increase as the sphere's begin to intersect.
If you want to take a deeper look at collision mechanics, look for topics under "dynamic behavior of materials". There you will see concepts for wave mechanics and elastic behavior. Most numerical methods transition to energy potential models rather than mass-spring-damper models, you can check out methods such as molecular dynamics, smoothed particle hydrodynamics, and peri-dynamics.
The main point is that these numerical methods operate on governing equations that don't allow particles (or your spheres) to pass through each other. The spring models, or potentials often have very (very) stiff responses when objects begin to intersect, as compared to the normal contact between objects.
A couple of points to take away:
Your integrator (i.e. time step) is likely too coarse or your collision detection is too fine, allowing the spheres to intersect.
Look up the discrete particle method (sometimes called the discrete element method), it is very popular and has many, many people using, studying, and publishing. You will find code, mass-spring-damper models, parameters, and other methods to help you reach your goal.

How a feature descriptior for traffic sign detection works in opencv

I am trying to understand how Centroid to Contour (CtC) detector works. Here is a sample code that I have found on Github, and I am trying to understand what is the idea behind this. In this sample, the author trying to detect speed sign using CtC. There are just two important functions:
pre_process()
CtC_features
I have understood a part of the code and how it works but I have some problems in understanding how CtC_features function works.
If you can help me I would like to understand the following parts (just 3 points):
Why if centroid.x > curr.x we need to add PI value to the angle result ( if (centroid.x > curr.x) ang += 3.14159; //PI )
When we start selecting the features on line 97 we set the start angle ( double ang = - 1.57079; ). Why is this half the pi value and negative? How was this value chosen?
And a more general question, how can you know that what feature you select are related to speed limit sign? You find the centroid of the image and adjust the angle in the first step, but in the second step how can you know if ( while (feature_v[i].first > ang) ) the current angle is bigger than your hardcode angle ( in first case ang = - 1.57079) then we add that distance as a feature.
I would like to understand the idea behind this code and if someone with more experience and with some knowledge about trigonometry would help me it will be just great.
Thank you.
The code you provided is not the best, but let's see what happens.
I took this starting image of a sign:
Then, pre_process is called, which basically runs a Canny edge detector, along with some tricks which should lead to a better edge detection. I won't look into them, but this is what it returns:
Not the greatest. Maybe some parameter tuning would help.
But now, CtC_features is called, which is the scope of the question. The role of CtC_features is to obtain some features for a machine learning algorithms. This amounts to finding a numerical description of the image which would help the ML algorithm detect the sign. Such a description can be anything. Think about how someone who never saw a STOP sign and does not know how to read would describe it. They would say something like "A red, flat plate, with 8 sides and some white stuff in the middle". Based on this description, someone might be able to tell it's a STOP sign. We want to do the same, but since computers are computers, we look for numerical features. And, with them, some algorithm could be trained to "learn" what features each sign has.
So, let's see what features does CtC_features obtains from the contours.
The first thing it does is to call findContours. This function takes a binary image and returns arrays of points representing the contours of the image. Basically, it takes the edges and puts them into arrays of points. With connectivity, so we basically know which points are connected. If we use the code from here for visualization, we can see what happens:
So, the array contours is a std::vector<std::vector<cv::Point>> and you have in each sub-array a continuous contour, here drawn with a different color.
Next, we compute the number of points (edge pixels), and do an average over their coordinates to find the centroid of the edge image. The centroid is the filled circle:
Then, we iterate over all points, and create a vector of std::pair<double, double>, recording for each point the distance from the centroid and the angle. The angle function is defined at the bottom of the file as
double angle(Point2f a, Point2f b) {
return atan((a.y - b.y) / (a.x - b.x));
}
It basically computes the angle of the line from a to b with respect to the x axis, using the arctangent function. I'll let you watch a video on arctangent, but tl;dr is that it gives you the angle from a ratio. In radians (a circle is 2 PI radians, half a circle is PI radians). The problem is that the function is periodic, with a period of PI. This means that there are 2 angles on the circle (the circle of all points at the same distance around the centroid) which give you the same value. So, we compute the ratio (the ratio is btw known as the tangent of the angle), apply the inverse function (arctangent) and we get an angle (corresponding to a point). But what if it's the other point? Well, we know that the other point is exactly with PI degrees offset (it is diametrically opposite), so we add PI if we detect that it's the other point.
The picture below also helps understand why there are 2 points:
The tangent of the angle is highlighted vertical distance,. But the angle on the other side of the diagonal line, which intersects the circle in the bottom left, also has the same tangent. The atan function gives the tangents only for angles on the left side of the center. Note that there are no 2 directions with the same tangent.
What the check does is to ask whether the point is on the right of the centroid. This is done in order to be able to add a half a circle (PI radians or 180 degrees) to correct for the result of atan.
Now, we know the distance (a simple formula) and we have found (and corrected) for the angle. We insert this pair into the vector feature_v, and we sort it. The sort function, called like that, sorts after the first element of the pair, so we sort after the angle, then after distance.
The interval variable:
int degree = 10;
double interval = double((double(degree) / double(360)) * 2 * 3.14159); //5 degrees interval
simply is value of degree, converted from degrees into radians. We need radians since the angles have been computed so far in radians, and degrees are more user friendly. Yep, the comment is wrong, the interval is 10 degrees, not 5.
The ang variable defined below it is -PI / 2 (a quarter of a circle):
double ang = - 1.57079;
Now, what it does is to divide the points around the centroid into bins, based on the angle. Each bin is 10 degrees wide. This is done by iterating over the points sorted after angle, all are accumulated until we get to the next bin. We are only interested in the largest distance of a point in each bin. The starting point should be small enough that all the direction (points) are captured.
In order to understand why it starts from -PI/2, we have to get back at the trigonometric function diagram above. What happens if the angle goes like this:
See how the highlighted vertical segment goes "downwards" on the y axis. This means that its length (and implicitly the tangent) is negative here. Also, the angle is considered to be negative (otherwise there would be 2 angles on the same side of the center with the same tangent). Now, we are interested in the range of angles we have. It's all the angles on the right side of the centroid, starting from the bottom at -PI/2 to the top at PI/2. A range of PI radians, or 180 degrees. This is also written in the documentation of atan:
If no errors occur, the arc tangent of arg (arctan(arg)) in the range [-PI/2, +PI/2] radians, is returned.
So, we simply split all the possible directions (360 degrees) into buckets of 10 degrees, and take the distance of the farthest point in each bin. Since the circle has 360 degrees, we'll get 360 / 10 = 36 bins. Then, these are normalized such that the greatest value is 1. This helps a bit with the machine learning algorithm.
How can we know if the point we selected belongs to the sign? We don't. Most computer vision make some assumptions regarding the image in order to simplify the problem. The idea of the algorithm is to determine the shape of the sign by recording the distance from the center to the edges. This makes the assumption that the centroid is roughly in the middle of the sign. Depending on the ML algorithm used, and on the training data, different levels of robustness can be obtained.
Also, it assumes that (some of) the edges can be reliably identified. See how in my image, the algorithm was not able to detect the upper left edge?
The good news is that this doesn't have to be perfect. ML algorithms know how to handle this variation (up to some extent) provided that they are appropriately trained. It doesn't have to be perfect, but it has to be good enough. In order to answer what good enough means, what are the actual limitations of the algorithm, some more testing needs to be done, as well as some understanding of the ML algorithm used. But this is also why ML is so popular in vision: it can handle a lot of variation quite well.
At the end, we basically get an array of 36 numbers, one for each of the 36 bins of 10 degrees, representing the maximum distance of a point in the bin. I assume this is because the developer of the algorithm wanted a way to capture the shape of the sign, by looking at distances from center in various directions. This assumes that no edges are detected in the background, and the sign looks something like:
The max distance is used to pick the border, and not the or other symbols on the sign.
It is not directly used here, but a possibly related reading is the Hough transform, which uses a similar particularization to detect straight lines in an image.

New velocity after circle collision

On a circular billiard-table, the billiard-ball collides with the boundary of that table with some velocity v1. This collision is detected as follows:
double s = sqrt( (p.x-a)*(p.x-a) + (p.y-b)*(p.y-b) );
if (s<r) // point lies inside circle
// do nothing
else if (s==r) // point lies on circle
// calculate new velocity
else if (s>r) // point lies outside circle
// move point back onto circle (I already have that part)
// calculate new velocity
Now how can the new velocity v2 after the collision be calculated, such that angle of incidence = angle of reflection (elastic collision)?
PS: The billiard-ball is represented by a point p(x,y) with a velocity-vector v(x,y). The simulation is without friction.
Assuming you're making some simple (game-like) billiards simulation you could use something like:
v_new = coeff*(v_old - 2*dot(v_old, boundary_normal)*boundary_normal);
Here v_old is your current velocity vector and boundary_normal is the inward pointing normal of your circular billiards table at the point of impact. If you know the center c of your circular table and you have the point of impact p then the normal is simply normalize(c-p). That is, the normalized vector you obtain when subtracting p from c.
Now I have taken coeff to be a fudge factor between 0 (no velocity at all anymore after impact) and 1 (same velocity after impact). You can make this more physically plausible by determining a correct coefficient of restitution.
In the end all the formula above is, is simple reflection as you might have seen in a basic ray tracer for example. As said, it's a fairly crude abstraction from an accurate physics simulation, but will most likely do the job.
As the comments say, this is a mechanics question.
Have a look at the momentum definition.
What you want in particular, is covered in the section elastic collisions.

Ballistic curve problem

Ok i know this is quite off-topic for programmers but still I need this for app, so here it is:
Ballistic curve (without wind or any other conditions) is specified by these 2 lines:
So, there is a problem that you got 3 unknown values: x,y and time t, but only 2 equations.
You can't really compute all 3 with just these values, I got:
velocity v
angle Alpha
origin coordinates
Thus, you have to decide which one to specify.
Now you have 2D tanks game, or anything like that, you know you have tank and using ballistic you have to shoot opponent down with setting angle and power.
I need to know when the bullet hit the ground, it can be on-air as it fly, or precomputed.
There comes up my problem. Which way to use? Pre-compute or check for hitting the ground in each step.
If I would like to pre-compute, I would need to know height of terrain, which, logically would have to be constant as I don't know in which x coord. If I would know the X, it would mean that just in front of my turret is wall. So only way to get to result, when I hit the ground, would be with checking in intervals of time for hitting the ground. This is also good because the terrain doesn't have top be static yay! But isn't that too great overhead which could be made much simpler? Have you encountered with such problem/solution?
Thanks in advance, btw the terrain can be flat, using lines or NURBS so I please for general solution, not specific as in which height you shoot in that will be impact.
You can compute the path of the projectile y(x) by solving one equation for t and substituting into the other. You get
Then finding the landing point is a matter of computing the intersections between that function and the function that defines the height of the terrain. One intersection will be the launch point and the other will be the landing point. (If your terrain is very steep and hilly, there could be more than 2 intersections, in which case you take the first one with x greater than the launch point.) You can use any of various root-finding algorithms to actually compute the intersection; check the documentation of whatever mathematical or game-physical libraries you have to see if they provide a method to do this.
David Zaslavsky did a good job of answering your question about solving for the equation, but if your ultimate goal is simple ballistics simluation, I suggest you instead use vector decomposition.
By utilizing vector decomposition, you can derive the x- and y-compenent vectors of your projectile. You can then apply acceleration to each component to account for gravity, wind, etc. Then you can update the projectile's (x,y) position each interval as a function of time.
For example:
double Speed = 100.0; // Speed rather than velocity, as it is only the magnitude
double Angle = 30.0; // Initial angle of 30º
doulbe Position[2] = {0.0,0.0}; // Set the origin to (0,0)
double xvelocity = Speed * Cos(Angle);
double yvelocity = Speed * Sin(Angle);
Then if you can impliment a simple Update function as follows:
void Update(double Time)
{
yvelocity = -9.8 * Time; // Apply gravity
Position[0] *= (xvelocity * Time); // update x position
Position[1] *= (yvelocity * time); // update y position
CheckCollisions(); // check for collisions
}
Of course this is a very basic example, but you can build on it from here.
Fortunately, this is pretty simple kinematics.
Those equations are parametric: For any given time t, they give you the x and y coordinates for that time. All you need to do is plug in the starting velocity v and angle a.
If you're working on level ground, the time for your projectile to come back down is simply 2sin(a)v/g, i.e. the vertical component of your velocity divided by the downward acceleration due to gravity. The 2 is because it takes that amount of time for the speed to drop to 0, then the same time again for it to accelerate back down. Once you know the time you can solve for x.
If your terrain is not flat, you have some additional fun. Something you could try is work out the time for hitting the ground at the same height, and then correct for the extra vertical distance. This will also change your horizontal distance which might again affect your height... but two or three adjustments and the error will be too small for humans to notice :)
I'm not sure you're going about this is right way. The main equation you want is s = si + vi*dt + .5*adtdt. This is a simple equation of one dimension, but it generalizes to vectors cleanly.
Effectively, si is your initial position and vi is your initial velocity and a is acceleration due to gravity.
To make this work, build a vector for perfect horizontal muzzle velocity and project it on the launch angle. That's your vi. Si will be the tip of the barrel. From there it's vector summing and scaling.
Continuous functions do not work well for computers, because computers are implicitly discrete: the floating/double numbers are discrete, the timer is discrete, the game grid is discrete (even if it uses 'doubles').
So just discretize the equation the way Justin Holdsclaw suggested. Have velocity and accelerations vectors in each direction (in your case X and Y; you could also add Z). Update all vectors, and the object's position in space, at every tick.
Note that the result won't be 'exact'. The smaller your 'delta' values (grid coarseness), the closer you'll be to the 'exact' curve. To know precisely how close - if you're interested, find a book on numerical analysis and read the first few chapters. For practical purposes you can just experiment a bit.

How can I determine distance from an object in a video?

I have a video file recorded from the front of a moving vehicle. I am going to use OpenCV for object detection and recognition but I'm stuck on one aspect. How can I determine the distance from a recognized object.
I can know my current speed and real-world GPS position but that is all. I can't make any assumptions about the object I'm tracking. I am planning to use this to track and follow objects without colliding with them. Ideally I would like to use this data to derive the object's real-world position, which I could do if I could determine the distance from the camera to the object.
Your problem's quite standard in the field.
Firstly,
you need to calibrate your camera. This can be done offline (makes life much simpler) or online through self-calibration.
Calibrate it offline - please.
Secondly,
Once you have the calibration matrix of the camera K, determine the projection matrix of the camera in a successive scene (you need to use parallax as mentioned by others). This is described well in this OpenCV tutorial.
You'll have to use the GPS information to find the relative orientation between the cameras in the successive scenes (that might be problematic due to noise inherent in most GPS units), i.e. the R and t mentioned in the tutorial or the rotation and translation between the two cameras.
Once you've resolved all that, you'll have two projection matrices --- representations of the cameras at those successive scenes. Using one of these so-called camera matrices, you can "project" a 3D point M on the scene to the 2D image of the camera on to pixel coordinate m (as in the tutorial).
We will use this to triangulate the real 3D point from 2D points found in your video.
Thirdly,
use an interest point detector to track the same point in your video which lies on the object of interest. There are several detectors available, I recommend SURF since you have OpenCV which also has several other detectors like Shi-Tomasi corners, Harris, etc.
Fourthly,
Once you've tracked points of your object across the sequence and obtained the corresponding 2D pixel coordinates you must triangulate for the best fitting 3D point given your projection matrix and 2D points.
The above image nicely captures the uncertainty and how a best fitting 3D point is computed. Of course in your case, the cameras are probably in front of each other!
Finally,
Once you've obtained the 3D points on the object, you can easily compute the Euclidean distance between the camera center (which is the origin in most cases) and the point.
Note
This is obviously not easy stuff but it's not that hard either. I recommend Hartley and Zisserman's excellent book Multiple View Geometry which has described everything above in explicit detail with MATLAB code to boot.
Have fun and keep asking questions!
When you have moving video, you can use temporal parallax to determine the relative distance of objects. Parallax: (definition).
The effect would be the same we get with our eyes which which can gain depth perception by looking at the same object from slightly different angles. Since you are moving, you can use two successive video frames to get your slightly different angle.
Using parallax calculations, you can determine the relative size and distance of objects (relative to one another). But, if you want the absolute size and distance, you will need a known point of reference.
You will also need to know the speed and direction being traveled (as well as the video frame rate) in order to do the calculations. You might be able to derive the speed of the vehicle using the visual data but that adds another dimension of complexity.
The technology already exists. Satellites determine topographic prominence (height) by comparing multiple images taken over a short period of time. We use parallax to determine the distance of stars by taking photos of night sky at different points in earth's orbit around the sun. I was able to create 3-D images out of an airplane window by taking two photographs within short succession.
The exact technology and calculations (even if I knew them off the top of my head) are way outside the scope of discussing here. If I can find a decent reference, I will post it here.
You need to identify the same points in the same object on two different frames taken a known distance apart. Since you know the location of the camera in each frame, you have a baseline ( the vector between the two camera positions. Construct a triangle from the known baseline and the angles to the identified points. Trigonometry gives you the length of the unknown sides of the traingles for the known length of the baseline and the known angles between the baseline and the unknown sides.
You can use two cameras, or one camera taking successive shots. So, if your vehicle is moving a 1 m/s and you take fames every second, then successibe frames will gibe you a 1m baseline which should be good to measure the distance of objects up to, say, 5m away. If you need to range objects further away than the frames used need to be further apart - however more distant objects will in view for longer.
Observer at F1 sees target at T with angle a1 to velocity vector. Observer moves distance b to F2. Sees target at T with angle a2.
Required to find r1, range from target at F1
The trigonometric identity for cosine gives
Cos( 90 – a1 ) = x / r1 = c1
Cos( 90 - a2 ) = x / r2 = c2
Cos( a1 ) = (b + z) / r1 = c3
Cos( a2 ) = z / r2 = c4
x is distance to target orthogonal to observer’s velocity vector
z is distance from F2 to intersection with x
Solving for r1
r1 = b / ( c3 – c1 . c4 / c2 )
Two cameras so you can detect parallax. It's what humans do.
edit
Please see ravenspoint's answer for more detail. Also, keep in mind that a single camera with a splitter would probably suffice.
use stereo disparity maps. lots of implementations are afloat, here are some links:
http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/OWENS/LECT11/node4.html
http://www.ece.ucsb.edu/~manj/ece181bS04/L14(morestereo).pdf
In you case you don't have stereo camera, but depth can be evaluated using video
http://www.springerlink.com/content/g0n11713444148l2/
I think the above will be what might help you the most.
research has progressed so far that depth can be evaluated ( though not to a satisfactory extend) from a single monocular image
http://www.cs.cornell.edu/~asaxena/learningdepth/
Someone please correct me if I'm wrong, but it seems to me that if you're going to simply use a single camera and simply relying on a software solution, any processing you might do would be prone to false positives. I highly doubt that there is any processing that could tell the difference between objects that really are at the perceived distance and those which only appear to be at that distance (like the "forced perspective") in movies.
Any chance you could add an ultrasonic sensor?
first, you should calibrate your camera so you can get the relation between the objects positions in the camera plan and their positions in the real world plan, if you are using a single camera you can use the "optical flow technic"
if you are using two cameras you can use the triangulation method to find the real position (it will be easy to find the distance of the objects) but the probem with the second method is the matching, which means how can you find the position of an object 'x' in camera 2 if you already know its position in camera 1, and here you can use the 'SIFT' algorithme.
i just gave you some keywords wish it could help you.
Put and object of known size in the cameras field of view. That way you can have a more objective metric to measure angular distances. Without a second viewpoint/camera you'll be limited to estimating size/distance but at least it won't be a complete guess.