Rate of change in pixel coordinates during camera rotation (near vs far subjects) - computer-vision

Lets say a camera is spinning around horizontally with its axis of rotation at the center of the camera lens. Do subjects farther away from the camera have a different rate of change in photo x coordinate than subjects closer to the camera when camera lens is rotating? Obviously this is true when translating the camera (when driving in a car the mountains in the distance go by slower than the stop sign). But after playing around a bit and doing some at-home experiments I couldnt find any evidence that suggests there is a difference when rotating...

I don't know the answer for sure, but I will share my thoughts.
Let's pretend we are using a camera with an FOV (field of view) of 90 degrees or so. Let's start the position of the camera at some perpendicular distance away from two same sized objects that are aligned in a straight line. The camera is not yet included in that straight line.
As we translate the camera towards the two objects in order to make a straight line with them, the object that is further away will appear in the image before the closer object due to the triangular FOV. The further object will appear first but it's x-coordinate in the resulting image will shift slower than the closer object.
Now we stop the camera when it is in a straight line with the other two objects. The further object is behind the closer object, so it cannot be seen. I think no matter how we rotate the camera, we will not be able to see the further object behind the closer object. I also think changing the FOV will not help us here. This would mean that there is no difference in the rate of change of each object's x-coordinate. If there were, we would be able to see the further object behind the closer object. We would have created an x-ray vision camera!

Related

Calculating the rotation and translation (external parameters) of multiple cameras relative to a single camera?

Given a set of 5 cameras positioned as shown in the image below which capture the top, front, rear, left and right views of an object placed in the center.
Also given that the origin of the world coordinate is assumed to be the top view (therefore used as the reference view), how do I go about calculating the rotation and translation (external parameters of the cameras) of all other 4 cameras relative to this top camera. The front, rear, left and right cameras have also been slanted 45 degrees (about the x axis) to capture the object in the middle.
The calculation of the external parameters will later be used to calculate the projection matrix for each camera (the internal parameters are known)
Calibrate the extrinsic parameters with respect to an object of known shape and size which is visible to all cameras, or at least to all pairs of (reference camera, current camera).
For best results use a 3D object, not a plane. For example, a box with three unequal sides, or a dodecahedron. The latter would allow you to calibrate all cameras simultaneously, since each of them should see three faces at least. Depending on your accuracy requirements, you may need to spend some real money on getting this object machined accurately.
As for software, you can of course whip up a script to do it using OpenCV, or just use a CG tool like Blender, where visualization of the results may be much easier.

OpenCV Camera to Object Horizontal Angle Calculation

So, I'm a high school student and the lead programmer on my local robotics team, and this year I decided to try out OpenCV and do some vision processing on our robot.
From my vision code, I need to know a few things about some objects on our competition field. These things are: distance (ft), horizontal angle from camera, and horizontal distance from camera (ft). Essentially, one large right triangle.
I already have the camera successfully detecting these objects and putting a boundingRect around them. With a gyroscope on our robot, we should be able to get our robot to ~90 degree angle to the object once it is detected (as it's a set angle on the field). Thus, I can calculate distance just based on an empirically made function of the area of the boundingRect of the object.
The horizontal angle of the object from the camera, however, I'm not exactly sure how to approach. Once I have that, though, I can do some simple trig and get the horizontal distance.
-
So here's what we have/know: Distance to object in ft, object is at ~90 degrees to camera, camera has horizontal fov of 67 degrees w/ resolution of 800x600, the real world dimensions of the object, and a boundingRect around the object.
How would I, using all of this information, calculate the horizontal angle from the camera to the object?

my plane is not vertical, How to update coordinate of point cloud to lie on a vertical plane

I have a bunch of points lying on a vertical plane. In reality this plane
should be exactly vertical. But, when I visualize the point cloud, there is a
slight inclination (nearly 2 degrees) from the verticality. At the moment, I can calculate
this inclination only. Concerning other errors, I assume there are no
shifts or something like that.
So, I want to update coordinates of my point data so that they lie on the vertical plane. I think, I should do some kind of transformation. It may be only via rotation along X-axis. Not sure what it would be.
I guess, you understood my question. Honestly, I am poor at
mathematics. So, please let me know how to update my point coordinates
to lie on the exact vertical plane.
Note: AS I am implementing this in c++ and there are many programmers who have sound knowledge on these things, I am posting this question under c++.
UPDATES
If I say exactly what I have done so far;
I have point cloud data representing a vertical object + its surroundings things. (The data is collected by a moving scanner and may have axes deviations from the correct world axes). The problem is, I cannot say exactly that there is an error on my data or not. Therefore, I checked this with a vertical planar object (which is the dominated object in my data as well). In reality that plane is truly vertical. But, when I fit a plane by removing outliers, then that plane is not truly vertical and has nearly 2 degree inclination. Therefore, I am suspecting that my data has some error. So I want to update all my point clouds (including points on the plane and points which represent other objects) in a way to lay that particular planar points exactly on the vertical plane. Then, I guess, all the points will be updated into their correct positions as in the reality. That is all (x,y,z) coordinates should be updated.
As an example please refer the below figure.
left-represents original point cloud (as you can see, points themselves are not vertical) and back line tells the vertical plane which I fitted and red is the zenith line. as you can see, there is an inclination of the vertical plane.
So, I want to update whole my data in the right figure. then, after updating if i fit a plane again (removing outliers), then it is exactly parallel to the zenith line. please help me.
I may be able to help you out, considering I worked with planes recently. First of all, how come the points aren't coplanar from the get go? I'd make the points coplanar in the first place instead of them being at an inclination (from what origin?), and then having to fix them. Also, having the points be coplanar on your first go would increase efficiency.
Sorry if this is the answer you're not looking for, but I need more information before I can help you out. Also, 3D math is hard. If you work with it enough, it starts to get pounded into your head, where you will NEVER forget it, especially if you went through the headaches I had to go through.
I did a bit of thinking on it, and since you want to rotate along the x-axis, your rotation will be done on the xz-plane, which means we can make this a 2D problem. After doing a bit of research on Wikipedia, this may be your solution.
new z = ((x - intended x) * sin(angle)) + (z * cos(angle)) + intended x
What I'm doing here is subtracting our intended x value from our current x value, so that we make (intended x, 0) our point of origin to rotate around. After the point is rotated, I add (intended x, 0) back to our coordinate so that we get the correct result.
Depending on where you got your points from (some kind of measurement, I guess) and what you want to do with them, there are several different things you could do with your data.
The search keyword "regression plane" might help - there are several ways of finding planes approximating point clouds, and several ways to "snap" points to planes.
Edit: You want to apply a rotation around the axis defined by the cross product of the normal vector on your regression plane and the normal of your desired plane, and a point your choice. From your illustration I take it that you probably want the bottom of your vertical planar object to be the point of reference for the rotation.
So you've got your point of reference, you now the axis around which you want to rotate, and the angle. All you need to do is:
Translation (to get to your point of reference)
Rotation
I read your question again, and hopefully this answer will help you out. If there's anything else I need to know, please tell me.
Now, In order to rotate anything, there must be a center point to rotate around. Now you've already been able to detect the angle of inclination, so now we need a formula for rotating a point a certain angle around an origin. In addition, since this problem only occurs on a 2D plane, we can use this basic formula to readjust the points. For any two axis x and y:
Theta is the angle that you will rotate around in a counter-clockwise direction. x' and y' are your new points. x.origin and y.origin are the coordinates for the point you will be going around. Now I don't know if my math is 100% correct on this but if it's not, hopefully you can change a thing or two and it will work.

Trying to implement a mouse look "camera" in OpenGL/SFML

I've been using OpenGL with SFML 1.6 for some time now, and it has been a blast! With one exception: I can't seem to implement a camera class correctly. You see, I am trying to create a C++ class called "Camera". Here are my functions:
Camera::Strafe(float fSpeed)
checks whether the WASD keys are pressed, and if so, move the camera at "fSpeed" in their respective directions.
Camera::MouseMove(int currentX, int currentY)
should provide a first-person mouse look, taking in the current mouse coordinates and rotating the camera accordingly. My Strafe() implementation works fine, but I can't seem to get MouseMove() right.
I already know from reading other resources on OpenGL mouse look implementations that I must center the mouse after every frame, and I have that part down. But that's about it. I can't seem to get how to actually rotate the camera on the spot from the mouse coordinates. Probably need to use some trig, I bet.
I've done something similar to this (it was a 3rd person camera). If I remember what I did correctly, I took the change in mouse position and used that to calculate two angles (I did that with some trig, I believe). One angle gave me horizontal rotation, the other gave me vertical rotation. Pitch, Yaw and Roll specifically, although I can't remember which refers to which direction. There is also one you have to do before the other, or else things will rotate funny. I'm pretty sure it was pitch first, then yaw or roll.
Hopefully it should be obvious what the change in mouse position did. It allowed mouse senesitivity. If I moved the mouse fast, I would have a larger change, and so I would rotate "faster."
EDIT: Ok, I looked at my code and it's a very simple calculation.
This was done with C#, so bear with me for syntax:
_angles.X += MathHelper.ToDegrees(changeInX / 100);
_angles.Y += MathHelper.ToDegrees(changeInY / 100);
my angles were stored in a 2 dimensional vector (since I only rotated on two axes). You'll see I took my changeInX and changeInY values and simply divided them by 100 to get some arbitrary radian value, then converted that number to degrees. Adjust the 100 for sensitivity. Keep in mind, no solid-math was done here to figure this out. I just did some trial-and-error until I got something that worked well.

How can I determine distance from an object in a video?

I have a video file recorded from the front of a moving vehicle. I am going to use OpenCV for object detection and recognition but I'm stuck on one aspect. How can I determine the distance from a recognized object.
I can know my current speed and real-world GPS position but that is all. I can't make any assumptions about the object I'm tracking. I am planning to use this to track and follow objects without colliding with them. Ideally I would like to use this data to derive the object's real-world position, which I could do if I could determine the distance from the camera to the object.
Your problem's quite standard in the field.
Firstly,
you need to calibrate your camera. This can be done offline (makes life much simpler) or online through self-calibration.
Calibrate it offline - please.
Secondly,
Once you have the calibration matrix of the camera K, determine the projection matrix of the camera in a successive scene (you need to use parallax as mentioned by others). This is described well in this OpenCV tutorial.
You'll have to use the GPS information to find the relative orientation between the cameras in the successive scenes (that might be problematic due to noise inherent in most GPS units), i.e. the R and t mentioned in the tutorial or the rotation and translation between the two cameras.
Once you've resolved all that, you'll have two projection matrices --- representations of the cameras at those successive scenes. Using one of these so-called camera matrices, you can "project" a 3D point M on the scene to the 2D image of the camera on to pixel coordinate m (as in the tutorial).
We will use this to triangulate the real 3D point from 2D points found in your video.
Thirdly,
use an interest point detector to track the same point in your video which lies on the object of interest. There are several detectors available, I recommend SURF since you have OpenCV which also has several other detectors like Shi-Tomasi corners, Harris, etc.
Fourthly,
Once you've tracked points of your object across the sequence and obtained the corresponding 2D pixel coordinates you must triangulate for the best fitting 3D point given your projection matrix and 2D points.
The above image nicely captures the uncertainty and how a best fitting 3D point is computed. Of course in your case, the cameras are probably in front of each other!
Finally,
Once you've obtained the 3D points on the object, you can easily compute the Euclidean distance between the camera center (which is the origin in most cases) and the point.
Note
This is obviously not easy stuff but it's not that hard either. I recommend Hartley and Zisserman's excellent book Multiple View Geometry which has described everything above in explicit detail with MATLAB code to boot.
Have fun and keep asking questions!
When you have moving video, you can use temporal parallax to determine the relative distance of objects. Parallax: (definition).
The effect would be the same we get with our eyes which which can gain depth perception by looking at the same object from slightly different angles. Since you are moving, you can use two successive video frames to get your slightly different angle.
Using parallax calculations, you can determine the relative size and distance of objects (relative to one another). But, if you want the absolute size and distance, you will need a known point of reference.
You will also need to know the speed and direction being traveled (as well as the video frame rate) in order to do the calculations. You might be able to derive the speed of the vehicle using the visual data but that adds another dimension of complexity.
The technology already exists. Satellites determine topographic prominence (height) by comparing multiple images taken over a short period of time. We use parallax to determine the distance of stars by taking photos of night sky at different points in earth's orbit around the sun. I was able to create 3-D images out of an airplane window by taking two photographs within short succession.
The exact technology and calculations (even if I knew them off the top of my head) are way outside the scope of discussing here. If I can find a decent reference, I will post it here.
You need to identify the same points in the same object on two different frames taken a known distance apart. Since you know the location of the camera in each frame, you have a baseline ( the vector between the two camera positions. Construct a triangle from the known baseline and the angles to the identified points. Trigonometry gives you the length of the unknown sides of the traingles for the known length of the baseline and the known angles between the baseline and the unknown sides.
You can use two cameras, or one camera taking successive shots. So, if your vehicle is moving a 1 m/s and you take fames every second, then successibe frames will gibe you a 1m baseline which should be good to measure the distance of objects up to, say, 5m away. If you need to range objects further away than the frames used need to be further apart - however more distant objects will in view for longer.
Observer at F1 sees target at T with angle a1 to velocity vector. Observer moves distance b to F2. Sees target at T with angle a2.
Required to find r1, range from target at F1
The trigonometric identity for cosine gives
Cos( 90 – a1 ) = x / r1 = c1
Cos( 90 - a2 ) = x / r2 = c2
Cos( a1 ) = (b + z) / r1 = c3
Cos( a2 ) = z / r2 = c4
x is distance to target orthogonal to observer’s velocity vector
z is distance from F2 to intersection with x
Solving for r1
r1 = b / ( c3 – c1 . c4 / c2 )
Two cameras so you can detect parallax. It's what humans do.
edit
Please see ravenspoint's answer for more detail. Also, keep in mind that a single camera with a splitter would probably suffice.
use stereo disparity maps. lots of implementations are afloat, here are some links:
http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/OWENS/LECT11/node4.html
http://www.ece.ucsb.edu/~manj/ece181bS04/L14(morestereo).pdf
In you case you don't have stereo camera, but depth can be evaluated using video
http://www.springerlink.com/content/g0n11713444148l2/
I think the above will be what might help you the most.
research has progressed so far that depth can be evaluated ( though not to a satisfactory extend) from a single monocular image
http://www.cs.cornell.edu/~asaxena/learningdepth/
Someone please correct me if I'm wrong, but it seems to me that if you're going to simply use a single camera and simply relying on a software solution, any processing you might do would be prone to false positives. I highly doubt that there is any processing that could tell the difference between objects that really are at the perceived distance and those which only appear to be at that distance (like the "forced perspective") in movies.
Any chance you could add an ultrasonic sensor?
first, you should calibrate your camera so you can get the relation between the objects positions in the camera plan and their positions in the real world plan, if you are using a single camera you can use the "optical flow technic"
if you are using two cameras you can use the triangulation method to find the real position (it will be easy to find the distance of the objects) but the probem with the second method is the matching, which means how can you find the position of an object 'x' in camera 2 if you already know its position in camera 1, and here you can use the 'SIFT' algorithme.
i just gave you some keywords wish it could help you.
Put and object of known size in the cameras field of view. That way you can have a more objective metric to measure angular distances. Without a second viewpoint/camera you'll be limited to estimating size/distance but at least it won't be a complete guess.