How to turn these camera properties into standard calibration? - computer-vision

I have a product of a long closed company whose camera properties i need to figure out. I have the following info about the camera (e.g. FOV), and wonder how i turn this into standard
camera calibration parameters (focal length + principal point) so that i can construct a camera matrix that i can use with openCV. (NB: there are also radial and tangential distortion parameters, but i left those out of the listing below as they're obvious)
## SceneCamWidth: 1280
## SceneCamHeight: 960
## SceneCamFOV: 58.410146519580450785724679008126
## SceneCamDistortionModel: None
## SceneCamSensorOffsetX: -6.82891845703125
## SceneCamSensorOffsetY: 4.581512451171875
## SceneCamPixelPitchX: 0.00375
## SceneCamPixelPitchY: 0.00375
## SceneCamPos: 0 10.5 12.5
## SceneCamOrX: 10.06933620662360873154739238089
## SceneCamOrY: 0.5876239088537422716740366013255
## SceneCamOrZ: -0.023156037041591612940516498042598
I can run a camera calibration myself to verify the outcome of the conversion (I assume calibrations are pretty repeatable), but i need to figure out what the above means to write a script that others can use without having to calibrate their camera.

Related

Crop a raster by a list of sfc_polygons

I want to crop a land use raster by 16 polygons, which are 2km-buffers around agricultural fields.
My usual way of dealing with this (crop and mask, see code below) did not work, as my polygons are a list of sfc_Polygons.
I first use st_centroid and st_buffer to build buffers around the centroids of the fields.
The buffer-object is a list of 16 sfc_POLYGONS.
The land use object is a raster of 10*10m resolution.
fields$centroid <- st_centroid(fields$geometry)
buffer <- st_buffer(fields$centroid, 2000)
class(buffer) # [1] "sfc_POLYGON" "sfc"
buffer_landuse <- crop(land_use, buffer) # Error in .local(x, y, ...) : Cannot get an Extent object from argument y
buffer_landuse <- mask(buffer_landuse, buffer)
I guess I need to convert the list of sfc_POLYGONS into individual shapefiles to use the crop and mask functions. But I did not find a solutions so far. I would be happy about any help. Thanks a lot!

Find new camera matrix from two set of input points with OpenCV

Consider the following input data as given:
CP: camera intrinsic parameter (3x3) (more info)
WP: 4x 3D world coordinates (ie. 4 points in world space)
Now consider the following variables:
CM1: current camera location & rotation (4x4 matrix)
SP1: 4x 2D screen coordinates (ie. 4 points in screen space)
SP2: again 4x 2D screen coordinates (ie. 4 points in screen space)
with the constraint that SP1 is derived from looking through the camera at CM1 with camera parameters CP at WP. For clarity, let's express that relationship as the following (although this might not be correct formula in matrix maths).
SP1 = CM1 * CP * WP
I am trying to find the new camera location & rotation CM2 such that
SP2 = CM2 * CP * WP
I've been trying to use cvCalibrateCamera() but haven't been successful. I'm glad that I finally figured out how to convert all the data so that the function no longer spits out an error. But now I think that the function might not be the correct one to use in this case, as I have no idea how to apply the output data. (I have the feeling that the function is used to compute camera intrinsic parameters from input data, but the documentation is not clear to me. In case you are interested in my humble attempt: http://pastebin.com/WC8CKUhZ)
Is there a way to achieve what I am trying to compute with openCV? I couldn't find any function that would match my requirements more than cvCalibrateCamera() (guessed from the data I can feed into it).

Camera Calibration with OpenCV

I want to perform camera calibration with OpenCV C++ API, using a set of known world to image point matches.
OpenCV has a function called cv::calibrateCamera as documented here. This mention clearly that the function will deduce the
intrinsic camera matrix for planar objects and that it expects the user
to specify the matrix for non-planar 3D environments.
In my point correspondences, the world coordinates are not planar. And I do not have a qualified guess for the internal camera matrix.
How would I go about calibrating the camera in this case?
Currently, I am using a simple DLT based approach for the calculation using the cv::SVD::solveZ function. But I would like to use the non-linear estimation that OpenCV performs.
This page explains how to perform camera auto-calibration. This includes a method using Kruppa equations which appears to be solvable using the non-linear techniques you desire.
I was in the same situation: I have a non-planar 3D target, however I wanted to use OpenCV's non-linear LM-optimization for the calibration process. (Zhang's initialization method used by OpenCV only allows for planar calibration targets)
What you can do is to extract the camera matrix from your own DLT result and use this as an initial guess for calibrateCamera. It is sufficient if done for one pair, only (camera points - object points). Even though the other pairs might produce other camera matrices, they will hopefully be similar and you'll need that matrix only for initialization anyways.
Note, I do assume though, that with your own DLT you obtain a projection matrix P which maps homogeneous world points X to hom. image points x via x = P * X.
This would be the way to go, it is in python though, you should be able to adapt to your own needs:
P = YOUR_DLT(imagePoints[0], objectPoints[0])
cameraMatrix, _, _, _, _, _, _ = cv2.decomposeProjectionMatrix(P)
cameraMatrix /= cameraMatrix[2,2] # ensure unit elem[2,2]
cameraMatrix[0,1] = 0 # ensure no skew
cameraMatrix[0,0] = abs(cameraMatrix[0,0]) # ensure positive focal lengthes
cameraMatrix[1,1] = abs(cameraMatrix[1,1])
# ensure principal point within image:
cameraMatrix[0,2] = min(resX-1, max(0, cameraMatrix[0,2]))
cameraMatrix[1,2] = min(resY-1, max(0, cameraMatrix[1,2]))
retval, cameraMatrix, distCoeffs, rvecs, tvecs = \
cv2.calibrateCamera(objectPoints, imagePoints, imageSize, cameraMatrix)
Note, since calibrateCamera assumes cameraMatrix[2,2]==1 and is constrained to positive focal lengths and 0 skew, the camera matrix likely needs to be corrected, as I've showed in the code above.

3d camera transformation effects on 2d image pixel

I have a question.Lets say i captured a image from camera.after that i rotate my camera to rX,rY,rZ( pitch , Yaw , Roll ) and translate it to ( Tx , Ty , Tz ) and capture second image .Now where will a image pixel point(Px,Py) in first image will be in second image ?
Px,Py ( any pixel point in image - given )
rX,rY,rZ , Tx , Ty , Tz (camera rotation and translation vectors - given)
have to find new value of that pixel point after camera rotation.
Any equation or logic to solve this problem ? it may be easier but i couldn't find the solution.please help me.
thanks.
Unfortunately you don't have enough information to solve the problem. Let's see if I can make a drawing here to show you why:
/
cam1 < (1) (2) (3)
\
\ /
v
cam2
I hope this is clear. Let's say you take three pictures from cam1, with some object located at (1), (2) and (3). In all three cases the object is located exactly in the center of the picture.
Now you move the camera to the cam2 location, which involves a 90 degree counter clockwise rotation on Y plus some translation on X and Z.
For simplicity, let's say your Px,Py is the center of the picture. The three pictures that you took with cam1 have the same object at that pixel, so whatever equations and calculations you come up with to locate that pixel in the cam2 pictures, they will have the same input for the three pictures, so they will also produce the same output. But clearly, that will be wrong, since from the cam2 location each of the three pictures that you take will see the object in a very different position, moving horizontally across the frame.
Do you see what's missing?
If you wanted to do this properly, you would need your cam1 device to also capture a depth map, so that for each pixel you also know how far away from the camera the object represented by it was. This is what will differentiate the three pictures where the object moves farther away from the camera.
If you had the depth for Px,Py, then you can then do an inverse perspective projection from cam1 and obtain the 3D location of that pixel relative to cam1. You will then apply the inverse rotation and translation to convert the point to the 3D space relative to cam2, and then do a perspective projection from cam2 to find what will be the new pixel location.
Sorry for the bad news, I hope this helps!
You might want to read up on Epipolar Geometry. Without knowing anything else than image coordinates, your corresponding pixel could be anywhere along a line in the second image.
You can search for opengl transformation math. The links should provide you the math behind rotation and translation in 3d.
For example, this link shows :
Rotations:
Rotation about the X axis by an angle a:
|1 0 0 0|
|0 cos(a) -sin(a) 0|
|0 sin(a) cos(a) 0|
|0 0 0 1|

How do you judge the (real world) distance of an object in a picture?

I am building a recognition program in C++ and to make it more robust, I need to be able to find the distance of an object in an image.
Say I have an image that was taken 22.3 inches away of an 8.5 x 11 picture. The system correctly identifies that picture in a box with the dimensions 319 pixels by 409 pixels.
What is an effective way for relating the actual Height and width (AH and AW) and the pixel Height and width (PH and PW) to the distance (D)?
I am assuming that when I actually go to use the equation, PH and PW will be inversely proportional to D and AH and AW are constants (as the recognized object will always be an object where the user can indicate width and height).
I don't know if you changed your question at some point but my first answer it quite complicated for what you want. You probably can do something simpler.
1) Long and complicated solution (more general problems)
First you need the know the size of the object.
You can to look at computer vision algorithms. If you know the object (its dimensions and shape). Your main problem is the problem of pose estimation (that is find the position of the object relative the camera) from this you can find the distance. You can look at [1] [2] (for example, you can find other articles on it if you are interested) or search for POSIT, SoftPOSIT. You can formulate the problem as an optimization problem : find the pose in order to minimize the "difference" between the real image and the expected image (the projection of the object given the estimated pose). This difference is usually the sum of the (squared) distances between each image point Ni and the projection P(Mi) of the corresponding object (3D) point Mi for the current parameters.
From this you can extract the distance.
For this you need to calibrate you camera (roughly, find the relation between the pixel position and the viewing angle).
Now you may not want do code all of this for by yourself, you can use Computer Vision libs such as OpenCV, Gandalf [3] ...
Now you may want to do something more simple (and approximate). If you can find the image distance between two points at the same "depth" (Z) from the camera, you can relate the image distance d to the real distance D with : d = a D/Z (where a is a parameter of the camera related to the focal length, number of pixels that you can find using camera calibration)
2) Short solution (for you simple problem)
But here is the (simple, short) answer : if you picture in on a plane parallel to the "camera plane" (i.e. it is perfectly facing the camera) you can use :
PH = a AH / Z
PW = a AW / Z
where Z is the depth of the plane of the picture and a in an intrinsic parameter of the camera.
For reference the pinhole camera model relates image coordinated m=(u,v) to world coordinated M=(X,Y,Z) with :
m ~ K M
[u] [ au as u0 ] [X]
[v] ~ [ av v0 ] [Y]
[1] [ 1 ] [Z]
[u] = [ au as ] X/Z + u0
[v] [ av ] Y/Z + v0
where "~" means "proportional to" and K is the matrix of intrinsic parameters of the camera. You need to do camera calibration to find the K parameters. Here I assumed au=av=a and as=0.
You can recover the Z parameter from any of those equations (or take the average for both). Note that the Z parameter is not the distance from the object (which varies on the different points of the object) but the depth of the object (the distance between the camera plane and the object plane). but I guess that is what you want anyway.
[1] Linear N-Point Camera Pose Determination, Long Quan and Zhongdan Lan
[2] A Complete Linear 4-Point Algorithm for Camera Pose Determination, Lihong Zhi and Jianliang Tang
[3] http://gandalf-library.sourceforge.net/
If you know the size of the real-world object and the angle of view of the camera then assuming you know the horizontal angle of view alpha(*), the horizontal resolution of the image is xres, then the distance dw to an object in the middle of the image that is xp pixels wide in the image, and xw meters wide in the real world can be derived as follows (how is your trigonometry?):
# Distance in "pixel space" relates to dinstance in the real word
# (we take half of xres, xw and xp because we use the half angle of view):
(xp/2)/dp = (xw/2)/dw
dw = ((xw/2)/(xp/2))*dp = (xw/xp)*dp (1)
# we know xp and xw, we're looking for dw, so we need to calculate dp:
# we can do this because we know xres and alpha
# (remember, tangent = oposite/adjacent):
tan(alpha) = (xres/2)/dp
dp = (xres/2)/tan(alpha) (2)
# combine (1) and (2):
dw = ((xw/xp)*(xres/2))/tan(alpha)
# pretty print:
dw = (xw*xres)/(xp*2*tan(alpha))
(*) alpha = The angle between the camera axis and a line going through the leftmost point on the middle row of the image that is just visible.
Link to your variables:
dw = D, xw = AW, xp = PW
This may not be a complete answer but may push you in the right direction. Ever seen how NASA does it on those pictures from space? The way they have those tiny crosses all over the images. Thats how they get a fair idea about the deapth and size of the object as far as I know. The solution might be to have an object that you know the correct size and deapth of in the picture and then calculate the others' relative to that. Time for you to do some research. If thats the way NASA does it then it should be worth checking out.
I have got to say This is one of the most interesting questions i have seen for a long time on stackoverflow :D. I just noticed you have only two tags attached to this question. Adding something more in relation to images might help you better.