Calculate transformation for obtaining bounding box in perspective view - opengl

This problem does not completely depend on point cloud library. Instead, it is a computer graphics problem.
Given a perspective view of a scene which contains a ball, we need to obtain bounding box of this ball. Please see a sample figure below-
If the yellow ball is projected, a bounding rectangle can be defined by two diagonal points s1 and s2. 3D coordinates of these points were defined by the following procedure-
Visualize the point cloud in PCLVisualizer
Define 2D bounding box using two diagonal points specifically for this view and get the 3D coordinates of the points using registerPointPickingCallback() callback function
Define minimum and maximum x, y coordinates by comparing diagonal points
Define minimum and maximum z coordinate based on the experimental setup
From the above procedure, 3D coordinates of s1 and s2 were found as shown below.
Point s1: (0.774406, 0.295306, 2.703)
Point s2: ( 1.37865, 0.845276, 2.834)
By comparing s1 and s2, a box was defined and addCube was used but it didn't work. See two screenshots of the output below taken at different camera angles-
Below is the code snippet-
double x_min = 0.774406, x_max = 1.37865;
double y_min = 0.295306, y_max = 0.845276;
double z_min = 0.2, z_max = 3.0;
pcl::visualization::PCLVisualizer viewer("Cloud Viewer");
viewer.setCameraParameters(camera);
viewer.addCube(x_min, x_max, y_min, y_max, z_min, z_max, 1, 0, 0, "bounding_box");
Fortunately, addCube function has another definition, which provides a way to apply translation and rotation.
My question is, using these two points i.e., s1 and s2, and camera, how do define the transformations, so that the bounding box contains the ball.
Below are the links to download all required file, in order to run the project-
bounding_box.cpp: The CPP file
scene.pcd: The captured PCD file used in above CPP code
camera.cam: Camera file used in above CPP code
Dependency: PCL 1.8

Related

Get known position in one image to another using 8-point algorithm

I have two images and and know the position of a point in the first image. Now I want to get the corresponding position in the second image.
This is my idea:
I can use algorithms such as SIFT to match keypoints (as seen in the image)
I know the camera matrix using calibration with e.g. chessboards
Using the 8 point algorithm I calculate the fundamental matrix F
Can I now use F to calculate the corresponding point?
Using fundamental matrix F alone is not enough. If you have a point on one image, you can't find its position on the second image, because it depends not only on configuration of the cameras, but also on the distance from the camera to that point.
This can also be seen from the equation x2^T * F * x1 = 0. If you know x1 and F, then for x2 you get equation x2^T * b = 0, where b = F * x1. This is an equation of a point x2 lying on the line b (points x1, x2 and line b are in homogeneous coordinates). Although you cant find the exact position of the point on the second image, you know that it must lie somewhere on that line.
Hartley and Zisserman have a great explanation these of these concepts in their book Multiple View Geometry in Computer Vision. Be sure to check it out for more details.

Extract points from PointCloud with PCL

I'm using the library PCL to read .pcd point clouds.
I need to extract a profile cut of this point cloud.
Any idea advice on how to implement such a feature ?
Basically, I want to move a box along the point cloud, and project the points present in the box on 1 plane.
I already did the reading of the point cloud, but I'm a bit stuck with the extraction of sub-point cloud
You can use the pcl::ProjectInliers class which does exactly that: it projects the points onto a parametric model (e.g. plane, sphere, ...). There's even an handy tutorial for it!
Here's an extract from the tutorial which creates a plane and projects the points on it:
// Create a set of planar coefficients with X=Y=0,Z=1
pcl::ModelCoefficients::Ptr coefficients (new pcl::ModelCoefficients ());
coefficients->values.resize (4);
coefficients->values[0] = coefficients->values[1] = 0;
coefficients->values[2] = 1.0;
coefficients->values[3] = 0;
// Create the filtering object
pcl::ProjectInliers<pcl::PointXYZ> proj;
proj.setModelType (pcl::SACMODEL_PLANE);
proj.setInputCloud (cloud);
proj.setModelCoefficients (coefficients);
proj.filter (*cloud_projected);
If you don't need an actual box but a distance threshold to a plane will do try this.
plane can be represented by a unit normal vector (3d) and distance, say norm = (0,0,1), d = 10. This defines a plane z = 10;
create a point on the plane, just (10*0, 10*0, 10*1) -> point_on_plane =(0,0,10)
point distance from plane dist = (p - point_on_plane) .dot(norm)
if fabs(dist) less than threshold, project points on the plane
projection = p - dist*norm
to iterate all cross section increase d.

Can I create a transformation matrix from rotation/translation vectors?

I'm trying to deskew an image that has an element of known size. Given this image:
I can use aruco:: estimatePoseBoard which returns rotation and translation vectors. Is there a way to use that information to deskew everything that's in the same plane as the marker board? (Unfortunately my linear algebra is rudimentary at best.)
Clarification
I know how to deskew the marker board. What I want to be able to do is deskew the other things (in this case, the cloud-shaped object) in the same plane as the marker board. I'm trying to determine whether or not that's possible and, if so, how to do it. I can already put four markers around the object I want to deskew and use the detected corners as input to getPerspectiveTransform along with the known distance between them. But for our real-world application it may be difficult for the user to place markers exactly. It would be much easier if they could place a single marker board in the frame and have the software deskew the other objects.
Since you tagged OpenCV:
From the image I can see that you have detected the corners of all the black box. So just get the most border for points in a way or another:
Then it is like this:
std::vector<cv::Point2f> src_points={/*Fill your 4 corners here*/};
std::vector<cv::Point2f> dst_points={cv:Point2f(0,0), cv::Point2f(width,0), cv::Point2f(width,height),cv::Point2f(0,height)};
auto H=v::getPerspectiveTransform(src_points,dst_points);
cv::Mat copped_image;
cv::warpPerspective(full_image,copped_image,H,cv::Size(width,height));
I was stuck on the assumption that the destination points in the call to getPerspectiveTransform had to be the corners of the output image (as they are in Humam's suggestion). Once it dawned on me that the destination points could be somewhere within the output image I had my answer.
float boardX = 1240;
float boardY = 1570;
float boardWidth = 1730;
float boardHeight = 1400;
vector<Point2f> destinationCorners;
destinationCorners(Point2f(boardX+boardWidth, boardY));
destinationCorners(Point2f(boardX+boardWidth, boardY+boardHeight));
destinationCorners(Point2f(boardX, boardY+boardHeight));
destinationCorners(Point2f(boardX, boardY));
Mat h = getPerspectiveTransform(detectedCorners, destinationCorners);
Mat bigImage(image.size() * 3, image.type(), Scalar(0, 50, 50));
warpPerspective(image, bigImage, h, bigImage.size());
This fixed the perspective of the board and everything in its plane. (The waviness of the board is due to the fact that the paper wasn't lying flat in the original photo.)

OpenCV How to rotate cv::RotatedRect?

How do I apply some transformation (e.g. rotation) to a cv::rotatedRect?
Tried using cv::warpAffine but won't work, as it is supposed to be applied to cv::Mat...
You can control rotation translation and scale directly using the internal variables angle, center & size see documentation.
More general transformations requires getting the vertices using points() and manipulating them using for example cv::warpAffinebut once doing that you will no longer have a cv::rotatedRect (by definition)
If you are planing to do complex operations like affine or perspective, you should deal with the points of the rotated rect and the result may be quad shape not a rectangle.
cv::warpAffine works for images. you should use cv::Transform and cv::Perspectivetransform
They take array of points and produced array of points.
Example:
cv::RotatedRect rect;
//fill rect somehow
cv::Point2f rect_corners[4];
rect.points(rect_corners);
std::vector<cv::Point2f> rect_corners_transformed(4);
cv::Mat M;
//fill M with affine transformation matrix
cv::transform(std::vector<cv::Point2f>(std::begin(rect_corners), std::end(rect_corners)), rect_corners_transformed, M);
// your transformed points are in rect_corners_transformed
TLDR: Create a new rectangle.
I don't know if it will help you, but I solved a similar problem by creating a new rectangle and ignoring the old one. In other words, I calculated the new angle, and then assigned it and the values of the old rectangle (the center point and the size) to the new rectangle:
RotatedRect newRotatedRectangle(oldRectangle.center, oldRectangle.size, newAngle);

Panoramic Image Photogrametry: How to calculate range?

Assume that I took two panoramic image with vertical offset of H and each image is presented in equirectangular projection with size Xm and Ym. To do this, I place my panoramic camera at position say A and took an image, then move camera H meter up and took another image.
I know that a point in image 1 with coordinate of X1,Y1 is the same point on image 2 with coordinate X2 and Y2(assuming that X1=X2 as we have only vertical offset).
My question is that How I can calculate the range of selected of point (the point that know its X1and Y1 is on image 1 and its position on image 2 is X2 and Y2 from the Point A (where camera was when image no 1 was taken.).
Yes, you can do it - hold on!!!
Key thing y = focal length of your lens - now I can do it!!!
So, I think your question can be re-stated more simply by saying that if you move your camera (on the right in the diagram) up H metres, a point moves down p pixels in the image taken from the new location.
Like this if you imagine looking from the side, across you taking the picture.
If you know the micron spacing of the camera's CCD from its specification, you can convert p from pixels to metres to match the units of H.
Your range from the camera to the plane of the scene is given by x + y (both in red at the bottom), and
x=H/tan(alpha)
y=p/tan(alpha)
so your range is
R = x + y = H/tan(alpha) + p/tan(alpha)
and
alpha = tan inverse(p/y)
where y is the focal length of your lens. As y is likely to be something like 50mm, it is negligible, so, to a pretty reasonable approximation, your range is
H/tan(alpha)
and
alpha = tan inverse(p in metres/focal length)
Or, by similar triangles
Range = H x focal length of lens
--------------------------------
(Y2-Y1) x CCD photosite spacing
being very careful to put everything in metres.
Here is a shot in the dark, given my understanding of the problem at hand you want to do something similar to computer stereo vision, I point you to http://en.wikipedia.org/wiki/Computer_stereo_vision to start. Not sure if this is still possible to do in the manner you are suggesting, it sounds like you may need some more physical constraints but I do remember being able to correlate two 2d points in images after undergoing a strict translation. Think :
lambda[x,y,1]^t = W[r1, tx;r2, ty;ry, tz][x; y; z; 1]^t
Where lamda is a scale factor, W is a 3x3 matrix covering the intrinsic parameters of your camera, r1, r2, and r3 are row vectors that make up the 3x3 rotation matrix (in your case you can assume the identity matrix since you have only applied a translation), and tx, ty, tz which are your translation components.
Since you are looking at two 2d points at the same 3d point [x,y,z] this 3d point is shared by both 2d points. I cannot say if you can rationalize the actual x,y, and z values particularly for your depth calculation but this is where I would start.