Transform Rectangle to trapezoid for perspective - c++

I have picture from front-view. and I want to turn this into bird's eye view.
Now I want to calculate for each point in the rectangle (x,y) what will be transformed x,y in the trapezoid.
there must be a formula for this transformation with a given x and y and also the angle of the trapezoid (a).
I am programming in C and using opencv.
Thanks a lot in advance.

Did you consider the homography transform. You use this to create or correct perspective in an image, I think that it is exactly what you want.
With OpenCV, you can use the method cv::findHomography(). The arguments are the 4 initial points (vertices of your rectangle) and the 4 final points (the vertices of the trapeze). You get a transformation matrix that you can then use with cv::warpPerspective() or cv::perspectiveTransform().

I was able to figure out a way for your problem.
Here is the code I used for the same:
Importing the required packages:
import cv2
import numpy as np
Reading the image to be used:
filename = '1.jpg'
img = cv2.imread(filename)
cv2.imwrite('img.jpg',img)
Storing the height and width of the image in separate variables:
ih, iw, _ = img.shape
Creating a black window whose size is bigger than that of the image and storing its height and width in separate variables:
black = np.zeros((ih + 300, iw + 300, 3), np.uint8)
cv2.imwrite('black.jpg',black)
bh, bw, _ = black.shape
Storing the 4 corner points of the image in an array:
pts_src = np.array([[0.0, 0.0],[float(iw), 0.0],[float(iw), float(ih)],[0.0,float(ih)]])
Storing the 4 corner points of the trapezoid to be obtained:
pts_dst = np.array([[bw * 0.25, 0],[bw * 0.75, 0.0],[float(bw), float(bh)],[0.0,float(bh)]])
Calculating the homography matrix using pts_src and pts_dst:
h, status = cv2.findHomography(pts_src, pts_dst)
Warping the given rectangular image into the trapezoid:
im_out = cv2.warpPerspective(img, h, (black.shape[1],black.shape[0]))
cv2.imwrite("im_outImage.jpg", im_out)
cv2.waitKey(0)
cv2.destroyAllWindows()
If you alter the values in the array pts_dst you will be able to get different kinds of quadrilaterals.

Related

How to convert 2d(x,y) cooridinates into 3d(x,y,z) coordinates using python and point cloud?

I have been using this github repo: https://github.com/aim-uofa/AdelaiDepth/blob/main/LeReS/Minist_Test/tools/test_shape.py
To figure out how this piece of code can be used to get x,y,z coordinates:
def reconstruct_3D(depth, f):
"""
Reconstruct depth to 3D pointcloud with the provided focal length.
Return:
pcd: N X 3 array, point cloud
"""
cu = depth.shape[1] / 2
cv = depth.shape[0] / 2
width = depth.shape[1]
height = depth.shape[0]
row = np.arange(0, width, 1)
u = np.array([row for i in np.arange(height)])
col = np.arange(0, height, 1)
v = np.array([col for i in np.arange(width)])
v = v.transpose(1, 0)
I want to use these coordinates to find distance between 2 people in 3D for an object detection model. Does anyone have any advice?
I know how to use 2d images with yolo to figure out distance between 2 people. Based on this link: Compute the centroid of a rectangle in python
My thinking is i can use the bounding boxes to get corners and then find the centroid and do that for 2 bounding boxes of people and use triangulation to find the hypotenuse between the 2 points (which is their distance).
However, i am having a tricky time on how to use a set of 3d coordinates to find distance between 2 people. I can get the relative distance from my 2d model.
By having a 2D depth image and camera's intrinsic matrix, you can convert each pixel to 3D point cloud as:
z = d
x = (u - cx) * z / f
y = (v - cy) * z / f
// where (cx, cy) is the principle point and f is the focal length.
In the meantime, you can use third party library like open3d for doing the same:
xyz = open3d.geometry.create_point_cloud_from_depth_image(depth, intrinsic)

OpenCV Python: How to warpPerspective a large image based on transform inferred from small region

I am using cv2.getPerspectiveTransform() and cv2.warpPerspective() to warp an image according to Adrian Rosenbrock blog : https://www.pyimagesearch.com/2014/08...
However in my case I have an image where I can only select the region B to be warped but need to warp (top-down view) the whole larger image A.
Can the parameters of the perspective transform inferred from the smaller region B be applied to the full image A? Is that possible?enter image description here
Here is one way to demonstrate that the matrix from the red square applies to the whole image in Python OpenCV.
Here I rectify the quadrilateral into a rectangle on the basis of its top and left dimensions.
Input:
import numpy as np
import cv2
import math
# read input
img = cv2.imread("red_quadrilateral.png")
hh, ww = img.shape[:2]
# specify input coordinates for corners of red quadrilateral in order TL, TR, BR, BL as x,
input = np.float32([[136,113], [206,130], [173,207], [132,196]])
# get top and left dimensions and set to output dimensions of red rectangle
width = round(math.hypot(input[0,0]-input[1,0], input[0,1]-input[1,1]))
height = round(math.hypot(input[0,0]-input[3,0], input[0,1]-input[3,1]))
print("width:",width, "height:",height)
# set upper left coordinates for output rectangle
x = input[0,0]
y = input[0,1]
# specify output coordinates for corners of red quadrilateral in order TL, TR, BR, BL as x,
output = np.float32([[x,y], [x+width-1,y], [x+width-1,y+height-1], [x,y+height-1]])
# compute perspective matrix
matrix = cv2.getPerspectiveTransform(input,output)
print(matrix)
# do perspective transformation setting area outside input to black
# Note that output size is the same as the input image size
imgOutput = cv2.warpPerspective(img, matrix, (ww,hh), cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT, borderValue=(0,0,0))
# save the warped output
cv2.imwrite("red_quadrilateral_warped.jpg", imgOutput)
# show the result
cv2.imshow("result", imgOutput)
cv2.waitKey(0)
cv2.destroyAllWindows()

opencv C++ negative X,Y cordinates from detections (yolo detections)

I am using Yolo detection algorithm to predict bounding boxes, but the detection returns some values as negative integers , but the cv::rectangle function draws correct rectangles on to images, so it's kind of puzzling why there are negative or even long values in the detection coordinates, here get_rect function returns x,y,width,height, after applying NMS
std::vector<uint> rr = get_rect(img, res[j].bbox);
later converted it into opencv understandable format:
cv::Rect r = cv::Rect(rr[0],rr[1],rr[2],rr[3]);
when print following values of Rect r some values are ambiguous,
r.x = 398
r.y = 1431655936
r.width = 22
r.height = -1431655867
As can be seen values of y coordinate and height are completely out of the canvas, so any reasons for this.
Also , i made sure that the input image dimensions , infering dimensions , and output image rendering dimensions all are same, Also cv::rectangle function correctly drawing all the rectangles with there respective object locations.

How to find an Equivalent point in a Scaled down image?

I would like to calculate the corner points or contours of the star in this in a Larger image. For that I'm scaling down the size to a smaller one & I'm able to get this points clearly. Now How to map this point in original image? I'm using opencv c++.
Consider a trivial example: the image size is reduced exactly by half.
So, the cartesian coordinate (x, y) in the original image becomes coordinate (x/2, y/2) in the reduced image, and coordinate (x', y') in the reduced image corresponds to coordinate (x*2, y*2) in the original image.
Of course, fractional coordinates get typically rounded off, in a reduced scale image, so the exact mapping is only possible for even-numbered coordinates in this example's original image.
Generalizing this, if the image's width is scaled by a factor of w horizontally and h vertically, coordinate (x, y) becomes coordinate(x*w, y*h), rounded off. In the example I gave, both w and h are 1/2, or .5
You should be able to figure out the values of w and h yourself, and be able to map the coordinates trivially. Of course, due to rounding off, you will not be able to compute the exact coordinates in the original image.
I realize this is an old question. I just wanted to add to Sam's answer above, to deal with "rounding off", in case other readers are wondering the same thing I faced.
This rounding off becomes obvious for even # of pixels across a coordinate axis. For instance, along a 1-D axis, a point demarcating the 2nd quartile gets mapped to an inaccurate value:
axis_prev = [0, 1, 2, 3]
axis_new = [0, 1, 2, 3, 4, 5, 6, 7]
w_prev = len(axis_prev) # This is an axis of length 4
w_new = len(axis_new) # This is an axis of length 8
x_prev = 2
x_new = x_prev * w_new / w_prev
print(x_new)
>>> 4
### x_new should be 5
In Python, one strategy would be to linearly interpolate values from one axis resolution to another axis resolution. Say for the above, we wish to map a point from the smaller image to its corresponding point of the star in the larger image:
import numpy as np
from scipy.interpolate import interp1d
x_old = np.linspace(0, 640, 641)
x_new = np.linspace(0, 768, 769)
f = interp1d(x_old, x_new)
x = 35
x_prime = f(x)

How to detect image gradient or normal using OpenCV

I wanted to detect ellipse in an image. Since I was learning Mathematica at that time, I asked a question here and got a satisfactory result from the answer below, which used the RANSAC algorithm to detect ellipse.
However, recently I need to port it to OpenCV, but there are some functions that only exist in Mathematica. One of the key function is the "GradientOrientationFilter" function.
Since there are five parameters for a general ellipse, I need to sample five points to determine one. Howevere, the more sampling points indicates the lower chance to have a good guess, which leads to the lower success rate in ellipse detection. Therefore, the answer from Mathematica add another condition, that is the gradient of the image must be parallel to the gradient of the ellipse equation. Anyway, we'll only need three points to determine one ellipse using least square from the Mathematica approach. The result is quite good.
However, when I try to find the image gradient using Sobel or Scharr operator in OpenCV, it is not good enough, which always leads to the bad result.
How to calculate the gradient or the tangent of an image accurately? Thanks!
Result with gradient, three points
Result without gradient, five points
----------updated----------
I did some edge detect and median blur beforehand and draw the result on the edge image. My original test image is like this:
In general, my final goal is to detect the ellipse in a scene or on an object. Something like this:
That's why I choose to use RANSAC to fit the ellipse from edge points.
As for your final goal, you may try
findContours and [fitEllipse] in OpenCV
The pseudo code will be
1) some image process
2) find all contours
3) fit each contours by fitEllipse
here is part of code I use before
[... image process ....you get a bwimage ]
vector<vector<Point> > contours;
findContours(bwimage, contours, CV_RETR_LIST, CV_CHAIN_APPROX_NONE);
for(size_t i = 0; i < contours.size(); i++)
{
size_t count = contours[i].size();
Mat pointsf;
Mat(contours[i]).convertTo(pointsf, CV_32F);
RotatedRect box = fitEllipse(pointsf);
/* You can put some limitation about size and aspect ratio here */
if( box.size.width > 20 &&
box.size.height > 20 &&
box.size.width < 80 &&
box.size.height < 80 )
{
if( MAX(box.size.width, box.size.height) > MIN(box.size.width, box.size.height)*30 )
continue;
//drawContours(SrcImage, contours, (int)i, Scalar::all(255), 1, 8);
ellipse(SrcImage, box, Scalar(0,0,255), 1, CV_AA);
ellipse(SrcImage, box.center, box.size*0.5f, box.angle, 0, 360, Scalar(200,255,255), 1, CV_AA);
}
}
imshow("result", SrcImage);
If you focus on ellipse(no other shape), you can treat the value of the pixels of the ellipse as mass of the points.
Then you can calculate the moment of inertial Ixx, Iyy, Ixy to find out the angle, theta, which can rotate a general ellipse back to a canonical form (X-Xc)^2/a + (Y-Yc)^2/b = 1.
Then you can find out Xc and Yc by the center of mass.
Then you can find out a and b by min X and min Y.
--------------- update -----------
This method can apply to filled ellipse too.
More than one ellipse on a single image will fail unless you segment them first.
Let me explain more,
I will use C to represent cos(theta) and S to represent sin(theta)
After rotation to canonical form, the new X is [eq0] X=xC-yS and Y is Y=xS+yC where x and y are original positions.
The rotation will give you min IYY.
[eq1]
IYY= Sum(m*Y*Y) = Sum{m*(xS+yC)(xS+yC)} = Sum{ m(xxSS+yyCC+xySC) = Ixx*S^2 + Iyy*C^2 + Ixy*S*C
For min IYY, d(IYY)/d(theta) = 0 that is
2IxxSC - 2IyySC + Ixy(CC-SS) = 0
2(Ixx-Iyy)/Ixy = (SS-CC)/SC = S/C+C/S = Z+1/Z
While programming, the LHS is just a number, let's said N
Z^2 - NZ +1 =0
So there are two roots of Z hence theta, let's said Z1 and Z2, one will min the IYY and the other will max the IYY.
----------- pseudo code --------
Compute Ixx, Iyy, Ixy for a hollow or filled ellipse.
Compute theta1=atan(Z1) and theta2=atan(Z2)
Put These two theta into eq1 find which is smaller. Then you get theta.
Go back to those non-zero pixels, transfer them to new X and Y by the theta you found.
Find center of mass Xc Yc and min X and min Y by sort().
-------------- by hand -----------
If you need the original equation of the ellipse
Just put [eq0] into the canonical form
You're using terms in an unusual way.
Normally for images, the term "gradient" is interpreted as if the image is a mathematical function f(x,y). This gives us a (df/dx, df/dy) vector in each point.
Yet you're looking at the image as if it's a function y = f(x) and the gradient would be f(x)/dx.
Now, if you look at your image, you'll see that the two interpretations are definitely related. Your ellipse is drawn as a set of contrasting pixels, and as a result there are two sharp gradients in the image - the inner and outer. These of course correspond to the two normal vectors, and therefore are in opposite directions.
Also note that your image has pixels. The gradient is also pixelated. The way your ellipse is drawn, with a single pixel width means that your local gradient takes on only values that are a multiple of 45 degrees:
▄▄ ▄▀ ▌ ▀▄