I need to extract an object from an image. I know the location of the object inside the image, ie the region where the object is located: this region is provided as a pair of coordinates [xmin, ymin] and [xmax, ymax].
I would like to modify the coordinates of this region (thus increasing the height and width in a suitable way) in order to extract a subimage with a specified aspect ratio. So, we have the following constraints:
in order to avoid cutting the object incorrectly, the width and height of the region must not be reduced;
bounds checking: the adaptation of the region size must ensure that the new coordinates are inside the image;
the width/height ratio of the subimage should be approximately equal to the specified aspect ratio.
How to solve this problem?
UPDATE: one possible solution
The solution to my problem is mainly the algorithm proposed by Mark in this answer. The result of this algorithm is a new region wider or higher than the original and it is able to obtain a new aspect ratio very close to that specified, without moving the center of the original region (if this is feasible, depending on the position of the region within the original image). The region obtained from this algorithm could be further processed by the following algorithm in order to make the aspect ratio closer to that specified.
for left=0:(xmin-1), // it tries all possible combinations
for right=0:(imgWidth-xmax), // of increments of the region size
for top=0:(ymin-1), // along the four directions
for bottom=0:(imgHeight-ymax),
x1 = xmin - left;
x2 = xmax + right;
y1 = ymin - top;
y2 = ymax + bottom;
newRatio = (x2 - x1) / (y2 - y1);
if (newRatio == ratio)
rect = [x1 y1 x2 y2];
return;
end
end
end
end
end
Example... An image with 976 rows and 1239 columns; an initial region [xmin ymin xmax ymax] = [570 174 959 957].
First algorithm (main processing).
Input: the initial region and the image size.
Output: it produces new region r1 = [568 174 960 957],
width = 392 and height = 783, so the aspect ratio is equal to 0.5006.
Second algorithm (post-processing).
Input: the region r1.
Output: new region r2 = [568 174 960 958],
width = 392 and height = 784, so the aspect ratio is equal to 0.5.
obj_width = xmax - xmin
obj_height = ymax - ymin
if (obj_width / obj_height > ratio)
{
height_adjustment = ((obj_width / ratio) - (ymax - ymin)) / 2;
ymin -= height_adjustment;
ymax += height_adjustment;
if (ymin < 0)
{
ymax -= ymin;
ymin = 0;
}
if (ymax >= image_height)
ymax = image_height - 1;
}
else if (obj_width / obj_height < ratio)
{
width_adjustment = ((obj_height * ratio) - (xmax - xmin)) / 2;
xmin -= width_adjustment;
xmax += width_adjustment;
if (xmin < 0)
{
xmax -= xmin;
xmin = 0;
}
if (xmax >= image_width)
xmax = image_width - 1;
}
Let's start with your region: a w x h rectangle centered on a point p. You want to extend this region to have the aspect ratio r. The idea is to extend the width or the height:
(trivial case) If w / h == r, then return.
Compute w' = h x r.
If w' > w, then the resulting region is of width w', height h and center p.
Else, the resulting region is of width w, height h' = w / r, and center p.
Move the center p to follow the edges of the image if it has to be clipped, for example if the resulting region upper-left point is outside of the image: let u = upper-left point of the resulting region and d = (min(u.x,0), min(u.y,0)). Then, the final center will be p' = p - d. It is similar for the lower-right part of the region.
Clip the resulting region to the image.
Related
Context :
Page No 8 in this lecture says that the OpenCV HoughLines function returns an N x 2 array of line parameters rho and theta which is stored in the array called lines.
Then in order to actually create the lines from these angles, we have some formulae and later we use the line function. The formulae are explained below in the code.
Code :
//Assuming we start our program with the Input Image as shown below.
//This array will be used for storing rho and theta as N x 2 array
vector<Vec2f> lines;
//The input bw_roi is a canny image with detected edges
HoughLines(bw_roi, lines, 1, CV_PI/180, 70, 0, 0); '
//These formulae below do the line estimation based on rho and theta
for( size_t i = 0; i < lines.size(); i++ )
{
float rho = lines[i][0], theta = lines[i][1];
Point2d pt1, pt2;
double m;
double a = cos(theta), b = sin(theta);
double x0 = a*rho, y0 = b*rho;
//When we use 1000 below we get Observation 1 output.
//But if we use 200, we get Observation 2 output.
pt1.x = cvRound(x0 + 1000*(-b));
pt1.y = cvRound(y0 + 1000*(a));
pt2.x = cvRound(x0 - 1000*(-b));
pt2.y = cvRound(y0 - 1000*(a));
//This line function is independent of HoughLines function
//and is used for drawing any type of line in OpenCV
line(frame, pt1, pt2, Scalar(0,0,255), 3, LINE_AA);
}
Input Image:
Observation 1:
Observation 2:
Problem:
In the code shown above if we play around with the number we multiply with a, -a, b & -b we get lines of different lengths. The Observation 2 was obtained when I multiplied by 200 instead of 1000 (which lead to Observation 1).
For more information, please refer to comments in lines 18 and 19 of code shown above.
Question:
When we draw lines from HoughLines output, how can we have control of, from where our line begins and ends ?
For instance, I want the right lane (red line pointing towards right bottom from left top corner) in Observation 2 to begin from the right bottom of the screen and point towards the left top of the screen (like a mirror image of the left lane).
Given
a = cos(theta)
b = sin(theta)
x0 = a * rho
y0 = b * rho
you can write the formula for all points lying on the line defined by (rho, theta) as
x = x0 - c * b
y = y0 + c * a
where c is distance from the reference point (intersect with perpendicular line through origin).
In your case, you've evaluated it with c = 1000 and c = -1000 to get two points to draw a line between.
You can rewrite those as
c = (x0 - x) / b
c = (y - y0) / a
And then use substitution to calculate horizontal and vertical intercepts:
x = x0 - ((y - y0) / a) * b
or
y = y0 + ((x0 - x) / b) * a
NB: Take care to correctly handle the cases when a or b are 0.
Let's say you have an 800x600 image (to keep numbers simple). We can define the bottom edge of the image as the line y = 599. Calculate the value of x where your line intercepts it using the above formula.
If the intercept point is in the image (0 <= x < 800), there's your starting point.
If it's to the left (x < 0), find the intercept with line x = 0 to use as starting point.
If it's to the right (x >= 800), find the intercept with line x = 799 to use as starting point.
Then use similar technique to find the second point to be able to draw a line.
I've got a question about bilinear interpolation in the OSRM-Project.
I understand the "normal" bilinear interpolation. Here the picture from Wikipedia, what is insane:
Now I'm trying to understand the bilinear interpolation which is used in the OSRM-Project for raster source data.
// Query raster source using bilinear interpolation
RasterDatum RasterSource::GetRasterInterpolate(const int lon, const int lat) const
{
if (lon < xmin || lon > xmax || lat < ymin || lat > ymax)
{
return {};
}
const auto xthP = (lon - xmin) / xstep;
const auto ythP =
(ymax - lat) /
ystep; // the raster texture uses a different coordinate system with y pointing downwards
const std::size_t top = static_cast<std::size_t>(fmax(floor(ythP), 0));
const std::size_t bottom = static_cast<std::size_t>(fmin(ceil(ythP), height - 1));
const std::size_t left = static_cast<std::size_t>(fmax(floor(xthP), 0));
const std::size_t right = static_cast<std::size_t>(fmin(ceil(xthP), width - 1));
// Calculate distances from corners for bilinear interpolation
const float fromLeft = xthP - left; // this is the fraction part of xthP
const float fromTop = ythP - top; // this is the fraction part of ythP
const float fromRight = 1 - fromLeft;
const float fromBottom = 1 - fromTop;
return {static_cast<std::int32_t>(raster_data(left, top) * (fromRight * fromBottom) +
raster_data(right, top) * (fromLeft * fromBottom) +
raster_data(left, bottom) * (fromRight * fromTop) +
raster_data(right, bottom) * (fromLeft * fromTop))};
}
Original Code here
Can someone explain me how the code works?
The input format are the SRTM data in ASCII format.
The variables height and width are defined as nrows and ncolumns.
The variables xstep and ystep are defined as:
return (max - min) / (static_cast<float>(count) - 1)
Where count is height for ystep and width for xstep, max and min similar.
And another question:
Can I use the same code for data in TIF-format and the whole world?
Horizontal pixel coordinates are in the range [0, width - 1]; similarly vertical coordinates are in [0, height - 1]. (Zero-indexing convention used in many many languages including C++)
The lines
const auto xthP = (lon - xmin) / xstep; (and for ythP)
Convert the input image-space coordinates (long, lat) into pixel coordinates. xstep is the width of each pixel in image-space.
Rounding this down (using floor) gives pixels intersected by the sample area on one side, and rounding up (ceil) gives the pixels on the other side. For the X-coordinate these give left and right.
The reason for using fmin and fmax are to clamp the coordinates so that they don't exceed the pixel coordinate range.
EDIT: since you are trying to interpret this picture, I'll list the corresponding parts below:
Q11 = (left, top)
Q12 - (left, bottom), etc.
P = (xthP, ythP)
R1 = fromTop, R2 = fromBottom etc.
A good start point would be http://www.cs.uu.nl/docs/vakken/gr/2011/Slides/06-texturing.pdf, slide 27. In future though, Google is your friend.
I want to implement the following using OpenCV (I'll post my attempt at the bottom of the post). I am aware that OpenCV has a function for something like this, but I want to try to write my own.
In an image (Mat) (the coordinate system is at the top left, since it is an image) of width width and height height, I want to display a filled ellipsewith the following properties:
it should be centered at (width/2, height/2)
the image should be binary, so the points corresponding to the ellipse should have a value of 1 and others should be 0
the ellipse should be rotated by angle radians around the origin (or degrees, this does not matter all that much, I can convert)
ellipse: semi-major axis parameter is a and semi-minor axis parameter is b and these two parameters also represent the size of these axes in the picture, so "no matter" the width and height, the ellipse should have a major axis of size 2*a and a minor axis of size 2*b
Ok, so I've found an equation similar to this (https://math.stackexchange.com/a/434482/403961) for my purpose. My code is as follows.. it does seem to do pretty well on the rotation side, but, sadly, depending on the rotation angle, the SIZE (major axis, not sure about the minor) visibly increases/decreases, which is not normal, since I want it to have the same size, independent of the rotation angle.
NOTE The biggest size is seemingly achieved when the angle is 45 or -45 degrees and the smallest for angles like -90, 0, 90.
Code:
inline double sqr(double x)
{
return x * x;
}
Mat ellipticalElement(double a, double b, double angle, int width, int height)
{
// just to make sure I don't use some bad values for my parameters
assert(2 * a < width);
assert(2 * b < height);
Mat element = Mat::zeros(height, width, CV_8UC1);
Point center(width / 2, height / 2);
for(int x = 0 ; x < width ; x++)
for(int y = 0 ; y < height ; y++)
{
if (sqr((x - center.x) * cos(angle) - (y - center.y) * sin(angle)) / sqr(a) + sqr((x - center.x) * sin(angle) - (y - center.y) * cos(angle)) / sqr(b) <= 1)
element.at<uchar>(y, x) = 1;
}
return element;
}
A pesky typo sneaked in your inequality. The first summand must be
sqr((x - center.x) * cos(angle) + (y - center.y) * sin(angle)) / sqr(a)
Note the plus sign instead of minus.
Hi i have problem with multiple hgt files when i want to diplay them.
When i have one map it is not a problem. For example for 2d map i can remember vertex like
vec2(i,j)*vec2(0.01,-0.01).
But i need to have more than one map. I need to use Equirectangular projection
So my question is how to transform i,j position from hgt file to Longitude and Latitude.
My idea is if we have file N45E016.
x = 44 + i/1201;
y = 16 + j/1201;
But i think this is wrong. Because x depends from y;
After i get x and y i can compute Equirectangular projection.
So my question is how to do this better.
Try this:
x = xmin + dx * i / (w - 1)
y = ymin + dy * j / (h - 1)
with:
dx = xmax - xmin
dy = ymax - ymin
xmin, xmax are the min./max. longitude of the tile (hgt file),
ymin, ymax are the min./max. latitude of the tile,
w, h are the width and height of the tile (number of samples along the longitude/latitude axis).
You may have to adapt slightly the proposed formula depending on whether the samples are replicated along the tile boundaries or not.
I got a mandelbrot set I want to zoom in. The mandelbrot is calculated around a center coordinate, mandelbrot size and a zoom-level. The original mandelbrot is centered around
real=-0.6 and im=0.4 with a size of 2 in both real and im.
I want to be able to click on a point in the image and calculate a new one, zoomed in around that point
The window containing it is 800x800px, so I figured this would make a click in the lower right corner be equal to a center of real=0.4 and im=-0.6, and a click in the upper left corner be real=-1.6 and im=1.4
I calculated it with:
for the real values
800a+b=0.4 => a=0.0025
0a+b=-1.6 => b=-1.6
for imaginary values
800c+d=-0.6 => c=-0.0025
0c+d=1.4 => d=1.4
However, this does not work if I continue with mandelbrot size of 2 and zoom-level of 2. Am I missing something concerning the coordinates with the zoom-levels?
I had similar problems zooming in my C# Mandelbrot. My solution was to calculate the difference from the click position to the center in percents, multiply this with the maximum of units (width / zoom * 0.5, width = height, zoom = n * 100) from the center and add this to your current value. So My code was this (assuming I get sx and sy as parameters from the click):
double[] o = new double[2];
double digressLRUD = width / zoom * 0.5; //max way up or down from the center in coordinates
double shiftCenterCursor_X = sx - width/2.0; //shift of cursor to center
double shiftCenterCursor_X_percentage = shiftCenterCursor_X / width/2.0; //shift in percentage
o[0] = x + digressLRUD * shiftCenterCursor_X_percentage; //new position
double shiftCenterCursor_Y = sy - width/2.0;
double shiftCenterCursor_Y_percentage = shiftCenterCursor_Y / width/2.0;
o[1] = y - digressLRUD * shiftCenterCursor_Y_percentage;
This works, but you'll have to update the zoom (I use to multiply it with 2).
Another point is to move the selected center to the center of the image. I did this using some calculations:
double maxRe = width / zoom;
double centerRe = reC - maxRe * 0.5;
double maxIm = height / zoom;
double centerIm = -imC - maxIm * 0.5;
This will bring you the coordinates you have to pass your algorithm so it'll render the selected place.