I'm looking for some help with a project I'm working on.
What I'm doing is a polygon approximation algorithm. I've got all of the points of my boundary, but in order to start the algorithm, I need to find the top left and bottom right points from the set of points. All of the points are stored in a structure array that has the x and y coordinates of each point. Any ideas on an easy way to loop through the array of points?
Any help would be greatly appreciated. If you need more information, just ask and I'll provide all I can.
Based on your comment that bottom left is min(x+y) and top right is max(x+y)
Top left: min(x+max(y)-y)
Bottom right: max(max(x)-x+y)
Where the inner max is a constant.
Though this may not always give a result that agrees with your eyes.
Alternative metrics can be constructed based on the distance from the corners of the bounding box of your object, or the square of the distance, etc.
Another technique is to translate the polygon around the origin and then top left is the point furthest from the origin but in the top left quadrant... That gives a whole heap of choices as to where to put (0,0) could be average of all, could be weighted average based on some rule, etc. lot of variability in how you pick that, each may give results that differ for a very small number if polygons from what the eye would pick.
Finally you could always train a neural network to pick the answer.... That might result in something that is (insert confidence limits from training)% likely to give an answer you agree with... But you and I may disagree
Top left: min(x+max(y)-y)
Bottom right: min(max(x)-x+y)
Where the inner max is a constant.
For the corners I used 2 indicator variables left and up which can have the values {-1,1}.
For upper corners (i.e. top-Left and top-Right) up := 1,
if its a lower corner (i.e. bottom-left and bottom-right) up:= -1 .
Analogously for the left variable:
left:= 1 for top-left and bottom-left
left := -1 for top-right and bottom right
Then the corner can be found by maximizing the term
left * point.x + up * point.y
over all Coordinates.
Related
I'm trying to fit a rectangle around a set of 8 2D-Points, while trying to minimize the covered area.
Example:
The rectangle may be scaled and rotated. However it needs to stay a rectangle.
My first approach was to brute force each possible rotation, fit the rectangle as close as possible, and calculate the covered area. The best fit would be then the rotation with the lowest area.
However this does not really sound like the best solution.
Is there any better way for doing this?
I don't know what you mean by "try every possible rotation", as there are infinitely many of them, but this basic idea actually yields a very efficient solution:
The first step is to compute the convex hull. How much this actually saves depends on the distribution of your data, but for points picked uniformly from a unit disk, the number of points on the hull is expected to be O(n^1/3). There are a number of ways to do that:
If the points are already sorted by one of their coordinates, the Graham scan algorithm does that in O(n). For every point in the given order, connect it to the previous two in the hull and then remove every concave point (the only candidate are those neighboring the new point) on the new hull.
If the points are not sorted, the gift-wrapping algorithm is a simple algorithm that runs at O(n*h). For each point on the hull starting from the leftmost point of the input, check every point to see if it's the next point on the hull. h is the number of points on the hull.
Chen's algorithm promises O(n log h) performance, but I haven't quite explored how it works.
another simle idea would be to sort the points by their azimuth and then remove the concave ones. However, this only seems like O(n+sort) at first, but I'm afraid it actually isn't.
At this point, checking every angle collected thus far should suffice (as conjenctured by both me and Oliver Charlesworth, and for which Evgeny Kluev offered a gist of a proof). Finally, let me refer to the relevant reference in Lior Kogan's answer.
For each direction, the bounding box is defined by the same four (not necessarily distinct) points for every angle in that interval. For the candidate directions, you will have at least one arbitrary choice to make. Finding these points might seem like an O(h^2) task until you realise that the extremes for the axis-aligned bounding box are the same extremes that you start the merge from, and that consecutive intervals have their extreme points either identical or consecutive. Let us call the extreme points A,B,C,D in the clockwise order, and let the corresponding lines delimiting the bounding box be a,b,c,d.
So, let's do the math. The bounding box area is given by |a,c| * |b,d|. But |a,c| is just the vector (AC) projected onto the rectangle's direction. Let u be a vector parallel to a and c and let v be the perpendicular vector. Let them vary smoothly across the range. In the vector parlance, the area becomes ((AC).v) / |v| * ((BD).u) / |u| = {((AC).v) ((BD).u)} / {|u| |v|}. Let us also choose that u = (1,y). Then v = (y, -1). If u is vertical, this poses a slight problem involving limits and infinities, so let's just choose u to be horizontal in that case instead. For numerical stability, let's just rotate 90° every u that is outside (1,-1)..(1,1). Translating the area to the cartesian form, if desired, is left as an exercise for the reader.
It has been shown that the minimum area rectangle of a set of points is collinear with one of the edges of the collection's convex hull polygon ["Determining the Minimum-Area Encasing Rectangle for an Arbitrary Closed Curve" [Freeman, Shapira 1975]
An O(nlogn) solution for this problem was published in "On the computation of minimum encasing rectangles and set diameters" [Allison, Noga, 1981]
A simple and elegant O(n) solution was published in "A Linear time algorithm for the minimum area rectangle enclosing a convex polygon" [Arnon, Gieselmann 1983] when the input is the convex hull (The complexity of constructing a convex hull is equal to the complexity of sorting the input points). The solution is based on the Rotating calipers method described in Shamos, 1978. An online demonstration is available here.
They first thing that came to mind when I saw this problem was to use principal component analysis. I conjecture that the smallest rectangle is the one that satisfies two conditions: that the edges are parallel with the principal axes and that at least four points lie on the edges (bounded points). There should be an extension to n dimensions.
I have a map that I can zoom in and out of. Gridlines display the latitude and longitude.
Right now I zoom in by shrinking the borders by a specific amount.
Say top edge is lat 30n, bot edge is lat 28n, left edge is lon -40w, right edge is lon -38w.
So for one zoom in, the new edges are 29.5n, 28.5n, -39.5w,-38.5 respectively.
Zooming in enough times would eventually make the opposite edges equal each other and then flip.
So next zoom in for top and bot edge would be 29n, 29n and then next 28.5n,29.5n which flips the orientation.
I'm looking for a function or a way that I can continuously zoom in, but never have the edges equal each other or flip. In others words something where I can get the edges to continuously get closer and closer to each other, but never touch.
So I have an idea that I'll need some function like
lim x-> infinity of x/(x+1) - 1 + midpoint of the two edges
which basically means, as I zoom in I will approach the midpoint.
If this is valid, I don't know how to implement limits in C++.
This is only one equation too. I have two edges on each axis and so I'd need two equations to approach the midpoint from both sides at an equal rate.
In 2d space we have a collection of rectangles.
Here's a picture:
Vertical line on the right is non-moveable "wall".
Arrow on the left shows direction of movement.
I need to move leftmost rectangle to the right.
Rectangles cannot rotate.
Rectangles are not allowed to overlap horizontally.
Rectangle can shrink (horizontally) down to a certain minimum width (which is set individually to each rectangle)
Rectangles can only move horizontally (left/right) and shrink horizontally.
Leftmost rectangle (pointed at by arrow) cannot shrink.
Widths and coordinates are stored as integers.
I need to move leftmost rectangle to the right by X units, pushing everything in its way to the right.
There are two problems:
I need to determine how far I can move leftmost rectangle to the right (it might not be possible to move for X units).
Move rect by X units (or if it is not possible to move by X units, move by maximum possible amount, smaller than X) and get new coordinates and sizes for every rectangle in the system.
Additional complications:
You cannot use Y coordinate and height of rectangle for anything, instead
every rectangle has a list (implemented as pointers) of rectangles it will hit if you keep moving it to the right, you can only retrieve x coordinate, width, and minimum width. This data model cannot be changed. (technically, reppresenting this as a set of rectangles in 2d is simplification)
Important: Children from different levels and branches can have the same rectangle in the "potential collision" list. Here's initial picture with pointers displayed as red lines:
How can I do it?
I know a dumb way (that'll work) to solve this problem: iteratively. I.e.
Memorize current state of the system. If state of the system is already memorized, forget previously memorized state.
Push leftmost rect by 1 unit.
Recursively resolve collisions (if any).
If collision could not be resolved, return memorized state of the system.
If collisions could be resolved, and we already moved by X units, return current state of the system.
Otherwise, go to 1.
This WILL solve the problem, but such iterative solution can be slow if X is large. Is there any better way to solve it?
One possible solution that comes to mind:
You said that each rectangle holds pointers to all the objects it will hit when moving right. I would suggest, take the list of pointers from the big rectangle (the one pointed by the arrow), take all it's nodes (the rectangles it would collide), find the min width, then do the same of all the child nodes, add the widths for each branch recursively. Treat the problem like a tree depth problem. Every node has a min width value so the answer to your question would be the distance between the wall and the x value of the right edge of the big rectangle minus the GREATEST sum of the least width of the rectangles. Create a vector where the depth (sum of min widths) of each branch of the tree is stored and find the max value. Distance minus max is your answer.
imagine the same picture with 4 boxes. one to the left, one to the right and then the wall. lets name them box 1, box 2 (mid top), box 3 (mid bottom) and the last one box 4 (right). each box has a width of 4. all can shrink except the left one. box 2 can shrink by 2, box 3 can shrink by 1, box 4 can shrink by 2. Calculating the 2 branches gives
* Branch 1 : 2 + 2 = 4
* Branch 2 : 3 + 2 = 5
* Only concerned with branch 2 because branch 1 is LESS than branch 2 hence will fit in the space parallel to branch 2and is available to shrink that much.
I am new to stereo matching. I couldn't understand the concept of disparity. What are a disparity map and disparity image, and what is the difference between them?
Disparity
Disparity refers to the distance between two corresponding points in the left and right image of a stereo pair. If you look at the image below you see a labelled point X (ignore X1, X2 & X3). By following the dotted line from X to OL you see the intersection point with the left hand plane at XL. The same principal applies with the right-hand image plane.
If X projects to a point in the left frame XL = (u,v) and to the right frame at XR = (p,q) you can find the disparity for this point as the magnitude of the vector between (u,v) and (p,q).
Obviously this process involves choosing a point in the left hand frame and then finding its match (often called the corresponding point) in the right hand image; often this is a particularly difficult task to do without making a lot of mistakes.
Disparity Map/Image
If you were to perform this matching process for every pixel in the left hand image, finding its match in the right hand frame and computing the distance between them you would end up with an image where every pixel contained the distance/disparity value for that pixel in the left image.
Example
Given a left image
And a right image
By matching every pixel in the left hand image with its corresponding pixel in the right hand image and computing the distance between the pixel values (the disparities) you should end up with images that look like this:
This bottom image is known as a disparity image/map. A useful topic to read about when performing stereo matching is rectification. This will make the process of matching pixels in the left and right image considerably faster as the search will be horizontal.
One of the easiest methods to understand the disparity would be to blink your eyes, one at a time, alternating between your left and right eye. If you observe, the objects closer to you would appear to jump about their position more than the objects further away. This shift would become negligible as the objects move away. Therefore, in the disparity map, the brighter shades represent more shift and lesser distance from the point of view (camera). The darker shades represent lesser shift and therefore greater distance from the camera.
Good Morning everybody,
Today I wanna concern about the topic "Image Manipulation in C++".
So far I am able to filter all the noisy stuff out of the picture and change the color to black and white.
But now I have two questions.
First Question:
Below you see a screenshot of the image. What is the best way to find out how to rotate the text. In the end it would be nice if the text is horizontal. Does anybody have a good link or an example.
Second Question:
How to go on? Do you think I should send the image to an "Optical Character Recognizer" (a) or should I filter out each letter (b)?
If the answer is (a) what is the smallest ocr lib? All libs I found so far seem to be overpowered and difficult to implement in an existing project. (like gocr or tesseract)
If the answer is (b) what is the best way to save each letter as an own image? Shoul i search for an white pixel an than go from pixel to pixel an save the coordinates in an 2D Array? What is with the letter "i" ;)
Thanks to everybody who will help me to find my way!Sorry for the strange english above. I'm still a language noob :-)
The usual name for the problem in your first question is "Skew Correction"
You may Google for it (lot of references). A nice paper here, showing for example how to get this:
An easy way to start (but not as good as the previously mentioned), is to perform a Principal Component Analysis:
For your first question:
First, Remove any "specs" of noisy white pixels that aren't part of the letter sequence. A gentle low-pass filter (pixel color = average of surrounding pixels) followed by a clamping of the pixel values to pure black or pure white. This should get rid of the little "dot" underneath the "a" character in your image and any other specs.
Now search for the following pixels:
xMin = white pixel with the lowest x value (white pixel closest to the left edge)
xMax = white pixel with the largest x value (white pixel closest to the right edge)
yMin = white pixel with the lowest y value (white pixel closest to the top edge)
yMax = white pixel with the largest y value (white pixel closest to the bottom edge)
with these four pixel values, form a bounding box: Rect(xMin, yMin, xMax, yMax);
compute the area of the bounding box and find the center.
using the center of the bounding box, rotate the box by N degrees. (You can pick N: 1 degree would be an ok value).
Repeat the process of finding xMin,xMax,yMin,yMax and recompute the area
Continue rotating by N degrees until you've rotated K degrees. Also rotate by -N degrees until you've rotated by -K degrees. (Where K is the max rotation... say 30 degrees). At each step recompute the area of the bounding box.
The rotation that produces the bounding box with the smallest area is likely the rotation that aligns the letters parallel to the bottom edge (horizontal alignment).
You could measure the height to each white pixel from the bottom and find how much the text is leaning. It's a very simple approach but it worked fine for me when I tried it.