Fasteest algorithm to compute avg distance between two Sets of points

Fasteest algorithm to compute avg distance between two Sets of points - computer-vision

Please see picture :
Given the set of points marked in Red, I take two consecutive points (here 0 and 1 - these numbers are just for illustration , thus not the index in the array holding these points.).
I take their midpoint. From the midpoint I draw a normal on each segments in the Green set (segment = line between two consecutive points).
The blue line is such a normal. The intersection point falls between points 10 and 11. I record the length of it.
The black normal line however, is a normal on the line given by points 12 and 13. But, the intersection does not fall between 12 and 13 . So I reject it.
I want to get the median value of the lengths all such accepted lines, measured from the midpoints of the segments in the red set.
My Brute force algorithm is running at O(MN) time.
My questions :
Is there a standard algorithm for what I am seeking? That is to say, I do not know if the parameter I am measuring has a common name.
What is the fastest method of measuring it.
I would love to do some parallel processing, but I am using D, and I am getting :
"core.thread.threadbase.ThreadError#src/core/thread/threadbase.d(1219): Error creating thread"
Thank you.

Related

Interpolating issue during Marching Cube on a scalar field

I have a regular square grid wherein all my data points are stored at the centroid. I have a scalar field(range : 0->1) that indicates the amount of substance inside a cell. I am interested in identifying the interface of this substance inside the cell (for further processing and not for visualization).
I came across the Marching cube algorithm(http://paulbourke.net/geometry/polygonise/). Here I need the values at the corners of the cell. So I averaged the centroid values of the neighbouring cell.
This averaging coupled with further linearization to find the points of intersection during "Polygonization" in MC is resulting in non-realistic interfaces such as this..
Here the gray cell is full of substance while its neighbours have minimal amount of substance. Ideally this should be very close to the boundary of the celtre cell. I feel this happens due to linear interpolation between 0.25 and 0 which leads it far off from its intended position.
Can something be done to sort out this issue ?

The Marching-Cubes-algorithm has one parameter that can be adjusted, namely the isolevel. In your example you seem to have chosen a value around 0.05 for the isolevel. When choosing a value just below 0.25 (e.g. something like 0.24) the interfaces will be much closer to the center cell. But then you will have still unsatisfactory results when two cells with value 1 touch each other s.t. the corners will have an average value of 0.5.
What you still can try: instead of averaging the cell values for computing corner values you can take the maximum cell value for the corner value and raise the isolevel to a value just below 1 (e.g. 0.9).

C++: Find neighbouring grid points from calibration picture from unsorted list

I do have 4 lists of the x and y coordinates of calibration points. Those are in no particular order and not alligned on any axis (they come from a real calibration picture with slight rotation and distortion) but the lists have the same indexing and cannot be sorted in such a way that each list is ascending/descending. They also hold no integer values but floating point. I am now trying to find the four neighbouring points for a given point.
E.g. searching for the neighbours of the point [150,150] would return [140,140], [140,160], [160,140], [160,160] (except for them actually beeing more like [139.581239,138.28812]).
At the moment I have to look through all calibration points for each point to check. There are about 500 calibration points.
Later during the process, I need to know the 4 neighbours for a random point within the 1600x1400 grid for multiple million times. So it is crucial to find those points as fast as possible to avoid calculation time of days or even weeks.
My first approach was checking each of the ~500 calibration points for each point to check and look at their relative position to the checking point (x_calib > x and y_calib > y would be somewhere in the top, right region of the point) and calculate their distance to it. The closest point in each region (top left, top right, lower left, lower right) would then be the respective neighbour point. That seems not the be efficient at all and takes a lot of time.
The second approach was creating a rainbow table for each of the 1600x1400 points and save the respective neighbours (to be exact, to save the index in the list of coordinates). Later on, the process would check this rainbow table at position [x,y,0], [x,y,1], [x,y,2] and [x,y,3] to get the 4 indices of the 4 neighbour points. Though calculating the rainbow table takes some time (~20 minutes for those ~2 million points), this approach speeds up the later processing. Unfortunatelly, this approach makes it difficult to debug the later steps of the process because it takes this much time before the rest even starts..
I still think there should be room for optimization and I would appreciate any suggestion or help to speed up the whole thing. I allready read about the kd-tree thing but did not quite see the possibility to use it here. I'm hoping that there's an approach for this kind of unsorted (and unsortable) list of points which is more efficient than the rainbow table - or which is at least faster at creating the table.
Thanks in advance!

c++, Generate diamond or triangular holes in uniform distribution in a 2D space

How to generate particles in 2D space using uniform random distribution such that there are triangular or diamond shaped holes within?

Acceptance/Rejection - define your cutout areas, generate points uniformly over the 2-d space, and if the result lands in a cutout reject it and try again. Probability of acceptance will be p(accept) = 1 - Area(cutouts) / Area(2-d_generating_space), and the expected number of attempts to generate will be the inverse of that. For example, if the holes make up 80% of your space then p(accept) = 0.2 for a given trial and on average it will take 5 attempts to get an acceptable point.

I would start off with the triangle case, since the diamond case is really the same as having two triangles.
Here is another explanation of pjs' algorithm:
Define your 2-d space in terms of x-min, x-max, y-min, y-max.
Define your a set of triangles you are cutting from in terms of triangle1[point1, point2, point3] ... triangle_n[point1, point2, point3].
Pick how many points you want to generate, call this numberOfPoints.
Iterate over the numberOfPoints.
Pick a random value within your x-range (from x-min to x-max)
Pick a random value within your y-range (from y-min to y-max).
This is your x,y position for your new random point.
Check to see if this fits within any of your cutting triangles (you will have another loop here) and can use this, or another containment test.
If it is within one of the cutting triangles, throw it away and do not increment your counter. Otherwise, you have successfully added a point.
There are ways to do this more efficiently, than checking every single point against every single cutting triangle. This is an OK first approach for not too many triangles.

Opengl ,split a sprite to 2 other sprites between a given line

I don't usually make questions without some code in them but this time i cant find a starting point in what I want to do ,I want to split a sprite(uv ,vertices) to two other sprites(uv,vertices) between 2 points ,just like in fruit ninja where you split the fruits, but in 2d sprites.
I don't want you to write the code,just explain the general idea of how to do it .
I am using Libgdx if that matters

This process is called clipping.
In your case, you have a polygon defined by 4 vertices (including their positions and UV coordinates). You split this via a line given by two points.
A simple algorithm would check on which side of the line each of the 4 points is. If it is on the left side, add it to your first result, if it is on the right side, add it to your second. If two consecutive vertices end up on different sides of the line, you need to compute the intersection of the line and that edge and add it to both results.

Finding largest rectangle in 2D array

I need an algorithm which can parse a 2D array and return the largest continuous rectangle. For reference, look at the image I made demonstrating my question.

Generally you solve these sorts of problems using what are called scan line algorithms. They examine the data one row (or scan line) at a time building up the answer you are looking for, in your case candidate rectangles.
Here's a rough outline of how it would work.
Number all the rows in your image from 0..6, I'll work from the bottom up.
Examining row 0 you have the beginnings of two rectangles (I am assuming you are only interested in the black square). I'll refer to rectangles using (x, y, width, height). The two active rectangles are (1,0,2,1) and (4,0,6,1). You add these to a list of active rectangles. This list is sorted by increasing x coordinate.
You are now done with scan line 0, so you increment your scan line.
Examining row 1 you work along the row seeing if you have any of the following:
new active rectangles
space for existing rectangles to grow
obstacles which split existing rectangles
obstacles which require you to remove a rectangle from the active list
As you work along the row you will see that you have a new active rect (0,1,8,1), we can grow one of existing active ones to (1,0,2,2) and we need to remove the active (4,0,6,1) replacing it with two narrower ones. We need to remember this one. It is the largest we have seen to far. It is replaced with two new active ones: (4,0,4,2) and (9,0,1,2)
So at the send of scan line 1 we have:
Active List: (0,1,8,1), (1,0,2,2), (4,0,4,2), (9, 0, 1, 2)
Biggest so far: (4,0,6,1)
You continue in this manner until you run out of scan lines.
The tricky part is coding up the routine that runs along the scan line updating the active list. If you do it correctly you will consider each pixel only once.
Hope this helps. It is a little tricky to describe.

I like a region growing approach for this.
For each open point in ARRAY
grow EAST as far as possible
grow WEST as far as possible
grow NORTH as far as possible by adding rows
grow SOUTH as far as possible by adding rows
save the resulting area for the seed pixel used
After looping through each point in ARRAY, pick the seed pixel with the largest area result
...would be a thorough, but maybe not-the-most-efficient way to go about it.
I suppose you need to answer the philosophical question "Is a line of points a skinny rectangle?" If a line == a thin rectangle, you could optimize further by:
Create a second array of integers called LINES that has the same dimensions as ARRAY
Loop through each point in ARRAY
Determine the longest valid line to the EAST that begins at each point and save its length in the corresponding cell of LINES.
After doing this for each point in ARRAY, loop through LINES
For each point in LINES, determine how many neighbors SOUTH have the same length value or less.
Accept a SOUTHERN neighbor with a smaller length if doing so will increase the area of the rectangle.
The largest rectangle using that seed point is (Number_of_acceptable_southern_neighbors*the_length_of_longest_accepted_line)
As the largest rectangular area for each seed is calculated, check to see if you have a new max value and save the result if you do.
And... you could do this without allocating an array LINES, but I thought using it in my explanation made the description simpler.
And... I think you need to do this same sort of thing with VERTICAL_LINES and EASTERN_NEIGHBORS, or some cases might miss big rectangles that are tall and skinny. So maybe this second algorithm isn't so optimized after all.
Use the first method to check your work. I think Knuth said "...premature optimization is the root of all evil."
HTH,
Perry
ADDENDUM:Several edits later, I think this answer deserves a group upvote.

A straight forward approach would be to do a loop through all the potential rectangles in the grid, figure out their area, and if it is greater than the current highest area, select it as the highest:
var biggestFound
for each potential rectangle:
if area(this potential rectangle) > area(biggestFound)
biggestFound = this potential rectangle
Then you simply need to find the potential rectangles.
for each square in grid:
recursive loop 1:
if not occupied:
grow right until occupied, and return a rectangle
grow down one and recurse (call loop 1)
This will duplicate a lot of work (for example you will re-evaluate a lot of sub-rectangles), but it should give you an answer.
Edit
An alternate approach might be to start with a single square the size of the grid, and "subtract" occupied squares to end up with a final set of potential rectangles. There might be optimization opportunities here using quadtrees, and in ensuring that you keep split rectangles "in order", top to bottom, left to right, in case you need to re-combine rectangles farther down in the algorithm.
If you are actually starting out with rectangular data (for your "populated grid" set), instead of a loose pixel grid, then you could easily get better perf out of a rectangle/region subtracting algorithm.
I'm not going to post pseudo-code for this because the idea is completely experimental, and I have no idea if the perf will be any better for a loose pixel grid ;)
Windows system "regions" and "dirty rectangles", as well as general "temporal caching" might be good inspiration here for more efficiency. There are also a lot of z-buffer tricks if this is for a graphics algorithm...

Use dynamic programming approach. Consider a function S(x,y) such that S(x,y) holds the area of the largest rectangle where (x,y) are the lowest-right-most corner cell of the rectangle; x is the row co-ordinate and y is the column co-ordinate of the rectangle.
For example, in your figure, S(1,1) = 1, S(1,2)=2, S(2,1)=2, and S(2,2) = 4. But, S(3,1)=0, because this cell is filled. S(8,5)=40, which says that the largest rectangle for which the lowest-right cell is (8,5) has the area 40, which happens to be the optimum solution in this example.
You can easily write a dynamic programming equation of S(x,y) from the value of S(x-1,y), S(x,y-1) and S(x-1,y-1). Using that you can obtain the values of all S(x,y) in O(mn) time, where m and n are the row and column dimension of the given table. Once, S(x,y) are know for all 1<=x <= m, and for all 1 <= y <= n, we simply need to find the x, and y for which S(x,y) is the largest; this step also takes O(mn) time. By keeping addition data, you can also find the side-length of the largest rectangle.
The overall complexity is O(mn). To understand more on this, Read Chapter 15 or Cormen's algorithm book, specifically Section 15.4.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js