Reducing time complexity in maximal minimum-sum 2-partitioning of an array - c++

Let array[N] an array of N non-negative values. We're trying to recursively partition the array in two (2) subarrays, so that we can achieve the maximum "minimum-sum" of each subarray. The solution is described by the following recursion:
We want to calculate opt[0][N-1].
Let c[x][y] denote the sum{array[i]} from x up to y (including).
I have managed to unwind the recursion in the following C++ code snippet, using dynamic programming:
for ( uint16_t K1 = 0; K1 < N; K1 ++ ) {
for ( uint16_t K2 = 0; K2 < N-K1; K2 ++ ) {
const uint16_t x = K2, y = K2 + K1;
opt[x][y] = 0;
for ( uint16_t w = x; w < y; w ++ ) {
uint32_t left = c[x][w] + opt[x][w];
uint32_t right = c[w+1][y] + opt[w+1][y];
/* Choose minimum between left-right */
uint32_t val = MIN( left, right );
/* Best opt[x][y] ? */
if ( val > opt[x][y] ) {
opt[x][y] = val;
}
}
} /* K2 */
} /* K1 */
This technique parses all subarrays, beginning from size 1 and up to size N. The final solution will thus be stored in opt[0][N-1].
For example, if N=6, the matrix will be iterated as follows: (0,0) (1,1) (2,2) (3,3) (4,4) (5,5) (0,1) (1,2) (2,3) (3,4) (4,5) (0,2) (1,3) (2,4) (3,5) (0,3) (1,4) (2,5) (0,4) (1,5) (0,5). The final answer will be in opt[0][5].
I have tested and verified that the above technique works to unwind the recursion. I am trying to further reduce the complexity, as this will run in O(n^3), if I'm correct. Could this be achieved?
edit: I'm also noting the physical meaning of the recursion, as it was asked in the comments. Let N denote N cities across a straight line. We're a landlord who controls these cities; at the end of a year, each city i pays an upkeep of array[i] coins as long as it's under our control.
Our cities are under attack by a superior force and defeat is unavoidable. At the beginning of each year, we erect a wall between two adjacent cities i,i+1, x <= i <= y. During each year, the enemy forces will attack either from the west, thus conquering all cities in [x,i], or will attack from the east, thus conquering all cities in [i+1,y]. The remaining cities will pay us their upkeep at the end of the year. The enemy forces destroy the wall at the end of the year, retreat, and launch a new attack in the following year. The game ends when only 1 city is left standing.
The enemy forces will always attack from the optimal position, in order to reduce our maximum income over time. Our strategy is to choose the optimal position of the wall, so as to maximize our total income at the end of the game.

Here's the final answer to the problem, following the contribution of #NiklasB. . Let w(x,y) denote the optimal partition of an array for the problem opt[x][y]. As follows, x <= w(x,y) < y. We assume that the positions for all subproblems opt[x][y] with a given subarray size d = y-x are known.
Let's now try to find the optimal w positions for all subproblems of size k+1. We can easily prove that w(x,y+1) >= w(x,y); IOW if we add another element to the right, the optimal partition might "move to the right", in order to more evenly balance the two sums; it however cannot "move to the left". In a similar fashion, w(x-1,y) <= w(x,y).
NB: it would be helpful if someone could attempt to mathematically verify the above.
As follows, let wall[x][y] denote the optimal w solution for the subproblem opt[x][y]. Loop for ( uint16_t w = x; w < y; w ++ ) in the original snippet, will be modified as follows:
for ( uint16_t w = wall[x][y-1]; w <= wall[x+1][y]; w ++ ) {
...
if ( val > opt[x][y] ) {
opt[x][y] = val;
wall[x][y] = w;
}
}
A few modifications are needed to deal with corner cases when 0 <= y-x <= 1, but it does the job. It reduces the running time complexity from O(n^3) to O(n^2), since the time to compute the solution for a larger subproblem is amortized O(1), by taking into account the w boundaries. Example: with N = 2500, the recursive algorithm (with memoization) runs in 58 sec. The O(n^2) algorithm runs in only 148 msec.

Related

Given N lines on a Cartesian plane. How to find the bottommost intersection of lines efficiently?

I have N distinct lines on a cartesian plane. Since slope-intercept form of a line is, y = mx + c, slope and y-intercept of these lines are given. I have to find the y coordinate of the bottommost intersection of any two lines.
I have implemented a O(N^2) solution in C++ which is the brute-force approach and is too slow for N = 10^5. Here is my code:
int main() {
int n;
cin >> n;
vector<pair<int, int>> lines(n);
for (int i = 0; i < n; ++i) {
int slope, y_intercept;
cin >> slope >> y_intercept;
lines[i].first = slope;
lines[i].second = y_intercept;
}
double min_y = 1e9;
for (int i = 0; i < n; ++i) {
for (int j = i + 1; j < n; ++j) {
if (lines[i].first ==
lines[j].first) // since lines are distinct, two lines with same slope will never intersect
continue;
double x = (double) (lines[j].second - lines[i].second) / (lines[i].first - lines[j].first); //x-coordinate of intersection point
double y = lines[i].first * x + lines[i].second; //y-coordinate of intersection point
min_y = min(y, min_y);
}
}
cout << min_y << endl;
}
How to solve this efficiently?
In case you are considering solving this by means of Linear Programming (LP), it could be done efficiently, since the solution which minimizes or maximizes the objective function always lies in the intersection of the constraint equations. I will show you how to model this problem as a maximization LP. Suppose you have N=2 first degree equations to consider:
y = 2x + 3
y = -4x + 7
then you will set up your simplex tableau like this:
x0 x1 x2 x3 b
-2 1 1 0 3
4 1 0 1 7
where row x0 represents the negation of the coefficient of "x" in the original first degree functions, x1 represents the coefficient of "y" which is generally +1, x2 and x3 represent the identity matrix of dimensions N by N (they are the slack variables), and b represents the value of the idepent term. In this case, the constraints are subject to <= operator.
Now, the objective function should be:
x0 x1 x2 x3
1 1 0 0
To solve this LP, you may use the "simplex" algorithm which is generally efficient.
Furthermore, the result will be an array representing the assigned values to each variable. In this scenario the solution is:
x0 x1 x2 x3
0.6666666667 4.3333333333 0.0 0.0
The pair (x0, x1) represents the point which you are looking for, where x0 is its x-coordinate and x1 is it's y-coordinate. There are other different results that you could get, for an example, there could exist no solution, you may find out more at plenty of books such as "Linear Programming and Extensions" by George Dantzig.
Keep in mind that the simplex algorithm only works for positive values of X0, x1, ..., xn. This means that before applying the simplex, you must make sure the optimum point which you are looking for is not outside of the feasible region.
EDIT 2:
I believe making the problem feasible could be done easily in O(N) by shifting the original functions into a new position by means of adding a big factor to the independent terms of each function. Check my comment below. (EDIT 3: this implies it won't work for every possible scenario, though it's quite easy to implement. If you want an exact answer for any possible scenario, check the following explanation on how to convert the infeasible quadrants into the feasible back and forth)
EDIT 3:
A better method to address this problem, one that is capable of precisely inferring the minimum point even if it is in the negative side of either x or y: converting to quadrant 1 all of the other 3.
Consider the following generic first degree function template:
f(x) = mx + k
Consider the following generic cartesian plane point template:
p = (p0, p1)
Converting a function and a point from y-negative quadrants to y-positive:
y_negative_to_y_positive( f(x) ) = -mx - k
y_negative_to_y_positive( p ) = (p0, -p1)
Converting a function and a point from x-negative quadrants to x-positive:
x_negative_to_x_positive( f(x) ) = -mx + k
x_negative_to_x_positive( p ) = (-p0, p1)
Summarizing:
quadrant sign of corresponding (x, y) converting f(x) or p to Q1
Quadrant 1 (+, +) f(x)
Quadrant 2 (-, +) x_negative_to_x_positive( f(x) )
Quadrant 3 (-, -) y_negative_to_y_positive( x_negative_to_x_positive( f(x) ) )
Quadrant 4 (+, -) y_negative_to_y_positive( f(x) )
Now convert the functions from quadrants 2, 3 and 4 into quadrant 1. Run simplex 4 times, one based on the original quadrant 1 and the other 3 times based on the converted quadrants 2, 3 and 4. For the cases originating from a y-negative quadrant, you will need to model your simplex as a minimization instance, with negative slack variables, which will turn your constraints to the >= format. I will leave to you the details on how to model the same problem based on a minimization task.
Once you have the results of each quadrant, you will have at hands at most 4 points (because you might find out, for example, that there is no point on a specific quadrant). Convert each of them back to their original quadrant, going back in an analogous manner as the original conversion.
Now you may freely compare the 4 points with each other and decide which one is the one you need.
EDIT 1:
Note that you may have the quantity N of first degree functions as huge as you wish.
Other methods for solving this problem could be better.
EDIT 3: Check out the complexity of simplex. In the average case scenario, it works efficiently.
Cheers!

How to count how many valid colourings in a graph?

I attempted this SPOJ problem.
Problem:
AMR10J - Mixing Chemicals
There are N bottles each having a different chemical. For each chemical i, you have determined C[i] which means that mixing chemicals i and C[i] causes an explosion. You have K distinct boxes. In how many ways can you divide the N chemicals into those boxes such that no two chemicals in the same box can cause an explosion together?
INPUT
The first line of input is the number of test cases T. T test cases follow each containing 2 lines.
The first line of each test case contains 2 integers N and K.
The second line of each test case contains N integers, the ith integer denoting the value C[i]. The chemicals are numbered from 0 to N-1.
OUTPUT
For each testcase, output the number of ways modulo 1,000,000,007.
CONSTRAINTS
T <= 50
2 <= N <= 100
2 <= K <= 1000
0 <= C[i] < N
For all i, i != C[i]
SAMPLE INPUT
3
3 3
1 2 0
4 3
1 2 0 0
3 2
1 2 0
SAMPLE OUTPUT
6
12
0
EXPLANATION
In the first test case, we cannot mix any 2 chemicals. Hence, each of the 3 boxes must contain 1 chemical, which leads to 6 ways in total.
In the third test case, we cannot put the 3 chemicals in the 2 boxes satisfying all the 3 conditions.
The summary of the problem, given a set of chemicals and a set of boxes, count how many possible ways to place these chemicals in boxes such that no chemicals will explode.
At first I used brute force method to solve the problem, I recursively place chemicals in boxes and count valid configurations, I got TLE at my first attempt.
Later I learned that the problem can be solved with graph colouring.
I can represent chemicals as vertexes and there'a an edge between chemicals if they cannot be placed each other.
And the set of boxes can be used as vertex colours, all I need to do was to count how many different valid colourings of the graph.
I applyed this concept to solve the problem unfortunately I got TLE again. I don't know how to improve my code, I need help.
code:
#include <bits/stdc++.h>
#define MAXN 100
using namespace std;
const int mod = (int) 1e9 + 7;
int n;
int k;
int ways;
void greedy_coloring(vector<int> adj[], int color[])
{
int u = 0;
for (; u < n; ++u)
if (color[u] == -1)//found first uncolored vertex
break;
if (u == n)//no uncolored vertexex means all vertexes are colored
{
ways = (ways + 1) % mod;
return;
}
bool available[k];
memset(available, true, sizeof(available));
for (int v : adj[u])
if (color[v] != -1)//if the adjacent vertex colored, make its color unavailable
available[color[v]] = false;
for (int c = 0; c < k; ++c)
if (available[c])
{
color[u] = c;
greedy_coloring(adj, color);
color[u] = -1;//don't forgot to reset the color
}
}
int main()
{
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int T;
cin >> T;
while (T--)
{
cin >> n >> k;
vector<int> adj[n];
int c[n];
for (int i = 0; i < n; ++i)
{
cin >> c[i];
adj[i].push_back(c[i]);
adj[c[i]].push_back(i);
}
ways = 0;
int color[n];
memset(color, -1, sizeof(color));
greedy_coloring(adj, color);
cout << ways << "\n";
}
return 0;
}
Counting the number of colorings in a general graph is #P-hard, but this graph has some special structure, which I'll exploit in a minute after I enumerate some basic properties of counting colorings. The first observation is that, if the graph has a node with no neighbors, if we delete that node, the number of colorings decreases by a factor of k. The second observation is that, if a node has exactly one neighbor and we delete it, the number of colorings decreases by a factor of k-1. The third is that the number of colorings is equal to the product of the number of colorings for each connected component. The fourth is that we can delete all but one parallel edge.
Using these properties, it suffices to determine a formula for each connected component of the 2-core of this graph, which is a simple cycle of some length. Let P(n) and C(n) be the number of ways to color a path or cycle respectively with n nodes. We use the basic properties above to find
P(n) = k (k-1)^(n-1).
Finding a formula for C(n) I think requires the deletion contraction formula, which leads to a recurrence
C(3) = k (k-1) (k-2), i.e., three nodes of different colors;
C(n) = P(n) - C(n-1) = k (k-1)^(n-1) - C(n-1).
Multiply the above recurrence by (-1)^n.
(-1)^3 C(3) = -k (k-1) (k-2)
(-1)^n C(n) = (-1)^n k (k-1)^(n-1) - (-1)^n C(n-1)
= (-1)^n k (k-1)^(n-1) + (-1)^(n-1) C(n-1)
(-1)^n C(n) - (-1)^(n-1) C(n-1) = (-1)^n k (k-1)^(n-1)
Let D(n) = (-1)^n C(n).
D(3) = -k (k-1) (k-2)
D(n) - D(n-1) = (-1)^n k (k-1)^(n-1)
Now we can write D(n) as a telescoping sum:
D(n) = [sum_{i=4}^n (D(n) - D(n-1))] + D(3)
D(n) = [sum_{i=4}^n (-1)^n k (k-1)^(n-1)] - k (k-1) (k-2).
Break it down as two geometric sums which then cancel nicely.
D(n) = [sum_{i=4}^n (-1)^n ((k-1) + 1) (k-1)^(n-1)] - k (k-1) (k-2)
= sum_{i=4}^n (1-k)^n - sum_{i=4}^n (1-k)^(n-1) - k (k-1) (k-2)
= (1-k)^n - (1-k)^3 - k (k-1) (k-2)
= (1-k)^n - (1 - 3k + 3k^2 - k^3) - (2k - 3k^2 + k^3)
= (1-k)^n - (1-k)
C(n) = (-1)^n (1-k)^n - (-1)^n (1-k)
= (k-1)^n + (-1)^n (k-1).
Note that after removing all parallel edges, we can have at most n edges. This means that in any one connected component we can only see one cycle (and simple at that), which makes the combinatorics rather straightforward. (Cycles are only dependent on how many edges each node can spawn, which is capped at 1.)
Second example:
k = 3
<< 0 <-- 3
/ ^
/ ^
1 --> 2
Since cycles are self contained, any connection to one removes the possibility of another. In the example above, we cannot make a second cycle involving node 3 by adding more nodes, and the same issue would extend to any subsequent connected nodes.
It should be enough, therefore, to perform a search, separating out connected components and marking their node count and whether they contain a cycle. Given a connected component, where c of the nodes are part of a cycle and m nodes are not, we have the following formula (David Eisenstat helped me correct my combinatoric for the count of colourings of a cycle):
if the component has a cycle:
[(k - 1)^c + (-1)^c * (k - 1)] *
(k - 1)^(m)
otherwise:
k * (k - 1)^(m - 1)
As David Eisenstat noted, multiply all these results for the final tally.

Belman-Ford algorithm in 2d Array

I've got a problem with applying a Bellman-Ford algorithm to 2D Array (not to graph)
Input array has m x n dimensions:
s[1,1] s[1,2] ... s[1,n] -> Exit
s[2,1] s[2,2] ... s[2,n]
...
Entry -> s[m,1] s[m,2] ... s[m,n]
And it is room-alike (each entry is a room with s[x,y] cost of enterance). Each room could have also a negative cost, and we have to find cheapest path from Entry to choosen room and to Exit.
For example, we've got this array of rooms and costs:
1 5 6
2 -3 4
5 2 -8
And we want to walk over room [3,2], s[3,2] = 4. We are starting form 5 at [1,3] and must walk over [3,2] before we go to [3,3].
And my question is, what is the best way to implement it in Bellman-Ford algorithm? I know that Dijkstry algorithm will not work becouse of negative cost.
Is for each room from [0, maxHeight] and relax all neighbors correct? Like this:
for (int i = height-1; i >= 0; --i) {
for (int j = 0; j < width; ++j) {
int x = i;
int y = j;
if (x > 0) // up
Relax(x, y, x - 1, y);
if (y + 1 < width) // right
Relax(x, y, x, y + 1);
if (y > 0) // left
Relax(x, y, x, y - 1);
if (x + 1 < height) // down
Relax(x, y, x + 1, y);
}
}
But how can I then read a cost to choosen room and from room to exit?
If you know how to move on the graph from an array, you can scroll to additional condition paragraph. Read also next paragraph.
In fact, you can look at that building like on a graph.
You can see like: (I forgot doors in second line, sorry.)
So, how it is possible to be implement. Ignore for the moment additional condition (visit a particular vertex before leaving).
Weight function:
Let S[][] be an array of entry cost. Notice, that about weight of edge decides only vertex on the end. It has no matter if it's (1, 2) -> (1,3) or (2,3) -> (1, 3). Cost is defined by second vertex. so function may look like:
cost_type cost(vertex v, vertex w) {
return S[w.y][w.x];
}
//As you can see, first argument is unnecessary.
Edges:
In fact you don't have to keep all edges in some array. You can calculate them in function every time you need.
The neighbours for vertex (x, y) are (x+1, y), (x-1, y), (x, y+1), (x, y-1), if that nodes exist. You have to check it, but it's easy. (Check if new_x > 0 && new_x < max_x.) It may look like that:
//Size of matrix is M x N
is_correct(vertex w) {
if(w.y < 1 || w.y > M || w.x < 1 || w.x > N) {
return INCORRECT;
}
return CORRECT;
}
Generating neighbours can look like:
std::tie(x, y) = std::make_tuple(v.x, v.y);
for(vertex w : {{x+1, y}, {x-1, y}, {x, y+1}, {x, y-1}}) {
if(is_correct(w) == CORRECT) {//CORRECT may be true
relax(v, w);
}
}
I believe, that it shouldn't take extra memory for four edges. If you don't know std::tie, look at cppreference. (Extra variables x, y take more memory, but I believe that it's more readable here. In your code it may not appear.)
Obviously you have to have other 2D array with distance and (if necessary) predecessor, but I think it's clear and I don't have to describe it.
Additional condition:
You want to know cost from enter to exit, but you have to visit some vertex compulsory. Easiest way to calculate it is to calculate cost from enter to compulsory and from compulsory to exit. (There will be two separate calculations.) It will not change big O time. After that you can just add results.
You just have to guarantee that it's impossible to visit exit before compulsory. It's easy, you can just erase outgoing edges from exit by adding extra line in is_correct function, (Then vertex v will be necessary.) or in generating neighbours code fragment.
Now you can implement it basing on wikipedia. You have graph.
Why you shouldn't listen?
Better way is to use Belman Ford Algorithm from other vertex. Notice, that if you know optimal path from A to B, you also know optimal path from B to A. Why? Always you have to pay for last vertex and you don't pay for first, so you can ignore costs of them. Rest is obvious.
Now, if you know that you want to know paths A->B and B->C, you can calculate B->A and B->C using one time BF from node B and reverse path B->A. It's over.
You just have to erase outgoing edges from entry and exit nodes.
However, if you need very fast algorithm, you have to optimize that. But it is for another topic, I think. Also, it looks like no one is interested in hard optimization.
I can quickly add, just that small and easy optimization bases at that, that you can ignore relaxation from correspondingly distant vertices. In array you can calculate distance in easy way, so it's pleasant optimization.
I have not mentioned well know optimization, cause I believe that all of them are in a random course of the web.

4 by 3 lock pattern

I came across this problem.
which asks to calculate the number of ways a lock pattern of a specific length can be made in 4x3 grid and follows the rules. there may be some of the points must not be included in the path
A valid pattern has the following properties:
A pattern can be represented using the sequence of points which it's touching for the first time (in the same order of drawing the pattern), a pattern going from (1,1) to (2,2) is not the same as a pattern going from (2,2) to (1,1).
For every two consecutive points A and B in the pattern representation, if the line segment connecting A and B passes through some other points, these points must be in the sequence also and comes before A and B, otherwise the pattern will be invalid. For example a pattern representation which starts with (3,1) then (1,3) is invalid because the segment passes through (2,2) which didn't appear in the pattern representation before (3,1), and the correct representation for this pattern is (3,1) (2,2) (1,3). But the pattern (2,2) (3,2) (3,1) (1,3) is valid because (2,2) appeared before (3,1).
In the pattern representation we don't mention the same point more than once, even if the pattern will touch this point again through another valid segment, and each segment in the pattern must be going from a point to another point which the pattern didn't touch before and it might go through some points which already appeared in the pattern.
The length of a pattern is the sum of the Manhattan distances between every two consecutive points in the pattern representation. The Manhattan distance between two points (X1, Y1) and (X2, Y2) is |X1 - X2| + |Y1 - Y2| (where |X| means the absolute value of X).
A pattern must touch at least two points
my approach was a brute force, loop over the points, start at the point and using recursive decremente the length until reach a length zero then add 1 to the number of combinations.
Is there a way to calculate it in mathematical equation or there is a better algorithm for this ?
UPDATE:
here is what I have done, it gives some wrong answers ! I think the problem is in isOk function !
notAllowed is a global bit mask of the not allowed points.
bool isOk(int i, int j, int di,int dj, ll visited){
int mini = (i<di)?i:di;
int minj = (j<dj)?j:dj;
if(abs(i-di) == 2 && abs(j-dj) == 2 && !getbit(visited, mini+1, minj+1) )
return false;
if(di == i && abs(j - dj) == 2 && !getbit(visited, i,minj+1) )
return false;
if(di == i && abs(j-dj) == 3 && (!getbit(visited, i,1) || !getbit(visited, i,2)) )
return false;
if(dj == j && abs(i - di) == 2 && !getbit(visited, 1,j) )
return false;
return true;
}
int f(int i, int j, ll visited, int l){
if(l > L) return 0;
short& res = dp[i][j][visited][l];
if(res != -1) return res;
res = 0;
if(l == L) return ++res;
for(int di=0 ; di<gN ; ++di){
for(int dj=0 ; dj<gM ; ++dj){
if( getbit(notAllowed, di, dj) || getbit(visited, di, dj) || !isOk(i,j, di,dj, visited) )
continue;
res += f(di, dj, setbit(visited, di, dj), l+dist(i,j , di,dj));
}
}
return res;
}
My answer to another question can be adapted to this problem as well.
Let f(i,j,visited,k) the number of ways to complete a partial pattern, when we are currently at node (i,j), have already visited the vertices in the set visited and have so far walked a path length of k. We can represent visited as a bitmask.
We can compute f(i,j,visited,k) recursively by trying all possible next moves and apply DP to reuse subproblem solutions:
f(i,j, visited, L) = 1
f(i,j, visited, k) = 0 if k > L
f(i,j, visited, k) = sum(possible moves (i', j'): f(i', j', visited UNION {(i',j')}, k + dis((i,j), (i',j')))
Possible moves are those that cross a number of visited vertices and then end in an univisited (and not forbidden) one.
If D is the set of forbidden vertices, the answer to the question is
sum((i,j) not in D: f(i,j, {(i,j)}, L)).
The runtime is something like O(X^2 * Y^2 * 2^(X*Y) * maximum possible length). I guess the maximum possible length is in fact well below 1000.
UPDATE: I implemented this solution and it got accepted. I enumerated the possible moves in the following way: Assume we are at point (i,j) and have already visited the set of vertices visited. Enumerate all distinct coprime pairs (dx,dy) 0 <= dx < X and 0 <= dy < Y. Then find the smallest k with P_k = (i + kdx, j + kdy) still being a valid grid point and P_k not in visited. If P_k is not forbidden, it is a valid move.
The maximum possible path length is 39.
I'm using a DP array of size 3 * 4 * 2^12 * 40 to store the subproblem results.
There are a couple of attributes of the combinations that may be used to optimize the brute force method:
Using mirror images (horizontal, vertical, or both) you can generate 4 combinations for each one found (except horizontal or vertical lines). Maybe you could consider only combinations starting in one quadrant.
You can usually generate additional combinations of the same length by translation (moving a combination).

Improve minimum distance filter for pointset

I create a minimum distance filter for points.
The function takes a stream of points (x1,y1,x2,y2...) and removes the corresponding ones.
void minDistanceFilter(vector<float> &points, float distance = 0.0)
{
float p0x, p0y;
float dx, dy, dsq;
float mdsq = distance*distance; // minimum distance square
unsigned i, j, n = points.size();
for(i=0; i<n; ++i)
{
p0x = points[i];
p0y = points[i+1];
for(j=0; j<n; j+=2)
{
//if (i == j) continue; // discard itself (seems like it slows down the algorithm)
dx = p0x - points[j]; // delta x (p0x - p1x)
dy = p0y - points[j+1]; // delta y (p0y - p1y)
dsq = dx*dx + dy*dy; // distance square
if (dsq < mdsq)
{
auto del = points.begin() + j;
points.erase(del,del+3);
n = points.size(); // update n
j -= 2; // decrement j
}
}
}
}
The only problem that is very slow, due to it tests all points against all points (n^2).
How could it be improved?
kd-trees or range trees could be used for your problem. However, if you want to code from scratch and want something simpler, then you can use a hash table structure. For each point (a,b), hash using the key (round(a/d),round(b/d)) and store all the points that have the same key in a list. Then, for each key (m,n) in your hash table, compare all points in the list to the list of points that have key (m',n') for all 9 choices of (m',n') where m' = m + (-1 or 0 or 1) and n' = n + (-1 or 0 or 1). These are the only points that can be within distance d of your points that have key (m,n). The downside compared to a kd-tree or range tree is that for a given point, you are effectively searching within a square of side length 3*d for points that might have distance d or less, instead of searching within a square of side length 2*d which is what you would get if you used a kd-tree or range tree. But if you are coding from scratch, this is easier to code; also kd-trees and range trees are kinda overkill if you only have one universal distance d that you care about for all points.
Look up range tree, e.g. en.wikipedia.org/wiki/Range_tree . You can use this structure to store 2-dimensional points and very quickly find all the points that lie inside a query rectangle. Since you want to find points within a certain distance d of a point (a,b), your query rectangle will need to be [a-d,a+d]x[b-d,b+d] and then you test any points found inside the rectangle to make sure they are actually within distance d of (a,b). Range tree can be built in O(n log n) time and space, and range queries take O(log n + k) time where k is the number of points found in the rectangle. Seems optimal for your problem.