Cluster terrain map disjoint by river through map/matrix? - c++

I have simple matrix (matrix, which represents terrain map in 2d game, contains ASCII characters for example 'm' for mountain, 'v' for valley, 'r' for river) and on map there maybe one or none river. River can flow from any position from matrix to any ( and always separate map on two distinct parts => no source of river on map possible, always enter at one end and exists on another). How to separate matrix/terrain map on two clusters if there is river present ?
example terrain
v v v v v v v v r v v v v v
v v v v v m m m r m m m m m
v v v v v m m r r m m m m m
m m v m m m m r r m m m v v
v v v v v v r r v v v v v v
here I should get left cluster and right cluster of coordinates which are not river.

You should try looking up the Fill algorithm.
http://en.wikipedia.org/wiki/Flood_fill
Basically you want to pick a point that it's not in a river, start the flood fill algorithm which will give you a set of points connected to the starting point. This way now you have one part and finding the one is pretty easy from now on.

Your map induces a graph:
there's one vertex for each map cell
two vertices are connected if they are adjacent and none of them is an 'r'
Once the graph is constructed, you can run a graph traversal algorithm like breadth-first search (BFS) or depth-first search (DFS) to find the connected components of the graph.
I'd recommend using BFS, because if the map is large then DFS might get you into a stack overflow (if its recursive implementation is used).
You'll want to run the BFS only on non-'r' nodes, so that in the end you'll end up with two connected components.

Related

How to find all equals paths in degenerate tree from specific start vertex? [duplicate]

I have some degenerate tree (it looks like as array or doubly linked list). For example, it is this tree:
Each edge has some weight. I want to find all equal paths, which starts in each vertex.
In other words, I want to get all tuples (v1, v, v2) where v1 and v2 are an arbitrary ancestor and descendant such that c(v1, v) = c(v, v2).
Let edges have the following weights (it is just example):
a-b = 3
b-c = 1
c-d = 1
d-e = 1
Then:
The vertex A does not have any equal path (there is no vertex from left side).
The vertex B has one equal pair. The path B-A equals to the path B-E (3 == 3).
The vertex C has one equal pair. The path B-C equals to the path C-D (1 == 1).
The vertex D has one equal pair. The path C-D equals to the path D-E (1 == 1).
The vertex E does not have any equal path (there is no vertex from right side).
I implement simple algorithm, which works in O(n^2). But it is too slow for me.
You write, in comments, that your current approach is
It seems, I looking for a way to decrease constant in O(n^2). I choose
some vertex. Then I create two set. Then I fill these sets with
partial sums, while iterating from this vertex to start of tree and to
finish of tree. Then I find set intersection and get number of paths
from this vertex. Then I repeat algorithm for all other vertices.
There is a simpler and, I think, faster O(n^2) approach, based on the so called two pointers method.
For each vertix v go at the same time into two possible directions. Have one "pointer" to a vertex (vl) moving in one direction and another (vr) into another direction, and try to keep the distance from v to vl as close to the distance from v to vr as possible. Each time these distances become equal, you have equal paths.
for v in vertices
vl = prev(v)
vr = next(v)
while (vl is still inside the tree)
and (vr is still inside the tree)
if dist(v,vl) < dist(v,vr)
vl = prev(vl)
else if dist(v,vr) < dist(v,vl)
vr = next(vr)
else // dist(v,vr) == dist(v,vl)
ans = ans + 1
vl = prev(vl)
vr = next(vr)
(By precalculating the prefix sums, you can find dist in O(1).)
It's easy to see that no equal pair will be missed provided that you do not have zero-length edges.
Regarding a faster solution, if you want to list all pairs, then you can't do it faster, because the number of pairs will be O(n^2) in the worst case. But if you need only the amount of these pairs, there might exist faster algorithms.
UPD: I came up with another algorithm for calculating the amount, which might be faster in case your edges are rather short. If you denote the total length of your chain (sum of all edges weight) as L, then the algorithm runs in O(L log L). However, it is much more advanced conceptually and more advanced in coding too.
Firstly some theoretical reasoning. Consider some vertex v. Let us have two arrays, a and b, not the C-style zero-indexed arrays, but arrays with indexation from -L to L.
Let us define
for i>0, a[i]=1 iff to the right of v on the distance exactly i there
is a vertex, otherwise a[i]=0
for i=0, a[i]≡a[0]=1
for i<0, a[i]=1 iff to the left of v on the distance exactly -i there is a vertex, otherwise a[i]=0
A simple understanding of this array is as follows. Stretch your graph and lay it along the coordinate axis so that each edge has the length equal to its weight, and that vertex v lies in the origin. Then a[i]=1 iff there is a vertex at coordinate i.
For your example and for vertex "b" chosen as v:
a--------b--c--d--e
--|--|--|--|--|--|--|--|--|-->
-4 -3 -2 -1 0 1 2 3 4
a: ... 0 1 0 0 1 1 1 1 0 ...
For another array, array b, we define the values in a symmetrical way with respect to origin, as if we have inverted the direction of the axis:
for i>0, b[i]=1 iff to the left of v on the distance exactly i there
is a vertex, otherwise b[i]=0
for i=0, b[i]≡b[0]=1
for i<0, b[i]=1 iff to the right of v on the distance exactly -i there is a vertex, otherwise b[i]=0
Now consider a third array c such that c[i]=a[i]*b[i], asterisk here stays for ordinary multiplication. Obviously c[i]=1 iff the path of length abs(i) to the left ends in a vertex, and the path of length abs(i) to the right ends in a vertex. So for i>0 each position in c that has c[i]=1 corresponds to the path you need. There are also negative positions (c[i]=1 with i<0), which just reflect the positive positions, and one more position where c[i]=1, namely position i=0.
Calculate the sum of all elements in c. This sum will be sum(c)=2P+1, where P is the total number of paths which you need with v being its center. So if you know sum(c), you can easily determine P.
Let us now consider more closely arrays a and b and how do they change when we change the vertex v. Let us denote v0 the leftmost vertex (the root of your tree) and a0 and b0 the corresponding a and b arrays for that vertex.
For arbitrary vertex v denote d=dist(v0,v). Then it is easy to see that for vertex v the arrays a and b are just arrays a0 and b0 shifted by d:
a[i]=a0[i+d]
b[i]=b0[i-d]
It is obvious if you remember the picture with the tree stretched along a coordinate axis.
Now let us consider one more array, S (one array for all vertices), and for each vertex v let us put the value of sum(c) into the S[d] element (d and c depend on v).
More precisely, let us define array S so that for each d
S[d] = sum_over_i(a0[i+d]*b0[i-d])
Once we know the S array, we can iterate over vertices and for each vertex v obtain its sum(c) simply as S[d] with d=dist(v,v0), because for each vertex v we have sum(c)=sum(a0[i+d]*b0[i-d]).
But the formula for S is very simple: S is just the convolution of the a0 and b0 sequences. (The formula does not exactly follow the definition, but is easy to modify to the exact definition form.)
So what we now need is given a0 and b0 (which we can calculate in O(L) time and space), calculate the S array. After this, we can iterate over S array and simply extract the numbers of paths from S[d]=2P+1.
Direct application of the formula above is O(L^2). However, the convolution of two sequences can be calculated in O(L log L) by applying the Fast Fourier transform algorithm. Moreover, you can apply a similar Number theoretic transform (don't know whether there is a better link) to work with integers only and avoid precision problems.
So the general outline of the algorithm becomes
calculate a0 and b0 // O(L)
calculate S = corrected_convolution(a0, b0) // O(L log L)
v0 = leftmost vertex (root)
for v in vertices:
d = dist(v0, v)
ans = ans + (S[d]-1)/2
(I call it corrected_convolution because S is not exactly a convolution, but a very similar object for which a similar algorithm can be applied. Moreover, you can even define S'[2*d]=S[d]=sum(a0[i+d]*b0[i-d])=sum(a0[i]*b0[i-2*d]), and then S' is the convolution proper.)

Adjacency matrix and Bron–Kerbosch algorithm

I want to find maximum cliques in a graph that is given to me in a form of adjacency matrix. I what I am trying to do I am being given the amount of shops I need to find with the same product tag that are being collected and whether sufficient amount of those shops was found
so input goes along lines
x - shop count.
y - product count/tag.
z - in how many shops does the product need to be present.
so let's say I got
5 - x
2 - y
4 - z
Then the adjacency matrix going with it is:
0 1 1 1 1
1 0 2 2 1
1 2 0 2 2
1 2 2 0 1
1 1 2 1 0
There are two different products available, now I want to find out whether there are atleast 4 shops selling specific product. I found out about Bron–Kerbosch algorithm e.g. http://en.wikipedia.org/wiki/Bron%E2%80%93Kerbosch_algorithm
But I don't know how to pick my R, P and X subsets and how to represent them. It does not have to be very efficient nor I believe there is a need for any more advanced data structure than a 2D array but I just don't know how to use this adjacency matrix as a list of my vertices etc. Could anyone give me an idea on how to get started with this algorithm? Probably telling me how to treat R, P and X with my data would be sufficient. I would want to create my program in C++
From the wikipedia article:
BronKerbosch3(G):
P = V(G)
R = X = empty
for each vertex v in a degeneracy ordering of G:
BronKerbosch2(R ⋃ {v}, P ⋂ N(v), X ⋂ N(v))
P := P \ {v}
X := X ⋃ {v}
where G is the graph and BronKerbosch2() is defined as :
BronKerbosch2(R,P,X):
if P and X are both empty:
report R as a maximal clique
choose a pivot vertex u in P ⋃ X
for each vertex v in P \ N(u):
BronKerbosch2(R ⋃ {v}, P ⋂ N(v), X ⋂ N(v))
P := P \ {v}
X := X ⋃ {v}
So now you know your choice of R, P and X. Also lookup what an adjacency matrix actually is as mentioned in the comments. Also, read the article carefully for the use of degeneracy in the algorithm.
In C++, you could use std::array<std::array<int, SIZE>, SIZE> for the 2D adjacency matrix and then proceed with the algorithm.

Find line-meshgrid intersections without sorting?

I am trying to find line-meshgrid intersections without sorting. Here is the figure:
Known:
The two intersection points on the boundary: (x0 y0) and (xN,yN)
are known.
The position of each meshgrid line is known. [-R R] is the span of the meshgrid.
The meshgrid is centered at Cartesian origin symmetrically.
What I want:
I'd like to get an array of all intersections in either ascending or descending order, based on the distance from each point to either the starting point (x0,y0), or the end point (xN,yN).
For example:
(x0 y0), (x1,y1),(x2,y2)..., (xN,yN): acceptable
(xN yN), (xN-1,yN-1),(xN-2,yN-2)..., (x0,y0): acceptable.
(x0 y0), (x3,y3),(x1,y1)..., (xN,yN): not acceptable.
What I am stuck at:
I understand I can at least calculate each intersection with a for loop, but I don't know how to save the intersections with the order motioned above without sorting (bubble ex.). Say, I start from (x0,y0), then which way to go to find my first intersection? Particularly, should I go along x direction, or should I go along y direction, so that I can hit my first intersection? And how about the next move to my second one?
I figure is there anyway to do it in a "natural" geometry way? The slope (assuming the line is not vertical) of the line is known, and the meshgrid is known, so is there any trick we can play here? Thanks a lot
In addition:
What if I'd like to do all the intersections in parallel? Say, in CUDA.
Assuming a unit tile size, the coordinates of the intersections are found at x = i and y = j respectively, for increasing indexes.
Using the parametric line equation x = X + t U, y = Y + t V, the intersections occur at t = (i - X) / U and t = (j - Y) / V, which we rewrite U V t = V (i - X), U V t = U (j - Y), for convenience.
These two sequences are naturally sorted, they follow two arithmetic progressions of common differences V and U and initial indexes i = Ceil(X), j = Ceil(Y). Then what you need to do is a merge of the two sequences.
# Initialize
i= Ceil(X), j= Ceil(Y)
Tx= V (i - X), Ty= U (j - Y)
# Loop until the final point
while i < XX and j < YY:
# Move to the next intersection
if Tx + V < Ty + U:
Increment i, Tx+= V
elif Tx + V > Ty + U:
Increment j, Ty+= U
else:
Increment both i and j, Tx+= V, Ty+= U
The second coordinate of an intersection is found from the relevant value of T.

how to transform dfs

I have Depth first searching algorithm whose pseudo code is given below:
DFS(Vertex v)
mark v visited
make an empty Stack S
push all vertices adjacent to v onto S
while S is not empty do
Vertex w is pop off S
for all Vertex u adjacent to w do
if u is not visited then
mark u visited
push u onto S
Now, I wish to convert the above dfs algorithm to breadth first search. I am implementing the program in C++. I am clueless how to go about the same.
EDIT: I know the pseudo code of bfs. What i am searching for is how to convert the above pseudo code of dfs to bfs.
BFS(Vertex v)
mark v visited
make an empty Queue Q
Enqueue all vertices adjacent to v onto Q
while Q is not empty do
Vertex w is dequeued from Q
for all Vertex u adjacent to w do
if u is not visited then
mark u visited
enqueue u into Q
I hope this helps

How to project a point onto a plane in 3D?

I have a 3D point (point_x,point_y,point_z) and I want to project it onto a 2D plane in 3D space which (the plane) is defined by a point coordinates (orig_x,orig_y,orig_z) and a unary perpendicular vector (normal_dx,normal_dy,normal_dz).
How should I handle this?
Make a vector from your orig point to the point of interest:
v = point-orig (in each dimension);
Take the dot product of that vector with the unit normal vector n:
dist = vx*nx + vy*ny + vz*nz; dist = scalar distance from point to plane along the normal
Multiply the unit normal vector by the distance, and subtract that vector from your point.
projected_point = point - dist*normal;
Edit with picture:
I've modified your picture a bit. Red is v. dist is the length of blue and green, equal to v dot normal. Blue is normal*dist. Green is the same vector as blue, they're just plotted in different places. To find planar_xyz, start from point and subtract the green vector.
This is really easy, all you have to do is find the perpendicular (abbr here |_) distance from the point P to the plane, then translate P back by the perpendicular distance in the direction of the plane normal. The result is the translated P sits in the plane.
Taking an easy example (that we can verify by inspection) :
Set n=(0,1,0), and P=(10,20,-5).
The projected point should be (10,10,-5). You can see by inspection that Pproj is 10 units perpendicular from the plane, and if it were in the plane, it would have y=10.
So how do we find this analytically?
The plane equation is Ax+By+Cz+d=0. What this equation means is "in order for a point (x,y,z) to be in the plane, it must satisfy Ax+By+Cz+d=0".
What is the Ax+By+Cz+d=0 equation for the plane drawn above?
The plane has normal n=(0,1,0). The d is found simply by using a test point already in the plane:
(0)x + (1)y + (0)z + d = 0
The point (0,10,0) is in the plane. Plugging in above, we find, d=-10. The plane equation is then 0x + 1y + 0z - 10 = 0 (if you simplify, you get y=10).
A nice interpretation of d is it speaks of the perpendicular distance you would need to translate the plane along its normal to have the plane pass through the origin.
Anyway, once we have d, we can find the |_ distance of any point to the plane by the following equation:
There are 3 possible classes of results for |_ distance to plane:
0: ON PLANE EXACTLY (almost never happens with floating point inaccuracy issues)
+1: >0: IN FRONT of plane (on normal side)
-1: <0: BEHIND plane (ON OPPOSITE SIDE OF NORMAL)
Anyway,
Which you can verify as correct by inspection in the diagram above
This answer is an addition to two existing answers.
I aim to show how the explanations by #tmpearce and #bobobobo boil down to the same thing, while at the same time providing quick answers to those who are merely interested in copying the equation best suited for their situation.
Method for planes defined by normal n and point o
This method was explained in the answer by #tmpearce.
Given a point-normal definition of a plane with normal n and point o on the plane, a point p', being the point on the plane closest to the given point p, can be found by:
p' = p - (n ⋅ (p - o)) × n
Method for planes defined by normal n and scalar d
This method was explained in the answer by #bobobobo.
Given a plane defined by normal n and scalar d, a point p', being the point on the plane closest to the given point p, can be found by:
p' = p - (n ⋅ p + d) × n
If instead you've got a point-normal definition of a plane (the plane is defined by normal n and point o on the plane) #bobobobo suggests to find d:
d = -n ⋅ o
and insert this into equation 2. This yields:
p' = p - (n ⋅ p - n ⋅ o) × n
A note about the difference
Take a closer look at equations 1 and 4. By comparing them you'll see that equation 1 uses n ⋅ (p - o) where equation 2 uses n ⋅ p - n ⋅ o. That's actually two ways of writing down the same thing:
n ⋅ (p - o) = n ⋅ p - n ⋅ o = n ⋅ p + d
One may thus choose to interpret the scalar d as if it were a 'pre-calculation'. I'll explain: if a plane's n and o are known, but o is only used to calculate n ⋅ (p - o),
we may as well define the plane by n and d and calculate n ⋅ p + d instead, because we've just seen that that's the same thing.
Additionally for programming using d has two advantages:
Finding p' now is a simpler calculation, especially for computers. Compare:
using n and o: 3 subtractions + 3 multiplications + 2 additions
using n and d: 0 subtractions + 3 multiplications + 3 additions.
Using d limits the definition of a plane to only 4 real numbers (3 for n + 1 for d), instead of 6 (3 for n + 3 for o). This saves ⅓ memory.
It's not sufficient to provide only the plane origin and the normal vector. This does define the 3d plane, however this does not define the coordinate system on the plane.
Think that you may rotate your plane around the normal vector with regard to its origin (i.e. put the normal vector at the origin and "rotate").
You may however find the distance of the projected point to the origin (which is obviously invariant to rotation).
Subtract the origin from the 3d point. Then do a cross product with the normal direction. If your normal vector is normalized - the resulting vector's length equals to the needed value.
EDIT
A complete answer would need an extra parameter. Say, you supply also the vector that denotes the x-axis on your plane.
So we have vectors n and x. Assume they're normalized.
The origin is denoted by O, your 3D point is p.
Then your point is projected by the following:
x = (p - O) dot x
y = (p - O) dot (n cross x)
Let V = (orig_x,orig_y,orig_z) - (point_x,point_y,point_z)
N = (normal_dx,normal_dy,normal_dz)
Let d = V.dotproduct(N);
Projected point P = V + d.N
I think you should slightly change the way you describe the plane. Indeed, the best way to describe the plane is via a vector n and a scalar c
(x, n) = c
The (absolute value of the) constant c is the distance of the plane from the origin, and is equal to (P, n), where P is any point on the plane.
So, let P be your orig point and A' be the projection of a new point A onto the plane. What you need to do is find a such that A' = A - a*n satisfies the equation of the plane, that is
(A - a*n, n) = (P, n)
Solving for a, you find that
a = (A, n) - (P, n) = (A, n) - c
which gives
A' = A - [(A, n) - c]n
Using your names, this reads
c = orig_x*normal_dx + orig_y*normal_dy+orig_z*normal_dz;
a = point_x*normal_dx + point_y*normal_dy + point_z*normal_dz - c;
planar_x = point_x - a*normal_dx;
planar_y = point_y - a*normal_dy;
planar_z = point_z - a*normal_dz;
Note: your code would save one scalar product if instead of the orig point P you store c=(P, n), which means basically 25% less flops for each projection (in case this routine is used many times in your code).
Let r be the point to project and p be the result of the projection. Let c be any point on the plane and let n be a normal to the plane (not necessarily normalised). Write p = r + m d for some scalar m which will be seen to be indeterminate if their is no solution.
Since (p - c).n = 0 because all points on the plane satisfy this restriction one has (r - c).n + m(d . n) = 0 and so m = [(c - r).n]/[d.n] where the dot product (.) is used. But if d.n = 0 there is no solution. For example if d and n are perpendicular to one another no solution is available.