Shortest path graph algorithm help Boost

Shortest path graph algorithm help Boost - c++

I have a rectangular grid shaped DAG where the horizontal edges always point right and the vertical edges always point down. The edges have positive costs associated with them. Because of the rectangular format, the nodes are being referred to using zero-based row/column. Here's an example graph:
Now, I want to perform a search. The starting vertex will always be in the left column (column with index 0) and in the upper half of the graph. This means I'll pick the start to be either (0,0), (1,0), (2,0), (3,0) or (4,0). The goal vertex is always in the right column (column with index 6) and "corresponds" to the start vertex:
start vertex (0,0) corresponds to goal vertex (5,6)
start vertex (1,0) corresponds to goal vertex (6,6)
start vertex (2,0) corresponds to goal vertex (7,6)
start vertex (3,0) corresponds to goal vertex (8,6)
start vertex (4,0) corresponds to goal vertex (9,6)
I only mention this to demonstrate that the goal vertex will always be reachable. It's possibly not very important to my actual question.
What I want to know is what search algorithm should I use to find the path from start to goal? I am using C++ and have access to the Boost Graph Library.
For those interested, I'm trying to implement Fuchs' suggestions from his Optimal Surface Reconstruction from Planar Contours paper.
I looked at A* but to be honest didn't understand it and wasn't how the heuristic works or even whether I could come up with one!
Because of the rectangular shape and regular edge directions, I figured there might be a well-suited algorithm. I considered Dijkstra
but the paper I mention said there were quicker algorithms (but annoyingly for me doesn't provide an implementation), plus that's
single-source and I think I want single-pair.

So, this is your problem,
DAG no cycles
weights > 0
Left weight < Right weight
You can use a simple exhaustive search defining every possible route. So you have a O(NxN) algoririthm. And then you will choose the shortest path. It is not a very smart solution, but it is effective.
But I suppose you want to be smarter than that, let's consider that if a particular node can be reached from two nodes, you can find the minimum of the weights at the two nodes plus the cost for arriving to the current node. You can consider this as an extension of the previous exhaustive search.
Remember that a DAG can be drawn in a line. For DAG linearization here's an interesting resource.
Now you have just defined a recursive algorithm.
MinimumPath(A,B) = MinimumPath(MinimumPath(A,C)+MinimumPath(A,D)+,MinimumPath(...)+MinimumPath(...))
Of course the starting point of recursion is
MinimumPath(Source,Source)
which is 0 of course.
As far as I know, there isn't an out of the box algorithm from boost to do this. But this is quite straightforward to implement it.
A good implementation is here.
If, for some reason, you do not have a DAG, Dijkstra's or Bellman-Ford should be used.

if I'm not mistaken, from the explanation this is really an optimal path problem not a search problem since the goal is known. In optimization I don't think you can avoid doing an exhaustive search if you want the optimal path and not an approximate path.
From the paper it seems like you are really subdividing a space many times then running the algorithm. This would reduce your N to closer to a constant in the context of the entire surface making O(N^2) not so bad.
That being said perhaps dynamic programming would be a good strategy where the N would be bounded by the difference between your start and goal node. Here is an example form genomic alignment. Just an illustration to give you an idea of how it works.
Construct a N by N array of cost values all set to 0 or some default.
#
For i in size N:
For j in size N:
#cost of getting here from i-1 or j-1
cost[i,j] = min(cost[i-1,j] + i edge cost , cost[i,j-1] + j edge cost)
Once you have your table filled in, start at bottom right corner. Starting from your goal, Choose to go to the node with the lowest cost of getting there. Work your way backwards choosing the lowest value until you reach the start node (or rather the array entry corresponding to the start node). This should give you the optimal path very simply.
Dynamic programming works by solving the optimization on smaller sub-problems. In this case the sub-problems are optimal paths to preceding nodes. I think the rectangular nature of your graph makes this a good fit.

Related

MST from an array of 2D triangles, but with a little twist

Here's an illustration of the steps taken thus far:
Pseudo-random rectangle generation
"Central node" insertion, rect separation, and final node selections
Delaunay triangulation (shown with previously selected nodes)
Rendering of triangle edges
At this point (Step 5), I would like to use this data to form a Minimum Spanning Tree, but there's a slight catch...
Somewhere in the graph (likely near the center, but not always) will be a node that requires between 3-5 connections to it from other unique nodes. This complicates things, since every other node should only contain a single connection, and the data structures being used make it difficult to determine "what's connected to what" in a solid, traversable format.
So, given an array of triangles in the above format, and a random vertex to use as the "root node", how could I properly traverse the network to create an MST where there are at least 3 connections to our "central node", but no more than 5 connections to it? Is this possible?

Since it's rare to have a vertex in a Delaunay triangulation have much more than 6 edges, you can use brute force: there are only 20+15+6 ways to select 3, 4, or 5 edges out of 6 (respectively), so try all of them. For each of the 41 (up to 336 for degree 9) small trees (the root and a few edges) thus created, run either Kruskal's algorithm or Prim's algorithm starting with that tree already "found" to be part of the MST. (Ignore the root's other edges so as not to increase its degree further.) Then just pick the best one (including the cost of the seed tree).
As for the general neighbor information problem, it seems you just need to build a standard graph representation first. For example, you can make an adjacency list by scanning over all your Edge objects and storing each endpoint in a list associated with the other (in a map<Vector2<T>,vector<Vector2<T>>> or an equivalent based on whatever identifiers for your vertices).

I've taken a workaround approach...
After step 3 of my algorithm, I simply remove all edges which connect to the "central node", keeping track of which edges form the "ring" (aka "edge loop") around it, and run the MST on all remaining edges.
For the MST, I went with the boost graph library.
That made it easy to loop through the triangles I had, adding each of its three edges to an adjacency_list. Then a simple call to whichever boost-provided MST algorithm took care of the rest.
Finally, I readd the edges that were previously taken out. The shortest path is whatever it was in the previous step, plus the length of whichever readded edge that connects to another edge on the "ring" is shortest.
I can then add (or remove) an arbitrary number of previous edges to ensure there are between 3 to 5 edges connecting from the edge loop to the "central node".
Doing things in this order also allows me to know as soon as step 3 if we'll even have a valid number of edges, so we don't waste cycles running a MST.

Maze least turns

I have a problem that i can't solve.I have a maze and I have to find a path from a point S to a point E,which has the least turns.It is known that the point E is reacheable.I can move only in 4 directions,left,right,up and down.It doesn't have to be the shortest path,just to have least turns.
I tried to store the number of turns in a priority queue.For example when I reach a certain place in the maze I will add the numbers of turns till there.From there I would add his neighbours to the priority queue,if they weren't visited already or they weren't walls,with the value of the current block i was sitting,for example t + x which can have the following values ( 0-if the neighbour is facing in the same direction I was facing when i got near him,or 1 if it is in a different direction).It seems that this approach doesn't work for every case.
I will appreciate if somebody could offer me some hints, without any code.

You are on the right track. What you need to implement for this problem is Dijkstra's algorithm. You just need to consider not just points as graph vertices, but pair of (point,direction). From every vertex (p,d) you have 5 edges (although last one can be blocked by wall): (p,0), (p,1), (p,2), (p,3), (neighbour of p in direction d, d). First four edges are of weight 1 (as you turn here), and the last one is of weight 0 (no turn, just move forward). Algorithm is good enough to ignore loops and works fine for edges of weight 0. You should end when any vertex (end point, _) is extracted from priority queue.
This method has one issue, as too many verticies are inspected in the process. If your maze is small, that's not the problem. Otherwise, consider a slight modification known as A*. You need a good heuristic function, describing lower bound on number of turns to the goal.

'Stable' multi-dimensional scaling algorithm

I have a wireless mesh network of nodes, each of which is capable of reporting its 'distance' to its neighbors, measured in (simplified) signal strength to them. The nodes are geographically in 3d space but because of radio interference, the distance between nodes need not be trigonometrically (trigonomically?) consistent. I.e., given nodes A, B and C, the distance between A and B might be 10, between A and C also 10, yet between B and C 100.
What I want to do is visualize the logical network layout in terms of connectness of nodes, i.e. include the logical distance between nodes in the visual.
So far my research has shown the multidimensional scaling (MDS) is designed for exactly this sort of thing. Given that my data can be directly expressed as a 2d distance matrix, it's even a simpler form of the more general MDS.
Now, there seem to be many MDS algorithms, see e.g. http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html and http://tapkee.lisitsyn.me/ . I need to do this in C++ and I'm hoping I can use a ready-made component, i.e. not have to re-implement an algo from a paper. So, I thought this: https://sites.google.com/site/simpmatrix/ would be the ticket. And it works, but:
The layout is not stable, i.e. every time the algorithm is re-run, the position of the nodes changes (see differences between image 1 and 2 below - this is from having been run twice, without any further changes). This is due to the initialization matrix (which contains the initial location of each node, which the algorithm then iteratively corrects) that is passed to this algorithm - I pass an empty one and then the implementation derives a random one. In general, the layout does approach the layout I expected from the given input data. Furthermore, between different runs, the direction of nodes (clockwise or counterclockwise) can change. See image 3 below.
The 'solution' I thought was obvious, was to pass a stable default initialization matrix. But when I put all nodes initially in the same place, they're not moved at all; when I put them on one axis (node 0 at 0,0 ; node 1 at 1,0 ; node 2 at 2,0 etc.), they are moved along that axis only. (see image 4 below). The relative distances between them are OK, though.
So it seems like this algorithm only changes distance between nodes, but doesn't change their location.
Thanks for reading this far - my questions are (I'd be happy to get just one or a few of them answered as each of them might give me a clue as to what direction to continue in):
Where can I find more information on the properties of each of the many MDS algorithms?
Is there an algorithm that derives the complete location of each node in a network, without having to pass an initial position for each node?
Is there a solid way to estimate the location of each point so that the algorithm can then correctly scale the distance between them? I have no geographic location of each of these nodes, that is the whole point of this exercise.
Are there any algorithms to keep the 'angle' at which the network is derived constant between runs?
If all else fails, my next option is going to be to use the algorithm I mentioned above, increase the number of iterations to keep the variability between runs at around a few pixels (I'd have to experiment with how many iterations that would take), then 'rotate' each node around node 0 to, for example, align nodes 0 and 1 on a horizontal line from left to right; that way, I would 'correct' the location of the points after their relative distances have been determined by the MDS algorithm. I would have to correct for the order of connected nodes (clockwise or counterclockwise) around each node as well. This might become hairy quite quickly.
Obviously I'd prefer a stable algorithmic solution - increasing iterations to smooth out the randomness is not very reliable.
Thanks.
EDIT: I was referred to cs.stackexchange.com and some comments have been made there; for algorithmic suggestions, please see https://cs.stackexchange.com/questions/18439/stable-multi-dimensional-scaling-algorithm .
Image 1 - with random initialization matrix:
Image 2 - after running with same input data, rotated when compared to 1:
Image 3 - same as previous 2, but nodes 1-3 are in another direction:
Image 4 - with the initial layout of the nodes on one line, their position on the y axis isn't changed:

Most scaling algorithms effectively set "springs" between nodes, where the resting length of the spring is the desired length of the edge. They then attempt to minimize the energy of the system of springs. When you initialize all the nodes on top of each other though, the amount of energy released when any one node is moved is the same in every direction. So the gradient of energy with respect to each node's position is zero, so the algorithm leaves the node where it is. Similarly if you start them all in a straight line, the gradient is always along that line, so the nodes are only ever moved along it.
(That's a flawed explanation in many respects, but it works for an intuition)
Try initializing the nodes to lie on the unit circle, on a grid or in any other fashion such that they aren't all co-linear. Assuming the library algorithm's update scheme is deterministic, that should give you reproducible visualizations and avoid degeneracy conditions.
If the library is non-deterministic, either find another library which is deterministic, or open up the source code and replace the randomness generator with a PRNG initialized with a fixed seed. I'd recommend the former option though, as other, more advanced libraries should allow you to set edges you want to "ignore" too.

I have read the codes of the "SimpleMatrix" MDS library and found that it use a random permutation matrix to decide the order of points. After fix the permutation order (just use srand(12345) instead of srand(time(0))), the result of the same data is unchanged.

Obviously there's no exact solution in general to this problem; with just 4 nodes ABCD and distances AB=BC=AC=AD=BD=1 CD=10 you cannot clearly draw a suitable 2D diagram (and not even a 3D one).
What those algorithms do is just placing springs between the nodes and then simulate a repulsion/attraction (depending on if the spring is shorter or longer than prescribed distance) probably also adding spatial friction to avoid resonance and explosion.
To keep a "stable" diagram just build a solution and then only update the distances, re-using the current position from previous solution as starting point. Picking two fixed nodes and aligning them seems a good idea to prevent a slow drift but I'd say that spring forces never end up creating a rotational momentum and thus I'd expect that just scaling and centering the solution should be enough anyway.

How to solve binary labeling with graph cut?

I have 32 segments of the overlapped regions of two images. I have to assign each of the segment to either one of the images based on lowest cost. So, it is a binary labeling problem, and above are the energy minimization function.
L is the vector of length of 32(equal to the number of segments) and value of each element depends on its index corresponding to the segment number. Say, if 3rd segment is assigned image 1, then L(2)=0, and 14th segments is assigned to image 2, so L(13)=1. That is L[x]'s value is either 0 or 1. Thus, there are 2^32 possible assignment of L. So, I can compute E(L) for each combination, after performing 2^32 calculation, I can get the minimum E(L), and use that combination. This is what my intuition suggests. But this is impractical, because the complexity is exponential.
But, many literatures suggest this binary labeling problem can be solved as a graph cut problem with max flow/min cut algorithm. But, how do I formulate this problem as max flow/min cut problem? The 32 segments are the nodes of the graph, but what would be the weight of the edges? And what would be the capacity?

The formulation as a graph theory problem and proof of the "if and only if" relationship can be found in "What Energy Functions Can Be Minimized
via Graph Cuts?" by Vladimir Kolmogorov and Ramin Zabih.
The key idea is to construct a directed edge between i and j of weight Vij(0,1)+Vij(1,0)-Vij(0,0)-Vij(1,1).
If Vij(1,0)-Vij(0,0)>0 you also need to construct a directed edge between the source and i of weight Vij(1,0)-Vij(0,0).
Otherwise you need to construct a directed edge between i and the destination of weight Vij(0,0)-Vij(1,0).
Similarly, if Vij(0,1)-Vij(0,0)>0 you also need to construct a directed edge between the source and j of weight Vij(0,1)-Vij(0,0).
Otherwise you need to construct a directed edge between j and the destination of weight Vij(0,0)-Vij(0,1).
Note that the min-cut of this graph will be offset by V(0,0)-sum of weights on edges connecting to the destination.

3D Math - Only keeping positions within a certain amount of yards

I'm trying to determine from a large set of positions how to narrow my list down significantly.
Right now I have around 3000 positions (x, y, z) and I want to basically keep the positions that are furthest apart from each other (I don't need to keep 100 positions that are all within a 2 yard radius from each other).
Besides doing a brute force method and literally doing 3000^2 comparisons, does anyone have any ideas how I can narrow this list down further?
I'm a bit confused on how I should approach this from a math perspective.

Well, I can't remember the name for this algorithm, but I'll tell you a fun technique for handling this. I'll assume that there is a semi-random scattering of points in a 3D environment.
Simple Version: Divide and Conquer
Divide your space into a 3D grid of cubes. Each cube will be X yards on each side.
Declare a multi-dimensional array [x,y,z] such that you have an element for each cube in your grid.
Every element of the array should either be a vertex or reference to a vertex (x,y,z) structure, and each should default to NULL
Iterate through each vertex in your dataset, determine which cube the vertex falls in.
How? Well, you might assume that the (5.5, 8.2, 9.1) vertex belongs in MyCubes[5,8,9], assuming X (cube-side-length) is of size 1. Note: I just truncated the decimals/floats to determine which cube.
Check to see if that relevant cube is already taken by a vertex. Check: If MyCubes[5,8,9] == NULL then (inject my vertex) else (do nothing, toss it out! spot taken, buddy)
Let's save some memory
This will give you a nicely simplified dataset in one pass, but at the cost of a potentially large amount of memory.
So, how do you do it without using too much memory?
I'd use a hashtable such that my key is the Grid-Cube coordinate (5,8,9) in my sample above.
If MyHashTable.contains({5,8,9}) then DoNothing else InsertCurrentVertex(...)
Now, you will have a one-pass solution with minimal memory usage (no gigantic array with a potentially large number of empty cubes. What is the cost? Well, the programming time to setup your structure/class so that you can perform the .contains action in a HashTable (or your language-equivalent)
Hey, my results are chunky!
That's right, because we took the first result that fit in any cube. On average, we will have achieved X-separation between vertices, but as you can figure out by now, some vertices will still be close to one another (at the edges of the cubes).
So, how do we handle it? Well, let's go back to the array method at the top (memory-intensive!).
Instead of ONLY checking to see if a vertex is already in the cube-in-question, also perform this other check:
If Not ThisCubeIsTaken()
For each SurroundingCube
If not Is_Your_Vertex_Sufficiently_Far_Away_From_Me()
exit_loop_and_outer_if_statement()
end if
Next
//Ok, we got here, we can add the vertex to the current cube because the cube is not only available, but the neighbors are far enough away from me
End If
I think you can probably see the beauty of this, as it is really easy to get neighboring cubes if you have a 3D array.
If you do some smoothing like this, you can probably enforce a 'don't add if it's with 0.25X' policy or something. You won't have to be too strict to achieve a noticeable smoothing effect.
Still too chunky, I want it smooth
In this variation, we will change the qualifying action for whether a vertex is permitted to take residence in a cube.
If TheCube is empty OR if ThisVertex is closer to the center of TheCube than the Cube's current vertex
InsertVertex (overwrite any existing vertex in the cube
End If
Note, we don't have to perform neighbor detection for this one. We just optimize towards the center of each cube.
If you like, you can merge this variation with the previous variation.
Cheat Mode
For some people in this situation, you can simply take a 10% random selection of your dataset and that will be a good-enough simplification. However, it will be very chunky with some points very close together. On the bright side, it takes a few minutes max. I don't recommend it unless you are prototyping.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js