Recursive search of all viable paths in graph by edge length - c++

I have school assigment to create a program which gets a graph and finds out mininal spanning tree with condition that path between two points (which are preselected on run start) will be shortest by NUMBER OF EDGES between them.
The task itself is OK but where I struggle is optimization. When I find my path between A and B (preselected points) I try to recursively find all other possible options by DFS and then do MSTs and choose the smallest. Since the path must be of lowest number of edges and I found one of those paths by first BFS I know that I can cut of my DFS search after X recursions where X is number of edges between A in B found in the BFS. It works very fast in certain types of graphs (where number of edges is to 3 times number of vertexes) but when edges are for example 10 times bigger it just runs without stopping.
I asked my friend for a hint and he said me he uses BFS for recursively finding the other paths, and he is okay, but what is the performance difference? DFS will try first to run down and stops when reach the certain point or wastes hops avaiable, BFS goes wide first and then ends all paths at the same depth step but still I do same ammount of hopping right?
What am I missing here? Or did I understand him wrong? Thanks for any ideas.
EDIT: I tried to check which edges I already visited in the particular DFS run to avoid going opposite direction, back to the point where i was and so on but it only generated delay on certain group of graphs while not helping noticably with others.
EDIT2: swapped edge and vertex quantity (cant be more vertexes than edges)

So I finished my assigment and when looking for other paths/all paths between two points A/B, DFS and BFS gets the same result as far as I found out.
The key for me was right optimization of solution. My final solution combines going wide with going deep and works like this (maybe it can be done faster but I doubt it):
We know the length (in number of edges) between points A/B (we
ran BFS once to find out path).
We run something like BFS from
end point B and in every vertex we make mark how much hops we did
from the end and call this again on all outgoing edges (we stop when
hops == lengthOfPath(A,B)).
Then we do DFS from source point A
and combine all vertexes which hold proper hop value into the path
by recursion.
In other words we mark all point by their reachability from B and then from A we go over those accessible (with set value other than default and with proper value of hops from A) and combine all possible paths together.
There are our paths. (I left this vague intentionally because task is still active). If anyone would need better explanation in future, post comment and I will elaborate. If someone finds better solution, post it as your answer and if right I will tick it as right answer.

Related

how to find a path visit as much as possible vertices?

Given a square grid (undirected graph), is there any way to find a path which will visit as much as possible vertices.
each vertex can be visit only once. It means that the path will be Hamilton tour if exist, or be a longest path.
The graph has some walls. Wall is a vertex has no edge connect to neighbors.
I has a solution (in mind), but it's very similar to find all path and chose the first one has most vertices visited.
Find a path will visit all neighbors from given start vertex to the end (no way can go).
look back to the current path until the starting vertex, if there is any vertex has neighbors outside of the current path, process like
step 1 from the found vertex and its new neighbors.
analysis and choose the longest path (has most vertices).
I found similar problem, cannot understand what does #Juho mean:
Choose a successor si to top(S), and try to find a path si−1⇝si avoiding vertices in F. If a path is found, insert the vertices on the path si−1⇝si to F.
I don't have enough reputation to add a comment there.
my solution get performance trouble, I guess. Any suggestion?
This is more of a Hamiltonian Path problem. It's NP-complete so you'll need to do an exhaustive search. Get your brute-force on. I can only suggest that it is possible to use threading to alleviate your performance problem; divide up your starting vertices evenly amongst your available threads. Terminate in the event that you find a Hamiltonian Path, else the longest path wins.
The algorithm itself is just going to have to find all possible paths. I there is possibly a heuristic to stop bothering with a path that seems to be getting bad results but that will mean the solution may not always be correct.

Maze least turns

I have a problem that i can't solve.I have a maze and I have to find a path from a point S to a point E,which has the least turns.It is known that the point E is reacheable.I can move only in 4 directions,left,right,up and down.It doesn't have to be the shortest path,just to have least turns.
I tried to store the number of turns in a priority queue.For example when I reach a certain place in the maze I will add the numbers of turns till there.From there I would add his neighbours to the priority queue,if they weren't visited already or they weren't walls,with the value of the current block i was sitting,for example t + x which can have the following values ( 0-if the neighbour is facing in the same direction I was facing when i got near him,or 1 if it is in a different direction).It seems that this approach doesn't work for every case.
I will appreciate if somebody could offer me some hints, without any code.
You are on the right track. What you need to implement for this problem is Dijkstra's algorithm. You just need to consider not just points as graph vertices, but pair of (point,direction). From every vertex (p,d) you have 5 edges (although last one can be blocked by wall): (p,0), (p,1), (p,2), (p,3), (neighbour of p in direction d, d). First four edges are of weight 1 (as you turn here), and the last one is of weight 0 (no turn, just move forward). Algorithm is good enough to ignore loops and works fine for edges of weight 0. You should end when any vertex (end point, _) is extracted from priority queue.
This method has one issue, as too many verticies are inspected in the process. If your maze is small, that's not the problem. Otherwise, consider a slight modification known as A*. You need a good heuristic function, describing lower bound on number of turns to the goal.

Shortest path graph algorithm help Boost

I have a rectangular grid shaped DAG where the horizontal edges always point right and the vertical edges always point down. The edges have positive costs associated with them. Because of the rectangular format, the nodes are being referred to using zero-based row/column. Here's an example graph:
Now, I want to perform a search. The starting vertex will always be in the left column (column with index 0) and in the upper half of the graph. This means I'll pick the start to be either (0,0), (1,0), (2,0), (3,0) or (4,0). The goal vertex is always in the right column (column with index 6) and "corresponds" to the start vertex:
start vertex (0,0) corresponds to goal vertex (5,6)
start vertex (1,0) corresponds to goal vertex (6,6)
start vertex (2,0) corresponds to goal vertex (7,6)
start vertex (3,0) corresponds to goal vertex (8,6)
start vertex (4,0) corresponds to goal vertex (9,6)
I only mention this to demonstrate that the goal vertex will always be reachable. It's possibly not very important to my actual question.
What I want to know is what search algorithm should I use to find the path from start to goal? I am using C++ and have access to the Boost Graph Library.
For those interested, I'm trying to implement Fuchs' suggestions from his Optimal Surface Reconstruction from Planar Contours paper.
I looked at A* but to be honest didn't understand it and wasn't how the heuristic works or even whether I could come up with one!
Because of the rectangular shape and regular edge directions, I figured there might be a well-suited algorithm. I considered Dijkstra
but the paper I mention said there were quicker algorithms (but annoyingly for me doesn't provide an implementation), plus that's
single-source and I think I want single-pair.
So, this is your problem,
DAG no cycles
weights > 0
Left weight < Right weight
You can use a simple exhaustive search defining every possible route. So you have a O(NxN) algoririthm. And then you will choose the shortest path. It is not a very smart solution, but it is effective.
But I suppose you want to be smarter than that, let's consider that if a particular node can be reached from two nodes, you can find the minimum of the weights at the two nodes plus the cost for arriving to the current node. You can consider this as an extension of the previous exhaustive search.
Remember that a DAG can be drawn in a line. For DAG linearization here's an interesting resource.
Now you have just defined a recursive algorithm.
MinimumPath(A,B) = MinimumPath(MinimumPath(A,C)+MinimumPath(A,D)+,MinimumPath(...)+MinimumPath(...))
Of course the starting point of recursion is
MinimumPath(Source,Source)
which is 0 of course.
As far as I know, there isn't an out of the box algorithm from boost to do this. But this is quite straightforward to implement it.
A good implementation is here.
If, for some reason, you do not have a DAG, Dijkstra's or Bellman-Ford should be used.
if I'm not mistaken, from the explanation this is really an optimal path problem not a search problem since the goal is known. In optimization I don't think you can avoid doing an exhaustive search if you want the optimal path and not an approximate path.
From the paper it seems like you are really subdividing a space many times then running the algorithm. This would reduce your N to closer to a constant in the context of the entire surface making O(N^2) not so bad.
That being said perhaps dynamic programming would be a good strategy where the N would be bounded by the difference between your start and goal node. Here is an example form genomic alignment. Just an illustration to give you an idea of how it works.
Construct a N by N array of cost values all set to 0 or some default.
#
For i in size N:
For j in size N:
#cost of getting here from i-1 or j-1
cost[i,j] = min(cost[i-1,j] + i edge cost , cost[i,j-1] + j edge cost)
Once you have your table filled in, start at bottom right corner. Starting from your goal, Choose to go to the node with the lowest cost of getting there. Work your way backwards choosing the lowest value until you reach the start node (or rather the array entry corresponding to the start node). This should give you the optimal path very simply.
Dynamic programming works by solving the optimization on smaller sub-problems. In this case the sub-problems are optimal paths to preceding nodes. I think the rectangular nature of your graph makes this a good fit.

'Stable' multi-dimensional scaling algorithm

I have a wireless mesh network of nodes, each of which is capable of reporting its 'distance' to its neighbors, measured in (simplified) signal strength to them. The nodes are geographically in 3d space but because of radio interference, the distance between nodes need not be trigonometrically (trigonomically?) consistent. I.e., given nodes A, B and C, the distance between A and B might be 10, between A and C also 10, yet between B and C 100.
What I want to do is visualize the logical network layout in terms of connectness of nodes, i.e. include the logical distance between nodes in the visual.
So far my research has shown the multidimensional scaling (MDS) is designed for exactly this sort of thing. Given that my data can be directly expressed as a 2d distance matrix, it's even a simpler form of the more general MDS.
Now, there seem to be many MDS algorithms, see e.g. http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html and http://tapkee.lisitsyn.me/ . I need to do this in C++ and I'm hoping I can use a ready-made component, i.e. not have to re-implement an algo from a paper. So, I thought this: https://sites.google.com/site/simpmatrix/ would be the ticket. And it works, but:
The layout is not stable, i.e. every time the algorithm is re-run, the position of the nodes changes (see differences between image 1 and 2 below - this is from having been run twice, without any further changes). This is due to the initialization matrix (which contains the initial location of each node, which the algorithm then iteratively corrects) that is passed to this algorithm - I pass an empty one and then the implementation derives a random one. In general, the layout does approach the layout I expected from the given input data. Furthermore, between different runs, the direction of nodes (clockwise or counterclockwise) can change. See image 3 below.
The 'solution' I thought was obvious, was to pass a stable default initialization matrix. But when I put all nodes initially in the same place, they're not moved at all; when I put them on one axis (node 0 at 0,0 ; node 1 at 1,0 ; node 2 at 2,0 etc.), they are moved along that axis only. (see image 4 below). The relative distances between them are OK, though.
So it seems like this algorithm only changes distance between nodes, but doesn't change their location.
Thanks for reading this far - my questions are (I'd be happy to get just one or a few of them answered as each of them might give me a clue as to what direction to continue in):
Where can I find more information on the properties of each of the many MDS algorithms?
Is there an algorithm that derives the complete location of each node in a network, without having to pass an initial position for each node?
Is there a solid way to estimate the location of each point so that the algorithm can then correctly scale the distance between them? I have no geographic location of each of these nodes, that is the whole point of this exercise.
Are there any algorithms to keep the 'angle' at which the network is derived constant between runs?
If all else fails, my next option is going to be to use the algorithm I mentioned above, increase the number of iterations to keep the variability between runs at around a few pixels (I'd have to experiment with how many iterations that would take), then 'rotate' each node around node 0 to, for example, align nodes 0 and 1 on a horizontal line from left to right; that way, I would 'correct' the location of the points after their relative distances have been determined by the MDS algorithm. I would have to correct for the order of connected nodes (clockwise or counterclockwise) around each node as well. This might become hairy quite quickly.
Obviously I'd prefer a stable algorithmic solution - increasing iterations to smooth out the randomness is not very reliable.
Thanks.
EDIT: I was referred to cs.stackexchange.com and some comments have been made there; for algorithmic suggestions, please see https://cs.stackexchange.com/questions/18439/stable-multi-dimensional-scaling-algorithm .
Image 1 - with random initialization matrix:
Image 2 - after running with same input data, rotated when compared to 1:
Image 3 - same as previous 2, but nodes 1-3 are in another direction:
Image 4 - with the initial layout of the nodes on one line, their position on the y axis isn't changed:
Most scaling algorithms effectively set "springs" between nodes, where the resting length of the spring is the desired length of the edge. They then attempt to minimize the energy of the system of springs. When you initialize all the nodes on top of each other though, the amount of energy released when any one node is moved is the same in every direction. So the gradient of energy with respect to each node's position is zero, so the algorithm leaves the node where it is. Similarly if you start them all in a straight line, the gradient is always along that line, so the nodes are only ever moved along it.
(That's a flawed explanation in many respects, but it works for an intuition)
Try initializing the nodes to lie on the unit circle, on a grid or in any other fashion such that they aren't all co-linear. Assuming the library algorithm's update scheme is deterministic, that should give you reproducible visualizations and avoid degeneracy conditions.
If the library is non-deterministic, either find another library which is deterministic, or open up the source code and replace the randomness generator with a PRNG initialized with a fixed seed. I'd recommend the former option though, as other, more advanced libraries should allow you to set edges you want to "ignore" too.
I have read the codes of the "SimpleMatrix" MDS library and found that it use a random permutation matrix to decide the order of points. After fix the permutation order (just use srand(12345) instead of srand(time(0))), the result of the same data is unchanged.
Obviously there's no exact solution in general to this problem; with just 4 nodes ABCD and distances AB=BC=AC=AD=BD=1 CD=10 you cannot clearly draw a suitable 2D diagram (and not even a 3D one).
What those algorithms do is just placing springs between the nodes and then simulate a repulsion/attraction (depending on if the spring is shorter or longer than prescribed distance) probably also adding spatial friction to avoid resonance and explosion.
To keep a "stable" diagram just build a solution and then only update the distances, re-using the current position from previous solution as starting point. Picking two fixed nodes and aligning them seems a good idea to prevent a slow drift but I'd say that spring forces never end up creating a rotational momentum and thus I'd expect that just scaling and centering the solution should be enough anyway.

Shortest Path distance between points given as X-Y coordinates

I am currently working on a project that has a vector containing X and Y coordinates for approximately 800 points. These points represent an electric network of lines.
My goal is to compute the shortest distance Path between a Point A and Point B that can be or can not be located along the path given by the vectors containing the X-Y coordinates of the electric lines.
I have read about the Dijkstra Algorithm but since i am not that much familiar with it, I am not sure if I should go in that direction. I will be very thankful if I can get any feedback or comments from you that can direct me to solve this matter.
Any pathfinding algorithm depends on paths, points are just meaningless. What you have now is a list of "waypoints". However you have not explained how those points connect. For example if any and every point is connected to each other point the shortest distance would simply be the pythagoral distance between A & B. - I'm also unsure what you mean by X-Y coordinates of electric lines, such a "line" would always have a start & end position?
So the first step is to add to each point not only the x,y coordinates, but also a list of connectable points.
Once you did this you can start using a pathfinding algorithm (In this case A* would seem better than Dijkstra's though). It would simply be a standard implementation with each "cost" the actual distance between a point. (And for A* the heuristic would be the pythagoral distance to the end point).
For a good tutorial about A* (and other algorithms) you should check Amit's pages
EDIT, in reply to the comments.
It seems the first step is to convert a set of line segments to "points". The way I would go through this is:
collection AllPoints {containing Location & LinksToOtherPoints}
for each Segment
get start/end Point of Segment
if Point.Location is not in allPoints
add Point to AllPoints
add the other Point of Segment to LinksToOtherPoints
You then have simply a list with all points & the connections between them. As you have to constantly search the allPoints collection I suggest storing that in a binary tree structure (sets?).
For computing the shortest path Dijakstra would be fine.
You may get faster results from using A*, which uses a best guess of the distance in order to focus its search in the right direction, thereby getting there quicker.
If you are repeatedly querying the same data set, then memoization is fine.
Those people who recommend a brute-force algorithm are fools - it's worth taking a little time to learn how to program an efficient solution. But you could calculate the shortest path between all points using the Floyd-Warshall algorithm. Unfortunately, this won't tell you what the shortest path is just how long it is.
Just calculate the distance for all possible paths and pick the shortest one.
800 paths is nothing for modern PC. You will not even notice it.