Finding all paths of a certain length in a weighted undirected graph - c++

I need to generate all paths that are lesser than or equal to a specified length in a graph (the graph is undirected and it's possible to have cycles). I tried using BFS while keeping track of the distance already traversed but I'm not sure how I would ensure that every path is different.
Note: I know this probably has a very high computational complexity, but I'm not worrying about that for now.

Using BFS is kind of the right way to do so.
But you additionally have to keep track of already found nodes.
There is a simple algorithm from Dijkstra solving this for you

Related

Visit all nodes exactly once in a directed graph

I have a directed graph and I want to find a path that visits every node exactly one time. I want to do this with a good complexity. Is this possible? And if yes, how?
You are searching for a Hamiltonian path, which is a simple open path that contains each node exactly once.
Finding a Hamiltonian path in a given graph is NP-complete. In fact, determining whether a given (directed or undirected) graph contains a Hamiltonian path is already NP-complete (proven via reduction from e.g. the vertex cover problem).
If you still want to code it, here is an implementation on github. If you want a fast solution, maybe a heuristic is sufficient (for instance inspired by DNA molecules, or a solution that works fast on a subset of graphs. For instance, if you have a DAG, you can do a topological sort and then check if successive vertices are connected. If so, the topological sort gives a Hamiltonian path.

Point to point path in a Graph

I want an algorithm to be able to find an optimal path between two vertices on a graph (with positive int weights).The thing is my graph is relatively big (up to 100 vertices). I have considered the dijkstra algorithm but as I searched the net most implementions use the adjacency matrix which in my case will be 100x100.
If you could recommend me a certain source to read and learn from , or even better provide me with a c++ implementaion it will be great.
PS: The algorithm needs to output the required route and not just the shortest distance between two points.
Thank you for your time.
Have you looked into A*?
Here's a good article to start reading: http://www.redblobgames.com/pathfinding/a-star/introduction.html

Shortest distance from a node back to itself in a weighted, directed graph

This is driving me mad...
I have a directed graph with weighted edges between the nodes.
I need to find the shortest distance from node A and back to node A.
I've tried Djikstras algorithm and Floyd-Warshall but neither quite seem to do the trick. I'm not strong on the maths and most resources I found quickly get overly complex without explaining what I really need to do. Once I understand the steps I can code it, just struggling to find an understandable approach...
Can anyone help?
You want to find the shortest non-trivial cycle containing a particular point. You could do this by taking successive A^n, where A is the adjacency matrix, and finding the first nonzero value on the matrix diagonal corresponding to your point. This is illustrative, but a breadth-first search will work faster. I don't think there is a better way than the BF search, performance wise, though I can't prove that.
Here is some code that implements the search.
http://algs4.cs.princeton.edu/42directed/BreadthFirstDirectedPaths.java.html
You'll have to modify the termination condition, I think.
Alternatively, you could use Dijkstra's and "lift" the point in question into two different start/end points, but this isn't going to be faster for computing just one path, and makes a mess of the graph if you want many paths (point pairs).

How to use union-find, minheap, Kruskal's, and a sort algorithm to create a minimum cost spanning tree? (C++)

I apologize if this question is a bit broad, but I'm having a difficult time trying to understand how I would create a minimum cost spanning tree. This is in C++ if it matters at all.
From what I understand, you would use Kruskal's to select the minimum cost edges for building the spanning tree. My thinking is to read the edges into a minheap and that way you can remove from the top in order to get the edge with the minimum cost.
So far I've only been able to implement the minheap and sets for union-find, I am still unsure of the purpose of union-find and a sorting algorithm for the purpose of creating a spanning tree.
I would greatly appreciate any advice.
EDIT: I am not limited to union find, minheap, kruskals, and a sorting algorith, nor am I required to do any. These were just the items suggested by the instructor.
These two structures serve different purposes in the algorithm. Kruskal's algorithm works by adding the cheapest possible edge at each point that doesn't form a cycle. It can be shown using some not particularly complex math that this guarantees that the resulting spanning tree is minimal. The intuition behind why this works is as follows. Suppose that Kruskal's algorithm is not optimal and that there is a cheaper spanning tree. Sort all of the edges in that tree by weight, then compare those edges in sorted order to the edges chosen by Kruskal's algorithm in sorted order. Since we assume for contradiction that Kruskal's algorithm isn't optimal, there must be some place in the sequences where there's a disagreement. If in this disagreement Kruskal's algorithm has a lighter edge than the optimal solution, then we can make the optimal solution even better by adding that edge in, finding the cycle it creates, then deleting the heaviest edge in the cycle. That edge can't be the edge we just added, because otherwise that would have created a cycle in the MST produced by Kruskal's algorithm and Kruskal's algorithm never adds an edge that creates a cycle. So this means that Kruskal's algorithm must have diverged from the optimal solution by not adding some light edge. But the only reason Kruskal's algorithm skips an edge is if it creates a cycle, and this means that there must be a cycle in the optimal MST, also a contradiction. This means that our assumption was wrong and that Kruskal's algorithm must be optimal.
Hopefully, this motivates why Kruskal's algorithm needs the heap and the union-find structure. We need the heap so that we can get back all the edges in sorted order. If we don't visit the edges in this order, then the above proof breaks down and all bets are off. Interestingly, you don't actually need a heap; you just need some way of visiting all the edges in sorted order. If you want, you can just dump all the edges into a giant array and then sort the array. This doesn't change the runtime of the algorithm from the binary heap case if you use a fast sort.
The union-find structure is a bit trickier. At each point in Kruskal's algorithm you need to be able to tell whether adding an edge would create a cycle in the graph. One way to do this is to store a structure that keeps track of what nodes are already connected to one another. That way, when adding an edge, you can check whether the endpoints are already connected. If they are, then the edge would form a cycle and should be ignored. The union-find structure is a way of maintaining this information efficiently. In particular, its two operations - union and find - correspond to the act of connecting together two distinct groups of nodes that were previously not connected, as would be the case if you added an edge that connected two trees contained in different parts of the spanning forest. The find step gives you a way to check if two nodes are already connected; if so you should skip the current edge.
Hope this helps!

Graph - strongly connected components

Is there any fast way to determine the size of the largest strongly connected component in a graph?
I mean, like, the obvious approach would mean determining every SCC (could be done using two DFS calls, I suppose) and then looping through them and taking the maximum.
I'm pretty sure there has to be some better approach if I only need to have the size of that component and only the largest one, but I can't think of a good solution. Any ideas?
Thanks.
Let me answer your question with another question -
How can you determine which value in a set is the largest without examining all of the values?
Firstly you could use Tarjan's algorithm which needs only one DFS instead of two. If you understand the algorithm clearly, the SCCs form a DAG and this algo finds them in the reverse topological sort order. So if you have a sense of the graph (like a visual representation) and if you know that relative big SCCs occur at end of the DAG then you could stop the algorithm once first few SCCs are found.