Shortest Path Algorithm with a Double Weighted Graph - c++

I have a directed graph with two weights between vertices, time and cost. The goal is to minimize time while keeping the cost under a maximum value given by the user. I was told to modify the Bellman-Ford algorithm by maintaining an ordered list based on cost instead of a single distance for each vertex of the graph. I am able to correctly implement the Bellman-Ford algorithm when only considering time as a factor, however what modifications to the algorithm do I need to keep cost under a maximum value as well?

Related

If I use a DBSCAN algorithm with a minPts of 1 will it still run in O(nlogn) time?

I'm doing a homework problem that simplified is grouping stars into constellations given their x,y coordinates and a min distance. Any star can be a constellation by itself. so e.g 5 stars cant connect to each other then it will return that there are 5 constellations.
I've initially made an algorithm that checks each point with a runtime of O(n^2). I want to make it faster and saw that a DBSCAN runs in O(nlogn) time.
My question is that if I were to use a DBSCAN the algorithm says it will run in O(nlogn) time but if my minPts is 1 (the size of my clusters) will that negate the efficiency of the DBSCAN and run in O(n^2)??
As far as I'm concerned running time of a DBSCAN in this case depends on a calculation of neighbours, which is performed on each point, within the given distance. As you mentioned if you perform a linear scan you'd get O(n^2) in total. Nevertheless, in order to speed up a search you could use an index-based structure which searches in a O(logn) time.
Please, checkout spatial databases.
The runtime depends on epsilon being small enough so that the result sizes are small, as well as an index being able to accelerate these queries. There is no requirement on minpts, so it will work for the "degenerate" case of Single-Link clustering.
But in your case, you may simply want to use the spatial index for neighbor search directly, not go via the proxy of DBSCAN?

boost induce maximum common subgraph

I want to induce a small graph on a large graph. Both of these graphs are shown bellow. The vertices with same color are equivalent.
small graph (subject to be induced)
large graph (The smaller graph is induced on)
The problem is none of the red vertices on the large graph have 4 neighbors as the small graph. So boost::vf2_subgraph_iso would fail inducing small on the large. But If the inducing algorithm is more tolerant (maximal matching instead of matching all vertices) then it may yield a closest match.
On the other hand boost::mcgregor_common_subgraphs takes really exponentially long time to complete. The other consequence is applying Maximum Common Subgraph finding algorithm on, two absolutely incompatible graphs is wastage of huge amount of CPU time. mcgregor's algorithm is based on finding cliques. So it starts with 1 vertex subgraph aand progresses towards larger subgraphs. For a subgraph hving more than 10 vertices it takes unacceptable amount of time to reach to the obvious solution.
So What is the solution in this scenario ? Is there a more tolerant algorithm for inducing subgraph ?

Best graph algorithm for least transfer in an electric grid

I'm given a series of cities, and each one produces an amount of electricity and needs an amount of electricity. Each city has up to 8 adjacent cities, and I am trying to minimize the number of transfers.
If A->B 10 energy, total cost of transfer is 10.
If A->B->C 10 energy (A to C through B), total cost of transfer is 20.
I thought about using Djikstra's on each point that needs energy, and ending the search for that point when enough energy has been found, but thought of several pitfalls.
I was wondering what else I could consider that could potentially work?
I also considered looking into the Floyd-Warshall algorithm as well as the Hagerup (read a bit about them on wikipedia and they seemed potentially viable)
Thanks
Your problem is easily reduced to a well-known minimum-cost flow problem:
The minimum-cost flow problem (MCFP) is to find the cheapest possible
way of sending a certain amount of flow through a flow network.
This reduction can be done the following way. Add a dummy "source" and "sink" vertices to your graph, add directed edge from source to each original vertex with capacity equal to production rate at that vertex, add a directed edge from each original vertex to sink with capacity equal to consumption rate at that vertex. Set capacities and costs on your original edges as you need them, and solve the max-flow min-cost problem on the resulting network.
I also doubt that Dijkstra algorithm or any shortest-path algorithm will be of any use, as they are concerned with the path of only one unit of electricity from a particular city, and do not take into account "interference" effects from electricity produced in different cities. For example, if you have two cities (A and B) producing 1 unit of energy, one more city (C) close to both A and B consuming 1 unit of energy, and one more city (D) far away consuming 1 unit of energy, then you will have to route energy from either A either B to D, but no shortest-path algorithm will offer you this.
Ending the search as soon as you have enough energy isn't guaranteed to find the shortest path, but letting Dijkstra run completely for each point that's a power consumer will, and is probably still reasonable to do computationally depending on the size of the network.
Lookup A* algorithm it improves on dijkstra with heuristics which might remove some pitfalls.
I can't really think of any other algorithm.
Actually I think A* should be fine.

Dijkstra's algorithm: memory consumption

I have an implementation of Dijkstra's Algorithm, based on the code on this website. Basically, I have a number of nodes (say 10000), and each node can have 1 to 3 connections to other nodes.
The nodes are generated randomly within a 3d space. The connections are also randomly generated, however it always tries to find connections with it's closest neighbors first and slowly increases the search radius. Each connection is given a distance of one. (I doubt any of this matters but it's just background).
In this case then, the algorithm is just being used to find the shortest number of hops from the starting point to all the other nodes. And it works well for 10,000 nodes. The problem I have is that, as the number of nodes increases, say towards 2 million, I use up all of my computers memory when trying to build the graph.
Does anyone know of an alternative way of implementing the algorithm to reduce the memory footprint, or is there another algorithm out there that uses less memory?
According to your comment above, you are representing the edges of the graph with a distance matrix long dist[GRAPHSIZE][GRAPHSIZE]. This will take O(n^2) memory, which is too much for large values of n. It is also not a good representation in terms of execution time when you only have a small number of edges: it will cause Dijkstra's algorithm to take O(n^2) time (where n is the number of nodes) when it could potentially be faster, depending on the data structures used.
Since in your case you said each node is only connected to up to 3 other nodes, you shouldn't use this matrix: Instead, for each node you should store a list of the nodes it is connected to. Then when you want to go over the neighbors of a node, you just need to iterate over this list.
In some specific cases you don't even need to store this list because it can be calculated for each node when needed. For example, when the graph is a grid and each node is connected to the adjacent grid nodes, it's easy to find a node's neighbors on the fly.
If you really cannot afford memory, even with minimizations on your graph representation, you may develop a variation of the Dijkstra's algorithm, considering a divide and conquer method.
The idea is to split data into minor chunks, so you'll be able to perform Dijkstra's algorithm in each chunk, for each of the points within it.
For each solution generated in these minor chunks, consider the it as an unique node to another data chunk, from which you'll start another execution of Dijkstra.
For example, consider the points below:
.B .C
.E
.A .D
.F .G
You can select the closest points to a given node, say, within two hops, and then use the solution as part of the graph extended, considering the former points as only one set of points, with a distance equal to the resulting distance of the Dijkstra solution.
Say you start from D:
select the closest points to D within a given number of hops;
use Dijkstra's algorithm upon the selected entries, commencing from D;
use the solution as a graph with the central node D and the last nodes in the shortest paths as nodes directly linked to D;
extend the graph, repeating the algorithm until all the nodes have been considered.
Although there's a costly extra processing here, you'd be able to surpass memory limitation, and, if you have some other machines, you can even distribute the processes.
Please, note this is just the idea of the process, the process I've described is not necessarily the best way to do it. You may find something interesting looking for distributed Dijkstra's algorithm.
I like boost::graph a lot. It's memory consumption is very decent (I've used it on road networks with 10 million nodes and 2Gb ram).
It has a Dijkstra implementation, but if the goal is to implement and understand it by yourself, you can still use their graph representation (I suggest adjacency list) and compare your result with theirs to be sure your result is correct.
Some people mentioned other algorithms. I don't think this will play a big role on the memory usage, but more likely in the speed. 2M nodes, if the topology is close to a street-network, the running time will be less than a second from one node to all others.
http://www.boost.org/doc/libs/1_52_0/libs/graph/doc/index.html

breadth first or depth first search

I know how this algorithm works, but cant decide when to use which algorithm ?
Are there some guidelines, where one better perform than other or any considerations ?
Thanks very much.
If you want to find a solution with the shortest number of steps or if your tree has infinite height (or very large) you should use breadth first.
If you have a finite tree and want to traverse all possible solutions using the smallest amount of memory then you should use depth first.
If you are searching for the best chess move to play you could use iterative deepening which is a combination of both.
IDDFS combines depth-first search's space-efficiency and breadth-first search's completeness (when the branching factor is finite).
BFS is generally useful in cases where the graph has some meaningful "natural layering" (e.g., closer nodes represent "closer" results) and your goal result is likely to be located closer to the starting point or the starting points are "cheaper to search".
When you want to find the shortest path, BFS is a natural choice.
If your graph is infinite or pro grammatically generated, you would probably want to search closer layers before venturing afield, as the cost of exploring remote nodes before getting to the closer nodes is prohibitive.
If accessing more remote nodes would be more expensive due to memory/disk/locality issues, BFS may again be better.
Which method to use usually depends on application (ie. the reason why you have to search a graph) - for example topological sorting requires depth-first search whereas Ford-Fulkerson algorithm for finding maximum flow requires breadth-first search.
If you are traversing a tree, depth-first will use memory proportional to its depth. If the tree is reasonably balanced (or has some other limit on its depth), it may be convenient to use recursive depth-first traversal.
However, don't do this for traversing a general graph; it will likely cause a stack overflow. For unbounded trees or general graphs, you will need some kind of auxiliary storage that can expand to a size proportional to the number of input nodes. In this case, breadth-first traversal is simple and convenient.
If your problem provides a reason to choose one node over another, you might consider using a priority queue, instead of a stack (for depth-first) or a FIFO (for breadth-first). A priority queue will take O(log K) time (where K is the current number of different priorities) to find the best node at each step, but the optimization may be worth it.