Optimizing a dijkstra implementation - c++

QUESTION EDITED, now I only want to know if a queue can be used to improve the algorithm.
I have found this implementation of a mix cost max flow algorithm, which uses dijkstra: http://www.stanford.edu/~liszt90/acm/notebook.html#file2
Gonna paste it here in case it gets lost in the internet void:
// Implementation of min cost max flow algorithm using adjacency
// matrix (Edmonds and Karp 1972). This implementation keeps track of
// forward and reverse edges separately (so you can set cap[i][j] !=
// cap[j][i]). For a regular max flow, set all edge costs to 0.
//
// Running time, O(|V|^2) cost per augmentation
// max flow: O(|V|^3) augmentations
// min cost max flow: O(|V|^4 * MAX_EDGE_COST) augmentations
//
// INPUT:
// - graph, constructed using AddEdge()
// - source
// - sink
//
// OUTPUT:
// - (maximum flow value, minimum cost value)
// - To obtain the actual flow, look at positive values only.
#include <cmath>
#include <vector>
#include <iostream>
using namespace std;
typedef vector<int> VI;
typedef vector<VI> VVI;
typedef long long L;
typedef vector<L> VL;
typedef vector<VL> VVL;
typedef pair<int, int> PII;
typedef vector<PII> VPII;
const L INF = numeric_limits<L>::max() / 4;
struct MinCostMaxFlow {
int N;
VVL cap, flow, cost;
VI found;
VL dist, pi, width;
VPII dad;
MinCostMaxFlow(int N) :
N(N), cap(N, VL(N)), flow(N, VL(N)), cost(N, VL(N)),
found(N), dist(N), pi(N), width(N), dad(N) {}
void AddEdge(int from, int to, L cap, L cost) {
this->cap[from][to] = cap;
this->cost[from][to] = cost;
}
void Relax(int s, int k, L cap, L cost, int dir) {
L val = dist[s] + pi[s] - pi[k] + cost;
if (cap && val < dist[k]) {
dist[k] = val;
dad[k] = make_pair(s, dir);
width[k] = min(cap, width[s]);
}
}
L Dijkstra(int s, int t) {
fill(found.begin(), found.end(), false);
fill(dist.begin(), dist.end(), INF);
fill(width.begin(), width.end(), 0);
dist[s] = 0;
width[s] = INF;
while (s != -1) {
int best = -1;
found[s] = true;
for (int k = 0; k < N; k++) {
if (found[k]) continue;
Relax(s, k, cap[s][k] - flow[s][k], cost[s][k], 1);
Relax(s, k, flow[k][s], -cost[k][s], -1);
if (best == -1 || dist[k] < dist[best]) best = k;
}
s = best;
}
for (int k = 0; k < N; k++)
pi[k] = min(pi[k] + dist[k], INF);
return width[t];
}
pair<L, L> GetMaxFlow(int s, int t) {
L totflow = 0, totcost = 0;
while (L amt = Dijkstra(s, t)) {
totflow += amt;
for (int x = t; x != s; x = dad[x].first) {
if (dad[x].second == 1) {
flow[dad[x].first][x] += amt;
totcost += amt * cost[dad[x].first][x];
} else {
flow[x][dad[x].first] -= amt;
totcost -= amt * cost[x][dad[x].first];
}
}
}
return make_pair(totflow, totcost);
}
};
My question is if it can be improved by using a priority queue inside of Dijkstra(). I tried but I couldn't get it to work properly.
Actually I suspect that in Dijkstra it should be looping over adjacent nodes, not all nodes...
Thanks a lot.

Surely Dijkstra's algorithm can be improved by using minheap. After we put a vertex into shortest-path tree and process (i.e. label) all adjacent vertices, our next step is to select the vertex with smallest label, not yet in the tree.
This is where minheap comes to mind. Rather than sequentially scan through all vertices, we extract the min element from heap and restructure it, which takes O(logn) time vs O(n). Note that the heap is going to keep only those vertices that are not yet in the shortest-path tree. However we should be able to somehow modify vertices in the heap, if we update their labels.

I am not so sure using a priority queue to implement Dijkstra's algorithm will actually improve the run time because, while using a priority queue decreases the amount of time needed to find the vertex with minimum distance from the source (O(log V) with a priority queue vs. O(V) in the naive implementation), it also increases the amount of time needed to process a new edge (O(log V) with a priority queue vs. O(1) in the naive implementation).
Thus, for the naive implementation, the running time is O(V^2+E).
However, for the priority queue implementation, the running time is O(V log V+E log V).
For very dense graphs, E could be O(V^2), which means the naive implementation would have running time O(V^2+V^2)=O(V^2) while the priority queue implementation would have running time O(V log V+V^2 log V)=O(V^2 log V). Thus, as you can see, the priority queue implementation actually has a worse worst-case run time in the case of dense graphs.
Given the fact that the people writing the above implementation stored the edges as an adjacency matrix rather than using adjacency lists, it looks like the people who wrote this code were expecting the graph to be a dense graph with O(V^2) edges, so it makes sense that they would use the naive implementation over the priority queue implementation here.
For more info about running time of Dijkstra's algorithm, read up on this Wikipedia page.

Related

Shortest path in Graph with time limit

Let's say I have a graph G with N vertices and M edges. Each edge has its length and time (let's say in minutes), which it takes to traverse that edge. I need to find the shortest path in the graph between the vertices 1 and N, which is performed in under T minutes time.
Since time is the more valuable resource and we care about traversing the graph in time, and only then with minimal length, I decided to use Dijkstra's algorithm, for which I considered the time of each edge as its weight. I added a vector to store the durations. Thus, the algorithm returns the least time, not the least length. A friend suggested this addition to my code:
int answer(int T) {
int l = 1;
int r = M; // a very big number
int answer = M;
while (l <= r) {
int mid = (l + r) / 2;
int time = dijkstra(mid); // the parameter mid serves as an upper bound for dijkstra and I relax the edge only if its length(not time) is less than mid
if (time <= T) {
answer = mid;
r = mid - 1;
} else {
l = mid + 1;
}
}
if (best == M) {
return -1; // what we return in case there is no path in the graph, which takes less than T minutes
}
return answer;
}
Here is the dijkstra method (part of class Graph with std::unordered_map<int, std::vector<Node>> adjacencyList member):
int dijkstra(int maxLength) {
std::priority_queue<Node, std::vector<Node>, NodeComparator> heap;//NodeComparator sorts by time of edge
std::vector<int> durations(this->numberOfVertices + 1, M);
std::set<int> visited;
// duration 1->1 is 0
durations[1] = 0;
heap.emplace(1, 0, 0);
while (!heap.empty()) {
int vertex = heap.top().endVertex;
heap.pop();
// to avoid repetition
if (visited.find(vertex) != visited.end()) {
continue;
}
for (Node node: adjacencyList[vertex]) {
// relaxation
if (node.length <= maxLength && durations[node.endVertex] > durations[vertex] + node.time) {
durations[node.endVertex] = durations[vertex] + node.time;
heap.emplace(node.endVertex, durations[node.endVertex], 0);
}
}
// mark as visited to avoid going through the same vertex again
visited.insert(vertex);
}
// return path time between 1 and N bounded by maxKilograms
return durations.back();
}
This seems to work but seems inefficient to me. To be frank, I don't understand his idea completely. It appears to me like randomly trying to find the best answer(because nobody said that the time of an edge is tied proportionally to its length). I tried searching for shortest path in graph with time limit but I found algorithms that find the fastest paths, not the shortest with a limit. Does an algorithm for this even exist? How can improve my solution?
What is this?
int time = dijkstra(mid);
It certainly isnt an implementation of the Dijkstra algorithm!
The Dijkstra algorithm requires the starting node and returns THE shortest path from the starting node to every other.
You are going to need a function that returns all the distinct paths between start and end nodes that take less than T. Then you can search them for the one that is cheapest.
Search graph for all distinct paths from start to end
Discard paths that take more then T
Select cheapest path.
Finding a resource constrained shortest path is NPHard. Most approaches to this problem employ a labelling scheme, which is a specialization of dynamic programming. You could use available libraries to accomplish this. See here for a boost implementation.

How does Dijkstra's Algorithm work if a max priority queue is used?

I was recently looking at some code for Dijkstra's algorithm. The goal of the code was to find the minimum cost path from vertex 1 to vertex N. I came across this working code when looking at the solution to the problem:
void dijkstra(int start, int n) {
for(int i = 0; i<n; i++) {
dist[i] = INF;
pred[i] = -1;
}
dist[start] = 0;
priority_queue<ll> q;
q.push(0);
int u = 0;
while(q.size()) {
u = q.top();
q.pop();
for(int end : adj[u]) {
ll w = weight.at(mp(u, end));
if(dist[u] + w < dist[end]) {
dist[end] = dist[u] + w;
pred[end] = u;
q.push(end);
}
}
}
}
This program uses a priority queue in order to determine which vertices to traverse to next (starting from vertex 1). However, the priority queue implemented in this algorithm is the standard C++ Priority Queue which is a Max Priority Queue. This means that the largest elements have the highest priority. However, I thought that in Dijkstra's algorithm, we wanted to poll the smallest vertices first? I am unsure how using a Max Priority Queue works for this algorithm.
Everything that you have mentioned is right. But, be careful! u is the index of neighbors in this algorithm. And, if neighbors with a higher index have a smaller distance, it will work correctly.
Moreover, you should notice that the priority queue can be implemented such that the top element will be the smallest value using std::greater<T>.

How to avoid using nested loops in cpp?

I am working on digital sampling for sensor. I have following code to compute the highest amplitude and the corresponding time.
struct LidarPoints{
float timeStamp;
float Power;
}
std::vector<LidarPoints> measurement; // To store Lidar points of current measurement
Currently power and energy are the same (because of delta function)and vector is arranged in ascending order of time. I would like to change this to step function. Pulse duration is a constant 10ns.
uint32_t pulseDuration = 5;
The problem is to find any overlap between the samples and if any to add up the amplitudes.
I currently use following code:
for(auto i= 0; i< measurement.size(); i++){
for(auto j=i+1; i< measurement.size(); j++){
if(measurement[j].timeStamp - measurement[i].timeStamp) < pulseDuration){
measurement[i].Power += measurement[j].Power;
measurement[i].timeStamp = (measurement[i].timeStamp + measurement[j].timeStamp)/2.0f;
}
}
}
Is it possible to code this without two for loops since I cannot afford the amount of time being taken by nested loops.
You can take advantage that the vector is sorted by timeStamp and find the next pulse with binary search, thus reducing the complexity from O(n^2) to O(n log n):
#include <vector>
#include <algorithm>
#include <numeric>
#include <iterator
auto it = measurement.begin();
auto end = measurement.end();
while (it != end)
{
// next timestamp as in your code
auto timeStampLower = it->timeStamp + pulseDuration;
// next value in measurement with a timestamp >= timeStampLower
auto lower_bound = std::lower_bound(it, end, timeStampLower, [](float a, const LidarPoints& b) {
return a < b.timeStamp;
});
// sum over [timeStamp, timeStampLower)
float sum = std::accumulate(it, lower_bound, 0.0f, [] (float a, const LidarPoints& b) {
return a + b.timeStamp;
});
auto num = std::distance(it, lower_bound);
// num should be >= since the vector is sorted and pulseDuration is positive
// you should uncomment next line to catch unexpected error
// Expects(num >= 1); // needs GSL library
// assert(num >= 1); // or standard C if you don't want to use GSL
// average over [timeStamp, timeStampLower)
it->timeStamp = sum / num;
// advance it
it = lower_bound;
}
https://en.cppreference.com/w/cpp/algorithm/lower_bound
https://en.cppreference.com/w/cpp/algorithm/accumulate
Also please note that my algorithm will produce different result than yours because you don't really compute the average over multiple values with measurement[i].timeStamp = (measurement[i].timeStamp + measurement[j].timeStamp)/2.0f
Also to consider: (I am by far not an expert in the field, so I am just throwing the ideea, it's up to you to know if its valid or not): with your code you just "squash" together close measurement, instead of having a vector of measurement with periodic time. It might be what you intend or not.
Disclaimer: not tested beyond "it compiles". Please don't just copy-paste it. It could be incomplet and incorrekt. But I hope I gave you a direction to investigate.
Due to jitter and other timing complexities, instead of simple summation, you need to switch to [Numerical Integration][۱] (eg. Trapezoidal Integration...).
If your values are in ascending order of timeStamp adding else break to the if statement shouldn't effect the result but should be a lot quicker.
for(auto i= 0; i< measurement.size(); i++){
for(auto j=i+1; i< measurement.size(); j++){
if(measurement[j].timeStamp - measurement[i].timeStamp) < pulseDuration){
measurement[i].Power += measurement[j].Power;
measurement[i].timeStamp = (measurement[i].timeStamp + measurement[j].timeStamp)/2.0f;
} else {
break;
}
}
}

shortest path algorithm from text input

I've been trying to do this shortest path problem and I realised that the way I was trying to it was almost completely wrong and that I have no idea to complete it.
The question requires you to find the shortest path from one point to another given a text file of input.
The input looks like this with the first value representing how many levels there are.
4
14 10 15
13 5 22
13 7 11
5
This would result in an answer of: 14+5+13+11+5=48
The question asks for the shortest path from the bottom left to the top right.
The way I have attempted to do this is to compare the values of either path possible and then add them to a sum. e.g the first step from the input I provided would compare 14 against 10 + 15. I ran into the problem that if both values are the same it will stuff up the rest of the working.
I hope this makes some sense.
Any suggestions on an algorithm to use or any sample code would be greatly appreciated.
Assume your data file is read into a 2D array of the form:
int weights[3][HEIGHT] = {
{14, 10, 15},
{13, 5, 22},
{13, 7, 11},
{X, 5, X}
};
where X can be anything, doesn't matter. For this I'm assuming positive weights and therefore there is never a need to consider a path that goes "down" a level.
In general you can say that the minimum cost is lesser of the following 2 costs:
1) The cost of rising a level: The cost of the path to the opposite side from 1 level below, plus the cost of coming up.
2) The cost of moving across a level : The cost of the path to the opposite from the same level, plus the cost of coming across.
int MinimumCost(int weight[3][HEIGHT]) {
int MinCosts[2][HEIGHT]; // MinCosts[0][Level] stores the minimum cost of reaching
// the left node of that level
// MinCosts[1][Level] stores the minimum cost of reaching
// the right node of that level
MinCosts[0][0] = 0; // cost nothing to get to the start
MinCosts[0][1] = weight[0][1]; // the cost of moving across the bottom
for (int level = 1; level < HEIGHT; level++) {
// cost of coming to left from below right
int LeftCostOneStep = MinCosts[1][level - 1] + weight[2][level - 1];
// cost of coming to left from below left then across
int LeftCostTwoStep = MinCosts[0][level - 1] + weight[0][level - 1] + weight[1][level];
MinCosts[0][level] = Min(LeftCostOneStep, LeftCostTwoStep);
// cost of coming to right from below left
int RightCostOneStep = MinCosts[0][level - 1] + weight[0][level - 1];
// cost of coming to right from below right then across
int RightCostTwoStep = MinCosts[1][level - 1] + weight[1][level - 1] + weight[1][level];
MinCosts[1][level] = Min(RightCostOneStep, RightCostTwoStep);
}
return MinCosts[1][HEIGHT - 1];
}
I haven't double checked the syntax, please only use it to get a general idea of how to solve the problem. You could also rewrite the algorithm so that MinCosts uses constant memory, MinCosts[2][2] and your whole algorithm could become a state machine.
You could also use dijkstra's algorithm to solve this, but that's a bit like killing a fly with a nuclear warhead.
My first idea was to represent the graph with a matrix and then run a DFS or Dijkstra to solve it. But for this given question, we can do better.
So, here is a possible solution of this problem that runs in O(n). 2*i means left node of level i and 2*i+1 means right node of level i. Read the comments in this solution for an explanation.
#include <stdio.h>
struct node {
int lup; // Cost to go to level up
int stay; // Cost to stay at this level
int dist; // Dist to top right node
};
int main() {
int N;
scanf("%d", &N);
struct node tab[2*N];
// Read input.
int i;
for (i = 0; i < N-1; i++) {
int v1, v2, v3;
scanf("%d %d %d", &v1, &v2, &v3);
tab[2*i].lup = v1;
tab[2*i].stay = tab[2*i+1].stay = v2;
tab[2*i+1].lup = v3;
}
int v;
scanf("%d", &v);
tab[2*i].stay = tab[2*i+1].stay = v;
// Now the solution:
// The last level is obvious:
tab[2*i+1].dist = 0;
tab[2*i].dist = v;
// Now, for each level, we compute the cost.
for (i = N - 2; i >= 0; i--) {
tab[2*i].dist = tab[2*i+3].dist + tab[2*i].lup;
tab[2*i+1].dist = tab[2*i+2].dist + tab[2*i+1].lup;
// Can we do better by staying at the same level ?
if (tab[2*i].dist > tab[2*i+1].dist + tab[2*i].stay) {
tab[2*i].dist = tab[2*i+1].dist + tab[2*i].stay;
}
if (tab[2*i+1].dist > tab[2*i].dist + tab[2*i+1].stay) {
tab[2*i+1].dist = tab[2*i].dist + tab[2*i+1].stay;
}
}
// Print result
printf("%d\n", tab[0].dist);
return 0;
}
(This code has been tested on the given example.)
Use a depth-first search and add only the minimum values. Then check which side is the shortest stair. If it's a graph problem look into a directed graph. For each stair you need 2 vertices. The cost from ladder to ladder can be something else.
The idea of a simple version of the algorithm is the following:
define a list of vertices (places where you can stay) and edges (walks you can do)
every vertex will have a list of edges connecting it to other vertices
for every edge store the walk length
for every vertex store a field with 1000000000 with the meaning "how long is the walk to here"
create a list of "active" vertices initialized with just the starting point
set the walk-distance field of starting vertex with 0 (you're here)
Now the search algorithm proceeds as
pick the (a) vertex from the "active list" with lowest walk_distance and remove it from the list
if the vertex is the destination you're done.
otherwise for each edge in that vertex compute the walk distance to the other_vertex as
new_dist = vertex.walk_distance + edge.length
check if the new distance is shorter than other_vertex.walk_distance and in this case update other_vertex.walk_distance to the new value and put that vertex in the "active list" if it's not already there.
repeat from 1
If you run out of nodes in the active list and never processed the destination vertex it means that there was no way to reach the destination vertex from the starting vertex.
For the data structure in C++ I'd use something like
struct Vertex {
double walk_distance;
std::vector<struct Edge *> edges;
...
};
struct Edge {
double length;
Vertex *a, *b;
...
void connect(Vertex *va, Vertex *vb) {
a = va; b = vb;
va->push_back(this); vb->push_back(this);
}
...
};
Then from the input I'd know that for n levels there are 2*n vertices needed (left and right side of each floor) and 2*(n-1) + n edges needed (one per each stair and one for each floor walk).
For each floor except the last you need to build three edges, for last floor only one.
I'd also allocate all edges and vertices in vectors first, fixing the pointers later (post-construction setup is an anti-pattern but here is to avoid problems with reallocations and still maintaining things very simple).
int n = number_of_levels;
std::vector<Vertex> vertices(2*n);
std::vector<Edge> edges(2*(n-1) + n);
for (int i=0; i<n-1; i++) {
Vertex& left = &vertices[i*2];
Vertex& right = &vertices[i*2 + 1];
Vertex& next_left = &vertices[(i+1)*2];
Vertex& next_right = &vertices[(i+1)*2 + 1];
Edge& dl_ur = &edges[i*3]; // down-left to up-right stair
Edge& dr_ul = &edges[i*3+1]; // down-right to up-left stair
Edge& floor = &edges[i*3+2];
dl_ur.connect(left, next_right);
dr_ul.connect(right, next_left);
floor.connect(left, right);
}
// Last floor
edges.back().connect(&vertex[2*n-2], &vertex[2*n-1]);
NOTE: untested code
EDIT
Of course this algorithm can solve a much more general problem where the set of vertices and edges is arbitrary (but lengths are non-negative).
For the very specific problem a much simpler algorithm is possible, that doesn't even need any data structure and that can instead compute the result on the fly while reading the input.
#include <iostream>
#include <algorithm>
int main(int argc, const char *argv[]) {
int n; std::cin >> n;
int l=0, r=1000000000;
while (--n > 0) {
int a, b, c; std::cin >> a >> b >> c;
int L = std::min(r+c, l+b+c);
int R = std::min(r+b+a, l+a);
l=L; r=R;
}
int b; std::cin >> b;
std::cout << std::min(r, l+b) << std::endl;
return 0;
}
The idea of this solution is quite simple:
l variable is the walk_distance for the left side of the floor
r variable is the walk_distance for the right side
Algorithm:
we initialize l=0 and r=1000000000 as we're on the left side
for all intermediate steps we read the three distances:
a is the length of the down-left to up-right stair
b is the length of the floor
c is the length of the down-right to up-left stair
we compute the walk_distance for left and right side of next floor
L is the minimum between r+c and l+b+c (either we go up starting from right side, or we go there first starting from left side)
R is the minimum betwen l+a and r+b+a (either we go up starting from left, or we start from right and cross the floor first)
for the last step we just need to chose what is the minimum between r and coming there from l by crossing the last floor

Creating random undirected graph in C++

The issue is I need to create a random undirected graph to test the benchmark of Dijkstra's algorithm using an array and heap to store vertices. AFAIK a heap implementation shall be faster than an array when running on sparse and average graphs, however when it comes to dense graphs, the heap should became less efficient than an array.
I tried to write code that will produce a graph based on the input - number of vertices and total number of edges (maximum number of edges in undirected graph is n(n-1)/2).
On the entrance I divide the total number of edges by the number of vertices so that I have a const number of edges coming out from every single vertex. The graph is represented by an adjacency list. Here is what I came up with:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <list>
#include <set>
#define MAX 1000
#define MIN 1
class Vertex
{
public:
int Number;
int Distance;
Vertex(void);
Vertex(int, int);
~Vertex(void);
};
Vertex::Vertex(void)
{
Number = 0;
Distance = 0;
}
Vertex::Vertex(int C, int D)
{
Number = C;
Distance = D;
}
Vertex::~Vertex(void)
{
}
int main()
{
int VertexNumber, EdgeNumber;
while(scanf("%d %d", &VertexNumber, &EdgeNumber) > 0)
{
int EdgesFromVertex = (EdgeNumber/VertexNumber);
std::list<Vertex>* Graph = new std::list<Vertex> [VertexNumber];
srand(time(NULL));
int Distance, Neighbour;
bool Exist, First;
std::set<std::pair<int, int>> Added;
for(int i = 0; i < VertexNumber; i++)
{
for(int j = 0; j < EdgesFromVertex; j++)
{
First = true;
Exist = true;
while(First || Exist)
{
Neighbour = rand() % (VertexNumber - 1) + 0;
if(!Added.count(std::pair<int, int>(i, Neighbour)))
{
Added.insert(std::pair<int, int>(i, Neighbour));
Exist = false;
}
First = false;
}
}
First = true;
std::set<std::pair<int, int>>::iterator next = Added.begin();
for(std::set<std::pair<int, int>>::iterator it = Added.begin(); it != Added.end();)
{
if(!First)
Added.erase(next);
Distance = rand() % MAX + MIN;
Graph[it->first].push_back(Vertex(it->second, Distance));
Graph[it->second].push_back(Vertex(it->first, Distance));
std::set<std::pair<int, int>>::iterator next = it;
First = false;
}
}
// Dijkstra's implementation
}
return 0;
}
I get an error:
set iterator not dereferencable" when trying to create graph from set data.
I know it has something to do with erasing set elements on the fly, however I need to erase them asap to diminish memory usage.
Maybe there's a better way to create some undirectioned graph? Mine is pretty raw, but that's the best I came up with. I was thinking about making a directed graph which is easier task, but it doesn't ensure that every two vertices will be connected.
I would be grateful for any tips and solutions!
Piotry had basically the same idea I did, but he left off a step.
Only read half the matrix, and ignore you diagonal for writing values to. If you always want a node to have an edge to itself, add a one at the diagonal. If you always do not want a node to have an edge to itself, leave it as a zero.
You can read the other half of your matrix for a second graph for testing your implementation.
Look at the description of std::set::erase :
Iterator validity
Iterators, pointers and references referring to elements removed by
the function are invalidated.
All other iterators, pointers and
references keep their validity.
In your code, if next is equal to it, and you erase element of std::set by next, you can't use it. In this case you must (at least) change it and only after this keep using of it.