Finding the strongly connected components in a Di-Graph in one DFS - c++

By now, I have used the following algorithm for finding the strongly connected components of a graph.
call DFS(G) to compute the finishing time f[v] for every vertex v, sort the vertices of G in decreasing order of their finishing time;
compute the transpose GT of G;
Perform another DFS on G, this time in the main for-loop we go through the vertices of G
in the decreasing order of f[v];
output the vertices of each tree in the DFS forest (formed by the second DFS) as a separate
strongly connected component.
.
But I was wondering if it is possible to find all the strongly connected components in only one DFS.
Any help in this regard would be highly appreciated.

Check out, Algorithm Design Manual by Steven Skiena. It calculates SCC in one DFS. It's based on the concept of oldest reachable vertex.
Initialize each vertex's reachable vertex and SCComponent# to itself in the beginning.
low[i] = i;
scc[i] = -1;
Do a DFS on the digraph, you are interested only in back edges and cross edges because these two edges will tell you if you've encountered a back edge and entering 1 component from another.
int edge_classification(int x, int y)
{
if (parent[y] == x) return(TREE);
if (discovered[y] && !processed[y]) return(BACK);
if (processed[y] && (entry_time[y]>entry_time[x])) return(FORWARD);
if (processed[y] && (entry_time[y]<entry_time[x])) return(CROSS);
printf("Warning: unclassified edge (%d,%d)\n",x,y);
}
So when you encounter these edges, you set reachable vertex[] recursively, based on the entry times.
if (class == BACK) {
if (entry_time[y] < entry_time[ low[x] ] )
low[x] = y;
}
if (class == CROSS)
{
if (scc[y] == -1) /* component not yet assigned */
if (entry_time[y] < entry_time[ low[x] ] )
low[x] = y;
}
A new strongly connected component is found whenever the lowest reachable vertex from vertex 'v' is itself (loop can say, a->b->c->a, lowest reachable vertex of a is a).
process_vertex_early(int v)
{
push(&active,v);
}
After DFS for a vertex is complete (DFS for it's neighbors would have been completed too), check the lowest reachable vertex for it:
if (low[v] == v)
{ /* edge (parent[v],v) cuts off scc */
pop_component(v);
}
if (entry_time[low[v]] < entry_time[low[parent[v]]])
low[parent[v]] = low[v];
pop_component(...) just pops from the stack until this component is found. If a->b->c->a is scanned the stack will have a(bottom)->b->c(top).. pop until vertex 'a' is seen. You get an SCC for 'a'..and similarly you get all connected components in one DFS.

I found this on the Wikipedia page for Strongly connected component:
Kosaraju's algorithm, Tarjan's algorithm and the path-based strong
component algorithm all efficiently compute the strongly connected
components of a directed graph, but Tarjan's and the path-based
algorithm are favoured in practice since they require only one
depth-first search rather than two.
I think this quite answers your question :)

Related

Why BFS algorithm doesn't always find solution of rubick's cube?

I'm tring to write rubick's cube solver based on BFS algorithm. It finds way if there's one shuffle done (one wall moved). There's problem with memory when I'm doing more complicated shuffe.
I have written cube, so one can does moves on it, so the moves work well. I'm checking if the cube is solve by comparing it with the new one (not shuffle). I know it's not perfect, but it should work anyway...
theres some code:
void search(Node &problem)
{
int flag;
Node *curr;
Node* mvs[12]; //moves
std::vector<Cube> explored; //list of position cube's already been in
std::queue<Node*> Q; //queue of possible ways
if (problem.isgoal() == true) return problem.state;
Q.push(&problem);
explored.push_back(problem.state);
while (!Q.empty())
{
flag = 0;
curr = Q.front();
if (curr->isgoal() == true)
{
return curr->state;
}
if (std::find(explored.begin(), explored.end(), curr->state)!=explored.end()) //checking if cube's been in curr position
{
flag = 1;
break;
}
if (flag == 1) break;
explored.push_back(Q.front()->state);
for (int i = 0; i < 12; i++) {
Q.push(curr->expand(i)); //expand is method that
//spread tree of possible moves from curr node
}
Q.pop();
}
}
TLDR; Too Broad.
As mentioned by #tarkmeper, rubik's cube have a huge number of combinations.
A simple shuffling algorithm will not give you an answer. I would suggest you to make algorithms which solve cube based on it's initial state. As I solve the cube myself, there are 2 basic methods:
1. Solve the cube layer by layer which is beginner's method https://www.youtube.com/watch?v=MaltgJGz-dU
2. CFOP(Cross F2l(First 2 Layers) OLL PLL(oll, pll are algorithms))
https://www.youtube.com/watch?v=WzE7SyDB8vA (Pretty advanced)
There have been machines developed to solve the cube but they take input as images of the cube.
I think implementing CFOP could actually solve your issue as it does not check for random shuffles of the cube but actually solves it systematically, but it would be very difficult.
For your implementation it would be much better to take data as a matrix.
A rubik's cube has 3 parts: 1. Center(1 Color) 2. Edge(2 Color) 3.Corner (3 Color)
There are 6 centers 12 egdes 8 corners. You would also have to take into account valid initial states as you cannot randomize it.
What I could think up right now about a problem of this scale is to make 4 algorithms:
Cross():
.... which takes the cube and makes the cross which is basically all white edges
aligned to white center and their 2nd colored center.
F2L():
.... to make 2nd layers of the cube with the corner pieces of the first layer,
this could use backtracking as there are lot of invalid case occurences
OLL():
.... based on your state of cube after F2L transformation there is a straight
forward set of moves, Same for PLL():
Getting to the bare bones of the cube itself:
You would also need to implement moves which are F, F', R, R', L, L', B, B'.
These are moves on the cube the ones with " ' " denote moving that face in anticlockwise direction with respect to the current face of the cube you are looking at.Imagine you are holding the cube, F is for front in clockwise, R is right in clockwise, L is left in clockwise, B is back in clockwise.
Rubics cubes have a huge number of possible states (https://www.quora.com/In-how-many-ways-can-a-Rubiks-cube-be-arranged).
Conceptually your algorithm might need to go need to include all 43,252,003,274,489,856,000 states into the queue before it hits the correct result. You won't have this much memory to do a breadth-first search.
Some of the states are not lead to solutions because they don't belong to their rotations they belong to all permutation. let's Consider an example 1,2,3,4.
3,1,2,4 are not found in rotations of 1,2,3,4 it's found in all permutations
Some answers require 18 moves and when you have at least 12 moves every step you have 12^18 using breadths first search at worst. Computers are fast but not fast enough to solve the cube using BFS. It harder to see if it is possible to store all solution in a database as only moves that solve the cube are needed to be stored, but this is likely (see end game Chess tables).

Implementation of breadth first search in PacMan

I am currently working on a C++ project to make a PacMan clone. Basically I have done almost everything that the game does. But I have not yet figured out how to implement breadth first search in order for the ghosts to chase pacman. In the last few days, I have read a lot about BFS. I know what it is and what it does. I also know I have to use a queue for this purpose. But still, I am unable to actually implement this algorithm in my game. I have a 2d grid of 36*28 tiles. But I am really unsure about how to implement it in my xy-coordinate system, what to push to the queue and how to manipulate the neighbouring tiles. I'm stuck at this point. I'm not asking for actual code. I just need a clear and simple explanation about the actual implementation of BFS and which things to keep in mind while working on BFS in this 2d game grid.
Your explanation will be really helpful. Thanks.
I assume you want to do the BFS every time a ghost will do a move. What you could do is start a BFS from PacMan until he found all ghosts. Note that you don't actually need the complete route a ghost will take, you only need the next move. While doing the BFS you can store for each cell the distance from PacMan to that cell. When you BFS is done, all ghosts can look in their adjacent cells an pick the cell with the lowest number. Note that you should initialize all cells with a large number.
To do your BFS you can do some tricks, like mapping your (x, y) coordinate to one number. This number can be placed in your queue. Note you should check for wall before putting something in your queue. When you pull something out the queue run a for-loop of length 4 (the number of adjacent cells).
int dx[] = {0, 1, 0, -1};
int dy[] = {1, 0, -1, 0};
void do_bfs() {
std::queue<int> queue;
// initialize grid
// add starting position of pacman to queue
while(!queue.empty()) {
// remove and access first element
cur_place = queue.front(); queue.pop();
map_to_coordinate(cur_x, cur_y, cur_place);
cur_distance = grid[cur_x][cur_y];
for (int i = 0; i < 4; i++) {
if (cur_x + dx[i] >= 0 && /* more checks */) {
queue.push_back(map_to_number(cur_x + dx[i], cur_y + dy[i]));
grid[cur_x + dx[i]][cur_y + dy[i]] = cur_distance + 1;
}
}
}
// now grid is filled, so now you should find out for each ghost how to move
}
As an exercise for the reader I tried to leave open as much while making my point.

Compute the "lower contour" of a set of segments in the plane in `O(n log n)`

Suppose you've a set s of horizontal line segments in the plane described by a starting point p, an end point q and a y-value.
We can assume that all values of p and qare pairwise distinct and no two segments overlap.
I want to compute the "lower contour" of the segment.
We can sort s by p and iterate through each segment j. If i is the "active" segment and j->y < i->y we "switch to" j (and output the corresponding contour element).
However, what can we do, when no such j exists and we find a j with i->q < j->p. Then, we would need to switch to the "next higher segment". But how do we know that segment? I can't find a way such that the resulting algorithm would have a running time of O(n log n). Any ideas?
A sweep line algorithm is an efficient way to solve your problem. As explained previously by Brian, we can sort all the endpoints by the x-coordinate and process them in order. An important distinction to make here is that we are sorting the endpoints of the segment and not the segments in order of increasing starting point.
If you imagine a vertical line sweeping from left to right across your segments, you will notice two things:
At any position, the vertical line either intersects a set of segments or nothing. Let's call this set the active set. The lower contour is the segment within the active set with the smallest y-coordinate.
The only x-coordinates where the lower contour can change are the segment endpoints.
This immediately brings one observation: the lower contour should be a list of segments. A list of points does not provide sufficient information to define the contour, which can be undefined at certain x-coordinates (where there are no segments).
We can model the active set with an std::set ordered by the y position of the segment. Processing the endpoints in order of increasing x-coordinate. When encountering a left endpoint, insert the segment. When encountering a right endpoint, erase the segment. We can find the active segment with the lowest y-coordinate with set::begin() in constant time thanks to the ordering. Since each segment is only ever inserted once and erased once, maintaining the active set takes O(n log n) time in total.
In fact, it is possible to maintain a std::multiset of only the y-coordinates for each segment that intersects the sweep line, if it is easier.
The assumption that the segments are non-overlapping and have distinct endpoints is not entirely necessary. Overlapping segments are handled both by the ordered set of segments and the multiset of y-coordinates. Coinciding endpoints can be handled by considering all endpoints with the same x-coordinate at one go.
Here, I assume that there are no zero-length segments (i.e. points) to simplify things, although they can also be handled with some additional logic.
std::list<segment> lower_contour(std::list<segment> segments)
{
enum event_type { OPEN, CLOSE };
struct event {
event_type type;
const segment &s;
inline int position() const {
return type == OPEN ? s.sp : s.ep;
}
};
struct order_by_position {
bool operator()(const event& first, const event& second) {
return first.position() < second.position();
}
};
std::list<event> events;
for (auto s = segments.cbegin(); s != segments.cend(); ++s)
{
events.push_back( event { OPEN, *s } );
events.push_back( event { CLOSE, *s } );
}
events.sort(order_by_position());
// maintain a (multi)set of the y-positions for each segment that intersects the sweep line
// the ordering allows querying for the lowest segment in O(log N) time
// the multiset also allows overlapping segments to be handled correctly
std::multiset<int> active_segments;
bool contour_is_active = false;
int contour_y;
int contour_sp;
// the resulting lower contour
std::list<segment> contour;
for (auto i = events.cbegin(); i != events.cend();)
{
auto j = i;
int current_position = i->position();
while (j != events.cend() && j->position() == current_position)
{
switch (j->type)
{
case OPEN: active_segments.insert(j->s.y); break;
case CLOSE: active_segments.erase(j->s.y); break;
}
++j;
}
i = j;
if (contour_is_active)
{
if (active_segments.empty())
{
// the active segment ends here
contour_is_active = false;
contour.push_back( segment { contour_sp, current_position, contour_y } );
}
else
{
// if the current lowest position is different from the previous one,
// the old active segment ends here and a new active segment begins
int current_y = *active_segments.cbegin();
if (current_y != contour_y)
{
contour.push_back( segment { contour_sp, current_position, contour_y } );
contour_y = current_y;
contour_sp = current_position;
}
}
}
else
{
if (!active_segments.empty())
{
// a new contour segment begins here
int current_y = *active_segments.cbegin();
contour_is_active = true;
contour_y = current_y;
contour_sp = current_position;
}
}
}
return contour;
}
As Brian also mentioned, a binary heap like std::priority_queue can also be used to maintain the active set and tends to outperform std::set, even if it does not allow arbitrary elements to be deleted. You can work around this by flagging a segment as removed instead of erasing it. Then, repeatedly remove the top() of the priority_queue if it is a flagged segment. This might end up being faster, but it may or may not matter for your use case.
First sort all the endpoints by x-coordinate (both starting and ending points). Iterate through the endpoints and keep a std::set of all the y-coordinates of active segments. When you reach a starting point, add its y-coordinate to the set and "switch" to it if it's the lowest; when you reach an ending point, remove its y-coordinate from the set and recalculate the lowest y-coordinate using the set. This gives an O(n log n) solution overall.
A balanced binary search tree such as that used to implement std::set generally has a large constant factor. You can speed up this approach by using a binary heap (std::priority_queue) instead of a set, with the lowest y-coordinate at the root. In this case, you can't remove a non-root node, but when you reach such an ending point, just mark the segment inactive in an array. When the root node is popped, continue popping until there is a new root node that hasn't been marked inactive already. I think this will be about twice as fast as the set-based approach, but you'll have to code it yourself and see, if that's a concern.

Optimizing the Dijkstra's algorithm

I need a graph-search algorithm that is enough in our application of robot navigation and I chose Dijkstra's algorithm.
We are given the gridmap which contains free, occupied and unknown cells where the robot is only permitted to pass through the free cells. The user will input the starting position and the goal position. In return, I will retrieve the sequence of free cells leading the robot from starting position to the goal position which corresponds to the path.
Since executing the dijkstra's algorithm from start to goal would give us a reverse path coming from goal to start, I decided to execute the dijkstra's algorithm backwards such that I would retrieve the path from start to goal.
Starting from the goal cell, I would have 8 neighbors whose cost horizontally and vertically is 1 while diagonally would be sqrt(2) only if the cells are reachable (i.e. not out-of-bounds and free cell).
Here are the rules that should be observe in updating the neighboring cells, the current cell can only assume 8 neighboring cells to be reachable (e.g. distance of 1 or sqrt(2)) with the following conditions:
The neighboring cell is not out of bounds
The neighboring cell is unvisited.
The neighboring cell is a free cell which can be checked via the 2-D grid map.
Here is my implementation:
#include <opencv2/opencv.hpp>
#include <algorithm>
#include "Timer.h"
/// CONSTANTS
static const int UNKNOWN_CELL = 197;
static const int FREE_CELL = 255;
static const int OCCUPIED_CELL = 0;
/// STRUCTURES for easier management.
struct vertex {
cv::Point2i id_;
cv::Point2i from_;
vertex(cv::Point2i id, cv::Point2i from)
{
id_ = id;
from_ = from;
}
};
/// To be used for finding an element in std::multimap STL.
struct CompareID
{
CompareID(cv::Point2i val) : val_(val) {}
bool operator()(const std::pair<double, vertex> & elem) const {
return val_ == elem.second.id_;
}
private:
cv::Point2i val_;
};
/// Some helper functions for dijkstra's algorithm.
uint8_t get_cell_at(const cv::Mat & image, int x, int y)
{
assert(x < image.rows);
assert(y < image.cols);
return image.data[x * image.cols + y];
}
/// Some helper functions for dijkstra's algorithm.
bool checkIfNotOutOfBounds(cv::Point2i current, int rows, int cols)
{
return (current.x >= 0 && current.y >= 0 &&
current.x < cols && current.y < rows);
}
/// Brief: Finds the shortest possible path from starting position to the goal position
/// Param gridMap: The stage where the tracing of the shortest possible path will be performed.
/// Param start: The starting position in the gridMap. It is assumed that start cell is a free cell.
/// Param goal: The goal position in the gridMap. It is assumed that the goal cell is a free cell.
/// Param path: Returns the sequence of free cells leading to the goal starting from the starting cell.
bool findPathViaDijkstra(const cv::Mat& gridMap, cv::Point2i start, cv::Point2i goal, std::vector<cv::Point2i>& path)
{
// Clear the path just in case
path.clear();
// Create working and visited set.
std::multimap<double,vertex> working, visited;
// Initialize working set. We are going to perform the djikstra's
// backwards in order to get the actual path without reversing the path.
working.insert(std::make_pair(0, vertex(goal, goal)));
// Conditions in continuing
// 1.) Working is empty implies all nodes are visited.
// 2.) If the start is still not found in the working visited set.
// The Dijkstra's algorithm
while(!working.empty() && std::find_if(visited.begin(), visited.end(), CompareID(start)) == visited.end())
{
// Get the top of the STL.
// It is already given that the top of the multimap has the lowest cost.
std::pair<double, vertex> currentPair = *working.begin();
cv::Point2i current = currentPair.second.id_;
visited.insert(currentPair);
working.erase(working.begin());
// Check all arcs
// Only insert the cells into working under these 3 conditions:
// 1. The cell is not in visited cell
// 2. The cell is not out of bounds
// 3. The cell is free
for (int x = current.x-1; x <= current.x+1; x++)
for (int y = current.y-1; y <= current.y+1; y++)
{
if (checkIfNotOutOfBounds(cv::Point2i(x, y), gridMap.rows, gridMap.cols) &&
get_cell_at(gridMap, x, y) == FREE_CELL &&
std::find_if(visited.begin(), visited.end(), CompareID(cv::Point2i(x, y))) == visited.end())
{
vertex newVertex = vertex(cv::Point2i(x,y), current);
double cost = currentPair.first + sqrt(2);
// Cost is 1
if (x == current.x || y == current.y)
cost = currentPair.first + 1;
std::multimap<double, vertex>::iterator it =
std::find_if(working.begin(), working.end(), CompareID(cv::Point2i(x, y)));
if (it == working.end())
working.insert(std::make_pair(cost, newVertex));
else if(cost < (*it).first)
{
working.erase(it);
working.insert(std::make_pair(cost, newVertex));
}
}
}
}
// Now, recover the path.
// Path is valid!
if (std::find_if(visited.begin(), visited.end(), CompareID(start)) != visited.end())
{
std::pair <double, vertex> currentPair = *std::find_if(visited.begin(), visited.end(), CompareID(start));
path.push_back(currentPair.second.id_);
do
{
currentPair = *std::find_if(visited.begin(), visited.end(), CompareID(currentPair.second.from_));
path.push_back(currentPair.second.id_);
} while(currentPair.second.id_.x != goal.x || currentPair.second.id_.y != goal.y);
return true;
}
// Path is invalid!
else
return false;
}
int main()
{
// cv::Mat image = cv::imread("filteredmap1.jpg", CV_LOAD_IMAGE_GRAYSCALE);
cv::Mat image = cv::Mat(100,100,CV_8UC1);
std::vector<cv::Point2i> path;
for (int i = 0; i < image.rows; i++)
for(int j = 0; j < image.cols; j++)
{
image.data[i*image.cols+j] = FREE_CELL;
if (j == image.cols/2 && (i > 3 && i < image.rows - 3))
image.data[i*image.cols+j] = OCCUPIED_CELL;
// if (image.data[i*image.cols+j] > 215)
// image.data[i*image.cols+j] = FREE_CELL;
// else if(image.data[i*image.cols+j] < 100)
// image.data[i*image.cols+j] = OCCUPIED_CELL;
// else
// image.data[i*image.cols+j] = UNKNOWN_CELL;
}
// Start top right
cv::Point2i goal(image.cols-1, 0);
// Goal bottom left
cv::Point2i start(0, image.rows-1);
// Time the algorithm.
Timer timer;
timer.start();
findPathViaDijkstra(image, start, goal, path);
std::cerr << "Time elapsed: " << timer.getElapsedTimeInMilliSec() << " ms";
// Add the path in the image for visualization purpose.
cv::cvtColor(image, image, CV_GRAY2BGRA);
int cn = image.channels();
for (int i = 0; i < path.size(); i++)
{
image.data[path[i].x*cn*image.cols+path[i].y*cn+0] = 0;
image.data[path[i].x*cn*image.cols+path[i].y*cn+1] = 255;
image.data[path[i].x*cn*image.cols+path[i].y*cn+2] = 0;
}
cv::imshow("Map with path", image);
cv::waitKey();
return 0;
}
For the algorithm implementation, I decided to have two sets namely the visited and working set whose each elements contain:
The location of itself in the 2D grid map.
The accumulated cost
Through what cell did it get its accumulated cost (for path recovery)
And here is the result:
The black pixels represent obstacles, the white pixels represent free space and the green line represents the path computed.
On this implementation, I would only search within the current working set for the minimum value and DO NOT need to scan throughout the cost matrix (where initially, the initially cost of all cells are set to infinity and the starting point 0). Maintaining a separate vector of the working set I think promises a better code performance because all the cells that have cost of infinity is surely to be not included in the working set but only those cells that have been touched.
I also took advantage of the STL which C++ provides. I decided to use the std::multimap since it can store duplicating keys (which is the cost) and it sorts the lists automatically. However, I was forced to use std::find_if() to find the id (which is the row,col of the current cell in the set) in the visited set to check if the current cell is on it which promises linear complexity. I really think this is the bottleneck of the Dijkstra's algorithm.
I am well aware that A* algorithm is much faster than Dijkstra's algorithm but what I wanted to ask is my implementation of Dijkstra's algorithm optimal? Even if I implemented A* algorithm using my current implementation in Dijkstra's which is I believe suboptimal, then consequently A* algorithm will also be suboptimal.
What improvement can I perform? What STL is the most appropriate for this algorithm? Particularly, how do I improve the bottleneck?
You're using a std::multimap for 'working' and 'visited'. That's not great.
The first thing you should do is change visited into a per-vertex flag so you can do your find_if in constant time instead of linear times and also so that operations on the list of visited vertices take constant instead of logarithmic time. You know what all the vertices are and you can map them to small integers trivially, so you can use either a std::vector or a std::bitset.
The second thing you should do is turn working into a priority queue, rather than a balanced binary tree structure, so that operations are a (largish) constant factor faster. std::priority_queue is a barebones binary heap. A higher-radix heap---say quaternary for concreteness---will probably be faster on modern computers due to its reduced depth. Andrew Goldberg suggests some bucket-based data structures; I can dig up references for you if you get to that stage. (They're not too complicated.)
Once you've taken care of these two things, you might look at A* or meet-in-the-middle tricks to speed things up even more.
Your performance is several orders of magnitude worse than it could be because you're using graph search algorithms for what looks like geometry. This geometry is much simpler and less general than the problems that graph search algorithms can solve. Also, with a vertex for every pixel your graph is huge even though it contains basically no information.
I heard you asking "how can I make this better without changing what I'm thinking" but nevertheless I'll tell you a completely different and better approach.
It looks like your robot can only go horizontally, vertically or diagonally. Is that for real or just a side effect of you choosing graph search algorithms? I'll assume the latter and let it go in any direction.
The algorithm goes like this:
(0) Represent your obstacles as polygons by listing the corners. Work in real numbers so you can make them as thin as you like.
(1) Try for a straight line between the end points.
(2) Check if that line goes through an obstacle or not. To do that for any line, show that all corners of any particular obstacle lie on the same side of the line. To do that, translate all points by (-X,-Y) of one end of the line so that that point is at the origin, then rotate until the other point is on the X axis. Now all corners should have the same sign of Y if there's no obstruction. There might be a quicker way just using gradients.
(3) If there's an obstruction, propose N two-segment paths going via the N corners of the obstacle.
(4) Recurse for all segments, culling any paths with segments that go out of bounds. That won't be a problem unless you have obstacles that go out of bounds.
(5) When it stops recursing, you should have a list of locally optimised paths from which you can choose the shortest.
(6) If you really want to restrict bearings to multiples of 45 degrees, then you can do this algorithm first and then replace each segment by any 45-only wiggly version that avoids obstacles. We know that such a version exists because you can stay extremely close to the original line by wiggling very often. We also know that all such wiggly paths have the same length.

improving performance for graph connectedness computation

I am writing a program to generate a graph and check whether it is connected or not. Below is the code. Here is some explanation: I generate a number of points on the plane at random locations. I then connect the nodes, NOT based on proximity only. By that I mean to say that a node is more likely to be connected to nodes that are closer, and this is determined by a random variable that I use in the code (h_sq) and the distance. Hence, I generate all links (symmetric, i.e., if i can talk to j the viceversa is also true) and then check with a BFS to see if the graph is connected.
My problem is that the code seems to be working properly. However, when the number of nodes becomes greater than ~2000 it is terribly slow, and I need to run this function many times for simulation purposes. I even tried to use other libraries for graphs but the performance is the same.
Does anybody know how could I possibly speed everything up?
Thanks,
int Graph::gen_links() {
if( save == true ) { // in case I want to store the structure of the graph
links.clear();
links.resize(xy.size());
}
double h_sq, d;
vector< vector<luint> > neighbors(xy.size());
// generate links
double tmp = snr_lin / gamma_0_lin;
// xy is a std vector of pairs containing the nodes' locations
for(luint i = 0; i < xy.size(); i++) {
for(luint j = i+1; j < xy.size(); j++) {
// generate |h|^2
d = distance(i, j);
if( d < d_crit ) // for sim purposes
d = 1.0;
h_sq = pow(mrand.randNorm(0, 1), 2.0) + pow(mrand.randNorm(0, 1), 2.0);
if( h_sq * tmp >= pow(d, alpha) ) {
// there exists a link between i and j
neighbors[i].push_back(j);
neighbors[j].push_back(i);
// options
if( save == true )
links.push_back( make_pair(i, j) );
}
}
if( neighbors[i].empty() && save == false ) {
// graph not connected. since save=false i dont need to store the structure,
// hence I exit
connected = 0;
return 1;
}
}
// here I do BFS to check whether the graph is connected or not, using neighbors
// BFS code...
return 1;
}
UPDATE:
the main problem seems to be the push_back calls within the inner for loops. It's the part that takes most of the time in this case. Shall I use reserve() to increase efficiency?
Are you sure the slowness is caused by the generation but not by your search algorithm?
The graph generation is O(n^2) and you can't do too much to it. However, you can apparently use memory in exchange of some of the time if the point locations are fixed for at least some of the experiments.
First, distances of all node pairs, and pow(d, alpha) can be precomputed and saved into memory so that you don't need to compute them again and again. The extra memory cost for 10000 nodes will be about 800mb for double and 400mb for float..
In addition, sum of square of normal variable is chi-square distribution if I remember correctly.. Probably you can have some precomputed table lookup if the accuracy allowed?
At last, if the probability that two nodes will be connected are so small if the distance exceeds some value, then you don't need O(n^2) and probably you can only calculate those node pairs that have distance smaller than some limits?
As a first step you should try to use reserve for both inner and outer vectors.
If this does not bring performance up to your expectations I believe this is because memory allocations that are still happening.
There is a handy class I've used in similar situations, llvm::SmallVector (find it in Google). It provides a vector with few pre-allocated items, so you can have decrease number of allocations by one per vector.
It still can grow when it is running out of items in pre-allocated space.
So:
1) Examine the number of items you have in your vectors on average during runs (I'm talking about both inner and outer vectors)
2) Put in llvm::SmallVector with a pre-allocation of such size (as vector is allocated on the stack you might need to increase stack size, or reduce pre-allocation if you are restricted on available stack memory).
Another good thing about SmallVector is that it has almost the same interface as std::vector (could be easily put instead of it)