Implementation and Improvability of Depth First Search - c++

I have coded DFS as the way it is on my mind and didn't referred any Text book or Pseudo-code for ideas. I think I have some lines of codes that are making unnecessary calculations. Any ideas on reducing the complexity of my algorithm ?
vector<int>visited;
bool isFound(vector<int>vec,int value)
{
if(std::find(vec.begin(),vec.end(),value)==vec.end())
return false;
else
return true;
}
void dfs(int **graph,int numOfNodes,int node)
{
if(isFound(visited,node)==false)
visited.push_back(node);
vector<int>neighbours;
for(int i=0;i<numOfNodes;i++)
if(graph[node][i]==1)
neighbours.push_back(i);
for(int i=0;i<neighbours.size();i++)
if(isFound(visited,neighbours[i])==false)
dfs(graph,numOfNodes,neighbours[i]);
}
void depthFirstSearch(int **graph,int numOfNodes)
{
for(int i=0;i<numOfNodes;i++)
dfs(graph,numOfNodes,i);
}
PS: Could somebody please sent me a link teaching me how can to insert C++ code with good quality. I've tried syntax highlighting but it didn't work out.

Your DFS has O(n^2) time complexity, which is really bad (it should run in O(n + m)).
This line ruins your implementation, because searching in vector takes time proportional to its length:
if(std::find(vec.begin(),vec.end(),value)==vec.end())
To avoid this, you can remember what was visited in an array of boolean values.
Second problem with your DFS is that for bigger graph it will probably cause stack overflow, because worst case recursion depth is equal to number of vertices in graph. Remedy to this problem is also simple: use std::list<int> as your own stack.
So, code that does DFS should look more or less like this:
// n is number of vertices in graph
bool visited[n]; // in this array we save visited vertices
std::list<int> stack;
std::list<int> order;
for(int i = 0; i < n; i++){
if(!visited[i]){
stack.push_back(i);
while(!stack.empty()){
int top = stack.back();
stack.pop_back();
if(visited[top])
continue;
visited[top] = true;
order.push_back(top);
for(all neighbours of top)
if(!visited[neighbour])
stack.push_back(neighbour);
}
}
}

Related

Parallelization of bin packing problem by OpenMp

I am learning open mp, and I want to parallelize well-known BinPacking problem. But the problem is what whatever I try, can't get correct solution ( the one I get with sequential verstion).
So far, I have tried multiple different versions (including reduction, tasks, schedule) but didn't get anything useful.
Below is my the most recent try.
int binPackingParallel(std::vector<int> weight, int n, int c)
{
int resltut = 0;
int bin_rem[n];
#pragma omp parallel for schedule(dynamic) reduction(+:result)
for (int i = 0; i < n; i++) {
bool done = false;
int j;
for (j = 0; j < result && !done; j++) {
int b ;
#pragma omp atomic
b = bin_rem[j] - weight[i];
if ( b >= 0) {
bin_rem[j] = bin_rem[j] - weight[i];
done = true;
}
}
if (!done) {
#pragma omp critical
bin_rem[result] = c - weight[i];
result++;
}
}
return result;
}
Edit: I made modification on starting problem, so now there is given number of bins N and we need to check if all elements can be put in N bins. I made this by using recursion, still my parallel version is slower.
bool can_fit_parallel(std::vector<int> arr, std::vector<int> bins, int n) {
// base case: if the array is empty, we can fit the elements
if (arr.empty()) {
return true;
}
bool found = false;
#pragma omp parallel for schedule (dynamic,10)
for (int i = 0; i < n; i++) {
if (bins[i] >= arr[0]) {
bins[i] -= arr[0];
if (can_fit_parallel(std::vector<int>(arr.begin() + 1, arr.end()), bins, n)) {
found = true;
#pragma omp cancel for
}
// if the element doesn't fit or if the recursion fails,
// restore the bin's capacity and try the next bin
bins[i] += arr[0];
}
}
// if the element doesn't fit in any of the bins, return false
return found;
}
Any help would be great
You do not need parallelization to make your code significantly faster. You have implemented the First Fit method (its complexity is O(n2)), but it can be significantly faster if you use binary search trees (O(n Log n)). To do so, you just have to use the standard library (std::multiset), in this example I have implemented the BestFit algorithm:
int binPackingSTL(const std::vector<int>& weight, const int n, const int c)
{
std::multiset<int> bins; //multiset to store bins
for (const auto x: weight) {
const auto it=bins.lower_bound(x); // find the best bin to accomodate x
if(it==bins.end()){
bins.insert(c - x); // if no suitale bin found insert a new one
} else {
//suitable bin found - replace it with a smaller value
auto value=*it; // store its value
bins.erase(it); // erase the old value
bins.insert(value-x); // insert the new value
}
}
return bins.size(); // number of bins
}
In my measurements, it is 100x times faster than your code in the case of n=50000
EDIT: Both algorithms mentioned above (First-Fit and Best-Fit) are approximations to the bin packing problem. To answer your revised question, you have to use an algorithm that finds the optimal solution. So, you need to find an algorithm for the exact solution, not an approximation. Instead of trying to reinvent the wheel, you can consider using already available libraries such as BPPLIB – A Bin Packing Problem Library.
This is not a reduction: that would cause each thread to have it own partial result, and you want result to be global. I think that putting a critical section around two statements might work. The atomic statement is meaningless since it is not on a shared variable.
But there a deeper problem: each i iteration can write a result, which affects how far the search of the other iterations goes. That means that the outer iteration has to be sequential. (You really need to think hard about whether iterations are independent before you slap a parallel directive on them!) Maybe you can make the inner iteration parallel: it's a search, which would be a reduction on j. However that loop would have to be pretty dang long before you'd see a performance improvement.
This looks to me like the sort of algorithm that you'd have to reformulate before you can make it parallel.

Is this Union Find really O(n) as they claim?

I am solving a problem on LeetCode:
Given an unsorted array of integers nums, return the length of the longest consecutive elements sequence. You must write an algorithm that runs in O(n) time. So for nums = [100,4,200,1,3,2], the output is 4.
The Union Find solution to solve this is as below:
class Solution {
public:
vector<int> parent, sz;
int find(int i) {
if(parent[i]==i) return i;
return parent[i]=find(parent[i]);
}
void merge(int i, int j) {
int p1=find(i);
int p2=find(j);
if(p1==p2) return;
if(sz[p1]>sz[p2]) {
sz[p1]+=sz[p2];
parent[p2]=p1;
} else {
sz[p2]+=sz[p1];
parent[p1]=p2;
}
}
int longestConsecutive(vector<int>& nums) {
sz.resize(nums.size(),1);
parent.resize(nums.size(),0);
iota(begin(parent),end(parent),0);
unordered_map<int, int> m;
for(int i=0; i<nums.size(); i++) {
int n=nums[i];
if(m.count(n)) continue;
if(m.count(n-1)) merge(i,m[n-1]);
if(m.count(n+1)) merge(i,m[n+1]);
m[n]=i;
}
int res=0;
for(int i=0; i<parent.size(); i++) {
if(parent[i]==i && sz[i]>res) {
res=sz[i];
}
}
return res;
}
};
This gets accepted by the OJ (Runtime: 80 ms, faster than 76.03% of C++ online submissions for Longest Consecutive Sequence), but is this really O(n), as claimed by many answers, such as this one? My understanding is that Union Find is an O(NlogN) algorithm.
Are they right? Or, am I missing something?
They are right. A properly implemented Union Find with path compression and union by rank has linear run time complexity as a whole, while any individual operation has an amortized constant run time complexity. The exact complexity of m operations of any type is O(m * alpha(n)) where alpha is the inverse Ackerman function. For any possible n in the physical world, the inverse Ackerman function doesn't exceed 4. Thus, we can state that individual operations are constant and algorithm as a whole linear.
The key part for path compression in your code is here:
return parent[i]=find(parent[i])
vs the following that doesn't employ path compression:
return find(parent[i])
What this part of the code does is that it flattens the structure of the nodes in the hierarchy and links each node directly to the final root. Only in the first run of find will you traverse the whole structure. The next time you'll get a direct hit since you set the node's parent to its ultimate root. Notice that the second code snippet works perfectly fine, but it just does redundant work when you are not interested in the path itself and only in the final root.
Union by rank is evident here:
if(sz[p1]>sz[p2]) {...
It makes sure that the node with more children becomes the root of the node with less children. Therefore, less nodes need to be reassigned a new parent, hence less work.
Note: The above was updated and corrected based on feedback from #Matt-Timmermans and #kcsquared.

A variant of Dijkstra algorithm

I found this algorithm from CP3 book for ICPC, it is a variant of Dijkstra but it gives TLE in some cases (hidden tests). Although it seems that the running time of this algorithm is same as Dijkstra but I think it is different. Can anyone help me with the time complexity of this algorithm.
vector<int> visited(N,0),dis(N,0);
vector<pair<int,int> > adj[N]; // value, node
void dijkstra()
{
for(int i=2;i<=N;i++)
dis[i]=N;
priority_queue<pair<int,int> ,vector<pair<int,int> >,greater<pair<int,int> > > pq;
pq.push(make_pair(0,1));
while(!pq.empty())
{
pair<int,int> p=pq.top();
ll x=p.second;
pq.pop();
if(p.first>dis[x])
continue;
for(int i=0;i<adj[x].size();i++)
{
if(dis[adj[x][i].ss]>dis[x]+adj[x][i].first)
{
dis[adj[x][i].second]=dis[x]+adj[x][i].first;
pq.push(make_pair(dis[adj[x][i].second],adj[x][i].second));
}
}
}
}
Arrays in C++ are zero based, that is the first index is 0 and the last is size()-1.
vector<int> visited(N,0),dis(N,0); <--- dis is initialized with N zero's
vector<pair<int,int> > adj[N]; // value, node
void dijkstra()
{
for(int i=2;i<=N;i++)
dis[i]=N; <---- initializing i=N or dis[N] is undefined behaviour
You write beyond the end of the array with possible disastrous results.
Your real error might be that that
dis[1] = 0
Where it should have been N or MAX_INT.
This algorithm can run infinitely when there exists a graph of a particular pattern. p.first>dis[x] may not always be true and then it will not exit the loop.
I believe that is the only part which is changed from the original Dijkstra algorithm

How to print the cycle contains node in a directed graph in C++

I am writing a program that will detect cycle in a directed graph and will print the nodes that built the cycle. I try use to use a recursive method using C++ by not understanding how to print these nodes after a cycle is detect. Here is my code:
#include <bits/stdc++.h>
using namespace std;
void addedge(list<int>,int ,int );
void cycle_check(list<int>*,int);
// Make a pair between vertex x and vertex y
void addedge(list<int> *ls,int x,int y){
ls[x].push_back(y);
return;
}
void check_cycle_util(list<int> *ls,bool *visit,int curr_node,int &temp){
visit[curr_node]=true;
list<int>::iterator it;
for(it=ls[curr_node].begin();it!=ls[curr_node].end();it++){
if(!visit[*it]){
check_cycle_util(ls,visit,*it,temp);
}
else{
if(temp==0){
temp=1;
cout<<"There is a cycle in the graph\n";
break;
}
}
}
}
//checking the cycles in a graph
void cycle_check(list<int>*ls,int num){
bool *visit=new bool[num];
int temp=0;
for(int i=0;i<num;i++)
visit[i]=false;
for(int i=0;i<num;i++){
if(!visit[i] && temp==0){
check_cycle_util(ls,visit,i,temp);
}
}
}
int main(){
int num;
cout<<"Enter the no. of vertices :";
cin>>num;
list<int> *ls=new list<int>[num];
addedge(ls,0,1);
addedge(ls,2,3);
addedge(ls,3,4);
addedge(ls,4,5);
addedge(ls,1,2);
addedge(ls,1,4);
addedge(ls,3,0);
cycle_check(ls,6);
return 0;
}
I think you could learn the Tarjan Shrink Point Algorithm, it's used to search the strongly connected components.
The main idea is using the same value to mark all the points of a strongly connected component. So the points have the same value are in the same cycle.
The main steps are these:
First, we define two arrays for points, one is the timestamp array, it means the sequence number in the DFS. The other is the low stamp array, it means the min value of the timestamp of the point through DFS, in other words, the value of one point is the min value among timestamp of the point itself and the low stamps of the linked points.
Use DFS to assign the low stamp array, and then all the points which have the same low stamp are in the same cycle.
P.s: Because my Engish is not good so that I can't explain the algorithm very clear. So I recommend you could see another article to learn about this algorithm.
This is an example for using the stack to save path:
the path is a vector defined as a global variable
visit[curr_node]=true;
path.push_back(curr_node);
/* other code */
if(temp==0){
temp=1;
cout<<"There is a cycle in the graph\n";
break;
for(auto i=path.size()-1;path[i]!=curr_node||i==path.size();--i){
cout<<path[i];
}
}
}
path.pop_back();

Complexity Of Dijkstra's algorithm

I read from many sources that Dijkstra's Shortest Path also will run in O(V^2) complexity if using a naive way to get the min element (linear search). However, it can be optimised to O(VLogV) if priority queue is used as this data structure will return min element in O(1) time but takes O(LogV) time to restore the heap property after deleting the min element.
I have implemented Dijkstra's algo in the following code for the UVA problem at this link: https://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&category=16&page=show_problem&problem=1927:
#include<iostream>
#include<vector>
#include <climits>
#include <cmath>
#include <set>
using namespace std;
#define rep(a,b,c) for(int c=a;c<b;c++)
typedef std::vector<int> VI;
typedef std::vector<VI> VVI;
struct cmp {
bool operator()(const pair<int,int> &a,const pair<int,int> &b) const {
return a.second < b.second;
}
};
void sp(VVI &graph,set<pair<int,int>,cmp> &minv,VI &ans,int S,int T) {
int e = -1;
minv.insert(pair<int,int>(S,0));
rep(0,graph.size() && !minv.empty() && minv.begin()->first != T,s) {
e = minv.begin()->first;
minv.erase(minv.begin());
int nb = 0;
rep(0,graph[e].size(),d) {
nb = d;
if(graph[e][d] != INT_MAX && ans[e] + graph[e][d] < ans[d]) {
set<pair<int,int>,cmp>::iterator si = minv.find(pair<int,int>(d,ans[d]));
if(si != minv.end())
minv.erase(*si);
ans[d] = ans[e] + graph[e][d];
minv.insert(pair<int,int>(d,ans[d]));
}
}
}
}
int main(void) {
int cc = 0,N = 0,M = 0,S = -1,T = -1,A=-1,B=-1,W=-1;
VVI graph;
VI ans;
set<pair<int,int>,cmp> minv;
cin >> cc;
rep(0,cc,i) {
cin >> N >> M >> S >> T;
graph.clear();
ans.clear();
graph.assign(N,VI());
ans.assign(graph.size(),INT_MAX);
minv.clear();
rep(0,N,j) {
graph[j].assign(N,INT_MAX);
}
ans[S] = 0;
graph[S][S] = 0;
rep(0,M,j) {
cin >> A >> B >> W;
graph[A][B] = min(W,graph[A][B]);
graph[B][A] = min(W,graph[B][A]);
}
sp(graph,minv,ans,S,T);
cout << "Case #" << i + 1 << ": ";
if(ans[T] != INT_MAX)
cout << ans[T] << endl;
else
cout << "unreachable" << endl;
}
}
Based on my analysis, my algorithm has a O(VLogV) complexity. The STL std::set is implemented as a binary search tree. Furthermore, the set is sorted too.
Hence getting the minimum element from it is O(1), insertion and deletion is O(LogV) each. However, I am still getting a TLE from this problem which should be solvable in O(VLogV) based on the given time limit.
This led me to think deeper. What if all nodes were interconnected such that each vertex V has V-1 neighbours? Won't it make Dijkstra's algorithm run in O(V^2) since each vertex has to look at V-1,V-2,V-3... nodes every round?
On second thoughts, I might have misinterpreted the worst case complexity. Could someone please advise me on the following issues:
How is Dijkstra's algo O(VLogV) especially given the above counterexample?
How could I optimise my code so that it could achieve O(VLogV) complexity (or better)?
Edit:
I realised that my program does not run in O(ElogV) after all. The bottleneck is caused by my input processing which runs in O(V^2). The dijkstra part indeed runs in (ElogV).
In order to understand the time complexity of Dijkstra's algorithm, we need to study the operations that are performed on the data structure that is used to implement the Frontier set (i.e. the data structure used for minv in your algorithm):
Insert
Update
Find/Delete minimum
There are O(|V|) inserts, O(|E|) updates, O(|V|) Find/Delete Minimums in total that occur on the data structure for the entire duration of the algorithm.
Originally Dijkstra implemented the Frontier set using an unsorted array. Thus it was O(1) for Insert and Update, but O(|V|) for Find/Delete minimum, resulting in O(|E| + |V|^2), but since |E| < |V|^2, you have O(|V|^2).
If a binary min-heap is used to implement the Frontier set, you have log(|v|) for all operations, resulting in O(|E|log|V| + |V|log|V|), but since it is reasonable to assume |E| > |V|, you have O(|E|log|V|).
Then came the Fibonacci heap, where you have O(1) amortized time for Insert/Update/Find minimum, but O(log|V|) amortized time for Delete minimum, giving you the currently best known time bound of O(|E| + |V|log|V|) for Dijkstra's algorithm.
Finally, an algorithm for solving the Single Source Shortest Paths problem in O(|V|log|V|) worst case time complexity is not possible if (|V|log|V| < |E|), since the problem has the trivial lower time bound of O(|E| + |V|) i.e. you need to inspect each vertex and edge at least once to solve the problem.
Improving Dijkstra by using a BST or heap will lead to time complexities like O(|E|log|V|) or O(|E|+|V|log|V|), see Dijkstra's running time. Each edge has to be checked at some point.