Is the Union-Find (or Disjoint Set) data structure in STL? - c++

I would have expected such a useful data structure to be included in the C++ Standard Library but I can't seem to find it.

It is not, but there is one in boost: http://www.boost.org/doc/libs/1_64_0/libs/disjoint_sets/disjoint_sets.html, so if you want an off-the-shelf implementation I'd recommend this.

No. I written a simple implementation. It's very extensible.
struct DisjointSet {
vector<int> parent;
vector<int> size;
DisjointSet(int maxSize) {
parent.resize(maxSize);
size.resize(maxSize);
for (int i = 0; i < maxSize; i++) {
parent[i] = i;
size[i] = 1;
}
}
int find_set(int v) {
if (v == parent[v])
return v;
return parent[v] = find_set(parent[v]);
}
void union_set(int a, int b) {
a = find_set(a);
b = find_set(b);
if (a != b) {
if (size[a] < size[b])
swap(a, b);
parent[b] = a;
size[a] += size[b];
}
}
};
And the usage as follows.
void solve() {
int n;
cin >> n;
DisjointSet S(n); // Initializing with maximum Size
S.union_set(1, 2);
S.union_set(3, 7);
int parent = S.find_set(1); // root of 1
}

The implementation of disjoint set using tree. There are two operations:
find_set(x): get representative of set which contains member x, here representative is the root node
union_set(x,y): union of two sets which contain members x and y
Tree representation is efficient than linked list representation with two heuristics:
-- "union by rank" and "path compression" --
union by rank: assign rank to each node. Rank is height of the node (number of edges in the longest simple path between the node and a descendant leaf)
path compression: during "find_set" operation, make parent of node as root
(Ref: Introduction to Algorithms, 3rd Edition by CLRS)
The STL implementation is given below:
#include <iostream>
#include <vector>
using namespace std;
struct disjointSet{
vector<int> parent, rank;
disjointSet(int n){
rank.assign(n, 0);
for (int i = 0; i < n; i++)
parent.push_back(i);
}
int find_set(int v){
if(parent[v]!=v)
parent[v] = find_set(parent[v]);
return parent[v];
}
void union_set(int x,int y){
x = find_set(x);
y = find_set(y);
if (rank[x] > rank[y])
parent[y] = x;
else{
parent[x] = y;
if(rank[x]==rank[y])
rank[y]++;
}
}
};

Related

select desired paths through binary tree

I have already asked a question here. And I achieved the answer.
Actually, this is a kind of different path, i.e., I have two options for each level, up(1) or down(-1), and I have the n level. Therefore I have 2^n path.
Now in this current question, I want to select some desired paths.
My desired paths have two conditions as follows.
The end of a given path reaches, let say, 1.
Thus, I want to select those paths which finally reach 1i.e., the sum of a given path with n level is 1.
I want to bound those paths in #1 between, let say, 9 and -9. For example, I want to avoid the sum of 9 up or 9 down in sequence, which is bigger than 9 and less than -9.
Here is my attempt:
int popcount(unsigned x){
int c = 0;
for (; x != 0; x >>= 1)
if (x & 1)
c++;
return c;
}
void selected_path(vector<int> &d, int n){
d.clear();
int size = 1<<n;
d.resize(size);
for (int i = 0; i < size; ++i) {
d[i] = n-2.0*popcount(i);
}
}
In the above code, d[] gives me all possible paths, i.e. 2^n. But I want to select those paths with the above 2 conditions.
Edit: the answer but not efficient!
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
struct node{
int data;
node *left, *right;
};
struct node *create(int x, int n, int limit){
struct node *newnode;
newnode = new node();
n++;
if(n==(limit+2)) return 0;
newnode->data = x;
x = newnode->data + 1;
newnode->left = create(x,n,limit);
x = newnode->data -1 ;
newnode->right = create(x, n,limit);
return newnode;
}
void inorder(std::vector<int> &d, node *root, vector<int> &stack, int uplimit, int downlimit){ //uplimit,downlimit
if(root == NULL) return;
stack.push_back(root->data);
inorder(d,root->left,stack,uplimit,downlimit);
if(root->left == 0 and root->right ==0){
if(stack[stack.size() -1] == 1){
int max=*max_element(stack.begin(), stack.end());
int min=*min_element(stack.begin(), stack.end());
if(max < uplimit and min > downlimit){
for(int i = 1; i < stack.size(); i++){
d.push_back(stack[i]);
}
}
}
}
inorder(d,root->right,stack,uplimit,downlimit);
stack.pop_back();
}
int main(){
int limit = 7;
struct node *root;
root = create(0,0,limit);
std::vector<int> stack;
std::vector<int> d;
int uplimit = 9;
int downlimit = -9;
inorder(d,root, stack,uplimit,downlimit);
stack.clear();
int n_path = int(d.size()/(limit));
for(int ip =0; ip < n_path; ip++){
for(int i = 1; i <=limit; i++){
cout << d[ip*(limit)+(i-1)] << "\t";
}
cout << endl;
}
}
A possible answer can be as above code. Here d[] is a two-dimension array, which is the first one is the number of the paths, and the second dimension is the value of nodes during the path.
But the problem is that it is not efficient in terms of memory (if limit > 20) due to the root-> data saves all possible nodes, which is unnecessary.
I would highly appreciate it if one could give some idea to make it efficient

Error while Compiling "|75|error: cannot convert 'std::vector<int>' to 'int'" while implementing bfs

I am implementing bfs (Breadth First Search ) for the graph , but I am getting an error while I pass the starting value of the vector to an integer, for the dfs function to perform, as in the dfs function I have passed the source of the vector, i.e the first element of the vector.
error is on the line where start is declared to v[i]
Here is the complete code
#include <iostream>
#include <vector>
#include <queue>
#include <stdio.h>
using namespace std;
vector<int> v[10];
bool visited[10];
int level[10];
int a = 0;
int arr[10];
void dfs(int s) //function should run only one time
{
queue<int> q;
q.push(s);
visited[s] = true;
level[s] = 0;
while (!q.empty())
{
int p = q.front();
arr[a] = p;
a++;
q.pop();
for (int i = 0; i < v[p].size(); i++)
{
if (visited[v[p][i]] == false) {
level[v[p][i]] = level[p] + 1;
q.push(v[p][i]);
visited[v[p][i]] = true;
}
}
}
}
int main()
{
char c;
int start; // starting element of the vector
int i = 0; // for keeping track of the parent
int countt = 0; // keep track of the no of parents
bool check;
printf("Child or Parent ?");
scanf("%c", &c);
while (countt <= 10) {
if (c == 'c') {
check = true;
int j = 0;
while (check) {
// to keep the track of the child;
scanf("%d", &v[i][j]);
j++;
}
}
if (c == 'p')
{
scanf("%d", &v[i]);
if (i == 0)
{
start = v[i];
}
i++;
countt++;
}
}
printf(" Vector input completed");
dfs(start);
printf("DFS completed, printing the dfs now ");
for (int g = 0; g <= 10; g++)
{
printf("%d", &arr[g]);
}
}
In your current code, v is an array of size 10 containing vector's. However, start is an int, so there is nothing strange in getting an error when trying to assign one to another.
I believe that you wanted v to be either an array of ints or vector of ints. In such a case you just have to declare v properly: int v[10] or vector<int> v(10).
This is general syntax: if you want to declare a vector with known size then you have to put it in (), not in []. Note that you can also fill the vector with some initial values (say zeroes) by writing vector<int> v(10, 0).
In case got you wrong and you wanted to store a graph as vector of vectors, then you can write vector<vector<int>> v(10).

Trouble with Dijkstra , finding all minimum paths

We have a problem here, we're trying to find all the shortest paths in graph from one node to another. We have already implemented dijkstra but we really dont know how to find them all.
Do we have to use BFS?
#include <vector>
#include <iostream>
#include <queue>
using namespace std;
typedef pair <int, int> dist_node;
typedef pair <int, int> edge;
const int MAXN = 10000;
const int INF = 1 << 30;
vector <edge> g[MAXN];
int d[MAXN];
int p[MAXN];
int dijkstra(int s, int n,int t){
for (int i = 0; i <= n; ++i){
d[i] = INF; p[i] = -1;
}
priority_queue < dist_node, vector <dist_node>,greater<dist_node> > q;
d[s] = 0;
q.push(dist_node(0, s));
while (!q.empty()){
int dist = q.top().first;
int cur = q.top().second;
q.pop();
if (dist > d[cur]) continue;
for (int i = 0; i < g[cur].size(); ++i){
int next = g[cur][i].first;
int w_extra = g[cur][i].second;
if (d[cur] + w_extra < d[next]){
d[next] = d[cur] + w_extra;
p[next] = cur;
q.push(dist_node(d[next], next));
}
}
}
return d[t];
}
vector <int> findpath (int t){
vector <int> path;
int cur=t;
while(cur != -1){
path.push_back(cur);
cur = p[cur];
}
reverse(path.begin(), path.end());
return path;
}
This is our code, we believe we have to modify it but we really don't know where.
Currently, you are only saving/retrieving one of the shortest paths that you happen to find. Consider this example:
4 nodes
0 -> 1
0 -> 2
1 -> 3
2 -> 3
It becomes clear that you cannot have a single p[] value for each position, as in fact the 4th node (3) has 2 previous valid nodes: 1 and 2.
You could thus replace it with a vector<int> p[MAXN]; and work as follows:
if (d[cur] + w_extra < d[next]){
d[next] = d[cur] + w_extra;
p[next].clear();
p[next].push_back(cur);
q.push(dist_node(d[next], next));
}
else if(d[cur] + w_extra == d[next]){
p[next].push_back(cur); // a new shortest way of hitting this same node
}
You will also need to update your findpath() function as it will need to deal with "branches" resulting in several multiple paths, possibly an exponentially huge amount of paths depending on the graph. If you just need to print the paths, you could do something like this:
int answer[MAXN];
void findpath (int t, int depth){
if(t == -1){ // we reached the initial node of one shortest path
for(int i = depth-1; i >= 0; --i){
printf("%d ", answer[i]);
}
printf("%d\n", last_node); // the target end node of the search
return;
}
for(int i = p[t].size()-1; i >= 0; --i){
answer[depth] = p[t][i];
findpath(p[t][i], depth+1);
}
}
Note you'll need to do p[s].push_back(-1) at the beginning of your dijkstra, besides clearing this vector array between cases.

What is an efficient way to sort a graph?

For example suppose there are 3 nodes A,B,C and A links to B and C, B links to A and C, and C links to B and A. In visual form its like this
C <- A -> B //A links to B & C
A <- B -> C //B links to A & C
B <- C -> A //C links to B & A
Assume the A,B,C are held in an array like so [A,B,C] with index starting at 0. How can I efficiently sort the array [A,B,C] according to the value held by each node.
For example if A holds 4, B holds -2 and C holds -1, then sortGraph([A,B,C]) should return [B,C,A]. Hope its clear. Would it be possible if I can somehow utilize std::sort?
EDIT: Not basic sort algorithm. Let me clarify a bit more. Assume I have a list of Nodes [n0,n1...nm]. Each ni has a left and right neighbor index. For example, n1 left neight is n0 and its right neighbor is n2. I use index to represent the neighbor. If n1 is at index 1, then its left neighbor is at index 0 and its right neighbor is at index 2. If I sort the array, then I need to update the neighbor index as well. I don't want to really implement my own sorting algorithm, any advice on how to proceed?
If I understand the edited question correctly your graph is a circular linked list: each node points to the previous and next nodes, and the "last" node points to the "first" node as its next node.
There's nothing particularly special you need to do the sort that you want. Here are the basic steps I'd use.
Put all the nodes into an array.
Sort the array using any sorting algorithm (e.g. qsort).
Loop through the result and reset the prev/next pointers for each node, taking into account the special cases for the first and last node.
Here is a C++ implementation, hope is useful (it includes several algorithms like dijkstra, kruskal, for sorting it uses depth first search, etc...)
Graph.h
#ifndef __GRAPH_H
#define __GRAPH_H
#include <vector>
#include <stack>
#include <set>
typedef struct __edge_t
{
int v0, v1, w;
__edge_t():v0(-1),v1(-1),w(-1){}
__edge_t(int from, int to, int weight):v0(from),v1(to),w(weight){}
} edge_t;
class Graph
{
public:
Graph(void); // construct a graph with no vertex (and thus no edge)
Graph(int n); // construct a graph with n-vertex, but no edge
Graph(const Graph &graph); // deep copy of a graph, avoid if not necessary
public:
// #destructor
virtual ~Graph(void);
public:
inline int getVertexCount(void) const { return this->numV; }
inline int getEdgeCount(void) const { return this->numE; }
public:
// add an edge
// #param: from [in] - starting point of the edge
// #param: to [in] - finishing point of the edge
// #param: weight[in] - edge weight, only allow positive values
void addEdge(int from, int to, int weight=1);
// get all edges
// #param: edgeList[out] - an array with sufficient size to store the edges
void getAllEdges(edge_t edgeList[]);
public:
// topological sort
// #param: vertexList[out] - vertex order
void sort(int vertexList[]);
// dijkstra's shortest path algorithm
// #param: v[in] - starting vertex
// #param: path[out] - an array of <distance, prev> pair for each vertex
void dijkstra(int v, std::pair<int, int> path[]);
// kruskal's minimum spanning tree algorithm
// #param: graph[out] - the minimum spanning tree result
void kruskal(Graph &graph);
// floyd-warshall shortest distance algorithm
// #param: path[out] - a matrix of <distance, next> pair in C-style
void floydWarshall(std::pair<int, int> path[]);
private:
// resursive depth first search
void sort(int v, std::pair<int, int> timestamp[], std::stack<int> &order);
// find which set the vertex is in, used in kruskal
std::set<int>* findSet(int v, std::set<int> vertexSet[], int n);
// union two sets, used in kruskal
void setUnion(std::set<int>* s0, std::set<int>* s1);
// initialize this graph
void init(int n);
// initialize this graph by copying another
void init(const Graph &graph);
private:
int numV, numE; // number of vertices and edges
std::vector< std::pair<int, int> >* adjList; // adjacency list
};
#endif
Graph.cpp
#include "Graph.h"
#include <algorithm>
#include <map>
Graph::Graph()
:numV(0), numE(0), adjList(0)
{
}
Graph::Graph(int n)
:numV(0), numE(0), adjList(0)
{
this->init(n);
}
Graph::Graph(const Graph &graph)
:numV(0), numE(0), adjList(0)
{
this->init(graph);
}
Graph::~Graph()
{
delete[] this->adjList;
}
void Graph::init(int n)
{
if(this->adjList){
delete[] this->adjList;
}
this->numV = n;
this->numE = 0;
this->adjList = new std::vector< std::pair<int, int> >[n];
}
void Graph::init(const Graph &graph)
{
this->init(graph.numV);
for(int i = 0; i < numV; i++){
this->adjList[i] = graph.adjList[i];
}
}
void Graph::addEdge(int from, int to, int weight)
{
if(weight > 0){
this->adjList[from].push_back( std::make_pair(to, weight) );
this->numE++;
}
}
void Graph::getAllEdges(edge_t edgeList[])
{
int k = 0;
for(int i = 0; i < numV; i++){
for(int j = 0; j < this->adjList[i].size(); j++){
// add this edge to edgeList
edgeList[k++] = edge_t(i, this->adjList[i][j].first, this->adjList[i][j].second);
}
}
}
void Graph::sort(int vertexList[])
{
std::pair<int, int>* timestamp = new std::pair<int, int>[this->numV];
std::stack<int> order;
for(int i = 0; i < this->numV; i++){
timestamp[i].first = -1;
timestamp[i].second = -1;
}
for(int v = 0; v < this->numV; v++){
if(timestamp[v].first < 0){
this->sort(v, timestamp, order);
}
}
int i = 0;
while(!order.empty()){
vertexList[i++] = order.top();
order.pop();
}
delete[] timestamp;
return;
}
void Graph::sort(int v, std::pair<int, int> timestamp[], std::stack<int> &order)
{
// discover vertex v
timestamp[v].first = 1;
for(int i = 0; i < this->adjList[v].size(); i++){
int next = this->adjList[v][i].first;
if(timestamp[next].first < 0){
this->sort(next, timestamp, order);
}
}
// finish vertex v
timestamp[v].second = 1;
order.push(v);
return;
}
void Graph::dijkstra(int v, std::pair<int, int> path[])
{
int* q = new int[numV];
int numQ = numV;
for(int i = 0; i < this->numV; i++){
path[i].first = -1; // infinity distance
path[i].second = -1; // no path exists
q[i] = i;
}
// instant reachable to itself
path[v].first = 0;
path[v].second = -1;
while(numQ > 0){
int u = -1; // such node not exists
for(int i = 0; i < numV; i++){
if(q[i] >= 0
&& path[i].first >= 0
&& (u < 0 || path[i].first < path[u].first)){ //
u = i;
}
}
if(u == -1){
// all remaining nodes are unreachible
break;
}
// remove u from Q
q[u] = -1;
numQ--;
for(int i = 0; i < this->adjList[u].size(); i++){
std::pair<int, int>& edge = this->adjList[u][i];
int alt = path[u].first + edge.second;
if(path[edge.first].first < 0 || alt < path[ edge.first ].first){
path[ edge.first ].first = alt;
path[ edge.first ].second = u;
}
}
}
delete[] q;
return;
}
// compare two edges by their weight
bool edgeCmp(edge_t e0, edge_t e1)
{
return e0.w < e1.w;
}
std::set<int>* Graph::findSet(int v, std::set<int> vertexSet[], int n)
{
for(int i = 0; i < n; i++){
if(vertexSet[i].find(v) != vertexSet[i].end()){
return vertexSet+i;
}
}
return 0;
}
void Graph::setUnion(std::set<int>* s0, std::set<int>* s1)
{
if(s1->size() > s0->size()){
std::set<int>* temp = s0;
s0 = s1;
s1 = temp;
}
for(std::set<int>::iterator i = s1->begin(); i != s1->end(); i++){
s0->insert(*i);
}
s1->clear();
return;
}
void Graph::kruskal(Graph &graph)
{
std::vector<edge_t> edgeList;
edgeList.reserve(numE);
for(int i = 0; i < numV; i++){
for(int j = 0; j < this->adjList[i].size(); j++){
// add this edge to edgeList
edgeList.push_back( edge_t(i, this->adjList[i][j].first, this->adjList[i][j].second) );
}
}
// sort the list in ascending order
std::sort(edgeList.begin(), edgeList.end(), edgeCmp);
graph.init(numV);
// create disjoint set of the spanning tree constructed so far
std::set<int>* disjoint = new std::set<int>[this->numV];
for(int i = 0; i < numV; i++){
disjoint[i].insert(i);
}
for(int e = 0; e < edgeList.size(); e++){
// consider edgeList[e]
std::set<int>* s0 = this->findSet(edgeList[e].v0, disjoint, numV);
std::set<int>* s1 = this->findSet(edgeList[e].v1, disjoint, numV);
if(s0 == s1){
// adding this edge will make a cycle
continue;
}
// add this edge to MST
graph.addEdge(edgeList[e].v0, edgeList[e].v1, edgeList[e].w);
// union s0 & s1
this->setUnion(s0, s1);
}
delete[] disjoint;
return;
}
#define IDX(i,j) ((i)*numV+(j))
void Graph::floydWarshall(std::pair<int, int> path[])
{
// initialize
for(int i = 0; i < numV; i++){
for(int j = 0; j < numV; j++){
path[IDX(i,j)].first = -1;
path[IDX(i,j)].second = -1;
}
}
for(int i = 0; i < numV; i++){
for(int j = 0; j < this->adjList[i].size(); j++){
path[IDX(i,this->adjList[i][j].first)].first
= this->adjList[i][j].second;
path[IDX(i,this->adjList[i][j].first)].second
= this->adjList[i][j].first;
}
}
// dynamic programming
for(int k = 0; k < numV; k++){
for(int i = 0; i < numV; i++){
for(int j = 0; j < numV; j++){
if(path[IDX(i,k)].first == -1
|| path[IDX(k,j)].first == -1){
// no path exist from i-to-k or from k-to-j
continue;
}
if(path[IDX(i,j)].first == -1
|| path[IDX(i,j)].first > path[IDX(i,k)].first + path[IDX(k,j)].first){
// there is a shorter path from i-to-k, and from k-to-j
path[IDX(i,j)].first = path[IDX(i,k)].first + path[IDX(k,j)].first;
path[IDX(i,j)].second = k;
}
}
}
}
return;
}
If you are looking for sorting algorithms you should just ask google:
http://en.wikipedia.org/wiki/Sorting_algorithm
My personal favourite is the BogoSort coupled with parallel universe theory. The theory is that if you hook a machine up to the program that can destroy the universe, then if the list isn't sorted after one iteration it will destroy the universe. That way all the parallel universes except the one with the list sorted will be destroyed and you have a sorting algorithm with complexity O(1).
The best ....
Create a struct like this:
template<typename Container, typename Comparison = std::less<typename Container::value_type>>
struct SortHelper
{
Container const* container;
size_t org_index;
SortHelper( Container const* c, size_t index ):container(c), org_index(index) {}
bool operator<( SortHelper other ) const
{
return Comparison()( (*c)[org_index], (*other.c)[other.org_index] );
}
};
This lets you resort things however you want.
Now, make a std::vector<SortHelper<blah>>, sort it, and you now have a vector of instructions of where everything ends up going after you sort it.
Apply these instructions (there are a few ways). An easy way would be to reuse container pointer as a bool. Walk the sorted vector of helpers. Move the first entry to where it should go, moving what you found where it should go to where it should go, and repeat until you loop or the entire array is sorted. As you go, clear the container pointers in your helper struct, and check them to make sure you don't move an entry that has already been moved (this lets you detect loops, for example).
Once a loop has occurred, proceed down the vector looking for the next as-yet-not-in-right-place entry (with a non-null container pointer).

Dijkstra's algorithm question

In the code below:
#define MAX_VERTICES 260000
#include <fstream>
#include <vector>
#include <queue>
#define endl '\n'
using namespace std;
struct edge {
int dest;
int length;
};
bool operator< (edge e1, edge e2) {
return e1.length > e2.length;
}
int C, P, P0, P1, P2;
vector<edge> edges[MAX_VERTICES];
int best1[MAX_VERTICES];
int best2[MAX_VERTICES];
void dijkstra (int start, int* best) {
for (int i = 0; i < P; i++) best[i] = -1;
best[start] = 0;
priority_queue<edge> pq;
edge first = { start, 0 };
pq.push(first);
while (!pq.empty()) {
edge next = pq.top();
pq.pop();
if (next.length != best[next.dest]) continue;
for (vector<edge>::iterator i = edges[next.dest].begin(); i != edges[next.dest].end(); i++) {
if (best[i->dest] == -1 || next.length + i->length < best[i->dest]) {
best[i->dest] = next.length + i->length;
edge e = { i->dest, next.length+i->length };
pq.push(e);
}
}
}
}
int main () {
ifstream inp("apple.in");
ofstream outp("apple.out");
inp >> C >> P >> P0 >> P1 >> P2;
P0--, P1--, P2--;
for (int i = 0; i < C; i++) {
int a, b;
int l;
inp >> a >> b >> l;
a--, b--;
edge e = { b, l };
edges[a].push_back(e);
e.dest = a;
edges[b].push_back(e);
}
dijkstra (P1, best1); // find shortest distances from P1 to other nodes
dijkstra (P2, best2); // find shortest distances from P2 to other nodes
int ans = best1[P0]+best1[P2]; // path: PB->...->PA1->...->PA2
if (best2[P0]+best2[P1] < ans)
ans = best2[P0]+best2[P1]; // path: PB->...->PA2->...->PA1
outp << ans << endl;
return 0;
}
What is this: if (next.length != best[next.dest]) continue; used for? Is it to avoid us situations where going through the loop will give us the same answer that we already have?
Thanks!
That line is a way to handle the fact that c++'s priority_queue does not have a decrease_key function.
That is, when you do pq.push(e) and there is already an edge with the same destination in the heap you would prefer to decrease the key of the edge already in the heap. This is not easily done with c++'s priority_queue and so a simple way to handle it is to allow multiple edges in the heap corresponding to the same destination and ignoring all but the first (for each dest) that you pop from the heap.
Note that this changes the complexity from O(ElogV) to O(ElogE).
I guess you are contemplating the case where your priority_queue contains 2 times the same edge, but each one with a different "length".
This could happen if you push edge X which has a length of Y, and afterwards push edge X again, but this time it has a length < Y. That is why, if the length of that edge, isn't the lowest you've found for that edge so far, you ommit it in that loop's iteration.