I am reading indirect priority queues in Robert Sedgewick's Algorithms in C++.
The implementation below maintains pq as an array of indices into some client array. For example, if the client defines operator< for arguments of type Index, then, when fixUp compares pq[j] with pq[k], it is comparing data.grade[pq[j]] with data.grade[pq[k]], as desired. We assume that Index is a wrapper class whose object can index arrays, so that we can keep the heap position corresponding to index value k in qp[k], which allows us to implement "change priority" and "remove". We maintain the invariant pq[qp[k]]=qp[pq[k]]=k for all k in the heap.
template <class Index>
class PQ
{
private:
int N; Index* pq; int* qp;
void exch(Index i, Index j)
{
int t;
t = qp[i]; qp[i] = qp[j]; qp[j] = t;
pq[qp[i]] = i; pq[qp[j]] = j;
}
void fixUp(Index a[], int k);
void fixDown(Index a[], int k, int N);
public:
PQ(int maxN)
{
pq = new Index[maxN+1];
qp = new int[maxN+1]; N = 0;
}
int empty() const { return N == 0; }
void insert(Index v) { pq[++N] = v; qp[v] = N; fixUp(pq, N); }
Index getmax()
{
exch(pq[1], pq[N]);
fixDown(pq, 1, N-1);
return pq[N--];
}
void change(Index k)
{
fixUp(pq, qp[k]);
fixDown(pq, qp[k], N);
}
};
The main disadvantage of using indirection in this way is the extra space used. The size of the index arrays has to be the size of the data array, when the maximum size of the priority queue could be much less. Another approach to building a priority queue on top of existing data in an array is to have the client program make records consisting of a key with its array index as associated information, or to use an index key with a client-supplied overloaded operator<. Then, if the implementation uses a linked-allocation representation. Then the space used by the priority queue would be proportional to the maximum number of elements on the queue at any one time. Such approaches would be preferred over Program 9.12 if space must be conserved and if the priority queue involves only a small fraction of the data array.
What does the author mean by the below?
We assume that Index is a wrapper class whose object can index arrays
In improvement, the author is suggesting:
Another approach to building a priority queue on top of existing data in an array is to have the client program make records consisting of a key with its array index as associated information, or to use an index key with a client-supplied overloaded operator<. Then, if the implementation uses a linked-allocation representation
What does author mean here in the two points he is suggesting?
Related
Suppose a n-dimensional array that is passed as template argument and should be traversed in order to save it to a file. First of all I want to find out the size of the elements the array consists of. Thereto I try to dereference the pointers until I get the first element at [0][0][0]...[0]. But I already fail at this stage:
/**
* #brief save a n-dimensional array to file
*
* #param arr: the n-level-pointer to the data to be saved
* #param dimensions: pointer to array where dimensions of <arr> are stored
* #param n: number of levels / dimensions of <arr>
*/
template <typename T>
void save_array(T arr, unsigned int* dimensions, unsigned int n){
// how to put this in a loop ??
auto deref1 = *arr;
auto deref2 = *deref1;
auto deref3 = *deref2;
// do this n times, then derefn is equivalent to arr[0]...[0], 42 should be printed
std::cout << derefn << std::endl;
/* further code */
}
/*
* test call
*/
int main(){
unsigned int dim[4] = {50, 60, 80, 50}
uint8_t**** arr = new uint8_t***[50];
/* further initialization of arr, omitted here */
arr[0][0][0][0] = 42;
save_array(arr, dim, 4);
}
When I think of this from a memory perspective I want to perform a n-indirect load of a given address.
I saw a related question that was asked yesterday:
Declaring dynamic Multi-Dimensional pointer
This would help me a lot as well. One comment states it is not possible since types of all expressions must be known at compile-time. In my case there's actually known everything, all callers of save_array will have n hardcoded before passing it. So I think it could be just a matter of defining stuff at the right place what I am yet not able to.
I know I am writing C-style code in C++ and there could be options to achieve this with classes etc., but my question is: Is it possible to achieve n-level pointer dereference by an iterative or recursive approach? Thanks!
First of all: Do you really need a jagged array? Do you want to have some sort of sparse array? Because otherwise, could you not just flatten your n-dimensional structure into a single, long array? That would not just lead to much simpler code, but most likely also be more efficient.
That being said: It can be done for sure. For example, just use a recursive template and rely on overloading to peel off levels of indirection until you get to the bottom:
template <typename T>
void save_array(T* arr, unsigned int* dimensions)
{
for (unsigned int i = 0U; i < *dimensions; ++i)
std::cout << ' ' << *arr++;
std::cout << std::endl;
}
template <typename T>
void save_array(T** arr, unsigned int* dimensions)
{
for (unsigned int i = 0U; i < *dimensions; ++i)
save_array(*arr, dimensions + 1);
}
You don't even need to explicitly specify the number of indirections n, since that number is implicitly given by the pointer type.
You can do basically the same trick to allocate/deallocate the array too:
template <typename T>
struct array_builder;
template <typename T>
struct array_builder<T*>
{
T* allocate(unsigned int* dimensions) const
{
return new T[*dimensions];
}
};
template <typename T>
struct array_builder<T**> : private array_builder<T*>
{
T** allocate(unsigned int* dimensions) const
{
T** array = new T*[*dimensions];
for (unsigned int i = 0U; i < *dimensions; ++i)
array[i] = array_builder<T*>::allocate(dimensions + 1);
return array;
}
};
Just this way around, you need partial specialization since the approach using overloading only works when the type can be inferred from a parameter. Since functions cannot be partially specialized, you have to wrap it in a class template like that. Usage:
unsigned int dim[4] = { 50, 60, 80, 50 };
auto arr = array_builder<std::uint8_t****>{}.allocate(dim);
arr[0][0][0][0] = 42;
save_array(arr, dim);
Hope I didn't overlook anything; having this many indirections out in the open can get massively confusing real quick, which is why I strongly advise against ever doing this in real code unless absolutely unavoidable. Also this raw usage of new all over the place is anything but great. Ideally, you'd be using, e.g., std::unique_ptr. Or, better yet, just nested std::vectors as suggested in the comments…
Why not just use a data structure like tree with multiple child nodes.
Suppose you need to store n dimensional array values, create a node pointing to the first dimension. Say your first dimension length is 5 then you have 5 child nodes and if your 2nd dimension size is 10. Then for each of these 5 node you have 10 child nodes and so on....
Some thing like,
struct node{
int index;
int dimension;
vector<node*> children;
}
It will be easier to traverse through tree and is much cleaner.
As the title says, I was wondering if constant-time/O(1) access to a container does imply that memory is necessarily contiguous at some point.
When I say contiguous I mean if pointers can be compared with relational operators at some point without invoking undefined behavior.
Take eg std::deque: it does not guarantee that all its elements are stored contiguously (i.e in the same memory array), but is it correct to say that as std::deque satisfy the requirements of a Random Access Iterator, memory will be contiguous at some point independently of the implementation?
I am new to C++ so in case what I said above does not make sense: suppose I was going to implement random access iterators in C. Can
typedef struct random_access_iterator {
void *pointer; /*pointer to element */
void *block; /* pointer to the memory array
* so in case of comparing iterators
* where pointer does not point to addresses in the same array
* it can be used to comparisons instead*/
/* implement other stuff */
} RandomIter;
be used to generically express a similar mechanism to that of C++ random access iterators (considering that even if pointer do not, block will always
point to addresses in the same memory array in iterators of the same container)?
EDIT: just to clarify, constant-time here is used to denote constant-time random access
No. Consider a fixed-size linked list like the following:
struct DumbCounterexample
{
struct node
{
std::vector<int> items;
std::unique_ptr<node> next;
};
std::unique_ptr<node> first;
size_t size;
static constexpr size_t NODE_COUNT = 10;
DumbCounterexample()
: first{new node},
size{0}
{
node* to_fill = first.get();
for (size_t i = 0; i < NODE_COUNT - 1; ++i) {
to_fill->next.reset(new node);
to_fill = to_fill->next.get();
}
}
int& operator[](size_t i)
{
size_t node_num = i % NODE_COUNT;
size_t item_num = i / NODE_COUNT;
node* target_node = first.get();
for (size_t n = 0; n < node_num; ++n) {
target_node = target_node->next.get();
}
return target_node->items[item_num];
}
void push_back(int i)
{
size_t node_num = size % NODE_COUNT;
node* target_node = first.get();
for (size_t n = 0; n < node_num; ++n) {
target_node = target_node->next.get();
}
target_node->items.push_back(i);
++size;
}
};
Lookup time is constant. It does not depend on the number of elements stored in the container (only the constant NODE_COUNT).
Now, this is a strange data structure, and I can't think of any legitimate reason to use it, but it does serve as a counterexample to the claim that there need be a single contiguous block of memory that would be shared by all iterators to elements in the container (i.e. the block pointer in your example random_access_iterator struct).
It seems that a priority queue is just a heap with normal queue operations like insert, delete, top, etc. Is this the correct way to interpret a priority queue? I know you can build priority queues in different ways but if I were to build a priority queue from a heap is it necessary to create a priority queue class and give instructions for building a heap and the queue operations or is it not really necessary to build the class?
What I mean is if I have a function to build a heap and functions to do operations like insert and delete, do I need to put all these functions in a class or can I just use the instructions by calling them in main.
I guess my question is whether having a collection of functions is equivalent to storing them in some class and using them through a class or just using the functions themselves.
What I have below is all the methods for a priority queue implementation. Is this sufficient to call it an implementation or do I need to put it in a designated priority queue class?
#ifndef MAX_PRIORITYQ_H
#define MAX_PRIORITYQ_H
#include <iostream>
#include <deque>
#include "print.h"
#include "random.h"
int parent(int i)
{
return (i - 1) / 2;
}
int left(int i)
{
if(i == 0)
return 1;
else
return 2*i;
}
int right(int i)
{
if(i == 0)
return 2;
else
return 2*i + 1;
}
void max_heapify(std::deque<int> &A, int i, int heapsize)
{
int largest;
int l = left(i);
int r = right(i);
if(l <= heapsize && A[l] > A[i])
largest = l;
else
largest = i;
if(r <= heapsize && A[r] > A[largest])
largest = r;
if(largest != i) {
exchange(A, i, largest);
max_heapify(A, largest, heapsize);
//int j = max_heapify(A, largest, heapsize);
//return j;
}
//return i;
}
void build_max_heap(std::deque<int> &A)
{
int heapsize = A.size() - 1;
for(int i = (A.size() - 1) / 2; i >= 0; i--)
max_heapify(A, i, heapsize);
}
int heap_maximum(std::deque<int> &A)
{
return A[0];
}
int heap_extract_max(std::deque<int> &A, int heapsize)
{
if(heapsize < 0)
throw std::out_of_range("heap underflow");
int max = A.front();
//std::cout << "heapsize : " << heapsize << std::endl;
A[0] = A[--heapsize];
A.pop_back();
max_heapify(A, 0, heapsize);
//int i = max_heapify(A, 0, heapsize);
//A.erase(A.begin() + i);
return max;
}
void heap_increase_key(std::deque<int> &A, int i, int key)
{
if(key < A[i])
std::cerr << "New key is smaller than current key" << std::endl;
else {
A[i] = key;
while(i > 1 && A[parent(i)] < A[i]) {
exchange(A, i, parent(i));
i = parent(i);
}
}
}
void max_heap_insert(std::deque<int> &A, int key)
{
int heapsize = A.size();
A[heapsize] = std::numeric_limits<int>::min();
heap_increase_key(A, heapsize, key);
}
A priority queue is an abstract datatype. It is a shorthand way of describing a particular interface and behavior, and says nothing about the underlying implementation.
A heap is a data structure. It is a name for a particular way of storing data that makes certain operations very efficient.
It just so happens that a heap is a very good data structure to implement a priority queue, because the operations which are made efficient by the heap data strucure are the operations that the priority queue interface needs.
Having a class with exactly the interface you need (just insert and pop-max?) has its advantages.
You can exchange the implementation (list instead of heap, for example) later.
Someone reading the code that uses the queue doesn't need to understand the more difficult interface of the heap data structure.
I guess my question is whether having a collection of functions is
equivalent to storing them in some class and using them through a
class or just using the functions themselves.
It's mostly equivalent if you just think in terms of "how does my program behave". But it's not equivalent in terms of "how easy is my program to understand by a human reader"
The term priority queue refers to the general data structure useful to order priorities of its element. There are multiple ways to achieve that, e.g., various ordered tree structures (e.g., a splay tree works reasonably well) as well as various heaps, e.g., d-heaps or Fibonacci heaps. Conceptually, a heap is a tree structure where the weight of every node is not less than the weight of any node in the subtree routed at that node.
The C++ Standard Template Library provides the make_heap, push_heap
and pop_heap algorithms for heaps (usually implemented as binary
heaps), which operate on arbitrary random access iterators. It treats
the iterators as a reference to an array, and uses the array-to-heap
conversion. It also provides the container adaptor priority_queue,
which wraps these facilities in a container-like class. However, there
is no standard support for the decrease/increase-key operation.
priority_queue referes to abstract data type defined entirely by the operations that may be performed on it. In C++ STL prioroty_queue is thus one of the sequence adapters - adaptors of basic containers (vector, list and deque are basic because they cannot be built from each other without loss of efficiency), defined in <queue> header (<bits/stl_queue.h> in my case actually). As can be seen from its definition, (as Bjarne Stroustrup says):
container adapter provides a restricted interface to a container. In
particular, adapters do not provide iterators; they are intended to be
used only through their specialized interfaces.
On my implementation prioroty_queue is described as
/**
* #brief A standard container automatically sorting its contents.
*
* #ingroup sequences
*
* This is not a true container, but an #e adaptor. It holds
* another container, and provides a wrapper interface to that
* container. The wrapper is what enforces priority-based sorting
* and %queue behavior. Very few of the standard container/sequence
* interface requirements are met (e.g., iterators).
*
* The second template parameter defines the type of the underlying
* sequence/container. It defaults to std::vector, but it can be
* any type that supports #c front(), #c push_back, #c pop_back,
* and random-access iterators, such as std::deque or an
* appropriate user-defined type.
*
* The third template parameter supplies the means of making
* priority comparisons. It defaults to #c less<value_type> but
* can be anything defining a strict weak ordering.
*
* Members not found in "normal" containers are #c container_type,
* which is a typedef for the second Sequence parameter, and #c
* push, #c pop, and #c top, which are standard %queue operations.
* #note No equality/comparison operators are provided for
* %priority_queue.
* #note Sorting of the elements takes place as they are added to,
* and removed from, the %priority_queue using the
* %priority_queue's member functions. If you access the elements
* by other means, and change their data such that the sorting
* order would be different, the %priority_queue will not re-sort
* the elements for you. (How could it know to do so?)
template:
template<typename _Tp, typename _Sequence = vector<_Tp>,
typename _Compare = less<typename _Sequence::value_type> >
class priority_queue
{
In opposite to this, heap describes how its elements are being fetched and stored in memory. It is a (tree based) data structure, others are i.e array, hash table, struct, union, set..., that in addition satisfies heap property: all nodes are either [greater than or equal to] or [less than or equal to] each of its children, according to a comparison predicate defined for the heap.
So in my heap header I find no heap container, but rather a set of algorithms
/**
* #defgroup heap_algorithms Heap Algorithms
* #ingroup sorting_algorithms
*/
like:
__is_heap_until
__is_heap
__push_heap
__adjust_heap
__pop_heap
make_heap
sort_heap
all of them (excluding __is_heap, commented as "This function is an extension, not part of the C++ standard") described as
* #ingroup heap_algorithms
*
* This operation... (what it does)
Not really. The "priority" in the name stems from a priority value for the entries in the queue, defining their ... of course: priority. There are many ways to implement such a PQ, however.
A priority queue is an abstract data structure that can be implemented in many ways-unsorted array,sorted array,heap-. It is like an interface, it gives you the signature of heap:
class PriorityQueue {
top() → element
peek() → element
insert(element, priority)
remove(element)
update(element, newPriority)
size() → int
}
A heap is a concrete implementation of the priority queue using an array (it can conceptually be represented as a particular kind of binary tree) to hold elements and specific algorithms to enforce invariants. Invariants are internal properties that always hold true throughout the life of the data structure.
here is the performance comparison of priority queue implementions:
as in the title is it possible to join a number of arrays together without copying and only using pointers? I'm spending a significant amount of computation time copying smaller arrays into larger ones.
note I can't used vectors since umfpack (some matrix solving library) does not allow me to or i don't know how.
As an example:
int n = 5;
// dynamically allocate array with use of pointer
int *a = new int[n];
// define array pointed by *a as [1 2 3 4 5]
for(int i=0;i<n;i++) {
a[i]=i+1;
}
// pointer to array of pointers ??? --> this does not work
int *large_a = new int[4];
for(int i=0;i<4;i++) {
large_a[i] = a;
}
Note: There is already a simple solution I know and that is just to iteratively copy them to a new large array, but would be nice to know if there is no need to copy repeated blocks that are stored throughout the duration of the program. I'm in a learning curve atm.
thanks for reading everyone
as in the title is it possible to join a number of arrays together without copying and only using pointers?
In short, no.
A pointer is simply an address into memory - like a street address. You can't move two houses next to each other, just by copying their addresses around. Nor can you move two houses together by changing their addresses. Changing the address doesn't move the house, it points to a new house.
note I can't used vectors since umfpack (some matrix solving library) does not allow me to or i don't know how.
In most cases, you can pass the address of the first element of a std::vector when an array is expected.
std::vector a = {0, 1, 2}; // C++0x initialization
void c_fn_call(int*);
c_fn_call(&a[0]);
This works because vector guarantees that the storage for its contents is always contiguous.
However, when you insert or erase an element from a vector, it invalidates pointers and iterators that came from it. Any pointers you might have gotten from taking an element's address no longer point to the vector, if the storage that it has allocated must change size.
No. The memory of two arrays are not necessarily contiguous so there is no way to join them without copying. And array elements must be in contiguous memory...or pointer access would not be possible.
I'd probably use memcpy/memmove, which is still going to be copying the memory around, but at least it's been optimized and tested by your compiler vendor.
Of course, the "real" C++ way of doing it would be to use standard containers and iterators. If you've got memory scattered all over the place like this, it sounds like a better idea to me to use a linked list, unless you are going to do a lot of random access operations.
Also, keep in mind that if you use pointers and dynamically allocated arrays instead of standard containers, it's a lot easier to cause memory leaks and other problems. I know sometimes you don't have a choice, but just saying.
If you want to join arrays without copying the elements and at the same time you want to access the elements using subscript operator i.e [], then that isn't possible without writing a class which encapsulates all such functionalities.
I wrote the following class with minimal consideration, but it demonstrates the basic idea, which you can further edit if you want it to have functionalities which it's not currently having. There should be few error also, which I didn't write, just to make it look shorter, but I believe you will understand the code, and handle error cases accordingly.
template<typename T>
class joinable_array
{
std::vector<T*> m_data;
std::vector<size_t> m_size;
size_t m_allsize;
public:
joinable_array() : m_allsize() { }
joinable_array(T *a, size_t len) : m_allsize() { join(a,len);}
void join(T *a, size_t len)
{
m_data.push_back(a);
m_size.push_back(len);
m_allsize += len;
}
T & operator[](size_t i)
{
index ix = get_index(i);
return m_data[ix.v][ix.i];
}
const T & operator[](size_t i) const
{
index ix = get_index(i);
return m_data[ix.v][ix.i];
}
size_t size() const { return m_allsize; }
private:
struct index
{
size_t v;
size_t i;
};
index get_index(size_t i) const
{
index ix = { 0, i};
for(auto it = m_size.begin(); it != m_size.end(); it++)
{
if ( ix.i >= *it ) { ix.i -= *it; ix.v++; }
else break;
}
return ix;
}
};
And here is one test code:
#define alen(a) sizeof(a)/sizeof(*a)
int main() {
int a[] = {1,2,3,4,5,6};
int b[] = {11,12,13,14,15,16,17,18};
joinable_array<int> arr(a,alen(a));
arr.join(b, alen(b));
arr.join(a, alen(a)); //join it again!
for(size_t i = 0 ; i < arr.size() ; i++ )
std::cout << arr[i] << " ";
}
Output:
1 2 3 4 5 6 11 12 13 14 15 16 17 18 1 2 3 4 5 6
Online demo : http://ideone.com/VRSJI
Here's how to do it properly:
template<class T, class K1, class K2>
class JoinArray {
JoinArray(K1 &k1, K2 &k2) : k1(k1), k2(k2) { }
T operator[](int i) const { int s = k1.size(); if (i < s) return k1.operator[](i); else return k2.operator[](i-s); }
int size() const { return k1.size() + k2.size(); }
private:
K1 &k1;
K2 &k2;
};
template<class T, class K1, class K2>
JoinArray<T,K1,K2> join(K1 &k1, K2 &k2) { return JoinArray<T,K1,K2>(k1,k2); }
template<class T>
class NativeArray
{
NativeArray(T *ptr, int size) : ptr(ptr), size(size) { }
T operator[](int i) const { return ptr[i]; }
int size() const { return size; }
private:
T *ptr;
int size;
};
int main() {
int array[2] = { 0,1 };
int array2[2] = { 2,3 };
NativeArray<int> na(array, 2);
NativeArray<int> na2(array2, 2);
auto joinarray = join(na,na2);
}
A variable that is a pointer to a pointer must be declared as such.
This is done by placing an additional asterik in front of its name.
Hence, int **large_a = new int*[4]; Your large_a goes and find a pointer, while you've defined it as a pointer to an int. It should be defined (declared) as a pointer to a pointer variable. Just as int **large_a; could be enough.
I need to use boost::disjoint_sets, but the documentation is unclear to me. Can someone please explain what each template parameter means, and perhaps give a small example code for creating a disjoint_sets?
As per the request, I am using disjoint_sets to implement Tarjan's off-line least common ancestors algorithm, i.e - the value type should be vertex_descriptor.
What I can understand from the documentation :
Disjoint need to associate a rank and a parent (in the forest tree) to each element. Since you might want to work with any kind of data you may,for example, not always want to use a map for the parent: with integer an array is sufficient. You also need a rank foe each element (the rank needed for the union-find).
You'll need two "properties" :
one to associate an integer to each element (first template argument), the rank
one to associate an element to an other one (second template argument), the fathers
On an example :
std::vector<int> rank (100);
std::vector<int> parent (100);
boost::disjoint_sets<int*,int*> ds(&rank[0], &parent[0]);
Arrays are used &rank[0], &parent[0] to the type in the template is int*
For a more complex example (using maps) you can look at Ugo's answer.
You are just giving to the algorithm two structures to store the data (rank/parent) he needs.
disjoint_sets<Rank, Parent, FindCompress>
Rank PropertyMap used to store the size of a set (element -> std::size_t). See union by rank
Parent PropertyMap used to store the parent of an element (element -> element). See Path compression
FindCompress Optional argument defining the find method. Default to find_with_full_path_compression See here (Default should be what you need).
Example:
template <typename Rank, typename Parent>
void algo(Rank& r, Parent& p, std::vector<Element>& elements)
{
boost::disjoint_sets<Rank,Parent> dsets(r, p);
for (std::vector<Element>::iterator e = elements.begin();
e != elements.end(); e++)
dsets.make_set(*e);
...
}
int main()
{
std::vector<Element> elements;
elements.push_back(Element(...));
...
typedef std::map<Element,std::size_t> rank_t; // => order on Element
typedef std::map<Element,Element> parent_t;
rank_t rank_map;
parent_t parent_map;
boost::associative_property_map<rank_t> rank_pmap(rank_map);
boost::associative_property_map<parent_t> parent_pmap(parent_map);
algo(rank_pmap, parent_pmap, elements);
}
Note that "The Boost Property Map Library contains a few adaptors that convert commonly used data-structures that implement a mapping operation, such as builtin arrays (pointers), iterators, and std::map, to have the property map interface"
This list of these adaptors (like boost::associative_property_map) can be found here.
For those of you who can't afford the overhead of std::map (or can't use it because you don't have default constructor in your class), but whose data is not as simple as int, I wrote a guide to a solution using std::vector, which is kind of optimal when you know the total number of elements beforehand.
The guide includes a fully-working sample code that you can download and test on your own.
The solution mentioned there assumes you have control of the class' code so that in particular you can add some attributes. If this is still not possible, you can always add a wrapper around it:
class Wrapper {
UntouchableClass const& mInstance;
size_t dsID;
size_t dsRank;
size_t dsParent;
}
Moreover, if you know the number of elements to be small, there's no need for size_t, in which case you can add some template for the UnsignedInt type and decide in runtime to instantiate it with uint8_t, uint16_t, uint32_tor uint64_t, which you can obtain with <cstdint> in C++11 or with boost::cstdint otherwise.
template <typename UnsignedInt>
class Wrapper {
UntouchableClass const& mInstance;
UnsignedInt dsID;
UnsignedInt dsRank;
UnsignedInt dsParent;
}
Here's the link again in case you missed it: http://janoma.cl/post/using-disjoint-sets-with-a-vector/
I written a simple implementation a while ago. Have a look.
struct DisjointSet {
vector<int> parent;
vector<int> size;
DisjointSet(int maxSize) {
parent.resize(maxSize);
size.resize(maxSize);
for (int i = 0; i < maxSize; i++) {
parent[i] = i;
size[i] = 1;
}
}
int find_set(int v) {
if (v == parent[v])
return v;
return parent[v] = find_set(parent[v]);
}
void union_set(int a, int b) {
a = find_set(a);
b = find_set(b);
if (a != b) {
if (size[a] < size[b])
swap(a, b);
parent[b] = a;
size[a] += size[b];
}
}
};
And the usage goes like this. It's simple. Isn't it?
void solve() {
int n;
cin >> n;
DisjointSet S(n); // Initializing with maximum Size
S.union_set(1, 2);
S.union_set(3, 7);
int parent = S.find_set(1); // root of 1
}
Loic's answer looks good to me, but I needed to initialize the parent so that each element had itself as parent, so I used the iota function to generate an increasing sequence starting from 0.
Using Boost, and I imported bits/stdc++.h and used using namespace std for simplicity.
#include <bits/stdc++.h>
#include <boost/pending/disjoint_sets.hpp>
#include <boost/unordered/unordered_set.hpp>
using namespace std;
int main() {
array<int, 100> rank;
array<int, 100> parent;
iota(parent.begin(), parent.end(), 0);
boost::disjoint_sets<int*, int*> ds(rank.begin(), parent.begin());
ds.union_set(1, 2);
ds.union_set(1, 3);
ds.union_set(1, 4);
cout << ds.find_set(1) << endl; // 1 or 2 or 3 or 4
cout << ds.find_set(2) << endl; // 1 or 2 or 3 or 4
cout << ds.find_set(3) << endl; // 1 or 2 or 3 or 4
cout << ds.find_set(4) << endl; // 1 or 2 or 3 or 4
cout << ds.find_set(5) << endl; // 5
cout << ds.find_set(6) << endl; // 6
}
I changed std::vector to std::array because pushing elements to a vector will make it realloc its data, which makes the references the disjoint sets object contains become invalid.
As far as I know, it's not guaranteed that the parent will be a specific number, so that's why I wrote 1 or 2 or 3 or 4 (it can be any of these). Maybe the documentation explains with more detail which number will be chosen as leader of the set (I haven't studied it).
In my case, the output is:
2
2
2
2
5
6
Seems simple, it can probably be improved to make it more robust (somehow).
Note: std::iota Fills the range [first, last) with sequentially increasing values, starting with value and repetitively evaluating ++value.
More: https://en.cppreference.com/w/cpp/algorithm/iota