How to find smallest connected label in equivalency list - c++

I have a list of numbers stored in a standard vector. Some of the numbers are children of other numbers. Here is an example
3, 4
3, 5
5, 6
7, 3
8, 9
8, 1
8, 2
9, 8
Or as a graph:
1 2 3-4 5-6 7 8-9
|-------------|
|-----------|
|---|
|-------|
That is there are two clusters 3,4,5,6,7 and 1,2,8,9. The root number is the smallest number of a cluster. Here 3 and 1. I would like to know which algorithms I can use to extract a list like this:
3, 4
3, 5
3, 6
3, 7
1, 2
1, 8
1, 9

An algorithm similar disjoint set union algorithm can help you:
Initialize N disjoint subset, each subset has exactly one number, and root of number i(r(i)) is i.
For each edge (u, v), you can assign:
t = min(r(u), r(v))
r(u) = t
r(v) = t
For each i with i != r(i), you can write out [r(i) - i].

Related

C++ Sort vector by index

I need to sort a std::vector by index. Let me explain it with an example:
Imagine I have a std::vector of 12 positions (but can be 18 for example) filled with some values (it doesn't have to be sorted):
Vector Index: 0 1 2 3 4 5 6 7 8 9 10 11
Vector Values: 3 0 2 3 2 0 1 2 2 4 5 3
I want to sort it every 3 index. This means: the first 3 [0-2] stay, then I need to have [6-8] and then the others. So it will end up like this (new index 3 has the value of previous idx 6):
Vector Index: 0 1 2 3 4 5 6 7 8 9 10 11
Vector Values: 3 0 2 1 2 2 3 2 0 4 5 3
I'm trying to make it in one line using std::sort + lambda but I can't get it. Also discovered the std::partition() function and tried to use it but the result was really bad hehe
Found also this similar question which orders by odd and even index but can't figure out how to make it in my case or even if it is possible: Sort vector by even and odd index
Thank you so much!
Note 0: No, my vector is not always sorted. It was just an example. I've changed the values
Note 1: I know it sound strange... think it like hte vecotr positions are like: yes yes yes no no no yes yes yes no no no yes yes yes... so the 'yes' positions will go in the same order but before the 'no' positions
Note 2: If there isn't a way with lambda then I thought making it with a loop and auxiliar vars but it's more ugly I think.
Note 3: Another example:
Vector Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Vector Values: 3 0 2 3 2 0 1 2 2 4 5 3 2 3 0 0 2 1
Sorted Values: 3 0 2 1 2 2 2 3 0 3 2 0 4 5 3 0 2 1
The final Vector Values is sorted (in term of old index): 0 1 2 6 7 8 12 13 14 3 4 5 9 10 11 15 16 17
You can imagine those index in 2 colums, so I want first the Left ones and then the Right one:
0 1 2 3 4 5
6 7 8 9 10 11
12 13 14 15 16 17
You don't want std::sort, you want std::rotate.
std::vector<int> v = {20, 21, 22, 23, 24, 25,
26, 27, 28, 29, 30, 31};
auto b = std::next(std::begin(v), 3); // skip first three elements
auto const re = std::end(v); // keep track of the actual end
auto e = std::next(b, 6); // the end of our current block
while(e < re) {
auto mid = std::next(b, 3);
std::rotate(b, mid, e);
b = e;
std::advance(e, 6);
}
// print the results
std::copy(std::begin(v), std::end(v), std::ostream_iterator<int>(std::cout, " "));
This code assumes you always do two groups of 3 for each rotation, but you could obviously work with whichever arbitrary ranges you wanted.
The output looks like what you'd want:
20 21 22 26 27 28 23 24 25 29 30 31
Update: #Blastfurnace pointed out that std::swap_ranges would work as well. The rotate call can be replaced with the following line:
std::swap_ranges(b, mid, mid); // passing mid twice on purpose
With the range-v3 library, you can write this quite conveniently, and it's very readable. Assuming your original vector is called input:
namespace rs = ranges;
namespace rv = ranges::views;
// input [3, 0, 2, 3, 2, 0, 1, 2, 2, 4, 5, 3, 2, 3, 0, 0, 2, 1]
auto by_3s = input | rv::chunk(3); // [[3, 0, 2], [3, 2, 0], [1, 2, 2], [4, 5, 3], [2, 3, 0], [0, 2, 1]]
auto result = rv::concat(by_3s | rv::stride(2), // [[3, 0, 2], [1, 2, 2], [2, 3, 0]]
by_3s | rv::drop(1) | rv::stride(2)) // [[3, 2, 0], [4, 5, 3], [0, 2, 1]]
| rv::join
| rs::to<std::vector<int>>; // [3, 0, 2, 1, 2, 2, 2, 3, 0, 3, 2, 0, 4, 5, 3, 0, 2, 1]
Here's a demo.

Row-wise Element Indexing in PyTorch for C++

I am using the C++ frontend for PyTorch and am struggling with a relatively basic indexing problem.
I have an 8 by 6 Tensor such as the one below:
[ Variable[CUDAFloatType]{8,6} ]
0 1 2 3 4 5
0 1.7107e-14 4.0448e-17 4.9708e-06 1.1664e-08 9.9999e-01 2.1857e-20
1 1.8288e-14 5.9356e-17 5.3042e-06 1.2369e-08 9.9999e-01 2.4799e-20
2 2.6828e-04 9.0390e-18 1.7517e-02 1.0529e-03 9.8116e-01 6.7854e-26
3 5.7521e-10 3.1037e-11 1.5021e-03 1.2304e-06 9.9850e-01 1.4888e-17
4 1.7811e-13 1.8383e-15 1.6733e-05 3.8466e-08 9.9998e-01 5.2815e-20
5 9.6191e-06 2.6217e-23 3.1345e-02 2.3024e-04 9.6842e-01 2.9435e-34
6 2.2653e-04 8.4642e-18 1.6085e-02 9.7405e-04 9.8271e-01 6.3059e-26
7 3.8951e-14 2.9903e-16 8.3518e-06 1.7974e-08 9.9999e-01 3.6993e-20
I have another Tensor with just 8 elements in it such as:
[ Variable[CUDALongType]{8} ]
0
3
4
4
4
4
4
4
I would like to index the rows of my first tensor using the second to produce:
0
0 1.7107e-14
1 1.2369e-08
2 9.8116e-01
3 9.9850e-01
4 9.9998e-01
5 9.6842e-01
6 9.8271e-01
7 9.9999e-01
I have tried a few different approaches including index_select but it seems to produce an output that has the same dimensions as the input (8x6).
In Python I think I could index with Python's built-in indexing as discussed here: https://github.com/pytorch/pytorch/issues/1080
Unfortunately, in C++ I can only index a Tensor with a scalar (zero-dimensional Tensor) so I don't think that approach works for me here.
How can I achieve my desired result without resorting to loops?
It turns out you can do this in a couple different ways. One with gather and one with index. From the PyTorch discussions where I asked the same question:
Using torch::gather
auto x = torch::randn({8, 6});
int64_t idx_data[8] = { 0, 3, 4, 4, 4, 4, 4, 4 };
auto idx = x.type().toScalarType(torch::kLong).tensorFromBlob(idx_data, 8);
auto result = x.gather(1, idx.unsqueeze(1));
Using the C++ specific torch::index
auto x = torch::randn({8, 6});
int64_t idx_data[8] = { 0, 3, 4, 4, 4, 4, 4, 4 };
auto idx = x.type().toScalarType(torch::kLong).tensorFromBlob(idx_data, 8);
auto rows = torch::arange(0, x.size(0), torch::kLong);
auto result = x.index({rows, idx});

Longest sub-sequence the elements of which make up a set of increasing integers

Find the length of the longest continuous sub-sequence of an array the elements of which make up a set of continuous increasing integers.
The input file consists of the number n(the number of elements in the array) followed by n integers.
example input - 10 1 6 4 5 2 3 8 10 7 7
example output - 6(1 6 4 5 2 3 since they make the set 1 2 3 4 5 6).
I was able to write an algorithm that satisfies 0<n<5000 but in order to get 100 points the algorithm had to work for 0<=n<=50000.
How about something like this? Arrange the array elements in descending order, each coupled with its index-range as a local maximum (for example, A[0] = 10 would be the maximum for array indexes, [0, 10], while A[3] = 4 would be the local maximum for array indexes, [3,3]. Now traverse this list and find the longest, continuously descending sequence where the index-ranges are all contained in the starting range.
10 1 6 4 5 2 3 8 10 7 7
=> 10, [ 0,10]
8, [ 1, 7]
7, [ 9,10]
6, [ 1, 6] <--
5, [ 3, 6] | ranges
4, [ 3, 3] | all
3, [ 5, 6] | contained
2, [ 5, 5] | in [1,6]
1, [ 1, 1] <--

Data clustering and comparison between two arrays

I have two collections of elements. How can I pick out those with duplicates and put them into each group with least amount of comparison? Preferably in C++.
For Example given
Array 1 = {1, 1, 2, 2, 3, 4, 5, 5, 1, 1, 2, 2, 4, 5, 8, …}
Array 2 = {2, 1, 1, 2, 2, 4, 7, 7, 8, 8, 2, 2, 4, 4, 8, …}.
At first, I want to cluster data.
Array 1 = { Group 1 = {1, 1, 1, 1, …}, Group 2 = {2, 2, 2, 2, …}, Group 3 = {3, …}, Group 4 = {4, 4, …}, Group 5 = {5, 5, 5, …}, Group 6 = {8, …} }.
Array 2 = { Group 1 = {1, 1, …}, Group 2 = {2, 2, 2, 2, 2 …}, Group 3 = {4, 4 ,4, …}, Group 4 = {7, 7, …}, Group 5 = {8, 8, 8 …} }.
And second, I want data matching.
Group 1 of Array 1 == Group 1 of Array 2
Group 2 of Array 1 == Group 2 of Array 2
Group 4 of Array 1 == Group 3 of Array 2
Group 6 of Array 1 == Group 5 of Array 2
How can I solve this problem in C++? Please give me your brilliant tips.
Additionally, I will explain my problem in detail. I have two data sets which is calculated in stereo image. Array 1 is data of left camera, and Array 2 is data of right camera. My final goal is to match groups which have same values such as group 6 of array1 and group 5 of array 2. Data ordering is not my consideration. I just want to find same values between groups in two arrays. (Will you recommend me to use data ordering first to reduce the number of comparison? ).
In order to solve this problem, should I use ‘std::map’ for data clustering, and compare those N! times (N: no. of groups in array 1 or 2)? Is this best way that I can do?
I’d like to get your advice. Thank you for sharing my problems.
My conclusion
My approach is to use map container in C++ STL.
Make 2 map containers (Array1_map, Array2_map).
Insert value of each array into the map containers as a key, and insert index of each array into the map as a value. (Two data of both arrays are orderly saved in a map without duplication.)
Use find() member function of map container for data matching.
After data matching, I was able to get the indexes of each array which have the matched keys (corresponding keys).
Thank you for all your helpful answers!
The easiest way I can see to do this is to construct a histogram of each array. Then you can compare those histograms together. That should be O(NlogN) to convert each array to a histogram where N is the array size and then O(N) to compare the histograms when N is the number of unique elements in the array (size of the map). That would look like
int arr1[] = {...};
int arr2[] = {...};
std::map<int, int> arr1_histogram, arr2_histogram;
for (auto e : arr1)
arr1_histogram[e]++;
for (auto e : arr2)
arr2_histogram[e]++;
if (arr1_histogram == arr2_histogram)
// true case
else
// false case

Most efficient sorting algorithm to continuously sort an vector<vector<double>>

What is the fastest algorithm to keep a
vector<vector<double>>
continuously "merge" sorted being able to handle updates in realtime?
For example, at T0 vec<vector<double> is empty
At T1, (in fact only one vec<double> comes in at once)
A = 1, 2, 4
B = 1, 3, 4, 5
C = 6, 7
The vector<vector> gets merge-sorted into,
1
1
2
3
4
4
5
6
7
At T2
C = 0, 4
D = 3, 7
The new list would be
0
1
1
2
3
3
4
4
4
5
7
So first we have to remove the old values of C, then "insert" the new values of C correctly.
Some sort function like this AVL_Tree Func(vector<vector<double>> vecvec, vector<double> newVec) that returns tree would seem to be best. AVL Tree? Can someone show me a c++ templatized version of code that would work? Boost, STL etc use is fine.