I am using the C++ frontend for PyTorch and am struggling with a relatively basic indexing problem.
I have an 8 by 6 Tensor such as the one below:
[ Variable[CUDAFloatType]{8,6} ]
0 1 2 3 4 5
0 1.7107e-14 4.0448e-17 4.9708e-06 1.1664e-08 9.9999e-01 2.1857e-20
1 1.8288e-14 5.9356e-17 5.3042e-06 1.2369e-08 9.9999e-01 2.4799e-20
2 2.6828e-04 9.0390e-18 1.7517e-02 1.0529e-03 9.8116e-01 6.7854e-26
3 5.7521e-10 3.1037e-11 1.5021e-03 1.2304e-06 9.9850e-01 1.4888e-17
4 1.7811e-13 1.8383e-15 1.6733e-05 3.8466e-08 9.9998e-01 5.2815e-20
5 9.6191e-06 2.6217e-23 3.1345e-02 2.3024e-04 9.6842e-01 2.9435e-34
6 2.2653e-04 8.4642e-18 1.6085e-02 9.7405e-04 9.8271e-01 6.3059e-26
7 3.8951e-14 2.9903e-16 8.3518e-06 1.7974e-08 9.9999e-01 3.6993e-20
I have another Tensor with just 8 elements in it such as:
[ Variable[CUDALongType]{8} ]
0
3
4
4
4
4
4
4
I would like to index the rows of my first tensor using the second to produce:
0
0 1.7107e-14
1 1.2369e-08
2 9.8116e-01
3 9.9850e-01
4 9.9998e-01
5 9.6842e-01
6 9.8271e-01
7 9.9999e-01
I have tried a few different approaches including index_select but it seems to produce an output that has the same dimensions as the input (8x6).
In Python I think I could index with Python's built-in indexing as discussed here: https://github.com/pytorch/pytorch/issues/1080
Unfortunately, in C++ I can only index a Tensor with a scalar (zero-dimensional Tensor) so I don't think that approach works for me here.
How can I achieve my desired result without resorting to loops?
It turns out you can do this in a couple different ways. One with gather and one with index. From the PyTorch discussions where I asked the same question:
Using torch::gather
auto x = torch::randn({8, 6});
int64_t idx_data[8] = { 0, 3, 4, 4, 4, 4, 4, 4 };
auto idx = x.type().toScalarType(torch::kLong).tensorFromBlob(idx_data, 8);
auto result = x.gather(1, idx.unsqueeze(1));
Using the C++ specific torch::index
auto x = torch::randn({8, 6});
int64_t idx_data[8] = { 0, 3, 4, 4, 4, 4, 4, 4 };
auto idx = x.type().toScalarType(torch::kLong).tensorFromBlob(idx_data, 8);
auto rows = torch::arange(0, x.size(0), torch::kLong);
auto result = x.index({rows, idx});
Related
I have a list of numbers stored in a standard vector. Some of the numbers are children of other numbers. Here is an example
3, 4
3, 5
5, 6
7, 3
8, 9
8, 1
8, 2
9, 8
Or as a graph:
1 2 3-4 5-6 7 8-9
|-------------|
|-----------|
|---|
|-------|
That is there are two clusters 3,4,5,6,7 and 1,2,8,9. The root number is the smallest number of a cluster. Here 3 and 1. I would like to know which algorithms I can use to extract a list like this:
3, 4
3, 5
3, 6
3, 7
1, 2
1, 8
1, 9
An algorithm similar disjoint set union algorithm can help you:
Initialize N disjoint subset, each subset has exactly one number, and root of number i(r(i)) is i.
For each edge (u, v), you can assign:
t = min(r(u), r(v))
r(u) = t
r(v) = t
For each i with i != r(i), you can write out [r(i) - i].
sticks = int(raw_input());
stickList= map(int,raw_input().split()) ;
stickList = sorted(stickList);
for i in xrange(0,len(stickList)):
stickList[i] = stickList[i]-stickList[0];
print stickList;
Given Input is :
6
5 4 4 2 2 8
Why the output is this: [0, 2, 4, 4, 5, 8]
instead of [0,0,2,2,3,6]
That is because you are changing the value in source stickList in for loop.
After first iteration in loop stickList[0] will become 0 for remaining iterations.
As ShadowRanger mentioned reversed list will do,
stickList = map(int, "5 4 4 2 2 8".split())
stickList.sort()
for i in reversed(xrange(len(stickList))):
stickList[i] -= stickList[0]
print stickList
I have for example the following matrix B which is stored in COO and CSR format (retrieved from the non-symetric example here). Could you please suggest an efficient c++ way to apply the matlab sum(B,2) function using the coo or csr(or both) storing format? Because it is quit possible to work with large arrays can we do that using parallel programming (omp or CUDA (e.g, thrust))?
Any algorithmic or library based suggestions are highly appreciated.
Thank you!
PS: Code to construct a sparse matrix and get the CSR coordinates can be found for example in the answer of this post.
COO format: CSR format:
row_index col_index value columns row_index value
1 1 1 0 0 1
1 2 -1 1 3 -1
1 3 -3 3 5 -3
2 1 -2 0 8 -2
2 2 5 1 11 5
3 3 4 2 13 4
3 4 6 3 6
3 5 4 4 4
4 1 -4 0 -4
4 3 2 2 2
4 4 7 3 7
5 2 8 1 8
5 5 -5 4 -5
For COO its pretty simple:
struct MatrixEntry {
size_t row;
size_t col;
int value;
};
std::vector<MatrixEntry> matrix = {
{ 1, 1, 1 },
{ 1, 2, -1 },
{ 1, 3, -3 },
{ 2, 1, -2 },
{ 2, 2, 5 },
{ 3, 3, 4 },
{ 3, 4, 6 },
{ 3, 5, 4 },
{ 4, 1, -4 },
{ 4, 3, 2 },
{ 4, 4, 7 },
{ 5, 2, 8 },
{ 5, 5, -5 },
};
std::vector<int> sum(5);
for (const auto& e : matrix) {
sum[e.row-1] += e.value;
}
and for large matrixes you can just split up the for loop into multiple smaller ranges and add the results at the end.
If you only need the sum of each row (and not columwise) CSR is also straight forward (and even more efficient):
std::vector<int> row_idx = { 0, 3, 5, 8, 11, 13 };
std::vector<int> value = { 1, -1, -3, -2, 5, 4, 6, 4, -4, 2, 7, 8, -5 };
std::vector<int> sum(5);
for(size_t i = 0; i < row_idx.size()-1; ++i) {
sum[i] = std::accumulate(value.begin() + row_idx[i], value.begin() + row_idx[i + 1], 0);
}
Again, for parallelism you can simply split up the loop.
What is the fastest algorithm to keep a
vector<vector<double>>
continuously "merge" sorted being able to handle updates in realtime?
For example, at T0 vec<vector<double> is empty
At T1, (in fact only one vec<double> comes in at once)
A = 1, 2, 4
B = 1, 3, 4, 5
C = 6, 7
The vector<vector> gets merge-sorted into,
1
1
2
3
4
4
5
6
7
At T2
C = 0, 4
D = 3, 7
The new list would be
0
1
1
2
3
3
4
4
4
5
7
So first we have to remove the old values of C, then "insert" the new values of C correctly.
Some sort function like this AVL_Tree Func(vector<vector<double>> vecvec, vector<double> newVec) that returns tree would seem to be best. AVL Tree? Can someone show me a c++ templatized version of code that would work? Boost, STL etc use is fine.
I have a matrix in SAS/IML:
x = {7 6 3 3 8,
2 3 5 2 5,
2 6 4 3 8,
7 4 8 1 3,
8 8 6 8 7,
3 2 6 1 5 };
I want to create a new matrix that contains the highest k values of each column in x. For example, if k=3, I want the result matrix to contain:
8 8 8 8 8
7 6 6 3 8
7 6 6 3 7
because, for instance, the largest 3 numbers in the first column of x are 8, 7, and 7.
I've unsuccessfully tried to figure out how to do this using the rank function.
Your code looks fine. Here's a minor revision:
do c=1 to ncol(x);
r = rank(x[,c]);
y = x[loc(r>=nrow(x)-k+1), c];
call sort(y);
tops[,c] = y;
end;
As to avoiding the loop to make it faster, it's not necessary. Even with 10,000 columns, this code runs in a fraction of a second. Try running the following timing code:
x = j(500, 10000);
call randgen(x,"normal");
k = 3;
t0=time();
tops = j(k,ncol(x),0);
do c=1 to ncol(x);
r = rank(x[,c]);
y = x[loc(r>=nrow(x)-k+1), c];
call sort(y);
tops[,c] = y;
end;
t=time()-t0;
print t;
Here's a partial answer I've come up with:
k = 3;
tops = j(k,ncol(x),0);
do c=1 to ncol(x);
r = rank(x[,c]);
h=loc(r>=nrow(x)-k+1);
tops[,c] = x[,c][h];
end;
This approach uses a loop, which I'd like to avoid, so please post improvements if possible!