I have two square matrices A and B
I must convert B to CSR Format and determine the product C
A * B_csr = C
I have found a lot of information online regarding CSR Matrix - Vector multiplication. The algorithm is:
for (k = 0; k < N; k = k + 1)
result[i] = 0;
for (i = 0; i < N; i = i + 1)
{
for (k = RowPtr[i]; k < RowPtr[i+1]; k = k + 1)
{
result[i] = result[i] + Val[k]*d[Col[k]];
}
}
However, I require Matrix - Matrix multiplication.
Further, it seems that most algorithms apply A_csr - vector multiplication where I require A * B_csr. My solution is to transpose the two matrices before converting then transpose the final product.
Can someone explain how to compute a Matrix - CSR Matrix product and/or a CSR Matrix - Matrix product?
Here is a simple solution in Python for the Dense Matrix X CSR Matrix. It should be self-explanatory.
def main():
# 4 x 4 csr matrix
# [1, 0, 0, 0],
# [2, 0, 3, 0],
# [0, 0, 0, 0],
# [0, 4, 0, 0],
csr_values = [1, 2, 3, 4]
col_idx = [0, 0, 2, 1]
row_ptr = [0, 1, 3, 3, 4]
csr_matrix = [
csr_values,
col_idx,
row_ptr
]
dense_matrix = [
[1, 3, 3, 4],
[1, 2, 3, 4],
[1, 4, 3, 4],
[1, 2, 3, 5],
]
res = [
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
]
# matrix order, assumes both matrices are square
n = len(dense_matrix)
# res = dense X csr
csr_row = 0 # Current row in CSR matrix
for i in range(n):
start, end = row_ptr[i], row_ptr[i + 1]
for j in range(start, end):
col, csr_value = col_idx[j], csr_values[j]
for k in range(n):
dense_value = dense_matrix[k][csr_row]
res[k][col] += csr_value * dense_value
csr_row += 1
print res
if __name__ == '__main__':
main()
CSR Matrix X Dense Matrix is really just a sequence of CSR Matrix X Vector product for each row of the dense matrix right? So it should be really easy to extend the code you show above to do this.
Moving forward, I suggest you don't code these routines yourself. If you are using C++ (based on the tag), then you could have a look at Boost ublas for example, or Eigen. The APIs may seem a bit cryptic at first but it's really worth it in the long term. First, you gain access to a lot more functionality, which you will probably require in the future. Second these implementations will be better optimised.
Related
I have a MatrixXi, say
[0, 1, 2]
[0, 2, 3]
[4, 7, 6]
[4, 6, 5]
[0, 4, 5]
[0, 5, 1]
[1, 5, 6]
I get a part of it by doing:
MatrixXi MR = F.middleRows(first, last);
with first and last at will. Now I'd like to turn those n rows into a column VectorXi, like:
[0,
1,
2,
0,
2,
3]
possibly without using a for loop. I've tried:
VectorXi VRT(MR.rows() * MR.cols());
VRT.tail(MR.rows() * MR.cols()) = MR.array();
But I get:
Assertion failed: (rows == this->rows() && cols == this->cols() && "DenseBase::resize() does not actually allow to resize."), function resize, file /Users/max/Developer/Stage/Workspace/AutoTools3D/dep/libigl/external/eigen/Eigen/src/Core/DenseBase.h, line 257.
How do I get that? I'm using Eigen before v4 so I cannot use reshape...
Thank you
As pointed out by chtz, this works:
Eigen::VectorXi VR(MR.size());
Eigen::MatrixXi::Map(VR.data(), MR.cols(), MR.rows()) =
MR.transpose();
I have problem, because I want to generate permutations of a list (in prolog), which contains n zeros and 24 - n ones without repetitions. I've tried:findall(L, permutation(L,P), Bag) and then sort it to remove repetitions, but it causes stack overflow. Anyone has an efficient way to do this?
Instead of thinking about lists, think about binary numbers. The list will have a length of 24 elements. If all those elements are 1's we have:
?- X is 0b111111111111111111111111.
X = 16777215.
The de fact standard predicate between/3 can be used to generate numbers in the interval [0, 16777215]:
?- between(0, 16777215, N).
N = 0 ;
N = 1 ;
N = 2 ;
...
Only some of these numbers satisfy your condition. Thus, you will need to filter/test them and then convert the numbers that pass into a list representation of its binary equivalent.
Select n random numbers between 0 and 23 in ascending order. These integers give you the indexes of the zeroes and all the configurations are different. The key is generating these list of indexes.
%
% We need N monotonically increasing integer numbers (to be used
% as indexes) from [From,To].
%
need_indexes(N,From,To,Sol) :-
N>0,
!,
Delta is To-From+1,
N=<Delta, % Still have a chance to generate them all
N_less is N-1,
From_plus is From+1,
(
% Case 1: "From" is selected into the collection of index values
(need_indexes(N_less,From_plus,To,SubSol),Sol=[From|SubSol])
;
% Case 2: "From" is not selected, which is only possible if N<Delta
(N<Delta -> need_indexes(N,From_plus,To,Sol))
).
need_indexes(0,_,_,[]).
Now we can get list of indexes picked from the available possible indexes.
For example:
Give me 5 indexes from 0 to 23 (inclusive):
?- need_indexes(5,0,23,Collected).
Collected = [0, 1, 2, 3, 4] ;
Collected = [0, 1, 2, 3, 5] ;
Collected = [0, 1, 2, 3, 6] ;
Collected = [0, 1, 2, 3, 7] ;
...
Give them all:
?- findall(Collected,need_indexes(5,0,23,Collected),L),length(L,LL).
L = [[0, 1, 2, 3, 4], [0, 1, 2, 3, 5], [0, 1, 2, 3, 6], [0, 1, 2, 3, 7], [0, 1, 2, 3|...], [0, 1, 2|...], [0, 1|...], [0|...], [...|...]|...],
LL = 42504.
We are expecting: (24! / ((24-5)! * 5!)) solutions.
Indeed:
?- L is 20*21*22*23*24 / (1*2*3*4*5).
L = 42504.
Now the only problem is transforming every solution like [0, 1, 2, 3, 4] into a string of 0 and 1. This is left as an exercise!
Here is an even simpler answer to generate strings directly. Very direct.
need_list(ZeroCount,OneCount,Sol) :-
length(Zs,ZeroCount),maplist([X]>>(X='0'),Zs),
length(Os,OneCount),maplist([X]>>(X='1'),Os),
compose(Zs,Os,Sol).
compose([Z|Zs],[O|Os],[Z|More]) :- compose(Zs,[O|Os],More).
compose([Z|Zs],[O|Os],[O|More]) :- compose([Z|Zs],Os,More).
compose([],[O|Os],[O|More]) :- !,compose([],Os,More).
compose([Z|Zs],[],[Z|More]) :- !,compose(Zs,[],More).
compose([],[],[]).
rt(ZeroCount,Sol) :-
ZeroCount >= 0,
ZeroCount =< 24,
OneCount is 24-ZeroCount,
need_list(ZeroCount,OneCount,SolList),
atom_chars(Sol,SolList).
?- rt(20,Sol).
Sol = '000000000000000000001111' ;
Sol = '000000000000000000010111' ;
Sol = '000000000000000000011011' ;
Sol = '000000000000000000011101' ;
Sol = '000000000000000000011110' ;
Sol = '000000000000000000100111' ;
Sol = '000000000000000000101011' ;
Sol = '000000000000000000101101' ;
Sol = '000000000000000000101110' ;
Sol = '000000000000000000110011' ;
Sol = '000000000000000000110101' ;
....
?- findall(Collected,rt(5,Collected),L),length(L,LL).
L = ['000001111111111111111111', '000010111111111111111111', '000011011111111111111111', '000011101111111111111111', '000011110111111111111111', '000011111011111111111111', '000011111101111111111111', '000011111110111111111111', '000011111111011111111111'|...],
LL = 42504.
The operation consists of two arrays X and idx of equal length where the values of idx can vary between 0 to (k-1) with the value of k given.
This is the general Python code to illustrate this.
import numpy as np
X = np.arange(6) # Just for a sample of elements
k = 3
idx = numpy.array([[0, 1, 2, 2, 0, 1]]).T # Can only contain values in [0..(k-1)]
np.array([X[np.where(idx==i)[0]] for i in range(k)])
Sample output:
array([[0, 4],
[1, 5],
[2, 3]])
Note that there is actually a reason for me to represent idx as a matrix and not as a vector. It was initialised to numpy.zeros((n,1)) as part of its computation, where n the size of X.
I tried implement this in Theano like so
import theano
import theano.tensor as T
X = T.vector('X')
idx = T.vector('idx')
k = T.scalar()
c = theano.scan(lambda i: X[T.where(T.eq(idx,i))], sequences=T.arange(k))
f = function([X,idx,k],c)
But I received this error at line where c is defined:
TypeError: Wrong number of inputs for Switch.make_node (got 1((<int8>,)), expected 3)
Is there a simple way to implement this in Theano?
Use nonzero() and correct the dimensions of idx.
This code solved the problem
import theano
import theano.tensor as T
X = T.vector('X')
idx = T.vector('idx')
k = T.scalar()
c, updates = theano.scan(lambda i: X[T.eq(idx,i).nonzero()], sequences=T.arange(k))
f = function([X,idx,k],c)
For the same example, through the use of Theano:
import numpy as np
X = np.arange(6)
k = 3
idx = np.array([[0, 1, 2, 2, 0, 1]]).T
f(X, idx.T[0], k).astype(int)
This gives the output as
array([[0, 4],
[1, 5],
[2, 3]])
If idx is defined as np.array([0, 1, 2, 2, 0, 1]), then f(X, idx, k) can be used instead.
I've got a theano function which computes euclidean distances for 2 matrices—X (n vectors x k features) and Y (m vectors x k features). The result is an n x m matrix of pairwise distances of each vector (or row) in X from each vector (or row) in Y.
import theano
from theano import tensor as T
X, Y = T.dmatrices('X', 'Y')
X_squared_sum = T.sum(X ** 2, axis=1, keepdims=True)
Y_squared_sum = T.sum(Y.T ** 2, axis=0, keepdims=True)
squared_distances = X_squared_sum + Y_squared_sum - 2 * T.dot(X, Y.T)
f_distance = theano.function([X, Y], T.sqrt(squared_distances))
Let's say I change the above function to accept a single vector, an array of vectors, and the number of smallest distances. What I want is a theano function that will find the N smallest distances, similar to below:
import numpy as np
import theano
from theano import tensor as T
X = T.dvector('X')
Y = T.dmatrix('Y')
N = T.iscalar('N')
X_squared_sum = T.dot(X, X)
Y_squared_sum = T.sum(Y.T ** 2, axis=0)
squared_distances = X_squared_sum + Y_squared_sum - 2 * T.dot(X, Y.T)
dist_sorted = T.FIND_N_SMALLEST(T.sqrt(squared_distances), N)
n_closest = theano.function([X, Y, N], dist_sorted)
U = np.array([[1, 1, 1, 1]])
V = np.array([
[ 4, 4, 4, 4],
[ 2, 2, 2, 2],
[ 3, 3, 3, 3],
[ 1, 1, 1, 1]])
n_closest(U, V, 2) # [0.0, 2.0]
I'd like to avoid explicitly sorting all the distances, since the number that I want will generally be much much smaller than the total number of distances.
I have a 2-dimensional array of ones and zeros called M where the g rows represent groups and the a columns represent articles. M maps groups and articles. If a given article "art" belongs to group "gr" then we have M[gr,art]=1; if not we have M[gr,art]=0.
Now, I would like to convert M into a square a x a matrix of ones and zeros (call it N) where if an article "art1" is in the same group as article "art2", we have N(art1,art2)=1 and N(art1,art2)=0 otherwise. N is clearly symmetric with 1's in the diagonal.
How do I construct N based on M?
Many thanks for your suggestions - and sorry if this is trivial (still new to python...)!
So you have a boolean matrix M like the following:
>>> M
array([[1, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 1],
[0, 0, 1, 0, 0, 0],
[1, 0, 1, 0, 0, 0]])
>>> ngroups, narticles = M.shape
and what you want is a matrix of shape (narticles, narticles) that represents co-occurrence. That's simply the square of the matrix:
>>> np.dot(M, M.T)
array([[1, 0, 0, 1],
[0, 2, 0, 0],
[0, 0, 1, 1],
[1, 0, 1, 2]])
... except that you don't want counts, so set entries > 0 to 1.
>>> N = np.dot(M, M.T)
>>> N[N > 0] = 1
>>> N
array([[1, 0, 0, 1],
[0, 1, 0, 0],
[0, 0, 1, 1],
[1, 0, 1, 1]])