I'm using SymPy to do linear algebra. I want to perform an element-wise multiplication (Hadamard product) on two matricies.
For example,
sympy.MatrixSymbol('X', 4, 3) [operator/method] sympy.MatrixSymbol('W', 4, 3)
would give
[[X[0,0]*W[0,0], X[0,1]*W[0,1], ...],[X[1,0]*W[1,0], X[1,1]*W[1,1], ...]]
But it seems that there isn't a method for it. Is there any way to perform an element-wise multiplication with SymPy?
Yes there is a function in SymPy that can do element-wise multiplication (Hadamard product). As per their documentation of SymPy 0.7.6 the function is:
multiply_elementwise(b)
Returns the Hadamard product (elementwise product) of A and B.
Example:
>>> from sympy.matrices import Matrix
>>> A = Matrix([[0, 1, 2], [3, 4, 5]])
>>> B = Matrix([[1, 10, 100], [100, 10, 1]])
>>> A.multiply_elementwise(B)
Matrix([
[ 0, 10, 200],
[300, 40, 5]])
Update: For element-wise multiplication of MatrixSymbols use the following function:
HadamardProduct(A, B)
For Example:
>>> from sympy import HadamardProduct
>>> A = MatrixSymbol('A', m, n)
>>> B = MatrixSymbol('B', m, n)
>>>print(HadamardProduct(A,B))
A.*B
Related
I am using scipy and its cdist function to compute a distance matrix from an array of vectors.
import numpy as np
from scipy.spatial import distance
vectorList = [(0, 10), (4, 8), (9.0, 11.0), (14, 14), (16, 19), (25.5, 17.5), (35, 16)]
#Convert to numpy array
arr = np.array(vectorList)
#Computes distances matrix and set self-comparisons to NaN
d = distance.cdist(arr, arr)
np.fill_diagonal(d, None)
Let's say I want to return all the distances that are below a specific threshold (6 for example)
#Find pairs of vectors whose separation distance is < 6
id1, id2 = np.nonzero(d<6)
#id1 --> array([0, 1, 1, 2, 2, 3, 3, 4])
#id2 --> array([1, 0, 2, 1, 3, 2, 4, 3])
I now have 2 arrays of indices.
Question: how can I return the distances between these pairs of vectors as an array / list ?
4.47213595499958 #d[0][1]
4.47213595499958 #d[1][0]
5.830951894845301 #d[1][2]
5.830951894845301 #d[2][1]
5.830951894845301 #d[2][2]
5.830951894845301 #d[3][2]
5.385164807134504 #d[3][4]
5.385164807134504 #d[4][3]
d[id1][id2] returns a matrix, not a list, and the only way I found so far is to iterate over the distance matrix again which doesn't make sense.
np.array([d[i1][i2] for i1, i2 in zip(id1, id2)])
Use
d[id1, id2]
This is the form that numpy.nonzero example shows (i.e. a[np.nonzero(a > 3)]) which is different from the d[id1][id2] you are using.
See arrays.indexing for more details on numpy indexing.
Let say we have two matrices A and B.
A has the shape (r, k) and B has the shape (r, l).
Now I want to calculate the np.outer product of these two matrices per rows. After the outer product I then want to sum all values in axis 0. So my result matrix should have the shape (k, l).
E.g.:
Form of A is (4, 2), of B is (4, 3).
import numpy as np
A = np.array([[0, 7], [4, 1], [0, 2], [0, 5]])
B = np.array([[9, 7, 7], [6, 7, 5], [2, 7, 9], [6, 9, 7]])
# This is the first outer product for the first values of A and B
print(np.outer(A[0], B[0])) # This will give me
# First possibility is to use list comprehension and then
sum1 = np.sum((np.outer(x, y) for x, y in zip(A, B)), axis=0)
# Second possibility would be to use the reduce function
sum2 = reduce(lambda sum, (x, y): sum+np.outer(x, y), zip(A, B), np.zeros((A.shape[1], B.shape[1])))
# result for sum1 or sum2 looks like this:
# array([[ 175., 156., 133.], [ 133., 131., 137.]])
I'm asking my self, is there a better or faster solution? Because when I have e.g. two matrices with more than 10.000 rows this takes some time.
Only using the np.outer function is not the solution, because np.outer(A, B) will give me a matrix with shape (8, 12) (this is not what I want).
Need this for neural networks backpropagation.
You could literally transfer the iterators as string notation to np.einsum -
np.einsum('rk,rl->kl',A,B)
Or with matrix-multiplication using np.dot -
A.T.dot(B)
The operation consists of two arrays X and idx of equal length where the values of idx can vary between 0 to (k-1) with the value of k given.
This is the general Python code to illustrate this.
import numpy as np
X = np.arange(6) # Just for a sample of elements
k = 3
idx = numpy.array([[0, 1, 2, 2, 0, 1]]).T # Can only contain values in [0..(k-1)]
np.array([X[np.where(idx==i)[0]] for i in range(k)])
Sample output:
array([[0, 4],
[1, 5],
[2, 3]])
Note that there is actually a reason for me to represent idx as a matrix and not as a vector. It was initialised to numpy.zeros((n,1)) as part of its computation, where n the size of X.
I tried implement this in Theano like so
import theano
import theano.tensor as T
X = T.vector('X')
idx = T.vector('idx')
k = T.scalar()
c = theano.scan(lambda i: X[T.where(T.eq(idx,i))], sequences=T.arange(k))
f = function([X,idx,k],c)
But I received this error at line where c is defined:
TypeError: Wrong number of inputs for Switch.make_node (got 1((<int8>,)), expected 3)
Is there a simple way to implement this in Theano?
Use nonzero() and correct the dimensions of idx.
This code solved the problem
import theano
import theano.tensor as T
X = T.vector('X')
idx = T.vector('idx')
k = T.scalar()
c, updates = theano.scan(lambda i: X[T.eq(idx,i).nonzero()], sequences=T.arange(k))
f = function([X,idx,k],c)
For the same example, through the use of Theano:
import numpy as np
X = np.arange(6)
k = 3
idx = np.array([[0, 1, 2, 2, 0, 1]]).T
f(X, idx.T[0], k).astype(int)
This gives the output as
array([[0, 4],
[1, 5],
[2, 3]])
If idx is defined as np.array([0, 1, 2, 2, 0, 1]), then f(X, idx, k) can be used instead.
I've got a theano function which computes euclidean distances for 2 matrices—X (n vectors x k features) and Y (m vectors x k features). The result is an n x m matrix of pairwise distances of each vector (or row) in X from each vector (or row) in Y.
import theano
from theano import tensor as T
X, Y = T.dmatrices('X', 'Y')
X_squared_sum = T.sum(X ** 2, axis=1, keepdims=True)
Y_squared_sum = T.sum(Y.T ** 2, axis=0, keepdims=True)
squared_distances = X_squared_sum + Y_squared_sum - 2 * T.dot(X, Y.T)
f_distance = theano.function([X, Y], T.sqrt(squared_distances))
Let's say I change the above function to accept a single vector, an array of vectors, and the number of smallest distances. What I want is a theano function that will find the N smallest distances, similar to below:
import numpy as np
import theano
from theano import tensor as T
X = T.dvector('X')
Y = T.dmatrix('Y')
N = T.iscalar('N')
X_squared_sum = T.dot(X, X)
Y_squared_sum = T.sum(Y.T ** 2, axis=0)
squared_distances = X_squared_sum + Y_squared_sum - 2 * T.dot(X, Y.T)
dist_sorted = T.FIND_N_SMALLEST(T.sqrt(squared_distances), N)
n_closest = theano.function([X, Y, N], dist_sorted)
U = np.array([[1, 1, 1, 1]])
V = np.array([
[ 4, 4, 4, 4],
[ 2, 2, 2, 2],
[ 3, 3, 3, 3],
[ 1, 1, 1, 1]])
n_closest(U, V, 2) # [0.0, 2.0]
I'd like to avoid explicitly sorting all the distances, since the number that I want will generally be much much smaller than the total number of distances.
Multiplying elements of numpy arrays based on elements in one array.
import numpy as np
x = np.random.randint(-10,10, size=(12, 4))
x = np.insert(arr=x, values=np.random.choice([1,2,3,4], 12), obj=8, axis=1)
How can I multiply rows of x[:,:4] element-wise provided that these rows have identical element in the last column.
You can use itertools.groupby for grouping your rows based on 4th element then use np.multiply within reduce function to calculate the multiply
:
>>> from operator import itemgetter
>>> from itertools import groupby
>>> [reduce(lambda x,y:np.multiply(x,y),g) for _,g in groupby(sorted(x,key=itemgetter(3)),itemgetter(3))]
[array([ 0, -7, -5, -7]), array([ 0, -588, 1296, 1]), array([ 9, -3, -1, 0]), array([ 56, -8, -60, 9]), array([ -9, -3, -10, 6]), array([-72, -9, -15, 64]), array([ 5, -8, -5, 9])]