matrix multiply elementwise - sympy

Is it possible to do from matrix_multiply_elementwise in sympy library with more than two matrices? Or any other way for multiplying couple of matrices elementwise?
p.s.
In numpy it is straightforward but since I need high precision calculation I decided to use sympy

What you are looking for is the Hadamard product (or Schur product).
In sympy it is available as sympy.matrices.dense.matrix_multiply_elementwise(A, B), documented here.

Here is a way to multiply three matrices elementwise. The solution can be generalized to more than three matrices, or to functions other than multiplication.
from sympy import Matrix, pprint
a = Matrix([[1,2],[3,4]])
b = Matrix([[1,10],[100,1000]])
c = Matrix([[-1,1],[1,-1]])
x = Matrix(2,2, lambda i,j,m1=a,m2=b,m3=c: m1[i,j]*m2[i,j]*m3[i,j])
pprint(a)
pprint(b)
pprint(c)
pprint(x)
For simplicity, this solution is for matrices of shape 2X2, but it could easily be adapted for other sizes (using the shape property)
Here is the output
⎡1 2⎤
⎢ ⎥
⎣3 4⎦
⎡ 1 10 ⎤
⎢ ⎥
⎣100 1000⎦
⎡-1 1 ⎤
⎢ ⎥
⎣1 -1⎦
⎡-1 20 ⎤
⎢ ⎥
⎣300 -4000⎦

You may have your answer on this post here: Getting element-wise equations of matrix multiplication in sympy
or here
Sympy Documentation

Related

How to create and print nsum or nprod style expression?

Is it possible to print nsum or nprod style express?
e.g.
n = Symbol('n')
nsum(lambda n: 1/fac(n), [0, inf]) #this calculates e
is there any way to create and print an expression of like definition of e
or geometric mean
nsum is not a SymPy function, it's an mpmath function (mpmath was a submodule of SymPy once upon a time; if you are still using ancient SymPy version, an upgrade is recommended).
In SymPy, summation is implemented by Sum, which is unevaluated summation until .doit() method is called. Example:
>>> from sympy import *
>>> init_printing()
>>> Sum(1/factorial(n), (n, 0, oo))
∞
____
╲
╲ 1
╲ ──
╱ n!
╱
╱
‾‾‾‾
n = 0
>>> Sum(1/factorial(n), (n, 0, oo)).doit()
ℯ
Here init_printing() initiates the printing of "pretty" expressions; how pretty they are depends on your environment (I'm using a text terminal in this example). More on printing in SymPy.
Similarly, there is Product for products.
But the formulas you give an example make me think you are looking for something presentation-oriented, which calls for opening LaTeX editor and typing those formulas.

Replicate IDL 'smooth' in Python 2.7

I have been trying to work out how to replicate IDL's smooth function in Python and I just can't get anything like the same results. (Disclaimer: It is probably 10 years since I touched this kind of mathematical problem so it has been dumped to make way for information like where to find the cheapest local fuel). I am trying to code this:
smooth(b,w,/nan)
where b is a 2D float array containing NANs (zeros - missing data - have also been converted to NAN).
From the IDL documents, it appears smooth uses a boxcar, so from scipy.ndimage.filters I have tried:
bsmooth = uniform_filter(b, w)
I am aware that there are some fundamental differences here:
the default edge behaviour from IDL is "the end points are copied
from the original array to the result with no smoothing" whereas I
don't seem to have the option to do this with the uniform filter.
Treatment of the NaN elements. In IDL, the /nan keyword seems to
mean that where possible the NaN values will be filled by the result
of the other points in the window. If there are no valid points to
generate a result, by a MISSING keyword. I thought I could
approximate this behaviour following the smoothing using
scipy.interpolate's NearestNDInterpolator (thanks to the brilliant
explanation by Alex on here:
filling gaps on an image using numpy and scipy)
Here is my test array:
>>>b array([[ 0.97599638, 0.93114936, 0.87070072, 0.5379253 ],
[ 0.34873217, nan, 0.40985891, 0.22407863],
[ nan, nan, nan, 0.67532134],
[ nan, nan, 0.85441768, nan]])
My answers bore not the SLIGHTEST resemblance to IDL, whether I use the /nan keyword or not.
IDL> smooth(b,2,/nan)
0.97599638 0.93114936 0.87070072 0.53792530
0.34873217 0.70728749 0.60817236 0.22407863
NaN 0.53766960 0.54091913 0.67532134
NaN NaN 0.85441768 NaN
IDL> smooth(b,2)
0.97599638 0.93114936 0.87070072 0.53792530
0.34873217 -NaN -NaN 0.22407863
-NaN -NaN -NaN 0.67532134
-NaN -NaN 0.85441768 NaN
I confess I find the scipy documentation rather sparse on detail so I have no idea if I am really doing what I think I doing. The fact that the two python approaches which I believed would both smooth the image give different answers suggests that things are not what I understood them to be.
>>>uniform_filter(b, 2)
array([[ 0.97599638, 0.95357287, 0.90092504, 0.70431301],
[ 0.66236428, nan, nan, nan],
[ nan, nan, nan, nan],
[ nan, nan, nan, nan]])
I thought it was a bit odd it was so empty so I tried this with an array of 100 elements (still using a window of 2) and output the images. The results (first image is 'b' second is 'bsmooth') are not quite what I was hoping for:
Going back to the smaller array and following the examples in: http://scipy.github.io/old-wiki/pages/Cookbook/SignalSmooth which I thought would give the same output as uniform_filter, I tried:
>>> box = np.array([1,1,1,1])
>>> box = box.reshape(2,2)
>>> box
array([[1, 1],
[1, 1]])
>>> bsmooth = scipy.signal.convolve2d(b,box,mode='same')
>>> print bsmooth
[[ 0.97599638 1.90714574 1.80185008 1.40862602]
[ 1.32472855 nan nan 2.04256356]
[ nan nan nan nan]
[ nan nan nan nan]]
Obviously I have completely misunderstood the scipy functions, maybe even the IDL one. If anyone can help me to replicate the IDL smooth function as closely as possible, I would be extremely grateful. I am under considerable time pressure to get a solution for this that doesn't rely on IDL and I am tossing a coin to decide whether to code the function from scratch or develop a very contagious illness.
How can I perform the same smoothing in python?
First: Please use matplotlib.pyplot.imshow with interpolation="none" that's nicer to look at and maybe with greyscale.
So for your example: There is actually no convolution (filter) within scipy and numpy that treat's NaN as missing values (they propagate them within the convolution). At least I've found none so far and your boundary-treatement is also (to my knowledge) not implemented. But the boundary could be just replaced afterwards.
If you want to do convolution with NaN you can for example use astropy.convolution.convolve. There NaNs are interpolated using the kernel of your filter. But their convolution has some drawbacks as well: Border handling like you want isn't implemented there neither and your kernel must be of odd shape and the sum of your kernel must not be zero (or very close to it)
For example:
from astropy.convolution import convolve
import numpy as np
array = np.random.uniform(10,100, (4,4))
array[1,1] = np.nan
kernel = np.ones((3,3))
convolve(array, kernel)
as an example an initial array of
array([[ 97.19514587, 62.36979751, 93.54811286, 30.23567842],
[ 51.02184613, nan, 46.14769821, 60.08088041],
[ 20.86482452, 42.39661484, 36.96961278, 96.89180175],
[ 45.54453509, 76.61274347, 46.44485141, 25.40985372]])
will become:
array([[ 266.9009961 , 406.59680717, 348.69637399, 230.01236989],
[ 330.16243546, 506.82785931, 524.95440336, 363.87378443],
[ 292.75477064, 422.31693304, 487.26826319, 311.94469828],
[ 185.41871792, 268.83318211, 324.72547798, 205.71611967]])
if you want to "normalize" it, astropy offers the normalize_kernel parameter:
convolved = convolve(array, kernel, normalize_kernel=True)
array([[ 29.58753936, 42.09982189, 49.31793529, 33.00203873],
[ 49.87040638, 65.67695002, 66.10447436, 40.44026448],
[ 52.51126383, 63.03914444, 60.85474739, 35.88011742],
[ 39.40188443, 46.82350749, 40.1380926 , 22.46090152]])
If you want to replace the "edge" values with the ones from the original array just replace them:
convolved[0,:] = array[0,:]
convolved[-1,:] = array[-1,:]
convolved[:,0] = array[:,0]
convolved[:,-1] = array[:,-1]
So that's what the existing packages offer (as far as I know it). If you want to learn a bit of Cython or numba you can easily write your own convolutions that is not much slower (only a factor of 2-10) than the numpy/scipy ones but does EXACTLY what you want without messing around.
Since this is not something that is available in the python packages and because I saw the question asked several times during my research without satisfactory answers, here is how I solved the issue.
Provided is a test version of my function that I'm off to tidy up. I am sure there will be better ways to do the things I have done as I'm still fairly new to Python - please do recommend any appropriate changes.
Plots use autumn colourmap just because it allowed me to see the NaNs clearly.
My results:
IDL propagate
0.033369284 0.067915268 0.96602046 0.85623550
0.30435592 NaN NaN 100.00000
0.94065958 NaN NaN 0.90966976
0.018516513 0.044460904 0.051047217 NaN
python propagate
[[ 3.33692829e-02 6.79152655e-02 9.66020487e-01 8.56235492e-01]
[ 3.04355923e-01 nan nan 1.00000000e+02]
[ 9.40659566e-01 nan nan 9.09669768e-01]
[ 1.85165123e-02 4.44609040e-02 5.10472165e-02 nan]]
IDL replace
0.033369284 0.067915268 0.96602046 0.85623550
0.30435592 0.47452110 14.829881 100.00000
0.94065958 0.33833817 17.002417 0.90966976
0.018516513 0.044460904 0.051047217 NaN
python replace
[[ 3.33692829e-02 6.79152655e-02 9.66020487e-01 8.56235492e-01]
[ 3.04355923e-01 4.74521092e-01 1.48298812e+01 1.00000000e+02]
[ 9.40659566e-01 3.38338177e-01 1.70024175e+01 9.09669768e-01]
[ 1.85165123e-02 4.44609040e-02 5.10472165e-02 nan]]
My function:
#!/usr/bin/env python
# smooth.py
__version__ = 0.1
# Version 0.1 29 Feb 2016 ELH Test release
import numpy as np
import matplotlib.pyplot as mp
def Smooth(v1, w, nanopt):
# v1 is the input 2D numpy array.
# w is the width of the square window along one dimension
# nanopt can be replace or propagate
'''
v1 = np.array(
[[3.33692829e-02, 6.79152655e-02, 9.66020487e-01, 8.56235492e-01],
[3.04355923e-01, np.nan , 4.86013025e-01, 1.00000000e+02],
[9.40659566e-01, 5.23314093e-01, np.nan , 9.09669768e-01],
[1.85165123e-02, 4.44609040e-02, 5.10472165e-02, np.nan ]])
w = 2
'''
mp.imshow(v1, interpolation='None', cmap='autumn')
mp.show()
# make a copy of the array for the output:
vout=np.copy(v1)
# If w is even, add one
if w % 2 == 0:
w = w + 1
# get the size of each dim of the input:
r,c = v1.shape
# Assume that w, the width of the window is always square.
startrc = (w - 1)/2
stopr = r - ((w + 1)/2) + 1
stopc = c - ((w + 1)/2) + 1
# For all pixels within the border defined by the box size, calculate the average in the window.
# There are two options:
# Ignore NaNs and replace the value where possible.
# Propagate the NaNs
for col in range(startrc,stopc):
# Calculate the window start and stop columns
startwc = col - (w/2)
stopwc = col + (w/2) + 1
for row in range (startrc,stopr):
# Calculate the window start and stop rows
startwr = row - (w/2)
stopwr = row + (w/2) + 1
# Extract the window
window = v1[startwr:stopwr, startwc:stopwc]
if nanopt == 'replace':
# If we're replacing Nans, then select only the finite elements
window = window[np.isfinite(window)]
# Calculate the mean of the window
vout[row,col] = np.mean(window)
mp.imshow(vout, interpolation='None', cmap='autumn')
mp.show()
return vout

Computation of Kullback-Leibler (KL) distance between text-documents using numpy

My goal is to compute the KL distance between the following text documents:
1)The boy is having a lad relationship
2)The boy is having a boy relationship
3)It is a lovely day in NY
I first of all vectorised the documents in order to easily apply numpy
1)[1,1,1,1,1,1,1]
2)[1,2,1,1,1,2,1]
3)[1,1,1,1,1,1,1]
I then applied the following code for computing KL distance between the texts:
import numpy as np
import math
from math import log
v=[[1,1,1,1,1,1,1],[1,2,1,1,1,2,1],[1,1,1,1,1,1,1]]
c=v[0]
def kl(p, q):
p = np.asarray(p, dtype=np.float)
q = np.asarray(q, dtype=np.float)
return np.sum(np.where(p != 0,(p-q) * np.log10(p / q), 0))
for x in v:
KL=kl(x,c)
print KL
Here is the result of the above code: [0.0, 0.602059991328, 0.0].
Texts 1 and 3 are completely different, but the distance between them is 0, while texts 1 and 2, which are highly related has a distance of 0.602059991328. This isn't accurate.
Does anyone has an idea of what I'm not doing right with regards to KL? Many thanks for your suggestions.
Though I hate to add another answer, there are two points here. First, as Jaime pointed out in the comments, KL divergence (or distance - they are, according to the following documentation, the same) is designed to measure the difference between probability distributions. This means basically that what you pass to the function should be two array-likes, the elements of each of which sum to 1.
Second, scipy apparently does implement this, with a naming scheme more related to the field of information theory. The function is "entropy":
scipy.stats.entropy(pk, qk=None, base=None)
http://docs.scipy.org/doc/scipy-dev/reference/generated/scipy.stats.entropy.html
From the docs:
If qk is not None, then compute a relative entropy (also known as
Kullback-Leibler divergence or Kullback-Leibler distance) S = sum(pk *
log(pk / qk), axis=0).
The bonus of this function as well is that it will normalize the vectors you pass it if they do not sum to 1 (though this means you have to be careful with the arrays you pass - ie, how they are constructed from data).
Hope this helps, and at least a library provides it so don't have to code your own.
After a bit of googling to undersand the KL concept, I think that your problem is due to the vectorization : you're comparing the number of appearance of different words. You should either link your column indice to one word, or use a dictionnary:
# The boy is having a lad relationship It lovely day in NY
1)[1 1 1 1 1 1 1 0 0 0 0 0]
2)[1 2 1 1 1 0 1 0 0 0 0 0]
3)[0 0 1 0 1 0 0 1 1 1 1 1]
Then you can use your kl function.
To automatically vectorize to a dictionnary, see How to count the frequency of the elements in a list? (collections.Counter is exactly what you need). Then you can loop over the union of the keys of the dictionaries to compute the KL distance.
A potential issue might be in your NP definition of KL. Read the wikipedia page for formula: http://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
Note that you multiply (p-q) by the log result. In accordance with the KL formula, this should only be p:
return np.sum(np.where(p != 0,(p) * np.log10(p / q), 0))
That may help...

How to Add Numbers in a Matrix to Yield Minimum Result?

This is more of an algorithmic/math problem but I'm hoping to implement a solution in C++.
Suppose I have a matrix like so where the dots represent integers:
W X Y Z
A . . . .
B . . . .
C . . . .
D . . . .
How would I yield the minimum result if I had to pick one number from each column such that there is at most one number from each row?
For instance, I could choose AW BX CY DZ or AZ BX CY DW but NOT AW BW CZ DZ
The brute force approach would seem to take n! calculations. Is there a quicker way? Eventually I would like to add numbers in matrices of size ~60.
Also, all numbers range from 0 to 256.
And if you'd rather not code it yourself, you could always use someone else' hard-work and kind publication. This one in Haskell, solves a 60x60 random matrix in less than two tenths of a second on my old laptop. What a great algorithm!
import Data.Algorithm.Munkres
import Data.Array.Unboxed
import Data.List (transpose)
solve n matrix =
hungarianMethodInt (listArray ((1,1),(n,n)) $ concat $ transpose matrix)

Matlab's Conv2 equivalent in OpenCV

I have been trying to do Convolution of a 2D Matrix using OpenCV. I actually went through this code http://blog.timmlinder.com/2011/07/opencv-equivalent-to-matlabs-conv2-function/#respond but it yields correct answers only in positive cases. Is there a simple function like conv2 in Matlab for OpenCV or C++?
Here is an example:
A= [
1 -2
3 4
]
I want to convolve it with [-0.707 0.707]
And the result as by conv2 from Matlab is
-0.7071 2.1213 -1.4142
-2.1213 -0.7071 2.8284
Some function to compute this output in OpenCV or C++? I will be grateful for a response.
If you want an exclusive OpenCV solution, use cv2.filter2D function. But you should adjust the borderType flag if you want to get the correct output as that of matlab.
>>> A = np.array([ [1,-2],[3,4] ]).astype('float32')
>>> A
array([[ 1., -2.],
[ 3., 4.]], dtype=float32)
>>> B = np.array([[ 0.707,-0.707]])
>>> B
array([[ 0.707, -0.707]])
>>> cv2.filter2D(A2,-1,B,borderType = cv2.BORDER_CONSTANT)
array([[-0.70700002, 2.12100005, -1.41400003],
[-2.12100005, -0.70700002, 2.82800007]], dtype=float32)
borderType is important. To find the convolution you need values outside the array. If you want to get matlab like output, you need to pass cv2.BORDER_CONSTANT. See output is greater in size than input.
If you are using OpenCV with Python 2 binding you can use Scipy as long as your images will be ndarrays:
>>> from scipy import signal
>>> A = np.array([[1,-2], [3,4]])
>>> B = np.array([[-0.707, 0.707]])
>>> signal.convolve2d(A,B)
array([[-0.707, 2.121, -1.414],
[-2.121, -0.707, 2.828]])
Be sure that you use the full mode (which is set by default) if you want to achieve the same result as in matlab as long as if you use 'same' mode Scipy will center differently from Matlab.