Tensor contraction with Kronecker deltas in sympy - sympy

I'm trying to use sympy to do some index gymnastics for me. I'm trying to calculate the derivatives of a cost function that looks like
cost = sumi (Mii)2
where M is given by a rotation
Mij = U*ki M0kl Ulj
I've written up a parametrization for the rotation matrix, from which I get the derivatives as products of Kronecker deltas. What I've got so far is
def Uder(p,q,r,s):
return KroneckerDelta(p,r)*KroneckerDelta(q,s) - KroneckerDelta(p,s)*KroneckerDelta(q,r)
from sympy import *
# Matrix size
n = symbols('n')
p = symbols('p');
i = Dummy('i')
k = Dummy('k')
l = Dummy('l')
# Matrix elements
M0 = IndexedBase('M')
U = IndexedBase('U')
# Indices
r, s = map(tensor.Idx, ['r', 's'])
# Derivative
cost_x = Sum(Sum(Sum(M0[i,i]*(Uder(k,i,r,s)*M0[k,l]*U[l,i] + U[k,i]*M0[k,l]*Uder(l,i,r,s)),(k,1,n)),(l,1,n)),(i,1,n))
print cost_x
but sympy is not evaluating the contractions for me, which should reduce to simple sums in terms of r and s, which are the rotation indices. Instead, what I get is
Sum(((-KroneckerDelta(_i, r)*KroneckerDelta(_k, s) + KroneckerDelta(_i, s)*KroneckerDelta(_k, r))*M[_k, _l]*U[_l, _i] + (-KroneckerDelta(_i, r)*KroneckerDelta(_l, s) + KroneckerDelta(_i, s)*KroneckerDelta(_l, r))*M[_k, _l]*U[_k, _i])*M[_i, _i], (_k, 1, n), (_l, 1, n), (_i, 1, n))
I'm using the latest git snapshot 4633fd5713c434c3286e3412a2399bd40fbd9569 of sympy.

Related

Sympy expression simplification

I'm solving an eigenvalue problem when the matrix and the eigenvectors are time dependent. The matrix has dimension 8x8 and is hermitian. The time dependent matrix has the form:
import sympy as sp
t, lbd = sp.symbols(r't,\lambda', real=True)
Had = ...
print(repr(Had))
Matrix([[2*t,0, 0, 0, 0, 0, 0,0],
[ 0,-2*t, 2*t*(1 - t), 0, 0, 0,0,0],
[0, 2*t*(1 - t),0,0, 2 - 2*t, 0,0,0],
[0,0,0,0, 0, 2 - 2*t, 0,0],
[0,0,2 - 2*t,0,0,0,0,0],
[0,0,0, 2 - 2*t,0,0, 2*t*(1 - t),0],
[0,0,0,0,0, 2*t*(1 - t),-2*t,0],
[0,0,0,0,0,0,0,2*t]])
Now the characteristic polynomial has the following for:
P = p.simplify(sp.collect(sp.factor(Had.charpoly(lbd).as_expr()),lbd))
and get
Then I choose the second term and find the solution for lambda:
P_list = sp.factor_list(P)
a,b = P_list[1]
eq,exp = sp.simplify(b)
sol = sp.solve(eq)
With that I get the roots in a list:
r_list = []
for i in range(len(sol)):
a = list(sol[i].values())
r_list.append(a[0])
Solving the problem using sp.eigenvecs:
val_mult_vec = Had.eigenvects()
e_vals = []
mults = []
e_vecs = []
for i in range(len(val_mult_vec)):
val, mult, [vec_i, vec_j] = val_mult_vec[i]
e_vals.append(val)
e_vals.append(val)
mults.append(mult)
e_vecs.append(vec_i)
e_vecs.append(vec_j)
Solving the eigenvectors I get complicated expressions like this:
But I know that this complicated expression can be expressed in terms of the solution of the second term in the characteristic polynomial something like this:
Where r1 are one of the roots of that equation. With the solution to the characteristic polynomial how can I rewrite the eigenvectors in a simplified way like the last image using sympy? rewrite e_vec[i] in terms of r_list[j]
Seems like you want to obtain a compact version of the eigenvectors.
Recepy:
We can create as many symbols as the number of eigenvalues. Each symbol represents an eigenvalue.
Loop over the eigenvectors and for each of its elements substitute the long eigenvalue expression with the respective symbol.
r_symbols = symbols("r0:%s" % len(e_vals))
final_evecs = []
for vec, val, s in zip(e_vecs, e_vals, r_symbols):
final_evecs.append(
vec.applyfunc(lambda t: t.subs(val, s))
)
final_evecs is a list containing eigenvectors in a compact notation.
Let's test one output:
final_evecs[7]

Component reconstruction for multivariate lagged time series

I am trying to write a multivariate Singular Spectrum Analysis with Monte Carlo test. To this extent I am working on a code piece that can reconstruct the input series using the lagged trajectory matrix and projection base (ST-PCs) that result from the pca/ssa decomposition of the input series. The attached code piece works for a lagged univariate (that is, single) time series, but I am struggling to make this reconstruction for a lagged multivariate time series. I don't quite get the procedure mathematically and - not surprisingly - I also did not manage to program it. Useful links are attached to the function descriptions of the accompanying code. Input data should be of the form (time * number of series), so say 288x3 implying 3 time series of 288 time levels.
I hope you can help me out!
import numpy as np
def lagged_covariance_matrix(data, M):
""" Computes the lagged covariance matrix using the Broomhead & King method
Background: Plaut, G., & Vautard, R. (1994). Spells of low-frequency oscillations and
weather regimes in the Northern Hemisphere. Journal of the atmospheric sciences, 51(2), 210-236.
Arguments:
data : pxn time series, where p denotes the length of the time series and n the number of channels
M : window length """
# explicitely 'add' spatial dimension if input is a single time series
if np.ndim(data) == 1:
data = np.reshape(data,(len(data),1))
T = data.shape[0]
L = data.shape[1]
N = T - M + 1
X = np.zeros((T, L, M))
for i in range(M):
X[:,:,i] = np.roll(data, -i, axis = 0)
X = X[:N]
# X constitutes the trajectory matrix and is a stacked hankel matrix
X = np.reshape(X, (N, M*L), order = 'C') # https://www.jstatsoft.org/article/viewFile/v067i02/v67i02.pdf
# choose the smallest projection basis for computation of the covariance matrix
if M*L >= N:
return 1/(M*L) * X.dot(X.T), X
else:
return 1/N * X.T.dot(X), X
def sort_by_eigenvalues(eigenvalues, PCs):
""" Sorts the PCs and eigenvalues by descending size of the eigenvalues """
desc = np.argsort(-eigenvalues)
return eigenvalues[desc], PCs[:,desc]
def Reconstruction(M, E, X):
""" Reconstructs the series as the sum of M subseries.
See: https://en.wikipedia.org/wiki/Singular_spectrum_analysis, 'Basic SSA' &
the work of Vivien Sainte Fare Garnot on univariate time series (https://github.com/VSainteuf/mcssa)
Arguments:
M : window length
E : eigenvector basis
X : trajectory matrix """
time = len(X) + M - 1
RC = np.zeros((time, M))
# step 3: grouping
for i in range(M):
d = np.zeros(M)
d[i] = 1
I = np.diag(d)
Q = np.flipud(X # E # I # E.T)
# step 4: diagonal averaging
for k in range(time):
RC[k, i] = np.diagonal(Q, offset = -(time - M - k)).mean()
return RC
#=====================================================================================================
#=====================================================================================================
#=====================================================================================================
# input data
data = None
# number of lags a.k.a. window length
M = 45 # M = 1 means no lag
covmat, X = lagged_covariance_matrix(data, M)
# get the eigenvalues and vectors of the covariance matrix
vals, vecs = np.linalg.eig(covmat)
eig_data, eigvec_data = sort_by_eigenvalues(vals, vecs)
# component reconstruction
recons_data = Reconstruction(M, eigvec_data, X)
The following works but does not make direct use of the projection base (ST-PCs). Hence the original question still stands, but this already helps a great lot and solves the problem for me. This code piece makes use of the similarity between the ST-PCs projection base and the u & vt matrices obtained from the single value decomposition of the lagged trajectory matrix. I think it gives back the same answer as one would obtain using the ST-PCs projection base?
def lag_reconstruction(data, X, M, pairs = None):
""" Reconstructs the series as the sum of M subseries using the lagged trajectory matrix.
Based on equation 2.9 of Plaut, G., & Vautard, R. (1994). Spells of low-frequency oscillations and weather regimes in the Northern Hemisphere. Journal of Atmospheric Sciences, 51(2), 210-236.
Inspired by work of R. van Westen and C. Wieners """
time = data.shape[0] # number of time levels of the original series
L = data.shape[1] # number of input series
N = time - M + 1
u, s, vt = np.linalg.svd(X, full_matrices = False)
rc = np.zeros((time, L, M))
for t in range(time):
counter = 0
for i in range(M):
if t-i >= 0 and t-i < N:
counter += 1
if pairs:
for k in pairs:
rc[t,:,i] += u[t-i, k] * s[k] * vt[k, i*L : i*L + L]
else:
for k in range(len(s)):
rc[t,:,i] += u[t-i, k] * s[k] * vt[k, i*L : i*L + L]
rc[t] = rc[t]/counter
return rc

Polynomial interpolation in python

I am studying function approximation and while trying to understand/implement polynomial interpolation I've found an example here. I find the code below a good example to understand what is actually going on instead of using ready functions, however it doesn't run:
Defining the interpolation algorithm. Essentially, we are trying to come up with a representation of true f as a linear combination of basis functions (psi-s).
import sympy as sym
def interpolation(f, psi, points):
N = len(psi) - 1 #order of the approximant polynomial
A = sym.zeros((N+1, N+1)) # initiating the square matrix, whose regular element is psi evaluated at each nodes
b = sym.zeros((N+1, 1)) # original function f evaluated at the selected nodes
psi_sym = psi # save symbolic expression
# Turn psi and f into Python functions
x = sym.Symbol('x')
psi = [sym.lambdify([x], psi[i]) for i in range(N+1)]
f = sym.lambdify([x], f)
for i in range(N+1):
for j in range(N+1):
A[i,j] = psi[j](points[i])
b[i,0] = f(points[i])
c = A.LUsolve(b) #finding the accurate weights for each psi
# c is a sympy Matrix object, turn to list
c = [sym.simplify(c[i,0]) for i in range(c.shape[0])]
u = sym.simplify(sum(c[i,0]*psi_sym[i] for i in range(N+1)))
return u, c
True function f:= 10(x-1)^2 -1, nodes: x0:= 1 + 1/3 and x1 = 1 + 2/3. Interval: [1,2].
x = sym.Symbol('x')
f = 10*(x-1)**2 - 1
psi = [1, x] # approximant polynomial of order 1 (linear approximation)
Omega = [1, 2] #interval
points = [1 + sym.Rational(1,3), 1 + sym.Rational(2,3)]
u, c = interpolation(f, psi, points)
comparison_plot(f, u, Omega)
The code doesn't run. The error occurs in line
A = sym.zeros((N+1, N+1))
Error message: ValueError: (2, 2) is not an integer
But A isn't supposed to be an integer, it is a square matrix whose each element is psi evaluated at each node. f = A*c.
Thank you!!!

Extratcing all square submatrices from matrix using Numpy

Say I have a NxN numpy matrix. I am looking for the fastest way of extracting all square chunks (sub-matrices) from this matrix. Meaning all CxC parts of the original matrix for 0 < C < N+1. The sub-matrices should correspond to contiguous rows/columns indexes of the original matrix. I want to achieve this in as little time as possible.
You could use Numpy slicing,
import numpy as np
n = 20
x = np.random.rand(n, n)
slice_list = [slice(k, l) for k in range(0, n) for l in range(k, n)]
results = [x[sl,sl] for sl in slice_list]
avoiding loops in Numpy, is not a goal by itself. As long as you are being mindful about it, there shouldn't be much overhead.
Tricky enough, but here is an example of extracting all MxM submatrices in a NxN matrix.
import numpy as NP
import numpy.random as RNG
P = N - M + 1
x = NP.arange(P).repeat(M)
y = NP.tile(NP.arange(M), P) + x
a = RNG.randn(N, N)
b = a[NP.newaxis].repeat(P, axis=0)
c = b[x, y]
d = c.reshape(P, M, N)
e = d[:, NP.newaxis].repeat(P, axis=1)
f = e[:, x, :, y]
g = f.reshape(P, M, P, M)
h = g.transpose(2, 0, 3, 1)
for i in range(0, P):
for j in range(0, P):
assert NP.equal(h[i, j], a[i:i+M, j:j+M]).all()

Incremental entropy computation

Let std::vector<int> counts be a vector of positive integers and let N:=counts[0]+...+counts[counts.length()-1] be the the sum of vector components. Setting pi:=counts[i]/N, I compute the entropy using the classic formula H=p0*log2(p0)+...+pn*log2(pn).
The counts vector is changing --- counts are incremented --- and every 200 changes I recompute the entropy. After a quick google and stackoverflow search I couldn't find any method for incremental entropy computation. So the question: Is there an incremental method, like the ones for variance, for entropy computation?
EDIT: Motivation for this question was usage of such formulas for incremental information gain estimation in VFDT-like learners.
Resolved: See this mathoverflow post.
I derived update formulas and algorithms for entropy and Gini index and made the note available on arXiv. (The working version of the note is available here.) Also see this mathoverflow answer.
For the sake of convenience I am including simple Python code, demonstrating the derived formulas:
from math import log
from random import randint
# maps x to -x*log2(x) for x>0, and to 0 otherwise
h = lambda p: -p*log(p, 2) if p > 0 else 0
# update entropy if new example x comes in
def update(H, S, x):
new_S = S+x
return 1.0*H*S/new_S+h(1.0*x/new_S)+h(1.0*S/new_S)
# entropy of union of two samples with entropies H1 and H2
def update(H1, S1, H2, S2):
S = S1+S2
return 1.0*H1*S1/S+h(1.0*S1/S)+1.0*H2*S2/S+h(1.0*S2/S)
# compute entropy(L) using only `update' function
def test(L):
S = 0.0 # sum of the sample elements
H = 0.0 # sample entropy
for x in L:
H = update(H, S, x)
S = S+x
return H
# compute entropy using the classic equation
def entropy(L):
n = 1.0*sum(L)
return sum([h(x/n) for x in L])
# entry point
if __name__ == "__main__":
L = [randint(1,100) for k in range(100)]
M = [randint(100,1000) for k in range(100)]
L_ent = entropy(L)
L_sum = sum(L)
M_ent = entropy(M)
M_sum = sum(M)
T = L+M
print("Full = ", entropy(T))
print("Update = ", update(L_ent, L_sum, M_ent, M_sum))
You could re-compute the entropy by re-computing the counts and using some simple mathematical identity to simplify the entropy formula
K = count.size();
N = count[0] + ... + count[K - 1];
H = count[0]/N * log2(count[0]/N) + ... + count[K - 1]/N * log2(count[K - 1]/N)
= F * h
h = (count[0] * log2(count[0]) + ... + count[K - 1] * log2(count[K - 1]))
F = -1/(N * log2(N))
which holds because of log2(a / b) == log2(a) - log2(b)
Now given an old vector count of observations so far and another vector of new 200 observations called batch, you can do in C++11
void update_H(double& H, std::vector<int>& count, int& N, std::vector<int> const& batch)
{
N += batch.size();
auto F = -1/(N * log2(N));
for (auto b: batch)
++count[b];
H = F * std::accumulate(count.begin(), count.end(), 0.0, [](int elem) {
return elem * log2(elem);
});
}
Here I assume that you have encoded your observations as int. If you have some kind of symbol, you would need a symbol table std::map<Symbol, int>, and do a lookup for each symbol in batch before you update the count.
This seems the quickest way of writing some code for a general update. If you know that in every batch only few counts actually change, you can do as #migdal does and keep track of the changing counts, subtract their old contribution to the entropy and add the new contribution.