User defined SVM kernel with scikit-learn - python-2.7

I encounter a problem when defining a kernel by myself in scikit-learn.
I define by myself the gaussian kernel and was able to fit the SVM but not to use it to make a prediction.
More precisely I have the following code
from sklearn.datasets import load_digits
from sklearn.svm import SVC
from sklearn.utils import shuffle
import scipy.sparse as sparse
import numpy as np
digits = load_digits(2)
X, y = shuffle(digits.data, digits.target)
gamma = 1.0
X_train, X_test = X[:100, :], X[100:, :]
y_train, y_test = y[:100], y[100:]
m1 = SVC(kernel='rbf',gamma=1)
m1.fit(X_train, y_train)
m1.predict(X_test)
def my_kernel(x,y):
d = x - y
c = np.dot(d,d.T)
return np.exp(-gamma*c)
m2 = SVC(kernel=my_kernel)
m2.fit(X_train, y_train)
m2.predict(X_test)
m1 and m2 should be the same, but m2.predict(X_test) return the error :
operands could not be broadcast together with shapes (260,64) (100,64)
I don't understand the problem.
Furthermore if x is one data point, the m1.predict(x) gives a +1/-1 result, as expexcted, but m2.predict(x) gives an array of +1/-1...
No idea why.

The error is at the x - y line. You cannot subtract the two like that, because the first dimensions of both may not be equal. Here is how the rbf kernel is implemented in scikit-learn, taken from here (only keeping the essentials):
def row_norms(X, squared=False):
if issparse(X):
norms = csr_row_norms(X)
else:
norms = np.einsum('ij,ij->i', X, X)
if not squared:
np.sqrt(norms, norms)
return norms
def euclidean_distances(X, Y=None, Y_norm_squared=None, squared=False):
"""
Considering the rows of X (and Y=X) as vectors, compute the
distance matrix between each pair of vectors.
[...]
Returns
-------
distances : {array, sparse matrix}, shape (n_samples_1, n_samples_2)
"""
X, Y = check_pairwise_arrays(X, Y)
if Y_norm_squared is not None:
YY = check_array(Y_norm_squared)
if YY.shape != (1, Y.shape[0]):
raise ValueError(
"Incompatible dimensions for Y and Y_norm_squared")
else:
YY = row_norms(Y, squared=True)[np.newaxis, :]
if X is Y: # shortcut in the common case euclidean_distances(X, X)
XX = YY.T
else:
XX = row_norms(X, squared=True)[:, np.newaxis]
distances = safe_sparse_dot(X, Y.T, dense_output=True)
distances *= -2
distances += XX
distances += YY
np.maximum(distances, 0, out=distances)
if X is Y:
# Ensure that distances between vectors and themselves are set to 0.0.
# This may not be the case due to floating point rounding errors.
distances.flat[::distances.shape[0] + 1] = 0.0
return distances if squared else np.sqrt(distances, out=distances)
def rbf_kernel(X, Y=None, gamma=None):
X, Y = check_pairwise_arrays(X, Y)
if gamma is None:
gamma = 1.0 / X.shape[1]
K = euclidean_distances(X, Y, squared=True)
K *= -gamma
np.exp(K, K) # exponentiate K in-place
return K
You might want to dig deeper into the code, but look at the comments for the euclidean_distances function. A naive implementation of what you're trying to achieve would be this:
def my_kernel(x,y):
d = np.zeros((x.shape[0], y.shape[0]))
for i, row_x in enumerate(x):
for j, row_y in enumerate(y):
d[i, j] = np.exp(-gamma * np.linalg.norm(row_x - row_y))
return d

Related

How to fit a 2D ellipse to given points

I would like to fit a 2D array by an elliptic function: (x / a)² + (y / b)² = 1 ----> (and so get the a and b)
And then, be able to replot it on my graph.
I found many examples on internet, but no one with this simple Cartesian equation. I probably have searched badly ! I think a basic solution for this problem could help many people.
Here is an example of the data:
Sadly, I can not put the values... So let's assume that I have an X,Y arrays defining the coordinates of each of those points.
This can be solved directly using least squares. You can frame this as minimizing the sum of squares of quantity (alpha * x_i^2 + beta * y_i^2 - 1) where alpha is 1/a^2 and beta is 1/b^2. You have all the x_i's in X and the y_i's in Y so you can find the minimizer of ||Ax - b||^2 where A is an Nx2 matrix (i.e. [X^2, Y^2]), x is the column vector [alpha; beta] and b is column vector of all ones.
The following code solves the more general problem for an ellipse of the form Ax^2 + Bxy + Cy^2 + Dx +Ey = 1 though the idea is exactly the same. The print statement gives 0.0776x^2 + 0.0315xy+0.125y^2+0.00457x+0.00314y = 1 and the image of the ellipse generated is also below
import numpy as np
import matplotlib.pyplot as plt
alpha = 5
beta = 3
N = 500
DIM = 2
np.random.seed(2)
# Generate random points on the unit circle by sampling uniform angles
theta = np.random.uniform(0, 2*np.pi, (N,1))
eps_noise = 0.2 * np.random.normal(size=[N,1])
circle = np.hstack([np.cos(theta), np.sin(theta)])
# Stretch and rotate circle to an ellipse with random linear tranformation
B = np.random.randint(-3, 3, (DIM, DIM))
noisy_ellipse = circle.dot(B) + eps_noise
# Extract x coords and y coords of the ellipse as column vectors
X = noisy_ellipse[:,0:1]
Y = noisy_ellipse[:,1:]
# Formulate and solve the least squares problem ||Ax - b ||^2
A = np.hstack([X**2, X * Y, Y**2, X, Y])
b = np.ones_like(X)
x = np.linalg.lstsq(A, b)[0].squeeze()
# Print the equation of the ellipse in standard form
print('The ellipse is given by {0:.3}x^2 + {1:.3}xy+{2:.3}y^2+{3:.3}x+{4:.3}y = 1'.format(x[0], x[1],x[2],x[3],x[4]))
# Plot the noisy data
plt.scatter(X, Y, label='Data Points')
# Plot the original ellipse from which the data was generated
phi = np.linspace(0, 2*np.pi, 1000).reshape((1000,1))
c = np.hstack([np.cos(phi), np.sin(phi)])
ground_truth_ellipse = c.dot(B)
plt.plot(ground_truth_ellipse[:,0], ground_truth_ellipse[:,1], 'k--', label='Generating Ellipse')
# Plot the least squares ellipse
x_coord = np.linspace(-5,5,300)
y_coord = np.linspace(-5,5,300)
X_coord, Y_coord = np.meshgrid(x_coord, y_coord)
Z_coord = x[0] * X_coord ** 2 + x[1] * X_coord * Y_coord + x[2] * Y_coord**2 + x[3] * X_coord + x[4] * Y_coord
plt.contour(X_coord, Y_coord, Z_coord, levels=[1], colors=('r'), linewidths=2)
plt.legend()
plt.xlabel('X')
plt.ylabel('Y')
plt.show()
Following the suggestion by ErroriSalvo, here is the complete process of fitting an ellipse using the SVD. The arrays x, y are coordinates of the given points, let's say there are N points. Then U, S, V are obtained from the SVD of the centered coordinate array of shape (2, N). So, U is a 2 by 2 orthogonal matrix (rotation), S is a vector of length 2 (singular values), and V, which we do not need, is an N by N orthogonal matrix.
The linear map transforming the unit circle to the ellipse of best fit is
sqrt(2/N) * U * diag(S)
where diag(S) is the diagonal matrix with singular values on the diagonal. To see why the factor of sqrt(2/N) is needed, imagine that the points x, y are taken uniformly from the unit circle. Then sum(x**2) + sum(y**2) is N, and so the coordinate matrix consists of two orthogonal rows of length sqrt(N/2), hence its norm (the largest singular value) is sqrt(N/2). We need to bring this down to 1 to have the unit circle.
N = 300
t = np.linspace(0, 2*np.pi, N)
x = 5*np.cos(t) + 0.2*np.random.normal(size=N) + 1
y = 4*np.sin(t+0.5) + 0.2*np.random.normal(size=N)
plt.plot(x, y, '.') # given points
xmean, ymean = x.mean(), y.mean()
x -= xmean
y -= ymean
U, S, V = np.linalg.svd(np.stack((x, y)))
tt = np.linspace(0, 2*np.pi, 1000)
circle = np.stack((np.cos(tt), np.sin(tt))) # unit circle
transform = np.sqrt(2/N) * U.dot(np.diag(S)) # transformation matrix
fit = transform.dot(circle) + np.array([[xmean], [ymean]])
plt.plot(fit[0, :], fit[1, :], 'r')
plt.show()
But if you assume that there is no rotation, then np.sqrt(2/N) * S is all you need; these are a and b in the equation of the ellipse.
You could try a Singular Value Decomposition of the data matrix.
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.linalg.svd.html
First center the data by subtracting mean values of X,Y from each column respectively.
X=X-np.mean(X)
Y=Y-np.mean(Y)
D=np.vstack(X,Y)
Then, apply SVD and extract
-eigenvalues (members of s) -> axis length
-eigenvectors(U) -> axis orientation
U, s, V = np.linalg.svd(D, full_matrices=True)
This should be a least-squares fit.
Of course, things can get more complicated than this, please see
https://www.emis.de/journals/BBMS/Bulletin/sup962/gander.pdf

Tensor contraction with Kronecker deltas in sympy

I'm trying to use sympy to do some index gymnastics for me. I'm trying to calculate the derivatives of a cost function that looks like
cost = sumi (Mii)2
where M is given by a rotation
Mij = U*ki M0kl Ulj
I've written up a parametrization for the rotation matrix, from which I get the derivatives as products of Kronecker deltas. What I've got so far is
def Uder(p,q,r,s):
return KroneckerDelta(p,r)*KroneckerDelta(q,s) - KroneckerDelta(p,s)*KroneckerDelta(q,r)
from sympy import *
# Matrix size
n = symbols('n')
p = symbols('p');
i = Dummy('i')
k = Dummy('k')
l = Dummy('l')
# Matrix elements
M0 = IndexedBase('M')
U = IndexedBase('U')
# Indices
r, s = map(tensor.Idx, ['r', 's'])
# Derivative
cost_x = Sum(Sum(Sum(M0[i,i]*(Uder(k,i,r,s)*M0[k,l]*U[l,i] + U[k,i]*M0[k,l]*Uder(l,i,r,s)),(k,1,n)),(l,1,n)),(i,1,n))
print cost_x
but sympy is not evaluating the contractions for me, which should reduce to simple sums in terms of r and s, which are the rotation indices. Instead, what I get is
Sum(((-KroneckerDelta(_i, r)*KroneckerDelta(_k, s) + KroneckerDelta(_i, s)*KroneckerDelta(_k, r))*M[_k, _l]*U[_l, _i] + (-KroneckerDelta(_i, r)*KroneckerDelta(_l, s) + KroneckerDelta(_i, s)*KroneckerDelta(_l, r))*M[_k, _l]*U[_k, _i])*M[_i, _i], (_k, 1, n), (_l, 1, n), (_i, 1, n))
I'm using the latest git snapshot 4633fd5713c434c3286e3412a2399bd40fbd9569 of sympy.

Matrix related calculation in python

I am finding a very interesting problem while calculating a matrix update in python . I have to calculate the error (which is difference between previous n updated matrix ).
import numpy as np
import matplotlib.pyplot as plt
#from matplotlib import animation
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
from matplotlib.ticker import LinearLocator, FormatStrFormatter
def update(A):
C=A
D=A
D[1:-1,1:-1]=(C[0:-2,1:-1]+C[2:,1:-1]+C[1:-1,0:-2]+C[1:-1,2:])/4
return(np.abs(D-C),D)
def error(A,B):
C=np.zeros(np.shape(A),np.float64)
#e=np.max(np.max(np.abs(C)))
e=(np.abs(C))
return (e.sum(dtype='float64'))
def initial(C):
C[0,:]=0 ## Top Boundary
C[-1,:]=0 ## Bottom Boundary
C[:,0]=0 ## left Boundary
C[:,-1]=100 ## Right Boundary
return(C)
def SolveLaplace(nx, ny,epsilon,imax):
## Initialize the mesh with some values
U = np.zeros((nx, ny),np.float64)
## Set boundary conditions for the problem
U=initial(U)
## Store previous grid values to check against error tolerance
UN=np.zeros((nx, ny),np.float64)
UN=initial(UN)
## Constants
k = 1 ## Iteration counter
## Iterative procedure
while k<imax:
err,U=update(U)
print(err.sum())
k+=1
return (U)
nx = 50.0
ny = 50.0
dx = 0.001
epsilon = 1e-6 ## Absolute Error tolerance
imax = 5000 ## Maximum number of iterations allowed
Z = SolveLaplace(nx, ny,epsilon,imax)
#x = np.linspace(0, nx * dx, nx)
#y = np.linspace(0, ny * dx, ny)
#X, Y = np.meshgrid(x,y)
##===================================================================
def PlotSolution(nx,ny,dx,T):
## Set up x and y vectors for meshgrid
x = np.linspace(0, nx * dx, nx)
y = np.linspace(0, ny * dx, ny)
fig = plt.figure()
ax = fig.gca(projection='3d')
X, Y = np.meshgrid(x,y)
ax.plot_surface(X, Y, T.transpose(), rstride=1, cstride=1, cmap=cm.cool, linewidth=0, antialiased=False)
plt.xlabel("X")
plt.ylabel("Y")
#plt.zlabel("T(X,Y)")
plt.figure()
plt.contourf(X, Y, T.transpose(), 32, rstride=1, cstride=1, cmap=cm.cool)
plt.colorbar()
plt.xlabel("X")
plt.ylabel("Y")
plt.show()
##===================================================================
PlotSolution(nx, ny, dx, Z)
I am suppose to solve Laplace equation for 2-d sheet(temperature distribution) and when error is less than certain minimum value ,equilibrium will be achieved. But while calculating error, I am always getting 0 but when I print my matrix then I find it should not be a zero . Guys I think I have some conceptual problem here and So please help .
Your problem is that you use shallow copies, i.e., only copy the reference, when assigning C=A; D=A in the update function. Essentially, after the construction of D, all three variables A,C,D point to the same object. Use
def update(A):
C=1.0*A
D=1.0*A
D[1:-1,1:-1]=(C[0:-2,1:-1]+C[2:,1:-1]+C[1:-1,0:-2]+C[1:-1,2:])/4
return(np.abs(D-C),D)
or even shorter
def update(A):
D=A.copy()
D[1:-1,1:-1]=(A[0:-2,1:-1]+A[2:,1:-1]+A[1:-1,0:-2]+A[1:-1,2:])/4
return(np.abs(D-A),D)
Passing arguments and performing arithmetic operations results automatically in a deep copy.
You know that the (geometric, first order) convergence rate is something like max(1-C/(nx^2), 1-C/(ny^2)), i.e., very slow for even moderately large grids? For real applications, better use conjugate gradients, other Krylov-related algorithms or multi-grid approaches (or sparse solver libraries, UMFpack ...).
In the (unused) error procedure, should there be not something like
e = abs(A-B)
At the moment, you return the norm of the freshly generated zero matrix C.

Solve for the positions of all six roots PYTHON

I'm using Newton's method, so I want to find the positions of all six roots of the sixth-order polynomial, basically the points where the function is zero.
I found the rough values on my graph with this code below but want to output those positions of all six roots. I'm thinking of using x as an array to input the values in to find those positions but not sure. I'm using 1.0 for now to locate the rough values. Any suggestions from here??
def P(x):
return 924*x**6 - 2772*x**5 + 3150*x**4 - 1680*x**3 + 420*x**2 - 42*x + 1
def dPdx(x):
return 5544*x**5 - 13860*x**4 + 12600*x**3 - 5040*x**2 + 840*x - 42
accuracy = 1**-10
x = 1.0
xlast = float("inf")
while np.abs(x - xlast) > accuracy:
xlast = x
x = xlast - P(xlast)/dPdx(xlast)
print(x)
p_points = []
x_points = np.linspace(0, 1, 100)
y_points = np.zeros(len(x_points))
for i in range(len(x_points)):
y_points[i] = P(x_points[i])
p_points.append(P(x_points))
plt.plot(x_points,y_points)
plt.savefig("roots.png")
plt.show()
The traditional way is to use deflation to factor out the already found roots. If you want to avoid manipulations of the coefficient array, then you have to divide the roots out.
Having found z[1],...,z[k] as root approximations, form
g(x)=(x-z[1])*(x-z[2])*...*(x-z[k])
and apply Newtons method to h(x)=f(x)/g(x) with h'(x)=f'/g-fg'/g^2. In the Newton iteration this gives
xnext = x - f(x)/( f'(x) - f(x)*g'(x)/g(x) )
Fortunately the quotient g'/g has a simple form
g'(x)/g(x) = 1/(x-z[1])+1/(x-z[2])+...+1/(x-z[k])
So with a slight modification to the Newton step you can avoid finding the same root over again.
This all still keeps the iteration real. To get at the complex root, use a complex number to start the iteration.
Proof of concept, adding eps=1e-8j to g'(x)/g(x) allows the iteration to go complex without preventing real values. Solves the equivalent problem 0=exp(-eps*x)*f(x)/g(x)
import numpy as np
import matplotlib.pyplot as plt
def P(x):
return 924*x**6 - 2772*x**5 + 3150*x**4 - 1680*x**3 + 420*x**2 - 42*x + 1
def dPdx(x):
return 5544*x**5 - 13860*x**4 + 12600*x**3 - 5040*x**2 + 840*x - 42
accuracy = 1e-10
roots = []
for k in range(6):
x = 1.0
xlast = float("inf")
x_points = np.linspace(0.0, 1.0, 200)
y_points = P(x_points)
for rt in roots:
y_points /= (x_points - rt)
y_points = np.array([ max(-1.0,min(1.0,np.real(y))) for y in y_points ])
plt.plot(x_points,y_points,x_points,0*y_points)
plt.show()
while np.abs(x - xlast) > accuracy:
xlast = x
corr = 1e-8j
for rt in roots:
corr += 1/(xlast-rt)
Px = P(xlast)
dPx = dPdx(xlast)
x = xlast - Px/(dPx - Px*corr)
print(x)
roots.append(x)

Second order ODE integration using scipy

I am trying to integrate a second order differential equation using 'scipy.integrate.odeint'. My eqution is as follows
m*x[i]''+x[i]'= K/N*sum(j=0 to N)of sin(x[j]-x[i])
which I have converted into two first order ODEs as followed. In the below code, yinit is array of the initial values x(0) and x'(0). My question is what should be the values of x(0) and x'(0) ?
x'[i]=y[i]
y'[i]=(-y[i]+K/N*sum(j=0 to N)of sin(x[j]-x[i]))/m
from numpy import *
from scipy.integrate import odeint
N = 50
def f(theta, t):
global N
x, y = theta
m = 0.95
K = 1.0
fx = zeros(N, float)
for i in range(N):
s = 0.0
for j in range(i+1,N):
s = s + sin(x[j] - x[i])
fx[i] = (-y[i] + (K*s)/N)/m
return array([y, fx])
t = linspace(0, 10, 100, endpoint=False)
Uniformly generating random number
theta = random.uniform(-180, 180, N)
Integrating function f using odeint
yinit = array([x(0), x'(0)])
y = odeint(f, yinit, t)[:,0]
print (y)
You can choose as initial condition whatever you want.
In your case, you decided to use a random initial condition for x for all the oscillators. You can use a random initial condition for 'y' as well I guess, as I did below.
There were a few errors in the above code, mostly on how to unpack x,y from theta and how to repack them at the end (see concatenate below in the corrected code). See also the concatenate for yinit.
The rest are stylish/minor changes.
from numpy import concatenate, linspace, random, mod, zeros, sin
from scipy.integrate import odeint
Nosc = 20
assert mod(Nosc, 2) == 0
def f(theta, _):
N = theta.size / 2
x, y = theta[:N], theta[N:]
m = 0.95
K = 1.0
fx = zeros(N, float)
for i in range(N):
s = 0.0
for j in range(i + 1, N):
s = s + sin(x[j] - x[i])
fx[i] = (-y[i] + (K * s) / N) / m
return concatenate(([y, fx]))
t = linspace(0, 10, 50, endpoint=False)
theta = random.uniform(-180, 180, Nosc)
theta2 = random.uniform(-180, 180, Nosc) #added initial condition for the velocities of the oscillators
yinit = concatenate((theta, theta2))
res = odeint(f, yinit, t)
X = res[:, :Nosc].T
Y = res[:, Nosc:].T
To plot the time evolution of the system, you can use something like
import matplotlib.pylab as plt
fig, ax = plt.subplots()
for displacement in X:
ax.plot(t, displacement)
ax.set_xlabel('t')
ax.set_ylabel('x')
fig.show()
What are you modelling? At first the eq. looked a bit like kuramoto oscillators, but then I noticed you also have a x[i]'' term.
Notice how in your model, as you do not have a spring term in the equation, like a term x(t) at the LHS, the value of x converges to an arbitrary value: