Gradient Descent in Python 2 - python-2.7

Part of my assignment is to implement the Gradient Descent to find the best approximation of values c_1, c_2 and r_1 for the function
.
Given is only a list of 30 y-values corresponding to x from 0 to 30. I am implementing this in Enthought Canopy like this:
First I start with random values:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as pyplt
c1 = -0.1
c2 = 0.1
r1 = 0.1
x = np.linspace(0,29,30) #start,stop,numitems
y = c1*np.exp(r1*x) + (c1*x)**3.0 - (c2*x)**2.0
pyplt.plot(x,y)
values_x = np.linspace(0,29,30)
values_y = np.array([0.2, -0.142682939241718, -0.886680607211679, -2.0095087143494, -3.47583798747496, -5.24396052331554, -7.2690008846359, -9.50451068338581, -11.9032604272567, -14.4176327390446, -16.9998176236069, -19.6019094345634, -22.1759550265352, -24.6739776668383, -27.0479889096801, -29.2499944927101, -31.2319972651608, -32.945998641919, -34.3439993255969, -35.3779996651013, -35.9999998336943, -36.161999917415, -35.8159999589895, -34.9139999796348, -33.4079999898869, -31.249999994978, -28.3919999975061, -24.7859999987616, -20.383999999385, -15.1379999996945])
pyplt.plot(values_x,values_y)
The squared error is quite high:
def Error(y,y0):
return ( (1.0)*sum((y-y0)**2.0) )
print Error(y,values_y)
Now, to implement the gradient descent, I derived the partial derivative functions for c_1, c_2 and r_1 and implemented the Gradient Descent:
step_size = 0.0000005
accepted_Error = 50
dc1 = c1
dc2 = c2
dr1 = r1
y0 = values_y
previous_Error = 100000
left = True
for _ in range(1000):
gc1 = (2.0) * sum( ( y - dc1*np.exp(dr1*x) - (dc1*x)**3 + (dc2*x)**2 ) * ( -1*np.exp(dr1*x) - (3*(dc1**2)*(x**3)) ) )
gc2 = (2.0) * sum( ( y - dc1*np.exp(dr1*x) - (dc1*x)**3 + (dc2*x)**2 ) * ( 2*dc2*(x**2) ) )
gr1 = (2.0) * sum( ( y - dc1*np.exp(dr1*x) - (dc1*x)**3 + (dc2*x)**2 ) * ( -1*dc1*x*np.exp(dr1*x) ) )
dc1 = dc1 - step_size*gc1
dc2 = dc2 - step_size*gc2
dr1 = dr1 - step_size*gr1
y1 = dc1*np.exp(dr1*x) + (dc1*x)**3.0 - (dc2*x)**2.0
current_Error = Error(y0,y1)
if (current_Error > accepted_Error):
print currentError
else:
break
if (current_Error > previous_Error):
print currentError
print "DIVERGING"
break
if (current_Error==previous_Error):
print "CAN'T IMPROVE"
break
previous_Error = current_Error
However, the error is not improving at all, and I tried to vary the step size. Is there a mistake in my code?

Related

SymPy `subs` Not Doing Anything

I have a differential equation for which I use sympy.solvers.ode.dsolve to solve, I get out
___________ ___________
-x⋅╲╱ E - V_max x⋅╲╱ E - V_max
ψ(x) = C₁⋅ℯ + C₂⋅ℯ
From (I put some of the code to I used to generate this equation at the end):
2
d
-ψ(x)⋅E + ψ(x)⋅V_max + ───(ψ(x)) = 0
2
dx
This is all well and good, the problem comes when I know that C₁ and C₂ happen to be equal and want to substitute one for the other. So I try something like
psi_high.subs( sp.Symbol( "C_2" ), sp.Symbol( "C_1" ) )
However it just comes out the same as before
___________ ___________
-x⋅╲╱ E - V_max x⋅╲╱ E - V_max
ψ(x) = C₁⋅ℯ + C₂⋅ℯ
I am thinking this may be a memory issue, that a reference to a sympy.Symbol object must refer to not only a sympy.Symbol object with the same value/symbol but which also must be the same underlying object.
This is only speculation (but I can say, do psi_high.subs( x, 0 ) and it works), but my question is how do I resolve it?
Curiously, it seems to work here (I did try this using the sympy.symbols function and by enclosing the symbol references in a tuple and list like shown in the question)
Thanks!
well_length = sq.Quantity( 'L' )
highest_potential = sq.Quantity( "V_max" )
x = sp.Symbol( 'x' )
m = sq.Quantity( 'm' )
hbar = sq.Quantity( "hbar" )
total_energy = sq.Quantity( 'E' )
inverse_total_energy = 1.0 / total_energy
psi_symbol = ud.lookup( "GREEK SMALL LETTER PSI" )
psi = sp.Function( "psi" )
second_derivative = sp.Derivative( psi( x ), x, 2 )
make_shrodinger_left = lambda potential, psi_parameter : ( second_derivative + ( psi( psi_parameter ) * potential ) )
make_shrodinger_right = lambda psi_parameter : total_energy * psi( psi_parameter )
make_psi_equal = lambda input_value, value : sp.Eq( psi( sp.Eq( x, input_value ) ), value )
set_equal = lambda to_set, value : sp.Eq( to_set, value )
shrodinger_left_high = sp.simplify( make_shrodinger_left( highest_potential, x ) )
shrodinger_right = make_shrodinger_right( x )
high_diff = sp.simplify( set_equal( shrodinger_left_high - shrodinger_right, 0 ) )
Here's a simpler example:
In [3]: eq = Eq(f(x).diff(x, 2), 0)
In [4]: eq
Out[4]:
2
d
───(f(x)) = 0
2
dx
In [5]: sol = dsolve(eq)
In [7]: sol
Out[7]: f(x) = C₁ + C₂⋅x
We can inspect these symbols:
In [8]: sol.free_symbols
Out[8]: {C₁, C₂, x}
In [9]: [s.name for s in sol.free_symbols]
Out[9]: ['C2', 'C1', 'x']
Note that there are no underscores in the symbol names. What we want to do then is:
In [10]: sol.subs(Symbol("C1"), Symbol("C2"))
Out[10]: f(x) = C₂⋅x + C₂

Odeint function from scipy.integrate gives wrong result

I use odeint function to solve a coupled differential equations system and plot one of the variables (theta_i) after the system is solved. My variable (theta_i) comes from the equation:
theta_i = np.arctan2(g1,g2)
where g1 ang g2 are variables calculated in the same function. The results have to be between -pi and pi and they are supposed to look like this (plot from matlab simulation):
However, when I try to plot theta_i after odeint has finished I get this(plot from my python code):
which is really weird. When I print the values of theta_i right after its calcumation (still inside the function) they look correct (between -0.2 and 0.5), so it has to be something with the result's storing and my implementation of odeint. All the other variables that come from the odeint solution are correct. I searched similar posts but nobody had the same problem with me. What might be the problem here? I am new to python and I use python 2.7.12. Thank you in advance.
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
added_mass_x = 0.03 # kg
added_mass_y = 0.04
mb = 0.3 # kg
m1 = mb-added_mass_x
m2 = mb-added_mass_y
l1 = 0.07 # m
l2 = 0.05 # m
J = 0.00050797 # kgm^2
Sa = 0.0110 # m^2
Cd = 2.44
Cl = 3.41
Kd = 0.000655 # kgm^2
r = 1000 # kg/m^3
c1 = 0.5*r*Sa*Cd
c2 = 0.5*r*Sa*Cl
c3 = 0.5*mb*(l1**2)
c4 = Kd/J
c5 = (1/(2*J))*(l1**2)*mb*l2
c6 = (1/(3*J))*(l1**3)*mb
theta_0 = 10*(np.pi/180) # rad
theta_A = 20*(np.pi/180) # rad
f = 2 # Hz
t = np.linspace(0,100,8000) # s
def direct(u,t):
vcx = u[0]
vcy = u[1]
wz = u[2]
psi = u[3]
x = u[4]
y = u[5]
vcx_i = u[6]
vcy_i = u[7]
psi_i = u[8]
wz_i = u[9]
theta_i = u[10]
theta_deg_i = u[11]
# Subsystem 1
omega = 2*np.pi*f # rad/s
theta = theta_0 + theta_A*np.sin(omega*t) # rad
theta_deg = (theta*180)/np.pi # deg
thetadotdot = -(omega**2)*theta_A*np.sin(omega*t) # rad/s^2
# Subsystem 2
vcxdot = (m2/m1)*vcy*wz-(c1/m1)*vcx*np.sqrt((vcx**2)+(vcy**2))+(c2/m1)*vcy*np.sqrt((vcx**2)+(vcy**2))*np.arctan2(vcy,vcx)-(c3/m1)*thetadotdot*np.sin(theta)
vcydot = -(m1/m2)*vcx*wz-(c1/m2)*vcy*np.sqrt((vcx**2)+(vcy**2))-(c2/m2)*vcx*np.sqrt((vcx**2)+(vcy**2))*np.arctan2(vcy,vcx)+(c3/m2)*thetadotdot*np.cos(theta)
wzdot = ((m1-m2)/J)*vcx*vcy-c4*wz*wz*np.sign(wz)-c5*thetadotdot*np.cos(theta)-c6*thetadotdot
psidot = wz
# Subsystem 3
xdotdot = vcxdot*np.cos(psi)-vcx*np.sin(psi)*wz+vcydot*np.sin(psi)+vcy*np.cos(psi)*wz # m/s^2
ydotdot = -vcxdot*np.sin(psi)-vcx*np.cos(psi)*wz+vcydot*np.cos(psi)-vcy*np.sin(psi)*wz # m/s^2
xdot = vcx*np.cos(psi)+vcy*np.sin(psi) # m/s
ydot = -vcx*np.sin(psi)+vcy*np.cos(psi) # m/s
# Subsystem 4
vcx_i = xdot*np.cos(psi_i)-ydot*np.sin(psi_i)
vcy_i = ydot*np.cos(psi_i)+xdot*np.sin(psi_i)
psidot_i = wz_i
vcxdot_i = xdotdot*np.cos(psi_i)-xdot*np.sin(psi_i)*psidot_i-ydotdot*np.sin(psi_i)-ydot*np.cos(psi_i)*psidot_i
vcydot_i = ydotdot*np.cos(psi_i)-ydot*np.sin(psi_i)*psidot_i+xdotdot*np.sin(psi_i)+xdot*np.cos(psi_i)*psidot_i
g1 = -(m1/c3)*vcxdot_i+(m2/c3)*vcy_i*wz_i-(c1/c3)*vcx_i*np.sqrt((vcx_i**2)+(vcy_i**2))+(c2/c3)*vcy_i*np.sqrt((vcx_i**2)+(vcy_i**2))*np.arctan2(vcy_i,vcx_i)
g2 = (m2/c3)*vcydot_i+(m1/c3)*vcx_i*wz_i+(c1/c3)*vcy_i*np.sqrt((vcx_i**2)+(vcy_i**2))+(c2/c3)*vcx_i*np.sqrt((vcx_i**2)+(vcy_i**2))*np.arctan2(vcy_i,vcx_i)
A = 12*np.sin(2*np.pi*f*t+np.pi) # eksiswsi tail_frequency apo simulink
if A>=0.1:
wzdot_i = ((m1-m2)/J)*vcx_i*vcy_i-c4*wz_i**2*np.sign(wz_i)-c5*g2-c6*np.sqrt((g1**2)+(g2**2))
elif A<-0.1:
wzdot_i = ((m1-m2)/J)*vcx_i*vcy_i-c4*wz_i**2*np.sign(wz_i)-c5*g2+c6*np.sqrt((g1**2)+(g2**2))
else:
wzdot_i = ((m1-m2)/J)*vcx_i*vcy_i-c4*wz_i**2*np.sign(wz)-c5*g2
if g2>0:
theta_i = np.arctan2(g1,g2)
elif g2<0 and g1>=0:
theta_i = np.arctan2(g1,g2)-np.pi
elif g2<0 and g1<0:
theta_i = np.arctan2(g1,g2)+np.pi
elif g2==0 and g1>0:
theta_i = -np.pi/2
elif g2==0 and g1<0:
theta_i = np.pi/2
elif g1==0 and g2==0:
theta_i = 0
theta_deg_i = (theta_i*180)/np.pi
#print theta_deg_i
return [vcxdot, vcydot, wzdot, psidot, xdot, ydot, vcxdot_i, vcydot_i, psidot_i, wzdot_i, theta_i, theta_deg_i]
# arxikes synthikes
vcx_0 = 0.1257
vcy_0 = 0
wz_0 = 0
psi_0 = 0
x_0 = 0
y_0 = 0
vcx_i_0 = 0.1257
vcy_i_0 = 0
psi_i_0 = 0
wz_i_0 = 0
theta_i_0 = 0.1745
theta_deg_i_0 = 9.866
u0 = [vcx_0, vcy_0, wz_0, psi_0, x_0, y_0, vcx_i_0, vcy_i_0, psi_i_0, wz_i_0, theta_i_0, theta_deg_i_0]
u = odeint(direct, u0, t, tfirst=False)
vcx = u[:,0]
vcy = u[:,1]
wz = u[:,2]
psi = u[:,3]
x = u[:,4]
y = u[:,5]
vcx_i = u[:,6]
vcy_i = u[:,7]
psi_i = u[:,8]
wz_i = u[:,9]
theta_i = u[:,10]
theta_deg_i = u[:,11]
print theta_i
plt.figure(17)
plt.plot(t,theta_i,'r-',linewidth=1,label='theta_i')
plt.xlabel('t [s]')
plt.title('theta_i [rad] (Main body CF)')
plt.legend()
plt.show()
The problem as you stated is that theta_i is not part of the gradient. When you formulate your direct, it should be of the form:
def direct(vector, t):
return vector_dot
The quickest and dirtiest solution (without cleaning the code) is to use the function you already defined:
theta_i = [direct(u_i, t_i)[10] for t_i, u_i in zip(t, u)]
I used a a shorter interval: t = np.linspace(0,10,8000). It yielded this:
EDIT: How to remove your theta from the integrator:
def direct(u, t):
# your original function as it is
def direct2(u,t):
return direct(u,t)[:9]
#now integrate the second function
u = odeint(direct2, u0, t)

Symbolic entropy maximization in SymPy

A simple problem of entropy maximization in statistical mechanics in Physics, is formulated as follows.
The goal is to maximize an entropy function (LaTeX is still missing in stackexchange?):
H = - sum_x P_x ln P_x
subject to the following constraints: the normalization constraint
1 = sum_x P_x
and the constraint of average energy
U = sum_i E_x P_x
where the index i runs over x=1,2,...,n. E_x represents the energy of the system when it is in microscopic state x and P_x is the probability for the system to be in the microscopic state x.
The solution to such a problem can be obtained by the method of Lagrange multipliers. In this context, it works as follows...
Firstly, the Lagrangian is defined as
L = H + a( 1 - sum_i P_x ) + b( U - sum_i P_x E_x )
Here, a and b are the Lagrange multipliers. The Lagrangian L is a function of a, b and the probabilities P_x for x=1,2,...,n. The term a( 1 - sum_x P_x ) correspond to the normalization constraint and the term b( E - sum_x P_x E_x ) to the average energy constraint.
Secondly, the partial derivatives of L with respect to a, b and the P_x for the different x=1,2,...,n are calculated. These result in
dL/da = 1 - sum_x P_x
dL/db = E - sum_x E_x P_x
dL/P_x = dH/P_x - a - b E_x
= - ln P_x - 1 - a - b E_x
Thirdly, we find the solution by equating these derivatives to zero. This makes sense since there are 2+n equations and we have 2+n unknowns: the P_x, a and b. The solution from these equations read
P_x = exp( - b E_x ) / Z
where
Z = sum_x exp( - b E_x )
and b is implicitly determined by the relation
E = sum_x P_x E_x = ( 1 / Z ) sum_x exp( -b E_x ) E_x
Now that the "mathematical" problem is defined, lets state the "computational" problem (which is the one I want to ask here).
The computational problem is the following: I would like to reproduce the previous derivation in Sympy.
The idea is to automatize the process so, eventually, I can attack similar but more complicate problems.
I already made certain progress. Still, I think I haven't used the best approach. This is my solution.
# Lets attempt to derive these analytical result using SymPy.
import sympy as sy
import sympy.tensor as syt
# Here, n is introduced to specify an abstract range for x and y.
n = sy.symbols( 'n' , integer = True )
a , b = sy.symbols( 'a b' ) # The Lagrange-multipliers.
x = syt.Idx( 'x' , n ) # Index x for P_x
y = syt.Idx( 'y' , n ) # Index y for P_y; this is required to take derivatives according to SymPy rules.
>>> P = syt.IndexedBase( 'P' ) # The unknowns P_x.
>>> E = syt.IndexedBase( 'E' ) # The knowns E_x; each being the energy of state x.
>>> U = sy.Symbol( 'U' ) # Mean energy.
>>>
>>> # Entropy
>>> H = sy.Sum( - P[x] * sy.log( P[x] ) , x )
>>>
>>> # Lagrangian
>>> L = H + a * ( 1 - sy.Sum( P[x] , x ) ) + b * ( U - sy.Sum( E[x] * P[x] , x ) )
>>> # Lets compute the derivatives
>>> dLda = sy.diff( L , a )
>>> dLdb = sy.diff( L , b )
>>> dLdPy = sy.diff( L , P[y] )
>>> # These look like
>>>
>>> print dLda
-Sum(P[x], (x, 0, n - 1)) + 1
>>>
>>> print dLdb
U - Sum(E[x]*P[x], (x, 0, n - 1))
>>>
>>> print dLdPy
-a*Sum(KroneckerDelta(x, y), (x, 0, n - 1)) - b*Sum(KroneckerDelta(x, y)*E[x], (x, 0, n - 1)) + Sum(-log(P[x])*KroneckerDelta(x, y) - KroneckerDelta(x, y), (x, 0, n - 1))
>>> # The following approach does not work
>>>
>>> tmp = dLdPy.doit()
>>> print tmp
-a*Piecewise((1, 0 <= y), (0, True)) - b*Piecewise((E[y], 0 <= y), (0, True)) + Piecewise((-log(P[y]) - 1, 0 <= y), (0, True))
>>>
>>> sy.solve( tmp , P[y] )
[]
>>> # Hence, we try an ad-hoc procedure
>>> Px = sy.Symbol( 'Px' )
>>> Ex = sy.Symbol( 'Ex' )
>>> tmp2 = dLdPy.doit().subs( P[y] , Px ).subs( E[y] , Ex ).subs( y , 0 )
>>> print tmp2
-Ex*b - a - log(Px) - 1
>>> Px = sy.solve( tmp2 , Px )
>>> print Px
[exp(-Ex*b - a - 1)]
Is there a "better" way to proceed? In particular, I don't like the idea of substituting P[y] with Px and E[y] with Ex in order to solve the equations. Why the equations cannot be solved in terms of P[y]?

generate N random numbers from a skew normal distribution using numpy

I need a function in python to return N random numbers from a skew normal distribution. The skew needs to be taken as a parameter.
e.g. my current use is
x = numpy.random.randn(1000)
and the ideal function would be e.g.
x = randn_skew(1000, skew=0.7)
Solution needs to conform with: python version 2.7, numpy v.1.9
A similar answer is here: skew normal distribution in scipy However this generates a PDF not the random numbers.
I start by generating the PDF curves for reference:
NUM_SAMPLES = 100000
SKEW_PARAMS = [-3, 0]
def skew_norm_pdf(x,e=0,w=1,a=0):
# adapated from:
# http://stackoverflow.com/questions/5884768/skew-normal-distribution-in-scipy
t = (x-e) / w
return 2.0 * w * stats.norm.pdf(t) * stats.norm.cdf(a*t)
# generate the skew normal PDF for reference:
location = 0.0
scale = 1.0
x = np.linspace(-5,5,100)
plt.subplots(figsize=(12,4))
for alpha_skew in SKEW_PARAMS:
p = skew_norm_pdf(x,location,scale,alpha_skew)
# n.b. note that alpha is a parameter that controls skew, but the 'skewness'
# as measured will be different. see the wikipedia page:
# https://en.wikipedia.org/wiki/Skew_normal_distribution
plt.plot(x,p)
Next I found a VB implementation of sampling random numbers from the skew normal distribution and converted it to python:
# literal adaption from:
# http://stackoverflow.com/questions/4643285/how-to-generate-random-numbers-that-follow-skew-normal-distribution-in-matlab
# original at:
# http://www.ozgrid.com/forum/showthread.php?t=108175
def rand_skew_norm(fAlpha, fLocation, fScale):
sigma = fAlpha / np.sqrt(1.0 + fAlpha**2)
afRN = np.random.randn(2)
u0 = afRN[0]
v = afRN[1]
u1 = sigma*u0 + np.sqrt(1.0 -sigma**2) * v
if u0 >= 0:
return u1*fScale + fLocation
return (-u1)*fScale + fLocation
def randn_skew(N, skew=0.0):
return [rand_skew_norm(skew, 0, 1) for x in range(N)]
# lets check they at least visually match the PDF:
plt.subplots(figsize=(12,4))
for alpha_skew in SKEW_PARAMS:
p = randn_skew(NUM_SAMPLES, alpha_skew)
sns.distplot(p)
And then wrote a quick version which (without extensive testing) appears to be correct:
def randn_skew_fast(N, alpha=0.0, loc=0.0, scale=1.0):
sigma = alpha / np.sqrt(1.0 + alpha**2)
u0 = np.random.randn(N)
v = np.random.randn(N)
u1 = (sigma*u0 + np.sqrt(1.0 - sigma**2)*v) * scale
u1[u0 < 0] *= -1
u1 = u1 + loc
return u1
# lets check again
plt.subplots(figsize=(12,4))
for alpha_skew in SKEW_PARAMS:
p = randn_skew_fast(NUM_SAMPLES, alpha_skew)
sns.distplot(p)
from scipy.stats import skewnorm
a=10
data= skewnorm.rvs(a, size=1000)
Here, a is a parameter which you can refer to:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.skewnorm.html
Adapted from rsnorm function from fGarch R package
def random_snorm(n, mean = 0, sd = 1, xi = 1.5):
def random_snorm_aux(n, xi):
weight = xi/(xi + 1/xi)
z = numpy.random.uniform(-weight,1-weight,n)
xi_ = xi**numpy.sign(z)
random = -numpy.absolute(numpy.random.normal(0,1,n))/xi_ * numpy.sign(z)
m1 = 2/numpy.sqrt(2 * numpy.pi)
mu = m1 * (xi - 1/xi)
sigma = numpy.sqrt((1 - m1**2) * (xi**2 + 1/xi**2) + 2 * m1**2 - 1)
return (random - mu)/sigma
return random_snorm_aux(n, xi) * sd + mean

Why does SymPy ignore my initial condition?

import sympy as sp
t = sp.Symbol("t")
y = sp.Function("y")
v = sp.Function("v")
sol = sp.dsolve([
y(t).diff(t) - v(t),
v(t).diff(t) + 2*y(t)
],
ics={y(0):2, v(0):0.45})
sol
gives:
[y(t) == C1*sin(sqrt(2)*t) + C2*cos(sqrt(2)*t),
v(t) == sqrt(2)*C1*cos(sqrt(2)*t) - sqrt(2)*C2*sin(sqrt(2)*t)]
Why are C1 and C2 not calculated?