How to provide custom gradient in TensorFlow - python-2.7

I am trying to understand that how to use #tf.custom_gradient function available in TensorFlow 1.7 for providing a custom gradient of a vector with respect to a vector. Below code is the minimum working example which solves following problem to get dz/dx.
y=Ax
z=||y||2
Also, this attached image describes the solution as expected by manually calulation
If I do not use the #tf.custom_gradient then the TensorFlow gives the desired solution as expected. My question is that how can I provide custom gradient for y=Ax? We know that dy/dx = A^T as shown in the above attachment which shows steps of calculation that matches the TensorFlow output.
import tensorflow as tf
#I want to write custom gradient for this function f1
def f1(A,x):
y=tf.matmul(A,x,name='y')
return y
#for y= Ax, the derivative is: dy/dx= transpose(A)
#tf.custom_gradient
def f2(A,x):
y=f1(A,x)
def grad(dzByDy): # dz/dy = 2y reaches here correctly.
dzByDx=tf.matmul(A,dzByDy,transpose_a=True)
return dzByDx
return y,grad
x= tf.constant([[1.],[0.]],name='x')
A= tf.constant([ [1., 2.], [3., 4.]],name='A')
y=f1(A,x) # This works as desired
#y=f2(A,x) #This line gives Error
z=tf.reduce_sum(y*y,name='z')
g=tf.gradients(ys=z,xs=x)
with tf.Session() as sess:
print sess.run(g)

Since your function f2() has two inputs, you have to provide a gradient to flow back to each of them. The error you see:
Num gradients 2 generated for op name: "IdentityN" [...] do not match num inputs 3
is admittedly quite cryptic, though. Supposing you never want to calculate dy/dA, you can just return None, dzByDx. The code below (tested):
import tensorflow as tf
#I want to write custom gradient for this function f1
def f1(A,x):
y=tf.matmul(A,x,name='y')
return y
#for y= Ax, the derivative is: dy/dx= transpose(A)
#tf.custom_gradient
def f2(A,x):
y=f1(A,x)
def grad(dzByDy): # dz/dy = 2y reaches here correctly.
dzByDx=tf.matmul(A,dzByDy,transpose_a=True)
return None, dzByDx
return y,grad
x= tf.constant([[1.],[0.]],name='x')
A= tf.constant([ [1., 2.], [3., 4.]],name='A')
#y=f1(A,x) # This works as desired
y=f2(A,x) #This line gives Error
z=tf.reduce_sum(y*y,name='z')
g=tf.gradients(ys=z,xs=x)
with tf.Session() as sess:
print sess.run( g )
outputs:
[array([[20.],
[28.]], dtype=float32)]
as desired.

Related

Integrating an array in scipy with bounds.

I am trying to integrate over an array of data, but with bounds. Therfore I planned to use simps (scipy.integrate.simps). Because simps itself does not support bounds I decided to feed it only the selection of my data I want to integrate over. Yet this leads to strange results which are twice as big as the expected outcome.
What am I doing wrong, or what am I missing, or missunderstanding?
# -*- coding: utf-8 -*-
from scipy import integrate
from scipy import interpolate
import numpy as np
import matplotlib.pyplot as plt
# my data
x = np.linspace(-10, 10, 30)
y = x**2
# but I only want to integrate from 3 to 5
f = interpolate.interp1d(x, y)
x_selection = np.linspace(3, 5, 10)
y_selection = f(x_selection)
# quad returns the expected result
print 'quad', integrate.quad(f, 3, 5), '<- the expected value (includig error estimation)'
# but simps returns an uexpected result, when using the selected data
print 'simps', integrate.simps(x_selection, y_selection), '<- twice as big'
print 'trapz', integrate.trapz(x_selection, y_selection), '<- also twice as big'
plt.plot(x, y, marker='.')
plt.fill_between(x, y, 0, alpha=0.5)
plt.plot(x_selection, y_selection, marker='.')
plt.fill_between(x_selection, y_selection, 0, alpha=0.5)
plt.show()
Windows7, python2.7, scipy1.0.0
The Arguments for simps() and trapz() are in the wrong order.
You have flipped the calling arguments; simps and trapz expect first the y dimension, and second the x dimension, as per the docs. Once you have corrected this, similar results should obtain. Note that your example function admits a trivial analytic antiderivative, which would be much cheaper to evaluate.
– N. Wouda

Keras ImageDataGenerator: random transform

I'm interested in augmenting my dataset with random image transformations. I'm using Keras ImageDataGenerator, and I'm getting the following error when trying to apply random_transform to a single image:
--> x = apply_transform(x, transform matrix, img_channel_axis, fill_mode, cval)
>>> RuntimeError: affine matrix has wrong number of rows.
I found the source code for the ImageDataGenerator here. However, I'm not sure how to debug the runtime error. Below is the code I have:
from keras.preprocessing.image import img_to_array, load_img
from keras.preprocessing.image import ImageDataGenerator
from keras.applications.inception_v3 import preprocess_input
image_path = './figures/zebra.jpg'
#data augmentation
train_datagen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
print "\nloading image..."
image = load_img(image_path, target_size=(299, 299))
image = img_to_array(image)
image = np.expand_dims(image, axis=0) # 1 x input_shape
image = preprocess_input(image)
train_datagen.fit(image)
image = train_datagen.random_transform(image)
The error occurs at the last line when calling random_transform.
The problem is that random_transform expects a 3D-array.
See the docstring:
def random_transform(self, x, seed=None):
"""Randomly augment a single image tensor.
# Arguments
x: 3D tensor, single image.
seed: random seed.
# Returns
A randomly transformed version of the input (same shape).
"""
So you'll need to call it before np.expand_dims.

How to create a container in pymc3

I am trying to build a model for the likelihood function of a particular outcome of a Langevin equation (Brownian particle in a harmonic potential):
Here is my model in pymc2 that seems to work:
https://github.com/hstrey/BayesianAnalysis/blob/master/Langevin%20simulation.ipynb
#define the model/function to be fitted.
def model(x):
t = pm.Uniform('t', 0.1, 20, value=2.0)
A = pm.Uniform('A', 0.1, 10, value=1.0)
#pm.deterministic(plot=False)
def S(t=t):
return 1-np.exp(-4*delta_t/t)
#pm.deterministic(plot=False)
def s(t=t):
return np.exp(-2*delta_t/t)
path = np.empty(N, dtype=object)
path[0]=pm.Normal('path_0',mu=0, tau=1/A, value=x[0], observed=True)
for i in range(1,N):
path[i] = pm.Normal('path_%i' % i,
mu=path[i-1]*s,
tau=1/A/S,
value=x[i],
observed=True)
return locals()
mcmc = pm.MCMC( model(x) )
mcmc.sample( 20000, 2000, 10 )
The basic idea is that each point depends on the previous point in the chain (Markov chain). Btw, x is an array of data, N is its length, delta_t is the time step =0.01. Any idea how to implement this in pymc3? I tried:
# define the model/function for diffusion in a harmonic potential
DHP_model = pm.Model()
with DHP_model:
t = pm.Uniform('t', 0.1, 20)
A = pm.Uniform('A', 0.1, 10)
S=1-pm.exp(-4*delta_t/t)
s=pm.exp(-2*delta_t/t)
path = np.empty(N, dtype=object)
path[0]=pm.Normal('path_0',mu=0, tau=1/A, observed=x[0])
for i in range(1,N):
path[i] = pm.Normal('path_%i' % i,
mu=path[i-1]*s,
tau=1/A/S,
observed=x[i])
Unfortunately the model crashes as soon as I try to run it. I tried some pymc3 examples (tutorial) on my machine and this is working.
Thanks in advance. I am really hoping that the new samplers in pymc3 will help me with this model. I am trying to apply Bayesian methods to single-molecule experiments.
Rather than creating many individual normally-distributed 1-D variables in a loop, you can make a custom distribution (by extending Continuous) that knows the formula for computing the log likelihood of your entire path. You can bootstrap this likelihood formula off of the Normal likelihood formula that pymc3 already knows. See the built-in AR1 class for an example.
Since your particle follows the Markov property, your likelihood looks like
import theano.tensor as T
def logp(path):
now = path[1:]
prev = path[:-1]
loglik_first = pm.Normal.dist(mu=0., tau=1./A).logp(path[0])
loglik_rest = T.sum(pm.Normal.dist(mu=prev*ss, tau=1./A/S).logp(now))
loglik_final = loglik_first + loglik_rest
return loglik_final
I'm guessing that you want to draw a value for ss at every time step, in which case you should make sure to specify ss = pm.exp(..., shape=len(x)-1), so that prev*ss in the block above gets interpreted as element-wise multiplication.
Then you can just specify your observations with
path = MyLangevin('path', ..., observed=x)
This should run much faster.
Since I did not see an answer to my question, let me answer it myself. I came up with the following solution:
# now lets model this data using pymc
# define the model/function for diffusion in a harmonic potential
DHP_model = pm.Model()
with DHP_model:
D = pm.Gamma('D',mu=mu_D,sd=sd_D)
A = pm.Gamma('A',mu=mu_A,sd=sd_A)
S=1.0-pm.exp(-2.0*delta_t*D/A)
ss=pm.exp(-delta_t*D/A)
path=pm.Normal('path_0',mu=0.0, tau=1/A, observed=x[0])
for i in range(1,N):
path = pm.Normal('path_%i' % i,
mu=path*ss,
tau=1.0/A/S,
observed=x[i])
start = pm.find_MAP()
print(start)
trace = pm.sample(100000, start=start)
unfortunately, this code takes at N=50 anywhere between 6hours to 2 days to compile. I am running on a pretty fast PC (24Gb RAM) running Ubuntu. I tried to using the GPU but that runs slightly slower. I suspect memory problems since it uses 99.8% of the memory when running. I tried the same calculation with Stan and it only takes 2min to run.

Fitting a Gaussian, getting a straight line. Python 2.7

As my title suggests, I'm trying to fit a Gaussian to some data and I'm just getting a straight line. I've been looking at these other discussion Gaussian fit for Python and Fitting a gaussian to a curve in Python which seem to suggest basically the same thing. I can make the code in those discussions work fine for the data they provide, but it won't do it for my data.
My code looks like this:
import pylab as plb
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy import asarray as ar,exp
y = y - y[0] # to make it go to zero on both sides
x = range(len(y))
max_y = max(y)
n = len(y)
mean = sum(x*y)/n
sigma = np.sqrt(sum(y*(x-mean)**2)/n)
# Someone on a previous post seemed to think this needed to have the sqrt.
# Tried it without as well, made no difference.
def gaus(x,a,x0,sigma):
return a*exp(-(x-x0)**2/(2*sigma**2))
popt,pcov = curve_fit(gaus,x,y,p0=[max_y,mean,sigma])
# It was suggested in one of the other posts I looked at to make the
# first element of p0 be the maximum value of y.
# I also tried it as 1, but that did not work either
plt.plot(x,y,'b:',label='data')
plt.plot(x,gaus(x,*popt),'r:',label='fit')
plt.legend()
plt.title('Fig. 3 - Fit for Time Constant')
plt.xlabel('Time (s)')
plt.ylabel('Voltage (V)')
plt.show()
The data I am trying to fit is as follows:
y = array([ 6.95301373e+12, 9.62971320e+12, 1.32501876e+13,
1.81150568e+13, 2.46111132e+13, 3.32321345e+13,
4.45978682e+13, 5.94819771e+13, 7.88394616e+13,
1.03837779e+14, 1.35888594e+14, 1.76677210e+14,
2.28196006e+14, 2.92781632e+14, 3.73133045e+14,
4.72340762e+14, 5.93892782e+14, 7.41632194e+14,
9.19750269e+14, 1.13278296e+15, 1.38551838e+15,
1.68291212e+15, 2.02996957e+15, 2.43161742e+15,
2.89259207e+15, 3.41725793e+15, 4.00937676e+15,
4.67187762e+15, 5.40667931e+15, 6.21440313e+15,
7.09421973e+15, 8.04366842e+15, 9.05855930e+15,
1.01328502e+16, 1.12585509e+16, 1.24257598e+16,
1.36226443e+16, 1.48356404e+16, 1.60496345e+16,
1.72482199e+16, 1.84140400e+16, 1.95291969e+16,
2.05757166e+16, 2.15360187e+16, 2.23933053e+16,
2.31320228e+16, 2.37385276e+16, 2.42009864e+16,
2.45114362e+16, 2.46427484e+16, 2.45114362e+16,
2.42009864e+16, 2.37385276e+16, 2.31320228e+16,
2.23933053e+16, 2.15360187e+16, 2.05757166e+16,
1.95291969e+16, 1.84140400e+16, 1.72482199e+16,
1.60496345e+16, 1.48356404e+16, 1.36226443e+16,
1.24257598e+16, 1.12585509e+16, 1.01328502e+16,
9.05855930e+15, 8.04366842e+15, 7.09421973e+15,
6.21440313e+15, 5.40667931e+15, 4.67187762e+15,
4.00937676e+15, 3.41725793e+15, 2.89259207e+15,
2.43161742e+15, 2.02996957e+15, 1.68291212e+15,
1.38551838e+15, 1.13278296e+15, 9.19750269e+14,
7.41632194e+14, 5.93892782e+14, 4.72340762e+14,
3.73133045e+14, 2.92781632e+14, 2.28196006e+14,
1.76677210e+14, 1.35888594e+14, 1.03837779e+14,
7.88394616e+13, 5.94819771e+13, 4.45978682e+13,
3.32321345e+13, 2.46111132e+13, 1.81150568e+13,
1.32501876e+13, 9.62971320e+12, 6.95301373e+12,
4.98705540e+12])
I would show you what it looks like, but apparently I don't have enough reputation points...
Anyone got any idea why it's not fitting properly?
Thanks for your help :)
The importance of the initial guess, p0 in curve_fit's default argument list, cannot be stressed enough.
Notice that the docstring mentions that
[p0] If None, then the initial values will all be 1
So if you do not supply it, it will use an initial guess of 1 for all parameters you're trying to optimize for.
The choice of p0 affects the speed at which the underlying algorithm changes the guess vector p0 (ref. the documentation of least_squares).
When you look at the data that you have, you'll notice that the maximum and the mean, mu_0, of the Gaussian-like dataset y, are
2.4e16 and 49 respectively. With the peak value so large, the algorithm, would need to make drastic changes to its initial guess to reach that large value.
When you supply a good initial guess to the curve fitting algorithm, convergence is more likely to occur.
Using your data, you can supply a good initial guess for the peak_value, the mean and sigma, by writing them like this:
y = np.array([...]) # starting from the original dataset
x = np.arange(len(y))
peak_value = y.max()
mean = x[y.argmax()] # observation of the data shows that the peak is close to the center of the interval of the x-data
sigma = mean - np.where(y > peak_value * np.exp(-.5))[0][0] # when x is sigma in the gaussian model, the function evaluates to a*exp(-.5)
popt,pcov = curve_fit(gaus, x, y, p0=[peak_value, mean, sigma])
print(popt) # prints: [ 2.44402560e+16 4.90000000e+01 1.20588976e+01]
Note that in your code, for the mean you take sum(x*y)/n , which is strange, because this would modulate the gaussian by a polynome of degree 1 (it multiplies a gaussian with a monotonously increasing line of constant slope) before taking the mean. That will offset the mean value of y (in this case to the right). A similar remark can be made for your calculation of sigma.
Final remark: the histogram of y will not resemble a Gaussian, as y is already a Gaussian. The histogram will merely bin (count) values into different categories (answering the question "how many datapoints in y reach a value between [a, b]?").

Fitting The Theoretical Equation To My Data

I am very, very new to python, so please bear with me, and pardon my naivety. I am using Spyder Python 2.7 on my Windows laptop. As the title suggests, I have some data, a theoretical equation, and I am attempting to fit my data, with what I believe is the Chi-squared fit. The theoretical equation I am using is
import math
import numpy as np
import scipy.optimize as optimize
import matplotlib.pylab as plt
import csv
#with open('1.csv', 'r') as datafile:
# datareader = csv.reader(datafile)
# for row in datareader:
# print ', '.join(row)
t_y_data = np.loadtxt('exerciseball.csv', dtype=float, delimiter=',', usecols=(1,4), skiprows = 1)
print(t_y_data)
t = t_y_data[:,0]
y = t_y_data[:,1]
gamma0 = [.1]
sigma = [(0.345366)/2]*(len(t))
#len(sigma)
#print(sigma)
#print(len(sigma))
#sigma is the error in our measurements, which is the radius of the object
# Dragfunction is the theoretical equation of the position as a function of time when the thing falling experiences a drag force
# This is the function we are trying to fit to our data
# t is the independent variable time, m is the mass, and D is the Diameter
#Gamma is the value of which python will vary, until chi-squared is a minimum
def Dragfunction(x, gamma):
print x
g = 9.8
D = 0.345366
m = 0.715
# num = math.sqrt(gamma)*D*g*x
# den = math.sqrt(m*g)
# frac = num/den
# print "frac", frac
return ((m)/(gamma*D**2))*math.log(math.cosh(math.sqrt(gamma/m*g)*D*g*t))
optimize.curve_fit(Dragfunction, t, y, gamma0, sigma)
This is the error message I am getting:
return ((m)/(gamma*D**2))*math.log(math.cosh(math.sqrt(gamma/m*g)*D*g*t))
TypeError: only length-1 arrays can be converted to Python scalars
My professor and I have spent about three or four hours trying to fix this. He helped me work out a lot of the problems, but this we can't seem to resolve.
Could someone please help? If there is any other information you need, please let me know.
Your error message comes from the fact that those math functions only accept a scalar, so to call functions on an array, use the numpy versions:
In [82]: a = np.array([1,2,3])
In [83]: np.sqrt(a)
Out[83]: array([ 1. , 1.41421356, 1.73205081])
In [84]: math.sqrt(a)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
----> 1 math.sqrt(a)
TypeError: only length-1 arrays can be converted to Python scalars
In the process, I happened to spot a mathematical error in your code. Your equation at top says that g is in the bottom of the square root inside the log(cosh()), but you've got it on the top because a/b*c == a*c/b in python, not a/(b*c)
log(cosh(sqrt(gamma/m*g)*D*g*t))
should instead be any one of these:
log(cosh(sqrt(gamma/m/g)*D*g*t))
log(cosh(sqrt(gamma/(m*g))*D*g*t))
log(cosh(sqrt(gamma*g/m)*D*t)) # the simplest, by canceling with the g from outside sqrt
A second error is that in your function definition, you have the parameter named x which you never use, but instead you're using t which at this point is a global variable (from your data), so you won't see an error. You won't see an effect using curve_fit since it will pass your t data to the function anyway, but if you tried to call the Dragfunction on a different data set, it would still give you the results from the t values. Probably you meant this:
def Dragfunction(t, gamma):
print t
...
return ... D*g*t ...
A couple other notes as unsolicited advice, since you said you were new to python:
You can load and "unpack" the t and y variables at once with:
t, y = np.loadtxt('exerciseball.csv', dtype=float, delimiter=',', usecols=(1,4), skiprows = 1, unpack=True)
If your error is constant, then sigma has no effect on curve_fit, as it only affects the relative weighting for the fit, so you really don't need it at all.
Below is my version of your code, with all of the above changes in place.
import numpy as np
from scipy import optimize # simplified syntax
import matplotlib.pyplot as plt # pylab != pyplot
# `unpack` lets you split the columns immediately:
t, y = np.loadtxt('exerciseball.csv', dtype=float, delimiter=',',
usecols=(1, 4), skiprows=1, unpack=True)
gamma0 = .1 # does not need to be a list
def Dragfunction(x, gamma):
g = 9.8
D = 0.345366
m = 0.715
gammaD_m = gamma*D*D/m # combination is used twice, only calculate once for (small) speedup
return np.log(np.cosh(np.sqrt(gammaD_m*g)*t)) / gammaD_m
gamma_best, gamma_var = optimize.curve_fit(Dragfunction, t, y, gamma0)