Calculate gradient for only part of a shared variable array - gradient

I want to do the following:
import theano, numpy, theano.tensor as T
a = T.fvector('a')
w = theano.shared(numpy.array([1, 2, 3, 4], dtype=theano.config.floatX))
w_sub = w[1]
b = T.sum(a * w)
grad = T.grad(b, w_sub)
Here, w_sub is for example w[1] but I do not want to explicitly write out b in function of w_sub. Despite going through this and other related issues I can't solve it.
This is just to show you my problem. Actually, what I really want to do is a sparse convolution with Lasagne. The zero entries in the weight matrix do not need to be updated and therefore there is no need to calculate the gradient for these entries of w.
This is now the complete error message:
Traceback (most recent call last):
File "D:/Jeroen/Project_Lasagne_General/test_script.py", line 9, in <module>
grad = T.grad(b, w_sub)
File "C:\Anaconda2\lib\site-packages\theano\gradient.py", line 545, in grad
handle_disconnected(elem)
File "C:\Anaconda2\lib\site-packages\theano\gradient.py", line 532, in handle_disconnected
raise DisconnectedInputError(message)
theano.gradient.DisconnectedInputError: grad method was asked to compute the gradient with respect to a variable that is not part of the computational graph of the cost, or is used only by a non-differentiable operator: Subtensor{int64}.0
Backtrace when the node is created:
File "D:/Jeroen/Project_Lasagne_General/test_script.py", line 6, in <module>
w_sub = w[1]

When theano compiles the graph, it only sees the variables as explicitly defined in the graph. In your example, w_sub is not explicitly used in the computation of b and therefore is not part of the computation graph.
Using theano printing library with the following code, you can see on this
graph vizualization that indeed w_sub is not part of the graph of b.
import theano
import theano.tensor as T
import numpy
import theano.d3viz as d3v
a = T.fvector('a')
w = theano.shared(numpy.array([1, 2, 3, 4], dtype=theano.config.floatX))
w_sub = w[1]
b = T.sum(a * w)
o = b, w_sub
d3v.d3viz(o, 'b.html')
To fix the problem, you need to explicitly use w_sub in the computation of b.
Then you will be able to compute the gradients of b wrt w_sub and update the values of the shared variable as in the following example :
import theano
import theano.tensor as T
import numpy
a = T.fvector('a')
w = theano.shared(numpy.array([1, 2, 3, 4], dtype=theano.config.floatX))
w_sub = w[1]
b = T.sum(a * w_sub)
grad = T.grad(b, w_sub)
updates = [(w, T.inc_subtensor(w_sub, -0.1*grad))]
f = theano.function([a], b, updates=updates, allow_input_downcast=True)
f(numpy.arange(10))

Related

Surface Plotting on Python 2.7 with pyplot

I am new to Python. I have been trying to plot a data file that contains 3 columns and 1024 data points. While running the code the following error arises:
Traceback (most recent call last):
File "plot-data.py", line 27, in <module>
linewidth=0, antialiased=False)
File "/home/ritajit/.local/lib/python2.7/site-packages/mpl_toolkits/mplot3d/axes3d.py", line 1624, in plot_surface
X, Y, Z = np.broadcast_arrays(X, Y, Z)
File "/home/ritajit/.local/lib/python2.7/site-packages/numpy/lib/stride_tricks.py", line 249, in broadcast_arrays
shape = _broadcast_shape(*args)
File "/home/ritajit/.local/lib/python2.7/site-packages/numpy /lib/stride_tricks.py", line 184, in _broadcast_shape
b = np.broadcast(*args[:32])
ValueError: shape mismatch: objects cannot be broadcast to a single shape
My code looks like this
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.mlab import griddata
import matplotlib.cm as cm
from pylab import rcParams
rcParams['figure.figsize'] = 9, 9
## 3D surface_plot
fig = plt.figure()
axes = fig.add_subplot(111, projection='3d') #gca = get current axis
data = np.loadtxt('2D-data.txt')
x = data[:,0]
y = data[:,1]
z = data[:,2]
xi = np.unique(x)
yi = np.unique(y)
xv, yv = np.meshgrid(x,y)
Z = griddata(x, y, z, xi, yi, interp='linear')
# surface_plot with color grading and color bar
p = axes.plot_surface(xv,yv,Z, rstride=4, cstride=4, cmap=cm.RdBu,
linewidth=0, antialiased=False)
fig.colorbar(p, shrink=0.5)
axes.set_xlabel('$x$',fontsize=15)
axes.set_ylabel('$y$',fontsize=15)
axes.set_zlabel('$z$',fontsize=15)
plt.tight_layout()
fig.savefig("surface.pdf")
plt.show()
I am unable to work this through.
What wrong am I doing?
Is there any other way to plot 3d datafile?
A few lines from my data file:
1 2 1.30884
2 2 1.30925
3 2 1.30974
4 2 1.30841
5 2 1.30864
6 2 1.30795
The 1st,2nd,3rd columns are x,y,z respectively
Three main issues here:
You need to meshgrid the unique values, not the original ones
xi = np.unique(x)
yi = np.unique(y)
xv, yv = np.meshgrid(xi,yi)
You need to interpolate on the gridded values
griddata(x, y, z, xv, yv)
You need to plot Z, not z
p = axes.plot_surface(xv,yv,Z)
In total it looks like you could achieve pretty much the same by reshaping the data columns (but the small data excerpt is not enough to judge on this).
Last, matplotlib.mlab.griddata will be deprecated in the next version. As an alternative consider scipy.interpolate.griddata. Also have a look at the Contour plot of irregularly spaced data example.

Method for evaluating the unit vector ( or normalising a vector ) in Python or in the numerical libraries: numpy, scipy [duplicate]

I would like to convert a NumPy array to a unit vector. More specifically, I am looking for an equivalent version of this normalisation function:
def normalize(v):
norm = np.linalg.norm(v)
if norm == 0:
return v
return v / norm
This function handles the situation where vector v has the norm value of 0.
Is there any similar functions provided in sklearn or numpy?
If you're using scikit-learn you can use sklearn.preprocessing.normalize:
import numpy as np
from sklearn.preprocessing import normalize
x = np.random.rand(1000)*10
norm1 = x / np.linalg.norm(x)
norm2 = normalize(x[:,np.newaxis], axis=0).ravel()
print np.all(norm1 == norm2)
# True
I agree that it would be nice if such a function were part of the included libraries. But it isn't, as far as I know. So here is a version for arbitrary axes that gives optimal performance.
import numpy as np
def normalized(a, axis=-1, order=2):
l2 = np.atleast_1d(np.linalg.norm(a, order, axis))
l2[l2==0] = 1
return a / np.expand_dims(l2, axis)
A = np.random.randn(3,3,3)
print(normalized(A,0))
print(normalized(A,1))
print(normalized(A,2))
print(normalized(np.arange(3)[:,None]))
print(normalized(np.arange(3)))
This might also work for you
import numpy as np
normalized_v = v / np.sqrt(np.sum(v**2))
but fails when v has length 0.
In that case, introducing a small constant to prevent the zero division solves this.
As proposed in the comments one could also use
v/np.linalg.norm(v)
To avoid zero division I use eps, but that's maybe not great.
def normalize(v):
norm=np.linalg.norm(v)
if norm==0:
norm=np.finfo(v.dtype).eps
return v/norm
If you have multidimensional data and want each axis normalized to its max or its sum:
def normalize(_d, to_sum=True, copy=True):
# d is a (n x dimension) np array
d = _d if not copy else np.copy(_d)
d -= np.min(d, axis=0)
d /= (np.sum(d, axis=0) if to_sum else np.ptp(d, axis=0))
return d
Uses numpys peak to peak function.
a = np.random.random((5, 3))
b = normalize(a, copy=False)
b.sum(axis=0) # array([1., 1., 1.]), the rows sum to 1
c = normalize(a, to_sum=False, copy=False)
c.max(axis=0) # array([1., 1., 1.]), the max of each row is 1
If you don't need utmost precision, your function can be reduced to:
v_norm = v / (np.linalg.norm(v) + 1e-16)
You mentioned sci-kit learn, so I want to share another solution.
sci-kit learn MinMaxScaler
In sci-kit learn, there is a API called MinMaxScaler which can customize the the value range as you like.
It also deal with NaN issues for us.
NaNs are treated as missing values: disregarded in fit, and maintained
in transform. ... see reference [1]
Code sample
The code is simple, just type
# Let's say X_train is your input dataframe
from sklearn.preprocessing import MinMaxScaler
# call MinMaxScaler object
min_max_scaler = MinMaxScaler()
# feed in a numpy array
X_train_norm = min_max_scaler.fit_transform(X_train.values)
# wrap it up if you need a dataframe
df = pd.DataFrame(X_train_norm)
Reference
[1] sklearn.preprocessing.MinMaxScaler
There is also the function unit_vector() to normalize vectors in the popular transformations module by Christoph Gohlke:
import transformations as trafo
import numpy as np
data = np.array([[1.0, 1.0, 0.0],
[1.0, 1.0, 1.0],
[1.0, 2.0, 3.0]])
print(trafo.unit_vector(data, axis=1))
If you work with multidimensional array following fast solution is possible.
Say we have 2D array, which we want to normalize by last axis, while some rows have zero norm.
import numpy as np
arr = np.array([
[1, 2, 3],
[0, 0, 0],
[5, 6, 7]
], dtype=np.float)
lengths = np.linalg.norm(arr, axis=-1)
print(lengths) # [ 3.74165739 0. 10.48808848]
arr[lengths > 0] = arr[lengths > 0] / lengths[lengths > 0][:, np.newaxis]
print(arr)
# [[0.26726124 0.53452248 0.80178373]
# [0. 0. 0. ]
# [0.47673129 0.57207755 0.66742381]]
If you want to normalize n dimensional feature vectors stored in a 3D tensor, you could also use PyTorch:
import numpy as np
from torch import FloatTensor
from torch.nn.functional import normalize
vecs = np.random.rand(3, 16, 16, 16)
norm_vecs = normalize(FloatTensor(vecs), dim=0, eps=1e-16).numpy()
If you're working with 3D vectors, you can do this concisely using the toolbelt vg. It's a light layer on top of numpy and it supports single values and stacked vectors.
import numpy as np
import vg
x = np.random.rand(1000)*10
norm1 = x / np.linalg.norm(x)
norm2 = vg.normalize(x)
print np.all(norm1 == norm2)
# True
I created the library at my last startup, where it was motivated by uses like this: simple ideas which are way too verbose in NumPy.
Without sklearn and using just numpy.
Just define a function:.
Assuming that the rows are the variables and the columns the samples (axis= 1):
import numpy as np
# Example array
X = np.array([[1,2,3],[4,5,6]])
def stdmtx(X):
means = X.mean(axis =1)
stds = X.std(axis= 1, ddof=1)
X= X - means[:, np.newaxis]
X= X / stds[:, np.newaxis]
return np.nan_to_num(X)
output:
X
array([[1, 2, 3],
[4, 5, 6]])
stdmtx(X)
array([[-1., 0., 1.],
[-1., 0., 1.]])
For a 2D array, you can use the following one-liner to normalize across rows. To normalize across columns, simply set axis=0.
a / np.linalg.norm(a, axis=1, keepdims=True)
If you want all values in [0; 1] for 1d-array then just use
(a - a.min(axis=0)) / (a.max(axis=0) - a.min(axis=0))
Where a is your 1d-array.
An example:
>>> a = np.array([0, 1, 2, 4, 5, 2])
>>> (a - a.min(axis=0)) / (a.max(axis=0) - a.min(axis=0))
array([0. , 0.2, 0.4, 0.8, 1. , 0.4])
Note for the method. For saving proportions between values there is a restriction: 1d-array must have at least one 0 and consists of 0 and positive numbers.
A simple dot product would do the job. No need for any extra package.
x = x/np.sqrt(x.dot(x))
By the way, if the norm of x is zero, it is inherently a zero vector, and cannot be converted to a unit vector (which has norm 1). If you want to catch the case of np.array([0,0,...0]), then use
norm = np.sqrt(x.dot(x))
x = x/norm if norm != 0 else x

Unable to interpolate data using Rbf in Scipy

I tried to interpolate the data using Rbf.
import numpy as np, matplotlib.pyplot as plt
from scipy.interpolate import Rbf
x = np.array([110, 112, 114, 115, 119, 120, 122, 124]).astype(float)
y = np.array([60, 61, 63, 67, 68, 70, 75, 81]).astype(float)
d = np.array([4, 6, 5, 3, 2, 1, 7, 9]).astype(float)
ulx, lrx = np.min(x), np.max(x)
uly, lry = np.max(y), np.min(y)
xi = np.linspace(ulx, lrx, 4)
yi = np.linspace(uly, lry, 4)
rbfi = Rbf(x, y, d)
di = rbfi(xi, yi)
plt.imshow(di)
plt.show()
However, I got:
TypeError: Invalid dimensions for image data
How is the solution?
With your original data (from the SO with griddata), this works:
Clean up the x,y, removing the outliers:
yreg=y.reshape(15,15)[:,[0]].repeat(15,1).flatten()
xreg=x.reshape(15,15)[[0],:].repeat(15,0).flatten()
ulx, lrx = np.min(xreg), np.max(xreg)
uly, lry = np.max(yreg), np.min(yreg)
N = 20
xi = np.linspace(ulx, lrx, N)
yi = np.linspace(uly, lry, N)
# grided_data = interpolate.griddata((xreg, yreg), z, (xi.reshape(1,-1), yi.reshape(-1,1)),
method='nearest',fill_value=0)
I don't think Rbf (and similar interpolators) handle broadcasting like griddata does. So I have to construct 2 vectors defining all the interpolation points.
yyi=np.repeat(yi,N)
xxi=np.repeat(xi[None,:],N,0).flatten()
rbfi=interpolate.Rbf(xreg,yreg,z,function='linear')
zzi=rbfi(xxi,yyi).reshape(N,N)
Timing wise, Rbf is noticeably slower than griddata.
With xreg, yreg, the interpolation results (for both methods) look similar to the image of z.reshape(15,15) - a square with 2 square plateaus in the lower left corner.

Converting theano tensor types

I have a computation graph built with Theano. It goes like this:
import theano
from theano import tensor as T
import numpy as np
W1 = theano.shared( np.random.rand(45,32).astype('float32'), 'W1')
b1 = theano.shared( np.random.rand(32).astype('float32'), 'b1')
W2 = theano.shared( np.random.rand(32,3).astype('float32'), 'W2')
b2 = theano.shared( np.random.rand(3).astype('float32'), 'b2')
input = T.matrix('input')
hidden = T.tanh(T.dot(input, W1)+b1)
output = T.nnet.softmax(T.dot(hidden, W2)+b2)
Now, the mapping from a vector to a vector. However, input is set as a matrix type so I can pass many vectors through the mapping simultaneously. I'm doing some machine learning and this makes the learning phase more efficient.
The problem is that after the learning phase, I'd like to view the mapping as vector to vector so I can compute:
jac = theano.gradient.jacobian(output, wrt=input)
jacobian complains that input is not TensorType(float32, vector). Is there a way I can change the input tensor type without rebuilding the whole computation graph?
Technically, this a possible solution:
import theano
from theano import tensor as T
import numpy as np
W1 = theano.shared( np.random.rand(45,32).astype('float32'), 'W1')
b1 = theano.shared( np.random.rand(32).astype('float32'), 'b1')
W2 = theano.shared( np.random.rand(32,3).astype('float32'), 'W2')
b2 = theano.shared( np.random.rand(3).astype('float32'), 'b2')
input = T.vector('input') # it will be reshaped!
hidden = T.tanh(T.dot(input.reshape((-1, 45)), W1)+b1)
output = T.nnet.softmax(T.dot(hidden, W2)+b2)
#Here comes the trick
jac = theano.gradient.jacobian(output.reshape((-1,)), wrt=input).reshape((-1, 45, 3))
In this way jac.eval({input: np.random.rand(10*45)}).shape will result (100, 45, 3)!
The problem is that it calculates the derivative across the batch index. So in theory the first 1x45 number can effect all the 10x3 outputs (in a batch of length 10).
For that, there are several solutions.
You could take the diagonal across the first two axes, but unfortunately Theano does not implement it, numpy does!
I think it can be done with a scan, but this is an other matter.

Using polyfit to plot scatter points with errors

I can use np.polyfit to fit a line in my scatter plot as shown bellow
a = np.array([1.08,2.05,1.56,0.73,1.1,0.73,0.34,0.73,0.88,2.05])
b=np.array([4.72131259, 6.60937492, 6.41485738, 6.82386894, 6.20293278, 7.22670489, 6.15681295, 5.91595178, 6.43917035, 6.64453907])
m1, b1 = np.polyfit(a, b, 1)
corr1 =a1.plot(a, m1*a+b1, '-', color='black')
a1.scatter(a, b)
Is there any way to fit a line using polyfit this time taking the errors for my points as shown bellow?
ae = np.empty(10)
ae.fill(0.15)
be = ae
sca1=a1.errorbar(a, b, ae, be, capsize=0, ls='none', color='black', elinewidth=1)
If you want to compute the fit and plot the (fixed at the given value) error bars over the fit points, like this:
Then this code will do the job:
import numpy as np
import matplotlib.pyplot as mp
a = np.array([1.08,2.05,1.56,0.73,1.1,0.73,0.34,0.73,0.88,2.05])
b=np.array([4.72131259, 6.60937492, 6.41485738, 6.82386894, 6.20293278, 7.22670489, 6.15681295, 5.91595178, 6.43917035, 6.64453907])
ae = np.empty(10)
ae.fill(0.15)
be = ae
m1, b1 = np.polyfit(a, b, 1)
mp.figure()
corr1 =mp.errorbar(a,m1*a+b1,ae,be, '-', color='black')
mp.scatter(a, b)
mp.show()
If you want to get the covariance of the fit, and use the standard deviation to set the error bars, instead use the code
import numpy as np
import matplotlib.pyplot as mp
import math
a = np.array([1.08,2.05,1.56,0.73,1.1,0.73,0.34,0.73,0.88,2.05])
b=np.array([4.72131259, 6.60937492, 6.41485738, 6.82386894, 6.20293278, 7.22670489, 6.15681295, 5.91595178, 6.43917035, 6.64453907])
coeff,covar = np.polyfit(a, b, 1,cov=True)
m1= coeff[0]
b1= coeff[1]
xe = math.sqrt(covar[0][0])
ye = math.sqrt(covar[1][2])
mp.figure()
corr1 =mp.errorbar(a,m1*a+b1,xe,ye, '-', color='black')
mp.scatter(a, b)
mp.show()
which gives a plot like this:
If you want to do a weighted fit, you can supply a weight vector to polyfit with the syntax
m2, b2 = np.polyfit(a, b, 1,w=weightvector)
According to the documentation the weightvector should contain 1 over the standard deviation of the data points.
If you want to do a least squares fit weighted by errors in BOTH x and y, I don't think polyfit does this - it will accept a weight vector for one dimension.
To supply errors in both dimensions as weights you would have to use scipy.optimize.leastsq.
There is a documentation page at this link of the Scipy documentation about doing fits with scipy.optimize.leastsq. The example talks about a fit to power law, but clearly a straight line could be done as well.
For errors in one dimension (Y) here, an example using leastsq is:
import numpy as np
import matplotlib.pyplot as mp
import math
from scipy import optimize
a = np.array([1.08,2.05,1.56,0.73,1.1,0.73,0.34,0.73,0.88,2.05])
b=np.array([4.72131259, 6.60937492, 6.41485738, 6.82386894, 6.20293278, 7.22670489, 6.15681295, 5.91595178, 6.43917035, 6.64453907])
aerr = np.empty(10)
aerr.fill(0.15)
berr=aerr
# fit a straight line with scipy scipy.optimize.leastsq
# define our (line) fitting function
fitfunc = lambda p, x: p[0] + p[1] * x
errfunc = lambda p, x, y, err: (y - fitfunc(p, x)) / err
pinit = [1.0, -1.0]
out = optimize.leastsq(errfunc, pinit,args=(a, b, aerr), full_output=1)
coeff = out[0]
covar = out[1]
print 'coeff', coeff
print 'covar', covar
m1= coeff[1]
b1= coeff[0]
xe = math.sqrt(covar[0][0])
ye = math.sqrt(covar[1][1])
# plot results
mp.figure()
corr2 =mp.errorbar(a,m1*a+b1,xe,ye, '-', color='red')
mp.scatter(a, b)
mp.show()
To take into account errors in both X and Y, you would have to change the definition of errfunc to reflect the specific technique you are using to do that. If a lambda isn't convenient you can instead define a function that will do that. I can't comment further on this without knowing what technique is being used to weight by X and Y errors, there are several in the literature.