for example with a numpy array like the following,
a=([1,2,3,4,5], [100,200,300,400,500]) and x=a[0] and y=a[1]
how can I plot xy where 2 < x < 4 ?
You could try masks on arrays: http://docs.scipy.org/doc/numpy/reference/routines.ma.html
Here's an example of what I mean:
import numpy as np
x = np.array([1,2,3,4,5])
y = np.array([100,200,300,400,500])
# b contains true when corresponding value of x is outside 2 < x < 4
b = np.ma.masked_outside(x, 2, 4).mask
# x2 originates from x, but values 2 < x < 4 are stripped (according to the boolean variables contained in b), the same is done with y2
x2 = x[~b]
y2 = y[~b]
print 'x2', x2
print 'y2', y2
When its only about plotting, you could just use
import matplotlib.pyplot as plt
plt.plot(x,y)
plt.axis((2,4,None,None))
plt.show()
>>> from numpy import array
>>> a=array(([1,2,3,4,5], [100,200,300,400,500]))
>>> a[:, (2 < a[0])*(a[0] < 4)]
array([[ 3],
[300]])
Since that just gives a single point, let's choose another range:
>>> a[:, (1.5 < a[0])*(a[0] < 4.5)]
array([[ 2, 3, 4],
[200, 300, 400]])
To explain, (1.5 < a[0])*(a[0] < 4.5) is a vector of true and false values. It is true whenever x is between 1.5 and 4.5. Numpy can use such boolean vectors to select just those values. We use that for the second axis.
For the first axis, if we had used 0 (as in a[0, (1.5 < a[0])*(a[0] < 4.5)]), we would have gotten just the x-values between 1.5 and 4.5. If we had used 1, we would have gotten just the y values that correspond to that x-range. If we want an array with both the x and y values, we can use : for the first axis which means "all".
If we want to plot those values:
b = a[:, (1.5 < a[0])*(a[0] < 4.5)]
import matplotlib.pyplot as plt
p = plt.plot(b[0], b[1])
plt.show()
Related
I am attempting to implement a perceptron. I have loaded a 100x2 array of values between 0 and 100. Each item in the array has a label of either -1 or 1.
I believe the perceptron is working, however I cannot plot decision boundary as shown here: plot decision boundary matplotlib
When I run my code I only see a single color background. I would expect to see two colors, one color for each label in my data set (-1 and 1).
My current output, I expect to see 2 colors for the background (-1 or 1)
An example of what I hope to see, from the sklearn documentation
import numpy as np
from matplotlib import pyplot as plt
def generate_data():
#generate a dataset that is linearly seperable
group_1 = np.random.randint(50, 100, size=(50,2))
group_1_labels = np.full((50,1), 1)
group_2 = np.random.randint(0, 49, size =(50,2))
group_2_labels = np.full((50,1), -1)
#add a bias value of -1
bias = np.full((50,1), -1)
#add labels, upper right quadrant are 1, lower left are -1
group_1_with_bias = np.hstack((group_1, bias))
group_2_with_bias = np.hstack((group_2, bias))
group_1_labeled = np.hstack((group_1_with_bias, group_1_labels))
group_2_labeled = np.hstack((group_2_with_bias, group_2_labels))
#merge our labeled data and shuffle!
merged_data = np.vstack((group_1_labeled, group_2_labeled))
np.random.shuffle(merged_data)
return merged_data
data = generate_data()
#load data, strip labels, add a -1 bias value
X = data[:, :3]
#create labels matrix
l = np.ravel(data[:, 3:])
def perceptron_sgd(X, l, c, epochs):
#initialize weights
w = np.zeros(3)
errors = []
for epoch in range(epochs):
total_error = 0
for i, x in enumerate(X):
if (np.dot(x, w) * l[i]) <= 0:
total_error += (np.dot(x, w) * l[i])
w = w + c * (x * l[i])
errors.append(total_error * -1)
print "epoch " + str(epoch) + ": " + str(w)
return w, errors
def classify(X, l, w):
z = np.dot(X, w)
print z
z[z <= 0] = -1
z[z > 0] = 1
#return a matrix of predicted labels
return z
w, errors = perceptron_sgd(X, l, .001, 36)
# X - some data in 2dimensional np.array
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, .2), np.arange(y_min, y_max, .2))
# here "model" is your model's prediction (classification) function
Z = classify(np.c_[xx.ravel(), yy.ravel()], l, w[:-1]) #strip the bias from weights
# Put the result into a color plot
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, cmap=plt.cm.Paired)
plt.axis('off')
#Plot also the training points
plt.scatter(X[:, 0], X[:, 1], c=l, cmap=plt.cm.Paired)
I got it to work.
Standardized your X
from sklearn import preprocessing
scaler = preprocessing.StandardScaler().fit(X[:, :-1])
X_trans = np.column_stack((scaler.transform(X[:, :-1]), X[:, -1]))
Better initialization than zero.
#initialize weights
r = np.sqrt(2)
w = np.random.uniform(-r, r, (3,))
Add learned biases during prediction
z = np.dot(X, w[:-1]) + w[-1]
Standardize during prediction as well (using standardization learned from input)
Z = classify(scaler.transform(np.c_[xx.ravel(), yy.ravel()]),
l, w) #strip the bias from weights
Generally, always a good idea to standardize the inputs.
Entire code:
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline
def generate_data():
#generate a dataset that is linearly seperable
group_1 = np.random.randint(50, 100, size=(50,2))
group_1_labels = np.full((50,1), 1)
group_2 = np.random.randint(0, 49, size =(50,2))
group_2_labels = np.full((50,1), -1)
#add a bias value of -1
bias = np.full((50,1), -1)
#add labels, upper right quadrant are 1, lower left are -1
group_1_with_bias = np.hstack((group_1, bias))
group_2_with_bias = np.hstack((group_2, bias))
group_1_labeled = np.hstack((group_1_with_bias, group_1_labels))
group_2_labeled = np.hstack((group_2_with_bias, group_2_labels))
#merge our labeled data and shuffle!
merged_data = np.vstack((group_1_labeled, group_2_labeled))
np.random.shuffle(merged_data)
return merged_data
data = generate_data()
#load data, strip labels, add a -1 bias value
X = data[:, :3]
#create labels matrix
l = np.ravel(data[:, 3:])
from sklearn import preprocessing
scaler = preprocessing.StandardScaler().fit(X[:, :-1])
X_trans = np.column_stack((scaler.transform(X[:, :-1]), X[:, -1]))
def perceptron_sgd(X, l, c, epochs):
#initialize weights
r = np.sqrt(2)
w = np.random.uniform(-r, r, (3,))
errors = []
for epoch in range(epochs):
total_error = 0
for i, x in enumerate(X):
if (np.dot(x, w) * l[i]) <= 0:
total_error += (np.dot(x, w) * l[i])
w = w + c * (x * l[i])
errors.append(total_error * -1)
print("epoch " + str(epoch) + ": " + str(w))
return w, errors
def classify(X, l, w):
z = np.dot(X, w[:-1]) + w[-1]
print(z)
z[z <= 0] = -1
z[z > 0] = 1
#return a matrix of predicted labels
return z
w, errors = perceptron_sgd(X_trans, l, .01, 25)
# X - some data in 2dimensional np.array
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, .1), np.arange(y_min, y_max, .1))
# here "model" is your model's prediction (classification) function
Z = classify(scaler.transform(np.c_[xx.ravel(), yy.ravel()]), l, w) #strip the bias from weights
# Put the result into a color plot
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.4)
#plt.axis('off')
#Plot also the training points
plt.scatter(X[:, 0], X[:, 1], c=l, cmap=plt.cm.Paired)
The operation consists of two arrays X and idx of equal length where the values of idx can vary between 0 to (k-1) with the value of k given.
This is the general Python code to illustrate this.
import numpy as np
X = np.arange(6) # Just for a sample of elements
k = 3
idx = numpy.array([[0, 1, 2, 2, 0, 1]]).T # Can only contain values in [0..(k-1)]
np.array([X[np.where(idx==i)[0]] for i in range(k)])
Sample output:
array([[0, 4],
[1, 5],
[2, 3]])
Note that there is actually a reason for me to represent idx as a matrix and not as a vector. It was initialised to numpy.zeros((n,1)) as part of its computation, where n the size of X.
I tried implement this in Theano like so
import theano
import theano.tensor as T
X = T.vector('X')
idx = T.vector('idx')
k = T.scalar()
c = theano.scan(lambda i: X[T.where(T.eq(idx,i))], sequences=T.arange(k))
f = function([X,idx,k],c)
But I received this error at line where c is defined:
TypeError: Wrong number of inputs for Switch.make_node (got 1((<int8>,)), expected 3)
Is there a simple way to implement this in Theano?
Use nonzero() and correct the dimensions of idx.
This code solved the problem
import theano
import theano.tensor as T
X = T.vector('X')
idx = T.vector('idx')
k = T.scalar()
c, updates = theano.scan(lambda i: X[T.eq(idx,i).nonzero()], sequences=T.arange(k))
f = function([X,idx,k],c)
For the same example, through the use of Theano:
import numpy as np
X = np.arange(6)
k = 3
idx = np.array([[0, 1, 2, 2, 0, 1]]).T
f(X, idx.T[0], k).astype(int)
This gives the output as
array([[0, 4],
[1, 5],
[2, 3]])
If idx is defined as np.array([0, 1, 2, 2, 0, 1]), then f(X, idx, k) can be used instead.
I've got a theano function which computes euclidean distances for 2 matrices—X (n vectors x k features) and Y (m vectors x k features). The result is an n x m matrix of pairwise distances of each vector (or row) in X from each vector (or row) in Y.
import theano
from theano import tensor as T
X, Y = T.dmatrices('X', 'Y')
X_squared_sum = T.sum(X ** 2, axis=1, keepdims=True)
Y_squared_sum = T.sum(Y.T ** 2, axis=0, keepdims=True)
squared_distances = X_squared_sum + Y_squared_sum - 2 * T.dot(X, Y.T)
f_distance = theano.function([X, Y], T.sqrt(squared_distances))
Let's say I change the above function to accept a single vector, an array of vectors, and the number of smallest distances. What I want is a theano function that will find the N smallest distances, similar to below:
import numpy as np
import theano
from theano import tensor as T
X = T.dvector('X')
Y = T.dmatrix('Y')
N = T.iscalar('N')
X_squared_sum = T.dot(X, X)
Y_squared_sum = T.sum(Y.T ** 2, axis=0)
squared_distances = X_squared_sum + Y_squared_sum - 2 * T.dot(X, Y.T)
dist_sorted = T.FIND_N_SMALLEST(T.sqrt(squared_distances), N)
n_closest = theano.function([X, Y, N], dist_sorted)
U = np.array([[1, 1, 1, 1]])
V = np.array([
[ 4, 4, 4, 4],
[ 2, 2, 2, 2],
[ 3, 3, 3, 3],
[ 1, 1, 1, 1]])
n_closest(U, V, 2) # [0.0, 2.0]
I'd like to avoid explicitly sorting all the distances, since the number that I want will generally be much much smaller than the total number of distances.
So I wanted to see if I could make fractal flames using matplotlib and figured a good test would be the sierpinski triangle. I modified a working version I had that simply performed the chaos game by normalizing the x range from -2, 2 to 0, 400 and the y range from 0, 2 to 0, 200. I also truncated the x and y coordinates to 2 decimal places and multiplied by 100 so that the coordinates could be put in to a matrix that I could apply a color map to. Here's the code I'm working on right now (please forgive the messiness):
import numpy as np
import matplotlib.pyplot as plt
import math
import random
def f(x, y, n):
N = np.array([[x, y]])
M = np.array([[1/2.0, 0], [0, 1/2.0]])
b = np.array([[.5], [0]])
b2 = np.array([[0], [.5]])
if n == 0:
return np.dot(M, N.T)
elif n == 1:
return np.dot(M, N.T) + 2*b
elif n == 2:
return np.dot(M, N.T) + 2*b2
elif n == 3:
return np.dot(M, N.T) - 2*b
def norm_x(n, minX_1, maxX_1, minX_2, maxX_2):
rng = maxX_1 - minX_1
n = (n - minX_1) / rng
rng_2 = maxX_2 - minX_2
n = (n * rng_2) + minX_2
return n
def norm_y(n, minY_1, maxY_1, minY_2, maxY_2):
rng = maxY_1 - minY_1
n = (n - minY_1) / rng
rng_2 = maxY_2 - minY_2
n = (n * rng_2) + minY_2
return n
# Plot ranges
x_min, x_max = -2.0, 2.0
y_min, y_max = 0, 2.0
# Even intervals for points to compute orbits of
x_range = np.arange(x_min, x_max, (x_max - x_min) / 400.0)
y_range = np.arange(y_min, y_max, (y_max - y_min) / 200.0)
mat = np.zeros((len(x_range) + 1, len(y_range) + 1))
random.seed()
x = 1
y = 1
for i in range(0, 100000):
n = random.randint(0, 3)
V = f(x, y, n)
x = V.item(0)
y = V.item(1)
mat[norm_x(x, -2, 2, 0, 400), norm_y(y, 0, 2, 0, 200)] += 50
plt.xlabel('x0')
plt.ylabel('y')
fig = plt.figure(figsize=(10,10))
plt.imshow(mat, cmap="spectral", extent=[-2,2, 0, 2])
plt.show()
The mathematics seem solid here so I suspect something weird is going on with how I'm handling where things should go into the 'mat' matrix and how the values in there correspond to the colormap.
If I understood your problem correctly, you need to transpose your matrix using the method .T. So just replace
fig = plt.figure(figsize=(10,10))
plt.imshow(mat, cmap="spectral", extent=[-2,2, 0, 2])
plt.show()
by
fig = plt.figure(figsize=(10,10))
ax = gca()
ax.imshow(mat.T, cmap="spectral", extent=[-2,2, 0, 2], origin="bottom")
plt.show()
The argument origin=bottom tells to imshow to have the origin of your matrix at the bottom of the figure.
Hope it helps.
I have a set of X and Y coordinates and each point has a different pixel value - the Z quantity. I would like to plot these values using a raster or contour plot.
I am having difficulty doing this because there is no mathematical relationship between the pixel value and the X and Y coordinates.
I have created an array for the range of x and y values and I have tried constructing a dictionary where I can look up the value of z using a concatenated x and y string. At the moment I am having an index issue and I am wondering if there is a better way of achieving this?
My code so far is:
import matplotlib.pyplot as plt
import numpy as np
XY_Zpoints = {'11':8,
'12':8,
'13':8,
'14':6,
'21':6,
'22':8,
'23':6,
'24':6,
'31':8,
'32':3,
'33':8,
'34':6,
'41':8,
'42':3,
'43':3,
'44':8,
}
x, y = np.meshgrid(np.linspace(1,4,4), np.linspace(1,4,4))
z = XY_Zpoints[str(x)+str(y)]
# Plot the grid
plt.imshow(z)
plt.spectral()
plt.show()
Thanks in advance for any help you can offer!
Instead of a dictionary, you can use a numpy array where the position of each pixel value coresponds to the x and y coordinates. For your example this array would look like:
z = np.array([[8, 8, 8, 6], [6, 8, 6, 6], [8, 3, 8, 6], [8, 3, 3, 8]])
To access the pixel value at x = 2 and y = 3, for example you can do this:
x = 2
y = 3
pixel = z[x-1][y - 1]
z can be displayed with:
imshow(z)