How to get accurate predictions from a neural network - python-2.7

I created below neural network for the truth table for the 3-input logic AND gate, but the expected output for the [1,1,0] is not correct.Output should be 0. But it predicts as 0.9 that means approximately 1. So the output is not correct. So what I need to know is how to make the output prediction more accurate.Please guide me.
import numpy as np
class NeuralNetwork():
def __init__(self):
self.X = np.array([[0, 0, 0],
[0, 0, 1],
[0, 1, 0],
[0, 1, 1],
[1, 0, 0],
[1, 0, 1],
[1, 1, 1]])
self.y = np.array([[0],
[0],
[0],
[0],
[0],
[0],
[1]])
np.random.seed(1)
# randomly initialize our weights with mean 0
self.syn0 = 2 * np.random.random((3, 4)) - 1
self.syn1 = 2 * np.random.random((4, 1)) - 1
def nonlin(self,x, deriv=False):
if (deriv == True):
return x * (1 - x)
return 1 / (1 + np.exp(-x))
def train(self,steps):
for j in xrange(steps):
# Feed forward through layers 0, 1, and 2
l0 = self.X
l1 = self.nonlin(np.dot(l0, self.syn0))
l2 = self.nonlin(np.dot(l1, self.syn1))
# how much did we miss the target value?
l2_error = self.y - l2
if (j % 10000) == 0:
print "Error:" + str(np.mean(np.abs(l2_error)))
# in what direction is the target value?
# were we really sure? if so, don't change too much.
l2_delta = l2_error * self.nonlin(l2, deriv=True)
# how much did each l1 value contribute to the l2 error (according to the weights)?
l1_error = l2_delta.dot(self.syn1.T)
# in what direction is the target l1?
# were we really sure? if so, don't change too much.
l1_delta = l1_error * self.nonlin(l1, deriv=True)
self.syn1 += l1.T.dot(l2_delta)
self.syn0 += l0.T.dot(l1_delta)
print("Output after training:")
print(l2)
def predict(self,newInput):
# Multiply the input with weights and find its sigmoid activation for all layers
layer0 = newInput
print("predict -> layer 0 : "+str(layer0))
layer1 = self.nonlin(np.dot(layer0, self.syn0))
print("predict -> layer 1 : "+str(layer1))
layer2 = self.nonlin(np.dot(layer1, self.syn1))
print("predicted output is : "+str(layer2))
if __name__ == '__main__':
ann=NeuralNetwork()
ann.train(100000)
ann.predict([1,1,0])
Output:
Error:0.48402933124
Error:0.00603525276229
Error:0.00407346660344
Error:0.00325224335386
Error:0.00277628698655
Error:0.00245737222701
Error:0.00222508289674
Error:0.00204641406194
Error:0.00190360175536
Error:0.00178613765229
Output after training:
[[ 1.36893057e-04]
[ 5.80758383e-05]
[ 1.19857670e-03]
[ 1.85443483e-03]
[ 2.13949603e-03]
[ 2.19360982e-03]
[ 9.95769492e-01]]
predict -> layer 0 : [1, 1, 0]
predict -> layer 1 : [ 0.00998162 0.91479567 0.00690524 0.05241988]
predicted output is : [ 0.99515547]

Actually, it does produce correct output -- the model is ambiguous. Your input data fits A*B; the value of the third input never affects the given output, so your model has no way to know that it's supposed to matter in case 110. In terms of pure information theory, you don't have the input to force the result you want.

Seems like this is happening for every input you miss in the AND gate. For example try replacing [0, 1, 1] input with [1, 1, 0] and then try to predict [0, 1, 1] it predicts the final value close to 1. I tried including biases and learning rate but nothing seem to work.
Like Prune mentioned it might be because the BackPropagation Network is not able to work with the incomplete model.
To train your network to the fullest and get optimal weights, provide all the possible inputs i.e 8 inputs to the AND gate. Then you can always get the correct predictions because you already trained the network with those inputs, which might not make sense with predictions in this case. May be predictions on a small dataset do not work that great.
This is just my guess because almost all the networks I used for predictions used to have fairly bigger datasets.

Related

skimage.feature.greycomatrix only producing diagonal values

I am attempting to produce glcm on a trend-reduced digital elevation model. My current problem is that the output of skimage.feature.greycomatrix(image) only contains values in the diagonal entries of the matrix.
glcm = greycomatrix(image,distances=[1],levels=100,angles=[0] ,symmetric=True,normed=True)
The image is quantized prior with the following code:
import numpy as np
from skimage.feature import greycomatrix
def quantize(raster):
print("\n Quantizing \n")
raster += (np.abs(np.min(raster)) + 1)
mean = np.nanmean(raster.raster[raster.raster > 0])
std = np.nanstd(raster.raster[raster.raster > 0])
raster[raster == None] = 0 # set all None values to 0
raster[np.isnan(raster)] = 0
raster[raster > (mean + 1.5*std)] = 0
raster[raster < (mean - 1.5*std)] = 0 # High pass filter
raster[raster > 0] = raster[raster > 0] - (np.min(raster[raster > 0]) - 1)
raster[raster>101] = 0
raster = np.rint(raster)
flat = np.ndarray.flatten(raster[raster > 0])
range = np.max(flat) - np.min(flat)
print("\n\nRaster Range: {}\n\n".format(range))
raster = raster.astype(np.uint8)
raster[raster > 101] = 0
How would I go about making the glcm compute values outside of the diagonal matrix (i.e. just the frequencies of the values themselves), and is there something fundamentally wrong with my approach?
If pixel intensities are correlated in an image, the co-occurrence of two similar levels is highly probable, and therefore the nonzero elements of the corresponding GLCM will concentrate around the main diagonal. In contrast, if pixel intensities are uncorrelated the nonzero elements of the GLCM will be spread all over the matrix. The following example makes this apparent:
import numpy as np
from skimage import data
import matplotlib.pyplot as plt
from skimage.feature import greycomatrix
x = data.brick()
y = data.gravel()
mx = greycomatrix(x, distances=[1], levels=256, angles=[0], normed=True)
my = greycomatrix(y, distances=[1], levels=256, angles=[0], normed=True)
fig, ax = plt.subplots(2, 2, figsize=(12, 8))
ax[0, 0].imshow(x, cmap='gray')
ax[0, 1].imshow(mx[:, :, 0, 0])
ax[1, 0].imshow(y, cmap='gray')
ax[1, 1].imshow(my[:, :, 0, 0])
Despite I haven't seen your raster image I'm guessing that the intensity changes very smoothly across the image returned by quantize, and hence the GLCM is mostly diagonal.

Comparison of Lists for the 2048 game

def helper(mat):
for row in mat:
zero_list = []
for subrow in row:
if subrow == 0:
row.remove(0)
zero_list.append(0)
row.extend(zero_list)
return mat
def merge_left(mat):
result = mat.copy()
helper(mat)
counter = 0
for i in range(len(mat)):
current_tile = 0
for j in range(len(mat)):
if mat[i][j] == current_tile:
mat[i][j-1] *= 2
mat[i][j] = 0
counter += mat[i][j-1]
current_tile = mat[i][j]
helper(mat)
return result == mat
print(merge_left([[2, 2, 0, 2], [4, 0, 0, 0], [4, 8, 0, 4], [0, 0, 0, 2]]))
Hey guys,
The result I get for merge_left in the above code is True for the test case.
Given that result is a duplicate copy of mat.
How is it so that result has also been altered in a similar way to mat through this code?
I'd understand this to be the case if I had written
result = mat instead of result = mat.copy()
Why is this the case? I'm aiming to compare the two states of the input mat. Before the code alters mat and after it does.
list.copy() only clones the outer list. The inner lists are still aliases, so modifying one of them modifies result and mat. Here's a minimal reproduction of the problem:
>>> x = [[1, 2]]
>>> y = x.copy()
>>> y[0][0] += 1
>>> y
[[2, 2]]
>>> x
[[2, 2]]
You can use [row[:] for row in mat] to deep copy each row within the matrix. Slicing and copy are pretty much the same.
You can also use copy.deepcopy, but it's overkill for this.
Also, row.remove(0) while iterating over row, as with iterating over any list while adding or removing elements from it, is very likely a bug. Consider a redesign or use for subrow in row[:]: at minimum.

Two-sided moving average in python

Hi I have some data and I want to compute the centered moving average or two-sided moving average.
I've understood how easy this can be done with the numpy.convolve function and I wonder if there is an easy or similar way in which this can be done, but when the average needs to be two-sided.
The one sided moving average usually works in the following way if the interval contains three entries, N = 3:
import numpy
list = [3, 4, 7, 8, 9, 10]
N = 3
window = numpy.repeat(1., N)/N
moving_avg = numpy.convolve(list, window, 'valid')
moving_avg = array([ 4.66666667, 6.33333333, 8. , 9. ])
Now what I am aiming to get is the average that is centered, so that if N = 3, the intervals over which the mean is taken are: [[3, 4, 7], [4, 7, 8], [7, 8, 9], [8, 9, 10]]. This is also tricky if N is an even number. Is there a tool to compute this? I'd prefer to do it either by writing a function or using numpy.
Like the commenters, I'm also confused what you're trying to accomplish that's different than the way you demonstrated.
In any case, I did want to offer a solution that lets you write your own convolution operations using Numba's #stencil decorator:
from numba import stencil
#stencil
def ma(a):
return (a[-1] + a[0] + a[1]) / 3
data = np.array([3, 4, 7, 8, 9, 10])
print(ma(data))
[0. 4.66666667 6.33333333 8. 9. 0. ]
Not sure if that's exactly what you're looking for, but the stencil operator is great. The variable you pass it represents a given element, and any indexing you use is relative to that element. As you can see, it was pretty easy to make a 3-element window to calculate a moving average.
Hopefully this gives you what you need.
Using a Large Neighborhood
You can add a parameter to the stencil, which is inclusive. Let's make a neighborhood of 9:
#stencil(neighborhood = ((-4, 4),))
def ma(a):
cumul = 0
for i in range(-4, 5):
cumul += a[i]
return cumul / 9
You can shift the range forward or back with (-8, 0) or (0, 8) and changing the range.
Setting N Neighborhood
Not sure if this is the best way, but I accomplished it with a wrapper:
def wrapper(data, N):
#nb.stencil(neighborhood = ((int(-(N-1)/2), int((N-1)/2)),))
def ma(a):
cumul = 0
for i in np.arange(int(-(N-1)/2), int((N-1)/2)+1):
cumul += a[i]
return cumul / N
return ma(data)
Again, indexing is weird, so you'll have to play with it to get the desired effect.

How to deal with overfitting in Tensorflow?

I'm currently trying to train a image classification convolutional neural network. I'm using an architecture similar to that in the TensorFlow tutorial. After training, I can get a quite high training accuracy and a very low cross entropy. But the test accuracy is always only a little bit higher than random guessing. The neural network seems to suffer from overfitting. In the training process, I have applied stochastic gradient descent and droupout to try to avoid overfitting. But it just doesn't seem to work.
Here is part of my code.
batch_image = np.ndarray(shape=(100,9216), dtype='float')
batch_class = np.ndarray(shape=(100,10), dtype='float')
# first convolutinal layer
w_conv1 = weight_variable([5, 5, 3, 64])
b_conv1 = bias_variable([64])
x_image = tf.reshape(x, [-1, 48, 64, 3])
h_conv1 = tf.nn.relu(conv2d(x_image, w_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)
norm1 = tf.nn.lrn(tf.to_float(h_pool1, name='ToFloat'), 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75)
# second convolutional layer
w_conv2 = weight_variable([5, 5, 64, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(norm1, w_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)
norm2 = tf.nn.lrn(tf.to_float(h_pool2, name='ToFloat'), 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75)
# densely connected layer
w_fc1 = weight_variable([12*16*64, 512])
b_fc1 = bias_variable([512])
h_pool2_flat = tf.reshape(norm2, [-1, 12*16*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, w_fc1) + b_fc1)
#densely connected layer
w_fc2 = weight_variable([512, 256])
b_fc2 = bias_variable([256])
h_fc2 = tf.nn.relu(tf.matmul(h_fc1, w_fc2) + b_fc2)
# dropout
keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc2, keep_prob)
# readout layer
w_fc3 = weight_variable([256, 10])
b_fc3 = bias_variable([10])
y_prob = tf.nn.softmax(tf.matmul(h_fc1_drop, w_fc3) + b_fc3)
# train and evaluate the model
cross_entropy = -tf.reduce_sum(y_ * tf.log(y_prob + 0.000000001))
train_step = tf.train.GradientDescentOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_prob, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
sess.run(tf.initialize_all_variables())
for i in range(100):
rand_idx = np.random.randint(17778, size=(100))
k = 0
for j in rand_idx:
batch_image[k] = images[j]
batch_class[k] = np.zeros(shape=(10))
batch_class[k, classes[j, 0]] = 1.0
k+=1
train_step.run(feed_dict={x:batch_image, y_:batch_class, keep_prob:0.5})
train_accuracy = accuracy.eval(feed_dict={x:batch_image, y_:batch_class, keep_prob:1.0})
train_ce = cross_entropy.eval(feed_dict={x:batch_image, y_:batch_class, keep_prob:1.0})
I am wondering is there any mistake in my code or do I have to apply any other strategies to get a better test accuracy.
Thank you!
You can try below strategies to avoid overfitting.
shuffle the input data
Use early stopping for the Loss function with some Patience level.
L1 & L2 Regularization
Add Dropout
Batch Normalization.
If pixels are not normalized, dividing the pixels values with 255 also helps.
Perform Image Data Agumentation.
May be hyper parameter tuning grid search.
Hope it helps! Happy Coding.
Thank You!

Advanced comparison of lists

I am currently making a program for a school project which is supposed to help farmers overfertilize less (this is kind of a big deal in Denmark). The way the program is supposed to work is that you enter some information about your fields(content of NPK, field size, type of dirt and other things), and then i'll be able to compare their field's content of nutrition to the recommended amount. Then I can create the theoretical ideal composition of fertilizer for this field.
This much I have been able to do, but here is the hard part.
I have a long list of fertilizers that are available in Denmark, and I want my program to compare 10 of them to my theoretical ideal composition, and then automatically pick the one that fits best.
I literally have no idea how to do this!
The way I format my fertilizer compositions is in lists like this
>>>print(idealfertilizercomp)
[43.15177154944473, 3.9661554732945534, 43.62771020624008, 4.230565838180857, 5.023796932839768]
Each number represent one element in percent. An example could be the first number, 43.15177154944473, which is the amount of potassium I want in my fertilizer in percent.
TL;DR:
How do I make a program or function that can compare a one list of integers to a handfull other lists of integers, and then pick the one that fits best?
So, while i had dinner i actually came up with a way to compare multiple lists in proportion to another:
def numeric(x):
if x >= 0:
return x
else:
return -x
def comparelists(x,y):
z1 = numeric(x[0]-y[0])
z2 = numeric(x[1]-y[1])
z3 = numeric(x[2]-y[2])
z4 = numeric(x[3]-y[3])
z5 = numeric(x[4]-y[4])
zt = z1+z2+z3+z4+z5
return zt
def compare2inproportion(x, y, z):
n1 = (comparelists(x, y))
n2 = (comparelists(x, z))
print(n1)
print(n2)
if n1 < n2:
print(y, "is closer to", x, "than", z)
elif n1 > n2:
print(z, "is closer to", x, "than", y)
else:
print("Both", y, "and", z, "are equally close to", x)
idealfertilizer = [1, 2, 3, 4, 5]
fertilizer1 = [2, 3, 4, 5, 6]
fertilizer2 = [5, 4, 3, 2, 1]
compare2inproportion(idealfertilizer, fertilizer1, fertilizer2)
This is just a basic version that compares two lists, but its really easy to expand upon. The output looks like this:
[2, 3, 4, 5, 6] is closer to [1, 2, 3, 4, 5] than [5, 4, 3, 2, 1]
Sorry for taking your time, and thanks for the help.