Input 0 of layer "sequential" is incompatible with the layer: expected shape....... found shape=(None, 143) - python-2.7

I am a beginner in machine learning and I am trying to train a model with nltk and tensorflow. But I get the following error when I run my program. I understand the problem. it seems that the shape of my list does not pass but I do not know why and I do not find any relief. I specify that I use a list of lists with different sizes. Need help please I need to understand, solve and move forward
code and error:
I am trying to train a model with nltk and tensorflow. But it seems that the shape of my list does not pass but I do not know why and I do not find any relief. I specify that I use a list of lists with different sizes.
github code:
https://github.com/maeltoukap/whatsapp-chat-bot

First of all, you are missing two reshape steps. You need to add the lines
train_x = np.expand_dims(train_x, axis=1)
train_y = np.expand_dims(train_y, axis=1)
after you define train_x and train_y (so after line 67 in your picture). Your input shape is then the shape of your first training example, so change input_shape: train_x[0] to input_shape: train_x[0].shape. Also change the number of neurons in your last dense layer. Currently you have in your last layer Dense(len(train_y[0]).... You need to change that to Dense(30, ...). Then you should be good.
The complete code would look like this:
random.shuffle(training)
training = np.array(training, dtype=object)
train_x = list(training[:, 0])
train_y = list(training[:, 1])
train_x = np.expand_dims(train_x, axis=1)
train_y = np.expand_dims(train_y, axis=1)
print(len(train_x))
model = Sequential()
# model.add(Dense(128, input_shape=113, activation='relu'))
model.add(Dense(128, input_shape=train_x[0].shape, activation='relu'))
# model.add(Dense(128, input_shape=(len(train_x[0])), activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(54, activation='relu'))
model.add(Dropout(0.5))
# model.add(Dense(113, activation='softmax'))
model.add(Dense(30, activation='softmax'))
sgd = SGD(learning_rate=0.01, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='categorical_crossentropy', metrics=['accuracy'])
hist = model.fit(train_x, train_y, epochs=200, batch_size=5, verbose=1)

Related

Tensor flow shuffle a tensor for batch gradient

To whom it may concern,
I am pretty new to tensorflow. I am trying to solve the famous MNIST problem for CNN. But i have encountered difficulty when i have to resuffle the x_training data (which is a [40000, 28, 28, 1] shape data.
my code is as below:
x_train_final = tf.reshape(x_train_final, [-1, image_width, image_width, 1])
x_train_final = tf.cast(x_train_final, dtype=tf.float32)
perm = np.arange(num_training_example).astype(np.int32)
np.random.shuffle(perm)
x_train_final = x_train_final[perm]
Below errors happened:
ValueError: Shape must be rank 1 but is rank 2 for 'strided_slice_1371' (op: 'StridedSlice') with input shapes: [40000,28,28,1], [1,40000], [1,40000], [1].
Anyone can advise how can i work around this? Thanks.
I would suggest you to make use of scikit's shuffle function.
from sklearn.utils import shuffle
x_train_final = shuffle(x_train_final)
Also, you can pass in multiple arrays and shuffle function will reorganize(shuffle) the data in those multiple arrays maintaining same shuffling order in all those arrays. So with that, you can even pass in your label dataset as well.
Ex:
X_train, y_train = shuffle(X_train, y_train)

Neural network input shape error

I am a beginner in keras and I am trying to classify data with a neural network.
x_train = x_train.reshape(1,x_train.shape[0],window,5)
x_val = x_val.reshape(1,x_val.shape[0],window,5)
x_train = x_train.astype('float32')
x_val = x_val.astype('float32')
model = Sequential()
model.add(Dense(64,activation='relu',input_shape= (data_dim,window,5)))
model.add(Dropout(0.5))
model.add(Dense(64,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(2,activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='sgd',
metrics=['accuracy'])
weights = model.get_weights()
model_info = model.fit(x_train, y_train,batch_size=batchsize, nb_epoch=15,verbose=1,validation_data=(x_val, y_val))
print x_train.shape
#(1,1600,45,5)
print y_train.shape
#(1600,2)
I always have this error with this script and I don't understand why:
ValueError: Error when checking target: expected dense_3 to have 4 dimensions, but got array with shape (16000, 2)
Your model's output (dense_3, so named because it is the third Dense layer) has four dimensions. However, the labels you are attempting to compare it to (y_train) is only two dimensions. You will need to alter your network's architecture so that your model reshapes the data to match the labels.
Keeping track of tensor shapes is difficult when you're just starting out, so I recommend calling plot_model(model, to_file='model.png', show_shapes=True) before calling model.fit. You can look at the resulting PNG to understand what effect layers are having on the shape of your data.

Can I use a neural network on a linear regression using Keras? If yes , How?

I'm having difficulties setting up a NN in Keras. Please help me!
This is my code and I'm getting random values every time when I predict.
model = Sequential()
layer1 = Dense(5, input_shape = (5,))
model.add(layer1)
model.add(Activation('relu'))
layer2 = Dense(1)
model.add(layer2)
model.add(Activation('relu'))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(xtrain, ytrain, verbose=1)
I have 5 input features and want to predict a single continuous value as an output
Input space have five features.
The problem was that i am getting random prediction at same input. Now, I have reach the solution. It is happening just because of that i am not doing the normalisation of features.
Thanks
From my point of view,
you are not giving your input shape correctly
layer1 = Dense(5, input_shape = (5,))
What is your actual input shape?

Understanding Deep Learning model accuracy

I need help in understanding the accuracy and dataset output format for Deep Learning model.
I did some training for deep learning based on this site : https://machinelearningmastery.com/deep-learning-with-python2/
I did the example for pima-indian-diabetes dataset, and iris flower dataset. I train my computer for pima-indian-diabetes dataset using script from this : http://machinelearningmastery.com/tutorial-first-neural-network-python-keras/
Then I train my computer for iris-flower dataset using below script.
# import package
import numpy
from pandas import read_csv
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils
from sklearn.model_selection import cross_val_score, KFold
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline
from keras.callbacks import ModelCheckpoint
# fix random seed for reproductibility
seed = 7
numpy.random.seed(seed)
# load dataset
dataframe = read_csv("iris_2.csv", header=None)
dataset = dataframe.values
X = dataset[:,0:4].astype(float)
Y = dataset[:,4]
# encode class value as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
### one-hot encoder ###
dummy_y = np_utils.to_categorical(encoded_Y)
# define base model
def baseline_model():
# create model
model = Sequential()
model.add(Dense(4, input_dim=4, init='normal', activation='relu'))
model.add(Dense(3, init='normal', activation='sigmoid'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model_json = model.to_json()
with open("iris.json", "w") as json_file:
json_file.write(model_json)
model.save_weights('iris.h5')
return model
estimator = KerasClassifier(build_fn=baseline_model, nb_epoch=1000, batch_size=6, verbose=0)
kfold = KFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(estimator, X, dummy_y, cv=kfold)
print("Accuracy: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
Everything works fine until I decided to try on other dataset from this link : https://archive.ics.uci.edu/ml/datasets/Glass+Identification
At first I train this new dataset using the pime-indian-diabetes dataset script's example and change the value for X and Y variable to this
dataset = numpy.loadtxt("glass.csv", delimiter=",")
X = dataset[:,0:10]
Y = dataset[:,10]
and also the value for the neuron layer to this
model = Sequential()
model.add(Dense(10, input_dim=10, init='uniform', activation='relu'))
model.add(Dense(10, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
the result produce accuracy = 32.71%
Then I changed the output column of this dataset which is originally in integer (1~7) to string (a~g) and use the example's script for the iris-flower dataset by doing some modification to it
import numpy
from pandas import read_csv
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
seed = 7
numpy.random.seed(seed)
dataframe = read_csv("glass.csv", header=None)
dataset = dataframe.values
X = dataset[:,0:10].astype(float)
Y = dataset[:,10]
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
def create_baseline():
model = Sequential()
model.add(Dense(10, input_dim=10, init='normal', activation='relu'))
model.add(Dense(1, init='normal', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model_json = model.to_json()
with open("glass.json", "w") as json_file:
json_file.write(model_json)
model.save_weights('glass.h5')
return model
estimator = KerasClassifier(build_fn=create_baseline, nb_epoch=1000, batch_size=10, verbose=0)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(estimator, X, encoded_Y, cv=kfold)
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
I did not use 'dummy_y' variable as refer to this tutorial : http://machinelearningmastery.com/binary-classification-tutorial-with-the-keras-deep-learning-library/
I check that the dataset using alphabet as the output and thinking that maybe I can reuse that script to train the new glass dataset that I modified.
This time the results become like this
Baseline : 68.42% (3.03%)
From the article, that 68% and 3% means the mean and standard deviation of model accuracy.
My 1st question is when do I use integer or alphabet as the output column? and is this kind of accuracy result common when we tempered with the dataset like changing the output from integer to string/alphabet?
My 2nd question is how do I know how many neuron I have to put for each layer? Is it related to what backend I use when compiling the model(Tensorflow or Theano)?
Thank you in advance.
First question
It doesn't matter, as you can see here:
Y = range(10)
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
print encoded_Y
Y = ['a', 'b', 'c', 'd', 'e', 'f','g','h','i','j']
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
print encoded_Y
results:
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
Which means that your classifier sees exactly the same labels.
Second question
There is no absolutely correct answer for this question, but for sure it does not depend on your backend.
You should try and experiment with different number of neurons, number of layers, types of layers and all other network parameters in order to understand what is the best architecture to your problem.
With experience you will develop both a good intuition as for what parameters will be better for which type of problems as well as a good method for the experimentation.
The best rule of thumb (assuming you have the dataset required to sustain such a strategy) I've heard is "Make your network as large as you can until it overfit, add regularization until it does not overfit - repeat".
Per parts. First, if your output includes values ​​of [0, 5] it is
impossible that using the sigmoid activation you can obtain that.
The sigmoid function has a range of [0, 1]. You could use an
activation = linear (without activation). But I think it's a bad approach because your problem is not to estimate a continuous value.
Second, the question you should ask yourself is not so much the type
of data you are using (in the sense of how you store the
information). Is it a string? Is it an int? Is it a float? It does
not matter, but you have to ask what kind of problem you are trying
to solve.
In this case, the problem should not be treated as a regression
(estimate a continuous value). Because your output are categorical,
numbers but categorical. Really you want to classifying between:
Type of glass: (class attribute).
When do a classification problem the following configuration is
"normally" used:
The class is encoded by one-hot encoding. It is nothing more than a vector of 0's and a single one in the corresponding class.
For instance: class 3 (0 count) and have 6 classes -> [0, 0, 0, 1, 0, 0] (as many zeros as classes you have).
As you see now, we dont have a single output, your model must be as outputs as your Y (6 classes). That way the last layer should
have as many neurons as classes. Dense (classes, ...).
You are also interested in the fact that the output is the probability of belonging to each class, that is: p (y = class_0),
... p (y_class_n). For this, the softmax activation layer is used,
which is to ensure that the sum of all the probabilities is 1.
You have to change the loss for the categorical_crossentropy so that it is able to work together with the softmax. And use the metric categorical_accuracy.
seed = 7
numpy.random.seed(seed)
dataframe = read_csv("glass.csv", header=None)
dataset = dataframe.values
X = dataset[:,0:10].astype(float)
Y = dataset[:,10]
encoder = LabelEncoder()
encoder.fit(Y)
from keras.utils import to_categorical
encoded_Y = to_categorical(encoder.transform(Y))
def create_baseline():
model = Sequential()
model.add(Dense(10, input_dim=10, init='normal', activation='relu'))
model.add(Dense(encoded_Y.shape[1], init='normal', activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])
model_json = model.to_json()
with open("glass.json", "w") as json_file:
json_file.write(model_json)
model.save_weights('glass.h5')
return model
model = create_baseline()
model.fit(X, encoded_Y, epochs=1000, batch_size=100)
The number of neurons does not depend on the backend you use.
But if it is true that you will never have the same results. That's
because there are enough stochastic processes within a network:
initialization, dropout (if you use), batch order, etc.
What is known is that expanding the number of neurons per dense
makes the model more complex and therefore has more potential to
represent your problem but is more difficult to learn and more
expensive both in time and in calculations. That way you always have
to look for a balance.
At the moment there is no clear evidence that it is better:
expand the number of neurons per layer.
add more layers.
There are models that use one architecture and others the other.
Using this architecture you get the following result:
Epoch 1000/1000
214/214 [==============================] - 0s 17us/step - loss: 0.0777 - categorical_accuracy: 0.9953
Using this architecture you get the following result:

DataConverstionWarning with GridsearchCV in Sklearn

I'm getting the following warning repeatedly when using GridsearchCV in Sklearn
"DataConversionWarning: Copying input dataframe for slicing."
I tried running some of the models separately outside of Gridsearch and didn't get any warnings. It also didn't prevent Gridsearch from finding a model.
I have 2 Questions:
1) What does this error mean?
2) What are the implications for my output, if any?
The relevant parts of the code are below:
df = pd.read_csv(os.path.join(filepath, "Modeling_Set.csv")) #loads main data
keep_vars = pd.read_csv(os.path.join(filepath, "keep_vars.csv")) #loads a list of variables to keep from a CSV list
model_vars = keep_vars[keep_vars['keep']==1]['name'] #creates a list of vars to keep
modeling_df = df[model_vars] #creates the df with only keep vars
model_feature_vars = model_vars[:-1]
#Splits test and train data
X_train, X_test, y_train, y_test = train_test_split(modeling_df[model_feature_vars], modeling_df['Segment'], test_size=0.30, random_state=42)
#sets up models
#Range of parameters for gridsearch with decision trees
max_depth = range(2,20,2)
min_samples_split = range(2,10,2)
features = range(2, len(X_train.columns))
#set up for decision trees with gridsearch
parametersDT ={'feature_selection__k':features,
'feature_selection__score_func':(chi2, f_classif),
'classification__criterion':('gini','entropy'),
'classification__max_depth':max_depth,
'classification__min_samples_split':min_samples_split}
DT_with_K_Best = Pipeline([
('feature_selection', SelectKBest()),
('classification', DecisionTreeClassifier())
])
clf_DT = GridSearchCV(DT_with_K_Best, parametersDT, cv=10, verbose=2, scoring='f1_weighted', n_jobs = -2)
clf_DT.fit(X_train,y_train)
As far as I can tell it only means that the DataFrame you're using is copied before being fed to the model.
This shouldn't affect the training results. It's only an efficiency problem, unrelated to the performance of the classifier.