I am trying to create a generative network based on the pre-trained Inception_v3.
1) I fix all the weights in the model
2) create a Variable whose size is (2, 3, 299, 299)
3) create targets of size (2, 1000) that I want my final layer activations to become as close as possible to by optimizing the Variable.
(I do not set the batchsize of 1, because unlike VGG16, Inception_v3 doesn't take batchsize=1, but that's not the point).
The following code should work, but gives me the error: «RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation».
# minimalist code with Inception_v3 that throws the error:
import torch
from torch.autograd import Variable
import torch.optim as optim
import torch.nn as nn
import torchvision
torch.set_default_tensor_type('torch.FloatTensor')
Iv3 = torchvision.models.inception_v3(pretrained=True)
for i in Iv3.parameters():
i.requires_grad = False
criterion = nn.CrossEntropyLoss()
x = Variable(torch.randn(2, 3, 299, 299), requires_grad=True)
target = torch.empty(2, dtype=torch.long).random_(1000)
output = Iv3(x)
loss = criterion(output[0], target)
loss.backward()
print(x.grad)
This is very strange, because if I do the same thing with VGG16, everything works fine:
# minimalist working code with VGG16:
import torch
from torch.autograd import Variable
import torch.optim as optim
import torch.nn as nn
import torchvision
# torch.cuda.empty_cache()
# vgg16 = torchvision.models.vgg16(pretrained=True).cuda()
# torch.set_default_tensor_type('torch.cuda.FloatTensor')
torch.set_default_tensor_type('torch.FloatTensor')
vgg16 = torchvision.models.vgg16(pretrained=True)
for i in vgg16.parameters():
i.requires_grad = False
criterion = nn.CrossEntropyLoss()
x = Variable(torch.randn(2, 3, 229, 229), requires_grad=True)
target = torch.empty(2, dtype=torch.long).random_(1000)
output = vgg16(x)
loss = criterion(output, target)
loss.backward()
print(x.grad)
Please help.
Thanks to #iacolippo the issue is solved. Turns out the problem was due to Pytorch 1.0.0. No problem with Pytorch 0.4.1. though.
Related
I have tried to write some example with keras,but some error happenError when checking target: expected dense_2 to have shape (2,) but got array with shape (1,)
I have tried to change the input_shape but it doesn't work
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split
import numpy
print "hello"
input=[[1],[2],[3],[4],[5],[6],[7],[8]]
input=numpy.array(input, dtype="float")
# input=input.reshape(8,1)
output=[[1],[0],[1],[0],[1],[0],[1],[0]]
output=numpy.array(output, dtype="float")
(trainx,testx,trainy,testy)=train_test_split(input, output, test_size=0.25, random_state=42)
lb = LabelBinarizer()
trainy=lb.fit_transform(trainy)
testy=lb.transform(testy)
model=Sequential()
model.add(Dense(4,input_shape=(1,),activation="sigmoid"))
# model.add(Dense(4,activation="sigmoid"))
# print len(lb.classes_)
model.add(Dense(len(lb.classes_),activation="softmax",input_shape=(4,)))
INIT_LR = 0.01
EPOCHS = 20
print("[INFO] training network...")
opt = SGD(lr=INIT_LR)
model.compile(loss="categorical_crossentropy", optimizer=opt,metrics=["accuracy"])
H = model.fit(trainx, trainy, validation_data=(testx, testy),epochs=EPOCHS, batch_size=2)
Since you have two classes, you can have a single neuron in the final Dense layer and use sigmoid activation. Or if you want to use softmax, you need to create a one hot encoding of y like this.
(trainx,testx,trainy,testy)=train_test_split(input, output, test_size=0.25, random_state=42)
trainy = keras.utils.to_categorical(trainy, 2)
testy = keras.utils.to_categorical(testy, 2)
You should use "from tensorflow.python.keras.xx" instead of "from keras.xx". It prevents it from receiving the error like: "AttributeError: module 'tensorflow' has no attribute 'get_default_graph"
Below is the code for a simple Bayesian Linear regression. After I obtain the trace and the plots for the parameters, is there any way in which I can save the data that created the plots in a file so that if I need to plot it again I can simply plot it from the data in the file rather than running the whole simulation again?
import pymc3 as pm
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0,9,5)
y = 2*x + 5
yerr=np.random.rand(len(x))
def soln(x, p1, p2):
return p1+p2*x
with pm.Model() as model:
# Define priors
intercept = pm.Normal('Intercept', 15, sd=5)
slope = pm.Normal('Slope', 20, sd=5)
# Model solution
sol = soln(x, intercept, slope)
# Define likelihood
likelihood = pm.Normal('Y', mu=sol,
sd=yerr, observed=y)
# Sampling
trace = pm.sample(1000, nchains = 1)
pm.traceplot(trace)
print pm.summary(trace, ['Slope'])
print pm.summary(trace, ['Intercept'])
plt.show()
There are two easy ways of doing this:
Use a version after 3.4.1 (currently this means installing from master, with pip install git+https://github.com/pymc-devs/pymc3). There is a new feature that allows saving and loading traces efficiently. Note that you need access to the model that created the trace:
...
pm.save_trace(trace, 'linreg.trace')
# later
with model:
trace = pm.load_trace('linreg.trace')
Use cPickle (or pickle in python 3). Note that pickle is at least a little insecure, don't unpickle data from untrusted sources:
import cPickle as pickle # just `import pickle` on python 3
...
with open('trace.pkl', 'wb') as buff:
pickle.dump(trace, buff)
#later
with open('trace.pkl', 'rb') as buff:
trace = pickle.load(buff)
Update for someone like me who is still coming over to this question:
load_trace and save_trace functions were removed. Since version 4.0 even the deprecation waring for these functions were removed.
The way to do it is now to use arviz:
with model:
trace = pymc.sample(return_inferencedata=True)
trace.to_netcdf("filename.nc")
And it can be loaded with:
trace = arviz.from_netcdf("filename.nc")
This way works for me :
# saving trace
pm.save_trace(trace=trace_nb, directory=r"c:\Users\xxx\Documents\xxx\traces\trace_nb")
# loading saved traces
with model_nb:
t_nb = pm.load_trace(directory=r"c:\Users\xxx\Documents\xxx\traces\trace_nb")
I used Keras to train a simple RNN with 2 layers of LSTM with dropout. I want to load the .pb graph in tensorflow C API and use it for later prediction, but I got segmentation fault. Later I found if I keep the network the same and only removing the dropout option and re-train it again, then everything runs OK. However I want to use the one with Dropout, because the accuracy is better in predicting test data. Some one with suggestions? There are so few examples for using tensorflow C API.
Here is where I got segmentation fault:
TF_SessionRun(session, NULL,
&inputs[0], &input_values[0], static_cast<int>(inputs.size()),
&outputs[0], &output_values[0], static_cast<int>(outputs.size()),
NULL, 0, NULL, status);
// Assign the values from the output tensor to a variable and iterate over them
ASSERT(!output_values.empty());
float* out_vals = static_cast<float*>(TF_TensorData(output_values[0]));
BTW, I used the following code from website to change from .mdl in Keras to .pb in tensorflow.
import tensorflow as tf
import sys
import numpy as np
# Create function to convert saved keras model to tensorflow graph
def convert_to_pb(weight_file,input_fld='',output_fld=''):
import os
import os.path as osp
from tensorflow.python.framework import graph_util
from tensorflow.python.framework import graph_io
from keras.models import load_model
from keras import backend as K
# weight_file is a .h5 keras model file
output_node_names_of_input_network = ["pred0"]
output_node_names_of_final_network = 'output_node'
# change filename to a .pb tensorflow file
output_graph_name = weight_file[:-3]+'pb'
weight_file_path = osp.join(input_fld, weight_file)
net_model = load_model(weight_file_path)
num_output = len(output_node_names_of_input_network)
pred = [None]*num_output
pred_node_names = [None]*num_output
for i in range(num_output):
pred_node_names[i] = output_node_names_of_final_network+str(i)
pred[i] = tf.identity(net_model.output[i], name=pred_node_names[i])
print('output nodes names are: ', pred_node_names)
sess = K.get_session()
constant_graph = graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), pred_node_names)
graph_io.write_graph(constant_graph, output_fld, output_graph_name, as_text=False)
print('saved the constant graph (ready for inference) at: ', osp.join(output_fld, output_graph_name))
return output_fld+output_graph_name
tfpath = convert_to_pb(sys.argv[1],'./','./')
print 'tfpath: ', tfpath
Then
I'm tying to use Keras for image recognition, but kept getting errors like:
ValueError: Error when checking input: expected input_9 to have 4 dimensions, but got array with shape (100, 300, 300)
I tried to change values for params that relate to dimensions, also tried to reshape images, but still got errors.
In fact, I don't understand why did I get this error. Why it expects 4 dimensions?
Here's my code:
import os
import numpy as np
import pandas as pd
import scipy
import sklearn
import keras
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Convolution2D, Flatten, MaxPooling2D, Reshape, InputLayer
import cv2
from skimage import io
import urllib2
from PIL import Image
import numpy as np
%matplotlib inline
I chose 50 rose images and 50 sunflower images from imagenet:
rose_file = "http://www.image-net.org/api/text/imagenet.synset.geturls?wnid=n04971313"
sunflower_file = "http://www.image-net.org/api/text/imagenet.synset.geturls?wnid=n11978713"
images = []
image_num = 50
rose_urls = urllib2.urlopen(rose_file)
rose_ct = 0
for rose_url in rose_urls:
try:
resp = urllib2.urlopen(rose_url)
rose_image = np.asarray(bytearray(resp.read()), dtype="uint8")
images.append(rose_image)
rose_ct += 1
if rose_ct == image_num: # only use 50 images here, otherwise, loading time is too long
break
except: # some images are no longer available
pass
sunflower_urls = urllib2.urlopen(sunflower_file)
sunflower_ct = 0
for sunflower_url in sunflower_urls:
try:
resp = urllib2.urlopen(sunflower_url)
sunflower_image = np.asarray(bytearray(resp.read()), dtype="uint8")
images.append(sunflower_image)
sunflower_ct += 1
if sunflower_ct == image_num: # only use 50 images here, otherwise, loading time is too long
break
except: # some images are no longer available
pass
Resize training images to 300*300:
from keras.utils.np_utils import to_categorical
for i in range(len(images)):
images[i]=cv2.resize(np.array(images[i]),(300,300))
images = np.array(images)
labels = [0 for i in range(image_num)]
labels.extend([1 for j in range(image_num)])
labels = np.array(labels)
labels = to_categorical(labels)
Build the model:
filters=10
filtersize=(5,5)
epochs=7
batchsize=128
input_shape=(300,300, 3)
model = Sequential()
model.add(keras.layers.InputLayer(input_shape=input_shape))
model.add(keras.layers.convolutional.Conv2D(filters, filtersize, strides=(1, 1),
padding='valid', data_format="channels_last", activation='relu'))
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(units=2, input_dim=10, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(images, labels, epochs=epochs, batch_size=batchsize, validation_split=0.3)
model.summary()
Here, I tried to change input_shape=(300,300, 3) into input_shape=(300,300, 3, 0), hoping this means 4 dimensions, but got errors saying:
Input 0 is incompatible with layer conv2d_13: expected ndim=4, found ndim=5
Do you know why did I get these errors? And how to deal with this problem?
I modify the FCN net and design a new net,in which I use two ImageData Layer as input param and hope the net produces a picture as output.
here is the train_val.prototxt and the deploy.prototxt
the original picture and the label are both gray scale pics and sizes are 224*224.
I've trained a caffemodel and use infer.py to use the caffemodel to do a segmentation,but meet the error:
F0505 06:15:08.072602 30713 net.cpp:767] Check failed: target_blobs.size() == source_layer.blobs_size() (2 vs. 1) Incompatible number of blobs for layer conv1
here is the infer.py file:
import numpy as np
from PIL import Image
caffe_root = '/home/zhaimo/'
import sys
sys.path.insert(0, caffe_root + 'caffe-master/python')
import caffe
im = Image.open('/home/zhaimo/fcn-master/data/vessel/test/13.png')
in_ = np.array(im, dtype=np.float32)
#in_ = in_[:,:,::-1]
#in_ -= np.array((104.00698793,116.66876762,122.67891434))
#in_ = in_.transpose((2,0,1))
net = caffe.Net('/home/zhaimo/fcn-master/mo/deploy.prototxt', '/home/zhaimo/fcn-master/mo/snapshot/train/_iter_200000.caffemodel', caffe.TEST)
net.blobs['data'].reshape(1, *in_.shape)
net.blobs['data'].data[...] = in_
net.forward()
out = net.blobs['score'].data[0].argmax(axis=0)
plt.axis('off')
plt.savefig('/home/zhaimo/fcn-master/mo/result/13.png')
how to solve this problem?
The problem is with your bias term in conv1. In your train.prototxt it is set to false. But in your deploy.prototxt it is not and by default that is true. That is why weight loader is looking for two blobs.