I have a UNet++(view in private, code for model at the bottom of the article) which I'm trying to reconfigure. I'm getting some artifacts in some images so I'm following this article which suggests doing upsampling then a convolution operation.
I'm replacing the up-sample layers with sequential operation shown below but my model isn't learning. I suspect its to do with how I've configured the channels so I'd like another opinion.
Old up-sample operation:
self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
New operations:
class upConv(nn.Module):
"""
Up sampling/ deconv block by factor of 2
"""
def __init__(self, in_ch, out_ch):
super().__init__()
self.upc = nn.Sequential(
nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True),
nn.Conv2d(in_ch, out_ch*2, 3, stride=1, padding=1),
nn.BatchNorm2d(out_ch*2),
nn.ReLU(inplace=True)
)
def forward(self, x):
out = self.upc(x)
return out
My question is do these two operations have the same output/function within my model?
Do these two operations have the same output/function within my model?
If out_ch*2 == in_ch, then: yes, they have the same output shape.
If the input x is the output of a BatchNorm+ReLU op, then they could be even more similar.
Related
I have a model that I need to solve many times, with different objective function coefficients.
Naturally, I want to spend as little time as possible on updating the model.
My current setup is as follows (simplified):
Abstract model:
def obj_rule(m):
return sum(Dist[m, n] * m.flow[m,n] for (m,n) in m.From * m.To)
m.obj = Objective(rule=obj_rule, sense=minimize)
After solve, I update the concrete model instance mi this way, we new values in Dist:
mi.obj = sum(Dist[m, n] * mi.flow[m,n] for (m,n) in mi.From * mi.To)
However, I see that the update line takes a lot of time - ca. 1/4 of the overall solution time, several seconds for bigger cases.
Is there some faster way of updating the objective function?
(After all, in the usual way of saving an LP model, the objective function coefficients are in a separate vector, so changing them should not affect anything else.)
Do you have a reason to define an Abstract model before creating your Concrete model? If you define your concrete model with the rule you show above, you should be able to just update your data and re-solve the model without a lot of overhead since you do don't redefine the objective object. Here's a simple example where I change the values of the cost parameter and re-solve.
import pyomo.environ as pyo
a = list(range(2)) # set the variables define over
#%% Begin basic model
model = pyo.ConcreteModel()
model.c = pyo.Param(a,initialize={0:5,1:3},mutable=True)
model.x = pyo.Var(a,domain = pyo.Binary)
model.con = pyo.Constraint(expr=model.x[0] + model.x[1] <= 1)
def obj_rule(model):
return(sum(model.x[ind] * model.c[ind] for ind in a))
model.obj = pyo.Objective(rule=obj_rule,sense=pyo.maximize)
#%% Solve the first time
solver = pyo.SolverFactory('glpk')
res=solver.solve(model)
print('x[0]: {} \nx[1]: {}'.format(pyo.value(model.x[0]),
pyo.value(model.x[1])))
# x[0]: 1
# x[1]: 0
#%% Update data and re-solve
model.c.reconstruct({0:0,1:5})
res=solver.solve(model)
print('x[0]: {} \nx[1]: {}'.format(pyo.value(model.x[0]),
pyo.value(model.x[1])))
# x[0]: 0
# x[1]: 1
I have a model trained to classify rgb values into 1000 categories.
#Model architecture
model = Sequential()
model.add(Dense(512,input_shape=(3,),activation="relu"))
model.add(BatchNormalization())
model.add(Dense(512,activation="relu"))
model.add(BatchNormalization())
model.add(Dense(1000,activation="relu"))
model.add(Dense(1000,activation="softmax"))
I want to be able to extract the output before the softmax layer so I can conduct analyses on different samples of categories within the model. I want execute softmax for each sample, and conduct analyses using a function named getinfo().
Model
Initially, I enter X_train data into model.predict, to get a vector of 1000 probabilities for each input. I execute getinfo() on this array to get the desired result.
Pop1
I then use model.pop() to remove the softmax layer. I get new predictions for the popped model, and execute scipy.special.softmax. However, getinfo() produces an entirely different result on this array.
Pop2
I write my own softmax function to validate the 2nd result, and I receive an almost identical answer to Pop1.
Pop3
However, when I simply calculate getinfo() on the output of model.pop() with no softmax function, I get the same result as the initial Model.
data = np.loadtxt("allData.csv",delimiter=",")
model = load_model("model.h5")
def getinfo(data):
objects = scipy.stats.entropy(np.mean(data, axis=0), base=2)
print(('objects_mean',objects))
colours_entropy = []
for i in data:
e = scipy.stats.entropy(i, base=2)
colours_entropy.append(e)
colours = np.mean(np.array(colours_entropy))
print(('colours_mean',colours))
info = objects - colours
print(('objects-colours',info))
return info
def softmax_max(data):
# calculate softmax whilst subtracting the max values (axis=1)
sm = []
count = 0
for row in data:
max = np.argmax(row)
e = np.exp(row-data[count,max])
s = np.sum(e)
sm.append(e/s)
sm = np.asarray(sm)
return sm
#model
preds = model.predict(X_train)
getinfo(preds)
#pop1
model.pop()
preds1 = model.predict(X_train)
sm1 = scipy.special.softmax(preds1,axis=1)
getinfo(sm1)
#pop2
sm2 = softmax_max(preds1)
getinfo(sm2)
#pop3
getinfo(preds1)
I expect to get the same output from Model, Pop1 and Pop2, but a different answer to Pop3, as I did not compute softmax here. I wonder if the issue is with computing softmax after model.predict? And whether I am getting the same result in Model and Pop3 because softmax is constraining the values between 0-1, so for the purpose of the getinfo() function, the result is mathematically equivalent?
If this is the case, then how do I execute softmax before model.predict?
I've gone around in circles with this, so any help or insight would be much appreciated. Please let me know if anything is unclear. Thank you!
model.pop() does not immediately have an effect. You need to run model.compile() again to recompile the new model that doesn't include the last layer.
Without the recompile, you're essentially running model.predict() twice in a row on the exact same model, which explains why Model and Pop3 give the same result. Pop1 and Pop2 give weird results because they are calculating the softmax of a softmax.
In addition, your model does not have the softmax as a separate layer, so pop takes off the entire last Dense layer. To fix this, add the softmax as a separate layer like so:
model.add(Dense(1000)) # softmax removed from this layer...
model.add(Activation('softmax')) # ...and added to its own layer
Im trying to finetune the existing models in Keras to classify my own dataset. Till now I have tried the following code (taken from Keras docs: https://keras.io/applications/) in which Inception V3 is fine-tuned on a new set of classes.
from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
# create the base pre-trained model
base_model = InceptionV3(weights='imagenet', include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(200, activation='softmax')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
layer.trainable = False
# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
# train the model on the new data for a few epochs
model.fit_generator(...)
# at this point, the top layers are well trained and we can start fine-tuning
# convolutional layers from inception V3. We will freeze the bottom N layers
# and train the remaining top layers.
# let's visualize layer names and layer indices to see how many layers
# we should freeze:
for i, layer in enumerate(base_model.layers):
print(i, layer.name)
# we chose to train the top 2 inception blocks, i.e. we will freeze
# the first 172 layers and unfreeze the rest:
for layer in model.layers[:172]:
layer.trainable = False
for layer in model.layers[172:]:
layer.trainable = True
# we need to recompile the model for these modifications to take effect
# we use SGD with a low learning rate
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')
# we train our model again (this time fine-tuning the top 2 inception blocks
# alongside the top Dense layers
model.fit_generator(...)
Can anyone plz guide me what changes should I do in the above code so as to fine-tune ResNet50 model present in Keras.
Thanks in advance.
It is difficult to make out a specific question, have you tried anything more than just copying the code without any changes?
That said, there is an abundance of problems in the code: It is a simple copy/paste from keras.io, not functional as it is, and needs some adaption before working at all (regardless of using ResNet50 or InceptionV3):
1): You need to define the input_shape when loading InceptionV3, specifically replace base_model = InceptionV3(weights='imagenet', include_top=False) with base_model = InceptionV3(weights='imagenet', include_top=False, input_shape=(299,299,3))
2): Further, you need to adapt the number of the classes in the last added layer, e.g. if you have only 2 classes to: predictions = Dense(2, activation='softmax')(x)
3): Change the loss-function when compiling your model from categorical_crossentropy to sparse_categorical_crossentropy
4): Most importantly, you need to define the fit_generator before calling model.fit_generator() and add steps_per_epoch. If you have your training images in ./data/train with every category in a different subfolder, this can be done e.g. like this:
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(
"./data/train",
target_size=(299, 299),
batch_size=50,
class_mode='binary')
model.fit_generator(train_generator, steps_per_epoch=100)
This of course only does basic training, you will for example need to define save calls to hold on to the trained weights. Only if you get the code working for InceptionV3 with the changes above I suggest to proceed to work on implementing this for ResNet50: As a start you can replace InceptionV3() with ResNet50() (of course only after from keras.applications.resnet50 import ResNet50), and change the input_shape to (224,224,3) and target_size to (224,244).
The above mentioned code-changes should work on Python 3.5.3 / Keras 2.0 / Tensorflow backend.
Beyond the important points mentioned in the above answer for ResNet50 (! if your images are shaped into similar format as in the original Keras code (224,224) - not of rectangular shape) you may substitute:
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
by
x = base_model.output
x = Flatten(x)
EDIT: Please read #Yu-Yang comment bellow
I think I experienced the same issue. It appeared to be a complex problem, which has a decent thread on github(https://github.com/keras-team/keras/issues/9214). The problem is in Batch Normalization of unfreezed blocks of the net. You have two solutions:
Only change top layer(leaving the blocks as they are)
Add a patch from the github thread above.
I'm trying to create a Soccer match program for a couple of my friends. What I currently have a class to instantiate teams and I'm writing the match instantiating class.
What I currently have is:
class Team(object):
def __init__(self, name, games=0, points=0, goals=0, wins=0, loses=0):
self.name = name
self.points = points
self.games = games
self.goals = goals
self.wins = wins
self.loses = loses
def win(self):
self.points += 3
self.wins += 1
self.games += 1
def lose(self):
self.points += 1
self.loses += 1
self.games += 1
def ratio(self):
print "(Wins/Loses/Games)\n ", self.wins, "/", self.loses, "/", self.games
class Match(object):
def __init__(self, teamagoals, teambgoals):
self.teamagoals = teamagoals
self.teambgoals = teambgoals
def playgame(teama, teamb):
#conditional statements which will decide who wins
alpha = Team("George's Team")
beta = Team("Josh's Team")
gamma = Team("Fred's Team")
At this point I run into issues about how to go about doing this. As per my question, I'm trying to - in the Match() class - involve two instances of the team class. For example, I would like to call a function in Match() and specify that team alpha and team gamma play against each other, then when they win or lose modify their instances corresponding point, games, etc, values.
What is the way I can do this? Is it possible through putting all of the Team() instances into a list, then importing that list into the Match() class? Or is there some other, more elegant, way? Please help.
I'm not sure exactly how you intend to select which teams play each other, but say that's done by one function. Then, Match could be implemented as follows:
class Match(object):
def __init__(self, teama, teamb):
self.teama = teama
self.teamb = teamb
def play(self):
a_points = 0
b_points = 0
# code/function call to decide how many points each team gets.
self.teama.points += a_points
self.teamb.points += b_points
if a_points > b_points:
self.teama.wins += 1
else if b_points > a_points:
self.teamb.wins += 1
else:
# code for when the teams draw
I'm not sure how you intend to structure some of the functions, like deciding how many points teams get, but this is a framework that is hopefully useful/gives you ideas on how you could do things.
Following up on your comment, if you would like to be able to pass in specific teams created outside the class, you could do so as follows:
match1 = Match(gamma, alpha). This will create a new match instance, where the teams 'gamma' and 'alpha' will be modified. (teama and teamb in the class will refer to these objects.)
I am trying to build a model for the likelihood function of a particular outcome of a Langevin equation (Brownian particle in a harmonic potential):
Here is my model in pymc2 that seems to work:
https://github.com/hstrey/BayesianAnalysis/blob/master/Langevin%20simulation.ipynb
#define the model/function to be fitted.
def model(x):
t = pm.Uniform('t', 0.1, 20, value=2.0)
A = pm.Uniform('A', 0.1, 10, value=1.0)
#pm.deterministic(plot=False)
def S(t=t):
return 1-np.exp(-4*delta_t/t)
#pm.deterministic(plot=False)
def s(t=t):
return np.exp(-2*delta_t/t)
path = np.empty(N, dtype=object)
path[0]=pm.Normal('path_0',mu=0, tau=1/A, value=x[0], observed=True)
for i in range(1,N):
path[i] = pm.Normal('path_%i' % i,
mu=path[i-1]*s,
tau=1/A/S,
value=x[i],
observed=True)
return locals()
mcmc = pm.MCMC( model(x) )
mcmc.sample( 20000, 2000, 10 )
The basic idea is that each point depends on the previous point in the chain (Markov chain). Btw, x is an array of data, N is its length, delta_t is the time step =0.01. Any idea how to implement this in pymc3? I tried:
# define the model/function for diffusion in a harmonic potential
DHP_model = pm.Model()
with DHP_model:
t = pm.Uniform('t', 0.1, 20)
A = pm.Uniform('A', 0.1, 10)
S=1-pm.exp(-4*delta_t/t)
s=pm.exp(-2*delta_t/t)
path = np.empty(N, dtype=object)
path[0]=pm.Normal('path_0',mu=0, tau=1/A, observed=x[0])
for i in range(1,N):
path[i] = pm.Normal('path_%i' % i,
mu=path[i-1]*s,
tau=1/A/S,
observed=x[i])
Unfortunately the model crashes as soon as I try to run it. I tried some pymc3 examples (tutorial) on my machine and this is working.
Thanks in advance. I am really hoping that the new samplers in pymc3 will help me with this model. I am trying to apply Bayesian methods to single-molecule experiments.
Rather than creating many individual normally-distributed 1-D variables in a loop, you can make a custom distribution (by extending Continuous) that knows the formula for computing the log likelihood of your entire path. You can bootstrap this likelihood formula off of the Normal likelihood formula that pymc3 already knows. See the built-in AR1 class for an example.
Since your particle follows the Markov property, your likelihood looks like
import theano.tensor as T
def logp(path):
now = path[1:]
prev = path[:-1]
loglik_first = pm.Normal.dist(mu=0., tau=1./A).logp(path[0])
loglik_rest = T.sum(pm.Normal.dist(mu=prev*ss, tau=1./A/S).logp(now))
loglik_final = loglik_first + loglik_rest
return loglik_final
I'm guessing that you want to draw a value for ss at every time step, in which case you should make sure to specify ss = pm.exp(..., shape=len(x)-1), so that prev*ss in the block above gets interpreted as element-wise multiplication.
Then you can just specify your observations with
path = MyLangevin('path', ..., observed=x)
This should run much faster.
Since I did not see an answer to my question, let me answer it myself. I came up with the following solution:
# now lets model this data using pymc
# define the model/function for diffusion in a harmonic potential
DHP_model = pm.Model()
with DHP_model:
D = pm.Gamma('D',mu=mu_D,sd=sd_D)
A = pm.Gamma('A',mu=mu_A,sd=sd_A)
S=1.0-pm.exp(-2.0*delta_t*D/A)
ss=pm.exp(-delta_t*D/A)
path=pm.Normal('path_0',mu=0.0, tau=1/A, observed=x[0])
for i in range(1,N):
path = pm.Normal('path_%i' % i,
mu=path*ss,
tau=1.0/A/S,
observed=x[i])
start = pm.find_MAP()
print(start)
trace = pm.sample(100000, start=start)
unfortunately, this code takes at N=50 anywhere between 6hours to 2 days to compile. I am running on a pretty fast PC (24Gb RAM) running Ubuntu. I tried to using the GPU but that runs slightly slower. I suspect memory problems since it uses 99.8% of the memory when running. I tried the same calculation with Stan and it only takes 2min to run.