Get intermediate output of layers on MCU using tflite micro? - c++

Sorry for the unclarity of my first question, I have edited it to be more specific.
Because the output from the middle layers in some neural network is very interesting, I would like to get the output of certain layer during the inference on a micro-controller(MCU) running tf-lite micro c++ library.
The normal way to do this in tensorflow:
# The model we train
model = tf.keras.models.Sequential([
...
])
model.compile(...)
model.fit(...)
# Creat a aux-model which includes the layers until the one we want
layer_output_model = Model(model.inputs, model.layers[theIndexYouWant].outputs)
When we put the model into MCU, we will first quantize/prune the model, convert it into a C array and flash the model to MCU, like this:
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
tflite_model = converter.convert()
open(tflite_mnist_model, "wb").write(tflite_model)
And the inference will be called in c++ like this:
# Initialization
const tflite::Model* model = ::tflite::GetModel(model);
TfLiteTensor* input = interpreter.input(0);
TfLiteTensor* output = interpreter.output(0);
# Give input, run inference and get output
input->data.f[0] = 0.;
TfLiteStatus invoke_status = interpreter.Invoke();
float value = output->data.f[0];
If I want to extract the output of certain middle layer during inference in the MCU, how could I do it?
The only method I can come up with now is convert the above aux-model layer_output_model into c array and upload this as an additional model to MCU.
converter = tf.lite.TFLiteConverter.from_keras_model(layer_output_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
aux_tflite_model = converter.convert()
Is this the right way to do? I'm not sure the aux_model I converted here is the same representation of the wanted model layer output, especially after quantization using representative_dataset
Thanks.

Related

I want to deploy a pytorch segmentation model in a C++ application .. C++ equivalent preprocessing

I want to deploy a pytorch segmentation model in a C++ application. I knew that I have to convert the model to a Torch Script and use libtorch.
However, what is C++ equivalent to the following pre-preprocessing (It's Ok to convert opencv, but I don't know how to convert the others)?
import torch.nn.functional as F
train_tfms = transforms.Compose([transforms.ToTensor(), transforms.Normalize(channel_means, channel_stds)])
input_width, input_height = input_size[0], input_size[1]
img_1 = cv.resize(img, (input_width, input_height), cv.INTER_AREA)
X = train_tfms(Image.fromarray(img_1))
X = Variable(X.unsqueeze(0)).cuda() # [N, 1, H, W]
mask = model(X)
mask = F.sigmoid(mask[0, 0]).data.cpu().numpy()
mask = cv.resize(mask, (img_width, img_height), cv.INTER_AREA)
To create the transformed dataset, you will need to call MapDataset<DatasetType, TransformType> map(DatasetType dataset,TransformType transform) (see doc).
You will likely have to implement your 2 transforms yourself, just look at how they implemented theirs and imitate that.
The libtorch tutorial will guide you through datasets and dataloaders
You can call the sigmoid function with torch::nn::functionql::sigmoid I believe

Extracting output before the softmax layer, then manually calculating softmax gives a different result

I have a model trained to classify rgb values into 1000 categories.
#Model architecture
model = Sequential()
model.add(Dense(512,input_shape=(3,),activation="relu"))
model.add(BatchNormalization())
model.add(Dense(512,activation="relu"))
model.add(BatchNormalization())
model.add(Dense(1000,activation="relu"))
model.add(Dense(1000,activation="softmax"))
I want to be able to extract the output before the softmax layer so I can conduct analyses on different samples of categories within the model. I want execute softmax for each sample, and conduct analyses using a function named getinfo().
Model
Initially, I enter X_train data into model.predict, to get a vector of 1000 probabilities for each input. I execute getinfo() on this array to get the desired result.
Pop1
I then use model.pop() to remove the softmax layer. I get new predictions for the popped model, and execute scipy.special.softmax. However, getinfo() produces an entirely different result on this array.
Pop2
I write my own softmax function to validate the 2nd result, and I receive an almost identical answer to Pop1.
Pop3
However, when I simply calculate getinfo() on the output of model.pop() with no softmax function, I get the same result as the initial Model.
data = np.loadtxt("allData.csv",delimiter=",")
model = load_model("model.h5")
def getinfo(data):
objects = scipy.stats.entropy(np.mean(data, axis=0), base=2)
print(('objects_mean',objects))
colours_entropy = []
for i in data:
e = scipy.stats.entropy(i, base=2)
colours_entropy.append(e)
colours = np.mean(np.array(colours_entropy))
print(('colours_mean',colours))
info = objects - colours
print(('objects-colours',info))
return info
def softmax_max(data):
# calculate softmax whilst subtracting the max values (axis=1)
sm = []
count = 0
for row in data:
max = np.argmax(row)
e = np.exp(row-data[count,max])
s = np.sum(e)
sm.append(e/s)
sm = np.asarray(sm)
return sm
#model
preds = model.predict(X_train)
getinfo(preds)
#pop1
model.pop()
preds1 = model.predict(X_train)
sm1 = scipy.special.softmax(preds1,axis=1)
getinfo(sm1)
#pop2
sm2 = softmax_max(preds1)
getinfo(sm2)
#pop3
getinfo(preds1)
I expect to get the same output from Model, Pop1 and Pop2, but a different answer to Pop3, as I did not compute softmax here. I wonder if the issue is with computing softmax after model.predict? And whether I am getting the same result in Model and Pop3 because softmax is constraining the values between 0-1, so for the purpose of the getinfo() function, the result is mathematically equivalent?
If this is the case, then how do I execute softmax before model.predict?
I've gone around in circles with this, so any help or insight would be much appreciated. Please let me know if anything is unclear. Thank you!
model.pop() does not immediately have an effect. You need to run model.compile() again to recompile the new model that doesn't include the last layer.
Without the recompile, you're essentially running model.predict() twice in a row on the exact same model, which explains why Model and Pop3 give the same result. Pop1 and Pop2 give weird results because they are calculating the softmax of a softmax.
In addition, your model does not have the softmax as a separate layer, so pop takes off the entire last Dense layer. To fix this, add the softmax as a separate layer like so:
model.add(Dense(1000)) # softmax removed from this layer...
model.add(Activation('softmax')) # ...and added to its own layer

Tensorflow return similar images

I want to use Google's Tensorflow to return similar images to an input image.
I have installed Tensorflow from http://www.tensorflow.org (using PIP installation - pip and python 2.7) on Ubuntu14.04 on a virtual machine CPU.
I have downloaded the trained model Inception-V3 (inception-2015-12-05.tgz) from http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz that is trained on ImageNet Large Visual Recognition Challenge using the data from 2012, but I think it has both the Neural network and the classifier inside it (as the task there was to predict the category). I have also downloaded the file classify_image.py that classifies an image in 1 of the 1000 classes in the model.
So I have a random image image.jpg that I an running to test the model. when I run the command:
python /home/amit/classify_image.py --image_file=/home/amit/image.jpg
I get the below output: (Classification is done using softmax)
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 3
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 3
trench coat (score = 0.62218)
overskirt (score = 0.18911)
cloak (score = 0.07508)
velvet (score = 0.02383)
hoopskirt, crinoline (score = 0.01286)
Now, the task at hand is to find images that are similar to the input image (image.jpg) out of a database of 60,000 images (jpg format, and kept in a folder at /home/amit/images). I believe this can be done by removing the final classification layer from the inception-v3 model, and using the feature set of the input image to find cosine distance from the feature set all the 60,000 images, and we can return the images having less distance (cos 0 = 1)
Please suggest me the way forward for this problem and how do I do this using Python API.
I think I found an answer to my question:
In the file classify_image.py that classifies the image using the pre trained model (NN + classifier), I made the below mentioned changes (statements with #ADDED written next to them):
def run_inference_on_image(image):
"""Runs inference on an image.
Args:
image: Image file name.
Returns:
Nothing
"""
if not gfile.Exists(image):
tf.logging.fatal('File does not exist %s', image)
image_data = gfile.FastGFile(image, 'rb').read()
# Creates graph from saved GraphDef.
create_graph()
with tf.Session() as sess:
# Some useful tensors:
# 'softmax:0': A tensor containing the normalized prediction across
# 1000 labels.
# 'pool_3:0': A tensor containing the next-to-last layer containing 2048
# float description of the image.
# 'DecodeJpeg/contents:0': A tensor containing a string providing JPEG
# encoding of the image.
# Runs the softmax tensor by feeding the image_data as input to the graph.
softmax_tensor = sess.graph.get_tensor_by_name('softmax:0')
feature_tensor = sess.graph.get_tensor_by_name('pool_3:0') #ADDED
predictions = sess.run(softmax_tensor,
{'DecodeJpeg/contents:0': image_data})
predictions = np.squeeze(predictions)
feature_set = sess.run(feature_tensor,
{'DecodeJpeg/contents:0': image_data}) #ADDED
feature_set = np.squeeze(feature_set) #ADDED
print(feature_set) #ADDED
# Creates node ID --> English string lookup.
node_lookup = NodeLookup()
top_k = predictions.argsort()[-FLAGS.num_top_predictions:][::-1]
for node_id in top_k:
human_string = node_lookup.id_to_string(node_id)
score = predictions[node_id]
print('%s (score = %.5f)' % (human_string, score))
I ran the pool_3:0 tensor by feeding in the image_data to it. Please let me know if I am doing a mistake. If this is correct, I believe we can use this tensor for further calculations.
Tensorflow now has a nice tutorial on how to get the activations before the final layer and retrain a new classification layer with different categories:
https://www.tensorflow.org/versions/master/how_tos/image_retraining/
The example code:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py
In your case, yes, you can get the activations from pool_3 the layer below the softmax layer (or the so-called bottlenecks) and send them to other operations as input:
Finally, about finding similar images, I don't think imagenet's bottleneck activations are very pertinent representation for image search. You could consider to use an autoencoder network with direct image inputs.
(source: deeplearning4j.org)
Your problem sounds similar to this visual search project

PCA for dimensionality reduction before Random Forest

I am working on binary class random forest with approximately 4500 variables. Many of these variables are highly correlated and some of them are just quantiles of an original variable. I am not quite sure if it would be wise to apply PCA for dimensionality reduction. Would this increase the model performance?
I would like to be able to know which variables are more significant to my model, but if I use PCA, I would be only able to tell what PCs are more important.
Many thanks in advance.
My experience is that PCA before RF is not an great advantage if any. Principal component regression(PCR) is e.g. when, PCA assists to regularize training features before OLS linear regression and that is very needed for sparse data-sets. As RF itself already performs a good/fair regularization without assuming linearity, it is not necessarily an advantage. That said, I found my self writing a PCA-RF wrapper for R two weeks ago. The code includes a simulated data set of a data set of 100 features comprising only 5 true linear components. Under such cercumstances it is infact a small advantage to pre-filter with PCA
The code is a seamless implementation, such that every RF parameters are simply passed on to RF. Loading vector are saved in model_fit to use during prediction.
#I would like to be able to know which variables are more significant to my model, but if I use PCA, I would be only able to tell what PCs are more important.
The easy way is to run without PCA and obtain variable importances and expect to find something similar for PCA-RF.
The tedious way, wrap the PCA-RF in a new bagging scheme with your own variable importance code. Could be done in 50-100 lines or so.
The souce-code suggestion for PCA-RF:
#wrap PCA around randomForest, forward any other arguments to randomForest
#define as new S3 model class
train_PCA_RF = function(x,y,ncomp=5,...) {
f.args=as.list(match.call()[-1])
pca_obj = princomp(x)
rf_obj = do.call(randomForest,c(alist(x=pca_obj$scores[,1:ncomp]),f.args[-1]))
out=mget(ls())
class(out) = "PCA_RF"
return(out)
}
#print method
print.PCA_RF = function(object) print(object$rf_obj)
#predict method
predict.PCA_RF = function(object,Xtest=NULL,...) {
print("predicting PCA_RF")
f.args=as.list(match.call()[-1])
if(is.null(f.args$Xtest)) stop("cannot predict without newdata parameter")
sXtest = predict(object$pca_obj,Xtest) #scale Xtest as Xtrain was scaled before
return(do.call(predict,c(alist(object = object$rf_obj, #class(x)="randomForest" invokes method predict.randomForest
newdata = sXtest), #newdata input, see help(predict.randomForest)
f.args[-1:-2]))) #any other parameters are passed to predict.randomForest
}
#testTrain predict #
make.component.data = function(
inter.component.variance = .9,
n.real.components = 5,
nVar.per.component = 20,
nObs=600,
noise.factor=.2,
hidden.function = function(x) apply(x,1,mean),
plot_PCA =T
){
Sigma=matrix(inter.component.variance,
ncol=nVar.per.component,
nrow=nVar.per.component)
diag(Sigma) = 1
x = do.call(cbind,replicate(n = n.real.components,
expr = {mvrnorm(n=nObs,
mu=rep(0,nVar.per.component),
Sigma=Sigma)},
simplify = FALSE)
)
if(plot_PCA) plot(prcomp(x,center=T,.scale=T))
y = hidden.function(x)
ynoised = y + rnorm(nObs,sd=sd(y)) * noise.factor
out = list(x=x,y=ynoised)
pars = ls()[!ls() %in% c("x","y","Sigma")]
attr(out,"pars") = mget(pars) #attach all pars as attributes
return(out)
}
A run code example:
#start script------------------------------
#source above from separate script
#test
library(MASS)
library(randomForest)
Data = make.component.data(nObs=600)#plots PC variance
train = list(x=Data$x[ 1:300,],y=Data$y[1:300])
test = list(x=Data$x[301:600,],y=Data$y[301:600])
rf = randomForest (train$x, train$y,ntree =50) #regular RF
rf2 = train_PCA_RF(train$x, train$y,ntree= 50,ncomp=12)
rf
rf2
pred_rf = predict(rf ,test$x)
pred_rf2 = predict(rf2,test$x)
cat("rf, R^2:",cor(test$y,pred_rf )^2,"PCA_RF, R^2", cor(test$y,pred_rf2)^2)
cor(test$y,predict(rf ,test$x))^2
cor(test$y,predict(rf2,test$x))^2
pairs(list(trueY = test$y,
native_rf = pred_rf,
PCA_RF = pred_rf2)
)
You can have a look here to get a better idea. The link says use PCA for smaller datasets!! Some of my colleagues have used Random Forests for the same purpose when working with Genomes. They had ~30000 variables and large amount of RAM.
Another thing I found is that Random Forests use up a lot of Memory and you have 4500 variables. So, may be you could apply PCA to the individual Trees.

Scikit-learn 0.15.2 - OneVsRestClassifier not works due to predict_proba not available

I am trying to do onevsrest classification like below:
classifier = Pipeline([('vectorizer', CountVectorizer()),('tfidf', TfidfTransformer()),('clf', OneVsRestClassifier(SVC(kernel='rbf')))])
classifier.fit(X_train, Y)
predicted = classifier.predict(X_test)
And I get the error 'predict_proba is not available when probability = false'. I saw that there was a bug reported, the one below:
https://github.com/scikit-learn/scikit-learn/issues/1946
And it was closed as fixed, so I killed scikit-learn from my Windows PC and completely re-downloaded scikit-learn to have version 0.15.2. But I still get this error. Any suggestions? Or I understood this wrong, and I still can't use SVC with OneVSRestClassifier unless I specify probability=true?
UPDATE: just to clarify, I am trying to actually achieve multi-label classification, here is data source:
df = pd.read_csv(fileIn, header = 0, encoding='utf-8-sig')
rows = random.sample(df.index, int(len(df) * 0.9))
work = df.ix[rows]
work_test = df.drop(rows)
X_train = []
y_train = []
X_test = []
y_test = []
for i in work[[i for i in list(work.columns.values) if i.startswith('Change')]].values:
X_train.append(','.join(i.T.tolist()))
X_train = np.array(X_train)
for i in work[[i for i in list(work.columns.values) if i.startswith('Corax')]].values:
y_train.append(list(i))
for i in work_test[[i for i in list(work_test.columns.values) if i.startswith('Change')]].values:
X_test.append(','.join(i.T.tolist()))
X_test = np.array(X_test)
for i in work_test[[i for i in list(work_test.columns.values) if i.startswith('Corax')]].values:
y_test.append(list(i))
lb = preprocessing.MultiLabelBinarizer()
Y = lb.fit_transform(y_train)
And after that I send it to pipeline mentioned earlier
Ok, I did some investigation in code. OneVsRestClassifier tries to call decision_function first and if it fails - it goes for predict_proba function of base classifier (svm.svc in our case).
As far as I see, my X_test is numpy.array of lists of strings. After it undergoes a sequence of transformations specified in pipeline CountVectorizer -> TfidfTransformer it becomes a sparse matrix (by design of these things). As I see currently decision_function is not available for sparse matrices, and there is even an open suggestion on github: https://github.com/scikit-learn/scikit-learn/issues/73
So, to summarize, looks like you can't make a multilabel classification using svm.svc unless you specify probability=True. If you do this you introduce some overhead to the classifier.fit process but it will work.