We're currently training various neural networks using Keras, which is ideal because it has a nice interface and is relatively easy to use, but we'd like to be able to apply them in our production environment.
Unfortunately the production environment is C++, so our plan is to:
Use the TensorFlow backend to save the model to a protobuf
Link our production code to TensorFlow, and then load in the protobuf
Unfortunately I don't know how to access the TensorFlow saving utilities from Keras, which normally saves to HDF5 and JSON. How do I save to protobuf?
In case you don't need to utilize a GPU in the environment you are deploying to, you could also use my library, called frugally-deep. It is available on GitHub and published under the MIT License: https://github.com/Dobiasd/frugally-deep
frugally-deep allows running forward passes on already-trained Keras models directly in C++ without the need to link against TensorFlow or any other backend.
This seems to be answered in "Keras as a simplified interface to TensorFlow: tutorial", posted on The Keras Blog by Francois Chollet.
In particular, section II, "Using Keras models with TensorFlow".
You can access TensorFlow backend by:
import keras.backend.tensorflow_backend as K
Then you can call any TensorFlow utility or function like:
K.tf.ConfigProto
Save your keras model as an HDF5 file.
You can then do the conversion with the following code:
from keras import backend as K
from tensorflow.python.framework import graph_util
from tensorflow.python.framework import graph_io
weight_file_path = 'path to your keras model'
net_model = load_model(weight_file_path)
sess = K.get_session()
constant_graph = graph_util.convert_variables_to_constants(sess, sess.graph.as_graph_def(), 'name of the output tensor')
graph_io.write_graph(constant_graph, 'output_folder_path', 'output.pb', as_text=False)
print('saved the constant graph (ready for inference) at: ', osp.join('output_folder_path', 'output.pb'))
Here is my sample code which handles multiple input and multiple output cases:
https://github.com/amir-abdi/keras_to_tensorflow
Make sure you change the learning phase of keras backend to store proper values of the layers (like dropout or batch normalization). Here is a discussion about it.
Related
Given that layers API has been deprecated, how do I build models in tf2 without using tf.keras (or what is the recommended way to build models)? Issue #30829 has the same question, but was closed without any answers.
Update:
I'm okay with using tf.keras.layers instead of tf.layers, but once I've built all the layers and I need to return the model, is there a way to NOT use keras model, compile, fit, predict and evaluate, and just do it the tensorflow's way?
If you were wondering why I would want to do something like that, it is that I would like to use estimators to train, rather than keras' fit function. There exists a keras_model_to_estimator, but it seems it's not mature enough yet
Google released migration guide from TF 1 to TF 2, section Converting Models.
Recommended way to build models
Guide (section "Models based on tf.layers") recommends to convert tf.layers models to tf.keras.layers:
The conversion was one-to-one because there is a direct mapping from v1.layers to tf.keras.layers.
Build models without Keras
The option is to provide own layer implementations (example from the guide):
W = tf.Variable(tf.ones(shape=(2,2)), name="W")
b = tf.Variable(tf.zeros(shape=(2)), name="b")
#tf.function
def forward(x):
return W * x + b
out_a = forward([1,0])
print(out_a)
But it is worth to consider tf.keras.layers.Layer (example), which gives some degree of freedom, but integrate with rest of Keras (and it's layers).
Even with layers written with tf.keras, you are able to write own training loop (example).
To sum up, in practice TF 2.0 requires you to use tf.keras.
I open this thread to discuss how to bring my NN model to deployment.
I build and trained a NN in Matlab with mdCNN, (mdCNN is a simple Matlab library for building NN for multiple dimension input, which is currently is not supported with Matlab - cov3x3x3). I trained my model in Matlab, Now I want to bring it to production.
After few hours of research, I plan to do the following
Train a NN model in Keras with TF backend. I choose Keras because I want to have backward compatibility with Matlab in the future.
Grab a tensorflow session from Keras model, there is an example how to do that here. Than Save the session in *.pd file
Load the NN model from openCV dnn model. there is a specific function that does that
cv::readNet()
Run the NN in C++ using OpenCV with
net.setInput(blob);
Mat prob = net.forward();
I want to check with you if this flow would really work. Are there any suggestions how to do the deployment better? Any suggestions or improvements for the flow ?
Maybe have a look at this question: Convert Keras model to C++
The general idea is to save the model in json and the weights in hdf5 and use this keras2cpp solution to convert it to C++.
Apologies if this may have been answered somewhere but I've been looking for about an hour and can't find a good answer.
I have a simple Logistic Regression model trained in Scikit-Learn that I'm exporting to a .pmml file.
from sklearn2pmml import PMMLPipeline, sklearn2pmml
my_pipeline = PMMLPipeline(
( classifier", LogisticRegression() )
)
my_pipeline.fit(blah blah)
sklearn2pmml(my_pipeline, "filename.pmml")
etc....
So what I'm wondering is if/how I can import this file back into Python (2.7 preferably) or Scikit-Learn to use as I would in Java/Scala. Something along the lines of
"import (filename.pmml) as pm
pm.predict(data)
Thanks for any help!
Scikit-learn does not offer support for importing PMML files, so what you're trying to achieve cannot be done I'm afraid.
The concept of using libraries such as sklearn2pmml is really to extend the functionality that sklearn does not have when it comes to supporting the model export to a PMML format.
Typically, those who use sklearn2pmml are really looking to re-use the PMML models in other platforms (e.g. IBM's SPSS, Apache Spark ML, Weka or any other consumer as listed in the Data Mining Group's website).
If you're looking to save a model created with scikit-learn and re-use it afterwards with scikit-learn as well then you should explore its native persistence model mechanism named Pickle, which uses a binary data format.
You can read more about how to save/load models in Pickle format (together with its known issues) here.
I created a simple solution to generate sklearn kmeans models from pmml files which i exported from knime analytics platform. You can check it out pmml2sklearn
You could use PyPMML to make predictions on a new dataset using PMML in Python, for example:
from pypmml import Model
model = Model.fromFile('the/pmml/file/path')
result = model.predict(data)
The data could be dict, json, Series or DataFrame of Pandas.
I believe you can Import/Export a pmml file with python. After you load back your model you can predict again with out any problem. However output file formats can differ, like 1d array, or nxn panda tables etc.
from sklearn2pmml import make_pmml_pipeline, sklearn2pmml
from pypmml import Model
#Extract as pmml
yourModelPipeline = make_pmml_pipeline(yourModelObjectGoesHere)
sklearn2pmml(yourModelPipeline, "yourModel.pmml")
#Load from pmml
yourModelLoaded = Model.fromFile('yourModel.pmml')
prediction = yourModelLoaded.predict(yourPredictionDataSet)
Lastly reproducing result make take long time, don't let it discourage you :). I would like to share developers comment about the issue: https://github.com/autodeployai/pypmml/issues/53
I've found a few resources for how to import a model with Tensorflow c++ after exporting it to a .pb file, but my understanding is that the .pb file method has been replaced with a newer method which uses the tf.Saver.save method to produce a .meta, .index, .data-00000-of-00001, and a checkpoint file. I cannot find anything on how to import a model from these file types with the C++ API.
How can I do this?
I use TFLearn wrapper on the top of Tensorflow, but the process should be identical with plain Tensorflow models. You can save a checkpoint from TFLearn in this way correctly, but you can also freeze if you want to use a .pb model. It is possible to load both a checkpoint or a model file in C++. In any case, the inference part is identical in C++.
I've followed this guide to save a machine learning model for later use. The model was dumped in one machine:
from sklearn.externals import joblib
joblib.dump(clf, 'model.pkl')
And when I loaded it joblib.load('model.pkl') in another machine, I got this warning:
UserWarning: Trying to unpickle estimator DecisionTreeClassifier from
version pre-0.18 when using version 0.18.1. This might lead to
breaking code or invalid results. Use at your own risk.
So is there any way to know the sklearn version of the saved model to compare it with the current version?
Versioning of pickled estimators was added in scikit-learn 0.18. Starting from v0.18, you can get the version of scikit-learn used to create the estimator with,
estimator.__getstate__()['_sklearn_version']
The warning you get is produced by the __setstate__ method of the estimator which is automatically called upon unpickling. It doesn't look like there is a straightforward way of getting this version without loading the estimator from disk. You can filter out the warning, with,
import warnings
with warnings.catch_warnings():
warnings.simplefilter("ignore", category=UserWarning)
estimator = joblib.load('model.pkl')
For pre-0.18 versions, there is no such mechanism, but I imagine you could, for instance, use not hasattr(estimator, '__getstate') as a test to detect to, at least, pre-0.18 versions.
I have the same problem, just re-training datasets and save again the 'model.pkl' file with joblib.dump. This will be resolved. Good luck!