TensorFlow 0.12 Model Files - c++

I train a model and save it using:
saver = tf.train.Saver()
saver.save(session, './my_model_name')
Besides the checkpoint file, which simply contains pointers to the most recent checkpoints of the model, this creates the following 3 files in the current path:
my_model_name.meta
my_model_name.index
my_model_name.data-00000-of-00001
I wonder what each of these files contains.
I'd like to load this model in C++ and run the inference. The label_image example loads the model from a single .bp file using ReadBinaryProto(). I wonder how I can load it from these 3 files. What is the C++ equivalent of the following?
new_saver = tf.train.import_meta_graph('./my_model_name.meta')
new_saver.restore(session, './my_model_name')

What your saver creates is called "Checkpoint V2" and was introduced in TF 0.12.
I got it working quite nicely (though the docs on the C++ part are horrible, so it took me a day to solve). Some people suggest converting all variables to constants or freezing the graph, but none of these is actually needed.
Python part (saving)
with tf.Session() as sess:
tf.train.Saver(tf.trainable_variables()).save(sess, 'models/my-model')
If you create the Saver with tf.trainable_variables(), you can save yourself some headache and storage space. But maybe some more complicated models need all data to be saved, then remove this argument to Saver, just make sure you're creating the Saver after your graph is created. It is also very wise to give all variables/layers unique names, otherwise you can run in different problems.
C++ part (inference)
Note that checkpointPath isn't a path to any of the existing files, just their common prefix. If you mistakenly put there path to the .index file, TF won't tell you that was wrong, but it will die during inference due to uninitialized variables.
#include <tensorflow/core/public/session.h>
#include <tensorflow/core/protobuf/meta_graph.pb.h>
using namespace std;
using namespace tensorflow;
...
// set up your input paths
const string pathToGraph = "models/my-model.meta"
const string checkpointPath = "models/my-model";
...
auto session = NewSession(SessionOptions());
if (session == nullptr) {
throw runtime_error("Could not create Tensorflow session.");
}
Status status;
// Read in the protobuf graph we exported
MetaGraphDef graph_def;
status = ReadBinaryProto(Env::Default(), pathToGraph, &graph_def);
if (!status.ok()) {
throw runtime_error("Error reading graph definition from " + pathToGraph + ": " + status.ToString());
}
// Add the graph to the session
status = session->Create(graph_def.graph_def());
if (!status.ok()) {
throw runtime_error("Error creating graph: " + status.ToString());
}
// Read weights from the saved checkpoint
Tensor checkpointPathTensor(DT_STRING, TensorShape());
checkpointPathTensor.scalar<std::string>()() = checkpointPath;
status = session->Run(
{{ graph_def.saver_def().filename_tensor_name(), checkpointPathTensor },},
{},
{graph_def.saver_def().restore_op_name()},
nullptr);
if (!status.ok()) {
throw runtime_error("Error loading checkpoint from " + checkpointPath + ": " + status.ToString());
}
// and run the inference to your liking
auto feedDict = ...
auto outputOps = ...
std::vector<tensorflow::Tensor> outputTensors;
status = session->Run(feedDict, outputOps, {}, &outputTensors);
For completeness, here's the Python equivalent:
Inference in Python
with tf.Session() as sess:
saver = tf.train.import_meta_graph('models/my-model.meta')
saver.restore(sess, tf.train.latest_checkpoint('models/'))
outputTensors = sess.run(outputOps, feed_dict=feedDict)

I'm currently struggling with this myself, I've found it's not very straightforward to do currently. The two most commonly cited tutorials on the subject are:
https://medium.com/jim-fleming/loading-a-tensorflow-graph-with-the-c-api-4caaff88463f#.goxwm1e5j
and
https://medium.com/#hamedmp/exporting-trained-tensorflow-models-to-c-the-right-way-cf24b609d183#.g1gak956i
The equivalent of
new_saver = tf.train.import_meta_graph('./my_model_name.meta')
new_saver.restore(session, './my_model_name')
Is just
Status load_graph_status = LoadGraph(graph_path, &session);
Assuming you've "frozen the graph" (Used a script with combines the graph file with the checkpoint values).
Also, see the discussion here: Tensorflow Different ways to Export and Run graph in C++

Related

More than one input is Const Op

I am trying to serve the following gitrepo in opencv: https://github.com/una-dinosauria/3d-pose-baseline and the checkpoint data can be found at the following link: https://drive.google.com/file/d/0BxWzojlLp259MF9qSFpiVjl0cU0/view
I have already constructed a frozen graph which I can serve in python and was generated using the following script:
meta_path = 'checkpoint-4874200.meta' # Your .meta file
output_node_names = ['linear_model/add_1'] # Output nodes
export_dir=os.path.join('export_dir')
graph=tf.Graph()
with tf.Session(graph=graph) as sess:
# Restore the graph
loader=tf.train.import_meta_graph(meta_path)
loader.restore(sess,'checkpoint-4874200')
builder=tf.saved_model.builder.SavedModelBuilder(export_dir)
builder.add_meta_graph_and_variables(sess,
[tf.saved_model.SERVING],
strip_default_attrs=True)
# Freeze the graph
frozen_graph_def = tf.graph_util.convert_variables_to_constants(
sess,
sess.graph_def,
output_node_names)
# Save the frozen graph
with open('C:\\Users\\FrozenGraph.pb', 'wb') as f:
f.write(frozen_graph_def.SerializeToString())
Then I optimized the graph by running:
optimized_graph_def=optimize_for_inference_lib.optimize_for_inference(
frozen_graph_def,
['inputs/enc_in'],
['linear_model/add_1'],
tf.float32.as_datatype_enum)
g=tf.gfile.FastGFile('optimized_inference_graph.pb','wb')
g.write(optimized_graph_def.SerializeToString())
and the optimized frozen graph can be found at: https://github.com/alecda573/frozen_graph/blob/master/optimized_inference_graph.pb
When I try to run in opencv the following I get this runtime error:
OpenCV(4.3.0) Error: Unspecified error (More than one input is Const op) in cv::dnn::dnn4_v20200310::`anonymous-namespace'::TFImporter::getConstBlob, file C:\build\master_winpack-build-win64-vc15\opencv\modules\dnn\src\tensorflow\tf_importer.cpp, line 570
Steps to reproduce
To reproduce problem you just need to download the frozen graph from the above link or create yourself from the checkpoint data and then call the following in opencv with the below headers:
#include <iostream>
#include <vector>
#include <cmath>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc.hpp>
#include "opencv2/dnn.hpp"
string pbFilePath = "C:/Users/optimized_inferene_graph.pb";
//Create 3d-pose-baseline model
cv::dnn::Net inputNet;
inputNet = cv::dnn::readNetFromTensorflow(pbFilePath);
Would love to know if anyone has any thoughts on how to address this error.
You can see the frozen graph and optimize graph I generated with tensorboard from the attached photos.
I have a feeling the error is arising from the training flag inputs but I am not certain, and I do not want to go trying to edit the graph if that is not the problem.
I am attaching the function in opencv that is causing the issue:
const tensorflow::TensorProto& TFImporter::getConstBlob(const tensorflow::NodeDef &layer, std::map<String, int> const_layers,
int input_blob_index, int* actual_inp_blob_idx) {
if (input_blob_index == -1) {
for(int i = 0; i < layer.input_size(); i++) {
Pin input = parsePin(layer.input(i));
if (const_layers.find(input.name) != const_layers.end()) {
if (input_blob_index != -1)
CV_Error(Error::StsError, "More than one input is Const op");
input_blob_index = i;
}
}
}
if (input_blob_index == -1)
CV_Error(Error::StsError, "Const input blob for weights not found");
Pin kernel_inp = parsePin(layer.input(input_blob_index));
if (const_layers.find(kernel_inp.name) == const_layers.end())
CV_Error(Error::StsError, "Input [" + layer.input(input_blob_index) +
"] for node [" + layer.name() + "] not found");
if (kernel_inp.blobIndex != 0)
CV_Error(Error::StsError, "Unsupported kernel input");
if(actual_inp_blob_idx) {
*actual_inp_blob_idx = input_blob_index;
}
int nodeIdx = const_layers.at(kernel_inp.name);
if (nodeIdx < netBin.node_size() && netBin.node(nodeIdx).name() == kernel_inp.name)
{
return netBin.node(nodeIdx).attr().at("value").tensor();
}
else
{
CV_Assert_N(nodeIdx < netTxt.node_size(),
netTxt.node(nodeIdx).name() == kernel_inp.name);
return netTxt.node(nodeIdx).attr().at("value").tensor();
}
}
As you pointed out, the error originates in getConstBlob (https://github.com/opencv/opencv/blob/master/modules/dnn/src/tensorflow/tf_importer.cpp#L570). getConstBlobis called several times in populateNet (https://github.com/opencv/opencv/blob/master/modules/dnn/src/tensorflow/tf_importer.cpp#L706), which is called in all overloaded definitions of readNetFromTensor (https://github.com/opencv/opencv/blob/master/modules/dnn/src/tensorflow/tf_importer.cpp#L2278). Those may be starting points for where to place breakpoints if you want to step through with a debugger.
The other thing I noticed is that the definition of populateNet which I believe you're using (supplying a std::string: https://docs.opencv.org/master/d6/d0f/group__dnn.html#gad820b280978d06773234ba6841e77e8d) requires two arguments - both the model path (model) and a configuration (config`), which is optional and defaults to an empty string. In the unit tests, it looks like there are both cases - with and without configuration provided (https://github.com/opencv/opencv/blob/master/modules/dnn/test/test_tf_importer.cpp). I'm not sure if that would have an impact.
Lastly, in the script you provided to replicate the results, I believe the model file name is misspelled - it says optimized_inferene_graph.pb, but the file you point to in the github repo is spelled optimized_inference_graph.pb.
Just a few suggestions, I hope this may help!

tensorflow and tflearn c++ API

At first I am new on both tensorflow and python to start with.
I have a python code that contains a TFlearn DNN network. I need to convert that code to C++ to later on convert it into a library to be used in mobile application development.
I read about the C++ API for tensorflow (of which documentations are real vague and not clear). so I took the code line by line to try converting it.
The first step was loading the saved model that was was previously trained and saved in python (I don't need training to be done in c++ so just loading the tflearn model is enough)
The python code to save the file was as follows:
network = input_data(shape=[None, 100, 100, 1], name='input')
network = conv_2d(network, 32, 5, activation='relu')
network = avg_pool_2d(network, 2)
network = conv_2d(network, 64, 5, activation='relu')
network = avg_pool_2d(network, 2)
network = fully_connected(network, 128, activation='relu')
network = fully_connected(network, 64, activation='relu')
network = fully_connected(network, 2, activation='softmax',restore=False)
network = regression(network, optimizer='adam', learning_rate=0.0001,
loss='categorical_crossentropy', name='target')
model = tflearn.DNN(network, tensorboard_verbose=0)
model.fit(X, y.toarray(), n_epoch=3, validation_set=0.1, shuffle=True,
show_metric=True, batch_size=32, snapshot_step=100,
snapshot_epoch=False, run_id='model_finetuning')
model.save('model/my_model.tflearn')
To load the model python code was:
network = input_data(shape=[None, 100, 100, 1], name='input')
network = conv_2d(network, 32, 5, activation='relu')
network = avg_pool_2d(network, 2)
network = conv_2d(network, 64, 5, activation='relu')
network = avg_pool_2d(network, 2)
network = fully_connected(network, 128, activation='relu')
network = fully_connected(network, 64, activation='relu')
network = fully_connected(network, 2, activation='softmax')
network = regression(network, optimizer='adam', learning_rate=0.001,
loss='categorical_crossentropy', name='target')
model = tflearn.DNN(network, tensorboard_verbose=0)
model.load('model/my_model.tflearn')
and this code worked like a charm in python, yet the model save file was actually 4 files inside the model folder as follows:
model
|------------checkpoint
|------------my_model.tflearn.data-00000-of-00001
|------------my_model.tflearn.index
|------------my_model.tflearn.meta
now I come to the c++ part of it. After a lot of research I came up with the following code:
#include "tensorflow/core/public/session.h"
#include "tensorflow/core/platform/env.h"
#include <iostream>
using namespace tensorflow;
using namespace std;
int main()
{
Session* session;
Status status = NewSession(SessionOptions(), &session);
if (!status.ok())
{
cerr << status.ToString() << "\n";
return 1;
}
else
{
cout << "Session created successfully" << endl;
}
tensorflow::Tensor input_tensor(tensorflow::DT_FLOAT, tensorflow::TensorShape({1,100,100,1}));
GraphDef graph_def;
status = ReadBinaryProto(Env::Default(), "/home/user/PycharmProjects/untitled/model/my_model.tflearn", &graph_def);
if (!status.ok())
{
cerr << status.ToString() << "\n";
return 1;
}
else
{
cout << "Read Model File" << endl;
}
return 0;
}
And now for my questions, the code compile correctly (with no faults) using the bazel build (as described in the "Short" explanation of tensorflow C++ API. but when I tried to run it the model file is not found.
Is what I did in c++ correct? Is this the correct way to load the saved model (which I don't know why 4 files are generated during save)? or is there another approach to do it?
Is there any "Full and descent" manual for the tensorflow c++ API?
If you just want to load an already trained model, a c++ loader already exists. Directly on tensorflow look here and here
Patwie also got a really good example for loading a saved model Code from Patwie.
tensorflow::Status LoadModel(tensorflow::Session *sess, std::string graph_fn, std::string checkpoint_fn = "") {
tensorflow::Status status;
// Read in the protobuf graph we exported
tensorflow::MetaGraphDef graph_def;
status = ReadBinaryProto(tensorflow::Env::Default(), graph_fn, &graph_def);
if (status != tensorflow::Status::OK())
return status;
// create the graph in the current session
status = sess->Create(graph_def.graph_def());
if (status != tensorflow::Status::OK())
return status;
// restore model from checkpoint, iff checkpoint is given
if (checkpoint_fn != "") {
const std::string restore_op_name = graph_def.saver_def().restore_op_name();
const std::string filename_tensor_name = graph_def.saver_def().filename_tensor_name();
tensorflow::Tensor filename_tensor(tensorflow::DT_STRING, tensorflow::TensorShape());
filename_tensor.scalar<std::string>()() = checkpoint_fn;
tensor_dict feed_dict = {{filename_tensor_name, filename_tensor}};
status = sess->Run(feed_dict,
{},
{restore_op_name},
nullptr);
if (status != tensorflow::Status::OK())
return status;
} else {
// virtual Status Run(const std::vector<std::pair<string, Tensor> >& inputs,
// const std::vector<string>& output_tensor_names,
// const std::vector<string>& target_node_names,
// std::vector<Tensor>* outputs) = 0;
status = sess->Run({}, {}, {"init"}, nullptr);
if (status != tensorflow::Status::OK())
return status;
}
Unfortunatly there isn't a "full and descent" manual for tensorflow c++ API yet (AFAIK)
I wrote the steps how to save a TFLearn checkpoint correctly:
...
model = tflearn.DNN(network)
class MonitorCallback(tflearn.callbacks.Callback):
# Create an other session to clone the model and avoid effecting the training process
with tf.Session() as second_sess:
# Clone the current model
model2 = model
# Delete the training ops
del tf.get_collection_ref(tf.GraphKeys.TRAIN_OPS)[:]
# Save the checkpoint
model2.save('checkpoint_'+str(training_state.step)+".ckpt")
# Write a text protobuf to have a human-readable form of the model
tf.train.write_graph(second_sess.graph_def, '.', 'checkpoint_'+str(training_state.step)+".pbtxt", as_text = True)
return
mycb = MonitorCallback()
model.fit({'input': X}, {'target': Y}, n_epoch=500, run_id="mymodel", callbacks=mycb)
...
After you have the checkpoint, you can load in C++:
https://github.com/kecsap/tensorflow_cpp_packaging#load-a-checkpoint-in-c
...and you it for inference:
https://github.com/kecsap/tensorflow_cpp_packaging#inference-in-c
You can also find example code for C and how to freeze a model then load in C++.

Select specific gpu for the session of tensorflow c++ api

How could I ask tensorflow use specific gpu to do the inference?
Part of the source codes
std::unique_ptr<tensorflow::Session> session;
Status const load_graph_status = LoadGraph(graph_path, &session);
if (!load_graph_status.ok()) {
LOG(ERROR) << "LoadGraph ERROR!!!!"<< load_graph_status;
return -1;
}
std::vector<Tensor> resized_tensors;
Status const read_tensor_status = ReadTensorFromImageFile(image_path, &resized_tensors);
if (!read_tensor_status.ok()) {
LOG(ERROR) << read_tensor_status;
return -1;
}
std::vector<Tensor> outputs;
Status run_status = session->Run({{input_layer, resized_tensor}},
output_layer, {}, &outputs);
So far so good, but tensorflow always select the same gpu when I execute Run, do I have a way to specify which gpu to execute?
In case you need complete source codes, I placed them at pastebin
Edit : Looks like options.config.mutable_gpu_options()->set_visible_device_list("0") work, but I am not sure.
Turns out in the C++ API there are a series of (nested) structs: tensorflow::SessionOptions, tensorflow::ConfigProto, and tensorflow::GPUOptions. The latter contains a method called set_visible_device_list(::std::string&& value) which you can select the GPU you would like:
auto options = tensorflow::SessionOptions();
options.config.mutable_gpu_options()->set_visible_device_list("0");
// session_ is a unique_ptr to a tensorflow::Session
session_->reset(tensorflow::NewSession(options));
Similar to this (for memory usage restriction):
how to limit GPU usage in tensorflow (r1.1) with C++ API

Not found: FeedInputs: unable to find feed output TensorFlow

I was trying this example of using Tensorflow saved model in c++ in this website:
https://medium.com/jim-fleming/loading-a-tensorflow-graph-with-the-c-api-4caaff88463f#.ji310n4zo
It works well. But it does not save the values of the variables a and b as it only saves the graph not the variables. I tried to replace the following line:
tf.train.write_graph(sess.graph_def, 'models/', 'graph.pb', as_text=False)
with
saver.save(sess, 'models/graph', global_step=0)
of course after creating the saver object. It does not work and it outputs:
Not found: FeedInputs: unable to find feed output a
I checked the nodes the Nodes that are loaded and they are only:
_SOURCE
_SINK
while in the write_graph function and then load the model in C++, I got the following nodes loaded:
_SOURCE
_SINK
save/restore_slice_1/shape_and_slice
save/restore_slice_1/tensor_name
save/restore_slice/shape_and_slice
save/restore_slice/tensor_name
save/save/shapes_and_slices
save/save/tensor_names
save/Const
save/restore_slice_1
save/restore_slice
b
save/Assign_1
b/read
b/initial_value
b/Assign
a
save/Assign
save/restore_all
save/save
save/control_dependency
a/read
c
a/initial_value
a/Assign
init
Tensor
and even the graph file that is created by saver.save() is much smaller, 165B, compared to the one created by write_graph, 1.9KB.
I'm not sure if that is the best way of solving the problem but at least it solves it.
As write_graph can also store the values of the constants, I added the following code to the python just before writing the graph with write_graph function:
for v in tf.trainable_variables():
vc = tf.constant(v.eval())
tf.assign(v, vc, name="assign_variables")
This creates constants that store variables' values after being trained and then create tensors "assign_variables" to assign them to the variables. Now, when you call write_graph, it will store the variables' values in the file.
The only remaining part is to call these tensors "assign_variables" in the c code to make sure that your variables are assigned with the constants values that are stored in the file. Here is a one way to do it:
Status status = NewSession(SessionOptions(), &session);
std::vector<tensorflow::Tensor> outputs;
for(int i = 0;status.ok(); i++) {
char name[100];
if (i==0)
sprintf(name, "assign_variables");
else
sprintf(name, "assign_variables_%d", i);
status = session->Run({}, {name}, {}, &outputs);
}
There is another way of restoring the variables, by calling the save/restore_all operation, that should be present in the graph:
std::vector<tensorflow::Tensor> outputs;
Tensor checkpoint_filepath(DT_STRING, TensorShape());
checkpoint_filepath.scalar<std::string>()() = "path to the checkpoint file";
status = session->Run( {{ "save/Const", checkpoint_filepath },},
{}, {"save/restore_all"}, &outputs);

GATE Embedded runtime

I want to use "GATE" through web. Then I decide to create a SOAP web service in java with help of GATE Embedded.
But for the same document and saved Pipeline, I have a different run-time duration, when GATE Embedded runs as a java web service.
The same code has a constant run-time when it runs as a Java Application project.
In the web service, the run-time will be increasing after each execution until I get a Timeout error.
Does any one have this kind of experience?
This is my Code:
#WebService(serviceName = "GateWS")
public class GateWS {
#WebMethod(operationName = "gateengineapi")
public String gateengineapi(#WebParam(name = "PipelineNumber") String PipelineNumber, #WebParam(name = "Documents") String Docs) throws Exception {
try {
System.setProperty("gate.home", "C:\\GATE\\");
System.setProperty("shell.path", "C:\\cygwin2\\bin\\sh.exe");
Gate.init();
File GateHome = Gate.getGateHome();
File FrenchGapp = new File(GateHome, PipelineNumber);
CorpusController FrenchController;
FrenchController = (CorpusController) PersistenceManager.loadObjectFromFile(FrenchGapp);
Corpus corpus = Factory.newCorpus("BatchProcessApp Corpus");
FrenchController.setCorpus(corpus);
File docFile = new File(GateHome, Docs);
Document doc = Factory.newDocument(docFile.toURL(), "utf-8");
corpus.add(doc);
FrenchController.execute();
String docXMLString = null;
docXMLString = doc.toXml();
String outputFileName = doc.getName() + ".out.xml";
File outputFile = new File(docFile.getParentFile(), outputFileName);
FileOutputStream fos = new FileOutputStream(outputFile);
BufferedOutputStream bos = new BufferedOutputStream(fos);
OutputStreamWriter out;
out = new OutputStreamWriter(bos, "utf-8");
out.write(docXMLString);
out.close();
gate.Factory.deleteResource(doc);
return outputFileName;
} catch (Exception ex) {
return "ERROR: -> " + ex.getMessage();
}
}
}
I really appreciate any help you can provide.
The problem is that you're loading a new instance of the pipeline for every request, but then not freeing it again at the end of the request. GATE maintains a list internally of every PR/LR/controller that is loaded, so anything you load with Factory.createResource or PersistenceManager.loadObjectFrom... must be freed using Factory.deleteResource once it is no longer needed, typically using a try-finally:
FrenchController = (CorpusController) PersistenceManager.loadObjectFromFile(FrenchGapp);
try {
// ...
} finally {
Factory.deleteResource(FrenchController);
}
But...
Rather than loading a new instance of the pipeline every time, I would strongly recommend you explore a more efficient approach to load a smaller number of instances of the pipeline but keep them in memory to serve multiple requests. There is a fully worked-through example of this technique in the training materials on the GATE wiki, in particular module number 8 (track 2 Thursday).