I have a model which is made of 3 inputs and 1 output. I generated the TRT engine from my onnx model as shown below
int main() {
int maxBatchSize = 32;
nvinfer1::IBuilder* builder = nvinfer1::createInferBuilder(gLogger);
const auto explicitBatch = 1U << static_cast<uint32_t>(nvinfer1::NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);
nvinfer1::INetworkDefinition* network = builder->createNetworkV2(explicitBatch);
nvonnxparser::IParser* parser = nvonnxparser::createParser(*network, gLogger);
parser->parseFromFile("model3.onnx", 1);
for (int i = 0; i < parser->getNbErrors(); ++i)
{
std::cout << parser->getError(i)->desc() << std::endl;
}
builder->setMaxBatchSize(maxBatchSize);
nvinfer1::IBuilderConfig* config = builder->createBuilderConfig();
config->setMaxWorkspaceSize(1 << 20);
nvinfer1::ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
parser->destroy();
network->destroy();
config->destroy();
builder->destroy();
nvinfer1::IHostMemory* serializedModel = engine->serialize();
std::ofstream engine_file("model.engine");
engine_file.write((const char*)serializedModel->data(), serializedModel->size());
serializedModel->destroy();
return 0;
}
How can I perform inference since I have multiple inputs? In the Nvidia guide, only the scenario for single input single output is given.
You can make use of the files that are shipped with the TensorRT installations.
Check out the C:\TensorRT\samples\common directory. Take a look at the buffers.h header file.
Take a look at the MNIST example in the same directory which uses the buffers.h.
The buffers.h file takes care of multiple inputs or outputs.
It creates a BufferManager to deal with those inputs and outputs.
Related
I tried to use OpenVINO Inference Engine to accelerate my DL inference. It works with one image. But I want to create a batch of two images and then do a inference.
This is my code:
InferenceEngine::Core core;
InferenceEngine::CNNNetwork network = core.ReadNetwork("path/to/model.xml");
InferenceEngine::InputInfo::Ptr input_info = network.getInputsInfo().begin()->second;
std::string input_name = network.getInputsInfo().begin()->first;
InferenceEngine::DataPtr output_info = network.getOutputsInfo().begin()->second;
std::string output_name = network.getOutputsInfo().begin()->first;
InferenceEngine::ExecutableNetwork executableNetwork = core.LoadNetwork(network, "CPU");
InferenceEngine::InferRequest inferRequest = executableNetwork.CreateInferRequest();
std::string input_image_01 = "path/to/image_01.png";
cv::Mat image_01 = cv::imread(input_image_01 );
InferenceEngine::Blob::Ptr imgBlob_01 = wrapMat2Blob(image_01);
std::string input_image_02 = "path/to/image_02.png";
cv::Mat image_02 = cv::imread(input_image_02 );
InferenceEngine::Blob::Ptr imgBlob_02 = wrapMat2Blob(image_02);
InferenceEngine::BlobMap imgBlobMap;
std::pair<std::string, InferenceEngine::Blob::Ptr> pair01(input_image_01, imgBlob_01);
imgBlobMap.insert(pair01);
std::pair<std::string, InferenceEngine::Blob::Ptr> pair02(input_image_02, imgBlob_02);
imgBlobMap.insert(pair02);
inferRequest.SetInput(imgBlobMap);
inferRequest.StartAsync();
inferRequest.Wait(InferenceEngine::IInferRequest::WaitMode::RESULT_READY);
InferenceEngine::Blob::Ptr output = inferRequest.GetBlob(output_name);
std::vector<unsigned> class_results;
ClassificationResult cls(output, {"x", "y"}, 2, 3);
class_results = cls.getResults();
Unfortunately, I received the following error message from the command
inferRequest.SetInput(imgBlobMap);
[NOT_FOUND] Failed to find input or output with name: 'path/to/image_02.png'
C:\j\workspace\private-ci\ie\build-windows-vs2019#2\b\repos\openvino\inference-engine\src\plugin_api\cpp_interfaces/impl/ie_infer_request_internal.hpp:303
C:\Program Files (x86)\Intel\openvino_2021.3.394\inference_engine\include\details/ie_exception_conversion.hpp:66
How can I create a batch of more than image, do a inference and get the information for classification class and confidence? Is the confidence and class located in the received variable of GetBlob()? Should I need the call of ClassificationResult cls(output, {"x", "y"}, 2, 3);?
I'd recommend you to review Using Shape Inference article from OpenVINO online documentation to be aware of the limitations of using batches. It also refers to Open Model Zoo smart_classroom_demo, where dynamic batching is used in processing multiple previously detected faces. Basically, when you have batch enabled in the model, the memory buffer of your input blob will be allocated to have a room for all batch of images, and your responsibility is to fill data in input blob for each image in batch from your data. You may take a look at function CnnDLSDKBase::InferBatch, of smart_classroom_demo, which is located at file smart_classroom_demo/cpp/src/cnn.cpp, line 51. As you can see, in the loop over num_imgs an auxiliary function matU8ToBlob fills the input blob with data for current_batch_size of images, then set batch size for infer request and run inference.
for (size_t batch_i = 0; batch_i < num_imgs; batch_i += batch_size) {
const size_t current_batch_size = std::min(batch_size, num_imgs - batch_i);
for (size_t b = 0; b < current_batch_size; b++) {
matU8ToBlob<uint8_t>(frames[batch_i + b], input, b);
}
if (config_.max_batch_size != 1)
infer_request_.SetBatch(current_batch_size);
infer_request_.Infer();
there is a similar sample using the batch inputs as input into model within the OpenVINO. You can refer to below link.
https://github.com/openvinotoolkit/openvino/blob/ae2913d3b5970ce0d3112cc880d03be1708f13eb/inference-engine/samples/hello_nv12_input_classification/main.cpp#L236
My goal is to run a Keras model I have made in my ESP32 microcontroller. I have the libraries all working correctly.
I have created a Keras model using google Collab that looks to be working fine when I give it random test data within google Collab. The model has two input features and 4 different outputs.(a multiple-output regression model)
However, when I export and load the model into my c++ application in the ESP32 it does not matter what the inputs are, it always predicts the same output.
I have based myself in this code in order to load and run the model in c++ : https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/magic_wand/main_functions.cc
And this is my version of the code
namespace {
tflite::ErrorReporter* error_reporter = nullptr;
const tflite::Model* model = nullptr;
tflite::MicroInterpreter* interpreter = nullptr;
TfLiteTensor* input = nullptr;
TfLiteTensor* output = nullptr;
int inference_count = 0;
// Create an area of memory to use for input, output, and intermediate arrays.
// Finding the minimum value for your model may require some trial and error.
constexpr int kTensorArenaSize = 2 * 2048;
uint8_t tensor_arena[kTensorArenaSize];
} // namespace
static void setup(){
static tflite::MicroErrorReporter micro_error_reporter;
error_reporter = µ_error_reporter;
model = tflite::GetModel(venti_model);
if (model->version() != TFLITE_SCHEMA_VERSION) {
error_reporter->Report(
"Model provided is schema version %d not equal "
"to supported version %d.",
model->version(), TFLITE_SCHEMA_VERSION);
return;
}
// This pulls in all the operation implementations we need.
// NOLINTNEXTLINE(runtime-global-variables)
static tflite::ops::micro::AllOpsResolver resolver;
// Build an interpreter to run the model with.
static tflite::MicroInterpreter static_interpreter(
model, resolver, tensor_arena, kTensorArenaSize, error_reporter);
interpreter = &static_interpreter;
// Allocate memory from the tensor_arena for the model's tensors.
TfLiteStatus allocate_status = interpreter->AllocateTensors();
if (allocate_status != kTfLiteOk) {
error_reporter->Report("AllocateTensors() failed");
return;
}
// Obtain pointers to the model's input and output tensors.
input = interpreter->input(0);
ESP_LOGI("TENSOR SETUP", "input size = %d", input->dims->size);
ESP_LOGI("TENSOR SETUP", "input size in bytes = %d", input->bytes);
ESP_LOGI("TENSOR SETUP", "Is input float32? = %s", (input->type == kTfLiteFloat32) ? "true" : "false");
ESP_LOGI("TENSOR SETUP", "Input data dimentions = %d",input->dims->data[1]);
output = interpreter->output(0);
ESP_LOGI("TENSOR SETUP", "output size = %d", output->dims->size);
ESP_LOGI("TENSOR SETUP", "output size in bytes = %d", output->bytes);
ESP_LOGI("TENSOR SETUP", "Is input float32? = %s", (output->type == kTfLiteFloat32) ? "true" : "false");
ESP_LOGI("TENSOR SETUP", "Output data dimentions = %d",output->dims->data[1]);
}
static bool setupDone = true;
static void the_ai_algorithm_task(){
/* First time task is init setup the ai model */
if(setupDone == false){
setup();
setupDone = true;
}
/* Load the input data i.e deltaT1 and deltaT2 */
//int i = 0;
input->data.f[0] = 2.0; /* Different values dont change the output */
input->data.f[1] = 3.2;
// Run inference, and report any error
TfLiteStatus invoke_status = interpreter->Invoke();
if (invoke_status != kTfLiteOk) {
error_reporter->Report("Invoke failed");
// return;
}
/* Retrieve outputs Fan , AC , Vent 1 , Vent 2 */
double fan = output->data.f[0];
double ac = output->data.f[1];
double vent1 = output->data.f[2];
double vent2 = output->data.f[3];
ESP_LOGI("TENSOR SETUP", "fan = %lf", fan);
ESP_LOGI("TENSOR SETUP", "ac = %lf", ac);
ESP_LOGI("TENSOR SETUP", "vent1 = %lf", vent1);
ESP_LOGI("TENSOR SETUP", "vent2 = %lf", vent2);
}
The model seems to load ok as the dimensions and sizes are correct. But the output is always the same 4 values
fan = 0.0087
ac = 0.54
vent1 = 0.73
vent2 = 0.32
Any idea on what can be going wrong? Is it something about my model or am I just not using the model correctly in my c++ application?
Could you refer to the "Test the model" section here - https://colab.research.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/lite/micro/examples/hello_world/train/train_hello_world_model.ipynb#scrollTo=f86dWOyZKmN9 and verify if the TFLite model is producing the correct results?
You can find the issue by testing the 1) TFModel (which you have done already) 2) TFLite model and then 3) TFLite Micro Model (C Source File)
You also need to verify that the inputs passed to the model are of the same type and distribution. eg: If your TFModel was trained on Images in the range 0-255, then you need to pass this to the TFLite and TFLite Micro Model. Instead, if you trained the model using preprocessed data (0-255 get normalized to 0-1 during training), then you need to do the same and preprocess the data for the TFLite and TFLite Micro model.
I have found the issue and the answer.
It was not the C++ code, it was the model. Originally, I made my model with 3 hidden layers of 64, 20, and 8 (I am new to ML so I was only playing with random values) and it was giving me the issue.
To solve it I just changed the hidden layers to 32, 16, and 8 and the C++ code was outputting right values.
I am trying to serve the following gitrepo in opencv: https://github.com/una-dinosauria/3d-pose-baseline and the checkpoint data can be found at the following link: https://drive.google.com/file/d/0BxWzojlLp259MF9qSFpiVjl0cU0/view
I have already constructed a frozen graph which I can serve in python and was generated using the following script:
meta_path = 'checkpoint-4874200.meta' # Your .meta file
output_node_names = ['linear_model/add_1'] # Output nodes
export_dir=os.path.join('export_dir')
graph=tf.Graph()
with tf.Session(graph=graph) as sess:
# Restore the graph
loader=tf.train.import_meta_graph(meta_path)
loader.restore(sess,'checkpoint-4874200')
builder=tf.saved_model.builder.SavedModelBuilder(export_dir)
builder.add_meta_graph_and_variables(sess,
[tf.saved_model.SERVING],
strip_default_attrs=True)
# Freeze the graph
frozen_graph_def = tf.graph_util.convert_variables_to_constants(
sess,
sess.graph_def,
output_node_names)
# Save the frozen graph
with open('C:\\Users\\FrozenGraph.pb', 'wb') as f:
f.write(frozen_graph_def.SerializeToString())
Then I optimized the graph by running:
optimized_graph_def=optimize_for_inference_lib.optimize_for_inference(
frozen_graph_def,
['inputs/enc_in'],
['linear_model/add_1'],
tf.float32.as_datatype_enum)
g=tf.gfile.FastGFile('optimized_inference_graph.pb','wb')
g.write(optimized_graph_def.SerializeToString())
and the optimized frozen graph can be found at: https://github.com/alecda573/frozen_graph/blob/master/optimized_inference_graph.pb
When I try to run in opencv the following I get this runtime error:
OpenCV(4.3.0) Error: Unspecified error (More than one input is Const op) in cv::dnn::dnn4_v20200310::`anonymous-namespace'::TFImporter::getConstBlob, file C:\build\master_winpack-build-win64-vc15\opencv\modules\dnn\src\tensorflow\tf_importer.cpp, line 570
Steps to reproduce
To reproduce problem you just need to download the frozen graph from the above link or create yourself from the checkpoint data and then call the following in opencv with the below headers:
#include <iostream>
#include <vector>
#include <cmath>
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc.hpp>
#include "opencv2/dnn.hpp"
string pbFilePath = "C:/Users/optimized_inferene_graph.pb";
//Create 3d-pose-baseline model
cv::dnn::Net inputNet;
inputNet = cv::dnn::readNetFromTensorflow(pbFilePath);
Would love to know if anyone has any thoughts on how to address this error.
You can see the frozen graph and optimize graph I generated with tensorboard from the attached photos.
I have a feeling the error is arising from the training flag inputs but I am not certain, and I do not want to go trying to edit the graph if that is not the problem.
I am attaching the function in opencv that is causing the issue:
const tensorflow::TensorProto& TFImporter::getConstBlob(const tensorflow::NodeDef &layer, std::map<String, int> const_layers,
int input_blob_index, int* actual_inp_blob_idx) {
if (input_blob_index == -1) {
for(int i = 0; i < layer.input_size(); i++) {
Pin input = parsePin(layer.input(i));
if (const_layers.find(input.name) != const_layers.end()) {
if (input_blob_index != -1)
CV_Error(Error::StsError, "More than one input is Const op");
input_blob_index = i;
}
}
}
if (input_blob_index == -1)
CV_Error(Error::StsError, "Const input blob for weights not found");
Pin kernel_inp = parsePin(layer.input(input_blob_index));
if (const_layers.find(kernel_inp.name) == const_layers.end())
CV_Error(Error::StsError, "Input [" + layer.input(input_blob_index) +
"] for node [" + layer.name() + "] not found");
if (kernel_inp.blobIndex != 0)
CV_Error(Error::StsError, "Unsupported kernel input");
if(actual_inp_blob_idx) {
*actual_inp_blob_idx = input_blob_index;
}
int nodeIdx = const_layers.at(kernel_inp.name);
if (nodeIdx < netBin.node_size() && netBin.node(nodeIdx).name() == kernel_inp.name)
{
return netBin.node(nodeIdx).attr().at("value").tensor();
}
else
{
CV_Assert_N(nodeIdx < netTxt.node_size(),
netTxt.node(nodeIdx).name() == kernel_inp.name);
return netTxt.node(nodeIdx).attr().at("value").tensor();
}
}
As you pointed out, the error originates in getConstBlob (https://github.com/opencv/opencv/blob/master/modules/dnn/src/tensorflow/tf_importer.cpp#L570). getConstBlobis called several times in populateNet (https://github.com/opencv/opencv/blob/master/modules/dnn/src/tensorflow/tf_importer.cpp#L706), which is called in all overloaded definitions of readNetFromTensor (https://github.com/opencv/opencv/blob/master/modules/dnn/src/tensorflow/tf_importer.cpp#L2278). Those may be starting points for where to place breakpoints if you want to step through with a debugger.
The other thing I noticed is that the definition of populateNet which I believe you're using (supplying a std::string: https://docs.opencv.org/master/d6/d0f/group__dnn.html#gad820b280978d06773234ba6841e77e8d) requires two arguments - both the model path (model) and a configuration (config`), which is optional and defaults to an empty string. In the unit tests, it looks like there are both cases - with and without configuration provided (https://github.com/opencv/opencv/blob/master/modules/dnn/test/test_tf_importer.cpp). I'm not sure if that would have an impact.
Lastly, in the script you provided to replicate the results, I believe the model file name is misspelled - it says optimized_inferene_graph.pb, but the file you point to in the github repo is spelled optimized_inference_graph.pb.
Just a few suggestions, I hope this may help!
I am using jsonc-libjson to create a json string like below.
{ "author-details": {
"name" : "Joys of Programming",
"Number of Posts" : 10
}
}
My code looks like below
json_object *jobj = json_object_new_object();
json_object *jStr1 = json_object_new_string("Joys of Programming");
json_object *jstr2 = json_object_new_int("10");
json_object_object_add(jobj,"name", jStr1 );
json_object_object_add(jobj,"Number of Posts", jstr2 );
this gives me json string
{
"name" : "Joys of Programming",
"Number of Posts" : 10
}
How do I add the top part associated with author details?
To paraphrase an old advertisement, "libjson users would rather fight than switch."
At least I assume you must like fighting with the library. Using nlohmann's JSON library, you could use code like this:
nlohmann::json j {
{ "author-details", {
{ "name", "Joys of Programming" },
{ "Number of Posts", 10 }
}
}
};
At least to me, this seems somewhat simpler and more readable.
Parsing is about equally straightforward. For example, let's assume we had a file named somefile.json that contained the JSON data shown above. To read and parse it, we could do something like this:
nlohmann::json j;
std::ifstream in("somefile.json");
in >> j; // Read the file and parse it into a json object
// Let's start by retrieving and printing the name.
std::cout << j["author-details"]["name"];
Or, let's assume we found a post, so we want to increment the count of posts. This is one place that things get...less tasteful--we can't increment the value as directly as we'd like; we have to obtain the value, add one, then assign the result (like we would in lesser languages that lack ++):
j["author-details"]["Number of Posts"] = j["author-details"]["Number of Posts"] + 1;
Then we want to write out the result. If we want it "dense" (e.g., we're going to transmit it over a network for some other machine to read it) we can just use <<:
somestream << j;
On the other hand, we might want to pretty-print it so a person can read it more easily. The library respects the width we set with setw, so to have it print out indented with 4-column tab stops, we can do:
somestream << std::setw(4) << j;
Create a new JSON object and add the one you already created as a child.
Just insert code like this after what you've already written:
json_object* root = json_object_new_object();
json_object_object_add(root, "author-details", jobj); // This is the same "jobj" as original code snippet.
Based on the comment from Dominic, I was able to figure out the correct answer.
json_object *jobj = json_object_new_object();
json_object* root = json_object_new_object();
json_object_object_add(jobj, "author-details", root);
json_object *jStr1 = json_object_new_string("Joys of Programming");
json_object *jstr2 = json_object_new_int(10);
json_object_object_add(root,"name", jStr1 );
json_object_object_add(root,"Number of Posts", jstr2 );
I train a model and save it using:
saver = tf.train.Saver()
saver.save(session, './my_model_name')
Besides the checkpoint file, which simply contains pointers to the most recent checkpoints of the model, this creates the following 3 files in the current path:
my_model_name.meta
my_model_name.index
my_model_name.data-00000-of-00001
I wonder what each of these files contains.
I'd like to load this model in C++ and run the inference. The label_image example loads the model from a single .bp file using ReadBinaryProto(). I wonder how I can load it from these 3 files. What is the C++ equivalent of the following?
new_saver = tf.train.import_meta_graph('./my_model_name.meta')
new_saver.restore(session, './my_model_name')
What your saver creates is called "Checkpoint V2" and was introduced in TF 0.12.
I got it working quite nicely (though the docs on the C++ part are horrible, so it took me a day to solve). Some people suggest converting all variables to constants or freezing the graph, but none of these is actually needed.
Python part (saving)
with tf.Session() as sess:
tf.train.Saver(tf.trainable_variables()).save(sess, 'models/my-model')
If you create the Saver with tf.trainable_variables(), you can save yourself some headache and storage space. But maybe some more complicated models need all data to be saved, then remove this argument to Saver, just make sure you're creating the Saver after your graph is created. It is also very wise to give all variables/layers unique names, otherwise you can run in different problems.
C++ part (inference)
Note that checkpointPath isn't a path to any of the existing files, just their common prefix. If you mistakenly put there path to the .index file, TF won't tell you that was wrong, but it will die during inference due to uninitialized variables.
#include <tensorflow/core/public/session.h>
#include <tensorflow/core/protobuf/meta_graph.pb.h>
using namespace std;
using namespace tensorflow;
...
// set up your input paths
const string pathToGraph = "models/my-model.meta"
const string checkpointPath = "models/my-model";
...
auto session = NewSession(SessionOptions());
if (session == nullptr) {
throw runtime_error("Could not create Tensorflow session.");
}
Status status;
// Read in the protobuf graph we exported
MetaGraphDef graph_def;
status = ReadBinaryProto(Env::Default(), pathToGraph, &graph_def);
if (!status.ok()) {
throw runtime_error("Error reading graph definition from " + pathToGraph + ": " + status.ToString());
}
// Add the graph to the session
status = session->Create(graph_def.graph_def());
if (!status.ok()) {
throw runtime_error("Error creating graph: " + status.ToString());
}
// Read weights from the saved checkpoint
Tensor checkpointPathTensor(DT_STRING, TensorShape());
checkpointPathTensor.scalar<std::string>()() = checkpointPath;
status = session->Run(
{{ graph_def.saver_def().filename_tensor_name(), checkpointPathTensor },},
{},
{graph_def.saver_def().restore_op_name()},
nullptr);
if (!status.ok()) {
throw runtime_error("Error loading checkpoint from " + checkpointPath + ": " + status.ToString());
}
// and run the inference to your liking
auto feedDict = ...
auto outputOps = ...
std::vector<tensorflow::Tensor> outputTensors;
status = session->Run(feedDict, outputOps, {}, &outputTensors);
For completeness, here's the Python equivalent:
Inference in Python
with tf.Session() as sess:
saver = tf.train.import_meta_graph('models/my-model.meta')
saver.restore(sess, tf.train.latest_checkpoint('models/'))
outputTensors = sess.run(outputOps, feed_dict=feedDict)
I'm currently struggling with this myself, I've found it's not very straightforward to do currently. The two most commonly cited tutorials on the subject are:
https://medium.com/jim-fleming/loading-a-tensorflow-graph-with-the-c-api-4caaff88463f#.goxwm1e5j
and
https://medium.com/#hamedmp/exporting-trained-tensorflow-models-to-c-the-right-way-cf24b609d183#.g1gak956i
The equivalent of
new_saver = tf.train.import_meta_graph('./my_model_name.meta')
new_saver.restore(session, './my_model_name')
Is just
Status load_graph_status = LoadGraph(graph_path, &session);
Assuming you've "frozen the graph" (Used a script with combines the graph file with the checkpoint values).
Also, see the discussion here: Tensorflow Different ways to Export and Run graph in C++