I used the xgboost R package to train a model. I want to make predictions in a C/C++ environment. I succeeded saving the trained model from R and loading it in my C code.
I want to test this code by saving the test data I used in R (as a DMatrix), and loading it back into my C program, and do the prediction.
In R I used the xgb.Dmatrix.save() command to save the test data to file. my C code looks like that:
DMatrixHandle d = 0;
int y = XGDMatrixCreateFromFile("test_data.DMatrix",1,&d);
This code compiles, but fails at runtime with the following error:
dmlc-core/include/dmlc/logging.h:245: [13:57:27] src/data/data.cc:51: Check failed: (version) == (kVersion) MetaInfo: invalid format
Any suggestions about how to tell xgboost to save/load things in the right format?
Any clue would be helpful.
Related
I've been trying to figure out how to use tflite with c++ for a raspberry pi project, and I've had trouble understanding documentation and how to perform inference. I've been confused about the code shown on this page:
https://www.tensorflow.org/lite/api_docs/cc/class/tflite/interpreter
For context, I am inexperienced in c++, and usually work with python and matlab.
Here are the lines of code I'm trying to understand:
auto input = interpreter->typed_tensor(0);
for (int i = 0; i < input_size; i++) {
input[i] = ...; interpreter->Invoke();
My understanding of these lines is - They set input equal to the typed_tensor function of interpreter, with (0) as the input.
Then, they loop for values of i between 0 and input_size. For each i value, they make the inpt[i] equal to some value. They then call interpreter->Invoke(), which has the neural net perform inference, with input[i] as the input?
This is then repeated for each i value, each time calling interpreter->Invoke() with input[i] as the input to the neural net.
Is my understanding of this process correct?
What shape should the input take? For example, if I had converted a tensorflow model that takes a 1x100 input and then converted this to a tflite model, how would I create the input to feed into the tflite model in c++?
What data type should the input to the tflite model be in?
Thank you,
Simon
I've tried looking at example code of other people's uses of tflite in c++, but I haven't been able to examine the inputs to their models. I would also appreciate help with viewing the input tensors while debugging. I'm using code::blocks as an IDE for now.
I'm trying implementing deep learning model into TensorRT runtime. The model conversion step is done quite OK and i'm pretty sure about it.
Now there's 2 parts i'm currently struggle with is memCpy data from host To Device (like openCV to Trt) and get the right output shape in order to get the right data. So my questions is:
How actually a shape of input dims relate with memory buffer. What is the difference when the model input dims is NCHW and NHWC, so when i read a openCV image, it's NHWC and also the model input is NHWC, do i have to re-arange the buffer data, if Yes then what's the actual consecutive memory format i have to do ?. Or simply what does the format or sequence of data that the engine are expecting ?
About the output (assume the input are correctly buffered), how do i get the right result shape for each task (Detection, Classification, etc..)..
Eg. an array or something look similar like when working with python .
I read Nvidia docs and it's not beginner-friendly at all.
//Let's say i have a model thats have a dynamic shape input dim in the NHWC format.
auto input_dims = nvinfer1::Dims4{1, 386, 342, 3}; //Using fixed H, W for testing
context->setBindingDimensions(input_idx, input_dims);
auto input_size = getMemorySize(input_dims, sizeof(float));
// How do i format openCV Mat to this kind of dims and if i encounter new input dim format, how do i adapt to that ???
And the expected output dims is something like (1,32,53,8) for example, the output buffer result in a pointer and i don't know what's the sequence of the data to reconstruct to expected array shape.
// Run TensorRT inference
void* bindings[] = {input_mem, output_mem};
bool status = context->enqueueV2(bindings, stream, nullptr);
if (!status)
{
std::cout << "[ERROR] TensorRT inference failed" << std::endl;
return false;
}
auto output_buffer = std::unique_ptr<int>{new int[output_size]};
if (cudaMemcpyAsync(output_buffer.get(), output_mem, output_size, cudaMemcpyDeviceToHost, stream) != cudaSuccess)
{
std::cout << "ERROR: CUDA memory copy of output failed, size = " << output_size << " bytes" << std::endl;
return false;
}
cudaStreamSynchronize(stream);
//How do i use this output_buffer to form right shape of output, (1,32,53,8) in this case ?
Could you please edit your question and tell us which model you're using if it's a commonly known NN, prehaps one we can download to test locally?
Then, the answer since it doesn't depend on the model (even though it would help to answer)
How actually a shape of input dims relate with memory buffer
If the input is NxCxHxW, you need to allocate N*C*H*W*sizeof(float) memory for that on your CPU and GPU. To be more precise, you need to allocate space on GPU for all the bindings and on CPU for only input and output bindings.
when i read a openCV image, it's NHWC and also the model input is NHWC, do i have to re-arange the buffer data
No, you do not have to re-arrange the buffer data. If you would have to change between NHWC and NCHW you can check this or google 'opencv NHWC to NHCW'.
Full working code example here, especially this function.
Or simply what does the format or sequence of data that the engine are expecting ?
This depends on how the neural network was trained. You should in general know exactly which kind of preprocessing and image data formats have been used to train the NN. You should even use the same libraries to load images and process them if possible. It's an open problem in ML: if you try to replicate results of some papers and use their models but they haven't open sourced the preprocessing you might get worse results. In the "worst" case you can implement both NHCW and NCHW and test which of them works.
About the output (assume the input are correctly buffered), how do i get the right result shape for each task (Detection, Classification, etc..).. Eg. an array or something look similar like when working with python .
This question clearly requires me to understand which NNs you are referring to. But I myself do the following:
Load the TensorRT .engine file in my code like this and deserialize like this
Print the bindings like this
Then I know the size of the input binding or bindings if there are many inputs, and the size of the output binding or bindings if there are many outputs.
This way you know the right result shape for each task. I hope this answered your question. If not, please add detailed comments and edit your post to be more precise. Thank you.
I read Nvidia docs and it's not beginner-friendly at all.
Yes I agree. You're better of searching TensorRT c++ (or Python) repositories from Github and studying their code. Have you seen TensorRT samples? It doesn't really take many lines of code to implement TensorRT inference.
Thanks in advance for your support.
I'm trying to get the output of a tensor after the inference on a .tflite U-net neural network. I'm using Tensorflow lite image classification code as a baseline.
I need to adapt the code for a segmentation task. My question is how I can access the output of the inferenced model (which 128x128x1) and write the result into an image?
I already debugged the code and explored many different approaches. Unfortunately, I'm not confident with the C++ language. What I found is that the command: interpreter->typed_output_tensor<float>(0) should be what I need, as also referenced here: https://www.tensorflow.org/lite/guide/inference#loading_a_model. However, I cannot access the 128x128 tensor generated by the network.
You can find the code at the address: https://github.com/tensorflow/tensorflow/blob/770481fb3e9126f9a29db5667f528e450d54d719/tensorflow/lite/examples/label_image/label_image.cc
The interesting part is here (lines 217 -224):
const float threshold = 0.001f;
std::vector<std::pair<float, int>> top_results;
int output = interpreter->outputs()[0];
TfLiteIntArray* output_dims = interpreter->tensor(output)->dims;
// assume output dims to be something like (1, 1, ... ,size)
auto output_size = output_dims->data[output_dims->size - 1];
I expect the values saved in an image or an alternative way of saving the output tensor
I have trained some models using tensorflow 1.5.1 and I have the checkpoints for those models (including .ckpt and .meta files). Now I want to do inference in c++ using those files.
In python, I would do the following to save and load the graph and the checkpoints.
for saving:
images = tf.placeholder(...) // the input layer
//the graph def
output = tf.nn.softmax(net) // the output layer
tf.add_to_collection('images', images)
tf.add_to_collection('output', output)
for inference i restore the graph and the checkpoint then restore the input and output layers from collections like so:
meta_file = './models/last-100.meta'
ckpt_file = './models/last-100'
with tf.Session() as sess:
saver = tf.train.import_meta_graph(meta_file)
saver.restore(sess, ckpt_file)
images = tf.get_collection('images')
output = tf.get_collection('output')
outputTensors = sess.run(output, feed_dict={images: np.array(an_image)})
now assuming that I did the saving in python as usual, how can I do inference and restore in c++ with simple code like in python?
I have found examples and tutorials but for tensorflow versions 0.7 0.12 and the same code doesn't work for version 1.5. I found no tutorials for restoring models using c++ API on tensorflow website.
For the sake of this thread. I will rephrase my comment into an answer.
Posting a full example would require either a CMake setup or putting the file into a specific directory to run bazel. As I do favor the first way and it would burst all limits on this post to cover all parts I would like to redirect to a complete implementation in C99, C++, GO without Bazel which I tested for TF > v1.5.
Loading a graph in C++ is not much more difficult than in Python, given you compiled TensorFlow already from source.
Start by creating a MWE, which creates a very dump network graph is always a good idea to figure out how things work:
import tensorflow as tf
x = tf.placeholder(tf.float32, shape=[1, 2], name='input')
output = tf.identity(tf.layers.dense(x, 1), name='output')
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver = tf.train.Saver(tf.global_variables())
saver.save(sess, './exported/my_model')
There are probably tons of answers here on SO about this part. So I just let it stay here without further explanation.
Loading in Python
Before doing stuff in other languages, we can try to do it in python properly -- in the sense: we just need to rewrite it in C++.
Even restoring is very easy in python like:
import tensorflow as tf
with tf.Session() as sess:
# load the computation graph
loader = tf.train.import_meta_graph('./exported/my_model.meta')
sess.run(tf.global_variables_initializer())
loader = loader.restore(sess, './exported/my_model')
x = tf.get_default_graph().get_tensor_by_name('input:0')
output = tf.get_default_graph().get_tensor_by_name('output:0')
it is not helpful as most of these API endpoints do not exists in the C++ API (yet?). An alternative version would be
import tensorflow as tf
with tf.Session() as sess:
metaGraph = tf.train.import_meta_graph('./exported/my_model.meta')
restore_op_name = metaGraph.as_saver_def().restore_op_name
restore_op = tf.get_default_graph().get_operation_by_name(restore_op_name)
filename_tensor_name = metaGraph.as_saver_def().filename_tensor_name
sess.run(restore_op, {filename_tensor_name: './exported/my_model'})
x = tf.get_default_graph().get_tensor_by_name('input:0')
output = tf.get_default_graph().get_tensor_by_name('output:0')
Hang on. You can always use print(dir(object)) to get the properties like restore_op_name, ... .
Restoring a model is an operation in TensorFlow like every other operation. We just call this operation and providing the path (a string-tensor) as an input. We can even write our own restore operation
def restore(sess, metaGraph, fn):
restore_op_name = metaGraph.as_saver_def().restore_op_name # u'save/restore_all'
restore_op = tf.get_default_graph().get_operation_by_name(restore_op_name)
filename_tensor_name = metaGraph.as_saver_def().filename_tensor_name # u'save/Const'
sess.run(restore_op, {filename_tensor_name: fn})
Even this looks strange, it now greatly helps to do the same stuff in C++.
Loading in C++
Starting with the usual stuff
#include <tensorflow/core/public/session.h>
#include <tensorflow/core/public/session_options.h>
#include <tensorflow/core/protobuf/meta_graph.pb.h>
#include <string>
#include <iostream>
typedef std::vector<std::pair<std::string, tensorflow::Tensor>> tensor_dict;
int main(int argc, char const *argv[]) {
const std::string graph_fn = "./exported/my_model.meta";
const std::string checkpoint_fn = "./exported/my_model";
// prepare session
tensorflow::Session *sess;
tensorflow::SessionOptions options;
TF_CHECK_OK(tensorflow::NewSession(options, &sess));
// here we will put our loading of the graph and weights
return 0;
}
You should be able to compile this by either put it in the TensorFlow repo and use bazel or simply follow the instructions here to use CMake.
We need to create such a meta_graph created by tf.train.import_meta_graph. This can be done by
tensorflow::MetaGraphDef graph_def;
TF_CHECK_OK(ReadBinaryProto(tensorflow::Env::Default(), graph_fn, &graph_def));
In C++ reading a graph from file is not the same as importing a graph in Python. We need to create this graph in a session by
TF_CHECK_OK(sess->Create(graph_def.graph_def()));
By looking at the strange python restore function above:
restore_op_name = metaGraph.as_saver_def().restore_op_name
restore_op = tf.get_default_graph().get_operation_by_name(restore_op_name)
filename_tensor_name = metaGraph.as_saver_def().filename_tensor_name
we can code the equivalent piece in C++
const std::string restore_op_name = graph_def.saver_def().restore_op_name()
const std::string filename_tensor_name = graph_def.saver_def().filename_tensor_name()
Having this in place, we just run the operation by
sess->Run(feed_dict, // inputs
{}, // output_tensor_names (we do not need them)
{restore_op}, // target_node_names
nullptr) // outputs (there are no outputs this time)
Creating the feed_dict is probably a post on its own and this answer is already long enough. It does only cover the most important stuff. I would like to redirect to a complete implementation in C99, C++, GO without Bazel which I tested for TF > v1.5. This is not that hard -- it just can get very long in the case of the plain C version.
I am running libsvm through weka. Its output accuracy looks good to me, so I am planning to write a svm model by myself. However, weka didn't generate any training parameter, such as number of support vector. Therefore i cannot do anything. Searching the web, i found somebody said it would generate some parameters like the following:
optimization finished, #iter = 27
nu = 0.058475864943863545
obj = -1.871013102744184, rho = -0.19357337828800944
nSV = 9, nBSV = 0 `enter code here`
Total nSV = 9
but how come i didn't see any of them? any step that i missed? please help me. Thanks a lot.
Weka writes the output you mentioned to stderr.
So if you have started weka.sh or weka.bat from a terminal (or "command window" if you are on Windows), you should see that output appear in your terminal window after clicking "classify"
If you want to have access to this information via scripts, you can
redirect the output to a file and read in that file.
Here is how to edit the startup file weka.sh / weka.bat.
Edit this line (it is probably the last line) in order to write log info to a file instead of the terminal window:
java -cp $CP -Xmx8092m weka.gui.GUIChooser 2>>/opt/weka-stable/weka.log &
You can also add a properties file to your home directory to add more fine-grained behaviour.
https://weka.wikispaces.com/Properties+file
(You probably can also access information via the Weka Java API somehow, but you did not ask for that)