I was trying this example of using Tensorflow saved model in c++ in this website:
https://medium.com/jim-fleming/loading-a-tensorflow-graph-with-the-c-api-4caaff88463f#.ji310n4zo
It works well. But it does not save the values of the variables a and b as it only saves the graph not the variables. I tried to replace the following line:
tf.train.write_graph(sess.graph_def, 'models/', 'graph.pb', as_text=False)
with
saver.save(sess, 'models/graph', global_step=0)
of course after creating the saver object. It does not work and it outputs:
Not found: FeedInputs: unable to find feed output a
I checked the nodes the Nodes that are loaded and they are only:
_SOURCE
_SINK
while in the write_graph function and then load the model in C++, I got the following nodes loaded:
_SOURCE
_SINK
save/restore_slice_1/shape_and_slice
save/restore_slice_1/tensor_name
save/restore_slice/shape_and_slice
save/restore_slice/tensor_name
save/save/shapes_and_slices
save/save/tensor_names
save/Const
save/restore_slice_1
save/restore_slice
b
save/Assign_1
b/read
b/initial_value
b/Assign
a
save/Assign
save/restore_all
save/save
save/control_dependency
a/read
c
a/initial_value
a/Assign
init
Tensor
and even the graph file that is created by saver.save() is much smaller, 165B, compared to the one created by write_graph, 1.9KB.
I'm not sure if that is the best way of solving the problem but at least it solves it.
As write_graph can also store the values of the constants, I added the following code to the python just before writing the graph with write_graph function:
for v in tf.trainable_variables():
vc = tf.constant(v.eval())
tf.assign(v, vc, name="assign_variables")
This creates constants that store variables' values after being trained and then create tensors "assign_variables" to assign them to the variables. Now, when you call write_graph, it will store the variables' values in the file.
The only remaining part is to call these tensors "assign_variables" in the c code to make sure that your variables are assigned with the constants values that are stored in the file. Here is a one way to do it:
Status status = NewSession(SessionOptions(), &session);
std::vector<tensorflow::Tensor> outputs;
for(int i = 0;status.ok(); i++) {
char name[100];
if (i==0)
sprintf(name, "assign_variables");
else
sprintf(name, "assign_variables_%d", i);
status = session->Run({}, {name}, {}, &outputs);
}
There is another way of restoring the variables, by calling the save/restore_all operation, that should be present in the graph:
std::vector<tensorflow::Tensor> outputs;
Tensor checkpoint_filepath(DT_STRING, TensorShape());
checkpoint_filepath.scalar<std::string>()() = "path to the checkpoint file";
status = session->Run( {{ "save/Const", checkpoint_filepath },},
{}, {"save/restore_all"}, &outputs);
Related
I cannot put, and keep the model on the CUDA device. I cannot send a tensor that is already on CUDA through a model without getting the "found at least two devices, cpu and cuda" error.
Did I miss some simple way of putting the model on the CUDA device in LibTorch? I cannot find it or figure it out.
The full, reproducible example is below, but the lines in question are quite simple as show below...
I have a tensor that is on CUDA I want to send it though a model. This causes an error
auto the_tensor = torch::rand({42, 427}).to(device);
std::cout << net.forward(the_tensor).to(device);
terminate called after throwing an instance of 'c10::Error'
what(): Expected all tensors to be on the same device, but found at least two devices, cpu and
cuda:0! (when checking argument for argument mat1 in method wrapper_addmm)
If I do NOT put the tensor on CUDA I can run the tensor and the model both on CUDA like so
auto the_tensor = torch::rand({42, 427});
std::cout << net.forward(the_tensor).to(device);
I can also send the tensor back to the CPU and this also does NOT create an error. But, I have a large script with a lot of tensors that will already be on the CUDA device I DO NOT want to be sending the tensors from CUDA back to the CPU and then back to the CUDA device. This is why I call it a bug. How do I put the model on the CUDA device and keep it there other then putting .to(device) on the end of the model only when it is being called with forward net.forward(tensor)
auto the_tensor = torch::rand({42, 427}).to(device);
std::cout << net.forward(the_tensor.to(torch::kCPU)).to(device);
I have tried permanently putting the model on the device but nothing I try works.
net.to(device);
net->to(device);
Critic_Net().to(device);
I've tried many variations like these above to put the model on the CUDA device and keep it on the CUDA device but nothing works but to put the model on the CUDA device with net.forward(the_tensor).to(device);
The full, reproducible example.
#include <torch/torch.h>
using namespace torch::indexing;
torch::Device device(torch::kCUDA);
struct Critic_Net : torch::nn::Module {
torch::Tensor next_state_batch__sampled_action;
public:
Critic_Net() {
lin1 = torch::nn::Linear(427, 42);
lin2 = torch::nn::Linear(42, 286);
lin3 = torch::nn::Linear(286, 1);
}
torch::Tensor forward(torch::Tensor next_state_batch__sampled_action) {
auto h = next_state_batch__sampled_action;
h = torch::relu(lin1->forward(h));
h = torch::tanh(lin2->forward(h));
h = lin3->forward(h);
return torch::nan_to_num(h);
}
torch::nn::Linear lin1{nullptr}, lin2{nullptr}, lin3{nullptr};
};
auto net = Critic_Net();
int main() {
net.to(device);
auto the_tensor = torch::rand({42, 427}).to(device);
std::cout << net.forward(the_tensor).to(device);
}
When you are moving your model to gpu with the to function, libtorch does not move anything because you have not registered anything as parameters/buffers/modules. Hence when you call the forward method, incompatible devices are found and an error is raised. Here is how to register your submodules (see doc)
struct Critic_Net : torch::nn::Module {
public:
Critic_Net() {
lin1 = register_module("lin1", torch::nn::Linear(427, 42));
lin2 = register_module("lin1", torch::nn::Linear(42, 286));
lin3 = register_module("lin1", torch::nn::Linear(286, 1));
}
torch::Tensor forward(torch::Tensor next_state_batch__sampled_action) {
// unchanged
}
torch::nn::Linear lin1{nullptr}, lin2{nullptr}, lin3{nullptr};
};
Note : in addition to register_module you have access to register_parameter and register_buffer which take tensors instead of modules. The difference is that the "parameters" are trainable tensors while "buffers" will not be trainable (they are useful if you want to keep a moving average of your inputs for example).
I tried to use OpenVINO Inference Engine to accelerate my DL inference. It works with one image. But I want to create a batch of two images and then do a inference.
This is my code:
InferenceEngine::Core core;
InferenceEngine::CNNNetwork network = core.ReadNetwork("path/to/model.xml");
InferenceEngine::InputInfo::Ptr input_info = network.getInputsInfo().begin()->second;
std::string input_name = network.getInputsInfo().begin()->first;
InferenceEngine::DataPtr output_info = network.getOutputsInfo().begin()->second;
std::string output_name = network.getOutputsInfo().begin()->first;
InferenceEngine::ExecutableNetwork executableNetwork = core.LoadNetwork(network, "CPU");
InferenceEngine::InferRequest inferRequest = executableNetwork.CreateInferRequest();
std::string input_image_01 = "path/to/image_01.png";
cv::Mat image_01 = cv::imread(input_image_01 );
InferenceEngine::Blob::Ptr imgBlob_01 = wrapMat2Blob(image_01);
std::string input_image_02 = "path/to/image_02.png";
cv::Mat image_02 = cv::imread(input_image_02 );
InferenceEngine::Blob::Ptr imgBlob_02 = wrapMat2Blob(image_02);
InferenceEngine::BlobMap imgBlobMap;
std::pair<std::string, InferenceEngine::Blob::Ptr> pair01(input_image_01, imgBlob_01);
imgBlobMap.insert(pair01);
std::pair<std::string, InferenceEngine::Blob::Ptr> pair02(input_image_02, imgBlob_02);
imgBlobMap.insert(pair02);
inferRequest.SetInput(imgBlobMap);
inferRequest.StartAsync();
inferRequest.Wait(InferenceEngine::IInferRequest::WaitMode::RESULT_READY);
InferenceEngine::Blob::Ptr output = inferRequest.GetBlob(output_name);
std::vector<unsigned> class_results;
ClassificationResult cls(output, {"x", "y"}, 2, 3);
class_results = cls.getResults();
Unfortunately, I received the following error message from the command
inferRequest.SetInput(imgBlobMap);
[NOT_FOUND] Failed to find input or output with name: 'path/to/image_02.png'
C:\j\workspace\private-ci\ie\build-windows-vs2019#2\b\repos\openvino\inference-engine\src\plugin_api\cpp_interfaces/impl/ie_infer_request_internal.hpp:303
C:\Program Files (x86)\Intel\openvino_2021.3.394\inference_engine\include\details/ie_exception_conversion.hpp:66
How can I create a batch of more than image, do a inference and get the information for classification class and confidence? Is the confidence and class located in the received variable of GetBlob()? Should I need the call of ClassificationResult cls(output, {"x", "y"}, 2, 3);?
I'd recommend you to review Using Shape Inference article from OpenVINO online documentation to be aware of the limitations of using batches. It also refers to Open Model Zoo smart_classroom_demo, where dynamic batching is used in processing multiple previously detected faces. Basically, when you have batch enabled in the model, the memory buffer of your input blob will be allocated to have a room for all batch of images, and your responsibility is to fill data in input blob for each image in batch from your data. You may take a look at function CnnDLSDKBase::InferBatch, of smart_classroom_demo, which is located at file smart_classroom_demo/cpp/src/cnn.cpp, line 51. As you can see, in the loop over num_imgs an auxiliary function matU8ToBlob fills the input blob with data for current_batch_size of images, then set batch size for infer request and run inference.
for (size_t batch_i = 0; batch_i < num_imgs; batch_i += batch_size) {
const size_t current_batch_size = std::min(batch_size, num_imgs - batch_i);
for (size_t b = 0; b < current_batch_size; b++) {
matU8ToBlob<uint8_t>(frames[batch_i + b], input, b);
}
if (config_.max_batch_size != 1)
infer_request_.SetBatch(current_batch_size);
infer_request_.Infer();
there is a similar sample using the batch inputs as input into model within the OpenVINO. You can refer to below link.
https://github.com/openvinotoolkit/openvino/blob/ae2913d3b5970ce0d3112cc880d03be1708f13eb/inference-engine/samples/hello_nv12_input_classification/main.cpp#L236
I train a model and save it using:
saver = tf.train.Saver()
saver.save(session, './my_model_name')
Besides the checkpoint file, which simply contains pointers to the most recent checkpoints of the model, this creates the following 3 files in the current path:
my_model_name.meta
my_model_name.index
my_model_name.data-00000-of-00001
I wonder what each of these files contains.
I'd like to load this model in C++ and run the inference. The label_image example loads the model from a single .bp file using ReadBinaryProto(). I wonder how I can load it from these 3 files. What is the C++ equivalent of the following?
new_saver = tf.train.import_meta_graph('./my_model_name.meta')
new_saver.restore(session, './my_model_name')
What your saver creates is called "Checkpoint V2" and was introduced in TF 0.12.
I got it working quite nicely (though the docs on the C++ part are horrible, so it took me a day to solve). Some people suggest converting all variables to constants or freezing the graph, but none of these is actually needed.
Python part (saving)
with tf.Session() as sess:
tf.train.Saver(tf.trainable_variables()).save(sess, 'models/my-model')
If you create the Saver with tf.trainable_variables(), you can save yourself some headache and storage space. But maybe some more complicated models need all data to be saved, then remove this argument to Saver, just make sure you're creating the Saver after your graph is created. It is also very wise to give all variables/layers unique names, otherwise you can run in different problems.
C++ part (inference)
Note that checkpointPath isn't a path to any of the existing files, just their common prefix. If you mistakenly put there path to the .index file, TF won't tell you that was wrong, but it will die during inference due to uninitialized variables.
#include <tensorflow/core/public/session.h>
#include <tensorflow/core/protobuf/meta_graph.pb.h>
using namespace std;
using namespace tensorflow;
...
// set up your input paths
const string pathToGraph = "models/my-model.meta"
const string checkpointPath = "models/my-model";
...
auto session = NewSession(SessionOptions());
if (session == nullptr) {
throw runtime_error("Could not create Tensorflow session.");
}
Status status;
// Read in the protobuf graph we exported
MetaGraphDef graph_def;
status = ReadBinaryProto(Env::Default(), pathToGraph, &graph_def);
if (!status.ok()) {
throw runtime_error("Error reading graph definition from " + pathToGraph + ": " + status.ToString());
}
// Add the graph to the session
status = session->Create(graph_def.graph_def());
if (!status.ok()) {
throw runtime_error("Error creating graph: " + status.ToString());
}
// Read weights from the saved checkpoint
Tensor checkpointPathTensor(DT_STRING, TensorShape());
checkpointPathTensor.scalar<std::string>()() = checkpointPath;
status = session->Run(
{{ graph_def.saver_def().filename_tensor_name(), checkpointPathTensor },},
{},
{graph_def.saver_def().restore_op_name()},
nullptr);
if (!status.ok()) {
throw runtime_error("Error loading checkpoint from " + checkpointPath + ": " + status.ToString());
}
// and run the inference to your liking
auto feedDict = ...
auto outputOps = ...
std::vector<tensorflow::Tensor> outputTensors;
status = session->Run(feedDict, outputOps, {}, &outputTensors);
For completeness, here's the Python equivalent:
Inference in Python
with tf.Session() as sess:
saver = tf.train.import_meta_graph('models/my-model.meta')
saver.restore(sess, tf.train.latest_checkpoint('models/'))
outputTensors = sess.run(outputOps, feed_dict=feedDict)
I'm currently struggling with this myself, I've found it's not very straightforward to do currently. The two most commonly cited tutorials on the subject are:
https://medium.com/jim-fleming/loading-a-tensorflow-graph-with-the-c-api-4caaff88463f#.goxwm1e5j
and
https://medium.com/#hamedmp/exporting-trained-tensorflow-models-to-c-the-right-way-cf24b609d183#.g1gak956i
The equivalent of
new_saver = tf.train.import_meta_graph('./my_model_name.meta')
new_saver.restore(session, './my_model_name')
Is just
Status load_graph_status = LoadGraph(graph_path, &session);
Assuming you've "frozen the graph" (Used a script with combines the graph file with the checkpoint values).
Also, see the discussion here: Tensorflow Different ways to Export and Run graph in C++
I want to use the DCMTK 3.6.1 library in an existing project that can create DICOM image. I want to use this library because I want to make the compression of the DICOM images. In a new solution (Visual Studio 2013/C++) Following the example in the DCMTK official documentation, I have this code, that works properly.
using namespace std;
int main()
{
DJEncoderRegistration::registerCodecs();
DcmFileFormat fileformat;
/**** MONO FILE ******/
if (fileformat.loadFile("Files/test.dcm").good())
{
DcmDataset *dataset = fileformat.getDataset();
DcmItem *metaInfo = fileformat.getMetaInfo();
DJ_RPLossless params; // codec parameters, we use the defaults
// this causes the lossless JPEG version of the dataset
//to be created EXS_JPEGProcess14SV1
dataset->chooseRepresentation(EXS_JPEGProcess14SV1, ¶ms);
// check if everything went well
if (dataset->canWriteXfer(EXS_JPEGProcess14SV1))
{
// force the meta-header UIDs to be re-generated when storing the file
// since the UIDs in the data set may have changed
delete metaInfo->remove(DCM_MediaStorageSOPClassUID);
delete metaInfo->remove(DCM_MediaStorageSOPInstanceUID);
metaInfo->putAndInsertString(DCM_ImplementationVersionName, "New Implementation Version Name");
//delete metaInfo->remove(DCM_ImplementationVersionName);
//dataset->remove(DCM_ImplementationVersionName);
// store in lossless JPEG format
fileformat.saveFile("Files/carrellata_esami_compresso.dcm", EXS_JPEGProcess14SV1);
}
}
DJEncoderRegistration::cleanup();
return 0;
}
Now I want to use the same code in an existing C++ application where
if (infoDicom.arrayImgDicom.GetSize() != 0) //Things of existing previous code
{
//I have added here the registration
DJEncoderRegistration::registerCodecs(); // register JPEG codecs
DcmFileFormat fileformat;
DcmDataset *dataset = fileformat.getDataset();
DJ_RPLossless params;
dataset->putAndInsertUint16(DCM_Rows, infoDicom.rows);
dataset->putAndInsertUint16(DCM_Columns, infoDicom.columns,);
dataset->putAndInsertUint16(DCM_BitsStored, infoDicom.m_bitstor);
dataset->putAndInsertUint16(DCM_HighBit, infoDicom.highbit);
dataset->putAndInsertUint16(DCM_PixelRepresentation, infoDicom.pixelrapresentation);
dataset->putAndInsertUint16(DCM_RescaleIntercept, infoDicom.rescaleintercept);
dataset->putAndInsertString(DCM_PhotometricInterpretation,"MONOCHROME2");
dataset->putAndInsertString(DCM_PixelSpacing, "0.086\\0.086");
dataset->putAndInsertString(DCM_ImagerPixelSpacing, "0.096\\0.096");
BYTE* pData = new BYTE[sizeBuffer];
LPBYTE pSorg;
for (int nf=0; nf<iNumberFrames; nf++)
{
//this contains all the PixelData and I put it into the dataset
pSorg = (BYTE*)infoDicom.arrayImgDicom.GetAt(nf);
dataset->putAndInsertUint8Array(DCM_PixelData, pSorg, sizeBuffer);
dataset->chooseRepresentation(EXS_JPEGProcess14SV1, ¶ms);
//and I put it in my data set
//but this IF return false so che canWriteXfer fails...
if (dataset->canWriteXfer(EXS_JPEGProcess14SV1))
{
dataset->remove(DCM_MediaStorageSOPClassUID);
dataset->remove(DCM_MediaStorageSOPInstanceUID);
}
//the saveFile fails too, and the error is "Pixel
//rappresentation non found" but I have set the Pixel rep with
//dataset->putAndInsertUint16(DCM_PixelRepresentation, infoDicom.pixelrapresentation);
OFCondition status = fileformat.saveFile("test1.dcm", EXS_JPEGProcess14SV1);
DJEncoderRegistration::cleanup();
if (status.bad())
{
int error = 0; //only for test
}
thefile.Write(pSorg, sizeBuffer); //previous code
}
Actually I made test with image that have on one frame, so the for cycle is done only one time. I don't understand why if I choose dataset->chooseRepresentation(EXS_LittleEndianImplicit, ¶ms); or dataset->chooseRepresentation(EXS_LittleEndianEXplicit, ¶ms); works perfectly but not when I choose dataset->chooseRepresentation(EXS_JPEGProcess14SV1, ¶ms);
If I use the same image in the first application, I can compress the image without problems...
EDIT: I think the main problem to solve is the status = dataset->chooseRepresentation(EXS_JPEGProcess14SV1, &rp_lossless) that return "Tag not found". How can I know wich tag is missed?
EDIT2: As suggest in the DCMTK forum I have added the tag about the Bits Allocated and now works for few images, but non for all. For some images I have again "Tag not found": how can I know wich one of tags is missing? As a rule it's better insert all the tags?
I solve the problem adding the tags DCM_BitsAllocated and DCM_PlanarConfiguration. This are the tags that are missed. I hope that is useful for someone.
At least you should call the function chooseRepresentation, after you have applied the data.
**dataset->putAndInsertUint8Array(DCM_PixelData, pSorg, sizeBuffer);**
dataset->chooseRepresentation(EXS_JPEGProcess14SV1, ¶ms);
Though I've followed the excellent Protocol Buffer documentation and tutorials for C++ and Python, I can't achieve my goal which is :
- to serialize datas from a C++ process.
- insert it into LevelDB from that same process.
- extract the serialized datas from a Python process
- Deseralize it from this same Python process
- Use those deseralized datas in Python
I can serialize my datas using protocol buffer in C++ (using a std::string container). I can insert it into LevelDB. But, when I levelDB->Get my serialized datas, though Python seems to recognize it as a String, and showing me their raw content, whenever I deserialize it into a Python String, it is empty!
Here is how I serialize and insert my datas in C++ :
int main(int arg, char** argv)
{
GOOGLE_PROTOBUF_VERIFY_VERSION;
leveldb::DB* db;
leveldb::Options options;
leveldb::Status status;
tutorial::AddressBook address_book;
tutorial::Person* person1;
tutorial::Person* person2;
options.create_if_missing = true;
status = leveldb::DB::Open(options, "test_db", &db);
assert(status.ok());
person1 = address_book.add_person();
person1->set_id(1);
person1->set_name("ME");
person1->set_email("me#me.com");
person2 = address_book.add_person();
person2->set_id(2);
person2->set_name("SHE");
person2->set_email("she#she.com");
std::string test;
if (!address_book.SerializeToString(&test))
{
std::cerr << "Failed to write address book" << std::endl;
return -1;
}
if (status.ok()) status = db->Put(leveldb::WriteOptions(), "Test", test);
And here is how I try to deserialize it in Python:
address_book = addressbook_pb2.AddressBook()
db = leveldb.LevelDB('test_db')
ab = address_book.ParseFromString(db.Get("Test"))
ad var type is NoneType
Edit :
before the db.Get(), ab.ByteSize() returns 0, 76 after the ParseFromString(), I assume it's a Type problem then...
+
ab.ListFields() returns a unexploitable list of the contained field: succesfully couting two person instances, but unable to let me acces to it.
Any clues, any ideas of what I didn't understand, what I'm doing wrong here?
Many thanks!
Ok, so this was my bad.
I went back into the Protocol Buffers Python documentation, and the fact is that even if the AdressBook object I was retrieving did not showed any description, it was still able to be iterated over and even had a .str() method.
so, if anyone comes to that problem again, just try to explore your ProtocolBuffers object using iPython like I did, and you'll find that every of your proto elements are fields of your object.
Using my example:
ab = adress_book.ParseFromString(db.Get('Test'))
ab.__str__() # Shows a readable version of my object
for person in adress_book.person: # I'm even able to iterate over any of my ab fields values
print person.id
print person.name
Try using ' instead of ":
ab = address_book.ParseFromString(db->Get('Test'))