Deploying caffe RNNs - c++

I'm trying to deploy a caffe model containing a RNN layer. The issue I'm having is how to compute the output from the network. My assumption was that I could call
net->Forward();
to update the network and then
net->output_blobs()[0]->mutable_cpu_data()[x];
once every timestep to read the output. However, using a constant input and then running "net-Forward()" multiple times does not affect the output as one would expect. I've tried to use different weights/biases, which changes the output, but no matter what configuration I'm using the output will still be static. Does anyone know what the proper procedure for deploying caffe RNNs with C++ is?
Edit:
This was tested with a single neuron RNN layer like below.
model.prototxt:
layer {
name: "input"
type: "Input"
top: "states"
input_param {
shape: {
dim: 1
dim: 1
}
}
}
input: "clip"
input_shape { dim: 1 dim: 1 dim: 1}
layer {
name: "rnn"
type: "RNN"
top: "rnn"
bottom: "clip"
bottom: "states"
recurrent_param {
num_output: 1
}
}
And the .cpp:
caffe::Blob<float>* input_layer = test_net->input_blobs()[0];
float* input_data;
input_data = input_layer->mutable_cpu_data();
input_data[0] = 1.0;
for (int i=0; i<5; i++)
{
test_net->Forward();
cout << "Ouput: " << net->output_blobs().back()->mutable_cpu_data()[0] << endl;
}

Related

(OpenCV / DNN) Face Recognition isn't working - Euklidian distance is always 0

I want to verify if one person is similar to another person. Therefore I want to get the similarity between two faces.
These are the input faces
Mindy Face
Madonna Face
Now I want to push them through the DNN and after that I want to get the Euklidian distance between the two resulting matrices.
I've used the following OpenFace model: https://storage.cmusatyalab.org/openface-models/nn4.small2.v1.t7
This is my code for calculating the distances:
cv::Mat madonna = cv::imread("/home/der/Madonna_Face.jpg");
cv::Mat mindy = cv::imread("/home/der/Mindy_Face.jpg");
cv::resize(madonna, madonna, cv::Size(96, 96));
cv::resize(mindy, mindy, cv::Size(96, 96));
cv::Mat madonnaBlob = cv::dnn::blobFromImage(madonna, 1.0 / 255, cv::Size(96, 96), cv::Scalar{0,0,0}, true, false);
cv::Mat mindyBlob = cv::dnn::blobFromImage(mindy, 1.0 / 255, cv::Size{96, 96}, cv::Scalar{0,0,0}, true, false);
cv::dnn::Net _net = cv::dnn::readNetFromTorch("/home/der/nn4.small2.v1.t7");
_net.setInput(madonnaBlob);
cv::Mat madonnaMat = _net.forward();
_net.setInput(mindyBlob);
cv::Mat mindyMat = _net.forward();
std::cout << cv::norm(madonnaMat, mindyMat);
And if I'm doing so the result from cv::norm is 0.
The representations are exactly the same:
std::vector<double> master = madonnaMat;
std::vector<double> slave = mindyMat;
for(int i; i < 128; i++) {
std::cout << master[i] << " # " << slave[i] << std::endl;
}
Output:
> -0.0865457 # -0.0865457
> 0.133816 # 0.133816
> -0.105774 # -0.105774
> 0.05389 # 0.05389
> -0.00391233 # -0.00391233
> ...
Results:
Madonna Representation: [-0.060358506, 0.14156586, -0.10181303, 0.060315549, 0.0016125928, 0.066964693, -0.044892643, -0.043857966, 0.088271223, 0.047121659, 0.078663647, 0.025775915, 0.062051967, 0.034234334, -0.049976062, 0.045926169, 0.084343202, 0.046965379, -0.092582494, 0.13601208, -0.003582818, -0.15382886, 0.075037867, 0.19894752, -0.041007876, -0.12050319, -0.056161541, -0.018724455, 0.024790274, 0.0092850979, 0.095108159, 0.067354925, 0.06044127, 0.041365273, -0.12024247, 0.18279234, 0.027767293, 0.09874554, -0.16951905, 0.062370241, -0.014530737, 0.015518869, -0.0056175897, -0.066358574, -0.02390888, -0.07608442, 0.13011196, 0.031423025, -0.010443882, 0.12755248, -0.010195011, 0.0051672528, -0.10453289, -0.013270194, 0.096139617, 0.10375636, -0.047089621, 0.050923191, 0.066422582, -0.046726897, -0.1845296, 0.031028474, 0.086226918, -0.27064508, 0.055891197, -0.0053421594, 0.035870265, -0.026942547, -0.17279817, 0.13772435, 0.0071162563, 0.075375959, -0.046405111, 0.12658595, 0.11093359, 0.0030428318, 0.070016958, 0.1725318, -0.056130294, -0.14420295, -0.12438529, 0.056423288, -0.080888703, -0.052004829, -0.06481386, 0.14163122, -0.059617694, -0.026075639, 0.052098148, -0.0055074869, -0.014869845, -0.11943244, 0.068051606, -0.096071519, 0.19727865, -0.016027609, -0.05776047, 0.069935486, -0.020494614, 0.013407955, -0.06065508, -0.056143567, -0.04608072, 0.072748154, -0.035580911, 0.15261506, -0.074352823, -0.081481896, 0.020475708, -0.021581693, -0.16350025, 0.12794609, 0.082243897, 0.015881324, 0.011330541, -0.026391003, 0.086644463, -0.10490314, 0.088207267, 0.17892174, 0.025871141, 0.012454472, 0.010682535, 0.1253885, -0.12909022, 0.082067415, -0.035789803, 0.032903988]
Madonna Size: 1 x 128
Madonna Dims: 2
Mindy Representation: [-0.082645342, 0.14463238, -0.10716592, 0.065654278, 0.0045089996, 0.064019054, -0.047334831, -0.056190431, 0.099919245, 0.048234992, 0.068906084, 0.028518379, 0.057044145, 0.046223734, -0.056203742, 0.033566523, 0.082230642, 0.055683684, -0.080982864, 0.12431844, -0.00075431512, -0.14511517, 0.045022864, 0.20965824, -0.030178605, -0.11852413, -0.066858761, -0.01461118, 0.032898057, 0.02857255, 0.1088237, 0.07066118, 0.044605579, 0.022743503, -0.10785796, 0.20373915, 0.010565795, 0.063950166, -0.18701579, 0.062780239, -0.0042907735, 0.031276166, -0.006556896, -0.038440779, -0.01419229, -0.072688736, 0.13676986, 0.040385362, 0.010314438, 0.095734902, -0.0080824783, 0.011763249, -0.098884396, -0.040797569, 0.10534941, 0.12088351, -0.07317061, 0.063644305, 0.0830286, -0.050620016, -0.18088549, 0.03330183, 0.090282671, -0.25393733, 0.056058947, -0.020288708, 0.049997903, -0.044997148, -0.15860014, 0.15251927, 0.015151619, 0.088731326, -0.028061632, 0.11127418, 0.090425298, 0.0052096732, 0.053858042, 0.18543676, -0.066999368, -0.15851147, -0.11389373, 0.088093147, -0.08713299, -0.048095752, -0.063261949, 0.12453313, -0.051213119, -0.023759408, 0.048403475, -0.012721839, -0.021282939, -0.098075315, 0.066707589, -0.11601795, 0.20438787, -0.015739718, -0.052848384, 0.057336167, -0.01592578, 0.014057826, -0.058749981, -0.043632519, -0.031006066, 0.046038814, -0.065755703, 0.15442967, -0.082077362, -0.099808514, 0.016168201, 0.0046916353, -0.14556217, 0.11152669, 0.062443323, -0.00032889194, 0.0020548289, -0.026999777, 0.096809812, -0.11947374, 0.085579365, 0.16317753, 0.028130196, 0.014577032, 0.0079531483, 0.11340163, -0.15006165, 0.094127603, -0.0440454, 0.033095147]
Mindy Size: 1 x 128
Mindy Dims: 2
Any ideas what I'm doing wrong? Thanks.
I've experienced this several times. I couldn't find this explicitly mentioned in the OpenCV documentation, but the cv::dnn::Net::forward function returns a cv::Mat blob with the data member pointing always to the same zone of memory. Therefore on the second forward pass, that zone of memory is overwritten and both madonnaBlob and mindyBlob point there.
As #Christoph Rackwitz pointed out, you need to clone the cv::Mat before running the second inference:
_net.setInput(madonnaBlob);
cv::Mat madonnaMat = _net.forward();
madonnaMat = madonnaMat.clone(); // Copy memory
_net.setInput(mindyBlob);
cv::Mat mindyMat = _net.forward();

Parsing a YAML file?

How can I parse the following YAML file using yaml-cpp?
scene:
- camera:
film:
width: 800
height: 600
filename: "out.svg"
- shape:
name: "muh"
I tried:
#include <yaml-cpp/yaml.h>
int main() {
YAML::Node root_node = YAML::LoadFile("Scenes/StanfordBunny.flatland.yaml");
// throws an exception
int value = root_node["scene"]["camera"]["film"]["width"].as<int>();
}
How can I get the value of the width attribute?
How can I get the name of the shape attribute?
The "-" in front of camera means it is an array of objects. So my guess would be:
root_node["scene"][0]["camera"]["film"]["width"].as<int>();

Writing nested maps and sequences to a yaml file using YAML::Emitter

I have been trying to output a yaml file using YAML::Emitter. For instance, I need something like this to be my yaml file.
annotations:
- run:
type: range based
attributes:
start_frame:
frame_number: 25
end_frame:
frame_number: 39
So far, using my code
for (auto element : obj)
{
basenode = YAML::LoadFile(filePath); //loading a file throws exception when it is not a valid yaml file
//Check if metadata is already there
if (!basenode["file_reference"])
{
writeMetaData(element.getGttStream(), element.getTotalFrames(), element.getFileHash());
}
annotationNode["annotations"].push_back(element.getGestureName());
annotationNode["type"] = "range based";
output << annotationNode;
attributesNode["attributes"]["start_frame"]["frame_number"] = element.getStartFrame();
attributesNode["attributes"]["end_frame"]["frame_number"] = element.getEndFrame();
output << typeNode;
output << attributesNode;
ofs.open(filePath, std::ios_base::app);
ofs << std::endl << output.c_str();
}
I am getting an output like this
annotations:
- run
type: range based
---
attributes:
start_frame:
frame_number: 26
end_frame:
frame_number: 57
I want the "type" and "attributes" under the recently pushed sequence item into the "annotations" and subsequently the same for all the following nodes.
I even tried using something like this
annotationNode[0][type] = "range based"
and the output was like this
0: type: "range based"
How do i get the recently pushed item in the sequence "annotations"?
If you're building up your root node, annotationNode, then just build it up and output it once. You wouldn't need to write either the typeNode or attributesNode to the emitter. For example, you might write
YAML::Node annotationNode;
for (auto element : obj) {
YAML::Node annotations;
annotations["name"] = element.getGestureName();
annotations["type"] = ...;
annotations["attributes"] = ...;
annotationNode["annotations"] = annotations;
}
output << annotationNode;

How can I get layer's top label in c++?

1)Is it possible get each the layer's top labels (e.g: ip1,ip2,conv1,conv2) in c++?
If my layer is
layer {
name: "relu1_1"
type: "Input"
top: "pool1"
input_param {
shape: {
dim:1
dim: 1
dim: 28
dim: 28
}
}
}
I want to get the top label which in my case is "pool1"
I searched the examples provided, but I couldn't find anything. currently I'm able to get only the layers names and layer type by the following commands,
cout << "Layer name:" << "'" << net_->layer_names()[layer_index]<<endl;
cout << "Layer type: " << net_->layers()[layer_index]->type()<<endl;
2) Where can I find the tutorials or the examples which explains most used API's for using caffe framework using c++?
Thankyou in advance.
Look at Net class in doxygen:
const vector< vector< Blob< Dtype > * > > all_ tops = net_->top_vecs(); // get "top" of all layers
Blob<Dtype>* ptop = all_tops[layer_index][0]; // pointer to top blob of layer
If you want the layer's name, you can
const string layer_name = net_->layer_names()[layer_index];
You can access all sorts of names/data using net_ interface, just read the doc!

Setting input layer in CAFFE with C++

I'm writing C++ code using CAFFE to predict a single (for now) image. The image has already been preprocessed and is in .png format. I have created a Net object and read in the trained model. Now, I need to use the .png image as an input layer and call net.Forward() - but can someone help me figure out how to set the input layer?
I found a few examples on the web, but none of them work, and almost all of them use deprecated functionality. According to: Berkeley's Net API, using "ForwardPrefilled" is deprecated, and using "Forward(vector, float*)" is deprecated. API indicates that one should "set input blobs, then use Forward() instead". That makes sense, but the "set input blobs" part is not expanded on, and I can't find a good C++ example on how to do that.
I'm not sure if using a caffe::Datum is the right way to go or not, but I've been playing with this:
float lossVal = 0.0;
caffe::Datum datum;
caffe::ReadImageToDatum("myImg.png", 1, imgDims[0], imgDims[1], &datum);
caffe::Blob< float > *imgBlob = new caffe::Blob< float >(1, datum.channels(), datum.height(), datum.width());
//How to get the image data into the blob, and the blob into the net as input layer???
const vector< caffe::Blob< float >* > &result = caffeNet.Forward(&lossVal);
Again, I'd like to follow the API's direction of setting the input blobs and then using the (non-deprecated) caffeNet.Forward(&lossVal) to get the result as opposed to making use of the deprecated stuff.
EDIT:
Based on an answer below, I updated to include this:
caffe::MemoryDataLayer<unsigned char> *memory_data_layer = (caffe::MemoryDataLayer<unsigned char> *)caffeNet.layer_by_name("input").get();
vector< caffe::Datum > datumVec;
datumVec.push_back(datum);
memory_data_layer->AddDatumVector(datumVec);
but now the call to AddDatumVector is seg faulting.. I wonder if this is related to my prototxt format? here's the top of my prototxt:
name: "deploy"
input: "data"
input_shape {
dim: 1
dim: 3
dim: 100
dim: 100
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
I base this part of the question on this discussion about a "source" field being important in the prototxt...
caffe::Datum datum;
caffe::ReadImageToDatum("myImg.png", 1, imgDims[0], imgDims[1], &datum);
MemoryDataLayer<float> *memory_data_layer = (MemoryDataLayer<float> *)caffeNet->layer_by_name("data").get();
memory_data_layer->AddDatumVector(datum);
const vector< caffe::Blob< float >* > &result = caffeNet.Forward(&lossVal);
Something like this could be useful. Here you will have to use MemoryData layer as the input layer. I am expecting the layer name to be named data.
The way of using datum variable may not be correct. If my memory is correct, I guess, you have to use a vector of datum data.
I think this should get you started.
Happy brewing. :D
Here is an excerpt from my code located here where I used Caffe in my C++ code. I hope this helps.
Net<float> caffe_test_net("models/sudoku/deploy.prototxt", caffe::TEST);
caffe_test_net.CopyTrainedLayersFrom("models/sudoku/sudoku_iter_10000.caffemodel");
// Get datum
Datum datum;
if (!ReadImageToDatum("examples/sudoku/cell.jpg", 1, 28, 28, false, &datum)) {
LOG(ERROR) << "Error during file reading";
}
// Get the blob
Blob<float>* blob = new Blob<float>(1, datum.channels(), datum.height(), datum.width());
// Get the blobproto
BlobProto blob_proto;
blob_proto.set_num(1);
blob_proto.set_channels(datum.channels());
blob_proto.set_height(datum.height());
blob_proto.set_width(datum.width());
int size_in_datum = std::max<int>(datum.data().size(),
datum.float_data_size());
for (int ii = 0; ii < size_in_datum; ++ii) {
blob_proto.add_data(0.);
}
const string& data = datum.data();
if (data.size() != 0) {
for (int ii = 0; ii < size_in_datum; ++ii) {
blob_proto.set_data(ii, blob_proto.data(ii) + (uint8_t)data[ii]);
}
}
// Set data into blob
blob->FromProto(blob_proto);
// Fill the vector
vector<Blob<float>*> bottom;
bottom.push_back(blob);
float type = 0.0;
const vector<Blob<float>*>& result = caffe_test_net.Forward(bottom, &type);
What about:
Caffe::set_mode(Caffe::CPU);
caffe_net.reset(new caffe::Net<float>("your_arch.prototxt", caffe::TEST));
caffe_net->CopyTrainedLayersFrom("your_model.caffemodel");
Blob<float> *your_blob = caffe_net->input_blobs()[0];
your_blob->set_cpu_data(your_image_data_as_pointer_to_float);
caffe_net->Forward();