Which algorithm is used to train/predict Opencv LBPH face recognizer? - c++

I couldn't understand how training stage and predition stage is working.İs it using another algorithm like svm or k-nearestneighbour after finding LBPH features?

If you check: https://github.com/Itseez/opencv_contrib/blob/master/modules/face/src/lbph_faces.cpp
Then you will see they use 1-nearest neighbour, excerpt from detect function:
// find 1-nearest neighbor
collector->init((int)_histograms.size(), state);
for (size_t sampleIdx = 0; sampleIdx < _histograms.size(); sampleIdx++) {
double dist = compareHist(_histograms[sampleIdx], query, HISTCMP_CHISQR_ALT);
int label = _labels.at<int>((int)sampleIdx);
if (!collector->collect(label, dist, state))return;
}
A 1-nearest neighbour classifier is used since the Local Binary Pattern descriptor is simple enough. See for a more in depth explanation the paper: "Face Recognition with Local Binary Patterns"
On a side note. This is not really an implementation/practical question and thus does not really belong on this forum. I would suggest using the opencv forum.

Related

Parse Individual Curves from General_polygon_set_2 in CGAL

To start, I want to thank everyone who has helped me so far on previous problems I have had with working through the CGAL Library, it is greatly appreciated.
Background on myself: I am still very new with C++ and my coding experience is in MATLAB so there is a lot of concepts that I am learning very quickly and are therefore very new to me, so please excuse my erroneous language that I may use with regard to C++.
The Problem:
I have recently wrote some code that finds the Minkowski sum of a polyline and a circle (i.e., buffer of a polyline) using the code found in the documentation of Boolean Set Operations on General Polygons.
Here, a General_polygon_set_2 concept is utilized in the output, and if the output code is used from the example above I can get the following output of a Polygon_with_holes_2 class:
48 [775.718 -206.547 --> 769.134 -157.991] (769 -157 1 1) [769.134 -157.991 --> 770 -157] (769 -157 1 1) [770 -157 --> 768.866 -156.009] [768.866 -156.009 --> 762.282 -107.453] [762.282 -107.453 --> 703.282 -115.453] [703.282 -115.453 --> 708.072 -150.778] ...
7 15 [549.239 -193.612 --> 569.403 -216.422] ... 3 [456.756 -657.812 --> 657.930 908.153] ...
Here, if I understand correctly, the first integer refers to the number of a vertices in the .outer_boundary() , followed by descriptions of the curves for each "edge" of the general polygon. In my problem, the outputs will only consist of linear functions and circular arcs.
Linear: [775.718 -206.547 --> 769.134 -157.991]
Circular Arc (x-monotone): (769 -157 1 1) [769.134 -157.991 --> 770 -157]
The linear element is simple, go from this x-y coordinate to this other one by a line. As for the the circular arc, it is little bit more different, it says to use this circle described by the arguments in these brackets () to go from this x-y coordinate to this other one contained in these brackets []. The arguments to circle are: (x,y,radius,orientation).
Next, since we have holes, after the .outer_boundary() has been written out, two more integers are displayed. The first one states the number of holes, the second states the number vertices in this hole, then followed by those vertices for that hole. Then once that hole is written out, another integer is written describing the number of vertices in that hole, and this then continues for all of the holes, completing the description of the polygon.
So with that, my current problem is parsing out each individual curve one at a time so that I can do operations on them.
I have the following functions from the documentation to work with:
.outer_boundary(): returns the general polygon that represents the outer boundary.
.holes_begin(): returns the begin iterator of the holes.
.holes_end():
So my thought is to break the General_polygon_set_2 to General_polygon_2, then break that down into the .outer_boundary() and the different holes. Finally, for each set of curves, break those down into individual curves.
I am not really sure how to go about this, I just know that I need individual curve data so I can do my own operations on them. Any help, will be, as always, greatly appreciated!
Note: I actually deleted this post after reading through the arrangements documentation thinking that this was too obvious of an answer, but after sometime I still really do not see how to pull this info properly, I think the biggest issue is in my lacking knowledge of C++. Sorry about this being a noob-ish question.
Solution in Progress:
list<Polygon_with_holes_2> res;
S.polygons_with_holes (back_inserter (res));
list<Polygon_with_holes_2>::iterator i = res.begin();
Polygon_with_holes_2 mink = *i;
minkOuter = mink.outer_boundary();
cout << minkOuter << endl;
int numHoles = mink.holes_end()-mink.holes_begin();
cout << numHoles << endl;
Now I am working on isolating the holes, followed by breaking those down into each individual curve.
The doc here states that the value_type of a Hole_const_iterator is a General_polygon_2, which means that what you can iterate through all "curves" using "holes_begin()" and "holes-end", like you thought. To do that, use the following syntax:
for(auto h_it = mink.holes_begin(); h_it != mink.holes_end(); ++h_it)
{
//in here h_it is an iterator with value type General_polygon_2, so *h_it will be a the polygon describing a hole. Every step of this loop will give you another hole.
}
Then, you can iterate the curves of each polygon with curves_begin() and curves_end() the same way.
So to iterate each curve of a polygon_with_holes:
for(auto h_it = mink.holes_begin(); h_it != mink.holes_end(); ++h_it)
{
for(auto curve_it = h_it->curves_begin(); curves_it != h_it->curves_end(); ++curves_it)
{
//*curves_it gives you a curve.
}
}

extracting output data from typed_output_tensor in TFlite

Thanks in advance for your support.
I'm trying to get the output of a tensor after the inference on a .tflite U-net neural network. I'm using Tensorflow lite image classification code as a baseline.
I need to adapt the code for a segmentation task. My question is how I can access the output of the inferenced model (which 128x128x1) and write the result into an image?
I already debugged the code and explored many different approaches. Unfortunately, I'm not confident with the C++ language. What I found is that the command: interpreter->typed_output_tensor<float>(0) should be what I need, as also referenced here: https://www.tensorflow.org/lite/guide/inference#loading_a_model. However, I cannot access the 128x128 tensor generated by the network.
You can find the code at the address: https://github.com/tensorflow/tensorflow/blob/770481fb3e9126f9a29db5667f528e450d54d719/tensorflow/lite/examples/label_image/label_image.cc
The interesting part is here (lines 217 -224):
const float threshold = 0.001f;
std::vector<std::pair<float, int>> top_results;
int output = interpreter->outputs()[0];
TfLiteIntArray* output_dims = interpreter->tensor(output)->dims;
// assume output dims to be something like (1, 1, ... ,size)
auto output_size = output_dims->data[output_dims->size - 1];
I expect the values saved in an image or an alternative way of saving the output tensor

Problems with implementing approximate(feature based) q learning

I am new to reinforcement learning. I had recently learned about approximate q learning, or feature-based q learning, in which you describe states by features to save space. I have tried to implement this in a simple grid game. Here, the agent is supposed to learn to not go into a firepit(signaled by an f) and to instead eat up as much dots as possible. Here is the grid used:
...A
.f.f
.f.f
...f
Here A signals the agent's starting location. Now, when implementing, I set up two features. One was 1/((distance to closest dot)^2), and the other was (distance to firepit) + 1. When the agent enters a firepit, the program returns with a reward of -100. If it goes to a non firepit position that was already visited(and thus there is no dot to be eaten), the reward is -50. If it goes to an unvisited dot, the reward is +500. In the above grid, no matter what the initial weights are, the program never learns the correct weight values. Specifically, in the output, the first training session gains a score(how many dots it ate) of 3, but for all other training sessions, the score is just 1 and the weights converge to an incorrect value of -125 for weight 1(distance to firepit) and 25 for weight 2(distance to unvisited dot). Is there something specifically wrong with my code or is my understanding of approximate q learning incorrect?
I have tried to play around with the rewards that the environment is giving and also with the initial weights. None of these have fixed the problem.
Here is the link to the entire program: https://repl.it/repls/WrongCheeryInterface
Here is what is going on in the main loop:
while(points != NUMPOINTS){
bool playerDied = false;
if(!start){
if(!atFirepit()){
r = 0;
if(visited[player.x][player.y] == 0){
points += 1;
r += 500;
}else{
r += -50;
}
}else{
playerDied = true;
r = -100;
}
}
//Update visited
visited[player.x][player.y] = 1;
if(!start){
//This is based off the q learning update formula
pairPoint qAndA = getMaxQAndAction();
double maxQValue = qAndA.q;
double sample = r;
if(!playerDied && points != NUMPOINTS)
sample = r + (gamma2 * maxQValue);
double diff = sample - qVal;
updateWeights(player, diff);
}
// checking end game condition
if(playerDied || points == NUMPOINTS) break;
pairPoint qAndA = getMaxQAndAction();
qVal = qAndA.q;
int bestAction = qAndA.a;
//update player and q value
player.x += dx[bestAction];
player.y += dy[bestAction];
start = false;
}
I would expect that both weights would still be positive, but one of them is negative(the one giving distance to the firepit).
I also expected the program to learn overtime that it is bad to enter a firepit and also bad, but not as bad, to go to an unvisited dot.
Probably not the anwser you want to hear, but:
Have you try to implement the simpler tabular Q-learning before approximate Q-learning? In your setting, with a few states and actions, it will work pefectly. If you are learning, I strongly recommend you to start with the simpler cases in order to get a better understanding/intuition about how Reinforcement Learning works.
Do you know the implications of using approximators instead of learning the exact Q function? In some cases, due to the complexity of the problem (e.g., when the state space is continuous) you should approximate the Q function (or the policy, depending on the algorithm), but this may introduce some convergence problems. Additionally, in you case, you are trying to hand-pick some features, which usually required a depth knowledge of the problem (i.e., environment) and the learning algorithm.
Do you understand the meaning of the hyperparameters alpha and gamma? You can not choose them randomly. Sometimes they are critical to obtain the expected results, not always, depending heavely on the problem and the learning algorithm. In your case, taking a look to the convergence curve of you weights, it's pretty clear that you are using a value of alpha too high. As you pointed out, after the first training session your weigths remain constant.
Therefore, practical recommendations:
Be sure to solve your grid game using a tabular Q-learning algorithm before trying more complex things.
Experiment with different values of alpha, gamma and rewards.
Read more in depth about approximated RL. A very good and accesible book (starting from zero knowledge) is the classical Sutton and Barto's book: Reinforcement Learning: An Introduction, which you can obtain for free and was updated in 2018.

How to port the MATLAB libSVM parameters in C++

In my cross-validation in MATLAB with libSVM I found that these are the best parameters to use:
model = svmtrain( labels, training, '-s 0 -t 2 -c 10000 -g 100');
Now I want to replicate the classification in C++ with OpenCV.
But I do not understand how to set the C++ parameters to be the same as MATLAB:
Based on this documentation I tried the following:
CvSVMParams params;
params.svm_type = CvSVM::C_SVC;
params.kernel_type = CvSVM::RBF;
//params.term_crit = cvTermCriteria(CV_TERMCRIT_ITER, 10000, 1e-6);
params.Cvalue = 10000;
params.gamma = 100;
CvSVM SVM;
SVM.train(train, labels, Mat(), Mat(), params);
but I get this error:
error: no member named 'Cvalue' in 'CvSVMParams' params.Cvalue = 10000;
Last thing, should I uncomment
//params.term_crit = cvTermCriteria(CV_TERMCRIT_ITER, 10000, 1e-6);
and try other values or is it not important?
Because I can't even understand in MATLAB how to set the same parameters.
Not every parameter has an exact equivalent when porting from LibSVM in matlab to OpenCV SVM. The term criteria is one of them. Keep in mind that the SVM of opencv might have some bugs depending on the version you use (not an issue with the latest version).
You should un-comment the line, to have better control of your termination criteria. This line says that the algorithm should end when 10000 iterations are performed. If you use CV_TERMCRIT_EPS, it will stop when a precision less than the specified (for you, its 1e-6) is achieved for each vector. Use both stopping criteria, and it will stop when either of them completes.
Alternatively, could also try using LibSVM for C++, by linking it as a library. This will give you the exact same algorithms and functions that you are using in matlab.

OpenCV (C++) using SIFT descriptors increases the number of detected features?

I have a bit confusing situation while using the SIFT descriptors implementation from OpenCV.
I'm trying to test various feature detector + descriptor calculation methods, so I am using a combination of cv::FeatureDetector and cv::DescriptorExtractor interfaces which allow me to simply change between different detector methods and descriptors.
When calling cv::DescriptorExtractor::compute(...) (the variant for a single image), the documentation says that it is possible for the number of key points given to the algorithm to decrease if it is impossible to calculate their descriptors, and I understand how and why that is done.
But, what happens to me is that the number of key points after the descriptor computations actually increases. It is clearly so, and I'm not trying to stop it from happening, I am just hoping for an explanation as to why (just an intuitive description would be cool, altough I appreciate anything more that that).
I've got layers upon layers of wrappers around the actual OpenCV that don't have any code (just setting up some local non-OpenCV flags), so here's the OpenCV code that's being called at the bottom of it all:
cv::Ptr<cv::FeatureDetector> dect = cv::FeatureDetector::create("MSER");
cv::Mat input = cv::imread("someImg.ppm", 0);
std::vector<cv::KeyPoint> keypoints;
dect->detect(input, keypoints);
cv::Ptr<cv::DescriptorExtractor>deEx=cv::DescriptorCalculator::create("SIFT");
std::cout << "before computing, feats size " << keypoints.size() << std::endl;
// code to print out 10 features
cv::Mat desc;
deEx->compute(input, keypoints, desc);
std::cout << "after computing, feats size " << keypoints.size() << std::endl;
// code to print out 10 features
I've printed out the first 10 key points just before and after the descriptor calculations, so here are some concrete numbers as an example:
before computing, feats size 379
feat[0]: 10.7584 39.9262 176.526 0 12.5396
feat[1]: 48.2209 207.904 275.091 0 11.1319
feat[2]: 160.894 313.781 170.278 0 9.63786
feat[3]: 166.061 239.115 158.33 0 19.5027
feat[4]: 150.043 233.088 171.887 0 11.9569
feat[5]: 262.323 322.173 188.103 0 8.65429
feat[6]: 189.501 183.462 177.396 0 12.3069
feat[7]: 218.135 253.027 171.763 0 123.069
feat[8]: 234.508 353.236 173.281 0 11.8375
feat[9]: 234.404 394.079 176.23 0 8.99652
after computing, feats size 463
feat[0]: 10.7584 39.9262 13.1313 0 12.5396
feat[1]: 48.2209 207.904 69.0472 0 11.1319
feat[2]: 48.2209 207.904 107.438 0 11.1319
feat[3]: 160.894 313.781 9.57937 0 9.63786
feat[4]: 166.061 239.115 166.144 0 19.5027
feat[5]: 150.043 233.088 78.8696 0 11.9569
feat[6]: 262.323 322.173 167.259 0 8.65429
feat[7]: 189.501 183.462 -1.49394 0 12.3069
feat[8]: 218.135 253.027 -117.067 3 123.069
feat[9]: 218.135 253.027 7.44055 3 123.069
I can see from this example that the original feat[1] and feat[7] have spanned in to two new key points each, but I do not see any logical explanation for the compute method to do that :(
The printout I have given here is from using MSER for detection of keypoints, and then trying to calculate SIFT descriptors, but the same increase in size also happens with STAR, SURF, and SIFT (i.e. DoG) keypoints detected. I didn't try to change the SIFT descriptor in to something else, but if someone thinks it's relevant to the question, I'll try it and edit it in my question.
First of all, as you can see in the documentation cv::DescriptorExtractor::compute take a std::vector<cv::Keypoints> in argument which in non const. It means that this vector can be modified by cv::DescriptorExtractor::compute.
In practice, KeyPointsFilter::runByImageBorder and KeyPointsFilter::runByKeypointSize (two non-const functions) will be applied to the vector and will remove the keypoint for which a descriptor cannot be computed. No re-extraction of keypoints will be done.
You should post the few lines of code you are using for further diagnostic.
--
Well, I finally found where the problem occurs: cv::SiftDescriptorExtractor::compute method calls SIFT::operator() which (re)calculate orientation of features and also duplicates the points with several dominant orientations.
Solution could be to change descriptorParams.recalculateAngles to false.
Looks like it is due to OpenCV using Rob Hess's SIFT implementation, which sometimes duplicates the keypoints with more than one dominant orientation.
Looking around the OpenCV reported bugs did the trick, the issue was reported here.
It is not a bug, the behavior was not corrected in the newer versions but instead just documented. Since I am obliged to the OpenCV version I am using right now (v2.1), it did not occur to me to look at newer documentation for additional behavior since the behavior described in the old one made sense to me.
This is not a bug but by design:
SIFT returns multiple interest points at the same location with different orientations if there is not clearly a single dominant orientation. Usually, up to three (depending on the actual image patch) orientations are estimated.