Opencv cascade classifier with SVM as weak learner - c++

I'm working on a project related to people detection. I successfully implemented both an HOG SVM based classifier (with libSVM) and a cascade classifier (with opencv). The svm classifier works really good, i tested over a number of videos and it is correctly detecting people with really a few false positive and a few false negative; problem here is the computational time: nearly 1.2-1.3 sec over the entire image and 0.2-0.4 sec over the foreground patches; since i'm working on a project that must be able to work in nearly real-time environment, so i switched to the cascade classifier (to get less computational time).
So i trained many different cascade classifiers with opencv (opencv_traincascade). The output is good in terms of computational time (0.2-0.3 sec over the entire image, a lot less when launched only over the foreground), so i achieved the goal, let's say. Problem here is the quality of detection: i'm getting a lot of false positive and a lot of false negative. Since the only difference between the two methods is the base classifier used in opencv (decision tree or decision stumps, anyway no SVM as far as i understand), so i'm starting to think that my problem could be the base classifier (in some way, hog feature are best separated with hyperplanes i guess).
Of course, the dataset used in libsvm and Opencv is exactly the same, both for training and for testing...for the sake of completeness, i used nearly 9 thousands positive samples and nearly 30 thousands negative samples.
Here my two questions:
is it possible to change the base weak learner in the opencv_traincascade function? if yes, it the svm one of the possible choices? if the both answers are yes, how can i do such a thing? :)
are there other computer vision or machine learning libraries that implement the svm as weak classifier and have some methods to train a cascade classifier? (are these libraries suitable to be used in conjuction with opencv?)
thank you in advance as always!
Marco.

In principle a weak classifier can be anything, but the strength of Adaboost related methods is that they are able to obtain good results out of simple classifiers (they are called “weak” for a reason).
Using SVN and Adaboost cascade is a contradiction, as the former has no need to be used in such a framework: it is able to do its job by itself, and the latter is fast just because it takes advantage of weak classifiers.
Furthermore I don’t know of any study about it and OpenCv doesn’t support it: you have to write code by yourself. It is a huge undertaking and probably you won’t get any interesting result.
Anyway if you think that HOG features are more fitted for your task, OpenCv’s traincascade has an option for it, apart from Haar and Lbp.
As to your second question, I’m not sure but quite confident that the answer is negative.
My advice is: try to get the most you can from traincascade, for example try increase the number of samples id you can and compare the results.

This paper is quite good. It simply says that SVM can be treated as a weak classifier if you use fewer samples to train it (let's say less than half of the training set). The higher the weights the more chance it will be trained by the 'weak-SVM'.
The source code is not widely available unfortunately. If you want a quick prototype, use python scikit learn and see if you can get desirable results before modifying opencv.

Related

What are the main rates and values we should figure to evaluate both feature detection, description and matching?

I work on palmprint recognition using feature2D with Open_CV library, and I use algorithms such as SIFT, SURF, ORB... to detect features and extract/match descriptors. My test include (1 vs 1) palmprint and also (1 vs Data Base) of palmprint.
Ones I get the result, I need to evaluate the algorithm, and for this I know that there are some rates or scores (like EER, rank-1 identification, recall and accuracy) which gives an estimation about how much this method was successful. Now I need to know if any of those rates are implemented in Open_CV, and how to use them. If they aren't, what are the different formulas used in the literary.
As far as I know there is little implemented in OpenCV. A common way is to store the results (e.g. in JSON) and process those with other programs such as Matlab or Python. This also allows you to change the evaluation without the need to recompute the algorithms.
There is no overall best method to show the results. It always depends on what you want to show. In my opinion ROC is the best way to express your output. It is also very widely used in research.
If you insist on doing it in C++, then you could use:
Roceasy or
DLIB

Why opencv SVM predict same object result is different after 2 times training? [duplicate]

I am using a multi-dimensional SVM classifier (SVM.NET, a wrapper for libSVM) to classify a set of features.
Given an SVM model, is it possible to incorporate new training data without having to recalculate on all previous data? I guess another way of putting it would be: is an SVM mutable?
Actually, it's usually called incremental learning. The question has come up before and is pretty well answered here : A few implementation details for a Support-Vector Machine (SVM).
In brief, it's possible but not easy, you would have to change the library you are using or implement the training algorithm yourself.
I found two possible solutions, SVMHeavy and LaSVM, that supports incremental training. But I haven't used either and don't know anything about them.
Online and incremental although similar but differ slightly. In online, its generally a single pass(epoch=1) or number of epochs could be configured. Where as, incremental would mean that you already have a model; no matter how it is built, but then model can be mutable by new examples. Also, a combination of online and incremental is often what is required.
Here is a list of tools with some remarks on the online and/or incremental SVM : https://stats.stackexchange.com/questions/30834/is-it-possible-to-append-training-data-to-existing-svm-models/51989#51989

Stanford classifier - Why?

Given Stanford Classifier is relatively new which added values it supplies to users of Weka or RapidMiner working on text ML?
I'm not sure the Stanford classifier qualifies as "new" -- but, in my (admittedly biased) experience it's quite fast and robust at the types of classification problems we often encounter in NLP. That is, in situations where you have a lot of sparse indicator features (e.g., bag of words), but relatively few features fire per example (< 100 or so). On these problems, it is orders of magnitude faster than Weka. I don't have any personal experience with RapidMiner, so I can't say much in the way of comparison there.

choosing kernel for digit recognition in C

I'm trying to classify digits read on images at known positions in C++, using SVM.
for that, I sample over a rectangle at the known position of the digit, I train with a ground_truth.
I wonder how to choose the kernel of the SVM. I use the default linear kernel but my intuition tell me that it might not be the best choice.
How could I choose the kernel?
You will need to tune the kernel (if you use a nonlinear one). This guide may be useful for you: A practical guide to SVM classification
Unfortunately there is not a magic bullet for this, so experimentation is your best friend.
Probably I would start with RBF which tends to work decently in most cases, and I am agreed with your intuition that probably linear is not the best, although some times (especially when you have tons of data) it can give you good surprises :)
The problem I have found with RBF is that it tends to overfit the training set, this stop to be an issue if you have a lot of data but then a new problem raises because it tends to scale poorly and having slow training time for big data.

Supprt Vector Machine works in matlab, doesn't work in c++

I'm writing an application that uses an SVM to do classification on some images (specifically these). My Matlab implementation works really well. Using a SIFT bag-of-words approach, I'm able to get near 100% accuracy with a linear kernel.
I need to implement this in C++ for speed/portability reasons, and so I've tried using both libsvm and dlib. I've tried multiple SVM types (c_svm, nu_svm, one_class) and multiple kernels (linear, polynomial, rbf). The best I've been able to achieve is around 50% accuracy - even on the same samples that I've trained on. I've confirmed that my feature generators are working, because when I export my c++-generated features to Matlab and train on those, I'm able to get near-perfect results again.
Is there something magical about Matlab's SVM implementation? Are there any common pitfalls or areas that I might look into that would explain the behavior I'm seeing? I know this is a little vague, but part of the problem is that I don't know where to go. Please let me know in the comments if there is other info I can provide that would be helpful.
There is nothing magical about the Matlab version of the libraries, other that it runs in Matlab which makes it harder to shoot yourself on the foot.
A check list:
Are you normalizing your data, making all values lie between 0 and 1
(or between -1 and 1), either linearly or using the mean and the
standard deviation?
Are you parameter searching for a good value of C (or C and gamma in
the case of an RBF kernel)? Doing cross validation or on a hold out set?
Are you sure that your're handling NaN, and all other floating point
nastiness? Matlab is very good at hiding this from you, C++ not so
much.
Could it be that you're loading your data incorrectly, reading a
"%s" into a double or something that is adding noise to your input
data?
Could it be that libsvm/dlib expects the data in row major order and
your're sending it in in column major (or the other way around)? Again Matlab makes this almost impossible, C++ not so much.
32-64 bit nastiness one version of the library, executable compiled
with the other?
Some other things:
Could it be that in Matlab you're somehow leaking the class (y) into
the preprocessing? no one does this on purpose, but I've seen it happen.
If you make almost any f(y) a feature, you'll get almost 100%
everytime.
Sometimes it helps to verify that everything is numerically
identical by printing to file before training both in C++ and
Matlab.
i'm very happy with libsvm using the rbf kernel. carlosdc pointed out the most common errors in the correct order :-). for libsvm - did you use the python tools shipped with libsvm? if not i recommend to do so. write your feature vectors to a file (from matlab and/or c++) and do a metatraining for the rbf kernel with easy.py. you get the parameters and a prediction for the generated model. if this prediction is ok continue with c++. from training you also get a scaled feature file (min/max transformed to -1.0/1.0 for every feature). compare these to your c++ implementation as well.
some libsvm issues: a nasty habit is (if i remember correctly) that values scaling to 0 (zero) are omitted in the scaled file. in grid.py is a parameter "nr_local_worker" which is defining the mumber of threads. you might wish to increase it.