I'm using OpenCV (3.1) SVM with 3 classes. Is there any way how to handle input data, which does not belong to any of these classes? Is there posibility to get probability from the prediciton?
I just simply want to mark data from unknown class as "Does not belong to any of trained classes".
Thank you
Looking at the SVM docs(the predict function, in particular), it seems that the best you can do is get the distance from the support vector, and it looks like you can only even get that from a binary classifier.
Not sure how constrained to OpenCV you are, but if you can use scikit learn for your problem, their SVM has a predict_proba function that should be helpful. There is also a predict_log_proba function, if that's your preference. Also, note that you'll need to set probability=true when calling the fit function if you go this route.
If you're contrained to C/C++, you might look into LibSVM, as they also have the ability to give the probabilities, although I'm not as familiar with their api. Also note that the OpenCV and scikit learn implementations are both based on LibSVM
Hope one of these works for you!
Related
I work on palmprint recognition using feature2D with Open_CV library, and I use algorithms such as SIFT, SURF, ORB... to detect features and extract/match descriptors. My test include (1 vs 1) palmprint and also (1 vs Data Base) of palmprint.
Ones I get the result, I need to evaluate the algorithm, and for this I know that there are some rates or scores (like EER, rank-1 identification, recall and accuracy) which gives an estimation about how much this method was successful. Now I need to know if any of those rates are implemented in Open_CV, and how to use them. If they aren't, what are the different formulas used in the literary.
As far as I know there is little implemented in OpenCV. A common way is to store the results (e.g. in JSON) and process those with other programs such as Matlab or Python. This also allows you to change the evaluation without the need to recompute the algorithms.
There is no overall best method to show the results. It always depends on what you want to show. In my opinion ROC is the best way to express your output. It is also very widely used in research.
If you insist on doing it in C++, then you could use:
Roceasy or
DLIB
I am using a multi-dimensional SVM classifier (SVM.NET, a wrapper for libSVM) to classify a set of features.
Given an SVM model, is it possible to incorporate new training data without having to recalculate on all previous data? I guess another way of putting it would be: is an SVM mutable?
Actually, it's usually called incremental learning. The question has come up before and is pretty well answered here : A few implementation details for a Support-Vector Machine (SVM).
In brief, it's possible but not easy, you would have to change the library you are using or implement the training algorithm yourself.
I found two possible solutions, SVMHeavy and LaSVM, that supports incremental training. But I haven't used either and don't know anything about them.
Online and incremental although similar but differ slightly. In online, its generally a single pass(epoch=1) or number of epochs could be configured. Where as, incremental would mean that you already have a model; no matter how it is built, but then model can be mutable by new examples. Also, a combination of online and incremental is often what is required.
Here is a list of tools with some remarks on the online and/or incremental SVM : https://stats.stackexchange.com/questions/30834/is-it-possible-to-append-training-data-to-existing-svm-models/51989#51989
I'm working on a project related to people detection. I successfully implemented both an HOG SVM based classifier (with libSVM) and a cascade classifier (with opencv). The svm classifier works really good, i tested over a number of videos and it is correctly detecting people with really a few false positive and a few false negative; problem here is the computational time: nearly 1.2-1.3 sec over the entire image and 0.2-0.4 sec over the foreground patches; since i'm working on a project that must be able to work in nearly real-time environment, so i switched to the cascade classifier (to get less computational time).
So i trained many different cascade classifiers with opencv (opencv_traincascade). The output is good in terms of computational time (0.2-0.3 sec over the entire image, a lot less when launched only over the foreground), so i achieved the goal, let's say. Problem here is the quality of detection: i'm getting a lot of false positive and a lot of false negative. Since the only difference between the two methods is the base classifier used in opencv (decision tree or decision stumps, anyway no SVM as far as i understand), so i'm starting to think that my problem could be the base classifier (in some way, hog feature are best separated with hyperplanes i guess).
Of course, the dataset used in libsvm and Opencv is exactly the same, both for training and for testing...for the sake of completeness, i used nearly 9 thousands positive samples and nearly 30 thousands negative samples.
Here my two questions:
is it possible to change the base weak learner in the opencv_traincascade function? if yes, it the svm one of the possible choices? if the both answers are yes, how can i do such a thing? :)
are there other computer vision or machine learning libraries that implement the svm as weak classifier and have some methods to train a cascade classifier? (are these libraries suitable to be used in conjuction with opencv?)
thank you in advance as always!
Marco.
In principle a weak classifier can be anything, but the strength of Adaboost related methods is that they are able to obtain good results out of simple classifiers (they are called “weak” for a reason).
Using SVN and Adaboost cascade is a contradiction, as the former has no need to be used in such a framework: it is able to do its job by itself, and the latter is fast just because it takes advantage of weak classifiers.
Furthermore I don’t know of any study about it and OpenCv doesn’t support it: you have to write code by yourself. It is a huge undertaking and probably you won’t get any interesting result.
Anyway if you think that HOG features are more fitted for your task, OpenCv’s traincascade has an option for it, apart from Haar and Lbp.
As to your second question, I’m not sure but quite confident that the answer is negative.
My advice is: try to get the most you can from traincascade, for example try increase the number of samples id you can and compare the results.
This paper is quite good. It simply says that SVM can be treated as a weak classifier if you use fewer samples to train it (let's say less than half of the training set). The higher the weights the more chance it will be trained by the 'weak-SVM'.
The source code is not widely available unfortunately. If you want a quick prototype, use python scikit learn and see if you can get desirable results before modifying opencv.
We're working on a machine learning project in which we'd like to see the influence of certain online sample embedding methods on SVMs.
In the process we've tried interfacing with Pegasos and dlib as well as designing (and attempting to write) our own SVM implementation.
dlib seems promising as it allows interfacing with user written kernels.
Yet kernels don't give us the desired "online" behavior (unless that assumption is wrong).
Therefor, if you know about an SVM library which supports online embedding and custom written embedders, it would be of great help.
Just to be clear about "online".
It is crucial that the embedding process will happen online in order to avoid heavy memory usage.
We basically want to do the following within Stochastic subGradient Decent(in very general pseudo code):
w = 0 vector
for t=1:T
i = random integer from [1,n]
embed(sample_xi)
// sample_xi is sent to sub gradient loss i as a parameter
w = w - (alpha/t)*(sub_gradient(loss_i))
end
I think in your case you might want to consider the Budgeted Stochastic Gradient Descent for Large-Scale SVM Training (BSGD) [1] by Wang, Crammer, Vucetic
This is because, as specified in the paper about the "Curse of Kernelization" you might want to explore this option instead what you have indicated in the pseudocode in your question.
The Shark Machine Learning Library implements BSGD. Check a quick tutorial here
Maybe you want to use something like dlib's empirical kernel map. You can read it's documentation and particularly the example program for the gory details of what it does, but basically it lets you project a sample into the span of some basis set in a kernel feature space. There are even algorithms in dlib that iteratively build the basis set, which is maybe what you are asking about.
I'm writing an application that uses an SVM to do classification on some images (specifically these). My Matlab implementation works really well. Using a SIFT bag-of-words approach, I'm able to get near 100% accuracy with a linear kernel.
I need to implement this in C++ for speed/portability reasons, and so I've tried using both libsvm and dlib. I've tried multiple SVM types (c_svm, nu_svm, one_class) and multiple kernels (linear, polynomial, rbf). The best I've been able to achieve is around 50% accuracy - even on the same samples that I've trained on. I've confirmed that my feature generators are working, because when I export my c++-generated features to Matlab and train on those, I'm able to get near-perfect results again.
Is there something magical about Matlab's SVM implementation? Are there any common pitfalls or areas that I might look into that would explain the behavior I'm seeing? I know this is a little vague, but part of the problem is that I don't know where to go. Please let me know in the comments if there is other info I can provide that would be helpful.
There is nothing magical about the Matlab version of the libraries, other that it runs in Matlab which makes it harder to shoot yourself on the foot.
A check list:
Are you normalizing your data, making all values lie between 0 and 1
(or between -1 and 1), either linearly or using the mean and the
standard deviation?
Are you parameter searching for a good value of C (or C and gamma in
the case of an RBF kernel)? Doing cross validation or on a hold out set?
Are you sure that your're handling NaN, and all other floating point
nastiness? Matlab is very good at hiding this from you, C++ not so
much.
Could it be that you're loading your data incorrectly, reading a
"%s" into a double or something that is adding noise to your input
data?
Could it be that libsvm/dlib expects the data in row major order and
your're sending it in in column major (or the other way around)? Again Matlab makes this almost impossible, C++ not so much.
32-64 bit nastiness one version of the library, executable compiled
with the other?
Some other things:
Could it be that in Matlab you're somehow leaking the class (y) into
the preprocessing? no one does this on purpose, but I've seen it happen.
If you make almost any f(y) a feature, you'll get almost 100%
everytime.
Sometimes it helps to verify that everything is numerically
identical by printing to file before training both in C++ and
Matlab.
i'm very happy with libsvm using the rbf kernel. carlosdc pointed out the most common errors in the correct order :-). for libsvm - did you use the python tools shipped with libsvm? if not i recommend to do so. write your feature vectors to a file (from matlab and/or c++) and do a metatraining for the rbf kernel with easy.py. you get the parameters and a prediction for the generated model. if this prediction is ok continue with c++. from training you also get a scaled feature file (min/max transformed to -1.0/1.0 for every feature). compare these to your c++ implementation as well.
some libsvm issues: a nasty habit is (if i remember correctly) that values scaling to 0 (zero) are omitted in the scaled file. in grid.py is a parameter "nr_local_worker" which is defining the mumber of threads. you might wish to increase it.