Get SVM classification score in multiclass classification with OpenCV - c++

I'm working on a project where I'm doing multiclass classification with SVM in OpenCV.
My goal is to get the confidence score of the classification as well as the predicted class.
How can I do that? Right now I'm doing something like
float result = mysvm.predict(sample);
Having a fairly high amount of classes I prefer to avoid doing a lot of one-vs-all classifications and then calculate the scores.
Since OpenCV SVM is implemented using LibSVM, I'm quite sure that there is a way to do this, but looking at http://docs.opencv.org/modules/ml/doc/support_vector_machines.html doesn't really help.
Thanks for any input provided.

In opencv/include/opencv2/ml/ml.hpp, there is a struct called CvSVMDecisionFunc.. It has been used in line 546 as a Protected Variable,
CvSVMDecisionFunc* decision_func;
What you need to do is to cut that line and paste it as Public and then do a complete rebuild of OpenCV.. This variable, decision_func contains all the data for specific support vectors (ie, the alpha and rho values)..

Related

OpenCV Random Decision Forest: How to get posterior probability

I did research on multiple websites, but I couldn't find any solution.
Here's the problem:
I am implementing a pixel-wise classification using RTrees from OpenCV. I need the posterior probability for each class. I tried to get it via cv::ml::StatModel::predict(), but the output matrix only contains the predicted value. Is there another way to get the posterior probability from RTrees?
PS: I'm still quite new to Machine Learning, so please forgive me my lack of knowledge ^^"
Instead of using cv::ml::StatModel::predict, you could refer to the cv::ml::RTrees::getVotes member function. This way, in case of classification, you get the number of trees which voted for each class for given sample. By dividing these numbers of votes by the forest size you get an approximation of posterior probabilities.
The getVotes function should be called instead of predict like this:
cv::Mat samples = [one or multiple samples (their feature vectors)]
cv::Mat votes;
classifier.getVotes(sample, votes, 0);
// provide 0 here unless you would like to manipulate with RTrees flags
What you should be aware of is that the votes matrix is going to have one more row than the number of samples. In this first row there are your classes enumerated (in ascending order if I remember well from the OpenCV source code).
The answer is up to date as of the 3.4.1 version of OpenCV.

SVM with probability estimates

I have a binary classification problem i am solving with SVM. The classes are unbalanced in the training data. I now need to get posterior probabilities outputs, and not just a binary score. I tried to use Platt scaling by either Weka's SMO, and LibSVM. For both of these implementations i get results which, in terms of f1-measure for the minority class, are worse then when i generated only binary results.
Do you know of a way to transform SVM binary results to probabilities which keeps the next rule:
"prob > = 0.5 if and only if decision value >= 0".
Meaning that the label the each sample gets is the same when using either binary classification, or probabilities.
SVM can be set so that they output class membership probabilities. You should look documentation of your toolkit to learn how to enable this.
For example sckit-learn
When the constructor option probability is set to True, class
membership probability estimates (from the methods predict_proba and
predict_log_proba) are enabled.

What is class_weight parameter does in scikit-learn SGD

I am a frequent user of scikit-learn, I want some insights about the “class_ weight ” parameter with SGD.
I was able to figure out till the function call
plain_sgd(coef, intercept, est.loss_function,
penalty_type, alpha, C, est.l1_ratio,
dataset, n_iter, int(est.fit_intercept),
int(est.verbose), int(est.shuffle), est.random_state,
pos_weight, neg_weight,
learning_rate_type, est.eta0,
est.power_t, est.t_, intercept_decay)
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/stochastic_gradient.py
After this it goes to sgd_fast and I am not very good with cpython. Can you give some celerity on these questions.
I am having a class biased in the dev set where positive class is somewhere 15k and negative class is 36k. does the class_weight will resolve this problem. Or doing undersampling will be a better idea. I am getting better numbers but it’s hard to explain.
If yes then how it actually does it. I mean is it applied on the features penalization or is it a weight to the optimization function. How I can explain this to layman ?
class_weight can indeed help increasing the ROC AUC or f1-score of a classification model trained on imbalanced data.
You can try class_weight="auto" to select weights that are inversely proportional to class frequencies. You can also try to pass your own weights has a python dictionary with class label as keys and weights as values.
Tuning the weights can be achieved via grid search with cross-validation.
Internally this is done by deriving sample_weight from the class_weight (depending on the class label of each sample). Sample weights are then used to scale the contribution of individual samples to the loss function used to trained the linear classification model with Stochastic Gradient Descent.
The feature penalization is controlled independently via the penalty and alpha hyperparameters. sample_weight / class_weight have no impact on it.

cvSVM training produces poor results for HOGDescriptor

My objective is to train an SVM and get support vectors which i can plug into opencv's HOGdescriptor for object detection.
I have gathered 4000~ positives and 15000~ negatives and I train using the SVM provided by opencv. the results give me too many false positives.(up to 20 per image) I would clip out the false positives and add them into the pool of negatives to retrain. and I would end up with even more false positives at times! I have tried adjusting L2HysThreshold of my hogdescriptor upwards to 300 without significant improvement. is my pool of positives and negatives large enough?
the SVM training is also much faster than expected. I have tried with a feature vector size of 2916 and 12996, using grayscale images and color images on separate tries. SVM training has never taken longer than 20 minutes. I use auto_train. I am new to machine learning but from what i hear training with a dataset as large as mine should take at least a day no?
I believe cvSVM is not doing much learning and according to http://opencv-users.1802565.n2.nabble.com/training-a-HOG-descriptor-td6363437.html, it is not suited for this purpose. does anyone with experience with cvSVM have more input on this?
I am considering using SVMLight http://svmlight.joachims.org/ but it looks like there isn't a way to visualize the SVM hyperplane. What are my options?
I use opencv2.4.3 and have tried the following setsups for hogdescriptor
hog.winSize = cv::Size(100,100);
hog.cellSize = cv::Size(5,5);
hog.blockSize = cv::Size(10,10);
hog.blockStride = cv::Size(5,5); //12996 feature vector
hog.winSize = cv::Size(100,100);
hog.cellSize = cv::Size(10,10);
hog.blockSize = cv::Size(20,20);
hog.blockStride = cv::Size(10,10); //2916 feature vector
Your first descriptor dimension is way too large to be any useful. To form any reliable SVM hyperplane, you need at least the same number of positive and negative samples as your descriptor dimensions. This is because ideally you need separating information in every dimension of the hyperplane.
The number of positive and negative samples should be more or less the same unless you provide your SVM trainer with a bias parameter (may not be available in cvSVM).
There is no guarantee that HOG is a good descriptor for the type of problem you are trying to solve. Can you visually confirm that the object you are trying to detect has a distinct shape with similar orientation in all samples? A single type of flower for example may have a unique shape, however many types of flowers together don't have the same unique shape. A bamboo has a unique shape but may not be distinguishable from other objects easily, or may not have the same orientation in all sample images.
cvSVM is normally not the tool used to train SVMs for OpenCV HOG. Use the binary form of SVMLight (not free for commercial purposes) or libSVM (ok for commercial purposes). Calculate HOGs for all samples using your C++/OpenCV code and write it to a text file in the correct input format for SVMLight/libSVM. Use either of the programs to train a model using linear kernel with the optimal C. Find the optimal C by searching for the best accuracy while changing C in a loop. Calculate the detector vector (a N+1 dimensional vector where N is the dimension of your descriptor) by finding all the support vectors, multiplying alpha values by each corresponding support vector, and then for each dimension adding all the resulting alpha * values to find an ND vector. As the last element add -b where b is the hyperplane bias (you can find it in the model file coming out of SVMLight/libSVM training). Feed this N+1 dimensional detector to HOGDescriptor::setSVMDetector() and use HOGDescriptor::detect() or HOGDescriptor::detectMultiScale() for detection.
I have had successful results using SVMLight to learn SVM models when training from OpenCV, but haven't used cvSVM, so can't compare.
The hogDraw function from http://vision.ucsd.edu/~pdollar/toolbox/doc/index.html will visualise your descriptor.

OpenCV's KNN Unknown Classifications

At the moment I am using OpenCV's KNN implementation to classify images. It currently classifies images into P, S or rectangle, and correctly. However if I feed it an image of noise it will attempt to classify it as 1 of the 3 classifications I stated earlier. To get it to classify as noise, should I train the KNN to put noise in a 'noise' category, or is there some kind of accuracy rating I can use?
The way to do it is to use the dists variable in the knn_nearest function. It spits out the distance between your vector and the K unit vectors, the further the distance the less they have in common with the test data.
yes, but i wouldnt advise it. If you have a classifier which is good at distinguishing between oranges and apples, you shouldn't try making it recognizes "not a fruit". First because you can feed wrong inputs to almost anything, second because it will lower its original performance, and third because you need noise to have a pattern. How do you define noise??