weka SVM multi class classifier - weka

I understand that weka use a 1 to 1 approach in terms of SVM. However, i would like to classify documents and i have 10 class labels.
Is it possible to change the parameters to change it to a 1 vs rest approach instead.
How should i actually go about doing it.
The official site http://weka.wikispaces.com/LibSVM does not help much
Other classification methods such as naive bayes have been tried but i would like to compare the results against SVM methods

LIBSVM also allows multi-label classification. You can find here examples of implementation.
http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/multilabel/
You can also search papers that using LibSVM with non-binary datasets.
i.e. http://link.springer.com/article/10.1007/s00521-011-0793-1
Anyway another variant to use LibSVM is SMO WEKA library.

In Weka Version 3.8, the multi-class meta classifier can be used. It also has options including 1-against-1 and 1-against-all multi-classification methods.

Related

is the 'warm start' option of dlib's dcd trainer only for 1-class classification?

I am using dlib for a program that classifies medical images using SVM. Because the images are large (many features, say 10000 to 100000) and I use a linear kernel, it sounds as though the svm_c_linear_dcd_trainer is a good class to use.
Another reason that I like the svm_c_linear_dcd_trainer class is that it claims to support 'warm starting', so if a single observation is often added to/subtracted from the sample (such as in LOOCV) that is efficient for long vectors.
But the only example of svm_c_linear_dcd_trainer uses one_class classification. The documentation suggests that the force_last_weight_to_1 option that implements the warm start, is for 1-class classification only.
Is that true, i.e. is this warm-start option not available for binary classification? An in that case, would another implementation be faster?
That is not a limitation. Did you read the documentation for the class? http://dlib.net/dlib/svm/svm_c_linear_dcd_trainer_abstract.h.html#svm_c_linear_dcd_trainer Where in dlib's documentation does it say warm-starting is limited to one class classification. The documentation for the svm_c_linear_dcd_trainer doesn't even mention one class classification near as I can see.

Why Classification model in weka predicting all instances as one class?

I have built a classification model using weka.I have two classes namely {spam,non-spam} After applying stringtowordvector filter, I get 10000 attributes for 19000 records. Then I am using liblinear library to build model which gives me F-score as follows:
Spam-94%
non-spam-98%
When I use same model to predict new instances, it predict all of them as spam.
Also, when I try to use test set same as training set, It predict all of them as spam too. I am mentally exhausted to find the problem.Any help will be appreciated.
I get it also wrong every so often. Then I watch this video to remind myself how it's done: https://www.youtube.com/watch?v=Tggs3Bd3ojQ where Prof Witten, one of the Weka Developers/Architects shows how to use the FilteredClassifier (which in turn is configured to load the StringToWordVector Filter) on the training-dataset and the test-set correctly.
This is shown for weka 3.6, weka 3.7. might be slightly different.
What does ZeroR give you? If it's close to 100%, you know that any classification algorithm should be not too far off either.
Why do you optimize for F-Measure? Just asking. I have never used this and don't know much about it. (I would optimize for the "Precision" metric assuming you have much more Spam than Nonspam).

Read svm data and retrain with more data?

I am implementing a facial expression recognition and am using SVM to classify given expression.
When I train, I use this command line
svm.train(myFeatureVector,myLabels,Mat(),Mat(), myParameters);
svm.save("myClassifier.yml");
which will later when I will predict using
response = svm.predict(incomingFeatureVector);
But then when I want to train more than once (exited the program and start again), it seems to have overwritten my previous svm file. Is there any way I could do read previous svm file and add more data into it (and then resave it ,etc) ? I looked up on this openCV documentation and found nothing. However, when I read on this page; there is a method called CvSVM::read. I don't know what that does/how to implement it.
Hope anyone can help me :(
What you are trying to do is incremental learning but unfortunately Support Vector Machines is a batch algorithm, hence if you want to add more data you have to retrain with the whole set again.
There are online learning alternatives, like Pegasos SVM but I am not aware of any that is implemented on OpenCV

NLTK wrapper for Weka to build a classifier

I'm building a Named Entity classifier with nltk and I have my focus on location retrieval (of any type, from countries to museums, restaurants or roads). I'm trying to vary featuresets and methods I use.
For now, I've used NLTK's built-in Maxent, NaiveBayes, PositiveNaiveBayes, DecisionTrees and SVM. I'm using 40 different combinations of featuresets.
Maxent seems to be the best, but it's too slow. nltk's SVM is for binary classification and I had some issues with pickling the final classifier. Then I tried nltk's wrapper for scikit-learn SVM, but it didn't accept my inputs, I tried to adapt but had some float coercion problem.
Now, I'm considering to use nltk's wrapper for Weka, but I don't know if it could give me some extremely different result worthy to try and don't have to much time. My question is, what advantages Weka has over nltk's built-in classifiers?

How can i translate my feature matrix to weka language?

I need some help please.
Well, i have some feature vectores from 2 classes (2 differents movements of upper limb). Now i need to put my feature matrix (all feature vectors) in weka to classify my movements, specifically with SVM algorithm. But i never worked with weka before, or with java or with format arff. How can i translate my feature matrix to weka language?
Thank you very much. I will apreciate all help
Lilia
Realized it should probably be a full answer, but there are a number of great documents out there that detail the .arff file format. Since you already have feature vectors it's worth just using each entry in that feature vector as a different numerical output.
There's a good explanation of the Arff format here: http://www.cs.waikato.ac.nz/ml/weka/arff.html
There's a Java example showing how to convert a csv to an arff file programatically:
http://weka.wikispaces.com/Converting+CSV+to+ARFF
And there's even an online tool that will do most of it for you (I don't really recommend this as it makes sometimes critical mistakes):
http://slavnik.fe.uni-lj.si/markot/csv2arff/csv2arff.php
Though if all you want to do is run some regression, weka will let you do that without converting anything to arff.