NLTK wrapper for Weka to build a classifier - weka

I'm building a Named Entity classifier with nltk and I have my focus on location retrieval (of any type, from countries to museums, restaurants or roads). I'm trying to vary featuresets and methods I use.
For now, I've used NLTK's built-in Maxent, NaiveBayes, PositiveNaiveBayes, DecisionTrees and SVM. I'm using 40 different combinations of featuresets.
Maxent seems to be the best, but it's too slow. nltk's SVM is for binary classification and I had some issues with pickling the final classifier. Then I tried nltk's wrapper for scikit-learn SVM, but it didn't accept my inputs, I tried to adapt but had some float coercion problem.
Now, I'm considering to use nltk's wrapper for Weka, but I don't know if it could give me some extremely different result worthy to try and don't have to much time. My question is, what advantages Weka has over nltk's built-in classifiers?

Related

How to run half precision inference on a TensorRT model, written with TensorRT C++ API?

I'm trying to run half precision inference with a model natively written in TensorRT C++ API (not parsed from other frameworks e.g. caffe, tensorflow);
To the best of my knowledge, there is no public working example of this problem; the closest thing I found is the sampleMLP sample code, released with TensorRT 4.0.0.3, yet the release notes say there is no support for fp16;
My toy example code can be found in this repo. It contains API-implemented architecture and inference routine, plus the python script I use to convert my dictionary of trained weights to the wtd TensorRT format.
My toy architecture only consists of one convolution; the goal is to obtain similar results between fp32 and fp16, except for some reasonable loss of precision; the code seems to work with fp32, whereas what I obtain in case of fp16 inferencing are values of totally different orders of magnitude (~1e40); so it looks like I'm doing something wrong during conversions;
I'd appreciate any help in understanding the problem.
Thanks,
f
After quickly reading through your code, I can see you did more than is necessary to get a half precision optimized network. You shouldn't manually convert the loaded weights from float32 to float16 yourself. Instead, you should create your network as you normally would and call nvinfer1::IBuilder::setFp16Mode(true) with your nvinfer1::IBuilder object to let TensorRT do the conversions for you where suitable.

is the 'warm start' option of dlib's dcd trainer only for 1-class classification?

I am using dlib for a program that classifies medical images using SVM. Because the images are large (many features, say 10000 to 100000) and I use a linear kernel, it sounds as though the svm_c_linear_dcd_trainer is a good class to use.
Another reason that I like the svm_c_linear_dcd_trainer class is that it claims to support 'warm starting', so if a single observation is often added to/subtracted from the sample (such as in LOOCV) that is efficient for long vectors.
But the only example of svm_c_linear_dcd_trainer uses one_class classification. The documentation suggests that the force_last_weight_to_1 option that implements the warm start, is for 1-class classification only.
Is that true, i.e. is this warm-start option not available for binary classification? An in that case, would another implementation be faster?
That is not a limitation. Did you read the documentation for the class? http://dlib.net/dlib/svm/svm_c_linear_dcd_trainer_abstract.h.html#svm_c_linear_dcd_trainer Where in dlib's documentation does it say warm-starting is limited to one class classification. The documentation for the svm_c_linear_dcd_trainer doesn't even mention one class classification near as I can see.

Extracting MatConvnet model weights

I am currently developing an application for facial recognition.
The algorithms are implemented and trained using the MatConvnet library (http://www.vlfeat.org/matconvnet/). At the end, I have a Network (.mat file) which looks like that:
I would like to know if it were possible to extract the weights of the Network using its .mat file, write them in a XML file and read them with Caffe C++. I would like to reuse them in Caffe C++ in order to do some testing and hardware implementation. Is there an efficient and practical way to proceed so ?
Thank you for very much for your help.
The layer whose parameters you'd like to store, must be set as 'precious'. In net.var you can access the parameters and write them.
There is a conversion script that converts matconvnet models to caffe models here which you may find useful.
You can't use weights of the trained Network by matconvnet for caffe. You can merely import your model from matconvnet to caffe.(https://github.com/vlfeat/matconvnet/blob/4ce2871ec55f0d7deed1683eb5bd77a8a19a50cd/utils/import-caffe.py). But this script does not support all layers and you may have difficulties in employing it.
The best way is to define your caffe prototxt in python as the matconvnet model.

Why Classification model in weka predicting all instances as one class?

I have built a classification model using weka.I have two classes namely {spam,non-spam} After applying stringtowordvector filter, I get 10000 attributes for 19000 records. Then I am using liblinear library to build model which gives me F-score as follows:
Spam-94%
non-spam-98%
When I use same model to predict new instances, it predict all of them as spam.
Also, when I try to use test set same as training set, It predict all of them as spam too. I am mentally exhausted to find the problem.Any help will be appreciated.
I get it also wrong every so often. Then I watch this video to remind myself how it's done: https://www.youtube.com/watch?v=Tggs3Bd3ojQ where Prof Witten, one of the Weka Developers/Architects shows how to use the FilteredClassifier (which in turn is configured to load the StringToWordVector Filter) on the training-dataset and the test-set correctly.
This is shown for weka 3.6, weka 3.7. might be slightly different.
What does ZeroR give you? If it's close to 100%, you know that any classification algorithm should be not too far off either.
Why do you optimize for F-Measure? Just asking. I have never used this and don't know much about it. (I would optimize for the "Precision" metric assuming you have much more Spam than Nonspam).

weka SVM multi class classifier

I understand that weka use a 1 to 1 approach in terms of SVM. However, i would like to classify documents and i have 10 class labels.
Is it possible to change the parameters to change it to a 1 vs rest approach instead.
How should i actually go about doing it.
The official site http://weka.wikispaces.com/LibSVM does not help much
Other classification methods such as naive bayes have been tried but i would like to compare the results against SVM methods
LIBSVM also allows multi-label classification. You can find here examples of implementation.
http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/multilabel/
You can also search papers that using LibSVM with non-binary datasets.
i.e. http://link.springer.com/article/10.1007/s00521-011-0793-1
Anyway another variant to use LibSVM is SMO WEKA library.
In Weka Version 3.8, the multi-class meta classifier can be used. It also has options including 1-against-1 and 1-against-all multi-classification methods.