I am a new learner to machine learning and I want to do a 2-class classification with only a few attributes. I have learned by researching online that two-class averaged perceptron algorithm is good for two-class classification with a linear model.
However, I have been reading through the documentation of Scikit-learn, and I am a bit confused if Scikit-learn is providing a averaged perceptron algorithm.
I wonder if the sklearn.linear_model.Perceptron class can be implemented as the two-class averaged perceptron algorithm by setting up the parameters correctly.
I appreciate it very much for your kind help.
I'm sure someone will correct me if I'm wrong but I do not believe Averaged Perceptron is implemented in sklearn. If I recall correctly, Perceptron in sklearn is simply SGD with certain default parameters.
With that said, have you tried good old logistic regression? While it may not be the sexiest algorithm around, it often does provide good results and can serve as a baseline to see if you need to explore more complicated methods.
Related
In Pose Estimation Using Associative Embedding technique I still don't have clarity regarding How we can group the detected points from HeatMaps to Individual Human Poses using Associative Embeddings Layer. Is there any code that clearly gives Idea regarding this ? I'm Using EfficientHRNet approach for Pose Estimation.
Extracted KeyPoints from Heatmaps and need to group those points into individual poses using Embedding Layer Output.
From OpenVINO perspective, we could offer:
This model: human-pose-estimation-0007
This IE demo: Human Pose Estimation Python* Demo
This model utilized the Associative Embedding technique.
However, if you want to build it from scratch, you'll need to design your own Deep Learning architecture, implement and train the neural network.
This research paper might give you some insight into things that you need to decide (eg batch, optimization algorithm, learning rate, etc).
I'm using OpenCV (3.1) SVM with 3 classes. Is there any way how to handle input data, which does not belong to any of these classes? Is there posibility to get probability from the prediciton?
I just simply want to mark data from unknown class as "Does not belong to any of trained classes".
Thank you
Looking at the SVM docs(the predict function, in particular), it seems that the best you can do is get the distance from the support vector, and it looks like you can only even get that from a binary classifier.
Not sure how constrained to OpenCV you are, but if you can use scikit learn for your problem, their SVM has a predict_proba function that should be helpful. There is also a predict_log_proba function, if that's your preference. Also, note that you'll need to set probability=true when calling the fit function if you go this route.
If you're contrained to C/C++, you might look into LibSVM, as they also have the ability to give the probabilities, although I'm not as familiar with their api. Also note that the OpenCV and scikit learn implementations are both based on LibSVM
Hope one of these works for you!
I'm working on a project related to people detection. I successfully implemented both an HOG SVM based classifier (with libSVM) and a cascade classifier (with opencv). The svm classifier works really good, i tested over a number of videos and it is correctly detecting people with really a few false positive and a few false negative; problem here is the computational time: nearly 1.2-1.3 sec over the entire image and 0.2-0.4 sec over the foreground patches; since i'm working on a project that must be able to work in nearly real-time environment, so i switched to the cascade classifier (to get less computational time).
So i trained many different cascade classifiers with opencv (opencv_traincascade). The output is good in terms of computational time (0.2-0.3 sec over the entire image, a lot less when launched only over the foreground), so i achieved the goal, let's say. Problem here is the quality of detection: i'm getting a lot of false positive and a lot of false negative. Since the only difference between the two methods is the base classifier used in opencv (decision tree or decision stumps, anyway no SVM as far as i understand), so i'm starting to think that my problem could be the base classifier (in some way, hog feature are best separated with hyperplanes i guess).
Of course, the dataset used in libsvm and Opencv is exactly the same, both for training and for testing...for the sake of completeness, i used nearly 9 thousands positive samples and nearly 30 thousands negative samples.
Here my two questions:
is it possible to change the base weak learner in the opencv_traincascade function? if yes, it the svm one of the possible choices? if the both answers are yes, how can i do such a thing? :)
are there other computer vision or machine learning libraries that implement the svm as weak classifier and have some methods to train a cascade classifier? (are these libraries suitable to be used in conjuction with opencv?)
thank you in advance as always!
Marco.
In principle a weak classifier can be anything, but the strength of Adaboost related methods is that they are able to obtain good results out of simple classifiers (they are called “weak” for a reason).
Using SVN and Adaboost cascade is a contradiction, as the former has no need to be used in such a framework: it is able to do its job by itself, and the latter is fast just because it takes advantage of weak classifiers.
Furthermore I don’t know of any study about it and OpenCv doesn’t support it: you have to write code by yourself. It is a huge undertaking and probably you won’t get any interesting result.
Anyway if you think that HOG features are more fitted for your task, OpenCv’s traincascade has an option for it, apart from Haar and Lbp.
As to your second question, I’m not sure but quite confident that the answer is negative.
My advice is: try to get the most you can from traincascade, for example try increase the number of samples id you can and compare the results.
This paper is quite good. It simply says that SVM can be treated as a weak classifier if you use fewer samples to train it (let's say less than half of the training set). The higher the weights the more chance it will be trained by the 'weak-SVM'.
The source code is not widely available unfortunately. If you want a quick prototype, use python scikit learn and see if you can get desirable results before modifying opencv.
I'm attempting to use RBM neural network in sklearn, but I can't find a predict function, I see how you can train it (I think) but I can't seem to figure out how to actually predict a value.
http://scikit-learn.org/stable/auto_examples/neural_networks/plot_rbm_logistic_classification.html#example-neural-networks-plot-rbm-logistic-classification-py
I'm working on a class assignment. this is the assignment:
You will then use randomized hill climbing algorithm to find good weights for a neural network.
Is it possible to do this with SKLearn? Is there a better recommended tool to be able to select different weights for NN? (The goal is to experiment with around 3 different search optimization techniques, and learn about them, not necessarily write them, nor write a NN in this case).
RBM's do not do prediction tasks. They are generative models. You can use the transform method to get a hidden state transformation of the input, or your gan use the gibbs method to sample from the network.
You will then use randomized hill climbing algorithm to find good weights for a neural network.
No, this is not available in scikit-learn.
It sounds like your assignment might be meant for you to implement a simpler problem from scratch rather than use another library, as hill climbing isn't normally used for training a neural network. And they probably don't want you to do hill climbing for an RBM neural network.
You should probably consult your professor for more direction on what you really should be doing.
I'm performing an experiment in which I need to compare classification performance of several classification algorithms for spam filtering, viz. Naive Bayes, SVM, J48, k-NN, RandomForests, etc. I'm using the WEKA data mining tool. While going through the literature I came to know about various dimension reduction methods which can be broadly classified into two types-
Feature Reduction: Principal Component Analysis, Latent Semantic Analysis, etc.
Feature Selection: Chi-Square, InfoGain, GainRatio, etc.
I have also read this tutorial of WEKA by Jose Maria in his blog: http://jmgomezhidalgo.blogspot.com.es/2013/02/text-mining-in-weka-revisited-selecting.html
In this blog he writes, "A typical text classification problem in which dimensionality reduction can be a big mistake is spam filtering". So, now I'm confused whether dimensionality reduction is of any use in case of spam filtering or not?
Further, I have also read in the literature about Document Frequency and TF-IDF as being one of feature reduction techniques. But I'm not sure how does it work and come into play during classification.
I know how to use weka, chain filters and classifiers, etc. The problem I'm facing is since I don't have enough idea about feature selection/reduction (including TF-IDF) I am unable to decide how and what feature selection techniques and classification algorithms I should combine to make my study meaningful. I also have no idea about optimal threshold value that I should use with chi-square, info gain, etc.
In StringToWordVector class, I have an option of IDFTransform, so does it makes sence to set it to TRUE and also use a feature selection technique, say InfoGain?
Please guide me and if possible please provide links to resources where I can learn about dimension reduction in detail and can plan my experiment meaningfully!
Well, Naive Bayes seems to work best for spam filtering, and it doesn't play nicely with dimensionality reduction.
Many dimensionality reduction methods try to identify the features of the highest variance. This of course won't help a lot with spam detection, you want discriminative features.
Plus, there is not only one type of spam, but many. Which is likely why naive Bayes works better than many other methods that assume there is only one type of spam.