How to save the result of feature selection in Weka? - weka

I’m trying to use InfoGainAttributeEval in Weka for feature selection, how to save the result? I try to save it but seems like my weka just save my input data, not the result of feature selection.

Welcome to SO. As far as I understand you want to get the ranked values of the attributes. To do this, right click on the "Ranker + InfoGainAttributeEval" statement in the "Result List" section. Select "Save result buffer". You can see the results in programs such as notepad. You can also import it into "Excel" and create it in the chart. I think you selected "Ranker" in the Search Method section. I think it is an image as seen in the figure below.
After selecting and running "InfoGainAttributeEval" and "Ranker" it will give you a "ranked" list (Use full training set). Right click and select "Save Reduced Data" then save. Open the file in notepad as well. Open in Weka too. Select the ones whose Rank value is 0 in Weka and delete them with "Remove". Let those with rank value be left. Now you can get the same result reduced with these features. Save in .arff format. Now you have acquired Reduced data.

If "Save Reduced Data" is not working for you, here is another approach.
Attribute selection can be accomplished in the Preprocess tab.
There is a bar near the top for Filtering the data. Click the
"Choose" button. Under Filters->Supervised->Attribute you will
find AttributeSelection. Select that.
Once it says "AttributeSelection" in the Filter bar, you can click
on the bar to pick a selection method and a search method as well as
set the parameters for those choices.
Once you have made your choices for the feature selection algorithm,
click Apply to the right of the filter bar so that the filter is
actually applied to the data. The data should now have the reduced
feature set. So all you need to do is save it by clicking on the
Save button at the top right.
This should save the reduced data set.

Related

Formula help on IF ELSE on Smartsheet

I want to have a condition where IF Delivered column checkbox is checked, then that whole row will be deleted. Is that feasible?
How can I start with it?
Formulas can't change the condition of an item (like a row), only the value in a cell. So, in other words, you can't delete a row with a formula.
You "could" do this with an external script using the Smartsheet API, but you'll want to take situations that #Ken White mentioned in the comments into account. Your script should make sure that there is a way for users to recover the deleted row if the box is checked by mistake.
There are a couple of ways this might be possible. If you set up a default filter on a sheet to always load rows where complete box is unchecked, then, if you checked off a task or two and reloaded the sheet those tasks would not be visible the next time it loads.
To do this:
Create a new filter.
Title it and check the Share Filter checkbox
Set the criteria to the checkbox is unchecked
Then click okay
Save the sheet to save the shared filter.
Click on SHARE
Scroll down and click edit next to the default view
Set the filter to new filter you saved
Save.
Check off some boxes and save the sheet.
Reload the sheet and the completed items will not be visible.

improve weka classifier results

I have a database which consists of 27 attributes and 597 instances .
I want to classify it with as best results as possible using Weka.
Which classifier is not important .The class attribute is nominal and the rest are numeric .
The Best results until now was LWL (83.2215) and oneR(83.389). I used attribute selection filter but the results are not improved and no other classifier can give better results even NN or SMO or meta classes.
Any idea about how to improve this database knowing that there are no missing values and the database is about 597 patients gathered in three years.
Have you tried boosting or bagging? These generally can help improve results.
http://machinelearningmastery.com/improve-machine-learning-results-with-boosting-bagging-and-blending-ensemble-methods-in-weka/
Boosting
Boosting is an ensemble method that starts out with a base classifier
that is prepared on the training data. A second classifier is then
created behind it to focus on the instances in the training data that
the first classifier got wrong. The process continues to add
classifiers until a limit is reached in the number of models or
accuracy.
Boosting is provided in Weka in the AdaBoostM1 (adaptive boosting)
algorithm.
Click “Add new…” in the “Algorithms” section. Click the “Choose”
button. Click “AdaBoostM1” under the “meta” selection. Click the
“Choose” button for the “classifier” and select “J48” under the “tree”
section and click the “choose” button. Click the “OK” button on the
“AdaBoostM1” configuration.
Bagging
Bagging (Bootstrap Aggregating) is an ensemble method that creates
separate samples of the training dataset and creates a classifier for
each sample. The results of these multiple classifiers are then
combined (such as averaged or majority voting). The trick is that each
sample of the training dataset is different, giving each classifier
that is trained, a subtly different focus and perspective on the
problem.
Click “Add new…” in the “Algorithms” section. Click the “Choose”
button. Click “Bagging” under the “meta” selection. Click the “Choose”
button for the “classifier” and select “J48” under the “tree” section
and click the “choose” button. Click the “OK” button on the “Bagging”
configuration.
I tried Boosting and Bagging as #applecrusher has mentioned. It showed a little improvement in the accuracy; but for the same data with SKLearn, I was getting a lot better accuracy. When I compared the code and output at each step, I found that train-test split function in SKLearn was, by default, shuffling the data. When I shuffled the data for WEKA using Collections.shuffle(), I saw improved results. Give it a try.

Dealing with huge select lists

I often use select lists with my projects but when it comes to a huge select list, I couldn't find a solution. I need a easy, plug and play solution for solution will be used in a few places.
When you have a select box or text box to be filled from a model data, I want to show user a text box, right side of text box, there should be a button to choice the value. Upon clicking that button, popup or a modal will be opened and I filter all the records and find my value, upon clicking value, modal or popup closes and I get choosen value to form control.
İmagine you have a text box to choose your customer, and among 2500 customer,
PS:don't suggest autocomplete, don't want to accomplish it.
Why don't you look at something like Chozen plugin http://harvesthq.github.io/chosen/. It allows you to easily search large select lists

Weka - How to remove an attribute whose all values are missing?

I have a CSV file containing data for a market-basket analysis. I have imported the file successfully to Weka, but I found that some attributes does not have any value, i.e., all values are missing. Weka don't let me use the Apriori algorithm with this data, so I would like to know if there is a way to remove those attributes from the imported data.
PS.: There are thousands of attributes, so I don't want to specify the attributes that need to be removed.
You can remove them using the "remove" filter in WEKA's Explorer.
Once the data has been loaded into WEKA:
1) Go to "Process" (1st main tab).
2) In "filter" area click "Choose" a filter.
3) Navigate through the tree to "filters" -> "unsupervised" -> "attribute" -> "Remove".
4) Once "Remove" filter has been chosen, click to "Remove" label next to "Choose" button, it will open a dialog.
5) Fill out the attributeIndices text field with the index/indices of the attributes to be removed, i.e. "1,4,10" or "1-3,7".
6) Click "OK" in the dialog and "Apply" button in the filter area of the main window.
That's all !!
I believe weka.filters.unsupervised.attribute.RemoveUseless might help

Weka Vote: how to tune individual classifier

Hi, I am using weka.classifier.meta. Vote. I am combining three classifiers, one of them is SMO. I want to know how can I specifically set the parameter values for SMO. Is there a way to do this using the graphical user interface? I want to change the value of C.
Yes -- In the classify dialog, just click the name of the classifier next to the choose button, and a dialog with the parameters will be displayed.
Its not very intuitive, because it isn't a button you push to get the parameters dialog, rather its the label with the classifier name and parameters next to the choose button.