Why won't Weka allow me to start Association Rule generation? - weka

I have created an arff file for a data set that I would like to use in Weka. The file is formatted as a sparse arff file. Anyway, I have successfully loaded in the data. I then switch to the Association tab and set my parameters. However, the Start button won't become enabled, so I can't click it to start the association generation. Why is this? Has anyone run into this issue before and know how to solve it?
Here is a screenshot:

all variables must be Nominal; is the a priori method does not work on numeric variables

You may want to check the attribute types in your arff file. Weka is very particular about types when they are used for associations and even though it may let you set parameters, the routine will not run.
Try using an attribute declaration like the following:
#attribute "attr1" {t}
And define your rows as follows:
{1957 "t", 9163 "t", 10143 "t"}

Related

strange Train and test set are not compatible error in weka

I have read many solution about this error. But my problem is definitely different from the others: I have a "train" dataset(arff) and a "test" dataset(arff), both these two arff have an attribute "id"(string). It works well if I 'remove' "id" of these two arff at the same time(if I don't remove the id in "test" I will get an error); what confuse me is that my friend can do it by remove only the "id" in "train", so his output will contains the "id".
(since he didn't remove the "id" in the "test", the number of attribute will not be the same, and this is against what I read that the number of attribute should be exactly the same).
I really need an output that can contain the "id".
Maybe I did something wrong with the "remove"? I read somewhere said that the test feature may be superior to that of train. And also a paragraph talking about how to remove:"Instead of using a nominal ID attribute, declare it as STRING
attribute. With this you don't have to declare each possible value
like with NOMINAL attributes and it therefore doesn't matter what
strings are used in the test set that you're trying to use the trained
model on. In order to be able to work with this STRING ID attribute
you have to use the FilteredClassifier in conjunction with the Remove
filter (package weka.filters.unsupervised.attribute) and your original
base classifier. This setup will remove the ID attribute for the
learning process (i.e., the base classifier), but you'll still be able
to use it outside for tracking instances. "
http://weka.8497.n7.nabble.com/use-saved-model-td22857.html
Anyone have an idea?
Any help will be appreciated.
my 2 arff, left: train; right: test
left: output of myfriend with id such as test_subject1005 ; right: my output
Finally I got my solution. Just click directly the "supplied test set" and in the prompt interface click "Yes". That all! (It seems that I did not see this prompt before, so I did not try)

libpq: get data type

I am coding a cpp project with the database "postgreSQL".
I created a table in my database its type is character varying(40).
Now I need to SELECT these data FROM the table in my cpp project. I knew that I should use the library libpq, this is the interface of "postgreSQL" for c/cpp.
I have succeeded in selecting data from the table. Now I am considering if it's possible to get the data type of this table. For example, here I want to get character varying(40).
You need to use PQftype.
As described here: http://www.idiap.ch/~formaz/doc/postgreSQL/libpq-chapter17861.htm
And just take a look here about decoding return values: http://www.postgresql.org/message-id/da7021e0608040738l3b0880a1q5a76b838937f8c78#mail.gmail.com
You must also use PQfsize to get field size.

Nominal to binary conversion in weka tool

I was trying to preprocess Leukemia dataset which has two classes ALL and AML.I need to convert it into binary values. I used "nominal to binary" filter. But it does not convert it to binary values. My weka version is 3.6.11.
Well, on my 3.6 version of Weka, it is working.
1. Load the file on Explorer.
2. Go to Filter->Weka filters ->unsupervised->attribute->nominalToBinary.
3. In the attributeIndices, indicate the "nominal" attribute index that you are trying to change to "binary".
4. Leave all other options to default. Click OK.
5. Click apply.
To get the NominalToBinary filter to work on the class attribute,
make sure the attribute selected in the class dropdown is changed to another attribute, temporarily, then you can switch back after applying the filter.
Weka apparently does not let you apply the NominalToBinary filter on the selected class attribute.

Remove Missing Values in Weka

I'm using a dataset in Weka for classfication that includes missing values. As far as I understood, Weka replaces them automatically with the Modes or Mean of the training data (using the filter unsupervised/attribute/ReplaceMissingValues) when using a classifier like NaiveBayes.
I would like to try removing them, to see how this effects the quality of the classifier. Is there a filter to do that?
See this answer below for a better, modern approach.
My approach is not the perfect one because IF you have more than 5 or 6 attributes then it becomes quite cumbersome to apply but I can suggest that MultiFilter should be used for this purpose if only a few attributes have missing values.
If you have missing values in 2 attributes then you'll use RemoveWithValues 2 times in a MultiFilter.
Load your data in Weka Explorer
Select MultiFilter from the Filter area
Click on MultiFilter and Add RemoveWithValues
Then configure each RemoveWithValues filter with the attribute index and select True in matchMissingValues
Save the filter settings and click Apply in Explorer.
Use the removeIf() method on weka.core.Instances using the method reference from weka.core.Instance for the hasMissingValue method, which returns a boolean if a given Instance has any missing values.
Instances dataset = source.getDataSet(); // for some source
dataset.removeIf(Instance::hasMissingValue);

Add new attribute calculated based on other attributes

I'm starting with WEKA and want to achieve the following.
I have file with 2 attributes: user_id, user_age.
I can successfully load data using WEKA API and get Instances object.
Now I want to calculate new attribute user_age_range - like (0-18) - 0, (19-25) - 1, etc.
Is there a way to calculate this attribute using WEKA Filters?
Also I would like not to iterate manually through all instances, but to define method that operates on single Instance and use some filter (or other abstraction) that'll apply corresponding "transformation" to all instances.
Please advice - how I could achieve this.
Thanks in advance.
After looking through the docs I found one or two filters that you could use in conjunction to achieve what you want.
http://weka.sourceforge.net/doc.dev/weka/filters/unsupervised/attribute/Copy.html
Use copy to create a copy that you will transform.
http://weka.sourceforge.net/doc.dev/weka/filters/unsupervised/attribute/NumericTransform.html
The numeric transform takes a class and a method option, you could write your own class that boxes the ages into the ranges you want and supply this class and method as your options.
Hope this helps
Using a csv file you can do that on Excel.
If you are using arff files, convert it to csv and then you can add the columns that you want depending on the number of new attributes and then just do whatever you want to do with one or more atributes on the first row. Extend that to all rows and it's done.