Remove instances where nominal attribute = value (Weka GUI) - weka

I have a dataset with:
400 instances
I have one nominal attribute cluster with values:
cluster1
cluster2
...
cluster10
How do I remove instances where e.g. cluster=cluster5? (using the GUI)
I was told to use the filter weka.filters.unsupervised.instance.RemoveWithValues, but it seems to only be able to remove numerical values below a certain splitPoint. I could of course use the Edit window, but notice I have 400 instances!

weka.filters.unsupervised.instance.RemoveWithValues will remove nominal values. Note the field "nominalIndices" in the image below. After selecting the index of the desired attribute, enter the index of the nominal value you would like to have removed.

Related

Algolia Grouping (attributeForDistinct)

I'm grouping data by the attributeForDistinct option from Algolia dashboard. Is there any way to use this option from react-instantSearch-dom? There is a distinct option in the dashboard that can be changed from code true/false. I want to use the attributeForDistinct option from code to group data dynamically.enter image description here
You would turn on distinct in your index configuration, or you can configure it client side by adding it to a Configure widget.
Here's a simple example:
https://www.algolia.com/doc/api-reference/widgets/configure/react/#examples
Set it to an integer value between 0 and 4 to control the number of records with the same value for attributeForDistinct returned in the result set.

Set all negative attribute values to zero in weka

I have an attribute that I added using AddExpression filter, and now I want to change its values so that all negative values are set to zero. I tried using MathExpression filter like this:
MathExpression -E "ifelse(A > 0, A, 0)" -V -R 17
17 is the attribute index seen in weka Preprocess/Attributes. But after applying it, I can still see that minimum value for my attribute is -5, not 0 as expected. What am I doing wrong?
If it changes anything, I removed some attributes before applying this filter, so the attribute index changed
The problem appears to be some kind of bug - it sometimes works but sometimes it doesn't. I don't know the exact way to reproduce it. However, I've found a workaround, that works if there aren't many negative values.
If you click on the Edit button, you can sort rows by the attribute you want to modify, and manually change negative values to zeros. If you want to preserve the original row order, before sorting add an ID attribute using AddID filter. After you're finished modyfing values, sort data by ID to restore the original order.

Infragistics UltraGrid - How to use displayed values in group by headers when using an IEditorDataFilter?

I have a situation where I'm using the IEditorDataFilter interface within a custom UltraGrid editor control to automatically map values from a bound data source when they're displayed in the grid cells. In this case it's converting guid-based key values into user-friendly values, and it works well by displaying what I need in the cell, but retaining the GUID values as the 'value' behind the scenes.
My issue is what happens when I enable the built-in group by functionality and the user groups by a column using my editor. In that case the group by headers default to using the cell's value, which is the guid in my case, so I end up with headers like this:
Column A: 7F720CE8-123A-4A5D-95A7-6DC6EFFE5009 (10 items)
What I really want is the cell's display value to be used instead so it's something like this:
Column A: Item 1 (10 items)
What I've tried so far
Infragistics provides a couple mechanisms for modifying what's shown in group by rows:
GroupByRowDescriptionMask property of the grid (http://bit.ly/1g72t1b)
Manually set the row description via the InitializeGroupByRow event (http://bit.ly/1ix1CbK)
Option 1 doesn't appear to give me what I need because the cell's display value is not exposed in the set of tokens they provide. Option 2 looks promising but it's not clear to me how to get at the cell's display value. The event argument only appears to contain the cell's backing value, which in my case is the GUID.
Is there a proper approach for using the group by functionality when you're also using an IEditorDataFilter implementation to convert values?
This may be frowned upon, but I asked my question on the Infragistic forums as well, and a complete answer is available there (along with an example solution demonstrating the problem):
http://www.infragistics.com/community/forums/p/88541/439210.aspx
In short, I was applying my custom editors at the cell level, which made them unavailable when the rows were grouped together. A better approach would be to apply the editor at the column level, which would make the editor available at the time of grouping, and would provide the expected behavior.

Remove Missing Values in Weka

I'm using a dataset in Weka for classfication that includes missing values. As far as I understood, Weka replaces them automatically with the Modes or Mean of the training data (using the filter unsupervised/attribute/ReplaceMissingValues) when using a classifier like NaiveBayes.
I would like to try removing them, to see how this effects the quality of the classifier. Is there a filter to do that?
See this answer below for a better, modern approach.
My approach is not the perfect one because IF you have more than 5 or 6 attributes then it becomes quite cumbersome to apply but I can suggest that MultiFilter should be used for this purpose if only a few attributes have missing values.
If you have missing values in 2 attributes then you'll use RemoveWithValues 2 times in a MultiFilter.
Load your data in Weka Explorer
Select MultiFilter from the Filter area
Click on MultiFilter and Add RemoveWithValues
Then configure each RemoveWithValues filter with the attribute index and select True in matchMissingValues
Save the filter settings and click Apply in Explorer.
Use the removeIf() method on weka.core.Instances using the method reference from weka.core.Instance for the hasMissingValue method, which returns a boolean if a given Instance has any missing values.
Instances dataset = source.getDataSet(); // for some source
dataset.removeIf(Instance::hasMissingValue);

How can i query to get the multiple values in SimpleDB (AWS)

jpg
In that Picture i have colored one part. i have attribute called "deviceModel". It contains more than one value.. i want to take using query from my domain which ItemName() contains deviceModel attribute values more than one value.
Thanks,
Senthil Raja
There is no direct approach to get what you are asking.. You need to manipulate by writing your own piece of code. By running SELECT query you will get the item Attribute-value pair. So here you need to traverse each each itemName() and count values of your desire attribute.
I think what you are refering to is called MultiValued Attributes. When you put a value in the attribute - if you don't replace the existing attribute value the values will multiply, giving you an array of items connected to the value of that attribute name.
How you create them will depend on the sdk/language you are using for your REST calls, however look for the Replace=true/false when you set the attribute's value.
Here is the documentation page on retrieving them: http://docs.amazonwebservices.com/AmazonSimpleDB/latest/DeveloperGuide/ (look under Using Amazon SimpleDB -> Using Select to Create Amazon SimpleDB Queries -> Queries on Attributes with Multiple Values)