Negative filtering by filter_box or some other mechanism - apache-superset

Let's say I have a column named Column1. There are more than 10k different values for this column, but my goal is to display on a dashboard all data except few of them. Is it possible to achieve it in Superset? As far as I understand the only one option to filter dashboard is a filter_box, and I have to choose values explicitly in filterbox, so no way to use a negative filter. Is it true, or there is some hidden mechanism?

You can use the limit selector values option to provide the filter out values you dont need by specifying the column name and the list of values you would like to ignore using the appropriate condition like *equals, not equals, etc

Related

Using Filter Formula with Multiple Criteria and Allowing Blank Criteria

Below is a link to a file with fake data. The 2nd tab is designed to allow users to filter the larger data set by selecting criteria from the drop-down. How do I design a formula (currently located in cell B12) so that it filters the larger data set from Sheet1, but if the user leaves "Activity" blank, it returns all results? Currently, the formula is requiring a selection in every dropdown. I want it so that if someone does NOT select something in a dropdown, that it will return all results based on what criteria have been entered.
Dummy data
=FILTER(Sheet1!$A$2:$F,
IF(C2<>"", Sheet1!$A$2:$A>=$C$2, Sheet1!$A$2:$A<>""),
IF(F2<>"", Sheet1!$A$2:$A<=$F$2, Sheet1!$A$2:$A<>""),
IF(C4<>"", Sheet1!$B$2:$B =$C$4, Sheet1!$B$2:$B<>""),
IF(F4<>"", Sheet1!$C$2:$C =$F$4, Sheet1!$C$2:$C<>""))

QuickSight: How can I use ifelse() or any other alternative for multiple conditions according to input provided in the added parameter?

I get the option of ifelse() in the Functions list when I am trying to add a calculated field while editing the data, but do not get it from the 'Add' option where I get the option to Add title, Add description, Add calculated field, Add parameter. I get options like sumif, avgif, countif but there I can provide only one condition.
I want to create an ifelse(0) function with multiple conditions dependent on a parameter value which user selects from a dropdown.
If you want to add ifelse() function you have to add it at the Dataset section. it is not available in Data analysis section.
If you want to add multiple choice values in the parameter, then you have to add the list of items by
click on Add Parameter
in the "Create new parameter" dialogue box, select multiple values then write the list of items by each line in the below text area.
then click on the create.
Currently IfEsle() is not supported in analysis based on SPICE dataset. If you want to use IfElse() in analysis convert SPICE dataset to a Direct Query dataset.

How to get row count for large dataset in Informatica?

I am trying to get the row count for a dataset with 280 fields with out having affect on the performance. Looking for best possible ways to perform.
The better option to avoid performance issue is, use sorter transformation and sort the columns and pass the pipeline to aggregator transformation. In aggregator transformation please check the option sorted input.
In terms if your source is a database then, index the required conditional columns in the table and also partition the table if required.
For your solution, I have in mind 2 options:
Using Aggregator (remember to use a predefined order by to improve performance with the next trans), SQ > Aggregator > Target. Inside the aggregator add new ports with the sum() and/or count() functions. Remember to select the columns to group
Check this out this example:
https://www.guru99.com/aggregator-transformation-informatica.html
Using Source Qualifier query override. Use a traditional select count/sum with group by from the database- SQ > Target.
By the way. Informatica is very good with the performance, more than the columns you need to review how many records you are processing. A best practice is always to stress the datasource/database more than the Infa app.
Regards,
Juan
If all you need is just to count the rows, use the Aggregator. That's what it's for. However, this will create cache - to limit it's size, use a single port.
To avoid caching, you can use a variable in expression and just increment it. This however will give you an extra column with all rows numbered, not just a single value. You'll still need to aggregate it. Here it would be possible to use aggregater with no function to return just the last value.

Merge cells with similar but different data, different spelling

I am trying Tableau with data extracted from Salesforce. The input includes a "Country" record were the row have different spellings for the same thing.
Example: Cananda, CANADA, CAnada etc.
Is there a way to fix this in Tableau?
The easiest solution is create a group field based on your Country field.
Select Country in the data pane on the left side bar, right click and choose Create Group. Select elements that you want to group together put them into a single group, say Canada, that contains all variations of spelling.
This new group field initially has a name of Country (group). You may want to rename it Country_Corrected. (Or even better, rename the first field, Country_Original, and call the group field simply Country. Then you can hide Country_Original)
Groups are implemented using SQL case statements. They have many uses, but one application is to easily tolerate some inconsistent spellings in your data source without having to change your data. In general, you can specify several transformations like this that take effect at query and visualization time. For very large data sets, or for very complicated transformations, you may eventually want to push some of them upstream in your data pipeline to get better performance. But make those optimizations later when you've proven the necessity.
If the differences are just in case (upper vs lower), you can right-click the Country dimension, and create a calculated field called something like "New Country", and use the following formula to make the case consistent:
upper([Country])
Use this new "New Country" calc dimension instead of your "Country" dimension, and it will group them all without case sensitivity, and display as uppercase. Or you can use "lower" instead of "upper" if preferred.

calcite, id filters , what is the easiest way to get them?

I would like to find the easiest way to get the filters for id columns of my tables. Currently I use FilterableTable but that returns the filters as an expression tree and I would have to scan for it. I am wondering if there is an easier way to get the filter of my PK columns (the one I declare as keys or as indexed), i.e. get a from-to kind of structure.
EDIT: so what ideally I would expect is to extract a list of id ranges for the query, i.e. from filters to [id1 ... id2] , [id3...id4] and so on, where id1