How to edit the Query and remove Order By clause - apache-superset

I picked an entire table as a Data Source and picked my fields. The SQL of it returns as:
SELECT customer_id AS customer_id,
country AS country,
count(invoice_num) AS total_invoices
FROM sales
GROUP BY customer_id,
country
ORDER BY total_invoices DESC
LIMIT 10000;
I do not want this ORDER BY total_invoices DESC as it is ruining the entire result. What should I do?

I think it always orders by the first metric. If you have multiple metrics, you can re-order them to change which one is used in the Order By clause.
Under Customization, the bar chart also has a "Sort Bars" option to sort the x-axis by label, which might work for you depending on what kind of result you're looking for.

Related

AWS LogInsights query stats average of count

I have cloudwatch entries that may be group with respect to a certain field. To be clear assume that field is city. I would like to count the entries with respect to cities. This is the easy part.
fields city
|stats count(*) by city
However I also want to get maximum minimum and average of this count, but I can not. Is it possible to have such queries i.e:
fields city
|stats avg(count(*) by city)
The console return an error for such query : mismatched input 'by' expecting {SYM_COMMA, SYM_RParen}
Here's how you'd do it: You'd first get the count (that you already figured) and then get the metrics you want by calling the relevant functions like so:
fields city
|stats count(*) as cityCount, avg(cityCount), max(cityCount), min(cityCount)

Remove duplicates based on sort

I have a customers table with ID's and some datetime columns. But those ID's have duplicates and i just want to Analyse distinct ID values.
I tried using groupby but this makes the process very slow.
Due to data sensitivity can't share it.
Any suggestions would be helpful.
I'd suggest using ROW_NUMBER() This lets you rank the rows by chosen columns and you can then pick out the first result.
Given you've shared no data or table and column names here's an example based on the Adventureworks database. The technique will be the same, you partition by whatever makes the group of rows you want to deduplicate unique (ProductKey below) and order in a way that makes the version you want to keep first (Children, birthdate and customerkey in my example).
USE AdventureWorksDW2017;
WITH CustomersOrdered AS
(
SELECT S.ProductKey, C.CustomerKey, C.TotalChildren, C.BirthDate
, ROW_NUMBER() OVER (
PARTITION BY S.ProductKey
ORDER BY C.TotalChildren DESC, C.BirthDate DESC, C.CustomerKey ASC
) AS CustomerSequence
FROM dbo.FactInternetSales AS S
INNER JOIN dbo.DimCustomer AS C
ON S.CustomerKey = C.CustomerKey
)
SELECT ProductKey, CustomerKey
FROM CustomersOrdered
WHERE CustomerSequence = 1
ORDER BY ProductKey, CustomerKey;
you can also just sort the columns with date column an than click on id column and delete duplicates...

Obtain MAX of column and display for each row

I'm trying to obtain the MAX of a particular column in a Power BI Report and place this as a new Measure within each ROW of the same dataset. Please see the example below.
Is this possible in DAX and via DirectQuery/LiveConnection? The report is pointing to a tabular model but due to outside factors the measure must be created in the report.
Thanks
You can accomplish this a few ways. Essentially, you need override the filter context so that the MAX function isn't just running over whatever slice you're showing in the visual. Using CALCULATE or the iterator function MAXX, set the wrap the table in the ALL() function to override the context and calculate max over all rows.
= CALCULATE(MAX([Calendar`Year]), ALL('Smithfield_Fiscal_Calendar'))
or
= MAXX(ALL('Smithfield_Fiscal_Calendar'), [Calendar`Year])
To get the breakout by date, you'll need to include a Date table in your model. PowerBI makes this possible with a few different DAX options. As an example, go to your Model tab, click 'New Table' and put in the following expression:
MyCalendar = CALENDAR(DATE(2019,1,1), DATE (2019,1,10))
This is a little trivial -- you'd want to use a useful range of dates but this one matches your example above. Next, add a column to [MyCalendar]
CalendarMonthYear = month([date]) & "-" & year([date])
Go to your budget table and add a similar field
BudgetMonthYear = month([date]) & "-" & year([date])
Go into your Model view and create a relationship between CalendarMonthYear and BudgetMonthYear. This will associate every date in the date table with the particular budget row from your budget table.
Hope it helps.

power BI diaplay one value

I am using Power BI to bring together data from several systems and display a dash board with data from all of the systems.
The dashboard has a couple of filters which are then used to display the data relating to one object across all systems.
When the dashboard is first loaded and none of the filter have been selected, the data cards display information from all rows in the table.
Is there a way to make a data card only display one row of data?
or
Be blank if there are more than one row of data?
There's no direct way to look at the number of rows in the visual, count them, and do something different if there's more than 1.
That said, there are a few things you can do.
HASONEFILTER
If you have a specific column in your table that, when selected, filters your results to a single row, then you can check if there's a filter on that column using HASONEFILTER. (If you have multiple alternative columns,any of which filter to a single row, that's ok too.)
You could then create a measure for each column that tests HASONEFILTER. If true, return the MAX of the column. (The reason for MAX is because measures always have to aggregate, but the MAX of a 1-row column will be the same as the value in that column.) If false, return either BLANK() or an empty string, depending on your preference.
E.g.
ColumnAMeasure = IF(HASONEFILTER(Sheet1[Slicer Column]),MAX(Sheet1[COLUMN A]), "")
ColumnBMeasure = IF(HASONEFILTER(Sheet1[Slicer Column]),MAX(Sheet1[COLUMN B]), "")
where Sheet1 is the name of the table and "Slicer Column" is the name of the column being used as a slicer
HASONEVALUE
If you have multiple columns that could be used as filters in combination (meaning that having a filter applied on "Slicer Column" doesn't guarantee only 1 row in the table), then rather than testing HASONEFILTER, you can test HASONEVALUE.
ColumnAMeasure = IF(HASONEVALUE(Sheet1[COLUMN A]),MAX(Sheet1[COLUMN A]), "")
ColumnBMeasure = IF(HASONEVALUE(Sheet1[Column B]),MAX(Sheet1[COLUMN B]), "")
Notice that HASONEVALUE tests the current column you're trying to display, rather than a slicer column like HASONEFILTER.
One side-effect of HASONEVALUE is that, if you're filtered to 3 rows, but all 3 rows have the same value for column A, then column A will display that value. (Whereas with HASONEFILTER, column A would stay blank until you're filtered to one thing.)
Low Tech
Both answers above depend on a measure existing for every column you want to display, so that you can test whether to display a blank row or not. That could become a pain if you have dozens of columns.
A lower-tech alternative is to add in an additional row with blanks for each column and then sort your table so that that row always appears first. (And shorten your visual so only the top row is visible.) Technically the other rows would be underneath and there'd be a scrollbar, but at least the initial display would be blank rather than showing a random row.
Hopefully something here has helped. Other people might have better solutions too. More information:
HASONEFILTER documentation: https://msdn.microsoft.com/en-us/library/gg492135.aspx
HASONEVALUE documentation: https://msdn.microsoft.com/en-us/library/gg492190.aspx

Auto incrementing a virtual column after a GROUP BY, ORDER BY query

I've made extensive research about auto-increment before posting this but couldn't find similar case:
I have a query pulling data from a main table, grouping by player_id and ordering by points desc, therefore creating a ranking output. My aim is to make the same query, once it finishes aggregating and sorting data, create a new column 'Rank' and auto increment it so it shows 1,2,3 etc since everything is already grouped by player and ordered by points DESC.
Thanks guys.
Example of the source table :
player_id-----------points-----
---1-------------------5----------
---1-------------------10---------
---1-------------------5---------
---2-------------------20---------
---2-------------------5---------
Desired output according to this example:
Rank------player_id-------score-----
----1----------2-----------25 POINTS ---------
----2----------1-----------20 POINTS ---------
EDIT
Rownum does the job well, no need for an auto increment virtual column! see Mutnowski's accepted answer below please.
Try this
SELECT #rownum:=#rownum+1 AS ‘rank’, Player_ID, Points FROM (SELECT Player_ID, SUM(Points) AS 'Points' FROM tblScores GROUP BY Player_ID ORDER BY Points DESC) AS foo, (SELECT #rownum:=0) AS foo2
I think you need to run a query to get your results without the rank, then run another query on first that selects all and adds the rank
Applying SUM to the Points column should produce the results you want.
SELECT #rownum:=#rownum+1 AS ‘rank‘,player_id, SUM(points)
FROM scores
GROUP BY player_id
ORDER BY SUM(points) DESC;