Return top value ordered by another column - powerbi

Suppose I have a table as follows:
TableA =
DATATABLE (
"Year", INTEGER,
"Group", STRING,
"Value", DOUBLE,
{
{ 2015, "A", 2 },
{ 2015, "B", 8 },
{ 2016, "A", 9 },
{ 2016, "B", 3 },
{ 2016, "C", 7 },
{ 2017, "B", 5 },
{ 2018, "B", 6 },
{ 2018, "D", 7 }
}
)
I want a measure that returns the top Group based on its Value that work inside or outside a Year filter context. That is, it can be used in a matrix visual like this (including the Total row):
It's not hard to find the maximal value using DAX:
MaxValue = MAX(TableA[Value])
or
MaxValue = MAXX(TableA, TableA[Value])
But what is the best way to look up the Group that corresponds to that value?
I've tried this:
Top Group = LOOKUPVALUE(TableA[Group],
TableA[Year], MAX(TableA[Year]),
TableA[Value], MAX(TableA[Value]))
However, this doesn't work for the Total row and I'd rather not have to use the Year in the measure if possible (there are likely other columns to worry about in a real scenario).
Note: I am providing a couple solutions in the answers below, but I'd love to see any other approaches as well.
Ideally, it would be nice if there were an extra argument in the MAXX function that would specify which column to return after finding the maximum, much like the MAXIFS Excel function has.

Another way to do this is through the use of the TOPN function.
The TOPN function returns entire row(s) instead of a single value. For example, the code
TOPN(1, TableA, TableA[Value])
returns the top 1 row of TableA ordered by TableA[Value]. The Group value associated with that top Value is in the row, but we need to be able to access it. There are a couple of possibilities.
Use MAXX:
Top Group = MAXX(TOPN(1, TableA, TableA[Value]), TableA[Group])
This finds the maximum Group from the TOPN table in the first argument. (There is only one Group value, but this allows us to covert a table into a single value.)
Use SELECTCOLUMNS:
Top Group = SELECTCOLUMNS(TOPN(1, TableA, TableA[Value]), "Group", TableA[Group])
This function usually returns a table (with the columns that are specified), but in this case, it is a table with a single row and a single column, which means the DAX interprets it as just a regular value.

One way to do this is to store the maximum value and use that as a filter condition.
For example,
Top Group =
VAR MaxValue = MAX(TableA[Value])
RETURN MAXX(FILTER(TableA, TableA[Value] = MaxValue), TableA[Group])
or similarly,
Top Group =
VAR MaxValue = MAX(TableA[Value])
RETURN CALCULATE(MAX(TableA[Group]), TableA[Value] = MaxValue)
If there are multiple groups with the same maximum value the measures above will pick the first one alphabetically. If there are multiple and you want to show all of them, you could use a concatenate iterator function:
Top Group =
VAR MaxValue = MAX(TableA[Value])
RETURN CONCATENATEX(
CALCULATETABLE(
VALUES(TableA[Group]),
TableA[Value] = MaxValue
),
TableA[Group],
", "
)
If you changed the 9 in TableA to an 8, this last measure would return A, B rather than A.

Related

Convert a table into a function that can act like a Table.SelectRows condition

I have a table of Project:
that I would like to filter by the FIELD, OPERATOR, and VALUE columns contained in the Project Group table:
The Power Query M to apply this filter would be:
let
Source = #"Project",
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Projectid", Int64.Type}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Projectid] >= 100000 and [Projectid] <= 500000)
in
#"Filtered Rows"
Results (need to remove the error row):
How do I convert the FIELD, OPERATOR, and VALUE columns into a function that can be used as a condition for the SelectRows function?
If you need to do comparisons, might be best to first change the types of the columns (in both tables) that are being compared. Preferably to type number.
The code below assumes that:
the OPERATOR column of Project Group table can only contain: > or < and that these values should be interpreted as >= and <= respectively.
the column in Project table (that needs to be compared) can change and its name will be in the FIELD column of the Project Group. It's assumed that the name matches exactly. If this is not the case, you might need to standardise things (or at least perform a case-insensitive search) to ensure values can be mapped to column names correctly.
Based on the assumptions above, here's one approach:
let
// Dummy table for example purposes
project = Table.FromColumns({
{0..10},
{5..15}
}, type table [projectId = number, name = number]),
// Dummy table for example purposes
projectGroup = Table.FromColumns({
{"projectId", "projectId"},
{">", "<"},
{5, 7}
}, type table [FIELD = text, OPERATOR = text, VALUE = number]),
// Should take in a row from "Project" table and return a boolean
// representing whether said row matches the criteria contained
// within "Project Group" table.
selectorFunc = (projectRow as record) as logical =>
let
shouldKeepProjectRow = Table.MatchesAllRows(projectGroup, (projectGroupRow as record) =>
let
fieldNameToCheck = projectGroupRow[FIELD],
valueFromProjectRow = Record.Field(projectRow, fieldNameToCheck),
compared = if projectGroupRow[OPERATOR] = ">" then
valueFromProjectRow >= projectGroupRow[VALUE]
else
valueFromProjectRow <= projectGroupRow[VALUE]
in compared
)
in shouldKeepProjectRow,
selectedRows = Table.SelectRows(project, selectorFunc)
in
selectedRows
The main function used is Table.MatchesAllRows (https://learn.microsoft.com/en-us/powerquery-m/table-matchesallrows).
Another approach could potentially be: Expression.Evaluate: https://learn.microsoft.com/en-us/powerquery-m/expression-evaluate. However, I've not used it, so I'm not sure whether there are any "gotchas"/implications to be aware of.

How to remove null rows from MDX query results

How can I remove the null row from my MDX query results?
Here is the query I'm currently working with
select
non empty
{
[Measures].[Average Trips Per Day]
,[Measures].[Calories Burned]
,[Measures].[Carbon Offset]
,[Measures].[Median Distance]
,[Measures].[Median Duration]
,[Measures].[Rider Trips]
,[Measures].[Rides Per Bike Per Day]
,[Measures].[Total Distance]
,[Measures].[Total Riders]
,[Measures].[Total Trip Duration in Minutes]
,[Measures].[Total Members]
} on columns
,
non empty
{
(
[Promotion].[Promotion Code Name].children
)
} on rows
from [BCycle]
where ([Program].[Program Name].&[Madison B-cycle])
;results
This is not a null value however it is one of the children of [Promotion].[Promotion Code Name].Children.
You can exclude that particular value from children using the EXCEPT keyword of MDx.
Example query:
//This query shows the number of orders for all products,
//with the exception of Components, which are not
//sold.
SELECT
[Date].[Month of Year].Children ON COLUMNS,
Except
([Product].[Product Categories].[All].Children ,
{[Product].[Product Categories].[Components]}
) ON ROWS
FROM
[Adventure Works]
WHERE
([Measures].[Order Quantity])
Reference -> https://learn.microsoft.com/en-us/sql/mdx/except-mdx-function?view=sql-server-2017

How to Return Text with IF Function in an Array

In Google Sheets, I'm trying to query a column and look for a state abbreviation, and if that abbreviation is a match, then "East" if not then "West"
Wanting to return text values in my column based on state abbreviation. We have territory manager split into two domains--East and West. So, trying to easily sort my data by East/West.
Here's what I have:
=IF(M:M={"AL", "CA", "DE","FL","GA","IA","KY","ME","MD","MA","MN","MS","NH","NJ","NY","ND","RI","SD","TN","VT","VA","WV","WI"},"East","West")
But, when I fill down, it just fills down East, and does not seem to actually query M:M
Thoughts?
Not the cleanest code, but this should work:
=ARRAYFORMULA(IF(LEN(A:A), IF((A:A = "foo")+(A:A = "bar") = 1, "WEST", "EAST"), ))
To use IF with an OR in an ARRAYFORMULA, you evaluate the column with 1s and 0s. The A:A = "foo" will evaluate to 1 if foo is in the cell. So if one of your OR criteria is in the cell, the total value in the IF will be 1.
You have a lot of criteria so writing each of them in will take a while ...
E.g. IF( (A:A = "AL") + (A:A = "CA") ... (A:A = "WI") = 1, "East", "West")
Use ISERROR/MATCH():
=IF(ISERROR(MATCH(M:M,{"AL", "CA", "DE","FL","GA","IA","KY","ME","MD","MA","MN","MS","NH","NJ","NY","ND","RI","SD","TN","VT","VA","WV","WI"},0)),"West","East")

What GROUPBY aggregator can I use to test if grouped values are equal to a constant?

Situation: I have table Bob where each row has a bunch of columns, including a Result, SessionID1, SessionID2.
Goal: I want to GroupBy SessionID1 and SessionID2 and see if any Results in the group are 0; I expect multiple rows to have the same ID1 and ID2 values. I then want to divide the count of groups with 0 results / the count of all groups.
Questions: I think I want something like:
GROUPBY (
Bob,
SessionID1,
SessionID2,
"Has at least 1 success",
???)
But what aggregator can I use for ??? to get a boolean indicating if any result in the group equals 0?
Also, if I want a count of groups with successes, do I just wrap the GROUPBY in a COUNT?
Consider this sample table:
You can try the following DAX to create a new summary table:
Summary = GROUPBY(Bob, Bob[SessionID1], Bob[SessionID2],
"Number of rows", COUNTX(CURRENTGROUP(), Bob[Result]),
"Number of successes", SUMX(CURRENTGROUP(), IF(Bob[Result] = 0, 1, 0)))
Then you can add a calculated column for the success ratio:
Success ratio = Summary[Number of successes] / Summary[Number of rows]
Results:
EDIT:
If what you want to calculate is something like Any success, then SUMMARIZE may be a better option to use than GROUPBY due to their function nature.
Summary2 = SUMMARIZE(Bob, Bob[SessionID1], Bob[SessionID2],
"Any success", IF(COUNTROWS(FILTER(Bob, Bob[Result] = 0)) > 0, 1, 0),
"Number of rows", COUNTROWS(Bob))
Results:

Grouping the output of a CouchDB View

I have a map reduce view:
.....
emit( diffYears, doc.xyz );
reduced with _sum.
xyz is then a number which is summed per integer(diffYears).
The output looks roughly like this:
4 1204.9
5 796.19
6 1124.8
7 1112.6
8 1993.62
9 159.26
10 395.41
11 456.05
12 457.97
13 39.80
14 483.68
15 269.469
etc..
What I would like to do is group the results as follows:
Grouping Total per group
0-4 1959.2 i.e add up the xyz's for years 0,1,2,3,4
5-9 3998.5 same for 5,6,7,8,9 ...etc.
10-14 3566.3
I saw a suggestion where a list was used on a view output here: Using a CouchDB view, can I count groups and filter by key range at the same time?
but have been unable to adapt it to get any kind of result.
The code given is:
{
_id: "_design/authors",
views: {
authors_by_date: {
map: function(doc) {
emit(doc.date, doc.author);
}
}
},
lists: {
count_occurrences: function(head, req) {
start({ headers: { "Content-Type": "application/json" }});
var result = {};
var row;
while(row = getRow()) {
var val = row.value;
if(result[val]) result[val]++;
else result[val] = 1;
}
return result;
}
}
}
I substituted var val = row.key in this section:
while(row = getRow()) {
var val = row.value;
if(result[val]) result[val]++;
else result[val] = 1;
}
(although in this case the result is a count.)
This seems to be the way to do it.
(It is like having a startkey and endkey for each grouping which I can do manually, naturally, but not inside a process. Or is there a way of entering multiple start- and endkeys into one GET command???? )
This must be a fairly normal thing to do especially for researchers using statistical analysis.
I assume therefore that it does get done but I cannot locate examples
as far as CouchDB is concerned.
I would appreciate some help with this please or a pointer in the right direction.
Many thanks.
EDIT:
Perhaps the answer lies in a process in 'reduce' to group the output??
You can accomplish what you want using a complex key. The limitation is that the group size is static and needs to be defined in the view.
You'll need a simple step function to create your groups within map like:
var size = 5;
var group = ( doc.diffYears - (doc.diffYears % size)) / size;
emit( [group, doc.diffYears], doc.xyz);
The reduce function can remain _sum.
Now when you query the view use group_level to control the grouping. At group_level=0, everything will be summed and one value will be returned. At group_level=1 you'll receive your desired sums of 0-4, 5-9 etc. At group_level=2 you'll get your original output.