Rescale Dataset using Power BI - powerbi

I'm trying to rescale a dataset in using PowerBI Desktop. I've imported a dataset full of raw data, but I can't use row context together with an aggregate. I'm trying to accomplish this:
Data:
+---------+-----+
| Name | Bar |
+---------+-----+
| Alfred | 0 |
| Alfred | -1 |
| Alfred | 1 |
| Burt | 1 |
| Burt | 0 |
| Charlie | 1 |
| Charlie | 1 |
| Charlie | 0 |
+---------+-----+
Calculations:
Foo: = SUM(Bar) / COUNT(Bar) GROUP BY Name
Which would Generate this dataset:
+---------+-----+
| Name | Foo |
+---------+-----+
| Alfred | 0 |
| Burt | .5 |
| Charlie | .67 |
+---------+-----+
Final Calculation:
Score: = (#Foo - MIN(Foo)) / (MAX(Foo)-MIN(Foo))
The goal is to grade on a curve with a set of data. I can do it in excel, but was hoping that Power BI could handle all the heavy lifting.
At this point it might be easier to do it all in SQL before bringing it into PowerBI, but that would make it significantly less dynamic (with date filters and the like). Thanks for any insight you might have!

I think you're looking for the GROUPBY DAX function. https://support.office.com/en-us/article/GROUPBY-Function-DAX-d6d064b2-fd8b-4c1b-97f8-c6d03cdf8ad0
You then would GROUPBY on the Name field and proceed from there. If need to use the measure outside of a visual that groups by each Name (like show me the average score after applying the curve), then you'll need to wrap that in a calculate table where you include the names, your measure projected as a column, and then do your aggregates (min/max/average) over that calculated table.

Related

Power BI - Filtered count not grouping by values in table

I have two tables (subject & category) that are both related to the same parent table (main). Because of the foreign key constraints, it looks like Power BI automatically created the links.
Simple mock-up of table links
I need to count the subjects by type for each possible distance range. I tried a simple calculation shown below for each distance category.
less than 2m =
CALCULATE(
COUNTA('Category'[Descr]),
'Subject'[Distance] IN { "less than 2m" }
)
However, the filter doesn't seem to apply properly.
I want...
+------+--------------+--------------+--+
| Descr| less than 2m | more than 2m | |
+------+--------------+--------------+--+
| Car | 2 | 1 | |
| Sign | 4 | 2 | |
+------+--------------+--------------+--+
but I'm getting...
+------+--------------+--------------+--+
| Descr| less than 2m | more than 2m | |
+------+--------------+--------------+--+
| Car | 3 | 3 | |
| Sign | 6 | 6 | |
+------+--------------+--------------+--+
It's just giving me the total count by type which is correct but isn't applying the filter by distance so I can break it down.
I'm sure this is probably really simple but I'm pretty new with DAX and I can't figure this one out.
I wish I could mark Kosuke's comment as an answer. The issue was indeed with having to enable cross-filtering. This can either be done clicking on the link on your model or using a function to temporarily enable the cross filter.

Sum where version is highest by another variable (no max version in the whole data)

I'm struggling having this measure to work.
I would like to have a measure that will sum the Value only for the max version of each house.
So following this example table:
|---------------------|------------------|------------------|
| House_Id | Version_Id | Value |
|---------------------|------------------|------------------|
| 1 | 1 | 1000 |
|---------------------|------------------|------------------|
| 1 | 2 | 2000 |
|---------------------|------------------|------------------|
| 2 | 1 | 3000 |
|---------------------|------------------|------------------|
| 3 | 1 | 5000 |
|---------------------|------------------|------------------|
The result of this measure should be: 10.000 because the house_id 1 version 1 is ignored as there's another version higher.
By House_id the result should be:
|---------------------|------------------|
| House_Id | Value |
|---------------------|------------------|
| 1 | 2000 |
|---------------------|------------------|
| 1 | 3000 |
|---------------------|------------------|
| 2 | 5000 |
|---------------------|------------------|
Can anyone help me?
EDIT:
Given the correct answer #RADO gave, now I want to further enhance this measure:
Now, my main Data table in reality has more columns.
What if I want to add this measure to a table visual that splits the measure by another column from (or related to) the Data table.
For example (simplified data table):
|---------------------|------------------|------------------|------------------|
| House_Id | Version_Id | Color_Id | Value |
|---------------------|------------------|------------------|------------------|
| 1 | 1 | 1 (Green) | 1000 |
|---------------------|------------------|------------------|------------------|
| 1 | 2 | 2 (Red) | 2000 |
|---------------------|------------------|------------------|------------------|
| 2 | 1 | 1 (Green) | 3000 |
|---------------------|------------------|------------------|------------------|
| 3 | 1 | 1 (Green) | 5000 |
|---------------------|------------------|------------------|------------------|
There's a Color_Id in the main table that is connected to a Color table.
Then I add a visual table with ColorName (from the ColorTable) and the measure (ColorId 1 is Green, 2 is Red).
With the given answer the result is wrong when filtered by ColorName. Although the Total row is indeed correct:
|---------------------|------------------|
| ColorName | Value |
|---------------------|------------------|
| Green | 9000 |
|---------------------|------------------|
| Red | 2000 |
|---------------------|------------------|
| Total | 10000 |
|---------------------|------------------|
This result is wrong per ColorName as 9000 + 2000 is 11000 and not 10000.
The measure should ignore the rows with an old version. In the example before this is the row for House_Id 1 and Color_Id Green because the version is old (there's a newer version for that House_Id).
So:
How can I address this situation?
What If I want to filter by another column from (or related to) the Data table such as Location_Id? It is posible to define the measure in such a way that could work for any given number splits for columns in the main Data table?
I use "Data" as a name of your table.
Sum of Latest Values =
VAR Latest_Versions =
SUMMARIZE ( Data, Data[House_id], "Latest_Version", MAX ( Data[Version_Id] ) )
VAR Latest_Values =
TREATAS ( Latest_Versions, Data[House_id], Data[Version_Id] )
VAR Result =
CALCULATE ( SUM ( Data[Value] ), Latest_Values )
RETURN Result
Measure output:
How it works:
We calculate a virtual table of house_ids and their max versions, and store it in a variable "Latest_Versions"
We use the table from the first step to filter data for the latest versions only, and establish proper data lineage
(https://www.sqlbi.com/articles/understanding-data-lineage-in-dax/)
We calculate the sum of latest values by filtering data for the latest values only.
You can learn more about this pattern here:
https://www.sqlbi.com/articles/propagate-filters-using-treatas-in-dax/

Filter out outliers dynamically using PERCENTILE

I'm building a sales dashboard in PowerBI.
I have a Sales table.
My source of data is declarative, so I have a few extreme values caused by human errors and mistypes, etc.
Let's say I want to build a histogram with:
On the X axis, the stock aging of any sales. Which is "how long the product has been in stock at the time of sale". It is given by the [Product_Age] column
On values, the number of sales.
What I want to do is exclude the top 1% extreme values from my calculations (average, etc.) and vizualisations.
I've created a measure :
SalesByAge_Adjusted =
VAR TEMP =
FILTER(
SALES;
VAR StockAgingMAX =
PERCENTILE.INC(
SALES[Sales_Age];
0,99
)
RETURN
SALES[Sales_Age] < StockAgingMAX
)
RETURN
COUNTROWS(TEMP)
It uses PERCENTILE.INC to get the 99th percentile of Sales_Age values in the current context and I try to use it as a filter.
However, it just won't work.
I can diplay the measure on its own. How many sales I have. But as soon as I drag and drop "Sales_Age" to summarize the values. It shows nothing.
I have created the following table as an example.
+-------+--------+
| Axis | Values |
+-------+--------+
| 1 | 1067 |
| 2 | 1725 |
| 4 | 298 |
| 8 | 402 |
| 16 | 1848 |
| 32 | 1395 |
| 64 | 1116 |
| 128 | 1027 |
| 256 | 1948 |
| 512 | 790 |
| 1024 | 2173 |
| 2048 | 2025 |
| 4096 | 104 |
| 8192 | 1243 |
| 16384 | 1676 |
| 32768 | 1285 |
| 65536 | 806 |
+-------+--------+
For filtering the values that are out the 99% percentile I've created the following measure. Basically it gets an overall percentile without filter context and compares to each Axis value.
Filter = IF(CALCULATE(PERCENTILE.INC('Table'[Axis],0.99),ALL('Table'))>=MAX('Table'[Axis]),1,0)
In the visual of the chart, you use the filter measure to exclude your outliers
In this case, it will filter the last value of table: 65,536

PowerBI Sort Columns in Matrix Visual

I have a Matrix visual in Microsoft PowerBI with Australian 'States' as rows and 'Months Ago' as columns.
By default the Matrix shows my columns from 0 months ago to 12. I would like it to show from 12 months ago on the left to 0 months ago on the right.
+-------------------+-----------------------------+-------+
| | Months Ago | |
+-------------------+-----------------------------+-------+
| State | 0 | 1 | 2 | 3 | 4 | 5 | Total |
+-------------------+----+----+----+----+----+----+-------+
| Queensland | 10 | 10 | 10 | 10 | 10 | 10 | 60 |
+-------------------+----+----+----+----+----+----+-------+
| New South Wales | | | | | | | |
+-------------------+----+----+----+----+----+----+-------+
| Victoria | | | | | | | |
+-------------------+----+----+----+----+----+----+-------+
| South Australia | | | | | | | |
+-------------------+----+----+----+----+----+----+-------+
| Western Australia | | | | | | | |
+-------------------+----+----+----+----+----+----+-------+
Currently I am only given the option to sort by the value type fields (ie revenue etc).
Is there any option to sort/order the Column Headers?
I don't think there is an option for you to sort column headers directly.
However, you can change the default sort order for the Months Ago column so that it will be reflected in general.
You can add a custom column MonthSrt = 12 - [Months Ago] in query editor:
(It won't work in DAX because of a known issue)
Then you can select the Months Ago column and sort it by MonthSrt:
The custom sort will be applied when you use the Months Ago column in visuals:
You can also make groups (1 to 1 items) al give them al logical number:
The order will change automaticly in the matrix
The following solution worked for me to display the dates in descending order in a matrix:
how to sort column dates in descending order of matrix in power bi

Total/Sum working incorrectly in Power Bi

I have created a Report in which I have created some measures like -
X =
CALCULATE (
DISTINCTCOUNT ( ActivityNew[Name] ),
FILTER (
ActivityNew,
ActivityNew[Resource Owner Name] = MAX ( 'Resource Owners'[Manager Name] )
&& ActivityNew[LocationId] = 2
)
)
When I use this measure in table then the column values dont add up. For eg. if the value of this measure is 2,2,2,2,2 then Total in table should be 10. but it is showing 2.
I have noticed that wherever I have used this MAX(), the measure values are not adding up.
Why this is happening and Is their any solution for this?
You are using DISTINCTCOUNT which is in general not aggregatable.
Say you have the following table Sales:
+----------+------+-------+
| Customer | Item | Count |
+----------+------+-------+
| Albert | Coke | 3 |
| Bertram | Beer | 5 |
| Bertram | Coke | 2 |
| Charlie | Beer | 1 |
+----------+------+-------+
If you wanted to count the number of distinct items each customer has bought, you would create a new measure with the formula:
[Distinct Items] := DISTINCTCOUNT(Sales[Item])
If you include the [Customer] column and your [Distinct Items] measure in a report, it would output the following:
+----------+----------------+
| Customer | Distinct Items |
+----------+----------------+
| Albert | 1 |
| Bertram | 2 |
| Charlie | 1 |
+----------+----------------+
| Total | 2 |
+----------+----------------+
As you can see, this does not sum up, as the context of the total row, is the entire table, not filtered by any particular customer. To change this behaviour, you have to explicitly tell your measure that it should sum the values derived at the customer level. To do this, use the SUMX function. In my example, the measure formula should be changed like this:
[Distinct Items] := SUMX(VALUES(Sales[Customer]), DISTINCTCOUNT(Sales[Item]))
As I only want to sum over unique customers I use VALUES(Sales[Customer]). If you want to sum over every row in the table simply do: SUMX(<table name>, <expression>).
With this change, the output in the above example would be:
+----------+----------------+
| Customer | Distinct Items |
+----------+----------------+
| Albert | 1 |
| Bertram | 2 |
| Charlie | 1 |
+----------+----------------+
| Total | 4 |
+----------+----------------+