Adding a measure which finds the next row value for every row (similar to SQL Lead window function) - powerbi

will be very grateful if you could share your experience and advice on the following problem in Power BI:
3 Tables given in the data model:
calendar dimension table
fact table on sessions
fact table on spending
| CW | Total cost | Sessions | Expected Column 1 | Expected Column 2 |
+----+-------------+-----------+-------------------+-------------------+
| 1 | 1200 | 50 | | |
| 2 | 1500 | 60 | 1200 | 50 |
| 3 | 1700 | 48 | 1500 | 60 |
| 4 | 1150 | 36 | 1700 | 48 |
| 5 | 900 | 29 | 1150 | 36 |
+----+-------------+-----------+-------------------+-------------------+
CW column indicates the calendar week and it is from calendar table. Sessions and Total cost are from sessions and spending tables respectively. Data is aggregated and visualized on calendar week level.
Problem: I need to create measures to derive Expected column 1 and expected column 2 based on total cost and sessions columns. Basically getting next values for each row similar to lead window function.
I have checked power BI community and there are several ideas (for example here https://community.powerbi.com/t5/Desktop/DAX-Query-to-Find-Next-Value/td-p/833896).
But these solution assume all columns are from the same table, however in the above described case
all 3 columns are from different tables.
Will the be possible to get expected columns 1 and 2 and how? Many thanks in advance!

Related

Power BI - Filtered count not grouping by values in table

I have two tables (subject & category) that are both related to the same parent table (main). Because of the foreign key constraints, it looks like Power BI automatically created the links.
Simple mock-up of table links
I need to count the subjects by type for each possible distance range. I tried a simple calculation shown below for each distance category.
less than 2m =
CALCULATE(
COUNTA('Category'[Descr]),
'Subject'[Distance] IN { "less than 2m" }
)
However, the filter doesn't seem to apply properly.
I want...
+------+--------------+--------------+--+
| Descr| less than 2m | more than 2m | |
+------+--------------+--------------+--+
| Car | 2 | 1 | |
| Sign | 4 | 2 | |
+------+--------------+--------------+--+
but I'm getting...
+------+--------------+--------------+--+
| Descr| less than 2m | more than 2m | |
+------+--------------+--------------+--+
| Car | 3 | 3 | |
| Sign | 6 | 6 | |
+------+--------------+--------------+--+
It's just giving me the total count by type which is correct but isn't applying the filter by distance so I can break it down.
I'm sure this is probably really simple but I'm pretty new with DAX and I can't figure this one out.
I wish I could mark Kosuke's comment as an answer. The issue was indeed with having to enable cross-filtering. This can either be done clicking on the link on your model or using a function to temporarily enable the cross filter.

Sum where version is highest by another variable (no max version in the whole data)

I'm struggling having this measure to work.
I would like to have a measure that will sum the Value only for the max version of each house.
So following this example table:
|---------------------|------------------|------------------|
| House_Id | Version_Id | Value |
|---------------------|------------------|------------------|
| 1 | 1 | 1000 |
|---------------------|------------------|------------------|
| 1 | 2 | 2000 |
|---------------------|------------------|------------------|
| 2 | 1 | 3000 |
|---------------------|------------------|------------------|
| 3 | 1 | 5000 |
|---------------------|------------------|------------------|
The result of this measure should be: 10.000 because the house_id 1 version 1 is ignored as there's another version higher.
By House_id the result should be:
|---------------------|------------------|
| House_Id | Value |
|---------------------|------------------|
| 1 | 2000 |
|---------------------|------------------|
| 1 | 3000 |
|---------------------|------------------|
| 2 | 5000 |
|---------------------|------------------|
Can anyone help me?
EDIT:
Given the correct answer #RADO gave, now I want to further enhance this measure:
Now, my main Data table in reality has more columns.
What if I want to add this measure to a table visual that splits the measure by another column from (or related to) the Data table.
For example (simplified data table):
|---------------------|------------------|------------------|------------------|
| House_Id | Version_Id | Color_Id | Value |
|---------------------|------------------|------------------|------------------|
| 1 | 1 | 1 (Green) | 1000 |
|---------------------|------------------|------------------|------------------|
| 1 | 2 | 2 (Red) | 2000 |
|---------------------|------------------|------------------|------------------|
| 2 | 1 | 1 (Green) | 3000 |
|---------------------|------------------|------------------|------------------|
| 3 | 1 | 1 (Green) | 5000 |
|---------------------|------------------|------------------|------------------|
There's a Color_Id in the main table that is connected to a Color table.
Then I add a visual table with ColorName (from the ColorTable) and the measure (ColorId 1 is Green, 2 is Red).
With the given answer the result is wrong when filtered by ColorName. Although the Total row is indeed correct:
|---------------------|------------------|
| ColorName | Value |
|---------------------|------------------|
| Green | 9000 |
|---------------------|------------------|
| Red | 2000 |
|---------------------|------------------|
| Total | 10000 |
|---------------------|------------------|
This result is wrong per ColorName as 9000 + 2000 is 11000 and not 10000.
The measure should ignore the rows with an old version. In the example before this is the row for House_Id 1 and Color_Id Green because the version is old (there's a newer version for that House_Id).
So:
How can I address this situation?
What If I want to filter by another column from (or related to) the Data table such as Location_Id? It is posible to define the measure in such a way that could work for any given number splits for columns in the main Data table?
I use "Data" as a name of your table.
Sum of Latest Values =
VAR Latest_Versions =
SUMMARIZE ( Data, Data[House_id], "Latest_Version", MAX ( Data[Version_Id] ) )
VAR Latest_Values =
TREATAS ( Latest_Versions, Data[House_id], Data[Version_Id] )
VAR Result =
CALCULATE ( SUM ( Data[Value] ), Latest_Values )
RETURN Result
Measure output:
How it works:
We calculate a virtual table of house_ids and their max versions, and store it in a variable "Latest_Versions"
We use the table from the first step to filter data for the latest versions only, and establish proper data lineage
(https://www.sqlbi.com/articles/understanding-data-lineage-in-dax/)
We calculate the sum of latest values by filtering data for the latest values only.
You can learn more about this pattern here:
https://www.sqlbi.com/articles/propagate-filters-using-treatas-in-dax/

Filter out outliers dynamically using PERCENTILE

I'm building a sales dashboard in PowerBI.
I have a Sales table.
My source of data is declarative, so I have a few extreme values caused by human errors and mistypes, etc.
Let's say I want to build a histogram with:
On the X axis, the stock aging of any sales. Which is "how long the product has been in stock at the time of sale". It is given by the [Product_Age] column
On values, the number of sales.
What I want to do is exclude the top 1% extreme values from my calculations (average, etc.) and vizualisations.
I've created a measure :
SalesByAge_Adjusted =
VAR TEMP =
FILTER(
SALES;
VAR StockAgingMAX =
PERCENTILE.INC(
SALES[Sales_Age];
0,99
)
RETURN
SALES[Sales_Age] < StockAgingMAX
)
RETURN
COUNTROWS(TEMP)
It uses PERCENTILE.INC to get the 99th percentile of Sales_Age values in the current context and I try to use it as a filter.
However, it just won't work.
I can diplay the measure on its own. How many sales I have. But as soon as I drag and drop "Sales_Age" to summarize the values. It shows nothing.
I have created the following table as an example.
+-------+--------+
| Axis | Values |
+-------+--------+
| 1 | 1067 |
| 2 | 1725 |
| 4 | 298 |
| 8 | 402 |
| 16 | 1848 |
| 32 | 1395 |
| 64 | 1116 |
| 128 | 1027 |
| 256 | 1948 |
| 512 | 790 |
| 1024 | 2173 |
| 2048 | 2025 |
| 4096 | 104 |
| 8192 | 1243 |
| 16384 | 1676 |
| 32768 | 1285 |
| 65536 | 806 |
+-------+--------+
For filtering the values that are out the 99% percentile I've created the following measure. Basically it gets an overall percentile without filter context and compares to each Axis value.
Filter = IF(CALCULATE(PERCENTILE.INC('Table'[Axis],0.99),ALL('Table'))>=MAX('Table'[Axis]),1,0)
In the visual of the chart, you use the filter measure to exclude your outliers
In this case, it will filter the last value of table: 65,536

PowerBI Sort Columns in Matrix Visual

I have a Matrix visual in Microsoft PowerBI with Australian 'States' as rows and 'Months Ago' as columns.
By default the Matrix shows my columns from 0 months ago to 12. I would like it to show from 12 months ago on the left to 0 months ago on the right.
+-------------------+-----------------------------+-------+
| | Months Ago | |
+-------------------+-----------------------------+-------+
| State | 0 | 1 | 2 | 3 | 4 | 5 | Total |
+-------------------+----+----+----+----+----+----+-------+
| Queensland | 10 | 10 | 10 | 10 | 10 | 10 | 60 |
+-------------------+----+----+----+----+----+----+-------+
| New South Wales | | | | | | | |
+-------------------+----+----+----+----+----+----+-------+
| Victoria | | | | | | | |
+-------------------+----+----+----+----+----+----+-------+
| South Australia | | | | | | | |
+-------------------+----+----+----+----+----+----+-------+
| Western Australia | | | | | | | |
+-------------------+----+----+----+----+----+----+-------+
Currently I am only given the option to sort by the value type fields (ie revenue etc).
Is there any option to sort/order the Column Headers?
I don't think there is an option for you to sort column headers directly.
However, you can change the default sort order for the Months Ago column so that it will be reflected in general.
You can add a custom column MonthSrt = 12 - [Months Ago] in query editor:
(It won't work in DAX because of a known issue)
Then you can select the Months Ago column and sort it by MonthSrt:
The custom sort will be applied when you use the Months Ago column in visuals:
You can also make groups (1 to 1 items) al give them al logical number:
The order will change automaticly in the matrix
The following solution worked for me to display the dates in descending order in a matrix:
how to sort column dates in descending order of matrix in power bi

SUM of column conditional to many values of another column

I am trying to accomplish something, but don't know how to do it.
I have a Dimension (Table called TEntry) that represents time entries for employees like so :
Id | EmployeeId | EntryDT | TimeInMinutes | PriceAgreementId
------ | ---------- | ---------- | ------------- | ----------------
1 | 1 | 2017-03-20 | 100 | 1
2 | 1 | 2017-03-31 | 50 | null
3 | 2 | 2017-03-21 | 100 | 1
4 | 2 | 2017-03-23 | 125 | 2
5 | 3 | 2017-03-15 | 90 | null
6 | 3 | 2017-03-25 | 60 | 1
Sometimes they work on "PriceAgreements", and sometimes they don't.
In my Dashboard, i have a Table that groups the table TEntry by EmployeeId and Sums the TimeInMinutes. I also have a Slicer for EntryDT :
EmployeeId | TimeInMinutes
-------------- | -------------
1 | 150
2 | 225
3 | 150
I need to create 2 new columns that represent :
The total TimeInMinutes an Employee has worked on all PriceAgreements
So for EmployeeId #1, the Total would be 100.
The total TimeInMinutes ALL Employees have worked, but only for the PriceAgreements the current Employee (current row) has worked on.
The Table would look like this (without the PriceAgreementIds in parenthesis) :
EmployeeId | TimeInMinutes | TimeInMinutes on PriceAgreements | TimeInMinutes on PriceAgreements ALL other EmployeeIds
-------------- | ------------- | -------------------------------- | ------------------------------------------------------
1 | 150 | 100 (PriceAgreementId=1) | 260 (PriceAgreementId=1)
2 | 225 | 225 (PriceAgreementId=1 and 2) | 385 (PriceAgreementId=1 and 2)
3 | 150 | 150 (PriceAgreementId=1) | 260 (PriceAgreementId=1)
Column "TimeInMinutes on PriceAgreements" is quite easy, but the other one, i cannot find a solution...
I have this DAX expression I started, but it is not complete:
CALCULATE(SUM(TEntry[TimeInMinutes]), NOT ISBLANK(TEntry[PriceAgreementId]), ALL(TEmployee))
TEmployee is a Dimension linked to the main TEntry Table.
Any help would be appreciated.
Thank you
I'm throwing this on as an answer because (a) it might get you (or someone else) going in the right direction and (b) if it's guaranteed that an Employee would only ever have time entries corresponding to 2 price agreements, this would work - which is unlikely the case for you, but might be the case for others trying to accomplish a similar thing.
Measure =
CALCULATE (
SUM ( TEntry[TimeInMinutes] ),
FILTER (
ALL ( TEntry ),
(
TEntry[PriceAgreementID] = MIN ( TEntry[PriceAgreementID] )
|| TEntry[PriceAgreementID] = MAX ( TEntry[PriceAgreementID] )
)
&& TEntry[PriceAgreementID] <> BLANK ()
)
)
This measure is saying: SUM the TimeInMinutes for all records in the TEntry table where the PriceAgreementID matches either the minimum OR maximum PriceAgreementID (in the context of the current row) AND the PriceAgreementID isn't blank.
The fatal flaw in this answer is in the MIN and MAX. For Employee ID 2, who has 2 PriceAgreementIDs (1 & 2) - the MIN will calculate the minutes for PriceAgreementID 1 and the MAX will calculate the minutes for PriceAgreementID 2. However, to expand to a case where there might be more than 2 PriceAgreements...I don't know how to do that.
It does work on the sample data in your question, though (since there is a max of 2 price agreements per employee):
Typically when I'm faced with a problem like this that isn't easy to solve, I think about my data model and make sure that it conforms to a star schema as closely as possible.
In your case, an employee can have multiple price agreements, and a price agreement can be associated with many employees. That, to me, suggests a many-to-many relationship. I'd strongly recommend reading more about many-to-many relationships and whether restructuring the underlying tables (e.g. to include a bridge table) would help get you closer to the answer you need.
A good starting point might be: https://www.sqlbi.com/articles/many-to-many-relationships-in-power-bi-and-excel-2016/