DAX: Calculate differences between 2 years when value is not 0 - powerbi

+---------+-------+---------+
| NAME | YEAR | SCORE |
+---------+-------+---------+
| A | 2019 | 100 |
| B | 2019 | 67 |
| C | 2019 | 38 |
| A | 2020 | 48 |
| B | 2020 | 78 |
| C | 2020 | 0 |
| A | 2021 | 0 |
| B | 2021 | 50 |
| C | 2021 | 100 |
+---------+-------+---------+
I have a data table with structure below and I am trying to create a card which shows the difference between 2 years (current year - previous year) (which will be affected by Name slicer). However I couldnt seem get achieve below requirements. Is there any ways to achieve this?
if the score for previous year is 0, it will find the difference between current year - non 0 year. Example: if 2019 value is 0, it will find the difference between 2021- 2018.
if the current year is 0, it will compare the last 2 non 0 years. Example: 2021 value is 0, it will compare 2020-2019

You can use a measure like this
Measure =
VAR _tbl =
TOPN ( 2, FILTER ( 'Table', 'Table'[SCORE] <> 0 ), 'Table'[YEAR], DESC )
VAR _yr1 =
CALCULATE ( MAX ( 'Table'[YEAR] ), _tbl )
VAR _score1 =
CALCULATE (
MAX ( 'Table'[SCORE] ),
FILTER ( VALUES ( 'Table'[YEAR] ), 'Table'[YEAR] = _yr1 )
)
VAR _yr2 =
CALCULATE ( MIN ( 'Table'[YEAR] ), _tbl )
VAR _score2 =
CALCULATE (
MAX ( 'Table'[SCORE] ),
FILTER ( VALUES ( 'Table'[YEAR] ), 'Table'[YEAR] = _yr2 )
)
RETURN
_score1 - _score2
which will give you this

Related

Get latest value for each ID in Power BI / DAX measure

In Power BI I would like to create a DAX measure that will retrieve the latest string value for specific IDs. Example source table:
Name_ID | Name | DateTime | Value
----------------------------------------------------------
1 | Child_1 | 18.8.2021 12:33:24 | F
32 | Parent_32 | 18.8.2021 11:41:09 | F
13 | Child_1 | 18.8.2021 11:30:58 | E
48 | Parent_48 | 18.8.2021 09:13:11 | F
2 | Child_2 | 17.8.2021 00:09:42 | S
1 | Child_1 | 17.8.2021 23:03:34 | F
48 | Parent_48 | 17.8.2021 21:46:27 | S
6 | Parent_6 | 16.8.2021 17:31:26 | S
.
.
.
specific parents IDs for example here are 6, 32 and 48, so the result should be something like this:
Name_ID | Name | DateTime (of last execution) | Value
------------------------------------------------------------------------------
32 | Parent_32 | 18.8.2021 11:41:09 | F
48 | Parent_48 | 18.8.2021 09:13:11 | F
6 | Parent_6 | 16.8.2021 17:31:26 | S
The result table I'm trying to get is only parents latest appearance and retrieving the whole row or just Value from last column.
This seems so easy in theory and on paper but I just can't seem to get it in DAX I have tried with various calculate formulas but without any result worth mentioning .
I'm beginner in Power Bi and any help would be very appreciated!
You can use a measure like this one, where we check Max Date per Name:
Flag =
var MaxDatePerName = CALCULATE(max(Sheet3[DateTime]), FILTER(ALL(Sheet3), SELECTEDVALUE(Sheet3[Name]) = Sheet3[Name]))
return
if( MaxDatePerName = SELECTEDVALUE(Sheet3[DateTime]) && LEFT(SELECTEDVALUE(Sheet3[Name]),6) = "Parent", 1, BLANK())
With RANKX
Measure2 =
VAR _0 =
MAX ( 'Table 1'[DateTime] )
VAR _00 =
MAX ( 'Table 1'[Name] )
VAR _1 =
CALCULATE (
RANKX (
FILTER ( ALL ( 'Table 1' ), 'Table 1'[Name] = _00 ),
CALCULATE ( MAX ( 'Table 1'[DateTime] ) ),
,
DESC
)
)
VAR _2 =
IF ( _1 = 1 && CONTAINSSTRING ( _00, "Parent" ) = TRUE (), _0, BLANK () )
RETURN
_2

persist future values from measure

I have a measure which displays number of employees in relation to the date.
Each day the FactEmployee is updated to reflect who is working. this means that my measure (obviously) can't display how many employees there are tomorrow.
I would like to persist the latest value (ie. todays value) into the future.
Data model
My (not perfect) measure
Count, employee :=
VAR today = TODAY()
VAR res =
IF (
MAX ( DimDate[fulldate] ) > today,
CALCULATE (
COUNT ( DimEmployee[emp_key] ),
FILTER ( ALL ( FactEmployee ), RELATED ( DimDate[fulldate] ) = today)
),
CALCULATE ( COUNT ( DimEmployee[emp_key] ), FactEmployee )
)
RETURN
res
Output
year-month count, emp
---------------------------
2020-01 182
2020-02 180
2020-03 174
2020-04 171
2020-05 171
2020-06 173
2020-07 172
2020-08 175
2020-09 172
Expected Output
year-month count, emp
--------------------------
2020-01 182
2020-02 180
2020-03 174
2020-04 171
2020-05 171
2020-06 173
2020-07 172
2020-08 175
2020-09 172
2020-10 172 <----repeated value from 2020-09
2020-11 172 <----repeated value from 2020-09
2020-12 172 <----repeated value from 2020-09
how can i fix my measure to get the missing values (oktober to december)?
I have replicated your model using a simplified version, I don't think you need dimEmployee in this case.
Assuming your model is like this
And your tables look like these:
FactEmployee
+----------+---------+
| date_key | emp_key |
+----------+---------+
| 20200101 | 1 |
+----------+---------+
| 20200102 | 1 |
+----------+---------+
| 20200103 | 1 |
+----------+---------+
| 20200104 | 1 |
+----------+---------+
| 20200105 | 1 |
+----------+---------+
| 20200101 | 2 |
+----------+---------+
| 20200102 | 2 |
+----------+---------+
| 20200104 | 2 |
+----------+---------+
| 20200101 | 3 |
+----------+---------+
| 20200102 | 3 |
+----------+---------+
| 20200103 | 3 |
+----------+---------+
| 20200104 | 3 |
+----------+---------+
| 20200105 | 4 |
+----------+---------+
DimDate
+------------+----------+
| Date | Date_key |
+------------+----------+
| 01/01/2020 | 20200101 |
+------------+----------+
| 02/01/2020 | 20200102 |
+------------+----------+
| 03/01/2020 | 20200103 |
+------------+----------+
| 04/01/2020 | 20200104 |
+------------+----------+
| 05/01/2020 | 20200105 |
+------------+----------+
| 06/01/2020 | 20200106 |
+------------+----------+
| 07/01/2020 | 20200107 |
+------------+----------+
I have created a calculation that follow these steps:
Compute the maximum date with valid or non blank values for the distinct count of emp key, under the variable MaxDateKey.
IF statement evaluated for date_key greater than 'MaxDatekey' - in this case 20200106 and 20200107. For those dates, the calculation retrieves the distinct count of emp_key for MaxDateKey.
When the IF stamenet is false, distinct count is calculated as usual.
Count =
VAR MaxDateKey =
CALCULATE (
LASTNONBLANK ( FactEmployee[date_key], DISTINCTCOUNT ( FactEmployee[emp_key] ) ),
REMOVEFILTERS ( DimDate[Date] )
)
VAR Result =
IF (
MAX ( DimDate[Date_key] ) > MaxDateKey,
CALCULATE (
DISTINCTCOUNT ( FactEmployee[emp_key] ),
ALL ( DimDate[Date] ),
DimDate[Date_key] = MaxDateKey
),
DISTINCTCOUNT ( FactEmployee[emp_key] )
)
RETURN
Result
The output below. The values from the last valid date 5th of Jan is applied to the subsequent dates (6th and 7th of Jan).
For line chart, you can check the Forecast option from the Analytics pane as shown below.
The output will be something like below-

Rolling average over time with multiple values per date

I'm trying to calculate a rolling average for each row of a table based on values present in this table based on a sliding time window looking ahead and back a certain amount of days.
Given the following table:
myTable
+------------+-------+
| Date | Value |
+------------+-------+
| 31/05/2020 | 5 |
+------------+-------+
| 31/05/2020 | 10 |
+------------+-------+
| 01/06/2020 | 50 |
+------------+-------+
| 01/08/2020 | 50 |
+------------+-------+
and the measure
myMeasure =
VAR LookAheadAndBehindInDays = 28
RETURN
AVERAGEX (
DATESINPERIOD (
myTable[Date],
DATEADD ( LASTDATE ( myTable[Date] ), LookAheadAndBehindInDays, DAY ),
-2 * LookAheadAndBehindInDays,
DAY
),
myTable[Value]
)
I checked that the DATESINPERIOD returns effectively the right dates. My problem lies in the calculation of the average.
Instead of calculating the average of all values directly (expected result)
+------------+-------+---------------------------+
| Date | Value | myMeasure |
+------------+-------+---------------------------+
| 31/05/2020 | 5 | (5 + 10 + 50) / 3 = 21.66 |
+------------+-------+---------------------------+
| 31/05/2020 | 10 | (5 + 10 + 50) / 3 = 21.66 |
+------------+-------+---------------------------+
| 01/06/2020 | 50 | (5 + 10 + 50) / 3 = 21.66 |
+------------+-------+---------------------------+
| 01/08/2020 | 27 | 27 / 1 = 27 |
+------------+-------+---------------------------+
It first calculates the average of each date, and then the average of those values:
+------------+-------+--------------------+------------------------+
| Date | Value | Avg. by Date | myMeasure |
+------------+-------+--------------------+------------------------+
| 31/05/2020 | 5 | (5 + 10) / 2 = 7.5 | (7.5 + 50) / 3 = 28.75 |
+------------+-------+--------------------+------------------------+
| 31/05/2020 | 10 | (5 + 10) / 2 = 7.5 | (7.5 + 50) / 3 = 28.75 |
+------------+-------+--------------------+------------------------+
| 01/06/2020 | 50 | 50 / 1 = 50 | (7.5 + 50) / 3 = 28.75 |
+------------+-------+--------------------+------------------------+
| 01/08/2020 | 27 | 27 / 1 = 27 | 27 / 1 = 27 |
+------------+-------+--------------------+------------------------+
I found out about this behavior by using this measure:
myMeasure DEBUG =
VAR LookAheadAndBehindInDays = 28
VAR vTable =
DATESINPERIOD (
myTable[Date],
DATEADD ( LASTDATE ( myTable[Date] ), LookAheadAndBehindInDays , DAY ),
-2 * LookAheadAndBehindInDays,
DAY
)
RETURN
FIRSTDATE ( vTable ) & " - " & LASTDATE ( vTable ) & UNICHAR(10)
& " - Row Count: " & COUNTROWS ( vTable ) & UNICHAR(10)
& " - Avg: " & AVERAGEX(vTable, myTable[Value]) & UNICHAR(10)
& " - Dates: " & CONCATENATEX ( vTable, myTable[Date], "," ) & UNICHAR(10)
& " - Values: " & CONCATENATEX ( vTable, myTable[Value], "," )
This returns for rows with the date '31/05/2020' and '31/05/2020' the following value:
31/05/2020 - 01/06/2020
Row Count: 2
Avg: 28.75
Dates: 31/05/2020,01/06/2020
Values: 7.5,50
Most notable are the Row Count 2, which I would expect to be 3 and the values 5,10 and 50 (as reflected above in the tables)
So my question is, how can in calculate the rolling average over time by weighting each value equally, instead of weighting each day equally.
I'm not sure I completely understood the problem, but to me you just need a standard AVERAGE and not the AVERAGEX iterator.
I've changed the formula a bit and didn't use DATESINPERIOD, this one achieves the same result and (to me) is more clear and readable
Avg =
VAR DaysInterval = 28
RETURN
CALCULATE (
AVERAGE ( myTable[Value] ),
DATESBETWEEN (
myTable[Date],
MAX ( myTable[Date] ) - DaysInterval, --from
MAX ( myTable[Date] ) + DaysInterval --to
)
)
here is the result (based on the sample dataset)
What you are looking for is the calculated average from the days -/+28:
myMeasure =
VAR LookAheadAndBehindInDays = 28
var curDAte = rolling[ Date]
return CALCULATE(AVERAGE(rolling[Value]),
FILTER(rolling,
rolling[ Date] +LookAheadAndBehindInDays >= curDAte &&
rolling[ Date] -LookAheadAndBehindInDays <= curDAte))
as you can see I am using the filter to get the rows falling in the date range and calculate the average over those.

Calculated Column - Filtering MAX on date range from filter

Consider the following tables - one of printers, the other of page counts from meter readings:
Printers
+------------+---------+--------+
| Printer ID | Make | Model |
+------------+---------+--------+
| 1 | Xerox | ABC123 |
| 2 | Brother | DEF456 |
| 3 | Xerox | ABC123 |
+------------+---------+--------+
Meter Read
+-------+------------+-----------+------------+
| Index | Printer ID | Poll Date | Mono Pages |
+-------+------------+-----------+------------+
| 1 | 1 | 1/1/2019 | 1000 |
| 2 | 2 | 1/1/2019 | 800 |
| 3 | 3 | 1/1/2019 | 33000 |
| 4 | 1 | 1/2/2019 | 1100 |
| 5 | 2 | 1/2/2019 | 850 |
| 6 | 3 | 1/2/2019 | 34000 |
| 7 | 1 | 1/3/2019 | 1200 |
| 8 | 2 | 1/3/2019 | 900 |
| 9 | 3 | 1/3/2019 | 35000 |
| 10 | 1 | 1/4/2019 | 1400 |
| 11 | 2 | 1/4/2019 | 950 |
| 12 | 3 | 1/4/2019 | 36000 |
| 13 | 1 | 1/5/2019 | 1800 |
| 14 | 2 | 1/5/2019 | 1000 |
| 15 | 3 | 1/5/2019 | 36500 |
| 16 | 1 | 1/6/2019 | 2000 |
| 17 | 2 | 1/6/2019 | 1050 |
| 18 | 3 | 1/6/2019 | 37500 |
| 19 | 1 | 1/7/2019 | 2100 |
| 20 | 2 | 1/7/2019 | 1100 |
| 21 | 3 | 1/7/2019 | 39000 |
| 22 | 1 | 1/8/2019 | 2200 |
| 23 | 2 | 1/8/2019 | 1150 |
| 24 | 3 | 1/8/2019 | 40000 |
+-------+------------+-----------+------------+
In my Power BI report, I have a Dates table:
Dates = CALENDAR(DATE(2019, 1, 1), DATE(2019, 1, 31))
that I am using as a slicer. The goal is to end up with a delta of Mono Pages during the date range from the slicer. I'm able to grab the difference between each meter read with a fairly complicated calculated column on the Meter Read table:
PagesSinceLastPoll =
IF(
ISBLANK(
LOOKUPVALUE(
'Meter Read'[Mono Pages],
'Meter Read'[Index], CALCULATE(
MAX(
'Meter Read'[Index]
), FILTER(
'Meter Read',
'Meter Read'[Index] < EARLIER('Meter Read'[Index])
&& 'Meter Read'[Printer ID] = EARLIER('Meter Read'[Printer ID] )
)
)
)
),
BLANK(),
'Meter Read'[Mono Pages] -
LOOKUPVALUE(
'Meter Read'[Mono Pages],
'Meter Read'[Index], CALCULATE(
MAX(
'Meter Read'[Index]
), FILTER(
'Meter Read',
'Meter Read'[Index] < EARLIER('Meter Read'[Index])
&& 'Meter Read'[Printer ID] = EARLIER('Meter Read'[Printer ID] )
)
)
)
)
But the performance over 10,000+ rows is pretty bad. I'd like to grab the max and min values for a device in the filtered date range and just subtract instead, but I'm having a hard time getting the right value. My DAX so far keeps getting me the max value from the ENTIRE table, not the table filtered on the dates in my slicer. Everything I've tried so far is some variation on:
MaxInRange =
CALCULATE (
MAX ( 'Meter Read'[Mono Pages] ),
FILTER ( 'Meter Read', 'Meter Read'[Printer ID] = Printers[Printer ID] )
)
To summarize: If I have a slicer starting 1/2/2019 and ending 1/5/2019, the max value for Printer ID 1 should read 1800, not 2200.
Thoughts?
The calculated column can be done more efficiently like this:
PagesSinceLastPoll =
VAR PrevRow =
TOPN(1,
FILTER('Meter Read',
'Meter Read'[PrinterID] = EARLIER('Meter Read'[PrinterID]) &&
'Meter Read'[PollDate] < EARLIER('Meter Read'[PollDate])
),
'Meter Read'[PollDate]
)
RETURN 'Meter Read'[MonoPages] - SELECTCOLUMNS(PrevRow, "Pages", 'Meter Read'[MonoPages])
Using that, the number of pages between two dates can just sum this column on those dates.
If you want to skip that and go straight to a measure, try something like this:
PagesInPeriod =
VAR StartDate = FIRSTDATE(Dates[Date])
VAR EndDate = LASTDATE(Dates[Date])
RETURN
SUMX(
VALUES('Meter Read'[PrinterID]),
CALCULATE(
MAX('Meter Read'[MonoPages]),
Dates[Date] = EndDate
)
-
CALCULATE(
MAX('Meter Read'[MonoPages]),
Dates[Date] < StartDate
)
)
Note that if you use Dates[Date] = StartDate, then you'll be off. You want to calculate the max pages before your first included date.
Both of these methods should give the same result:
Alexis' measure is the correct way to handle this (my thanks!), but I made a very small edit. Since it is possible that a reading was not taken on the end date, we need to look on or before that date, else it treats the max on end date like a zero. The final code then becomes:
PagesInPeriod =
VAR StartDate = FIRSTDATE(Dates[Date])
VAR EndDate = LASTDATE(Dates[Date])
RETURN
SUMX(
VALUES('Meter Read'[PrinterID]),
CALCULATE(
MAX('Meter Read'[MonoPages]),
Dates[Date] <= EndDate
)
-
CALCULATE(
MAX('Meter Read'[MonoPages]),
Dates[Date] < StartDate
)
)

How to dynamically compare same period based on a slicer

I would like to compare the same period of sessions per day. If i'm looking at Oct 10th 2018 to Oct. 16th 2018 (Wednesday to Tuesday), I would like to compare it to the same day range of last week:
+------+-------+-----+----------+-------------+--+
| year | month | day | sessions | last_period | |
+------+-------+-----+----------+-------------+--+
| 2018 | oct | 10 | 2000 | 2500 | |
| 2018 | oct | 11 | 2500 | 2400 | |
| 2018 | oct | 12 | 2600 | 2300 | |
| 2018 | oct | 13 | 2700 | 2450 | |
| 2018 | oct | 14 | 2400 | 2500 | |
| 2018 | oct | 15 | 2300 | 2200 | |
| 2018 | oct | 16 | 2000 | 1150 | |
+------+-------+-----+----------+-------------+--+
A simple formula can make it work based on the 7-day interval:
same_last_period = CALCULATE(SUM(table[Sessions]),DATEADD(table[Date],-7,DAY))
but I would like the formula to depend on a date slicer. Say if i wanted to look at the Oct 1-Oct 20. I would like my formula to change and look at the same period right before with the same amount of day intervals. Ultimately this would be graphed as well.
Try this:
same_last_period =
VAR DayCount = CALCULATE(DISTINCTCOUNT(table[Date]), ALLSELECTED(table[Date]))
RETURN CALCULATE(SUM(table[Sessions]), DATEADD(table[Date], -DayCount, DAY))
Edit:
This above doesn't work how I intended since you still have the year, month, and day in your filter context. That needs to be removed.
same_last_period =
VAR DayCount =
CALCULATE (
DISTINCTCOUNT ( 'table'[Date] ),
ALLSELECTED ( 'table'[Date] ),
ALLEXCEPT ( 'table', 'table'[Date] )
)
RETURN
CALCULATE (
SUM ( 'table'[Sessions] ),
DATEADD ( 'table'[Date], -DayCount, DAY ),
ALLEXCEPT ( 'table', 'table'[Date] )
)
The ALLEXCEPT removes any extra filter context except for Date.