Rolling average over time with multiple values per date - powerbi

I'm trying to calculate a rolling average for each row of a table based on values present in this table based on a sliding time window looking ahead and back a certain amount of days.
Given the following table:
myTable
+------------+-------+
| Date | Value |
+------------+-------+
| 31/05/2020 | 5 |
+------------+-------+
| 31/05/2020 | 10 |
+------------+-------+
| 01/06/2020 | 50 |
+------------+-------+
| 01/08/2020 | 50 |
+------------+-------+
and the measure
myMeasure =
VAR LookAheadAndBehindInDays = 28
RETURN
AVERAGEX (
DATESINPERIOD (
myTable[Date],
DATEADD ( LASTDATE ( myTable[Date] ), LookAheadAndBehindInDays, DAY ),
-2 * LookAheadAndBehindInDays,
DAY
),
myTable[Value]
)
I checked that the DATESINPERIOD returns effectively the right dates. My problem lies in the calculation of the average.
Instead of calculating the average of all values directly (expected result)
+------------+-------+---------------------------+
| Date | Value | myMeasure |
+------------+-------+---------------------------+
| 31/05/2020 | 5 | (5 + 10 + 50) / 3 = 21.66 |
+------------+-------+---------------------------+
| 31/05/2020 | 10 | (5 + 10 + 50) / 3 = 21.66 |
+------------+-------+---------------------------+
| 01/06/2020 | 50 | (5 + 10 + 50) / 3 = 21.66 |
+------------+-------+---------------------------+
| 01/08/2020 | 27 | 27 / 1 = 27 |
+------------+-------+---------------------------+
It first calculates the average of each date, and then the average of those values:
+------------+-------+--------------------+------------------------+
| Date | Value | Avg. by Date | myMeasure |
+------------+-------+--------------------+------------------------+
| 31/05/2020 | 5 | (5 + 10) / 2 = 7.5 | (7.5 + 50) / 3 = 28.75 |
+------------+-------+--------------------+------------------------+
| 31/05/2020 | 10 | (5 + 10) / 2 = 7.5 | (7.5 + 50) / 3 = 28.75 |
+------------+-------+--------------------+------------------------+
| 01/06/2020 | 50 | 50 / 1 = 50 | (7.5 + 50) / 3 = 28.75 |
+------------+-------+--------------------+------------------------+
| 01/08/2020 | 27 | 27 / 1 = 27 | 27 / 1 = 27 |
+------------+-------+--------------------+------------------------+
I found out about this behavior by using this measure:
myMeasure DEBUG =
VAR LookAheadAndBehindInDays = 28
VAR vTable =
DATESINPERIOD (
myTable[Date],
DATEADD ( LASTDATE ( myTable[Date] ), LookAheadAndBehindInDays , DAY ),
-2 * LookAheadAndBehindInDays,
DAY
)
RETURN
FIRSTDATE ( vTable ) & " - " & LASTDATE ( vTable ) & UNICHAR(10)
& " - Row Count: " & COUNTROWS ( vTable ) & UNICHAR(10)
& " - Avg: " & AVERAGEX(vTable, myTable[Value]) & UNICHAR(10)
& " - Dates: " & CONCATENATEX ( vTable, myTable[Date], "," ) & UNICHAR(10)
& " - Values: " & CONCATENATEX ( vTable, myTable[Value], "," )
This returns for rows with the date '31/05/2020' and '31/05/2020' the following value:
31/05/2020 - 01/06/2020
Row Count: 2
Avg: 28.75
Dates: 31/05/2020,01/06/2020
Values: 7.5,50
Most notable are the Row Count 2, which I would expect to be 3 and the values 5,10 and 50 (as reflected above in the tables)
So my question is, how can in calculate the rolling average over time by weighting each value equally, instead of weighting each day equally.

I'm not sure I completely understood the problem, but to me you just need a standard AVERAGE and not the AVERAGEX iterator.
I've changed the formula a bit and didn't use DATESINPERIOD, this one achieves the same result and (to me) is more clear and readable
Avg =
VAR DaysInterval = 28
RETURN
CALCULATE (
AVERAGE ( myTable[Value] ),
DATESBETWEEN (
myTable[Date],
MAX ( myTable[Date] ) - DaysInterval, --from
MAX ( myTable[Date] ) + DaysInterval --to
)
)
here is the result (based on the sample dataset)

What you are looking for is the calculated average from the days -/+28:
myMeasure =
VAR LookAheadAndBehindInDays = 28
var curDAte = rolling[ Date]
return CALCULATE(AVERAGE(rolling[Value]),
FILTER(rolling,
rolling[ Date] +LookAheadAndBehindInDays >= curDAte &&
rolling[ Date] -LookAheadAndBehindInDays <= curDAte))
as you can see I am using the filter to get the rows falling in the date range and calculate the average over those.

Related

DAX: Calculate differences between 2 years when value is not 0

+---------+-------+---------+
| NAME | YEAR | SCORE |
+---------+-------+---------+
| A | 2019 | 100 |
| B | 2019 | 67 |
| C | 2019 | 38 |
| A | 2020 | 48 |
| B | 2020 | 78 |
| C | 2020 | 0 |
| A | 2021 | 0 |
| B | 2021 | 50 |
| C | 2021 | 100 |
+---------+-------+---------+
I have a data table with structure below and I am trying to create a card which shows the difference between 2 years (current year - previous year) (which will be affected by Name slicer). However I couldnt seem get achieve below requirements. Is there any ways to achieve this?
if the score for previous year is 0, it will find the difference between current year - non 0 year. Example: if 2019 value is 0, it will find the difference between 2021- 2018.
if the current year is 0, it will compare the last 2 non 0 years. Example: 2021 value is 0, it will compare 2020-2019
You can use a measure like this
Measure =
VAR _tbl =
TOPN ( 2, FILTER ( 'Table', 'Table'[SCORE] <> 0 ), 'Table'[YEAR], DESC )
VAR _yr1 =
CALCULATE ( MAX ( 'Table'[YEAR] ), _tbl )
VAR _score1 =
CALCULATE (
MAX ( 'Table'[SCORE] ),
FILTER ( VALUES ( 'Table'[YEAR] ), 'Table'[YEAR] = _yr1 )
)
VAR _yr2 =
CALCULATE ( MIN ( 'Table'[YEAR] ), _tbl )
VAR _score2 =
CALCULATE (
MAX ( 'Table'[SCORE] ),
FILTER ( VALUES ( 'Table'[YEAR] ), 'Table'[YEAR] = _yr2 )
)
RETURN
_score1 - _score2
which will give you this

DAX get N'th last non-blank value

For any given date, I would like to get an average Sales of the most recent 3 days with non-blank sales. So I need to retrieve not only the last non-blank sales (which might be easy) but I also need to get the second last and third last sales. Generally, I need N'th last sales.
Sample data:
+------------+--------+--------+--------+--------+------------------+
| Date | Amount | N'th 1 | N'th 2 | N'th 3 | Expected Results |
+------------+--------+--------+--------+--------+------------------+
| 2021-02-01 | 1 | 1 | | | 1.00 |
| 2021-02-02 | 2 | 2 | 1 | | 1.50 |
| 2021-02-03 | 2 | 2 | 2 | 1 | 1.67 |
| 2021-02-04 | | 2 | 2 | 1 | 1.67 |
| 2021-02-05 | 3 | 3 | 2 | 2 | 2.33 |
| 2021-02-06 | | 3 | 2 | 2 | 2.33 |
| 2021-02-07 | | 3 | 2 | 2 | 2.33 |
| 2021-02-08 | 4 | 4 | 3 | 2 | 3.00 |
| 2021-02-09 | | 4 | 3 | 2 | 3.00 |
| 2021-02-10 | | 4 | 3 | 2 | 3.00 |
| 2021-02-11 | | 4 | 3 | 2 | 3.00 |
+------------+--------+--------+--------+--------+------------------+
The N'th 1 is the last "non-blank" sales. The N'th 2 is the "last but one". The expected result is the average of N1, N2, N3.
Link to sample data file with solutions suggested by accepted answer:
DAX Rolling Average NonBlanks.pbix
Here's my take (it's a measure):
Non-blank average =
var curDate = SELECTEDVALUE(Data[Date], MAX(Data[Date]))
var nonBlankTab = FILTER(ALL(Data), NOT(ISBLANK(Data[Amount])) && Data[Date] <= curDate)
var rankedTab = FILTER ( ADDCOLUMNS ( nonBlankTab, "Rank", RANKX ( nonBlankTab, [Date] ) ), [Rank] <= 3 )
return AVERAGEX(rankedTab, [Amount])
EDIT:
Just an explanation:
the measure is calculated for the selected date. If no date context is present, the latest date is assumed.
Then I filter out the table to contain only rows with non blank sales not later than curDate
Then I rank the dates so that the latest 3 dates always receive ranks 1, 2 and 3.
Then I filter out all the dates with rank higher than 3
Finally, I calculate an average over the remaining 3 data points.
EDIT2:
I simplified the measure a bit - lastSalesDate is not necessary. Also, as per request in the comments, I left the first attempt as it was, and here is the modified version with TOPN instead of ADDCOLUMNS/RANKX/FILTER combo:
Non-blank average =
var curDate = SELECTEDVALUE(Data[Date], MAX(Data[Date]))
var nonBlankTab = FILTER(ALL(Data), NOT(ISBLANK(Data[Amount])) && Data[Date] <= curDate)
var rankedTab = TOPN(3, nonBlankTab, [Date])
return AVERAGEX(rankedTab, [Amount])
EDIT3:
A more universal version of the measure that just removes filters from Date column, which is actually all we need. No need to butcher all the other filters on the table:
Non-blank average =
var curDate = SELECTEDVALUE(Data[Date], MAX(Data[Date]))
var nonBlankTab = CALCULATETABLE(FILTER(Data, NOT(ISBLANK(Data[Amount])) && Data[Date] <= curDate), REMOVEFILTERS(Data[Date]))
var rankedTab = TOPN(3, nonBlankTab, [Date])
return AVERAGEX(rankedTab, [Amount])
Firs, create these below 3 measures-
n1 =
VAR current_date = MIN(your_table_name[Date])
VAR first_max_date_with_no_blank =
CALCULATE(
MAX(your_table_name[Date]),
FILTER(ALL(your_table_name), your_table_name[Date] <= current_date && your_table_name[Amount] <> BLANK())
)
RETURN
CALCULATE(
SUM(your_table_name[Amount]),
FILTER(
ALL(your_table_name),
your_table_name[Date] = first_max_date_with_no_blank
)
)
n2 =
VAR current_date = MIN(your_table_name[Date])
VAR first_max_date_with_no_blank =
CALCULATE(
MAX(your_table_name[Date]),
FILTER(ALL(your_table_name), your_table_name[Date] <= current_date && your_table_name[Amount] <> BLANK())
)
VAR second_max_date_with_no_blank =
CALCULATE(
MAX(your_table_name[Date]),
FILTER(ALL(your_table_name), your_table_name[Date] < first_max_date_with_no_blank && your_table_name[Amount] <> BLANK())
)
RETURN
CALCULATE(
SUM(your_table_name[Amount]),
FILTER(
ALL(your_table_name),
your_table_name[Date] = second_max_date_with_no_blank
)
)
n3 =
VAR current_date = MIN(your_table_name[Date])
VAR first_max_date_with_no_blank =
CALCULATE(
MAX(your_table_name[Date]),
FILTER(ALL(your_table_name), your_table_name[Date] <= current_date && your_table_name[Amount] <> BLANK())
)
VAR second_max_date_with_no_blank =
CALCULATE(
MAX(your_table_name[Date]),
FILTER(ALL(your_table_name), your_table_name[Date] < first_max_date_with_no_blank && your_table_name[Amount] <> BLANK())
)
VAR third_max_date_with_no_blank =
CALCULATE(
MAX(your_table_name[Date]),
FILTER(ALL(your_table_name), your_table_name[Date] < second_max_date_with_no_blank && your_table_name[Amount] <> BLANK())
)
RETURN
CALCULATE(
SUM(your_table_name[Amount]),
FILTER(
ALL(your_table_name),
your_table_name[Date] = third_max_date_with_no_blank
)
)
Now create this final measure-
average =
VAR sum_sales = [n1] + [n2] + [n3]
VAR devide_by = IF([n1] = BLANK(),0,1) + IF([n2] = BLANK(),0,1) + IF([n3] = BLANK(),0,1)
RETURN DIVIDE(sum_sales,devide_by)
Here is the final output-

How do I calculate the number of positive days?

I have the following measure:
no_positive_bets =
COUNTAX(
FILTER(
'belgarath match_',
'belgarath match_'[ogion_pnl] >= 0
),
'belgarath match_'[ogion_pnl]
)
Assuming there is a field called 'belgarath match_'[date] how would I group 'belgarath match_'[ogion_pnl] by day so that I can work out the number of positive days?
Edit:
By way of some sample data:
+-----+------------+-----------+
| id_ | date | ogion_pnl |
+-----+------------+-----------+
| 1 | 01/01/2020 | 100 |
| 2 | 02/01/2020 | 100 |
| 3 | 02/01/2020 | -50 |
| 4 | 03/01/2020 | 100 |
| 5 | 03/01/2020 | -150 |
+-----+------------+-----------+
The current snippet I have will return 3 because three rows are positive. However I would like it to return 2 as the first two days are positive.
Try this it may help..
no_positive_bets = COUNTAX(FILTER('belgarath match_', [ogion_pnl] >= 0), [ogion_pnl])
Create this below Custom Column First-
is possitive =
VAR current_date = your_table_name[date]
VAR current_date_sum =
CALCULATE(
SUM(your_table_name[ogion_pnl]),
FILTER(
ALL(your_table_name),
your_table_name[date] = current_date
)
)
RETURN IF(current_date_sum >= 0, 1, 0)
Now create this below Measure-
number of possitive day =
CALCULATE(
DISTINCTCOUNT(your_table_name[date]),
FILTER(
ALL(your_table_name),
your_table_name[is possitive] = 1
)
)
Now add the measure number of possitive day to a card and the output will be as below-

Calculated Column - Filtering MAX on date range from filter

Consider the following tables - one of printers, the other of page counts from meter readings:
Printers
+------------+---------+--------+
| Printer ID | Make | Model |
+------------+---------+--------+
| 1 | Xerox | ABC123 |
| 2 | Brother | DEF456 |
| 3 | Xerox | ABC123 |
+------------+---------+--------+
Meter Read
+-------+------------+-----------+------------+
| Index | Printer ID | Poll Date | Mono Pages |
+-------+------------+-----------+------------+
| 1 | 1 | 1/1/2019 | 1000 |
| 2 | 2 | 1/1/2019 | 800 |
| 3 | 3 | 1/1/2019 | 33000 |
| 4 | 1 | 1/2/2019 | 1100 |
| 5 | 2 | 1/2/2019 | 850 |
| 6 | 3 | 1/2/2019 | 34000 |
| 7 | 1 | 1/3/2019 | 1200 |
| 8 | 2 | 1/3/2019 | 900 |
| 9 | 3 | 1/3/2019 | 35000 |
| 10 | 1 | 1/4/2019 | 1400 |
| 11 | 2 | 1/4/2019 | 950 |
| 12 | 3 | 1/4/2019 | 36000 |
| 13 | 1 | 1/5/2019 | 1800 |
| 14 | 2 | 1/5/2019 | 1000 |
| 15 | 3 | 1/5/2019 | 36500 |
| 16 | 1 | 1/6/2019 | 2000 |
| 17 | 2 | 1/6/2019 | 1050 |
| 18 | 3 | 1/6/2019 | 37500 |
| 19 | 1 | 1/7/2019 | 2100 |
| 20 | 2 | 1/7/2019 | 1100 |
| 21 | 3 | 1/7/2019 | 39000 |
| 22 | 1 | 1/8/2019 | 2200 |
| 23 | 2 | 1/8/2019 | 1150 |
| 24 | 3 | 1/8/2019 | 40000 |
+-------+------------+-----------+------------+
In my Power BI report, I have a Dates table:
Dates = CALENDAR(DATE(2019, 1, 1), DATE(2019, 1, 31))
that I am using as a slicer. The goal is to end up with a delta of Mono Pages during the date range from the slicer. I'm able to grab the difference between each meter read with a fairly complicated calculated column on the Meter Read table:
PagesSinceLastPoll =
IF(
ISBLANK(
LOOKUPVALUE(
'Meter Read'[Mono Pages],
'Meter Read'[Index], CALCULATE(
MAX(
'Meter Read'[Index]
), FILTER(
'Meter Read',
'Meter Read'[Index] < EARLIER('Meter Read'[Index])
&& 'Meter Read'[Printer ID] = EARLIER('Meter Read'[Printer ID] )
)
)
)
),
BLANK(),
'Meter Read'[Mono Pages] -
LOOKUPVALUE(
'Meter Read'[Mono Pages],
'Meter Read'[Index], CALCULATE(
MAX(
'Meter Read'[Index]
), FILTER(
'Meter Read',
'Meter Read'[Index] < EARLIER('Meter Read'[Index])
&& 'Meter Read'[Printer ID] = EARLIER('Meter Read'[Printer ID] )
)
)
)
)
But the performance over 10,000+ rows is pretty bad. I'd like to grab the max and min values for a device in the filtered date range and just subtract instead, but I'm having a hard time getting the right value. My DAX so far keeps getting me the max value from the ENTIRE table, not the table filtered on the dates in my slicer. Everything I've tried so far is some variation on:
MaxInRange =
CALCULATE (
MAX ( 'Meter Read'[Mono Pages] ),
FILTER ( 'Meter Read', 'Meter Read'[Printer ID] = Printers[Printer ID] )
)
To summarize: If I have a slicer starting 1/2/2019 and ending 1/5/2019, the max value for Printer ID 1 should read 1800, not 2200.
Thoughts?
The calculated column can be done more efficiently like this:
PagesSinceLastPoll =
VAR PrevRow =
TOPN(1,
FILTER('Meter Read',
'Meter Read'[PrinterID] = EARLIER('Meter Read'[PrinterID]) &&
'Meter Read'[PollDate] < EARLIER('Meter Read'[PollDate])
),
'Meter Read'[PollDate]
)
RETURN 'Meter Read'[MonoPages] - SELECTCOLUMNS(PrevRow, "Pages", 'Meter Read'[MonoPages])
Using that, the number of pages between two dates can just sum this column on those dates.
If you want to skip that and go straight to a measure, try something like this:
PagesInPeriod =
VAR StartDate = FIRSTDATE(Dates[Date])
VAR EndDate = LASTDATE(Dates[Date])
RETURN
SUMX(
VALUES('Meter Read'[PrinterID]),
CALCULATE(
MAX('Meter Read'[MonoPages]),
Dates[Date] = EndDate
)
-
CALCULATE(
MAX('Meter Read'[MonoPages]),
Dates[Date] < StartDate
)
)
Note that if you use Dates[Date] = StartDate, then you'll be off. You want to calculate the max pages before your first included date.
Both of these methods should give the same result:
Alexis' measure is the correct way to handle this (my thanks!), but I made a very small edit. Since it is possible that a reading was not taken on the end date, we need to look on or before that date, else it treats the max on end date like a zero. The final code then becomes:
PagesInPeriod =
VAR StartDate = FIRSTDATE(Dates[Date])
VAR EndDate = LASTDATE(Dates[Date])
RETURN
SUMX(
VALUES('Meter Read'[PrinterID]),
CALCULATE(
MAX('Meter Read'[MonoPages]),
Dates[Date] <= EndDate
)
-
CALCULATE(
MAX('Meter Read'[MonoPages]),
Dates[Date] < StartDate
)
)

How to calculate an expression and group by 2 fields in DAX?

I want to write an expression in DAX that will group by 2 fields: AgentID and LoginDate. Here is the expression:
Average Availability % Per Day = (LoginTime + WorkTime) / (LoginTime + WorkTime + BreakTime)
What I have written in DAX so far is :
Average Availability % Per Day =
AVERAGEX (
VALUES ( Logins[LoginDay] ),
(
DIVIDE (
SUM ( Logins[LoginDuration] ) + SUM ( Logins[WorkDuration] ),
SUM ( Logins[LoginDuration] ) + SUM ( Logins[WorkDuration] )
+ SUM ( Logins[BreakDuration] )
)
)
)
However, the problem is the expression is summing everything and then getting the average as opposed to evaluating the expression and grouping by each day and each AgentID before calculating the average.
EDIT: Adding sample data:
AgentID | LoginDay | LoginDuration | BreakDuration | WorkDuration
96385 | 7/5/2018 | 14472 | 803 |1447
96385 | 7/6/2018 | 14742 | 857 |1257
96385 | 7/12/2018 | 14404 | 583 |291
96385 | 7/13/2018 | 14276 | 636 |368
96385 | 7/19/2018 | 14456 | 788 |543
96385 | 7/20/2018 | 14550 | 390 |1727
96385 | 7/26/2018 | 66670 | 53224 |1076
96385 | 7/27/2018 | 14592 | 277 |1928
So for example, for this agent, I am getting an average availability % per day of .75 when it should really be .91