Average aggregation in difference of dates between two different tables

Average aggregation in difference of dates between two different tables - powerbi

I have 2 tables:
Calendar table
Date
-----
Dec 2021
Jan 2022
Feb 2022
Mar 2022
.
.
.
Event table
Event | Last Date
-----------------
Event A | 01-Jan-2013
Event B | 01-Mar-2017
Event C | 01-Feb-2022
. | .
. | .
. | .
I want to create a table that calculates the average difference in month between the calendar table and the last Date of event event table and aggregate them per month
Calculation is as follows:
For Dec 2021
Diff in month for Event A= 107
Diff in month for Event B= 57
Event C is not considered as the last Date> Dec 2021
Avg = (107+57)/2 = 82
For Jan 2022
Diff in month for Event A= 108
Diff in month for Event B= 58
Event C is not considered as the last Date> Jan 2022
Avg = (108+57)/2 = 83
For Feb 2022
Diff in month for Event A= 109
Diff in month for Event B= 59
Diff in month for Event C= 0
Avg = (109+58+0)/3 = 56
For Mar 2022
Diff in month for Event A= 110
Diff in month for Event B= 60
Diff in month for Event C= 1
Avg = (110+59+1)/3 = 57
Output table should be as follows:
Date | Avg Diff
------------------------
Dec 2021 | 82
Jan 2022 | 83
Feb 2022 | 56
Mar 2022 | 57
. |
. |
. |
Any help implementing this through DAX(PowerBI) or PowerQuery is appreciated.

Result
Event table looks like this:
Calendar table looks like this:
Full code for calendar table to get result:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcklNVjAyMDJUitWJVvJKzANxjMAct9QkBMc3sQjKiQUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t]),
#"Duplicated Column" = Table.DuplicateColumn(Source, "Date", "Date2"),
#"Changed Type" = Table.TransformColumnTypes(#"Duplicated Column",{{"Date2", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom",
let a = (x)=>
let b = Table.SelectRows( Event, (y)=> y[Last Date] <= x[Date2] ),
c = Table.AddColumn(b, "diff", (z)=>( Date.Month(x[Date2]) + (Date.Year(x[Date2])*12)) - (Date.Month(z[Last Date]) + (Date.Year(z[Last Date])*12)) ),
d = Table.RowCount(b),
e = List.Sum(c[diff])
in e/d
in a),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Date2"})
in
#"Removed Columns"

Related

DAX - Rankx by multiple Categories Issue

I have 4 Categories (GP, ID, Age, Date). I would would like to create calculated column and group by GP, ID, and Age and Rank/ count by Date to see how many months each member has in past 24 month.
My Code works until I have members who cancelled their membership for a few months and then resumed after. I need to restart from the first month after skip. for example :
GP ID AGE DATE RKING Desired RANK
1 220 35-44 202206 12 6
1 220 35-44 202205 12 5
1 220 35-44 202204 12 4
1 220 35-44 202203 12 3
1 220 35-44 202202 12 2
1 220 35-44 202201 12 1
1 220 35-44 202012 24 24
1 220 35-44 202011 23 23
1 220 35-44 202010 22 22
1 220 35-44 202009 21 21
1 220 35-44 202008 20 20
1 220 35-44 202007 19 19
1 220 35-44 202006 18 18
1 220 35-44 202005 17 17
1 220 35-44 202004 16 16
… … … … … …
1 220 35-44 201901 1 1
This is what I have tried but doesn't work for dates skipping.
RKING Column=
RANKX (
CALCULATETABLE (
VALUES ('tbl'[Date] ),
ALLEXCEPT ( 'tblW', 'tbl'[GP], 'tbl'[ID] ),
'tbl'[AGE] = 'tbl'[AGE],
'tbl'[date] >= start_date && 'tbl'[date] <= end_date // date slicer
),
[Date] ,
,ASC
)

Looking through the code you were trying to make a measure for a visual (For a calcCol the measure is added as well). And as I got a point, you want to show a sum of consequtive months in a matrix for each date in accordance to ID/GP/AGE/DATE I see a following way.
As you know, calculations performs for each row in a matrix and filter the data model according to data presented in matrix rows and columns (slicers as well). So, my idea is -
Get date from matrixRow and use it as max date for the table.
Then use a FILTER(). FILTER() is an iterative function, so it goes throw each row and checks filtering condition - if true row remains if false - not.
I use following filtring conditions:
Get dateInMatrix-dateInACurrentTableRow (for example: 202203-202201= 2 months)
Then check how many rows in the table with min=202201 and max<202203
if there are less rows then date difference then it FALSE() and the row is out of table.
3) The last step is counting of rows it a filtered table.
A measure for matrix:
Ranking =
VAR matrixDate=MAX('table'[DATE])
VAR filteredTable =
FILTER(
ALL('table')
,DATEDIFF(
DATE(LEFT([DATE],4),RIGHT([DATE],2),1)
,DATE(LEFT(matrixDate,4),RIGHT(matrixDate,2),1)
,MONTH
)
=
VAR dateInRow=[DATE]
RETURN
CALCULATE(
COUNTROWS('table')
,'table'[DATE]>=dateInRow
,'table'[DATE]<matrixDate
)
)
RETURN
COUNTROWS(filteredTable)
[![enter image description here][1]][1]
A measure for calcColl:
RankColl =
VAR currentDate=[Start_Date]
Var MyFilt={('Table'[AGE],'Table'[ID],'Table'[GROUP])}
VAR withColl =
ADDCOLUMNS(
CALCULATETABLE(
'table'
,ALL('Table')
,TREATAS(MyFilt,'Table'[AGE],'Table'[ID],'Table'[GROUP])
)
,"dateDiff",
DATEDIFF(
[Start_Date]
,currentDate
,MONTH
)
,"RowsInTable",
VAR dateInRow=[Start_Date]
Var startDate=IF(dateInRow<currentDate,dateInRow,currentDate)
VAR endDay =IF(dateInRow>currentDate,dateInRow,currentDate)
VAR myDates = GENERATESERIES(startDate,endDay,1)
RETURN
COUNTROWS(
CALCULATETABLE(
'Table'
,ALL('Table')
,TREATAS(MyFilt,'Table'[AGE],'Table'[ID],'Table'[GROUP])
,TREATAS(myDates,'Table'[Start_Date])
)
)
)
VAR filtered =
FILTER(
withColl
,[dateDiff]=[RowsInTable]-1 -- for ex.:
-- dateDiff=01/01/2022-01/01/2022=0,
-- but it will be 1 row in the table for 01/01/2022
)
RETURN
CountRows( filtered)

DAX how to Ignore certain slicers in measure?

I have a table like below:
BU Value Date Measure Agg_Lvl_1 Agg_Lvl_2 Agg_Lvl_3
AA 10 01/01/2021 Sale Firm COO A
AB 20 01/04/2021 Sale Firm Non-COO A
AC 32 01/05/2021 Sale Firm COO A
BA 32 01/01/2021 Sale Firm Non-COO B
BB 43 01/04/2021 Sale Firm Non-COO B
BC 19 01/08/2021 Sale Firm Non-COO B
CA 11 01/11/2021 Sale Firm Non-COO C
CB 16 01/12/2021 Sale Firm Non-COO C
CC 18 01/13/2021 Sale Firm COO C
D 18 01/01/2021 Sale Ext Non-CIO D
AA 10 01/01/2021 non-Sale Ext Non-CIO A
AB 20 01/04/2021 non-Sale Firm Non-CIO A
I need to calculate each BU's contribution for Firm Sale by period:
contribution = Sum(Table(Value) where Measure ='Sale' & BU ='slicer select') /
Sum(Table(Value) where Measure ='Sale' & BU ='Firm'
also this "contribution" measure should correspond to date slicer
I have tried different DAX method all i got was contribution of 1 (i think the slicer/filter isn't set up right). Anyone please help?
E.g. AA contribution between 1/1/2021 - 1/4/2021 = (10+20)/ (10+20+32+18) = 12.5%

You can try the function ALLEXCEPT as shown below-
contribution =
CALCULATE(
SUM(Table_name[value]),
FILTER(
ALLEXCEPT(Table_name,Table_name[Date],Table_name[BU]),
Table_name[measure] = "Sale"
)
)
/
CALCULATE(
SUM(Table_name[value]),
FILTER(
ALLEXCEPT(Table_name,Table_name[Date]),
Table_name[measure] = "Sale"
&& Table_name[BU] = "Firm"
)
)

Find ID's not present in a date represented by a yyyyweek_number

I've 2 data sets, one which represts a list of all of the customers and other with their order dates
The order date are in a yyyyweek_number format, so for instance as today (2020-09-29) is week 40, the order date would be represented as 202040
I want to get a list of dealers who haven't placed orders in 4 day ranges viz. 30 days or less
60 days or less
90 days or less and
90+ days
To illustrate lets say the customer dataset is as under:
+----+
| ID |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
| 11 |
| 12 |
| 13 |
| 14 |
| 15 |
+----+
and the Order table is as under:
+----+-----------------+
| ID | Order_YYYY_WEEK |
+----+-----------------+
| 1 | 202001 |
| 2 | 202003 |
| 3 | 202004 |
| 5 | 202006 |
| 2 | 202008 |
| 3 | 202010 |
| 6 | 202012 |
| 8 | 202009 |
| 1 | 202005 |
| 10 | 202015 |
| 11 | 202018 |
| 13 | 202038 |
| 15 | 202039 |
| 12 | 202040 |
+----+-----------------+
The slicer format that I've looks like this
Now say for instance the 30 days or less button is selected,
the resulting table should represent a table as under, with all the ID's from the Customer table that aren't present in the ORDER table where ORDER_YYYY_WEEK is 30 days from todays week
+----+
| ID |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
| 10 |
| 11 |
| 14 |
+----+

Steps:
Create relationship between Customer id's in Customer table and Order table (if not already there)
Create a Date table
Convert Weeks to dates in a new calculated column in the Order table
Create relationship between Customer id's in Customer table and Order table
Create relationship between Dates in Date table and Order table
Create calculated column in Date Table with Day ranges ("30 days or less" etc)
Create measure to identify if an order was placed
Add slicer with date range from Date table and table visual with Customer id.
Add measure to table visual on filter pane and set to "No"
Some of these steps have additional detail below.
2. Create a Date table
We can do this is PowerQuery or in DAX. Here's the DAX version:
Calendar =
VAR
Days = CALENDAR ( DATE ( 2020, 1, 1 ), DATE ( 2020, 12, 31 ) )
RETURN
ADDCOLUMNS (
Days,
"Year Week", YEAR ( [Date] ) & WEEKNUM([Date])
)
Now mark this table as a date table in the "Table Tools" ribbon with the button "Mark as date table"
3. Convert Weeks to dates
For this to work, I have had to create a calculated column in the Order table with the first day of the year first. This can probably be improved upon.
StartYear = DATE(Left(Orders[Year week], 4), 01, 01)
Next the calculated column that we need in the Order table, that identifies the first day of the week. The Variable "DayNoInYear" takes the week number times 7 and substracting 7 to arrive at the first day of the week, returning the nth day of the year. This is then converted to a date with the variable "DateWeek":
Date =
VAR DayNoInYear = RIGHT(Orders[Year week], 2) * 7 - 7
VAR DateWeek = DATEADD(Orders[StartYear].[Date], DayNoInYear, DAY)
RETURN
DateWeek
6. Create calculated column in Date Table with Day ranges
Day ranges =
VAR Today = TODAY()
VAR CheckDate = 'Calendar'[Date] RETURN
SWITCH(TRUE(),
CheckDate - Today <= -90, "90+ days",
CheckDate - Today <= -60 && CheckDate - Today > -90 , "90 days or less",
CheckDate - Today <= -30 && CheckDate - Today > -60 , "60 days or less",
CheckDate - Today <= 0 && CheckDate - Today > -30 , "30 days or less",
"In the future"
)
7. Create measure to identify if an order was placed
Yes - No order =
VAR Yes_No =
IF(
ISBLANK(FIRSTNONBLANK(Orders[Customer id], Orders[Customer id])),
"No",
"Yes"
)
VAR ThirtyDays = SELECTEDVALUE('Calendar'[Day ranges]) = "30 days or less"
VAR SixtyDays = SELECTEDVALUE('Calendar'[Day ranges]) = "30 days or less" || SELECTEDVALUE('Calendar'[Day ranges]) = "60 days or less"
VAR NinetyDays = SELECTEDVALUE('Calendar'[Day ranges]) = "30 days or less" || SELECTEDVALUE('Calendar'[Day ranges]) = "60 days or less" || SELECTEDVALUE('Calendar'[Day ranges]) = "90 days or less"
RETURN
SWITCH(TRUE(),
AND(ThirtyDays = TRUE(), Yes_No = "No"), "No",
AND(SixtyDays = TRUE(), Yes_No = "No"), "No",
AND(NinetyDays = TRUE(), Yes_No = "No"), "No",
Yes_No = "No",
"Yes"
)
Steps 8 and 9
Create slicer with the newly created "Day range" column in the Date table and create a table visual with the "Yes - No order" measure as visual-level filter set at "No" as in screenshot attached below

Power BI What if analysis

I have a matrix Power BI visualization which is like
Jan Feb Mar April
Client1 10 20 30 10
Client2 15 25 65 80
Client3 66 22 54 12
I have created 3 what if parameters slicer table (having values from 1 to 4) for each client
For example, If the value of the first slicer is 1 and the second is 2 and the third is 2 then I want
Jan Feb Mar April
Client1 0 20 30 10
Client2 0 0 65 80
Client3 0 0 54 12
That is, it should replace the value with zero. I have been able to achieve that for one client using Dateadd function (by adding month)
Measure = CALCULATE(SUM('Table'[Value]),
DATEADD('Table'[Column], Parameter[Parameter Value], MONTH))
and I have used this measure to display the value, but how to make it work for the other two clients as well .

Let say you have three parameter tables as follows
Parameter1 Parameter2 Parameter3
Value1 Value2 Value3
------ ------ ------
1 1 1
2 2 2
3 3 3
4 4 4
and each of them has its own slicer. Then the measure you are after might look something like this:
Measure =
VAR Val1 = MAX(Parameter1[Value1])
VAR Val2 = MAX(Parameter2[Value2])
VAR Val3 = MAX(Parameter3[Value3])
VAR CurrClient = MAX('Table'[Client])
VAR CurrMonth = MONTH(DATEVALUE(MAX('Table'[Month]) & " 1, 2000"))
RETURN SWITCH(CurrClient,
"Client1", IF(CurrMonth <= Val1, 0, SUM('Table'[Value])),
"Client2", IF(CurrMonth <= Val2, 0, SUM('Table'[Value])),
"Client3", IF(CurrMonth <= Val3, 0, SUM('Table'[Value])),
SUM('Table'[Value])
)
Basically, you read in each parameter and compare them to the month in the current cell.

How to sum by group and add new variable dependent by the other two variables in SAS SQL

data work.want2;
input Y M $ ID $ volume;
datalines;
2009 JAN A1 100
2009 FEB A1 20
2009 FEB A1 80
2009 JAN A2 100
2009 JAN A2 100
2009 FEB A2 20
2009 FEB A2 80
2009 JAN A3 100
2009 FEB A3 150
2009 MAR A3 100
2011 DEC A1 100
2011 DEC A1 20
2011 DEC A2 20
2011 DEC A3 120
2011 DEC A3 80
2011 OCT A1 100
2011 OCT A2 20
2011 OCT A2 100
;
proc print data=want2;
run;
/*Code 2--> to sum by Y M ID*/
PROC SQL;
create table want3 as SELECT
Y,
M,
ID,
sum(volume) AS sumvolume
FROM want2
GROUP BY Y, M ,ID;
QUIT;
/*Code 3 -->get sum by Y M*/
PROC SQL;
SELECT
Y,
M,
sum(sumvolume) AS sumvolume_MO
FROM want3
GROUP BY Y, M;
QUIT;
I have use SAS SQL(code 2) to sum by ID, Y and M. I want to add a new variable,Monthly volume, dependent on Y and M.I have use "code 3" to get the results.
Is it possible to combine code 2 and code 3 together to get the results as following? I always get errors.
Thanks in advance.
Y M ID sumvolume sumvolume_MO
2009 FEB A1 100 350
2009 FEB A2 100 350
2009 FEB A3 150 350
2009 JAN A1 100 400
2009 JAN A2 200 400
2009 JAN A3 100 400
2009 MAR A3 100 100
2011 DEC A1 120 340
2011 DEC A2 20 340
2011 DEC A3 200 340
2011 OCT A1 100 220
2011 OCT A2 120 220

Updated to reflect results wanted sum(volume) instead of raw volume.
In general you would want to use sub queries. You could calculate the sum over the different groupings in separate subqueries and merge the results back together.
select a.y,a.m,a.id,a.sumvolume,b.sumvolume_mo
from
(select y,m,id,sum(volume) as sumvolume
from have
group by 1,2,3
) a
natural join
(select y,m,sum(volume) as sumvolume_mo
from have
group by 1,2
) b
;
But PROC SQL in SAS will also let you include non group and non aggregate variables in the SELECT and automatically remerge the data for you. So your could get SUMVOLUME_MO by adding up the values of SUMVOLUME.
select y,m,id,sumvolume,sum(sumvolume) as sumvolume_mo
from
(select y,m,id,sum(volume) as sumvolume
from have
group by 1,2,3
)
group by 1,2
;

Thanks to TOM's answers. I can get the results from the following codes.
PROC SQL;
create table newwant2 as
select y,m,id, sum(volume) as sumvolume_mo2,sumvolume_mo
from newwant
group by Y,M,id
;
Then I use the following code to delete the duplicate rows and keep the last row of each duplicate.
data newwant3;
set newwant2;
by Y M ID sumvolume_mo2 ;
if last.ID;
run;
proc print data=newwant3;
run;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Average aggregation in difference of dates between two different tables - powerbi

Related

DAX - Rankx by multiple Categories Issue

DAX how to Ignore certain slicers in measure?

Find ID's not present in a date represented by a yyyyweek_number

Power BI What if analysis

How to sum by group and add new variable dependent by the other two variables in SAS SQL

Categories

Resources