DAX: Cumulative Completion Rate with Month Slicer - powerbi

I'm trying to calculate cumulative completion rate by all users over moths, the issue is that in the below table for ex when I filter on october it divides users who finished till october / all users except those who finished in November.
I have a dim_date table which is connect to the data table, the retaltion is between Date from dim_date and Completion Date from Data table
Also in dim date table im numbering the months 1,2,3,4 etc
ID
Completion_status
Completion Date
1
0
2
0
3
0
4
0
5
0
6
1
11/1/2022
7
1
11/1/2022
8
1
11/1/2022
9
1
11/2/2022
10
1
11/1/2022
11
1
11/6/2022
12
1
11/4/2022
13
1
11/2/2022
14
1
10/13/2022
15
1
10/14/2022
16
1
10/14/2022
17
1
10/13/2022
18
1
10/15/2022
19
1
10/13/2022
20
1
10/13/2022
21
1
10/13/2022
22
1
10/13/2022
23
1
10/18/2022
24
1
10/13/2022
25
1
10/13/2022
26
1
10/13/2022
27
1
10/13/2022
28
1
9/10/2022
29
1
9/8/2022
the formula I use
Completion% =
VAR comp rate = SUM(Table[completion_status]) / count(Table[ID])
Return
CALCULATE(Table[Completion%],filter(ALL(Dim_Date),Dim_Date[Month Number] <= MAX(Dim_Date[Month Number])))
the expected result when I filter
on september is 2/29 = 7%
on october is 16/29 = 55%
on November is 24/29 = 83%

Something like:
=
VAR SelectedMonth =
MIN( Dim_Date[Month Number] )
VAR CumulativeTotal =
CALCULATE(
COUNTROWS( 'Table' ),
FILTER(
ALL( Dim_Date ),
Dim_Date[Month Number] <= SelectedMonth
&& NOT ( ISBLANK( Dim_Date[Month Number] ) )
)
)
VAR CountAllRows =
CALCULATE( COUNTROWS( 'Table' ), ALL( Dim_Date ) )
RETURN
DIVIDE( CumulativeTotal, CountAllRows )
I'm presuming that Dim_Date[Month Number] is blank when Table[Completion Date] is blank.
You may want to replace ALL with, for example, ALLSELECTED, depending on your required set-up.

Related

How to create chain index USING dax?

I have a table with department, category, year and sales. See the sample table below:
Table name: ABC
department
category
year
sales
Finance
1
2012
20
HR
1
2012
30
Marketing
1
2012
60
Finance
2
2012
50
HR
2
2012
15
Marketing
2
2012
17
Finance
1
2013
60
HR
1
2013
40
Marketing
1
2013
90
Finance
2
2013
7
HR
2
2013
20
Marketing
2
2013
22
Finance
1
2014
50
HR
1
2014
39
Marketing
1
2014
120
Using the DAX query language, I was able to create the relative index
department
category
year
sales
relative_index
Finance
1
2012
20
100
HR
1
2012
30
100
Marketing
1
2012
60
100
Finance
2
2012
50
100
HR
2
2012
15
100
Marketing
2
2012
17
100
Finance
1
2013
60
(60/20)*100 = 300
HR
1
2013
40
(40/30)*100 = 133
Marketing
1
2013
90
(90/60)*100 = 150
Finance
2
2013
7
(7/50)*100 = 14
HR
2
2013
20
(20/15)*100 = 133
Marketing
2
2013
22
(22/17)*100 = 129
Finance
1
2014
50
(50/60)*100 = 83
HR
1
2014
39
(39/40)*100 = 97.5
Marketing
1
2014
120
(120/90)*100 = 133
I used the following dax code to create the relatie_index
Relative_Link =
//Inception = get the minimum year
var Inception = MIN(ABC[year])
//FY_LY = get the previous year
var FY_LY = ABC[year]-1
//LY_level = get the previous year's sales for a department and category
previous year
var LY_level = calculate(sum(ABC[sales]), filter(allexcept(ABC, ABC[department],
ABC[category]), ABC[year]=FY_LY))
return if(ABC[year]=Inception, 100, (ABC[sales]/LY_level)*100)
I am having trouble creating the chain_index column
department
category
year
sales
relative_index
chain_index
Finance
1
2012
20
100
100
HR
1
2012
30
100
100
Marketing
1
2012
60
100
100
Finance
2
2012
50
100
100
HR
2
2012
15
100
100
Marketing
2
2012
17
100
100
Finance
1
2013
60
(60/20)*100 = 300
(100*300)/100 = 300
HR
1
2013
40
(40/30)*100 = 133
(100*133)/100 = 133
Marketing
1
2013
90
(90/60)*100 = 150
(100*150)/100 = 150
Finance
2
2013
7
(7/50)*100 = 14
(100*14)/100 = 14
HR
2
2013
20
(20/15)*100 = 133
(100*133)/100 = 133
Marketing
2
2013
22
(22/17)*100 = 129
(100*129)/100 = 129
Finance
1
2014
50
(50/60)*100 = 83
(83*300)/100 = 249
HR
1
2014
39
(39/40)*100 = 97.5
(97.5*133)/100 = 130
Marketing
1
2014
120
(120/90)*100 = 133
(133*150)/100 = 199.5
I am trying to use the following formula:
Chain_Index =
//Inception = get the min year
var Inception = MIN(ABC[year])
//FY_LY = get the previous year
var FY_LY = ABC[year]-1
//Next_Year = Inception + 1
var next_year = Inception + 1
//LY_ChainIndex_Value = get the previous chain index value from previous year
var LY_ChainIndex_Value = calculate(sum(ABC[chain_index]), filter(allexcept(ABC,
ABC[department], ABC[category]), ABC[FY_CY]=FY_LY))
return if(ABC[FY_CY]=Inception, 100, (ABC[relative_index]*LY_ChainIndex_Value)/100)
I am getting the following error message:
A circular dependency was detected: ABC[chain_index].
I am trying to create chain index value described in this youtube video:
https://www.youtube.com/watch?v=TXzNvxCB0_g&t=234
Thanks for reading and help will be appreciated
After some quick testing, the following column works great to calculate the chain_index:
chain_index =
VAR _ly = [year] - 1
VAR _ly_index =
CALCULATE (
SUM ( 'ABC'[relative_index] ) ,
ALLEXCEPT ( 'ABC' , 'ABC'[department] , 'ABC'[category] ) ,
'ABC'[year] = _ly
)
RETURN
IF (
ISBLANK ( _ly_index ) ,
100 ,
( [relative_index] * _ly_index ) / 100
)
Basing the relative_index on the following:
relative_index =
VAR _ly = [year] - 1
VAR _ly_sales =
CALCULATE (
SUM ( 'ABC'[sales] ) ,
ALLEXCEPT ( 'ABC' , 'ABC'[department] , 'ABC'[category] ) ,
'ABC'[year] = _ly
)
RETURN
IF (
ISBLANK ( _ly_sales ) ,
100 ,
100.0 * ( [sales] / _ly_sales )
)
I suspect the circular dependency is caused by whatever it is you are doing to calculate this column 'ABC'[FY_CY] - a column you do not need for this calculation at least.
Unless you are planning to deploy these indexing columns for categorization or slicing, then the best practice is to calculate these values using measures instead of making your data model bigger with calculated columns.

DAX - Rankx by multiple Categories Issue

I have 4 Categories (GP, ID, Age, Date). I would would like to create calculated column and group by GP, ID, and Age and Rank/ count by Date to see how many months each member has in past 24 month.
My Code works until I have members who cancelled their membership for a few months and then resumed after. I need to restart from the first month after skip. for example :
GP ID AGE DATE RKING Desired RANK
1 220 35-44 202206 12 6
1 220 35-44 202205 12 5
1 220 35-44 202204 12 4
1 220 35-44 202203 12 3
1 220 35-44 202202 12 2
1 220 35-44 202201 12 1
1 220 35-44 202012 24 24
1 220 35-44 202011 23 23
1 220 35-44 202010 22 22
1 220 35-44 202009 21 21
1 220 35-44 202008 20 20
1 220 35-44 202007 19 19
1 220 35-44 202006 18 18
1 220 35-44 202005 17 17
1 220 35-44 202004 16 16
… … … … … …
1 220 35-44 201901 1 1
This is what I have tried but doesn't work for dates skipping.
RKING Column=
RANKX (
CALCULATETABLE (
VALUES ('tbl'[Date] ),
ALLEXCEPT ( 'tblW', 'tbl'[GP], 'tbl'[ID] ),
'tbl'[AGE] = 'tbl'[AGE],
'tbl'[date] >= start_date && 'tbl'[date] <= end_date // date slicer
),
[Date] ,
,ASC
)
Looking through the code you were trying to make a measure for a visual (For a calcCol the measure is added as well). And as I got a point, you want to show a sum of consequtive months in a matrix for each date in accordance to ID/GP/AGE/DATE I see a following way.
As you know, calculations performs for each row in a matrix and filter the data model according to data presented in matrix rows and columns (slicers as well). So, my idea is -
Get date from matrixRow and use it as max date for the table.
Then use a FILTER(). FILTER() is an iterative function, so it goes throw each row and checks filtering condition - if true row remains if false - not.
I use following filtring conditions:
Get dateInMatrix-dateInACurrentTableRow (for example: 202203-202201= 2 months)
Then check how many rows in the table with min=202201 and max<202203
if there are less rows then date difference then it FALSE() and the row is out of table.
3) The last step is counting of rows it a filtered table.
A measure for matrix:
Ranking =
VAR matrixDate=MAX('table'[DATE])
VAR filteredTable =
FILTER(
ALL('table')
,DATEDIFF(
DATE(LEFT([DATE],4),RIGHT([DATE],2),1)
,DATE(LEFT(matrixDate,4),RIGHT(matrixDate,2),1)
,MONTH
)
=
VAR dateInRow=[DATE]
RETURN
CALCULATE(
COUNTROWS('table')
,'table'[DATE]>=dateInRow
,'table'[DATE]<matrixDate
)
)
RETURN
COUNTROWS(filteredTable)
[![enter image description here][1]][1]
A measure for calcColl:
RankColl =
VAR currentDate=[Start_Date]
Var MyFilt={('Table'[AGE],'Table'[ID],'Table'[GROUP])}
VAR withColl =
ADDCOLUMNS(
CALCULATETABLE(
'table'
,ALL('Table')
,TREATAS(MyFilt,'Table'[AGE],'Table'[ID],'Table'[GROUP])
)
,"dateDiff",
DATEDIFF(
[Start_Date]
,currentDate
,MONTH
)
,"RowsInTable",
VAR dateInRow=[Start_Date]
Var startDate=IF(dateInRow<currentDate,dateInRow,currentDate)
VAR endDay =IF(dateInRow>currentDate,dateInRow,currentDate)
VAR myDates = GENERATESERIES(startDate,endDay,1)
RETURN
COUNTROWS(
CALCULATETABLE(
'Table'
,ALL('Table')
,TREATAS(MyFilt,'Table'[AGE],'Table'[ID],'Table'[GROUP])
,TREATAS(myDates,'Table'[Start_Date])
)
)
)
VAR filtered =
FILTER(
withColl
,[dateDiff]=[RowsInTable]-1 -- for ex.:
-- dateDiff=01/01/2022-01/01/2022=0,
-- but it will be 1 row in the table for 01/01/2022
)
RETURN
CountRows( filtered)

How to create a column based on grouped condition?

My test tabe in powerbi:
IdRecord
Date
Value
1
2022-04-25 23:45:00.000
100
1
2022-04-24 18:07:00.000
344
2
2022-05-01 23:45:00.000
5
2
2022-05-02 18:07:00.000
66
2
2022-05-03 18:07:00.000
31
I require to create a calculated column to mark the earliest of the records grouped by id.
Desired output
IdRecord
Date
Value
IsFirst
1
2022-04-25 23:45:00.000
100
0
1
2022-04-24 18:07:00.000
344
1
2
2022-05-01 23:45:00.000
5
1
2
2022-05-02 18:07:00.000
66
0
2
2022-05-03 18:07:00.000
31
0
Answering to myself
FirstRes= VAR MYMIN = CALCULATE(
MIN(Table[Date]),
FILTER ( Table, Table[IdRecord] = EARLIER(Table[IdRecord]))
)
RETURN
IF(CALCULATE(
MIN(MIN(Table[Date]),MYMIN),
FILTER ( Table, Table[IdRecord] = EARLIER ( Table[IdRecord] ) )
) = Table[Date],1,0)

Amazon Athena: Query to find out patients with compliance=0 for consecutive 10 days

Find all patients having compliance=0 from past consecutive 10 days from current date using Amazon Athena.
patient id compliance create_date
1 0 2021-01-01
1 0 2021-01-02
1 0 2021-01-03
1 0 2021-01-04--rejected not for consecutive 10
2 0 2021-01-01
2 0 2021-01-02
2 0 2021-01-03
2 0 2021-01-04
2 0 2021-01-05
2 0 2021-01-06
2 0 2021-01-07
2 0 2021-01-08
2 0 2021-01-09
2 0 2021-01-10-- accepted as for 10 consective
There are multiple ways to achieve this, and one can be to take the difference between a given date and the next one and check the cumulative sum of last X deltas (which is equal to 10 in your case) and the cumulative sum of your compliance integer on that row (which should be strictly equal to 0):
with base as (
select
*,
sum(delta) over (partition by patient_id rows between 10 preceding and current row) as cumdelta ,
sum(compliance) over (partition by patient_id rows between 10 preceding and current row) as cumcompliance
from (
select *, if (date_diff('day', date, next_date) is null, 1, date_diff('day', date, next_date)) as delta
from (
select
patient_id,
compliance,
try_cast(date as date) as date,
lead(date) over (partition by patient_id order by date) as next_date
from data
)
)
)
select
patient_id,
compliance,
date,
case when (cumdelta = 10 and cumcompliance = 0) then 'yes' else null end as validated_compliance
from base

Power BI : DAX: Running Sum with fixed start date - even when filtering

I have two tables, with:
Entrydate, several categories
ChurnDate, several categories
The categories are connected via different tables, and the dates are connected with a Calendar.
Now I want to calculate how many customers I have. So I have following DAX formulas
1. SumChurn =
CALCULATE(
SUM('kuendigungen'[KUENDIGUNG]);
FILTER(
ALLSELECTED('Calendar'[Date]);
ISONORAFTER('Calendar'[Date]; MAX('Calendar'[Date]); DESC)
)
)
2. SumEntry =
CALCULATE(
SUM('eintritt'[NEUMITGLIED]);
FILTER(
ALLSELECTED('Calendar'[Date]);
ISONORAFTER('Calendar'[Date]; MAX('Calendar'[Date]); DESC)
)
)
3. TotalCustomers = SumEntry - SumChurn
This works, but in my diagram I want to filter the dates, so that it only visualizes 2020 or the last 3 years.When I do this the calculation is wrong because it only counts in this interval.
Is there a solution that I can filter the date in my visuals but in my calculation the start date of the cummulative sum is always fixed?
I dont't want a new column because I still want to filter my categories of customers...
Thanks,
Michaela
Edit: Try to explain it clearer
Example Table 1: contains new customers
Date unique_id1 unique_id2 unique_id3 cat1 cat2 cat3 cat4 cat5 cat6
1886-02-01 2070030124 550261 207000152145 207 0 0 1 0 0
1887-01-01 4350002756 4081878 435000010707 435 0 0 1 0 0
1888-01-01 7030000597 3206858 703000001279 703 0 0 1 0 0
1888-06-01 7030016696 3208056 703000005002 703 0 0 1 0 0
1888-09-01 8210024182 204124 821000008664 821 1 0 1 0 1
1889-01-01 7050055324 1988250 705000018309 705 1 0 1 0 0
1889-01-01 8250000278 439485 825000600296 825 0 0 1 0 0
1889-05-01 7030023754 3208355 703000000884 703 0 0 1 0 0
1889-10-01 2110071206 2849359 211000330019 211 0 1 1 0 0
1889-10-01 2110071236 2851371 211000120014 211 0 0 1 0 0
1889-11-14 5190529889 4260192 519000123846 519 1 0 1 0 0
1890-07-01 7330349030 4819467 733000013102 733 0 0 1 0 0
1890-07-01 7330152914 4817492 733000075604 733 1 0 1 0 1
1890-07-01 8190000889 486170 819000215708 819 0 0 1 0 0
1890-07-01 8190444976 486199 819000215740 819 0 0 1 0 0
1890-12-01 8190001388 476049 819000100005 819 0 0 1 0 0
1891-01-01 7030001248 3206975 703000000043 703 0 0 1 0 1
Example Table 2: contains leaving customers
similiar to table 1
Example Calendar Table:
01.01.1990
02.01.1990
03.01.1990 ... (till today)
Output shut be a measure
for each day in calendar: number of customer at this date = cumulative_sum(newcustomer) - cumulative_sum(churncustomer)
I get exactly this output, when I run the calculations I wrote, but I want the measure in a way, ehen I filter the date, the sum is still the cummulative sum from the very first date, otherwise the numbers are wrong.
Edit3:
I did exactly the same thing, as mkrabbani posted, but it doesnt't work for me, following calculations:
TotalKuendigungen =
CALCULATE(
SUM('kuendigungen'[KUENDIGUNG]);
FILTER (
ALL ( 'Calendar'[Date] );
( 'Calendar'[Date] <= MAX ( ( 'Calendar'[Date] ))
)))
TotalNeukunden = CALCULATE(
SUM('eintritt'[NEUMITGLIED]);
FILTER (
ALL ( 'Calendar'[Date] );
( 'Calendar'[Date] <= MAX ( ( 'Calendar'[Date] ))
)))
AnzahlMitglieder = [SummeNeumitglied] - [SummeKuendigung]
This is how it looks for me: (Neukunden: new customers, kündigungen: leaving, aktuellemitglieder: number of customers)
Picture 1 correct calculation
Picture 2: also correct calculation, but filter doesnt work
thanks for adding some sample data with more explanation. If I get your requirement correct, this below steps with explanation will help you solving your issue I hope.
Assumption: If my understanding is correct, you have 3 tables with Date, new_customer and leaving_customer and they are related as below diagram shown.
Now, I have created some sample data for 10 days, to visualize your requirement/issue. Hope, cumulative counts in the below table is correctly calculated (using basics of cumulative calculation).
At this stage, you need a measure that will calculate current number of customer for each row based on calculation > "cumulative_new_customer - cumulative_leaving_customer" which is not a tough job for you.
But, you are having issue when you are slicing your data using Date slicer. If you are selecting date number 5, which is "January 05 2020" in my sample data. You wants the final counts based on date January 01 to 05, but you are getting only counts from one single date "January 05 2020".
If the above explanation is correct, I would suggest to write 3 separate Measure as explained below in this answer. You can have a look on the output in the below picture I have added with comparison with before and after slicing the data. You can see the number of current user for "January 05 2020" is 41 for both case (Before and After Slicing)
Now, if everything above is meeting your expectation, you can use this below 3 measures as written.
1.
cumulative_new_customer =
CALCULATE (
COUNT(new_customer[unique_id]),
FILTER (
ALL ( 'Dates'[Date] ),
'Dates'[Date] <= MAX ( 'Dates'[Date] )
)
)
2.
cumulative_leaving_customer =
CALCULATE (
COUNT(leaving_customer[unique_id]),
FILTER (
ALL ( 'Dates'[Date] ),
'Dates'[Date] <= MAX ( 'Dates'[Date] )
)
)
3.
number_of_cutomer_today = [cumulative_new_customer] - [cumulative_leaving_customer]
Hope the above details will help you.