Calculating the cumulative values in Power BI - powerbi

I have two tables:
user table (contains: user registration data. columns: user_id, create_date)
customer order table (contains: history of orders. columns: user_id, order_date, order_id)
*user and customer aren't the same. when a user registers his first order, he becomes a customer.
For each month of each year, I want the accumulative count of distinct users and the accumulative count of the distinct customers because at last, I want to calculate the ratio of the accumulative count of the distinct customers to the accumulative count of the distinct users for each month.
I don't know how can I calculate the accumulative values and the Ratio that I said, using DAX.
Note that if a customer registers more than one order in a month, I want to count him just once for that month and if he registers a new order in the next months, also I count him in each new month.
Maybe these pictures help you to understand my question better.
-I don't count_of_users and count_of_customers columns in my tables. I should calculate them.
the user table:
user_id
create_date
1
2017-12-03
2
2018-01-01
3
2018-01-01
4
2018-02-04
5
2018-03-10
6
2018-04-07
7
2018-04-08
8
2018-09-12
9
2018-10-02
10
2018-10-02
11
2018-10-09
12
2018-10-11
13
2018-10-12
14
2018-10-12
15
2018-10-20
the customer order table:
user_id
order_date
order_id
1
2018-03-28
120
1
2018-03-28
514
1
2018-03-30
426
2
2018-02-11
125
2
2018-03-01
547
3
2018-02-10
588
3
2018-04-03
111
4
2018-02-10
697
5
2018-04-02
403
5
2018-04-05
321
6
2018-04-09
909
11
2018-10-25
8401

You need a few building blocks for this. Here is the data model I used:
<edit>
I see user_id in the different tables are not the same, in that case you can omit the relationship between the tables and the two relationships from the Calendar table will both be active - with no need to change the relationship semantics in the count_of_customer measure. </edit>
The calendar table is important because we can't rely on one single date column to aggregate data from different tables, so we create a common calendar table with this sample DAX code:
Calendar =
ADDCOLUMNS (
CALENDARAUTO () ,
"Year" , YEAR ( [Date] ) ,
"Month" , FORMAT ( [Date] , "MMM" ) ,
"Month-Year" , FORMAT ( [Date] , "MMM")&"-"&YEAR ( [Date] ) ,
"YearMonthNo" , YEAR ( [Date] ) * 12 + MONTH ( [Date] ) - 1
)
Make sure to sort the Month-Year column by the YearMonthNo column so your tables look nice:
Set your relationships as shown with the active relationship from Calendar to user - if not the measures will not work unless you alter the relationships accordingly in the code! In my data model the inactive relationship is between Calendar and customer order.
Next up are the measures we will use for this. First off we count the users, a simple row count:
count_of_users = COUNTROWS ( user )
Then we count distinct user ids in the order table to count customers, here we need to use the inactive relationship between Calendar and customer order and to do this we have to invoke CALCULATE:
count_of_customers =
CALCULATE (
DISTINCTCOUNT ( 'customer order'[user_id] ) ,
USERELATIONSHIP (
'Calendar'[Date] ,
'customer order'[order_date]
)
)
We can use this measure to count users cumulatively:
cumulative_users =
VAR _maxVisibleDate = MAX ( 'Calendar'[Date] )
RETURN
CALCULATE (
[count_of_users] ,
ALL ( 'Calendar' ) ,
'Calendar'[Date] <= _maxVisibleDate
)
And this measure to count cumulative customers per month:
cumulative_customers =
VAR _maxVisibleDate = MAX ( 'Calendar'[Date] )
RETURN
CALCULATE (
SUMX (
VALUES ( 'Calendar'[YearMonthNo] ) ,
[count_of_customers]
),
ALL ( 'Calendar' ) ,
'Calendar'[Date] <= _maxVisibleDate
)
Lastly we want the ratio of these last cumulative measures:
cumulative_customers/users =
DIVIDE (
[cumulative_customers] ,
[cumulative_users]
)
And here is your result:

Related

power bi dax, sum up all latest monthly entries

Hi I have a data table in powerbi structured
id date data
1 2022-10-30 123
1 2022-11-01 130
1 2022-11-30 456
the data spans multiple user ids and multiple years and it the values are cumulative (like minutes on a phone plan for instance). This is not the real data
I want to add up the end of month data. In the ideal case, my table would be complete and 2022-10-31 would exist for instance, then I could do
Measure =
CALCULATE(
SUM( 'Table'[data] ),
'Table'[dates] = EOMONTH( 'Table'[dates],0 )
)
This returns 456 but I want 579 (123+456). So i cannot use EOMONTH
I think the answer is some combination of the dax above and
FILTER( Table, Table[date] = MAX( Table[date] ) )
Though if I paste that in solo, it grabs the actual latest date only, not all monthly latest dates
Also I will use slicers on user ID's in case that changes the DAX
Please use this measure to get what you need:
Measure_ =
VAR TblSummary =
ADDCOLUMNS ( YoursTable, "EOM", CALCULATE ( ENDOFMONTH ( YoursTable[date] ) ) )
RETURN
SUMX ( TblSummary, IF ( [EOM] = [date], [data] ) )
If we test our above measure on a table visual:

dax - counting rows between dates

I have a table like
Account Open Close
1 01/01/2018 01/01/2019
2 01/01/2018 01/01/2020
3 01/01/2019 01/01/2021
4 01/01/2021
5 01/01/2019 01/01/2020
I'm interested in counting the number of accounts that are still active at the end of each year:
Year Count
2018 2
2019 3
2020 1
2021 1
I'm not sure if this can be derived from the Account table itself, so I created a date table with dates spanning many years. I added a column like
active_accounts = countrows ( Accounts, FILTER ( Accounts[Open] >= Date_table[Date] && Date_table[Date] < Accounts[Close]
The formula seemed to be working as an added column but took extremely long to calculate as the date table contains many dates. So I tried to use the formula as a DAX measure, but it seems to have trouble comparing columns between more than one table:
a single value for column 'Date' in table 'date_table' can't be determined. This can happen when a measure refers to a column containing many values without specifying aggregation
What's the simplest way to accomplish counting the number of active accounts in a particular year? Can this be done without a date table?
edit: enclosing date_table[date] with min() and max() makes the measure valid, but the figures are not right.
-further research indicates this might require CROSSJOIN()
edit: it looks like this can be accomplished by creating a cartesian product b/w the date_table and Account by FILTERING on where Date_table.Date is greater than Open but less than Close
You have already created a Calendar table and you can achieve the end goal with the following measure
_mX:=
VAR _1 =
GENERATEALL (
'fact',
DATESBETWEEN ( 'Calendar'[Calendar_Date], 'fact'[Open], 'fact'[Close]-1 )
)
VAR _2 =
ADDCOLUMNS (
_1,
"Calendar_Year",
CALCULATE (
MAXX (
FILTER ( 'Calendar', 'Calendar'[Calendar_Date] = [Calendar_Date] ),
'Calendar'[Calendar_Year]
) + 0
)
)
VAR _3 =
SUMMARIZE ( _2, [Calendar_Year], [Account] )
VAR _4 =
GROUPBY ( _3, [Calendar_Year], "count", COUNTX ( CURRENTGROUP (), [Account] ) )
VAR _5 =
ADDCOLUMNS (
'Calendar',
"ct",
CALCULATE (
MAXX (
FILTER ( _4, [Calendar_Year] = MAX ( 'Calendar'[Calendar_Year] ) ),
[count]
)
)
)
RETURN
MAXX ( _5, [ct] )
The measure is depended upon a calendar table with minimum columns like following and calendar has no relationship to fact
| Calendar_Date | Calendar_Year |
|---------------|---------------|
| 1/1/2018 | 2018 |

What is the purpose of using VALUES or ALL in the first parameter of an iterator function?

I know that only CALCULATE can modify the filter context. However following are 2 example using VALUES and ALL.
Example 1:
Revenue =
SUMX(
Sales,
Sales[Order Quantity] * Sales[Unit Price]
)
Revenue Avg Order =
AVERAGEX(
VALUES('Sales Order'[Sales Order]),
[Revenue]
)
What is the purpose of VALUES in AVERAGEX function? Is this to add an additional filter context?
Example 2:
Product Quantity Rank =
RANKX(
ALL('Product'[Product]),
[Quantity]
)
What is the purpose of using ALL in an iterator function?
Suppose we have a table like this:
ID
Sales Order
Order Quantity
UnitID
Unit Price
1
101
10
4
39.99
2
101
15
3
24.99
3
102
5
2
15.99
4
103
5
1
14.99
5
103
10
3
24.99
Since the Sales Order column has duplicates,
Revenue Avg Order = AVERAGEX ( VALUES ( Sales[Sales Order] ), [Revenue] )
gives a different result than
Revenue Avg ID = AVERAGEX ( Sales, [Revenue] )
since the first averages over the three Sales Order values whereas the second averages over the five ID rows.
Using DISTINCT instead of VALUES would work too.
Using ALL is instead of VALUES gives the same total but ignores the local filter context from the table visual:
Revenue Avg All = AVERAGEX ( ALL ( Sales[Sales Order] ), [Revenue] )
In this context, ALL is acting as a table function that returns all of the distinct values of the column specified ignoring filter context.

Calculate price based on distinct count

I am having trouble working out a measure (Revenue) in power bi.
I have a measure which is basically counting distinct values in a table (table 1). From this column I want to multiply the distinct count to get the total price (prices are in another table).
See below for an example
Table 1
Product DistinctCount Revenue (Measure I am trying to Calculate)
A 15 45.00
B 30 60.00
Prices Table
Product Price
A 3.00
B 2.00
At the moment the Revenue is calculating based on COUNT and not DISTINCTCOUNT.
Any help would be much appreciated.
thanks!
Measures, Calculated Columns, Google
I am assuming you have a relationship set up between these two tables on [Product]. If this is the case you can do something like this to create a calculated column:
Revenue =
CALCULATE (
SUMX ( 'Table 1', 'Table 1'[DistinctCount] * RELATED ( 'Prices Table'[Price] ) )
)
If you are trying to create a table visual try the DAX below, where ID is just a transaction ID for each product in your 'Table 1':
Revenue =
VAR DistinctCountOfProductTransactions =
CALCULATE ( DISTINCTCOUNT ( 'Table'[Id] ) )
VAR Result =
CALCULATE (
DistinctCountOfProductTransactions * SUM ( Prices[Price] ),
TREATAS ( VALUES ( 'Table'[Product] ), Prices[Product] )
)
RETURN
Result

What is the PowerBI/DAX query equivalent of this SQL windowed function

I have the following Table:
NDayNo Customer Date CallID
0 A 02/09/2018 48451
24 A 26/09/2018 48452
0 B 21/09/2018 48453
4 B 25/09/2018 48454
0 C 17/09/2018 48455
8 C 25/09/2018 48456
9 C 26/09/2018 48457
9 C 26/09/2018 48458
0 D 09/09/2018 48459
The NDayNo. value was worked out using this function in SQL:
COALESCE(DATEDIFF(day,FIRST_VALUE(Date) OVER (PARTITION BY Customer ORDER By Date),Date),0)
NDayNo. = working out the first time the customer contacts in the month (=0) and then how many days until next time they contact.
Im trying to replicate the same logic in PowerBI. Anybody know how I can calculate this as a calculated column/ DAX query?
This should work for you:
NDayNo =
DATEDIFF (
CALCULATE ( MIN ( 'table'[Date] ), ALLEXCEPT ( 'table', 'table'[Customer] ) ),
'table'[Date],
DAY
)
This DAX expression returns for each row the difference in days between the minimum [date] in the whole table (only filtered to the [Customer] in that row) and the [date] in that row.