I'm trying to create a report that shows the number of males, females, and total customers in a given day by hour.
The data is inserted into the database as a transaction whenever somebody enters the building. It stores their gender there.
The query to gather the data looks as follows:
initial_query = EventTransactions.objects.using('reportsdb')
.filter(building_id=id,
actualdatetime__lte=end_date,
actualdatetime__gte=beg_date)
From there, I annotate the query to extract the date:
ordered_query = initial_query.annotate(
year=ExtractYear('actualdatetime'),
month=ExtractMonth('actualdatetime'),
day=ExtractDay('actualdatetime'),
hour=ExtractHour('actualdatetime'),
male=Coalesce(Sum(Case(When(customer_gender='M', then=1)), output_field=IntegerField()), Value(0)),
female=Coalesce(Sum(Case(When(customer_gender='F', then=1)), output_field=IntegerField()), Value(0))
).values(
'year', 'month', 'day', 'hour', 'male', 'female'
)
How do I then sum the male customers and female customers by hour?
By this, I mean that I wish to provide a table to the user which contains each hour of the day (can just be a number from 0-23 at this point), total males for that hour, total females for that hour, and total customers for that hour:
TIME | MALE | FEMALE | TOTAL
0 12 4 16
1 5 8 13
2 2 3 5
3 20 38 58
etc.
I'd be happy to provide more information if necessary. Thank you!
You're nearly there. Do the annotation for the gendered populations after calling values such that the database will do the aggregate on the group of datetime extracted values.
ordered_query = initial_query.annotate(
year=ExtractYear('actualdatetime'),
month=ExtractMonth('actualdatetime'),
day=ExtractDay('actualdatetime'),
hour=ExtractHour('actualdatetime'),
).values(
'year', 'month', 'day', 'hour',
).annotate(
male=Coalesce(Sum(Case(When(customer_gender='M', then=1)), output_field=IntegerField()), Value(0)),
female=Coalesce(Sum(Case(When(customer_gender='F', then=1)), output_field=IntegerField()), Value(0))
)
Related
I am working to get cumulative distinct count of uids on daily basis. My dataset consists dates and UserIDs active on that date. Example : Say there are 2 uids (235,2354) appeared on date 2022-01-01 and they also appeared on next day with new uid 125 (235,2354,125) on 2022-01-02 At this point i want store cumulative count to be 3 not 5 as (user id 235 and 2354 already appeared on past day ).
My Sample Data looks like as follows:
https://github.com/manish-tripathi/Datasets/blob/main/Sample%20Data.xlsx
enter image description here
and my output should look as follows:
enter image description here
Here's one way that seems to work, using your linked Excel sheet as the data source.
Create a new table:
Table 2 = DISTINCT('Table'[Date])
Add the columns:
MAU = CALCULATE(
DISTINCTCOUNT('Table'[User ID]),
'Table'[Date] <= EARLIER('Table 2'[Date]))
DAU = CALCULATE(DISTINCTCOUNT('Table'[User ID]),
'Table'[Date] = EARLIER('Table 2'[Date]))
Result from your Excel data
Our Subscription table is contains around 40k entries. I am trying to calculate the Monthly-Recurring.Dim is a simple calendar table to map the date ranges start and end to the days.
Sumx('Subscription Clean',
CALCULATE(
SUMX(
SUMMARIZE(
FILTER(
CROSSJOIN(
'Subscription Clean',Dim
),
Dim[Date] <= 'Subscription Clean'[End At] &&
Dim[Date] > 'Subscription Clean'[Start At]
),
'Subscription Clean'[Duration total],
'Subscription Clean'[Duration],
'Subscription Clean'[End At],
'Subscription Clean'[Start At],
'Subscription Clean'[Price]),
DIVIDE(DIVIDE(
'Subscription Clean'[Price],DATEDIFF([Start At],[End At],DAY)+1
),1.19) % Tax of 19
)
)
)
The data looks like this
Sales Id
Product name
created At
end at
Cancelled At
Price in €
Duration
1
bananas
01.01.2021
01.03.2021
28.02.2021
20.00
2
2
apples
01.02.2021
01.05.2021
null
90€
3
The output should be like this for the mmr each month 10€ come from the first position for the months {1,2}
Each price will be distributed between the dates Start and End.
jan = 10
feb = 40
mar = 30
apr = 30
may = 0
The DAX query is working properly but hell a slow. Is there any way to improve the performance? Further the calculated measure will be used to do more calculations and needed to be loaded in memory if possible for quick access. Maybe anyone knows how to do it properly and I can leech some good hints ;)
I have two tables:
user table (contains: user registration data. columns: user_id, create_date)
customer order table (contains: history of orders. columns: user_id, order_date, order_id)
*user and customer aren't the same. when a user registers his first order, he becomes a customer.
For each month of each year, I want the accumulative count of distinct users and the accumulative count of the distinct customers because at last, I want to calculate the ratio of the accumulative count of the distinct customers to the accumulative count of the distinct users for each month.
I don't know how can I calculate the accumulative values and the Ratio that I said, using DAX.
Note that if a customer registers more than one order in a month, I want to count him just once for that month and if he registers a new order in the next months, also I count him in each new month.
Maybe these pictures help you to understand my question better.
-I don't count_of_users and count_of_customers columns in my tables. I should calculate them.
the user table:
user_id
create_date
1
2017-12-03
2
2018-01-01
3
2018-01-01
4
2018-02-04
5
2018-03-10
6
2018-04-07
7
2018-04-08
8
2018-09-12
9
2018-10-02
10
2018-10-02
11
2018-10-09
12
2018-10-11
13
2018-10-12
14
2018-10-12
15
2018-10-20
the customer order table:
user_id
order_date
order_id
1
2018-03-28
120
1
2018-03-28
514
1
2018-03-30
426
2
2018-02-11
125
2
2018-03-01
547
3
2018-02-10
588
3
2018-04-03
111
4
2018-02-10
697
5
2018-04-02
403
5
2018-04-05
321
6
2018-04-09
909
11
2018-10-25
8401
You need a few building blocks for this. Here is the data model I used:
<edit>
I see user_id in the different tables are not the same, in that case you can omit the relationship between the tables and the two relationships from the Calendar table will both be active - with no need to change the relationship semantics in the count_of_customer measure. </edit>
The calendar table is important because we can't rely on one single date column to aggregate data from different tables, so we create a common calendar table with this sample DAX code:
Calendar =
ADDCOLUMNS (
CALENDARAUTO () ,
"Year" , YEAR ( [Date] ) ,
"Month" , FORMAT ( [Date] , "MMM" ) ,
"Month-Year" , FORMAT ( [Date] , "MMM")&"-"&YEAR ( [Date] ) ,
"YearMonthNo" , YEAR ( [Date] ) * 12 + MONTH ( [Date] ) - 1
)
Make sure to sort the Month-Year column by the YearMonthNo column so your tables look nice:
Set your relationships as shown with the active relationship from Calendar to user - if not the measures will not work unless you alter the relationships accordingly in the code! In my data model the inactive relationship is between Calendar and customer order.
Next up are the measures we will use for this. First off we count the users, a simple row count:
count_of_users = COUNTROWS ( user )
Then we count distinct user ids in the order table to count customers, here we need to use the inactive relationship between Calendar and customer order and to do this we have to invoke CALCULATE:
count_of_customers =
CALCULATE (
DISTINCTCOUNT ( 'customer order'[user_id] ) ,
USERELATIONSHIP (
'Calendar'[Date] ,
'customer order'[order_date]
)
)
We can use this measure to count users cumulatively:
cumulative_users =
VAR _maxVisibleDate = MAX ( 'Calendar'[Date] )
RETURN
CALCULATE (
[count_of_users] ,
ALL ( 'Calendar' ) ,
'Calendar'[Date] <= _maxVisibleDate
)
And this measure to count cumulative customers per month:
cumulative_customers =
VAR _maxVisibleDate = MAX ( 'Calendar'[Date] )
RETURN
CALCULATE (
SUMX (
VALUES ( 'Calendar'[YearMonthNo] ) ,
[count_of_customers]
),
ALL ( 'Calendar' ) ,
'Calendar'[Date] <= _maxVisibleDate
)
Lastly we want the ratio of these last cumulative measures:
cumulative_customers/users =
DIVIDE (
[cumulative_customers] ,
[cumulative_users]
)
And here is your result:
I have a yearly goal for number of graduates. I want to distribute this yearly number to monthly level using predefined percent numbers.
I would like to get
jan = 1.7% * 292 = 4.96
feb = 1.4% * 292 = 4.01
etc...
The problem is that yearly number of graduates has date of 2021-01-01 and relation to date table, so it will only work for the first month. (Other months are blank). I cannot change the date relation because I have other goals in the same table that use month
Here are my measures
Graduates goal = CALCULATE( SUM(value), Measure = 'Graduates goal')
Goal% = CALCULATE( SUM(value), Measure = 'Graduates%)
Montly graduate target = CALCULATE( [Graduates goal] * [Goal%])
I have tried using ALL(Dates[Month]) ALL(Dates[Year]) but I cannot get past that month level restriction in yearly goal.
Update:
I was able to solve this with crossfilter something like this
Montly graduate target = CALCULATE([Graduates goal], CROSSFILTER( Goals, Dates, None), YEAR(Pvm) = YEAR(TODAY())) * Goal%
In power bi i have 2 table
Table 1 (Total no of seats)
Venue 1 100 seats
Venue 2 150 seats
Table 2 (No of seats used)
Venue 1 40 seats
Venue 2 75 seats
I need to calculate how to many seats used
ex (40/100) *100 = 40%
Can someone helps me
**Current database i can't join these table
Try to use average function did not work
Assuming your tables look like this:
Table 1:
Venue | Total Seats
-----------------------
Venue 1 | 100
Venue 2 | 150
Table 2:
Venue | Seats Used
----------------------
Venue 1 | 40
Venue 2 | 75
Create a relationship between Table 1 and Table 2 on field Venue, then you can create a measure:
Seat Utilisation =
DIVIDE (
SUM ( 'Table 2'[Seats Used] ),
SUM ( 'Table 1'[Total Seats] ),
BLANK()
)
See https://pwrbi.com/so_55470616/ for an example PBIX file