Calculate age cluster dynamically in DAX - Power BI - powerbi

I have created a report for the HR department. One of the visuals aims to display the pyramid of ages of the employees by gender (we take cluster of 5 years, e.g. 20-25 for people between 20 and 25 years old).
To make it simple, data wise, I have a table with the list of employees, including their date of birth and many other fields, not relevant for this post. I added a calculated column with the age cluster based on the today’s date:
AgeCluster =
VAR AgeCalc=if(HR_DATA[Birthdate]=blank(),blank(),DATEDIFF(HR_DATA[Birthdate],today(),YEAR))
VAR Mult5=INT(AgeCalc/5)
RETURN
if(isblank(AgeCalc),blank(),5*Mult5&"-"&5*(Mult5+1))
And I have a basic visual (tornado chart with the AgeCluster in Group, showing male and female)
Now my issue is that my report should be dynamic, so the user should be able to see the situation in the past or in the future... I have a calendar table (not linked to my HR_Data table), and a date slicer on my report's page. I need the age cluster to be recalculated.
I have tried a calculated table, but I can’t get it working properly. I have read various blog posts on similar issue, but still can't figure out how to solve it...
Any idea or tips much appreciated.
Thank you so much!

Dynamic filtering usually means in these situations that you need a fixed x-axis to work with. That x-axis is calculated outside the original table, such as a parameter table.
Assuming your data looks like this:
Table: Employees
EmployeeID
DOB
1
20 March 1977
2
05 December 1981
3
25 December 1951
4
20 December 1954
5
04 March 1980
6
24 July 1968
7
07 June 1984
8
01 October 1992
9
25 February 1999
10
02 November 1987
First, we need to create the Age Groups with their respective lower and top bands. This table uses a similar code when a parameter table is created.
Table: Buckets
Because you are using whole numbers, I think is best to calculate the TopBand by adding 4. In that case, you don't have repeating numbers. In other words, ranges shouldn't overlap (20-25 and 25-30)
Buckets =
SELECTCOLUMNS (
GENERATESERIES ( 20, 80, 5 ),
"Age Series",
[Value] & " - " & [Value] + 4,
"LowerBand", [Value],
"TopBand", [Value] + 4
)
Also, we will need a Calendar Table to filter accordingly.
Table: CalendarHR
CalendarHR =
ADDCOLUMNS (
CALENDAR ( MIN ( Employees[DOB] ), TODAY () ),
"Year", YEAR ( [Date] )
)
With all that, you can create a calculation that can be filter dynamically.
DAX Measure:
EmployeesbyBracket =
VAR _SelectedDate =
MAX ( CalendarHR[Date] )
VAR _SelectedLowerBand =
SELECTEDVALUE ( Buckets[LowerBand] )
VAR _SelectedTopBand =
SELECTEDVALUE ( Buckets[TopBand] )
VAR EmployeesAge =
ADDCOLUMNS (
Employees,
"Age",
INT ( DATEDIFF ( [DOB], _SelectedDate, YEAR ) / 5 ) * 5
)
RETURN
COUNTROWS (
FILTER (
EmployeesAge,
[Age] >= _SelectedLowerBand
&& [Age] <= _SelectedTopBand
)
)
Output
I've created a table visual with the Age Groups from the Buckets table.

Both a calculated column and a calculated table will be static and not achieve your desired objective. You need age to be calculated via a measure.
Create a disconnected table with your age buckets and then implement the dynamic segmentation pattern.

Related

Counting a conditional flag within a group over a subgroup in Power BI

I am looking for help in calculating a total by month, that counts how many applications met an aggregated criteria within that month.
A simplified version of the data is that we receive applications, and complete a number of tasks to process that application. Each task has a duration, and adding all the tasks for an application gives the total processing time for the application. We then want to know how many applications were processed in 10 days or less.
Note - the tasks to sum can vary, so we want to calculate the application total duration at the DAX layer, rather than in PowerQuery.
Example raw data:
Application
Task
Period
Duration
A
2859
Aug-22
2
A
2860
Aug-22
2
A
2861
Aug-22
1
B
1990
Aug-22
1
B
1991
Aug-22
20
C
9940
Sep-22
0
C
9941
Sep-22
27
D
1891
Sep-22
1
D
1892
Sep-22
8
E
3697
Sep-22
-
E
3698
Sep-22
26
The calculation condition would then look like this (we don't need to create this view, I'm just including it to explain the logic):
The actual output we're looking to create is a total by month (based on when application was received):
I've got a version of this working by creating a summary table in PowerQuery that makes a unique list of Application and Period combinations, and then links back to the main table to sum the duration for each, but I feel like it should be possible to do this using a DAX formula - as much for my own learning as anything else. Any suggestions on a) how to do it directly in DAX and b) whether this is sensible would be greatly appreciated!
You'll have to adjust the table and column names, but you can accomplish this logic with the following measures. The variable applicationsLessThan10 is a virtual table by Month and application and total duration for each. This is then filtered accordingly and we return the number of rows.
I split out month name into a new column with powerquery - ideally, you would have a date table which would account for year and month and the sorting of the month name.
LessThan10Days =
VAR applicationsLessThan10 =
FILTER (
ADDCOLUMNS (
SUMMARIZE (
ApplicationProcessing,
ApplicationProcessing[Month Name],
ApplicationProcessing[Application]
),
"Days", CALCULATE ( SUM ( ApplicationProcessing[Duration] ) )
),
[Days] <= 10
)
RETURN
COUNTROWS ( applicationsLessThan10 )
MoreThan10Days =
VAR applicationsMoreThan10 =
FILTER (
ADDCOLUMNS (
SUMMARIZE (
ApplicationProcessing,
ApplicationProcessing[Month Name],
ApplicationProcessing[Application]
),
"Days", CALCULATE ( SUM ( ApplicationProcessing[Duration] ) )
),
[Days] > 10
)
RETURN
COUNTROWS ( applicationsMoreThan10 )

Calculating the cumulative values in Power BI

I have two tables:
user table (contains: user registration data. columns: user_id, create_date)
customer order table (contains: history of orders. columns: user_id, order_date, order_id)
*user and customer aren't the same. when a user registers his first order, he becomes a customer.
For each month of each year, I want the accumulative count of distinct users and the accumulative count of the distinct customers because at last, I want to calculate the ratio of the accumulative count of the distinct customers to the accumulative count of the distinct users for each month.
I don't know how can I calculate the accumulative values and the Ratio that I said, using DAX.
Note that if a customer registers more than one order in a month, I want to count him just once for that month and if he registers a new order in the next months, also I count him in each new month.
Maybe these pictures help you to understand my question better.
-I don't count_of_users and count_of_customers columns in my tables. I should calculate them.
the user table:
user_id
create_date
1
2017-12-03
2
2018-01-01
3
2018-01-01
4
2018-02-04
5
2018-03-10
6
2018-04-07
7
2018-04-08
8
2018-09-12
9
2018-10-02
10
2018-10-02
11
2018-10-09
12
2018-10-11
13
2018-10-12
14
2018-10-12
15
2018-10-20
the customer order table:
user_id
order_date
order_id
1
2018-03-28
120
1
2018-03-28
514
1
2018-03-30
426
2
2018-02-11
125
2
2018-03-01
547
3
2018-02-10
588
3
2018-04-03
111
4
2018-02-10
697
5
2018-04-02
403
5
2018-04-05
321
6
2018-04-09
909
11
2018-10-25
8401
You need a few building blocks for this. Here is the data model I used:
<edit>
I see user_id in the different tables are not the same, in that case you can omit the relationship between the tables and the two relationships from the Calendar table will both be active - with no need to change the relationship semantics in the count_of_customer measure. </edit>
The calendar table is important because we can't rely on one single date column to aggregate data from different tables, so we create a common calendar table with this sample DAX code:
Calendar =
ADDCOLUMNS (
CALENDARAUTO () ,
"Year" , YEAR ( [Date] ) ,
"Month" , FORMAT ( [Date] , "MMM" ) ,
"Month-Year" , FORMAT ( [Date] , "MMM")&"-"&YEAR ( [Date] ) ,
"YearMonthNo" , YEAR ( [Date] ) * 12 + MONTH ( [Date] ) - 1
)
Make sure to sort the Month-Year column by the YearMonthNo column so your tables look nice:
Set your relationships as shown with the active relationship from Calendar to user - if not the measures will not work unless you alter the relationships accordingly in the code! In my data model the inactive relationship is between Calendar and customer order.
Next up are the measures we will use for this. First off we count the users, a simple row count:
count_of_users = COUNTROWS ( user )
Then we count distinct user ids in the order table to count customers, here we need to use the inactive relationship between Calendar and customer order and to do this we have to invoke CALCULATE:
count_of_customers =
CALCULATE (
DISTINCTCOUNT ( 'customer order'[user_id] ) ,
USERELATIONSHIP (
'Calendar'[Date] ,
'customer order'[order_date]
)
)
We can use this measure to count users cumulatively:
cumulative_users =
VAR _maxVisibleDate = MAX ( 'Calendar'[Date] )
RETURN
CALCULATE (
[count_of_users] ,
ALL ( 'Calendar' ) ,
'Calendar'[Date] <= _maxVisibleDate
)
And this measure to count cumulative customers per month:
cumulative_customers =
VAR _maxVisibleDate = MAX ( 'Calendar'[Date] )
RETURN
CALCULATE (
SUMX (
VALUES ( 'Calendar'[YearMonthNo] ) ,
[count_of_customers]
),
ALL ( 'Calendar' ) ,
'Calendar'[Date] <= _maxVisibleDate
)
Lastly we want the ratio of these last cumulative measures:
cumulative_customers/users =
DIVIDE (
[cumulative_customers] ,
[cumulative_users]
)
And here is your result:

Custom Calendar for sales data aggregated by weeks and months

I am struggling with creating Custom Calendar that would allow me to use time intelligence function on data that is already aggregated by weeks and months. The original table with transactions contains over 20M rows, so for performance and space saving reasons the grouping is necessary and I perform it while querying the database in SQL Server. I want to create:
Month vs Same Month Last Year
Week vs Same Week Last Year
Year-To-Date vs Last Year Year-To-Date (by Months)
Year-To-Date vs Last Year Year-To-Date (by Weeks)
My idea was to assign some dummy dates to each row of data, and then create custom calendar that would have each of these dummy dates along with other details. I just cannot figure out the key (how to create dummy date having Year number Month number and Week number - please keep in mind that for example Week 5 of 2020 is partially in January and partially in February)
Is it possible? Maybe I have to create 2 separate Calendars, one for Weeks and one for Months?
Add calculated table:
Calendar =
GENERATE (
CALENDAR (
DATE ( 2016, 1, 1 ),
DATE ( 2020, 12, 31 )
),
VAR VarDates = [Date]
VAR VarDay = DAY ( VarDates )
VAR VarMonth = MONTH ( VarDates )
VAR VarYear = YEAR ( VarDates )
VAR YM_text = FORMAT ( [Date], "yyyy-MM" )
VAR Y_week = YEAR ( VarDates ) & "." & WEEKNUM(VarDates)
RETURN
ROW (
"day" , VarDay,
"month" , VarMonth,
"year" , VarYear,
"YM_text" , YM_text,
"Y_week" , Y_week
)
)
You can customize it. In relation pane connect your Date field of FactTable (which is by week, as you say. This table can have missing dates and duplicate dates, to be precise) to the field Date of the Calendar table (which has unique Date, by day). Then, in all your visuals or measures always use the the Date field of the Calendar table. It is a good practice to hide the field Date in your FactTable.
More explanations here https://stackoverflow.com/a/54980662/1903793

Rolling 6 month Open Contracts in Power BI

I need to show how many active contracts we have open for each month in the last 6 months. I am trying to figure out a way to display this. Here is my table
Machine Enrollment# StartDate EndDate
A 1 1/2/2016 6/18/2019
B 2 12/15/2012 5/12/2034
C 3 3/25/2019 4/25/2021
D 4 1/7/2000 7/15/2019
A 5 10/1/2019 10/1/2025
I have thousands of rows. I want to be able to show a rolling 6 month visual for how many machines are under contract. So in this small example it would look like this
Apr-19 June-19 Jul-19 Aug-19 Sep-19 Oct-19
4 4 3 2 2 3
Where do I even begin in creating this? In the past, we have just looked at the numbers for the current month and tacked those results onto the end of a static table and deleted the column from over 6 months ago. I have been assigned to automate this report in Power BI. I am guessing I need to create a column/measure that looks at the EndDate and compares it to the filtered Date in the visual (ie: Aug-19) and determines if the contract was open at that time. But I do not know. Any help is much appreciated. Thanks in advance!
I think I found a solution for what you are looking for. You may find a sample pbix file here.
1. Create a calendar table
A calendar table is required to filter/slice the time periods. The calendar table needs to have a unique Date column, and optional columns such as Year, Quarter, and Month, depending on what units of period you need in the analysis.
A calendar table can be most easily created as a DAX calculated table. Here is an example of a minimal calendar table required in this use case.
Calendar =
ADDCOLUMNS(
CALENDAR( MIN( Contracts[StartDate] ), MAX( Contracts[EndDate] ) ),
"Year Month", FORMAT( [Date], "mmm-yy" ),
"Year Month Number", YEAR( [Date] ) * 100 + MONTH( [Date] )
)
2. Create a measure to calculate number of open contracts
Every numbers calculated and shown in reports need to be defined as measures.
Let's think about the number of May-19. The current filter context includes all 31 dates in the Calendar table between 2019-05-01 and 2019-05-31. In this case, how can we think of an open contract? If the contract starts after 2019-05-31, it is not open. If the contract ends before 2019-05-01, it is not open as well. Therefore the open contract meets this condition.
Starts on or before 2019-05-31 and
Ends on or after 2019-05-01
Below is the measure definition to count the number of contracts based on this condition.
# Open Contracts =
VAR MinDate = MIN( 'Calendar'[Date] )
VAR MaxDate = MAX( 'Calendar'[Date] )
RETURN COUNTROWS(
FILTER(
Contracts,
Contracts[StartDate] <= MaxDate
&& Contracts[EndDate] >= MinDate
)
)
3. Add dynamic filter for last 6 months
If I was understanding correctly, the requirement is to show monthly number of last 6 calendar months, excluding this month. I could not find a straightforward way for this. My solution may contain a bit of hacky scent.
Power BI does not have built-in filter support based on calendar months relative to now. We need to build a custom logic to achieve this. I did it by creating a measure that indicates whether current filter context is within the desired period. This measure is a flag that returns 1 if the filter context is a single calendar month which is included in the last 6 calendar months, or returns BLANK otherwise.
__Last6MonthFlag =
VAR YearMonths = CALCULATETABLE(
VALUES( 'Calendar'[Year Month] ),
REMOVEFILTERS( 'Calendar'[Year Month] ),
'Calendar'[Date] > EOMONTH( TODAY(), -7 )
&& 'Calendar'[Date] <= EOMONTH( TODAY(), -1 )
)
RETURN IF(
HASONEVALUE( 'Calendar'[Year Month] )
&& SELECTEDVALUE( 'Calendar'[Year Month] ) IN YearMonths,
1
)
Then I used this measure in the visual filter like this.
You need to do below activities to achieve your requirement.
Define a calendar table holding all the dates for your report. Define a calculated column for Month-Year. Month-Year = FORMAT('CalendarTable'[Date], "MM-YYYY")
You can define a calculated measure, which will return 1 if the end date of the contract is < 6 months from current date (you can use TODAY() function). This function will help in rolling calculation based on current date. Otherwise, this calculated measure will return NULL and SUM them.
You can drag the month-Year calculated column, defined in step no. 1, to the column axis. You can drag the calculated measure, defined in step no.2, to the values section. PowerBI control by default filters out the NULL values. So, when you use the calculated measure defined in step no. 2, you will get values only for the last 6 months. As the calculation is defined based on TODAY(), it will be a running calculation.

Working with Duration (Availability) in PowerBi

I'm trying to create a line graph in PowerBI. What I am trying to plot is somewhat complex.
I have the following tables:
Staffing - this table describes staffing for every employee in the company. By "staffing" I'm referring to how their time is allocated. For example, Employee #7 is staffed in "Chicken Manufacturing" with a StartDate of 1/1/2016 and an EndDate of 1/10/2016
EmployeeID Project StartDate EndDate
5 Cutting Lemons 12/1/2015 12/31/2015
5 Chicken Manufacturing 1/1/2016 1/10/2016
6 Fishing Lobsters 1/2/2016 1/5/2016
7 Chicken Manufacturing 1/5/2016 2/1/2016
8 Drinking 2/1/2016 null
I also have a standard Date dimension as well as an Employees table and a Project table. The Employees table has a row for every employee and the Project has a row for each activity.
I am trying to create a line graph that has dates on the x-axis and the line will show me how many employees are active at the given date. So for dates 12/1/2015 - 1/10/2016 Employee 5 should be counted as "Staffed" but on 1/11/2016, he should not be included in the total.
What I'm actually trying to do is calculate Availability and by that I mean, how many Employee hours are available on each day (I have a project called Available) so ultimately I will want to count the number of Hours rather than employees, but I think if I can get to work with counting employees, I shouldn't have too much trouble multiplying number of employees by 8 hours per day.
Try something like this:
Count of Emp =
CALCULATE (
DISTINCTCOUNT ( Employee[EmployeeID] ),
FILTER (
Staffing,
[StartDate] <= MAX ( 'Date'[Date] )
&& (
[EndDate] >= MAX ( 'Date'[Date] )
|| ISBLANK ( [EndDate] )
)
)
)
It is not tested but should work as long as you have relatinship between Employee - Staffing. Also be sure to use your date column in the Axis setting.
Let me know if this helps.