group by year of birth

group by year of birth - powerbi

I've a table sales
idcustomer year of birth  amount salesdate
112 1970 200 12/02/2022
12 1980 400 12/03/2012
122 1990 600 12/04/2012
300 1977 20 12/06/2012
500 1996 250 12/04/2012
I need to see how different agegroups perform, how much is sales per month and year,  Grouped in year of birth in 5 years, like 1980-1984, 1985-1989. I'd like agegroup to be dynamically created as new column in powerquery for example.

Not exactly clear what you want.
I assumed you always wanted your groupings to start on a multiple of five.
create a list of groupings based on the earliest date of birth rounded down to the nearest multiple of five.
Add a column which has the start year of the grouping
Add another column which takes that year and creates the text that you want for your grouping
Original data
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table24"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"idcustomer", Int64.Type}, {"year of birth", Int64.Type},
{"amount", Int64.Type}, {"salesdate", type date}}),
//add grouping column depending on min/max year of birth
//round firstYear down to a multiple of 5
firstYear =Number.IntegerDivide(List.Min(#"Changed Type"[year of birth]),5)*5,
lastYear = List.Max(#"Changed Type"[year of birth]),
//create list of groupings
groupings = List.Numbers(firstYear, Number.IntegerDivide(lastYear-firstYear,5)+1,5),
//group for first year selected from the list
#"First Year" = Table.AddColumn(#"Changed Type","firstYear",
each List.Last(List.Select(groupings, (li)=> li <= [year of birth])), Int64.Type),
//grouping column added as text
#"Add Grouping Column" = Table.AddColumn(#"First Year","Grouper",
each Text.From([firstYear]) & "-" & Text.From([firstYear]+4),type text),
//remove first year column
#"Removed Columns" = Table.RemoveColumns(#"Add Grouping Column",{"firstYear"})
in
#"Removed Columns"
Results

Your question is poorly described, but sample code below for powerquery sets up the columns for year, month, and bucket. You can then group on bucket column and one of the other columns to do whatever math you want
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"idcustomer", Int64.Type}, {"year of birth", Int64.Type}, {"amount", Int64.Type}, {"salesdate", type date}}),
// add columns for year, month and year/month
// you could probably do some math on the year instead, but too lazy to figure it out
#"Added Custom1" = Table.AddColumn(#"Changed Type", "SalesYear", each Date.Year([salesdate])),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "SalesMonth", each Date.Month([salesdate])),
#"Added Custom3" = Table.AddColumn(#"Added Custom2", "MonthYear", each #date(Date.Year([salesdate]),Date.Month([salesdate]),1)),
// generate list of buckets from 1940 through 2050, every 5 years
List = Table.FromList(List.Transform(List.Generate(() => 1940, each _ < 2050, each _ + 5), each Text.From(_)), null, {"Bucket"}),
#"Added Custom" = Table.AddColumn(List, "Year", each {Number.From([Bucket]) .. Number.From([Bucket])+4 }),
#"Expanded Year" = Table.ExpandListColumn(#"Added Custom", "Year"),
// add bucket column
#"Merged Queries" = Table.NestedJoin(#"Added Custom3", {"year of birth"}, #"Expanded Year", {"Year"}, "Table2", JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"Bucket"}, {"Bucket"})
in #"Expanded Table2"

Related

best way to add multiple columns in power bi

Hi need to add 1000+ calculated columns in power bi which provide the count per entry, for example.*means calculated columns
ID
RankCode
Count_RankCode*
RankAdvance
Count_RanAdvance*
1000
AAA
2
XYZ
2
1001
AAA
2
XYA
1
1002
AAB
1
XYZ
2
found the right way to count in power BI DAX:
COUNTROWS(FILTER('24Jun_1973',[rankCode]=earlier([rankCode])))
Requirement:
add 1000 columns that count rows in probably in one code using DAX
or create the 1000 count cloumn in power query M language (need it to be fast since raw date is 60gb).

As suggested by #smpa01, I was able to complete this task using the tabular editor. Just used the DAX script in tabular editor, put my all my measures in there since I was able to create all expressio in excels as it is just repeating then voila, 1000 measures added.
example:
Measure '24Jun_1973'[measure]=calculate(COUNTROWS(FILTER('28Jun_1973',[rankcode]='28Jun_1973'[rankcode])))
Measure '24Jun_1973'[measure2]=calculate(COUNTROWS(FILTER('28Jun_1973',[rankAdvance]='28Jun_1973'[rankAdvance])))

I have no idea why you would want another 1000 columns.
If you really want to though, in powerquery, you could unpivot, group and count, append the results to original data, then re-pivot. I don't know how fast it would be. I suspect not very.
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"RankCode", type text}, {"RankAdvance", type text}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"ID"}, "Attribute", "Value"),
// group and count
#"Grouped Rows" = Table.Group(#"Unpivoted Other Columns", {"Attribute", "Value"}, {{"Count", each Table.RowCount(_), type number}}),
#"Duplicated Column" = Table.DuplicateColumn(#"Grouped Rows", "Attribute", "Attribute - Copy"),
#"Change column name" = Table.TransformColumns(#"Duplicated Column",{{"Attribute - Copy", each "count_" & _, type text}}),
// append back to original table, then repivot
#"Merged Queries" = Table.NestedJoin(#"Unpivoted Other Columns",{"Attribute", "Value"},#"Change column name",{"Attribute", "Value"},"Table2",JoinKind.LeftOuter),
#"Expanded Table1" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"Count", "Attribute - Copy"}, {"Count", "Attribute - Copy"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Table1",{"Attribute", "Value"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Count", "Value"}, {"Attribute - Copy", "Attribute"}}),
combined = #"Unpivoted Other Columns" & #"Renamed Columns",
#"Pivoted Column" = Table.Pivot(combined, List.Distinct(combined[Attribute]), "Attribute", "Value")
in #"Pivoted Column"

Power BI: how to merge specific tables

I have Table A with data of specific people doing their tasks like this:
I have Table B with data of needs for specific people for different periods of time like this:
I also have additional table C with period definitions:
Period no | Date from | Date to
--------------------------------------
1 | 27/01/2021 | 24/02/2021
2 | 25/02/2021 | 24/03/2021
...
There are 2 problems here:
Someone in Table A can have Start and End dates spanning multiple periods, like for example Human B
The Start and End dates may not encompass whole Periods, they can be for example just for a couple of days. And so there's an algorithm that calculates whether this counts as a period or not:
if this is less than 5 days, than it doesn't count
if this is between 6 and 14 days, than it's 0.5 period
if it's more than 14 days, than it's 1 period
So now I want to merge Table A with Table B, to compare needs with what was delivered, for every period. The question is how to go with this?
My first thought was to add columns to Table A for Period and Quantity, to be able to group & merge over it - but what about when this deployment can span over multiple periods? Also how to implement this conditional logic for periods?

I think this works
Pull in Period definitions as Table1
Add a custom column using formula
= {Number.From([Date from])..Number.From([Date to])}
And then expand that to rows. That gives you a match for every date to every period
File .. Close and Load ... Connection
Full sample code for that part is:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Period no", Int64.Type}, {"Date from", type date}, {"Date to", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each {Number.From([Date from])..Number.From([Date to])}),
#"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "Custom"),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Date from", "Date to"})
in #"Removed Columns"
Pull in your TableA, called Table2 here
Add custom column with similar formula and expand to rows
= {Number.From([Start of Deployment]) .. Number.From([End of Deployment])}
Now merge the other table into this one and pull in period
Click select the type and period columns and group them, pulling in the maximum and minimum dates from the new custom column
Add custom column for working duration with formula
= 1+[DayMax]-[DayMin]
Then add a custom column to apply your algo
= if [Duration]<6 then 0 else if [Duration] <15 then 0.5 else 1
Remove extra columns. Done. File ... Close and Load ... Connection
You can merge this back into your Table B as needed
Full code for this table
let Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Type", type text}, {"Start of Deployment", type date}, {"End of Deployment", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each {Number.From([Start of Deployment]) .. Number.From([End of Deployment])}),
#"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "Custom"),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Start of Deployment", "End of Deployment"}),
#"Merged Queries" = Table.NestedJoin(#"Removed Columns",{"Custom"},Table1,{"Custom"},"Table1",JoinKind.LeftOuter),
#"Expanded Table1" = Table.ExpandTableColumn(#"Merged Queries", "Table1", {"Period no"}, {"Period no"}),
#"Grouped Rows" = Table.Group(#"Expanded Table1", {"Type", "Period no"}, {{"DayMin", each List.Min([Custom]), type number}, {"DayMax", each List.Max([Custom]), type number}}),
#"Added Custom1" = Table.AddColumn(#"Grouped Rows", "Duration", each 1+[DayMax]-[DayMin]),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Algo", each if [Duration]<6 then 0 else if [Duration] <15 then 0.5 else 1),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom2",{"DayMin", "DayMax", "Duration"})
in #"Removed Columns1"

How can I properly use variable month columns in a PowerBI query?

I have a performance monitoring table in which users are listed vertically and then months are listed across the header. These months are dynamically generated on a 12-month rolling window. At the beginning of each month, one month falls off the back of the query and another appears at the front. After beginning of month, I get the following error until I manually re-run the report:
The '<#MONTH>' column does not exist in the rowset.
Where '<#MONTH>' is the month that gets dropped off, e.g. if it is Sept 2019, it would be 'August 2018'.
I tried adding a window to the query that moved the start of the query ahead a day to try to eliminate the perceived race condition. This did not work.
Here is the M query I have currently:
let
Source = US_TOTALS,
#"Appended Query" = Table.SelectRows(Table.Combine({Source, CA_TOTALS}), each [DATELASTFULFILLMENT] >= #"This Year"),
#"Reordered Columns" = Table.ReorderColumns(#"Appended Query",{"SALESMAN", "RegionName", "DATELASTFULFILLMENT", "Total Sales", "Customer", "GROUP"}),
#"Inserted Start of Month" = Table.AddColumn(#"Reordered Columns", "StartOfMonth", each Date.StartOfMonth([DATELASTFULFILLMENT]), type date),
#"Reordered Columns1" = Table.ReorderColumns(#"Inserted Start of Month",{"SALESMAN", "RegionName", "DATELASTFULFILLMENT", "Total Sales", "StartOfMonth", "GROUP", "Customer"}),
#"Removed Columns" = Table.RemoveColumns(#"Reordered Columns1",{"Customer"}),
#"Date One" = Date.AddMonths(Date.StartOfMonth(Date.AddDays(DateTime.Date(DateTime.LocalNow()),1)),1),
#"Date Two" = Date.AddYears(Date.AddMonths(Date.StartOfMonth(Date.AddDays(DateTime.Date(DateTime.LocalNow()),1)),0),-1),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each [DATELASTFULFILLMENT] < Date.AddMonths(Date.StartOfMonth(DateTime.Date(DateTime.LocalNow())),1) and [DATELASTFULFILLMENT] >= Date.AddYears(Date.AddMonths(Date.StartOfMonth(DateTime.Date(DateTime.LocalNow())),0),-1)),
//#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each Date.From([DATELASTFULFILLMENT]) < #"Date One" and Date.From([DATELASTFULFILLMENT]) >= #"Date Two"),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"SALESMAN", "RegionName", "StartOfMonth", "GROUP"}, {{"Total", each List.Sum([Total Sales]), type number}}),
#"Reordered Columns2" = Table.ReorderColumns(#"Grouped Rows",{"SALESMAN", "RegionName", "GROUP", "StartOfMonth", "Total"}),
#"Inserted Month Name" = Table.AddColumn(#"Reordered Columns2", "Month Name", each Date.MonthName([StartOfMonth]), type text),
#"Inserted Year" = Table.AddColumn(#"Inserted Month Name", "Year", each Date.Year([StartOfMonth]), type number),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Inserted Year", {{"Year", type text}}, "en-US"),{"Month Name", "Year"},Combiner.CombineTextByDelimiter(" ", QuoteStyle.None),"Month"),
#"Removed Columns2" = Table.RemoveColumns(#"Merged Columns",{"StartOfMonth", "GROUP"}),
#"Grouped Rows1" = Table.Group(#"Removed Columns2", {"SALESMAN", "Month", "RegionName"}, {{"Total", each List.Sum([Total]), type number}}),
#"Sorted Rows" = Table.Sort(#"Grouped Rows1",{{"SALESMAN", Order.Ascending}}),
#"Pivoted Columns" = Table.Pivot(#"Sorted Rows", List.Distinct(#"Sorted Rows"[Month]), "Month", "Total", List.Sum),
Columns = List.RemoveFirstN(Table.ColumnNames(#"Pivoted Columns"),2),
#"Replaced Value" = Table.ReplaceValue(#"Pivoted Columns",null,0,Replacer.ReplaceValue,Columns),
#"Changed Type" = Table.TransformColumnTypes(#"Replaced Value",List.Transform(Columns, each {_, Currency.Type })),
#"Merged Queries" = Table.NestedJoin(#"Changed Type",{"SALESMAN"},USER_MAPPING_COMBINED,{"USERNAME"},"USER_MAPPING_COMBINED",JoinKind.LeftOuter),
#"Expanded USER_MAPPING" = Table.ExpandTableColumn(#"Merged Queries", "USER_MAPPING_COMBINED", {"NAME"}, {"NAME"})
in
#"Expanded USER_MAPPING"
Expected Results:
The query refreshes as normal
Actual Results:
The query errors with The '<#MONTH>' column does not exist in the rowset.
Where '<#MONTH>' is the month that gets dropped off, e.g. if it is Sept 2019, it would be 'August 2018'.
screenshot for reference:

Ok, I'm going to take a second swing at this. If I'm way off base, I apologize. But let me bring out an example to simulate your error.
let
Seed = Number.Mod(Number.Round(Time.Second(DateTime.LocalNow())), 7) + 1,
Source = List.Generate(()=> Seed, each _ < Seed + 6, each _ + 1),
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), {"MonthNumbers"}, null, ExtraValues.Error),
#"Added Custom" = Table.AddColumn(#"Converted to Table", "MonthNames", each Date.MonthName(#date(2019, [MonthNumbers], 1))),
#"Pivoted Column" = Table.Pivot(#"Added Custom", List.Distinct(#"Added Custom"[MonthNames]), "MonthNames", "MonthNumbers", List.Sum)
in
#"Pivoted Column"
So, this creates a table that looks like so:
But every single second, the frame shifts by a month. So if you refresh the preview every second, you'll see the frame slide Jan-Jun, Feb-Jul, Mar-Aug... when December is the last month in the window, it skips back to the Jan-Jun. The point is, the columns are changing.
Now you try to load this model. It does not work!
From the time you create the model with one column set, to the time it takes to load the column set, those columns have been changed and so one of the columns isn't there any more. This is like when your month changes. When you go in and load manually, you'll update your model and things will work fine until the next change in columns. But when you're doing it with the scheduled load, that doesn't update the model, it just tries to load the data and runs into this column mismatch.
So, how do we fix it without losing this dynamic naming? Let's look at that pivot... what if we don't do it and leave our power query looking like this?
Now the column names won't change when we load it into the model. We create a matrix visualization like so, and do some refreshes:
No errors, nice dynamic headers.
So, that's the approach that I think you need. I hope it helps.

Edit: Per comment, this answer shows how to deal with a source that comes with changing column names over time, which is not the problem the asker has.
#RyanB is correct. The right approach here is to do your crosstab layout in reports rather than in data model. The right way, in general, to deal with things that change is to reify these as data, rather than as schema.
Original post below
You're looking for the 'Unpivot other columns' transform:
Select the columns whose names do not change.
Use 'Unpivot other columns' transform
Rename columns
Deal with months as a single month column
Make sure this comes before any steps that depend on the changing column names.
Here are two sample queries that are identical in code except for the source, which has differently named columns:
// query1
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("RU85FsQgCL2LdQoWUTiLL8VkJve/QliSTKF8/oK4VsO2tY8fpLyEe1aWKm3fVgvpiF7SBNLZZulQQLrD9LK339I6mAOYpoJBo5tmUOgUYNr7YwfTiUwShA/VyL/wnufhyMRqv1oHRswTJANNBshGAWN63edNkf41j4GJvzseMn67Xw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Somethingstatic = _t, #"May-2019" = _t, #"Jun-2019" = _t, #"Jul-2019" = _t, #"Aug-2019" = _t]),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"ID", "Somethingstatic"}, "Attribute", "Value"),
#"Renamed Columns" = Table.RenameColumns(#"Unpivoted Other Columns",{{"Attribute", "Month"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Value", Int64.Type}, {"ID", Int64.Type}})
in
#"Changed Type"
// query 2
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("RU85FsQgCL2LdQoWUTiLL8VkJve/QliSTKF8/oK4VsO2tY8fpLyEe1aWKm3fVgvpiF7SBNLZZulQQLrD9LK339I6mAOYpoJBo5tmUOgUYNr7YwfTiUwShA/VyL/wnufhyMRqv1oHRswTJANNBshGAWN63edNkf41j4GJvzseMn67Xw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Somethingstatic = _t, #"Jun-2019" = _t, #"Jul-2019" = _t, #"Aug-2019" = _t, #"Sep-2019" = _t]),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"ID", "Somethingstatic"}, "Attribute", "Value"),
#"Renamed Columns" = Table.RenameColumns(#"Unpivoted Other Columns",{{"Attribute", "Month"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Value", Int64.Type}, {"ID", Int64.Type}})
in
#"Changed Type"

Looking for DAX statement to distribute value between date range

Distribute a value evenly across a start date and end date into daily buckets.
Working with a Date Table
TBTR value split evenly between Todays Date - End Date.
There are nine days (3/29-4/7), each day would have a value of 2.91 bucketed by day so that 2.91 per day for that period could ultimately be graphed.

Here you go, it will create extra rows what you can use to your liking:
let
Source = Excel.Workbook(File.Contents("C:\...\Test.xlsx"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
#"Added Custom1" = Table.AddColumn(#"Promoted Headers", "TBTRAverage", each [TBTR] / Duration.Days([End date]-[Todays date])),
#"Added Custom" = Table.AddColumn(#"Added Custom1", "Date", each let EndThisRow = [End date] in List.Generate(()=>[Todays date], each _ <= EndThisRow , each Date.AddDays( _ , 1))),
#"Expanded Date" = Table.ExpandListColumn(#"Added Custom", "Date"),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Date",{{"Date", type date}})
in
#"Changed Type"
Result:

How to filter down rows to include only first x number of months per grouping

My question is, is there a way to get the first/earliest date per grouping in a table and then filter the table to only include rows within the first x number of months of that first date, per grouping. Probably easiest to ask with example. Say I have the following table, and want to keep data for first 6 months of each Group:
The resulting table would look like:
Is there a way to accomplish this with DAX or M?

This seems to work:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Group", type text}, {"Date", type date}, {"Quantity", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Group"}, {{"AllData", each _, type table}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each let DateThisRow = [AllData][Date] in Table.SelectRows([AllData],each [Date] <= Date.AddMonths(List.Min(DateThisRow),6))),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"Group", "Date", "Quantity"}, {"Group", "Date", "Quantity"})
in
#"Expanded Custom"
Starting from your original table setup, named as Table1 in Excel:
...it gives me this as an end result:
The number 6 near the end of the #"Added Custom" line in the M code above is the number of months.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

group by year of birth - powerbi

Related

best way to add multiple columns in power bi

Power BI: how to merge specific tables

How can I properly use variable month columns in a PowerBI query?

Looking for DAX statement to distribute value between date range

How to filter down rows to include only first x number of months per grouping

Categories

Resources