Related
I've a table sales
idcustomer year of birth amount salesdate
112 1970 200 12/02/2022
12 1980 400 12/03/2012
122 1990 600 12/04/2012
300 1977 20 12/06/2012
500 1996 250 12/04/2012
I need to see how different agegroups perform, how much is sales per month and year, Grouped in year of birth in 5 years, like 1980-1984, 1985-1989. I'd like agegroup to be dynamically created as new column in powerquery for example.
Not exactly clear what you want.
I assumed you always wanted your groupings to start on a multiple of five.
create a list of groupings based on the earliest date of birth rounded down to the nearest multiple of five.
Add a column which has the start year of the grouping
Add another column which takes that year and creates the text that you want for your grouping
Original data
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table24"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"idcustomer", Int64.Type}, {"year of birth", Int64.Type},
{"amount", Int64.Type}, {"salesdate", type date}}),
//add grouping column depending on min/max year of birth
//round firstYear down to a multiple of 5
firstYear =Number.IntegerDivide(List.Min(#"Changed Type"[year of birth]),5)*5,
lastYear = List.Max(#"Changed Type"[year of birth]),
//create list of groupings
groupings = List.Numbers(firstYear, Number.IntegerDivide(lastYear-firstYear,5)+1,5),
//group for first year selected from the list
#"First Year" = Table.AddColumn(#"Changed Type","firstYear",
each List.Last(List.Select(groupings, (li)=> li <= [year of birth])), Int64.Type),
//grouping column added as text
#"Add Grouping Column" = Table.AddColumn(#"First Year","Grouper",
each Text.From([firstYear]) & "-" & Text.From([firstYear]+4),type text),
//remove first year column
#"Removed Columns" = Table.RemoveColumns(#"Add Grouping Column",{"firstYear"})
in
#"Removed Columns"
Results
Your question is poorly described, but sample code below for powerquery sets up the columns for year, month, and bucket. You can then group on bucket column and one of the other columns to do whatever math you want
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"idcustomer", Int64.Type}, {"year of birth", Int64.Type}, {"amount", Int64.Type}, {"salesdate", type date}}),
// add columns for year, month and year/month
// you could probably do some math on the year instead, but too lazy to figure it out
#"Added Custom1" = Table.AddColumn(#"Changed Type", "SalesYear", each Date.Year([salesdate])),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "SalesMonth", each Date.Month([salesdate])),
#"Added Custom3" = Table.AddColumn(#"Added Custom2", "MonthYear", each #date(Date.Year([salesdate]),Date.Month([salesdate]),1)),
// generate list of buckets from 1940 through 2050, every 5 years
List = Table.FromList(List.Transform(List.Generate(() => 1940, each _ < 2050, each _ + 5), each Text.From(_)), null, {"Bucket"}),
#"Added Custom" = Table.AddColumn(List, "Year", each {Number.From([Bucket]) .. Number.From([Bucket])+4 }),
#"Expanded Year" = Table.ExpandListColumn(#"Added Custom", "Year"),
// add bucket column
#"Merged Queries" = Table.NestedJoin(#"Added Custom3", {"year of birth"}, #"Expanded Year", {"Year"}, "Table2", JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"Bucket"}, {"Bucket"})
in #"Expanded Table2"
I have a source table that has projects with a start date and duration in months. I'm looking to write a PowerQuery for PowerBI that will create a row for each month of the project, counting up the months. For example:
Source:
Project(string) | Date (ms timestamp) | Duration (integer)
A | Jan-2022 | 3
B | Sep-2022 | 2
Result:
Project | Date
A | Jan-2022
A | Feb-2022
A | Mar-2022
B | Sep-2022
B | Oct-2022
Not sure where to start or what this query should look like. Any ideas?
Edit: Changed sample tables to make them readable
Edit: Dates in the source table are provided in millisecond timestamp format (eg 1641024000000). My intent in the result table is to have them in a human-readable date format.
Here is one way to do this in Power Query.
Paste the code into a blank query.
Then Change the Source line so as to load your actual data table.
I used an Excel table for the source, but you may use what ever.
I also have the unix time stamp in the Source table, converting it to a PQ date in the M Code.
If all of your time stamps do not equate to the start of the month, some additional logic may be required.
Read the code comments and explore the Applied Steps to understand the algorithm
let
//Read in the Source data
Source = Excel.CurrentWorkbook(){[Name="Table27"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Project", type text}, {" Date", Int64.Type}, {" Duration", Int64.Type}}),
//convert date from unixTime in milliseconds to a PQ date
unixTime = Table.TransformColumns(#"Changed Type",{" Date", each #duration(0,0,0,_/1000)+#date(1970,1,1)}),
//add custom column with a List of the desired dates
#"Added Custom" = Table.AddColumn(unixTime, "Months", each
List.Accumulate(
{0..[#" Duration"]-1},
{},
(state,current)=> state & {Date.AddMonths([#" Date"],current)})),
//Remove unneeded columns
//Expand the list and set the data thype
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{" Date", " Duration"}),
#"Expanded Months" = Table.ExpandListColumn(#"Removed Columns", "Months"),
#"Changed Type1" = Table.TransformColumnTypes(#"Expanded Months",{{"Months", type date}})
in
#"Changed Type1"
For some reason sqlfiddle was down for me so I made an example in db-fiddle using postgres instead of ms-sql.
What you're looking to accomplish can be done with a recursive CTE, the syntax in MS-SQL is slightly different but this should get you most of the way there.
WITH RECURSIVE project_dates AS(
SELECT
start_date as starting_date,
CAST(start_date + duration*INTERVAL '1 month' as date) as end_date,
project
FROM projects
UNION
SELECT
CAST(starting_date + INTERVAL '1 month' as date),
pd.end_date,
p.project
FROM projects p
JOIN project_dates pd ON pd.project = p.project
WHERE CAST(starting_date + INTERVAL '1 month' as date) < pd.end_date
)
SELECT starting_date, project FROM project_dates
ORDER BY project, starting_date
My results using your date look as such.
You can check out my answer on db-fiddle with this link: https://www.db-fiddle.com/f/iS7uWFGwiMbEmFtNmhsiWt/0
try below
Divide your milliseconds by 86400000 and add that to 1/1/1970 to get date
Create an array based on Duration, expand to rows, add that to the start date
Remove extra columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
ConvertToDays = Table.TransformColumns(Source,{{"Date", each Number.RoundDown(Number.From(_) / 86400000)}}),
#"Added Custom" = Table.AddColumn(ConvertToDays, "Custom", each Date.AddDays(#date(1970,1,1),18993)),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each List.Numbers(0,[Duration])),
#"Expanded Custom.1" = Table.ExpandListColumn(#"Added Custom1", "Custom.1"),
#"Added Custom2" = Table.AddColumn(#"Expanded Custom.1", "Custom.2", each Date.AddMonths([Custom],[Custom.1]), type date),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"Date", "Duration", "Custom", "Custom.1"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Custom.2", "Date"}}),
TextDate = Table.AddColumn(#"Renamed Columns", "TextDate", each Date.ToText([Date],"MMM-yy"))
in TextDate
I have Table A with data of specific people doing their tasks like this:
I have Table B with data of needs for specific people for different periods of time like this:
I also have additional table C with period definitions:
Period no | Date from | Date to
--------------------------------------
1 | 27/01/2021 | 24/02/2021
2 | 25/02/2021 | 24/03/2021
...
There are 2 problems here:
Someone in Table A can have Start and End dates spanning multiple periods, like for example Human B
The Start and End dates may not encompass whole Periods, they can be for example just for a couple of days. And so there's an algorithm that calculates whether this counts as a period or not:
if this is less than 5 days, than it doesn't count
if this is between 6 and 14 days, than it's 0.5 period
if it's more than 14 days, than it's 1 period
So now I want to merge Table A with Table B, to compare needs with what was delivered, for every period. The question is how to go with this?
My first thought was to add columns to Table A for Period and Quantity, to be able to group & merge over it - but what about when this deployment can span over multiple periods? Also how to implement this conditional logic for periods?
I think this works
Pull in Period definitions as Table1
Add a custom column using formula
= {Number.From([Date from])..Number.From([Date to])}
And then expand that to rows. That gives you a match for every date to every period
File .. Close and Load ... Connection
Full sample code for that part is:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Period no", Int64.Type}, {"Date from", type date}, {"Date to", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each {Number.From([Date from])..Number.From([Date to])}),
#"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "Custom"),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Date from", "Date to"})
in #"Removed Columns"
Pull in your TableA, called Table2 here
Add custom column with similar formula and expand to rows
= {Number.From([Start of Deployment]) .. Number.From([End of Deployment])}
Now merge the other table into this one and pull in period
Click select the type and period columns and group them, pulling in the maximum and minimum dates from the new custom column
Add custom column for working duration with formula
= 1+[DayMax]-[DayMin]
Then add a custom column to apply your algo
= if [Duration]<6 then 0 else if [Duration] <15 then 0.5 else 1
Remove extra columns. Done. File ... Close and Load ... Connection
You can merge this back into your Table B as needed
Full code for this table
let Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Type", type text}, {"Start of Deployment", type date}, {"End of Deployment", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each {Number.From([Start of Deployment]) .. Number.From([End of Deployment])}),
#"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "Custom"),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Start of Deployment", "End of Deployment"}),
#"Merged Queries" = Table.NestedJoin(#"Removed Columns",{"Custom"},Table1,{"Custom"},"Table1",JoinKind.LeftOuter),
#"Expanded Table1" = Table.ExpandTableColumn(#"Merged Queries", "Table1", {"Period no"}, {"Period no"}),
#"Grouped Rows" = Table.Group(#"Expanded Table1", {"Type", "Period no"}, {{"DayMin", each List.Min([Custom]), type number}, {"DayMax", each List.Max([Custom]), type number}}),
#"Added Custom1" = Table.AddColumn(#"Grouped Rows", "Duration", each 1+[DayMax]-[DayMin]),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Algo", each if [Duration]<6 then 0 else if [Duration] <15 then 0.5 else 1),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom2",{"DayMin", "DayMax", "Duration"})
in #"Removed Columns1"
I have a performance monitoring table in which users are listed vertically and then months are listed across the header. These months are dynamically generated on a 12-month rolling window. At the beginning of each month, one month falls off the back of the query and another appears at the front. After beginning of month, I get the following error until I manually re-run the report:
The '<#MONTH>' column does not exist in the rowset.
Where '<#MONTH>' is the month that gets dropped off, e.g. if it is Sept 2019, it would be 'August 2018'.
I tried adding a window to the query that moved the start of the query ahead a day to try to eliminate the perceived race condition. This did not work.
Here is the M query I have currently:
let
Source = US_TOTALS,
#"Appended Query" = Table.SelectRows(Table.Combine({Source, CA_TOTALS}), each [DATELASTFULFILLMENT] >= #"This Year"),
#"Reordered Columns" = Table.ReorderColumns(#"Appended Query",{"SALESMAN", "RegionName", "DATELASTFULFILLMENT", "Total Sales", "Customer", "GROUP"}),
#"Inserted Start of Month" = Table.AddColumn(#"Reordered Columns", "StartOfMonth", each Date.StartOfMonth([DATELASTFULFILLMENT]), type date),
#"Reordered Columns1" = Table.ReorderColumns(#"Inserted Start of Month",{"SALESMAN", "RegionName", "DATELASTFULFILLMENT", "Total Sales", "StartOfMonth", "GROUP", "Customer"}),
#"Removed Columns" = Table.RemoveColumns(#"Reordered Columns1",{"Customer"}),
#"Date One" = Date.AddMonths(Date.StartOfMonth(Date.AddDays(DateTime.Date(DateTime.LocalNow()),1)),1),
#"Date Two" = Date.AddYears(Date.AddMonths(Date.StartOfMonth(Date.AddDays(DateTime.Date(DateTime.LocalNow()),1)),0),-1),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each [DATELASTFULFILLMENT] < Date.AddMonths(Date.StartOfMonth(DateTime.Date(DateTime.LocalNow())),1) and [DATELASTFULFILLMENT] >= Date.AddYears(Date.AddMonths(Date.StartOfMonth(DateTime.Date(DateTime.LocalNow())),0),-1)),
//#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each Date.From([DATELASTFULFILLMENT]) < #"Date One" and Date.From([DATELASTFULFILLMENT]) >= #"Date Two"),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"SALESMAN", "RegionName", "StartOfMonth", "GROUP"}, {{"Total", each List.Sum([Total Sales]), type number}}),
#"Reordered Columns2" = Table.ReorderColumns(#"Grouped Rows",{"SALESMAN", "RegionName", "GROUP", "StartOfMonth", "Total"}),
#"Inserted Month Name" = Table.AddColumn(#"Reordered Columns2", "Month Name", each Date.MonthName([StartOfMonth]), type text),
#"Inserted Year" = Table.AddColumn(#"Inserted Month Name", "Year", each Date.Year([StartOfMonth]), type number),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Inserted Year", {{"Year", type text}}, "en-US"),{"Month Name", "Year"},Combiner.CombineTextByDelimiter(" ", QuoteStyle.None),"Month"),
#"Removed Columns2" = Table.RemoveColumns(#"Merged Columns",{"StartOfMonth", "GROUP"}),
#"Grouped Rows1" = Table.Group(#"Removed Columns2", {"SALESMAN", "Month", "RegionName"}, {{"Total", each List.Sum([Total]), type number}}),
#"Sorted Rows" = Table.Sort(#"Grouped Rows1",{{"SALESMAN", Order.Ascending}}),
#"Pivoted Columns" = Table.Pivot(#"Sorted Rows", List.Distinct(#"Sorted Rows"[Month]), "Month", "Total", List.Sum),
Columns = List.RemoveFirstN(Table.ColumnNames(#"Pivoted Columns"),2),
#"Replaced Value" = Table.ReplaceValue(#"Pivoted Columns",null,0,Replacer.ReplaceValue,Columns),
#"Changed Type" = Table.TransformColumnTypes(#"Replaced Value",List.Transform(Columns, each {_, Currency.Type })),
#"Merged Queries" = Table.NestedJoin(#"Changed Type",{"SALESMAN"},USER_MAPPING_COMBINED,{"USERNAME"},"USER_MAPPING_COMBINED",JoinKind.LeftOuter),
#"Expanded USER_MAPPING" = Table.ExpandTableColumn(#"Merged Queries", "USER_MAPPING_COMBINED", {"NAME"}, {"NAME"})
in
#"Expanded USER_MAPPING"
Expected Results:
The query refreshes as normal
Actual Results:
The query errors with The '<#MONTH>' column does not exist in the rowset.
Where '<#MONTH>' is the month that gets dropped off, e.g. if it is Sept 2019, it would be 'August 2018'.
screenshot for reference:
Ok, I'm going to take a second swing at this. If I'm way off base, I apologize. But let me bring out an example to simulate your error.
let
Seed = Number.Mod(Number.Round(Time.Second(DateTime.LocalNow())), 7) + 1,
Source = List.Generate(()=> Seed, each _ < Seed + 6, each _ + 1),
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), {"MonthNumbers"}, null, ExtraValues.Error),
#"Added Custom" = Table.AddColumn(#"Converted to Table", "MonthNames", each Date.MonthName(#date(2019, [MonthNumbers], 1))),
#"Pivoted Column" = Table.Pivot(#"Added Custom", List.Distinct(#"Added Custom"[MonthNames]), "MonthNames", "MonthNumbers", List.Sum)
in
#"Pivoted Column"
So, this creates a table that looks like so:
But every single second, the frame shifts by a month. So if you refresh the preview every second, you'll see the frame slide Jan-Jun, Feb-Jul, Mar-Aug... when December is the last month in the window, it skips back to the Jan-Jun. The point is, the columns are changing.
Now you try to load this model. It does not work!
From the time you create the model with one column set, to the time it takes to load the column set, those columns have been changed and so one of the columns isn't there any more. This is like when your month changes. When you go in and load manually, you'll update your model and things will work fine until the next change in columns. But when you're doing it with the scheduled load, that doesn't update the model, it just tries to load the data and runs into this column mismatch.
So, how do we fix it without losing this dynamic naming? Let's look at that pivot... what if we don't do it and leave our power query looking like this?
Now the column names won't change when we load it into the model. We create a matrix visualization like so, and do some refreshes:
No errors, nice dynamic headers.
So, that's the approach that I think you need. I hope it helps.
Edit: Per comment, this answer shows how to deal with a source that comes with changing column names over time, which is not the problem the asker has.
#RyanB is correct. The right approach here is to do your crosstab layout in reports rather than in data model. The right way, in general, to deal with things that change is to reify these as data, rather than as schema.
Original post below
You're looking for the 'Unpivot other columns' transform:
Select the columns whose names do not change.
Use 'Unpivot other columns' transform
Rename columns
Deal with months as a single month column
Make sure this comes before any steps that depend on the changing column names.
Here are two sample queries that are identical in code except for the source, which has differently named columns:
// query1
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("RU85FsQgCL2LdQoWUTiLL8VkJve/QliSTKF8/oK4VsO2tY8fpLyEe1aWKm3fVgvpiF7SBNLZZulQQLrD9LK339I6mAOYpoJBo5tmUOgUYNr7YwfTiUwShA/VyL/wnufhyMRqv1oHRswTJANNBshGAWN63edNkf41j4GJvzseMn67Xw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Somethingstatic = _t, #"May-2019" = _t, #"Jun-2019" = _t, #"Jul-2019" = _t, #"Aug-2019" = _t]),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"ID", "Somethingstatic"}, "Attribute", "Value"),
#"Renamed Columns" = Table.RenameColumns(#"Unpivoted Other Columns",{{"Attribute", "Month"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Value", Int64.Type}, {"ID", Int64.Type}})
in
#"Changed Type"
// query 2
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("RU85FsQgCL2LdQoWUTiLL8VkJve/QliSTKF8/oK4VsO2tY8fpLyEe1aWKm3fVgvpiF7SBNLZZulQQLrD9LK339I6mAOYpoJBo5tmUOgUYNr7YwfTiUwShA/VyL/wnufhyMRqv1oHRswTJANNBshGAWN63edNkf41j4GJvzseMn67Xw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Somethingstatic = _t, #"Jun-2019" = _t, #"Jul-2019" = _t, #"Aug-2019" = _t, #"Sep-2019" = _t]),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"ID", "Somethingstatic"}, "Attribute", "Value"),
#"Renamed Columns" = Table.RenameColumns(#"Unpivoted Other Columns",{{"Attribute", "Month"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Value", Int64.Type}, {"ID", Int64.Type}})
in
#"Changed Type"
My question is, is there a way to get the first/earliest date per grouping in a table and then filter the table to only include rows within the first x number of months of that first date, per grouping. Probably easiest to ask with example. Say I have the following table, and want to keep data for first 6 months of each Group:
The resulting table would look like:
Is there a way to accomplish this with DAX or M?
This seems to work:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Group", type text}, {"Date", type date}, {"Quantity", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Group"}, {{"AllData", each _, type table}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each let DateThisRow = [AllData][Date] in Table.SelectRows([AllData],each [Date] <= Date.AddMonths(List.Min(DateThisRow),6))),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"Group", "Date", "Quantity"}, {"Group", "Date", "Quantity"})
in
#"Expanded Custom"
Starting from your original table setup, named as Table1 in Excel:
...it gives me this as an end result:
The number 6 near the end of the #"Added Custom" line in the M code above is the number of months.