Query to return multiple rows based on the value in a column - powerbi

I have a source table that has projects with a start date and duration in months. I'm looking to write a PowerQuery for PowerBI that will create a row for each month of the project, counting up the months. For example:
Source:
Project(string) | Date (ms timestamp) | Duration (integer)
A | Jan-2022 | 3
B | Sep-2022 | 2
Result:
Project | Date
A | Jan-2022
A | Feb-2022
A | Mar-2022
B | Sep-2022
B | Oct-2022
Not sure where to start or what this query should look like. Any ideas?
Edit: Changed sample tables to make them readable
Edit: Dates in the source table are provided in millisecond timestamp format (eg 1641024000000). My intent in the result table is to have them in a human-readable date format.

Here is one way to do this in Power Query.
Paste the code into a blank query.
Then Change the Source line so as to load your actual data table.
I used an Excel table for the source, but you may use what ever.
I also have the unix time stamp in the Source table, converting it to a PQ date in the M Code.
If all of your time stamps do not equate to the start of the month, some additional logic may be required.
Read the code comments and explore the Applied Steps to understand the algorithm
let
//Read in the Source data
Source = Excel.CurrentWorkbook(){[Name="Table27"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Project", type text}, {" Date", Int64.Type}, {" Duration", Int64.Type}}),
//convert date from unixTime in milliseconds to a PQ date
unixTime = Table.TransformColumns(#"Changed Type",{" Date", each #duration(0,0,0,_/1000)+#date(1970,1,1)}),
//add custom column with a List of the desired dates
#"Added Custom" = Table.AddColumn(unixTime, "Months", each
List.Accumulate(
{0..[#" Duration"]-1},
{},
(state,current)=> state & {Date.AddMonths([#" Date"],current)})),
//Remove unneeded columns
//Expand the list and set the data thype
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{" Date", " Duration"}),
#"Expanded Months" = Table.ExpandListColumn(#"Removed Columns", "Months"),
#"Changed Type1" = Table.TransformColumnTypes(#"Expanded Months",{{"Months", type date}})
in
#"Changed Type1"

For some reason sqlfiddle was down for me so I made an example in db-fiddle using postgres instead of ms-sql.
What you're looking to accomplish can be done with a recursive CTE, the syntax in MS-SQL is slightly different but this should get you most of the way there.
WITH RECURSIVE project_dates AS(
SELECT
start_date as starting_date,
CAST(start_date + duration*INTERVAL '1 month' as date) as end_date,
project
FROM projects
UNION
SELECT
CAST(starting_date + INTERVAL '1 month' as date),
pd.end_date,
p.project
FROM projects p
JOIN project_dates pd ON pd.project = p.project
WHERE CAST(starting_date + INTERVAL '1 month' as date) < pd.end_date
)
SELECT starting_date, project FROM project_dates
ORDER BY project, starting_date
My results using your date look as such.
You can check out my answer on db-fiddle with this link: https://www.db-fiddle.com/f/iS7uWFGwiMbEmFtNmhsiWt/0

try below
Divide your milliseconds by 86400000 and add that to 1/1/1970 to get date
Create an array based on Duration, expand to rows, add that to the start date
Remove extra columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
ConvertToDays = Table.TransformColumns(Source,{{"Date", each Number.RoundDown(Number.From(_) / 86400000)}}),
#"Added Custom" = Table.AddColumn(ConvertToDays, "Custom", each Date.AddDays(#date(1970,1,1),18993)),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each List.Numbers(0,[Duration])),
#"Expanded Custom.1" = Table.ExpandListColumn(#"Added Custom1", "Custom.1"),
#"Added Custom2" = Table.AddColumn(#"Expanded Custom.1", "Custom.2", each Date.AddMonths([Custom],[Custom.1]), type date),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"Date", "Duration", "Custom", "Custom.1"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Custom.2", "Date"}}),
TextDate = Table.AddColumn(#"Renamed Columns", "TextDate", each Date.ToText([Date],"MMM-yy"))
in TextDate

Related

PowerBI DAX Create Table of Hours from Project List and Start of Week Table

I am trying to create a new table that contains a column with start of week and the hours estimated to be spent for that week on a per project basis.
Start of Week
Project
Hours
6/20/2022
ABC_XXX
10
6/27/2022
ABC_XXX
10
6/20/2022
ABC_YYY
40
6/27/2022
ABC_YYY
40
I have a table of dates representing the start of week for every project in the project table.
week start date = [date]-weekday([date],2)+1
Start of Week
6/20/2022
6/27/2022
7/4/2022
The project table contains (among other things) the project name, estimated start date, duration, and hours per week.
Project Name
Estimated Start Date
Duration in weeks
Hours Per Week
ABC_XXX
6/13/2022
8
10
ABC_YYY
6/04/2022
27
40
I am having trouble getting off the starting line. I know I need to evaluate on a per project basis and loop through all of the dates in my date table but can't find a good method to start with. I have done a lot of more simple things with creating new tables and calculations but this one is a little more complicated for me to get started. Any advice would be greatly appreciated.
The ultimate goal for this data is to present a trend showing estimated project demand over time that can be filtered by project or summed across all projects as well as filtered by timeline and displayed in a calendar view but it all starts with getting the data into this format I believe.
Here's a Power Query solution. The steps in the code below are:
use Date.AddWeeks to calculate the end date
List dates between two dates
Expand the list of dates and convert to date format
Use Date.DayOfWeek to create a day of week column
filter the table for day of week = 1 to include only weekly values (starting on Monday)
.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcnRyjo+IiFDSUTLTNzTWNzIwMgKyLYDY0EApVgeiIDIyEqzABCZvZA4kTBAKoqKigALmCAVAOaAqpdhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Project Name" = _t, #"Estimated Start Date" = _t, #"Duration in weeks" = _t, #"Hours Per Week" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Project Name", type text}, {"Estimated Start Date", type date}, {"Duration in weeks", Int64.Type}, {"Hours Per Week", Int64.Type}}),
#"Added Custom1" = Table.AddColumn(#"Changed Type", "Estimated End Date", each Date.AddWeeks([Estimated Start Date],[Duration in weeks])),
#"Added Custom" = Table.AddColumn(#"Added Custom1", "dates", each {Number.From([Estimated Start Date])..Number.From([Estimated End Date])}),
#"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "dates"),
#"Changed Type1" = Table.TransformColumnTypes(#"Expanded Custom",{{"dates", type date}}),
#"Added Custom4" = Table.AddColumn(#"Changed Type1", "day of week", each Date.DayOfWeek([dates])),
#"Filtered Rows" = Table.SelectRows(#"Added Custom4", each ([day of week] = 1))
in
#"Filtered Rows"

Power BI: how to merge specific tables

I have Table A with data of specific people doing their tasks like this:
I have Table B with data of needs for specific people for different periods of time like this:
I also have additional table C with period definitions:
Period no | Date from | Date to
--------------------------------------
1 | 27/01/2021 | 24/02/2021
2 | 25/02/2021 | 24/03/2021
...
There are 2 problems here:
Someone in Table A can have Start and End dates spanning multiple periods, like for example Human B
The Start and End dates may not encompass whole Periods, they can be for example just for a couple of days. And so there's an algorithm that calculates whether this counts as a period or not:
if this is less than 5 days, than it doesn't count
if this is between 6 and 14 days, than it's 0.5 period
if it's more than 14 days, than it's 1 period
So now I want to merge Table A with Table B, to compare needs with what was delivered, for every period. The question is how to go with this?
My first thought was to add columns to Table A for Period and Quantity, to be able to group & merge over it - but what about when this deployment can span over multiple periods? Also how to implement this conditional logic for periods?
I think this works
Pull in Period definitions as Table1
Add a custom column using formula
= {Number.From([Date from])..Number.From([Date to])}
And then expand that to rows. That gives you a match for every date to every period
File .. Close and Load ... Connection
Full sample code for that part is:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Period no", Int64.Type}, {"Date from", type date}, {"Date to", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each {Number.From([Date from])..Number.From([Date to])}),
#"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "Custom"),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Date from", "Date to"})
in #"Removed Columns"
Pull in your TableA, called Table2 here
Add custom column with similar formula and expand to rows
= {Number.From([Start of Deployment]) .. Number.From([End of Deployment])}
Now merge the other table into this one and pull in period
Click select the type and period columns and group them, pulling in the maximum and minimum dates from the new custom column
Add custom column for working duration with formula
= 1+[DayMax]-[DayMin]
Then add a custom column to apply your algo
= if [Duration]<6 then 0 else if [Duration] <15 then 0.5 else 1
Remove extra columns. Done. File ... Close and Load ... Connection
You can merge this back into your Table B as needed
Full code for this table
let Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Type", type text}, {"Start of Deployment", type date}, {"End of Deployment", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each {Number.From([Start of Deployment]) .. Number.From([End of Deployment])}),
#"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "Custom"),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Start of Deployment", "End of Deployment"}),
#"Merged Queries" = Table.NestedJoin(#"Removed Columns",{"Custom"},Table1,{"Custom"},"Table1",JoinKind.LeftOuter),
#"Expanded Table1" = Table.ExpandTableColumn(#"Merged Queries", "Table1", {"Period no"}, {"Period no"}),
#"Grouped Rows" = Table.Group(#"Expanded Table1", {"Type", "Period no"}, {{"DayMin", each List.Min([Custom]), type number}, {"DayMax", each List.Max([Custom]), type number}}),
#"Added Custom1" = Table.AddColumn(#"Grouped Rows", "Duration", each 1+[DayMax]-[DayMin]),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Algo", each if [Duration]<6 then 0 else if [Duration] <15 then 0.5 else 1),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom2",{"DayMin", "DayMax", "Duration"})
in #"Removed Columns1"

Power BI - How to pull only Sunday dates within Power Query

I have generated a list of dates in Power Query. I want it to only show dates that are Sunday. Is there a way to do this? I want the outcome to actually show a date.
Example: 3/15/20, 3/22/2020, 3/29/2020
Add a custom column that references date column using Date.DayOfWeekName() function, then filter on that column, and remove it
#"Added Custom" = Table.AddColumn(#"PreviousStep", "Custom", each Date.DayOfWeekName([date])),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = "Sunday")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Custom"})

How can I properly use variable month columns in a PowerBI query?

I have a performance monitoring table in which users are listed vertically and then months are listed across the header. These months are dynamically generated on a 12-month rolling window. At the beginning of each month, one month falls off the back of the query and another appears at the front. After beginning of month, I get the following error until I manually re-run the report:
The '<#MONTH>' column does not exist in the rowset.
Where '<#MONTH>' is the month that gets dropped off, e.g. if it is Sept 2019, it would be 'August 2018'.
I tried adding a window to the query that moved the start of the query ahead a day to try to eliminate the perceived race condition. This did not work.
Here is the M query I have currently:
let
Source = US_TOTALS,
#"Appended Query" = Table.SelectRows(Table.Combine({Source, CA_TOTALS}), each [DATELASTFULFILLMENT] >= #"This Year"),
#"Reordered Columns" = Table.ReorderColumns(#"Appended Query",{"SALESMAN", "RegionName", "DATELASTFULFILLMENT", "Total Sales", "Customer", "GROUP"}),
#"Inserted Start of Month" = Table.AddColumn(#"Reordered Columns", "StartOfMonth", each Date.StartOfMonth([DATELASTFULFILLMENT]), type date),
#"Reordered Columns1" = Table.ReorderColumns(#"Inserted Start of Month",{"SALESMAN", "RegionName", "DATELASTFULFILLMENT", "Total Sales", "StartOfMonth", "GROUP", "Customer"}),
#"Removed Columns" = Table.RemoveColumns(#"Reordered Columns1",{"Customer"}),
#"Date One" = Date.AddMonths(Date.StartOfMonth(Date.AddDays(DateTime.Date(DateTime.LocalNow()),1)),1),
#"Date Two" = Date.AddYears(Date.AddMonths(Date.StartOfMonth(Date.AddDays(DateTime.Date(DateTime.LocalNow()),1)),0),-1),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each [DATELASTFULFILLMENT] < Date.AddMonths(Date.StartOfMonth(DateTime.Date(DateTime.LocalNow())),1) and [DATELASTFULFILLMENT] >= Date.AddYears(Date.AddMonths(Date.StartOfMonth(DateTime.Date(DateTime.LocalNow())),0),-1)),
//#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each Date.From([DATELASTFULFILLMENT]) < #"Date One" and Date.From([DATELASTFULFILLMENT]) >= #"Date Two"),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"SALESMAN", "RegionName", "StartOfMonth", "GROUP"}, {{"Total", each List.Sum([Total Sales]), type number}}),
#"Reordered Columns2" = Table.ReorderColumns(#"Grouped Rows",{"SALESMAN", "RegionName", "GROUP", "StartOfMonth", "Total"}),
#"Inserted Month Name" = Table.AddColumn(#"Reordered Columns2", "Month Name", each Date.MonthName([StartOfMonth]), type text),
#"Inserted Year" = Table.AddColumn(#"Inserted Month Name", "Year", each Date.Year([StartOfMonth]), type number),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Inserted Year", {{"Year", type text}}, "en-US"),{"Month Name", "Year"},Combiner.CombineTextByDelimiter(" ", QuoteStyle.None),"Month"),
#"Removed Columns2" = Table.RemoveColumns(#"Merged Columns",{"StartOfMonth", "GROUP"}),
#"Grouped Rows1" = Table.Group(#"Removed Columns2", {"SALESMAN", "Month", "RegionName"}, {{"Total", each List.Sum([Total]), type number}}),
#"Sorted Rows" = Table.Sort(#"Grouped Rows1",{{"SALESMAN", Order.Ascending}}),
#"Pivoted Columns" = Table.Pivot(#"Sorted Rows", List.Distinct(#"Sorted Rows"[Month]), "Month", "Total", List.Sum),
Columns = List.RemoveFirstN(Table.ColumnNames(#"Pivoted Columns"),2),
#"Replaced Value" = Table.ReplaceValue(#"Pivoted Columns",null,0,Replacer.ReplaceValue,Columns),
#"Changed Type" = Table.TransformColumnTypes(#"Replaced Value",List.Transform(Columns, each {_, Currency.Type })),
#"Merged Queries" = Table.NestedJoin(#"Changed Type",{"SALESMAN"},USER_MAPPING_COMBINED,{"USERNAME"},"USER_MAPPING_COMBINED",JoinKind.LeftOuter),
#"Expanded USER_MAPPING" = Table.ExpandTableColumn(#"Merged Queries", "USER_MAPPING_COMBINED", {"NAME"}, {"NAME"})
in
#"Expanded USER_MAPPING"
Expected Results:
The query refreshes as normal
Actual Results:
The query errors with The '<#MONTH>' column does not exist in the rowset.
Where '<#MONTH>' is the month that gets dropped off, e.g. if it is Sept 2019, it would be 'August 2018'.
screenshot for reference:
Ok, I'm going to take a second swing at this. If I'm way off base, I apologize. But let me bring out an example to simulate your error.
let
Seed = Number.Mod(Number.Round(Time.Second(DateTime.LocalNow())), 7) + 1,
Source = List.Generate(()=> Seed, each _ < Seed + 6, each _ + 1),
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), {"MonthNumbers"}, null, ExtraValues.Error),
#"Added Custom" = Table.AddColumn(#"Converted to Table", "MonthNames", each Date.MonthName(#date(2019, [MonthNumbers], 1))),
#"Pivoted Column" = Table.Pivot(#"Added Custom", List.Distinct(#"Added Custom"[MonthNames]), "MonthNames", "MonthNumbers", List.Sum)
in
#"Pivoted Column"
So, this creates a table that looks like so:
But every single second, the frame shifts by a month. So if you refresh the preview every second, you'll see the frame slide Jan-Jun, Feb-Jul, Mar-Aug... when December is the last month in the window, it skips back to the Jan-Jun. The point is, the columns are changing.
Now you try to load this model. It does not work!
From the time you create the model with one column set, to the time it takes to load the column set, those columns have been changed and so one of the columns isn't there any more. This is like when your month changes. When you go in and load manually, you'll update your model and things will work fine until the next change in columns. But when you're doing it with the scheduled load, that doesn't update the model, it just tries to load the data and runs into this column mismatch.
So, how do we fix it without losing this dynamic naming? Let's look at that pivot... what if we don't do it and leave our power query looking like this?
Now the column names won't change when we load it into the model. We create a matrix visualization like so, and do some refreshes:
No errors, nice dynamic headers.
So, that's the approach that I think you need. I hope it helps.
Edit: Per comment, this answer shows how to deal with a source that comes with changing column names over time, which is not the problem the asker has.
#RyanB is correct. The right approach here is to do your crosstab layout in reports rather than in data model. The right way, in general, to deal with things that change is to reify these as data, rather than as schema.
Original post below
You're looking for the 'Unpivot other columns' transform:
Select the columns whose names do not change.
Use 'Unpivot other columns' transform
Rename columns
Deal with months as a single month column
Make sure this comes before any steps that depend on the changing column names.
Here are two sample queries that are identical in code except for the source, which has differently named columns:
// query1
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("RU85FsQgCL2LdQoWUTiLL8VkJve/QliSTKF8/oK4VsO2tY8fpLyEe1aWKm3fVgvpiF7SBNLZZulQQLrD9LK339I6mAOYpoJBo5tmUOgUYNr7YwfTiUwShA/VyL/wnufhyMRqv1oHRswTJANNBshGAWN63edNkf41j4GJvzseMn67Xw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Somethingstatic = _t, #"May-2019" = _t, #"Jun-2019" = _t, #"Jul-2019" = _t, #"Aug-2019" = _t]),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"ID", "Somethingstatic"}, "Attribute", "Value"),
#"Renamed Columns" = Table.RenameColumns(#"Unpivoted Other Columns",{{"Attribute", "Month"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Value", Int64.Type}, {"ID", Int64.Type}})
in
#"Changed Type"
// query 2
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("RU85FsQgCL2LdQoWUTiLL8VkJve/QliSTKF8/oK4VsO2tY8fpLyEe1aWKm3fVgvpiF7SBNLZZulQQLrD9LK339I6mAOYpoJBo5tmUOgUYNr7YwfTiUwShA/VyL/wnufhyMRqv1oHRswTJANNBshGAWN63edNkf41j4GJvzseMn67Xw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Somethingstatic = _t, #"Jun-2019" = _t, #"Jul-2019" = _t, #"Aug-2019" = _t, #"Sep-2019" = _t]),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"ID", "Somethingstatic"}, "Attribute", "Value"),
#"Renamed Columns" = Table.RenameColumns(#"Unpivoted Other Columns",{{"Attribute", "Month"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Value", Int64.Type}, {"ID", Int64.Type}})
in
#"Changed Type"

How to Find the Most Current Date From a Column in Power Query - MAX()

This is for a Power Query:
I am working on a report that compiles information from different dates and I need a column that generates the most recent date in the list and the previous date to the most current one in separate columns:
Most Current Date must be the same for the whole column (same for Previous Date Column)
Table Name : Skipped_Issue
Worker |Case |Report_Date |MOST_CURRENT_DATE |PREVIOUS_DATE
Tran |3000 |1/2018
Dhni |52451 |4/2018
Dhtuni |39656 |2/2018
For the most recent date, you can create a custom column with this formula:
= Date.From(List.Max(NameOfPreviousStep[Report_Date]))
Where NameOfPreviousStep references the prior step in your query (e.g. #"Changed Type" or Source).
To get the second to last date, you can create a custom column that evaluates the max after removing the MOST_CURRENT_DATE
= Date.From(
List.Max(
List.RemoveItems(#"Added Custom"[Report_Date],
#"Added Custom"[MOST_CURRENT_DATE])))
Here's the whole query for the sample data:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCilKzFPSUTI2MDAAUob6hvpGBoYWSrE60UouGXmZQDFTIxNTQyBtgipXUgqWNbY0MzUD0kZw2VgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Worker = _t, Case = _t, Report_Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Worker", type text}, {"Case", Int64.Type}, {"Report_Date", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "MOST_CURRENT_DATE", each Date.From(List.Max(Source[Report_Date])), type date),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "PREVIOUS_DATE", each Date.From(List.Max(List.RemoveItems(#"Added Custom"[Report_Date], #"Added Custom"[MOST_CURRENT_DATE]))), type date)
in
#"Added Custom1"
#Alexis-Olson That's useful, my respect for lists goes up! I needed to get the max row for each item(worker) date, I wrote code like this:
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Worker"}, {{"AllDates", each _, type table}}),
#"Added ReportDate List" = Table.AddColumn(#"Grouped Rows", "ReportDates", each [AllDates][Report_Date]),
#"Added MaxReportDate" = Table.AddColumn(#"Added ReportDate List", "Report_Date", each List.Max([ReportDates])),
and then merged back to get the single item for the max date for each worker. I'm finding the Grouped Rows with all rows handy when I need a list of a column