Replace list in cell by values from other query - vlookup

I have issue with defining smart data replacement in power query.
I am querying data from SharePoint, from multiple lists to create desired report.
If I need to replace values in column which is containing only 1 number, I am using merge queries function as "vlookup" replacement.
The issue starts when one column is containing multiple numbers, separated by semicolon.
Example
Source list:
| Unique ID | Name | Assignees_ID|
|-|-|-|
| Epic1 | Blabla1| 1 |
|Epic2 | Blabla2| 1;2;3|
"Vlookup_list" query:
|Assignees_ID|Assignees_Names|
|-|-|
|1|Mark|
|2|Irina|
|3|Bart|
Expected output:
| Unique ID | Name | Assignees_ID |Assignees_Names |
|-|-|-| - |
| Epic1 | Blabla1| 1 | Mark|
|Epic2 | Blabla2| 1;2;3| Mark; Irina; Bart|
So is there a smart way to perform such transition? I was trying multiple possibilities but my knowledge is too low to perform it.
Kind regards
Bartosz

In powerquery
Load the Vlookup_list into powerquery. Name the query VlookupNamesQuery File .. close and load to ... create connection only
Load the Example Source list into powerquery
Right click the Assignees_ID column and split by each semi-colon into rows
Merge in VlookupNamesQuery and match on ID using left outer join. Expand using arrows atop column to get Assignees_Names
Group on UniqueID and Name. Use home ... advanced editor ... to modify code to use Text.Combine to put together the ones that were split, as per below
let Source = Excel.CurrentWorkbook(){[Name="ExampleSourceListRange"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"UniqueID", type text}, {"Name", type text}, {"Assignees_ID", type text}}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type", {{"Assignees_ID", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Assignees_ID"),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Assignees_ID", Int64.Type}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type1",{"Assignees_ID"},VlookupNamesQuery,{"Assignees_ID"},"Names",JoinKind.LeftOuter),
#"Expanded Names" = Table.ExpandTableColumn(#"Merged Queries", "Names", {"Assignees_Names"}, {"Assignees_Names"}),
#"Grouped Rows" = Table.Group(#"Expanded Names", {"UniqueID", "Name"}, {
{"Assignees_ID", each Text.Combine(List.Transform([Assignees_ID], Text.From), ";"), type text},
{"Assignees_Names", each Text.Combine(List.Transform([Assignees_Names], Text.From), ";"), type text}
})
in #"Grouped Rows"

Related

Convert a single row into multiple rows, depending on values in a specific column in Power Bi

I have rows of data which can have information in multiple columns that I need to extract and convert into an individual row for each.
E.g.
Original table
Headers are:
Product Code | Description | Location 1 | Location 2 | Location 3
and I need to convert it to:
Product Code | Description | Location
Some products will be available in multiple regions.
If a product is available in Germany and France, there may be an DE in the Location 1 column, and an FR in the Location 2 column, while the location 3 column will be blank.
I need to convert it so that there is a single location column with corresponding entries for each region that product had.
Desired output table
Is there a way to automate this in Power Bi?
Select the Code and description columns then
UnpivotOtherColumns
Remove the blank entries
Remove the Attribute column
Not sure how you want your results sorted, but you could easily add a sorting algorithm to below.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUfJILAGSLq5AAoRidaKVjICM4OTEojQg7RYElwVJGQMZ7jn5ZanFQEaoN5KC2FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Product Code" = _t, Description = _t, #"Location 1" = _t, #"Location 2" = _t, #"Location 3" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Product Code", Int64.Type}, {"Description", type text}, {"Location 1", type text}, {"Location 2", type text}, {"Location 3", type text}}),
//Select Product Code and Description Columns
// Then "Unpivot other Columns
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type",
{"Product Code", "Description"}, "Attribute", "Location"),
//Remove the blank locations and the "Attrubute" column
#"Filtered Rows" = Table.SelectRows(#"Unpivoted Other Columns", each ([Location] <> "")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Attribute"})
in
#"Removed Columns"

Query to return multiple rows based on the value in a column

I have a source table that has projects with a start date and duration in months. I'm looking to write a PowerQuery for PowerBI that will create a row for each month of the project, counting up the months. For example:
Source:
Project(string) | Date (ms timestamp) | Duration (integer)
A | Jan-2022 | 3
B | Sep-2022 | 2
Result:
Project | Date
A | Jan-2022
A | Feb-2022
A | Mar-2022
B | Sep-2022
B | Oct-2022
Not sure where to start or what this query should look like. Any ideas?
Edit: Changed sample tables to make them readable
Edit: Dates in the source table are provided in millisecond timestamp format (eg 1641024000000). My intent in the result table is to have them in a human-readable date format.
Here is one way to do this in Power Query.
Paste the code into a blank query.
Then Change the Source line so as to load your actual data table.
I used an Excel table for the source, but you may use what ever.
I also have the unix time stamp in the Source table, converting it to a PQ date in the M Code.
If all of your time stamps do not equate to the start of the month, some additional logic may be required.
Read the code comments and explore the Applied Steps to understand the algorithm
let
//Read in the Source data
Source = Excel.CurrentWorkbook(){[Name="Table27"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Project", type text}, {" Date", Int64.Type}, {" Duration", Int64.Type}}),
//convert date from unixTime in milliseconds to a PQ date
unixTime = Table.TransformColumns(#"Changed Type",{" Date", each #duration(0,0,0,_/1000)+#date(1970,1,1)}),
//add custom column with a List of the desired dates
#"Added Custom" = Table.AddColumn(unixTime, "Months", each
List.Accumulate(
{0..[#" Duration"]-1},
{},
(state,current)=> state & {Date.AddMonths([#" Date"],current)})),
//Remove unneeded columns
//Expand the list and set the data thype
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{" Date", " Duration"}),
#"Expanded Months" = Table.ExpandListColumn(#"Removed Columns", "Months"),
#"Changed Type1" = Table.TransformColumnTypes(#"Expanded Months",{{"Months", type date}})
in
#"Changed Type1"
For some reason sqlfiddle was down for me so I made an example in db-fiddle using postgres instead of ms-sql.
What you're looking to accomplish can be done with a recursive CTE, the syntax in MS-SQL is slightly different but this should get you most of the way there.
WITH RECURSIVE project_dates AS(
SELECT
start_date as starting_date,
CAST(start_date + duration*INTERVAL '1 month' as date) as end_date,
project
FROM projects
UNION
SELECT
CAST(starting_date + INTERVAL '1 month' as date),
pd.end_date,
p.project
FROM projects p
JOIN project_dates pd ON pd.project = p.project
WHERE CAST(starting_date + INTERVAL '1 month' as date) < pd.end_date
)
SELECT starting_date, project FROM project_dates
ORDER BY project, starting_date
My results using your date look as such.
You can check out my answer on db-fiddle with this link: https://www.db-fiddle.com/f/iS7uWFGwiMbEmFtNmhsiWt/0
try below
Divide your milliseconds by 86400000 and add that to 1/1/1970 to get date
Create an array based on Duration, expand to rows, add that to the start date
Remove extra columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
ConvertToDays = Table.TransformColumns(Source,{{"Date", each Number.RoundDown(Number.From(_) / 86400000)}}),
#"Added Custom" = Table.AddColumn(ConvertToDays, "Custom", each Date.AddDays(#date(1970,1,1),18993)),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each List.Numbers(0,[Duration])),
#"Expanded Custom.1" = Table.ExpandListColumn(#"Added Custom1", "Custom.1"),
#"Added Custom2" = Table.AddColumn(#"Expanded Custom.1", "Custom.2", each Date.AddMonths([Custom],[Custom.1]), type date),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"Date", "Duration", "Custom", "Custom.1"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Custom.2", "Date"}}),
TextDate = Table.AddColumn(#"Renamed Columns", "TextDate", each Date.ToText([Date],"MMM-yy"))
in TextDate

Slicer to search for a specific string in a column - Power BI

I have two tables.
Table1
Category
A
B
...
Table2
Companies | Indistries
1 | A,D,X
2 | Z,B,X
3 | N,D,R,B,Q
I would like to have a slicer with different categories (A-Z). When clicking A all Diagrams should be filtered according to the Companies that "contain" industry A.
Long story short: it would be like a normal relationship but instead of finding the same, it would be a "contains".
Thank you for your help! Really appreciated.
Please down load the sample report file from link - HERE
Follow-
Created a Index column in table your_table_name
Created a new table slicer_new with this below code-
let
Source = your_table_name,
#"Split Column by Delimiter" = Table.SplitColumn(Source, "Indistries", Splitter.SplitTextByDelimiter(", ", QuoteStyle.Csv), {"Indistries.1", "Indistries.2", "Indistries.3", "Indistries.4"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Indistries.1", type text}, {"Indistries.2", type text}, {"Indistries.3", type text}, {"Indistries.4", type text}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Index"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Attribute"})
in
#"Removed Columns"
Here below is the final output of slicer_new table-
Get back to report and check the relation between table your_table_name and slicer_new.
Create slicer from table slicer_new
Add table visual for column Indistries from table your_table_name
Now select value in slicer, everything should work as expected now.

Power BI: Count multiple values in column

I'm working with the dataset where the gender of all participants is described in one single column. I'd like to create two new columns for both genders and fill it up with the number of males / females involved.
The syntax used in a column looks like this:
0::Male||1::Male||3::Male||4::Female
(so we have 4 participants, the value in col "Male" would be 3, in col "Female" 1)
Would you be so kind and help me to extract this information? ♥
I'm sorry to ask you as I know I'd be able to eventually find the solution by myself, but I'm really under pressure right now. :/
This is the screenshot of the column I wanna extract values from.
Thanks a lot to everyone who tries to help! :)
In powerquery, you'd need two splits (one on :: and one on ||, and then a pivot to get data into right format)
Select your data, including header column of participant_gender and load your table into powerquery with Data ... From Table/Range...[x] my data has headers. I assume below that this is the first table of your file, and is named Table1
Add Column..Index Column...
right click participant_gender column ... split column ... by delimiter ... custom ... || [x] each occurrence of the delimiter , Advanced options [x] rows
right click participant_gender column ... split column ... by delimiter ... custom ... :: [x] each occurrence of the delimiter (ignore advanced options)
Click to select both participant_gender.2 and Index columns .. right click group by ... group by (participant_gender.2) (index) new column name (count) operation (sum) column (participant_gender.1)
Click to select participant_Gender.2 column ... Transform..pivot column...Values column (Count)
File...Close and Load to ... table
Code produced, which you can paste into Home .. Advanced Editor... if you wish
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"participant_gender", type text}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Added Index", {{"participant_gender", Splitter.SplitTextByDelimiter("||", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "participant_gender"),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"participant_gender", type text}}),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type1", "participant_gender", Splitter.SplitTextByDelimiter("::", QuoteStyle.Csv), {"participant_gender.1", "participant_gender.2"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"participant_gender.1", Int64.Type}, {"participant_gender.2", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type2", {"participant_gender.2", "Index"}, {{"Count", each List.Sum([participant_gender.1]), type number}}),
#"Pivoted Column" = Table.Pivot(#"Grouped Rows", List.Distinct(#"Grouped Rows"[participant_gender.2]), "participant_gender.2", "Count", List.Sum)
in #"Pivoted Column"

M formula to add missing dates to table

Suppose I have a PowerBI date table, with some dates missing, similar to the following:
|---------------------|------------------|
| Date | quantity |
|---------------------|------------------|
| 1/1/2015 | 34 |
|---------------------|------------------|
| 1/4/2015 | 34 |
|---------------------|------------------|
Is there an M formula that would add the missing date rows (and just put in null for the second column), resulting in a table like below:
|---------------------|------------------|
| Date | quantity |
|---------------------|------------------|
| 1/1/2015 | 34 |
|---------------------|------------------|
| 1/2/2015 | null |
|---------------------|------------------|
| 1/3/2015 | null |
|---------------------|------------------|
| 1/4/2015 | 34 |
|---------------------|------------------|
I know this could be accomplished by merging a full [dates] table with my dataset, but that is not an option in my scenario. And I need to do this in M, during query manipulation, and not in DAX.
Appreciate the help!
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Base = Table.TransformColumnTypes(Source,{{"Date", type date}, {"quantity", Int64.Type}}),
// Generate list of dates between Max and Min dates of Table1
DateRange = Table.Group(Base, {}, {{"MinDate", each List.Min([Date]), type date}, {"MaxDate", each List.Max([Date]), type date}}),
StartDate = DateRange[MinDate]{0},
EndDate = DateRange[MaxDate]{0},
List ={Number.From(StartDate)..Number.From(EndDate)},
#"Converted to Table" = Table.FromList(List, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
FullList = Table.TransformColumnTypes(#"Converted to Table",{{"Column1", type date}}),
//Right Anti Join to find dates not in original Table1
#"Merged Queries" = Table.NestedJoin(Base,{"Date"},FullList,{"Column1"},"Table2",JoinKind.RightAnti),
#"Removed Other Columns" = Table.SelectColumns(#"Merged Queries",{"Table2"}),
Extras = Table.ExpandTableColumn(#"Removed Other Columns", "Table2", {"Column1"}, {"Date"}),
Combined = Base & Extras
in Combined
Here's another way:
I start with a table named Table2 in an Excel worksheet and use it as my source. It looks like this:
Then, use PowerBI's Get Data, then select All > Excel and the Connect button, and navigate to the Excel file that has the table I'm going to use as my source and select it and click Open. Then I select Table2 (the name of the table I want to use) from the tables presented for selection, and I click the Edit button. This loads Table2 as my source.
The second and third lines in my M code below (Source and Table2_Table) are what is generated from the above steps and gets me to the table and loads it. These will be different for you, based on your source info. Your source path and file info and table names will be different.
let
Source = Excel.Workbook(File.Contents("mypath\myfile.xlsx"), null, true),
Table2_Table = Source{[Item="Table2",Kind="Table"]}[Data],
#"Generate Dates" = List.Generate(()=> Date.From(List.Min(Table2_Table[Date])), each _ <= Date.From(List.Max(Table2_Table[Date])), each Date.AddDays(DateTime.Date(_), 1)),
#"Converted to Table" = Table.FromList(#"Generate Dates", Splitter.SplitByNothing(), {"Date"}, null, ExtraValues.Error),
#"Merged Queries" = Table.NestedJoin(#"Converted to Table",{"Date"},Table2_Table,{"Date"},"Converted to Table",JoinKind.LeftOuter),
#"Expanded Converted to Table" = Table.ExpandTableColumn(#"Merged Queries", "Converted to Table", {"Quantity"}, {"Quantity"})
in
#"Expanded Converted to Table"
I get this table as output:
Which I can then use in PowerBI. For example, in a table like this:
P.S. I noticed that when using this in PowerQuery from within Excel only and not from within PowerBI, I need to explicitly change the type for the date fields or else the merge won't work right and the Quantity numbers won't appear. So if doing this only from within Excel and not within PowerBI, this code change seems to work:
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}}),
#"Generate Dates" = List.Generate(()=> Date.From(List.Min(#"Changed Type"[Date])), each _ <= Date.From(List.Max(#"Changed Type"[Date])), each Date.AddDays(DateTime.Date(_), 1)),
#"Converted to Table" = Table.FromList(#"Generate Dates", Splitter.SplitByNothing(), {"Date"}, null, ExtraValues.Error),
#"Changed Type1" = Table.TransformColumnTypes(#"Converted to Table",{{"Date", type date}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type1",{"Date"},#"Changed Type",{"Date"},"Converted to Table",JoinKind.LeftOuter),
#"Expanded Converted to Table" = Table.ExpandTableColumn(#"Merged Queries", "Converted to Table", {"Quantity"}, {"Quantity"})
in
#"Expanded Converted to Table"
Of course, it probably wouldn't hurt to explicitly assign the date types when working within PowerBI as well...just in case.