Splitting a column across multiple rows in Power BI - powerbi

I am using Power BI Desktop.
I have a column where there are multiple values separated by ','. The value in this column is assigned multiple distinct rows. I want to split the column and assign a value to each of the rows. For example:
Original data
Input SN
Output SN
a
1,2,3
b
1,2,3
c
1,2,3
Desire Result
Input SN
Output SN
a
1
b
2
c
3
I have tried splitting by delimiter on Output SN and then trying to group by input SN, but I can't seem to get to the desired results.
Any suggestions?
Thanks,
Sussie.

Try something like this:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlTSUTLUMdIxVorViVZKQuElI3ixAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Input SN" = _t, #"Output SN" = _t]),
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1, Int64.Type),
#"Inserted Text After Delimiter" = Table.AddColumn(#"Added Index", "Text After Delimiter", each Text.BeforeDelimiter( Text.AfterDelimiter([Output SN], ",", [Index]-1),","), type text),
#"Removed Columns" = Table.RemoveColumns(#"Inserted Text After Delimiter",{"Output SN", "Index"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Text After Delimiter", "Output SN"}})
in
#"Renamed Columns"
Which transforms
into

Related

How can I group my values into columns by category in Power BI?

Maybe I didn't word it correctly because I'm not sure how to word it, but what I want to do is to go from Table A to table B:
Table A:
Category
Code
Color
A
B1B
Green
A
A1A
Blue
B
C1C
Olive
B
B1B
Green
B
EF9
Red
B
K32
Purple
Table B:
Category
Code1
Color1
Code2
Color2
Code3
Color3
Code4
Color4
A
AIA
Green
B1B
Blue
B
C1C
Olive
B1B
Blue
EF9
Red
K32
Purple
We have a lot of categories with colors for each one, for some tests they need each code and color to be in one row per category and we don't know how to do it, take into account that there are categories with 16 colors, some with 2 and such so no defined number of columns. Any tips? TIA
Since your results in Table B don't seem to match up with the data in Table A, it's not clear exactly what you want. In particular, I would expect that for Category A, the Code1 and Color1 entries would match the same line in Table A, but you show, in Table A, B1B|Green and in Table B, you show, for Code1|Color1 A1A|Green And multiple other similar mismatches.
If those mismatches are just a sloppy way of expressing your desired results, and you actually want them to match, you can
Group by Category
For each sub group
Unpivot the Code and Color columns
Create unique names for each Color/Code attribute pair
Then Pivot on the Attribute column with no aggregation
Home=>Transform Data=>Home=>Advanced EditorPaste the code below into the window that opens, and it will create your data and what I think might be your desired outcome
let
//Change next two lines to reflect your actual data source and typing
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUXIydAKS7kWpqXlKsToQMUdDsExOaSpYCKTA2dAZSPrnZJYhxNC1gniubpZAMig1BS7ibWwEJANKiwpygFpjAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Category = _t, Code = _t, Color = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Code", type text}, {"Color", type text}}),
//Group By Category
#"Group Category" = Table.Group(#"Changed Type","Category", {
{"Grouped", (t)=> let
#"Unpivoted Columns" = Table.UnpivotOtherColumns(t, {"Category"}, "Attribute", "Value"),
//create unique names for each code/color pair in the Attribute column => "Merged" column
#"Added Index" = Table.AddIndexColumn(#"Unpivoted Columns", "Index", 0, 1, Int64.Type),
#"Inserted Integer-Division" = Table.AddColumn(#"Added Index", "Integer-Division", each Number.IntegerDivide([Index], 2), Int64.Type),
#"Inserted Merged Column" = Table.AddColumn(#"Inserted Integer-Division", "Merged", each Text.Combine({[Attribute], Text.From([#"Integer-Division"], "en-US")}, "."), type text),
//Remove unneeded columns and then Pivot (with no aggregation) on the Merged column
#"Removed Columns" = Table.RemoveColumns(#"Inserted Merged Column",{"Attribute", "Index", "Integer-Division"}),
Pivot = Table.Pivot(#"Removed Columns", #"Removed Columns"[Merged],"Merged","Value")
in
Pivot
}
}),
//Combine the resultant subtables
#"Combine Tables" = Table.Combine(#"Group Category"[Grouped])
in
#"Combine Tables"
Source
-
Results

Power Query: Unpivoting Multiple different Attributes at the same time

In Power BI's Power Query, let say I have the below table.
Customer
Product 1: Type
Product 1: Cost
Product 2: Type
Product 2: Cost
Cust 1
A
$5
B
$15
Cust 2
A
$5
C
$20
The goal is to unpivot so that there is no product 1 or product 2, just product, effectively taking the 4 columns in to 2
Customer
Product Type
Product Cost
Cust 1
A
$5
Cust 1
B
$15
Cust 2
A
$5
Cust 2
C
$20
I know this is fairly simple if just unpivoting from many columns to 1 column through the Unpivot columns function.
But how do you go about unpivoting many columns into n columns without doing this n times and rejoining?
You can unpivot all columns except Customer, split the Attribute values into Product and Header, then re-pivot:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wci4tLlEwVNJRcgRiFVMg4QRiGJoqxepAZY2QZZ1BDCMDpdhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Customer = _t, #"Product 1: Type" = _t, #"Product 1: Cost" = _t, #"Product 2: Type" = _t, #"Product 2: Cost" = _t]),
Unpivoted = Table.UnpivotOtherColumns(Source, {"Customer"}, "Attribute", "Value"),
#"Split Attribute" = Table.SplitColumn(Unpivoted, "Attribute", Splitter.SplitTextByEachDelimiter({": "}, QuoteStyle.Csv, false), {"Product", "Header"}),
#"Pivoted Column" = Table.Pivot(#"Split Attribute", List.Distinct(#"Split Attribute"[Header]), "Header", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Pivoted Column",{"Product"})
in
#"Removed Columns"
Which turns
into
This should work also
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
base_columns=1, groupsof=2, // stack them
Combo = List.Transform(List.Split(List.Skip(Table.ColumnNames(Source),base_columns),groupsof), each List.FirstN(Table.ColumnNames(Source),base_columns) & _),
#"Added Custom" =List.Accumulate(
Combo,
#table({"Column1"}, {}),
(state,current)=> state & Table.Skip(Table.DemoteHeaders(Table.SelectColumns(Source, current)),1))
in #"Added Custom"

Repeat the last value over time

I have a table with power plant capacities in different years. There are only entries when something changes in the capacities. In the years not listed, the last value applies.
Plant
Year
Capacity
Cottam
2003
800
Cottam
2009
600
Cottam
2015
800
Drax
2000
600
Drax
2005
1200
Drax
2010
1800
Drax
2013
1200
Drax
2020
0
Ironbridge
2007
500
Ironbridge
2015
0
Now I would like to transform the initial table, so that I also have values for all years in between and can display them in a stacked column chart, for example. The result should look like shown in the table below. Marked in yellow are the numbers from the initial table.
You can do this easily in the Query Editor in M code.
To reproduce, paste the code below into a blank query:
let
//change next line to reflect your actual data source
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs4vKUnMVdJRMjIwMAZSFgYGSrE6qOKWQMoMU9zQFEm9S1FiBUS1AZJqhChIraERurAhSLEhhhmGxlhVG4FUQ8Q8i/LzkooyU9JTIcabAylTA6xyYGcCZWIB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Plant = _t, Year = _t, Capacity = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Plant", type text}, {"Year", Int64.Type}, {"Capacity", Int64.Type}}),
//generate Table of all years
#"All Years" = Table.FromColumns(
{List.Numbers(List.Min(#"Changed Type"[Year]), List.Max(#"Changed Type"[Year])- List.Min(#"Changed Type"[Year]) + 1 )}),
//Group by Plant
// Aggregate by joining with the All Years table and "Fill Down" to replace blanks with previous year.
// then expand the grouped column
#"Group by Plant" = Table.Group(#"Changed Type","Plant",{
{"Joined", each Table.FillDown(Table.Join(#"All Years","Column1",_,"Year",JoinKind.FullOuter),{"Capacity"})}
}),
#"Expanded Joined" = Table.ExpandTableColumn(#"Group by Plant", "Joined", {"Column1", "Capacity"}, {"Column1", "Capacity"}),
//Replace nulls with zero's
#"Replaced Value" = Table.ReplaceValue(#"Expanded Joined",null,0,Replacer.ReplaceValue,{"Capacity"}),
//Pivot on year
// then set the data types
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Replaced Value", {{"Column1", type text}}, "en-US"),
List.Distinct(Table.TransformColumnTypes(#"Replaced Value", {{"Column1", type text}}, "en-US")[Column1]), "Column1", "Capacity"),
//set data type
#"Changed Type1" = Table.TransformColumnTypes(#"Pivoted Column",
List.Transform(List.Sort(List.RemoveFirstN(Table.ColumnNames(#"Pivoted Column"),1), Order.Ascending), each {_, Int64.Type}))
in
#"Changed Type1"
Edit Note:
Actually, to create the graph in Power BI, you do NOT want to pivot the data, so the shorter code:
let
//change next line to reflect your actual data source
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs4vKUnMVdJRMjIwMAZSFgYGSrE6qOKWQMoMU9zQFEm9S1FiBUS1AZJqhChIraERurAhSLEhhhmGxlhVG4FUQ8Q8i/LzkooyU9JTIcabAylTA6xyYGcCZWIB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Plant = _t, Year = _t, Capacity = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Plant", type text}, {"Year", Int64.Type}, {"Capacity", Int64.Type}}),
//generate Table of all years
#"All Years" = Table.FromColumns(
{List.Numbers(List.Min(#"Changed Type"[Year]), List.Max(#"Changed Type"[Year])- List.Min(#"Changed Type"[Year]) + 1 )}),
//Group by Plant
// Aggregate by joining with the All Years table and "Fill Down" to replace blanks with previous year.
// then expand the grouped column
#"Group by Plant" = Table.Group(#"Changed Type","Plant",{
{"Joined", each Table.FillDown(Table.Join(#"All Years","Column1",_,"Year",JoinKind.FullOuter),{"Capacity"})}
}),
#"Expanded Joined" = Table.ExpandTableColumn(#"Group by Plant", "Joined", {"Column1", "Capacity"}, {"Year", "Capacity"}),
//Replace nulls with zero's
#"Replaced Value" = Table.ReplaceValue(#"Expanded Joined",null,0,Replacer.ReplaceValue,{"Capacity"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Replaced Value",{{"Year", Int64.Type}, {"Capacity", Int64.Type}})
in
#"Changed Type1"
Then, in Power BI, you can generate this:
Note:
The code below presents the Table FillDown / Table Join sequence from the first code using variables and more comments. Should be easier to understand (might be less efficient, though)
...
{"Joined", each
let
//join the subtable with the All Years table
#"Joined Table" = Table.Join(#"All Years", "Column1", _, "Year", JoinKind.FullOuter),
//Fill down the Capacity column so as to fill with the "last year" data
// since that column will contain a null after the Table.Join for years with no data
#"Fill Down" = Table.FillDown(#"Joined Table",{"Capacity"})
in
#"Fill Down"
}
...
Here's how to solve this (more easily) in DAX:
Prerequisite is separate Calendar table with a 1:many relation on the year
Calendar =
SELECTCOLUMNS(
GENERATESERIES(
MIN(Plants[Year]),
MAX(Plants[Year])
),
"Year", [Value]
)
Next calculate the Last Given Capacity per year
Last Given Capacity =
VAR current_year =
MAX(Calendar[Year])
VAR last_capacity_year =
CALCULATE(
MAX(Plants[Year]),
'Calendar'[Year] <= current_year
)
RETURN
CALCULATE(
MAX(Plants[Capacity]),
Calendar[Year] = last_capacity_year
)
Finally put it all together in a Stacked Column Chart with
X-axis: 'Calendar'[Year]
Y-axis: [Last Given Capacity]
Legend: 'Plants'[Plant]

Powerquery extract values from list in column B based on another columnA value

I have two columns A and B
A is a normal column. Let's say it may have any of the following colors: black, white, orange
B each B record contain a list. Let's say "white shirt", "white trousers", "orange t-shirt"
I'm trying to get in column B the items related to the color in column A.
If A = white, then I want in one cell "white shirt" and "white trousers".
If I hard code "white", it works, but I can't pass [A] to Text.Contains (or I don't know how)
= Table.TransformColumns(#"Added Custom", {"B", each Text.Combine(List.Transform(List.Select(_,each Text.Contains(_,"white")), Text.From), "#(lf)"), type text})
Please, I appreciate any help.
The trick is to set value of the first column equal to a variable before starting to work with the list in the second column
so for your example
= Table.TransformColumns(#"Added Custom", {"B", each let search = [ColumnA] in Text.Combine(List.Transform(List.Select(_,each Text.Contains(_,search)), Text.From), "#(lf)"), type text})
but in a more general example:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each
let search = [Column1] in
Text.Combine(
List.Select(
Text.Split([Column2],","),
each Text.Contains(_,search)
)
," ,")
)
in #"Added Custom"

Power M function to return most common string of a column as part of a Table.Group aggregate clause

I have written a custom aggregate function to return the most common value for one or more columns for a set of records which have a unique grouping: ie that you can use where you might use MAX, MIN etc but the result is the most commonly occurring value.
Can anyone advise on a better or more performant solution? It is my first M function. Or feel free to adapt.
Functionality if grouping on col G1 and aggregating on col A1 with the first table as input data you would get the 2nd table as output.
Function MostCommon
let
fnMostCommon = (ListIn) =>
let
uniquevalues=List.Distinct(ListIn),
result=Table.FromList(uniquevalues,null,{"u"}),
result2=Table.AddColumn(result ,"freq", each List.Count(List.PositionOf(ListIn, [u], 100))),
result3 = Table.Sort(result2,{{"freq", Order.Descending},{"u",Order.Ascending}}),
result4 = List.First(result3[u])
in result4
in fnMostCommon
Code to create data and run:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlTSUQozhBCxOlC+ETa+EZifhKY+CU19MpAJRGUm6LxYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [G1 = _t, A1 = _t, A2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"G1", type text}, {"A1", type text}, {"A2", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"G1"}, {{"Agg", each MostCommon([A1]), type text}, {"Agg2", each AllConcat([A1]), type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Agg2"})
in
#"Removed Columns"
How about this?
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlTSUQozVIrVgTKN0JlJCAVJCNFkIBOJEQsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [G1 = _t, A1 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"G1", type text}, {"A1", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"G1"}, {{"Count", each Table.First(Table.Sort(Table.Group(_, "A1", {"Count", List.Count}), {"Count"}))[A1], type text}})
in
#"Grouped Rows"
Everything is handled in the last step which I've expanded out below:
Table.Group(
#"Changed Type",
{"G1"},
{{
"Count",
each Table.First(
Table.Sort(
Table.Group(_, "A1", {"Count", List.Count}),
{"Count"}
)
)[A1],
type text
}}
)
This groups by the G1 column and for each G1 value, it calculates a grouped table with each A1 value and the count that value occurs. I.e., the grouped table for the a row is
A1 Count
---------
V1 1
V2 2
It then sorts by count (Table.Sort), selects the top row (Table.First), and returns the value in the [A1] column, which is V2 in the a row example.