Group By a column based on multiple other columns - Power Query

Group By a column based on multiple other columns - Power Query - powerbi

Hello beautiful people,
Could someone please help me with my below request, (noting that I am working with Power Query Editor). So I need it to be done using creating conditional columns in power query maybe?, please help.
I need to group users in a table based on a category with showing their count in multiple different fields, As per the below example:
I need results to be:
Muuuuuuuuch Appreciated

In powerquery, try this
Click select period and name columns
Right click, unpivot other columns
Click select period, attribute and value columns
Right click, group ... new column name Count, operation Count rows
Click select attribute column. Transform ... pivot column... values column Count, Advanced options, do not aggregate
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Period", "Name"}, "Attribute", "Value"),
#"Grouped Rows" = Table.Group(#"Unpivoted Other Columns", {"Period", "Attribute", "Value"}, {{"Count", each Table.RowCount(_), type number}}),
#"Pivoted Column" = Table.Pivot(#"Grouped Rows", List.Distinct(#"Grouped Rows"[Attribute]), "Attribute", "Count", List.Sum)
in #"Pivoted Column"
Alternately,
Click select period and name columns
Right click, unpivot other columns
Right click name column and remove
Right click value column and duplicate
Click select attribute column .. Transform pivot column ... Values column:Values Copy
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Period", "Name"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Name"}),
#"Duplicated Column" = Table.DuplicateColumn(#"Removed Columns", "Value", "Value - Copy"),
#"Pivoted Column" = Table.Pivot(#"Duplicated Column", List.Distinct(#"Duplicated Column"[Attribute]), "Attribute", "Value - Copy", List.Count)
in #"Pivoted Column"
Neither one is going to put in a blank row for a combination that does not exist like Dec No. Thats more complicated if required

Related

Power BI Count/Sum Based on Multiple Columns

I'm trying to count and sum based on multiple columns. When I tried the "group by" function in transform data, it gives me a timeout error.
Below is an illustrative example of what I'm trying to do. In the real dataset, the number of columns is 30+ and the number of possible entries in each column is also large, resulting in many unique combinations.
I'm not sure if there are other functionalities in Power BI that can achieve this, please send help!
Have this:
Want this:

Im going to go ahead and guess that perhaps you want to sum all your columns without knowing how many of them you have, and this code can work on any number of columns, grouping the first two
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Column1", "Column2"}, "Attribute", "Value"),
#"Grouped Rows" = Table.Group(#"Unpivoted Other Columns", {"Column1", "Column2", "Attribute"}, {{"Count", each Table.RowCount(_), Int64.Type}, {"SUM", each List.Sum([Value]), type number}}),
#"Pivoted Column" = Table.Pivot(#"Grouped Rows", List.Distinct(#"Grouped Rows"[Attribute]), "Attribute", "SUM", List.Sum)
in #"Pivoted Column"

In Power Query, on the Tranform ribbon, click "Group By" and enter these settings.

best way to add multiple columns in power bi

Hi need to add 1000+ calculated columns in power bi which provide the count per entry, for example.*means calculated columns
ID
RankCode
Count_RankCode*
RankAdvance
Count_RanAdvance*
1000
AAA
2
XYZ
2
1001
AAA
2
XYA
1
1002
AAB
1
XYZ
2
found the right way to count in power BI DAX:
COUNTROWS(FILTER('24Jun_1973',[rankCode]=earlier([rankCode])))
Requirement:
add 1000 columns that count rows in probably in one code using DAX
or create the 1000 count cloumn in power query M language (need it to be fast since raw date is 60gb).

As suggested by #smpa01, I was able to complete this task using the tabular editor. Just used the DAX script in tabular editor, put my all my measures in there since I was able to create all expressio in excels as it is just repeating then voila, 1000 measures added.
example:
Measure '24Jun_1973'[measure]=calculate(COUNTROWS(FILTER('28Jun_1973',[rankcode]='28Jun_1973'[rankcode])))
Measure '24Jun_1973'[measure2]=calculate(COUNTROWS(FILTER('28Jun_1973',[rankAdvance]='28Jun_1973'[rankAdvance])))

I have no idea why you would want another 1000 columns.
If you really want to though, in powerquery, you could unpivot, group and count, append the results to original data, then re-pivot. I don't know how fast it would be. I suspect not very.
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"RankCode", type text}, {"RankAdvance", type text}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"ID"}, "Attribute", "Value"),
// group and count
#"Grouped Rows" = Table.Group(#"Unpivoted Other Columns", {"Attribute", "Value"}, {{"Count", each Table.RowCount(_), type number}}),
#"Duplicated Column" = Table.DuplicateColumn(#"Grouped Rows", "Attribute", "Attribute - Copy"),
#"Change column name" = Table.TransformColumns(#"Duplicated Column",{{"Attribute - Copy", each "count_" & _, type text}}),
// append back to original table, then repivot
#"Merged Queries" = Table.NestedJoin(#"Unpivoted Other Columns",{"Attribute", "Value"},#"Change column name",{"Attribute", "Value"},"Table2",JoinKind.LeftOuter),
#"Expanded Table1" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"Count", "Attribute - Copy"}, {"Count", "Attribute - Copy"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Table1",{"Attribute", "Value"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Count", "Value"}, {"Attribute - Copy", "Attribute"}}),
combined = #"Unpivoted Other Columns" & #"Renamed Columns",
#"Pivoted Column" = Table.Pivot(combined, List.Distinct(combined[Attribute]), "Attribute", "Value")
in #"Pivoted Column"

power query - M Language - Convert Columns Into Rows

I have a spreadsheet that contains column Names as the product name, quantity, cost.
I want to convert this to rows of data that contain Product Name, Quantity, Cost.
See image below as to what I want.
What is the best way to handle this in Power Query M Language?
Not sure if I want to pivot just the columns that have prod name, quantity and cost?
Thanks

Here's A way...
Starting with this table as Table1:
You can select the Customer column and Unpivot Other Columns to get this:
Then you can add an index column (keep it named Index) and then also a custom column (keep it named Custom) with if Text.EndsWith([Attribute],"Cost") then 1 else 0 as its formula to get this:
Then add another custom column... Name it Total Cost and enter #"Unpivoted Other Columns"[Value]{[Index]+(List.Count(#"Added Custom"[Custom])/List.Sum(#"Added Custom"[Custom]))} as its formula to get:
The two steps above were, first, to set up to locate the corresponding Cost of the Tshirts based on the Cost's position in the Value column and, then, to actually locate the cost and record it on the same line as the respective Tshirts. The Index column provides row positioning information while the Custom column provides count information--both the overall list count and the count of rows with Cost. I use the count information to determine how many index positions to move down the Value column to get associated cost values dynamically.
Then filter on the Attribute column, using Text Filters > Does Not End With... and type the word Cost. All the rows with an Attribute entry ending with the word Cost should disappear:
Remove the Index and Custom columns and Rename the Attribute and Value columns to Product Name and Quantity, respectively to get your final result:
Here's my M code:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Customer"}, "Attribute", "Value"),
#"Added Index" = Table.AddIndexColumn(#"Unpivoted Other Columns", "Index", 0, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each if Text.EndsWith([Attribute],"Cost") then 1 else 0),
#"Added Custom2" = Table.AddColumn(#"Added Custom", "Total Cost", each #"Unpivoted Other Columns"[Value]{[Index]+(List.Count(#"Added Custom"[Custom])/List.Sum(#"Added Custom"[Custom]))}),
#"Filtered Rows" = Table.SelectRows(#"Added Custom2", each not Text.EndsWith([Attribute], "Cost")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index", "Custom"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Attribute", "Product Name"}, {"Value", "Quantity"}})
in
#"Renamed Columns"

They key here is pivoting and unpivoting.
Starting with a table like this,
Select the right four columns and click Transform > Unpivot Columns to get this table:
Now create a custom column that classifies the value using this formula.
if Text.EndsWith([Attribute], "Cost") then "Cost" else "Quantity"
I also chopped off the " Cost" piece at the end of the Attribute column. You can either Transform > Replace Values and replace " Cost" with nothing or Transform > Extract > Text Before Delimiter " Cost".
Now pivot the custom column (choose the Value column as your Values Column choice) and, finally, rename the Attribute column to Product Name.
Here's my M code for all the steps:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcknNK0stUnAuLS7Jz00tUtJRMjIGEoYmIJapqZ6pAYhnZKpnYKAUqxOt5JyRmZyYno+swdAQSJiagtUZgNQBeeYQDbEA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Customer = _t, #"Product Orange T-shirt" = _t, #"Product Blue T-shirt" = _t, #"Product Orange T-shirt Cost" = _t, #"Product Blue T-shirt Cost" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Customer", type text}, {"Product Orange T-shirt", Int64.Type}, {"Product Blue T-shirt", Int64.Type}, {"Product Orange T-shirt Cost", type number}, {"Product Blue T-shirt Cost", type number}}),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Customer"}, "Attribute", "Value"),
#"Added Custom" = Table.AddColumn(#"Unpivoted Columns", "Custom", each if Text.EndsWith([Attribute], "Cost") then "Cost" else "Quantity"),
#"Replaced Value" = Table.ReplaceValue(#"Added Custom"," Cost","",Replacer.ReplaceText,{"Attribute"}),
#"Pivoted Column" = Table.Pivot(#"Replaced Value", List.Distinct(#"Replaced Value"[Custom]), "Custom", "Value", List.Sum),
#"Renamed Columns" = Table.RenameColumns(#"Pivoted Column",{{"Attribute", "Product Name"}})
in
#"Renamed Columns"

Power BI - Duplicate Rows

In Power BI, I have a table that looks like this:
ID
234
435
3435
58
48504
7820
I want to convert it to a table that looks like this:
ID
234-101
234-102
435-101
435-102
3435-101
343-102
58-101
58-102
48504-101
48504-102
7820-101
7820-102
Is this even possible within Power BI?

I thought of two ways to do this, though there are probably others.
NOTE - I prefer the second method as it lets the "101" and "102" be data driven allowing them to be changed or added to in the future more easily.
A) Through the Query Editor (requires hard-coding the "101"/"102" values)
Step 1: Start with your data in the Query Editor
Step 2: Add two additional columns for your suffixes. Click on the "Custom Column from Examples" button and then type in "234-101" in the first cell. After arrowing down to the next cell, it should auto-populate the rest. Do this again for "-102".
Step 3: Unpivot the two new columns to get them into one. With the "ID" column selected, click on the dropdown for "Unpivot Columns" and click on "Unpivot Other Columns".
Step 4: Remove extra columns. In the resulting data, you will have the original "ID" column, along with two new ones; "Attribute" and "Value". Since the "Value" column contains the desired values, select the "ID" and "Attribute" columns, right click one of their headers, and select "Remove Columns".
Step 5: Rename the "Value" column to "ID" and you're finished.
Here is the resulting M code for all of those actions.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjI2UYrViVYyMTYF08YwhqkFRNzC1ACiwtzCyEApNhYA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}}),
#"Inserted Merged Column" = Table.AddColumn(#"Changed Type", "Merged", each Text.Combine({Text.From([ID], "en-US"), "-101"}), type text),
#"Inserted Merged Column1" = Table.AddColumn(#"Inserted Merged Column", "Merged.1", each Text.Combine({Text.From([ID], "en-US"), "-102"}), type text),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Inserted Merged Column1", {"ID"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Other Columns",{"ID", "Attribute"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Value", "ID"}})
in
#"Renamed Columns"
B) Through DAX
Step 1: Start with your data in the data view.
Step 2: Click on "Enter Data" and add the data for the suffixes. (Skip this if these numbers are sourced somewhere else)
Step 3: Click on "New Table" and enter the following formula.
NewData = CROSSJOIN(Data, Suffixes)
Step 4: Click on "New Column and enter the following formula.
NewID = CONCATENATE(CONCATENATE(NewData[ID], "-"), NewData[Value])
If you want the new column to be named "ID", you'll need to rename the old "ID" column first, as you can't simply remove it like was done in the first method.

If you're okay with using Power BI's query editor (Power Query) for this, you can do it with this query code:
let
Source = Table1,
#"Inserted Merged Column3" = Table.AddColumn(Source, "DelimitedListWithSuffixes", each Text.Combine({[ID], "-101,", [ID],"-102"}), type text),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Inserted Merged Column3", {{"DelimitedListWithSuffixes", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "DelimitedListWithSuffixes"),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"DelimitedListWithSuffixes", type text}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"DelimitedListWithSuffixes"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{"DelimitedListWithSuffixes", "ID"}})
in
#"Renamed Columns"
(Table1 is your original ID column table.)

Calculated column with the sum of values from many columns in a row

I need to sum all values in each row and display them in a calculated column. As I deal with lots of columns in lots of tables, adding something like
CalculatedColumn = 'public table_name'[column1] + 'public table_name'[column2] + ... + 'public table_name'[column528]
is really inefficient. Is there a shorter way of doing this?

Yes, there is. You should "Unpivot other columns" and then "Group By" using the Query Editor.
Suppose this dataset:
item;col1;col2;col3;col4;col5
apple;1;2;3;4;5
orange;1;2;3;5;8
banana;1;2;4;6;8
Load it up, and open the query editor.
Choose "Unpivot Other Columns":
You should now see this:
On the "Transform" tab in the ribbon, choose the leftmost "Group By" option. And fill out the dialog like so:
You should now have the wanted end result:
You could also skip the Group By step and let your visualization handle that.
PS. Should you need a few non-summed columns too I recommend either creating a duplicate dataset with the same source and either linking it to the original table with a relationship, or merging it so you get a final table with all wanted columns.
Footnote, this is the Power Query that is generated for you:
let
Source = Csv.Document(File.Contents("D:\Experiments\PowerBi\denormalized.csv"),[Delimiter=";", Columns=6, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"item", type text}, {"col1", Int64.Type}, {"col2", Int64.Type}, {"col3", Int64.Type}, {"col4", Int64.Type}, {"col5", Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"item"}, "Attribute", "Value"),
#"Grouped Rows" = Table.Group(#"Unpivoted Other Columns", {"item"}, {{"SumCol", each List.Sum([Value]), type number}})
in
#"Grouped Rows"

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Group By a column based on multiple other columns - Power Query - powerbi

Related

Power BI Count/Sum Based on Multiple Columns

best way to add multiple columns in power bi

power query - M Language - Convert Columns Into Rows

Power BI - Duplicate Rows

Calculated column with the sum of values from many columns in a row

Categories

Resources