I got the following table I want to edit with power query (still a beginner).
Every month we got a new row for every parameter (many) for every object. What I want, is a table with a new row for every every object and every Month with all the parameters listed as columns. The parameters can include numbers, dates, empty values etc.
I hope I could explain my issue well enough.
Thanks for you help!
What I have:
Parameter
Object
Location
Size
Month
Value
1
Object A
USA
4
Jan 2002
180
1
Object A
USA
4
Feb 2002
210
2
Object A
USA
4
Jan 2002
312
2
Object A
USA
4
Feb 2002
140
1
Object B
CAN
6
Jan 2002
164
1
Object B
CAN
6
Feb 2002
130
2
Object B
CAN
6
Jan 2002
95
2
Object B
CAN
6
Feb 2002
122
What I want:
Object
Month
Location
Size
Parameter 1
Parameter 2
Parameter 3...
Object A
Jan 2002
USA
4
180
312
...
Object A
Feb 2002
USA
4
210
140
...
Object B
Jan 2002
CAN
6
164
95
...
Object B
Feb 2002
CAN
6
130
122
95
Load data into powerquery with data .. from table/range... [x] headers
click select parameter column
transform .. pivot column
values column:value [ok]
file ... close and load ...
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Parameter", Int64.Type}, {"Object", type text}, {"Location", type text}, {"Size", Int64.Type}, {"Month", type datetime}, {"Value", Int64.Type}}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Changed Type", {{"Parameter", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Changed Type", {{"Parameter", type text}}, "en-US")[Parameter]), "Parameter", "Value", List.Sum)
in #"Pivoted Column"
Related
Here's an example of my existing data set. I have unique parts (Part) grouped by what factory # (Factory) they're used in and when the parts started to be used in operation (Part Install Year).
Part
Factory
Part Install Year
1
100
2018
2
100
2018
3
100
2018
3
200
2019
3
300
2020
4
400
2019
5
400
2020
6
500
2018
Desired Output is below. I need to group all the related parts by the lowest numbered factory they are installed in (Part Grouping) and then calculate the earliest year any part was installed in that factory (Factory Install Year). I'm having trouble figuring out how to create the Factory Install Year. Thank you!
Part
Factory
Part Install Year
Part Grouping
Factory Install Year
1
100
2018
100
min of all Dates in Factory 100
2
100
2018
100
min of all Dates in Factory 100
3
100
2018
100
min of all Dates in Factory 100
3
200
2019
100
min of all Dates in Factory 100
3
300
2020
100
min of all Dates in Factory 100
4
400
2019
400
min of all Dates in Factory 400
5
400
2020
400
min of all Dates in Factory 400
6
500
2018
500
2018
Try
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Part Install Year", type number}, {"Factory", type number}}),
EarliestYear = Table.Group(#"Changed Type", {"Factory"}, {{"EarliestYear", each List.Min([Part Install Year]), type nullable number}}),
#"Grouped Rows" = Table.Group( #"Changed Type", {"Part"}, {
{"data", each _, type table },
{"MinFactory", each List.Min([Factory])},
{"EarliestYear", each try EarliestYear[EarliestYear]{List.PositionOf(EarliestYear[Factory],List.Min([Factory]))} otherwise null}
}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"Factory", "Part Install Year"}, {"Factory", "Part Install Year"})
in #"Expanded data"
I have two tables in PowerBI, one modified date and one fact for customer scores. The relationship will be using the "Month Num" column. Score assessments take place every June, so I would like to be able to have the scores for 12 months (June 1 to June 30) averaged. Then I will just have a card comparing the Previous year score and Current year score. Is there a way to do this dynamically, so I do not have to change the year in the function every new year? I know using the AVERAGE function will be nested into the function somehow, but I am getting confused not using a calendar year and not seasoned enough to use Time Intelligence functions yet.
Customer Score Table
Month
Month Num
Year
Score
Customer #
June
6
2020
94.9
11111
July
7
2020
97
11111
months
continue
2020
100
June
6
2021
89
22222
July
7
2021
91
22222
months
continue
2021
100
June
6
2022
93
33333
July
7
2022
94
33333
Date Table
Month
Month Num
Month Initial
january
1
J
feb
2
F
march
3
M
other
months
continued
I am working in powerbi and I have a problem in the power query editor. My source table contains cross-sectional-temporal data with both cross-sectional and temporal primary keys (2 identifiers):
primary_key time_year data_feature_A
1 2019 A
1 2020 C
1 2021 L
1 2022 B
2 2019 K
2 2020 H
2 2021 M
2 2022 D
3 2019 A
3 2020 F
3 2021 X
3 2022 Y
4 2019 Y
4 2020 D
4 2021 M
4 2022 H
At the end of the procedure, the table should have the following structure:
primary_key 2019 2020 2021 2022
1 A C L B
2 K H M D
3 A F X Y
4 Y D M H
How do I do this most efficiently in Power Query? I would like to run such a procedure on a very big data set (original data).
In powerquery, just pivot the data
click select the time_year column
transform...pivot column...
values column:data_feature_a
advanced options...dont aggregate
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(Source, {{"time_year", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(Source, {{"time_year", type text}}, "en-US")[time_year]), "time_year", "data_feature_A")
in #"Pivoted Column"
I have a custom dates table.
I require to create a Period ID per Period.
Creating an Index does not solve the problem as it does not rank the numbers based on the Period.
To give you an example of my data and what my expected results are, see here
Herewith is sample data:
id MonthInYear
----------- -----------
1 20180100
2 20180100
3 20180100
4 20180100
5 20180100
6 20180200
7 20180200
8 20180200
9 20180200
10 20180200
11 20180200
12 20180200
13 20180200
14 20180300
15 20180300
16 20180300
17 20180300
18 20180300
19 20180300
20 20180300
21 20180300
22 20180300
23 20180300
My required results screenshot:
How do I create a ranked Period ID as per my expected results above in Power Query?
Let your table name is base_table and now create a new table base_table_date_id using the below code-
let
Source = base_table,
#"Changed Type" = Table.TransformColumnTypes(Source,{{"id", Int64.Type}, {"MonthInYear", Int64.Type}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"id"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns"),
#"Sorted Rows" = Table.Sort(#"Removed Duplicates",{{"MonthInYear", Order.Ascending}}),
#"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1, Int64.Type)
in
#"Added Index"
Finally merge your both table base_table and base_table_date_id into a new table using column MonthInYear. The new table code will be as below-
let
Source = Table.NestedJoin(base_table, {"MonthInYear"}, base_table_date_id, {"MonthInYear"}, "base_table_date_id", JoinKind.LeftOuter),
#"Expanded base_table_date_id" = Table.ExpandTableColumn(Source, "base_table_date_id", {"Index"}, {"base_table_date_id.Index"})
in
#"Expanded base_table_date_id"
Her is the final output-
I have a csv file like below.
Beat,Hour,Month,Primary Type,COUNTER
111,10AM,Apr,ASSAULT,12
111,10AM,Apr,BATTERY,5
111,10AM,Apr,BURGLARY,1
111,10AM,Apr,CRIMINAL DAMAGE,4
111,10AM,Aug,MOTOR VEHICLE THEFT,2
111,10AM,Aug,NARCOTICS,1
111,10AM,Aug,OTHER OFFENSE,18
111,10AM,Aug,THEFT,38
Now I want to find the % of each Primary Type grouped by the first three columns. For eg, For Beat = 111, Hour=10AM, Month=Apr, %Assault=12/(12+5+1+4) * 100. Can anyone give a clue on how to do this using pandas?
You can using transform sum
df['New']=df.COUNTER/df.groupby(['Beat','Hour','Month']).COUNTER.transform('sum')*100
df
Out[575]:
Beat Hour Month Primary Type COUNTER New
0 111 10AM Apr ASSAULT 12 54.545455
1 111 10AM Apr BATTERY 5 22.727273
2 111 10AM Apr BURGLARY 1 4.545455
3 111 10AM Apr CRIMINAL DAMAGE 4 18.181818
4 111 10AM Aug MOTOR VEHICLE THEFT 2 3.389831
5 111 10AM Aug NARCOTICS 1 1.694915
6 111 10AM Aug OTHER OFFENSE 18 30.508475
7 111 10AM Aug THEFT 38 64.406780