I am working on a Power BI report and I am using a parquet file as the source. I have this column that holds info about the month:
As you see here, in MONTH_RUN, I have the data stored both as "202201" (yyyymm format) and "12/1/2021" (more like a date format).
I would need to change the values that appears like this: "12/1/2021" to "202112", so I can have all the values the same format.
I need to do this in a dynamic way, like searching for the rows that has values containing "/" character and based on that to change it into "yyyymm" format.
Is this possible?
I'm kinda new with power query and idk how should I implement this, but I would prefer to do it without creating any other additional columns.
Please, do not post sample data as images. It makes it difficult to answer your question.
One way to deal with this is to add a custom column, where the value will be either the same as MONTH_RUN (if it doesn't contain /), or it will parse the text to date, and then format it accordingly. For example:
if Text.Contains([MONTH_RUN], "/") then
Date.ToText(Date.FromText([MONTH_RUN], [Format="M/d/yyyy", Culture="en-US"]), [Format="yyyyMM", Culture="en-US"])
else
[MONTH_RUN]
Then use this custom column in your report.
I would just delete the existing MONTH_RUN column and create a new one based on the date in the DAY_RUN column.
eg:
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"MONTH_RUN"}),
#"Added Custom" = Table.AddColumn(#"Removed Columns", "MONTH_RUN", each Date.ToText([DAY_RUN],"yyyyMM"), type text)
Or, from the UI:
Related
I'm relatively new to Power Query. I'm looking for the best way to replace a column's value as below.
The column has date values in a mixed format such as below.
09/16/2022
09/20/2022
09/26/2022
09/30/2022
10-01-2022
10-03-2022
10-05-2022
I'm looking to standardize and make the format generic as below.
09-16-2022
09-20-2022
09-26-2022
09-30-2022
10-01-2022
10-03-2022
10-05-2022
It seems one of the ways to implement this is to use Advanced Editor and build M queries to implement the replacement, functions like Table.TransformColumns and Text.Replace.
'can't figure out the exact code to be used with this or if there is a better way.
Looking for suggestions. Thanks.
If you're a beginner, let the UI write this code for you. Highlight the column by clicking the column header, go to Transform on the ribbon and click Replace Values. Replace "-" with "/" and click OK.
Finally right click the column header again, click Change Type and then select Date.
In M code you could use:
Table.TransformColumns(#"Previous Step",{{"Column Name", each Text.Replace(_,"/","-")}})
As an example:
let
//create sample table
Source = Table.FromColumns(
{{"09/16/2022",
"09/20/2022",
"09/26/2022",
"09/30/2022",
"10-01-2022",
"10-03-2022",
"10-05-2022"}},
type table[dates=text]),
//replace "/" with "-"
Normalize = Table.TransformColumns(Source,{{"dates", each Text.Replace(_,"/","-"), type text}})
in
Normalize
Source
Results
Notes:
Original data is dates as text strings
Final data is also dates as text strings
To convert the strings to dates, you could use, instead, something like: Table.TransformColumns(Source,{{"dates", each Date.From(_, "en-US"), type date}}) but the separator would be in accord with your Windows Regional date settings.
I'm trying to Add Custom Column in Power Query with the objective to return a Table from a List of dates.
The syntax used is as follows below:
= Table.AddColumn(TypeDate, "AddTable", each Table.FromList(
List.Dates([Date_begin],1,#duration(1,0,0,0)
)))
where:
TypeDate is the name of last step in Power Query
"AddTable" is the name of added custom column
[Date_begin] is a column with dates to be considered as the start of my list
Although the syntax seems correct, Power Query returns an error described as follows:
Expression.Error: We could not convert the value #date(2021, 1, 1) on to Text.
Details:
Value=01/01/2021
Type=[Type]
Does anyone know how to handle this problem?
I'll show an image where Power Query shows the error.
Select here to see Power Query interface
Your question is unclear
You want to add a column that has a table of dates for each row, using Date_Begin and Mes_Final?
#"Added Custom" = Table.AddColumn(TypeDate, "AddTable", each Table.TransformColumnTypes(Table.FromList({Number.From([Date_Begin])..Number.From([Mes_Final])}, Splitter.SplitByNothing(), {"date"}),{{"date", type date}}))
I have a table that's generated when I pull data from an accounting software - the example columns are months/years in the format as follows (It pulls all the way to current day, and the last month will be partial month data):
Nov_2020
Dec_2020
Jan_2021
Feb_1_10_2021 (Current month, column to remove)
... So on and so forth.
My goal I have been trying to figure out is how to use the power query editor to remove the last column (The partial month) - I tried messing around with the text length to no avail (The goal being to remove anything with text length >8, so the full months data would show but the last month shouldn't). I can't just remove based on a text filter, because if someone were to pull the data 1 year from now it would have to account for 2021/2022.
Is this possible to do in PQ? Sorry, I'm new to it so if I need to elaborate more I can.. Thanks!
You can do this with Table.SelectColumns where you use List.Select on the Table.ColumnNames.
= Table.SelectColumns(
PrevStep,
List.Select(Table.ColumnNames(PrevStep), each Text.Length(_) <= 8)
)
Although both Alexis Olson's and Justyna MK's answers are valid, there is another approach. Since it appears that you're getting data for each month in a separate column, what you will surely want to do is unpivot your data, that is transform those columns into rows. It's the only sensible way to get a good material for analysis, therefore, I would suggest to unpivot the columns first, then simply filter out rows containing the last month.
To make it dynamic, I would use unpivot other columns option - you select columns and it will transform remaining columns into row in such a way that two columns will be created - one that will contain column names in rows and the other one will contain values.
To illustrate what I mean by unpivoting, when you have data like this:
You're automatically transforming that into this:
You can try to do it through Power Query's Advanced Editor. Assign the name of the last column to LastColumn variable and then use it in the last step (Removed Columns).
let
Source = Excel.Workbook(File.Contents(Excel file path), null, true),
tblPQ_Table = Source{[Item="tblPQ",Kind="Table"]}[Data],
#"Changed Type" = Table.TransformColumnTypes(tblPQ_Table,{{"Nov_2020", Int64.Type}, {"Dec_2020", Int64.Type}, {"Jan_2021", Int64.Type}, {"Feb_1_10_2021", Int64.Type}}),
LastColumn = List.Last(Table.ColumnNames(#"Changed Type")),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{LastColumn})
in
#"Removed Columns"
The picture I have attached shows what my power query table looks like (exactly the same as source file) and then underneath what I would like the final end product to look like.
Correct me if I'm wrong but I thought the purpose of power query/power bi was to not manipulate the source file but do this in power query/power bi?
If that's the case, how can I enter new columns and data to the existing table below?
You can add custom columns without manipulating source file in power bi. Please refer to below link.
https://learn.microsoft.com/en-us/power-bi/desktop-add-custom-column
EDIT: Based on your comment editing my answer - Not sure if this helps.
Click on edit queries after loading source file to power bi.
Using 'Enter Data' button entered sample data you provided and created new table. Data can be copy pasted from excel. You can enter new rows manually. Using Tag number column to keep reference.
Merge Queries - Once the above table is created merged it with original table on tag number column.
Expand Table - In the original table expand the merged table. Uncheck tag number(as it is already present) and uncheck use original column name as prefix.
Now the table will look like the way you wanted it.
You can always change data(add new columns/rows) manually in new table by clicking on gear button next to source.
Here is the closest solution to what I found from "manual data entry" letting you as much freedom as you would like to add rows of data, if the columns that you want to create do not follow a specific pattern.
I used an example for the column "Mob". I have not exactly reproduced the content of your cells but I hope that this will not be an issue to understand the logic.
Here is the data I am starting with:
Here is the Power Query in which I "manually" add a row:
#"Added Conditional Column" = Table.AddColumn(#"Changed Type", "Mob", each if [Tag Number] = "v" then null else null),
NewRows = Table.InsertRows(#"Added Conditional Column", 2, {[Mob="15-OHIO", Tag Number="4353654", Electronic ID=1.5, NLIS="", Date="31/05/2015", Live Weight="6", Draft="", Condition store="", Weighing Type="WEAN"]})
in
NewRows
1) I first created a column with only null values:
#"Added Conditional Column" = Table.AddColumn(#"Changed Type", "Mob", each if [Tag Number] = "v" then null else null),
2) With the "Table.InsertRows" function:
I indicated the specific line: 2, (knowing that power Bi start counting at zero, at the "headers" so it will the third line in the file)
I indicated the column at which I wanted to insert the value, i.e "Mob"
I indicated the value that all other other rows should have:
NewRows = Table.InsertRows(#"Added Conditional Column", 2, {[Mob="15-OHIO", Tag Number="4353654", Electronic ID=1.5, NLIS="", Date="31/05/2015", Live Weight="6", Draft="", Condition store="", Weighing Type="WEAN"]})
Here is the result:
I hope this helps.
You can apply this logic for all the other rows.
I do not think that this is very scalable however, becaue you have to indicate each time the values of the rows in the other columns as well. There might be a better option.
I have imported a JSON file into PowerBI and it contains a column in which the values are of type "List". I am looking to expand that column into multiple columns.
Specifically, the data contains a Sprint Name, the start date and the end date of the sprint, along with some other values associated with each sprint.
Trying to use "Expand to new rows" duplicates each sprint instance, creating a table that looks like this, duplicating each sprint instance multiple times for each associated value:
Sprint Name Value
JAN(S1Dev) 2019-01-01
JAN(S1Dev) 2019-01-13
JAN(S1Dev) {attribute}
JAN(S1Dev) {attribute}
JAN(S2Dev) 2019-01-14
JAN(S2Dev) 2019-01-31
JAN(S2Dev) {attribute}
JAN(S2Dev) {attribute}
FEB(S1Test) 2019-02-01
FEB(S1Test) 2019-02-15
... ...
I would like to do something similar to the "expand" feature, which instead creates a new column with each attribute rather than a new row. This is currently vastly increasing the size of my table for no reason, while also making the data practically un-useable. Any help would be appreciated, cheers!
I have found a very simple solution to this, but as it took me some time to figure it out I will answer my own question instead of deleting it to help others in the future...
Upon importing the JSON data into PowerBI first select "Convert to Table" to view the data as a table with editable properties.
Next, click the arrows pointing away from each other at the top of the column of Lists, and select "Extract Values".
Select a delimiter to use for concatenating values, I am choosing a comma since I know that the data contained within the list does not have any commas in it. If your data contains commas within it, choose something else. Similarly, if your data contains one of the delimiters, do not choose that as the delimiter.
It should now display a comma-separated list where it previously displayed "List" in orange text.
Now, right-click on the column and select "Split Column" then choose "By Delimiter"
Select the delimiter that you previously chose, and under "split at" select "Each occurrence of the delimiter" then click OK.
Your column should now be split into multiple columns based on the list!