Load multiple files from folder without duplicating headers - powerbi

I have used folder data source in Power BI
the files are excel files with same structure
each one of these files have columns names are first
when I load the data I get the column name as row although i clicked in the
"Use First row as header" button
how can I remove first row from all files.
I may end with 100s of files in that folder, so I cant remove them one by one

I had similar problem to solve. I found answer here:
https://powerbi.tips/2016/08/load-multiple-excel-xlsx-files/
The logic is:
When you get data from Excel file, you have column Date with value Table for each sheet.
Refer to Data column, not expanding it. Add new column:
= Table.AddColumn(#"Previous Step", "TablesWithHeaders", each Table.PromoteHeaders([Data], [PromoteAllScalars=true]))
Expand column TablesWithHeaders.

Sample data:
You can apply a text filter on Column1 to filter out the values.
Results:
Assume that you don't have a useful row of data where the value is coincidentally
Column1 for Column1.
M query FYR:
#"Filtered Rows" = Table.SelectRows(#"Your Previous Steps", each [Column1] <> "Column1")

Related

Power Query / Power BI - How to move a cell value to a separate cell the easiest way?

I want to move a single value from column B to column A, how can I achieve it in the most simplest way in Power Query / Query Editor (Power BI)?
Please see attached images.
I know I might need to declare a variable so please enlighten me. By the way, I will delete row 1 afterwards, promote my headers, and rename column2 as PERIOD.
Thank you.
This might be along the lines of what you want to do.
If I start with this table named as Table1:
Then I click on the fx to the left of the formula bar:
And type = Table.InsertRows(Source, Table.RowCount(Source), {[Column2 = Source[KP20 rate]{0}, KP20 rate = null, Column4 = null]}) into the formula bar:
I used Table.InsertRows to create a new row in Table1. Source is the name of the latest state of Table1 after it is pulled into Power Query and before I do this step. So I actually use Source as the name of the table for this step instead of Table1. (Each applied step basically results in its own table. You probably know this already, but others may not.) So for this step I use Source as the table name in the Table.InsertRows statement. Then, since I want the new row to appear at the bottom of Source, I just enter the Table.RowCount of Source as the row number location for the new row. Then I enter each of the Columns' names and their values to be added. For Column2, I entered the value "Source[KP20 rate]{0}." Source[KP20 rate]{0} basically treats column KP20 rate as a list, where {0} serves as a pointer to the first item in the list. To target the second item in Source[KP20 rate] you would use Source[KP20 rate]{1}. You can see that I set the values for the other two columns (KP20 rate and Column4) to null.
The result:
Here's the M code in case you want to see it:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Custom1 = Table.InsertRows(Source, Table.RowCount(Source), {[Column2 = Source[KP20 rate]{0}, KP20 rate = null, Column4 = null]})
in
Custom1

POWER BI - How to add manual columns/data to existing table instead of adding columns/data to the source csv file

The picture I have attached shows what my power query table looks like (exactly the same as source file) and then underneath what I would like the final end product to look like.
Correct me if I'm wrong but I thought the purpose of power query/power bi was to not manipulate the source file but do this in power query/power bi?
If that's the case, how can I enter new columns and data to the existing table below?
You can add custom columns without manipulating source file in power bi. Please refer to below link.
https://learn.microsoft.com/en-us/power-bi/desktop-add-custom-column
EDIT: Based on your comment editing my answer - Not sure if this helps.
Click on edit queries after loading source file to power bi.
Using 'Enter Data' button entered sample data you provided and created new table. Data can be copy pasted from excel. You can enter new rows manually. Using Tag number column to keep reference.
Merge Queries - Once the above table is created merged it with original table on tag number column.
Expand Table - In the original table expand the merged table. Uncheck tag number(as it is already present) and uncheck use original column name as prefix.
Now the table will look like the way you wanted it.
You can always change data(add new columns/rows) manually in new table by clicking on gear button next to source.
Here is the closest solution to what I found from "manual data entry" letting you as much freedom as you would like to add rows of data, if the columns that you want to create do not follow a specific pattern.
I used an example for the column "Mob". I have not exactly reproduced the content of your cells but I hope that this will not be an issue to understand the logic.
Here is the data I am starting with:
Here is the Power Query in which I "manually" add a row:
#"Added Conditional Column" = Table.AddColumn(#"Changed Type", "Mob", each if [Tag Number] = "v" then null else null),
NewRows = Table.InsertRows(#"Added Conditional Column", 2, {[Mob="15-OHIO", Tag Number="4353654", Electronic ID=1.5, NLIS="", Date="31/05/2015", Live Weight="6", Draft="", Condition store="", Weighing Type="WEAN"]})
in
NewRows
1) I first created a column with only null values:
#"Added Conditional Column" = Table.AddColumn(#"Changed Type", "Mob", each if [Tag Number] = "v" then null else null),
2) With the "Table.InsertRows" function:
I indicated the specific line: 2, (knowing that power Bi start counting at zero, at the "headers" so it will the third line in the file)
I indicated the column at which I wanted to insert the value, i.e "Mob"
I indicated the value that all other other rows should have:
NewRows = Table.InsertRows(#"Added Conditional Column", 2, {[Mob="15-OHIO", Tag Number="4353654", Electronic ID=1.5, NLIS="", Date="31/05/2015", Live Weight="6", Draft="", Condition store="", Weighing Type="WEAN"]})
Here is the result:
I hope this helps.
You can apply this logic for all the other rows.
I do not think that this is very scalable however, becaue you have to indicate each time the values of the rows in the other columns as well. There might be a better option.

Power Query combine three external Excel source files and append specific columns

I'm trying to create a lookup table combining the my 3 source files primary keys columns, this way I won't have to do an outer join to find the missing records from each source and then append them together. I've found how to "combine" two source files but I can't figure out how to drill into the columns/fields lists so that I can select only Column 1 (or "Item Code" header name in the Excel files).
Here is the code I have so far to combine 2/3 files (as a trial):
let
Source = Table.Combine({Excel.Workbook(File.Contents("C:\Users\Desktop\Dry Good Demad-Supply Report\MRP_ParentDmd\Data_Sources\JDE_MRP_Dmd.xlsx"), null, true),
Excel.Workbook(File.Contents("C:\Users\Desktop\Dry Good Demad-Supply Report\MRP_ParentDmd\Data_Sources\JDE_Open_PO.xlsx"), null, true)})
in Source
If you've got a less than ideal data source (ie lots of irrelevant columns, duplicates in the data you want) then one way to avoid materialising a whole bunch of unnecessary data would be to perform all your transformations/filtering on nested table cells rather than loading all the data up just to remove columns/dupes.
The M code below should be a rough start that hopefully gets you on the way
let
//Adjust the Source step to refer to the relevant folder your 3 source files are saved in
Source = Folder.Files("CC:\Users\Desktop\Dry Good Demad-Supply Report\MRP_ParentDmd\Data_Sources"),
//Filter the file list to leave just your 3 source files if required
#"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".xlsx")),
//Remove all columns excep the Binary file column
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"Content"}),
//Convert the binary file to the file data ie sheets, tables, named ranges etc - the same data you get when you use a file as a source
#"Workbook Data" = Table.TransformColumns(#"Removed Other Columns",{"Content", each Excel.Workbook(_)}),
//Filter the nested file data table cell to select the sheet you need from your source files - may not be necessary depending on what's in the files
#"Sheet Filter" = Table.TransformColumns(#"Workbook Data",{"Content", each Table.SelectRows(_, each [Name] = "Sheet1")}),
//Step to Name the column you want to extract data from
#"Column Name" = "Column1",
//Extract a List of the values in the specified column
#"Column Values" = Table.TransformColumns(#"Sheet Filter",{"Content", each List.Distinct(Table.Column(_{0}[Data],#"Column Name"))}),
//Expand all the lists
#"Expanded Content" = Table.ExpandListColumn(#"Column Values", "Content"),
#"Removed Duplicates" = Table.Distinct(#"Expanded Content")
in
#"Removed Duplicates"
EDIT
To select multiple columns and provide the distinct rows you could change the steps starting from the #"Column Name"
This could end up taking a fair bit longer than the previous step depending on how much data you have, but it should do the job
//Step to Name the column you want to extract data from
#"Column Name" = {"Column1","Column2","Column5"},
//Extract a List of the values in the specified column
#"Column Values" = Table.TransformColumns(#"Sheet Filter",{"Content", each Table.SelectColumns(_{0}[Data],#"Column Name")}),
//In each nested table, filter down to distinct rows
#"Distinct rows in Nested Tables" = Table.TransformColumns(#"Column Values",{"Content", each Table.Distinct(_)}),
//Expand nested table column
#"Expanded Content" = Table.ExpandTableColumn(#"Distinct rows in Nested Tables", "Content", #"Column Name"),
//Remove Duplicates in combined table
#"Removed Duplicates" = Table.Distinct(#"Expanded Content")
in
#"Removed Duplicates"
If you're starting out with Power Query, don't try to write your code manually and don't cram everything into one statement. Rather, use the ribbon commands and then edit the code if required.
For your scenario, you could create a separate query for each data source. Load these as connections only. Shape each data source to contain the columns you need. Then you can append the three data queries and further refine the result.

Power BI : How to count occurrence of value from source table?

I have my data source something like below.
I need to show output in the report as below.
I tried using the unpivot column and getting something like this, how to count the occurrence value of each Business value.
Plot following mesure against Value column (from your unpivot table):
Business Occurance = COUNTROWS('your unpivot table')
We have to remove the Attribute column as the next step to Unpivot. Then my table should be looks like this.
Now create a new table with following Dax function, let's say the current table as Business Data (Your Unpivot table)
Occurrence Table = DISTINCT('Business Data')
Now end result table should look like this,
You can make use of this table for your table visual in the report.
Note: You can add n-number of rows and column into your source table and this logic will do magic to get the correct result.
I have marked two places first marked place you have to add Value column then click second marked place one dropdown value is open click count menu

Split a column of lists into multiple columns in PowerBI

I have imported a JSON file into PowerBI and it contains a column in which the values are of type "List". I am looking to expand that column into multiple columns.
Specifically, the data contains a Sprint Name, the start date and the end date of the sprint, along with some other values associated with each sprint.
Trying to use "Expand to new rows" duplicates each sprint instance, creating a table that looks like this, duplicating each sprint instance multiple times for each associated value:
Sprint Name Value
JAN(S1Dev) 2019-01-01
JAN(S1Dev) 2019-01-13
JAN(S1Dev) {attribute}
JAN(S1Dev) {attribute}
JAN(S2Dev) 2019-01-14
JAN(S2Dev) 2019-01-31
JAN(S2Dev) {attribute}
JAN(S2Dev) {attribute}
FEB(S1Test) 2019-02-01
FEB(S1Test) 2019-02-15
... ...
I would like to do something similar to the "expand" feature, which instead creates a new column with each attribute rather than a new row. This is currently vastly increasing the size of my table for no reason, while also making the data practically un-useable. Any help would be appreciated, cheers!
I have found a very simple solution to this, but as it took me some time to figure it out I will answer my own question instead of deleting it to help others in the future...
Upon importing the JSON data into PowerBI first select "Convert to Table" to view the data as a table with editable properties.
Next, click the arrows pointing away from each other at the top of the column of Lists, and select "Extract Values".
Select a delimiter to use for concatenating values, I am choosing a comma since I know that the data contained within the list does not have any commas in it. If your data contains commas within it, choose something else. Similarly, if your data contains one of the delimiters, do not choose that as the delimiter.
It should now display a comma-separated list where it previously displayed "List" in orange text.
Now, right-click on the column and select "Split Column" then choose "By Delimiter"
Select the delimiter that you previously chose, and under "split at" select "Each occurrence of the delimiter" then click OK.
Your column should now be split into multiple columns based on the list!