Extract 3 tables from the source table by specific text in cell

Extract 3 tables from the source table by specific text in cell - powerbi

I have a source table with more or less this format :
And I search how to extract the data for each section by using only the data from 'Name' column.
But like i'm pretty new to PowerQuery and PowerBi, I don't find the right command to reach my goal
The first possibillity is to add a new column and identify each line by the title like this :
Or the second possibility is to create 3 new table for each title and separate the data like this :
Thanks

Add a custom column that checks the Value1 column for a null, and if its a null, then return the value in Name
= if [Value 1]=null then [Name] else null
then right click and fill down that new column, and apply a filter to it
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each if [Value 1]=null then [Name] else null),
#"Filled Down" = Table.FillDown(#"Added Custom",{"Custom"})
// add row here to optionally filter on Custom column
in #"Filled Down"
Or just have one query and create more queries that have specific filters like this one
let Source = Table.SelectRows(OtherQueryNameHere, each [Custom] = "m") in Source

Related

Is there a way using PowerBI to use a if then within Table.SplitColumn function

This relates to PowerBI:
I have this code (mcode not DAX):
#"Split Column by Delimiter3" = Table.SplitColumn(#"Reordered Columns" , "Custom2", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), ListB),
Custom2 column
Which creates multiple columns from this list with data from the Custom2 column.
List
and this is the result
Result of Custom2 & List
Now I want to add if the result in each cell is blank or null then 0 (zero) else $200.
so the result word look like
Result Wanted
Given that the code creates multiple columns at once what mcode do i need to wrap around or put inside the existing code for #"Split Column by Delimiter3"?
I can do this manually but the whole point is to make it dynamic.
I have tried List.ReplaceValue(Text.Split([Custom2],","),"0","$200", Replacer.ReplaceText))as a text does work but i don't know how to make it replace any non blank with $200
Any help would be appreciated.
Thanks

Update: I found a solution myself:
1st step
use #"Added Custom13" =Table.AddColumn(#"Duplicated Column4", "CustomTest", each List.Transform(Text.Split([Custom3Prep],","),each "$200")),
2nd step
use #"Extracted Values1" = Table.TransformColumns(#"Added Custom13", {"CustomTest", each Text.Combine(List.Transform(_, Text.From), ","), type text}),
All the items within the delimiter were replaced with $200.:)

Create idex with repeated data

I have data from an external source that is downloaded in csv format. This data shows the interactions from several users and doesn't have an id column. The problem I'm having is that I'm not able to use index because multiple entries represent interactions and processes. The interactions would be the group of processes a specific user do and the process represents each actions taken in a specific interaction. Any user could repeat the same interaction at any time of day. The data looks likes this:
User1 has 2 processes but there were 3 interactions. How can I assign an ID for each interaction having into consideration that there might be multiple processes for a single user in the same day. I tried grouping them in Power Query but it groups the overall processes and I'm not able to distinguish the number of interactions. Is it better to do it in Dax?
Edit:
I notice that it is hard to understand what I need but I think this would be a better way to see it:
Process 2 are the steps done in an interaction. Like in the column in yellow I need to add an ID taking in to consideration where an interaction start and where it ends.

I'm not exactly sure I follow what you describe. It looks to me like user1 has 4 interactions--Processes AA, AB, BA, and BB--but you say 3.
Still, I decided to take a shot at providing an answer anyway. I started with a CSV file set up like you show.
Then brought the CSV into Power Query and, just to add a future point of reference so that you could follow the Id assignments better, I added an index column that I called startingIndex.
Then I added a custom column combining the processes that I understand actually define an interaction.
Then I grouped everything by users and Interactions into a column named allData.
Then I added a custom column to copy the column that was created from the earlier grouping, to sort the tables within it, and to add an index to each table within it. This essentially indexed each user's interaction group. (Because all of your interactions occur on the same date(s), the sorting doesn't help much. But I did it to show where you could do it if you included datetime info instead of just a date.)
Then I added a custom column to copy the column that was created earlier to add the interactions index, and to add an Id item within each table within it. I constructed each Id by combining the user, interactions, and interactionIndex for each.
Then I selected the latest column I had created (complexId) and removed all other columns.
Last, I expanded all tables without including the Interactions and Index columns. (The Index column was the index used for the interactions within the groups and no longer needed.) I included the startingIndex column just so you could see where items originally were at the start, in comparison to their final Id.

Given your new example, to create the Interaction ID you show, you only need the first two columns of the table. If not part of the original data, you can easily generate the third column (Process2))
It appears you want to increment the interaction ID whenever the Process changes
Please read the comments in the M code and explore the Applied Steps to better understand the algorithm:
M Code
let
//be sure to change table name in next row to your real table name
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"User", type text},
{"Process", type text},
{"Process2", type text}
}),
//add an index column
idx = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
//Custom column returns the Index if
// the current Index is 0 (first row) or
// there has been no change in user or process comparing current/previous row
// else return null
#"Added Custom" = Table.AddColumn(idx, "Custom",
each if [Index]=0
then 0
else
if [Process] <> idx[Process]{[Index]-1} or [User] <> idx[User]{[Index]-1} then [Index]
else null),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index"}),
//Fill down the custom column
// now have same number for each interactive group
#"Filled Down" = Table.FillDown(#"Removed Columns",{"Custom"}),
//Group by the "filled down" custom column with no aggregation
#"Grouped Rows" = Table.Group(#"Filled Down", {"Custom"}, {
{"all", each _, type table [User=nullable text, Process=nullable text, Process2=nullable text, Custom=number]}
}),
//add a one-based Index column to the grouped table
#"Added Index" = Table.AddIndexColumn(#"Grouped Rows", "Interaction ID", 1, 1, Int64.Type),
#"Removed Columns1" = Table.RemoveColumns(#"Added Index",{"Custom"}),
//Re-expand the table
#"Expanded all" = Table.ExpandTableColumn(#"Removed Columns1", "all",
{"User", "Process", "Process2"}, {"User", "Process", "Process2"})
in
#"Expanded all"
Source
Results

Power Query combine three external Excel source files and append specific columns

I'm trying to create a lookup table combining the my 3 source files primary keys columns, this way I won't have to do an outer join to find the missing records from each source and then append them together. I've found how to "combine" two source files but I can't figure out how to drill into the columns/fields lists so that I can select only Column 1 (or "Item Code" header name in the Excel files).
Here is the code I have so far to combine 2/3 files (as a trial):
let
Source = Table.Combine({Excel.Workbook(File.Contents("C:\Users\Desktop\Dry Good Demad-Supply Report\MRP_ParentDmd\Data_Sources\JDE_MRP_Dmd.xlsx"), null, true),
Excel.Workbook(File.Contents("C:\Users\Desktop\Dry Good Demad-Supply Report\MRP_ParentDmd\Data_Sources\JDE_Open_PO.xlsx"), null, true)})
in Source

If you've got a less than ideal data source (ie lots of irrelevant columns, duplicates in the data you want) then one way to avoid materialising a whole bunch of unnecessary data would be to perform all your transformations/filtering on nested table cells rather than loading all the data up just to remove columns/dupes.
The M code below should be a rough start that hopefully gets you on the way
let
//Adjust the Source step to refer to the relevant folder your 3 source files are saved in
Source = Folder.Files("CC:\Users\Desktop\Dry Good Demad-Supply Report\MRP_ParentDmd\Data_Sources"),
//Filter the file list to leave just your 3 source files if required
#"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".xlsx")),
//Remove all columns excep the Binary file column
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"Content"}),
//Convert the binary file to the file data ie sheets, tables, named ranges etc - the same data you get when you use a file as a source
#"Workbook Data" = Table.TransformColumns(#"Removed Other Columns",{"Content", each Excel.Workbook(_)}),
//Filter the nested file data table cell to select the sheet you need from your source files - may not be necessary depending on what's in the files
#"Sheet Filter" = Table.TransformColumns(#"Workbook Data",{"Content", each Table.SelectRows(_, each [Name] = "Sheet1")}),
//Step to Name the column you want to extract data from
#"Column Name" = "Column1",
//Extract a List of the values in the specified column
#"Column Values" = Table.TransformColumns(#"Sheet Filter",{"Content", each List.Distinct(Table.Column(_{0}[Data],#"Column Name"))}),
//Expand all the lists
#"Expanded Content" = Table.ExpandListColumn(#"Column Values", "Content"),
#"Removed Duplicates" = Table.Distinct(#"Expanded Content")
in
#"Removed Duplicates"
EDIT
To select multiple columns and provide the distinct rows you could change the steps starting from the #"Column Name"
This could end up taking a fair bit longer than the previous step depending on how much data you have, but it should do the job
//Step to Name the column you want to extract data from
#"Column Name" = {"Column1","Column2","Column5"},
//Extract a List of the values in the specified column
#"Column Values" = Table.TransformColumns(#"Sheet Filter",{"Content", each Table.SelectColumns(_{0}[Data],#"Column Name")}),
//In each nested table, filter down to distinct rows
#"Distinct rows in Nested Tables" = Table.TransformColumns(#"Column Values",{"Content", each Table.Distinct(_)}),
//Expand nested table column
#"Expanded Content" = Table.ExpandTableColumn(#"Distinct rows in Nested Tables", "Content", #"Column Name"),
//Remove Duplicates in combined table
#"Removed Duplicates" = Table.Distinct(#"Expanded Content")
in
#"Removed Duplicates"

If you're starting out with Power Query, don't try to write your code manually and don't cram everything into one statement. Rather, use the ribbon commands and then edit the code if required.
For your scenario, you could create a separate query for each data source. Load these as connections only. Shape each data source to contain the columns you need. Then you can append the three data queries and further refine the result.

Power bi and google analytics. How to set up filters for date before getting full data

How to set up filters for date before getting full data. Right now pbi export all data from day one. Is more than 1.7mil. rows. I reach limit after 1mil and way to use date filter inside pbi is not a option.
So are posible way setdate range parameter (example last 3 month) before export starts?
In adv editor source code:
let
Source = GoogleAnalytics.Accounts(),
#"1*******1" = Source{[Id="1*******1"]}[Data],
#"UA-1*******1-1" = #"1*******1"{[Id="UA-1*******1-1"]}[Data],
#"1*******6" = #"UA-1*******1-1"{[Id="1*******6"]}[Data],
#"Added Items" = Cube.Transform(#"1*******6",
{
{Cube.AddAndExpandDimensionColumn, "ga:eventAction", {"ga:eventAction"}, {"Event Action"}},
{Cube.AddAndExpandDimensionColumn, "ga:eventLabel", {"ga:eventLabel"}, {"Event Label"}},
/*{Cube.AddAndExpandDimensionColumn, "ga:date", {"ga:date"}, {"Date"}},*/
{Cube.AddMeasureColumn, "Unique Events", "ga:uniqueEvents"}
})
in
#"Added Items"
#"1*******1" = Source{[Id="1*******1"]}[Data],
#"UA-1*******1-1" = #"1*******1"{[Id="UA-1*******1-1"]}[Data],
#"1*******6" = #"UA-1*******1-1"{[Id="1*******6"]}[Data],
My guess, after source location we need somehow set date range parameters.

I know that it is an old question and supposed that you solve it.
I think it may be useful for those who google this question.
After step #"Added Items" replace text (in #"Added Items")
(be aware first in SelectRows( should be the name of the previous step)
with next
#"combinedData" = Table.Combine({
Table.SelectRows(#"Added Items", each Text.Contains([Month of Year], "2020"))
Table.SelectRows(#"Added Items", each Text.Contains([Month of Year], "2019"))
})
in
combinedData

Power BI/query text column is shown as a table

I am importing data into Power BI desktop from an XML file. If the field value is empty, e.g.,
<Address></Address> then PBI assigns that column type "Table" and it is not possible to change the type into text. I am getting an error:
Expression.Error: We cannot convert a value of type Table to type Text.
Details:
Value=Table
Type=Type
It looks like if the first encountered value is empty, then all following rows of that column in the same "page" (data is being read in chunks a.k.a "pages" by the data connector) fetch are assigned type "Table", though not always.
For some columns, it even not possible to change the column data type (they have "table" icon)
Any ideas on how to change the column type to text?
Edit1: I have noticed that clicking "Table" actually shows there is a text value in that column. Maybe there is some invisible setting for Power BI to overcome this blank string value problem?
Edit2: it looks like I have the same problem as described here https://community.powerbi.com/t5/Desktop/XML-Import-Table-Type-in-Column/td-p/97512. There is no solution to this simple problem?
I found this https://community.powerbi.com/t5/Issues/Power-Query-should-not-wrap-simple-xml-elements-in-a-table/idi-p/125525. It looks it is "by design" and no solution exists. It's a pitty for wasting time...
I have found someone's solution:
let
ConvertTableField = (aTable as table, fieldName as text) =>
let
//alternative: https://community.powerbi.com/t5/Desktop/Expand-value-from-table/td-p/214838
//#"Expanded g_v" = Table.TransformColumns(#"Expanded ts_info", {{"g_v", each if _ is table then Table.FirstValue(_, "") else _}})
#"Lists from table" =
Table.TransformColumns(
aTable,
{
{
fieldName,
each
if _ is table then
Table.ToList(_)
else {_}
}
}
),
#"Expanded List" =
Table.ExpandListColumn(
#"Lists from table",
fieldName
)
in
#"Expanded List"
in
ConvertTableField
I haven't still figured out how to run this function for all the columns of the table. M language is the most difficult of the programming languages I know more or less...

I am not sure about the XML usage, but as far as table type to text conversion goes - this worked for me. Maybe it's an overkill.
let
Source = (Table1 as table) =>
let
Section = Record.ToTable(#sections[Section1]),
Compare =
Table.SelectRows(
Section,
each [Value] = Table1
){0},
GetName = Record.Field(Compare, "Name")
in
GetName
in
Source

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js