Power Query join on the least difference

Power Query join on the least difference - powerbi

How to make a Power Query join of two tables on least difference between columns. I mean absolute difference between numbers.
I followed this great article: https://exceed.hr/blog/merging-with-date-range-using-power-query/
I tried adding this custom column analogously, where L stands for Tab1 and R for Tab2:
= Table.AddColumn(
Source,
"LeastAbsDifference",
(L) =>
Table.SelectRows( Tab2,
(R) => L[category] = R[category] and Number.Abs(L[target] - R[actual]) )
)
It produces error:
Expression.Error: We cannot convert the value 4 to type Logical.
Tables to recreate example:
// Tab1
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTJVitWJVnKCs5whrFgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [category = _t, target = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"target", Int64.Type}})
in
#"Changed Type"
// Tab2
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTJUitWBsIzgLGMwywnIMoGzTOEsM6XYWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [category = _t, actual = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"actual", Int64.Type}})
in
#"Changed Type"

Here's one way:
Append the two tables
Group by Category
Output the desired columns as a Group aggregation
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTJVitWJVnKCs5whrFgA", BinaryEncoding.Base64),
Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [category = _t, target = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"target", Int64.Type}}),
Source2 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTJUitWBsIzgLGMwywnIMoGzTOEsM6XYWAA=",
BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [category = _t, actual = _t]),
#"Changed Type1" = Table.TransformColumnTypes(Source2,{{"actual", Int64.Type}}),
//append the tables
append = Table.Combine({#"Changed Type",#"Changed Type1"}),
//Group by category, then output the desired columns
#"Grouped Rows" = Table.Group(append, {"category"}, {
{"target", each [target]{0},Int64.Type},
{"actual", (t)=> t[actual]{
List.PositionOf(List.Transform(t[actual], each Number.Abs(t[target]{0} - _)),
List.Min(List.Transform(t[actual], each Number.Abs(t[target]{0} - _))),Occurrence.First)},Int64.Type},
{"least difference", (t)=> List.Min(List.Transform(t[actual], each Number.Abs(t[target]{0} - _))),Int64.Type
}})
in
#"Grouped Rows"
Output from above code

I would like to acknowledge a favor of Kristian Rados, who provided the answer to my question in the comments of his article: Merging with date range using Power Query With my gratitude, and by courtesy of the author, I am quoting the answer in full:
The reason your formula produces an error is the second argument of the Table.SelectRows function. In it, you need to filter the table with the boolean (true/false) expression. In your case, the part of the code with Number.Abs function returns a number instead of true/false (e.g. L[target] – R[actual] = 5-1=4 ). Trying to filter the table this way could be possible, but it would require you to use multiple nested environments, which would result in a very complicated formula and slow performance.
I would suggest trying a different approach. By using your example from stack overflow I reproduced the problem. Below is a complete M code I came up with along with the explanation below:
let
Source = Tab1,
#"Merged Queries" = Table.NestedJoin(Source, {"category"}, Tab2, {"category"}, "Tab2", JoinKind.LeftOuter),
#"Expanded Tab2" = Table.ExpandTableColumn(#"Merged Queries", "Tab2", {"actual"}, {"actual"}),
#"Inserted Subtraction" = Table.AddColumn(#"Expanded Tab2", "Least difference", each Number.Abs([target] - [actual]), Int64.Type),
#"Grouped Rows" = Table.Group(#"Inserted Subtraction", {"category"}, {{"All", each Table.First(Table.Sort(_, {{"Least difference", Order.Ascending}}))}}),
#"Expanded All" = Table.ExpandRecordColumn(#"Grouped Rows", "All", {"target", "actual", "Least difference"}, {"target", "actual", "Least difference"})
in
#"Expanded All"
First, we merged the queries by using the category column. After expanding the table, we subtracted two columns to get the absolute difference between target and actual. Finally, we group by category and sort the table by the Least difference column in ascending order (Table.Sort function inside the grouped rows). After this, we take the first row of the nested table (Table.First function), and finally expand the record column.

Related

Power BI - power query editor - combine several rows in a table

Given this sample raw table (there are some more columns..):
agg_group
count_%
CHARGED_OFF
1.2
DELINQUENT
1.8
ELIGIBLE
90
MERCHANT_DELINQ
7
NOT_VERIFIED
0
How can I transform this table, to create 2 new columns, using either DAX in Power BI Desktop, or in Power Query?
Desired result:
agg_group
outstanding_principal
ELIGIBLE
90
DELINQUENT
10

Here is a Power Query sample that does this, paste the following into a blank query to see the steps, based on your sample data:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("JcsxDoAgDIXhu3QmRl3UUaVAEyyRoAsh3P8WNnX93v9qhTPs2aPtyTkwMA0zNFPBYiS+H+SiuCqKeToiCm2jyoVZ/lz638uwqHMq/cVMjtAKStw+", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [agg_group = _t, #"count_%" = _t]),
#"Cleaned Text" = Table.TransformColumns(Source,{{"count_%", each Number.FromText(_, "en-US"), type text}}),
#"Added Conditional Column" = Table.AddColumn(#"Cleaned Text", "agg_group_", each if [#"agg_group"] = "ELIGIBLE" then "ELIGIBLE" else "DELINQUENT"),
#"Grouped Rows" = Table.Group(#"Added Conditional Column", {"agg_group_"}, {{"outstanding_principal", each List.Sum([#"count_%"]), type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Grouped Rows",{{"agg_group_", "agg_group"}})
in
#"Renamed Columns"
Result:

Remove Duplicated strings from a cell in Power Query for Power BI

I'm trying to remove duplicates from two cells, and it seems a nowhere close a solution. So, I would like to ask for your assistance.
I have a table with a few tickets (TKS00XX) and their corresponding causing records (CSR00XX), and additionally to that I also have the investigation records (INV00XX) linked to the tickets. What happens is that in the investigation records we also have the causing records, which sometimes is the causing records linked back to the tickets.
In the example above, it is how is the table at source. In Power BI, I can group the records by the ticket and Investigation Records where I'd get the following
I'd like to merge these two causing records columns and remove the duplicated items (for example, the item highlighted in red (CSR0032). And this is the result I'm trying to get.
Do you know how I can achieve this result in the Power Query Editor for Power BI?
I have tried unpivoting the Causing Record tables and then grouping then by the ticket number ad investigation record numbers, but it removed some entries that shouldn't be removed.

Duplicate your table
From the 1st table remove the 1st CAUSING RECORD
From the 2nd table remove the 2nd CAUSING RECORD
Append both tables as new
Remove duplicate rows
Group by TICKET and INVESTIGATION RECORD and aggregate CAUSING RECORD AS SUM of CAUSING RECORD.
Since the last row will lead to an error, edit the formula in the formula bar and change List.Sum([CAUSING RECORD]) to Text.Combine([CAUSING RECORD], ", ")
The M-Code for the 3 involved tables is
Table1
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCvEONjAwNFLSUXIODjIwMAKxPP3CgCyYkLGRUqwOhkJjLAqNsSk0NMRQaGKCUGhqCtduClMIFzKzxKYQqB1doamFUmwsAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [TICKET = _t, #"CAUSING RECORD" = _t, #"INVESTIGATION RECORD" = _t, #"CAUSING RECORD.1" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"TICKET", type text}, {"CAUSING RECORD", type text}, {"INVESTIGATION RECORD", type text}, {"CAUSING RECORD.1", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"CAUSING RECORD.1"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"TICKET", "INVESTIGATION RECORD", "CAUSING RECORD"})
in
#"Reordered Columns"
Table2
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCvEONjAwNFLSUXIODjIwMAKxPP3CgCyYkLGRUqwOhkJjLAqNMRX6+3t6Yig0MUEoNDWFazeFKYQLmVliUwjUjq7Q1EIpNhYA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [TICKET = _t, #"CAUSING RECORD" = _t, #"INVESTIGATION RECORD" = _t, #"CAUSING RECORD.1" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"TICKET", type text}, {"CAUSING RECORD", type text}, {"INVESTIGATION RECORD", type text}, {"CAUSING RECORD.1", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"CAUSING RECORD"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"CAUSING RECORD.1", "CAUSING RECORD"}})
in
#"Renamed Columns"
Table3
let
Source = Table.Combine({Table1, Table2}),
#"Removed Duplicates" = Table.Distinct(Source),
#"Sorted Rows" = Table.Sort(
#"Removed Duplicates",{{"TICKET", Order.Ascending}}),
#"Grouped Rows" = Table.Group(
#"Sorted Rows",
{"TICKET", "INVESTIGATION RECORD"},
{
{"CAUSING RECORD", each Text.Combine([CAUSING RECORD], ", "), type nullable text}
}
)
in
#"Grouped Rows"

Once you join your two tables, you can Group by TICKET and INVESTIGATION RECORD.
Within the Table.Group function add a custom aggregation of the CAUSING RECORD columns, remove the duplicates, optionally sort, and combine them as a text string.
In the code below, it is the Table.Group function that is important; the rest is just setup and housekeeping.
Tickets
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCvEONjAwNFLSUXIODjIwMDJSitXBEDXGKmpoiBA1NYWrNcUmamKiFBsLAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [TICKET = _t, #"CAUSING RECORD" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"TICKET", type text}, {"CAUSING RECORD", type text}})
in
#"Changed Type"
Investigation
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8vQLMzAwUtJRcg4OMjAwNlKK1cEQNMYiaGKCEDSFCZpZYhE0tVCKjQUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"INVESTIGATION RECORD" = _t, #"CAUSING RECORD" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"INVESTIGATION RECORD", type text}, {"CAUSING RECORD", type text}})
in
#"Changed Type"
Combined
let
//Combine the two tables
#"Append Columns" = Table.FromColumns(
Table.ToColumns(Ticket) & Table.ToColumns(Investigation),
type table[TICKET=text, CAUSING RECORD=text, INVESTIGATION RECORD=text, CAUSING RECORD2=text]),
//Group by TICKET, INVESTIGATION RECORD
#"Grouped Rows" = Table.Group(#"Append Columns", {"TICKET", "INVESTIGATION RECORD"}, {
{"CAUSING RECORDS", each
Text.Combine(
List.Sort(
List.Distinct(
List.Combine({[CAUSING RECORD],[CAUSING RECORD2]}))),
", "), type text}})
in
#"Grouped Rows"

Power Query - Add custom column based on values found in another lookup Query

I like to add a column in MainQuery and fill it up with values found in another LookupQuery based what it being able to find the matching unique value embedded in the text of column1 in my Main Query. if not found just return blank, null. Thank you
Main Query
Column1
...111.
..ABC..
..34C..
...xyz.
yyy.....
Lookup Query
Uniques
34C
ABC
111
Desired Output in Main Query
Column1
New Column
...111..
111
..ABC...
ABC
..34C...
34C
....xyz.
yyy.....

Here's another method using List.Accumulate to loop through the lookup list. Not sure which of the methods will be more efficient.
Note: If there might be more than one lookup result for a given item in column 1, a small change in the List.Accumulate function can accommodate showing them all with a separator
let
//Read in Main Query
Source = Excel.CurrentWorkbook(){[Name="Main"]}[Content],
Main = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
//Read in Lookup Query as a list of text items
Lookup = List.Buffer(
List.Transform(Excel.CurrentWorkbook(){[Name="Lookup"]}[Content][Uniques],
each Text.From(_))),
//add Column
#"Add Column" =
Table.AddColumn(
Main, "New Column", each
List.Accumulate(Lookup,
null,
(state, current)=>
if Text.Contains([Column1], current)
then current else state),
type nullable text)
in
#"Add Column"
Results

There's a not a built-in join that does this, but you can cross-join followed by a filter. And to Cross Join in PQ you add a custom column to one table containing the full value of the other table, then run Table.ExpandTableColumn.
EG
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSjE0NFSK1YlWKk5Jc3RyNrY0BvMSs4uzgDLZxYlZKWnZxibOSrGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each Lookup),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Uniques"}, {"Uniques"}),
#"Selected Rows" = Table.SelectRows(#"Expanded Custom", each Text.Contains([Column1], [Uniques]))
in
#"Selected Rows"

Dax formula to extract the first 3 words from a cell?

For example, I have a column titled "root cause" with a sentence describing the cause.
Is it possible to extract the first 3 words?
(or maybe a better than to analysis sentences) Thanks

You can use an M Language transformation.
here an example:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("LcxBCsMwDATAryw+9xPJC3rozfigxKIyRBbYMqG/j0kDu7CwMDGGj5SOGULn6lx3DukVw1LNhRusMs7iArU2l7Xc738txwGhDrfxFQdtNvxPPMKb2mTzUP0hk2KbFdKQ0gU=",
BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type
nullable text) meta [Serialized.Text = true]) in type table [Sentence
= _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Sentence", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each Text.Combine(List.FirstN(Text.Split([Sentence]," "), 3), " " )) in
#"Added Custom"

I am trying to get the characters after the last delimiter / dax only solution please

ResName = IFERROR(Mid('Hellodetails'[resourceId],Find("/",'Hellodetails'[resourceId],1)),BLANK())
Column =
/subscriptions/3e5b4f75-12834/rgs/-test1-rg/providers/microsoft.web/sites/re2-test1
/subscriptions/3e5b4f75-12834/rgs/-test1-rg/providers/microsoft.web/sites/groups/re2-test1-wa-asset
/subscriptions/3e5b4f75-12834/rgs/-test1-rg/providers/microsoft.web/sites/r/re2-test1-wa-product
Output
re2-test1
re2-test1-wa-asset
re2-test1-wa-product
This is what I am trying to get but I am not geeting the outcome

To add a column with your desired output, using Power Query M-Code (not DAX):
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rc09DoMwDEDhuzDXWPypvQtiIMFEHlpHtlOuT6QuZWf/9N48N2glWFTOzvIxHGgK4/6coOtfw4iaDMHJvANNmFW+vJEavjmqmOzeHhTQuBJU6n+0WR53dpNKyX95OFZYzchv/uh1UfFWYp0sJw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
lastDelimited = Table.AddColumn(#"Changed Type","Last Part",
each List.Last(Text.Split([Column1],"/")), Text.Type)
in
lastDelimited
Data
Results
Using DAX
Column = TRIM(RIGHT(SUBSTITUTE([Column1],"/",REPT(" ",99)),99))

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Power Query join on the least difference - powerbi

Related

Power BI - power query editor - combine several rows in a table

Remove Duplicated strings from a cell in Power Query for Power BI

Power Query - Add custom column based on values found in another lookup Query

Dax formula to extract the first 3 words from a cell?

I am trying to get the characters after the last delimiter / dax only solution please

Categories

Resources