Power BI - Matching closest 3D points from two tables - powerbi

I have two tables (Table 1 and Table 2) both containing thousands of three dimensional point coordinates (X, Y, Z), Table 2 also has an attribute column.
Table 1
X
Y
Z
6007
44268
1053
6020
44269
1051
Table 2
X
Y
Z
Attribute
6011
44310
1031
A
6049
44271
1112
B
I need to populate a calculated column in Table 1 with an attribute from Table 2 based on the minimum distance between points in 3D space. Basically, match the points in Table 1 to the closest point in Table 2 and then fetch the attribute from Table 2.
So far I have tried rounding X, Y and Z in both tables, then concatenating the rounded values into a separate column in each table. I then use DAX:
CALCULATE(FIRSTNONBLANK(Table 2 [Attribute],1),FILTER(ALL(Table2), Table 2[XYZ]=Table 1 [XYZ])).
This has given me reasonable success depending on the degree of rounding applied to the coordinates.
Is there a better way to achieve this in Power Bi?

This is similar to this post, except with a simpler distance function. See also this post.
Assuming you want the standard Euclidean Distance:
ClosestPointAttribute =
MINX (
TOPN (
1,
Table2,
( Table2[X] - Table1[X] ) ^ 2 +
( Table2[Y] - Table1[Y] ) ^ 2 +
( Table2[Z] - Table1[Z] ) ^ 2,
ASC
),
Table2[Attribute]
)
Note: I've omitted the SQRT from the formula because we don't need the actual distance, just the ordering (and SQRT preserves order since it's a strictly increasing function). You can include it if you prefer.

A function in M Code:
(p1 as list, q1 as list)=>
let
f = List.Generate(
()=> [x = Number.Power(p1{0}-q1{0},2), idx=0],
each [idx]<List.Count(p1),
each [x = Number.Power(p1{[idx]+1}-q1{[idx]+1},2), idx=[idx]+1],
each [x]
),
r = Number.Sqrt(List.Sum(f))
in
r
Each list is a set of coordinates and the function will return the distance between p and q
The above function (which I named fnDistance) can be incorporated into power query code as in this example:
let
//Read in both tables and set data types
Source2 =Excel.CurrentWorkbook(){[Name="Table_2"]}[Content],
table2 = Table.TransformColumnTypes(Source2,{{"X", Int64.Type}, {"Y", Int64.Type}, {"Z", Int64.Type},{"Attribute", Text.Type}}),
Source = Excel.CurrentWorkbook(){[Name="Table_1"]}[Content],
table1 = Table.TransformColumnTypes(Source,{{"X", Int64.Type}, {"Y", Int64.Type}, {"Z", Int64.Type}}),
//calculate distances from Table 1 coordinates to each of the Table 2 coordinates and store in a List
custom = Table.AddColumn(table1,"Distances", each
let
t2 = Table.ToRecords(table2),
X=[X],
Y=[Y],
Z=[Z],
distances = List.Generate(()=>
[d=fnDistance({X,Y,Z},{t2{0}[X],t2{0}[Y],t2{0}[Z]}),a=t2{0}[Attribute], idx=0],
each [idx] < List.Count(t2),
each [d=fnDistance({X,Y,Z},{t2{[idx]+1}[X],t2{[idx]+1}[Y],t2{[idx]+1}[Z]}),a=t2{[idx]+1}[Attribute], idx=[idx]+1],
each {[d],[a]}),
//determine set of coordinates with the minimum distance and return associate Attribute
minDistance = List.Min(List.Alternate(List.Combine(distances),1,1,1)),
attribute = List.Range(List.Combine(distances), List.PositionOf(List.Combine(distances),minDistance)+1,1){0}
in
attribute, Text.Type)
in
custom

Related

Change column values based on its position

I'm trying to adjust some columns with negative values in my table, I want to all negative values be changed to 0,
The only problem is that the columns keep changing their names, so I would like to be able to make such adjustment based on column position,
For example, the columns are located in 3 and 4 position,
I have created a conditional column to adjust the negatives volumes,
#"New Column" = Table.AddColumn(#Previous Step", "New Column", each if OldColumnName < 0 then 0 else NewColumn),
Is there a way to make this conditional column based on the OldColumn position, and not by its name?
add column, custom column with formula
= if Record.Field(_,Table.ColumnNames(Source){2})<0 then 0 else Record.Field(_,Table.ColumnNames(Source){2})
or
= if Record.Field(_,Table.ColumnNames(Source){2})<0 then 0 else [some other column])
where {2} is the position in column names
Sample to transform in place to remove negatives
Stepname = Table.TransformColumns(#"PriorStepNameHere",{{Table.ColumnNames(#"PriorStepNameHere"){2}, each if _<0 then 0 else _, Int64.Type}})
for multiple column transformations
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
ColumnsToTransform = {Table.ColumnNames(Source){2},Table.ColumnNames(Source){3}},
#"MultipleTransform" = Table.TransformColumns(Source, List.Transform(ColumnsToTransform,(x)=>{x, each if _<0 then 0 else _, type number}))
in #"MultipleTransform"

Combine rows with similar information into 1

I have a table that looks like this example:
Order Bagged Shipped
----------------------------------
1 Y
2 Y
1 Y
3 Y
I want to combine like order numbers into 1 row like below:
Order Bagged Shipped
----------------------------------
1 Y Y
2 Y
3 Y
How can I do this in PowerBi desktop?
Assuming your data really is as simple as your example (values are either null or 'Y' and no conflicts), I suggest something like:
SELECT Order, MAX(Bagged), MAX(Shipped)
FROM mytable
GROUP BY Order
The GROUP BY Order indicates you want one row per order, the MAX for the other columns ensures you get the 'Y' (if it exists for that Order) or null (if 'Y' doesn't exist for that Order).
In BI, select Transform, then add the GroupBy function to your existing code:
#"Grouped Rows" = Table.Group(#"Previous Step", {"Order"}, {
{"Bagged", each if List.Contains([Bagged], "Y") then "Y" else null},
{"Shipped", each if List.Contains([Shipped], "Y") then "Y" else null}
})
in
#"Grouped Rows"

PowerBI table: how to add number to column name

I have a table with the following column names:
A
B
C
D
E
F
G
I need to rename my columns so that from a certain column onwards they are numbered sequentially:
A
B
C
D (1)
E (2)
F (3)
G (4)
I know how to do it manually, but since I have 65 of such columns I was hoping to use something like TransformColumnNames to do it programmatically.
Many thanks!
Here's one way: It starts with a table named Table 1 as the source.
let
Source = Table1,
//Replace the "D" below with the name of your column that you want to start numbering at
#"Get Column Number to Start Adding Numbers At" = List.PositionOf(Table.ColumnNames(Source),"D"),
#"Setup Column Numbers" = List.Transform({1..List.Count(Table.ColumnNames(Source))}, each if _-#"Get Column Number to Start Adding Numbers At" > 0 then " (" & Text.From(_-#"Get Column Number to Start Adding Numbers At") & ")" else ""),
#"Create New Column Names" = List.Zip({Table.ColumnNames(Source), #"Setup Column Numbers"}),
#"Converted to Table" = Table.FromList(#"Create New Column Names", Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Extracted Values" = Table.TransformColumns(#"Converted to Table", {"Column1", each Text.Combine(List.Transform(_, Text.From)), type text}),
Result = Table.RenameColumns(Source, List.Zip({Table.ColumnNames(Source),#"Extracted Values"[Column1]}))
in
Result
Maybe if you pivot the columns that need to have the number, then add an index and create a new concatenated column with number included. remove the other columns and unpivot again?

How to calculate performance curve for each row of data

I want to plot a performance curve for each row of data I have.
A simple version of what I want to do is plot the function with the equation as Y= m*X+b, where I have a table with m and b values and I want Y values for X = 1 to 10.
How is this calculated?
A Y = mX + b example can be seen in the following plot:
The following works:
WITH NUMBERS AS
(
SELECT N FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10))N(N)
),
Examples AS
(
SELECT m,b FROM (VALUES (1,2),(2,2))N(m,b)
)
SELECT
'Y = ' + CAST(Examples.m as varchar(10)) + 'X + ' + CAST(Examples.b as varchar(10)) AS Formula
,Numbers.N AS X
, Numbers.N * Examples.m + Examples.b
FROM Examples
CROSS JOIN NUMBERS

pyspark mathematical computation in a dataframe

I have extracted a Dataframe from a larger Dataframe, and now I need to do simple computation like addition and division in dataframe.
sample dataframe is like.
item counts
z 23156
x 15462
What I need to do is to divide x by sum of x and z
for example
value= x/x+z
You must compute the sum of x and first then divide x by sum(x) + sum(y)
for example:
Table 1(original table):
x z
1 2
3 4
Table 2 (Aggregated table):
table2 = sqlCtx.sql("select sum(x) + sum(z) as sum_xz")
table2.registerTempTable("table2")
sum_xz
10
Then join both table and divide
table3 = sqlCtx.sql("select a.x / bs.um_xz from table1 a join table2 b")
For your reference.