Compare similar values within a table in power BI - powerbi

I have fields within a table with values originally filled in by hand. Even if the same values are entered/meant, there can be "slight" deviations. Now I want to compare whether values in 2 columns within a row are quite similar.
If there is some similarity, I would like to have a True in a new column, otherwise a False. The use case is similar to the fuzzy join when two tables are merged, but the fields are within a table and do not work as a primary key. I have created a table below of what this should look like:
No
​A header
Another header
Calculated Column
1
Zürich, 1.OG Telefonzentrale
Telefonzentrale
True
2
Mittelterrasse 1.OG Raum T190
Mittelterrasse T1
True
1
TM-Raum 225
Bern, Bollwerk 10 / 2.OG
False
2
G7803
91G7803
True
It would be great if someone could help me in this topic.

I dont know if is there a way to do this, but we can try to verify how many word from column1 appear in column2:
CheckIfTrue__ =
VAR SplitByCharacter = " "
VAR Org = SELECTEDVALUE(Sheet3[​A header])
VAR CurrentF = SELECTEDVALUE(Sheet3[Another header] )
VAR Table0 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", Org),
VAR TokenCount =
PATHLENGTH ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ) )
RETURN
GENERATESERIES ( 1, MAX(TokenCount,1) )
),
"Word", PATHITEM ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ), [Value] )
),
"Word",[Word])
VAR Table1 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", CurrentF),
VAR TokenCount =
PATHLENGTH ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ) )
RETURN
GENERATESERIES ( 1, MAX(TokenCount,1) )
),
"Word", PATHITEM ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ), [Value] )
),
"Word",[Word])
RETURN
COUNTROWS(INTERSECT(Table0, Table1))+0

Related

I want to setup a RLS using this table in Power BI Embedded where I am only fetching Scheme Code as userprincipalname()

I am looking to setup RLS in Power BI Embedded using below table where SCHEMECODE should be my userprincipalname() and through the webapp user can have multiple SCHEMECODE's which will be comma separated. This is an example of what I will be getting from the webapp "16856,39760"
What you may try is to split your string into table of separate values;
Then in RLS 'YourTable'[SCHEMACODE] in TableOf (code below);
EVALUATE
VAR _userprincipalname = "16856,39760"
VAR SplitByCharacter = ","
VAR Table0 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", _userprincipalname ),
VAR TokenCount =
PATHLENGTH ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ) )
RETURN
GENERATESERIES ( 1, TokenCount )
),
"Word", PATHITEM ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ), [Value] )
),
"Word",[Word])
RETURN
Table0

DAX New & Lost Customers Table

I am trying to create a List for New Customers Added during the year and List of Lost Customers, I have written a DAX which works fine in summary count but doesn't work in table matrix.
NTB =
VAR currentCustomers =
VALUES ( Deposits[CIF ID] )
VAR currentDate =
MAX ( Deposits[Source.Date] )
VAR pastCustomers =
CALCULATETABLE (
VALUES ( Deposits[CIF ID] ),
ALL (
Deposits[Source.Date].[Month],
Deposits[Source.Date].[MonthNo],
Deposits[Source.Date].[Year]
),
Deposits[Source.Date] < currentDate
)
VAR newCustomers =
EXCEPT ( currentCustomers, pastCustomers )
RETURN
COUNTROWS ( newCustomers )
Total Row is correct, even if I remove one function the table remains same..
Appreciate your help
try this :
Modelling --> Add Table
Table =
VAR _max =
MAX ( Deposits[Source.Date] )
RETURN
ADDCOLUMNS (
SUMMARIZE ( Deposits, Deposits[CIF ID] ),
"NTB",
CALCULATE (
COUNT ( Deposits[CIF ID] ),
FILTER (
ALLEXCEPT ( Deposits, Deposits[CIF ID] ),
Deposits[Source.Date] >= _max
)
),
"Lost Customers",
CALCULATE (
COUNT ( Deposits[CIF ID] ),
FILTER (
ALLEXCEPT ( Deposits, Deposits[CIF ID] ),
Deposits[Source.Date] < _max
)
)
)

Power BI Dax - Trying to break circular dependency after one attempt

I am trying to figure out ancestor bloodline in data I have. I feel like there is something I am missing to make this work. This data wouldn't change so not sure if I am missing something I can write in the power query editor when loading / refreshing the data.
While technically it is circular in fields it never going to return the same row it is currently in and the first generation is hard number making a starting point. I only want to reference the mother and father to calculate the child's bloodline. The first generation is a basic IF() statement. Below is as far as I can get before hitting the circular dependency error. I have tried a few things to break it thinking its going to loop.
Logic is:
Each blood is 100% for 1st generations based on their birthplace then it is ((mother blood + father blood) / 2) for each generation after that. I found I can use PATHITEM() to isolate the type of blood but errors with a circular dependency. (This is where I can't figure out how to reference the mother / father to do the calculation.) If I take this part out I get the image below working for 1st generation and correct mother / father for second generation.
Asisa Blood =
VAR current_id = 'Sheet1'[ID]
VAR current_gen = 'Sheet1'[Generation]
VAR current_blood = 'Sheet1'[Birthplace]
VAR current_mother_blood =
PATHITEM(
CALCULATE(
DISTINCT('Sheet1'[Mother's Blood Mix]),
FILTER(
ALLNOBLANKROW('Sheet1'[ID]),
'Sheet1'[ID] = current_id
),
REMOVEFILTERS('Sheet1')
),1,INTEGER)
VAR current_father_blood =
PATHITEM(
CALCULATE(
DISTINCT('Sheet1'[Father's Blood Mix]),
FILTER(
ALLNOBLANKROW('Sheet1'[ID]),
'Sheet1'[ID] = current_id
),
REMOVEFILTERS('Sheet1')
),1,INTEGER)
VAR gen1_value = 100
RETURN
IF(AND(LOWER(current_gen) = "1",LOWER(current_blood) = "asisa"),
gen1_value,
((current_mother_blood + current_father_blood)/2)
)
Blood Mix concatenates the four blood types into one field for easy look up in next step.
Blood mix =
VAR current__id = 'Sheet1'[ID]
VAR current_blood_a = 'Sheet1'[Asisa Blood]
VAR current_blood_b = 'Sheet1'[Africa Blood]
VAR current_blood_c = 'Sheet1'[Europe Blood]
VAR current_blood_d = 'Sheet1'[North America Blood]
RETURN
current_blood_a & "|" & current_blood_b & "|" & current_blood_c & "|" & current_blood_d
Mother and Father are lookups on blood mix with mother or father ids
Mother's Blood Mix =
VAR current_id = 'Sheet1'[ID]
VAR current_gen = 'Sheet1'[Generation]
VAR gen_value = 'Sheet1'[Blood mix]
VAR current_parent_id =
IF(LOWER(current_gen) = "1",current_id,'Sheet1'[Mother ID])
VAR result =
CALCULATE(
DISTINCT('Sheet1'[Blood mix]),
FILTER(
ALLNOBLANKROW('Sheet1'[ID]),
'Sheet1'[ID] = current_parent_id
),
REMOVEFILTERS('Sheet1')
)
RETURN
result
You can try to do this way (rather high resource consuming):
NEW COLUMN:
heritage_Father = PATH(Blood[ID],(Blood[FatherID]))
heritage_Mother = PATH(Blood[ID],(Blood[MotherID]))
Mancestor = LOOKUPVALUE(Blood[heritage_Mother],Blood[ID],Blood[FatherID]) &"|"& Blood[heritage_Mother]
Fancestor = LOOKUPVALUE(Blood[heritage_Father],Blood[ID],Blood[MotherID])&"|"&Blood[heritage_Father]
NEW MEASURE:
ASIA_check =
VAR Table0 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", "0"&SELECTEDVALUE(Blood[Fancestor]) ),
VAR TokenCount =
PATHLENGTH ( [Text] )+1
RETURN
GENERATESERIES ( 1, TokenCount )
),
"Word", PATHITEM ( [Text], [Value] )
),
"Word",VALUE([Word]))
VAR Table1 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", "0"&SELECTEDVALUE(Blood[Mancestor]) ),
VAR TokenCount =
PATHLENGTH ( [Text] )+1
RETURN
GENERATESERIES ( 1, TokenCount )
),
"Word", PATHITEM ( [Text], [Value] )
),
"Word",VALUE([Word]))
RETURN
DIVIDe(
if(SELECTEDVALUE(Blood[Generation]) <> 1,
calculate(sum(Blood[AsiaBlood]), FILTER(ALL(Blood), Blood[ID] in Table0 || Blood[ID] in Table1)),0
),
(POWER(2, SELECTEDVALUE(Blood[Generation])-1)))
AFRICA_check =
VAR Table0 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", "0"&SELECTEDVALUE(Blood[Fancestor]) ),
VAR TokenCount =
PATHLENGTH ( [Text] )+1
RETURN
GENERATESERIES ( 1, TokenCount )
),
"Word", PATHITEM ( [Text], [Value] )
),
"Word",VALUE([Word]))
VAR Table1 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", "0"&SELECTEDVALUE(Blood[Mancestor]) ),
VAR TokenCount =
PATHLENGTH ( [Text] )+1
RETURN
GENERATESERIES ( 1, TokenCount )
),
"Word", PATHITEM ( [Text], [Value] )
),
"Word",VALUE([Word]))
RETURN
DIVIDe(
if(SELECTEDVALUE(Blood[Generation]) <> 1,
calculate(sum(Blood[AfricaBlood]), FILTER(ALL(Blood), Blood[ID] in Table0 || Blood[ID] in Table1)),0
),
(POWER(2, SELECTEDVALUE(Blood[Generation])-1)))

Check if any part of one string matches any part of other string

I'm looking for a method to check if any part of one string is in another string. Each string is placed in a separate column and is containing a set of random codes. Please see examples of these below.
The method should check if any of the codes in the Original Fault column is present in the Current Fault Column string. As seen in the example each code is created with a DI and 4 digits following. Each raw can contain a random number of those codes. Right now I'm able to check if the whole string from the Original Fault columns is present in the Current fault columns, but I need a solution that would be able to check each individual code from the Original Fault column string in the Current Fault column string. if there is at least one match it should return a yes in a new column
Here is an example how we can get this by using DAX:
CheckIfTrue = VAR SplitByCharacter = " " VAR Org = SELECTEDVALUE(code[Orginal Faults]) VAR CurrentF = SELECTEDVALUE(code[Current Faults] ) VAR Table0 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", Org),
VAR TokenCount =
PATHLENGTH ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ) )
RETURN
GENERATESERIES ( 1, MAX(TokenCount,1) )
),
"Word", PATHITEM ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ), [Value] )
),
"Word",[Word])
VAR Table1 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", CurrentF),
VAR TokenCount =
PATHLENGTH ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ) )
RETURN
GENERATESERIES ( 1, MAX(TokenCount,1) )
),
"Word", PATHITEM ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ), [Value] )
),
"Word",[Word]) RETURN COUNTROWS(INTERSECT(Table0, Table1))+0 // Org
Additional example:
ShowError = VAR SplitByCharacter = " " VAR Org = SELECTEDVALUE(code[Orginal Faults]) VAR CurrentF = SELECTEDVALUE(code[Current Faults] ) VAR Table0 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", Org),
VAR TokenCount =
PATHLENGTH ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ) )
RETURN
GENERATESERIES ( 1, MAX(TokenCount,1) )
),
"Word", PATHITEM ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ), [Value] )
),
"Word",[Word])
VAR Table1 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", CurrentF),
VAR TokenCount =
PATHLENGTH ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ) )
RETURN
GENERATESERIES ( 1, MAX(TokenCount,1) )
),
"Word", PATHITEM ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ), [Value] )
),
"Word",[Word]) RETURN CONCATENATEX(INTERSECT(Table0, Table1),[Word],";") // Org

POWER BI : I want DAX formula to find out the 1st highest and 2nd highest value for a group in my Main Table

Below are my two tables , I want the 1st and 2nd highest amount from my main table through DAX, So that I can get the Desired table with two different column , one having first highest for Sydney and Brisbane and another having 2nd highest for the same cities
Table 2 =
VAR _1 =
ADDCOLUMNS (
_t,
"rank", RANKX ( ALLEXCEPT ( _t, _t[City] ), CALCULATE ( MAX ( _t[val] ) ),, DESC )
)
VAR _2 =
SELECTCOLUMNS (
FILTER ( _1, [rank] = 1 ),
"City", [City] & "",
"firstHighest", [val] + 0
)
VAR _3 =
SELECTCOLUMNS (
FILTER ( _1, [rank] = 2 ),
"City", [City] & "",
"secondHighest", [val] + 0
)
VAR _4 =
NATURALLEFTOUTERJOIN ( _2, _3 )
RETURN
_4