I have 2 tables in SSAS / Power BI:
Table1:
| ValueName| ValueKey |
|:---- |:------: |
| abc | 1,2,3 |
Table2:
| ID | ValueKey | Value |
|:---- |:------: |:------: |
| ID1 | 1 | 87,8 |
| ID2 | 85 | 14 |
| ID3 | 90 | 95,8 |
| ID4 | 3 | 13,4 |
I need to retrieve (in temp table, later make calculations over this temp table) ID, Value and only those rows, which have ValueKey 1 or 2 or 3.
I need to do it with DAX. In SQL we have for such situation STING_SPLIT function. Is there some way how can I achive this with DAX? My ValueKey column (table1) is comma separated text and ValueKey (table2) column is INT.
Thanks in advance
Like #Jeroen Mostert suggests, you can do this by abusing the PATHCONTAINS function like this:
FilteredTable2 =
VAR CurrKey = SELECTEDVALUE ( Table1[ValueKey] )
VAR PathFromKey = SUBSTITUTE ( CurrKey, ",", "|" ) /* Paths use | as separator. */
RETURN
FILTER ( Table2, PATHCONTAINS ( PathFromKey, Table2[ValueKey] ) )
However, this is not best practice for relating tables. In general, you don't want multiple keys in a single fields.
In Power BI I would like to create a DAX measure that will retrieve the latest string value for specific IDs. Example source table:
Name_ID | Name | DateTime | Value
----------------------------------------------------------
1 | Child_1 | 18.8.2021 12:33:24 | F
32 | Parent_32 | 18.8.2021 11:41:09 | F
13 | Child_1 | 18.8.2021 11:30:58 | E
48 | Parent_48 | 18.8.2021 09:13:11 | F
2 | Child_2 | 17.8.2021 00:09:42 | S
1 | Child_1 | 17.8.2021 23:03:34 | F
48 | Parent_48 | 17.8.2021 21:46:27 | S
6 | Parent_6 | 16.8.2021 17:31:26 | S
.
.
.
specific parents IDs for example here are 6, 32 and 48, so the result should be something like this:
Name_ID | Name | DateTime (of last execution) | Value
------------------------------------------------------------------------------
32 | Parent_32 | 18.8.2021 11:41:09 | F
48 | Parent_48 | 18.8.2021 09:13:11 | F
6 | Parent_6 | 16.8.2021 17:31:26 | S
The result table I'm trying to get is only parents latest appearance and retrieving the whole row or just Value from last column.
This seems so easy in theory and on paper but I just can't seem to get it in DAX I have tried with various calculate formulas but without any result worth mentioning .
I'm beginner in Power Bi and any help would be very appreciated!
You can use a measure like this one, where we check Max Date per Name:
Flag =
var MaxDatePerName = CALCULATE(max(Sheet3[DateTime]), FILTER(ALL(Sheet3), SELECTEDVALUE(Sheet3[Name]) = Sheet3[Name]))
return
if( MaxDatePerName = SELECTEDVALUE(Sheet3[DateTime]) && LEFT(SELECTEDVALUE(Sheet3[Name]),6) = "Parent", 1, BLANK())
With RANKX
Measure2 =
VAR _0 =
MAX ( 'Table 1'[DateTime] )
VAR _00 =
MAX ( 'Table 1'[Name] )
VAR _1 =
CALCULATE (
RANKX (
FILTER ( ALL ( 'Table 1' ), 'Table 1'[Name] = _00 ),
CALCULATE ( MAX ( 'Table 1'[DateTime] ) ),
,
DESC
)
)
VAR _2 =
IF ( _1 = 1 && CONTAINSSTRING ( _00, "Parent" ) = TRUE (), _0, BLANK () )
RETURN
_2
Can someone help me to convert the sql string to Dax?
row_number() p over (partition by date, customer, type order by day)
The row number is my desired output.
Assuming that your data looks like this table:
Sample
+------------+----------+---------+--------+
| Date | Customer | Product | Gender |
+------------+----------+---------+--------+
| 01/01/2018 | 1234 | P2 | F |
| 01/01/2018 | 1234 | P2 | M |
| 03/01/2018 | 1235 | P1 | F |
| 03/01/2018 | 1235 | P2 | F |
+------------+----------+---------+--------+
I have created a calculated column called Rank, using the RANKX and FILTER function.
The first part of the calculation is to create variables outside the scope of the FILTER function. The second part uses RANKX that takes an expression value - in this case Gender - to order the values.
Rank =
VAR _currentdate = 'Sample'[Date]
VAR _customer = 'Sample'[Customer]
var _product = 'Sample'[Product]
return
RANKX(FILTER('Sample',
[Date]=_currentdate &&
[Customer] = _customer &&
[Product] = _product),[Gender],,ASC)
The output is
I contrasted the output to the SQL equivalent.
select
*,
row_number() over(partition by Date,Customer,Product order by Gender)
from (
select '2018-01-01' as Date,1234 as CUSTOMER,'P2' AS PRODUCT, 'M' Gender union
select '2018-01-01' as Date,1234,'P2','F' UNION
select '2018-01-03' as Date,1235,'P1','F' UNION
select '2018-01-03' as Date,1235,'P2','F'
)t1
I use serde to read data with specific format with delimiter |
One line of my data may looks like: key1=value2|key2=value2|key3="va , lues", and I create the hive table as below:
CREATE EXTERNAL TABLE(
field1 STRING,
field2 STRING,
field3 STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
"input.regex" = "([^\\|]*)\\|([^\\|]*)\\|([^\\|]*)",
"output.format.string" = "%1$s %2$s %3$s"
)
STORED AS TEXTFILE;
I need to extract all values, ignore all quotas if they exist.
Result looks like a
value2 value2 va , lues
How can I change my current regexp for extractig values ?
I can currently offer 2 options, none of them is perfect.
BTW, "output.format.string" is obsolete and has no effect.
1
create external table mytable
(
q1 string
,field1 string
,q2 string
,field2 string
,q3 string
,field3 string
)
row format serde 'org.apache.hadoop.hive.serde2.RegexSerDe'
with serdeproperties ('input.regex' = '.*?=(?<q1>"?)(.*?)(?:\\k<q1>)\\|.*?=(?<q2>"?)(.*?)(?:\\k<q2>)\\|.*?=(?<q3>"?)(.*?)(?:\\k<q3>)')
stored as textfile
;
select * from mytable
;
+----+--------+----+--------+----+-----------+
| q1 | field1 | q2 | field2 | q3 | field3 |
+----+--------+----+--------+----+-----------+
| | value2 | | value2 | " | va , lues |
+----+--------+----+--------+----+-----------+
2
create external table mytable
(
field1 string
,field2 string
,field3 string
)
row format serde 'org.apache.hadoop.hive.serde2.RegexSerDe'
with serdeproperties ('input.regex' = '.*?=(".*?"|.*?)\\|.*?=(".*?"|.*?)\\|.*?=(".*?"|.*?)')
stored as textfile
;
select * from mytable
;
+--------+--------+-------------+
| field1 | field2 | field3 |
+--------+--------+-------------+
| value2 | value2 | "va , lues" |
+--------+--------+-------------+
I have a comma-separated column(string) with duplicate values. I want to remove duplicates:
e.g.
column_name
-----------------
gun,gun,man,gun,man
shuttle,enemy,enemy,run
hit,chase
I want result like:
column_name
----------------
gun,man
shuttle,enemy,run
hit,chase
I am using hive database.
Option 1: keep last occurrence
This will keep the last occurrence of every word.
E.g. 'hello,world,hello,world,hello' will result in 'world,hello'
select regexp_replace
(
column_name
,'(?<=^|,)(?<word>.*?),(?=.*(?<=,)\\k<word>(?=,|$))'
,''
)
from mytable
;
+-------------------+
| gun,man |
| shuttle,enemy,run |
| hit,chase |
+-------------------+
Option 2: keep first occurrence
This will keep the first occurrence of every word.
E.g. 'hello,world,hello,world,hello' will result in 'hello,world'
select reverse
(
regexp_replace
(
reverse(column_name)
,'(?<=^|,)(?<word>.*?),(?=.*(?<=,)\\k<word>(?=,|$))'
,''
)
)
from mytable
;
Option 3: sorted
E.g. 'Cherry,Apple,Cherry,Cherry,Cherry,Banana,Apple' will result in 'Apple,Banana,Cherry'
select regexp_replace
(
concat_ws(',',sort_array(split(column_name,',')))
,'(?<=^|,)(?<word>.*?)(,\\k<word>(?=,|$))+'
,'${word}'
)
from mytable
;
If value sort is not a concern:
with mytable as (
select 'gun,gun,man,gun,man' as column_name union
select 'shuttle,enemy,enemy,run' as column_name union
select 'hit,chase' as column_name
) -- test data
SELECT column_name, concat_ws(',',collect_set(item)) from (
select distinct column_name, s.item from mytable
lateral view explode(split(column_name,',')) s as item
) t
group by column_name
;
+--------------------------+--------------------+--+
| column_name | _c1 |
+--------------------------+--------------------+--+
| gun,gun,man,gun,man | gun,man |
| hit,chase | chase,hit |
| shuttle,enemy,enemy,run | enemy,run,shuttle |
+--------------------------+--------------------+--+
If want to keep the value sorted:
with mytable as (
select 'gun,gun,man,gun,man' as column_name union
select 'shuttle,enemy,enemy,run' as column_name union
select 'hit,chase' as column_name
) -- test data
select column_name,concat_ws(',',collect_set(item)) as column_name_distincted
from (
select column_name,item, min(pos) as pos
from (
select column_name,pos,item
from mytable
lateral view posexplode(split(column_name,',')) s as pos,item
) t
group by column_name,item
order by column_name,pos
) t
group by column_name
;
+--------------------------+-------------------------+--+
| column_name | column_name_distincted |
+--------------------------+-------------------------+--+
| gun,gun,man,gun,man | gun,man |
| hit,chase | hit,chase |
| shuttle,enemy,enemy,run | shuttle,enemy,run |
+--------------------------+-------------------------+--+