PowerBI - Get the Index of the First Occurance of a value in the column - powerbi

I am trying to return the Index of a first occurrence of a value in a column.
I would want to use the Calculated Column functionality in PowerBI.
For Example,
Input Output
ASD 1
ASD 1
ASD 1
GEF 4
GEF 4
HIJ 6
GEF 4
This can be done in excel using a simple formula like,
MATCH(A2,A:A,0)-1
For PowerBI to understand Index, I have created a column called as Index on the Query editor and made the data look like,
Index Input Output
1 ASD ?
2 ASD ?
3 ASD ?
4 GEF ?
5 GEF ?
6 HIJ ?
7 GEF ?
How to do this in PowerBI?

The way I did this was to find the minimal index the corresponds to the Input value in the table:
Output = MINX(
FILTER(TableName,
TableName[Input] = EARLIER(TableName[Input])),
TableName[Index])
This takes the minimal index over the table, where Input matches the value of Input in the original (earlier) row context.

Related

Simpler regular expression for oracle regexp_replace

I have a | separated string with 20 |s like 123|1|42|13||94123|2983191|2|98863|...|211| upto 20 |. This is a oracle db column. The string is just 20 numbers followed by |.
I am trying to get a string out from it where I remove the numbers at position 4,6,8,9,11,12 and 13. Also, need to move the number at position 16 to position 4. Till now, I have got a regex like
select regexp_replace(col1, '^((\d*\|){4})(\d*\|)(\d*\|)(\d*\|)(\d*\|)((\d*\|){2})(\d*\|)((\d*\|){3})((\d*\|){2})(\d*\|)(.*)$', '\1|\4|\6||\9||||||||') as cc from table
This is where I get stuck since oracle only supports backreference upto 9 groups. Is there any way to make this regex simpler so it has lesser groups and can be fit into the replace? Any alternative solutions/suggestions are also welcome.
Note - The position counter begins at 0, so 123 in above string is the 0th number.
Edit: Example -
Source string
|||14444|10107|227931|10115||10118||11361|11485||10110||11512|16666|||
Expected result
|||16666|10107||10115||||||11512||||
You can get the result you want by removing capture groups for the numbers you are removing from the string anyway, and writing (for example) ((\d*\|){2}) as (\d*\|\d*\|). This reduces the number of capture groups to 7, allowing your code to work as is:
select regexp_replace(col1,
'^(\d*\|\d*\|\d*\|\d*\|)\d*\|(\d*\|)\d*\|(\d*\|)\d*\|\d*\|(\d*\|)\d*\|\d*\|\d*\|(\d*\|\d*\|)(\d*\|)(.*)$',
'\1\6\2|\3||\4|||\5|\7') as cc
from table
Output (for your test data and also #Littlefoot good column example):
CC
|||14444|16666|227931|||||11361|||||11512|||||
0|1|2|3|16|5||7|||10||||14|15||17|18|19|
Demo on dbfiddle
As there's unique column (ID, as you said), see if this helps:
split every column into rows
compose them back (using listagg) which uses 2 CASEs:
one to remove values you don't want
another to properly sort them ("put value at position 16 to position 4")
Note that my result differs from yours; if I counted it correctly, 16666 isn't at position 16 but 17 so - 11512 has to be moved to position 4.
I also added another dummy row which is here to confirm whether I counted positions correctly, and to show why you have to use lines #10-12 (because of duplicates).
OK, here you are:
SQL> with test (id, col) as
2 (
3 select 1, '|||14444|10107|227931|10115||10118||11361|11485||10110||11512|16666|||' from dual union all
4 select 2, '1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20' from dual
5 ),
6 temp as
7 (select replace(regexp_substr(col, '(.*?)(\||$)', 1, column_value), '|', '') val,
8 column_value lvl,
9 id
10 from test cross join table(cast(multiset(select level from dual
11 connect by level <= regexp_count(col, '\|') + 1
12 ) as sys.odcinumberlist))
13 )
14 select id,
15 listagg(case when lvl in (4, 6, 8, 9, 11, 12, 13) then '|'
16 else val || case when lvl = 20 then '' else '|' end
17 end, '')
18 within group (order by case when lvl = 16 then 4
19 when lvl = 4 then 16
20 else lvl
21 end) result
22 from temp
23 group by id;
ID RESULT
---------- ------------------------------------------------------------
1 |||11512|10107||10115|||||||10110|||16666|||
2 1|2|3|16|5||7|||10||||14|15||17|18|19|20
SQL>

Sumif and IF, to add up one column and compare it to another column

I have two lists that I want to compare to see if they match, but in one list the numbers are broken down into individual lots so I need to sum them first to make sure they match the other list which only shows the total amount.
Here's an example:
List 1
5 ABC
6 ABC
7 ABC
1 CDE
5 CDE
2 CDE
List 2
18 ABC
8 CDE
So I want to make sure that the sum of the ABC and CDE in List 1 matches the amount of ABC and CDE in List 2. I can do this using multiple columns, but I am trying for a more...elegant way (one nested formula).
If you are looking for a confirmation that the numbers match you can use the following:
=SUMIF($B$3:$B$11,E1,$A$3:$A$11)=SUMIF(B15:$B$17,E1,$A$15:$A$17)
Whhat this does is check if the sum of ABC in list 1 is equal to the sum of ABC in list 2 and return true if they are equal and false if they are not.

Replace String B with String C if it contains (but not exactly matches) String A

I have a data frame match_df which shows "matching rules": the column old should be replaced with the colum new in the dataframes it is applied on.
old <- c("10000","20000","300ZZ","40000")
new <- c("Name1","Name2","Name3","Name4")
match_df <- data.frame(old,new)
old new
1 10000 Name1
2 20000 Name2
3 300ZZ Name3 # watch the letters
4 40000 Name4
I want to apply the matching rules above on a data frame working_df
id <- c(1,2,3,4)
value <- c("xyz-10000","20000","300ZZ-230002112","40")
working_df <- data.frame(id,value)
id value
1 1 xyz-10000
2 2 20000
3 3 300ZZ-230002112
4 4 40
My desired result is
# result
id value
1 1 Name1
2 2 Name2
3 3 Name3
4 4 40
This means that I am not looking for an exact match. I'd rather like to replace the whole string working_df$value as soon as it includes any part of the string in match_df$old.
I like the solution posted in R: replace characters using gsub, how to create a function?, but it works only for exact matches. I experimented with gsub, str_replace_all from stringr but I couldn't find a solution that works for me. There are many solutions for exact matches on SOF, but I couldn't find a comprehensible one for this problem.
Any help is highly appreciated.
I'm not sure this is the most elegant/efficient way of doing it but you could try something like this:
working_df$value <- sapply(working_df$value,function(y){
idx<-which(sapply(match_df$old,function(x){grepl(x,y)}))[1]
if(is.na(idx)) idx<-0
ifelse(idx>0,as.character(match_df$new[idx]),as.character(y))
})
It uses grepl to find, for each value of working_df, if there is a row of match_df that is partially matching and get the index of that row. If there is more than one, it takes the first one.
You need the grep function. This will return the indices of a vector that match a pattern (any pattern, not necessarily a full string match). For instance, this will tell you which of your "old" values match the "10000" pattern:
grep(match_df[1,1], working_df$value)
Once you have that information, you can look up the corresponding "new" value for that pattern, and replace it on the matching rows.
Here are 2 approaches using Map + <<- and a for loop:
working_df[["value2"]] <- as.character(working_df[["value"]])
Map(function(x, y){working_df[["value2"]][grepl(x, working_df[["value2"]])] <<- y}, old, new)
working_df
## id value value2
## 1 1 xyz-10000 Name1
## 2 2 20000 Name2
## 3 3 300ZZ-230002112 Name3
## 4 4 40 40
## or...
working_df[["value2"]] <- as.character(working_df[["value"]])
for (i in seq_along(working_df[["value2"]])) {
working_df[["value2"]][grepl(old[i], working_df[["value2"]])] <- new[i]
}

How to create serial number for each group in rdlc?

i have create a report with serial number using rownumber(nothing) in rdcl report. since, i have used a group in the report, the serial number is continued with the previous group
I get the report like
Group1 1 value 2 value
3 value
Group2 4 value 5 value
6 value
But i need it like
Group1 1 value 2 value
3 value
Group2 1 value 2 value
3 value
I even tried with RunningValue function. Is it possible the way i need it?
For creating serial number for each group in rdlc you can use following syntax
=RowNumber("table1_Group1") as expression in the text field
where table1_Group1 is group name .

R - Need to subset a data frame using matches from a regex expression

I'm looking to subset a data frame based on matches from a regular expression that scans a single column, and returns the data in all the rows where column 2 has a match from the regular expression.
Using R 3.01 and I'm a relative inexperienced R programmer.
My data frame looks like this:
data:
........Column 1 .. Column2 Column 3
Row 1 ..data..........string....data
Row 2 ..data..........string....data
Row 3 ..data..........string....data
Row 4 ..data..........string....data
I'm using the following to scan column 2:
grep("word1", data$Column2, perl=TRUE)]
So far, I get all the strings returned from column2 that contain word1, but I'm looking to subset the entire row(s) where those matches are found.
new.data.frame <- old.data.frame[grep("word1", data$Column2, perl=TRUE), ]