Google Data Studio Calculated Field by Extracting String from Event Label Values - regex

I'm trying to use the CASE statement to output string values for an Event Label field using RegEx to produce a table that shows the number of events for each field value. So, if I'm looking for foobar, and other string values separately, within values for Event Label; it may either stand alone or be part of a URL like so:
|[object HTMLLabelElement] | Foobar |
/images/foobar-26.svg
It seems REGEXP_EXTRACT might suit this the best:
CASE WHEN REGEXP_EXTRACT(Event Label, '.(?i)foobar.') THEN Foobar
However, the table produced using the calculated field as the dimension only contains a blank row that seems to be the sum of the number of events.
What am I missing?

I think you need to use REGEXP_MATCH not REGEXP_EXTRACT, given your existing syntax, or to change the syntax to a straight REGEXP_EXTRACT without the CASE element.

Related

PowerBI custom combined column with Text.Combine and with embeded conditions leading to the insertion of different strings

I created a custom column in PowerBI, which concatenate columns.
I have the following:
Text.Combine({[Nip],[Nap],[Noup]]},"_")
However, I would like to have a specific text which change based on whether or not data is present in columns. I need to check if there is data in four columns. If there is data, a specific string of character should be inserted, if there is not, no data should be inserted.
I am trying to insert the outcome of the "IF"s, but there is some complexity, I have tried this, but this is not working, Power BI is telling me "Token Eof expected" :
If [Lapino] <> null or [Lapinou] <> null or [Werwolf] <> null or [Ciocolato] then
Text.Combine({[Nip],"Snoubadiuba",[Nap],[Noup]},"_")
else Text.Combine({[Nip],"BruttoCativo",[Nap],[Noup]},"_")
I believe this is as simple as changing your If to lowercase if. M code is case-sensitive.

Conditionally format cell if value entered already appears in same row, next column which is CSV

I'm want to conditionally format A3:A if the value entered in A3:A already appears in B3:B, which contains CSV, >1 time.
(A3:A will be CONCATENATED to B3:B, so the value will automatically appear at least once.)
Basically, if the value is not already present, there will be no formatting and I know to go ahead and add (leave it). If it is present, format the cell to alert me not to add (or delete). There may be numerous values in some cells and not so easy to glance to see if the value in question is already present.
I attempted to use REGEXMATCH, but not really sure how to switch the TRUE to a numeric value.
=IF(LEN(A3),REGEXMATCH(B3,A3),)
I've also found other formulas using COUNTIF and COUNTA that perform a similar action, but none that consider CSV.
My sheet
custom formula for CF:
=ARRAYFORMULA(REGEXMATCH(A3,TEXTJOIN("|",1,TRANSPOSE(QUERY(QUERY(TRANSPOSE(TRIM(
SPLIT(B3,","))), "select Col1,count(Col1) group by Col1"),
"select Col1 where Col2 > 1", 0)))))

Extracting String Portions in SQL using Regular expressions

Hi All,
I have a query related to Regular expressions in SQL.
I have a case where a portion of string has to be extracted from a column. The portion of that column will be prefixed with my column A. Please see the screenshot for the sample data. I have also added the output expected in a separate column (highlighted in green).
Scenarios:
Now if a column value has more than 1 unique number then that has to be shown up with Null
Eg: To verify CAN06010025, CAN06010026 & CAN06010030 after the approval.
In the above string I have more than 1 number(bold portion)
and this case should be ignored (meaning it has to give me Null Value).
If there is only one number and if it is repetitive then I have to consider that case and extract the portion of String..
Eg: Project USA12: Id USA12S001: Contact required -USA12S001- form to be updated
In this example, the portion I wanted to extract is repetitive and I am looking to extract the highlighted portion alone.
The same applies to the other cases as well.
I tried with the below sql. The challenge is my Col A can also be present in Col B (Line 2 in screenshot) and this code is considering my Col A portion when I count with REGEXP_COUNT function and is giving me the value as Null. My expectation is to extract that USA12S001 portion from the column.
Could you please help in achieving this where the above two conditions satisfies.
SQL:
SELECT
ColA,
ColB,
case when REGEXP_COUNT(ColB,ColA) >2 THEN NULL
ELSE REPLACE(REPLACE(concat(regexp_substr(ColB,ColA||'([[:alnum:]]+\.?)'),
nvl(regexp_substr(ColB,ColA||'(\-[[:digit:]]+)'),
regexp_substr(ColB,ColA||'([[:space:]]\-[[:space:]][[:digit:]]+)'))),
' ',''),'.','')
END AS Result
FROM
table
Test Data:
Col A
CAN06
USA12
USA27
HUN04
CAN05
USA24
CAN06
Col B
to verify CAN06010025, CAN06010026 & CAN06010030 after the approval
Project USA12: Id USA12S001: Contact required -USA12S001- form to be updated
Project USA27: Id: USA27S001: Prod
To review id HUN04S002-HUN04S004 after the due date.
ID: CAN05S005 with the details as CAN05S005 are completed.
Project USA24: Id: USA24S009: Data Issue
"Project: Subject CAN06S009: V2 & V3- Id CAN06S010: V1"
If the REGEXP_COUNT is the only issue, then the answer is simple: change
case when REGEXP_COUNT(ColB,ColA) >2
to:
case when REGEXP_COUNT(ColB,ColA || '[[:alnum:]]') >2

RPA(blueprism) date validation

I am trying to validate date which is coming from an excel sheet , the format should be in dd/mm/yyyy
i tried with regex pattern [0-9]{2}/[0-9]{2}/[0-9]{4}
but this won't work with single digit date and we since we cannot add 0 at start in excel sheet this pattern ain't working. (this is for blueprism tool which have a action for regex matching]
To build in the resiliency you require, you'll have to accept either 1 or 2 digits for both dd and mm:
[0-9]{1,2}/[0-9]{1,2}/[0-9]{4}
Since you are mentioning that you are working with BluePrism, are you sure that you really need Regex for validating dates? Because BP has built-in feature for that callable directly inside a Calc stage - checkout following example (you can see Expression of selected Calc stage in top Expression bar).
The function used for validating dates is IsDate([Some date as string]), result is saved into a Flag data-item.
After the check you can use that Flag data-item in a Decision block and do whatever you consider appropriate if a date is not an actual date.
Note: of course, if you are working with lists/datatables in a Code stage instead of iterating over a collection in process layout, then you need something else, but this might be still helpful.
In Code stage I would probably simply use DateTime.Parse(String) method which is able to automatically convert a date in a form of a String into a DateTime object instance; example:
' DateTime.Parse throws an Exception if parsing failed.
Dim valid As Boolean = False
Try
Dim d = DateTime.Parse(First_Date)
valid = True
Catch e As Exception
valid = False
End Try
See more about parsing dates using DateTime.Parse at MSDN: https://msdn.microsoft.com/en-us/library/1k1skd40(v=vs.110).aspx
There is also a nice post about parsing dates here: https://stackoverflow.com/a/18465222/7439802
In blue prism you can use
FormatDate(Now(), FormatOfDate)
For comparing two dates, initially Convert(" FormatDate ") in same format and then you can compare.
For " FormatDate " option you can refer Help of Blue prism and search for dateadd --> select the Calculation and decision

How to search multiple strings in a string?

I want to check in a powerquery new column if a string like "This is a test string" contains any of the strings list items {"dog","string","bark"}.
I already tried Text.PositionOfAny("This is a test string",{"dog","string","bark"}), but the function only accepts single-character values
Expression.Error: The value isn't a single-character string.
Any solution for this?
This is a case where you'll want to combine a few M library functions together.
You'll want to use Text.Contains many times against a list, which is a good case for List.Transform. List.AnyTrue will tell you if any string matched.
List.AnyTrue(List.Transform({"dog","string","bark"}, (substring) => Text.Contains("This is a test string", substring)))
If you wished that there was a Text.ContainsAny function, you can write it!
let
Text.ContainsAny = (string as text, list as list) as logical =>
List.AnyTrue(List.Transform(list, (substring) => Text.Contains(string, substring))),
Invoked = Text.ContainsAny("This is a test string", {"dog","string","bark"})
in
Invoked
Another simple solution is this:
List.ContainsAny(Text.SplitAny("This is a test string", " "), {"dog","string","bark"})
It transforms the text into a list because there we find a function that does what you need.
If it's a specific (static) list of matches, you'll want to add a custom column with an if then else statement in PQ. Then use a filter on that column to keep or remove the columns. AFAIK PQ doesn't support regex so Alexey's solution won't work.
If you need the lookup to be dynamic, it gets more complicated... but doable you essentially need to
have an ID column for the original row.
duplicate the query so you have two queries, then in the newly created query
split the text field into separate columns, usually by space
unpivot the newly created columns.
get the list of intended names
use list.generate method to generate a list that shows 1 if there's a match and 0 if there isn't.
sum the values of the list
if sum > 0 then mark that row as a match, usually I use the value 1 in a new column. Then you can filter the table to keep only rows with value 1 in the new column. Then group this table on ID - this is the list of ID that contain the match. Now use the merge feature to merge in the first table ensuring you keep only rows that match the IDs. That should get you to where you want to be.
Thanks for giving me the lead. In my own case I needed to ensure two items exist in a string hence I replaced formula as:
List.AllTrue(List.Transform({"/","2017"},(substring) => Text.Contains("4/6/2017 13",substring)))
it returned true perfectly.
You can use regex here with logical OR - | expression :
/dog|string|bark/.test("This is a test string") // retruns true