Match the string using inputs regex - regex

I have a column which has values like :
col1
ABB
CDD
EFF
GHH
IJJ
KLL
If I input A,D then it should return
ABB
CDD
On inputing J,K it should return
IJJ
KLL
I'm trying to do this using Regex

If you have to use regex, remove commas, and add square brackets outside the input to construct a search expression:
A,D ---> [AD]
J,K ---> [JK]
If this expression matches anywhere in the string from your list, add the matched string to the output.

Related

Regular Expression Spotfire flavor counting occurrences backwards

I have input strings with arbitrary number of '-' in them and I need a regular expression to return everything from the start of input to the SECOND FROM THE LAST occurrence of '-' or copy of input if there are less than 2 occurrences
For example:
input: "VS-TEST1-tbl0-max" output: "VS-TEST1"
input: "VSTEST1-tbl0-max" output: "VSTEST1"
input: "AllOneWord" output: "AllOneWord"
input: "AllOneWord-VS" output: "AllOneWord-VS"
RegEx language syntax? TIBCO SpotFire one: https://support.tibco.com/s/article/Tibco-KnowledgeArticle-Article-43748
Much thanks to all regexperts.
The transform you desire is equivalent to replacing the last 2 blocks (including the separator) with a blank string. The regex to match only two end blocks including separator is (-[^-]*){2}$
Use the function RXReplace for this purpose.
Usage example:
RXReplace('VS-TEST1-tbl0-max', '(-[^-]*){2}$', '') -> 'VS-TEST1'

Extract strings in a pattern from within another regex pattern: REGEX (Replace, Extract $1)

How can I extract strings in a pattern from within another regex pattern
Exemple:
REGEXREPLACE("Ex.: <a;b;c;> e <d;e;>.","<(.*?)>","($1)...")
to get
Ex: (a)(b)(c) e (d)(e).
Suggestion:
You can try a nested REGEXREPLACE functions then use SUBSTITUTE at the end like this sample below:
=SUBSTITUTE(REGEXREPLACE
(REGEXREPLACE
(REGEXREPLACE
("<a;b;c;> e <d;e;>.", "<(.*?)>", "($1)"),
"[^a-z()> .a-z]", ")*("),
"[^a-z()> .a-z]",""),
"()","")
Sample Demonstration
Sample string is on cell A2
How it works:
Used your original regex:
E.g. placed on cell A2
=REGEXREPLACE("<a;b;c;> e <d;e;>.", "<(.*?)>", "($1)")
Result: (a;b;c;) e (d;e;).
Replace ; with )*(:
=REGEXREPLACE(A2, "[^a-z()> .a-z]", ")*(")
Result: (a)*(b)*(c)*() e (d)*(e)*().
Remove asterisk (*) sign:
=REGEXREPLACE(B2,"[^a-z()> .a-z]","")
Result: (a)(b)(c)() e (d)(e)().
Replace empty () using [SUBSTITUTE][2] function:
=SUBSTITUTE(C2,"()","")
Result: (a)(b)(c) e (d)(e).
Reference:
Multiple regex matches in Google Sheets formula

Scala split and line start in the regex

I am trying to split the string in to four parts P, Q, R, S.
String starts with P as per the following example :
"P|VAL1|VAL2|VAL3|BLANK|Q|VAL4|BLANK|BLANK|R|VAL5|BLANK|VAL6|HELP|BLANK|VAL7|S|EDIT|BLANK|VAL8|(SDK 1.8)|BLANK".split("[(^?P\\|)][(Q?\\|)]?[(R?\\|)]?[(S?\\|)]")
"P|VAL1|VAL2|VAL3|BLANK|Q|VAL4|BLANK|BLANK|R|VAL5|BLANK|VAL6|HELP|BLANK|VAL7|S|EDIT|BLANK|VAL8|(SDK 1.8)|BLANK".split("[(^?P\|)][(Q?\|)]?[(R?\|)]?[(S?\|)]") foreach println
gives
VAL1|VAL2|VAL3|BLANK
VAL4|BLANK|BLANK
VAL5|BLANK|VAL6|HEL
BLANK|VAL7
|EDIT|BLANK|VAL8
DK 1.8
BLANK
where my expectation is :
VAL1|VAL2|VAL3|BLANK
VAL4|BLANK|BLANK
VAL5|BLANK|VAL6|HELP|BLANK|VAL7
EDIT|BLANK|VAL8|(SDK 1.8)|BLANK
However
"P|VAL1|VAL2|VAL3|BLANK|Q|VAL4|BLANK|BLANK|R|VAL5|BLANK|VAL6|HELP|BLANK|VAL7|S|EDIT|BLANK|VAL8|(SDK 1.8)|BLANK".split("[(^P\\|)][(Q?\\|)]?[(R?\\|)]?[(S?\\|)]") (0)
Checking first element of split with above gives
res9: String = ""
It seems that start of string is not honored here. I tried this on regex 101 as well it correctly matches P| at the start. However it also matches P| in the |HELP|. So it seems my regex is flawed. However my question is How the empty string above comes in to play ?
You can use the following regex if having an empty first element of your list is not important:
\\|[QRS]\\||^P\\|
You can replace this regex by \\|[PQRS]\\||^P\\| if you except other P as separator inside the string
OUTPUT:
"P|VAL1|VAL2|VAL3|BLANK|Q|VAL4|BLANK|BLANK|R|VAL5|BLANK|VAL6|HELP|BLANK|VAL7|S|EDIT|BLANK|VAL8|(SDK 1.8)|BLANK".split("\\|[QRS]\\||^P\\|");
[, VAL1|VAL2|VAL3|BLANK, VAL4|BLANK|BLANK, VAL5|BLANK|VAL6|HELP|BLANK|VAL7, EDIT|BLANK|VAL8|(SDK 1.8)|BLANK]
Otherwise you need to do it in 2 steps:
match and remove the P| at the beginning of your string using ^P\\| and replacing it by nothing demo1
split the string using the regex \\|[QRS]\\| demo2 You can replace this regex by \\|[PQRS]\\| if you except other P as separator inside the string
Here's one approach that defines the delimiter as one of P, Q, R, S enclosed by word boundary \b and optional |:
val s = "P|VAL1|VAL2|VAL3|BLANK|Q|VAL4|BLANK|BLANK|R|VAL5|BLANK|VAL6|HELP|BLANK|VAL7|S|EDIT|BLANK|VAL8|(SDK 1.8)|BLANK"
s.split("""\|?\b[PQRS]\b\|?""").filter(_ != "")
// res1: Array[String] = Array(VAL1|VAL2|VAL3|BLANK, VAL4|BLANK|BLANK, VAL5|BLANK|VAL6|HELP|BLANK|VAL7, EDIT|BLANK|VAL8|(SDK 1.8)|BLANK)
Skip the filter in case you want to include extracted empty strings.

Extract number not in brackets from this string using regular expressions [70-(90)]

[15-]
[41-(32)]
[48-(45)]
[70-15]
[40-(64)]
[(128)-42]
[(128)-56]
I have these values for which I want to extract the value not in curled brackets. If there is more than one, then add them together.
What is the regular expression to do this?
So the solution would look like this:
[15-] -> 15
[41-(32)] -> 41
[48-(45)] -> 48
[70-15] -> 85
[40-(64)] -> 40
[(128)-42] -> 42
[(128)-56] -> 56
You would be over complicating if you go for a regex approach (in this case, at least), also, regular expressions does not support mathematical operations, as pointed out by #richardtallent.
You can use an approach as shown here to extract a substring which omits the initial and final square brackets, and then, use the Split (as shown here) and split the string in two using the dash sign. Lastly, use the Instr function (as shown here) to see if any of the substrings that the split yielded contains a bracket.
If any of the substrings contain a bracket, then, they are omitted from the addition, or they are added up if otherwise.
Regular expressions does not support performing math on the terms. You can loop through the groups that are matched and perform the math outside of Regex.
Here's the pattern to extract any number within the square brackets that are not in cury brackets:
\[
(?:(?:\d+|\([^\)]*\))-)*
(\d+)
(?:-[^\]]*)*
\]
Each number will be returned in $1.
This works by looking for a number that is prefixed by any number of "words" separated by dashes, where the "words" are either numbers themselves or parenthesized strings, and followed by, optionally, a dash and some other stuff before hitting the end brace.
If VBA's RegEx doesn't support uncaptured groups (?:), remove all of the ?:'s and your captured numbers will be in $3 instead.
A simpler pattern also works:
\[
(?:[^\]]*-)*
(\d+)
(?:-[^\]]*)*
\]
This simply looks for numbers delimited by dashes and allowing for the number to be at the beginning or end.
Private Sub regEx()
Dim RegexObj As New VBScript_RegExp_55.RegExp
RegexObj.Pattern = "\[(\(?[0-9]*?\)?)-(\(?[0-9]*?\)?)\]"
Dim str As String
str = "[15-]"
Dim Match As Object
Set Match = RegexObj.Execute(str)
Dim result As Integer
Dim value1 As Integer
Dim value2 As Integer
If Not InStr(1, Match.Item(0).submatches.Item(0), "(", 1) Then
value1 = Match.Item(0).submatches.Item(0)
End If
If Not InStr(1, Match.Item(0).submatches.Item(1), "(", 1) And Not Match.Item(0).submatches.Item(1) = "" Then
value2 = Match.Item(0).submatches.Item(1)
End If
result = value1 + value2
MsgBox (result)
End Sub
Fill [15-] with the other strings.
Ok! It's been 6 years and 6 months since the question was posted. Still, for anyone looking for something like that maybe now or in the future...
Step 1:
Trim Leading and Trailing Spaces, if any
Step 2:
Find/Search:
\]|\[|\(.*\)
Replace With:
<Leave this field Empty>
Step 3:
Trim Leading and Trailing Spaces, if any
Step 4:
Find/Search:
^-|-$
Replace With:
<Leave this field Empty>
Step 5:
Find/Search:
-
Replace With:
\+

Regular Expression to match specific string followed by number?

What regular expression can I use to find this?
&v=15151651616
Where &v= is a static string and the number part may vary.
"^&v=[0-9]+$" if you want at least 1 number or "^&v=[0-9]*$" if no number must match too.
If you want it to match inside another sequence just remove the ^ and $, which means the sequence beginning by (^) and sequence ending with ($)
You can use the following regular expression:
&v=\d+
This matches &v= and then one or more digits.
I tried the other solutions but those were not working for me but the following worked.
NAME(column):
dbbdb
abcdef=1244
abc =123sfdafs
abc= 1223 adsfa
abc = 1323def
abcasdafs =adfd 1323def
To find 'bc' followed by a number, Code:
. -> match any character
? -> optional (show even if there are no characters)
+ -> in addition to the search keyword
where regexp_like (NAME, 'bc.?+[0-9]');
Output:
abcdef=1244
abc =123sfdafs
abc= 1223 adsfa
abc = 1323def
abcasdafs =adfd 1323def
To find 'bc' followed by '=' and a number, no matter the spaces, Code:
where regexp_like (NAME, 'bc ?+[=] ?+[0-9]');
Output:
abc =123sfdafs
abc= 1223 adsfa
abc = 1323def