How to make regexp for multiple condition? - regex

I have regexp code like below (I'm using VerbalExpression dart plugin ), My purpose is to check that a string starts with "36", followed by "01", "02", or "03". After that can be anything as long as the whole string is 16 characters long.
var regex = VerbalExpression()
..startOfLine()
..then("36")
..then("01")
..or("02")
..anythingBut(" ")
..endOfLine();
String nik1 = "3601999999999999";
String nik2 = "3602999999999999";
String nik3 = "3603999999999999";
print('result : ${regex.hasMatch(nik1)}');
print('Hasil : ${regex.hasMatch(nik2)}');
print('Hasil : ${regex.hasMatch(nik3)}');
my code only true for nik1 and nik2, however i want true for nik3, I noticed that i can't put or() after or() for multiple check, it just give me all false result, how do i achieve that?

I'm not familiar with VerbalExpression, but a RegExp that does this is straightforward enough.
const pattern = r'^36(01|02|03)\S{12}$';
void main() {
final regex = RegExp(pattern);
print(regex.hasMatch('3601999999999999')); // true
print(regex.hasMatch('3602999999999999')); // true
print(regex.hasMatch('3603999999999999')); // true
print(regex.hasMatch('360199999999999')); // false
print(regex.hasMatch('3600999999999999')); // false
print(regex.hasMatch('36019999999999999')); // false
}
Pattern explanation:
The r prefix means dart will interpret it as a raw string ("$" and "\" are not treated as special).
The ^ and $ represent the beginning and end of the string, so it will only match the whole string and cannot find matches from just part of the string.
(01|02|03) "01" or "02" or "03". | means OR. Wrapping it in parentheses lets it know where to stop the OR.
\S matches any non-whitespace character.
{12} means the previous thing must be repeated 12 times, so \S{12} means any 12 non-whitespace characters.

Related

match everything but a given string and do not match single characters from that string

Let's start with the following input.
Input = 'blue, blueblue, b l u e'
I want to match everything that is not the string 'blue'. Note that blueblue should not match, but single characters should (even if present in match string).
From this, If I replace the matches with an empty string, it should return:
Result = 'blueblueblue'
I have tried with [^\bblue\b]+
but this matches the last four single characters 'b', 'l','u','e'
Another solution:
(?<=blue)(?:(?!blue).)+(?=blue|$)|^(?:(?!blue).)+(?=blue|$)
Regex demo
If you regex engine support the \K flag, then we can try:
/blue\K|.*?(?=blue|$)/gm
Demo
This pattern says to match:
blue match "blue"
\K but then forget that match
| OR
.*? match anything else until reaching
(?=blue|$) the next "blue" or the end of the string
Edit:
On JavaScript, we can try the following replacement:
var input = "blue, blueblue, b l u e";
var output = input.replace(/blue|.*?(?=blue|$)/g, (x) => x != "blue" ? "" : "blue");
console.log(output);

Why does the regex [a-zA-Z]{5} return true for non-matching string?

I defined a regular expression to check if the string only contains alphabetic characters and with length 5:
use regex::Regex;
fn main() {
let re = Regex::new("[a-zA-Z]{5}").unwrap();
println!("{}", re.is_match("this-shouldn't-return-true#"));
}
The text I use contains many illegal characters and is longer than 5 characters, so why does this return true?
You have to put it inside ^...$ to match the whole string and not just parts:
use regex::Regex;
fn main() {
let re = Regex::new("^[a-zA-Z]{5}$").unwrap();
println!("{}", re.is_match("this-shouldn't-return-true#"));
}
Playground.
As explained in the docs:
Notice the use of the ^ and $ anchors. In this crate, every expression is executed with an implicit .*? at the beginning and end, which allows it to match anywhere in the text. Anchors can be used to ensure that the full text matches an expression.
Your pattern returns true because it matches any consecutive 5 alpha chars, in your case it matches both 'shouldn't' and 'return'.
Change your regex to: ^[a-zA-Z]{5}$
^ start of string
[a-zA-Z]{5} matches 5 alpha chars
$ end of string
This will match a string only if the string has a length of 5 chars and all of the chars from start to end fall in range a-z and A-Z.

regex to extract substring for special cases

I have a scenario where i want to extract some substring based on following condition.
search for any pattern myvalue=123& , extract myvalue=123
If the "myvalue" present at end of the line without "&", extract myvalue=123
for ex:
The string is abcdmyvalue=123&xyz => the it should return myvalue=123
The string is abcdmyvalue=123 => the it should return myvalue=123
for first scenario it is working for me with following regex - myvalue=(.?(?=[&,""]))
I am looking for how to modify this regex to include my second scenario as well. I am using https://regex101.com/ to test this.
Thanks in Advace!
Some notes about the pattern that you tried
if you want to only match, you can omit the capture group
e* matches 0+ times an e char
the part .*?(?=[&,""]) matches as least chars until it can assert eiter & , or " to the right, so the positive lookahead expects a single char to the right to be present
You could shorten the pattern to a match only, using a negated character class that matches 0+ times any character except a whitespace char or &
myvalue=[^&\s]*
Regex demo
function regex(data) {
var test = data.match(/=(.*)&/);
if (test === null) {
return data.split('=')[1]
} else {
return test[1]
}
}
console.log(regex('abcdmyvalue=123&3e')); //123
console.log(regex('abcdmyvalue=123')); //123
here is your working code if there is no & at end of string it will have null and will go else block there we can simply split the string and get the value, If & is present at the end of string then regex will simply extract the value between = and &
if you want to use existing regex then you can do it like that
var test = data1.match(/=(.*)&|=(.*)/)
const result = test[1] ? test[1] : test[2];
console.log(result);

VBA regexp to check special symbols

I have tried to use what I've learnt in this post,
and now I want to compose a RegExp which checks whether a string contains digits and commas. For example, "1,2,55,2" should be ok, whereas "a,2,55,2" or "1.2,55,2" should fail test. My code:
Private Function testRegExp(str, pattern) As Boolean
Dim regEx As New RegExp
If pattern <> "" Then
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.pattern = pattern
End With
If regEx.Test(str) Then
testRegExp = True
Else
testRegExp = False
End If
Else
testRegExp = True
End If
End Function
Public Sub foo()
MsgBox testRegExp("2.d", "[0-9]+")
End Sub
MsgBox yields true instead of false. What's the problem ?
Your regex matches a partial string, it matches a digit in all 55,2, a,2,55,2, 1.2,55,2 input strings.
Use anchors ^ and $ to enforce a full string match and add a comma to the character class as you say you want to match strings that only contain digits and commas:
MsgBox testRegExp("2.d", "^[0-9,]*$")
^ ^ ^
I also suggest using * quantifier to match 0 or more occurrences, rather than + (1 or more occurrences), but it is something you need to decide for yourself (whether you want to allow an empty string match or not).
Here is the regex demo. Note it is for PCRE regex flavor, but this regex will perform similarly in VBA.
Yes, as #Chaz suggests, if you do not need to match the string/line itself, the alternative is to match an inverse character class:
MsgBox testRegExp("2.d", "[^0-9,]")
This way, the negated character class [^0-9,] will match any character but a comma / digit, invalidating the string. If the result is True, it will mean the string contains some characters other than digits and a comma.
You can use the limited built in pattern matching for that:
function isOk(str) As boolean
for i = 1 To len(str)
if Mid$(str, i, 1) Like "[!0-9,]" then exit function
next
g = True and Len(str) > 0
end function

regular expressions, delimiting plus sign

Private Const SEPARATOR_REG_EXP1 As String = "SCD\+4\+[A-Z]\+"
Public Function TestReg() As Boolean
Dim s1 As String = "SCD+4+ADJUSTMENT+"
Dim match As Match = Regex.Match(s1, SEPARATOR_REG_EXP1)
If match.Success Then
Return True
Else : Return False
End If
End Function
Not sure why this does not match - haven't really used regular expressions much.
The regex pattern should be :
"SCD\+4\+[A-Z]+\+"
You have to add a + sign after [A-Z], because you want to match one or multiple of these [A-Z] characters.
This does not match, because [A-Z]matches only a single character of the given character class. You can use the + quantifier to match multiple chars. The resulting RegEx would be
SCD\+4\+[A-Z]+\+