check validation of an expression with Regex - regex

I need in my code to verify the validity of an expression entered in a textbox, so I thought to Regex but my problem is that I do not get it.
So here is my expression: [3 Numbers]-[1 character Shift].[1 Number].
for example: 007-L.4
I try with this:
Dim MyRegex As Regex = New Regex("^[0-9]{3}-[a-zA-Z].[O-9]$")
but it does not work
thank you in advance

You have two errors in your pattern:
^[0-9]{3}-[a-zA-Z].[O-9]$
^ ^
1 2
The . is a metacharacter which matches any character. You need to escape it to \. to match periods only,
Your range is not valid, since you wrote O (the letter) instead of 0 (the digit). :-)
Here's a the corrected pattern:
Dim MyRegex As Regex = New Regex("^[0-9]{3}-[a-zA-Z]\.[0-9]$")
(demo)

Related

Validating a string's first 3 letters as uppercase with regex

I have a question on Classic ASP regarding validating a string's first 3 letters to be uppercase while the last 4 characters should be in numerical form using regex.
For e.g.:
dim myString = "abc1234"
How do I validate that it should be "ABC1234" instead of "abc1234"?
Apologies for my broken English and for being a newbie in Classic ASP.
#ndn has a good regex pattern for you. To apply it in Classic ASP, you just need to create a RegExp object that uses the pattern and then call the Test() function to test your string against the pattern.
For example:
Dim re
Set re = New RegExp
re.Pattern = "^[A-Z]{3}.*[0-9]{4}$" ' #ndn's pattern
If re.Test(myString) Then
' Match. First three characters are uppercase letters and last four are digits.
Else
' No match.
End If
^[A-Z]{3}.*[0-9]{4}$
Explanation:
Surround everything with ^$ (start and end of string) to ensure you are matching everything
[A-Z] - gives you all capital letters in the English alphabet
{3} - three of those
.* - optionally, there can be something in between (if there can't be, you can just remove this)
[0-9] - any digit
{4} - 4 of those

VB.NET regex searching for AAA-9999

I need help with finding the first 3 capital letters A-Z and then a space followed by 4 numbers 0-9.
Dim IndividualClasses As MatchCollection = Regex.Matches(AllExitClasses(a), "([A-Z])([A-Z])([A-Z]) ([0-9])([0-9])([0-9])([0-9])")
An example input string would be AML 4309 or DEF 4298.
The above 7 characters are what I want to get out of string.
EDIT: Since you preprocess your input string, you can use
Dim IndividualClasses As MatchCollection = Regex.Matches(AllExitClasses(a).Replace(" ", "-"), "[A-Z]{3}[-][0-9]{4}")
REGEX EXPLANATION:
[A-Z]{3} - 3 occurrences of English letters A to Z
[-] - A character class matching exactly one hyphen
[0-9]{4} - Exactly 4 occurrences of digits from 0 to 9.
Note that I removed capturing groups since you do not seem to be using them at all, and I am using limiting quantifiers, e.g. {4}.
Note that you could use your input string as is and previous regex [A-Z]{3}\p{Zs}[0-9]{4}, but you would need to iterate through the match collection and replace a space in each Match.Value with a hyphen creating a new array.
Here is an IDEONE demo
Ok I replaced the spaces with a dash
then I am using this Regular expression
"([A-Z])([A-Z])([A-Z])([-])([0-9])([0-9])([0-9])([0-9])")
which works
AllExitClasses(a) = AllExitClasses(a).Replace(" ", "-")
'
MyClassString = AllExitClasses(a).ToString
Dim IndividualClasses As MatchCollection = Regex.Matches(MyClassString, "([A-Z])([A-Z])([A-Z])([-])([0-9])([0-9])([0-9])([0-9])")
Regex.Matches([variable], "^([A-Z]{3,3})(\s)([0-9]{4,4})$")
This regex will find your AAA 1111 (3 uppercase letters with [A-Z]{3,3}; one white space with (\s); and exactly 4 digits with ([0-9]{4})). I have found that http://regex101.com helps a lot with expressions in different languages.

Getting "TRUE" when RegEx doesn't match?

I am attempting to try and use Regular Expressions to do a pattern match in VBA. I've added a reference to the Regular Expression libraries and am using the following code as a test.
Sub testing()
Dim re
Dim val
Set re = New RegExp
re.Pattern = "[0-9]{5}"
re.IgnoreCase = True
val = Range("A8").Value
MsgBox val
MsgBox re.Test(val)
End Sub
The issue is that when I'm testing a string formatted as:
1234 565 4444543 12 33
I am receiving "True" when I use {5} and "False" when using {6}. Why is this?
Shouldn't both {5} and {6} return "False" in this case?
If the RegEx is matching on the whitespace, how do I prevent this? I want to match exactly 4 numbers followed by exactly one space followed by exactly 3 numbers, etc.
Help!
You need to anchor your regular expression:
re.Pattern = "^[0-9]{5}$"
Otherwise it matches if it finds the pattern anywhere in the input. ^ matches the beginning of the input, $ matches the end of the input.
I'm not sure why [0-9]{6} returns False with your input, since there are 6 digits in 4444543.

Regular expression in vb.net

how to check particular value start with string or digit. here i attached my code. am getting error to like idendifier expected.
code
----
Dim i As String
dim ReturnValue as boolean
i = 400087
Dim s_str As String = i.Substring(0, 1)
Dim regex As Regex = New Regex([(a - z)(A-Z)])
ReturnValue = Regex.IsMatch(s_str, Regex)
error
regx is type and cant be used as an expression
Your variable is regex, Regex is the type of the variable.
So it is:
ReturnValue = Regex.IsMatch(s_str, regex)
But your regex is also flawed. [(a - z)(A-Z)] is creating a character class that does exactly match the characters ()-az, the range A-Z and a space and nothing else.
It looks to me as if you want to match letters. For that just use \p{L} that is a Unicode property that would match any character that is a letter in any language.
Dim regex As Regex = New Regex("[\p{L}\d]")
maybe you mean
Dim _regex As Regex = New Regex("[(a-z)(A-Z)]")
Dim regex As Regex = New Regex([(a - z)(A-Z)])
ReturnValue = Regex.IsMatch(s_str, Regex)
Note case difference, use regex.IsMatch. You also need to quote the regex string: "[(a - z)(A-Z)]".
Finally, that regex doesn't make sense, you are matching any letter or opening/closing parenthesis anywhere in the string.
To match at the start of the string you need to include the start anchor ^, something like: ^[a-zA-Z] matches any ASCII letter at the start of the string.
Check if a string starts with a letter or digit:
ReturnValue = Regex.IsMatch(s_str,"^[a-zA-Z0-9]+")
Regex Explanation:
^ # Matches start of string
[a-zA-Z0-9] # Followed by any letter or number
+ # at least one letter of number
See it in action here.

Recognize numbers in french format inside document using regex

I have a document containing numbers in various formats, french, english, custom formats.
I wanted a regex that could catch ONLY numbers in french format.
This is a complete list of numbers I want to catch (d represents a digit, decimal separator is comma , and thousands separator is space)
d,d d,dd d,ddd
dd,d dd,dd dd,ddd
ddd,d ddd,dd ddd,ddd
d ddd,d d ddd,dd d ddd,ddd
dd ddd,d dd ddd,dd dd ddd,ddd
ddd ddd,d ddd ddd,dd ddd ddd,ddd
d ddd ddd,d...
dd ddd ddd,d...
ddd ddd ddd,d...
This is the regex I have
(\d{1,3}\s(\d{3}\s)*\d{3}(\,\d{1,3})?|\d{1,3}\,\d{1,3})
catches french formats like above, so I am on the right track, but also numbers like d,ddd.dd (because it catches d,ddd) or d,ddd,ddd (because it catches d,ddd ).
What should I add to my regex ?
The VBA code I have:
Sub ChangeNumberFromFRformatToENformat()
Dim SectionText As String
Dim RegEx As Object, RegC As Object, RegM As Object
Dim i As Integer
Set RegEx = CreateObject("vbscript.regexp")
With RegEx
.Global = True
.MultiLine = False
.Pattern = "(\d{1,3}\s(\d{3}\s)*\d{3}(\,\d{1,3})?|\d{1,3}\,\d{1,3})"
' regular expression used for the macro to recognise FR formated numners
End With
For i = 1 To ActiveDocument.Sections.Count()
SectionText = ActiveDocument.Sections(i).Range.Text
If RegEx.test(SectionText) Then
Set RegC = RegEx.Execute(SectionText)
' RegC regular expresion matches collection, holding french format numbers
For Each RegM In RegC
Call ChangeThousandAndDecimalSeparator(RegM.Value)
Next 'For Each RegM In RegC
Set RegC = Nothing
Set RegM = Nothing
End If
Next 'For i = 6 To ActiveDocument.Sections.Count()
Set RegEx = Nothing
End Sub
The user stema, gave me a nice solution. The regex should be:
(?<=^|\s)\d{1,3}(?:\s\d{3})*(?:\,\d{1,3})?(?=\s|$)
But VBA complains that the regexp has unescaped characters. I have found one here (?: \d{3}) between (?: \d{3}) which is a blank character, so I can substitute that with \s. The second one I think is here (?:,\d{1,3}) between ?: and \d, the comma character, and if I escape it will be \, .
So the regex is now (?<=^|\s)\d{1,3}(?:\s\d{3})*(?:\,\d{1,3})?(?=\s|$) and it works fine in RegExr but my VBA code will not accept it.
NEW LINE IN POST :
I have just discovered that VBA doesn't agree with this sequence of the regex ?<=^
What about this?
\b\d{1,3}(?: \d{3})*(?:,\d{1,3})?\b
See it here on Regexr
\b are word boundaries
At first (\d{1,3}) match 1 to 3 digits, then there can be 0 or more groups of a leading space followed by 3 digits ((?: \d{3})*) and at last there can be an optional fraction part ((?:,\d{1,3})?)
Edit:
if you want to avoid 1,111.1 then the \b anchors are not good for you. Try this:
(?<=^|\s)\d{1,3}(?: \d{3})*(?:,\d{1,3})?(?=\s|$)
Regexr
This regex requires now a whitespace or the start of the string before and a whitespace or the end of the string after the number to match.
Edit 2:
Since look behinds are not supported you can change to
(?:^|\s)\d{1,3}(?: \d{3})*(?:,\d{1,3})?(?=\s|$)
This changes nothing at the start of the string, but if the number starts with a leading whitespace, this is now included in the match. If the result of the match is used for something at first the leading whitespace has to be stripped (I am quite sure VBA does have a methond for that (try trim())).
If you are reading on a line by line basis, you might consider adding anchors (^ and $) to your regex, so you will end up with something like so:
^(\d{1,3}\s(\d{3}\s)*\d{3}(\,\d{1,3})?|\d{1,3}\,\d{1,3})$
This instructs the RegEx engine to start matching from the beginning of the line till the very end.