Using Regular expression in VBA - regex

This is my sample record in a Text format with comma delimited
901,BLL,,,BQ,ARCTICA,,,,
i need to replace ,,, to ,,
The Regular expression that i tried
With regex
.MultiLine = False
.Global = True
.IgnoreCase = False
.Pattern="^(?=[A-Z]{3})\\,{3,}",",,"))$ -- error
Now i want to pass Line from file to Regex to correct the record, can some body guide me to fix this i am very new to VBA
I want to read the file line by line pass it to Regex

Looking at your original pattern I tried using .Pattern = "^\d{3},\D{3},,," which works on the sample record as with the 3 number characters , 3 letters,,,
In the answer I have used a more generalised pattern .Pattern = "^\w*,\w*,\w*,," This also works on the sample and mathces 3 commas each preceded with 0 or more alphanumeric characters followed directly by a fourth comma. Both patterns require a match to be from the begining of the string.
Pattern .Pattern = "^\d+,[a-zA-Z]+,\w*,," also works on the sample record. It would specify that before the first comma there should be 1 or greater numeric characters (and only numeric characters) and before the second comma ther should be 1 or more letters (and only letters). Before the 3rd comma there could be 0 or more alphanumeric characters.
The left function removes the rightmost character in the match ie. the last comma to generate the string used by the Regex.Replace.
Sub Test()
Dim str As String
str = "901,BLL,,,BQ,ARCTICA,,,,"
Debug.Print
Debug.Print str
str = strConv(str)
Debug.Print str
End Sub
Function strConv(ByVal str As String) As String
Dim objRegEx As Object
Dim oMatches As Object
Dim oMatch As Object
Set objRegEx = CreateObject("VBScript.RegExp")
With objRegEx
.MultiLine = False
.IgnoreCase = False
.Global = True
.Pattern = "^\w*,\w*,\w*,,"
End With
Set oMatches = objRegEx.Execute(str)
If oMatches.Count > 0 Then
For Each oMatch In oMatches
str = objRegEx.Replace(str, Left(oMatch.Value, oMatch.Length - 1))
Next oMatch
End If
strConv = str
End Function

Try this
Sub test()
Dim str As String
str = "901,BLL,,,BQ,ARCTICA,,,,"
str = strConv(str)
MsgBox str
End Sub
Function strConv(ByVal str As String) As String
Dim objRegEx As Object, allMatches As Object
Set objRegEx = CreateObject("VBScript.RegExp")
With objRegEx
.MultiLine = False
.IgnoreCase = False
.Global = True
.Pattern = ",,,"
End With
strConv = objRegEx.Replace(str, ",,")
End Function

Related

Use Wildcard (*) with RegEx in VBA to match anything

I'm trying to use Regex to match any character (This is just a piece of code from a larger project). I got the below to work, but seems like it is wrong, is there a proper way to search for any character via RegEx?
strPattern = "([!##$%^&*()]?[a-z]?[0-9]?)"
Eg: MCVE
Public Sub RegExSearch()
Dim regexp As Object
Dim rng As Range, rcell As Range
Dim strInput As String, strPattern As String
Set regexp = CreateObject("vbscript.regexp")
Set rng = ActiveSheet.Range("A1:A1")
With regexp
.Global = False
.MultiLine = False
.ignoreCase = True
.Pattern = strPattern
End With
For Each rcell In rng.Cells
strPattern = "([!##$%^&*()]?[a-z]?[0-9]?)" ' This matches everything, but seems improper
If strPattern <> "" Then
strInput = rcell.Value
If regexp.test(strInput) Then
MsgBox rcell & " Matched in Cell" & rcell.Address
End If
End If
Next
End Sub
. "Wildcard." The unescaped period matches any character, except a new line.
strPattern = "."
Or as #RonRosenfeld pointed out, if you need to match everything INCLUDING a "new line" then this would work.
strPattern = "[/S/s]*"
https://wellsr.com/vba/2018/excel/vba-regex-regular-expressions-guide/

Visual Basic Excel Regular Expression {}

I have some trouble with {}. When i get max value like this {1,8} it not work and i don't now why. Min vale is valid well
Private Sub Highlvl_Expression()
Dim strPattern As String: strPattern = "[a-zA-Z0-9_]{1,8}"
Dim strReplace As String: strReplace = ""
Dim regEx As New RegExp
Dim Test As Boolean
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
Test = regEx.Test(Highlvl.Value)
If regEx.Test(Highlvl.Value) Then
MsgBox ("Validate")
Else
MsgBox ("Not Validate")
End If
End Sub
You specified the pattern that looks for 1 to 8 alphanumeric characters inside a string. If you run the regex against a 9-character string "ABCDE6789" (regEx.Execute("ABCDE6789")), you will have 2 matches: ABCDE678 and 9.
If you want to validate a string that should have a minimum or a maximum number of characters, you need to use anchors, i.e. start and end of string assertions ^ and $. So, use
Dim strPattern As String: strPattern = "^[a-zA-Z0-9_]{1,8}$"
And
.Global = False
The global flag is not necessary since we are not looking for multiple matches, but for a single true or false result with test.

RegEx to extract first set of digits only

I am trying to extract the 1st group of digits from this expression "CSTAR ADJ REF # A-3080101078AZ Keying error Deposit $ 938,710.33 on 05/20/2011 item keying". That is, i want to get only "3080101078" as result in an adjacent cell.
I have tried using this:
Function CleanString(strIn As String) As String
Dim objRegex
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.Pattern = "[^0-9\s]"
CleanString= Trim(.Replace(strIn, vbNullString))
End With
End Function
This is giving "3080101078 93871033 05202011". How can i get the first group of number only (3080101078) ?
Try this
Function CleanString(strIn As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.Pattern = "\d+"
CleanString = .Execute(strIn)(0)
End With
End Function
\d+ = any digit, one or more
.Execute(strIn) executes the regex against the input, returnining a MatchCollection
.Execute(strIn)(0) returns the first item in the match collection

vbscript: replace text in activedocument with hyperlink

Starting out at a new job and I have to go through a whole lot of documents that my predecessor left. They are MS Word-files that contain information on several hundreds of patents. Instead of copy/pasting every single patent-number in an online form, I would like to replace all patent-numbers with a clickable hyperlink. I guess this should be done with vbscript (I'm not used to working with MS Office).
I have so far:
<obsolete>
This is not working for me:
1. I (probably) need to add something to loop through the ActiveDocument
2. The replace-function probably needs a string and not an object for a parameter - is there a __toString() in vbscript?
THX!
UPDATE:
I have this partially working (regex and finding matches) - now if only I could get the anchor for the hyperlink.add-method right...
Sub HyperlinkPatentNumbers()
'
' HyperlinkPatentNumbers Macro
'
Dim objRegExp, Matches, match, myRange
Set myRange = ActiveDocument.Content
Set objRegExp = CreateObject("VBScript.RegExp")
With objRegExp
.Global = True
.IgnoreCase = False
.Pattern = "(WO|EP|US)([0-9]*)(A1|A2|B1|B2)"
End With
Set Matches = objRegExp.Execute(myRange)
If Matches.Count >= 1 Then
For Each match In Matches
ActiveDocument.Hyperlinks.Add Anchor:=objRegExp.match, Address:="http://worldwide.espacenet.com/publicationDetails/biblio?DB=EPODOC&adjacent=true&locale=en_EP&CC=$1&NR=$2&KC=$3"
Next
End If
Set Matches = Nothing
Set objRegExp = Nothing
End Sub
Is this VBA or VBScript? In VBScript you cannot declare types like Dim newText As hyperLink, but every variable is a variant, so: Dim newText and nothing more.
objRegEx.Replace returns the string with replacements and needs two parameters passed into it: The original string and the text you want to replace the pattern with:
Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.Global = True
objRegEx.IgnoreCase = False
objRegEx.Pattern = "^(WO|EP|US)([0-9]*)(A1|A2|B1|B2)$"
' assuming plainText contains the text you want to create the hyperlink for
strName = objRegEx.Replace(plainText, "$1$2$3")
strAddress = objRegex.Replace(plainText, "http://worldwide.espacenet.com/publicationDetails/biblio?DB=EPODOC&adjacent=true&locale=en_EP&CC=$1&NR=$2&KC=$3"
Now you can use strName and strAddress to create the hyperlink with.
Pro-tip: You can use objRegEx.Test(plainText) to see if the regexp matches anything for early handling of errors.
Problem solved:
Sub addHyperlinkToNumbers()
Dim objRegExp As Object
Dim matchRange As Range
Dim Matches
Dim match
Set objRegExp = CreateObject("VBScript.RegExp")
With objRegExp
.Global = True
.IgnoreCase = False
.Pattern = "(WO|EP|US|FR|DE|GB|NL)([0-9]+)(A1|A2|A3|A4|B1|B2|B3|B4)"
End With
Set Matches = objRegExp.Execute(ActiveDocument.Content)
For Each match In Matches
'This doesn't work, because of the WYSIWYG-model of MS Word:
'Set matchRange = ActiveDocument.Range(match.FirstIndex, match.FirstIndex + Len(match.Value))
Set matchRange = ActiveDocument.Content
With matchRange.Find
.Text = match.Value
.MatchWholeWord = True
.MatchCase = True
.Wrap = wdFindStop
.Execute
End With
ActiveDocument.Hyperlinks.Add Anchor:=matchRange, _
Address:="http://worldwide.espacenet.com/publicationDetails/biblio?DB=EPODOC&adjacent=true&locale=en_EP&CC=" _
& match.Submatches(0) & "&NR=" & match.Submatches(1) & "&KC=" & match.Submatches(2)
Next
MsgBox "Hyperlink added to " & Matches.Count & " patent numbers"
Set objRegExp = Nothing
Set matchRange = Nothing
Set Matches = Nothing
Set match = Nothing
End Sub

Excel VBA Regex Match Position

How do I grab the position of the first matched result in a regular expression? See below.
Function MYMATCH(strValue As String, strPattern As String, Optional blnCase As Boolean = True, Optional blnBoolean = True) As String
Dim objRegEx As Object
Dim strPosition As Integer
' Create regular expression.
Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.Pattern = strPattern
objRegEx.IgnoreCase = blnCase
' Do the search match.
strPosition = objRegEx.Match(strValue)
MYMATCH = strPosition
End Function
For one, I'm not entirely certain what .Match is returning (string, integer, etc.). The one solution I found said I should create a Match object to and then grab the position from there, but unlike vb, vba does not recognize the Match object. I've also seen some code like the following, but I'm not necessarily looking for the value, just the first string placement:
If allMatches.count <> 0 Then
result = allMatches.Item(0).submatches.Item(0)
End If
Somewhat ignoring any of the possible syntax errors above (mostly due to me changing variable types right and left), how do I easily/simply accomplish this?
Thanks!
You can use FirstIndex to return the position of matches using the Execute method, ie
Function MYMATCH(strValue As String, strPattern As String, Optional blnCase As Boolean = True, Optional blnBoolean = True) As String
Dim objRegEx As Object
Dim strPosition As Integer
Dim RegMC
' Create regular expression.
Set objRegEx = CreateObject("VBScript.RegExp")
With objRegEx
.Pattern = strPattern
.IgnoreCase = blnCase
If .test(strValue) Then
Set RegMC = .Execute(strValue)
MYMATCH = RegMC(0).firstindex + 1
Else
MYMATCH = "no match"
End If
End With
End Function
Sub TestMe()
MsgBox MYMATCH("test 1", "\d+")
End Sub
For the benefit of others who may be having this problem, I finally figured it out.
Option Explicit
Function CHAMATCH(strValue As String, strPattern As String, Optional blnCase As Boolean = True, Optional blnBoolean = True) As String
Dim objRegEx As Object
Dim objPosition As Object
Dim strPosition As String
' Create regular expression.
Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.Pattern = strPattern
objRegEx.IgnoreCase = blnCase
' Do the search match.
Set objPosition = objRegEx.Execute(strValue)
strPosition = objPosition(0).FirstIndex
CHAMATCH = strPosition
End Function
Instead of a Match type, just a regular Object type will do (considering all it's returning is a class). Then, if you want to grab the index location, just use .FirstIndex on the match [of your choice], or if you want the value, us .Value