get ASCII value of a Regex backreference in VBA - regex

I have the following snippet in VBA
Dim RegEx As Object
Dim myResult As String
Set RegEx = CreateObject("vbscript.regexp")
With RegEx
.Global = True
.IgnoreCase = True
.MultiLine = True
.Pattern = "([^a-z|A-Z|0-9|\s])"
End With
myResult = "Hello, World!"
I want to replace each regex match with its ascii value -- in this case, replace anything that's not a letter or number with its ascii value, so the resulting string should be
"Hello44 World33"
I basically want something like this to use the Asc() function on a backreference:
myResult = RegEx.Replace(myResult, Asc("$1"))
except that's not valid. I've tried escaping in various ways but I think I am barking up the wrong tree.
Thanks!

Don't know if you can do it in one go with Replace(), but you can use Execute() and loop through the matches. Note your original pattern also matched |, which I don't think you wanted.
Sub Tester()
Dim RegEx As Object, matches As Object, match As Object
Dim myResult As String
Set RegEx = CreateObject("vbscript.regexp")
With RegEx
.Global = True
.IgnoreCase = True
.MultiLine = True
.Pattern = "([^a-z0-9\s])"
End With
myResult = "Hello, World!"
Set matches = RegEx.Execute(myResult)
For Each match In matches
Debug.Print "<" & match.Value & "> = " & Asc(match.Value)
myResult = Replace(myResult, match.Value, Asc(match.Value))
Next match
Debug.Print myResult
End Sub

One of the signatures of Regex.Replace takes an evaluator instead of a string for the replacement value. Take a look at this:
Replace Method (String, MatchEvaluator)
Let me know if you need further help.
Edit: Added the actual code.
Imports System
Imports System.Text.RegularExpressions
Module RegExSample
Function AscText(ByVal m As Match) As String
Return Asc(m.ToString())
End Function
Sub Tester()
Dim RegEx As Object, matches As Object, match As Object
Dim myResult As String
Set RegEx = CreateObject("vbscript.regexp")
With RegEx
.Global = True
.IgnoreCase = True
.MultiLine = True
.Pattern = "([^a-z0-9\s])"
End With
myResult = "Hello, World!"
myResult = RegEx.Replace(text, AddressOf RegExSample.AscText)
Debug.Print myResult
End Sub
End Module

Related

VBA regex - Value used in formula is of the wrong data type

I can't seem to figure out why this function which includes a regex keeps returning an error of wrong data type? I'm trying to return a match to the identified pattern from a file path string in an excel document. An example of the pattern I'm looking for is "02 Package_2018-1011" from a sample string "H:\H1801100 MLK Middle School Hartford\2-Archive! Issued Bid Packages\01 Package_2018-0905 Demolition and Abatement Bid Set_Drawings - PDF\00 HazMat\HM-1.pdf". Copy of the VBA code is listed below.
Function textpart(Myrange As Range) As Variant
Dim strInput As String
Dim regex As Object
Set regex = CreateObject("VBScript.RegExp")
strInput = Myrange.Value
With regex
.Pattern = "\D{2}\sPackage_\d{4}-\d{4}"
.Global = True
End With
Set textpart = regex.Execute(strInput)
End Function
You need to use \d{2} to match 2-digit chunk, not \D{2}. Besides, you are trying to assign the whole match collection to the function result, while you should extract the first match value and assign that value to the function result:
Function textpart(Myrange As Range) As Variant
Dim strInput As String
Dim regex As Object
Dim matches As Object
Set regex = CreateObject("VBScript.RegExp")
strInput = Myrange.Value
With regex
.Pattern = "\d{2}\sPackage_\d{4}-\d{4}"
End With
Set matches = regex.Execute(strInput)
If matches.Count > 0 Then
textpart = matches(0).Value
End If
End Function
Note that to match it as a whole word you may add word boundaries:
.Pattern = "\b\d{2}\sPackage_\d{4}-\d{4}\b"
^^ ^^
To only match it after \, you may use a capturing group:
.Pattern = "\\(\d{2}\sPackage_\d{4}-\d{4})\b"
' ...
' and then
' ...
textpart = matches(0).Submatches(0)

Excel VBA Regex replace loses one character

The below code matches and replaces, but the digit next to the capture group is consumed. Where am I going wrong?
Sub test()
Dim regex As Object 'Regexp object.
Set regex = CreateObject("VBScript.RegExp") 'Regexp object.
Dim strPattern As String: strPattern = "\d(AM|PM)" 'Declare regex pattern.
Dim strReplace As String 'Placeholder string for replace operation.
Dim target As String
target = "1:05PM"
strReplace = " $1"
With regex
.Global = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regex.test(target) Then
Debug.Print regex.Replace(target, strReplace)
End If
End Sub
Output:
1:0 PM
It's because you have an un-captured \d in your regex. Try putting () around the \d i.e. (\d)(AM|PM).
You also need to change strReplace to "$1 $2"

Visual Basic Excel Regular Expression {}

I have some trouble with {}. When i get max value like this {1,8} it not work and i don't now why. Min vale is valid well
Private Sub Highlvl_Expression()
Dim strPattern As String: strPattern = "[a-zA-Z0-9_]{1,8}"
Dim strReplace As String: strReplace = ""
Dim regEx As New RegExp
Dim Test As Boolean
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
Test = regEx.Test(Highlvl.Value)
If regEx.Test(Highlvl.Value) Then
MsgBox ("Validate")
Else
MsgBox ("Not Validate")
End If
End Sub
You specified the pattern that looks for 1 to 8 alphanumeric characters inside a string. If you run the regex against a 9-character string "ABCDE6789" (regEx.Execute("ABCDE6789")), you will have 2 matches: ABCDE678 and 9.
If you want to validate a string that should have a minimum or a maximum number of characters, you need to use anchors, i.e. start and end of string assertions ^ and $. So, use
Dim strPattern As String: strPattern = "^[a-zA-Z0-9_]{1,8}$"
And
.Global = False
The global flag is not necessary since we are not looking for multiple matches, but for a single true or false result with test.

Using Regular expression in VBA

This is my sample record in a Text format with comma delimited
901,BLL,,,BQ,ARCTICA,,,,
i need to replace ,,, to ,,
The Regular expression that i tried
With regex
.MultiLine = False
.Global = True
.IgnoreCase = False
.Pattern="^(?=[A-Z]{3})\\,{3,}",",,"))$ -- error
Now i want to pass Line from file to Regex to correct the record, can some body guide me to fix this i am very new to VBA
I want to read the file line by line pass it to Regex
Looking at your original pattern I tried using .Pattern = "^\d{3},\D{3},,," which works on the sample record as with the 3 number characters , 3 letters,,,
In the answer I have used a more generalised pattern .Pattern = "^\w*,\w*,\w*,," This also works on the sample and mathces 3 commas each preceded with 0 or more alphanumeric characters followed directly by a fourth comma. Both patterns require a match to be from the begining of the string.
Pattern .Pattern = "^\d+,[a-zA-Z]+,\w*,," also works on the sample record. It would specify that before the first comma there should be 1 or greater numeric characters (and only numeric characters) and before the second comma ther should be 1 or more letters (and only letters). Before the 3rd comma there could be 0 or more alphanumeric characters.
The left function removes the rightmost character in the match ie. the last comma to generate the string used by the Regex.Replace.
Sub Test()
Dim str As String
str = "901,BLL,,,BQ,ARCTICA,,,,"
Debug.Print
Debug.Print str
str = strConv(str)
Debug.Print str
End Sub
Function strConv(ByVal str As String) As String
Dim objRegEx As Object
Dim oMatches As Object
Dim oMatch As Object
Set objRegEx = CreateObject("VBScript.RegExp")
With objRegEx
.MultiLine = False
.IgnoreCase = False
.Global = True
.Pattern = "^\w*,\w*,\w*,,"
End With
Set oMatches = objRegEx.Execute(str)
If oMatches.Count > 0 Then
For Each oMatch In oMatches
str = objRegEx.Replace(str, Left(oMatch.Value, oMatch.Length - 1))
Next oMatch
End If
strConv = str
End Function
Try this
Sub test()
Dim str As String
str = "901,BLL,,,BQ,ARCTICA,,,,"
str = strConv(str)
MsgBox str
End Sub
Function strConv(ByVal str As String) As String
Dim objRegEx As Object, allMatches As Object
Set objRegEx = CreateObject("VBScript.RegExp")
With objRegEx
.MultiLine = False
.IgnoreCase = False
.Global = True
.Pattern = ",,,"
End With
strConv = objRegEx.Replace(str, ",,")
End Function

VBA: Submatching regex

I have the following code:
Dim results(1) As String
Dim RE As Object, REMatches As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.MultiLine = False
.Global = True
.IgnoreCase = True
.Pattern = "(.*?)(\[(.*)\])?"
End With
Set REMatches = RE.Execute(str)
results(0) = REMatches(0).submatches(0)
results(1) = REMatches(0).submatches(2)
Basically if I pass in a string "Test" I want it to return an array where the first element is Test and the second element is blank.
If I pass in a string "Test [bar]", the first element should be "Test " and the second element should be "bar".
I can't seem to find any issues with my regex. What am I doing wrong?
You need to add beginning and end of string anchors to your regex:
...
.Pattern = "^(.*?)(\[(.*)\])?$"
...
Without these anchors, the .*? will always match zero characters and since your group is optional it will never try to backtrack and match more.