Extract Excel string from matched Regular Expression (VBA) - regex

I would like to extract the matched RegExp pattern from a given string in Excel VBA.
For example,
Given this expression:
"[0-9]*\+[0-9]{3}\#[0-9]*\+[0-9]{3}"
from this string:
"CSDT2_EXC_6+000#6+035_JM_150323"
I'd like to get: "6+000#6+035"
But I don't know how to accomplish this.
The nearest I could get was this:
Function getStations(file_name As String)
'Use Regular Expressiosn for grabbing the input and automatically filter it
Dim regEx As New RegExp
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
'This matches the pattern: e.g. 06+900#07+230
.Pattern = "[0-9]*\+[0-9]{3}\#[0-9]*\+[0-9]{3}"
End With
If regEx.Test(file_name) Then
strReplace = ""
getStations = regEx.Replace(file_name, strReplace)
Else
getStations = "Hay un problema con el nombre. Por favor, arréglalo"
End If
End Function
But this would bring me the following:
"CSDT2_EXC__JM_150323"
I'd like to only take the matched pattern. How can I achieve this?
Thanks a million for all the replies ;)

You can use this:
Function getStations(file_name As String)
'Use Regular Expressiosn for grabbing the input and automatically filter it
Dim regEx As New RegExp
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
'This matches the pattern: e.g. 06+900#07+230
.Pattern = "[0-9]*\+[0-9]{3}\#[0-9]*\+[0-9]{3}"
End With
If regEx.Test(file_name) Then
getStations = regEx.Execute(file_name)(0)
Else
getStations = "Hay un problema con el nombre. Por favor, arréglalo"
End If
End Function

Some minor suggestions to Rory's excellent answer (given you have redundancy in your initial function):
Function getStations(file_name As String) As String
'Use Regular Expressionn for grabbing the input and automatically filter it
Dim regEx As Object
Set regEx = CreateObject("vbscript.regexp")
regEx.Pattern = "[0-9]*\+[0-9]{3}\#[0-9]*\+[0-9]{3}"
If regEx.Test(file_name) Then
getStations = regEx.Execute(file_name)(0)
Else
getStations = "Hay un problema con el nombre. Por favor, arréglalo"
End If
End Function

Related

Excel VBA RegEx Replace Function Substituting a Literal $1 [duplicate]

This question already has answers here:
How to change case of matching letter with a VBA regex Replace?
(2 answers)
Closed 2 years ago.
I often rely on the blunt ease of the Replace function in VBA to do simple string replacements, but I have long been attracted to the magical allure of regular expressions to perform more sophisticated string manipulations. But in my experimenting, I am simply stuck that my replacement value, "$1", is being returned as a literal part of the output string instead of as the text matched by the RegEx pattern. I assume whatever I am doing wrong is something ugly simple, but I can't see it. Can anyone provide some guidance?
I have included the Microsoft VBScript Regular Expressions 5.5 library as a reference in my VBA project. Here is a simplified snippet of my code:
Dim regEx As RegExp
Dim strInput As String
Dim strPattern As String
Dim strReplace As String ' I've tried type Variant also
strPattern = "/[a-z]" ' Find strings with a forward slash followed by a lowercase letter; this works
strReplace = "$1" ' I've also tried using this value directly in the Replace function without first assigning it to a string value.
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
strInput = regEx.Replace(strInput, strPattern)
End If
If my input string is something like, "High/low Value", the result will be "High$1ow Value" when what I'm after is "High/Low Value". I'm stumped. Any thoughts?
Use "$1" if you are using a capture group in your pattern, which you are not.
This should work given the info provided, and will convert more than one instance of the pattern being matched.
Sub x()
Dim regEx As RegExp
Dim strInput As String
Dim strPattern As String
Dim strReplace As String ' I've tried type Variant also
Dim i As Long, f, s As String
Set regEx = New RegExp
strPattern = "/[a-z]" ' Find strings with a forward slash followed by a lowercase letter; this works
strInput = "High/low and Low/high"
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
If .Test(strInput) Then
s = strInput
For i = 0 To .Execute(strInput).Count - 1
f = .Execute(strInput)(i).FirstIndex
s = Left(s, f) & UCase(.Execute(strInput)(i)) & Right(s, Len(s) - f - 2)
Next i
strInput = s
MsgBox strInput
End If
End With
End Sub

VBA Regex substitution codes

any experience with vba regex substition codes?
I've tried the followings, which are working both on regex101.com and on regexr.com.
$&
\0
They are unfortunately not working in my VBA code.
Any similar experience?
Example: https://regex101.com/r/5Fb0EV/1
VBA code:
Dim MsgTxt As String
...
strPattern = "(Metodo di pagamento).*\r\x07?.*"
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = strPattern
MsgTxt = regEx.Replace(MsgTxt, "\0#END")
End With
Input string:
Metodo di pagamento selezionato:
Mastercard
Expected ouput:
Metodo di pagamento selezionato:
Mastercard #END
Try the below code:
Sub test()
Dim MsgTxt As String
MsgTxt = Chr(7) & "Metodo di pagamento selezionato:" & vbCr & Chr(7) & "Mastercard "
With New RegExp
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = "(Metodo di pagamento.*\r\x07?.*)"
MsgTxt = .Replace(MsgTxt, "$1#END")
End With
Debug.Print MsgTxt
End Sub
Input
Metodo di pagamento selezionato:
Mastercard
Output
Metodo di pagamento selezionato:
Mastercard #END

VBA regex - Value used in formula is of the wrong data type

I can't seem to figure out why this function which includes a regex keeps returning an error of wrong data type? I'm trying to return a match to the identified pattern from a file path string in an excel document. An example of the pattern I'm looking for is "02 Package_2018-1011" from a sample string "H:\H1801100 MLK Middle School Hartford\2-Archive! Issued Bid Packages\01 Package_2018-0905 Demolition and Abatement Bid Set_Drawings - PDF\00 HazMat\HM-1.pdf". Copy of the VBA code is listed below.
Function textpart(Myrange As Range) As Variant
Dim strInput As String
Dim regex As Object
Set regex = CreateObject("VBScript.RegExp")
strInput = Myrange.Value
With regex
.Pattern = "\D{2}\sPackage_\d{4}-\d{4}"
.Global = True
End With
Set textpart = regex.Execute(strInput)
End Function
You need to use \d{2} to match 2-digit chunk, not \D{2}. Besides, you are trying to assign the whole match collection to the function result, while you should extract the first match value and assign that value to the function result:
Function textpart(Myrange As Range) As Variant
Dim strInput As String
Dim regex As Object
Dim matches As Object
Set regex = CreateObject("VBScript.RegExp")
strInput = Myrange.Value
With regex
.Pattern = "\d{2}\sPackage_\d{4}-\d{4}"
End With
Set matches = regex.Execute(strInput)
If matches.Count > 0 Then
textpart = matches(0).Value
End If
End Function
Note that to match it as a whole word you may add word boundaries:
.Pattern = "\b\d{2}\sPackage_\d{4}-\d{4}\b"
^^ ^^
To only match it after \, you may use a capturing group:
.Pattern = "\\(\d{2}\sPackage_\d{4}-\d{4})\b"
' ...
' and then
' ...
textpart = matches(0).Submatches(0)

Conditional Regular Expression in VBA

I am parsing multiple HTML files using RegEx in Excel VBA (i know not the best thing to do) but I have this case which can either be - Scenario 1:
<span class="big vc vc_2 "><strong><i class="icon icon-angle-circled-down text-danger"></i>£51,038</strong> <span class="small">(-2.12%)</span></span>
or could be - Scenario 2:
<span class="big vc vc_2 "><strong><i class="icon icon-angle-circled-up text-success"></i>£292,539</strong> <span class="small">(14.13%)</span></span>
If the class ends in danger, I want to return -51038 and -2.12%
If the class ends in success, I want to return +292539 and 14.13%
The code I have been using for the second scenario and works fine is:
Sub Test()
With CreateObject("VBScript.RegExp")
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "<i class=""icon icon-angle-circled-up text-success""></i>([\s\S]*?)<"
sValue = HtmlSpecialCharsDecode(.Execute(sContent).Item(0).SubMatches(0))
End With
sValue = CleanString(sValue)
End sub
Function HtmlSpecialCharsDecode(sText)
With CreateObject("htmlfile")
.Open
With .createElement("textarea")
.innerHTML = sText
HtmlSpecialCharsDecode = .Value
End With
End With
End Function
Function CleanString(strIn As String) As String
Dim objRegex
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.Pattern = "[^\d]+"
CleanString = .Replace(strIn, vbNullString)
End With
End Function
All you need to do is add some more capturing groups with "or" conditions in them. In your case, you want the group (success|danger) (also (up|down) based on the examples). Then, instead of just checking the only submatch, check for the conditions that you put in your pattern:
Dim regex As Object
Dim matches As Object
Dim expr As String
expr = "<i class=""icon icon-angle-circled-(up|down) text-(success|danger)""></i>(.*?)</.*\((.*)%\)<.*"
Set regex = CreateObject("VBScript.RegExp")
With regex
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = expr
Set matches = .Execute(sContent)
End With
Dim isDanger As Boolean
If matches.Count > 0 Then
isDanger = (HtmlSpecialCharsDecode(matches.item(0).SubMatches(1)) = "danger")
sValue1 = HtmlSpecialCharsDecode(matches.item(0).SubMatches(2))
sValue2 = HtmlSpecialCharsDecode(matches.item(0).SubMatches(3))
End If
If isDanger Then
'Was "danger"
Debug.Print -CLng(CleanString(sValue1))
Debug.Print -CDbl(sValue2)
Else
'Was "success"
Debug.Print CLng(CleanString(sValue1))
Debug.Print CDbl(sValue2)
End If

Regex VBA in Access - finding text between two strings

I am having a heck of a time with a RegEx question in Access VBA.
My goal is to extract the server from a linked database connection string. Basically, the connection string looks like
ODBC;DRIVER=SQL Server;SERVER=compName\sqlexpress;Trusted_Connection=Yes;APP=Microsoft Office 2010;DATABASE=databaseName
I am able to get the first regex to work, but it is returning
SERVER=compName\sqlexpress
I would like this to only return
compName\sqlexpress
My understanding is the ?<= operator should allow the RegEx to work correctly, but I get the following error "Method 'Execute' of object 'IRegExp2' failed."
The only documentation I can find for any Microsoft RegEx syntax is here which is not the runtime 5.5 VBScript library, but I'm not sure where else to get supported syntax.
Here is the code I'm using to test this. My database has a lot of linked tables.
Sub printServerStringInformation()
Dim rxPattern As String
rxPattern = "(?=SERVER)(.*)(?=;Trusted)"
Debug.Print RxMatch(CurrentDb.tableDefs(1).Connect, rxPattern, False)
rxPattern = "(?<=SERVER)(.*)(?=;Trusted)"
Debug.Print RxMatch(CurrentDb.tableDefs(1).Connect, rxPattern, False)
End Sub
Here is the function I am using:
Public Function RxMatch( _
ByVal SourceString As String, _
ByVal Pattern As String, _
Optional ByVal IgnoreCase As Boolean = True, _
Optional ByVal MultiLine As Boolean = True) As Variant
'Microsoft VBScript Regular Expressions 5.5
'http://www.zytrax.com/tech/web/regex.htm#more
'http://bytecomb.com/regular-expressions-in-vba/
'http://xkcd.com/1171/
On Error GoTo errHandler
Dim oMatches As MatchCollection
With New RegExp
.MultiLine = MultiLine
.IgnoreCase = IgnoreCase
.Global = False
.Pattern = Pattern
Set oMatches = .Execute(SourceString)
If oMatches.Count > 0 Then
RxMatch = oMatches(0).value
Else
RxMatch = ""
End If
End With
errHandler:
Debug.Print Err.Description
End Function
Here goes solution with RegEx (complete code which could be converted into function):
Sub qTest_3()
Dim objRE As New RegExp
Dim Tekst As String
Dim Wynik As Variant
Tekst = "ODBC;DRIVER=SQL Server;SERVER=compName\sqlexpress;Trusted_Connection=Yes;APP=Microsoft Office 2010;DATABASE=databaseName"
With objRE
.Global = True
.IgnoreCase = True
.Pattern = "(^.*;SERVER=)(.*)(;Trusted.*)"
Wynik = .Replace(Tekst, "$2") 'only 2nd part of the pattern will be returned
End With
Debug.Print Wynik
End Sub
Your function changed could be as follows (I added additional parameter setting part of the pattern which should be returned):
Public Function RxMatchReturn( _
ByVal SourceString As String, _
ByVal Pattern As String, _
StringPart As Byte, _
Optional ByVal IgnoreCase As Boolean = True, _
Optional ByVal MultiLine As Boolean = True) As Variant
'Microsoft VBScript Regular Expressions 5.5
On Error GoTo errHandler
Dim oMatches As MatchCollection
With New RegExp
.MultiLine = MultiLine
.IgnoreCase = IgnoreCase
.Global = True
.Pattern = Pattern
RxMatchReturn = .Replace(SourceString, "$" & StringPart)
End With
errHandler:
Debug.Print err.Description
End Function