VBA Regex substitution codes - regex

any experience with vba regex substition codes?
I've tried the followings, which are working both on regex101.com and on regexr.com.
$&
\0
They are unfortunately not working in my VBA code.
Any similar experience?
Example: https://regex101.com/r/5Fb0EV/1
VBA code:
Dim MsgTxt As String
...
strPattern = "(Metodo di pagamento).*\r\x07?.*"
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = strPattern
MsgTxt = regEx.Replace(MsgTxt, "\0#END")
End With
Input string:
Metodo di pagamento selezionato:
Mastercard
Expected ouput:
Metodo di pagamento selezionato:
Mastercard #END

Try the below code:
Sub test()
Dim MsgTxt As String
MsgTxt = Chr(7) & "Metodo di pagamento selezionato:" & vbCr & Chr(7) & "Mastercard "
With New RegExp
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = "(Metodo di pagamento.*\r\x07?.*)"
MsgTxt = .Replace(MsgTxt, "$1#END")
End With
Debug.Print MsgTxt
End Sub
Input
Metodo di pagamento selezionato:
Mastercard
Output
Metodo di pagamento selezionato:
Mastercard #END

Related

Excel VBA Regex replace loses one character

The below code matches and replaces, but the digit next to the capture group is consumed. Where am I going wrong?
Sub test()
Dim regex As Object 'Regexp object.
Set regex = CreateObject("VBScript.RegExp") 'Regexp object.
Dim strPattern As String: strPattern = "\d(AM|PM)" 'Declare regex pattern.
Dim strReplace As String 'Placeholder string for replace operation.
Dim target As String
target = "1:05PM"
strReplace = " $1"
With regex
.Global = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regex.test(target) Then
Debug.Print regex.Replace(target, strReplace)
End If
End Sub
Output:
1:0 PM
It's because you have an un-captured \d in your regex. Try putting () around the \d i.e. (\d)(AM|PM).
You also need to change strReplace to "$1 $2"

Extract Excel string from matched Regular Expression (VBA)

I would like to extract the matched RegExp pattern from a given string in Excel VBA.
For example,
Given this expression:
"[0-9]*\+[0-9]{3}\#[0-9]*\+[0-9]{3}"
from this string:
"CSDT2_EXC_6+000#6+035_JM_150323"
I'd like to get: "6+000#6+035"
But I don't know how to accomplish this.
The nearest I could get was this:
Function getStations(file_name As String)
'Use Regular Expressiosn for grabbing the input and automatically filter it
Dim regEx As New RegExp
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
'This matches the pattern: e.g. 06+900#07+230
.Pattern = "[0-9]*\+[0-9]{3}\#[0-9]*\+[0-9]{3}"
End With
If regEx.Test(file_name) Then
strReplace = ""
getStations = regEx.Replace(file_name, strReplace)
Else
getStations = "Hay un problema con el nombre. Por favor, arréglalo"
End If
End Function
But this would bring me the following:
"CSDT2_EXC__JM_150323"
I'd like to only take the matched pattern. How can I achieve this?
Thanks a million for all the replies ;)
You can use this:
Function getStations(file_name As String)
'Use Regular Expressiosn for grabbing the input and automatically filter it
Dim regEx As New RegExp
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
'This matches the pattern: e.g. 06+900#07+230
.Pattern = "[0-9]*\+[0-9]{3}\#[0-9]*\+[0-9]{3}"
End With
If regEx.Test(file_name) Then
getStations = regEx.Execute(file_name)(0)
Else
getStations = "Hay un problema con el nombre. Por favor, arréglalo"
End If
End Function
Some minor suggestions to Rory's excellent answer (given you have redundancy in your initial function):
Function getStations(file_name As String) As String
'Use Regular Expressionn for grabbing the input and automatically filter it
Dim regEx As Object
Set regEx = CreateObject("vbscript.regexp")
regEx.Pattern = "[0-9]*\+[0-9]{3}\#[0-9]*\+[0-9]{3}"
If regEx.Test(file_name) Then
getStations = regEx.Execute(file_name)(0)
Else
getStations = "Hay un problema con el nombre. Por favor, arréglalo"
End If
End Function

excel VB regexp 5.5 capturing group

I have a problem using regexp in excel macro, by calling regex.execute(string), instead of getting an array of returned capturing groups, I always get single return which is the whole string specified in the pattern.
By using the same pattern in http://www.regexr.com/, I can see the return nicely grouped. What am I missing from this:
Private Sub ParseFileName(strInput As String)
Dim regEx As New RegExp
Dim strPattern As String
Dim strReplace
'Sample string \\Work_DIR\FTP\Results\RevA\FTP_01_01_06_Results\4F\ACC2X2R33371_SASSSD_run1
strPattern = "FTP_(\w+)_Results\\(\w+)\\([\d,\D]+)_(SAS|SATA)(HDD|SSD)_run(\d)"
With regEx
.Global = True
.MultiLine = False
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
Set strReplace = regEx.Execute(strInput)
ActiveCell.Offset(0, 1) = strReplace.Count
Else
ActiveCell.Offset(0, 1) = "(Not matched)"
End If
End sub
In the end, strReplace.Count always shows 1, which is the whole string FTP_01_01_06_Results\4F\ACC2X8R133371_SASSSD_run1
Use .SubMatches to get capturing groups values:
Private Sub ParseFileName(strInput As String)
Dim regEx As New RegExp
Dim strPattern As String
Dim strReplace As MatchCollection
Dim i As Long
'Sample string \\Work_DIR\FTP\Results\RevA\FTP_01_01_06_Results\4F\ACC2X2R33371_SASSSD_run1
strPattern = "FTP_(\w+)_Results\\(\w+)\\([\d,\D]+)_(SAS|SATA)(HDD|SSD)_run(\d)"
With regEx
.Global = True
.MultiLine = False
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
Set strReplace = regEx.Execute(strInput)
ActiveCell.Offset(0, 1) = strReplace.Count
For i = 0 To 5
ActiveCell.Offset(i + 1, 1) = strReplace(0).SubMatches(i)
Next
Else
ActiveCell.Offset(0, 1) = "(Not matched)"
End If
End Sub

Excel RegEx Extraction

recently I've been trying to extract some strings from text in excel. I used script from other post here: How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops
Since Macro code is working fine I couldn't use in Cell function, it's showing #NAME? error. I've included "Microsoft VBScript Regular Expressions 5.5" but still no luck.
I can use it with macro but script needs some changes. I would like to have some strings in A1:A50, then to B1:B50 extract date in format DD Month YYYY (e.g. 28 July 2014) and to C1:C50 extract account no in format G1234567Y.
For now script is replacing date with "". Regular Expression for date is correct but how to insert date into B column? And then A/c no to C column working on 1:50 range?
This is the code:
Sub simpleRegex()
Dim strPattern As String: strPattern = "[0-9][0-9].*[0-9][0-9][0-9][0-9]"
Dim strReplace As String: strReplace = ""
Dim regEx As New RegExp
Dim strInput As String
Dim Myrange As Range
Dim Out As Range
Set Myrange = ActiveSheet.Range("A1")
Set Out = ActiveSheet.Range("B1")
If strPattern <> "" Then
strInput = Myrange.Value
strReplace = ""
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
Out = regEx.Replace(strInput, strReplace)
Else
MsgBox ("Not matched")
End If
End If
End Sub
Thank You kindly for any assistance.
Currently your replacing the matching string with an empty string "" so that's why your getting no result. You need to return the actual match using () to indicate match set and $1 to retrieve it.
Based on your example, I'll assume your text in column A looks like this: 28 July 2014 G1234567Y
Here is a routine that will split apart the text into a date and then the text following the date.
Private Sub splitUpRegexPattern()
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim Myrange As Range
Set Myrange = ActiveSheet.Range("A1:A50")
For Each C In Myrange
strPattern = "([0-9]{1,2}.*[0-9]{4}) (.*)"
'strPattern = "(\d{1,2}.*\d{4}) (.*)"
If strPattern <> "" Then
strInput = C.Value
strReplace = "$1"
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
C.Offset(0, 1) = regEx.Replace(strInput, "$1")
C.Offset(0, 2) = regEx.Replace(strInput, "$2")
Else
C.Offset(0, 1) = "(Not matched)"
End If
End If
Next
End Sub
Result:
To use an in-cell function, set it up to extract a single piece such as Date or everything else. The following code will extract the date. Cell B1 would have the following equation: =extractDate(A1)
Function extractDate(Myrange As Range) As String
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strRaplace As String
Dim strOutput As String
strPattern = "(\d{1,2}.*\d{4}) (.*)"
If strPattern <> "" Then
strInput = Myrange.Value
strReplace = ""
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
extractDate = regEx.Replace(strInput, "$1")
Else
extractDate = "Not matched"
End If
End If
End Function
To make another function for extracting the rest of the date simply change $1 to $2 and it will return the second defined match in the pattern.

VBA macro in Excel 2007. I want to use Regular Expressions in Excel VBA to replace "He" with "She", "he" with "she", "Him" with "her"

Sub Test()
Dim strTest As String
Dim strTemp As String
strTest = Sheet1.Cells(1, 1).Value
MsgBox RE6(strTest)
Sheet1.Cells(2, 1).Value = RE6(strTest)
End Sub
Function RE6(strData As String)
Dim RE As Object 'REMatches As Object
Dim P As String, A As String
Dim Q As String, B As String
Dim R As String, C As String
Dim S As String, D As String
Dim T As String, E As String
Dim U As String, F As String
Dim V As String, G As String
Dim W As String, H As String
Dim N As Integer
Set RE = CreateObject("vbscript.regexp")
P = "(?:^|\b)He"
A = "She"
Q = "(?:^|\b)he"
B = "she"
R = "(?:^|\b)Him"
C = "Her"
S = "(?:^|\b)him"
D = "her"
T = "(?:^|\b)Himself"
E = "Herself"
U = "(?:^|\b)himself"
F = "herself"
V = "(?:^|\b)His"
G = "Her"
W = "(?:^|\b)his"
H = "her"
'This section replaces "He" with"She"
With RE
.MultiLine = True
.Global = True
.IgnoreCase = False
.Pattern = P
End With
RE6 = RE.Replace(strData, A)
'This section replaces "he" with "she"
With RE
.MultiLine = True
.Global = True
.IgnoreCase = False
.Pattern = Q
End With
RE6 = RE.Replace(strData, B)
'
'This section replaces "Him" with "Her"
With RE
.MultiLine = True
.Global = True
.IgnoreCase = False
.Pattern = R
End With
RE6 = RE.Replace(strData, C)
'This section replaces "him" with "her"
With RE
.MultiLine = True
.Global = True
.IgnoreCase = False
.Pattern = S
End With
RE6 = RE.Replace(strData, D)
'This section replaces "Himself" with "Herself"
With RE
.MultiLine = True
.Global = True
.IgnoreCase = False
.Pattern = T
End With
RE6 = RE.Replace(strData, E)
'This section replaces "himself" with "herself"
With RE
.MultiLine = True
.Global = True
.IgnoreCase = False
.Pattern = U
End With
RE6 = RE.Replace(strData, F)
'This section replaces "His" with "Her"
With RE
.MultiLine = True
.Global = True
.IgnoreCase = False
.Pattern = V
End With
RE6 = RE.Replace(strData, G)
'This section replaces "his" with "her"
With RE
.MultiLine = True
.Global = True
.IgnoreCase = False
.Pattern = W '
RE6 = RE.Replace(strData, H)
End With
End Function
When I run this code on this piece of text:
James has settled effortlessly in his new class. He has shown seriousness and demonstrated traits of a serious student in the first half of the term. I am very optimistic that his positive attitude towards work, if he does not relent, will yield positive dividends. However, James needs to respond positively to prompts on getting himself better organised in school. I wish Him, him the best in the second half of the term.
I only get "his" replaced with "her". If I comment out the last bit then I get only "Him" replaced with "Her". Any help will be very welcome.
The issue is you repeatedly do your replacement on strData, as opposed to the result of each replacement; that is, you take your original string, replace "He" with "She", and then store it in RE6. Then you take your original string again, replace "he" with "she", and then store it in RE6, overwriting the first replacement, and so on and so on.. This is why you only see the results of the last replacement.
To fix it, leave your first replacement as
RE6 = RE.Replace(strData, A)
but change all of your other replacements to be
RE6 = RE.Replace(RE6, B) <-- do this for B-H
This will give you your desired output.