I'm trying to extract a 10 character string out of the following string:
<div class="a-column a-span5 a-span-last"><div class="a-row a-spacing-mini"><span name="B01B5BBNPS">
I want to extract B01B5BBNPS. The string will be in Cell "A1". I tried using the following code but it only works when Cell "A1" only contains "B01B5BBNPS".
Function CleanString(strIn As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.Pattern = "^[B0]{2}[\w]{8}"
On Error Resume Next
CleanString = .Execute(strIn)(0)
End With
End Function
Your pattern must not begin with the "^" character because it means your pattern will only match string which begin with "B0" (or even 0B, or BB, 00,... nevermind).
You should try this pattern :
B0[\w]{8}
https://regex101.com/r/zpo3Th/1
Related
I can't seem to figure out why this function which includes a regex keeps returning an error of wrong data type? I'm trying to return a match to the identified pattern from a file path string in an excel document. An example of the pattern I'm looking for is "02 Package_2018-1011" from a sample string "H:\H1801100 MLK Middle School Hartford\2-Archive! Issued Bid Packages\01 Package_2018-0905 Demolition and Abatement Bid Set_Drawings - PDF\00 HazMat\HM-1.pdf". Copy of the VBA code is listed below.
Function textpart(Myrange As Range) As Variant
Dim strInput As String
Dim regex As Object
Set regex = CreateObject("VBScript.RegExp")
strInput = Myrange.Value
With regex
.Pattern = "\D{2}\sPackage_\d{4}-\d{4}"
.Global = True
End With
Set textpart = regex.Execute(strInput)
End Function
You need to use \d{2} to match 2-digit chunk, not \D{2}. Besides, you are trying to assign the whole match collection to the function result, while you should extract the first match value and assign that value to the function result:
Function textpart(Myrange As Range) As Variant
Dim strInput As String
Dim regex As Object
Dim matches As Object
Set regex = CreateObject("VBScript.RegExp")
strInput = Myrange.Value
With regex
.Pattern = "\d{2}\sPackage_\d{4}-\d{4}"
End With
Set matches = regex.Execute(strInput)
If matches.Count > 0 Then
textpart = matches(0).Value
End If
End Function
Note that to match it as a whole word you may add word boundaries:
.Pattern = "\b\d{2}\sPackage_\d{4}-\d{4}\b"
^^ ^^
To only match it after \, you may use a capturing group:
.Pattern = "\\(\d{2}\sPackage_\d{4}-\d{4})\b"
' ...
' and then
' ...
textpart = matches(0).Submatches(0)
I have some trouble with {}. When i get max value like this {1,8} it not work and i don't now why. Min vale is valid well
Private Sub Highlvl_Expression()
Dim strPattern As String: strPattern = "[a-zA-Z0-9_]{1,8}"
Dim strReplace As String: strReplace = ""
Dim regEx As New RegExp
Dim Test As Boolean
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
Test = regEx.Test(Highlvl.Value)
If regEx.Test(Highlvl.Value) Then
MsgBox ("Validate")
Else
MsgBox ("Not Validate")
End If
End Sub
You specified the pattern that looks for 1 to 8 alphanumeric characters inside a string. If you run the regex against a 9-character string "ABCDE6789" (regEx.Execute("ABCDE6789")), you will have 2 matches: ABCDE678 and 9.
If you want to validate a string that should have a minimum or a maximum number of characters, you need to use anchors, i.e. start and end of string assertions ^ and $. So, use
Dim strPattern As String: strPattern = "^[a-zA-Z0-9_]{1,8}$"
And
.Global = False
The global flag is not necessary since we are not looking for multiple matches, but for a single true or false result with test.
This is my sample record in a Text format with comma delimited
901,BLL,,,BQ,ARCTICA,,,,
i need to replace ,,, to ,,
The Regular expression that i tried
With regex
.MultiLine = False
.Global = True
.IgnoreCase = False
.Pattern="^(?=[A-Z]{3})\\,{3,}",",,"))$ -- error
Now i want to pass Line from file to Regex to correct the record, can some body guide me to fix this i am very new to VBA
I want to read the file line by line pass it to Regex
Looking at your original pattern I tried using .Pattern = "^\d{3},\D{3},,," which works on the sample record as with the 3 number characters , 3 letters,,,
In the answer I have used a more generalised pattern .Pattern = "^\w*,\w*,\w*,," This also works on the sample and mathces 3 commas each preceded with 0 or more alphanumeric characters followed directly by a fourth comma. Both patterns require a match to be from the begining of the string.
Pattern .Pattern = "^\d+,[a-zA-Z]+,\w*,," also works on the sample record. It would specify that before the first comma there should be 1 or greater numeric characters (and only numeric characters) and before the second comma ther should be 1 or more letters (and only letters). Before the 3rd comma there could be 0 or more alphanumeric characters.
The left function removes the rightmost character in the match ie. the last comma to generate the string used by the Regex.Replace.
Sub Test()
Dim str As String
str = "901,BLL,,,BQ,ARCTICA,,,,"
Debug.Print
Debug.Print str
str = strConv(str)
Debug.Print str
End Sub
Function strConv(ByVal str As String) As String
Dim objRegEx As Object
Dim oMatches As Object
Dim oMatch As Object
Set objRegEx = CreateObject("VBScript.RegExp")
With objRegEx
.MultiLine = False
.IgnoreCase = False
.Global = True
.Pattern = "^\w*,\w*,\w*,,"
End With
Set oMatches = objRegEx.Execute(str)
If oMatches.Count > 0 Then
For Each oMatch In oMatches
str = objRegEx.Replace(str, Left(oMatch.Value, oMatch.Length - 1))
Next oMatch
End If
strConv = str
End Function
Try this
Sub test()
Dim str As String
str = "901,BLL,,,BQ,ARCTICA,,,,"
str = strConv(str)
MsgBox str
End Sub
Function strConv(ByVal str As String) As String
Dim objRegEx As Object, allMatches As Object
Set objRegEx = CreateObject("VBScript.RegExp")
With objRegEx
.MultiLine = False
.IgnoreCase = False
.Global = True
.Pattern = ",,,"
End With
strConv = objRegEx.Replace(str, ",,")
End Function
I am trying to extract the 1st group of digits from this expression "CSTAR ADJ REF # A-3080101078AZ Keying error Deposit $ 938,710.33 on 05/20/2011 item keying". That is, i want to get only "3080101078" as result in an adjacent cell.
I have tried using this:
Function CleanString(strIn As String) As String
Dim objRegex
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.Pattern = "[^0-9\s]"
CleanString= Trim(.Replace(strIn, vbNullString))
End With
End Function
This is giving "3080101078 93871033 05202011". How can i get the first group of number only (3080101078) ?
Try this
Function CleanString(strIn As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.Pattern = "\d+"
CleanString = .Execute(strIn)(0)
End With
End Function
\d+ = any digit, one or more
.Execute(strIn) executes the regex against the input, returnining a MatchCollection
.Execute(strIn)(0) returns the first item in the match collection
How do I grab the position of the first matched result in a regular expression? See below.
Function MYMATCH(strValue As String, strPattern As String, Optional blnCase As Boolean = True, Optional blnBoolean = True) As String
Dim objRegEx As Object
Dim strPosition As Integer
' Create regular expression.
Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.Pattern = strPattern
objRegEx.IgnoreCase = blnCase
' Do the search match.
strPosition = objRegEx.Match(strValue)
MYMATCH = strPosition
End Function
For one, I'm not entirely certain what .Match is returning (string, integer, etc.). The one solution I found said I should create a Match object to and then grab the position from there, but unlike vb, vba does not recognize the Match object. I've also seen some code like the following, but I'm not necessarily looking for the value, just the first string placement:
If allMatches.count <> 0 Then
result = allMatches.Item(0).submatches.Item(0)
End If
Somewhat ignoring any of the possible syntax errors above (mostly due to me changing variable types right and left), how do I easily/simply accomplish this?
Thanks!
You can use FirstIndex to return the position of matches using the Execute method, ie
Function MYMATCH(strValue As String, strPattern As String, Optional blnCase As Boolean = True, Optional blnBoolean = True) As String
Dim objRegEx As Object
Dim strPosition As Integer
Dim RegMC
' Create regular expression.
Set objRegEx = CreateObject("VBScript.RegExp")
With objRegEx
.Pattern = strPattern
.IgnoreCase = blnCase
If .test(strValue) Then
Set RegMC = .Execute(strValue)
MYMATCH = RegMC(0).firstindex + 1
Else
MYMATCH = "no match"
End If
End With
End Function
Sub TestMe()
MsgBox MYMATCH("test 1", "\d+")
End Sub
For the benefit of others who may be having this problem, I finally figured it out.
Option Explicit
Function CHAMATCH(strValue As String, strPattern As String, Optional blnCase As Boolean = True, Optional blnBoolean = True) As String
Dim objRegEx As Object
Dim objPosition As Object
Dim strPosition As String
' Create regular expression.
Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.Pattern = strPattern
objRegEx.IgnoreCase = blnCase
' Do the search match.
Set objPosition = objRegEx.Execute(strValue)
strPosition = objPosition(0).FirstIndex
CHAMATCH = strPosition
End Function
Instead of a Match type, just a regular Object type will do (considering all it's returning is a class). Then, if you want to grab the index location, just use .FirstIndex on the match [of your choice], or if you want the value, us .Value