VBA and RegEx matching arbitrary strings in Excel 2010 - regex

I need to extract adress and potentially zip code as separate entites from the same line. The address line may or may not contain a zip code, and may or may not contain other unwanted strings. This is due to a bug in a web form, which is fixed, but the damage is already done to a set of elements.
Possible forms and results:
Address: Some address 251, 99302 Something Telephone: 555 6798 8473 -- Return "some address 251" and "99302 something" in separate strings. Comma may or may not be trailed by whitespace.
Address: Some address 251 -- Return "some address 251"
Address: Some address 251, 99302 -- Return "some address 251" and "99302". Again, comma may or may not be trailed by whitespace.
I have a basic understanding of how this could be done programatically in VBA by iterating over the string and checking individual characters and substrings, but I feel like it will be time-consuming and not very robust afterwards. Or if it's robust, it would end up being huge because of all the possible variations.
I am struggling the most with how to form the regular expression(s) and possibly the conditionals to get the desired results.
This is part of a larger project, so I won't paste all the various code, but I am pulling mailitems from Outlook to analyze and dump relevant info into an Excel sheet. I have both the Outlook and Excel code working, but the logic that extracts information is a bit flawed.
Here are the new snippets I've been working on:
Function regexp(str As String, regP As String)
Dim rExp As Object, rMatch As Object
Set rExp = CreateObject("vbscript.regexp")
With rExp
.Global = False
.MultiLine = False
.IgnoreCase = True
.Pattern = regP
End With
Set rMatch = rExp.Execute(str)
If rMatch.Count > 0 Then
regexp = rMatch(0)
Else
RegEx = vbNullString
Debug.Print "No match found!"
End If
End Function
Sub regexpAddress(str As String)
Dim result As String
Dim pattern As String
If InStr(str, "Telephone:") Then pattern = "/.+?(?=Telephone:)/"
result = regexp(str, pattern)
End Sub
I'm not sure how to form the regexps here. The one outlined should pull the right information (in 1 string instead 2, but that's still an improvement) - but only when the line contains the string "Telephone:", and I have a lot of cases where it won't contain that.
This is the current and somewhat flawed logic, which for some reason doesn't always yield the results I want:
For Each objMail In olFolder.Items
name = ""
address = ""
telephone = ""
email = ""
vIterations = vIterations + 1
arrBody = Split(objMail.body, Chr(10)) ' Split mail body when linebreak is encountered, throwing each line into its own array position
For i = 0 To UBound(arrBody)
arrLine = Split(arrBody(i), ": ") ' For each element (line), make new array, and if text search matches then write the 2nd half of the element to variable
If InStr(arrBody(i), "Name:") > 0 Then ' L2
name = arrLine(1) ' Reference 2nd column in array after the split
ElseIf InStr(arrBody(i), "Address:") > 0 Then
address = arrLine(1)
ElseIf InStr(arrBody(i), "Telephone:") > 0 Then
telephone = CLng(arrLine(1))
ElseIf InStr(arrBody(i), "Email:") > 0 Then
email = arrLine(1)
End If ' L2
Next
Next ' Next/end-for
This logic accepts and formats input of the following type:
Name: Joe
Address: Road
Telephone: 55555555555555
Email: joe#road.com
and returns joe, road, 55555 and joe#road.com to some defined Excel cells. This works fine when the mailitems are ordered as expected.
Problem: A bug lead to not my webform not inserting a linebreak after the address in some cases. The script still worked for the most part, but the mailitem contents sometimes ended up looking like this:
Name: Joe
Address: Road Telephone: 55555555555555
Email: joe#road.com
The address field was contaminated when it reached Excel ("Road Telephone" instead of just "Road"), but there was no loss of information. Which was acceptable, as it's easy to remove the surpluss string.
But in the following case (no email is entered), the phone number is not only lost but is actually replaced by a phone number from some other, arbitrary mailitem and I can't FOR THE LIFE OF ME figure out (1) why it won't get the correct number, (2) why it jumps to a new mail item to find the phone number or (3) how it selects this other mailitem:
Name: Joe
Address: Road Telephone: 5555555555555
Email:
In Excel:
Name: Joe
Address: Road Telephone
Telephone: 8877445511
Email:
So, TL;DR: my selection logic is flawed, and being that it is so hastily hacked together, not to mention how it yields false information and I am unable to figure out how and why, I would like to do a better operation using some other solution (like regexp?) instead for a more robust code.

Not so long ago I had a similar problem.
Code may not be very professional, but it can be helpful :)
Could you check if this code work for you correctly?
Function regexp(str As String, regP As String)
Dim rExp As Object, rMatch As Object
Set rExp = CreateObject("vbscript.regexp")
With rExp
.Global = False
.MultiLine = False
.IgnoreCase = True
.pattern = regP
End With
Set rMatch = rExp.Execute(str)
If rMatch.Count > 0 Then
regexp = rMatch(0)
Else
RegEx = vbNullString
Debug.Print "No match found!"
End If
End Function
Function for_vsoraas()
For Each objMail In olFolder.Items
vIterations = vIterations + 1
objMail_ = Replace(objMail.body, Chr(10), " ")
Dim StringToSearch(3) As String
StringToSearch(0) = "Name:"
StringToSearch(1) = "Address:"
StringToSearch(2) = "Telephone:"
StringToSearch(3) = "Email:"
Dim ArrResults(4) As String 'name,address,telephone,email, zipcode
For i = 0 To UBound(StringToSearch)
ResultString = ""
StartString = InStr(objMail_, StringToSearch(i))
If StartString > 0 Then
If i = UBound(StringToSearch) Then 'last string to search, dont search EndString
ResultString = Right(objMail_, Len(objMail_) + Len(StringToSearch(i)))
Else
EndString = 0
j = i
While (EndString = 0) 'prevent case no existing EndString
EndString = InStr(objMail_, StringToSearch(j + 1))
j = j + 1
If j = UBound(StringToSearch) And EndString = 0 Then
EndString = Len(objMail_) + 1
End If
Wend
ResultString = Mid(objMail_, StartString + Len(StringToSearch(i)) + 1, EndString - 1 - StartString - Len(StringToSearch(i)))
End If
ArrResults(i) = ResultString
End If
Next i
'search zipcode and address
ArrResults(4) = regexp(ArrResults(1), "\b(\d{5})\b")
ArrResults(1) = regexp(ArrResults(1), "([a-z ]{2,}\s{0,1}\d{0,3})")
'your varabile
Name = ArrResults(0)
Address = ArrResults(1)
Telephone = ArrResults(2)
Email = ArrResults(3)
ZipCode = ArrResults(4)
Next ' Next/end-for
End Function

I don't know if it was dumb luck or if I actually managed to learn some regex, but these patterns turn out to do exactly what I need.
' regex patterns - use flag /i
adrPattern = "([a-z ]{2,}\s{0,1}\d{0,3})" ' Select from a-z or space, case insensitive and at least 2 characters long, followed by optional space, ending with 0-3 digits
adrZipcode = "\b(\d{4})\b" ' Exactly 4 digits surrounded on both sides by either space, text or non-word character like comma
Edit: "Fixed" the telephone problem too. After spending 2 hours trying to write it in regex, and failing miserably, it dawned on me that solving the problem as a matter of faulty creation of the array had to be so much easier than treating it as a computational problem. And it was:
mailHolder = Replace(objMail.body, "Telephone:", Chr(10) + "Telephone:")
arrBody = Split(mailHolder, Chr(10))

Related

Find '~XX~' within a string with specific values

I have classic ASP written in VBScript. I have a record pulled from SQL Server and the data is a string. In this string, I need to find text enclosed in ~12345~ and I need to replace with very specific text. Example 1 would be replaced with M, 2 would be replaced with A. I then need to display this on the web page. We don't know how many items will be enclosed with ~.
Example Data:
Group Pref: (To be paid through WIT)
~2.5~ % Quarterly Rebate - Standard Commercial Water Heaters
Display on webpage after:
Group Pref: (To be paid through WIT)
~A.H~ % Quarterly Rebate - Standard Commercial Water Heaters
I tried this following, but there are two many cases and this would be unrealistic to maintain. I does replace the text and display correctly.
dim strSearchThis
strSearchThis =(rsResults("PREF"))
set re = New RegExp
with re
.global = true
.pattern = "~[^>]*~"
strSearchThis = .replace(strSearchThis, "X")
end with
I am also trying this code, I can find the text contained between each ~ ~, but when displayed its the information between the ~ ~ is not changed:
dim strSearchThis
strSearchThis =(rsResults("PREF"))
Set FolioPrefData = New RegExp
FolioPrefData.Pattern = "~[^>]*~"
FolioPrefData.Global = True
FolioPrefData.IgnoreCase = True
'will contain all found instances of ~ ~'
set colmatches = FolioPrefData.Execute(strSearchThis)
Dim itemLength, found
For Each objMatch in colMatches
Select Case found
Case "~"
'ignore - doing nothing'
Case "1"
found = replace(strSearchThis, "M")
End Select
Next
response.write(strSearchThis)
You can do it without using Regular Expressions, just checking the individual characters and writing a function that handles the different cases you have. The following function finds your delimited text and loops through all characters, calling the ReplaceCharacter function defined further down:
Function FixString(p_sSearchString) As String
Dim iStartIndex
Dim iEndIndex
Dim iIndex
Dim sReplaceString
Dim sReturnString
sReturnString = p_sSearchString
' Locate start ~
iStartIndex = InStr(sReturnString, "~")
Do While iStartIndex > 0
' Look for end ~
iEndIndex = InStr(iStartIndex + 1, sReturnString, "~")
If iEndIndex > 0 Then
sReplaceString = ""
' Loop htrough all charatcers
For iIndex = iStartIndex + 1 To iEndIndex - 1
sReplaceString = sReplaceString & ReplaceCharacter(Mid(sReturnString, iIndex, 1))
Next
' Replace string
sReturnString = Left(sReturnString, iStartIndex) & sReplaceString & Mid(sReturnString, iEndIndex)
' Locate next ~
iStartIndex = InStr(iEndIndex + 1, sReturnString, "~")
Else
' End couldn't be found, exit
Exit Do
End If
Loop
FixString = sReturnString
End Function
This is the function where you will enter the different character substitutions you might have:
Function ReplaceCharacter(p_sCharacter) As String
Select Case p_sCharacter
Case "1"
ReplaceCharacter = "M"
Case "2"
ReplaceCharacter = "A"
Case Else
ReplaceCharacter = p_sCharacter
End Select
End Function
You can use this in your existing code:
response.write(FixString(strSearchThis))
You can also use a Split and Join method...
Const SEPARATOR = "~"
Dim deconstructString, myOutputString
Dim arrayPointer
deconstructString = Split(myInputString, SEPARATOR)
For arrayPointer = 0 To UBound(deconstructString)
If IsNumeric(deconstructString(arrayPointer)) Then
'Do whatever you need to with your value...
End If
Next 'arrayPointer
myOutputString = Join(deconstructString, "")
This does rely, obviously, on breaking a string apart and rejoining it, so there is a sleight overhead on string mutability issues.

Check string has a date in it and extract part of the string

I have thousands of lines of text that I need to work through and the lines I am interested with lines that look like the following:
01/04/2019 09:35:41 - Test user (Additional Comments)
I am currently using this code to filter out all the other rows:
If InStr(FullCell(i), " - ") <> 0 And InStr(FullCell(i), ":") <> 0 And InStr(FullCell(i), "(") <> 0 Then
FullCell is the array that I am working through.
which I know is not the best way to do it. Is there a way to check that there is a date at the beginning of the string in the format dd/mm/yyyy and then extract the user name inbetween the '-' and the '(' symbol.
I had a play with regex to see if that could help but i'm limited in skills to be able to pull off both VBA and regex in the same code.
Whats the best way to do this.
Assuming Fullcell(i) contains the string,
If Left(Fullcell(i), 10) Like "##/##/####"
Will return True if you have a date (note that it will not differentiate between dd/mm/yyyy and mm/dd/yyyy.
And
Mid(Fullcell(i), InStr(Fullcell(i), " - ") + 2, InStr(Fullcell(i), " (") - InStr(Fullcell(i), " - ") - 2)
Will return the username
I'm sure there is a more efficient way to do this, but I've used the following solution quite a few times:
This will select the date:
x = 1
Do While Mid(FullCell,1,x) <> " "
x = x + 1
Loop
strDate = Left(FullCell,x)
This will find the character number of the hyphen, the username starts 2 characters after.
x = 1
Do While Mid(FullCell,x,1) <> "-"
x = x + 1
Loop
Then we will find the end of the username
y = x + 2
Do While Mid(FullCell,y,1) <> " "
y = y + 1
Loop
The username should now be characters (x+2 to y-1)
strUsername = Mid(FullCell, x + 2, y - (x + 2) - 1)
Here's how I would do it
Dim your variables
Dim ring as Range
Dim dat as variant
Dim FullCell() as string
Dim User as string
Dim I as long
Set your range
Set rng = ` any way you choose
Dat = rng.value2
Loop dat
For i = 1 to UBound(dat, 1)
Split the data
FullCell = Trim(Split(FullCell, "-"))
Test if it split
If UBound(FullCell) > 0 Then
Test if it matches
If IsDate(FullCell(0)) Then
i = Instr(FullCell(1), "(")-1)
If i then
User = left$(FullCell(1), i)
' Found a user
End If
End If
End If
Next
Abstraction is your friend, it's always helpful to break these into their own private functions whenever you can. You could put your code in a function and call it something like ExtractUsername.
Below I did an example of this, and I decided to go with the RegExp approach (late binding), but you could use string functions like the examples above as well.
This function returns the username if it finds the pattern you mentioned above, otherwise, it returns an empty string.
Private Function ExtractUsername(ByVal SourceString As String) As String
Dim RegEx As Object
Set RegEx = CreateObject("vbscript.regexp")
'(FIRST GROUP FINDS THE DATE FORMATTED AS DD/MM/YYY, AS WELL AS THE FORWARD SLASH)
'(SECOND GROUP FINDS THE USERNAME) THIS WILL BE SUBMATCH 1
With RegEx
.Pattern = "(^\d{2}\/\d{2}\/\d{4}.*-)(.+)(\()"
.Global = True
End With
Dim Match As Object
Set Match = RegEx.Execute(SourceString)
'ONLY RETURN IF A MATCH WAS FOUND
If Match.Count > 0 Then
ExtractUsername = Trim(Match(0).SubMatches(1))
End If
Set RegEx = Nothing
End Function
The regex pattern is grouped into three parts, the date (and slash), username, and opening parentheses. What you are interested in is the username, which in the SubMatch would be number 1.
Regexr is a helpful site for practicing regular expressions and can show you a bit more of what the pattern I went with is doing.
Please note that using regular expressions might give you performance issues and you should test it against regular string functions to see what works best for your situation.

VBA - Modify sheet naming from source file

I received help in the past for an issue regarding grabbing a source file name and naming a newly created worksheet the date from said source file name, i.e. "010117Siemens Hot - Cold Report.xls" and outputting "010117".
However the code only works for file names with this exact format, for example, file named "Siemens Hot - Cold Report 010117.xls", an error occurs because the newly created sheet does not find the date in the source file.
CODE
Application.ScreenUpdating = False
Dim n As Double
Dim wksNew As Excel.Worksheet
Dim src As Workbook
Set src = Workbooks.Open(filePath, False, False)
Dim srcRng As Range
With src.Worksheets("Sheet1")
Set srcRng = .Range(.Range("A1"), .Range("A1").End(xlDown).End(xlToRight))
End With
With ThisWorkbook
Set wksNew = .Worksheets.Add(After:=.Worksheets(.Sheets.Count))
n = .Sheets.Count
.Worksheets(n).Range("A1").Resize(srcRng.Rows.Count, srcRng.Columns.Count).Value = srcRng.Value
End With
' ======= get the digits part from src.Name using a RegEx object =====
' RegEx variables
Dim Reg As Object
Dim RegMatches As Variant
Set Reg = CreateObject("VBScript.RegExp")
With Reg
.Global = True
.IgnoreCase = True
.Pattern = "\d{0,9}" ' Match any set of 0 to 9 digits
End With
Set RegMatches = Reg.Execute(src.Name)
On Error GoTo CloseIt
If RegMatches.Count >= 1 Then ' make sure there is at least 1 match
ThisWorkbook.Worksheets(n).Name = RegMatches(0) ' rename "Sheet2" to the numeric part of the filename
End If
src.Close False
Set src = Nothing
So, my question is, how can I get my code to recognize the string of digits no matter its position in the file name?
Code
^\d{0,9}\B|\b\d{0,9}(?=\.)
Usage
I decided to make a function that can be called inside a cell as such: =GetMyNum(x) where x is a pointer to a cell (i.e. A1).
To get the code below to work:
Open Microsoft Visual Basic for Applications (ALT + F11)
Insert a new module (right click in the Project Pane and select Insert -> Module).
Click Tools -> References and find Microsoft VBScript Regular Expressions 5.5, enable it and click OK
Now copy/paste the following code into the new module:
Option Explicit
Function GetMyNum(Myrange As Range) As String
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim strOutput As String
Dim match As Object
strPattern = "^\d{0,9}\B|\b\d{0,9}(?=\.)"
If strPattern <> "" Then
strInput = Myrange.Value
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.test(strInput) Then
Set match = regEx.Execute(strInput)
GetMyNum = match.Item(0)
Else
GetMyNum = ""
End If
End If
End Function
Results
Input
A1: Siemens Hot - Cold Report 010117.xls
A2: 010117Siemens Hot - Cold Report.xls
B1: =GetMyNum(A1)
B2: =GetMyNum(A1)
Output
010117 # Contents of B1
010117 # Contents of B2
Explanation
I will explain each regex option separately. You can reorder the options in terms of importance in such a way that the most important option is first and least important is last.
^\d{0,9}\B Match the following
^ Assert position at the start of the line
\d{0,9} Match any digit 0-9 times
\B Ensure position does not match where a word boundary matches (this is used but may be dropped depending on usage - I added it because it seems the number you're trying to get is immediately followed by a word character and not followed by a space - if that's not always the case just remove this token)
\b\d{0,9}(?=\.) Match the following
\b Assert position as a word boundary
\d{0,9} Match any digit 0-9 times
(?=\.) Positive lookahead ensuring a literal dot . follows
Just my alternative solution to RegEx :)
This finds the first occurence of 6 consecutive digits, omitting blanks and periods... although there are probably some more issues with using IsNumeric as I believe a lowercase e is considered acceptable by it...
Sub FindTheNumber()
For i = 1 To Len(Range("A1").Value)
If IsNumeric(Mid(Range("A1").Value, i, 6)) = True And InStr(Mid(Range("A1").Value, i, 6), " ") = 0 And InStr(Mid(Range("A1").Value, i, 6), ".") = 0 Then
MyNumber = Mid(Range("A1").Value, i, 6)
Debug.Print MyNumber
Exit For
End If
Next i
For i = 1 To Len(Range("A2").Value)
If IsNumeric(Mid(Range("A2").Value, i, 6)) = True And InStr(Mid(Range("A2").Value, i, 6), " ") = 0 And InStr(Mid(Range("A2").Value, i, 6), ".") = 0 Then
MyNumber = Mid(Range("A2").Value, i, 6)
Debug.Print MyNumber
Exit For
End If
Next i
End Sub
Examples:
Immediate window:

Manipulate string to extract address

I'm currently doing some work with a very large data source on city addresses where the data looks something like this.
137 is the correct address but it belongs in a building that takes up 135-138A on the street.
source:
137 9/F 135-138A KING STREET 135-138A KING STREET TOR
i've used a function which removes the duplicates shown on extendoffice.
the second column has become this:
137 9/F 135-138A KING STREET TOR
what I want to do now is
find address number and add it in front of the street name
remove the numbers that are connected to the dash - ):
9/F 137 KING STREET TOR
Would the the best way to accomplish this?
The main problem I'm having with this is there are many inconsistent spaces in address names ex. "van dyke rd".
Is there anyway I can locate in an array the "-" and set variables for the 2 numbers on either side of the dash and replace it with the correct address number located at the front
Function RemoveDupes2(txt As String, Optional delim As String = " ") As String
Dim x
With CreateObject("Scripting.Dictionary")
.CompareMode = vbTextCompare
For Each x In Split(txt, delim)
If Trim(x) <> "" And Not .exists(Trim(x)) Then .Add Trim(x), Nothing
Next
If .Count > 0 Then RemoveDupes2 = Join(.keys, delim)
End With
End Function
Thanks
Regular Expressions are a way to (amongst other things) search for a feature in a string.
It looks like the feature you are looking for is: number:maybe some spaces : dash : maybe some spaces : number
In regex notation this would be expressed as:
([0-9]*)[ ]*-[ ]*([0-9]*)
Which translates to: Find a sequential group of digits followed by zero or more spaces, then a dash, then zero or more spaces, then some more digits.
The parenthesis indicate the elements that will be returned. So you could assign variables to the be the first number or the second number.
You might need to tweak this if a dash can potentially occur elsewhere in the address.
Further information on actually implementing that is available here: How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops
This meets the case you want, it captures the address range as two separate matches (if you want to process further).
The current code simple removes this range altogether.
What logic is there to move the 9/F to front?
See regex here
Function StripString(strIn As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "(\d+[A-C]?)-(\d+[A-C]?)"
If .test(strIn) Then
StripString = .Replace(strIn, vbullstring)
Else
StripString = "No match"
End If
End With
End Function
I'd just:
swap 1st and 2nd substrings
erase the substring with "-" in it
Function RemoveDupes2(txt As String, Optional delim As String = " ") As String
Dim x As Variant, arr As Variant, temp As Variant
Dim iArr As Long
With CreateObject("Scripting.Dictionary")
.CompareMode = vbTextCompare
For Each x In Split(txt, delim)
If Trim(x) <> "" And Not .exists(Trim(x)) Then .Add Trim(x), Nothing
Next
If .count > 0 Then
arr = .keys
temp = arr(0)
arr(0) = arr(1)
arr(1) = temp
For iArr = LBound(arr) To UBound(arr)
If InStr(arr(iArr), "-") <> 0 Then arr(iArr) = ""
Next
RemoveDupes2 = Join(arr, delim)
End If
End With
End Function

Excel UDF for capturing numbers within characters

I have a variable text field sitting in cell A1 which contains the following:
Text;#Number;#Text;#Number
This format can keep repeating, but the pattern is always Text;#Number.
The numbers can vary from 1 digit to n digits (limit 7)
Example:
Original Value
MyName;#123;#YourName;#3456;#HisName;#78
Required value:
123, 3456, 78
The field is too variable for excel formulas from my understanding.
I tried using regexp but I am a beginner when it comes to coding. if you can break down the code with some explanation text, it would be much appreciated.
I have tried some of the suggestions below and they work perfectly. One more question.
Now that I can split the numbers from the text, is there any way to utilize the code below and add another layer, where we split the numbers into x cells.
For example: once we run the function, if we get 1234, 567 in the same cell, the function would put 1234 in cell B2, and 567 in cell C2. This would keep updating all cells in the same row until the string has exhausted all of the numbers that are retrieved from the function.
Thanks
This is the John Coleman's suggested method:
Public Function GetTheNumbers(st As String) As String
ary = Split(st, ";#")
GetTheNumbers = ""
For Each a In ary
If IsNumeric(a) Then
If GetTheNumbers = "" Then
GetTheNumbers = a
Else
GetTheNumbers = GetTheNumbers & ", " & a
End If
End If
Next a
End Function
If the pattern is fixed, and the location of the numbers never changes, you can assume the numbers will be located in the even places in the string. This means that in the array result of a split on the source string, you can use the odd indexes of the resulting array. For example in this string "Text;#Number;#Text;#Number" array indexes 1, 3 would be the numbers ("Text(0);#Number(1);#Text(2);#Number(3)"). I think this method is easier and safer to use if the pattern is indeed fixed, as it avoids the need to verify data types.
Public Function GetNums(src As String) As String
Dim arr
Dim i As Integer
Dim result As String
arr = Split(src, ";#") ' Split the string to an array.
result = ""
For i = 1 To UBound(arr) Step 2 ' Loop through the array, starting with the second item, and skipping one item (using Step 2).
result = result & arr(i) & ", "
Next
If Len(result) > 2 Then
GetNums = Left(result, Len(result) - 2) ' Remove the extra ", " at the end of the the result string.
Else
GetNums = ""
End If
End Function
The numbers can vary from 1 digit to n digits (limit 7)
None of the other responses seems to take the provided parameters into consideration so I kludged together a true regex solution.
Option Explicit
Option Base 0 '<~~this is the default but I've included it because it has to be 0
Function numsOnly(str As String, _
Optional delim As String = ", ")
Dim n As Long, nums() As Variant
Static rgx As Object, cmat As Object
'with rgx as static, it only has to be created once; beneficial when filling a long column with this UDF
If rgx Is Nothing Then
Set rgx = CreateObject("VBScript.RegExp")
End If
numsOnly = vbNullString
With rgx
.Global = True
.MultiLine = False
.Pattern = "[0-9]{1,7}"
If .Test(str) Then
Set cmat = .Execute(str)
'resize the nums array to accept the matches
ReDim nums(cmat.Count - 1)
'populate the nums array with the matches
For n = LBound(nums) To UBound(nums)
nums(n) = cmat.Item(n)
Next n
'convert the nums array to a delimited string
numsOnly = Join(nums, delim)
End If
End With
End Function
      
Regexp option that uses Replace
Sub Test()
Debug.Print StrOut("MyName;#123;#YourName;#3456;#HisName;#78")
End Sub
function
Option Explicit
Function StrOut(strIn As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "(^|.+?)(\d{1,7})"
.Global = True
If .Test(strIn) Then
StrOut = .Replace(strIn, "$2, ")
StrOut = Left$(StrOut, Len(StrOut) - 2)
Else
StrOut = "Nothing"
End If
End With
End Function