Excel VBA: search a string to find the first non-text character - regex

Cells contain a mixture of characters within a string, such as:
Abcdef_8765
QWERTY3_JJHH
Xyz9mnop
I need to find the first non A-Za-z character so that I can strip out the subsequent remainder of the string.
So the results would be:
Abcdef
QWERTY
Xyz
I know how to do this if I know exactly what character I'm looking for, but I'm not intuitively grasping how to find ANY character other than A-Za-z.
Btw, this is intended to be used within a vba solution.
====================
EDIT:
I've had success with the following...
a = "abc123"
b = Len(a)
For x = 1 To b
c = (Mid(a, x, 1) Like "[a-zA-Z]")
If c = False Then
d = Left(a, x - 1)
Exit Sub
End If
Next x
Have I stumbled upon a suitable solution, or is this destined to break?
I ask only because I look at Doug Glancy's solution and it seems much more substantial.
(btw, I have not yet tested Doug's solution)

Here is a simple way which doesn't use RegEx. I am deliberately not using RegEx as the other two answer are based on RegEx. RegEx is definitely faster but this is almost equally fast. The difference in speed is almost negligible.
Function GetWord(Rng As Range)
Dim i As Long, pos As Long
For i = 1 To Len(Rng.Value)
Select Case Asc(Mid(Rng.Value, i, 1))
Case 65 To 90, 97 To 122
Case Else: pos = i: Exit For
End Select
Next i
GetWord = Left(Rng.Value, pos - 1)
End Function
Usage:
=GetWord(A1)
EDIT:
Followup from comments. Fine tuned the code (Courtesy #brettdj) .
Function GetWord(Rng As Range)
Dim i As Long, pos As Long
Dim sString As String
sString = UCase$(Rng.Value)
For i = 1 To Len(sString)
Select Case Asc(Mid$(sString, i, 1))
Case 65 To 90
Case Else: pos = i: Exit For
End Select
Next i
GetWord = Left(Rng.Value, pos - 1)
End Function
More Followup.
Here is something which I had never tried before. I did an actual test of my code vs RegXp and I was surprised to see my code was faster than RegXp which I had not anticipated.
I tested it on 10k cells and each cell had a string of 2256 of length
The string that I put in Cell A1:A10000 is
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccdddddddddddddddddddddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeSiddharth5RoutaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccdddddddddddddddddddddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeSiddharth5RoutaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccdddddddddddddddddddddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeSiddharth5RoutaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccdddddddddddddddddddddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeSiddharth5RoutaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccdddddddddddddddddddddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeSiddharth5RoutaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccdddddddddddddddddddddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeSiddharth5RoutaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccdddddddddddddddddddddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeSiddharth5RoutaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccdddddddddddddddddddddddddddddddddddddddddddddddddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeSiddharth5Rout
Next I ran this test

The regexp below looks to remove from the first non A-Z character.
Function StrChange(strIn As String) As String
Dim objRegEx As Object
Set objRegEx = CreateObject("vbscript.regexp")
With objRegEx
.ignorecase = True
.Pattern = "^([a-z]+)([^a-z].*)"
.Global = True
StrChange = .Replace(strIn, "$1")
End With
End Function

You can use a simple regular expression to specify a numeral followed by anything and use this function to replace anything that matches that pattern:
Function Regex_Replace(strOriginal As String, strPattern As String, strReplacement, varIgnoreCase As Boolean) As String
Dim objRegExp As Object
Set objRegExp = CreateObject("vbscript.regexp")
With objRegExp
.Pattern = strPattern
.IgnoreCase = varIgnoreCase
.Global = True
End With
Regex_Replace = objRegExp.Replace(strOriginal, strReplacement)
Set objRegExp = Nothing
End Function
You'd call it like this:
Sub DeleteAfterNums()
Dim cell As Excel.Range
'Change "Selection" to your range
For Each cell In Selection
'"\d.+" is a numeral and whatever follows it
cell.Value = Regex_Replace(cell.Value, "\d.+", "", True)
Next cell
End Sub

Here is a lightweight and fast method that avoids regex/reference additions, thus helping with overhead and transportability should that be an advantage.
Public Function GetText(xValue As String) As Variant
For GetText = 1 To Len(xValue)
If UCase(Mid(xValue, GetText, 1)) Like "[!A-Z]" Then GetText = Left(xValue, GetText - 1): Exit Function
Next
GetText = xValue
End Function
This is then called by using GetText("Submission String") from vba or prepended with a "=" from within a cell formula.

Related

VBA function Regex

Do you have an idea of what is wrong in this code please? It should extract all caps and the pattern "1WO" if available.
For example in "User:399595:Account:ETH:balance", i should have "UAETH" and in "User:197755:Account:1WO:balance" i should have "UA1WO"
Thank you
Option Explicit
Function ExtractCap(Txt As String) As String
Application.Volatile
Dim xRegEx As Object
Set xRegEx = CreateObject("VBSCRIPT.REGEXP")
If xRegEx.Pattern = "[^A-Z]" Then
xRegEx.Global = True
xRegEx.MultiLine = False
ExtractCap = xRegEx.Replace(Txt, "")
Set xRegEx = Nothing
Else: xRegEx.Pattern = "1WO"
ExtractCap = xRegEx.Execute(Txt)
End If
End Function
I'm not a "RegEx" expert, so you may want to try an alternative:
Function ExtractCap(Txt As String) As String
Application.Volatile
Dim i As Long
For i = 1 To Len(Txt)
Select Case Asc(Mid(Txt, i, 1))
Case 65 To 90
ExtractCap = ExtractCap & Mid(Txt, i, 1)
End Select
Next
End Function
while, should the pattern of your data strictly be as you showed, you could also consider:
Function ExtractCap(Txt As String) As String
Application.Volatile
ExtractCap = "UA" & Split(Txt, ":")(3)
End Function
Your RegEx works like this:
Function ExtractCap(Txt As String) As String
Application.Volatile
Dim xRegEx As Object
Set xRegEx = CreateObject("VBScript.RegExp")
With xRegEx
.Pattern = "[^A-Z]"
.Global = True
.MultiLine = False
ExtractCap = .Replace(Txt, vbNullString)
End With
If Txt = ExtractCap Then ExtractCap = "1WO"
End Function
Public Sub TestMe()
Debug.Print ExtractCap("User:399595:Account:ETH:balance")
End Sub
In your code, there were 2 errors, which stopped the execution:
xRegEx was set to Nothing and then it was asked to provide a value;
the check If xRegEx.Pattern = "[^A-Z]" does not actually mean a lot to VBA. E.g., you are setting a Pattern and making a condition out of it. If you want to know whether a pattern exists in a RegEx, you should compare the two strings - before and after the execution of the pattern.
Your problem can be easily solved.
Firstly, I assumed that 1WO can appears at most once in your string.
Based on that assumption, logic is as follows:
Define function, which extracts all capital letters from strings.
Now, in the main function, you split your string first using 1WO as delimeter. Now, pass every string (after splitting) to function, get all the caps from those strings and concatenate them again with 1WO in its place.
Option Explicit
Public Function Extract(str As String) As String
Dim s As Variant
For Each s In Split(str, "1WO")
'append extracted caps with 1WO at the end
Extract = Extract & ExtractCaps(s) & "1WO"
Next
'delete lest 1WO from result
Extract = Left(Extract, Len(Extract) - 3)
End Function
Function ExtractCaps(str As Variant) As String
Dim i As Long, char As String
For i = 1 To Len(str)
char = Mid(str, i, 1)
If Asc(char) > 64 And Asc(char) < 91 And char = UCase(char) Then
ExtractCaps = ExtractCaps & char
End If
Next
End Function
If you put this code in inserted Module, you can use it in a worksheet in formula: =Extract(A1).

VBA - Modify sheet naming from source file

I received help in the past for an issue regarding grabbing a source file name and naming a newly created worksheet the date from said source file name, i.e. "010117Siemens Hot - Cold Report.xls" and outputting "010117".
However the code only works for file names with this exact format, for example, file named "Siemens Hot - Cold Report 010117.xls", an error occurs because the newly created sheet does not find the date in the source file.
CODE
Application.ScreenUpdating = False
Dim n As Double
Dim wksNew As Excel.Worksheet
Dim src As Workbook
Set src = Workbooks.Open(filePath, False, False)
Dim srcRng As Range
With src.Worksheets("Sheet1")
Set srcRng = .Range(.Range("A1"), .Range("A1").End(xlDown).End(xlToRight))
End With
With ThisWorkbook
Set wksNew = .Worksheets.Add(After:=.Worksheets(.Sheets.Count))
n = .Sheets.Count
.Worksheets(n).Range("A1").Resize(srcRng.Rows.Count, srcRng.Columns.Count).Value = srcRng.Value
End With
' ======= get the digits part from src.Name using a RegEx object =====
' RegEx variables
Dim Reg As Object
Dim RegMatches As Variant
Set Reg = CreateObject("VBScript.RegExp")
With Reg
.Global = True
.IgnoreCase = True
.Pattern = "\d{0,9}" ' Match any set of 0 to 9 digits
End With
Set RegMatches = Reg.Execute(src.Name)
On Error GoTo CloseIt
If RegMatches.Count >= 1 Then ' make sure there is at least 1 match
ThisWorkbook.Worksheets(n).Name = RegMatches(0) ' rename "Sheet2" to the numeric part of the filename
End If
src.Close False
Set src = Nothing
So, my question is, how can I get my code to recognize the string of digits no matter its position in the file name?
Code
^\d{0,9}\B|\b\d{0,9}(?=\.)
Usage
I decided to make a function that can be called inside a cell as such: =GetMyNum(x) where x is a pointer to a cell (i.e. A1).
To get the code below to work:
Open Microsoft Visual Basic for Applications (ALT + F11)
Insert a new module (right click in the Project Pane and select Insert -> Module).
Click Tools -> References and find Microsoft VBScript Regular Expressions 5.5, enable it and click OK
Now copy/paste the following code into the new module:
Option Explicit
Function GetMyNum(Myrange As Range) As String
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim strOutput As String
Dim match As Object
strPattern = "^\d{0,9}\B|\b\d{0,9}(?=\.)"
If strPattern <> "" Then
strInput = Myrange.Value
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.test(strInput) Then
Set match = regEx.Execute(strInput)
GetMyNum = match.Item(0)
Else
GetMyNum = ""
End If
End If
End Function
Results
Input
A1: Siemens Hot - Cold Report 010117.xls
A2: 010117Siemens Hot - Cold Report.xls
B1: =GetMyNum(A1)
B2: =GetMyNum(A1)
Output
010117 # Contents of B1
010117 # Contents of B2
Explanation
I will explain each regex option separately. You can reorder the options in terms of importance in such a way that the most important option is first and least important is last.
^\d{0,9}\B Match the following
^ Assert position at the start of the line
\d{0,9} Match any digit 0-9 times
\B Ensure position does not match where a word boundary matches (this is used but may be dropped depending on usage - I added it because it seems the number you're trying to get is immediately followed by a word character and not followed by a space - if that's not always the case just remove this token)
\b\d{0,9}(?=\.) Match the following
\b Assert position as a word boundary
\d{0,9} Match any digit 0-9 times
(?=\.) Positive lookahead ensuring a literal dot . follows
Just my alternative solution to RegEx :)
This finds the first occurence of 6 consecutive digits, omitting blanks and periods... although there are probably some more issues with using IsNumeric as I believe a lowercase e is considered acceptable by it...
Sub FindTheNumber()
For i = 1 To Len(Range("A1").Value)
If IsNumeric(Mid(Range("A1").Value, i, 6)) = True And InStr(Mid(Range("A1").Value, i, 6), " ") = 0 And InStr(Mid(Range("A1").Value, i, 6), ".") = 0 Then
MyNumber = Mid(Range("A1").Value, i, 6)
Debug.Print MyNumber
Exit For
End If
Next i
For i = 1 To Len(Range("A2").Value)
If IsNumeric(Mid(Range("A2").Value, i, 6)) = True And InStr(Mid(Range("A2").Value, i, 6), " ") = 0 And InStr(Mid(Range("A2").Value, i, 6), ".") = 0 Then
MyNumber = Mid(Range("A2").Value, i, 6)
Debug.Print MyNumber
Exit For
End If
Next i
End Sub
Examples:
Immediate window:

How to extract ad sizes from a string with excel regex

I am trying to extract ad sizes from string. The ad sizes are all set standard sizes. So while I'd prefer to have a regex that looks for a pattern, IE 3 numbers followed by 2 or 3 numbers, hard coding it will also work, since we know what the sizes will be. Here's an example of some of the ad sizes:
300x250
728x90
320x50
I was able to find some VBScript that I modified that almost works, but because my strings that I'm searching are inconsistent, it's pulling too much in some cases. For example:
You see how it's not matching correctly in every instance.
The VB code I found is actually matching everything EXCEPT that ad sizes. I don't know enough about VBScript to reverse it to just look for ad sizes and pull them. So instead it looks for all other text and removes it.
The code is below. Is there a way to fix the Regex so that it just returns the ad sizes?
Function getAdSize(Myrange As Range) As String
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim strOutput As String
strPattern = "([^300x250|728x90])"
If strPattern <> "" Then
strInput = Myrange.Value
strReplace = ""
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = strPattern
End With
If regEx.Test(strInput) Then
getAdSize = regEx.Replace(strInput, strReplace)
Else
getAdSize = "Not matched"
End If
End If
End Function
NOTE, THE DATA IS NOT ALWAYS PRECEDED BY AN UNDERSCORE, SOMETIMES IT IS A DASH OR A SPACE BEFORE AND AFTER.
EDIT: Since it's not actually underscore delimited we can't use Split. We can however iterate over the string and extract the "#x#" manually. I have updated the code to reflect this and verified that it works successfully.
Public Function ExtractAdSize(ByVal arg_Text As String) As String
Dim i As Long
Dim Temp As String
Dim Ad As String
If arg_Text Like "*#x#*" Then
For i = 1 To Len(arg_Text) + 1
Temp = Mid(arg_Text & " ", i, 1)
If IsNumeric(Temp) Then
Ad = Ad & Temp
Else
If Temp = "x" Then
Ad = Ad & Temp
Else
If Ad Like "*#x#*" Then
ExtractAdSize = Ad
Exit Function
Else
Ad = vbNullString
End If
End If
End If
Next i
End If
End Function
Alternate version of the same function using Select Case boolean logic instead of nested If statements:
Public Function ExtractAdSize(ByVal arg_Text As String) As String
Dim i As Long
Dim Temp As String
Dim Ad As String
If arg_Text Like "*#x#*" Then
For i = 1 To Len(arg_Text) + 1
Temp = Mid(arg_Text & " ", i, 1)
Select Case Abs(IsNumeric(Temp)) + Abs((Temp = "x")) * 2 + Abs((Ad Like "*#x#*")) * 4
Case 0: Ad = vbNullString 'Temp is not a number, not an "x", and Ad is not valid
Case 1, 2, 5: Ad = Ad & Temp 'Temp is a number or an "x"
Case 4, 6: ExtractAdSize = Ad 'Temp is not a number, Ad is valid
Exit Function
End Select
Next i
End If
End Function
I have managed to make about 95% of the required answer - the RegEx below will remove the DDDxDD size and would return the rest.
Option Explicit
Public Function regExSampler(s As String) As String
Dim regEx As Object
Dim inputMatches As Object
Dim regExString As String
Set regEx = CreateObject("VBScript.RegExp")
With regEx
.Pattern = "(([0-9]+)x([0-9]+))"
.IgnoreCase = True
.Global = True
Set inputMatches = .Execute(s)
If regEx.test(s) Then
regExSampler = .Replace(s, vbNullString)
Else
regExSampler = s
End If
End With
End Function
Public Sub TestMe()
Debug.Print regExSampler("uni3uios3_300x250_ASDF.html")
Debug.Print regExSampler("uni3uios3_34300x25_ASDF.html")
Debug.Print regExSampler("uni3uios3_8x4_ASDF.html")
End Sub
E.g. you would get:
uni3uios3__ASDF.html
uni3uios3__ASDF.html
uni3uios3__ASDF.html
From here you can continue trying to find a way to reverse the display.
Edit:
To go from the 95% to the 100%, I have asked a question here and it turns out that the conditional block should be changed to the following:
If regEx.test(s) Then
regExSampler = InputMatches(0)
Else
regExSampler = s
End If
This formula could work if it's always 3 characters, then x, and it's always between underscores - adjust accordingly.
=iferror(mid(A1,search("_???x*_",A1)+1,search("_",A1,search("_???x*_",A1)+1)-(search("_???x*_",A1)+1)),"No match")

Excel VBA Regex Function That Returns Multiple Matches Into A Single Cell

I need help getting my Excel function to work. The goal is to run an in-cell function that extracts all pattern matches of a regex function from the input of another cell into one cell, not an array of cells.
I have tried this using an array which returns two matches in the function dialogue box preview but only outputs the first match in the cell. I have also tried using a collection but had no luck with that.
Here is my current code and a sample of text that would be used as the function's string input:
Function RegexMatches(strInput As String) As Variant
Dim rMatch As Match
Dim arrayMatches
Dim i As Long
arrayMatches = Array()
With New RegExp
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = "(Code)[\s][\d]{2,4}"
For Each rMatch In .Execute(strInput)
ReDim Preserve arrayMatches(i)
arrayMatches(i) = rMatch.Value
i = i + 1
Next
End With
RegexMatches = arrayMatches
End Function
Sample strInput from an Excel cell:
Code 123 some random text
goes here and continues to the next line
Code 4567 followed by more text
including new lines not just wrapped text
The desired output from this function would be both (2) matched values from the regex function into a single cell (e.g. "Code 123 Code 4567").
Any help is greatly appreciated!
Looks like you missed off the end of your function (as per Mat's Mug's comment)? Try this (which is as per Wiktor's comment).
Edit: amended in light of Mat's Mug's suggestion.
Function RegexMatches(strInput As String) As String
Dim rMatch As Object
Dim s As String
Dim arrayMatches()
Dim i As Long
With New RegExp
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = "(Code)[\s][\d]{2,4}"
If .test(strInput) Then
For Each rMatch In .Execute(strInput)
ReDim Preserve arrayMatches(i)
arrayMatches(i) = rMatch.Value
i = i + 1
's = s & " " & rMatch
Next
End If
End With
RegexMatches = Join(arrayMatches, " ")
End Function

Using VBA to parse and split a string with wildcards?

I've got a sheet that contains item numbers of alphanumeric characters, and a bunch of other information in the row. Sometimes, similar items are combined into one row, and the difference on the item number will be shown with (X/Y) to choose which character to use at that point in the item number (not just X or Y, can be any alphanumeric character). In other words, these entries will look like this:
AB(X/Y)CD123
What I need is a way to separate that into the two item numbers ABXCD123 and ABYCD123. After that I'll have to create a row below the current one and copy the current row into it, with the changed item number, but that part is easy. I've tried using InStr to get the (X/Y) flagged, but I don't know how to pull out the X and Y characters to make new strings with them. I also don't know if a wildcard will work with InStr, and I'm not too familiar with RegEx.
Any ideas?
Here is s little introduction to regex¹ in a UDF².
Function partNums(str As String, _
Optional num As Integer = 1)
Dim tmp As String
Static rgx As Object
'with rgx as static, it only has to be created once; beneficial when filling a long column with this UDF
If rgx Is Nothing Then
Set rgx = CreateObject("VBScript.RegExp")
End If
partNums = vbNullString
With rgx
.Global = True
.IgnoreCase = True
.MultiLine = False
.Pattern = "\([A-Z]{1}/[A-Z]{1}\)"
If .Test(str) Then
tmp = .Execute(str)(0)
Select Case num
Case 2
tmp = Mid(tmp, 4, 1)
Case Else
tmp = Mid(tmp, 2, 1)
End Select
partNums = .Replace(str, tmp)
End If
End With
End Function
In B2:B3 as,
=partNums(A2)
=partNums(A3,2)
            
Here is a largely duplicated UDF that handles from 1 to 3 characters.
Function partNums(str As String, _
Optional num As Integer = 1)
Dim tmp As String
Static rgx As Object
'with rgx as static, it only has to be created once; beneficial when filling a long column with this UDF
If rgx Is Nothing Then
Set rgx = CreateObject("VBScript.RegExp")
End If
partNums = vbNullString
With rgx
.Global = True
.IgnoreCase = True
.MultiLine = False
.Pattern = "\([A-Z]{1,3}/[A-Z]{1,3}\)"
If .Test(str) Then
tmp = .Execute(str)(0)
tmp = Split(Replace(Replace(tmp, Chr(40), vbNullString), Chr(41), vbNullString), Chr(47))(num - 1)
partNums = .Replace(str, tmp)
End If
End With
End Function
            
¹ regex questions can usually be answered by the solutions in How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops.
² A User Defined Function (aka UDF) is placed into a standard module code sheet. Tap Alt+F11 and when the VBE opens, immediately use the pull-down menus to Insert ► Module (Alt+I,M). Paste the function code into the new module code sheet titled something like Book1 - Module1 (Code). Tap Alt+Q to return to your worksheet(s).