I need some code to replace out a token if block such as
myText = "some text. [IfIsFile:True]this is a file[EndIfIsFile:True] [IfIsFile:False]this is not a file[EndIfIsFile:False]. more text"
I am happy to call each if block separately (true then false)
I need to send in the
function myReplace(text, token, Boolean if leave text or blank it) as string
I may call it as
myReplace(text, "[IfIsFile:True]", true)
and then
myReplace(text, "[IfIsFile:False]", false)
results would be
"some text. this is a file. more text"
ok I thought this would be much harder but I was able to write it up. if anyone wants to post improved code - pls do
Protected Function ReplaceIfToken(ByVal sText As String, ByVal startToken As String, ByVal endToken As String, ByVal bLeave As Boolean) As String
Dim str As String = sText
Dim iStart As Integer = 0
Dim iEnd As Integer = 0
iStart = InStr(str, startToken)
iEnd = InStr(str, endToken)
While iStart > 0
Dim fullTokenText As String = str.Substring(iStart - 1, iEnd + endToken.Length - iStart)
Dim fullTokenTextwoToken As String = fullTokenText.Replace(startToken, "").Replace(endToken, "")
If bLeave Then
str = str.Replace(fullTokenText, fullTokenTextwoToken)
Else
str = str.Replace(fullTokenText, "")
End If
iStart = InStr(str, startToken)
iEnd = InStr(str, endToken)
End While
Return str
End Function
Related
First of all, sorry for the long title. I just don't know how to put it succinctly. I am trying to do this in VBA as normal Excel will not cut it.
Basically, I have a column. Each cells may contain data in the format of something like
flat 10-14;Flat 18-19;unit 7-9;flat A-D;ABC;DEF;
What I need is to find the string that has "-" in it, and attempt to replace it with anything in between. so the above code will become
Flat 10, Flat 11; Flat 12, Flat 14;Flat 18, Flat 19;Unit 7, Unit 8, Unit 9;Flat A, Flat B, Flat C; ABC;DEF;
With the help of this article on RegExpression, I have managed to work out how to expand the bits of data with number, which I will post the code below. However, I don't know a good way to expand the data with the letter. i.e from Flat A-C to Flat A, Flat B, Flat C
My code is below, please feel free to give any pointers if you think it can be more efficient. I am very much an amateur at this. Thank you in advance.
Sub CallRegEx()
Dim r As Match
Dim mcolResults As MatchCollection
Dim strInput As String, strPattern As String
Dim test As String, StrOutput As String, prefix As String
Dim startno As Long, endno As Long
Dim myrange As Range
strPattern = "(Flat|Unit) [0-9]+-+[0-9]+"
With Worksheets("Sheet1")
lrow = .Cells(Rows.Count, 9).End(xlUp).Row
For Each x In .Range("A2:A" & lrow)
strInput = Range("A" & x.Row).Value
Set mcolResults = RegEx(strInput, strPattern, True, , True)
If Not mcolResults Is Nothing Then
StrOutput = strInput
For Each r In mcolResults
startno = Mid(r, (InStr(r, "-") - 2), 2)
endno = Mid(r, (InStr(r, "-") + 1))
prefix = Mid(r, 1, 4)
test = ""
For i = startno To endno - 1
test = test & prefix & " " & i & ","
Next i
test = test & prefix & " " & endno
'this is because I don't want the comma at the end of the last value
StrOutput = Replace(StrOutput, r, test)
Debug.Print r ' remove in production
Next r
End If
.Range("D" & x.Row).Value = StrOutput
Next x
End With
End Sub
This function below is to support the Sub above
Function RegEx(strInput As String, strPattern As String, _
Optional GlobalSearch As Boolean, Optional MultiLine As Boolean, _
Optional IgnoreCase As Boolean) As MatchCollection
Dim mcolResults As MatchCollection
Dim objRegEx As New RegExp
If strPattern <> vbNullString Then
With objRegEx
.Global = GlobalSearch
.MultiLine = MultiLine
.IgnoreCase = IgnoreCase
.Pattern = strPattern
End With
If objRegEx.test(strInput) Then
Set mcolResults = objRegEx.Execute(strInput)
Set RegEx = mcolResults
End If
End If
End Function
Letters have character codes that are ordinal (A < B < C ...) & these can be accessed via asc()/chr$() - here is one way to do it:
inputStr = "flat 10-14;Flat 18-19;unit 7-9;flat A-D;ABC;DEF;flat 6;flat T"
Dim re As RegExp: Set re = New RegExp
re.Pattern = "(flat|unit)\s+((\d+)-(\d+)|([A-Z])-([A-Z]))"
re.Global = True
re.IgnoreCase = True
Dim m As MatchCollection
Dim start As Variant, fin As Variant
Dim tokens() As String
Dim i As Long, j As Long
Dim isDigit As Boolean
tokens = Split(inputStr, ";")
For i = 0 To UBound(tokens) '// loop over tokens
Set m = re.Execute(tokens(i))
If (m.Count) Then
With m.Item(0)
start = .SubMatches(2) '// first match number/letter
isDigit = Not IsEmpty(start) '// is letter or number?
If (isDigit) Then '// number
fin = .SubMatches(3)
Else '// letter captured as char code
start = Asc(.SubMatches(4))
fin = Asc(.SubMatches(5))
End If
tokens(i) = ""
'// loop over items
For j = start To fin
tokens(i) = tokens(i) & .SubMatches(0) & " " & IIf(isDigit, j, Chr$(j)) & ";"
Next
End With
ElseIf i <> UBound(tokens) Then tokens(i) = tokens(i) & ";"
End If
Next
Debug.Print Join(tokens, "")
flat 10;flat 11;flat 12;flat 13;flat 14;Flat 18;Flat 19;unit 7;unit 8;unit 9;flat A;flat B;flat C;flat D;ABC;DEF;flat 6;flat T
I have code to import email body data from Outlook to Excel. I only need Name, ID, code from the email.
I have done everything except to extract the ID from a fixed sentence:
cn=SVCLMCH,OU=Users,OU=CX,DC=dm001,DC=corp,DC=dcsa,DC=com
The id is SVCLMCH in this case, that means I need to extract the text between "cn=" and ",OU=Users".
Sub import_code()
Dim O As Outlook.Application
Set O = New Outlook.Application
Dim ONS As Outlook.Namespace
Set ONS = O.GetNamespace("MAPI")
Dim OMAIL As Outlook.MailItem
Set OMAIL = Nothing
Dim ws As Object
Set ws = ThisWorkbook.Worksheets("Import code from Outlook")
Dim rcount As Long
Dim vText As Variant
Dim sText As String
Dim i As Long
If O.ActiveExplorer.Selection.Count = 0 Then
msgbox "No Items selected!", vbCritical, "Error"
End If
On Error Resume Next
'Process each selected record
rcount = ws.UsedRange.Rows.Count
For Each OMAIL In O.ActiveExplorer.Selection
sText = OMAIL.Body
vText = Split(sText, Chr(13))
'Find the next empty line of the worksheet
rcount = rcount + 1
'Check each line of text in the message body
For i = UBound(vText) To 0 Step -1
If InStr(1, vText(i), "Password Generated and set for:") > 0 Then
vItem = Split(vText(i), Chr(58))
ws.Range("A" & rcount) = Trim(vItem(1))
End If
If InStr(1, vText(i), "cn=") > 0 Then
vItem = Split(vText(i), Chr(58))
ws.Range("b" & rcount) = Trim(vItem(1))
End If
If InStr(1, vText(i), "Password:") > 0 Then
vItem = Split(vText(i), Chr(58))
ws.Range("c" & rcount) = Trim(vItem(1))
End If
Next i
Next OMAIL
End Sub
The trick here is to use the Split() function
Dim Arr() As String
Dim j As Integer
Dim k As Integer
Dim strvar As String
Dim strval As String
Dim strID As String
If InStr(1, vtext(i), "cn=") > 0 Then
' split the whole line in an array - "," beeing the value separator
Arr = Split(vtext(i), ",")
' loop through all array elements
For j = 0 To UBound(r) - 1
' find the position of =
k = InStr(Arr(j), "=")
strvar = Left(Arr(j), k - 1) ' this is the variable name, ie "cn"
strval = Mid(Arr(j), k + 1) ' this is the value, ie "SVCLMCH"
' now do what you want with a specific variable
Select Case strvar
Case "cn"
strID = strval
Case Else
' do nothing
End Select
Next j
End If
you can use a helper function like this:
Function GetID(strng As String)
Dim el As Variant
For Each el In Split(strng, ",")
If InStr(1, el, "cn=") > 0 Then
GetID = Mid(el, InStr(1, el, "cn=") + 3)
Exit Function
End If
Next
End Function
and your main code would exploit it as:
If InStr(1, vText(i), "cn=") > 0 Then ws.Range("b" & rcount) = GetID(CStr(vText(i)))
Use Regular Expression extract the ID from the sentence
Example Pattern = "(?<=cn=)([^.]+)(?=\,OU=Users)"
https://regex101.com/r/67u84s/2
Code Example
Option Explicit
Private Sub Examplea()
Dim Matches As Variant
Dim RegEx As Object
Set RegEx = CreateObject("VbScript.RegExp")
Dim olApp As Object
Set olApp = CreateObject("Outlook.Application")
Dim Item As Outlook.MailItem
Set Item = olApp.ActiveExplorer.Selection.Item(1)
Dim Pattern As String
Pattern = "(?<=cn=)([^.]+)(?=\,OU=Users)"
With RegEx
.Global = False
.Pattern = Pattern
.IgnoreCase = True
Set Matches = .Execute(Item.Body)
End With
If Matches.Count > 0 Then
Debug.Print Matches(0).SubMatches(0)
With ThisWorkbook.Sheets("Sheet1")
.Range("A1").Value = Trim(Matches(0).SubMatches(0))
End With
End If
End Sub
I have a macro that send an XMLHTTP request to a server and it gets as response a plain text string, not a JSON format string or other standard formats (at least for what I know).
I would like to parse the output string in order to access the data in an structured approach in the same fashion as the parseJson subroutine in this link
My problem is I am not good with regular expressions and I am not able to modify the routine for my needs.
The string that I need to parse has the following structure:
The string is a single line
Each single parameter is defined by its parameter name the equal simbol, its value and ending with; "NID=3;" or "SID=Test;"
Parameter can be collected in "structures" starts and end with the symbol | and they are identified with their name followed by ; such as |STEST;NID=3;SID=Test;|
A structure can contain also other structures
An example of a output string is the following
|KC;|AD;PE=5;PF=3;|CD;PE=5;HP=test;|CD;PE=3;HP=abc;|
In this case there is a macro structure KC which contains a structure AD. The structure AD is composed by the parameters PE, PF and 2 structures CD. And finaly the structures CD have the parameters PE and HP
So I would like to parse the string to obtain an Object/Dictionary that reflects this structure, can you help me?
Adds after the first answers
Hi all, thank you for your help, but I think I should make more clear the output that I would like to get.
For the example string that I have, I would like to have an object with the following structure:
<KC>
<AD>
<PE>5</PE>
<PF>3</PF>
<CD>
<PE>5</PE>
<HP>test</HP>
</CD>
<CD>
<PE>3</PE>
<HP>abc</HP>
</CD>
</AD>
</KC>
So I started to wrote a possible working code base on some hint from #Nvj answer and the answer in this link
Option Explicit
Option Base 1
Sub Test()
Dim strContent As String
Dim strState As String
Dim varOutput As Variant
strContent = "|KC;|AD;PE=5;PF=3;|CD;PE=5;HP=test;|CD;PE=3;HP=abc;|"
Call ParseString(strContent, varOutput, strState)
End Sub
Sub ParseString(ByVal strContent As String, varOutput As Variant, strState As String)
' strContent - source string
' varOutput - created object or array to be returned as result
' strState - Object|Array|Error depending on processing to be returned as state
Dim objTokens As Object
Dim lngTokenId As Long
Dim objRegEx As Object
Dim bMatched As Boolean
Set objTokens = CreateObject("Scripting.Dictionary")
lngTokenId = 0
Set objRegEx = CreateObject("VBScript.RegExp")
With objRegEx
.Global = True
.MultiLine = True
.IgnoreCase = True
.Pattern = "\|[A-Z]{2};" 'Pattern for the name of structures
Tokenize objTokens, objRegEx, strContent, lngTokenId, bMatched, "str"
.Pattern = "[A-Z]{2}=[^\|=;]+;" 'Pattern for parameters name and values
Tokenize objTokens, objRegEx, strContent, lngTokenId, bMatched, "par"
End With
End Sub
Sub Tokenize(objTokens, objRegEx, strContent, lngTokenId, bMatched, strType)
Dim strKey As String
Dim strKeyPar As String
Dim strKeyVal As String
Dim strWork As String
Dim strPar As String
Dim strVal As String
Dim strLevel As String
Dim strRes As String
Dim lngCopyIndex As Long
Dim objMatch As Object
strRes = ""
lngCopyIndex = 1
With objRegEx
For Each objMatch In .Execute(strContent)
If strType = "str" Then
bMatched = True
With objMatch
strWork = Replace(.Value, "|", "")
strWork = Replace(strWork, ";", "")
strLevel = get_Level(strWork)
strKey = "<" & lngTokenId & strLevel & strType & ">"
objTokens(strKey) = strWork
strRes = strRes & Mid(strContent, lngCopyIndex, .FirstIndex - lngCopyIndex + 1) & strKey
lngCopyIndex = .FirstIndex + .Length + 1
End With
lngTokenId = lngTokenId + 1
ElseIf strType = "par" Then
strKeyPar = "<" & lngTokenId & "par>"
strKeyVal = "<" & lngTokenId & "val>"
strKey = strKeyPar & strKeyVal
bMatched = True
With objMatch
strWork = Replace(.Value, ";", "")
strPar = Split(strWork, "=")(0)
strVal = Split(strWork, "=")(1)
objTokens(strKeyPar) = strPar
objTokens(strKeyVal) = strVal
strRes = strRes & Mid(strContent, lngCopyIndex, .FirstIndex - lngCopyIndex + 1) & strKey
lngCopyIndex = .FirstIndex + .Length + 1
End With
lngTokenId = lngTokenId + 2
End If
Next
strContent = strRes & Mid(strContent, lngCopyIndex, Len(strContent) - lngCopyIndex + 1)
End With
End Sub
Function get_Level(strInput As String) As String
Select Case strInput
Case "KC"
get_Level = "L1"
Case "AD"
get_Level = "L2"
Case "CD"
get_Level = "L3"
Case Else
MsgBox ("Error")
End
End Select
End Function
This function creates a dictionary with an item for each structure name, parameter name and parameter value as shown in the figure
Thanks to the function get_Level the items associated to structures have a level that should help to preserve the original hierarchy of the data.
So what I am missing is a function to create an object that has the original structure of the input string. This is what the Retrieve function do in this answer link, but I do not know how to adapt it to my case
This looks like a simple nested delimited string. A couple of Split() functions will do the trick:
Option Explicit
Function parseString(str As String) As Collection
Dim a1() As String, i1 As Long, c1 As Collection
Dim a2() As String, i2 As Long, c2 As Collection
Dim a3() As String
a1 = Split(str, "|")
Set c1 = New Collection
For i1 = LBound(a1) To UBound(a1)
If a1(i1) <> "" Then
Set c2 = New Collection
a2 = Split(a1(i1), ";")
For i2 = LBound(a2) To UBound(a2)
If a2(i2) <> "" Then
a3 = Split(a2(i2), "=")
If UBound(a3) > 0 Then
c2.Add a3(1), a3(0)
ElseIf UBound(a3) = 0 Then
c2.Add a3(0)
End If
End If
Next i2
c1.Add c2
End If
Next i1
Set parseString = c1
End Function
Sub testParseString()
Dim c As Collection
Set c = parseString("|KC;|AD;PE=5;PF=3;|CD;PE=5;HP=test;|CD;PE=3;HP=abc;|")
Debug.Assert c(1)(1) = "KC"
Debug.Assert c(2)("PE") = "5"
Debug.Assert c(3)(1) = "CD"
Debug.Assert c(4)("HP") = "abc"
Debug.Assert c(4)(3) = "abc"
End Sub
Note that you can address values by both, index and key (if key existed in the input). If key was not provided you can only access the value by its index. You can also iterate collection recursively to get all the values in a tree structure.
Food for thought: since your structures may have repeated names (in your case "CD" structure happens twice) Collections / Dictionaries would find it problematic to store this elegantly (due to key collisions). Another good way to approach this is to create an XML structure with DOMDocument and use XPath to access its elements. See Program with DOM in Visual Basic
UPDATE: I've added XML example below as well. Have a look.
Here is another take on your string parsing issue using DOMDocument XML parser. You need to include Microsoft XML, v.6.0 in your VBA references.
Function parseStringToDom(str As String) As DOMDocument60
Dim a1() As String, i1 As Long
Dim a2() As String, i2 As Long
Dim a3() As String
Dim dom As DOMDocument60
Dim rt As IXMLDOMNode
Dim nd As IXMLDOMNode
Set dom = New DOMDocument60
dom.async = False
dom.validateOnParse = False
dom.resolveExternals = False
dom.preserveWhiteSpace = True
Set rt = dom.createElement("root")
dom.appendChild rt
a1 = Split(str, "|")
For i1 = LBound(a1) To UBound(a1)
If a1(i1) <> "" Then
a2 = Split(a1(i1), ";")
Set nd = dom.createElement(a2(0))
For i2 = LBound(a2) To UBound(a2)
If a2(i2) <> "" Then
a3 = Split(a2(i2), "=")
If UBound(a3) > 0 Then
nd.appendChild dom.createElement(a3(0))
nd.LastChild.Text = a3(1)
End If
End If
Next i2
rt.appendChild nd
End If
Next i1
Set parseStringToDom = dom
End Function
Sub testParseStringToDom()
Dim dom As DOMDocument60
Set dom = parseStringToDom("|KC;|AD;PE=5;PF=3;|CD;PE=5;HP=test;|CD;PE=3;HP=abc;|")
Debug.Assert Not dom.SelectSingleNode("/root/KC") Is Nothing
Debug.Assert dom.SelectSingleNode("/root/AD/PE").Text = "5"
Debug.Assert dom.SelectSingleNode("/root/CD[1]/HP").Text = "test"
Debug.Assert dom.SelectSingleNode("/root/CD[2]/HP").Text = "abc"
Debug.Print dom.XML
End Sub
As you can see this converts your text into an XML DOM document preserving all the structures and allowing for duplicates in naming. You can then use XPath to access any node or value. This can also be extended to have more nesting levels and further structures.
This is the XML document it creates behind the scenes:
<root>
<KC/>
<AD>
<PE>5</PE>
<PF>3</PF>
</AD>
<CD>
<PE>5</PE>
<HP>test</HP>
</CD>
<CD>
<PE>3</PE>
<HP>abc</HP>
</CD>
</root>
I've started to write a parser in VBA for the string structure specified by you, and it's not complete, but I'll post it anyways. Maybe you can pick up some ideas from it.
Sub ParseString()
Dim str As String
str = "|KC;|AD;PE=5;PF=3;|CD;PE=5;HP=test;|CD;PE=3;HP=abc;|"
' Declare an object dictionary
' Make a reference to Microsoft Scripting Runtime in order for this to work
Dim dict As New Dictionary
' If the bars are present in the first and last character of the string, replace them
str = Replace(str, "|", "", 1, 1)
If (Mid(str, Len(str), 1) = "|") Then
str = Mid(str, 1, Len(str) - 1)
End If
' Split the string by bars
Dim substring_array() As String
substring_array = Split(str, "|")
' Declare a regex object
' Check the reference to Microsoft VBScript Regular Expressions 5.5 in order for this to work
Dim regex As New RegExp
With regex
.Global = True
.IgnoreCase = True
.MultiLine = True
End With
' Object to store the regex matches
Dim matches As MatchCollection
Dim param_name_matches As MatchCollection
Dim parameter_value_matches As MatchCollection
' Define some regex patterns
pattern_for_structure_name = "^[^=;]+;"
pattern_for_parameters = "[^=;]+=[^=;]+;"
pattern_for_parameter_name = "[^=;]="
pattern_for_parameter_val = "[^=;];"
' Loop through the elements of the array
Dim i As Integer
For i = 0 To UBound(substring_array) - LBound(substring_array)
' Get the array element in a string
str1 = substring_array(i)
' Check if it contains a structure name
regex.Pattern = pattern_for_structure_name
Set matches = regex.Execute(str1)
If matches.Count = 0 Then
' This substring does not contain a structure name
' Check if it contains parameters
regex.Pattern = pattern_for_parameter
Set matches = regex.Execute(matches(0).Value)
If matches.Count = 0 Then
' There are no parameters as well as no structure name
' This means the string had || - invalid string
MsgBox ("Invalid string")
Else
' The string contains parameter names
' Add each parameter name to the dictionary
Dim my_match As match
For Each my_match In matches
' Get the name of the parameter
regex.Pattern = pattern_for_parameter_name
Set parameter_name_matches = regex.Execute(my_match.Value)
' Check if the above returned any matches
If parameter_name_matches.Count = 1 Then
' Remove = sign from the parameter name
parameter_name = Replace(parameter_name_matches(0).Value, "=", "")
' Get the value of the parameter
regex.Pattern = pattern_for_parameter_value
Set parameter_value_matches = regex.Execute(my_match.Value)
' Check if the above returned any matches
If parameter_value_matches.Count = 1 Then
' Get the value
parameter_value = Replace(parameter_value_matches(0).Value, ";", "")
' Add the parameter name and value as a key pair to the Dictionary object
dict.Item(parameter_name) = parameter_value
Else
' Number of matches is either 0 or greater than 1 - in both cases the string is invalid
MsgBox ("Invalid string")
End If
Else
' Parameter name did not match - invalid string
MsgBox ("Invalid string")
End If
Next
End If
ElseIf matches.Count = 1 Then
' This substring contains a single structure name
' Check if it has parameter names
Else
' This substring contains more than one structure name - the original string is invalid
MsgBox ("Invalid string")
End If
Next i
End Sub
I am looking for a code to test date Format, the date should be in one of these formats
year : 13xx - 20xx
month: xx,x
day: xx,x
the hole date would be on of the following
2012/1/1
2012/01/01
2012/1/01
2012/01/1
I tried the following
Option Explicit
Sub ttt()
MsgBox (testDate("2012/01/01"))
End Sub
Function testDate(strDateToBeTested As String) As Boolean
Dim regularExpression, match
Set regularExpression = CreateObject("vbscript.regexp")
testDate = False
'regularExpression.Pattern = "(14|13|19|20)[0-9]{2}[- /.]([0-9]{1,2})[- /.]([0-9]{1,2})"
'regularExpression.Pattern = "(\d\d\d\d)/(\d|\d\d)/(\d|/dd)"
regularExpression.Pattern = "([0-9]{4}[ /](0[1-9]|[12][0-9]|3[01])[ /](0[1-9]|1[012]))"
regularExpression.Global = True
regularExpression.MultiLine = True
If regularExpression.Test(strDateToBeTested) Then
' For Each match In regularExpression.Execute(strDateToBeTested)
If Len(strDateToBeTested) < 10 Then
testDate = True
' Exit For
End If
'End If
End If
Set regularExpression = Nothing
End Function
The more and more I thought about this (and some research), the more I figured that regex is not the best solution to this format problem. Combining a couple of other ideas (with the ReplaceAndSplit function attributed to the owner), this is what I came up with.
Option Explicit
Sub ttt()
Dim dateStr() As String
Dim i As Integer
dateStr = Split("2012/1/1,2012/01/01,2012/1/01,2012/01/1,1435/2/2," & _
"1435/02/02,1900/07/07,1435/02/02222222,2015/Jan/03", ",")
For i = 1 To UBound(dateStr)
Debug.Print "trying '" & dateStr(i) & "' ... " & testDate(dateStr(i))
Next i
End Sub
Function testDate(strDateToBeTested As String) As Boolean
Dim dateParts() As String
Dim y, m, d As Long
dateParts = ReplaceAndSplit(strDateToBeTested, "/.-")
testDate = False
If IsNumeric(dateParts(0)) Then
y = Int(dateParts(0))
Else
Exit Function
End If
If IsNumeric(dateParts(1)) Then
m = Int(dateParts(1))
Else
Exit Function
End If
If IsNumeric(dateParts(2)) Then
d = Int(dateParts(2))
Else
Exit Function
End If
If (y >= 1435) And (y < 2020) Then 'change or remove the upper limit as needed
If (m >= 1) And (m <= 12) Then
If (d >= 1) And (d <= 30) Then
testDate = True
End If
End If
End If
End Function
'=======================================================
'ReplaceAndSplit by alainbryden, optimized by aikimark
'Uses the native REPLACE() function to replace all delimiters with a common
'delimiter, and then splits them based on that.
'=======================================================
Function ReplaceAndSplit(ByRef Text As String, ByRef DelimChars As String) As String()
Dim DelimLen As Long, Delim As Long
Dim strTemp As String, Delim1 As String, Arr() As String, ThisDelim As String
strTemp = Text
Delim1 = Left$(DelimChars, 1)
DelimLen = Len(DelimChars)
For Delim = 2 To DelimLen
ThisDelim = Mid$(DelimChars, Delim, 1)
If InStr(strTemp, ThisDelim) <> 0 Then _
strTemp = Replace(strTemp, ThisDelim, Delim1)
Next
ReplaceAndSplit = Split(strTemp, Delim1)
End Function
I'm trying to write a function that takes in a string, parses it, and returns another string that summarizes the number of consecutive alpha or numeric characters in the original string.
For example, the string 999aa45bbx would return 3N2A2N3A,
i.e.
3 numbers,
followed by 2 alpha,
by 2 numbers,
by 3 alpha.
I'm using the function to analyze formats of insurance policy ID numbers. So far, I've found solutions online that extract either alpha or numeric characters, but nothing that describes the format or order in which these characters exist in the original string.
Can anyone help?
A regexp like this will do the job
press altf11 together to go the VBE
Insert Module
copy and paste the code below
press altf11 together to go back to Excel
then you can use the function (which also detects invalid strings) within Excel, ie in B1
=AlphaNumeric(A1)
Function AlphaNumeric(strIn As String) As String
Dim objRegex As Object
Dim objRegMC As Object
Dim objRegM As Object
Dim strOut As String
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Global = True
.ignorecase = True
.Pattern = "[^\w]"
If .test(strIn) Then
AlphaNumeric = "One or more characters is invalid"
Else
.Pattern = "(\d+|[a-z]+)"
Set objRegMC = .Execute(strIn)
For Each objRegM In objRegMC
strOut = strOut & (objRegM.Length & IIf(IsNumeric(objRegM), "N", "A"))
Next
AlphaNumeric = strOut
End If
End With
End Function
Old school, looping through all characters in the string:
Function IdentifyCharacterSequences(s As String) As String
Dim i As Long
Dim charCounter As Long
Dim currentCharType As String
Dim sOut As String
sOut = ""
charCounter = 1
currentCharType = CharType(Mid(s, 1, 1))
For i = 2 To Len(s)
If (Not CharType(Mid(s, i, 1)) = currentCharType) Or (i = Len(s)) Then
sOut = sOut & charCounter & currentCharType
currentCharType = CharType(Mid(s, i, 1))
charCounter = 1
Else
charCounter = charCounter + 1
End If
Next i
IdentifyCharacterSequences = sOut
End Function
This uses the following helper function. Note that non-alphanumeric characters are identified using the letter "X". You can easily modify this to suit your purposes.
Function CharType(s As String) As String
If s Like "[A-z]" Then
CharType = "A"
ElseIf s Like "[0-9]" Then
CharType = "N"
Else
CharType = "X"
'Or raise an error if non-alphanumerical chars are unacceptable.
End If
End Function
Usage example: