Excel VBA + Regular Expression - regex

Ok, to start. I'm a little rusty on VBA, 3 + years since Ive need to use it.
In short, im struggling to extract text from a string. Im using regular expression to extract my department name and date from this string.
The Department will always fall between : and -.
I can't share the document due to security. But, I can explain the format and hopefully we can work from that.
Col A----Col B----Col C---Col D
Date(e)--Dept(e)--String--Duration
Where (e) means it was extracted from the string.
My code for the extraction, thus far, is below. Currently it will loop through all available rows and extract the department, but it always take the : and - with it! I can't seem to find a way to cut these out.
Any assistance?
I can probably work out the date bit eventually.
The final output from this code is ": Inbound Contacts -"
Where I need, "Inbound Contacts".
Sub stringSearch()
Dim ws As Worksheet
Dim lastRow As Long, x As Long
Dim matches As Variant, match As Variant
Dim Reg_Exp As Object
Set Reg_Exp = CreateObject("vbscript.regexp")
Reg_Exp.Pattern = "\:\s(\w.+)\s\-"
Set ws = Sheet2
lastRow = ws.Range("C" & Rows.Count).End(xlUp).Row
For x = 1 To lastRow
Set matches = Reg_Exp.Execute(CStr(ws.Range("C" & x).Value))
If matches.Count > 0 Then
For Each match In matches
ws.Range("B" & x).Value = match.Value
Next match
End If
Next x
End Sub

This is how to achieve what you want without regex, in general it should be a bit faster and way more understandable:
Sub TestMe()
Dim inputString As String
inputString = "Planning Unit: Inbound Contacts = Tuesday, 27/03/2018"
Debug.Print Split(Split(inputString, ":")(1), "=")(0)
End Sub
split the inputString by : and take the second part;
split the taken part by = and take the first part;

You are not accessing Group 1 value.
Instead of ws.Range("B" & x).Value = match.Value use
ws.Range("B" & x).Value = match.Submatches(0)
You may also enhance the regex a bit to
Reg_Exp.Pattern = ":\s*(\w.*?)\s*-"
This way, you will "trim" the Group 1 value. See the regex demo.
Details
: - a : char
\s* - 0+ whitespace chars
(\w.*?) - Group 1 (.Submatches(0)): a word char followed with any 0+ chars (other than line break chars) as few as possible (NOTE that \w does not match non-ASCII letters, probably you want to match any char that is not whitespace and not a -, then use [^\s-] instead of \w)
\s* - 0+ whitespace chars
- - a hyphen.

Regex:
You can use this Regex: ([\s\S]+?):\s*([\s\S]+?)\s*-\s*([A-z]+)\s*,\s*([0-9]{2}\/[0-9]{2}\/[0-9]{4})\b
And the demo
Code:
And this code:
Sub stringSearch()
Dim ws As Worksheet
Dim lastRow As Long, x As Long
Dim matches As Variant, match As Variant
Dim Reg_Exp As Object
Set Reg_Exp = CreateObject("vbscript.regexp")
Reg_Exp.Pattern = "([\s\S]+?):\s*([\s\S]+?)\s*-\s*([A-z]+)\s*,\s*([0-9]{2}\/[0-9]{2}\/[0-9]{4})\b"
Set ws = Sheet2
lastRow = ws.Range("C" & Rows.Count).End(xlUp).Row
For x = 1 To lastRow
Set matches = Reg_Exp.Execute(CStr(ws.Range("C" & x).Value))
If matches.Count > 0 Then
For Each match In matches
For i = 0 To match.SubMatches.Count - 1
Debug.Print match.SubMatches(i)
Next i
Next match
End If
Next x
End Sub
Result
This is the result on the immediate window:
+-------------------+
| Planning Unit |
| Inbound Contracts |
| Tuesday |
| 27/03/2018 |
| Planning Unit |
| Payments & Orders |
| Tuesday |
| 27/03/2018 |
| Planning Unit |
| Scheduling |
| Tuesday |
| 27/03/2018 |
+-------------------+

I'd use Left/Right/Mid and InStr/InStrRev instead of RegEx in this case.
For extracting the department:
Dim mainStr As String
Dim deptStr As String
mainStr = "Planning Unit: Inbound Contacts - Tuesday, 27/03/2018"
deptStr = Mid(mainStr, InStr(mainStr, ":") + 2)
deptStr = Left(deptStr, InStr(deptStr, "-") - 2)
For extracting the date:
Dim mainStr As String
Dim dateStr As String
mainStr = "Planning Unit: Inbound Contacts - Tuesday, 27/03/2018"
dateStr = Right(mainStr, Len(mainStr) - InStrRev(mainStr, " "))
To be honest, this kind of situation is common enough that you might want to write some sort of "extractText" function to get the text between delimiters. Here's the one I use.
Function extractText(str As String, leftDelim As String, rightDelim As String, _
Optional reverseSearch As Boolean = False) As String
'Extracts text between two delimiters in a string
'By default, searches for first instance of each delimiter in string from left to right
'To search from right to left, set reverseSearch = True
'If left delimiter = "", function returns text up to right delimiter
'If right delimiter = "", function returns text after left delimiter
'If left or right delimiter not found in string, function returns empty string
Dim leftPos As Long
Dim rightPos As Long
Dim leftLen As Long
If reverseSearch Then
leftPos = InStrRev(str, leftDelim)
rightPos = InStrRev(str, rightDelim)
Else
leftPos = InStr(str, leftDelim)
rightPos = InStr(str, rightDelim)
End If
leftPos = IIf(leftDelim = "", -1, leftPos)
rightPos = IIf(rightDelim = "", -1, rightPos)
leftLen = Len(leftDelim)
If leftPos > 0 Then
If rightPos = -1 Then
extractText = Mid(str, leftPos + leftLen)
ElseIf rightPos > leftPos Then
extractText = Mid(str, leftPos + leftLen, rightPos - leftPos - leftLen)
End If
ElseIf leftPos = -1 Then
If rightPos > 0 Then
extractText = Left(str, rightPos - 1)
End If
End If
End Function

Related

Find '~XX~' within a string with specific values

I have classic ASP written in VBScript. I have a record pulled from SQL Server and the data is a string. In this string, I need to find text enclosed in ~12345~ and I need to replace with very specific text. Example 1 would be replaced with M, 2 would be replaced with A. I then need to display this on the web page. We don't know how many items will be enclosed with ~.
Example Data:
Group Pref: (To be paid through WIT)
~2.5~ % Quarterly Rebate - Standard Commercial Water Heaters
Display on webpage after:
Group Pref: (To be paid through WIT)
~A.H~ % Quarterly Rebate - Standard Commercial Water Heaters
I tried this following, but there are two many cases and this would be unrealistic to maintain. I does replace the text and display correctly.
dim strSearchThis
strSearchThis =(rsResults("PREF"))
set re = New RegExp
with re
.global = true
.pattern = "~[^>]*~"
strSearchThis = .replace(strSearchThis, "X")
end with
I am also trying this code, I can find the text contained between each ~ ~, but when displayed its the information between the ~ ~ is not changed:
dim strSearchThis
strSearchThis =(rsResults("PREF"))
Set FolioPrefData = New RegExp
FolioPrefData.Pattern = "~[^>]*~"
FolioPrefData.Global = True
FolioPrefData.IgnoreCase = True
'will contain all found instances of ~ ~'
set colmatches = FolioPrefData.Execute(strSearchThis)
Dim itemLength, found
For Each objMatch in colMatches
Select Case found
Case "~"
'ignore - doing nothing'
Case "1"
found = replace(strSearchThis, "M")
End Select
Next
response.write(strSearchThis)
You can do it without using Regular Expressions, just checking the individual characters and writing a function that handles the different cases you have. The following function finds your delimited text and loops through all characters, calling the ReplaceCharacter function defined further down:
Function FixString(p_sSearchString) As String
Dim iStartIndex
Dim iEndIndex
Dim iIndex
Dim sReplaceString
Dim sReturnString
sReturnString = p_sSearchString
' Locate start ~
iStartIndex = InStr(sReturnString, "~")
Do While iStartIndex > 0
' Look for end ~
iEndIndex = InStr(iStartIndex + 1, sReturnString, "~")
If iEndIndex > 0 Then
sReplaceString = ""
' Loop htrough all charatcers
For iIndex = iStartIndex + 1 To iEndIndex - 1
sReplaceString = sReplaceString & ReplaceCharacter(Mid(sReturnString, iIndex, 1))
Next
' Replace string
sReturnString = Left(sReturnString, iStartIndex) & sReplaceString & Mid(sReturnString, iEndIndex)
' Locate next ~
iStartIndex = InStr(iEndIndex + 1, sReturnString, "~")
Else
' End couldn't be found, exit
Exit Do
End If
Loop
FixString = sReturnString
End Function
This is the function where you will enter the different character substitutions you might have:
Function ReplaceCharacter(p_sCharacter) As String
Select Case p_sCharacter
Case "1"
ReplaceCharacter = "M"
Case "2"
ReplaceCharacter = "A"
Case Else
ReplaceCharacter = p_sCharacter
End Select
End Function
You can use this in your existing code:
response.write(FixString(strSearchThis))
You can also use a Split and Join method...
Const SEPARATOR = "~"
Dim deconstructString, myOutputString
Dim arrayPointer
deconstructString = Split(myInputString, SEPARATOR)
For arrayPointer = 0 To UBound(deconstructString)
If IsNumeric(deconstructString(arrayPointer)) Then
'Do whatever you need to with your value...
End If
Next 'arrayPointer
myOutputString = Join(deconstructString, "")
This does rely, obviously, on breaking a string apart and rejoining it, so there is a sleight overhead on string mutability issues.

Extract data between square brackets using Regex and VBA [duplicate]

In sentences like:
"[x] Alpha
[33] Beta"
I extract an array of bracketed data as ([x], [33])
using VBA regex Pattern:
"(\[x\])|(\[\d*\])"
I cannot extract directly the array of un-bracketed data as (x, 33)
using web resources advice for pattern
"(?<=\[)(.*?)(?=\])"
Is this a VBA specific problem (i.e. limits on its implementation of Regex)
or did I misunderstand 'looking forward and backward' patterns?
Public Function Regx( _
ByVal SourceString As String, _
ByVal Pattern As String, _
Optional ByVal IgnoreCase As Boolean = True, _
Optional ByVal MultiLine As Boolean = True, _
Optional ByVal MatchGlobal As Boolean = True) _
As Variant
Dim oMatch As Match
Dim arrMatches
Dim lngCount As Long
' Initialize to an empty array
arrMatches = Array()
With New RegExp
.MultiLine = MultiLine
.IgnoreCase = IgnoreCase
.Global = MatchGlobal
.Pattern = Pattern
For Each oMatch In .Execute(SourceString)
ReDim Preserve arrMatches(lngCount)
arrMatches(lngCount) = oMatch.Value
lngCount = lngCount + 1
Next
End With
Sub testabove()
Call Regx("[x] Alpha" & Chr(13) & _
"[33] Beta", "(\[x\])|(\[\d*\])")
End Sub
Use capturing around the subpatterns that will fetch you your required value.
Use
"\[(x)\]|\[(\d*)\]"
(or \d+ if you need to match at least 1 digit, as * means zero or more occurrences, and + means one or more occurrences).
Or, use the generic pattern to extract anything inside the square brackets without the brackets:
"\[([^\][]+)]"
Then, access the right Submatches index by checking the submatch length (since you have an alternation, either of the submatch will be empty), and there you go. Just change your for loop with
For Each oMatch In .Execute(SourceString)
ReDim Preserve arrMatches(lngCount)
If Len(oMatch.SubMatches(0)) > 0 Then
arrMatches(lngCount) = oMatch.SubMatches(0)
Else
arrMatches(lngCount) = oMatch.SubMatches(1)
End If
' Debug.Print arrMatches(lngCount) ' - This outputs x and 33 with your data
lngCount = lngCount + 1
Next
With Excel and VBA you can strip the brackets after the regex extraction:
Sub qwerty()
Dim inpt As String, outpt As String
Dim MColl As MatchCollection, temp2 As String
Dim regex As RegExp, L As Long
inpt = "38c6v5hrk[x]537fhvvb"
Set regex = New RegExp
regex.Pattern = "(\[x\])|(\[\d*\])"
Set MColl = regex.Execute(inpt)
temp2 = MColl(0).Value
L = Len(temp2) - 2
outpt = Mid(temp2, 2, L)
MsgBox inpt & vbCrLf & outpt
End Sub
Try this:
\[(x)\]|\[(\d*)\]
What you don't want to be captured, don't put them inside (). this is used for grouping
Explanation
You will get x and 33 in $1 and $2
Dot Net Sample
Alright, I prepared it for you , although far away from vb for long. Lots of it might be not needed, yet it might help you to understand it better
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim text As String = "[x] Alpha [33] Beta]"
Dim pattern As String = "\[(x)\]|\[(\d*)\]"
' Instantiate the regular expression object.
Dim r As Regex = new Regex(pattern, RegexOptions.IgnoreCase)
' Match the regular expression pattern against a text string.
Dim m As Match = r.Match(text)
Dim matchcount as Integer = 0
Do While m.Success
matchCount += 1
Console.WriteLine("Match" & (matchCount))
Dim i As Integer
For i = 1 to 2
Dim g as Group = m.Groups(i)
Console.WriteLine("Group" & i & "='" & g.ToString() & "'")
Dim cc As CaptureCollection = g.Captures
Dim j As Integer
For j = 0 to cc.Count - 1
Dim c As Capture = cc(j)
Console.WriteLine("Capture" & j & "='" & c.ToString() _
& "', Position=" & c.Index)
Next
Next
m = m.NextMatch()
Loop
End Sub
End Module
Array Without Regex:
For Each Value In Split(SourceString, Chr(13))
ReDim Preserve arrMatches(lngCount)
arrMatches(lngCount) = Split(Split(Value, "]")(0), "[")(1)
lngCount = lngCount + 1
Next

Extracting Text Between Brackets with Regex

In sentences like:
"[x] Alpha
[33] Beta"
I extract an array of bracketed data as ([x], [33])
using VBA regex Pattern:
"(\[x\])|(\[\d*\])"
I cannot extract directly the array of un-bracketed data as (x, 33)
using web resources advice for pattern
"(?<=\[)(.*?)(?=\])"
Is this a VBA specific problem (i.e. limits on its implementation of Regex)
or did I misunderstand 'looking forward and backward' patterns?
Public Function Regx( _
ByVal SourceString As String, _
ByVal Pattern As String, _
Optional ByVal IgnoreCase As Boolean = True, _
Optional ByVal MultiLine As Boolean = True, _
Optional ByVal MatchGlobal As Boolean = True) _
As Variant
Dim oMatch As Match
Dim arrMatches
Dim lngCount As Long
' Initialize to an empty array
arrMatches = Array()
With New RegExp
.MultiLine = MultiLine
.IgnoreCase = IgnoreCase
.Global = MatchGlobal
.Pattern = Pattern
For Each oMatch In .Execute(SourceString)
ReDim Preserve arrMatches(lngCount)
arrMatches(lngCount) = oMatch.Value
lngCount = lngCount + 1
Next
End With
Sub testabove()
Call Regx("[x] Alpha" & Chr(13) & _
"[33] Beta", "(\[x\])|(\[\d*\])")
End Sub
Use capturing around the subpatterns that will fetch you your required value.
Use
"\[(x)\]|\[(\d*)\]"
(or \d+ if you need to match at least 1 digit, as * means zero or more occurrences, and + means one or more occurrences).
Or, use the generic pattern to extract anything inside the square brackets without the brackets:
"\[([^\][]+)]"
Then, access the right Submatches index by checking the submatch length (since you have an alternation, either of the submatch will be empty), and there you go. Just change your for loop with
For Each oMatch In .Execute(SourceString)
ReDim Preserve arrMatches(lngCount)
If Len(oMatch.SubMatches(0)) > 0 Then
arrMatches(lngCount) = oMatch.SubMatches(0)
Else
arrMatches(lngCount) = oMatch.SubMatches(1)
End If
' Debug.Print arrMatches(lngCount) ' - This outputs x and 33 with your data
lngCount = lngCount + 1
Next
With Excel and VBA you can strip the brackets after the regex extraction:
Sub qwerty()
Dim inpt As String, outpt As String
Dim MColl As MatchCollection, temp2 As String
Dim regex As RegExp, L As Long
inpt = "38c6v5hrk[x]537fhvvb"
Set regex = New RegExp
regex.Pattern = "(\[x\])|(\[\d*\])"
Set MColl = regex.Execute(inpt)
temp2 = MColl(0).Value
L = Len(temp2) - 2
outpt = Mid(temp2, 2, L)
MsgBox inpt & vbCrLf & outpt
End Sub
Try this:
\[(x)\]|\[(\d*)\]
What you don't want to be captured, don't put them inside (). this is used for grouping
Explanation
You will get x and 33 in $1 and $2
Dot Net Sample
Alright, I prepared it for you , although far away from vb for long. Lots of it might be not needed, yet it might help you to understand it better
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim text As String = "[x] Alpha [33] Beta]"
Dim pattern As String = "\[(x)\]|\[(\d*)\]"
' Instantiate the regular expression object.
Dim r As Regex = new Regex(pattern, RegexOptions.IgnoreCase)
' Match the regular expression pattern against a text string.
Dim m As Match = r.Match(text)
Dim matchcount as Integer = 0
Do While m.Success
matchCount += 1
Console.WriteLine("Match" & (matchCount))
Dim i As Integer
For i = 1 to 2
Dim g as Group = m.Groups(i)
Console.WriteLine("Group" & i & "='" & g.ToString() & "'")
Dim cc As CaptureCollection = g.Captures
Dim j As Integer
For j = 0 to cc.Count - 1
Dim c As Capture = cc(j)
Console.WriteLine("Capture" & j & "='" & c.ToString() _
& "', Position=" & c.Index)
Next
Next
m = m.NextMatch()
Loop
End Sub
End Module
Array Without Regex:
For Each Value In Split(SourceString, Chr(13))
ReDim Preserve arrMatches(lngCount)
arrMatches(lngCount) = Split(Split(Value, "]")(0), "[")(1)
lngCount = lngCount + 1
Next

Extract Email address from a table in .HTMLbody

I would like to reply to a webform extracting the email address from the form.
The webform is in a table, thus the ParseTextLinePair() function returns blanks as the email address in the column next to the label.
How can I extract the email address from a webform?
Sub ReplywithTemplatev2()
Dim Item As Outlook.MailItem
Dim oRespond As Outlook.MailItem
'Get Email
Dim intLocAddress As Integer
Dim intLocCRLF As Integer
Dim strAddress As String
Set Item = GetCurrentItem()
If Item.Class = olMail Then
' find the requestor address
strAddress = ParseTextLinePair(Item.Body, "Email-Adresse des Ansprechpartners *")
' This sends a response back using a template
Set oRespond = Application.CreateItemFromTemplate("C:\Users\Reply.oft")
With oRespond
.Recipients.Add Item.SenderEmailAddress
.Subject = "Your Subject Goes Here"
.HTMLBody = oRespond.HTMLBody & vbCrLf & _
"---- original message below ---" & vbCrLf & _
Item.HTMLBody & vbCrLf
' includes the original message as an attachment
' .Attachments.Add Item
oRespond.To = strAddress
' use this for testing, change to .send once you have it working as desired
.Display
End With
End If
Set oRespond = Nothing
End Sub
Function GetCurrentItem() As Object
Dim objApp As Outlook.Application
Set objApp = Application
On Error Resume Next
Select Case TypeName(objApp.ActiveWindow)
Case "Explorer"
Set GetCurrentItem = objApp.ActiveExplorer.Selection.Item(1)
Case "Inspector"
Set GetCurrentItem = objApp.ActiveInspector.CurrentItem
End Select
Set objApp = Nothing
End Function
Function ParseTextLinePair(strSource As String, strLabel As String)
Dim intLocLabel As Integer
Dim intLocCRLF As Integer
Dim intLenLabel As Integer
Dim strText As String
' locate the label in the source text
intLocLabel = InStr(strSource, strLabel)
intLenLabel = Len(strLabel)
If intLocLabel > 0 Then
intLocCRLF = InStr(intLocLabel, strSource, vbCrLf)
If intLocCRLF > 0 Then
intLocLabel = intLocLabel + intLenLabel
strText = Mid(strSource, _
intLocLabel, _
intLocCRLF - intLocLabel)
Else
intLocLabel = Mid(strSource, intLocLabel + intLenLabel)
End If
End If
ParseTextLinePair = Trim(strText)
End Function
A picture of the table to clarify.
Have you looked in to Regular Expressions in VBA, I haven't worked on it in while but here is an example.
Option Explicit
Sub Example()
Dim Item As MailItem
Dim RegExp As Object
Dim Search_Email As String
Dim Pattern As String
Dim Matches As Variant
Set RegExp = CreateObject("VbScript.RegExp")
Pattern = "\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b"
For Each Item In ActiveExplorer.Selection
Search_Email = Item.body
With RegExp
.Global = False
.Pattern = Pattern
.IgnoreCase = True
Set Matches = .Execute(Search_Email)
End With
If Matches.Count > 0 Then
Debug.Print Matches(0)
Else
Debug.Print "Not Found "
End If
Next
Set RegExp = Nothing
End Sub
Or Pattern = "(\S*#\w+\.\w+)" Or "(\w+(?:\W+\w+)*#\w+\.\w+)"
Regular-expressions.info/tutorial
\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,}\b Simple pattern that describes an email address.
A series of letters, digits, dots, underscores, percentage signs and hyphens, followed by an at sign, followed by another series of letters, digits and hyphens, finally followed by a single dot and two or more letters
[A-Z0-9._%+-]+ Match a single character present in the list below
A-Z A single character in the range between A and Z (case sensitive)
0-9 A single character in the range between 0 and 9
._%+- A single character in the list
# Matches the character # literally
Quantifiers
Udemy.com/vba-regex/
+---------+---------------------------------------------+------------------------------------------------------------+
| Pattern | Meaning | Example |
+---------+---------------------------------------------+------------------------------------------------------------+
| | | |
| – | Stands for a range | a-z means all the letters a to z |
| [] | Stands for any one of the characters quoted | [abc] means either a, b or c.[A-Z] means either A, B, …, Z |
| () | Used for grouping purposes | |
| | | Meaning is ‘or’ | X|Y, means X or Y |
| + | Matches the character one or more times | zo+ matches ‘zoo’, but not ‘z’ |
| * | Matches the character zero or more times | “lo*” matches either “l” or “loo” |
| ? | Matches the character zero or once | “b?ve?” matches the “ve” in “never”. |
+---------+---------------------------------------------+------------------------------------------------------------+
Wikibooks.org/wiki/Visual_Basic/Regular_Expressions
https://regex101.com/r/oP2yR0/1

match date pattern in the string vba excel

Edit:
Since my string became more and more complicated looks like regexp is the only way.
I do not have a lot experience in that and your help is much appreciated.
Basically from what I read on the web I construct the following exp to try matching occurrence in my sample string:
"My very long long string 12Mar2012 is right here 23Apr2015"
[0-9][0-9] + [a-zA-Z] + [0-9][0-9][0-9][0-9]
and trying this code. I do not have any match. Any good link on regexp tutorial much appreciated.
Dim re, match, RegExDate
Set re = CreateObject("vbscript.regexp")
re.Pattern = "(^[0-9][0-9] + [a-zA-Z] + [0-9][0-9][0-9][0-9]$)"
re.Global = True
For Each match In re.Execute(str)
MsgBox match.Value
RegExDate = match.Value
Exit For
Next
Thank you
This code validates the actual date from the Regexp using DateValuefor robustness
Sub Robust()
Dim Regex As Object
Dim RegexMC As Object
Dim RegexM As Object
Dim strIn As String
Dim BDate As Boolean
strIn = "My very long long string 12Mar2012 is right here 23Apr2015 and 30Feb2002"
Set Regex = CreateObject("vbscript.regexp")
With Regex
.Pattern = "(([0-9])|([0-2][0-9])|([3][0-1]))(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)(\d{4})"
.Global = True
If .test(strIn) Then
Set RegexMC = .Execute(strIn)
On Error Resume Next
For Each RegexM In RegexMC
BDate = False
BDate = IsDate(DateValue(RegexM.submatches(0) & " " & RegexM.submatches(4) & " " & RegexM.submatches(5)))
If BDate Then Debug.Print RegexM
Next
On Error GoTo 0
End If
End With
End Sub
thanks for all your help !!!
I managed to solve my problem using this simple code.
Dim rex As New RegExp
Dim dateCol As New Collection
rex.Pattern = "(\d|\d\d)(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)(\d{4})?"
rex.Global = True
For Each match In rex.Execute(sStream)
dateCol.Add match.Value
Next
Just note that on my side I'm sure that I got valid date in the string so the reg expression is easy.
thnx
Ilya
The following is a quick attempt I made. It's far from perfect.
Basically, it splits the string into words. While looping through the words it cuts off any punctuation (period and comma, you might need to add more).
When processing an item, we try to remove each month name from it. If the string gets shorter we might have a date.
It checks to see if the length of the final string is about right (5 or 6 characters, 1 or 2 + 4 for day and year)
You could instead (or also) check to see that there all numbers.
Private Const MonthList = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC"
Public Function getDates(ByVal Target As String) As String
Dim Data() As String
Dim Item As String
Dim Index As Integer
Dim List() As String
Dim Index2 As Integer
Dim Test As String
Dim Result As String
List = Split(MonthList, ",")
Data = Split(Target, " ")
Result = ""
For Index = LBound(Data) To UBound(Data)
Item = UCase(Replace(Replace(Data(Index), ".", ""), ",", ""))
For Index2 = LBound(Data) To UBound(Data)
Test = Replace(Item, List(Index2), "")
If Not Test = Item Then
If Len(Test) = 5 Or Len(Test) = 6 Then
If Result = "" Then
Result = Item
Else
Result = Result & ", " & Item
End If
End If
End If
Next Index2
Next
getDates = Result
End Function