Using Regex To find total occurrence of &T [duplicate] - regex

I have a string (for example: "Hello there. My name is John. I work very hard. Hello there!") and I am trying to find the number of occurrences of the string "hello there". So far, this is the code I have:
Dim input as String = "Hello there. My name is John. I work very hard. Hello there!"
Dim phrase as String = "hello there"
Dim Occurrences As Integer = 0
If input.toLower.Contains(phrase) = True Then
Occurrences = input.Split(phrase).Length
'REM: Do stuff
End If
Unfortunately, what this line of code seems to do is split the string every time it sees the first letter of phrase, in this case, h. So instead of the result Occurrences = 2 that I would hope for, I actually get a much larger number. I know that counting the number of splits in a string is a horrible way to go about doing this, even if I did get the correct answer, so could someone please help me out and provide some assistance?

Yet another idea:
Dim input As String = "Hello there. My name is John. I work very hard. Hello there!"
Dim phrase As String = "Hello there"
Dim Occurrences As Integer = (input.Length - input.Replace(phrase, String.Empty).Length) / phrase.Length
You just need to make sure that phrase.Length > 0.

the best way to do it is this:
Public Function countString(ByVal inputString As String, ByVal stringToBeSearchedInsideTheInputString as String) As Integer
Return System.Text.RegularExpressions.Regex.Split(inputString, stringToBeSearchedInsideTheInputString).Length -1
End Function

str="Thisissumlivinginsumgjhvgsum in the sum bcoz sum ot ih sum"
b= LCase(str)
array1=Split(b,"sum")
l=Ubound(array1)
msgbox l
the output gives u the no. of occurences of a string within another one.

You can create a Do Until loop that stops once an integer variable equals the length of the string you're checking. If the phrase exists, increment your occurences and add the length of the phrase plus the position in which it is found to the cursor variable. If the phrase can not be found, you are done searching (no more results), so set it to the length of the target string. To not count the same occurance more than once, check only from the cursor to the length of the target string in the Loop (strCheckThisString).
Dim input As String = "hello there. this is a test. hello there hello there!"
Dim phrase As String = "hello there"
Dim Occurrences As Integer = 0
Dim intCursor As Integer = 0
Do Until intCursor >= input.Length
Dim strCheckThisString As String = Mid(LCase(input), intCursor + 1, (Len(input) - intCursor))
Dim intPlaceOfPhrase As Integer = InStr(strCheckThisString, phrase)
If intPlaceOfPhrase > 0 Then
Occurrences += 1
intCursor += (intPlaceOfPhrase + Len(phrase) - 1)
Else
intCursor = input.Length
End If
Loop

You just have to change the input of the split function into a string array and then delare the StringSplitOptions.
Try out this line of code:
Occurrences = input.Split({phrase}, StringSplitOptions.None).Length
I haven't checked this, but I'm thinking you'll also have to account for the fact that occurrences would be too high due to the fact that you're splitting using your string and not actually counting how many times it is in the string, so I think Occurrences = Occurrences - 1
Hope this helps

You could create a recursive function using IndexOf. Passing the string to be searched and the string to locate, each recursion increments a Counter and sets the StartIndex to +1 the last found index, until the search string is no longer found. Function will require optional parameters Starting Position and Counter passed by reference:
Function InStrCount(ByVal SourceString As String, _
ByVal SearchString As String, _
Optional ByRef StartPos As Integer = 0, _
Optional ByRef Count As Integer = 0) As Integer
If SourceString.IndexOf(SearchString, StartPos) > -1 Then
Count += 1
InStrCount(SourceString, _
SearchString, _
SourceString.IndexOf(SearchString, StartPos) + 1, _
Count)
End If
Return Count
End Function
Call function by passing string to search and string to locate and, optionally, start position:
Dim input As String = "Hello there. My name is John. I work very hard. Hello there!"
Dim phrase As String = "hello there"
Dim Occurrences As Integer
Occurrances = InStrCount(input.ToLower, phrase.ToLower)
Note the use of .ToLower, which is used to ignore case in your comparison. Do not include this directive if you do wish comparison to be case specific.

One more solution based on InStr(i, str, substr) function (searching substr in str starting from i position, more info about InStr()):
Function findOccurancesCount(baseString, subString)
occurancesCount = 0
i = 1
Do
foundPosition = InStr(i, baseString, subString) 'searching from i position
If foundPosition > 0 Then 'substring is found at foundPosition index
occurancesCount = occurancesCount + 1 'count this occurance
i = foundPosition + 1 'searching from i+1 on the next cycle
End If
Loop While foundPosition <> 0
findOccurancesCount = occurancesCount
End Function
As soon as there is no substring found (InStr returns 0, instead of found substring position in base string), searching is over and occurances count is returned.

Looking at your original attempt, I have found that this should do the trick as "Split" creates an array.
Occurrences = input.split(phrase).ubound
This is CaSe sensitive, so in your case the phrase should equal "Hello there", as there is no "hello there" in the input

Expanding on Sumit Kumar's simple solution, here it is as a one-line working function:
Public Function fnStrCnt(ByVal str As String, ByVal substr As String) As Integer
fnStrCnt = UBound(Split(LCase(str), substr))
End Function
Demo:
Sub testit()
Dim thePhrase
thePhrase = "Once upon a midnight dreary while a man was in a house in the usa."
If fnStrCnt(thePhrase, " a ") > 1 Then
MsgBox "Found " & fnStrCnt(thePhrase, " a ") & " occurrences."
End If
End Sub 'testit()

I don't know if this is more obvious?
Starting from the beginning of longString check the next characters up to the number characters in phrase, if phrase is not found start looking from the second character etc. If it is found start agin from the current position plus the number of characters in phrase and increment the value of occurences
Module Module1
Sub Main()
Dim longString As String = "Hello there. My name is John. I work very hard. Hello there! Hello therehello there"
Dim phrase As String = "hello There"
Dim occurences As Integer = 0
Dim n As Integer = 0
Do Until n >= longString.Length - (phrase.Length - 1)
If longString.ToLower.Substring(n, phrase.Length).Contains(phrase.ToLower) Then
occurences += 1
n = n + (phrase.Length - 1)
End If
n += 1
Loop
Console.WriteLine(occurences)
End Sub
End Module

I used this in Vbscript, You can convert the same to VB.net as well
Dim str, strToFind
str = "sdfsdf:sdsdgs::"
strToFind = ":"
MsgBox GetNoOfOccurranceOf( strToFind, str)
Function GetNoOfOccurranceOf(ByVal subStringToFind As String, ByVal strReference As String)
Dim iTotalLength, newString, iTotalOccCount
iTotalLength = Len(strReference)
newString = Replace(strReference, subStringToFind, "")
iTotalOccCount = iTotalLength - Len(newString)
GetNoOfOccurranceOf = iTotalOccCount
End Function

I know this thread is really old, but I got another solution too:
Function countOccurencesOf(needle As String, s As String)
Dim count As Integer = 0
For i As Integer = 0 to s.Length - 1
If s.Substring(i).Startswith(needle) Then
count = count + 1
End If
Next
Return count
End Function

Related

Sort a String array

I have as input the string in the below format
"[1_5,3,7,1],[1_2,4,1,9],[],[1_1,,4,,,9,2]"
What I need to obtain is the same string but with the number after the _ sorted:
"[1_1,3,5,7],[1_1,2,4,9],[],[1_1,2,4,9,,,]"
Dim tmprequestedArea_selectionAreaIn As String = "[1_5,3,7,1],[1_2,4,1,9],[],[1_1,,4,,,9,2]"
tmprequestedArea_selectionAreaIn = Regex.Replace(requestedArea_selectionAreaIn,"\],\[","#")
tmprequestedArea_selectionAreaIn = Regex.Replace(tmprequestedArea_selectionAreaIn,"\[|\]","")
bracList.AddRange(tmprequestedArea_selectionAreaIn.Split(New Char() {"#"c}, StringSplitOptions.None ))
If sortNumber Then
'Split braclist by _ and puts the value in strList
'If after _ is only one number put only that number, else split it by char "," and put in strList the join of the split by , array
'Sort the array
'in previous example strList will contain a,b,c in position 0 and _d_f (instead of f,d) in position 1
For i As Integer = 0 To bracList.Count -1
Dim tmp As String()
Dim tmpInt As New System.Collections.Generic.List(Of Integer)
If Not(String.IsNullOrEmpty(bracList(i))) Then
Dim tmpRequested As String = bracList(i).Split(New Char() {"_"c})(0)
Dim tmpSelection As String = bracList(i).Split(New Char() {"_"c})(1)
If tmpSelection.Contains(",") Then
tmp = tmpSelection.Split(New Char() {","c})
For j As Integer = 0 To tmp.Length -1
tmpInt.Add(Convert.toInt32(tmp(j)))
Next
tmpInt.Sort
strList.Add("[" + tmpRequested + "_" + String.Join(",",tmpInt ) + "]")
Else
strList.Add("[" + tmpRequested + "_" + tmpSelection + "]" )
End If
Else
strList.Add("[]")
End If
Next i
I'm looking for a better way to manage it.
Try this, as a possible substitute for what you're doing now.
Given this input string:
Dim input As String = "[1_5,3,7,1],[1_2,4,1,9],[],[1_1,,4,,,9,2]"
Note: this will also deal with decimal values without changes. E.g.,
"[1_5.5,3.5,7,1],[1_2.564,4,2.563,9],[],[1_1,,4.23,,,9.0,2.45]"
You can extract the content of the brackets with this pattern: \[(.*?)\] and use Regex.Matches to return a MatchCollection of all the substrings that match the pattern.
Then use a StringBuilder as a container to rebuild the string while the parts are being treated.
Imports System.Linq
Imports System.Text.RegularExpressions
Dim pattern As String = "\[(.*?)\]"
Dim matches = Regex.Matches(input, pattern, RegexOptions.Singleline)
Dim sb As New StringBuilder()
For Each match As Match In matches
Dim value As String = match.Groups(1).Value
If String.IsNullOrEmpty(value) Then
sb.Append("[],")
Continue For
End If
Dim sepPosition As Integer = value.IndexOf("_"c) + 1
sb.Append("[" & value.Substring(0, sepPosition))
Dim values = value.Substring(sepPosition).Split(","c)
sb.Append(String.Join(",", values.Where(Function(n) n.Length > 0).OrderBy(Function(n) CDec(n))))
sb.Append(","c, values.Count(Function(n) n.Length = 0))
sb.Append("],")
Next
Dim result As String = sb.ToString().TrimEnd(","c)
If you don't know about LINQ, this is what it's doing:
String.Join(",", values.Where(Function(n) n.Length > 0).OrderBy(Function(n) CDec(n)))
values is an array of strings, generated by String.Split().
values.Where(Function(n) n.Length > 0): creates an Enumerable(Of String) from values Where the content, n, is a string of length > 0.
I could have written values.Where(Function(n) Not String.IsNUllOrEmpty(n)).
.OrderBy(Function(n) CDec(n))): Orders the resulting Enumerable(Of String) using the string value converted to Decimal and generates an Enumerable(Of String), which is passed back to String.Join(), to rebuild the string, adding a char (","c) between the parts.
values.Count(Function(n) n.Length = 0): Counts the elements of values that have Length = 0 (empty strings). This is the number of empty elements that are represented by a comma, appended at the end of the partial string.
If you are looking for a "way"
I think it is easier to fetch each char of the string and if it is a number you put it in array (and when the char is ']' you start new array) the sort the arrays and replace each number from the string with it's sorted number (so you will just do allocation without the need to reconstruct with regular expression
I wish that I had Visual Studio to provide you the code (it is joyful to code a riddle) ^_^
ps:for the commas you can use a counter for each blank commas an the put it in the end

Extract largest numeric sequence from string (regex, or?)

I have strings similar to the following:
4123499-TESCO45-123
every99999994_54
And I want to extract the largest numeric sequence in each string, respectively:
4123499
99999994
I have previously tried regex (I am using VB6)
Set rx = New RegExp
rx.Pattern = "[^\d]"
rx.Global = True
StringText = rx.Replace(StringText, "")
Which gets me partway there, but it only removes the non-numeric values, and I end up with the first string looking like:
412349945123
Can I find a regex that will give me what I require, or will I have to try another method? Essentially, my pattern would have to be anything that isn't the longest numeric sequence. But I'm not actually sure if that is even a reasonable pattern. Could anyone with a better handle of regex tell me if I am going down a rabbit hole? I appreciate any help!
You cannot get the result by just a regex. You will have to extract all numeric chunks and get the longest one using other programming means.
Here is an example:
Dim strPattern As String: strPattern = "\d+"
Dim str As String: str = "4123499-TESCO45-123"
Dim regEx As New RegExp
Dim matches As MatchCollection
Dim match As Match
Dim result As String
With regEx
.Global = True
.MultiLine = False
.IgnoreCase = False
.Pattern = strPattern
End With
Set matches = regEx.Execute(str)
For Each m In matches
If result < Len(m.Value) Then result = m.Value
Next
Debug.Print result
The \d+ with RegExp.Global=True will find all digit chunks and then only the longest will be printed after all matches are processed in a loop.
That's not solvable with an RE on its own.
Instead you can simply walk along the string tracking the longest consecutive digit group:
For i = 1 To Len(StringText)
If IsNumeric(Mid$(StringText, i, 1)) Then
a = a & Mid$(StringText, i, 1)
Else
a = ""
End If
If Len(a) > Len(longest) Then longest = a
Next
MsgBox longest
(first result wins a tie)
If the two examples you gave, are of a standard where:
<long_number>-<some_other_data>-<short_number>
<text><long_number>_<short_number>
Are the two formats that the strings come in, there are some solutions.
However, if you are searching any string in any format for the longest number, these will not work.
Solution 1
([0-9]+)[_-].*
See the demo
In the first capture group, you should have the longest number for those 2 formats.
Note: This assumes that the longest number will be the first number it encounters with an underscore or a hyphen next to it, matching those two examples given.
Solution 2
\d{6,}
See the demo
Note: This assumes that the shortest number will never exceed 5 characters in length, and the longest number will never be shorter than 6 characters in length
Please, try.
Pure VB. No external libs or objects.
No brain-breaking regexp's patterns.
No string manipulations, so - speed. Superspeed. ~30 times faster than regexp :)
Easy transform on variouse needs.
For example, concatenate all digits from the source string to a single string.
Moreover, if target string is only intermediate step,
so it's possible to manipulate with numbers only.
Public Sub sb_BigNmb()
Dim sSrc$, sTgt$
Dim taSrc() As Byte, taTgt() As Byte, tLB As Byte, tUB As Byte
Dim s As Byte, t As Byte, tLenMin As Byte
tLenMin = 4
sSrc = "every99999994_54"
sTgt = vbNullString
taSrc = StrConv(sSrc, vbFromUnicode)
tLB = LBound(taSrc)
tUB = UBound(taSrc)
ReDim taTgt(tLB To tUB)
t = 0
For s = tLB To tUB
Select Case taSrc(s)
Case 48 To 57
taTgt(t) = taSrc(s)
t = t + 1
Case Else
If CBool(t) Then Exit For ' *** EXIT FOR ***
End Select
Next
If (t > tLenMin) Then
ReDim Preserve taTgt(tLB To (t - 1))
sTgt = StrConv(taTgt, vbUnicode)
End If
Debug.Print "'" & sTgt & "'"
Stop
End Sub
How to handle sSrc = "ev_1_ery99999994_54", please, make by yourself :)
.

Excel UDF for capturing numbers within characters

I have a variable text field sitting in cell A1 which contains the following:
Text;#Number;#Text;#Number
This format can keep repeating, but the pattern is always Text;#Number.
The numbers can vary from 1 digit to n digits (limit 7)
Example:
Original Value
MyName;#123;#YourName;#3456;#HisName;#78
Required value:
123, 3456, 78
The field is too variable for excel formulas from my understanding.
I tried using regexp but I am a beginner when it comes to coding. if you can break down the code with some explanation text, it would be much appreciated.
I have tried some of the suggestions below and they work perfectly. One more question.
Now that I can split the numbers from the text, is there any way to utilize the code below and add another layer, where we split the numbers into x cells.
For example: once we run the function, if we get 1234, 567 in the same cell, the function would put 1234 in cell B2, and 567 in cell C2. This would keep updating all cells in the same row until the string has exhausted all of the numbers that are retrieved from the function.
Thanks
This is the John Coleman's suggested method:
Public Function GetTheNumbers(st As String) As String
ary = Split(st, ";#")
GetTheNumbers = ""
For Each a In ary
If IsNumeric(a) Then
If GetTheNumbers = "" Then
GetTheNumbers = a
Else
GetTheNumbers = GetTheNumbers & ", " & a
End If
End If
Next a
End Function
If the pattern is fixed, and the location of the numbers never changes, you can assume the numbers will be located in the even places in the string. This means that in the array result of a split on the source string, you can use the odd indexes of the resulting array. For example in this string "Text;#Number;#Text;#Number" array indexes 1, 3 would be the numbers ("Text(0);#Number(1);#Text(2);#Number(3)"). I think this method is easier and safer to use if the pattern is indeed fixed, as it avoids the need to verify data types.
Public Function GetNums(src As String) As String
Dim arr
Dim i As Integer
Dim result As String
arr = Split(src, ";#") ' Split the string to an array.
result = ""
For i = 1 To UBound(arr) Step 2 ' Loop through the array, starting with the second item, and skipping one item (using Step 2).
result = result & arr(i) & ", "
Next
If Len(result) > 2 Then
GetNums = Left(result, Len(result) - 2) ' Remove the extra ", " at the end of the the result string.
Else
GetNums = ""
End If
End Function
The numbers can vary from 1 digit to n digits (limit 7)
None of the other responses seems to take the provided parameters into consideration so I kludged together a true regex solution.
Option Explicit
Option Base 0 '<~~this is the default but I've included it because it has to be 0
Function numsOnly(str As String, _
Optional delim As String = ", ")
Dim n As Long, nums() As Variant
Static rgx As Object, cmat As Object
'with rgx as static, it only has to be created once; beneficial when filling a long column with this UDF
If rgx Is Nothing Then
Set rgx = CreateObject("VBScript.RegExp")
End If
numsOnly = vbNullString
With rgx
.Global = True
.MultiLine = False
.Pattern = "[0-9]{1,7}"
If .Test(str) Then
Set cmat = .Execute(str)
'resize the nums array to accept the matches
ReDim nums(cmat.Count - 1)
'populate the nums array with the matches
For n = LBound(nums) To UBound(nums)
nums(n) = cmat.Item(n)
Next n
'convert the nums array to a delimited string
numsOnly = Join(nums, delim)
End If
End With
End Function
      
Regexp option that uses Replace
Sub Test()
Debug.Print StrOut("MyName;#123;#YourName;#3456;#HisName;#78")
End Sub
function
Option Explicit
Function StrOut(strIn As String) As String
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "(^|.+?)(\d{1,7})"
.Global = True
If .Test(strIn) Then
StrOut = .Replace(strIn, "$2, ")
StrOut = Left$(StrOut, Len(StrOut) - 2)
Else
StrOut = "Nothing"
End If
End With
End Function

Slight adaptation of a User Defined Function

I would like to extract a combination of text and numbers from a larger string located within a column within excel.
The constants I have to work with is that each Text string will
•either start with a A, C or S, and
•will always be 7 Characters long
•the position of he string I would like to extract varies
The code I have been using which has been working efficiently is;
Public Function Xtractor(r As Range) As String
Dim a, ary
ary = Split(r.Text, " ")
For Each a In ary
If Len(a) = 7 And a Like "[SAC]*" Then
Xtractor = a
Exit Function
End If
Next a
Xtractor = ""
End Function
However today I have learnt that sometimes my data may include scenarios like this;
What I would like is to adapt my code so If the 8th character is "Underscore" and the 1st character of the 7 characters is either S, A or C please extract up until the "Underscore"
Secondly I would like to exclude commons words like "Support" & "Collect" from being extracted.
Finally the 7th letter should be a number
Any ideas around this would be much appreciated.
Thanks
try this
ary = Split(Replace(r.Text, "_", " "))
or
ary = Split(Replace(r.Text, "_", " ")," ")
result will be same for both variants
test
update
Do you know how I could leave the result blank if the 7th character returned a letter?
Public Function Xtractor(r As Range) As String
Dim a, ary
ary = Split(Replace(r.Text, "_", " "))
For Each a In ary
If Len(a) = 7 And a Like "[SAC]*" And IsNumeric(Mid(a, 7, 1)) Then
Xtractor = a
Exit Function
End If
Next a
Xtractor = ""
End Function
test
Add Microsoft VBScript Regular Expressions 5.5 to project references. Use the following code to test matching and extracting with Xtractor:
Public Function Xtractor(ByVal p_val As String) As String
Xtractor = ""
Dim ary As String, v_re As New VBScript_RegExp_55.RegExp, Matches
v_re.Pattern = "^([SAC][^_]{1,6})_?"
Set Matches = v_re.Execute(p_val)
If Matches.Count > 0 Then Xtractor = Matches(0).SubMatches(0) Else Xtractor = ""
End Function
Sub test_Xtractor(p_cur As Range, p_val As String, p_expected As String)
Dim v_cur As Range, v_res As Range
p_cur.Value = p_val
Set v_cur = p_cur.Offset(columnOffset:=1)
v_cur.FormulaR1C1 = "='" & ThisWorkbook.Name & "'!Xtractor(RC[-1])"
Set v_res = v_cur.Offset(columnOffset:=1)
v_res.FormulaR1C1 = "=RC[-1]=""" & p_expected & """"
Debug.Print p_val; "->"; v_cur.Value; ":"; v_res.Value
End Sub
Sub test()
test_Xtractor ActiveCell, "A612002_MDC_308", "A612002"
test_Xtractor ActiveCell.Offset(1), "B612002_MDC_308", ""
test_Xtractor ActiveCell.Offset(2), "SUTP038_MDC_3", "SUTP038"
test_Xtractor ActiveCell.Offset(3), "KUTP038_MDC_3", ""
End Sub
Choose the workbook and cell for writing test fixture, then run test from the VBA Editor.
Output in the Immediate window (Ctrl+G):
A612002_MDC_308->A612002:True
B612002_MDC_308->:True
SUTP038_MDC_3->SUTP038:True
KUTP038_MDC_3->:True
UPD
Isit possible to ammend this code so if the 7th character is a letter to return blank?
Replace line with assign to v_re by the following:
v_re.Pattern = "^([SAC](?![^_]{5}[A-Z]_?)[^_]{1,6})_?"
v_re.IgnoreCase = True
And add to the test suite:
test_Xtractor ActiveCell.Offset(4), "SUTP03A_MDC_3", ""
Output:
A612002_MDC_308->A612002:True
B612002_MDC_308->:True
SUTP038_MDC_3->SUTP038:True
KUTP038_MDC_3->:True
SUTP03A_MDC_3->:True
I inserted negative lookahead subrule (?![^_]{5}[A-Z]_?) to reject SUTP03A_MDC_3. But pay attention: the rejecting rule is applied exactly to the 7th character. Now v_re.IgnoreCase set to True, but if only capitalized characters are allowed, set it to False. See also Regular Expression Syntax on MSDN.

match date pattern in the string vba excel

Edit:
Since my string became more and more complicated looks like regexp is the only way.
I do not have a lot experience in that and your help is much appreciated.
Basically from what I read on the web I construct the following exp to try matching occurrence in my sample string:
"My very long long string 12Mar2012 is right here 23Apr2015"
[0-9][0-9] + [a-zA-Z] + [0-9][0-9][0-9][0-9]
and trying this code. I do not have any match. Any good link on regexp tutorial much appreciated.
Dim re, match, RegExDate
Set re = CreateObject("vbscript.regexp")
re.Pattern = "(^[0-9][0-9] + [a-zA-Z] + [0-9][0-9][0-9][0-9]$)"
re.Global = True
For Each match In re.Execute(str)
MsgBox match.Value
RegExDate = match.Value
Exit For
Next
Thank you
This code validates the actual date from the Regexp using DateValuefor robustness
Sub Robust()
Dim Regex As Object
Dim RegexMC As Object
Dim RegexM As Object
Dim strIn As String
Dim BDate As Boolean
strIn = "My very long long string 12Mar2012 is right here 23Apr2015 and 30Feb2002"
Set Regex = CreateObject("vbscript.regexp")
With Regex
.Pattern = "(([0-9])|([0-2][0-9])|([3][0-1]))(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)(\d{4})"
.Global = True
If .test(strIn) Then
Set RegexMC = .Execute(strIn)
On Error Resume Next
For Each RegexM In RegexMC
BDate = False
BDate = IsDate(DateValue(RegexM.submatches(0) & " " & RegexM.submatches(4) & " " & RegexM.submatches(5)))
If BDate Then Debug.Print RegexM
Next
On Error GoTo 0
End If
End With
End Sub
thanks for all your help !!!
I managed to solve my problem using this simple code.
Dim rex As New RegExp
Dim dateCol As New Collection
rex.Pattern = "(\d|\d\d)(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)(\d{4})?"
rex.Global = True
For Each match In rex.Execute(sStream)
dateCol.Add match.Value
Next
Just note that on my side I'm sure that I got valid date in the string so the reg expression is easy.
thnx
Ilya
The following is a quick attempt I made. It's far from perfect.
Basically, it splits the string into words. While looping through the words it cuts off any punctuation (period and comma, you might need to add more).
When processing an item, we try to remove each month name from it. If the string gets shorter we might have a date.
It checks to see if the length of the final string is about right (5 or 6 characters, 1 or 2 + 4 for day and year)
You could instead (or also) check to see that there all numbers.
Private Const MonthList = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC"
Public Function getDates(ByVal Target As String) As String
Dim Data() As String
Dim Item As String
Dim Index As Integer
Dim List() As String
Dim Index2 As Integer
Dim Test As String
Dim Result As String
List = Split(MonthList, ",")
Data = Split(Target, " ")
Result = ""
For Index = LBound(Data) To UBound(Data)
Item = UCase(Replace(Replace(Data(Index), ".", ""), ",", ""))
For Index2 = LBound(Data) To UBound(Data)
Test = Replace(Item, List(Index2), "")
If Not Test = Item Then
If Len(Test) = 5 Or Len(Test) = 6 Then
If Result = "" Then
Result = Item
Else
Result = Result & ", " & Item
End If
End If
End If
Next Index2
Next
getDates = Result
End Function