Named groups for Regex in VBA - regex

Is there any way to use named groups with regular expressions in VBA?
I would like to write a an Excel VBA Sub that matches the dates in file names and decrements these dates by a specified amount. I need to be able to distinguish between dd/mm and mm/dd formats -- among other irregularities -- and using named groups something like this would solve the problem:
(?:<month>\d\d)(?:<day>\d\d)
Advice is appreciated

Nope, no named groups in VBScript regular expressions.
VBScript uses the same regexp engine that JScript uses, so it's compatible with JavaScript regex, which also doesn't have named groups.
You have to use unnamed groups and just go by the order they appear on the expression to retrieve them by index after running it.
In general, dd/mm and mm/dd can't be automatically distinguished since there are valid dates that could be either. (e.g. 01/04 could be January 4th or April 1st). I don't think you'd be able to solve this with a regular expression.

Here is an implementation of named groups using VBA I made today. Hopefully this will be useful to someone else!:
'Description:
' An implementation of Regex which includes Named Groups
' and caching implemented in VBA
'Example:
' Dim match as Object
' set match = RegexMatch("01/01/2019","(?<month>\d\d)\/(?<day>\d\d)\/(?<year>\d\d\d\d)")
' debug.print match("day") & "/" & match("month") & "/" & match("year")
'Options:
' "i" = IgnoreCase
'Return value:
' A dictionary object with the following keys:
' 0 = Whole match
' 1,2,3,... = Submatch 1,2,3,...
' "Count" stores the count of matches
' "<<NAME>>" stores the match of a specified name
Function RegexMatch(ByVal haystack As String, ByVal pattern As String, Optional ByVal options As String) As Object
'Cache regexes for optimisation
Static CachedRegex As Object
Static CachedNames As Object
If CachedRegex Is Nothing Then Set CachedRegex = CreateObject("Scripting.Dictionary")
If CachedNames Is Nothing Then Set CachedNames = CreateObject("Scripting.Dictionary")
'Named regexp used to detect capturing groups and named capturing groups
Static NamedRegexp As Object
If NamedRegexp Is Nothing Then
Set NamedRegexp = CreateObject("VBScript.RegExp")
NamedRegexp.pattern = "\((?:\?\<(.*?)\>)?"
NamedRegexp.Global = True
End If
'If cached pattern doesn't exist, create it
If Not CachedRegex(pattern) Then
'Create names/capture group object
Dim testPattern As String, oNames As Object
testPattern = pattern
testPattern = Replace(testPattern, "\\", "asdasd")
testPattern = Replace(testPattern, "\(", "asdasd")
'Store names for optimisation
Set CachedNames(options & ")" & pattern) = NamedRegexp.Execute(testPattern)
'Create new VBA valid pattern
Dim newPattern As String
newPattern = NamedRegexp.Replace(pattern, "(")
'Create regexp from new pattern
Dim oRegexp As Object
Set oRegexp = CreateObject("VBScript.RegExp")
oRegexp.pattern = newPattern
'Set regex options
Dim i As Integer
For i = 1 To Len(flags)
Select Case Mid(flags, i, 1)
Case "i"
oRegexp.ignoreCase = True
Case "g"
oRegexp.Global = True
End Select
Next
'Store regex for optimisation
Set CachedRegex(options & ")" & pattern) = oRegexp
End If
'Get matches object
Dim oMatches As Object
Set oMatches = CachedRegex(options & ")" & pattern).Execute(haystack)
'Get names object
Dim CName As Object
Set CName = CachedNames(options & ")" & pattern)
'Create dictionary to return
Dim oRet As Object
Set oRet = CreateObject("Scripting.Dictionary")
'Fill dictionary with names and indexes
'0 = Whole match
'1,2,3,... = Submatch 1,2,3,...
'"Count" stores the count of matches
'"<<NAME>>" stores the match of a specified name
For i = 1 To CName.Count
oRet(i) = oMatches(0).Submatches(i - 1)
If Not IsEmpty(CName(i - 1).Submatches(0)) Then oRet(CName(i - 1).Submatches(0)) = oMatches(0).Submatches(i - 1)
Next i
oRet(0) = oMatches(0)
oRet("Count") = CName.Count
Set RegexMatch = oRet
End Function
P.S. for a Regex library (built by myself) which has this additional functionality, check out stdRegex. The equivalent can be done with:
set match = stdRegex.Create("(?:<month>\d\d)(?:<day>\d\d)").Match(sSomeString)
Debug.print match("month")
There are also more features of stdRegex, than VBScript's standard object. See the test suite for more info.

Thanks #Sancarn for his code!
For a few reasons I've revised it. The changes I've made are documented inside the code:
' Procedure for testing 'RegexMatch'.
' - It shows how to convert a date from 'mm/dd/yyyy' to 'dd.mm.yyyy' format.
' - It shows how to retrieve named groups by real name: 'Match.Item("group name")'
' as well as by number: 'Match.Items(group number)'.
' - It shows how to retrieve unnamed groups by number-generated name as well as by number.
' - It shows how to retrieve group count and the whole match by number-generated name as well as by number.
' - It shows that non-capturing groups like '(?:y)?' won't be listed.
' - It shows that left parenthesis inside a character class like '([x(])?' won't disturbe.
' Take notice of:
' - the small difference between 'Item' and 'Items'
' - the quotes in 'Match.Item("number of an unnamed group")'
Sub TestRegexMatch()
Dim Match As Scripting.Dictionary
Set Match = RegexMatch("01/23/2019z", "(?<month>\d\d)\/([x(])?(?<day>\d\d)\/(?:y)?(?<year>\d\d\d\d)(z)?")
Debug.Print Match.Item("day") & "." & Match.Item("month") & "." & Match.Item("year") & " vs. " & Match.Items(2) & "." & Match.Items(0) & "." & Match.Items(3)
Debug.Print "'" & Match.Item("1") & "'" & ", '" & Match.Item("4") & "' vs. '" & Match.Items(1) & "', '" & Match.Items(4) & "'"
Debug.Print Match.Item("98") & " vs. " & Match.Items(Match.Count - 2)
Debug.Print Match.Item("99") & " vs. " & Match.Items(Match.Count - 1)
End Sub
' An implementation of regex which includes named groups and caching implemented in VBA.
' The 'Microsoft VBScript Regular Expressions 5.5' library must be referenced (in VBA-editor: Tools -> References).
' Parameters:
' - haystack: the string the regex is applied on.
' - originalPattern: the regex pattern with or without named groups.
' The group naming has to follow .net regex syntax: '(?<group name>group content)'.
' Group names may contain the following characters: a-z, A-Z, _ (underscore).
' Group names must not be an empty string.
' - options: a string that may contain:
' - 'i' (the regex will work case-insensitive)
' - 'g' (the regex will work globally)
' - 'm' (the regex will work in multi-line mode)
' or any combination of these.
' Returned value: a Scripting.Dictionary object with the following entries:
' - Item 0 or "0", 1 or "1" ... for the groups content/submatches,
' following the convention of VBScript_RegExp_55.SubMatches collection, which is 0-based.
' - Item Match.Count - 2 or "98" for the whole match, assuming that the number of groups is below.
' - Item Match.Count - 1 or "99" for number of groups/submatches.
' Changes compared to the original version:
' - Handles non-capturing and positive and negative lookahead groups.
' - Handles left parenthesis inside a character class.
' - Named groups do not count twice.
' E.g. in the original version the second named group occupies items 3 and 4 of the returned
' dictionary, in this revised version only item 1 (item 0 is the first named group).
' - Additional 'm' option.
' - Fixed fetching cached regexes.
' - Early binding.
' - Some code cleaning.
' For an example take a look at the 'TestRegexMatch' procedure above.
Function RegexMatch(ByVal haystack As String, ByVal originalPattern As String, Optional ByVal options As String) As Scripting.Dictionary
Dim GroupsPattern As String
Dim RealPattern As String
Dim RealRegExp As VBScript_RegExp_55.RegExp
Dim RealMatches As VBScript_RegExp_55.MatchCollection
Dim ReturnData As Scripting.Dictionary
Dim GroupNames As VBScript_RegExp_55.MatchCollection
Dim Ctr As Integer
' Cache regexes and group names for optimisation.
Static CachedRegExps As Scripting.Dictionary
Static CachedGroupNames As Scripting.Dictionary
' Group 'meta'-regex used to detect named and unnamed capturing groups.
Static GroupsRegExp As VBScript_RegExp_55.RegExp
If CachedRegExps Is Nothing Then Set CachedRegExps = New Scripting.Dictionary
If CachedGroupNames Is Nothing Then Set CachedGroupNames = New Scripting.Dictionary
If GroupsRegExp Is Nothing Then
Set GroupsRegExp = New VBScript_RegExp_55.RegExp
' Original version: GroupsRegExp.Pattern = "\((?:\?\<(.*?)\>)?"
GroupsRegExp.Pattern = "\((?!(?:\?:|\?=|\?!|[^\]\[]*?\]))(?:\?<([a-zA-Z0-9_]+?)>)?"
GroupsRegExp.Global = True
End If
' If the pattern isn't cached, create it.
If Not CachedRegExps.Exists("(" & options & ")" & originalPattern) Then
' Prepare the pattern for retrieving named and unnamed groups.
GroupsPattern = Replace(Replace(Replace(Replace(originalPattern, "\\", "X"), "\(", "X"), "\[", "X"), "\]", "X")
' Store group names for optimisation.
CachedGroupNames.Add "(" & options & ")" & originalPattern, GroupsRegExp.Execute(GroupsPattern)
' Create new VBScript regex valid pattern and set regex for this pattern.
RealPattern = GroupsRegExp.Replace(originalPattern, "(")
Set RealRegExp = New VBScript_RegExp_55.RegExp
RealRegExp.Pattern = RealPattern
' Set regex options.
For Ctr = 1 To Len(options)
Select Case Mid(options, Ctr, 1)
Case "i"
RealRegExp.IgnoreCase = True
Case "g"
RealRegExp.Global = True
Case "m"
RealRegExp.MultiLine = True
End Select
Next
' Store this regex for optimisation.
CachedRegExps.Add "(" & options & ")" & originalPattern, RealRegExp
End If
' Get matches.
Set RealMatches = CachedRegExps.Item("(" & options & ")" & originalPattern).Execute(haystack)
' Get group names.
Set GroupNames = CachedGroupNames.Item("(" & options & ")" & originalPattern)
' Create dictionary to return.
Set ReturnData = New Scripting.Dictionary
' Fill dictionary with names and indexes as descibed in the remarks introducing this procedure.
For Ctr = 1 To GroupNames.Count
If IsEmpty(GroupNames(Ctr - 1).SubMatches(0)) Then
ReturnData.Add CStr(Ctr - 1), RealMatches(0).SubMatches(Ctr - 1)
Else
ReturnData.Add GroupNames(Ctr - 1).SubMatches(0), RealMatches(0).SubMatches(Ctr - 1)
End If
Next
ReturnData.Add "98", RealMatches.Item(0)
ReturnData.Add "99", GroupNames.Count
' Return the result.
Set RegexMatch = ReturnData
End Function
For further improvement this code could be the base of a class module for replacement of the VBScript regex.

Related

Find '~XX~' within a string with specific values

I have classic ASP written in VBScript. I have a record pulled from SQL Server and the data is a string. In this string, I need to find text enclosed in ~12345~ and I need to replace with very specific text. Example 1 would be replaced with M, 2 would be replaced with A. I then need to display this on the web page. We don't know how many items will be enclosed with ~.
Example Data:
Group Pref: (To be paid through WIT)
~2.5~ % Quarterly Rebate - Standard Commercial Water Heaters
Display on webpage after:
Group Pref: (To be paid through WIT)
~A.H~ % Quarterly Rebate - Standard Commercial Water Heaters
I tried this following, but there are two many cases and this would be unrealistic to maintain. I does replace the text and display correctly.
dim strSearchThis
strSearchThis =(rsResults("PREF"))
set re = New RegExp
with re
.global = true
.pattern = "~[^>]*~"
strSearchThis = .replace(strSearchThis, "X")
end with
I am also trying this code, I can find the text contained between each ~ ~, but when displayed its the information between the ~ ~ is not changed:
dim strSearchThis
strSearchThis =(rsResults("PREF"))
Set FolioPrefData = New RegExp
FolioPrefData.Pattern = "~[^>]*~"
FolioPrefData.Global = True
FolioPrefData.IgnoreCase = True
'will contain all found instances of ~ ~'
set colmatches = FolioPrefData.Execute(strSearchThis)
Dim itemLength, found
For Each objMatch in colMatches
Select Case found
Case "~"
'ignore - doing nothing'
Case "1"
found = replace(strSearchThis, "M")
End Select
Next
response.write(strSearchThis)
You can do it without using Regular Expressions, just checking the individual characters and writing a function that handles the different cases you have. The following function finds your delimited text and loops through all characters, calling the ReplaceCharacter function defined further down:
Function FixString(p_sSearchString) As String
Dim iStartIndex
Dim iEndIndex
Dim iIndex
Dim sReplaceString
Dim sReturnString
sReturnString = p_sSearchString
' Locate start ~
iStartIndex = InStr(sReturnString, "~")
Do While iStartIndex > 0
' Look for end ~
iEndIndex = InStr(iStartIndex + 1, sReturnString, "~")
If iEndIndex > 0 Then
sReplaceString = ""
' Loop htrough all charatcers
For iIndex = iStartIndex + 1 To iEndIndex - 1
sReplaceString = sReplaceString & ReplaceCharacter(Mid(sReturnString, iIndex, 1))
Next
' Replace string
sReturnString = Left(sReturnString, iStartIndex) & sReplaceString & Mid(sReturnString, iEndIndex)
' Locate next ~
iStartIndex = InStr(iEndIndex + 1, sReturnString, "~")
Else
' End couldn't be found, exit
Exit Do
End If
Loop
FixString = sReturnString
End Function
This is the function where you will enter the different character substitutions you might have:
Function ReplaceCharacter(p_sCharacter) As String
Select Case p_sCharacter
Case "1"
ReplaceCharacter = "M"
Case "2"
ReplaceCharacter = "A"
Case Else
ReplaceCharacter = p_sCharacter
End Select
End Function
You can use this in your existing code:
response.write(FixString(strSearchThis))
You can also use a Split and Join method...
Const SEPARATOR = "~"
Dim deconstructString, myOutputString
Dim arrayPointer
deconstructString = Split(myInputString, SEPARATOR)
For arrayPointer = 0 To UBound(deconstructString)
If IsNumeric(deconstructString(arrayPointer)) Then
'Do whatever you need to with your value...
End If
Next 'arrayPointer
myOutputString = Join(deconstructString, "")
This does rely, obviously, on breaking a string apart and rejoining it, so there is a sleight overhead on string mutability issues.

How to match escaped group signs {&date:dd.\{mm\}.yyyy} but not {&date:dd.{mm}.yyyy} with vba and regex

I'm trying to create a pattern for finding placeholders within a string to be able to replace them with variables later. I'm stuck on a problem to find all these placeholders within a string according to my requirement.
I already found this post, but it only helped a little:
Regex match ; but not \;
Placeholders will look like this
{&var} --> Variable stored in a dictionary --> dict("var")
{$prop} --> Property of a class cls.prop read by CallByName and PropGet
{#const} --> Some constant values by name from a function
Generally I have this pattern and it works well
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
RegEx.pattern = "\{([#\$&])([\w\.]+)\}"
For example I have this string:
"Value of foo is '{&var}' and bar is '{$prop}'"
I get 2 matches as expected
(&)(var)
($)(prop)
I also want to add a formating part like in .Net to this expression.
String.Format("This is a date: {0:dd.mm.yyyy}", DateTime.Now());
// This is a date: 05.07.2019
String.Format("This is a date, too: {0:dd.(mm).yyyy}", DateTime.Now());
// This is a date, too: 05.(07).2019
I extended the RegEx to get that optional formatting string
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
RegEx.pattern = "\{([#\$&])([\w\.]+):{0,1}([^\}]*)\}"
RegEx.Execute("Value of foo is '{&var:DD.MM.YYYY}' and bar is '{$prop}'")
I get 2 matches as expected
(&)(var)(DD.MM.YYYY)
($)(prop)()
At this point I noticed I have to take care for escapet "{" and "}", because maybe I want to have some brackets within the formattet result.
This does not work properly, because my pattern stops after "...{MM"
RegEx.Execute("Value of foo is '{&var:DD.{MM}.YYYY}' and bar is '{$prop}'")
It would be okay to add escape signs to the text before checking the regex:
RegEx.Execute("Value of foo is '{&var:DD.\{MM\}.YYYY}' and bar is '{$prop}'")
But how can I correctly add the negative lookbehind?
And second: How does this also works for variables, that should not be resolved, even if they have the correct syntax bus the outer bracket is escaped?
RegEx.Execute("This should not match '\{&var:DD.\{MM\}.YYYY\}' but this one '{&var:DD.\{MM\}.YYYY}'")
I hope my question is not confusing and someone can help me
Update 05.07.19 at 12:50
After the great help of #wiktor-stribiżew the result is completed.
As requested i provide some example code:
Sub testRegEx()
Debug.Print FillVariablesInText(Nothing, "Date\\\\{$var01:DD.\{MM\}.YYYY}\\\\ Var:\{$nomatch\}{$var02} Double: {#const}{$var01} rest of string")
End Sub
Function FillVariablesInText(ByRef dict As Dictionary, ByVal txt As String) As String
Const c_varPattern As String = "(?:(?:^|[^\\\n])(?:\\{2})*)\{([#&\$])([\w.]+)(?:\:([^}\\]*(?:\\.[^\}\\]*)*))?(?=\})"
Dim part As String
Dim snippets As New Collection
Dim allMatches, m
Dim i As Long, j As Long, x As Long, n As Long
' Create a RegEx object and execute pattern
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
RegEx.pattern = c_varPattern
RegEx.MultiLine = True
RegEx.Global = True
Set allMatches = RegEx.Execute(txt)
' Start at position 1 of txt
j = 1
n = 0
For Each m In allMatches
n = n + 1
Debug.Print "(" & n & "):" & m.value
Debug.Print " [0] = " & m.SubMatches(0) ' Type [&$#]
Debug.Print " [1] = " & m.SubMatches(1) ' Name
Debug.Print " [2] = " & m.SubMatches(2) ' Format
part = "{" & m.SubMatches(0)
' Get offset for pre-match-string
x = 1 ' Index to Postion at least +1
Do While Mid(m.value, x, 2) <> part
x = x + 1
Loop
' Postition in txt
i = m.FirstIndex + x
' Anything to add to result?
If i <> j Then
snippets.Add Mid(txt, j, i - j)
End If
' Next start postition (not Index!) + 1 for lookahead-positive "}"
j = m.FirstIndex + m.Length + 2
' Here comes a function get a actual value
' e.g.: snippets.Add dict(m.SubMatches(1))
' or : snippets.Add Format(dict(m.SubMatches(1)), m.SubMatches(2))
snippets.Add "<<" & m.SubMatches(0) & m.SubMatches(1) & ">>"
Next m
' Any text at the end?
If j < Len(txt) Then
snippets.Add Mid(txt, j)
End If
' Join snippets
For i = 1 To snippets.Count
FillVariablesInText = FillVariablesInText & snippets(i)
Next
End Function
The function testRegEx gives me this result and debug print:
(1):e\\\\{$var01:DD.\{MM\}.YYYY(2):}{$var02
[0] = $
[1] = var02
[2] =
(1):e\\\\{$var01:DD.\{MM\}.YYYY
[0] = $
[1] = var01
[2] = DD.\{MM\}.YYYY
(2):}{$var02
[0] = $
[1] = var02
[2] =
(3): {#const
[0] = #
[1] = const
[2] =
(4):}{$var01
[0] = $
[1] = var01
[2] =
Date\\\\<<$var01>>\\\\ Var:\{$nomatch\}<<$var02>> Double: <<#const>><<$var01>> rest of string
You may use
((?:^|[^\\])(?:\\{2})*)\{([#$&])([\w.]+)(?::([^}\\]*(?:\\.[^}\\]*)*))?}
To make sure the consecutive matches are found, too, turn the last } into a lookahead, and when extracting matches just append it to the result, or if you need the indices increment the match length by 1:
((?:^|[^\\])(?:\\{2})*)\{([#$&])([\w.]+)(?::([^}\\]*(?:\\.[^}\\]*)*))?(?=})
^^^^^
See the regex demo and regex demo #2.
Details
((?:^|[^\\])(?:\\{2})*) - Group 1 (makes sure the { that comes next is not escaped): start of string or any char but \ followed with 0 or more double backslashes
\{ - a { char
([#$&]) - Group 2: any of the three chars
([\w.]+) - Group 3: 1 or more word or dot chars
(?::([^}\\]*(?:\\.[^}\\]*)*))? - an optional sequence of : and then Group 4:
[^}\\]* - 0 or more chars other than } and \
(?:\\.[^}\\]*)* - zero or more reptitions of a \-escaped char and then 0 or more chars other than } and \
} - a } char
Welcome to the site! If you need to only match balanced escapes, you will need something more powerful. If not --- I haven't tested this, but you could try replacing [^\}]* with [^\{\}]|\\\{|\\\}. That is, match non-braces and escaped brace sequences separately. You may need to change this depending on how you want to handle backslashes in your formatting string.

Regex - Quantifier {x,y} following nothing

I'm creating a basic text editor and I'm using regex to achieve a find and replace function. To do this I've gotten this code:
Private Function GetRegExpression() As Regex
Dim result As Regex
Dim regExString As [String]
' Get what the user entered
If TabControl1.SelectedIndex = 0 Then
regExString = txtbx_Find2.Text
ElseIf TabControl1.SelectedIndex = 1 Then
regExString = txtbx_Find.Text
End If
If chkMatchCase.Checked Then
result = New Regex(regExString)
Else
result = New Regex(regExString, RegexOptions.IgnoreCase)
End If
Return result
End Function
And this is the Find method
Private Sub FindText()
''
Dim WpfTest1 As New Spellpad.Tb
Dim ElementHost1 As System.Windows.Forms.Integration.ElementHost = frm_Menu.Controls("ElementHost1")
Dim TheTextBox As System.Windows.Controls.TextBox = CType(ElementHost1.Child, Tb).ctrl_TextBox
''
' Is this the first time find is called?
' Then make instances of RegEx and Match
If isFirstFind Then
regex = GetRegExpression()
match = regex.Match(TheTextBox.Text)
isFirstFind = False
Else
' match.NextMatch() is also ok, except in Replace
' In replace as text is changing, it is necessary to
' find again
'match = match.NextMatch();
match = regex.Match(TheTextBox.Text, match.Index + 1)
End If
' found a match?
If match.Success Then
' then select it
Dim row As Integer = TheTextBox.GetLineIndexFromCharacterIndex(TheTextBox.CaretIndex)
MoveCaretToLine(TheTextBox, row + 1)
TheTextBox.SelectionStart = match.Index
TheTextBox.SelectionLength = match.Length
Else
If TabControl1.SelectedIndex = 0 Then
MessageBox.Show([String].Format("Cannot find ""{0}"" ", txtbx_Find2.Text), Application.ProductName, MessageBoxButtons.OK, MessageBoxIcon.Information)
ElseIf TabControl1.SelectedIndex = 1 Then
MessageBox.Show([String].Format("Cannot find ""{0}"" ", txtbx_Find.Text), Application.ProductName, MessageBoxButtons.OK, MessageBoxIcon.Information)
End If
isFirstFind = True
End If
End Sub
When I run the program I get errors:
For ?, parsing "?" - Quantifier {x,y} following nothing.; and
For *, parsing "*" - Quantifier {x,y} following nothing.
It's as if I can't use these but I really need to. How can I solve this problem?
? and * are quantifiers in regular expressions:
? is used to specify that something is optional, for instance b?au can match both bau and au.
* means the group with which it binds can be repeated zero, one or multiple times: for instance ba*u can bath bu, bau, baau, baaaaaaaau,...
Now most regular expressions use {l,u} as a third pattern with l the lower bound on the number of times something is repeated, and u the upper bound on the number of occurences. So ? is replaced by {0,1} and * by {0,}.
Now if you provide them without any character before them, evidently, the regex parser doesn't know what you mean. In other words if you do (used csharp, but the ideas are generally applicable):
$ csharp
Mono C# Shell, type "help;" for help
Enter statements below.
csharp> Regex r = new Regex("fo*bar");
csharp> r.Replace("Fooobar fooobar fbar fobar","<MATCH>");
"Fooobar <MATCH> <MATCH> <MATCH>"
csharp> r.Replace("fooobar far qux fooobar quux fbar echo fobar","<MATCH>");
"<MATCH> far qux <MATCH> quux <MATCH> echo <MATCH>"
If you wish to do a "raw text find and replace", you should use string.Replace.
EDIT:
Another way to process them is by escaping special regex characters. Ironically enough, you can do this by replacing them by a regex ;).
Private Function GetRegExpression() As Regex
Dim result As Regex
Dim regExString As [String]
' Get what the user entered
If TabControl1.SelectedIndex = 0 Then
regExString = txtbx_Find2.Text
ElseIf TabControl1.SelectedIndex = 1 Then
regExString = txtbx_Find.Text
End If
'Added code
Dim baseRegex As Regex = new Regex("[\\.$^{\[(|)*+?]")
regExString = baseRegex.Replace(regExString,"\$0")
'End added code
If chkMatchCase.Checked Then
result = New Regex(regExString)
Else
result = New Regex(regExString, RegexOptions.IgnoreCase)
End If
Return result
End Function

RegEx VBA Excel complex string

I have a function pulled from here. My problem is that I don't know what RegEx pattern I need to use to split out the following data:
+1 vorpal unholy longsword +31/+26/+21/+16 (2d6+13)
+1 vorpal flaming whip +30/+25/+20 (1d4+7 plus 1d6 fire and entangle)
2 slams +31 (1d10+12)
I want it to look like:
+1 vorpal unholy longsword, 31
+1 vorpal flaming whip, 30
2 slams, 31
Here is the VBA code that does the RegExp validation:
Public Function RXGET(ByRef find_pattern As Variant, _
ByRef within_text As Variant, _
Optional ByVal submatch As Long = 0, _
Optional ByVal start_num As Long = 0, _
Optional ByVal case_sensitive As Boolean = True) As Variant
' RXGET - Looks for a match for regular expression pattern find_pattern
' in the string within_text and returns it if found, error otherwise.
' Optional long submatch may be used to return the corresponding submatch
' if specified - otherwise the entire match is returned.
' Optional long start_num specifies the number of the character to start
' searching for in within_text. Default=0.
' Optional boolean case_sensitive makes the regex pattern case sensitive
' if true, insensitive otherwise. Default=true.
Dim objRegex As VBScript_RegExp_55.RegExp
Dim colMatch As VBScript_RegExp_55.MatchCollection
Dim vbsMatch As VBScript_RegExp_55.Match
Dim colSubMatch As VBScript_RegExp_55.SubMatches
Dim sMatchString As String
Set objRegex = New VBScript_RegExp_55.RegExp
' Initialise Regex object
With objRegex
.Global = False
' Default is case sensitive
If case_sensitive Then
.IgnoreCase = False
Else: .IgnoreCase = True
End If
.pattern = find_pattern
End With
' Return out of bounds error
If start_num >= Len(within_text) Then
RXGET = CVErr(xlErrNum)
Exit Function
End If
sMatchString = Right$(within_text, Len(within_text) - start_num)
' Create Match collection
Set colMatch = objRegex.Execute(sMatchString)
If colMatch.Count = 0 Then ' No match
RXGET = CVErr(xlErrNA)
Else
Set vbsMatch = colMatch(0)
If submatch = 0 Then ' Return match value
RXGET = vbsMatch.Value
Else
Set colSubMatch = vbsMatch.SubMatches ' Use the submatch collection
If colSubMatch.Count < submatch Then
RXGET = CVErr(xlErrNum)
Else
RXGET = CStr(colSubMatch(submatch - 1))
End If
End If
End If
End Function
I don't know about Excel but this should get you started on the RegEx:
/(?:^|, |and |or )(\+?\d?\s?[^\+]*?) (?:\+|-)(\d+)/
NOTE: There is a slight caveat here. This will also match if an element begins with + only (not being followed by a digit).
Capture groups 1 and 2 contain the strings that go left and right of your comma (if the whole pattern has index 0). So you can something like capture[1] + ', ' + capture[2] (whatever your syntax for that is).
Here is an explanation of the regex:
/(?:^|, |and |or ) # make sure that we only start looking after
# the beginning of the string, after a comma, after an
# and or after an or; the "?:" makes sure that this
# subpattern is not capturing
(\+? # a literal "+"
\d+ # at least one digit
# a literal space
[^+]*?) # arbitrarily many non-plus characters; the ? makes it
# non-greedy, otherwise it might span multiple lines
# a literal space
\+ # a literal "+"
(\d+)/ # at least one digit (and the brakets are for capturing)

Extract/convert date from string in MS Access

I'm trying to extract date/times from strings with the following patterns and convert them to date types in Access.
"08-Apr-2012 21:26:49"
"...Confirmed by SMITH, MD, JOHN (123) on 4/2/2012 11:11:01 AM;"
Can anyone help?
Try this
Dim d As Date
d = CDate("08-Apr-2012 21:26:49")
Debug.Print Format(d, "dd-MMM-yyyy")
Debug.Print Format(d, "h:m:s")
Will give
08-Apr-2012
21:26:49
use this regex to get date-time between " on " (ie, space on space) and the ";" (first semi-colon after that).
(?<=\ on )(.*?)(?=\;)
As already mentioned by Romeo in his answer, you need to use CDate() to convert a string with a valid date value to a Date variable.
You can get the date value out of the string like this:
(given that the strings always look like the one in the example, " on " (with blanks) before the date and ";" after it):
Public Function Test()
Dim Source As String
Dim Tmp As String
Dim DateStart As Integer
Dim DateEnd As Integer
Dim DateValue As Date
Source = "...Confirmed by SMITH, MD, JOHN (123) on 4/2/2012 11:11:01 AM;"
'find the place in the source string where " on " ends
DateStart = InStr(1, Source, " on ") + 4
'find first semicolon after the date)
DateEnd = InStr(DateStart, Source, ";")
'get the part with the date
Tmp = Mid(Source, DateStart, DateEnd - DateStart)
'convert to date
DateValue = CDate(Tmp)
End Function
Add this function to a VBA module:
' ----------------------------------------------------------------------'
' Return a Date object or Null if no date could be extracted '
' ----------------------------------------------------------------------'
Public Function ExtractDate(value As Variant) As Variant
If IsNull(value) Then
ExtractDate = Null
Exit Function
End If
' Using a static, we avoid re-creating the same regex object for every call '
Static regex As Object
' Initialise the Regex object '
If regex Is Nothing Then
Set regex = CreateObject("vbscript.regexp")
With regex
.Global = True
.IgnoreCase = True
.MultiLine = True
.pattern = "(\d+\/\d+/\d+\s+\d+:\d+:\d+\s+\w+|\d+-\w+-\d+\s+\d+:\d+:\d+)"
End With
End If
' Test the value against the pattern '
Dim matches As Object
Set matches = regex.Execute(value)
If matches.count > 0 Then
' Convert the match to a Date if we can '
ExtractDate = CDate(matches(0).value)
Else
' No match found, jsut return Null '
ExtractDate = Null
End If
End Function
And then use it like this, for instance in a query:
SELECT ID, LogData, ExtractDate(LogData) as LogDate
FROM MyLog
Make sure you check that hte dates returned are in the proper format and make sense to you.
CDate() interprets the date string in different ways depending on your locale.
If you're not getting the desired result, you will need to modify the code to separate the individual components of the date and rebuild them using DateSerial() for instance.