regex for "finding whole word only" plus allowing one character - regex

Hi I'm using this regular expression to find whole word only:
example:
Dim oRE, bMatch
Set oRE = New RegExp
oRE.Pattern = "\bFunction\b"
bMatch = oRE.Test("Functions") 'return false
bMatch = oRE.Test("Function dummy") 'return true
I want to allow one character at the end of the string. The char i want to allow is the double quote ("). So i would like this line of code to return true:
bMatch = oRE.Test("Function"+chr(34)+" dummy") 'chr(34) is the charcode of doublequote (")

Initiate a variable with chr(34) and concatenate it into your pattern.
dq = Chr(34)
oRE.Pattern = "\bFunction" & dq & "+\b"
Then you will be able to match the double quotes as well.
+ for 1 or more double quotes after Function (modify it per your needs).

The double quote can be written like this \x22 in order to replace it easily in your pattern "
Hope that this what you want as result Demo here
Dim oRE, bMatch
Set oRE = New RegExp
oRE.Pattern = "\bFunction.+?\x22"
aMatch = oRE.Test("Functions""")
bMatch = oRE.Test("Function dummy""")
wscript.echo "Functions " & aMatch
wscript.echo "Functions dummy " & bMatch

Related

Break String into individual elements and test for type of Character - NUM - LETTER - SPECIAL - Excel VBA

I need to figure out how I can test each character in the string to see if it is a number/letter/special character.
My question is, how can I break a string and test each individual character to see if the character is a number/letter/special character
Eg:
var = 1S#
Result1 = Num
Result2 = Alpha
Result3 = Special
If you mean
escaping user input that is to be treated as a literal string within a
regular expression—that would otherwise be mistaken for a special
character.
Then you can replace it with given regular expression:
/[.*+?^${}()|[\]\\]/g
So I got it to work by combining a few different posts on SO. This code breaks the string in an array and then checks each one for num/alpha/special and has a special case for *.
Split string into array of characters?
Regex Expression to check if there are any special characters in string like(!,#<#,$,%<^< etc)
How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops
-
Sub test()
'''Special Character Section'''
Dim special_charArr() As String
Dim special_char As String
special_char = "!,#,#,$,%,^,&,*,+,/,\,;,:"
special_charArr() = Split(special_char, ",")
'''Special Character Section'''
'''Alpha Section'''
Dim regexp As Object
Set regexp = CreateObject("vbscript.regexp")
Dim strPattern As String
strPattern = "([a-z])"
With regexp
.ignoreCase = True
.Pattern = strPattern
End With
'''Alpha Section'''
Dim buff() As String
my_string = "t3s!*"
ReDim buff(Len(my_string) - 1)
For i = 1 To Len(my_string)
buff(i - 1) = Mid$(my_string, i, 1)
char = buff(i - 1)
If IsNumeric(char) = True Then
MsgBox char & " = Number"
End If
For Each Key In special_charArr
special = InStr(char, Key)
If special = 1 Then
If Key <> "*" Then
MsgBox char & " = Special NOT *"
Else
MsgBox char & " = *"
End If
End If
Next
If regexp.test(char) Then
MsgBox char & " = Alpha"
End If
Next
End Sub

Marking word as "find and replace" in Microsoft Word with RegEx

I try to highlight a word found by RegEx, and if the right to replace it with its corresponding substitute.
The code works correctly only if NOT substituted.
Probably should every time rearrange???
Sub Replace()
Dim regExp As Object
Set regExp = CreateObject("vbscript.regexp")
Dim arr As Variant
Dim arrzam As Variant
Dim i As Long
Dim choice As Integer
Dim Document As Word.Range
Set Document = ActiveDocument.Content
On Error Resume Next
'EGN
'IBAN
arr = VBA.Array("((EGN(:{0,1})){0,1})[0-9]{10}", _
"[a-zA-Z]{2}[0-9]{2}[a-zA-Z0-9]{4}[0-9]{7}([a-zA-Z0-9]?){0,16}")
arrzam = VBA.Array("[****]", _
"[IBAN]")
With regExp
For i = 0 To UBound(arr)
.Pattern = arr(i)
.Global = True
For Each Match In regExp.Execute(Document)
ActiveDocument.Range(Match.FirstIndex, Match.FirstIndex + Match.Length).Duplicate.Select
choice = MsgBox("Replace " & Chr(34) & Match.Value & Chr(34) & " with " & Chr(34) & arrzam(i) & Chr(34) & "?", _
vbYesNoCancel + vbDefaultButton1, "Replace")
If choice = vbYes Then
Document = .Replace(Document, arrzam(i))
ElseIf choice = vbCancel Then
Next
End If
Next
Next
End With
End Sub
Actually, there are several things wrong with this.
First, the each Match in Each Match is static, determined at the moment of the first loop. You're changing the document in the meantime, so each successive Match looks at an old position.
Second, you're replacing all the occurrences at one time, so there is no need to loop through them. It seems a one line, one time Replace could do the same thing.

Excluding line breaks from regex capture

I realise that a similar question has been asked before and answered, but the problem persists after I've tried the solution proposed in that answer.
I want to write an Excel macro to separate a multi-line string into multiple single lines, trimmed of whitespace including line breaks. This is my code:
Sub testRegexMatch()
Dim r As New VBScript_RegExp_55.regexp
Dim str As String
Dim mc As MatchCollection
r.Pattern = "[\r\n\s]*([^\r\n]+?)[\s\r\n]*$"
r.Global = True
r.MultiLine = True
str = "This is a haiku" & vbCrLf _
& "You may read it if you wish " & vbCrLf _
& " but you don't have to"
Set mc = r.Execute(str)
For Each Line In mc
Debug.Print "^" & Line & "$"
Next Line
End Sub
Expected output:
^This is a haiku$
^You may read it if you wish$
^but you don't have to$
Actual output:
^This is a haiku
$
^
You may read it if you wish
$
^
but you don't have to$
I've tried the same thing on Regex101, but this appears to show the correct captures, so it must be a quirk of VBA's regex engine.
Any ideas?
You just need to access the captured values via SubMatches():
When a regular expression is executed, zero or more submatches can result when subexpressions are enclosed in capturing parentheses. Each item in the SubMatches collection is the string found and captured by the regular expression.
Here is my demo:
Sub DemoFn()
Dim re, targetString, colMatch, objMatch
Set re = New regexp
With re
.pattern = "\s*([^\r\n]+?)\s*$"
.Global = True ' Same as /g at the online tester
.MultiLine = True ' Same as /m at regex101.com
End With
targetString = "This is a haiku " & vbLf & " You may read it if you wish " & vbLf & " but you don't have to"
Set colMatch = re.Execute(targetString)
For Each objMatch In colMatch
Debug.Print objMatch.SubMatches.Item(0) ' <== SEE HERE
Next
End Sub
It prints:
This is a haiku
You may read it if you wish
but you don't have to

VBS script to report AD groups - Regex pattern not working with multiple matches

Having an issue with getting a regex statement to accept two expressions.
The "re.pattern" code here works:
If UserChoice = "" Then WScript.Quit 'Detect Cancel
re.Pattern = "[^(a-z)^(0,4,5,6,7,8,9)]"
re.Global = True
re.IgnoreCase = True
if re.test( UserChoice ) then
Exit Do
End if
MsgBox "Please choose either 1, 2 or 3 ", 48, "Invalid Entry"
While the below "regex.pattern " code does not. I want to use it to format the results of a DSQUERY command where groups are collected, but I don't want any of the info after the ",", nor do i want the leading CN= that is normally collected when the following dsquery is run:
"dsquery.exe user forestroot -samid "& strInput &" | dsget user -memberof")
The string I want to format would look something like this before formatting:
CN=APP_GROUP_123,OU=Global Groups,OU=Accounts,DC=corp,DC=contoso,DC=biz
This is the result I want:
APP_GROUP_123
Set regEx = New RegExp
**regEx.Pattern = "[,.*]["CN=]"**
Result = regEx.Replace(StrLine, "")
I'm only able to get the regex to work when used individually, either
regEx.Pattern = ",."
or
regEx.Pattern = "CN="
code is nested here:
Set InputFile = FSO.OpenTextFile("Temp.txt", 1)
Set InputFile = FSO.OpenTextFile("Temp.txt", 1)
set OutPutFile = FSO.OpenTextFile(StrInput & "-Results.txt", 8, True)
do While InputFile.AtEndOfStream = False
StrLine = InputFile.ReadLine
If inStr(strLine, TaskChoice) then
Set regEx = New RegExp
regEx.Pattern = "[A-Za-z]{2}=(.+?),.*"
Result = regEx.Replace(StrLine, "")
OutputFile.write(Replace(Result,"""","")) & vbCrLf
End if
This should get you started:
str = "CN=APP_GROUP_123,OU=Global Groups,OU=Accounts,DC=corp,DC=contoso,DC=biz"
Set re = New RegExp
re.pattern = "[A-Za-z]{2}=(.+?),.*"
if re.Test(str) then
set matches = re.Execute(str)
matched_str = "Matched: " & matches(0).SubMatches(0)
Wscript.echo matched_str
else
Wscript.echo "Not a match"
end if
Output:Matched: APP_GROUP_123
The regex you need is [A-Za-z]{2}=(.+?),.*
If the match is successful, it captures everything in the parenthesis. .+? means it will match any character non-greedily up until the first comma. The ? in .+? makes the expression non-greedy. If you were to omit it, you would capture everything up to the final comma at ,DC=biz
Your regular expression "[,.*]["CN=]" doesn't work for 2 reasons:
It contains an unescaped double quote. Double quotes inside VBScript strings must be escaped by doubling them, otherwise the interpreter would interpret your expression as a string "[,.*][", followed by an (invalid) variablename CN=] (without an operator too) and the beginning of the next string (the 3rd double quote).
You misunderstand regular expression syntax. Square brackets indicate a character class. An expression [,.*] would match any single comma, period or asterisk, not a comma followed by any number of characters.
What you meant to use was an alternation, which is expressed by a pipe symbol (|), and the beginning of a string is matched by a caret (^):
regEx.Pattern = ",.*|^CN="
With that said, in your case a better approach would be using a group and replacing the whole string with just the group match:
regEx.Pattern = "^cn=(.*?),.*"
regEx.IgnoreCase = True
Result = regEx.Replace(strLine, "$1")

Using regexp in Excel can I perform some arithmetic on the matched pattern before replacing the matched string?

I am using `VBscript.RegExp`` to find and replace using a regular expression. I'm trying to do something like this:
Dim regEx
Set regEx = CreateObject("VBScript.RegExp")
regEx.Pattern = "ID_(\d{3})"
regEx.IgnoreCase = False
regEx.Global = True
regEx.Replace(a_cell.Value, "=HYPERLINK(A" & CStr(CInt("$1") + 2) )
I.e. I have cells which contain things like ID_006 and I want to replace the contents of such a cell with a hyperlink to cell A8. So I match the three digits, and then want to add 2 to those digits to get the correct row to hyperlink to.
But the CStr(CInt("$1") + 2) part doesn't work. Any suggestions on how I can make it work?
Ive posted given these points
you should test for a valid match before trying a replace
from your current code the Global is redundant as you can add 1 hyerplink (1 match) to a cell
your current code will accept a partial string match, if you wanted to avoid ID_9999 then you match the entire string using ^ and $. This version runs me, you can revert to your current pattern with .Pattern = "ID_(\d{3})"
Normally when adding a hyperlink a visible address is needed. The code beloe does this (with the row manipulation in one shot)
The code below runs at A1:A10 (sample shown dumping to B1:B10 for pre and post coede)
Sub ParseIt()
Dim rng1 As Range
Dim rng2 As Range
Dim regEx
Set rng1 = Range([a1], [a10])
Set regEx = CreateObject("VBScript.RegExp")
With regEx
'match entire string
.Pattern = "^ID_(\d{3})$"
'match anywhere
' .Pattern = "ID_(\d{3})"
.IgnoreCase = False
For Each rng2 In rng1
If .test(rng2.Value) Then
'use Anchor:=rng2.Offset(0, 1) to dump one column to the right)
ActiveSheet.Hyperlinks.Add Anchor:=rng2, Address:="", SubAddress:= _
Cells(.Replace(rng2.Value, "$1") + 2, rng2.Column).Address, TextToDisplay:=Cells(.Replace(rng2.Value, "$1") + 2, rng2.Column).Address
End If
Next
End With
End Sub
This is because: "=HYPERLINK(A" & CStr(CInt("$1") + 2) is evaluated once, when the code is executed, not once for every match.
You need to capture & process the match like this;
a_cell_Value = "*ID_006*"
Set matches = regEx.Execute(a_cell_Value)
Debug.Print "=HYPERLINK(A" & CLng(matches(0).SubMatches(0)) + 2 & ")"
>> =HYPERLINK(A8)
Or if they are all in ??_NUM format;
a_cell_Value = "ID_11"
?"=HYPERLINK(A" & (2 + val(mid$(a_cell_Value, instr(a_cell_Value,"_") +1))) & ")"
=HYPERLINK(A13)
The line -
regEx.Replace(a_cell.Value, "=HYPERLINK(A" & CStr(CInt("$1") + 2) )
won't work as VBA will try to do a CInt on the literal string "$1" rather than on the match from your RegEx.
It would work if you did your replace in 2 steps, something like this -
Dim a_cell
a_cell = Sheets(1).Cells(1, 1)
Dim regEx
Set regEx = CreateObject("VBScript.RegExp")
regEx.Pattern = "ID_(\d{3})"
regEx.IgnoreCase = False
regEx.Global = True
a_cell = regEx.Replace(a_cell, "$1")
Sheets(1).Cells(1, 1) = "=HYPERLINK(A" & CStr(CInt(a_cell) + 2) & ")"