use regex in a wrapped text value cell - regex

I have a cell like this :
1 parent
1 child
I am getting the value in vba :
Dim nbChild As String
Dim nbParent As String
Sheets("Feuil1").Cells(C.Row - 1, C.Column).Value
However, I would like to put the number of parent and child in 2 separates variables nbParent and nbChild, so I was thinking to use regex to capture the groups (digit number before parent and digit number before child).
But I don't know how to do it. Thanks in advance for your help

Dim arr, parent, child
arr = Split(ActiveCell.Value, Chr(10))'split on hard return
parent=arr(0)
child=arr(1)
'then split each line on space....
debug.print Split(parent," ")(0) 'number
debug.print Split(parent," ")(1) 'text
debug.print Split(child," ")(0) 'number
debug.print Split(child," ")(1) 'text

With data like:
click on a cell and run:
Sub Family()
ary = Split(ActiveCell.Value, " ")
nParent = CLng(ary(0))
nChild = CLng(ary(2))
MsgBox nParent & vbCrLf & nChild
End Sub
(regex is not necessary)

I agree with #Gary's Student that you don't seem to need Regex but if you insist on it, I think the code below works. You'll need to add in reference to Microsoft VBScript Regular Expressions 5.5 from Tools -> References.
Sub main()
Dim value As String
Dim re As VBScript_RegExp_55.RegExp
Dim matches As VBScript_RegExp_55.MatchCollection
Dim match As VBScript_RegExp_55.match
value = Range("A1").value
Set re = New VBScript_RegExp_55.RegExp
re.Pattern = "\d+"
re.Global = True
Set matches = re.Execute(value)
For Each match In matches
Debug.Print match.value
Next
End Sub

Related

How to save SubMatches as array and print not empty submatches?

When I try the following Regex code and add a "Add Watch" (Shift + F9) to Matches
Sub TestRegEx1()
Dim regex As Object, Matches As Object
Dim str As String
str = "This is text for the submatches"
Set regex = CreateObject("VBScript.RegExp")
regex.Pattern = "Th(is).+(for).+(submatches)|.+(\d)|([A-Z]{3})"
regex.IgnoreCase = True
Set Matches = regex.Execute(str)
End Sub
I see that Matches is structured like this (with 2 empty submatches):
2 questions:
How can I save in an array variable the SubMatches?
How can I Debug.Print only elements that are not empty?
I've tried doing like below but is not working
Set Arr = Matches.SubMatches
Set Arr = Matches(1).SubMatches
Set Arr = Matches.Item(1).SubMatches
Thanks in advance
Is the following what you intended? Oversize an array at the start and redim at the end. First version prints only non-empty but stores all. Second version prints and stores only non-empty.
You probably want to .Test to ensure there are matches.
Option Explicit
Sub TestRegEx1()
Dim regex As Object, matches As Object, match As Object, subMatch As Variant
Dim str As String, subMatches(), i As Long
ReDim subMatches(0 To 1000)
str = "This is text for the submatches"
Set regex = CreateObject("VBScript.RegExp")
regex.Pattern = "Th(is).+(for).+(submatches)|.+(\d)|([A-Z]{3})"
regex.IgnoreCase = True
Set matches = regex.Execute(str)
For Each match In matches
For Each subMatch In match.subMatches
subMatches(i) = match.subMatches(i)
If Not IsEmpty(subMatches(i)) Then Debug.Print subMatches(i)
i = i + 1
Next
Next
ReDim Preserve subMatches(0 To i)
End Sub
If you only want to store non-empty then
Option Explicit
Sub TestRegEx1()
Dim regex As Object, matches As Object, match As Object, subMatch As Variant
Dim str As String, subMatches(), i As Long
ReDim subMatches(0 To 1000)
str = "This is text for the submatches"
Set regex = CreateObject("VBScript.RegExp")
regex.Pattern = "Th(is).+(for).+(submatches)|.+(\d)|([A-Z]{3})"
regex.IgnoreCase = True
Set matches = regex.Execute(str)
For Each match In matches
For Each subMatch In match.subMatches
subMatches(i) = match.subMatches(i)
If Not IsEmpty(subMatches(i)) Then
Debug.Print subMatches(i)
i = i + 1
End If
Next
Next
ReDim Preserve subMatches(0 To i)
End Sub
You may use a Collection and fill it on the go.
Add
Dim m, coll As Collection
Initialize the collection:
Set coll = New Collection
Then, once you get the matches, use
If Matches.Count > 0 Then ' if there are matches
For Each m In Matches(0).SubMatches ' you need the first match submatches
If Len(m) > 0 Then coll.Add (m) ' if not 0 length, save value to collection
Next
End If
Result of the code with changes:

Excel VBA - Looking up a string with wildcards

Im trying to look up a string which contains wildcards. I need to find where in a specific row the string occurs. The string all take form of "IP##W## XX" where XX are the 2 letters by which I look up the value and the ## are the number wildcards that can be any random number. Hence this is what my look up string looks like :
FullLookUpString = "IP##W## " & LookUpString
I tried using the Find Command to find the column where this first occurs but I keep on getting with errors. Here's what I had so far but it doesn't work :L if anyone has an easy way of doing. Quite new to VBA -.-
Dim GatewayColumn As Variant
Dim GatewayDateColumn As Variant
Dim FirstLookUpRange As Range
Dim SecondLookUpRange As Range
FullLookUpString = "IP##W## " & LookUpString
Set FirstLookUpRange = wsMPNT.Range(wsMPNT.Cells(3, 26), wsMPNT.Cells(3, lcolumnMPNT))
Debug.Print FullLookUpString
GatewayColumn = FirstLookUpRange.Find(What:=FullLookUpString, After:=Range("O3")).Column
Debug.Print GatewayColumn
Per the comment by #SJR you can do this two ways. Using LIKE the pattern is:
IP##W## [A-Z][A-Z]
Using regular expressions, the pattern is:
IP\d{2}W\d{2} [A-Z]{2}
Example code:
Option Explicit
Sub FindString()
Dim ws As Worksheet
Dim rngData As Range
Dim rngCell As Range
Set ws = ThisWorkbook.Worksheets("Sheet1") '<-- set your sheet
Set rngData = ws.Range("A1:A4")
' with LIKE operator
For Each rngCell In rngData
If rngCell.Value Like "IP##W## [A-Z][A-Z]" Then
Debug.Print rngCell.Address
End If
Next rngCell
' with regular expression
Dim objRegex As Object
Dim objMatch As Object
Set objRegex = CreateObject("VBScript.RegExp")
objRegex.Pattern = "IP\d{2}W\d{2} [A-Z]{2}"
For Each rngCell In rngData
If objRegex.Test(rngCell.Value) Then
Debug.Print rngCell.Address
End If
Next rngCell
End Sub
If we can assume that ALL the strings in the row match the given pattern, then we can examine only the last three characters:
Sub FindAA()
Dim rng As Range, r As Range, Gold As String
Set rng = Range(Range("A1"), Cells(1, Columns.Count))
Gold = " AA"
For Each r In rng
If Right(r.Value, 3) = Gold Then
MsgBox r.Address(0, 0)
Exit Sub
End If
Next r
End Sub
Try this:
If FullLookUpString Like "*IP##W##[a-zA-Z][a-zA-Z]*" Then
MsgBox "Match is found"
End If
It will find your pattern (pattern can be surrounded by any characters - that's allowed by *).

Regular expression for an Excel cell with R1C1 notation

I need some code to test if a cell contains a formula with a reference to another cell.
I found the answer Find all used references in Excel formula but the solution matches wrongly also formula with references to table columns as :
=SearchValInCol2(Tabella1[articolo];[#articolo];Tabella1[b])
Then, I wrote the following VBA code using the Like operator, but surely a solution with a regular expression would be more solid (I think the following code won't work in many scenarios).
Private Function TestIfCellContainsAFormula(cellToTest As Variant) As Boolean
Dim result As Object
Dim r As Range
Dim testExpression As String
Dim objRegEx As Object
Set r = cellToTest ' INPUT THE CELL HERE , e.g. RANGE("A1")
Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.IgnoreCase = True
objRegEx.Global = True
objRegEx.Pattern = """.*?""" ' remove expressions
testExpression = CStr(r.FormulaR1C1)
' search for pattern "=R[-3]C+4"
If testExpression Like "*R[[]*[]]*C*" Then
TestIfCellContainsAFormula2 = True
Exit Function
End If
' search for pattern "=RC[2]"
If testExpression Like "*R*C[[]*[]]*" Then
'If InStr(1, testExpression, "C[", vbTextCompare) <> 0 Then
TestIfCellContainsAFormula2 = True
Exit Function
End If
TestIfCellContainsAFormula2 = False
End Function
Option 1
To match R1C1 style references you can use this regex:
R(\[-?\d+\])C(\[-?\d+\])|R(\[-?\d+\])C|RC(\[-?\d+\])
See the railroad diagram for a visual explanation:
At the core is the 'offset' which is -?\d+ which is optional - followed by a digit or more. This sequence goes in the brackets ([]) to give \[-?\d+\]. Then the regex allows combinations of:
R[offset]C[offset]
R[offset]C or (|)
RC[offset] or (|)
Option 2
The regex above won't match R, C, or RC. It will match R[0], C[0], R[0]C, RC[0], and R[0]C[0] which are kind of equivalent. To eliminate those matches you might use this regex:
R(\[-?[1-9][0-9]*\])C(\[-?[1-9][0-9]*\])|R(\[-?[1-9][0-9]*\])C|RC(\[-?[1-9][0-9]*\])
Which is this:
But it seems entering R[0], C[0] and R[0]C[0] in my Excel (v2013) turns them into R, C and RC anyways - so you can avoid the additional complexity if this is not a concern.
Option 3
If you want to allow R, C and RC you can use a simpler regex:
R(\[-?\d+\])?C(\[-?\d+\])?
VBA test code
This uses Option 1.
Option Explicit
Sub Test()
Dim varTests As Variant
Dim varTest As Variant
Dim varMatches As Variant
Dim varMatch As Variant
varTests = Array("RC", _
"R[1]C", _
"RC[1]", _
"R[1]C[1]", _
"R[-1]C", _
"RC[-1]", _
"R[-1]C[-1]", _
"=SUM(A1:B2)", _
"RC[1]+R[-1]C+R[2]C[-99]", _
"R[-1]C-R[1]C[-44]-RC[999]+R[0]C[0]", _
"SearchValInCol2(Tabella1[articolo];[#articolo];Tabella1[b])")
For Each varTest In varTests
varMatches = FormulaContainsR1C1Reference(CStr(varTest))
Debug.Print "Input: " & CStr(varTest)
Debug.Print VBA.String(Len(CStr(varTest)) + 7, "-")
If IsEmpty(varMatches) Then
Debug.Print "No matches"
Else
Debug.Print UBound(varMatches) & " matches"
For Each varMatch In varMatches
Debug.Print varMatch
Next varMatch
End If
Debug.Print vbCrLf
Next varTest
End Sub
Function FormulaContainsR1C1Reference(ByVal strFormula As String) As Variant
Dim objRegex As Object
Dim strPattern As String
Dim objMatches As Object
Dim varMatches As Variant
Dim lngCounter As Long
Set objRegex = CreateObject("VBScript.RegExp")
With objRegex
' setup regex
.Global = True
.IgnoreCase = False
.Pattern = "R(\[-?\d+\])C(\[-?\d+\])|R(\[-?\d+\])C|RC(\[-?\d+\])"
' get matches
Set objMatches = .Execute(strFormula)
' iterate matches
If objMatches.Count > 0 Then
ReDim varMatches(1 To objMatches.Count)
For lngCounter = 1 To objMatches.Count
varMatches(lngCounter) = objMatches.Item(lngCounter - 1)
Next lngCounter
Else
varMatches = Empty
End If
End With
FormulaContainsR1C1Reference = varMatches
End Function
A1 style references
I posted a regex here for A1 style references:
^(?:[A-Z]|[A-Z][A-Z]|[A-X][A-F][A-D])(?:[1-9]|[1-9][0-9]|[1-9][0-9][0-9]|[1-9][0-9][0-9][0-9]|[1-9][0-9][0-9][0-9][0-9]|[1-9][0-9][0-9][0-9][0-9][0-9]|10[0-3][0-9][0-9][0-9][0-9]|104[0-7][0-9][0-9][0-9]|1048[0-4][0-9][0-9]|10485[0-6][0-9]|104857[0-6])$

Marking word as "find and replace" in Microsoft Word with RegEx

I try to highlight a word found by RegEx, and if the right to replace it with its corresponding substitute.
The code works correctly only if NOT substituted.
Probably should every time rearrange???
Sub Replace()
Dim regExp As Object
Set regExp = CreateObject("vbscript.regexp")
Dim arr As Variant
Dim arrzam As Variant
Dim i As Long
Dim choice As Integer
Dim Document As Word.Range
Set Document = ActiveDocument.Content
On Error Resume Next
'EGN
'IBAN
arr = VBA.Array("((EGN(:{0,1})){0,1})[0-9]{10}", _
"[a-zA-Z]{2}[0-9]{2}[a-zA-Z0-9]{4}[0-9]{7}([a-zA-Z0-9]?){0,16}")
arrzam = VBA.Array("[****]", _
"[IBAN]")
With regExp
For i = 0 To UBound(arr)
.Pattern = arr(i)
.Global = True
For Each Match In regExp.Execute(Document)
ActiveDocument.Range(Match.FirstIndex, Match.FirstIndex + Match.Length).Duplicate.Select
choice = MsgBox("Replace " & Chr(34) & Match.Value & Chr(34) & " with " & Chr(34) & arrzam(i) & Chr(34) & "?", _
vbYesNoCancel + vbDefaultButton1, "Replace")
If choice = vbYes Then
Document = .Replace(Document, arrzam(i))
ElseIf choice = vbCancel Then
Next
End If
Next
Next
End With
End Sub
Actually, there are several things wrong with this.
First, the each Match in Each Match is static, determined at the moment of the first loop. You're changing the document in the meantime, so each successive Match looks at an old position.
Second, you're replacing all the occurrences at one time, so there is no need to loop through them. It seems a one line, one time Replace could do the same thing.

match date pattern in the string vba excel

Edit:
Since my string became more and more complicated looks like regexp is the only way.
I do not have a lot experience in that and your help is much appreciated.
Basically from what I read on the web I construct the following exp to try matching occurrence in my sample string:
"My very long long string 12Mar2012 is right here 23Apr2015"
[0-9][0-9] + [a-zA-Z] + [0-9][0-9][0-9][0-9]
and trying this code. I do not have any match. Any good link on regexp tutorial much appreciated.
Dim re, match, RegExDate
Set re = CreateObject("vbscript.regexp")
re.Pattern = "(^[0-9][0-9] + [a-zA-Z] + [0-9][0-9][0-9][0-9]$)"
re.Global = True
For Each match In re.Execute(str)
MsgBox match.Value
RegExDate = match.Value
Exit For
Next
Thank you
This code validates the actual date from the Regexp using DateValuefor robustness
Sub Robust()
Dim Regex As Object
Dim RegexMC As Object
Dim RegexM As Object
Dim strIn As String
Dim BDate As Boolean
strIn = "My very long long string 12Mar2012 is right here 23Apr2015 and 30Feb2002"
Set Regex = CreateObject("vbscript.regexp")
With Regex
.Pattern = "(([0-9])|([0-2][0-9])|([3][0-1]))(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)(\d{4})"
.Global = True
If .test(strIn) Then
Set RegexMC = .Execute(strIn)
On Error Resume Next
For Each RegexM In RegexMC
BDate = False
BDate = IsDate(DateValue(RegexM.submatches(0) & " " & RegexM.submatches(4) & " " & RegexM.submatches(5)))
If BDate Then Debug.Print RegexM
Next
On Error GoTo 0
End If
End With
End Sub
thanks for all your help !!!
I managed to solve my problem using this simple code.
Dim rex As New RegExp
Dim dateCol As New Collection
rex.Pattern = "(\d|\d\d)(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)(\d{4})?"
rex.Global = True
For Each match In rex.Execute(sStream)
dateCol.Add match.Value
Next
Just note that on my side I'm sure that I got valid date in the string so the reg expression is easy.
thnx
Ilya
The following is a quick attempt I made. It's far from perfect.
Basically, it splits the string into words. While looping through the words it cuts off any punctuation (period and comma, you might need to add more).
When processing an item, we try to remove each month name from it. If the string gets shorter we might have a date.
It checks to see if the length of the final string is about right (5 or 6 characters, 1 or 2 + 4 for day and year)
You could instead (or also) check to see that there all numbers.
Private Const MonthList = "JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC"
Public Function getDates(ByVal Target As String) As String
Dim Data() As String
Dim Item As String
Dim Index As Integer
Dim List() As String
Dim Index2 As Integer
Dim Test As String
Dim Result As String
List = Split(MonthList, ",")
Data = Split(Target, " ")
Result = ""
For Index = LBound(Data) To UBound(Data)
Item = UCase(Replace(Replace(Data(Index), ".", ""), ",", ""))
For Index2 = LBound(Data) To UBound(Data)
Test = Replace(Item, List(Index2), "")
If Not Test = Item Then
If Len(Test) = 5 Or Len(Test) = 6 Then
If Result = "" Then
Result = Item
Else
Result = Result & ", " & Item
End If
End If
End If
Next Index2
Next
getDates = Result
End Function