How can I remove multiple text within parentheses with a regex in VB ArcGIS field calculator? - regex

The "Comment" field records include samples like:
"Line marker, Fence, Test station (Single: Struct P/S -1.2375V IRF -1.0976V) (ACV: 0.0412V)"
"Direction: DownStreamDate Collected: 5/8/2013:START POS RUN ON , Line marker, Test station, Fence, , Aerial marker , 222 MP 221.89 CALI 0.2 0.3 SUNNY END WARM"
I am attempting to omit all text within parentheses from Comments records in a "CrossingName" Field.
I am using:
Pre-Logic Script Code:
Set re = CreateObject("VBScript.RegExp")
With re
.Pattern = "\([^()]*\)"
.Global = False
.IgnoreCase = False
End With
CrossingName = re.Execute(targetString).Item(0)
The regular expression is not returning every instance of parentheses. I copied this to try and isolate the parentheses text. I want to omit every portion of text inside parentheses. Many records have two sets of parentheses. Other questions did not address this specific scenario. Thank you for your assistance.
Currently the script throws an error and only works on one record.
sample

Related

How to find and format multiple matching words with Regex in Word document using VB script

I have a word document in which I have to do the formatting of the words using VB script. The text can be as follows :
hello <bu ABC bu>, We are pleased to confirm our offer of employment to you. The terms and conditions that will apply to your employment with are set forth in this letter and Exhibit A attached hereto and incorporated herein by reference together, the “Agreement”
You have been offered and accepted the position of , presently reporting to . Your start date is expected to be
The words which are written inside tag needs to be bold and underlined. Currently I have written a VBscript which will find the text given as argument and make it bold and underline as required.
But to make the solution/script more dynamic, I want the script to match Regular Expression pattern which I have written : (?<=(<bu))[a-zA-Z0-9 -:/\[]()]+(?=(bu>))
The script I have written :
Option Explicit
Function Macro1()
Dim strFilePath
strFilePath = "C:\Users\<UserID>\Documents\OfferLetterTemplate.docx"
Dim strTextToReplace
strTextToReplace = "<bu XYZ bu>"
Dim Word, objDoc, objSelection
Set Word = CreateObject("Word.Application")
Word.Visible = True
Dim wordfile
Set wordfile = Word.Documents.Open(strFilePath)
Set objDoc = Word.ActiveDocument
Set objSelection = Word.Selection
objSelection.Find.Forward = True
objSelection.Find.MatchWholeWord = False
objSelection.Find.ClearFormatting
objSelection.Find.Replacement.ClearFormatting
objSelection.Find.Replacement.Font.Bold = True
objSelection.Find.Replacement.Font.Underline = True
objSelection.Find.Text = strTextToReplace
objSelection.Find.Replacement.Text = ""
objSelection.Find.Execute , , , , , , , 0, , , 2
wordfile.save
Word.Quit
End Function
call Macro1
Can someone help me how I can search for the RegEx which I have given above and format all the matching occurrences at once?

Changing formulas on the fly with VBA RegEx

i'm trying to change formulas in excel, i need to change the row number of the formulas.
I'm trying do use replace regex to do this. I use an loop to iterate through the rows of the excel and need to change the formula for the row that is iterating at the time. Here is an exemple of the code:
For i = 2 To rows_aux
DoEvents
Formula_string= "=IFS(N19='Z001';'xxxxxx';N19='Z007';'xxxxxx';0=0;'xxxxxxx')"
Formula_string_new = regEx.Replace(Formula_string, "$1" & i)
wb.Cells(i, 33) = ""
wb.Cells(i, 33).Formula = Formula_string_new
.
.
.
Next i
I need to replace rows references but not the ones in quotes or double quotes. Example:
If i = 2 i want the new string to be this:
"=IFS(N2='Z001';'xxxxxx';N2='Z007';'xxxxxx';0=0;'xxxxxxx')"
I'm trying to use this regex:
([a-zA-Z]+)(\d+)
But its changing everything in quotes too. Like this:
If i = 2:
"=IFS(N2='Z2';'xxxxxx';N2='Z2';'xxxxxx';0=0;'xxxxxxx')"
If anyone can help me i will be very grateful!
Thanks in advance.
As others have written, there are probably better ways to write this code. But for a regex that will capture just the Column letter in capturing group #1, try:
\$?\b(XF[A-D]|X[A-E][A-Z]|[A-W][A-Z]{2}|[A-Z]{2}|[A-Z])\$?(?:104857[0-6]|10485[0-6]\d|1048[0-4]\d{2}|104[0-7]\d{3}|10[0-3]\d{4}|[1-9]\d{1,5}|[1-9])d?
Note that is will NOT include the $ absolute addressing token, but could be altered if that were necessary.
Note that you can avoid the loop completely with:
Formula_string = "=IFS(N19=""Z001"",""xxxxxx"",N$19=""Z007"",""xxxxxx"",0=0,""xxxxxxx"")"
Formula_string_new = regEx.Replace(Formula_string, "$1" & firstRow)
With Range(wb.Cells(firstRow, 33), wb.Cells(lastRow, 33))
.Clear
.Formula = Formula_string_new
End With
When we write a formula to a range like this, the references will automatically adjust the way you were doing in your loop.
Depending on unstated factors, you may want to use the FormulaLocal property vice the Formula property.
Edit:
To make this a little more robust, in case there happens to be, within the quote marks, a string that exactly mimics a valid address, you can try checking to be certain that a quote (single or double) neither precedes nor follows the target.
Pattern: ([^"'])\$?\b(XF[A-D]|X[A-E][A-Z]|[A-W][A-Z]{2}|[A-Z]{2}|[A-Z])\$?(?:104857[0-6]|10485[0-6]\d|1048[0-4]\d{2}|104[0-7]\d{3}|10[0-3]\d{4}|[1-9]\d{1,5}|[1-9])d?\b(?!['"])
Replace: "$1$2" & i
However, this is not "bulletproof" as various combinations of included data might match. If it is a problem, let me know and I'll come up with something more robust.
If you can identify some unique features like in the example preceding bracket ( or colon ; and trailing equal = then this might work
Sub test()
Dim s As String, sNew As String, i As Long
Dim Regex As Object
Set Regex = CreateObject("vbscript.regexp")
With Regex
.Global = True
.MultiLine = False
.IgnoreCase = True
.Pattern = "([(;][a-zA-Z]{1,3})(\d+)="
End With
i = 1
s = "=IFS(NANA19='Z001';'xxxxxx';NA19='Z007';'xxxxxx';0=0;'xxxxxxx')"
sNew = Regex.Replace(s, "$1" & i & "=")
Debug.Print s & vbCr & sNew
End Sub

excel vba - use regex to return information between indicators

I have an app which returns data in the form of a table copied into the clipboard.
the table takes the form of:
table name
other info
-------------------------------
|heading 1|heading 2|heading 3|
-------------------------------
|data|date|other Data|
|data|date|other Data|
-------------------------------
time stamp
etc
I'm looking to pull back only the heading and data rows, minus the horizontal rows which are represented by dashes (---) in my data.
I need the pipes (|) as they are used to split the rows for passing back to excel.
I've used the following regex attempts
strPattern = "(?<=\|)[^|]++(?=\|)"
strPattern = "(\|[^|]++(\|)"
strPattern = "(^\s\|[\d\D]+?\|\s$)"
strPattern = "(^\s\|[\d\D]*\|\s$)"
strReplace = "$1"
thinking that the above uses the pipes as bookends and returns any digit or non digit character between the pipes. none of these work and at best it returns the entire string (I know I don't have anything removing the dashes yet)
looking for:
|heading 1|heading 2|heading 3|
|data|date|other Data|
|data|date|other Data|
Thanks in advance for any help
To answer your question, for a regex that will take your text as a block (multi-line variable) and only return the desired lines, try:
^(?:(?:(?:(?=-).)+)|(?:[^|]+))\n?
There may be better ways to accomplish your overall goal, but this accomplishes what you requested.
Option Explicit
Function PipedLines(S As String)
Dim RE As Object
Const sPat As String = "^(?:(?:(?:(?=-).)+)|(?:[^|]+))\n?"
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.MultiLine = True
.Pattern = sPat
PipedLines = .Replace(S, "")
End With
End Function
Hi #tsuimark have you treid copying Clipboard data to directly to excel.?
tried and attched screenshot. and remove unwanted rows in sheet.
Thanks.

Why does Find/Replace zRngResult.Find work fine, but RegEx myRegExp.Execute(zRngResult) mess up the range.Start?

I wish to select and add comments after certain words, e.g. “not”, “never”, “don’t” in sentences in a Word document with VBA. The Find/Replace with wildcards works fine, but “Use wildcards” cannot be selected with “Match case”. The RegEx can “IgnoreCase=True”, but the selection of the word is not reliable when there are more than one comments in a sentence. The Range.start seems to be getting modified in a way that I cannot understand.
A similar question was asked in June 2010. https://social.msdn.microsoft.com/Forums/office/en-US/f73ca32d-0af9-47cf-81fe-ce93b13ebc4d/regex-selecting-a-match-within-the-document?forum=worddev
Is there a new/different way of solving this problem?
Any suggestion will be appreciated.
The code using RegEx follows:
Function zRegExCommentor(zPhrase As String, tComment As String) As Long
Dim sTheseSentences As Sentences
Dim rThisSentenceToSearch As Word.Range, rThisSentenceResult As Word.Range
Dim myRegExp As RegExp
Dim myMatches As MatchCollection
Options.CommentsColor = wdByAuthor
Set myRegExp = New RegExp
With myRegExp
.IgnoreCase = True
.Global = False
.Pattern = zPhrase
End With
Set sTheseSentences = ActiveDocument.Sentences
For Each rThisSentenceToSearch In sTheseSentences
Set rThisSentenceResult = rThisSentenceToSearch.Duplicate
rThisSentenceResult.Select
Do
DoEvents
Set myMatches = myRegExp.Execute(rThisSentenceResult)
If myMatches.Count > 0 Then
rThisSentenceResult.Start = rThisSentenceResult.Start + myMatches(0).FirstIndex
rThisSentenceResult.End = rThisSentenceResult.Start + myMatches(0).Length
rThisSentenceResult.Select
Selection.Comments.Add Range:=Selection.Range
Selection.TypeText Text:=tComment & "{" & zPhrase & "}"
rThisSentenceResult.Start = rThisSentenceResult.Start + 1 'so as not to find the same phrase again and again
rThisSentenceResult.End = rThisSentenceToSearch.End
rThisSentenceResult.Select
End If 'If myMatches.Count > 0 Then
Loop While myMatches.Count > 0
Next 'For Each rThisSentenceToSearch In sTheseSentences
End Function
Relying on Range.Start or Range.End for position in a Word document is not reliable due to how Word stores non-printing information in the text flow. For some kinds of things you can work around it using Range.TextRetrievalMode, but the non-printing characters inserted by Comments aren't affected by these settings.
I must admit I don't understand why Word's built-in Find with wildcards won't work for you - no case matching shouldn't be a problem. For instance, based on the example: "Never has there been, never, NEVER, a total drought.":
FindText:="[n,N][e,E][v,V][e,E][r,R]"
Will find all instances of n-e-v-e-r regardless of the capitalization. The brackets let you define a range of values, in this case the combination of lower and upper case for each letter in the search term.
The workarounds described in my MSDN post you link to are pretty much all you can if you insist on RegEx:
Using the Office Open XML (or possibly Word 2003 XML) file format will let you use RegEx and standard XML processing tools to find the information, add comment "tags" into the Word XML, close it all up... And when the user sees the document it will all be there.
If you need to be doing this in the Word UI a slightly different approach should work (assuming you're targeting Word 2003 or later): Work through the document on a range-by-range basis (by paragraph, perhaps). Read the XML representation of the text into memory using the Range.WordOpenXML property, perform the RegEx search, add comments as WordOpenXML, then write the WordOpenXML back into the document using the InserXml method, replacing the original range (paragraph). Since you'd be working with the Paragraph object Range.Start won't be a factor.

Writing a word macro to organize chat logs

I need some help writing a word macro to organize some chat logs. What I want is to eliminate repeated consecutive occurrences of names, regardless of timestamp. Besides this, each person will be using their own formatting style (font, font color, etc.). Edit: the raw logs have no formatting (i.e. specific fonts, font color ,etc.). I want the macro to automatically add a specific (already existent) word style to each user.
So, what I have is:
[12:40] Steve: this is an example text.
[12:41] Steve: this is another example text.
[12:41] Steve: this is yet another example text.
[12:45] Bob: some more text.
[12:46] Bob: even more text.
[12:47] Steve: yadda yadda yadda.
The expected output would be:
[12:40] Steve: *style1*this is an example text.
this is another example text.
this is yet another example text.*/style1*
[12:45] Bob: *style2*some more text.
even more text.*/style2*
[12:47] Steve: *style1*yadda yadda yadda.*style1*
As of now, unfortunately, I know next to nothing of VBA for Applications. I was thinking of maybe searching for the names by a regex pattern and assigning them to a variable, comparing each match to the previous and, if they're equal, deleting the latter. The problem is I'm not fluent in VBA, so I don't know how to do what I want.
So far, all I've got is this:
Sub Organize()
Dim re As RegExp
Dim names As MatchCollection, name As Match
re.Pattern = "\[[0-9]{2}:[0-9]{2}\] [a-zA-Z]{1,20}:"
re.IgnoreCase = True
re.Global = True
Set names = re.Execute(ActiveDocument.Range)
For Each name In names
'This is where I get lost
Next name
End Sub
So, in the interest of solving this problem and me learning some VBA, could I get some help?
EDIT: the question has been edited to better reflect what I want the macro to do.
Assuming that each line in your log is a separate paragraph I would do it without Regex but with .Find object feature. The following code is working find for the sample data you provided.
Sub qTest()
Dim PAR As Paragraph
Dim PrevName As String
For Each PAR In ActiveDocument.Content.Paragraphs
PAR.Range.Select 'highlight current paragraph
'find name in paragraph
With Selection.Find
.ClearFormatting
.Text = "\]*\:"
.Execute
End With
If Selection.Text = PrevName Then
'extend region for the whole paragraph
'end delete it
ActiveDocument.Range(PAR.Range.Start, Selection.End + 1).Delete
Else
PrevName = Selection.Text
Debug.Print PrevName
End If
Next
End Sub