Excel VBA Find & Replace Text In A String With Loop - regex

I'm trying to replace all the text in a string between the pattern "&CC[number]:[number]" and replace it with a "==".
Here is the string. "T &CC3:5 Q8 Party/ Self-Identify&CC6:8 Male&CC9:11 Female&CC12:15 Q1 Vote"
This is what I need it to look like T &CC3:5==&CC6:8==&CC9:11==&CC12:15==
I know I need to loop through this string but I'm not sure the best way to set this up.
Dim stringOne As String
Dim regexOne As Object
Set regexOne = New RegExp
regexOne.Pattern = "([Q])+[0-9]"
regexOne.Global = False
stringOne = "T &CC3:5 Q8 Party/ Self-Identify&CC6:8 Male&CC9:11 Female&CC12:15 Q1 Vote"
Debug.Print regexOne.Replace(stringOne, "==")
End Sub
I have also explored using this regular expression regexOne.Pattern = "([&])+[C]+[C]+[0-9]+[:]+[0-9]"
I plan to eventually set the variable stringOne to Range("A1").Text

You could simplify the pattern a bit and use a capturing group and a positive lookahead
(&CC[0-9]+:[0-9]+).*?(?=&C|$)
Explanation
( Capture group 1
&CC[0-9]+:[0-9]+ Match &CC 1+ digits, : and 1+ digits
) Close group
.*? Match 0+ times any char except a newline non greedy
(?=&C|$) Positive lookahead, assert what is directly on the right is either &C or the end of the string
Regex demo
In the replacement use the first capturing group followed by ==

Related

Regex pattern in vbscript to match Text with multiple line

I have a long string with Slno. in it. I want to split the sentence from the string with Slno.
Sample text:
1. Able to click new button and proceed to ONB-002 dialogue.
2. - Partner connection name **(text field empty)(MANDATORY)**
- GS1 company prefix **(text field empty)(MANDATORY)**
I tried using vbscript regex to match a pattern. but it is matches only the first line of the string (1. text) not the second one.
^\d+\.\s(-?).*[\r\n].[\r\n\*+]*.*|^\d+\.\s(-?).*[\r\n]
And while splitting the string, for the Slno. 2 i want o get the below sentence as well. which am finding difficulty in getting.
Please assist me.
Set regex = CreateObject("VBScript.RegExp")
With regex
.Pattern = "^\d+\.\s(-?).*[\r\n].[\r\n\*+]*.*|^\d+\.\s(-?).*[\r\n]"
.Global = True
End With
Set matches = regex.Execute(txt)
My Expectation is am looking for a regex pattern that match
1. Able to click new button and proceed to ONB-002 dialogue.
&
2. - Partner connection name **(text field empty)(MANDATORY)**
- GS1 company prefix **(text field empty)(MANDATORY)**
as separate sentence or group.
If I am not mistaken, to get the 2 separate parts including the line after you could use:
^\d+\..*(?:\r?\n(?!\d+\.).*)*
Explanation
^ Start of string
\d+\. Match 1+ digits followed by a dot
.* Match any character except a newline 0+ times
(?: Non capturing group
\r?\n(?!\d+\.).* Match a newline and use a negative lookahead to asset what is on the right is not 1+ digits followed by a dot
)* Close non capturing group and repeat 0+ times
Regex demo

Extract only the first occurence of the search string and ignore everything after /

I'm new to regex and want to display all the folders that contain the string name but ignore the characters or inner directories after "/"
Using regex only
(*spark?/)
Below are the set of directories:
/app-logs/spark/logs/application_15262_85484
/user/oozie/share/lib/lib_36456456/spark
/app-logs/spark/logs
/app-logs/spark
/apps/spark/warehouse
My result should be:
/app-logs/spark
/user/oozie/share/lib/lib_36456456/spark
/app-logs/spark
/apps/spark
The expression we might be looking for here, would be:
(spark)\/?.*
which we would replace it with our first capturing group, $1.
Demo
Test
const regex = /(spark)\/?.*/gm;
const str = `/app-logs/spark/logs/application_15262_85484
/user/oozie/share/lib/lib_36456456/spark
/app-logs/spark/logs
/app-logs/spark
/apps/spark/warehouse`;
const subst = `$1`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log(result);
Your pattern (*spark?/) is not valid because before the quantifier * there is a an opening parenthesis for the capturing group which is not valid. The questionmark after the k means that the character k is optional.
You could use a repeating pattern to match a forward slash followed by matching not a forward slash, then match /spark
^(?:/[^/\n]+)+/spark
Explanation
^ Assert start of string
(?: Non capturing group
/[^/\n]+ Match /, then match 1+ times not / or a newline
)+ Close non capturing group and repeat 1+ times
/spark Match /spark
Regex demo

Use Regex to Split Numbered List array into Numbered List Multiline

I am trying to learn Regex to answer a question on SO portuguese.
Input (Array or String on a Cell, so .MultiLine = False)?
1 One without dot. 2. Some Random String. 3.1 With SubItens. 3.2 With number 0n mid. 4. Number 9 incorrect. 11.12 More than one digit. 12.7 Ending (no word).
Output
1 One without dot.
2. Some Random String.
3.1 With SubItens.
3.2 With number 0n mid.
4. Number 9 incorrect.
11.12 More than one digit.
12.7 Ending (no word).
What i thought was to use Regex with Split, but i wasn't able to implement the example on Excel.
Imports System.Text.RegularExpressions
Module Example
Public Sub Main()
Dim input As String = "plum-pear"
Dim pattern As String = "(-)"
Dim substrings() As String = Regex.Split(input, pattern) ' Split on hyphens.
For Each match As String In substrings
Console.WriteLine("'{0}'", match)
Next
End Sub
End Module
' The method writes the following to the console:
' 'plum'
' '-'
' 'pear'
So reading this and this. The RegExr Website was used with the expression /([0-9]{1,2})([.]{0,1})([0-9]{0,2})/igm on the Input.
And the following is obtained:
Is there a better way to make this? Is the Regex Correct or a better way to generate? The examples that i found on google didn't enlight me on how to use RegEx with Split correctly.
Maybe I am confusing with the logic of Split Function, which i wanted to get the split index and the separator string was the regex.
I can make that it ends with word and period
Use
\d+(?:\.\d+)*[\s\S]*?\w+\.
See the regex demo.
Details
\d+ - 1 or more digits
(?:\.\d+)* - zero or more sequences of:
\. - dot
\d+ - 1 or more digits
[\s\S]*? - any 0+ chars, as few as possible, up to the first...
\w+\. - 1+ word chars followed with ..
Here is a sample VBA code:
Dim str As String
Dim objMatches As Object
str = " 1 One without dot. 2. Some Random String. 3.1 With SubItens. 3.2 With Another SubItem. 4. List item. 11.12 More than one digit."
Set objRegExp = New regexp ' CreateObject("VBScript.RegExp")
objRegExp.Pattern = "\d+(?:\.\d+)*[\s\S]*?\w+\."
objRegExp.Global = True
Set objMatches = objRegExp.Execute(str)
If objMatches.Count <> 0 Then
For Each m In objMatches
Debug.Print m.Value
Next
End If
NOTE
You may require the matches to only stop at the word + . that are followed with 0+ whitespaces and a number using \d+(?:\.\d+)*[\s\S]*?[a-zA-Z]+\.(?=\s*(?:\d+|$)).
The (?=\s*(?:\d+|$)) positive lookahead requires the presence of 0+ whitespaces (\s*) followed with 1+ digits (\d+) or end of string ($) immediately to the right of the current location.
If VBA's split supports look-behind regex then this one may work, assuming there's no digit except in the indexes:
\s(?=\d)

Relevant Regular Expression in scala

I want to keep only the last term of a string separated by dots
Example:
My string is:
abc"val1.val2.val3.val4"zzz
Expected string after i use regex:
abc"val4"zzz
Which means i want the content from left-hand side which was separated with dot (.)
The most relevant I tried was
val json="""abc"val1.val2.val3.val4"zzz"""
val sortie="""(([A-Za-z0-9]*)\.([A-Za-z0-9]*){2,10})\.([A-Za-z0-9]*)""".r.replaceAllIn(json, a=> a.group(3))
the result was:
abc".val4"zzz
Can you tell me if you have different solution for regex please?
Thanks
You may use
val s = """abc"val1.val2.val3.val4"zzz"""
val res = "(\\w+\")[^\"]*\\.([^\"]*\")".r replaceAllIn (s, "$1$2")
println(res)
// => abc"val4"zzz
See the Scala demo
Pattern details:
(\\w+\") - Group 1 capturing 1+ word chars and a "
[^\"]* - 0+ chars other than "
\\. - a dot
([^\"]*\") - Group 2 capturing 0+ chars other than " and then a ".
The $1 is the backreference to the first group and $2 inserts the text inside Group 2.
Maybe without Regex at all:
scala> json.split("\"").map(_.split("\\.").last).mkString("\"")
res4: String = abc"val4"zzz
This assumes you want each "token" (separated by ") to become the last dot-separated inner token.

How do I match the contents of parenthesis in a scala regular expression

I'm trying to get at the contents of a string like this (2.2,3.4) with a scala regular expression to obtain a string like the following 2.2,3.4
This will get me the string with parenthesis and all from a line of other text:
"""\(.*?\)"""
But I can't seem to find a way to get just the contents of the parenthesis.
I've tried: """\((.*?)\)""" """((.*?))""" and some other combinations, without luck.
I've used this one in the past in other Java apps: \\((.*?)\\), which is why I thought the first attempt in the line above """\((.*?)\)""" would work.
For my purposes, this looks something like:
var points = "pointA: (2.12, -3.48), pointB: (2.12, -3.48)"
var parenth_contents = """\((.*?)\)""".r;
val center = parenth_contents.findAllIn(points(0));
var cxy = center.next();
val cx = cxy.split(",")(0).toDouble;
Use Lookahead and Lookbehind
You can use this regex:
(?<=\()\d+\.\d+,\d+\.\d+(?=\))
Or, if you don't need precision inside the parentheses:
(?<=\()[^)]+(?=\))
See demo 1 and demo 2
Explanation
The lookbehind (?<=\() asserts that what precedes is a (
\d+\.\d+,\d+\.\d+ matches the string
or, in Option 2, [^)]+ matches any chars that are not a closing parenthesis
The lookahead (?=\)) asserts that what follows is a )
Reference
Lookahead and Lookbehind Zero-Length Assertions
Mastering Lookahead and Lookbehind
May be try this out
val parenth_contents = "\\(([^)]+)\\)".r
parenth_contents: scala.util.matching.Regex = \(([^)]+)\)
val parenth_contents(r) = "(123, abc)"
r: String = 123, abc
A even sample regex for matching all occurrence of both parenthesis itself and content inside the parenthesises.
(\([^)]+\)+)
1st Capturing Group (\([^)]+\)+)
\( matches the character ( literally (case sensitive)
Match a single character not present in the list below [^)]+
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
) matches the character ) literally (case sensitive)
\)+ matches the character ) literally (case sensitive)
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
https://regex101.com/r/MMNRRo/1
\((.*?)\) works - you just need to extract the matched group. The easiest way to do that is to use the unapplySeq method of scala.util.matching.Regex:
scala> val wrapped = raw"\((.*?)\)".r
wrapped: scala.util.matching.Regex = \((.*?)\)
val wrapped(r) = "(123,abc)"
r: String = 123,abc