VBScript RegEx - match between words - regex

I'm having a hard time coming up with a working RegEx that words in VBScript. I'm trying to match all text between 2 keywords:
(?<=key)(.*)(?=Id)
This throws a RegEx error in VBScript. Id
Blob I'm matching against:
\"key\":[\"food\",\"real\",\"versus\",\"giant\",\"giant gummy\",\"diy candy\",\"candy\",\"gummy worm\",\"pizza\",\"fries\",\"spooky diy science\",\"spooky\",\"trapped\"],\"Id\"
Ideally, I'd end up with a comma delimited list like this:
food,real,versus,giant,giant gummy,diy candy,candy,gummy worm,pizza,fries,spooky diy science,spooky,trapped
but, I'd settle for all text between 2 keywords working in VBScript.
Thanks in advance!

VBScript's regular expression engine doesn't support lookbehind assertions, so you'll want to do something like this instead:
s = "\""key\"":[\""food\"",\""real\"",\""trapped\""],\""Id\"""
'remove backslashes and double quotes from string
s1 = Replace(s, "\", "")
s1 = Replace(s1, Chr(34), "")
Set re = New RegExp
re.Pattern = "key:\[(.*?)\],Id"
For Each m In re.Execute(s1)
list = m.Submatches(0)
Next
WScript.Echo list

Related

regex .NET to find and replace underscores only if found between > and <

I have a list of strings looking like this:
Title_in_Title_by_-_Mr._John_Doe
and I need to replace the _ with a SPACE from the text between the html"> and </a> ONLY.
so that the result to look like this:
Title in Title by - Mr. John Doe
I've tried to do it in 2 steps:
first isolate that part only with .*html">(.*)<\/a.* & ^.*>(.*)<.* & .*>.*<.* or ^.*>.*<.*
and then do the replace but the return is always unchanged and now I'm stuck.
Any help to accomplish this is much appreciated
How I would do it is to .split it and then .replace it, no need for regex.
Dim line as string = "Title_in_Title_by_-_Mr._John_Doe"
Dim split as string() = line.split(">"c)
Dim correctString as String = split(1).replace("_"c," "c)
Boom done
here is the string.replace article
Though if you had to use regex, this would probably be a better way of doing it
Dim inputString = "Title_in_Title_by_-_Mr._John_Doe"
Dim reg As New Regex("(?<=\>).*?(?=\<)")
Dim correctString = reg.match(inputString).value.replace("_"c, " "c)
Dim line as string = "Title_and_Title_by_-_Mr._John_Doe"
line = Regex.Replace(line, "(?<=\.html"">)[^<>]+(?=</a>)", _
Function (m) m.Value.Replace("_", " "))
This uses a regex with lookarounds to isolate the title, and a MatchEvaluator delegate in the form of a lambda expression to replace the underscores in the title, then it plugs the result back into the string.

Regex to find words and wrap quotes

I am trying to find words with spaces that are surrounded by (, ) or , and wrap them in quotes..
For e.g. In this expression - Development life cycle and enterprise service bus are to be wrapped in quotes.
Edit - Only phrases i.e. Words that contain spaces between them are to be wrapped
(AND(OR(SDLC,development life cycle),design,requirements,OR(biztalk,Websphere,TIBCO,Webmethods,ESB,enterprise service bus)))
(?<=[(,])([^(),]* [^(),]*)(?=[),])
Try this. See DEMO.
Replace by "$1" or "\1"
string strRegex = #"(?<=[(,])([^(),]* [^(),]*)(?=[),])";
Regex myRegex = new Regex(strRegex, RegexOptions.Multiline);
string strTargetString = #"(AND(OR(SDLC,development life cycle),design,requirements,OR(biztalk,Websphere,TIBCO,Webmethods,ESB,enterprise service bus)))" + "\n" + #" AND(OR(SDLC,""development life cycle""),OR(banking,AML,anti-money laundering,KYC,know your customer),OR(technology strategy,technical strategy,technical architecture,technology architecture,architect*)";
string strReplace = #"""$1""";
return myRegex.Replace(strTargetString, strReplace);

Simple Regular Expression matching

Im new to regular expressions and Im trying to use RegExp on gwt Client side. I want to do a simple * matching. (say if user enters 006* , I want to match 006...). Im having trouble writing this. What I have is :
input = (006*)
input = input.replaceAll("\\*", "(" + "\\" + "\\" + "S\\*" + ")");
RegExp regExp = RegExp.compile(input).
It returns true with strings like BKLFD006* too. What am I doing wrong ?
Put a ^ at the start of the regex you're generating.
The ^ character means to match at the start of the source string only.
I think you are mixing two things here, namely replacement and matching.
Matching is used when you want to extract part of the input string that matches a specific pattern. In your case it seems that is what you want, and in order to get one or more digits that are followed by a star and not preceded by anything then you can use the following regex:
^[0-9]+(?=\*)
and here is a Java snippet:
String subjectString = "006*";
String ResultString = null;
Pattern regex = Pattern.compile("^[0-9]+(?=\\*)");
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
ResultString = regexMatcher.group();
}
On the other hand, replacement is used when you want to replace a re-occurring pattern from the input string with something else.
For example, if you want to replace all digits followed by a star with the same digits surrounded by parentheses then you can do it like this:
String input = "006*";
String result = input.replaceAll("^([0-9]+)\\*", "($1)");
Notice the use of $1 to reference the digits that where captured using the capture group ([0-9]+) in the regex pattern.

Whole word replacements using Regular Expression

I have a list of original words and replace with words which I want to replace occurrence of the original words in some sentences to the replace words.
For example my list:
theabove the above
myaddress my address
So the sentence "This is theabove." will become "This is the above."
I am using Regular Expression in VB like this:
Dim strPattern As String
Dim regex As New RegExp
regex.Global = True
If Not IsEmpty(myReplacementList) Then
For intRow = 0 To UBound(myReplacementList, 2)
strReplaceWith = IIf(IsNull(myReplacementList(COL_REPLACEMENTWORD, intRow)), " ", varReplacements(COL_REPLACEMENTWORD, intRow))
strPattern = "\b" & myReplacementList(COL_ORIGINALWORD, intRow) & "\b"
regex.Pattern = strPattern
TextToCleanUp = regex.Replace(TextToReplace, strReplaceWith)
Next
End If
I loop all entries in my list myReplacementList against the text TextToReplace I want to process, and the replacement have to be whole word so I used the "\b" token around the original word.
It works well but I have a problem when the original words contain some special characters for example
overla) overlay
I try to escape the ) in the pattern but it does not work:
\boverla\)\\b
I can't replace the sentence "This word is overla) with that word." to "This word is overlay with that word."
Not sure what is missing? Is regular expression the way to the above scenario?
I'd use string.replace().
That way you don't have to escape special chars .. only these: ""!
See here for examples: http://www.dotnetperls.com/replace-vbnet
Regex is good if your looking for patterns. Or renaming your mp3 collection ;-) and much, much more. But in your case, I'd use string.replace().

Regular Expressions Vbscript

I am fiddling with regular expressions to shorten a string splitting routine I have been using.
I have a string for my cart that is submitted to an asp script as follows:
addnothing|-1, addRST115400112*2xl|0, addnothing|-1, addnothing|-1, addRST115400115*xs|0, addnothing|-1
I want to be able to extract the two entries that represent two stock items:
addRST115400112*2xl|0
addRST115400115*xs|0
I have managed to get this bit of code to work but I am unsure about the pattern I am using:
add[^n](.*)\*(.*)\|[0-9],
This returns this:
addRST115400112*2xl|0, addnothing|-1, addnothing|-1, addRST115400115*xs|0,
but I only want it to return :
addRST115400112*2xl|0
addRST115400115*xs|0
Can anybody point me in the right direction please?
You were matching it greedily (.* eats as much as it can so in your case it ends up eating till the last \|[0-9] i.e |0)
You should match it lazily by using .*? instead of .*
So your regex should be
add(?!nothing)(.*?)\*(.*?)\|\d
\d is similar to [0-9]
(?!nothing) is just a check..it doesn't match or consume anything..it's better then [^n] cuz it's more reliable,expressive and doesnt eat anything
Trying to keep the .Pattern simple (this is VBScript!) and to make tinkering with it easier (what really singles out stock items is by no means clear):
Dim sInp : sInp = "addnothing|-1, addRST115400112*2xl|0, addnothing|-1, addnothing|-1, addRST115400115*xs|0, addnothing|-1"
Dim reCut : Set reCut = New RegExp
reCut.Global = True
reCut.Pattern = "addR[^|]+\|\d"
Dim oMTS : Set oMTS = reCut.Execute(sInp)
If 2 = oMTS.Count Then
WScript.Echo "Success:", Join(Array(oMTS(0).Value, oMTS(1).Value))
Else
WScript.Echo "Bingo:", reCut.Pattern
End If
output:
Success: addRST115400112*2xl|0 addRST115400115*xs|0