vbscript multiple replace regex - regex

How do you match more than one pattern in vbscript?
Set regEx = New RegExp
regEx.Pattern = "[?&]cat=[\w-]+" & "[?&]subcat=[\w-]+" // tried this
regEx.Pattern = "([?&]cat=[\w-]+)([?&]subcat=[\w-]+)" // and this
param = regEx.Replace(param, "")
I want to replace any parameter called cat or subcat in a string called param with nothing.
For instance
string?cat=meow&subcat=purr or string?cat=meow&dog=bark&subcat=purr
I would want to remove cat=meow and subcat=purr from each string.

regEx.Pattern = "([?&])(cat|dog)=[\w-]+"
param = regEx.Replace(param, "$1") ' The $1 brings our ? or & back

Generally, OR in regex is a pipe:
[?&]cat=[\w-]+|[?&]subcat=[\w-]+
In this case, this will also work: making sub optional:
[?&](sub)?cat=[\w-]+
Another option is to use or on the not-shared parts:
[?&](cat|dog|bird)=[\w-]+

Related

Access vba Replace/Regex?

Afternoon,
I'm having trouble with some data imports from PowerPoint into Access.
Initially when I import the data the notes section comes in as the below for each row:
<div class="ExternalClass63DBAC931E7D4E4680E207BF938770AA"><p>xxxxxxxxxxx.</p> <p>xxxxxxxxxxxx</p></div>
The xxxxxxx is where the data I want to pull out is.
I have tried Regex in the form of replacing everything between the <> as seen below
Public Function AddPipesBeforeDates(ByVal strText As String) As String
Dim regex As Object
Dim matches As Object
Dim m As Object
Set regex = CreateObject("VBScript.RegExp")
regex.Global = True
regex.pattern = "<.*>"
Set matches = regex.Execute(strText)
For Each m In matches
strText = Replace(strText, m, "")
Next
AddPipesBeforeDates = strText
Set matches = Nothing
Set regex = Nothing
End Function
The problem becomes it wipes out everything.
I just found out about Regex and I'm not familiar with it.
Is there a way to delete the unwanted data?
Note the xxxxxx data can be any value spaces or special characters
Any thoughts or ideas on how to do this would be appreciated. I may be going at this the wrong way.
Thanks
You must note that . matches any character but a newline (thus, including < and >).
To remove all substrings between < and >, you may use
regex.pattern = "<[^<]+>"
This way, you will avoid "overfiring" and matching more than you need.

How to remove a string between certain slashes regex or excel

I'm looking for a way to remove string after a 3rd and a 4th forward slash
E.g http://www.website.com/content/remove-this/product
to http://www.website.com/content/product
I can use notepad++, regex or excel
I tried using
/.*?/(.*?)/
but that didn't work
Try using Notepad++ with "Replace" and using expression
^(.*://)([^/]*/)([^/]*/)([^/]*/)(.*)$
and replace with
$1$2$3$5
For the answers using Excel:
Formula
=LEFT(A1,FIND(CHAR(1),SUBSTITUTE(A1,"/",CHAR(1),4)))&MID(A1,1+FIND(CHAR(1),SUBSTITUTE(A1,"/",CHAR(1),5)),99)
UDF (using regex)
Option Explicit
Function Remove4th(S As String) As String
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.Pattern = "^((?:.*?/){4})[^/]*/"
.MultiLine = True
Remove4th = .Replace(S, "$1")
End With
End Function
I would do somehting like this:
<?php
$string = " http://www.website.com/content/remove-this/product";
preg_match_all('#http:\/\/([a-zA-Z0-9-.]*)\/([a-zA-Z0-9-]*)\/([a-zA-Z0-9-]*)\/([a-zA-Z0-9-]*)#ism',$string,$out);
$new_string = 'http://'.$out[1][0].'/'.$out[4][0];
echo $new_string;
// => http://www.website.com/content
?>

Find specific instance of a match in string using RegEx

I am very new to RegEx and I can't seem to find what I looking for. I have a string such as:
[cmdSubmitToDatacenter_Click] in module [Form_frm_bk_UnsubmittedWires]
and I want to get everything within the first set of brackets as well as the second set of brackets. If there is a way that I can do this with one pattern so that I can just loop through the matches, that would be great. If not, thats fine. I just need to be able to get the different sections of text separately. So far, the following is all I have come up with, but it just returns the whole string minus the first opening bracket and the last closing bracket:
[\[-\]]
(Note: I'm using the replace function, so this might be the reverse of what you are expecting.)
In my research, I have discovered that there are different RegEx engines. I'm not sure the name of the one that I'm using, but I'm using it in MS Access.
If you're using Access, you can use the VBScript Regular Expressions Library to do this. For example:
Const SOME_TEXT = "[cmdSubmitToDatacenter_Click] in module [Form_frm_bk_UnsubmittedWires]"
Dim re
Set re = CreateObject("VBScript.RegExp")
re.Global = True
re.Pattern = "\[([^\]]+)\]"
Dim m As Object
For Each m In re.Execute(SOME_TEXT)
Debug.Print m.Submatches(0)
Next
Output:
cmdSubmitToDatacenter_Click
Form_frm_bk_UnsubmittedWires
Here is what I ended up using as it made it easier to get the individual values returned. I set a reference to the Microsoft VBScript Regular Expression 5.5 so that I could get Intellisense help.
Public Sub GetText(strInput As String)
Dim regex As RegExp
Dim colMatches As MatchCollection
Dim strModule As String
Dim strProcedure As String
Set regex = New RegExp
With regex
.Global = True
.Pattern = "\[([^\]]+)\]"
End With
Set colMatches = regex.Execute(strInput)
With colMatches
strProcedure = .Item(0).submatches.Item(0)
strModule = .Item(1).submatches.Item(0)
End With
Debug.Print "Module: " & strModule
Debug.Print "Procedure: " & strProcedure
Set regex = Nothing
End Sub

using classic asp for regular expression

We have some Classic asp sites, and i'm working on them a lil' bit, and I was wondering how can I write a regular expression check, and extract the matched expression:
the expression I have is in the script's name
so Let's say this
Response.Write Request.ServerVariables("SCRIPT_NAME")
Prints out:
review_blabla.asp
review_foo.asp
review_bar.asp
How can I get the blabla, foo and bar from there?
Thanks.
Whilst Yots' answer is almost certainly correct, you can achieve the result you are looking for with a lot less code and somewhat more clearly:
'A handy function i keep lying around for RegEx matches'
Function RegExResults(strTarget, strPattern)
Set regEx = New RegExp
regEx.Pattern = strPattern
regEx.Global = true
Set RegExResults = regEx.Execute(strTarget)
Set regEx = Nothing
End Function
'Pass the original string and pattern into the function and get a collection object back'
Set arrResults = RegExResults(Request.ServerVariables("SCRIPT_NAME"), "review_(.*?)\.asp")
'In your pattern the answer is the first group, so all you need is'
For each result in arrResults
Response.Write(result.Submatches(0))
Next
Set arrResults = Nothing
Additionally, I have yet to find a better RegEx playground than Regexr, it's brilliant for trying out your regex patterns before diving into code.
You have to use the Submatches Collection from the Match Object to get your data out of the review_(.*?)\.asp Pattern
Function getScriptNamePart(scriptname)
dim RegEx : Set RegEx = New RegExp
dim result : result = ""
With RegEx
.Pattern = "review_(.*?)\.asp"
.IgnoreCase = True
.Global = True
End With
Dim Match, Submatch
dim Matches : Set Matches = RegEx.Execute(scriptname)
dim SubMatches
For Each Match in Matches
For Each Submatch in Match.SubMatches
result = Submatch
Exit For
Next
Exit For
Next
Set Matches = Nothing
Set SubMatches = Nothing
Set Match = Nothing
Set RegEx = Nothing
getScriptNamePart = result
End Function
You can do
review_(.*?)\.asp
See it here on Regexr
You will then find your result in capture group 1.
You can use RegExp object to do so.
Your code gonna be like this:
Set RegularExpressionObject = New RegExp
RegularExpressionObject.Pattern = "review_(.*)\.asp"
matches = RegularExpressionObject.Execute("review_blabla.asp")
Sorry, I can't test code below right now.
Check out usage at MSDN http://msdn.microsoft.com/en-us/library/ms974570.aspx

Find text and replace with hyperlink

I am trying to replace text in the body with pattern ASA###### to ASA######(hyperlink)
I have code which works if there is only one pattern in the body.
But if I have many patterns like
ASA3422df
ASA2389ds
ASA1265sa
the entire body gets replaced to
ASAhuyi65
My code is here.
Dim strID As String
Dim Body As String
Dim objMail As Outlook.MailItem
Dim temp As String
Dim RegExpReplace As String
Dim RegX As Object
strID = MyMail.EntryID
Set objMail = Application.Session.GetItemFromID(strID)
Body = objMail.HTMLBody
Body = Body + "Test"
objMail.HTMLBody = Body
Set RegX = CreateObject("VBScript.RegExp")
With RegX
.Pattern = "ASA[0-9][0-9][0-9][0-9][a-z][a-z]"
.Global = True
.IgnoreCase = Not MatchCase
End With
'RegExpReplace = RegX.Replace(Body, "http://www.code.com/" + RegX.Pattern + "/ABCD")
'if the replacement is longer than the search string, future .FirstIndexes will be off
Offset = 0
'Set matches = RegX.Execute(Body)
For Each m In RegX.Execute(Body)
RegExReplace = "" & m.Value & ""
Next
Set RegX = Nothing
objMail.HTMLBody = RegExReplace
objMail.Save
Set objMail = Nothing
End Sub
It looks like you were on the right track originally with that commented-out line. With the Replace method don't need to loop over matches (that's what the Global flag is for), and can use backreferences like $1, $2, etc. as placeholders for matching substrings. As with most languages, there's a dedicated page on Regular-Expressions.info for VBScript.
The following with do what you're looking for:
body = "Blah blah ASA3422df ASA2389ds ASA1265sa"
body = RegX.Replace(body, "<a href='http://www.code.com/$1'>$1</a>")
Debug.Print body
'-> Blah blah <a href='http://www.code.com/ASA3422df'>ASA3422df</a> <a href='http://www.code.com/ASA2389ds'>ASA2389ds</a> <a href='http://www.code.com/ASA1265sa'>ASA1265sa</a>
This replaces the matches (and only the matches) with a link, and leaves everything else untouched.
Over at codedawn, there is a fantastic add-in for Excel that gives you the same UI search and replace that you know and love, but for regular expressions.
http://www.codedawn.com/excel-add-ins.php
While this doesn't exactly help answer your question, it's useful for trying out regular expressions one after another without altering data or code.