Regular expression for changing links in Dreamweaver - regex

I'm in the process of moving my Dreamweaver-based website to a CMS, and I would like to replace site-wide the following kind of links:
a href="http://www.domain.com/category/item ### title.html" (where ### is a number)
to
a href="http://www.domain.com/category/item###"
What is the correct regular expression I should use in the find and replace built-in engine?

I propose
'(http://www.domain.com/category/item) *(\d+).+?\.html'
as RE chain
and to substitute the entire match with $1 + $2

Related

Regex, WordPress, and the Search Regex plugin - converting a script to a shortcode

I am using WordPress and the Search Regex plugin.
I need to covert several hundred Posts that contain a script which contains a unique Photo album number.
Each script needs to be changed to a shortcode which contains the same Photo album number.
Here is an example:
Search for the script:
<tt>%%wppa%%</tt> <tt>%%album=15%%</tt>
and replace it with the shortcode:
[wppa type="album" album="15"][/wppa]
What would I place in the Search field?
and
What would I place in the Replace field?
Use this regex:
<tt>%%([^%]+)%%</tt>\s*<tt>%%([^%=]+)=([^%=]+)%%</tt>
and this replacement:
[$1 type="$2" $2="$3"][/$1]
The $ numbers in the replacement refer to the capture groups: (). Overall, it's very simple, with [^%]+ matching "not %", \s* matching whitespace, and [^%=]+ matching "not % or =".
The plugin uses PCRE, but I'm not sure if the regex needs any adjustments since I don't use the plugin. (It may require escaping the / characters.)

how to remove all website addresses in bulk using regex

I have a lot of sites and I want to delete all of the web site address.
example:
http://www.website1.com/product.php?id=
http://www.website2.net/list.php?cid=
http://www.website3.org/view.php?page=
once removed:
product.php? id =
list.php? cid =
view.php? page =
I want to remove them in bulk using regex101 or regex on notepad ++
I want to ask what the code regullar expression to remove all of it?
I find PHP Live Regex easier to use for that purpose since you see the replace results directly (choose preg_replace instead of preg_match):
You can use this regex and choose replace and only keep the first capturing group $1:
(?:[a-z]{4,5}://[a-z.0-9]*\/)?([a-z.\?_=]*)([0-9]*)
Result:
product.php?id=
list.php?cid=
view.php?page=
See: http://www.phpliveregex.com/p/g5q
Use the following regex to search in Notepad++:
.*\/ demo
Then use a space to replace.
Basically we are searching for the last / and removing everything from beginning to that position.

How to extract a part of the url through regular expression in textwrangler?

My work involves manipulating lots of data. I use textwrangler as text editor but I guess the things would remain the same on all text editors.
So I have a url
http://example.com/swatches/frisk-watches/pr?p[]=sort%3Dpopularity&sid=812%2Cf13&offer=GsdOfferOnWatches07.&ref=4c83d65f-bfaf-4db6-b5f5-d733d7b1d2af
The above one is a sample url
I want to capture the text GsdOfferOnWatches07. i.e text from offer= and till &ref using regular expression on textwragler Ctrl+F feature.
How can I do that?
$link = 'http://example.com/swatches/frisk-watches/pr?p[]=sort%3Dpopularity&sid=812%2Cf13&offer=GsdOfferOnWatches07.&ref=4c83d65f-bfaf-4db6-b5f5-d733d7b1d2af';
preg_match('/offer=(.*?)&ref/', $link, $match);
echo $match[1];'

Regex Assistance for a url filepath

Can someone assist in creating a Regex for the following situation:
I have about 2000 records for which I need to do a search/repleace where I need to make a replacement for a known item in each record that looks like this:
<li>View Product Information</li>
The FILEPATH and FILE are variable, but the surrounding HTML is always the same. Can someone assist with what kind of Regex I would substitute for the "FILEPATH/FILE" part of the search?
you may match the constant part and use grouping to put it back
(<li>View Product Information</li>)
then you should replace the string with $1your_replacement$2, where $1 is the first matching group and $2 the second (if using python for instance you should call Match.group(1) and Match.group(2))
You would have to escape \ chars if you're using Java instead.

regular expression to match all domain names except admin / www / mail

I am new to regular expressions, but Give me this, I need to find a match:
a.com
b.com
c.com
aa.com
admin.com
www.com
mail.com
vg.com
As a result, I have found a regular expression to all domains except the admin / www / mail.
I wrote this:
[a-zA-Z0-9]+.com
But how to exclude admin, mail, www
I tried this:
^(www|mail|admin)[a-zA-Z0-9]+.com
But it doesn't work
Try this
\w+(?<!admin|mail|www)\.com
Here it is with some tests
http://www.rubular.com/r/frRl1ucR8J
Further reading on Regular Expressions: http://www.regular-expressions.info/tutorial.html
And the trick I used is called Negative LookBehind http://www.regular-expressions.info/lookaround.html
It is not simple to exclude some things, but here is a link to help:
http://www.codinghorror.com/blog/2005/10/excluding-matches-with-regular-expressions.html
is it possible to use a replace first? You could first do a find/replace to eliminate lines that match the things you want to skip, then use your regular expression.
You would do this to search for a string that doesn't contain admin:
^((?!admin).)*$
I'm not sure how to do it for multiple strings...
I use this, somewhat similar to already answered.
/^[A-Za-z0-9._'%+-]+#(\[(\d{1,3}\.){3}|(?!hotmail|gmail|yahoo|live|msn|outlook|comcast|verizon)(([a-zA-Z\d-]+\.)+))([a-zA-Z]{2,4}|\d{1,3})(\]?)$/i