Replace characters in url by regex - regex

I'm totally new to regex,
I'm using Yoast SEO - Redirects in wordpress, How to do that?
How to Replace "-and-" by "-" in url by regex
For example:
wwww.website.com/top-products-and-brands/product1/
To:
wwww.website.com/top-products-brands/product1/
I need to know what is the regex for match -and-
And how to redirect to the new link?
Thanks a lot.

Disclaimer: I have never used Yoast SEO
I think this will work:
Regular Expression:
(.*)(?:-and-)(.*)
New Url:
\$1-\$2
But honestly, I couldn't tell you because their docs on regex don't specify the syntax they use for capture groups, (or if they even support them at all).

Related

Multiple slash in URL replacement though regex

I am trying to create a regex in pcre, that is going to salinize URL with multiple slashes like the following:
https://www.domin.com/test1/////test2/somemoretests_67142 https://www.domin.com/test1/test2/somemoretests_67142///// https://www.domin.com/test1/test2///somemoretests_67142
So that I can replace it with the following: https://\2\4 and the link at the end of it looks: https://www.domin.com/test1/test2/somemoretests_67142
I have been struggling with it for the past couple of days, so any regex guru help is more than welcome :)
I have tried the following and more:
(http|https):\/\/(.*)(\/\/+)(.*)
(http|https):\/\/(.*)(\/\/){2,}(.*)
(http|https):\/\/(.*)(\/\/{2})(.*)
I am going to utilize these for Akamai to sanitize our URLs though cloudlet.
You can try:
(?<!https:\/)(?<!http:\/)(\/+$|(?<=\/)\/+)
And substitute the first group with empty string.
Regex demo.
This will produce this output:
https://www.domin.com/test1/test2/somemoretests_67142
https://www.domin.com/test1/test2/somemoretests_67142
https://www.domin.com/test1/test2/somemoretests_67142

Regex for this URL, http://www.chip.de and this domain chip.de

I am trying to create a regex to look for similar URL and domain like this below
*chip.de
http://www.chip.de*
I tried to use the regex expression
http?:\/\/([\w\.-]+)([\/\w \.-]*)
It did not capture the URL.
I tried to use the url, https://www.regextester.com/99497 to test it out and it failed..
What am I missing?
Please create two rules for domain and URL
Thank you
If you're simply looking for regex that will match URLs which include chip.de then please try this and let me know if it is sufficient:
https?\:\/\/www\.chip\.de.*

How to fix regex url pattern

I need to fix my url pattern:
/^((http(s)?(\:\/\/)){1}(www\.)?([\w\-\.\/])*(\.[a-zA-Z]{2,4}\/?)[^\\\/#?])[^\s\b\n|]*[^\.,;:\?\!\#\^\$ -]/
I thought this regex was ok, but it is not working for urls like: https://xx.xx (without www). 'www' should be optional ((www.)?). Where is the bug?
The problem is not in the (www\.)? part but that parts after that.
Take a look at the [^\\\/#?] and the [^\.,;:\?\!\#\^\$ -] parts.
So a valid URL would be https://xx.xx plus none of \/#? plus none of .,;:?!#^$_- making the url valid if you add those, for example https://xx.xx11.
I do advice you to not try to create your own regex because you are missing a lot!
For example, tlds like .amsterdam are valid. And why are you capturing so many groups?
Your regex as an image made with https://www.debuggex.com/:

How can I make this regex for a URL more specific?

I have the following regex that attempts to match URLs:
/((http|https):(([A-Za-z0-9$_.+!*(),;/?:#&~=-])|%[A-Fa-f0-9]{2}){2,}(#([a-zA-Z0-9][a-zA-Z0-9$_.+!*(),;/?:#&~=%-]*))?([A-Za-z0-9$_+!*();/?:~-]))/g
How can I modify this regex to only match URLs of a single domain?
For example, I only want to match URLs that begin with http://www.google.com?
This should simplify my regex, but I'm too much of a regex noob to get it working (after all these years...)
Did you write that RegEx? I don't know what it's trying to do, but it certainly doesn't match URLs correctly. Here's something it matches:
http:###9#?~
which I'm pretty sure isn't a valid URL.
You shouldn't be using RegEx to match URLs like this. You haven't said what language you're working in, but use whatever its equivalent of urlparse is..
Here's a relevant question: How do you validate a URL with a regular expression in Python?

regular expression to match all domain names except admin / www / mail

I am new to regular expressions, but Give me this, I need to find a match:
a.com
b.com
c.com
aa.com
admin.com
www.com
mail.com
vg.com
As a result, I have found a regular expression to all domains except the admin / www / mail.
I wrote this:
[a-zA-Z0-9]+.com
But how to exclude admin, mail, www
I tried this:
^(www|mail|admin)[a-zA-Z0-9]+.com
But it doesn't work
Try this
\w+(?<!admin|mail|www)\.com
Here it is with some tests
http://www.rubular.com/r/frRl1ucR8J
Further reading on Regular Expressions: http://www.regular-expressions.info/tutorial.html
And the trick I used is called Negative LookBehind http://www.regular-expressions.info/lookaround.html
It is not simple to exclude some things, but here is a link to help:
http://www.codinghorror.com/blog/2005/10/excluding-matches-with-regular-expressions.html
is it possible to use a replace first? You could first do a find/replace to eliminate lines that match the things you want to skip, then use your regular expression.
You would do this to search for a string that doesn't contain admin:
^((?!admin).)*$
I'm not sure how to do it for multiple strings...
I use this, somewhat similar to already answered.
/^[A-Za-z0-9._'%+-]+#(\[(\d{1,3}\.){3}|(?!hotmail|gmail|yahoo|live|msn|outlook|comcast|verizon)(([a-zA-Z\d-]+\.)+))([a-zA-Z]{2,4}|\d{1,3})(\]?)$/i