How do I exclude a folder from a URL using Regex? - regex

I need to use Regex to check for URLs that contain 'folder', in the following URL:
subdomain.domain.co.uk/section/folder/page
I'm using:
subdomain.domain.co.uk\/.*\/(?!folder\/).*
but it's still finding 'folder'. Any ideas?

Try this regex:
^subdomain.domain.co.uk\/((?!folder).)*$
Demo here:
Regex101

First off, you need slashes around "folder", otherwise you'll also exclude "/anotherfolder/" and "/folder.jpg" etc.
Put the negative look ahead before the "." and add "." before "folder":
subdomain.domain.co.uk\/(?!.*\/folder\/).*
This won't match a URL with "/folder/" anywhere in it.

Related

Multiple slash in URL replacement though regex

I am trying to create a regex in pcre, that is going to salinize URL with multiple slashes like the following:
https://www.domin.com/test1/////test2/somemoretests_67142 https://www.domin.com/test1/test2/somemoretests_67142///// https://www.domin.com/test1/test2///somemoretests_67142
So that I can replace it with the following: https://\2\4 and the link at the end of it looks: https://www.domin.com/test1/test2/somemoretests_67142
I have been struggling with it for the past couple of days, so any regex guru help is more than welcome :)
I have tried the following and more:
(http|https):\/\/(.*)(\/\/+)(.*)
(http|https):\/\/(.*)(\/\/){2,}(.*)
(http|https):\/\/(.*)(\/\/{2})(.*)
I am going to utilize these for Akamai to sanitize our URLs though cloudlet.
You can try:
(?<!https:\/)(?<!http:\/)(\/+$|(?<=\/)\/+)
And substitute the first group with empty string.
Regex demo.
This will produce this output:
https://www.domin.com/test1/test2/somemoretests_67142
https://www.domin.com/test1/test2/somemoretests_67142
https://www.domin.com/test1/test2/somemoretests_67142

How to fix regex url pattern

I need to fix my url pattern:
/^((http(s)?(\:\/\/)){1}(www\.)?([\w\-\.\/])*(\.[a-zA-Z]{2,4}\/?)[^\\\/#?])[^\s\b\n|]*[^\.,;:\?\!\#\^\$ -]/
I thought this regex was ok, but it is not working for urls like: https://xx.xx (without www). 'www' should be optional ((www.)?). Where is the bug?
The problem is not in the (www\.)? part but that parts after that.
Take a look at the [^\\\/#?] and the [^\.,;:\?\!\#\^\$ -] parts.
So a valid URL would be https://xx.xx plus none of \/#? plus none of .,;:?!#^$_- making the url valid if you add those, for example https://xx.xx11.
I do advice you to not try to create your own regex because you are missing a lot!
For example, tlds like .amsterdam are valid. And why are you capturing so many groups?
Your regex as an image made with https://www.debuggex.com/:

Regex Adding a URL path except the current one I'm at

I'm trying to add something along the lines of this regex logic.
For Input:
reading/
reading/123
reading/456
reading/789
I want the regex to match only
reading/123
reading/456
reading/789
Excluding reading/.
I've tried reading\/* but that doesn't work because it includes reading/
You must escape your backslashes in Hugo, \\/\\d+.

Conditional Regex to match url

I am trying to make a if/then condition to match the url, but I can't seem to get it to work. I am trying to match URLs and then capture the non-optional group. So - if a url comes in like this:
/en/testing.aspx
I want to capture /testing.aspx
if the url comes in like this:
/testing.aspx
I want to capture /testing.aspx
Is there an easy way to do this using regex?
EDIT:
The Url can be multi-part url, like /en/sub1/sub2/testing.aspx - I essentially want everything after "/en/".
use regex \/en(\/.+)$
Check this out
edited
https://regex101.com/r/lwowhi/6
If there is "/en/" in the URL and you still want to capture /testing.aspx then here is an edit (?:\/en)*(\/.+)$
https://regex101.com/r/lwowhi/8
You can use a greedy regex which will consume everything up until the final forward slash. Then, capture everything which comes after that point.
^.*?(?:\/en)?(\/.*)$
Demo
Guessing all pages are .aspx then use group.
regex: .(/..aspx)
this will match "/testing.aspx" in all bellow samples
/testing.aspx or
/en/testing.aspx or
www.abc.com/en-us/testing.aspx

Excluding in Live HTTP Headers plugin for Firefox

I am trying to exclude gmail's requests from Live Http headers, but I cant
seem to get the exclude reg ex to work.
My exclude regex is this: .gif$|.jpg$|.ico$|.css$|.js$|.*mail.google.com.*
Any ideas/suggestions?
I have had the same problem and its soultion was stupid simple:
do you have enabled the check box ("exclude URL by RegExp" (or similar - I have only the german version))?
Hint: you do need to add the .* at start and end of your expression, because the request will be excludes if it contains the pattern (is must not match the complate url).
I think. You sould use "\." to catch a dot. Dot without slash is any symbol.
Like this:
\.gif$|\.jpg$|\.ico$|.css$|\.js$|.*mail\.google\.com.*