Regular Expression for redirect - regex

How do I redirect all the following URLs to "/" using single regular expression?
members/kaleem/
members/kaleem/activity/just-me/
members/kaleem/activity/
members/kaleem/activity/favorites/
members/kaleem/activity/groups/
members/kaleem/friends/
I am using it wordpress redirect plugin.

I'm not sure how Wordpress' redirect plugin works, but this regular expression will match all of above, as well as any other pages after members/kaleem.
members/kaleem[[\w\-\/]*
Grab word characters, dashes, and slashes that appear after members/kaleem. If there are certain pages after members/kaleem that shouldn't be matched, it get's more complicated. I was assuming that the examples you showed were part of a pattern.
If you want to only match kaleem/activity and kaleem/friends, plus any pages that are children of them, you can use this:
members/kaleem/((activity|friends)[\w\/\-]*)?

It seems members/ is the common identifier. Correct? If so, you just have to match that: ^members/. Otherwise it becomes a bit more complicated: ^members/kaleem/(?:friends|activity/(?:(?:just-me|favorites|groups)/)?). See: http://regex101.com/r/jJ4rM8

Related

301 redierction, matching urls through regex. Matching dashes

I'm trying to match urls for a migration, however I can't seem to have a regex which matches it.
I've tried different expressions and using regex checkers to determine where exactly it's broken, but it's not clear to me
This is my regex
https:\/\/blog\.xyz\.ca\/EN\/post\/201[0-9]\/[0-9][0-9]\/[0-9][0-9]\/*\).aspx
I'm trying to match these kinds of urls (hundreds)
https://blog.xyz.ca/EN/post/2019/05/14/how-test-higher-education-test-can-test-more-test-students-and-test-sdf-the-test.aspx
https://blog.xyz.ca/EN/post/2019/05/14/how-test.aspx
https://blog.xyz.ca/EN/post/2019/05/14/how-test-higher-the-test.aspx
And remap them to something like this
https://blog.xyz.ca/2017/12/21/test-how-the-testaspx
I thought that I could match the dash section using the wildcard, but it seems to not be working and none of the generators are giving me a clear warning. I've tried https://regexr.com/ and https://www.regextester.com/
If I understand the problem right, here we might just want to have a simple expression and capture our desired URL components, according to which we would find our redirect rules, and we can likely start with:
(.+\.ca)\/EN\/post(\/[0-9]{4}\/[0-9]{2}\/[0-9]{2})(\/.+)\.aspx
and if necessary, we would be adding/reducing our constraints, and I'm guessing that no validation might be required.
Demo 1
or:
(.+\.ca)\/EN\/post(\/[0-9]{4}\/[0-9]{2}\/[0-9]{2})(\/.+)(\.aspx)
Demo 2

Regex for both website url versions with wildcard

I'm trying to add in allowed urls in a watchguard firebox webblocker list using regular expression. I'm trying to keep my list short by allowing one entry to apply to both www and non-www versions of a site including subdomains. I'm currently using the following:
(www\.)?ups\.com/*
Which works great for both versions plus subdomains, but has an issue as it allows other sites through that end their domain with ups.com such as jobs-ups.com
How can I make the regular expression know that if there is no subdomain that the url is only going to be ups.com without any other letters before the u, so it will block sites like jobs-ups.com?
You can use the caret ^ to accomplish this
^(?:www\.)?ups\.com\/
DEMO
The caret forces the check at the start of the string. This means it will not match in mid-string, which is what you are wanting.
Not familiar with firebox at all, but generally you should escape your periods and forward slashes. You would also generally use a non-capturing group as well. But if this is simple regex, you can still preserve your original formatting:
^(www.)?ups.com/*

JMeter Proxy exclusion patterns still being recorded

I am using JMeter to record traffic in my browser. In my URL Patterns to Exclude are:
.*\.jpg,
.*\.js,
.*\.png
Which looks like they should block these patterns (I've even tested it with a regex tester here)
Yet, I still see plenty of these files get pulled up. In a related forum someone had a similar issue, but his was caused by having additional url parameters afterwards (eg www.website.com/image.jpg?asdf=thisdoesntmatch). However this doesn't seem to be the case here. Can anyone point me in the right direction?
As already mentioned in the question comments it is probably a problem with the trailing characters. The pattern matcher is executed against the complete url including parameters.
So an URL http://example.com/layout.css?id=123 is not matched against the pattern .*\.css The JMeter HTTP Request Sample seperates the Path and the Parameters so it might be not obvious when you look at the URL.
Solution:Change the pattern to support trailing characters .*\.css.*
Explained
.* Any character
\. Matching the . (dot) character
css The character sequence css
.* Any character
Maybe you can do the oposite: leave blank the URL Patterns to exclude and negate those patterns in the URL Patterns to Include box:
(?!..(bmp|css|js|gif|ico|jpe?g|png|swf|woff))(.)

Regex for simple urls

I am looking for regex for simple URLs as
http://www.google.com
http://www.yahoo.in
http://www.example.eu
http://www.example.net
etc.
No subdirectories allowed. For example in this cases it must not validate http://www.google.com/, http://www.yahoo.in/mail.
Does anyone know any regex to do this?
I'm still a noob, but try this:
^http:\/\/[a-zA-Z0-9_\-]+\.[a-zA-Z0-9_\-]+\.[a-zA-Z0-9_\-]+$
This one should do:
^(https?:\/\/)?[0-9a-zA-Z]+\.[-_0-9a-zA-Z]+\.[0-9a-zA-Z]+$
This should work for URLs starting with http:// or https:// or without the protocol name.
The regex should also be used as case-insensitive. In that case, it can be shortened a bit:
^(https?:\/\/)?[0-9a-z]+\.[-_0-9a-z]+\.[0-9a-z]+$
If you don't care whether it is a valid url, you can use:
\S*www\.\S+
All the examples contain www. followed by a nonspace character, but that is unlikely to occur in a normal word.

Regex with URLs - syntax

We're using a proprietary tracking system that requires the use of regular expressions to load third party scripts on the URLs we specify.
I wanted to check the syntax of the regex we're using to see if it looks right.
To match the following URL
/products/18/indoor-posters
We are using this rule:
.*\/products\/18\/indoor-posters.*
Does this look right? Also, if there was a query parameter on the URL, would it still work? e.g.
/products/18/indoor-posters?someParam=someValue
There's another URL to match:
/products
The rule for this is:
.*\/products
Would this match correctly?
Well, "right" is a relative term. Usually, .* is not a good idea because it matches anything, even nothing. So while these regexes will all match your example strings, they'll also match much more. The question is: What are you using the regexes for?
If you only want to check whether those substrings are present anywhere in the string, then they are fine (but then you don't need regex anyway, just check for substrings).
If you want to somehow check whether it's a valid URL, then no, the regexes are not fine because they'd also match foo-bar!$%(§$§$/products/18/indoor-postersssssss)(/$%/§($/.
If you can be sure that you'll always get a correct URL as your input and just want to check whether they match you pattern, then I'd suggest
^.*\/products$
to match any URL that ends in /products, and
^.*\/products\/18\/indoor-posters(?:\?[\w-]+=[\w-]+)?$
to match a URL that ends in /products/18/indoor-posters with an optional ?name=value bit at the end, assuming only alphanumeric characters are legal for name and value.