Regex or without grouping - regex

I want to get the thread-id of an url via regex.
The Url can have these states:
https://mypage.com/threads/an-example-thread/
https://mypage.com/threads/an-example-thread/page-1
https://mypage.com/threads/an-example-thread
My pattern .+/threads/(.+)/.+ covers the first two options. Now I need a pattern, that also covers option 3. .+/threads/(.+)(/.+|$) works. But I use the first group to get the tread-id/name. So how is is possible to create an or-pattern without grouping?

As mentioned in the comments, try to use /threads/([^\/]*), that will match all 3.

Related

regexing url parameters IIS

I have reviewed a couple questions on regexing url parameters, but none of which seem to specifically address my issue. I have been trying to work out the correct regex pattern in www.regex101.com and I haven't found any successs. I have a url that has parameters which are separated by /'s. I am able to regex one parameter at a time, but I would ideally like to develop a pattern that can extract all of the parameters. So far this is what I have:
\/([a-zA-z]+)\/([a-zA-z]+)\/([a-zA-z-]+)\/
The url that I am trying to modify is:
www.mydomain.com/firstparameter/secondparameter/hyphenated-url-parameter/
The above pattern works for this example, but I need it to also work for these two examples:
www.mydomain.com/firstparameter/secondparameter/
www.mydomain.com/firstparameter/
Is it even possible to write one singular regex that can extract the parameters from each example above?
Try Regex: \/([a-zA-z]+)\/(?:(?:([a-zA-z]+)\/)?([a-zA-z-]+)\/)?
Details:
? Quantifier — Matches between zero and one times, as many times as possible
Demo
The assumption here is that, there is at least one parameter and max 3 parameters.
This should work for any number of parameters:
\/([\w|-]+)
Example

Negate some string pattern

In Jenkins with Git Parameter plugin (which helps me filter out tags)
I have this pattern *-rc this simply display all tags with -rc as a suffix. But how do I negate this pattern. I already have this (?!-rc).*$ but it is not working.
EDIT 1
I have tags named:
3.11.2-rc
3.11.1-rc
3.11.0
3.10.0
so on and so fort...
with this pattern *-rc I can simply display tags with '-rc'
now, what I want to achieve is display all tags without '-rc'
EDIT 2
As seen in this blog post, you could use the Extensible Choice plugin and:
add 'Extensible Choice' as a second parameter (as seen here)
write a groovy script which does the filtering for you
You can see an example here for branches, that you can adapt for tags.
You need to first turn on the regex option - see this answer for how to do that.
Use this negative look behind based regex:
.*(?<!-rc)$
See live demo of this regex with your sample data.
The look behind, which is anchored to end of input via $, requires that the preceding chars not be "-rc".

Regex Redirect Destination

I need to redirect multiple pages based on a single section of the URL, so these pages:
site/type1
site/type1/page1
site/type1/page2
site/type1/page3
need to redirect to:
site/type2
site/type2/page1
site/type2/page2
site/type2/page3
In reality there is more than 4 pages involved.
I think i can use the trigger:
(.*)/type2(.*)
but I'm stumped on how to set the destination so it dynamically populates the rest of the URL with what was there on the previous page.
Any ideas welcome.
Thanks.
you can simply look for site\/type1 and replace it with site\/type2
I tested it with vim and it worked fine for me :%s/site\/type1/site\/type2/g
Hope it'll help you
Your language probably supports replacing with regex.
This is the regex you should use:
(.*\/)type1(\/.*|$)
This is a bit different from the one you proposed. Your regex will match type2 instead of type1, but you want to redirect from type1 to type2, right? Another thing I changed is group 2. If you don't want things like site/type1b to turn into site/type2b, then you should write the second group as \/.*|$.
For the replacement, you should use $1type2$2, which means "the stuff captured in group 1, followed by type2, followed by the stuff captured in group 2".
Here is an online demo of the above regex and replacement. You can use the "code generator" feature to generate the code that performs this replacement in your language, if you are not sure how to replace with regex.

Negating a regex query

I have looked at multiple posts about this, and am still having issues.
I am attempting to write a regex query that finds the names of S3 buckets that do not follow the naming scheme we want. The scheme we want is as follows:
test-bucket-logs**-us-east-1**
The bolded part is optional. Meaning, the following two are valid bucket names:
test-bucket-logs
test-bucket-logs-us-east-1
Now, what I want to do is negate this. So I want to catch all buckets that do not follow the scheme above. I have successfully formed a query that will match for the naming scheme, but am having issues forming one that negates it. The regex is below:
^(.*-bucket-logs)(-[a-z]{2}-[a-z]{4,}-\d)?$
So some more valid bucket names:
example-bucket-logs-ap-northeast-1
something-bucket-logs-eu-central-1
Invalid bucket names (we want to match these):
Iscrewedthepooch
test-bucket-logs-us-ee
bucket-logs-us-east-1
Thank you for the help.
As mr Barmar said, probably the best approach on these circumstances is solving it programatically. You could write the usual regex for matching the right pattern, and exclude them from the collection.
But you can try this:
^(?:.(?!-bucket-logs-[a-z]{2}-[a-z]{4,}-\d|-bucket-logs$))*$
which is a typical solution using a negative lookeahead (?!) which is a non-capturing group, with zero-length. Basically it states that you want every line that starts with something but dont has the pattern after it.
EDITED
As Ibrahim pointed out(thank you!), there was a little issue with my first regex. I fixed it and I think it is ok now. I had forgot to set the last part of inner regex as optional(?).

Regex for youtube URL

I am using the following regex for validating youtube video share url's.
var valid = /^(http\:\/\/)?(youtube\.com|youtu\.be)+$/;
alert(valid.test(url));
return false;
I want the regex to support the following URL formats:
http://youtu.be/cCnrX1w5luM
http://youtube/cCnrX1w5luM
www.youtube.com/cCnrX1w5luM
youtube/cCnrX1w5luM
youtu.be/cCnrX1w5luM
I tried different regex but I am not getting a suitable one for share links. Can anyone help me to solve this.
Here's a regex I use to match and capture the important bits of YouTube URLs with video codes:
^((?:https?:)?\/\/)?((?:www|m)\.)?((?:youtube(-nocookie)?\.com|youtu.be))(\/(?:[\w\-]+\?v=|embed\/|v\/)?)([\w\-]+)(\S+)?$
Works with the following URLs:
https://www.youtube.com/watch?v=DFYRQ_zQ-gk&feature=featured
https://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
//www.youtube.com/watch?v=DFYRQ_zQ-gk
www.youtube.com/watch?v=DFYRQ_zQ-gk
https://youtube.com/watch?v=DFYRQ_zQ-gk
http://youtube.com/watch?v=DFYRQ_zQ-gk
//youtube.com/watch?v=DFYRQ_zQ-gk
youtube.com/watch?v=DFYRQ_zQ-gk
https://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
//m.youtube.com/watch?v=DFYRQ_zQ-gk
m.youtube.com/watch?v=DFYRQ_zQ-gk
https://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
//www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
https://www.youtube.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
//www.youtube.com/embed/DFYRQ_zQ-gk
www.youtube.com/embed/DFYRQ_zQ-gk
https://youtube.com/embed/DFYRQ_zQ-gk
http://youtube.com/embed/DFYRQ_zQ-gk
//youtube.com/embed/DFYRQ_zQ-gk
youtube.com/embed/DFYRQ_zQ-gk
https://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
http://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
//www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
https://youtube-nocookie.com/embed/DFYRQ_zQ-gk
http://youtube-nocookie.com/embed/DFYRQ_zQ-gk
//youtube-nocookie.com/embed/DFYRQ_zQ-gk
youtube-nocookie.com/embed/DFYRQ_zQ-gk
https://youtu.be/DFYRQ_zQ-gk?t=120
https://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
//youtu.be/DFYRQ_zQ-gk
youtu.be/DFYRQ_zQ-gk
https://www.youtube.com/HamdiKickProduction?v=DFYRQ_zQ-gk
The captured groups are:
protocol
subdomain
domain
path
video code
query string
https://regex101.com/r/vHEc61/1
You're missing www in your regex
The second \. should optional if you want to match both youtu.be and youtube (but I didn't change this since just youtube isn't actually a valid domain - see note below)
+ in your regex allows for one or more of (youtube\.com|youtu\.be), not one or more wild-cards.
You need to use a . to indicate a wild-card, and + to indicate you want one or more of them.
Try:
^(https?\:\/\/)?(www\.youtube\.com|youtu\.be)\/.+$
Live demo.
If you want it to match URLs with or without the www., just make it optional:
^(https?\:\/\/)?((www\.)?youtube\.com|youtu\.be)\/.+$
Live demo.
Invalid alternatives:
If you want www.youtu.be/... to also match (at the time of writing, this doesn't appear to be a valid URL format), put the optional www. outside the brackets:
^(https?\:\/\/)?(www\.)?(youtube\.com|youtu\.be)\/.+$
youtube/cCnrX1w5luM (with or without http://) isn't a valid URL, but the question explicitly mentions that the regex should support that. To include this, replace youtu\.be with youtu\.?be in any regex above. Live demo.
I know I'm like 2 years late to the party, but I was needing to write something up anyway, and seems to fit every test case that I can throw at it. Should be able to reference the first match ($1) to get the ID. Matches the http, https, www and non-www, youtube.com, youtu.be, /watch? and /watch.php? on youtube.com (youtu.be does not use these), and it supports matching even when there are other variables in the URL string (?t= for time, ?list= for playlists, etc).
(?:https?:\/\/)?(?:youtu\.be\/|(?:www\.|m\.)?youtube\.com\/(?:watch|v|embed)(?:\.php)?(?:\?.*v=|\/))([a-zA-Z0-9\_-]+)
Format for YouTube videos has changed. This regex works for all cases:
^(http(s)??\:\/\/)?(www\.)?((youtube\.com\/watch\?v=)|(youtu.be\/))([a-zA-Z0-9\-_])+
Tests here.
Based on so many other regex; this is the best I have got:
((http(s)?:\/\/)?)(www\.)?((youtube\.com\/)|(youtu.be\/))[\S]+
Test:
http://regexr.com/3bga2
Try this:
((http://)?)(www\.)?((youtube\.com/)|(youtu\.be)|(youtube)).+
http://regexr.com?36o7a
I took one of the answers from here and added support for a few edge cases that I noticed in my dataset. This should work for pretty much any valid url.
^(?:https?:)?(?:\/\/)?(?:youtu\.be\/|(?:www\.|m\.)?youtube\.com\/(?:watch|v|embed)(?:\.php)?(?:\?.*v=|\/))([a-zA-Z0-9\_-]{7,15})(?:[\?&][a-zA-Z0-9\_-]+=[a-zA-Z0-9\_-]+)*(?:[&\/\#].*)?$
I tried this one and it works fine for me.
(?:http(?:s)?:\/\/)?(?:www\.)?(?:youtu\.be\/|youtube\.com\/(?:(?:watch)?\?(?:.*&)?v(?:i)?=|(?:embed|v|vi|user)\/))([^\?&\"'<> #]+)
You can check here https://regex101.com/r/Kvk0nB/1
https://regexr.com/62kgd
^((http|https)\:\/\/)?(www\.youtube\.com|youtu\.?be)\/((watch\?v=)?([a-zA-Z0-9]{11}))(&.*)*$
https://www.youtube.com/watch?v=YPz9zqakRbk
https://www.youtube.com/watch?v=YPz9zqakRbk&t=11
http://youtu.be/cCnrX1w5luM&y=12
http://youtu.be/cCnrX1w5luM
http://youtube/cCnrXswsluM
www.youtube.com/cCnrX1w5luM
youtube/cCnrX1w5luM
Check this pattern instead:
r'(?i)(http.//|https.//)*[A-Za-z0-9._%+-]+\.\w+'