Regex: start with something and ends with whatever except something - regex

I need a regex for Url rewrite module, to validate urls in such way:
1) spa/ - match
2) spa/some/url - match
3) spa/some-url - match
4) spa/some.js - no match
5) spa/some.css - no match
So, it should match, if url
a) starts with "spa"
b) ends with whatever except ".js" or ".css"
What I tried to test is ^(spa/)((?!.js)|(?!.css))$
but it's not working.
Thank you and sorry if it's duplicated.

Try this regex:
^spa\/((.+)\/)*.*(?<!\.js|\.css)$
with g and m flags set.
Please note that this regex allows several characters that urls are not supposed to have. I have tried to keep it simple. So, you might want to tune it a bit before using it.

You need negative-lookbehind for this.
Try this (you may need to modify it slightly)
^spa.*(?<!(\.js|\.css))$
^spa : string beginning with spa
.* : followed by any character(s)
(?<!(\.js|\.css))$ : not ending with .js or .css

Related

Django url regex hit end of url: *./something

I am geting 404 on different urls that end with the same string and instead of creating multiple redirects I would like to catch them all on the last string. It always appears at the same position, pattern goes like so:
/some-of-my-urls/the-same-string
No trailing slash there. I tried something like this:
url(r'^[a-zA-Z0-9_]+/the-same-string', redirect_func),
url(r'^./the-same-string', redirect_func),
But that doesn't work. Probably obvious for somebody with more regex knowledge, I am not very advanced. Anybody ideas?
You may use a negated character class [^/] to match any char but / and quantify it with a + quantifier that matches 1 or more repetitions:
r'^[^/]+/the-same-string'
See the regex demo.

nginx regex - match variable number of fields

I have a route with urls that can have an optional extra field. It can be either of the form :
"/my-route/azezaezaeazeaze.123x456.jpg"
"/my-route/azezaezaeazeaze.123x456.6786786786.jpg"
with :
"azezaezaeazeaze" being a mongoId
123x456 two integers separated by "x"
6786786786 a unix timestamp
jpg an image extension (could be jpeg, png, gif...)
all those are separated by a "."
I would like to remove the optional part (the timestamp) from the request with the http rewrite module. So that the second url effectively becomes lie the first.
I made a small test on regex101 to get the groups, but :
- it doesn't seem to be the right syntax for nginx
- I do not see how it will allow me to remove the timestamp
How can I remove the timestamp from that url?
Starting from the right-hand end, you need to match a dot followed by anything
except a dot, so we have (\.[^.]*)$, then moving to the left, we want
to match a dot followed by only digits \.[0-9]*, which we dont want to
capture, and then to the left of that we want everything.
I ended up with something like this:
rewrite ^(.*)\.[0-9]*(\.[^.]*)$ $1$2 ;
Capitalizing on my first attempt and #meuh answer, I ended up with the following :
rewrite ^(/.*\..*)(\..*)(\..*)$ $1$3 last;
Now it works, but I would welcome any comment regarding the style/efficiency of this rewrite.

Regex for URL to sites

I have two URLs with the patterns:
1.http://localhost:9001/f/
2.http://localhost:9001/flight/
I have a site filter which redirects to the respective sites if the regex matches. I tried the following regex patterns for the 2 URLs above:
http?://localhost[^/]/f[^flight]/.*
http?://localhost[^/]/flight/.*
Both URLS are getting redirected to the first site, as both URLs are matched by the first regex.
I have tried http?://localhost[^/]/[f]/.* also for the 1st url. I am Unable to get what am i missing . I feel that this regex should not accept any thing other than "f", but it is allowing "flight" as well.
Please help me by pointing the mistake i have done.
Keep things simple:
.*/f(/[^/]*)?$
vs
.*/flight(/[^/]*)?$
Adding ? before $ makes the trailing slash with optional path term optional.
The first one will be caught with following regex;
/^http:[\/]{2}localhost:9001\/f[^light]$/
The other one will be disallowed and can be found with following regex
/^http:[\/]{2}localhost:9001\/flight\/$/
You regex has several issues: 1) p? means optional p (htt:// will match), 2) [^/] will only match : in your URLs since it will only capture 1 character (and you have a port number), 3) [^light] is a negated character class that means any character that is not l, i, g, h, or t.
So, if you want to only capture localhost URLs, you'd better use this regex for the 1st site:
http://localhost[^/]*/f/.*
And this one for the second
http://localhost[^/]*/flight/.*
Please also bear in mind that depending on where you use the regexps, your actual input may or may not include either the protocol.
These should work for you:
http[s]{0,1}:\/\/localhost:[0-9]{4}\/f\/
http[s]{0,1}:\/\/localhost:[0-9]{4}\/flight\/
You can see it working here

substring regex for first part of url

I've got a large database of projects and issue trackers, some of which have urls.
I'd like to query it to figure out a list of urls for each project, but many have extra data I'd like to avoid.
I'd like to do something like this:
substring(tracker_extra_field_data.field_data FROM 'http://([^/]*).*')
Except some urls are https, and I'd like to capture that as well as the first sub directory.
For example, given the url:
https://dev.foo.com/bar/action/?param=val
I'd like the select to return:
https://dev.foo.com/bar/
Is there a semi-simple way to do this with substring/regex in pgsql?
try this:
select substring('https://dev.foo.com/bar/action/?param=val' from '(https?://([^/]*/){1,2})');
template1=# select substring('https://dev.foo.com/bar/action/?param=val' from '(https?://([^/]*/){1,2})');
substring
-------------------------
https://dev.foo.com/bar/
(1 row)
template1=# select substring('http://dev.foo.com/bar/action/?param=val' from '(https?://([^/]*/){1,2})');
substring
------------------------
http://dev.foo.com/bar/
Updated after I didn't read the Q properly at first.
Use the pattern
^https?://[^/]+(?:/[^/]+)?/?
^ .. start of string
? .. zero or one atoms
(?:) .. non-capturing parens
[^/]+ .. any character except /, 1 or more of them
This only accepts URLs starting with http:// or https:// (protocol header required).
->SQLfiddle with a bigger test case.

Regex to match anything after /

I'm basically not in the clue about regex but I need a regex statement that will recognise anything after the / in a URL.
Basically, i'm developing a site for someone and a page's URL (Local URL of Course) is say (http://)localhost/sweettemptations/available-sweets. This page is filled with custom post types (It's a WordPress site) which have the URL of (http://)localhost/sweettemptations/sweets/sweet-name.
What I want to do is redirect the URL (http://)localhost/sweettemptations/sweets back to (http://)localhost/sweettemptations/available-sweets which is easy to do, but I also need to redirect any type of sweet back to (http://)localhost/sweettemptations/available-sweets. So say I need to redirect (http://)localhost/sweettemptations/sweets/* back to (http://)localhost/sweettemptations/available-sweets.
If anyone could help by telling me how to write a proper regex statement to match everything after sweets/ in the URL, it would be hugely appreciated.
To do what you ask you need to use groups. In regular expression groups allow you to isolate parts of the whole match.
for example:
input string of: aaaaaaaabbbbcccc
regex: a*(b*)
The parenthesis mark a group in this case it will be group 1 since it is the first in the pattern.
Note: group 0 is implicit and is the complete match.
So the matches in my above case will be:
group 0: aaaaaaaabbbb
group 1: bbbb
In order to achieve what you want with the sweets pattern above, you just need to put a group around the end.
possible solution: /sweets/(.*)
the more precise you are with the pattern before the group the less likely you will have a possible false positive.
If what you really want is to match anything after the last / you can take another approach:
possible other solution: /([^/]*)
The pattern above will find a / with a string of characters that are NOT another / and keep it in group 1. Issue here is that you could match things that do not have sweets in the URL.
Note if you do not mind the / at the beginning then just remove the ( and ) and you do not have to worry about groups.
I like to use http://regexpal.com/ to test my regex.. It will mark in different colors the different matches.
Hope this helps.
I may have misunderstood you requirement in my original post.
if you just want to change any string that matches
(http://)localhost/sweettemptations/sweets/*
into the other one you provided (without adding the part match by your * at the end) I would use a regular expression to match the pattern in the URL but them just blind replace the whole string with the desired one:
(http://)localhost/sweettemptations/available-sweets
So if you want the URL:
http://localhost/sweettemptations/sweets/somethingmore.html
to turn into:
http://localhost/sweettemptations/available-sweets
and not into:
localhost/sweettemptations/available-sweets/somethingmore.html
Then the solution is simpler, no groups required :).
when doing this I would make sure you do not match the "localhost" part. Also I am assuming the (http://) really means an optional http:// in front as (http://) is not a valid protocol prefix.
so if that is what you want then this should match the pattern:
(http://)?[^/]+/sweettemptations/sweets/.*
This regular expression will match the http:// part optionally with a host (be it localhost, an IP or the host name). You could omit the .* at the end if you want.
If that pattern matches just replace the whole URL with the one you want to redirect to.
use this regular expression (?<=://).+