Regex, optional match in url - regex

I spend a couple of hour with no good result (maybe my mood is not helping about it).
I am trying to build a regex to help me match both urls:
/reservables/imagenes/4/editar/6
/reservables/imagenes/4/subir
As you note above, the last segment in the first url 6 is not present at the end of the second url, because this segments is optional here. So I need to match both urls in one regex, for that, I have tried this:
reservables/(editar|imagenes)/([0-9]+)/(imagen|editar|actualizar|subir)/([0-9]+)
That works fine only for the first url. So, reading a few notes about regex it suggest me that I need the ? symbol, right? So, I tried this one, but it did not work:
reservables/(editar|imagenes)/([0-9]+)/(imagen|editar|actualizar|subir)/([0-9]+)?
Well, I do not what I am doing wrong.

You want to put the ? around the / as well, like so:
reservables/(editar|imagenes)/([0-9]+)/(imagen|editar|actualizar|subir)(?:/([0-9]+))?
You can see that it matches correctly on debuggex.

This one will work:
reservables/(editar|imagenes)/([0-9]+)/(imagen|editar|actualizar|subir)/([0-9]*)

Related

Match a url with quety string

I'm pretty new to regex.
I researched a lot, but I can't figure out the problem.
I have this url
https://kompozitor.fr/thenotebar/?s=test.
The query string ?s= is the search parameter on my blog.
I'd like to write a regex expression that matches only
/thenotebar/?s=
and any parameter given to the search engine.
I tried a few things like /thenotebar/\?s=(.*)
,but it didn't work.
Any help is appreciated. Thank you in advance.
As I understand, you need the "thenotebar/?s=" and at least one character in the parameters.
Try to use this regex for it.
/\/thenotebar\/\?s=.(.*)/g
In your regex, you miss the '\' before the '/', probably this is the reason, it not works. You need a dot before (.*). This means you need at least one character as parameter. You need it? Leave this extra dot if you only check the "/thenotebar/?s=" string.

How to write a regex to validate a specific URL format?

My URL will look something like this :
"/eshop/products/SMART+TV+SAMSUNG+UE55H6500SLXXH+3D/productDetail/ger_20295028/"
Where product names can keep changing here
SMART+TV+SAMSUNG+UE55H6500SLXXH+3D and product id here ger_20295028. I tried writing a regex which is wrong.
How can I correct it for the above URL?
Regex:
.*/products/[^/]*?/productDetail/[^/]*?/([^/].*?)/[^/]*?/([^/]*)(/.*?)*$
You use ? (single character) instead of * (any number) and you also have much more parts at the end than the example you've given. Try something like this
.*/products/[^/]*/productDetail/[^/]*/
You should read up on quantifiers (the ? means once or zero times, you are confusing it with *). This regex might work for you:
/^.*\/products\/[^\/]+\/productDetail\/[^\/]+\/$/
Try it online here.

Get all url from text by regex

i need get all urls from the text file using regex. But not all url, url that start by some template. For example. I have text:
{"Field_Name1":"http://google.ru","FieldName2":
"["some text", "http://example.com/view/...&id..&.."]",
"FieldName3": "http://example.com/edit/&id..."}someText"
["some text", "http://example.com/view/...&id..&.."]",
"FieldName3": "http://example.com/view/&id..."}someText2{..}someText.({})
I need take all urls like http://example.com/view/.....
I try use this regex, but it doesn't work. Maybe i have some mistake in it.
^(http|https|ftp)\://example\.com\/view\/+[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?/?[^\.\,\)\(\s]$
I'm not need pure url checker, I need checker that can get url that start by some template
What about this?
((ftp|http[s]?):\/\/example.com\/view\/.*?)\"
The first part until "/view/" should be clear. The rest ".*?)\"" means, show me everything before a double quote.
I think this will work! I gave it a go on regexr.com and it seemed to select just the url part, given that the text string doesn't actually have multiple periods in a row.
(?!")h.+.+[a-z]*
EDIT: Made a better one, or at least I think I did. Basically the expression says: "look for a quotation mark, and if the next character is an h then include that in the match and also make that the starting point, and then include any characters after that leading to a single period, followed by any lower case letters. There could be a million of them. As long as there was a period before it, you're good, and it wont select beyond that unless theres another period after the string.
Universal:
/(ftp|http|https)\:\/\/([\d\w\W]*?)(?=\")/igm
Template:
/(ftp|http|https)\:\/\/example\.com\/view\/([\d\w\W]*?)(?=\")/igm

Perl/lighttpd regex

I'm using regex in lighttpd to rewrite URLs, but I can't write an expression that does what I want (which I thought was pretty basic, apparently not, I'm probably missing something).
Say I have this URL: /page/variable_to_pass/ OR /page/variable_to_pass/
I want to rewrite the URL to this: /page.php?var=variable_to_pass
I've already got rules like ^/login/(.*?)$ to handle specific pages, but I wanted to make one that can match any page without needing one expression per page.
I tried this: ^/([^.?]*) but it matches the whole /page/variable_to_pass/ instead of just page.
Any help is appreciated, thanks!
This regexp should do what you need
/([^\/]+)/(.+)
First match would be page name, and the second - variable value
Try:
/([^.?])+/([^.?])+/
That should give you two matches.

Regex pattern to format url

I have this pattern ^(?:http://)?(?:www.)?(.*?)/?(.*?)$ but it's still not perfect.
Let's say we have these urls to test against it:
example.com
example.com/
www.example.com/
http://example.com/
example.com/param
http://example.com/params/
The final output should be example.com/ if there's no parameters and example.com/params/ if with parameters. My problem is that it matches only second group. It doesn't look like /? is working otherwise it would stop on slash character. Is it possible to achieve what I want using only one pattern?
So you want the host name in $1? Your regex is ambiguous, there are many ways to match it; the regex engine will prefer the longest, leftmost possible match. If you don't want slashes in the first part, then say so. Explicitly. (?:http://)?(?:www\.)?([^/]*)?/?(.*)?$
One that I've used is:
((?:(?:https?://)?[\w\d:##%/;$()~_?\+\-=&]+|www|ftp)\.[\w\d:##%/;$()~_?\+\-=&\.]+)
The problem with URLs is that there are SO many ways one can be written, which is why the above code looks so congested. This will match all your examples above, but it will also match things like:
alkasi.jaias
Hopefully this will get you headed to where you need or want to go, and perhaps someone might be able to come up behind me and clean it up some (it's early morning, I'm getting ready for work, and am exhausted. :P)