How to write a regex to validate a specific URL format? - regex

My URL will look something like this :
"/eshop/products/SMART+TV+SAMSUNG+UE55H6500SLXXH+3D/productDetail/ger_20295028/"
Where product names can keep changing here
SMART+TV+SAMSUNG+UE55H6500SLXXH+3D and product id here ger_20295028. I tried writing a regex which is wrong.
How can I correct it for the above URL?
Regex:
.*/products/[^/]*?/productDetail/[^/]*?/([^/].*?)/[^/]*?/([^/]*)(/.*?)*$

You use ? (single character) instead of * (any number) and you also have much more parts at the end than the example you've given. Try something like this
.*/products/[^/]*/productDetail/[^/]*/

You should read up on quantifiers (the ? means once or zero times, you are confusing it with *). This regex might work for you:
/^.*\/products\/[^\/]+\/productDetail\/[^\/]+\/$/
Try it online here.

Related

please regex url request

I would like to know if anybody can help me with a regular expression problem. I want to write a regular expression to catch URLs similar to this URL:
www.justin.tv/channel_name_here
I have tried:
/justin\.tv\/(.*)
The problem I get is that when this channel goes live, sometimes the URL transforms to something like this:
www.justin.tv/channel_name_here#/w/45365675688
I can't catch this. :( Can anybody please help me with this? I just want to catch the channel name without the pound symbol and the rest of the URL.
Here are some example URLs:
www.justin.tv/winning_movies#/w/6347562128
http://www.justin.tv/cine_accion_hd16#/w/6347562128/18
http://www.justin.tv/fox_movies_hd1/
I would want to get:
winning_movies
cine_accion_hd16
fox_movies_hd1
Thanks in advance! :)
Short answer:
(?<=justin\.tv\/)([^#\/]+)
Long answer:
Let's split this up into parts. Look at the back part first.
([^#\/]+)
This delimits the string into parts that don't include either '#' or '/'.
Now let's look at the first part.
(?<=justin\.tv\/)
The syntax "(?<=" followed by ")" is called positive lookbehind (this page has good examples and explanation of the different types of lookaround). Using a simple example:
(?<=A)B
The above example says "I want all 'B' that are immediately after an 'A'." Going to our big example, we're saying we want all parts (separated by '#' or '/') that are immediately after a part called "justin.tv/".
Look here for an example of the expression in action.
#justin\.tv/([^#/]+)#
If you want everything up to a certain character(-set), use a negated class.
Also, when working on regex for urls, using / as delimiter is error-prone, as you have to escape all the /'s. Use something else instead (like # in this case)

Regex, optional match in url

I spend a couple of hour with no good result (maybe my mood is not helping about it).
I am trying to build a regex to help me match both urls:
/reservables/imagenes/4/editar/6
/reservables/imagenes/4/subir
As you note above, the last segment in the first url 6 is not present at the end of the second url, because this segments is optional here. So I need to match both urls in one regex, for that, I have tried this:
reservables/(editar|imagenes)/([0-9]+)/(imagen|editar|actualizar|subir)/([0-9]+)
That works fine only for the first url. So, reading a few notes about regex it suggest me that I need the ? symbol, right? So, I tried this one, but it did not work:
reservables/(editar|imagenes)/([0-9]+)/(imagen|editar|actualizar|subir)/([0-9]+)?
Well, I do not what I am doing wrong.
You want to put the ? around the / as well, like so:
reservables/(editar|imagenes)/([0-9]+)/(imagen|editar|actualizar|subir)(?:/([0-9]+))?
You can see that it matches correctly on debuggex.
This one will work:
reservables/(editar|imagenes)/([0-9]+)/(imagen|editar|actualizar|subir)/([0-9]*)

Reg Ex Facebook

I am trying to extract some information from facebook using Regex. Here is a link with an example:
https://graph.facebook.com/210989592315921
I was interested in what would the regular expression be in order to extract just the number of likes from this string.
I have tried for example this expression:
"likes":\s[0-9]$
Thank you in advance for any advice regarding this matter,
Mark
You should follow "#Hope I helped" comment and use a json parser. You can't be sure the text is going to be formatted always the same way:
Are you always going to have a single space between : and the number ?
By the way, here is the error you are looking for, your current regex matches a single figure, not a multiple digit number, you should use something like: [0-9]+ and probably remove the $ which is not correct in your example, as you have a comma after the number.

Match all cases between string and comma using Regex

I have a large string. Here is a part of it:
{"status":"ok","items":[{"image_versions":[{"url":"http:\/\/distilleryimage8.instagram.com\/11a67042c62311e1bf341231380f8a12_7.jpg","width":612,"type":7,"height":612},{"url":"http:\/\/distilleryimage8.instagram.com\/11a67042c62311e1bf341231380f8a12_6.jpg","width":306,"type":6,"height":306},{"url":"http:\/\/distilleryimage8.instagram.com\/11a67042c62311e1bf341231380f8a12_5.jpg","width":150,"type":5,"height":150}],"code":"MrMBxJo-O8","has_more_comments":true,"taken_at":1341438972.0,"comments":[{"media_id":228329104165036988,"_spam":false,"text":"I live in Oklahoma! :D Shoot them off with me! :D","created_at":1341441914.0,"user":{"username":"heather_all_over","pk":13296276,"profile_pic_url":"http:\/\/images.instagram.com\/profiles\/profile_13296276_75sq_1339538236.jpg","full_name":"Heather\ud83c\udf80","is_private":false},"content_type":"comment","pk":228353791620276525,"type":0},{"media_id":228329104165036988,"_spam":false,"text":"Wish I had that much money to spend.......","created_at":1341441916.0,"user":{"username":"l_mcnair","pk":23775741,"profile_pic_url":"http:\/\/images.instagram.com\/profiles\/profile_23775741_75sq_1339894045.jpg","full_name":"Lauryn","is_private":true},"content_type":"comment","pk":228353803204944174,"type":0},{"media_id":228329104165036988,"_spam":false,"text":"You should video tape you setting them all off","created_at":1341441939.0,"user":{"username":"ahrii_","pk":37732021,"profile_pic_url":"http:\/\/images.instagram.com\/profiles\/profile_37732021_75sq_1340907381.jpg","full_name":"Ahriana;-*","is_private":false},"content_type":"comment","pk":228353997065675057,"type":0},{"media_id":228329104165036988,"_spam":false,"text":"When did skrillex start selling
I am trying to match every number after "pk":". I have been trying look aheads but can't quite seem to get it right. I don't know much about regex so if somebody could point me in the right direction that would be great!
This looks like a JSON response. Why not just parse the JSON and pull out the values for all the "pk" keys?
Depending on what language you're using, the regex might look different, but this should work on most languages:
/"pk":(\d+)/g
That basically looks for the string "pk": and then all the digits after that, placing those digits in a capturing group. The g at the end makes it search for all occurrences. Depending on the language you're using though, you might not be able to retrieve all of captures.
If you want the part after something you should use look-behind:
(?<="pk":)\d+

Perl/lighttpd regex

I'm using regex in lighttpd to rewrite URLs, but I can't write an expression that does what I want (which I thought was pretty basic, apparently not, I'm probably missing something).
Say I have this URL: /page/variable_to_pass/ OR /page/variable_to_pass/
I want to rewrite the URL to this: /page.php?var=variable_to_pass
I've already got rules like ^/login/(.*?)$ to handle specific pages, but I wanted to make one that can match any page without needing one expression per page.
I tried this: ^/([^.?]*) but it matches the whole /page/variable_to_pass/ instead of just page.
Any help is appreciated, thanks!
This regexp should do what you need
/([^\/]+)/(.+)
First match would be page name, and the second - variable value
Try:
/([^.?])+/([^.?])+/
That should give you two matches.