how to make regex - regex

I was trying to solve a problem through regex. but It's very hard to make the regex. let look to an example maybe you people can help me out. and gave me some good source to learn regex. Now my problem is I want to make the regex for a sentence. e.g www.facebook.com www.goole.com www.online.facebook.com www.live.com if you see these example the www and com is same but the data between these are changing. i tried to make through this link but can't.

Related

How to cut url down correctly by regex?

May I ask you some question about regex? It will be cool if you could help me to solve an issue. I have tons of urls and I need to find out all unique which has word promo in url.
For instance, I have a bunch urls like that:
/promo/vygoda-do-20-na-samsung?from=hb
/promo/antikrizisnaya-rasprodazha-skidki-do-50-mark164615151?from=hb
/promo/antikrizisnaya-rasprodazha-skidki-do-50-mark164615151
but I need get like this:
/promo/vygoda-do-20-na-samsung
/promo/antikrizisnaya-rasprodazha-skidki-do-50
/promo/antikrizisnaya-rasprodazha-skidki-do-50
All I could do it is
https://regex101.com/r/Ot8xzV/1
I have just started my journey to regex and don't have strong knowledge, so, please help me to do it. I'll be very grateful
Use
(.*/promo/[^?]+?)(?:-mark\d+|\?).*
Replace with $1 if you can replace. Capturing group may work for you already.
See proof.

Rewrite regex without negation

I have wrote this regex to help me extract some links from some text files:
https?:\/\/(?:.(?!https?:\/\/))+$
Because I am using golang/regexp lib, I'm not able to use it, due to my negation (?!..
What I would like to do with it, is to select all the text from the last occurance of http/https till the end.
sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/#query2
=> Output: http://websites.com/path/subpath/#query2
Can anyone help me with a solution, I've spent several hours trying different ways of reproducing the same result with no success.
Try this regex:
https?:[^:]*$
Regex live here.
The lookaheads exist for a reason.
However, if you insist on a supposedly equivalent alternative, a general strategy you can use is:
(?!xyz)
is somewhat equivalent to:
$|[^x]|x(?:[^y]|$)|xy(?:[^z]|$)
With that said, hopefully I didn't make any mistakes:
https?:\/\/(?:$|(?:[^h]|$)|(?:h(?:[^t]|$))|(?:ht(?:[^t]|$))|(?:htt(?:[^p]|$))|(?:http(?:[^s:]|$))|(?:https?(?:[^:]|$))|(?:https?:(?:[^\/]|$))|(?:https?:\/(?:[^\/]|$)))*$

Regex to find a web address

I'm trying to isolate links from html using a regex and the one I found that is suppose to do it doesn't seem to work.
/^(http?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/
Am I missing something? I'm using Brackets as my text editor
^(?:http|https):\/\/(?:[a-z0-9\-\.]+)(?::[0-9]+)?(?:\/|\/(?:[\w#!:\.\?\+=&%#!\-\/\(\)]+)|\?(?:[\w#!:\.\?\+=&%#!\-\/\(\)]+))?$
Messy, but works.
Also, you might want to look at a similar question: Regex expression for valid website link
Hope this helps :)
It is hard to make it 100% accurate.
A url could also be a IP address for example.
http://ip/
It can contain query strings.
http://www.google.com/?a=1&b=2
It can contain spaces.
http://www.google.com/this is my url/
It depends on what need you have for accuracy.

ColdFusion regex to match valid Youtube links (need improvement)

Although similar questions were asked on here multiple times already, I've got request to amend an existing regex line to improve it. Pretty sure this will help others in the same situation too.
What I'm trying to achieve is to match valid YouTube video URLs using ColdFusion regex.
Here's what I've currently got:
ReMatch('^.*(youtu.be\/|v\/|u\/\w\/|embed\/|watch\?v=|\&v=)([^##\&\?]*).*',mylink)
This works for the following URL types:
http://www.youtube.com/watch?v=0zM3nApSvMg&feature=feedrec_grec_index
http://www.youtube.com/user/IngridMichaelsonVEVO#p/a/u/1/QdK8U-VIH_o
http://www.youtube.com/v/0zM3nApSvMg?fs=1&hl=en_US&rel=0
http://www.youtube.com/watch?v=0zM3nApSvMg#t=0m10s
http://www.youtube.com/embed/0zM3nApSvMg?rel=0
http://www.youtube.com/watch?v=0zM3nApSvMg
http://youtu.be/0zM3nApSvMg
However, the following URL for whatever reason is getting matched too:
http://www.theguardian.com/media/2013/nov/29/russell-brand-rages-sun-rupert-murdoch
How can I amend the code to be a bit more accurate? Maybe making sure that the 'youtu' part is paramount to the link would help as I think the current regex only takes it as one of the optional parts? Trouble is I'm not able to amend this code myself, hence asking for help here.
//////EDITED////////////////
Thanks to Omega's answer below, with a little amendment here's the pattern that worked for my case:
ReMatch('(http:\/\/)(?:www\.)?youtu(?:be\.com\/(?:watch\?|user\/|v\/|embed\/)\S+|\.be\/\S+)',mylink)
Also, it is worth noting I had to strip the lookbehind part from the suggested pattern as ColdFusion does not support it.
(?<=http:\/\/)(?:www\.)?youtu(?:be\.com\/(?:watch\?|user\/|v\/|embed\/)\S+|\.be\/\S+)
See this demo.

Updating Textmate's Markdown Bundle URL Regex

Hey all, I'm looking to do pretty much what the title says. As-is, the seemingly favored Textmate Markdown bundle chokes on some URLs. What I'd like to do is implement #diegoperini's pattern to properly match URLs.
Some problems on my end: I suck with Regular Expressions, and I've never edited a Textmate bundle.
I'm not sure how to even explain what I'm trying. So essentially what I'm looking for is someone to point me in the right direction and help me along in getting this sorted out, and learn a bit more about Textmate and regex along the way.
Thanks so much.
I like to use Rubular.com for learning and testing Regex.
You want to use the Bundle Editor.
Read this.
Check out Textmate Screencasts.
You could also buy the Peepcode screencast on Textmate. It has some info about bundle editing that may be useful, although there's not huge amounts on what you want.
There is also a book
Good luck!