Regex to check a Valid URL - regex

Problem: I need a Regex which would check a given author URL is valid or not.
Requirement : Author URL is basically a URL from social networking sites/blogs etc having author id (profile id)
For eg .
www.facebook.com/RyanMathews
www.mouthshut.com/zobo.786
The regex as per my understanding would have to accept any string(combination of any characters ) after the sites complete address is followed by a " / " .
Tried Using this regex but doesnt support author ids
var urlregex = /^((https?:\/\/)?((([a-z\d])+(\-)?([a-z\d])+)+)(\.([a-z\d])+(\-)?([az\d])+)?)(\.[a-z]{2,4}?){1,2}$/i;
PS : Please explain the Regex & Logic too :D

it should Help but I will recommend to do little background reading.
What is the best regular expression to check if a string is a valid URL?
Getting parts of a URL (Regex)
Please spend some time to read these links and understand them, hope this helps, cheers!

^(http:\/\/){0,1}(www.[^\W]+.com)(\/[^\W]+)+
maybe this would work

Related

Regex > get YouTube Playlist ID?

Hi I am trying to get the playlist ID of a youtube url. The code below is not solid since the id 'PLcfQmtiAG0X-fmM85dPlql5wfYbmFumzQ' will not be extracted properly. It only returns 'PLcfQmtiAG0X'. Can someone help me?
var reg = new RegExp("[&?]list=([a-z0-9_]+)","i");
var url = 'https://www.youtube.com/playlist?list=PLcfQmtiAG0X-fmM85dPlql5wfYbmFumzQ';
var match = reg.exec(url);
return match[1];
I do a fair amount of regex work with URLs. Usually you'll want to use a parser but sometimes that is not an option. So to gather params I like to use a negative character class like this
/[&?]list=([^&]+)/i
The [&?] will mean that you won't match &split=123 since it has to start with a & or ?
The [^&]+ is the real magic, it means capture all the non & which is the value you are going for. If you want to play around, this site is pretty good:
https://regex101.com/

Regular expression to match string from url

I want to match shop name from a url .Please see the example below. Its for url redirection in a word press application.
See the examples given below
http://example.com/outlets/19-awok?page=2
http://example.com/outlets/19-awok
http://example.com/outlets/159-awok?page=3
In all cases i need to get only awok from the url .It will be the text coming after '-' and before query string .
I tried below and its not working
/outlets/(\d+)-(.*)? => /shop/$2
Any help will be greatly appreciated.
You can use this regex:
/outlets/\d+-([^?]+)?
Trailing ? is used to strip previous query string.

in regex get a single match just before the match pattern?

I have a response like below
{"id":9,"announcementName":"Test","announcementText":"<p>TestAssertion</p>\n","effectiveStartDate":"03/01/2016","effectiveEndDate":"03/02/2016","updatedDate":"02/29/2016","status":"Active","moduleName":"Individual Portal"}
{"id":103,"announcementName":"d3mgcwtqhdu8003","announcementText":"<p>This announcement is a test announcement”,"effectiveStartDate":"03/01/2016","effectiveEndDate":"03/02/2016","updatedDate":"02/29/2016","status":"Active","moduleName":"Individual Portal"}
{"id":113,"announcementName":"asdfrtwju3f5gh7f21","announcementText":"<p>This announcement is a test announcement”,"effectiveStartDate":"03/02/2016","effectiveEndDate":"03/03/2016","updatedDate":"02/29/2016","status":"InActive","moduleName":"Individual Portal"}
I am trying get the value of id (103) of announcementName d3mgcwtqhdu8003.
I am using below regEx pattern to get the id
"id":(.*?),"announcementName":"${announcementName}","announcementText":"
But it is matching everything from the first id to the announcementName. and returning
9,"announcementName":"Test","announcementText":"<p>TestAssertion</p>\n","effectiveStartDate":"03/01/2016","effectiveEndDate":"03/02/2016","updatedDate":"02/29/2016","status":"Active","moduleName":"Individual Portal"}
{"id":103,"announcementName":"d3mgcwtqhdu8003","announcementText":
But I want to match only from the id just before the required announcementName.
How can I do this in RegEx . Can someone please help me on this ?
As an answer here as well. Either use appropriate JSON functions, if not, a simple regex like:
"id":(\d+)
will probably do as the IDs are numeric.

Regular expression groups

For all the regex experts out there! I'm trying to figure out how to group my url into parts using regular expressions.
Example:
site.com/user/account/info/settings
I want to be able to capture the user/accout/info url NOT /settings
Can anyone take this challenge and be kind enough to help me out? Thanks!
If you want to get the beginning of the URL try this:
(\/.*\/(?!.*\/.+))
Input:
site.com/foo/remove-me/
site.com/user/account/info/settings
site.com/foo/bar/remove-me
site.com/foo/remove-me?param1=true&param2=hello+world
Output:
/foo/
/user/account/info/
/foo/bar/
/foo/
https://regex101.com/r/yI5rG4/2
After consideration of all your comments under your post, I understand that you want to get the last segment for controller name extraction. Hence try this:
(?:\/(?!.*\/.+))([^\?\n]*)
Used on these inputs:
site.com/foo/remove-me/
site.com/user/account/info/settings
site.com/foo/bar/remove-me
site.com/foo/remove-me?param1=true&param2=hello+world
Output for group 1:
remove-me/
settings
remove-me
remove-me
Test here: https://regex101.com/r/kR5tX6/2

Regexp to simplify Yahoo Answers Feed Title

I am trying to parse the yahoo answers feed - http://answers.yahoo.com/rss/allq
The issue is that the titles have
[ Category ] : Open Question :
in every title that I do not want... I want to write a regexp to remove this...
anything that we can make to remove all the letters in the starting [ and the first : should do it.
there is a space after the : also, we need to remove that too.
Thanks for this in advance, I will also try to find a solution myself.
Have you considered using Yahoo's YQL service to parse this feed (or other web pages)?
Querying html using Yahoo YQL
Yahoo! Query Language
YQL Console
They already have sample queries for you to get at Yahoo Answers data:
answers.getbycategory:
http://developer.yahoo.com/yql/console/#h=select%20*%20from%20answers.getbycategory%20where%20category_id%3D2115500137%20and%20type%3D%22resolved%22
answers.getbyuser:
http://developer.yahoo.com/yql/console/#h=select%20*%20from%20answers.getbyuser%20where%20user_id%3D%22YbaMGtHFaa%22
answers.getquestion:
http://developer.yahoo.com/yql/console/#h=select%20*%20from%20answers.getquestion%20where%20question_id%3D%2220090526102023AAkRbch%22
answers.search:
http://developer.yahoo.com/yql/console/#h=select%20*%20from%20answers.search%20where%20query%3D%22cars%22%20and%20category_id%3D2115500137%20and%20type%3D%22resolved%22
(Just an FYI in case you weren't aware of this convenient service. I use it instead of screen scraping with RegEx's.)
the following regex should do the job:
^\[.*?:
Usage sample in c#:
string resultString = Regex.Replace(subjectString, #"^\[.*?: ", "");
What it does is start with an [ bracket and take any characters until it matches a : and take the follwing space.
Hope this helps,
Tom.
Thanks # cmptrgeekken for pointing the non greedy thing out!