Need IP Address mask and DNS host name regular expressions? - regex

I need to allow an IP/DNS name from a text box. I am looking for a IP regular expression which work for IP.
Now I am using one regular expression:
/\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b/
which was working for 0-255 range. But allowing invalid IP such as : 121.21.05.234.01 which has 5 parts.
I need a regular expression which will work in all scenario's like below:
10.2.22.1 - true
123.123.123.123 - true
123.123.023.12 - true
12.23.12.0 - true
121.21.05.234.01 - false
Please provide me DNS expression also.

Try to anchor your regex with ^ and $, which will make it match the whole string.

Are you looking for a way to specify an occurrence count?
You may achieve this with curly brackets.
An exemple here.
In your case, it would lead to:
/\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}\b/
(I added a \ to escape the dot, too)

Related

Regex - find all hosts inside a url

Given the following url:
http://clk.atdmt.com/FLO/go/364329512/direct/01/?href=http://www.****123****.com/refer.do?r=linkshare&lsid=vl0mfKZlvKU-I%2AKKCkbqWO7Zb9aqRSVLEw&lsurl=http%3A%2F%2F****123****%2Fcollection.do%3Fdataset%3D12905%26cm_mmc%3DIM_AFFILIATES-_-Linkshare-vl0mfKZlvKU-_-10003079-_-3
Is there any expression that will match all the hosts above? (e.g http://clk.atdmt.com, http://*123*.com....)
I need the same expression to work even in the matched string will be http://clk.atdmt.com
Thanks
You can use this (javascript flavour) to extract the host (capture group 1):
/http:\/\/(?:www\.)?([^\/]+)/g

REGULAR EXPRESSION (url.rewrite)

I want to use url rewrite for linux lighttpd.conf but I can't get the right regular expression.
My web url is ip/cgi/aaa/bbb and I want to rewrite the url path. My target is /var/www/cgi/aaa.cgi?par=bbb
I write the rule as "^/cgi/([^/]+)\/(.*)?"=> "/var/www/cgi/$1.cgi?par=$2"
But somehow I can't get the parameter par value.
Your issue is because of your anchoring at the beginning of the input.
^/cgi
..will not match ip/cgi, because ip is at the start.. not /cgi. To fix it.. put ip in front of what you have:
^ip/cgi/([^/]+)\/(.*)?
# ^^ this part
Below is the output once that change is made:
"^" matches at beginning of string. As your ip is dynamic, try to match from "cgi" as it is absolute.
I tried below input at http://gskinner.com/RegExr/ and worked fine.
/cgi/([^/]+)\/(.*)?
/var/www/cgi/$1.cgi?par=$2
ip/cgi/aaa/bbb
result is ip/var/www/cgi/aaa.cgi?par=bbb
If you don't want "ip" in result string, use
(.*)/cgi/([^/]+)\/(.*)?
/var/www/cgi/$2.cgi?par=$3
ip/cgi/aaa/bbb
result is /var/www/cgi/aaa.cgi?par=bbb (without ip)

Regex to match anything after /

I'm basically not in the clue about regex but I need a regex statement that will recognise anything after the / in a URL.
Basically, i'm developing a site for someone and a page's URL (Local URL of Course) is say (http://)localhost/sweettemptations/available-sweets. This page is filled with custom post types (It's a WordPress site) which have the URL of (http://)localhost/sweettemptations/sweets/sweet-name.
What I want to do is redirect the URL (http://)localhost/sweettemptations/sweets back to (http://)localhost/sweettemptations/available-sweets which is easy to do, but I also need to redirect any type of sweet back to (http://)localhost/sweettemptations/available-sweets. So say I need to redirect (http://)localhost/sweettemptations/sweets/* back to (http://)localhost/sweettemptations/available-sweets.
If anyone could help by telling me how to write a proper regex statement to match everything after sweets/ in the URL, it would be hugely appreciated.
To do what you ask you need to use groups. In regular expression groups allow you to isolate parts of the whole match.
for example:
input string of: aaaaaaaabbbbcccc
regex: a*(b*)
The parenthesis mark a group in this case it will be group 1 since it is the first in the pattern.
Note: group 0 is implicit and is the complete match.
So the matches in my above case will be:
group 0: aaaaaaaabbbb
group 1: bbbb
In order to achieve what you want with the sweets pattern above, you just need to put a group around the end.
possible solution: /sweets/(.*)
the more precise you are with the pattern before the group the less likely you will have a possible false positive.
If what you really want is to match anything after the last / you can take another approach:
possible other solution: /([^/]*)
The pattern above will find a / with a string of characters that are NOT another / and keep it in group 1. Issue here is that you could match things that do not have sweets in the URL.
Note if you do not mind the / at the beginning then just remove the ( and ) and you do not have to worry about groups.
I like to use http://regexpal.com/ to test my regex.. It will mark in different colors the different matches.
Hope this helps.
I may have misunderstood you requirement in my original post.
if you just want to change any string that matches
(http://)localhost/sweettemptations/sweets/*
into the other one you provided (without adding the part match by your * at the end) I would use a regular expression to match the pattern in the URL but them just blind replace the whole string with the desired one:
(http://)localhost/sweettemptations/available-sweets
So if you want the URL:
http://localhost/sweettemptations/sweets/somethingmore.html
to turn into:
http://localhost/sweettemptations/available-sweets
and not into:
localhost/sweettemptations/available-sweets/somethingmore.html
Then the solution is simpler, no groups required :).
when doing this I would make sure you do not match the "localhost" part. Also I am assuming the (http://) really means an optional http:// in front as (http://) is not a valid protocol prefix.
so if that is what you want then this should match the pattern:
(http://)?[^/]+/sweettemptations/sweets/.*
This regular expression will match the http:// part optionally with a host (be it localhost, an IP or the host name). You could omit the .* at the end if you want.
If that pattern matches just replace the whole URL with the one you want to redirect to.
use this regular expression (?<=://).+

Regular expression - Negative look-ahead

I'm trying to use Perl's negative look-ahead regular expression
to exclude certain string from targeted string. Please give me your advice.
I was trying to get strings which do not have -sm, -sp, or -sa.
REGEX:
hostname .+-(?!sm|sp|sa).+
INPUT
hostname 9amnbb-rp01c
hostname 9tlsys-eng-vm-r04-ra01c
hostname 9tlsys-eng-vm-r04-sa01c
hostname 9amnbb-sa01
hostname 9amnbb-aaa-sa01c
Expected Output:
hostname 9amnbb-rp01c - SELECTED
hostname 9tlsys-eng-vm-r04-ra01c - SELECTED
hostname 9tlsys-eng-vm-r04-sa01c
hostname 9amnbb-sa01
hostname 9amnbb-aaa-sa01c
However, I got this actual Output below:
hostname 9amnbb-rp01c - SELECTED
hostname 9tlsys-eng-vm-r04-ra01c - SELECTED
hostname 9tlsys-eng-vm-r04-sa01c - SELECTED
hostname 9amnbb-sa01
hostname 9amnbb-aaa-sa01c - SELECTED
Please help me.
p.s.: I used Regex Coach
to visualize my result.
Move the .+- inside of the lookahead:
hostname (?!.+-(?:sm|sp|sa)).+
Rubular: http://www.rubular.com/r/OuSwOLHhEy
Your current expression is not working properly because when the .+- is outside of the lookahead, it can backtrack until the lookahead no longer causes the regex to fail. For example with the string hostname 9amnbb-aaa-sa01c and the regex hostname .+-(?!sm|sp|sa).+, the first .+ would match 9amnbb, the lookahead would see aa as the next two characters and continue, and the second .+ woudl match aaa-sa01c.
An alternative to my current regex would be the following:
hostname .+-(?!sm|sp|sa)[^-]+?$
This would prevent the backtracking because no - can occur after the lookahead, the non-greedy ? is used so that this would work correctly in a multiline global mode.
The following passes your testcases:
hostname [^-]+(-(?!sm|sp|sa)[^-]+)+$
I think it is a little easier to read than F.J.'s answer.
To answer Rudy: the question was posed as an exclusion-of-cases situation. That seems to fit negative lookahead well. :)

This regex matches and shouldn't. Why is it?

This regex:
^((https?|ftp)\:(\/\/)|(file\:\/{2,3}))?(((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))|(((([a-zA-Z0-9]+)(\.)?)+?)(\.)([a-z]{2}
|com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum))([a-zA-Z0-9\?\=\&\%\/]*)?$
Formatted for readability:
^( # Begin regex / begin address clause
(https?|ftp)\:(\/\/)|(file\:\/{2,3}))? # protocol
( # container for two address formats, more to come later
((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) # match IP addresses
)|( # delimiter for address formats
((([a-zA-Z0-9]+)(\.)?)+?) # match domains and any number of subdomains
(\.) #dot for .com
([a-z]{2}|com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum) #TLD clause
) # end address clause
([a-zA-Z0-9\?\=\&\%\/]*)? # querystring support, will pretty this up later
$
is matching:
www.google
and shouldn't be. This is one of my "fail" test cases. I have declared the TLD portion of the URL to be mandatory when matching on alpha instead of on IP, and "google" doesn't fit into the "[a-z]{2}" clause.
Keep in mind I will fix the following issues seperately - this question is about why it matches www.google and shouldn't.
Querystring needs to support proper formats only, currently accepts any combination of querystring characters
Several protocols not supported, though the scope of my requirements may not include them
uncommon TLDs with 3 characters not included
Probably matches http://www.google..com - will check for consecutive dots
Doesn't support decimal IP address formats
What's wrong with my regex?
edit: See also a previous problem with an earlier version of this regex on a different test case:
How can I make this regex match correctly?
edit2: Fixed - The corrected regex (as asked) is:
^((https?|ftp)\:(\/\/)|(file\:\/{2,3}))?(((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))|(((([a-zA-Z0-9]+)(\.)?)+?)(\.)([a-z]{2}|com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum))([\/][\/a-zA-Z0-9\.]*)*?([\/]?[\?][a-zA-Z0-9\=\&\%\/]*)?$
"google" might not fit in [a-z]{2}, but it does fit in [a-z]{2}([a-zA-Z0-9\?\=\&\%\/]*)? - you forgot to require a / after the TLD if the URL extends beyond the domain. So it's interpreting it with "www.go" as the domain and then "ogle" following it, with no slash in between. You can fix it by adding a [?/] to the front of that last group to require one of those two symbols between the TLD and any further portion of the URL.
Your TLD clause matches "go" in google and the querystring support part matches "ogle" afterwards. Try changing the querystring part to this:
([?/][a-zA-Z0-9\?\=\&\%\/]*)?
google" doesn't fit into the "[a-z]{2}" clause.
But "go" does and then "ogle" matches "([a-zA-Z0-9\?\=\&\%/]*)?"