Regex for specifying a date format - regex

I have defined the following regex for a specific date:
(0[1-9]|1[012]|[1-9])[\/-]
(0[1-9]|1[0-9]|2[0-9]|3[0]|[1-9])[\/-]
(18[0-9]+|19[0-9]+|20[0-9]+|0[1-9]|1[0-9]|2[0-9]|3[0-9]|4[0-9]|5[0-9]|6[0-9]|7[0-9]|8[0-9]|9[0-9])
First line defines the month, second line the date and third year formats.
I am good with the limits for dates, months and years but I dont know how to reject mixed formats like mm/dd-yyyy or mm-dd/yyyy.
Can someone please help??

You can match the first delimiter, then use a back reference to it.
# /(0[1-9]|1[012]|[1-9])([\/-])(0[1-9]|1[0-9]|2[0-9]|3[0]|[1-9])\2(18[0-9]+|19[0-9]+|20[0-9]+|0[1-9]|[1-9][0-9])/
( 0 [1-9] | 1 [012] | [1-9] ) # (1), Month
( [/-] ) # (2), Delimiter / or -
( # (3 start), Day
0 [1-9]
| 1 [0-9]
| 2 [0-9]
| 3 [0]
| [1-9]
) # (3 end)
\2 # Delimiter backreference
( # (4 start), Year
18 [0-9]+
| 19 [0-9]+
| 20 [0-9]+
| 0 [1-9]
| [1-9] [0-9]
) # (4 end)

Related

error parsing regexp invalid or unsupported Perl syntax: `(?!`

I'm validating phone number and email using this regex but I'm getting perl syntax error can anyone help me what to use here
^(?:(\d)(?!\1{2}))\d{4,15}$|([A-Za-z0-9]+#[A-za-z]+\.[A-Za-z]{2,3})
I'm validating international numbers between 4-15 and also validating continuously repeated numbers like 1111111111111, 99999999999, 77777777777 we can't use more than 3 repeated numbers also I'm validating email everything is fine but for the repeated number I've to use Perl syntax ?! that's why I'm getting error please help me to solve this
You're not using Perl; you're using RE2. While similar to Perl, it's not quite compatible.
Specifically, it can't handle the pattern you provided. That's what the message is saying. You'll need to provide something RE2 can handle.
The following is the relevant part:
^(?:(\d)(?!\1{2}))\d{4,15}$
In Perl, that checks for a string of 5-16 digits that's optionally followed by line feed, with the caveat that the first three digits can't be the same.
This is equivalent[1] and will work in RE2:
^
(?: 0 (?: 0 [1-9] | [1-9] [0-9] )
| 1 (?: 1 [02-9] | [02-9] [0-9] )
| 2 (?: 2 [0-13-9] | [0-13-9] [0-9] )
| 3 (?: 3 [0-24-9] | [0-24-9] [0-9] )
| 4 (?: 4 [0-35-9] | [0-35-9] [0-9] )
| 5 (?: 5 [0-46-9] | [0-46-9] [0-9] )
| 6 (?: 6 [0-57-9] | [0-57-9] [0-9] )
| 7 (?: 7 [0-68-9] | [0-68-9] [0-9] )
| 8 (?: 8 [0-79] | [0-79] [0-9] )
| 9 (?: 9 [0-8] | [0-8] [0-9] )
)
[0-9]{2,13}
\n?
\z
I don't know RE2, so there might a better solution.
Assuming \d was meant to match [0-9]. It actually matches a whole lot more.

How can I re-arrange this regex to work with the MM/DD/YYYY format?

There's a brilliant bit of regex by Macs Dickinson that validates DD/MM/YYYY strings taking into account allowable days for each month (eg 28 vs 30 vs 31) and the possibility of February 29th but only on leap years:
^(((0[1-9]|[12][0-9]|3[01])[- /.](0[13578]|1[02])|(0[1-9]|[12][0-9]|30)[- /.](0[469]|11)|(0[1-9]|1\d|2[0-8])[- /.]02)[- /.]\d{4}|29[- /.]02[- /.](\d{2}(0[48]|[2468][048]|[13579][26])|([02468][048]|[1359][26])00))$
I'm looking to re-arrange this to use for MM/DD/YYYY strings, but I can't wrap my head around it enough to get it there. Any help is much appreciated!
The regex you posted has a mistake in the Leap Year section.
Otherwise it is a fair facsimile of that form.
I've fixed his mistake, and rearranged the day/month for you.
It now matches MM/DD/YYYY.
((?:0[13578]|1[02])[- /.](?:0[1-9]|[12][0-9]|3[01])|(?:0[469]|11)[- /.](?:0[1-9]|[12][0-9]|30)|02[- /.](?:0[1-9]|1\d|2[0-8]))[- /.](\d{4})|(02[- /.]29)[- /.]([0-9]{2}(?:0[48]|[13579][26]|[2468][048])|(?:[02468][048]|[13579][26])00)
Formatted / explained
( # (1 start), Non-LeapYr MM/DD
(?: 0 [13578] | 1 [02] ) # Months with 31 days
[- /.]
(?: 0 [1-9] | [12] [0-9] | 3 [01] )
|
(?: 0 [469] | 11 ) # Months with 30 days
[- /.]
(?: 0 [1-9] | [12] [0-9] | 30 )
|
02 # February with 28 days
[- /.]
(?: 0 [1-9] | 1 \d | 2 [0-8] )
) # (1 end)
[- /.]
( \d{4} ) # (2), Any year 0000 - 9999
| # OR,
( 02 [- /.] 29 ) # (3), LeapYear MM/DD
[- /.]
( # (4 start), Leap Years 0000 - 9996
[0-9]{2}
(?: 0 [48] | [13579] [26] | [2468] [048] )
|
(?: [02468] [048] | [13579] [26] )
00
) # (4 end)

RegEx for matching various dates

I am trying to put together a regex statement to match on each of the below date formats.
* Mar 7, 2017
Mar. 7, 2017
* March 7, 2017
3-7-2017
03-07-2017
3-7-17
03-07-17
* 03/7/2017
* 03/07/17
* 3/7/17
Mar-07-2017
Mar-7-2017
March-07-2017
The below regex matches on the date formats that are indicated by an asterisk above. I have tried in vain to add to what I already have but have been unsuccessful.
([0-9]+)/([0-9]+)/([0-9]+)|([12]\d{3}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01]))|\w+\s\d{2},\s\d{4}|(?i)\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec](?:ember)?)\b
(?:0?[1-9]|[1-2][0-9]|3[01]),? \d{4}
Any help is always appreciated!
* Bonus question *
On some occasions, there may be multiple date matches and I need it to find a match following a certain word. In the past I've used the below syntax by enclosing the regex statement between the parenthesis after the period.
(?<=Word).(StatementHere)
Try this then ...
([0-9]+)/([0-9]+)/([0-9]+)|((0?[1-9]|1[0-2])-(0?[1-9]|[12]\d|3[01])-(\d{4}|\d{2}))|\w+\s\d{2},\s\d{4}|(?i)\b(Jan(?:uary|\.)?|Feb(?:ruary|\.)?|Mar(?:ch|\.)?|Apr(?:il|\.)?|May|Jun(?:e|\.)?|Jul(?:y|\.)?|Aug(?:ust|\.)?|Sep(?:tember|\.)?|Oct(?:ober|\.)?|Nov(?:ember|\.)?|Dec(?:ember|\.)?)([ ](?:0?[1-9]|[1-2][0-9]|3[01]),?[ ]|-(?:0?[1-9]|[1-2][0-9]|3[01])-)(\d{4})
https://regex101.com/r/k1vaVN/1
Readable version
( [0-9]+ ) # (1)
/
( [0-9]+ ) # (2)
/
( [0-9]+ ) # (3)
|
( # (4 start)
( 0? [1-9] | 1 [0-2] ) # (5)
-
( 0? [1-9] | [12] \d | 3 [01] ) # (6)
-
( \d{4} | \d{2} ) # (7)
) # (4 end)
|
\w+ \s \d{2} , \s \d{4}
|
(?i)
\b
( # (8 start)
Jan
(?: uary | \. )?
| Feb
(?: ruary | \. )?
| Mar
(?: ch | \. )?
| Apr
(?: il | \. )?
| May
| Jun
(?: e | \. )?
| Jul
(?: y | \. )?
| Aug
(?: ust | \. )?
| Sep
(?: tember | \. )?
| Oct
(?: ober | \. )?
| Nov
(?: ember | \. )?
| Dec
(?: ember | \. )?
) # (8 end)
( # (9 start)
[ ]
(?: 0? [1-9] | [1-2] [0-9] | 3 [01] )
,? [ ]
| -
(?: 0? [1-9] | [1-2] [0-9] | 3 [01] )
-
) # (9 end)
( \d{4} ) # (10)
update
Just wrap the dates in a (?: ) group, then add whatever qualifier before
it that you need.
word[ ]or[ ]phrase[ ]+\K(?:([0-9]+)/([0-9]+)/([0-9]+)|((0?[1-9]|1[0-2])-(0?[1-9]|[12]\d|3[01])-(\d{4}|\d{2}))|\w+\s\d{2},\s\d{4}|(?i)\b(Jan(?:uary|\.)?|Feb(?:ruary|\.)?|Mar(?:ch|\.)?|Apr(?:il|\.)?|May|Jun(?:e|\.)?|Jul(?:y|\.)?|Aug(?:ust|\.)?|Sep(?:tember|\.)?|Oct(?:ober|\.)?|Nov(?:ember|\.)?|Dec(?:ember|\.)?)([ ](?:0?[1-9]|[1-2][0-9]|3[01]),?[ ]|-(?:0?[1-9]|[1-2][0-9]|3[01])-)(\d{4}))

Matching known hosts warning in regex

How could I match the following where the IP address can change:
Warning: Permanently added '100.124.61.161' (RSA) to the list of known hosts.
Thanks in advance!
You can try the below code, change the string to restrict only specific texts.
if($string =~ m/Warning: Permanently added '(.*?)' \(RSA\) to the list of known hosts\./)
{
print "Match Successful, IP address: $1\n";
}
else
{
print "String did not match\n";
}
A general regex for the ipv4 (no port) would be
(?<!\d)(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])(?:\.(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])){3}(?!\d)
Explained
(?<! \d )
(?:
\d # 0 - 9
| [1-9] \d # 10 - 99
| 1 \d{2} # 100 - 199
| 2 [0-4] \d # 200 - 249
| 25 [0-5] # 250 - 255
)
(?:
\.
(?:
\d
| [1-9] \d
| 1 \d{2}
| 2 [0-4] \d
| 25 [0-5]
)
){3}
(?! \d )

Perl Regular expression for IP address range

I have some internet traffic data to analyze. I need to analyze only those packets that are within a certain IP range. So, I need to write a if statement. I suppose I need a regular expression for the test condition. My knowledge of regexp is a little weak. Can someone tell me how would I construct a regular expression for that condition. An example range may be like
Group A
56.286.75.0/19
57.256.106.0/21
64.131.14.0/22
Group B
58.176.44.0/21
58.177.92.0/19
The if statement would be like
if("IP in A" || "IP in B") {
do something
}
else { do something else }
so i would need to make the equivalent regexp for "IP in A" and "IP in B" conditions.
I don't think that regexps provide much advantage for this problem.
Instead, use the Net::Netmask module. The "match" method should do what you want.
I have to echo the disagreement with using a regex to check IP addresses...however, here is a way to pull IPs out of text:
qr{
(?<!\d) # No digit having come immediately before
(?: [1-9] \d? # any one or two-digit number
| 1 \d \d # OR any three-digit number starting with 1
| 2 (?: [0-4] \d # OR 200 - 249
| 5 [0-6] # OR 250 - 256
)
)
(?: \. # followed by a dot
(?: [1-9] \d? # 1-256 reprise...
| 1 \d \d
| 2 (?: [0-4 \d
| 5 [0-6]
)
)
){3} # that group exactly 3 times
(?!\d) # no digit following immediately after
}x
;
But given that general pattern, we can construct an IP parser. But for the given "ranges", I wouldn't do anything less than the following:
A => qr{
(?<! \d )
(?: 56\.186\. 75
| 57\.256\.106
| 64\.131\. 14
)
\.
(?: [1-9] \d?
| 1 \d \d
| 2 (?: [0-4] \d
| 5 [0-6]
)
)
(?! \d )
}x
B => qr{
(?<! \d )
58 \.
(?: 176\.44
| 177\.92
)
\.
(?: [1-9] \d?
| 1 \d \d
| 2 (?: [0-4] \d
| 5 [0-6]
)
)
(?! \d )
}x
I'm doing something like:
use NetAddr::IP;
my #group_a = map NetAddr::IP->new($_), #group_a_masks;
...
my $addr = NetAddr::IP->new( $ip_addr_in );
if ( grep $_->contains( $addr ), #group_a ) {
print "group a";
}
I chose NetAddr::IP over Net::Netmask for IPv6 support.
Martin is right, use Net::Netmask. If you really want to use a regex though...
$prefix = "192.168.1.0/25";
$ip1 = "192.168.1.1";
$ip2 = "192.168.1.129";
$prefix =~ s/([0-9]+)\.([0-9]+)\.([0-9]+)\.([0-9]+)\/([0-9]+)/$mask=(2**32-1)<<(32-$5); $1<<24|$2<<16|$3<<8|$4/e;
$ip1 =~ s/([0-9]+)\.([0-9]+)\.([0-9]+)\.([0-9]+)/$1<<24|$2<<16|$3<<8|$4/e;
$ip2 =~ s/([0-9]+)\.([0-9]+)\.([0-9]+)\.([0-9]+)/$1<<24|$2<<16|$3<<8|$4/e;
if (($prefix & $mask) == ($ip1 & $mask)) {
print "ip1 matches\n";
}
if (($prefix & $mask) == ($ip2 & $mask)) {
print "ip2 matches\n";
}