Regex to validate port number - regex

I'm using this regex (6553[0-5]|655[0-2]\d|65[0-4]\d{2}|6[0-4]\d{3}|[1-5]\d{4}|[1-9]\d{0,3} to validate port numbers. Somehow this is not working. What is wrong with this? Can anybody point me out.

What exactly do you mean by not working?
You could try something like so: ^([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])$ (obtained from here).
This will make sure that any given string is numeric and between the range of 0 and 65535.
Assuming your regular expression matches the same range, it is missing the start and end anchors (^ and $ respectively), so it would allow other strings besides the actual port.
Update 2 Feb 2022: Fixed the regex to reject values like 00 etc. The updated regex is sourced from the comment below. This regex can be better understood and visualized here: https://www.debuggex.com/r/jjEFZZQ34aPvCBMA

When, we search "how to validate port number" on Google we unfortunately land here
However (except if you have really no other choice...),
Regex is clearly not the way to validate a port number !
"One" (slightly better) way may be:
step 1: Convert your string into number, and return FALSE if it fails
step 2: return TRUE if your number is in [1-65535] range, and FALSE otherwise
Various reasons, why Regex is not the right way ?
Code readability (would takes few minutes to understand)
Code robustness (there are various ways to introduce a typo, a unitary test would be required)
Code flexibility (what if port number can be extended to a 64-bits number !?)
etc. ...

Number() is the function you want "123a" returns NAN
parseInt() truncates trailing letters "123a" returns 123
<input type="text" id="txtFld" onblur="if(Number(this.value)>0 && Number(this.value)<65536){alert('valid port number');}" />
jsfiddle

Here is the example I'm using to validate port settings for a firewall. The original answer will match 2 strings. I can only have 1 string match.
(6553[0-5]|655[0-2][0-9]|65[0-4][0-9][0-9]|6[0-4][0-9][0-9][0-9][0-9]|[1-5](\d){4}|[1-9](\d){0,3})
To get: 22,24:100,333,678,100:65535 my full validation (That will only return 1 match) is
(6553[0-5]|655[0-2][0-9]|65[0-4][0-9][0-9]|6[0-4][0-9][0-9][0-9][0-9]|[1-5](\d){4}|[0-9](\d){0,3})(:(6553[0-5]|655[0-2][0-9]|65[0-4][0-9][0-9]|6[0-4][0-9][0-9][0-9][0-9]|[1-5](\d){4}|[0-9](\d){0,3}))?(,(6553[0-5]|655[0-2][0-9]|65[0-4][0-9][0-9]|6[0-4][0-9][0-9][0-9][0-9]|[1-5](\d){4}|[0-9](\d){0,3}){1}(:(6553[0-5]|655[0-2][0-9]|65[0-4][0-9][0-9]|6[0-4][0-9][0-9][0-9][0-9]|[1-5](\d){4}|[0-9](\d){0,3}))?)*

A more strict approach is to have a regex matching all numbers up to 5 digits
with the following string:
*(^[1-9]{1}$|^[0-9]{2,4}$|^[0-9]{3,4}$|^[1-5]{1}[0-9]{1}[0-9]{1}[0-9]{1}[0-9]{1}$|^[1-6]{1}[0-4]{1}[0-9]{1}[0-9]{1}[0-9]{1}$|^[1-6]{1}[0-5]{1}[0-4]{1}[0-9]{1}[0-9]{1}$|^[1-6]{1}[0-5]{1}[0-5]{1}[0-3]{1}[0-5]{1}$)*

The accpeted answer by npinti is not right. It will not allow to enter port number 1000, for example. For me, this one (not nice, I'm a beginner) works correctly:
/^((((([1-9])|([1-9][0-9])|([1-9][0-9][0-9])|([1-9][0-9][0-9][0-9])|([1-6][0-5][0-5][0-3][0-5])))))$/

"^((6553[0-5])|(655[0-2][0-9])|(65[0-4][0-9]{2})|(6[0-4][0-9]{3})|([1-5][0-9]{4})|([0-5]{0,5})|([0-9]{1,4}))$"
It will allow everything between 0-65535 inclusive.

Here is single port regex validation that excludes ports that start with 0
^([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])
Here is validation for port range (ex. 1111-1111)
^([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])(-([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5]))?$
link:
https://github.com/findhit/proxywrap/issues/13

Landed here as well, searching specifically for REGEX to validate port number.
I see the approved solution was not fixed yet to cover all scenarios ( eg: 007 port, and others ) and solutions from other sites not updated either (eg).
Reached same minimal solution as saber tabatabaee yazdi, that should cover the 1-65535 range properly:
^([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])$
Enjoy !

#npinti 's answer allows leading zeros in the port number and also port 0 means pick any available port so I would exclude that so the regex becomes
^([1-9][0-9]{0,4}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])$
If you want to allow port 0 then
^(0|[1-9][0-9]{0,4}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])$

The solution:
Dim Minlenght As Integer = 1
Dim Maxlenght As Integer = 65536
Regex.IsMatch(sInput,"(^\d{0},{1}$)", "{" + Minlenght, Maxlenght + "}")

If variable is an integer between 1 and 65536 (inclusive) then...
if [[ "$port" =~ ^[0-9]+$ && $port -ge 1 && $port -le 65536 ]]; then

^((6553[0-5])|(655[0-2][0-9])|(65[0-4][0-9]{2})|(6[0-4][0-9]{3})|([1-5][0-9]{4})|([0-5]{0,5})|([0][0-9]{1,4})|([0-9]{1,4}))$
I have tested above regrex with Junit run the for loop from 0-65535
Ex: 00001 - 65535 with leading Zeros
1 - 65535 without leading Zeros
Ex:====
(6553[0-5]) : 65530-65535
(655[0-2][0-9]) : 65500-65529
(65[0-4][0-9]{2}): 65000-65499
(6[0-4][0-9]{3}) : 60000-64999
([1-5][0-9]{4}) : 10000-59999
([0-5]{0,5}) : 00000-55555 (for leading Zeros)
([0][0-9]{1,4}) : 00000-09999 (for leading Zeros)
([0-9]{1,4}) : 0000-9999 (for leading Zeros)

Related

Number groups with 0 as delimiter

There's a long natural number that can be grouped to smaller numbers by the 0 (zero) delimiter.
Example: 4201100370880
This would divide to Group1: 42, Group2: 110, Group3: 370880
There are 3 groups, groups never start with 0 and are at least 1 char long. Also the last groups is "as is", meaning it's not terminated by a tailing 0.
This is what I came up with, but it only works for certain inputs (like 420110037880):
(\d+)0([1-9][0-9]{1,2})0([1-9]\d+)
This shows I'm attempting to declare the 2nd group's length to min2 max3, but I'm thinking the correct solution should not care about it. If the delimiter was non-numeric I could probably tackle it, but I'm stumped.
All right, factoring in comment information, try splitting on a regex (this may vary based on what language you're using - .split(/.../) in JavaScript, preg_split in PHP, etc.)
The regex you want to split on is: 0(?!0). This translates to "a zero that is not followed by a zero". I believe this will solve your splitting problem.
If your language allows a limit parameter (PHP does), set it to 3. If not, you will need to do something like this (JavaScript):
result = input.split(/0(?!0)/);
result = result.slice(0,2).concat(result.slice(2).join("0"));
The following one should suit your needs:
^(.*?)0(?!0)(.*?)0(?!0)(.*)$
Visualization by Debuggex
The following regex works:
(\d+?)0(?!0) with the g modifier
Demo: http://regex101.com/r/rS4dE5
For only three matches, you can do:
(\d+?)0(?!0)(\d+?)0(?!0)(.*)

Regular Expression (RegEx) For Hours with Increments

I need to only accept input that meets these rules...
0.25-24
Increments of .25 (.00, .25, .50, .75)
First digit doesn't have to be required.
Would like trailing zeros to be optional.
Examples of some valid entries:
0.25
.50
.5
1
1.0
5.50
23.75
24 (max allowed)
UPDATE: nothing at all, null/blank, should also be accepted as valid
Example of some invalid entries:
0
.0
.00
0.0
0.00
24.25
-1
I understand that RegEx is a pattern matching language therefore it's not great for ranges, less-than, and great-than checking. So to check if it's less than or equal to 24 means I'd have to find a pattern, right? So there are 24 possible patters which would make this a long RegEx, am I understanding this correctly? I could use ColdFusion to do the check to make sure it's in the 0-24 range. It's not the end of the world if I have use ColdFusion for this part, but it'd be nice to get it all into the RegEx if it doesn't cause it to be too long. This is what I have so far:
^\d{0,2}((\.(0|00|25|5|50|75))?)$
http://regex101.com/r/iS7zM3
This handles pretty much all of it except for the 0-24 range check or the check for just a zero. I'll keep plugging away at it but any help would be appreciated. Thanks!
Change \d{0,2} to (?:1[0-9]?|2[0-4]?|[3-9])? and it'll match from 1 to 24 (or nothing).
You can also simplify the second part to (?:\.(?:00?|25|50?|75))? - you could go further to (?:\.(?:[05]0?|[27]5))? but that might obfuscate the intent a bit too far.
To exclude 24.25 you could perhaps use a negative lookahead (?!24\.[^0]) to prevent anything other than 24.0 or 24.00, but it's probably simpler to just exclude 24 from the main pattern and include a specific check for 24/24.0/24.00 at the start:
(?x)
# checks for 24
^24$|^24\.00?$
|
# integer part
^
(?:1[0-9]?|2[0-3]?|[3-9]|0(?=\.[^0])|(?=\.[^0]))
# decimal part
(?:\.(?:00?|25|50?|75))?
$
That also includes a check for 0(?=\.[^0]) which uses a positive lookahead to only allow an initial 0 if the next char is a . followed by a non-zero (so 0.0 and 0.00 isn't allowed).
The (?x) flag allows whitespace to be ignored, allowing readable regex in your code - obviously preferable to squashing it all onto a single line - and also enables the use of # to start line comments to explain parts of a pattern. (Literal whitespaces and hashes can be escaped with backslash, or encoded via e.g. \x23 for hash.)
For comparison, here's a pure-CFML way of doing it:
IsNumeric(Num)
AND Num GT 0
AND Num LTE 24
AND NOT find('.',Num*4)
Now, are you really sure it's better as a regex...
You could try this regex (broken down):
^
(?:
(?:[1-9]|1\d|2[0-3])(?:\.(?:[05]0?|[27]5))? # Non-zeros with optional decimal
|
0?(?:\.(?:50?|[27]5)) # Decimals under 1
|
24(?:\.00?)? # The maximum
)
$
In one line:
^(?:(?:[1-9]|1\d|2[0-3])(?:\.(?:[05]0?|[27]5))?|0?(?:\.(?:50?|[27]5))|24(?:\.00?)?)$
regex101 demo
^([0-1]?[0-9]|2[0-4])((\.(0|00|25|5|50|75))?)$
This means the one's place can be 0-9 if the tens place is missing, a 0, or 1.
If the tens place is a 2, then the ones place can be 0-4.
The second part is great, it's simple and readable too. It has an extra set of parens though that can be removed, reducing it to this:
^([0-1]?[0-9]|2[0-4])(\.(0|00|25|5|50|75))?$

Regular expression prices

I'm trying to find a valid price validation for my needs..
Valid input format (xxx means no maximum length - 0000 means 4 decimal places at maximum):
15,0000
15.0000
150.0000
150,0000
xxxxxxxxxxxx.0000
xxxxxxxxxxxx,0000
15,00
15,1
15.00
15.1
Invalid input format (basically everything that starts by 0):
01.0000
01.00
01
My regular expression so far: ^\$?[1-9][1-9,]*[0-9]\.?[0-9]{0,2}$
Edit 1: Changed my regex for this one: ^\$?[1-9]*[1-9]((\,)|(\.))?[0-9]{0,4}$ but now I need to be able to add 150000000 and it only allows me 150000
EDIT: just saw that you updated the question and added 0 as a valid input. I'll see if I can add that.
How about:
^([1-9].*[,\.][0-9]*)$
This will work on the examples above.
But be careful with input like 15x,001
See it in action
Okay this one seems okay to me
^[^0]\d+(\.|\,)?[0-9]{0,4}$
checked here http://rubular.com/r/97Ra9VS9h4
and yes one more thing if you want to check for one digit numbers also like 1,2 etc
then you can just replace the + with * like this ^[^0]\d*(\.|\,)?[0-9]{0,4}$
What about this one:
^\$?[1-9][0-9]*(,|\.)[0-9]{1,4}$
The first regex makes sure the price doesnt starts with a zero.
Then all numbers are allowed, zero or more numbers.
Then there must be a comma or a point.
Finaly all numbers are allowed, max count is four and minimum one
^[1-9][0-9]*([.,][0-9]{1,4})?$

Validating IPv4 addresses with regexp

I've been trying to get an efficient regex for IPv4 validation, but without much luck. It seemed at one point I had had it with (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}, but it produces some strange results:
$ grep --version
grep (GNU grep) 2.7
$ grep -E '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}\b' <<< 192.168.1.1
192.168.1.1
$ grep -E '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}\b' <<< 192.168.1.255
192.168.1.255
$ grep -E '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}\b' <<< 192.168.255.255
$ grep -E '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}\b' <<< 192.168.1.2555
192.168.1.2555
I did a search to see if this had already been asked and answered, but other answers appear to simply show how to determine 4 groups of 1-3 numbers, or do not work for me.
Best for Now (43 chars)
^((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}$
This version shortens things by another 6 characters while not making use of the negative lookahead, which is not supported in some regex flavors.
Newest, Shortest, Least Readable Version (49 chars)
^((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)(\.(?!$)|$)){4}$
The [0-9] blocks can be substituted by \d in 2 places - makes it a bit less readable, but definitely shorter.
Even Newer, even Shorter, Second least readable version (55 chars)
^((25[0-5]|(2[0-4]|1[0-9]|[1-9]|)[0-9])(\.(?!$)|$)){4}$
This version looks for the 250-5 case, after that it cleverly ORs all the possible cases for 200-249 100-199 10-99 cases. Notice that the |) part is not a mistake, but actually ORs the last case for the 0-9 range. I've also omitted the ?: non-capturing group part as we don't really care about the captured items, they would not be captured either way if we didn't have a full-match in the first place.
Old and shorter version (less readable) (63 chars)
^(?:(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(?!$)|$)){4}$
Older (readable) version (70 chars)
^(?:(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])(\.(?!$)|$)){4}$
It uses the negative lookahead (?!) to remove the case where the ip might end with a .
Alternative answer, using some of the newer techniques (71 chars)
^((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.){3}(25[0-5]|(2[0-4]|1\d|[1-9]|)\d)$
Useful in regex implementations where lookaheads are not supported
Oldest answer (115 chars)
^(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){3}
(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])$
I think this is the most accurate and strict regex, it doesn't accept things like 000.021.01.0. it seems like most other answers here do and require additional regex to reject cases similar to that one - i.e. 0 starting numbers and an ip that ends with a .
You've already got a working answer but just in case you are curious what was wrong with your original approach, the answer is that you need parentheses around your alternation otherwise the (\.|$) is only required if the number is less than 200.
'\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b'
^ ^
^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
Accept:
127.0.0.1
192.168.1.1
192.168.1.255
255.255.255.255
0.0.0.0
1.1.1.01 # This is an invalid IP address!
Reject:
30.168.1.255.1
127.1
192.168.1.256
-1.2.3.4
1.1.1.1.
3...3
Try online with unit tests: https://www.debuggex.com/r/-EDZOqxTxhiTncN6/1
IPv4 address (accurate capture)
Matches 0.0.0.0 through 255.255.255.255, but does capture invalid addresses such as 1.1.000.1
Use this regex to match IP numbers with accuracy.
Each of the 4 numbers is stored into a capturing group, so you can access them for further processing.
\b
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
\b
taken from JGsoft RegexBuddy library
Edit: this (\.|$) part seems weird
I think many people reading this post will be looking for simpler regular expressions, even if they match some technically invalid IP addresses. (And, as noted elsewhere, regex probably isn't the right tool for properly validating an IP address anyway.)
Remove ^ and, where applicable, replace $ with \b, if you don't want to match the beginning/end of the line.
Basic Regular Expression (BRE) (tested on GNU grep, GNU sed, and vim):
/^[0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+$/
Extended Regular Expression (ERE):
/^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$/
or:
/^([0-9]+(\.|$)){4}/
Perl-compatible Regular Expression (PCRE) (tested on Perl 5.18):
/^\d+\.\d+\.\d+\.\d+$/
or:
/^(\d+(\.|$)){4}/
Ruby (tested on Ruby 2.1):
Although supposed to be PCRE, Ruby for whatever reason allowed this regex not allowed by Perl 5.18:
/^(\d+[\.$]){4}/
My tests for all these are online here.
I was in search of something similar for IPv4 addresses - a regex that also stopped commonly used private ip addresses from being validated (192.168.x.y, 10.x.y.z, 172.16.x.y) so used negative look aheads to accomplish this:
(?!(10\.|172\.(1[6-9]|2\d|3[01])\.|192\.168\.).*)
(?!255\.255\.255\.255)(25[0-5]|2[0-4]\d|[1]\d\d|[1-9]\d|[1-9])
(\.(25[0-5]|2[0-4]\d|[1]\d\d|[1-9]\d|\d)){3}
(These should be on one line of course, formatted for readability purposes on 3 separate lines)
Debuggex Demo
It may not be optimised for speed, but works well when only looking for 'real' internet addresses.
Things that will (and should) fail:
0.1.2.3 (0.0.0.0/8 is reserved for some broadcasts)
10.1.2.3 (10.0.0.0/8 is considered private)
172.16.1.2 (172.16.0.0/12 is considered private)
172.31.1.2 (same as previous, but near the end of that range)
192.168.1.2 (192.168.0.0/16 is considered private)
255.255.255.255 (reserved broadcast is not an IP)
.2.3.4
1.2.3.
1.2.3.256
1.2.256.4
1.256.3.4
256.2.3.4
1.2.3.4.5
1..3.4
IPs that will (and should) work:
1.0.1.0 (China)
8.8.8.8 (Google DNS in USA)
100.1.2.3 (USA)
172.15.1.2 (USA)
172.32.1.2 (USA)
192.167.1.2 (Italy)
Provided in case anybody else is looking for validating 'Internet IP addresses not including the common private addresses'
Here is a better one with passing/failing IPs attached
/^((?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])[.]){3}(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])$/
Accepts
127.0.0.1
192.168.1.1
192.168.1.255
255.255.255.255
10.1.1.1
0.0.0.0
Rejects
1.1.1.01
30.168.1.255.1
127.1
192.168.1.256
-1.2.3.4
1.1.1.1.
3...3
192.168.1.099
Above answers are valid but what if the ip address is not at the end of line and is in between text.. This regex will even work on that.
code: '\b((([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\.)){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]))\b'
input text file:
ip address 0.0.0.0 asfasf
sad sa 255.255.255.255 cvjnzx
zxckjzbxk 999.999.999.999 jshbczxcbx
sjaasbfj 192.168.0.1 asdkjaksb
oyo 123241.24121.1234.3423 yo
yo 0000.0000.0000.0000 y
aw1a.21asd2.21ad.21d2
yo 254.254.254.254 y0
172.24.1.210 asfjas
200.200.200.200
000.000.000.000
007.08.09.210
010.10.30.110
output text:
0.0.0.0
255.255.255.255
192.168.0.1
254.254.254.254
172.24.1.210
200.200.200.200
'''
This code works for me, and is as simple as that.
Here I have taken the value of ip and I am trying to match it with regex.
ip="25.255.45.67"
op=re.match('(\d+).(\d+).(\d+).(\d+)',ip)
if ((int(op.group(1))<=255) and (int(op.group(2))<=255) and int(op.group(3))<=255) and (int(op.group(4))<=255)):
print("valid ip")
else:
print("Not valid")
Above condition checks if the value exceeds 255 for all the 4 octets then it is not a valid. But before applying the condition we have to convert them into integer since the value is in a string.
group(0) prints the matched output, Whereas group(1) prints the first matched value and here it is "25" and so on.
'''
/^(?:(25[0-5]|2[0-4]\d|1\d\d|[1-9]\d|\d)\.){3}(?1)$/m
Demo
I managed to construct a regex from all other answers.
(25[0-5]|2[0-4][0-9]|[1][0-9][0-9]|[1-9][0-9]|[0-9]?)(\.(25[0-5]|2[0-4][0-9]|[1][0-9][0-9]|[1-9][0-9]|[0-9]?)){3}
This is a little longer than some but this is what I use to match IPv4 addresses. Simple with no compromises.
^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])\.){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])$
For number from 0 to 255 I use this regex:
(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))
Above regex will match integer number from 0 to 255, but not match 256.
So for IPv4 I use this regex:
^(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))((\.(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))){3})$
It is in this structure: ^(N)((\.(N)){3})$ where N is the regex used to match number from 0 to 255.
This regex will match IP like below:
0.0.0.0
192.168.1.2
but not those below:
10.1.0.256
1.2.3.
127.0.1-2.3
For IPv4 CIDR (Classless Inter-Domain Routing) I use this regex:
^(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))((\.(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))){3})\/(([0-9])|([12][0-9])|(3[0-2]))$
It is in this structure: ^(N)((\.(N)){3})\/M$ where N is the regex used to match number from 0 to 255, and M is the regex used to match number from 0 to 32.
This regex will match CIDR like below:
0.0.0.0/0
192.168.1.2/32
but not those below:
10.1.0.256/16
1.2.3./24
127.0.0.1/33
And for list of IPv4 CIDR like "10.0.0.0/16", "192.168.1.1/32" I use this regex:
^("(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))((\.(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))){3})\/(([0-9])|([12][0-9])|(3[0-2]))")((,([ ]*)("(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))((\.(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))){3})\/(([0-9])|([12][0-9])|(3[0-2]))"))*)$
It is in this structure: ^(“C”)((,([ ]*)(“C”))*)$ where C is the regex used to match CIDR (like 0.0.0.0/0).
This regex will match list of CIDR like below:
“10.0.0.0/16”,”192.168.1.2/32”, “1.2.3.4/32”
but not those below:
“10.0.0.0/16” 192.168.1.2/32 “1.2.3.4/32”
Maybe it might get shorter but for me it is easy to understand so fine by me.
Hope it helps!
IPv4 address is a very complicated thing.
Note: Indentation and lining are only for illustration purposes and do not exist in the real RegEx.
\b(
((
(2(5[0-5]|[0-4][0-9])|1[0-9]{2}|[1-9]?[0-9])
|
0[Xx]0*[0-9A-Fa-f]{1,2}
|
0+[1-3]?[0-9]{1,2}
)\.){1,3}
(
(2(5[0-5]|[0-4][0-9])|1[0-9]{2}|[1-9]?[0-9])
|
0[Xx]0*[0-9A-Fa-f]{1,2}
|
0+[1-3]?[0-9]{1,2}
)
|
(
[1-3][0-9]{1,9}
|
[1-9][0-9]{,8}
|
(4([0-1][0-9]{8}
|2([0-8][0-9]{7}
|9([0-3][0-9]{6}
|4([0-8][0-9]{5}
|9([0-5][0-9]{4}
|6([0-6][0-9]{3}
|7([0-1][0-9]{2}
|2([0-8][0-9]{1}
|9([0-5]
))))))))))
)
|
0[Xx]0*[0-9A-Fa-f]{1,8}
|
0+[1-3]?[0-7]{,10}
)\b
These IPv4 addresses are validated by the above RegEx.
127.0.0.1
2130706433
0x7F000001
017700000001
0x7F.0.0.01 # Mixed hex/dec/oct
000000000017700000001 # Have as many leading zeros as you want
0x0000000000007F000001 # Same as above
127.1
127.0.1
These are rejected.
256.0.0.1
192.168.1.099 # 099 is not a valid number
4294967296 # UINT32_MAX + 1
0x100000000
020000000000
(((25[0-5])|(2[0-4]\d)|(1\d{2})|(\d{1,2}))\.){3}(((25[0-5])|(2[0-4]\d)|(1\d{2})|(\d{1,2})))
Test to find matches in text,
https://regex101.com/r/9CcMEN/2
Following are the rules defining the valid combinations in each number of an IP address:
Any one- or two-digit number.
Any three-digit number beginning with 1.
Any three-digit number beginning with 2 if the second digit is 0
through 4.
Any three-digit number beginning with 25 if the third digit is 0
through 5.
Let'start with (((25[0-5])|(2[0-4]\d)|(1\d{2})|(\d{1,2}))\.), a set of four nested subexpressions, and we’ll look at them in reverse order. (\d{1,2}) matches any one- or two-digit number or numbers 0 through 99. (1\d{2}) matches any three-digit number starting with 1 (1 followed by any two digits), or numbers 100 through 199. (2[0-4]\d) matches numbers 200 through 249. (25[0-5]) matches numbers 250 through 255. Each of these subexpressions is enclosed within another subexpression with an | between each (so that one of the four subexpressions has to match, not all). After the range of numbers comes \. to match ., and then the entire series (all the number options plus \.) is enclosed into yet another subexpression and repeated three times using {3}. Finally, the range of numbers is repeated (this time without the trailing \.) to match the final IP address number. By restricting each of the four numbers to values between 0 and 255, this pattern can indeed match valid IP addresses and reject invalid addresses.
Excerpt From: Ben Forta. “Learning Regular Expressions.”
If neither a character is wanted at the beginning of IP address nor at the end, ^ and $ metacharacters ought to be used, respectively.
^(((25[0-5])|(2[0-4]\d)|(1\d{2})|(\d{1,2}))\.){3}(((25[0-5])|(2[0-4]\d)|(1\d{2})|(\d{1,2})))$
Test to find matches in text,
https://regex101.com/r/uAP31A/1
Valid regex for IPV4 address for Java
^((\\d|[1-9]\\d|[0-1]\\d{2}|2[0-4]\\d|25[0-5])[\\.]){3}(\\d|[1-9]\\d|[0-1]\\d{2}|2[0-4]\\d|25[0-5])$
Find a valid ip address in the text is a very difficult problem
I have a regexp, that match (extract) valid ip addresses from strings in text files.
my regexp
\b(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[1-9])\.)(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){2}(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\b
\b word boundary
(?: - means start non capturing group
^(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[1-9])\.) - string must start with first right octet with dot char
(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[1-9]) - find first right octet - (firt octet can not start with - 0)
(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){2} - find next right two octets with dot string
(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\b - string must end with right fourth octet (now zero char is allowed)
But this ip regexp has a minority false positive matches:
https://regexr.com/69dk7
Find or extract valid ip address from text file with only regexp is impossible. Without checking another conditions you always get false positive matches.
Solution
I write one liner perl for extract ip addresses from text files. It has this conditions:
when the ip address is at the beginning of the line, the next char is one or multiple whitespace char (space, tab, new line...)
when ip address is at end of line, the new line is next char and before ip address is one or multiple whitespace chars
in middle of text - before and after ip address is one or multiple whitespace chars
The consequence is that perl not match strings like https://84.25.74.125 and another URI strings. Or ip addres at the end of line with dot char at the end. But it find any valid ip address in the text.
perl one liner solution:
$ cat ip.txt | perl -lane 'use warnings; use strict; for my $i (#F){if ($i =~/^(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[1-9])\.)(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){2}(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])$/) { print $i; } }'
36.42.84.233
158.22.45.0
36.84.84.233
12.0.5.4
1.25.45.36
255.3.6.5
4.255.2.1
127.0.0.1
127.0.0.5
126.0.0.1
testing text file:
$ cat ip.txt
36.42.84.233 stop 158.22.45.0 and 56.32.58.2.
25.36.84.84abc and abc2.4.8.2 is error.
1.2.3.4_
But false positive is 2.2.2.2.2.2.2.2 or 1.1.1.1.1
http://23.54.212.1:80
https://89.35.248.1/abc
36.84.84.233 was 25.36.58.4/abc/xyz&158.133.26.4&another_var
and 42.27.0.1:8333 in http://212.158.45.2:26
0.25.14.15 ip can not start with zero
2.3.0
abc 12.0.5.4
1.25.45.36
12.05.2.5
256.1.2.5
255.3.6.5
4.255.2.1
4.256.5.6
127.0.0.1 is localhost.
this ip 127.0.0.5 is not localhost
126.0.0.1
Appendix
For people from another planets for whom the strings 2130706433, 127.1, 24.005.04.52 is a valid ip address I have a message: Try to find a solution yourself!!!
Considering some variants suggested, \d and \b may not be supported. Hence, just in case:
IPv4 address
^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0)\.){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0)$
Test: https://debuggex.com/r/izHiog3KkYztRMSJ
With subnet mask :
^$|([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\
.([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\
.([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\
.([01]?\\d\\d?|2[0-4]\\d|25[0-5])
((/([01]?\\d\\d?|2[0-4]\\d|25[0-5]))?)$
I tried to make it a bit simpler and shorter.
^(([01]?\d{1,2}|2[0-4]\d|25[0-5])\.){3}([01]?\d{1,2}|2[0-4]\d|25[0-5])$
If you are looking for java/kotlin:
^(([01]?\\d{1,2}|2[0-4]\\d|25[0-5])\\.){3}([01]?\\d{1,2}|2[0-4]\\d|25[0-5])$
If someone wants to know how it works here is the explanation. It's really so simple. Just give it a try :p :
1. ^.....$: '^' is the starting and '$' is the ending.
2. (): These are called a group. You can think of like "if" condition groups.
3. |: 'Or' condition - as same as most of the programming languages.
4. [01]?\d{1,2}: '[01]' indicates one of the number between 0 and 1. '?' means '[01]' is optional. '\d' is for any digit between 0-9 and '{1,2}' indicates the length can be between 1 and 2. So here the number can be 0-199.
5. 2[0-4]\d: '2' is just plain 2. '[0-4]' means a number between 0 to 4. '\d' is for any digit between 0-9. So here the number can be 200-249.
6. 25[0-5]: '25' is just plain 25. '[0-5]' means a number between 0 to 5. So here the number can be 250-255.
7. \.: It's just plan '.'(dot) for separating the numbers.
8. {3}: It means the exact 3 repetition of the previous group inside '()'.
9. ([01]?\d{1,2}|2[0-4]\d|25[0-5]): Totally same as point 2-6
Mathematically it is like:
(0-199 OR 200-249 OR 250-255).{Repeat exactly 3 times}(0-199 OR 200-249 OR 250-255)
So, as you can see normally this is the pattern for the IP addresses. I hope it helps to understand Regular Expression a bit. :p
I tried to make it a bit simpler and shorter.
^(([01]?\d{1,2}|2[0-4]\d|25[0-5]).){3}([01]?\d{1,2}|2[0-4]\d|25[0-5])$
If you are looking for java/kotlin:
^(([01]?\d{1,2}|2[0-4]\d|25[0-5])\.){3}([01]?\d{1,2}|2[0-4]\d|25[0-5])$
If someone wants to know how it works here is the explanation. It's really so simple. Just give it a try :p :
1. ^.....$: '^' is the starting and '$' is the ending.
2. (): These are called a group. You can think of like "if" condition groups.
3. |: 'Or' condition - as same as most of the programming languages.
4. [01]?\d{1,2}: '[01]' indicates one of the number between 0 and 1. '?' means '[01]' is optional. '\d' is for any digit between 0-9 and '{1,2}' indicates the length can be between 1 and 2. So here the number can be 0-199.
5. 2[0-4]\d: '2' is just plain 2. '[0-4]' means a number between 0 to 4. '\d' is for any digit between 0-9. So here the number can be 200-249.
6. 25[0-5]: '25' is just plain 25. '[0-5]' means a number between 0 to 5. So here the number can be 250-255.
7. \.: It's just plan '.'(dot) for separating the numbers.
8. {3}: It means the exact 3 repetition of the previous group inside '()'.
9. ([01]?\d{1,2}|2[0-4]\d|25[0-5]): Totally same as point 2-6
Mathematically it is like:
(0-199 OR 200-249 OR 250-255).{Repeat exactly 3 times}(0-199 OR 200-249 OR 250-255)
So, as you can see normally this is the pattern for the IP addresses. I hope it helps to understand Regular Expression a bit. :p
To validate any IP address in the valid range 0.0.0.0 to 255.255.255.255 can be written in very simple form as below.
((1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])\.){3}(1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])
const char*ipv4_regexp = "\\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\."
"(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\."
"(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\."
"(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b";
I adapted the regular expression taken from JGsoft RegexBuddy library to C language (regcomp/regexec) and I found out it works but there's a little problem in some OS like Linux.
That regular expression accepts ipv4 address like 192.168.100.009 where 009 in Linux is considered an octal value so the address is not the one you thought.
I changed that regular expression as follow:
const char* ipv4_regex = "\\b(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\."
"(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\."
"(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\."
"(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\b";
using that regular expressione now 192.168.100.009 is not a valid ipv4 address while 192.168.100.9 is ok.
I modified a regular expression for multicast address too and it is the following:
const char* mcast_ipv4_regex = "\\b(22[4-9]|23[0-9])\\."
"(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\."
"(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9]?)\\."
"(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\b";
I think you have to adapt the regular expression to the language you're using to develop your application
I put an example in java:
package utility;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class NetworkUtility {
private static String ipv4RegExp = "\\b(?:(?:25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d?)\\.){3}(?:25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d?)\\b";
private static String ipv4MulticastRegExp = "2(?:2[4-9]|3\\d)(?:\\.(?:25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]\\d?|0)){3}";
public NetworkUtility() {
}
public static boolean isIpv4Address(String address) {
Pattern pattern = Pattern.compile(ipv4RegExp);
Matcher matcher = pattern.matcher(address);
return matcher.matches();
}
public static boolean isIpv4MulticastAddress(String address) {
Pattern pattern = Pattern.compile(ipv4MulticastRegExp);
Matcher matcher = pattern.matcher(address);
return matcher.matches();
}
}
-bash-3.2$ echo "191.191.191.39" | egrep
'(^|[^0-9])((2([6-9]|5[0-5]?|[0-4][0-9]?)?|1([0-9][0-9]?)?|[3-9][0-9]?|0)\.{3}
(2([6-9]|5[0-5]?|[0-4][0-9]?)?|1([0-9][0-9]?)?|[3-9][0-9]?|0)($|[^0-9])'
>> 191.191.191.39
(This is a DFA that matches the entire addr space (including broadcasts, etc.) an nothing else.
I think this one is the shortest.
^(([01]?\d\d?|2[0-4]\d|25[0-5]).){3}([01]?\d\d?|2[0-4]\d|25[0-5])$
I found this sample very useful, furthermore it allows different ipv4 notations.
sample code using python:
def is_valid_ipv4(ip4):
"""Validates IPv4 addresses.
"""
import re
pattern = re.compile(r"""
^
(?:
# Dotted variants:
(?:
# Decimal 1-255 (no leading 0's)
[3-9]\d?|2(?:5[0-5]|[0-4]?\d)?|1\d{0,2}
|
0x0*[0-9a-f]{1,2} # Hexadecimal 0x0 - 0xFF (possible leading 0's)
|
0+[1-3]?[0-7]{0,2} # Octal 0 - 0377 (possible leading 0's)
)
(?: # Repeat 0-3 times, separated by a dot
\.
(?:
[3-9]\d?|2(?:5[0-5]|[0-4]?\d)?|1\d{0,2}
|
0x0*[0-9a-f]{1,2}
|
0+[1-3]?[0-7]{0,2}
)
){0,3}
|
0x0*[0-9a-f]{1,8} # Hexadecimal notation, 0x0 - 0xffffffff
|
0+[0-3]?[0-7]{0,10} # Octal notation, 0 - 037777777777
|
# Decimal notation, 1-4294967295:
429496729[0-5]|42949672[0-8]\d|4294967[01]\d\d|429496[0-6]\d{3}|
42949[0-5]\d{4}|4294[0-8]\d{5}|429[0-3]\d{6}|42[0-8]\d{7}|
4[01]\d{8}|[1-3]\d{0,9}|[4-9]\d{0,8}
)
$
""", re.VERBOSE | re.IGNORECASE)
return pattern.match(ip4) <> None
((\.|^)(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0$)){4}
This regex will not accept
08.8.8.8 or 8.08.8.8 or 8.8.08.8 or 8.8.8.08
Finds a valid IP addresses as long as the IP is wrapped around any character other than digits (behind or ahead the IP). 4 Backreferences created: $+{first}.$+{second}.$+{third}.$+{forth}
Find String:
#any valid IP address
(?<IP>(?<![\d])(?<first>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))
#only valid private IP address RFC1918
(?<IP>(?<![\d])(:?(:?(?<first>10)[\.](?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5])))|(:?(?<first>172)[\.](?<second>(:?1[6-9])|(:?2[0-9])|(:?3[0-1])))|(:?(?<first>192)[\.](?<second>168)))[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))
Notepad++ Replace String Option 1: Replaces the whole IP (NO Change):
$+{IP}
Notepad++ Replace String Option 2: Replaces the whole IP octect by octect (NO Change)
$+{first}.$+{second}.$+{third}.$+{forth}
Notepad++ Replace String Option 3: Replaces the whole IP octect by octect (replace 3rd octect value with 0)
$+{first}.$+{second}.0.$+{forth}
NOTE: The above will match any valid IP including 255.255.255.255 for example and change it to 255.255.0.255 which is wrong and not very useful of course.
Replacing portion of each octect with an actual value however you can build your own find and replace which is actual useful to ammend IPs in text files:
for example replace the first octect group of the original Find regex above:
(?<first>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))
with
(?<first>10)
and
(?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))
with
(?<second>216)
and you are now matching addresses starting with first octect 192 only
Find on notepad++:
(?<IP>(?<![\d])(?<first>10)[\.](?<second>216)[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))
You could still perform Replace using back-referece groups in the exact same fashion as before.
You can get an idea of how the above matched below:
cat ipv4_validation_test.txt
Full Match:
0.0.0.1
12.108.1.34
192.168.1.1
10.249.24.212
10.216.1.212
192.168.1.255
255.255.255.255
0.0.0.0
Partial Match (IP Extraction from line)
30.168.1.0.1
-1.2.3.4
sfds10.216.24.23kgfd
da11.15.112.255adfdsfds
sfds10.216.24.23kgfd
NO Match
1.1.1.01
3...3
127.1.
192.168.1..
192.168.1.256
da11.15.112.2554adfdsfds
da311.15.112.255adfdsfds
Using grep you can see the results below:
From grep:
grep -oP '(?<IP>(?<![\d])(?<first>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))' ipv4_validation_test.txt
0.0.0.1
12.108.1.34
192.168.1.1
10.249.24.212
10.216.1.212
192.168.1.255
255.255.255.255
0.0.0.0
30.168.1.0
1.2.3.4
10.216.24.23
11.15.112.255
10.216.24.23
grep -P '(?<IP>(?<![\d])(?<first>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))' ipv4_validation_test.txt
0.0.0.1
12.108.1.34
192.168.1.1
10.249.24.212
10.216.1.212
192.168.1.255
255.255.255.255
0.0.0.0
30.168.1.0.1
-1.2.3.4
sfds10.216.24.23kgfd
da11.15.112.255adfdsfds
sfds10.216.24.23kgfd
#matching ip addresses starting with 10.216
grep -oP '(?<IP>(?<![\d])(?<first>10)[\.](?<second>216)[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))' ipv4_validation_test.txt
10.216.1.212
10.216.24.23
10.216.24.23
^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\\.)){3}+((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))$
Above will be regex for the ip address like:
221.234.000.112
also for 221.234.0.112, 221.24.03.112, 221.234.0.1
You can imagine all kind of address as above
I would use PCRE and the define keyword:
/^
((?&byte))\.((?&byte))\.((?&byte))\.((?&byte))$
(?(DEFINE)
(?<byte>25[0-5]|2[0-4]\d|[01]?\d\d?))
/gmx
Demo: https://regex101.com/r/IB7j48/2
The reason of this is to avoid repeating the (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) pattern four times. Other solutions such as the one below work well, but it does not capture each group as it would be requested by many.
/^((\d+?)(\.|$)){4}/
The only other way to have 4 capture groups is to repeat the pattern four times:
/^(?<one>\d+)\.(?<two>\d+)\.(?<three>\d+)\.(?<four>\d+)$/
Capturing a ipv4 in perl is therefore very easy
$ echo "Hey this is my IP address 138.131.254.8, bye!" | \
perl -ne 'print "[$1, $2, $3, $4]" if \
/\b((?&byte))\.((?&byte))\.((?&byte))\.((?&byte))
(?(DEFINE)
\b(?<byte>25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))
/x'
[138, 131, 254, 8]

Regex - Find numbers between 2000 and 3000

I have a need to search all numbers with 4 digits between 2000 and 3000.
It can be that letters are before and after.
I thought I can use [2000-3000]{4}, but doesnt work, why?
thank you.
How about
^2\d{3}|3000$
Or as Amarghosh & Bart K. & jleedev pointed out, to match multiple instances
\b(?:2[0-9]{3}|3000)\b
If you need to match a3000 or 3000a but not 13000, you would need lookahead and lookbefore like
(?<![0-9])(?:2[0-9]{3}|3000)(?![0-9])
Regular expressions are rarely suitable for checking ranges since for ranges like 27 through 9076 inclusive, they become incredibly ugly. It can be done but you're really better off just doing a regex to check for numerics, something like:
^[0-9]+$
which should work on just about every regex engine, and then check the range manually.
In toto:
def isBetween2kAnd3k(s):
if not s.match ("^[0-9]+$"):
return false
i = s.toInt()
if i < 2000 or i > 3000:
return false
return true
What your particular regex [2000-3000]{4} is checking for is exactly four occurrences of any of the following character: 2,0,0,0-3,0,0,0 - in other words, exactly four digits drawn from 0-3.
With letters before an after, you will need to modify the regex and check the correct substring, something like:
def isBetween2kAnd3kWithLetters(s):
if not s.match ("^[A-Za-z]*[0-9]{4}[A-Za-z]*$"):
return false
idx = s.locate ("[0-9]")
i = s.substring(idx,4).toInt()
if i < 2000 or i > 3000:
return false
return true
As an aside, a regex for checking the range 27 through 9076 inclusive would be something like this hideous monstrosity:
^2[7-9]|[3-9][9-9]|[1-9][0-9]{2}|[1-8][0-9]{3}|90[0-6][0-9]|907[0-6]$
I think that's substantially less readable than using ^[1-9][0-9]+$ then checking if it's between 27 and 9076 with an if statement?
Hum tricky one. The dash - only applies to the character immediately before and after so what your regex is actually matching is exactly 4 characters between 0 and 3 inclusive (ie, 0, 1, 2 and 3). eg, 3210, 1230, 3333, etc... Try the expression below.
(2[0-9]{3})|(3000)
Here's explanation why and ways to detect ranges: http://www.regular-expressions.info/numericranges.html
Correct regex will be \b(2\d{3}|3000)\b. That means: match character '2' then exactly three digits (this will match any from 2000 to 2999) or just match '3000'. There are some good tutorials on regular expressions:
http://gnosis.cx/publish/programming/regular_expressions.html
http://immike.net/blog/2007/04/06/the-absolute-bare-minimum-every-programmer-should-know-about-regular-expressions/
http://www.regular-expressions.info/
why don't you check for greater or less than? its simpler than a regex
num >= 2000 and num <=3000