Regex to match Exact port number (netstat command) - regex

We have this regex:\\*.*?(.600[0-9]).*?.(LISTEN|ESTABLISHED)
OS = Solaris 10
The purpose of this regex is to match the ports in output of "netstat -an" and report if any ports between 6000-6009 are getting used. The only problem is if I have something like this (sample output as mentioned below), the regex matches everything with 6000 in it. It matches 46000, 60006 and 6000. Because of that we are getting faulty alerts. How can we fix this to just ONLY pick up ports (6000-6009)? Please help.
10.10.10.10.2055 10.10.4.10.60006 49552 0 49552 0 ESTABLISHED
10.10.10.10.6360 10.10.4.10.6000 65290 0 49640 0 LISTEN
10.10.10.10.2044 10.10.4.10.46000 49552 0 49552 0 ESTABLISHED

Your '.' before 600 is matching any character not just '.' (why you got 46000) and you need to match for a space after 600x (why you got 60006)
\\*.*?([.]600[0-9]) .*?.(LISTEN|ESTABLISHED)

You can use awk to match only field 2 for port number pattern and last field for status:
awk '$NF ~ /(LISTEN|ESTABLISHED)/ && $2 ~ /\.600[0-9]/

Related

Regular expression for matching a specifc substring of a string

I have a log file that logs connection drops of computers in a LAN. I want to extract name of each computer from every line of the log file and for that I am doing this: (?<=Name:)\w+|(-PC)
The target text:
`[C417] ComputerName:KCUTSHALL-PC UserID:GO kcutshall Station 9900 (locked) LanId: | (11/23 10:54:09 - 11/23 10:54:44) | Average limit (300) exceeded while pinging www.google.com [74.125.224.147] 8x
[C445] ComputerName:FRONTOFFICE UserID:YB Yenae Ball Station 7C LanId: | (11/23 17:02:00) | Client is connected to agent.`
The problem is that some computer names have -PC in them and in some isn't. The expression I have created matches computer without -PC in their names but it if a computer has -PC in the name, it treats that as a separate match and I don't want that. In short, it gives me 3 matches, but I want only 2. That's why I need help here, I am beginner in regex.
You may use
(?<=Name:)\w+(?:-PC)?
Details
(?<=Name:) - a place immediately preceded with Name:
\w+ - 1+ word chars
(?:-PC)? - an optional non-capturing group that matches 1 or 0 occurrences of -PC substring.
Consider using word boundaries if you need to match PC as a whole word,
(?<=Name:)\w+(?:-PC\b)?
See the regex demo.

grep regex unexpected match - literal matches against number

Why does the following literal string
1998-${year}
..match against the grep command:
grep "[0-9 ]*-[ 0-9]*" filename.txt ?
What I need is a regex to match any of the following strings containing either a year range or one value of year only.
sdkfmslf 1998-2008
asdassdadsa 1998 - 2008
mkklml mklsmdf 2006
..but NOT this one:
asdsad a s 1998-${year}
* means "match zero or more". You want + which means "one or more."
grep "[0-9 ]+-[0-9]+" filename.txt
Try [0-9]{4}(\s*-\s*[0-9]{4})?. This will match a 4 digit number, or if it is followed by (optional white space)-(optional whitespace) then that must be followed by another 4 digit number.
Your string "asdsad a s 1998-${year}" would still match, since it has a single 4 digit value in it.
I don't like answering my own question, but none of the above worked. Here is what I found by experimenting. I'm sure there could be more elegant solutions, but here is a working version:
grep "[0-9][0-9][0-9][0-9][ ]*[\-]*[ ]*[0-9]*" filename.txt

Regex to validate port number

I'm using this regex (6553[0-5]|655[0-2]\d|65[0-4]\d{2}|6[0-4]\d{3}|[1-5]\d{4}|[1-9]\d{0,3} to validate port numbers. Somehow this is not working. What is wrong with this? Can anybody point me out.
What exactly do you mean by not working?
You could try something like so: ^([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])$ (obtained from here).
This will make sure that any given string is numeric and between the range of 0 and 65535.
Assuming your regular expression matches the same range, it is missing the start and end anchors (^ and $ respectively), so it would allow other strings besides the actual port.
Update 2 Feb 2022: Fixed the regex to reject values like 00 etc. The updated regex is sourced from the comment below. This regex can be better understood and visualized here: https://www.debuggex.com/r/jjEFZZQ34aPvCBMA
When, we search "how to validate port number" on Google we unfortunately land here
However (except if you have really no other choice...),
Regex is clearly not the way to validate a port number !
"One" (slightly better) way may be:
step 1: Convert your string into number, and return FALSE if it fails
step 2: return TRUE if your number is in [1-65535] range, and FALSE otherwise
Various reasons, why Regex is not the right way ?
Code readability (would takes few minutes to understand)
Code robustness (there are various ways to introduce a typo, a unitary test would be required)
Code flexibility (what if port number can be extended to a 64-bits number !?)
etc. ...
Number() is the function you want "123a" returns NAN
parseInt() truncates trailing letters "123a" returns 123
<input type="text" id="txtFld" onblur="if(Number(this.value)>0 && Number(this.value)<65536){alert('valid port number');}" />
jsfiddle
Here is the example I'm using to validate port settings for a firewall. The original answer will match 2 strings. I can only have 1 string match.
(6553[0-5]|655[0-2][0-9]|65[0-4][0-9][0-9]|6[0-4][0-9][0-9][0-9][0-9]|[1-5](\d){4}|[1-9](\d){0,3})
To get: 22,24:100,333,678,100:65535 my full validation (That will only return 1 match) is
(6553[0-5]|655[0-2][0-9]|65[0-4][0-9][0-9]|6[0-4][0-9][0-9][0-9][0-9]|[1-5](\d){4}|[0-9](\d){0,3})(:(6553[0-5]|655[0-2][0-9]|65[0-4][0-9][0-9]|6[0-4][0-9][0-9][0-9][0-9]|[1-5](\d){4}|[0-9](\d){0,3}))?(,(6553[0-5]|655[0-2][0-9]|65[0-4][0-9][0-9]|6[0-4][0-9][0-9][0-9][0-9]|[1-5](\d){4}|[0-9](\d){0,3}){1}(:(6553[0-5]|655[0-2][0-9]|65[0-4][0-9][0-9]|6[0-4][0-9][0-9][0-9][0-9]|[1-5](\d){4}|[0-9](\d){0,3}))?)*
A more strict approach is to have a regex matching all numbers up to 5 digits
with the following string:
*(^[1-9]{1}$|^[0-9]{2,4}$|^[0-9]{3,4}$|^[1-5]{1}[0-9]{1}[0-9]{1}[0-9]{1}[0-9]{1}$|^[1-6]{1}[0-4]{1}[0-9]{1}[0-9]{1}[0-9]{1}$|^[1-6]{1}[0-5]{1}[0-4]{1}[0-9]{1}[0-9]{1}$|^[1-6]{1}[0-5]{1}[0-5]{1}[0-3]{1}[0-5]{1}$)*
The accpeted answer by npinti is not right. It will not allow to enter port number 1000, for example. For me, this one (not nice, I'm a beginner) works correctly:
/^((((([1-9])|([1-9][0-9])|([1-9][0-9][0-9])|([1-9][0-9][0-9][0-9])|([1-6][0-5][0-5][0-3][0-5])))))$/
"^((6553[0-5])|(655[0-2][0-9])|(65[0-4][0-9]{2})|(6[0-4][0-9]{3})|([1-5][0-9]{4})|([0-5]{0,5})|([0-9]{1,4}))$"
It will allow everything between 0-65535 inclusive.
Here is single port regex validation that excludes ports that start with 0
^([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])
Here is validation for port range (ex. 1111-1111)
^([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])(-([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5]))?$
link:
https://github.com/findhit/proxywrap/issues/13
Landed here as well, searching specifically for REGEX to validate port number.
I see the approved solution was not fixed yet to cover all scenarios ( eg: 007 port, and others ) and solutions from other sites not updated either (eg).
Reached same minimal solution as saber tabatabaee yazdi, that should cover the 1-65535 range properly:
^([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])$
Enjoy !
#npinti 's answer allows leading zeros in the port number and also port 0 means pick any available port so I would exclude that so the regex becomes
^([1-9][0-9]{0,4}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])$
If you want to allow port 0 then
^(0|[1-9][0-9]{0,4}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])$
The solution:
Dim Minlenght As Integer = 1
Dim Maxlenght As Integer = 65536
Regex.IsMatch(sInput,"(^\d{0},{1}$)", "{" + Minlenght, Maxlenght + "}")
If variable is an integer between 1 and 65536 (inclusive) then...
if [[ "$port" =~ ^[0-9]+$ && $port -ge 1 && $port -le 65536 ]]; then
^((6553[0-5])|(655[0-2][0-9])|(65[0-4][0-9]{2})|(6[0-4][0-9]{3})|([1-5][0-9]{4})|([0-5]{0,5})|([0][0-9]{1,4})|([0-9]{1,4}))$
I have tested above regrex with Junit run the for loop from 0-65535
Ex: 00001 - 65535 with leading Zeros
1 - 65535 without leading Zeros
Ex:====
(6553[0-5]) : 65530-65535
(655[0-2][0-9]) : 65500-65529
(65[0-4][0-9]{2}): 65000-65499
(6[0-4][0-9]{3}) : 60000-64999
([1-5][0-9]{4}) : 10000-59999
([0-5]{0,5}) : 00000-55555 (for leading Zeros)
([0][0-9]{1,4}) : 00000-09999 (for leading Zeros)
([0-9]{1,4}) : 0000-9999 (for leading Zeros)

regular expression to find in-between content

I am trying to find the content between %%EndPageSetup and LH(%%[Page: 1]%%) = using regular expression. I tried various patterns but not getting the correct output. Can someone please help me on this?
%EndPageSetup
/DeviceGray dup setcolorspace
/colspABC exch def ‹ … scol
… „A VM? Pscript_WinNT_Incr begin
%%BeginResource: file Pscript_T42Hdr
5.0 0 /asc42 0.0 d/sF42{/asc42 ~ d Ji}bind d/bS42{0 asc42 -M}bind
d/eS42{0 asc42 neg
-M}b/Is2015?{version cvi 2015 ge}bind d/AllocGlyphStorage{Is2015?{!}{{string}
forall}?}bind d/Type42DictBegin{25
dict /FontName ~ d/Encoding ~ d 4
array astore cvx/FontBBox ~
d/PaintType 0 d/FontType 42
d/FontMatrix[1 0 0 1 0 0]d
/CharStrings 256 dict/.notdef 0 d &
E d/sfnts}bind d/Type42DictEnd{& #
/FontName get ~ definefont ! E}bind
d/RDS{string currentfile ~ readstring
!} executeonly
d/PrepFor2015{Is2015?{/GlyphDirectory
16 dict d sfnts 0 get # 2 ^
(glyx)putinterval 2 ^(locx)putinterval
! !}{! !}?}bind d/AddT42Char{Is2015?
{findfont/GlyphDirectory get ` d E !
!}{findfont/sfnts get 4 ^ get 3 ^ 2 ^
LH(%%[Page: 1]%%) =
Thanks.
this may work
/EndPageSetup(.*?)LH\((?:.*?)\[Page: 1\](?:.*?)\) =/
This works with your examples
%%EndPageSetup(.*?)\(%%\[.*?Page.*?\]%%\) =
See it here online on Regexr
make sure to activate the s (dotall) modifier, so that is possible to match newline characters with the ..
Your result is then in capture group 1.
How to activate the modifier and how to get the result depends on your language.
This should work:
(?:%%EndPageSetup)(.*\n)*(?=LH\(%%\[Page: 1\]%%\) =)
Explanation
The 3rd capture group (?=LH\(%%\[Page: 1\]%%\) =) uses a positive lookahead, so you can match that group without including it in the result.
The 2nd capture group (.*\n) matches all characters including line breaks. Using *, you can match 0 or more of the preceding token/group.
The first non-capturing group matches (?:%%EndPageSetup)and omits it from the result.
Note
You can use lookbehinds too, but JavaScript doesn't support them.

Validating IPv4 addresses with regexp

I've been trying to get an efficient regex for IPv4 validation, but without much luck. It seemed at one point I had had it with (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}, but it produces some strange results:
$ grep --version
grep (GNU grep) 2.7
$ grep -E '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}\b' <<< 192.168.1.1
192.168.1.1
$ grep -E '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}\b' <<< 192.168.1.255
192.168.1.255
$ grep -E '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}\b' <<< 192.168.255.255
$ grep -E '\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?(\.|$)){4}\b' <<< 192.168.1.2555
192.168.1.2555
I did a search to see if this had already been asked and answered, but other answers appear to simply show how to determine 4 groups of 1-3 numbers, or do not work for me.
Best for Now (43 chars)
^((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}$
This version shortens things by another 6 characters while not making use of the negative lookahead, which is not supported in some regex flavors.
Newest, Shortest, Least Readable Version (49 chars)
^((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)(\.(?!$)|$)){4}$
The [0-9] blocks can be substituted by \d in 2 places - makes it a bit less readable, but definitely shorter.
Even Newer, even Shorter, Second least readable version (55 chars)
^((25[0-5]|(2[0-4]|1[0-9]|[1-9]|)[0-9])(\.(?!$)|$)){4}$
This version looks for the 250-5 case, after that it cleverly ORs all the possible cases for 200-249 100-199 10-99 cases. Notice that the |) part is not a mistake, but actually ORs the last case for the 0-9 range. I've also omitted the ?: non-capturing group part as we don't really care about the captured items, they would not be captured either way if we didn't have a full-match in the first place.
Old and shorter version (less readable) (63 chars)
^(?:(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])(\.(?!$)|$)){4}$
Older (readable) version (70 chars)
^(?:(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])(\.(?!$)|$)){4}$
It uses the negative lookahead (?!) to remove the case where the ip might end with a .
Alternative answer, using some of the newer techniques (71 chars)
^((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.){3}(25[0-5]|(2[0-4]|1\d|[1-9]|)\d)$
Useful in regex implementations where lookaheads are not supported
Oldest answer (115 chars)
^(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){3}
(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])$
I think this is the most accurate and strict regex, it doesn't accept things like 000.021.01.0. it seems like most other answers here do and require additional regex to reject cases similar to that one - i.e. 0 starting numbers and an ip that ends with a .
You've already got a working answer but just in case you are curious what was wrong with your original approach, the answer is that you need parentheses around your alternation otherwise the (\.|$) is only required if the number is less than 200.
'\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.|$)){4}\b'
^ ^
^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
Accept:
127.0.0.1
192.168.1.1
192.168.1.255
255.255.255.255
0.0.0.0
1.1.1.01 # This is an invalid IP address!
Reject:
30.168.1.255.1
127.1
192.168.1.256
-1.2.3.4
1.1.1.1.
3...3
Try online with unit tests: https://www.debuggex.com/r/-EDZOqxTxhiTncN6/1
IPv4 address (accurate capture)
Matches 0.0.0.0 through 255.255.255.255, but does capture invalid addresses such as 1.1.000.1
Use this regex to match IP numbers with accuracy.
Each of the 4 numbers is stored into a capturing group, so you can access them for further processing.
\b
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
\b
taken from JGsoft RegexBuddy library
Edit: this (\.|$) part seems weird
I think many people reading this post will be looking for simpler regular expressions, even if they match some technically invalid IP addresses. (And, as noted elsewhere, regex probably isn't the right tool for properly validating an IP address anyway.)
Remove ^ and, where applicable, replace $ with \b, if you don't want to match the beginning/end of the line.
Basic Regular Expression (BRE) (tested on GNU grep, GNU sed, and vim):
/^[0-9]\+\.[0-9]\+\.[0-9]\+\.[0-9]\+$/
Extended Regular Expression (ERE):
/^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$/
or:
/^([0-9]+(\.|$)){4}/
Perl-compatible Regular Expression (PCRE) (tested on Perl 5.18):
/^\d+\.\d+\.\d+\.\d+$/
or:
/^(\d+(\.|$)){4}/
Ruby (tested on Ruby 2.1):
Although supposed to be PCRE, Ruby for whatever reason allowed this regex not allowed by Perl 5.18:
/^(\d+[\.$]){4}/
My tests for all these are online here.
I was in search of something similar for IPv4 addresses - a regex that also stopped commonly used private ip addresses from being validated (192.168.x.y, 10.x.y.z, 172.16.x.y) so used negative look aheads to accomplish this:
(?!(10\.|172\.(1[6-9]|2\d|3[01])\.|192\.168\.).*)
(?!255\.255\.255\.255)(25[0-5]|2[0-4]\d|[1]\d\d|[1-9]\d|[1-9])
(\.(25[0-5]|2[0-4]\d|[1]\d\d|[1-9]\d|\d)){3}
(These should be on one line of course, formatted for readability purposes on 3 separate lines)
Debuggex Demo
It may not be optimised for speed, but works well when only looking for 'real' internet addresses.
Things that will (and should) fail:
0.1.2.3 (0.0.0.0/8 is reserved for some broadcasts)
10.1.2.3 (10.0.0.0/8 is considered private)
172.16.1.2 (172.16.0.0/12 is considered private)
172.31.1.2 (same as previous, but near the end of that range)
192.168.1.2 (192.168.0.0/16 is considered private)
255.255.255.255 (reserved broadcast is not an IP)
.2.3.4
1.2.3.
1.2.3.256
1.2.256.4
1.256.3.4
256.2.3.4
1.2.3.4.5
1..3.4
IPs that will (and should) work:
1.0.1.0 (China)
8.8.8.8 (Google DNS in USA)
100.1.2.3 (USA)
172.15.1.2 (USA)
172.32.1.2 (USA)
192.167.1.2 (Italy)
Provided in case anybody else is looking for validating 'Internet IP addresses not including the common private addresses'
Here is a better one with passing/failing IPs attached
/^((?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])[.]){3}(?:[0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])$/
Accepts
127.0.0.1
192.168.1.1
192.168.1.255
255.255.255.255
10.1.1.1
0.0.0.0
Rejects
1.1.1.01
30.168.1.255.1
127.1
192.168.1.256
-1.2.3.4
1.1.1.1.
3...3
192.168.1.099
Above answers are valid but what if the ip address is not at the end of line and is in between text.. This regex will even work on that.
code: '\b((([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])(\.)){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]))\b'
input text file:
ip address 0.0.0.0 asfasf
sad sa 255.255.255.255 cvjnzx
zxckjzbxk 999.999.999.999 jshbczxcbx
sjaasbfj 192.168.0.1 asdkjaksb
oyo 123241.24121.1234.3423 yo
yo 0000.0000.0000.0000 y
aw1a.21asd2.21ad.21d2
yo 254.254.254.254 y0
172.24.1.210 asfjas
200.200.200.200
000.000.000.000
007.08.09.210
010.10.30.110
output text:
0.0.0.0
255.255.255.255
192.168.0.1
254.254.254.254
172.24.1.210
200.200.200.200
'''
This code works for me, and is as simple as that.
Here I have taken the value of ip and I am trying to match it with regex.
ip="25.255.45.67"
op=re.match('(\d+).(\d+).(\d+).(\d+)',ip)
if ((int(op.group(1))<=255) and (int(op.group(2))<=255) and int(op.group(3))<=255) and (int(op.group(4))<=255)):
print("valid ip")
else:
print("Not valid")
Above condition checks if the value exceeds 255 for all the 4 octets then it is not a valid. But before applying the condition we have to convert them into integer since the value is in a string.
group(0) prints the matched output, Whereas group(1) prints the first matched value and here it is "25" and so on.
'''
/^(?:(25[0-5]|2[0-4]\d|1\d\d|[1-9]\d|\d)\.){3}(?1)$/m
Demo
I managed to construct a regex from all other answers.
(25[0-5]|2[0-4][0-9]|[1][0-9][0-9]|[1-9][0-9]|[0-9]?)(\.(25[0-5]|2[0-4][0-9]|[1][0-9][0-9]|[1-9][0-9]|[0-9]?)){3}
This is a little longer than some but this is what I use to match IPv4 addresses. Simple with no compromises.
^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])\.){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])$
For number from 0 to 255 I use this regex:
(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))
Above regex will match integer number from 0 to 255, but not match 256.
So for IPv4 I use this regex:
^(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))((\.(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))){3})$
It is in this structure: ^(N)((\.(N)){3})$ where N is the regex used to match number from 0 to 255.
This regex will match IP like below:
0.0.0.0
192.168.1.2
but not those below:
10.1.0.256
1.2.3.
127.0.1-2.3
For IPv4 CIDR (Classless Inter-Domain Routing) I use this regex:
^(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))((\.(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))){3})\/(([0-9])|([12][0-9])|(3[0-2]))$
It is in this structure: ^(N)((\.(N)){3})\/M$ where N is the regex used to match number from 0 to 255, and M is the regex used to match number from 0 to 32.
This regex will match CIDR like below:
0.0.0.0/0
192.168.1.2/32
but not those below:
10.1.0.256/16
1.2.3./24
127.0.0.1/33
And for list of IPv4 CIDR like "10.0.0.0/16", "192.168.1.1/32" I use this regex:
^("(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))((\.(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))){3})\/(([0-9])|([12][0-9])|(3[0-2]))")((,([ ]*)("(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))((\.(([0-9])|([1-9][0-9])|(1([0-9]{2}))|(2[0-4][0-9])|(25[0-5]))){3})\/(([0-9])|([12][0-9])|(3[0-2]))"))*)$
It is in this structure: ^(“C”)((,([ ]*)(“C”))*)$ where C is the regex used to match CIDR (like 0.0.0.0/0).
This regex will match list of CIDR like below:
“10.0.0.0/16”,”192.168.1.2/32”, “1.2.3.4/32”
but not those below:
“10.0.0.0/16” 192.168.1.2/32 “1.2.3.4/32”
Maybe it might get shorter but for me it is easy to understand so fine by me.
Hope it helps!
IPv4 address is a very complicated thing.
Note: Indentation and lining are only for illustration purposes and do not exist in the real RegEx.
\b(
((
(2(5[0-5]|[0-4][0-9])|1[0-9]{2}|[1-9]?[0-9])
|
0[Xx]0*[0-9A-Fa-f]{1,2}
|
0+[1-3]?[0-9]{1,2}
)\.){1,3}
(
(2(5[0-5]|[0-4][0-9])|1[0-9]{2}|[1-9]?[0-9])
|
0[Xx]0*[0-9A-Fa-f]{1,2}
|
0+[1-3]?[0-9]{1,2}
)
|
(
[1-3][0-9]{1,9}
|
[1-9][0-9]{,8}
|
(4([0-1][0-9]{8}
|2([0-8][0-9]{7}
|9([0-3][0-9]{6}
|4([0-8][0-9]{5}
|9([0-5][0-9]{4}
|6([0-6][0-9]{3}
|7([0-1][0-9]{2}
|2([0-8][0-9]{1}
|9([0-5]
))))))))))
)
|
0[Xx]0*[0-9A-Fa-f]{1,8}
|
0+[1-3]?[0-7]{,10}
)\b
These IPv4 addresses are validated by the above RegEx.
127.0.0.1
2130706433
0x7F000001
017700000001
0x7F.0.0.01 # Mixed hex/dec/oct
000000000017700000001 # Have as many leading zeros as you want
0x0000000000007F000001 # Same as above
127.1
127.0.1
These are rejected.
256.0.0.1
192.168.1.099 # 099 is not a valid number
4294967296 # UINT32_MAX + 1
0x100000000
020000000000
(((25[0-5])|(2[0-4]\d)|(1\d{2})|(\d{1,2}))\.){3}(((25[0-5])|(2[0-4]\d)|(1\d{2})|(\d{1,2})))
Test to find matches in text,
https://regex101.com/r/9CcMEN/2
Following are the rules defining the valid combinations in each number of an IP address:
Any one- or two-digit number.
Any three-digit number beginning with 1.
Any three-digit number beginning with 2 if the second digit is 0
through 4.
Any three-digit number beginning with 25 if the third digit is 0
through 5.
Let'start with (((25[0-5])|(2[0-4]\d)|(1\d{2})|(\d{1,2}))\.), a set of four nested subexpressions, and we’ll look at them in reverse order. (\d{1,2}) matches any one- or two-digit number or numbers 0 through 99. (1\d{2}) matches any three-digit number starting with 1 (1 followed by any two digits), or numbers 100 through 199. (2[0-4]\d) matches numbers 200 through 249. (25[0-5]) matches numbers 250 through 255. Each of these subexpressions is enclosed within another subexpression with an | between each (so that one of the four subexpressions has to match, not all). After the range of numbers comes \. to match ., and then the entire series (all the number options plus \.) is enclosed into yet another subexpression and repeated three times using {3}. Finally, the range of numbers is repeated (this time without the trailing \.) to match the final IP address number. By restricting each of the four numbers to values between 0 and 255, this pattern can indeed match valid IP addresses and reject invalid addresses.
Excerpt From: Ben Forta. “Learning Regular Expressions.”
If neither a character is wanted at the beginning of IP address nor at the end, ^ and $ metacharacters ought to be used, respectively.
^(((25[0-5])|(2[0-4]\d)|(1\d{2})|(\d{1,2}))\.){3}(((25[0-5])|(2[0-4]\d)|(1\d{2})|(\d{1,2})))$
Test to find matches in text,
https://regex101.com/r/uAP31A/1
Valid regex for IPV4 address for Java
^((\\d|[1-9]\\d|[0-1]\\d{2}|2[0-4]\\d|25[0-5])[\\.]){3}(\\d|[1-9]\\d|[0-1]\\d{2}|2[0-4]\\d|25[0-5])$
Find a valid ip address in the text is a very difficult problem
I have a regexp, that match (extract) valid ip addresses from strings in text files.
my regexp
\b(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[1-9])\.)(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){2}(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\b
\b word boundary
(?: - means start non capturing group
^(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[1-9])\.) - string must start with first right octet with dot char
(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[1-9]) - find first right octet - (firt octet can not start with - 0)
(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){2} - find next right two octets with dot string
(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\b - string must end with right fourth octet (now zero char is allowed)
But this ip regexp has a minority false positive matches:
https://regexr.com/69dk7
Find or extract valid ip address from text file with only regexp is impossible. Without checking another conditions you always get false positive matches.
Solution
I write one liner perl for extract ip addresses from text files. It has this conditions:
when the ip address is at the beginning of the line, the next char is one or multiple whitespace char (space, tab, new line...)
when ip address is at end of line, the new line is next char and before ip address is one or multiple whitespace chars
in middle of text - before and after ip address is one or multiple whitespace chars
The consequence is that perl not match strings like https://84.25.74.125 and another URI strings. Or ip addres at the end of line with dot char at the end. But it find any valid ip address in the text.
perl one liner solution:
$ cat ip.txt | perl -lane 'use warnings; use strict; for my $i (#F){if ($i =~/^(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[1-9])\.)(?:(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\.){2}(?:25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])$/) { print $i; } }'
36.42.84.233
158.22.45.0
36.84.84.233
12.0.5.4
1.25.45.36
255.3.6.5
4.255.2.1
127.0.0.1
127.0.0.5
126.0.0.1
testing text file:
$ cat ip.txt
36.42.84.233 stop 158.22.45.0 and 56.32.58.2.
25.36.84.84abc and abc2.4.8.2 is error.
1.2.3.4_
But false positive is 2.2.2.2.2.2.2.2 or 1.1.1.1.1
http://23.54.212.1:80
https://89.35.248.1/abc
36.84.84.233 was 25.36.58.4/abc/xyz&158.133.26.4&another_var
and 42.27.0.1:8333 in http://212.158.45.2:26
0.25.14.15 ip can not start with zero
2.3.0
abc 12.0.5.4
1.25.45.36
12.05.2.5
256.1.2.5
255.3.6.5
4.255.2.1
4.256.5.6
127.0.0.1 is localhost.
this ip 127.0.0.5 is not localhost
126.0.0.1
Appendix
For people from another planets for whom the strings 2130706433, 127.1, 24.005.04.52 is a valid ip address I have a message: Try to find a solution yourself!!!
Considering some variants suggested, \d and \b may not be supported. Hence, just in case:
IPv4 address
^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0)\.){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0)$
Test: https://debuggex.com/r/izHiog3KkYztRMSJ
With subnet mask :
^$|([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\
.([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\
.([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\
.([01]?\\d\\d?|2[0-4]\\d|25[0-5])
((/([01]?\\d\\d?|2[0-4]\\d|25[0-5]))?)$
I tried to make it a bit simpler and shorter.
^(([01]?\d{1,2}|2[0-4]\d|25[0-5])\.){3}([01]?\d{1,2}|2[0-4]\d|25[0-5])$
If you are looking for java/kotlin:
^(([01]?\\d{1,2}|2[0-4]\\d|25[0-5])\\.){3}([01]?\\d{1,2}|2[0-4]\\d|25[0-5])$
If someone wants to know how it works here is the explanation. It's really so simple. Just give it a try :p :
1. ^.....$: '^' is the starting and '$' is the ending.
2. (): These are called a group. You can think of like "if" condition groups.
3. |: 'Or' condition - as same as most of the programming languages.
4. [01]?\d{1,2}: '[01]' indicates one of the number between 0 and 1. '?' means '[01]' is optional. '\d' is for any digit between 0-9 and '{1,2}' indicates the length can be between 1 and 2. So here the number can be 0-199.
5. 2[0-4]\d: '2' is just plain 2. '[0-4]' means a number between 0 to 4. '\d' is for any digit between 0-9. So here the number can be 200-249.
6. 25[0-5]: '25' is just plain 25. '[0-5]' means a number between 0 to 5. So here the number can be 250-255.
7. \.: It's just plan '.'(dot) for separating the numbers.
8. {3}: It means the exact 3 repetition of the previous group inside '()'.
9. ([01]?\d{1,2}|2[0-4]\d|25[0-5]): Totally same as point 2-6
Mathematically it is like:
(0-199 OR 200-249 OR 250-255).{Repeat exactly 3 times}(0-199 OR 200-249 OR 250-255)
So, as you can see normally this is the pattern for the IP addresses. I hope it helps to understand Regular Expression a bit. :p
I tried to make it a bit simpler and shorter.
^(([01]?\d{1,2}|2[0-4]\d|25[0-5]).){3}([01]?\d{1,2}|2[0-4]\d|25[0-5])$
If you are looking for java/kotlin:
^(([01]?\d{1,2}|2[0-4]\d|25[0-5])\.){3}([01]?\d{1,2}|2[0-4]\d|25[0-5])$
If someone wants to know how it works here is the explanation. It's really so simple. Just give it a try :p :
1. ^.....$: '^' is the starting and '$' is the ending.
2. (): These are called a group. You can think of like "if" condition groups.
3. |: 'Or' condition - as same as most of the programming languages.
4. [01]?\d{1,2}: '[01]' indicates one of the number between 0 and 1. '?' means '[01]' is optional. '\d' is for any digit between 0-9 and '{1,2}' indicates the length can be between 1 and 2. So here the number can be 0-199.
5. 2[0-4]\d: '2' is just plain 2. '[0-4]' means a number between 0 to 4. '\d' is for any digit between 0-9. So here the number can be 200-249.
6. 25[0-5]: '25' is just plain 25. '[0-5]' means a number between 0 to 5. So here the number can be 250-255.
7. \.: It's just plan '.'(dot) for separating the numbers.
8. {3}: It means the exact 3 repetition of the previous group inside '()'.
9. ([01]?\d{1,2}|2[0-4]\d|25[0-5]): Totally same as point 2-6
Mathematically it is like:
(0-199 OR 200-249 OR 250-255).{Repeat exactly 3 times}(0-199 OR 200-249 OR 250-255)
So, as you can see normally this is the pattern for the IP addresses. I hope it helps to understand Regular Expression a bit. :p
To validate any IP address in the valid range 0.0.0.0 to 255.255.255.255 can be written in very simple form as below.
((1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])\.){3}(1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])
const char*ipv4_regexp = "\\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\."
"(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\."
"(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\."
"(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b";
I adapted the regular expression taken from JGsoft RegexBuddy library to C language (regcomp/regexec) and I found out it works but there's a little problem in some OS like Linux.
That regular expression accepts ipv4 address like 192.168.100.009 where 009 in Linux is considered an octal value so the address is not the one you thought.
I changed that regular expression as follow:
const char* ipv4_regex = "\\b(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\."
"(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\."
"(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\."
"(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\b";
using that regular expressione now 192.168.100.009 is not a valid ipv4 address while 192.168.100.9 is ok.
I modified a regular expression for multicast address too and it is the following:
const char* mcast_ipv4_regex = "\\b(22[4-9]|23[0-9])\\."
"(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\."
"(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9]?)\\."
"(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]|[0-9])\\b";
I think you have to adapt the regular expression to the language you're using to develop your application
I put an example in java:
package utility;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class NetworkUtility {
private static String ipv4RegExp = "\\b(?:(?:25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d?)\\.){3}(?:25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]?\\d?)\\b";
private static String ipv4MulticastRegExp = "2(?:2[4-9]|3\\d)(?:\\.(?:25[0-5]|2[0-4]\\d|1\\d\\d|[1-9]\\d?|0)){3}";
public NetworkUtility() {
}
public static boolean isIpv4Address(String address) {
Pattern pattern = Pattern.compile(ipv4RegExp);
Matcher matcher = pattern.matcher(address);
return matcher.matches();
}
public static boolean isIpv4MulticastAddress(String address) {
Pattern pattern = Pattern.compile(ipv4MulticastRegExp);
Matcher matcher = pattern.matcher(address);
return matcher.matches();
}
}
-bash-3.2$ echo "191.191.191.39" | egrep
'(^|[^0-9])((2([6-9]|5[0-5]?|[0-4][0-9]?)?|1([0-9][0-9]?)?|[3-9][0-9]?|0)\.{3}
(2([6-9]|5[0-5]?|[0-4][0-9]?)?|1([0-9][0-9]?)?|[3-9][0-9]?|0)($|[^0-9])'
>> 191.191.191.39
(This is a DFA that matches the entire addr space (including broadcasts, etc.) an nothing else.
I think this one is the shortest.
^(([01]?\d\d?|2[0-4]\d|25[0-5]).){3}([01]?\d\d?|2[0-4]\d|25[0-5])$
I found this sample very useful, furthermore it allows different ipv4 notations.
sample code using python:
def is_valid_ipv4(ip4):
"""Validates IPv4 addresses.
"""
import re
pattern = re.compile(r"""
^
(?:
# Dotted variants:
(?:
# Decimal 1-255 (no leading 0's)
[3-9]\d?|2(?:5[0-5]|[0-4]?\d)?|1\d{0,2}
|
0x0*[0-9a-f]{1,2} # Hexadecimal 0x0 - 0xFF (possible leading 0's)
|
0+[1-3]?[0-7]{0,2} # Octal 0 - 0377 (possible leading 0's)
)
(?: # Repeat 0-3 times, separated by a dot
\.
(?:
[3-9]\d?|2(?:5[0-5]|[0-4]?\d)?|1\d{0,2}
|
0x0*[0-9a-f]{1,2}
|
0+[1-3]?[0-7]{0,2}
)
){0,3}
|
0x0*[0-9a-f]{1,8} # Hexadecimal notation, 0x0 - 0xffffffff
|
0+[0-3]?[0-7]{0,10} # Octal notation, 0 - 037777777777
|
# Decimal notation, 1-4294967295:
429496729[0-5]|42949672[0-8]\d|4294967[01]\d\d|429496[0-6]\d{3}|
42949[0-5]\d{4}|4294[0-8]\d{5}|429[0-3]\d{6}|42[0-8]\d{7}|
4[01]\d{8}|[1-3]\d{0,9}|[4-9]\d{0,8}
)
$
""", re.VERBOSE | re.IGNORECASE)
return pattern.match(ip4) <> None
((\.|^)(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9][0-9]?|0$)){4}
This regex will not accept
08.8.8.8 or 8.08.8.8 or 8.8.08.8 or 8.8.8.08
Finds a valid IP addresses as long as the IP is wrapped around any character other than digits (behind or ahead the IP). 4 Backreferences created: $+{first}.$+{second}.$+{third}.$+{forth}
Find String:
#any valid IP address
(?<IP>(?<![\d])(?<first>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))
#only valid private IP address RFC1918
(?<IP>(?<![\d])(:?(:?(?<first>10)[\.](?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5])))|(:?(?<first>172)[\.](?<second>(:?1[6-9])|(:?2[0-9])|(:?3[0-1])))|(:?(?<first>192)[\.](?<second>168)))[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))
Notepad++ Replace String Option 1: Replaces the whole IP (NO Change):
$+{IP}
Notepad++ Replace String Option 2: Replaces the whole IP octect by octect (NO Change)
$+{first}.$+{second}.$+{third}.$+{forth}
Notepad++ Replace String Option 3: Replaces the whole IP octect by octect (replace 3rd octect value with 0)
$+{first}.$+{second}.0.$+{forth}
NOTE: The above will match any valid IP including 255.255.255.255 for example and change it to 255.255.0.255 which is wrong and not very useful of course.
Replacing portion of each octect with an actual value however you can build your own find and replace which is actual useful to ammend IPs in text files:
for example replace the first octect group of the original Find regex above:
(?<first>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))
with
(?<first>10)
and
(?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))
with
(?<second>216)
and you are now matching addresses starting with first octect 192 only
Find on notepad++:
(?<IP>(?<![\d])(?<first>10)[\.](?<second>216)[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))
You could still perform Replace using back-referece groups in the exact same fashion as before.
You can get an idea of how the above matched below:
cat ipv4_validation_test.txt
Full Match:
0.0.0.1
12.108.1.34
192.168.1.1
10.249.24.212
10.216.1.212
192.168.1.255
255.255.255.255
0.0.0.0
Partial Match (IP Extraction from line)
30.168.1.0.1
-1.2.3.4
sfds10.216.24.23kgfd
da11.15.112.255adfdsfds
sfds10.216.24.23kgfd
NO Match
1.1.1.01
3...3
127.1.
192.168.1..
192.168.1.256
da11.15.112.2554adfdsfds
da311.15.112.255adfdsfds
Using grep you can see the results below:
From grep:
grep -oP '(?<IP>(?<![\d])(?<first>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))' ipv4_validation_test.txt
0.0.0.1
12.108.1.34
192.168.1.1
10.249.24.212
10.216.1.212
192.168.1.255
255.255.255.255
0.0.0.0
30.168.1.0
1.2.3.4
10.216.24.23
11.15.112.255
10.216.24.23
grep -P '(?<IP>(?<![\d])(?<first>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<second>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))' ipv4_validation_test.txt
0.0.0.1
12.108.1.34
192.168.1.1
10.249.24.212
10.216.1.212
192.168.1.255
255.255.255.255
0.0.0.0
30.168.1.0.1
-1.2.3.4
sfds10.216.24.23kgfd
da11.15.112.255adfdsfds
sfds10.216.24.23kgfd
#matching ip addresses starting with 10.216
grep -oP '(?<IP>(?<![\d])(?<first>10)[\.](?<second>216)[\.](?<third>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))[\.](?<forth>(:?\d)|(:?[1-9]\d)|(:?1\d{2})|(:?2[0-4]\d)|(:?25[0-5]))(?![\d]))' ipv4_validation_test.txt
10.216.1.212
10.216.24.23
10.216.24.23
^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\\.)){3}+((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))$
Above will be regex for the ip address like:
221.234.000.112
also for 221.234.0.112, 221.24.03.112, 221.234.0.1
You can imagine all kind of address as above
I would use PCRE and the define keyword:
/^
((?&byte))\.((?&byte))\.((?&byte))\.((?&byte))$
(?(DEFINE)
(?<byte>25[0-5]|2[0-4]\d|[01]?\d\d?))
/gmx
Demo: https://regex101.com/r/IB7j48/2
The reason of this is to avoid repeating the (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) pattern four times. Other solutions such as the one below work well, but it does not capture each group as it would be requested by many.
/^((\d+?)(\.|$)){4}/
The only other way to have 4 capture groups is to repeat the pattern four times:
/^(?<one>\d+)\.(?<two>\d+)\.(?<three>\d+)\.(?<four>\d+)$/
Capturing a ipv4 in perl is therefore very easy
$ echo "Hey this is my IP address 138.131.254.8, bye!" | \
perl -ne 'print "[$1, $2, $3, $4]" if \
/\b((?&byte))\.((?&byte))\.((?&byte))\.((?&byte))
(?(DEFINE)
\b(?<byte>25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))
/x'
[138, 131, 254, 8]