Regex: How can I match third IPv4 address? - regex

I'm a regex noob and for the life of me I can't figure out how to match the third IPv4 address on line that contains three IPv4 addresses.
The line in question:
ip route 214.25.48.547 255.255.255.255 16.48.75.46 name Chicago-VPN
The regex I have so far that matches one IP:
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})
If I put a {3} at the end of it, it breaks. I think it has something to do with the spaces between the addresses but I can't figure out how to handle that. I need to capture the third address.
https://regex101.com/r/mN3cR6/1

You just need to add a multiline modifier to the code.
Your new code should be like this
/([0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3})/g
See this demo https://regex101.com/r/mN3cR6/2

Try
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\s?)+
This should match one, two, or three, or even more "IPs".
Or
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\s([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})\s([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})
for exactly 3.
Or
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\s?){3}
for a shorter formula with some possible errors.
Note that the basic idea is problematic too, as it matches "999.999.999.999" when it is definitely not a valid IP address.

The following should match the third ip
(?:[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\s){2}([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})
It's possible to be more compact depending what language you're using - for instance in ruby
string.scan(/([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})/)[2]
would give you what you want. You could also collapse the multiple [0-9]{1,3}. instances using non matching groups and counts

The problem is, that the regex needs to not only contain the IPs but also the spaces between the IPs.
So adding a space into the repeated group should do the trick:
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3} ){3}
If you don't want tat space in the final match, you make it non-greedy, using ?? (or *?):
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3} ??){3}
Also note, that your regex matches more than just valid IPs. e.g. 999.999.999.999 would match nicely.

You are already matching all three IPs with that regex.
([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})
Match 1
214.25.48.547
Match 2
255.255.255.255
Match 3
16.48.75.46
You can test it here:
http://rubular.com/
The problem may be with how you are trying to access them.
In Ruby, your regex works perfectly:
regex = /([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})/
"ip route 214.25.48.547 255.255.255.255 16.48.75.46 name Chicago-VPN".scan(regex)
=> [["214.25.48.547"], ["255.255.255.255"], ["16.48.75.46"]]

Related

I want a regex support for characters that uses IP Address with Subnet

I have a regex ^[a-zA-Z0-9.*?]+$ that supports IP addresses like 31.202.216.280 how can I modify the given regex in a way where I could support subnets with an IP address like so 31.202.216.280/38
Be careful, if you set [0-9] for the Blocks, you can type an IPAddress with 999 as number.
If you want to "limit" to the 0-255 numbers, you need to explicit do this with:
^
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.
(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\/
([1-3][0-2]$|[0-2][0-9]$|0?[0-9]$)
Makes it possible to write addresses with:
(0-255).(0-255).(0-255).(0-255)/(0-32)
Look at the pre-made regular expressions provided by these Perl modules.
Regexp::Common::net
Regexp::Common::net::CIDR
You can use them on the command line or from a Perl script. You can also just copy the regex's and use them in another language.
With such a regex for IPs you might catch a lot of false positive. Try at least to remove a-ZA-Z. There are already great regex on SO to match IPs.
If you want to match your subnet add /[0-9]{1,3} before your string end ($). You might need to escape the slash depending on your programming language: \/.
if with regex subnet
^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\/[0-9]{1,3}$
if without subnet you may try
^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$
You can also use ttp template to capture IPs and PREFIXs as following. |IP captures IPs, |PREFIX captures prefixs.
See the following ttp example:
ttp_template = """
{{Prefix|PREFIX}}
{{Ip|IP}}
"""

Regex for finding domains in a sentence but not IP addresses

I am trying to write a regular expression that will match domains in a sentence.
I found this post which was very useful and helped me create the following to match domains, but it also unfortunately matches IP addresses too which I do not want:
((?!-))(xn--)?[a-z0-9][a-z0-9-_]{0,61}[a-z0-9]{0,1}\.(xn--)?([a-z0-9\._-]{1,61}|[a-z0-9-]{1,30})
I want to update my expression so that the following can still be found: in a sentence, between brackets, etc.:
www.example.com
subdomain.example.com
subdomain.example.co.uk
But not:
192.168.0.0
127.0.0.1
Is there a way to do this?
We could use a simple lookahead that excludes combinations of numbers and dots only: (?![\d.]+)
(?![\d.]+)((?!-))(xn--)?[a-z0-9][a-z0-9-_]{0,61}[a-z0-9]{0,1}\.(xn--)?([a-z0-9\._-]{1,61}|[a-z0-9-]{1,30})
Demo
Answer from #wp78de is correct, however it would not detect the domains starting with Numerical digits i.e. 123reg.com
So remove the first group in the regex like this
((?!-))(xn--)?[a-z0-9][a-z0-9-_]{0,61}[a-z0-9]{0,1}\.(xn--)?([a-z0-9\._-]{1,61}|[a-z0-9-]{1,30})

Why does this regexp for IPv4 doesn't work?

So this is the regex I've made:
^(([01]?\d{1,2})|(2(([0-4]\d)|(5[0-5])))\.){3}(([01]?\d{1,2})|(2(([0-4]\d)|(5[0-5]))))$
I have used several sites to break it down and it seems that it should work, but it doesn't. The desired result is to match any IPv4 - four numbers between 0 and 255 delimited by dots.
As an example, 1.1.1.1 won't give you a match.
The purpose of this question is not to find out a regex for IPv4 address, but to find out why this one, which seems correct, is not.
The literal . is only part of the 200-255 section of the capture group: railroad diagram.
Here's (([01]?\d{1,2})|(2([0-4]\d)|(5[0-5]))\.) formatted differently to help you spot the reason:
(
([01]?\d{1,2})
|
(2([0-4]\d)|(5[0-5])) \.
)
You're matching 0-199 or 200-255 with a dot. The dot is conditional on matching 200-255.
Additionally, as #SebastianProske pointed out, 2([0-4]\d)|(5[0-5]) matches 200-249 or 50-55, not 200-255.
You can fix your regex by adding capturing groups, but ultimately I would recommend not reinventing the wheel and using A) a pre-existing regex solution or B) parse the IPv4 address by splitting on dots. The latter method being easier to read and understand.
to fix yours up, just account for the "decimal" after each of the first three groups:
((2[0-4]\d|25[0-5]|[01]?\d{1,2})\.){3}(2[0-4]\d|25[0-5]|[01]?\d{1,2})
(*note that I reversed the order of the 2xx vs 1xx tests as well - prefer SPECIAL|...|NORMAL, or more restrictive first, when using alternations like this)
see it in action

How to avoid different capture group numbers in a regex?

I'm trying to capture an IP address in a log and revert on a hostname if the address is 0.0.0.0.
Here are some examples of logs:
Foo bar ip=0.0.0.0 baz host=YOLO-PC foobar bazinga
In this case, I want "YOLO-PC" because IP is 0.0.0.0
Foo bar ip=12.23.34.45 baz host=FOOBAR-PC foobar bazinga
In this case, I want 12.23.34.45.
Here's what I tried:
ip=(?:0\.0\.0\.0|(\d+\.\d+\.\d+\.\d+)).*?host=(?(1).|(\S+))
It works, but when IP is 0.0.0.0, it creates a second group and the program behind it can't fetch group #2, only group #1.
How can I do this? Put it all in only one group? Is there a better solution?
It's unclear from your question which environment/language/regex flavour you're dealing with. But PCRE regexes actually let you do this with the (?|some(capture)|another(capture)) syntax:
ip=(?|0\.0\.0\.0.*?host=(\S+)|(\d+\.\d+\.\d+\.\d+))
You can see from the debuggex visualisation that both groups are numbered 1. And on regex101 you see the captures on the right.
Alternatively (if you're not using PCRE), I guess you could do this. It's less strict, but works in most every engine. You're current regex isn't particularly strict with the IP format (allowing numbers higher than 255, etc) so maybe this is not an issue for you.
ip=(?:0\.0\.0\.0.*?host=)?(\S+)
Debuggex Demo
The number of groups on your result is equal to the number of ( ) groups in the regex. And the order you reference them is the order the opening parens appear in the regex. Some of the groups might not match and be empty.
So in your case, you will always have two groups. Group 1 is the non-zero ip and group 2 is the host-name. If the IP is 0.0.0.0, then group 1 will be empty. If not, then group 2 will be empty.
Can't you just check in your code which group is empty and use the other one?
Use an alternation, which attempts left-to- right:
(?<=ip)(?!0.0.0.0)\S+|(?<=host=)\S+
See demo
This matches only your target input due to using look arounds. A negative look ahead decided not to use the ip if it's all zero.
Just pick only the first match.

How does this Squid regex filter rule work?

On our Squid server, the admin has put on a new regex rule:
^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+
I know that it stands for IP address, but it allows all URLs to go through, only pinging external address has stopped. Also tunneling software like UltraSurf have stopped connecting to the server. Skype also is not getting connected.
Please explain how this works! Thanks.
I am not sure about your particular issue with the Squid server, but here is what the regex does:
[0-9]+ means "any digit one or more times", so it is matching a string that begins with a digit one or more times, followed by a dot, followed by a digit one or more times, followed by a dot, followed by a digit one or more times, followed by dot, followed by a digit one or more times.. then anything else. In essence, it is matching any IP address, so it wouldn't filter anything out. It will also match things that are not even valid IP addresses like 123456.123456.123456.123456 or 1.1.1.1 or 125.252.252.252asdf.
Paolo has explained the meaning of the Regex well! As mentioned, the Regex currently being used is too weak (or should I say too restrictive!)
If you want a much better Regex to match IP addresses, see this page.